In my package 'glmmML' I'm using old C code and linpack in the
optimizing procedure. Specifically, one part of the code looks like this: F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info); if (*info == 0){ F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job); ........ This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R! I googled for C and nan and found a work-around: Change 'if ...' to if (*info == 0 & (hessian[0][0] == hessian[0][0])){ which works as a test of hessian[0][0] (not) being NaN. I'm using the .C interface for calling C code. Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector? Göran Broström PS. Yes, I will switch to .Call! Soon... ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
> On 9 Feb 2017, at 16:00, Göran Broström <[hidden email]> wrote: > > In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this: > > F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info); > if (*info == 0){ > F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job); > ........ > > This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R! > > I googled for C and nan and found a work-around: Change 'if ...' to > > if (*info == 0 & (hessian[0][0] == hessian[0][0])){ > > which works as a test of hessian[0][0] (not) being NaN. > > I'm using the .C interface for calling C code. > > Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector? You should/could use macro R_FINITE to test each entry of the hessian. In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac: for (j = 0; j < *n; j++) for (i = 0; i < *n; i++) { if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) ) error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i]; } There may be a more compact way with a macro in the R headers. I feel that If other code can't handle non-finite values: then test. Berend Hasselman ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
On 02/09/2017 05:05 PM, Berend Hasselman wrote:
>> On 9 Feb 2017, at 16:00, Göran Broström <[hidden email]> wrote: >> >> In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this: >> >> F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info); >> if (*info == 0){ >> F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job); >> ........ >> >> This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R! >> >> I googled for C and nan and found a work-around: Change 'if ...' to >> >> if (*info == 0 & (hessian[0][0] == hessian[0][0])){ >> >> which works as a test of hessian[0][0] (not) being NaN. >> >> I'm using the .C interface for calling C code. >> >> Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector? > You should/could use macro R_FINITE to test each entry of the hessian. > In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac: > > for (j = 0; j < *n; j++) > for (i = 0; i < *n; i++) { > if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) ) > error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); > rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i]; > } > > There may be a more compact way with a macro in the R headers. > I feel that If other code can't handle non-finite values: then test. > > Berend Hasselman mayHaveNaNOrInf in array.c (originally from pqR), but be careful to test the individual operands of the sum. mayHaveNaNOrInf does not do the test for performance reasons, but as a result it can have false positives. Rboolean hasNaNOrInf(double *x, R_xlen_t n) { if ((n&1) != 0 && !R_FINITE(x[0])) return TRUE; for (R_xlen_t i = n&1; i < n; i += 2) if (!R_FINITE(x[i]+x[i+1])&& (!R_FINITE(x[i]) || !R_FINITE(x[i+1])) return TRUE; return FALSE; } Tomas > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
In reply to this post by Berend Hasselman
> > On 9 Feb 2017, at 16:00, Göran Broström <[hidden email]> wrote: > > > > In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this: > > > > F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info); > > if (*info == 0){ > > F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job); > > ........ > > > > This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R! > > > > I googled for C and nan and found a work-around: Change 'if ...' to > > > > if (*info == 0 & (hessian[0][0] == hessian[0][0])){ > > > > which works as a test of hessian[0][0] (not) being NaN. > > > > I'm using the .C interface for calling C code. > > > > Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector? > You should/could use macro R_FINITE to test each entry of the hessian. > In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac: > for (j = 0; j < *n; j++) > for (i = 0; i < *n; i++) { > if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) ) > error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); > rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i]; > } A minor hint on that: While REAL(.) (or INTEGER(.) ...) is really cheap in the R sources themselves, that is not the case in package code. Hence, not only nicer to read but even faster is double *fj = REAL(sexp_fjac); for (j = 0; j < *n; j++) for (i = 0; i < *n; i++) { if( !R_FINITE(fj[(*n)*j + i]) ) error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); rjac[(*ldr)*j + i] = fj[(*n)*j + i]; } > There may be a more compact way with a macro in the R headers. > I feel that If other code can't handle non-finite values: then test. > Berend Hasselman Indeed: do test. Much better safe than going for speed and losing in rare cases! The latter is a recipe to airplanes falling out of the sky ( ;-\ ) and is unfortunately used by some (in)famous "optimized" (fast but sometimes wrong!!) Lapack/BLAS libraries. The NEWS about the next version of R (3.4.0 due in April) has a new 2-paragraph entry related to this: --------------------------------------------------------------------------------- SIGNIFICANT USER-VISIBLE CHANGES: [...........] * Matrix products now consistently bypass BLAS when the inputs have NaN/Inf values. Performance of the check of inputs has been improved. Performance when BLAS is used is improved for matrix/vector and vector/matrix multiplication (DGEMV is now used instead of DGEMM). One can now choose from alternative matrix product implementations via options(matprod = ). The "internal" implementation is unoptimized but consistent in precision with other summation in R (uses long double accumulators). "blas" calls BLAS directly for best performance, yet usually with undefined behavior for inputs with NaN/Inf. --------------------------------------------------------------------------------- Last but not least : If you are not afraid of +/- Inf, but really only of NA/NaN's (as the OP said), then note that "THE manual" (= "Writing R Extensions") does mention ISNAN(.) almost in the same place as the first occurence of R_FINITE(.). Best regards, Martin Maechler, ETH Zurich ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
> On 9 Feb 2017, at 17:44, Martin Maechler <[hidden email]> wrote: > [snip] ... >> You should/could use macro R_FINITE to test each entry of the hessian. >> In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac: > >> for (j = 0; j < *n; j++) >> for (i = 0; i < *n; i++) { >> if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) ) >> error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); >> rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i]; >> } > > A minor hint on that: While REAL(.) (or INTEGER(.) ...) is really cheap in > the R sources themselves, that is not the case in package code. > > Hence, not only nicer to read but even faster is > > double *fj = REAL(sexp_fjac); > for (j = 0; j < *n; j++) > for (i = 0; i < *n; i++) { > if( !R_FINITE(fj[(*n)*j + i]) ) > error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); > rjac[(*ldr)*j + i] = fj[(*n)*j + i]; > } > Tom, Martin, Thanks for both of your suggestions. Very helpful. Berend Hasselman ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
In reply to this post by Martin Maechler
Thanks to all who answered my third question. I learned something, but:
On 2017-02-09 17:44, Martin Maechler wrote: > >>> On 9 Feb 2017, at 16:00, Göran Broström <[hidden email]> wrote: >>> >>> In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this: >>> >>> F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info); >>> if (*info == 0){ >>> F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job); >>> ........ >>> >>> This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R! >>> >>> I googled for C and nan and found a work-around: Change 'if ...' to >>> >>> if (*info == 0 & (hessian[0][0] == hessian[0][0])){ >>> >>> which works as a test of hessian[0][0] (not) being NaN. >>> >>> I'm using the .C interface for calling C code. >>> >>> Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector? > >> You should/could use macro R_FINITE to test each entry of the hessian. >> In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac: > >> for (j = 0; j < *n; j++) >> for (i = 0; i < *n; i++) { >> if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) ) >> error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); >> rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i]; >> } > > A minor hint on that: While REAL(.) (or INTEGER(.) ...) is really cheap in > the R sources themselves, that is not the case in package code. > > Hence, not only nicer to read but even faster is > > double *fj = REAL(sexp_fjac); > for (j = 0; j < *n; j++) > for (i = 0; i < *n; i++) { > if( !R_FINITE(fj[(*n)*j + i]) ) > error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); > rjac[(*ldr)*j + i] = fj[(*n)*j + i]; > } > isn't this even easier to read (and correct?): for (j = 0; j < n*; j++) for (i = 0; i < n*; i++){ if ( !R_FINITE(hessian[i][j]) ) error("blah...") } ? In .C land, that is. (And sure, I'm afraid of ±Inf in this context.) Thanks again, Göran ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
> On 10 Feb 2017, at 14:53, Göran Broström <[hidden email]> wrote: > > Thanks to all who answered my third question. I learned something, but: > > On 2017-02-09 17:44, Martin Maechler wrote: >> >>>> On 9 Feb 2017, at 16:00, Göran Broström <[hidden email]> wrote: >>>> >>>> In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this: >>>> >>>> F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info); >>>> if (*info == 0){ >>>> F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job); >>>> ........ >>>> >>>> This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R! >>>> >>>> I googled for C and nan and found a work-around: Change 'if ...' to >>>> >>>> if (*info == 0 & (hessian[0][0] == hessian[0][0])){ >>>> >>>> which works as a test of hessian[0][0] (not) being NaN. >>>> >>>> I'm using the .C interface for calling C code. >>>> >>>> Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector? >> >>> You should/could use macro R_FINITE to test each entry of the hessian. >>> In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac: >> >>> for (j = 0; j < *n; j++) >>> for (i = 0; i < *n; i++) { >>> if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) ) >>> error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); >>> rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i]; >>> } >> >> A minor hint on that: While REAL(.) (or INTEGER(.) ...) is really cheap in >> the R sources themselves, that is not the case in package code. >> >> Hence, not only nicer to read but even faster is >> >> double *fj = REAL(sexp_fjac); >> for (j = 0; j < *n; j++) >> for (i = 0; i < *n; i++) { >> if( !R_FINITE(fj[(*n)*j + i]) ) >> error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); >> rjac[(*ldr)*j + i] = fj[(*n)*j + i]; >> } >> > [...] > > isn't this even easier to read (and correct?): > > for (j = 0; j < n*; j++) > for (i = 0; i < n*; i++){ > if ( !R_FINITE(hessian[i][j]) ) error("blah...") > } > > ? In .C land, that is. (And sure, I'm afraid of ±Inf in this context.) > Only if you have lda and n equal (which you indeed have; but still worth mentioning) when calling dpoco. Berend ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
Thanks Berend, I will make that change and submit to CRAN.
Best, Göran On 2017-02-10 16:13, Berend Hasselman wrote: > >> On 10 Feb 2017, at 14:53, Göran Broström <[hidden email]> wrote: >> >> Thanks to all who answered my third question. I learned something, but: >> >> On 2017-02-09 17:44, Martin Maechler wrote: >>> >>>>> On 9 Feb 2017, at 16:00, Göran Broström <[hidden email]> wrote: >>>>> >>>>> In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this: >>>>> >>>>> F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info); >>>>> if (*info == 0){ >>>>> F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job); >>>>> ........ >>>>> >>>>> This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R! >>>>> >>>>> I googled for C and nan and found a work-around: Change 'if ...' to >>>>> >>>>> if (*info == 0 & (hessian[0][0] == hessian[0][0])){ >>>>> >>>>> which works as a test of hessian[0][0] (not) being NaN. >>>>> >>>>> I'm using the .C interface for calling C code. >>>>> >>>>> Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector? >>> >>>> You should/could use macro R_FINITE to test each entry of the hessian. >>>> In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac: >>> >>>> for (j = 0; j < *n; j++) >>>> for (i = 0; i < *n; i++) { >>>> if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) ) >>>> error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); >>>> rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i]; >>>> } >>> >>> A minor hint on that: While REAL(.) (or INTEGER(.) ...) is really cheap in >>> the R sources themselves, that is not the case in package code. >>> >>> Hence, not only nicer to read but even faster is >>> >>> double *fj = REAL(sexp_fjac); >>> for (j = 0; j < *n; j++) >>> for (i = 0; i < *n; i++) { >>> if( !R_FINITE(fj[(*n)*j + i]) ) >>> error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1); >>> rjac[(*ldr)*j + i] = fj[(*n)*j + i]; >>> } >>> >> [...] >> >> isn't this even easier to read (and correct?): >> >> for (j = 0; j < n*; j++) >> for (i = 0; i < n*; i++){ >> if ( !R_FINITE(hessian[i][j]) ) error("blah...") >> } >> >> ? In .C land, that is. (And sure, I'm afraid of ±Inf in this context.) >> > > Only if you have lda and n equal (which you indeed have; but still worth mentioning) when calling dpoco. > > Berend > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
Free forum by Nabble - Free Resume Builder | Edit this page |