Ancient C /Fortran code linpack error

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Ancient C /Fortran code linpack error

Göran Broström-3
In my package 'glmmML' I'm using old C code and linpack in the
optimizing procedure. Specifically, one part of the code looks like this:

     F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info);
     if (*info == 0){
         F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job);
         ........

This usually works OK, but with an ill-conditioned data set (from a user
of glmmML) it happened that the hessian was all nan. However, dpoco
returned *info = 0 (no error!) and then the call to dpodi hanged R!

I googled for C and nan and found a work-around: Change 'if ...' to

    if (*info == 0 & (hessian[0][0] == hessian[0][0])){

which works as a test of hessian[0][0] (not) being NaN.

I'm using the .C interface for calling C code.

Any thoughts on how to best handle the situation? Is this a bug in
dpoco? Is there a simple way to test for any NaNs in a vector?

Göran Broström

PS. Yes, I will switch to .Call! Soon...

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Ancient C /Fortran code linpack error

Berend Hasselman

> On 9 Feb 2017, at 16:00, Göran Broström <[hidden email]> wrote:
>
> In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this:
>
>    F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info);
>    if (*info == 0){
>        F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job);
>        ........
>
> This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R!
>
> I googled for C and nan and found a work-around: Change 'if ...' to
>
>   if (*info == 0 & (hessian[0][0] == hessian[0][0])){
>
> which works as a test of hessian[0][0] (not) being NaN.
>
> I'm using the .C interface for calling C code.
>
> Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector?

You should/could use macro R_FINITE to test each entry of the hessian.
In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac:

    for (j = 0; j < *n; j++)
        for (i = 0; i < *n; i++) {
            if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) )
                error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1);
            rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i];
        }

There may be a more compact way with a macro in the R headers.
I feel that If other code can't handle non-finite values: then test.

Berend Hasselman

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Ancient C /Fortran code linpack error

Tomas Kalibera
On 02/09/2017 05:05 PM, Berend Hasselman wrote:

>> On 9 Feb 2017, at 16:00, Göran Broström <[hidden email]> wrote:
>>
>> In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this:
>>
>>     F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info);
>>     if (*info == 0){
>>         F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job);
>>         ........
>>
>> This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R!
>>
>> I googled for C and nan and found a work-around: Change 'if ...' to
>>
>>    if (*info == 0 & (hessian[0][0] == hessian[0][0])){
>>
>> which works as a test of hessian[0][0] (not) being NaN.
>>
>> I'm using the .C interface for calling C code.
>>
>> Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector?
> You should/could use macro R_FINITE to test each entry of the hessian.
> In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac:
>
>      for (j = 0; j < *n; j++)
>          for (i = 0; i < *n; i++) {
>              if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) )
>                  error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1);
>              rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i];
>          }
>
> There may be a more compact way with a macro in the R headers.
> I feel that If other code can't handle non-finite values: then test.
>
> Berend Hasselman
And if performance was of importance, you could use the trick from
mayHaveNaNOrInf in array.c (originally from pqR), but be careful to test
the individual operands of the sum.
mayHaveNaNOrInf does not do the test for performance reasons, but as a
result it can have false positives.

Rboolean hasNaNOrInf(double *x, R_xlen_t n)
{
     if ((n&1) != 0 && !R_FINITE(x[0]))
         return TRUE;
     for (R_xlen_t i = n&1; i < n; i += 2)
         if (!R_FINITE(x[i]+x[i+1])&& (!R_FINITE(x[i]) || !R_FINITE(x[i+1]))
             return TRUE;
     return FALSE;
}

Tomas
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Ancient C /Fortran code linpack error

Martin Maechler
In reply to this post by Berend Hasselman

> > On 9 Feb 2017, at 16:00, Göran Broström <[hidden email]> wrote:
> >
> > In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this:
> >
> >    F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info);
> >    if (*info == 0){
> >        F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job);
> >        ........
> >
> > This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R!
> >
> > I googled for C and nan and found a work-around: Change 'if ...' to
> >
> >   if (*info == 0 & (hessian[0][0] == hessian[0][0])){
> >
> > which works as a test of hessian[0][0] (not) being NaN.
> >
> > I'm using the .C interface for calling C code.
> >
> > Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector?

> You should/could use macro R_FINITE to test each entry of the hessian.
> In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac:

>     for (j = 0; j < *n; j++)
>         for (i = 0; i < *n; i++) {
>             if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) )
>                 error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1);
>             rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i];
>         }

A minor hint  on that:  While REAL(.)  (or INTEGER(.) ...)  is really cheap in
the R sources themselves, that is not the case in package code.

Hence, not only nicer to read but even faster is

  double *fj = REAL(sexp_fjac);
  for (j = 0; j < *n; j++)
    for (i = 0; i < *n; i++) {
        if( !R_FINITE(fj[(*n)*j + i]) )
           error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1);
           rjac[(*ldr)*j + i] = fj[(*n)*j + i];
     }


> There may be a more compact way with a macro in the R headers.
> I feel that If other code can't handle non-finite values: then test.

> Berend Hasselman

Indeed: do test.
Much better safe than going for speed and losing in rare cases!

The latter is a recipe to airplanes falling out of the sky  ( ;-\ )
and is unfortunately used by some (in)famous "optimized" (fast but
sometimes wrong!!) Lapack/BLAS libraries.

The NEWS about the next version of R (3.4.0 due in April) has
a new 2-paragraph entry related to this:

---------------------------------------------------------------------------------

  SIGNIFICANT USER-VISIBLE CHANGES:

[...........]

    * Matrix products now consistently bypass BLAS when the inputs have
      NaN/Inf values. Performance of the check of inputs has been
      improved. Performance when BLAS is used is improved for
      matrix/vector and vector/matrix multiplication (DGEMV is now used
      instead of DGEMM).

      One can now choose from alternative matrix product
      implementations via options(matprod = ).  The "internal"
      implementation is unoptimized but consistent in precision with
      other summation in R (uses long double accumulators).  "blas"
      calls BLAS directly for best performance, yet usually with
      undefined behavior for inputs with NaN/Inf.

---------------------------------------------------------------------------------


Last but not least :

If you are not afraid of +/- Inf, but really only of NA/NaN's (as the OP said),
then note that "THE manual" (= "Writing R Extensions") does mention
ISNAN(.) almost in the same place as the first occurence of
R_FINITE(.).

Best regards,
Martin Maechler, ETH Zurich

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Ancient C /Fortran code linpack error

Berend Hasselman

> On 9 Feb 2017, at 17:44, Martin Maechler <[hidden email]> wrote:
>

[snip] ...

>> You should/could use macro R_FINITE to test each entry of the hessian.
>> In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac:
>
>>    for (j = 0; j < *n; j++)
>>        for (i = 0; i < *n; i++) {
>>            if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) )
>>                error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1);
>>            rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i];
>>        }
>
> A minor hint  on that:  While REAL(.)  (or INTEGER(.) ...)  is really cheap in
> the R sources themselves, that is not the case in package code.
>
> Hence, not only nicer to read but even faster is
>
>  double *fj = REAL(sexp_fjac);
>  for (j = 0; j < *n; j++)
>    for (i = 0; i < *n; i++) {
>        if( !R_FINITE(fj[(*n)*j + i]) )
>           error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1);
>           rjac[(*ldr)*j + i] = fj[(*n)*j + i];
>     }
>

Tom, Martin,

Thanks for both of your suggestions. Very helpful.

Berend Hasselman

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Ancient C /Fortran code linpack error

Göran Broström-3
In reply to this post by Martin Maechler
Thanks to all who answered my third question. I learned something, but:

On 2017-02-09 17:44, Martin Maechler wrote:

>
>>> On 9 Feb 2017, at 16:00, Göran Broström <[hidden email]> wrote:
>>>
>>> In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this:
>>>
>>>    F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info);
>>>    if (*info == 0){
>>>        F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job);
>>>        ........
>>>
>>> This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R!
>>>
>>> I googled for C and nan and found a work-around: Change 'if ...' to
>>>
>>>   if (*info == 0 & (hessian[0][0] == hessian[0][0])){
>>>
>>> which works as a test of hessian[0][0] (not) being NaN.
>>>
>>> I'm using the .C interface for calling C code.
>>>
>>> Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector?
>
>> You should/could use macro R_FINITE to test each entry of the hessian.
>> In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac:
>
>>     for (j = 0; j < *n; j++)
>>         for (i = 0; i < *n; i++) {
>>             if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) )
>>                 error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1);
>>             rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i];
>>         }
>
> A minor hint  on that:  While REAL(.)  (or INTEGER(.) ...)  is really cheap in
> the R sources themselves, that is not the case in package code.
>
> Hence, not only nicer to read but even faster is
>
>   double *fj = REAL(sexp_fjac);
>   for (j = 0; j < *n; j++)
>     for (i = 0; i < *n; i++) {
>         if( !R_FINITE(fj[(*n)*j + i]) )
>            error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1);
>            rjac[(*ldr)*j + i] = fj[(*n)*j + i];
>      }
>
[...]

isn't this even easier to read (and correct?):

     for (j = 0; j < n*; j++)
          for (i = 0; i < n*; i++){
               if ( !R_FINITE(hessian[i][j]) ) error("blah...")
          }

? In .C land, that is. (And sure, I'm afraid of ±Inf in this context.)

Thanks again, Göran

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Ancient C /Fortran code linpack error

Berend Hasselman

> On 10 Feb 2017, at 14:53, Göran Broström <[hidden email]> wrote:
>
> Thanks to all who answered my third question. I learned something, but:
>
> On 2017-02-09 17:44, Martin Maechler wrote:
>>
>>>> On 9 Feb 2017, at 16:00, Göran Broström <[hidden email]> wrote:
>>>>
>>>> In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this:
>>>>
>>>>   F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info);
>>>>   if (*info == 0){
>>>>       F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job);
>>>>       ........
>>>>
>>>> This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R!
>>>>
>>>> I googled for C and nan and found a work-around: Change 'if ...' to
>>>>
>>>>  if (*info == 0 & (hessian[0][0] == hessian[0][0])){
>>>>
>>>> which works as a test of hessian[0][0] (not) being NaN.
>>>>
>>>> I'm using the .C interface for calling C code.
>>>>
>>>> Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector?
>>
>>> You should/could use macro R_FINITE to test each entry of the hessian.
>>> In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac:
>>
>>>    for (j = 0; j < *n; j++)
>>>        for (i = 0; i < *n; i++) {
>>>            if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) )
>>>                error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1);
>>>            rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i];
>>>        }
>>
>> A minor hint  on that:  While REAL(.)  (or INTEGER(.) ...)  is really cheap in
>> the R sources themselves, that is not the case in package code.
>>
>> Hence, not only nicer to read but even faster is
>>
>>  double *fj = REAL(sexp_fjac);
>>  for (j = 0; j < *n; j++)
>>    for (i = 0; i < *n; i++) {
>>        if( !R_FINITE(fj[(*n)*j + i]) )
>>           error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1);
>>           rjac[(*ldr)*j + i] = fj[(*n)*j + i];
>>     }
>>
> [...]
>
> isn't this even easier to read (and correct?):
>
>    for (j = 0; j < n*; j++)
>         for (i = 0; i < n*; i++){
>              if ( !R_FINITE(hessian[i][j]) ) error("blah...")
>         }
>
> ? In .C land, that is. (And sure, I'm afraid of ±Inf in this context.)
>

Only if you have lda and n equal (which you indeed have; but still worth mentioning) when calling dpoco.

Berend

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Ancient C /Fortran code linpack error

Göran Broström-3
Thanks Berend, I will make that change and submit to CRAN.

Best, Göran

On 2017-02-10 16:13, Berend Hasselman wrote:

>
>> On 10 Feb 2017, at 14:53, Göran Broström <[hidden email]> wrote:
>>
>> Thanks to all who answered my third question. I learned something, but:
>>
>> On 2017-02-09 17:44, Martin Maechler wrote:
>>>
>>>>> On 9 Feb 2017, at 16:00, Göran Broström <[hidden email]> wrote:
>>>>>
>>>>> In my package 'glmmML' I'm using old C code and linpack in the optimizing procedure. Specifically, one part of the code looks like this:
>>>>>
>>>>>   F77_CALL(dpoco)(*hessian, &bdim, &bdim, &rcond, work, info);
>>>>>   if (*info == 0){
>>>>>       F77_CALL(dpodi)(*hessian, &bdim, &bdim, det, &job);
>>>>>       ........
>>>>>
>>>>> This usually works OK, but with an ill-conditioned data set (from a user of glmmML) it happened that the hessian was all nan. However, dpoco returned *info = 0 (no error!) and then the call to dpodi hanged R!
>>>>>
>>>>> I googled for C and nan and found a work-around: Change 'if ...' to
>>>>>
>>>>>  if (*info == 0 & (hessian[0][0] == hessian[0][0])){
>>>>>
>>>>> which works as a test of hessian[0][0] (not) being NaN.
>>>>>
>>>>> I'm using the .C interface for calling C code.
>>>>>
>>>>> Any thoughts on how to best handle the situation? Is this a bug in dpoco? Is there a simple way to test for any NaNs in a vector?
>>>
>>>> You should/could use macro R_FINITE to test each entry of the hessian.
>>>> In package nleqslv I test for a "correct" jacobian like this in file nleqslv.c in function fcnjac:
>>>
>>>>    for (j = 0; j < *n; j++)
>>>>        for (i = 0; i < *n; i++) {
>>>>            if( !R_FINITE(REAL(sexp_fjac)[(*n)*j + i]) )
>>>>                error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1);
>>>>            rjac[(*ldr)*j + i] = REAL(sexp_fjac)[(*n)*j + i];
>>>>        }
>>>
>>> A minor hint  on that:  While REAL(.)  (or INTEGER(.) ...)  is really cheap in
>>> the R sources themselves, that is not the case in package code.
>>>
>>> Hence, not only nicer to read but even faster is
>>>
>>>  double *fj = REAL(sexp_fjac);
>>>  for (j = 0; j < *n; j++)
>>>    for (i = 0; i < *n; i++) {
>>>        if( !R_FINITE(fj[(*n)*j + i]) )
>>>           error("non-finite value(s) returned by jacobian (row=%d,col=%d)",i+1,j+1);
>>>           rjac[(*ldr)*j + i] = fj[(*n)*j + i];
>>>     }
>>>
>> [...]
>>
>> isn't this even easier to read (and correct?):
>>
>>    for (j = 0; j < n*; j++)
>>         for (i = 0; i < n*; i++){
>>              if ( !R_FINITE(hessian[i][j]) ) error("blah...")
>>         }
>>
>> ? In .C land, that is. (And sure, I'm afraid of ±Inf in this context.)
>>
>
> Only if you have lda and n equal (which you indeed have; but still worth mentioning) when calling dpoco.
>
> Berend
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel