Non identical numerical results from R code vs C/C++ code?

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Non identical numerical results from R code vs C/C++ code?

Renaud Gaujoux
Hi,

suppose you have two versions of the same algorithm: one in pure R, the
other one in C/C++ called via .Call().
Assuming there is no bug in the implementations (i.e. they both do the
same thing), is there any well known reason why the C/C++ implementation
could return numerical results non identical to the one obtained from
the pure R code? (e.g. could it be rounding errors? please explain.)
Has anybody had a similar experience?

By not identical, I mean very small differences (< 2.4 e-14), but enough
to have identical() returning FALSE. Maybe I should not bother, but I
want to be sure where the differences come from, at least by mere curiosity.

Briefly the R code perform multiple matrix product; the C code is an
optimization of those specific products via custom for loops, where
entries are not computed in the same order, etc... which improves both
memory usage and speed. The result is theoretically the same.

Thank you,
Renaud

--
Renaud Gaujoux
Computational Biology - University of Cape Town
South Africa


 

###
UNIVERSITY OF CAPE TOWN

This e-mail is subject to the UCT ICT policies and e-mail disclaimer published on our website at http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from +27 21 650 4500. This e-mail is intended only for the person(s) to whom it is addressed. If the e-mail has reached you in error, please notify the author. If you are not the intended recipient of the e-mail you may not use, disclose, copy, redirect or print the content. If this e-mail is not related to the business of UCT it is sent by the sender in the sender's individual capacity.

###

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Non identical numerical results from R code vs C/C++ code?

Duncan Murdoch-2
On 10/09/2010 6:46 AM, Renaud Gaujoux wrote:
> Hi,
>
> suppose you have two versions of the same algorithm: one in pure R, the
> other one in C/C++ called via .Call().
> Assuming there is no bug in the implementations (i.e. they both do the
> same thing), is there any well known reason why the C/C++ implementation
> could return numerical results non identical to the one obtained from
> the pure R code? (e.g. could it be rounding errors? please explain.)
> Has anybody had a similar experience?

R often uses extended reals (80 bit floating point values on Intel
chips) for intermediate values.  C compilers may or may not do that.
>
> By not identical, I mean very small differences (< 2.4 e-14), but enough
> to have identical() returning FALSE. Maybe I should not bother, but I
> want to be sure where the differences come from, at least by mere curiosity.
>
> Briefly the R code perform multiple matrix product; the C code is an
> optimization of those specific products via custom for loops, where
> entries are not computed in the same order, etc... which improves both
> memory usage and speed. The result is theoretically the same.

Changing the order of operations will often affect rounding.  For
example, suppose epsilon is the smallest number such that 1 + epsilon is
not equal to 1.  Then 1 + (epsilon/2) + (epsilon/2) will evaluate to
either 1 or 1 + epsilon, depending on the order of computing the additions.

Duncan Murdoch

>
> Thank you,
> Renaud
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Non identical numerical results from R code vs C/C++ code?

Renaud Gaujoux
Thank you Duncan for your reply.

Currently I am using 'double' for the computations.
What type should I use for extended real in my intermediate computations?
The result will still be 'double' anyway right?



On 10/09/2010 13:00, Duncan Murdoch wrote:

> On 10/09/2010 6:46 AM, Renaud Gaujoux wrote:
>> Hi,
>>
>> suppose you have two versions of the same algorithm: one in pure R,
>> the other one in C/C++ called via .Call().
>> Assuming there is no bug in the implementations (i.e. they both do
>> the same thing), is there any well known reason why the C/C++
>> implementation could return numerical results non identical to the
>> one obtained from the pure R code? (e.g. could it be rounding errors?
>> please explain.)
>> Has anybody had a similar experience?
>
> R often uses extended reals (80 bit floating point values on Intel
> chips) for intermediate values.  C compilers may or may not do that.
>>
>> By not identical, I mean very small differences (< 2.4 e-14), but
>> enough to have identical() returning FALSE. Maybe I should not
>> bother, but I want to be sure where the differences come from, at
>> least by mere curiosity.
>>
>> Briefly the R code perform multiple matrix product; the C code is an
>> optimization of those specific products via custom for loops, where
>> entries are not computed in the same order, etc... which improves
>> both memory usage and speed. The result is theoretically the same.
>
> Changing the order of operations will often affect rounding.  For
> example, suppose epsilon is the smallest number such that 1 + epsilon
> is not equal to 1.  Then 1 + (epsilon/2) + (epsilon/2) will evaluate
> to either 1 or 1 + epsilon, depending on the order of computing the
> additions.
>
> Duncan Murdoch
>
>>
>> Thank you,
>> Renaud
>>
>

 

###
UNIVERSITY OF CAPE TOWN

This e-mail is subject to the UCT ICT policies and e-mail disclaimer published on our website at http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from +27 21 650 4500. This e-mail is intended only for the person(s) to whom it is addressed. If the e-mail has reached you in error, please notify the author. If you are not the intended recipient of the e-mail you may not use, disclose, copy, redirect or print the content. If this e-mail is not related to the business of UCT it is sent by the sender in the sender's individual capacity.

###

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Non identical numerical results from R code vs C/C++ code?

Duncan Murdoch-2
On 10/09/2010 7:07 AM, Renaud Gaujoux wrote:
> Thank you Duncan for your reply.
>
> Currently I am using 'double' for the computations.
> What type should I use for extended real in my intermediate computations?

I think it depends on compiler details.  On some compilers "long double"
will get it, but I don't think there's a standard type that works on all
compilers.  (In fact, the 80 bit type on Intel chips isn't necessarily
supported on other hardware.)  R defines LDOUBLE in its header files and
it is probably best to use that if you want to duplicate R results.

> The result will still be 'double' anyway right?

Yes, you do need to return type double.

Duncan Murdoch

>
>
>
> On 10/09/2010 13:00, Duncan Murdoch wrote:
>> On 10/09/2010 6:46 AM, Renaud Gaujoux wrote:
>>> Hi,
>>>
>>> suppose you have two versions of the same algorithm: one in pure R,
>>> the other one in C/C++ called via .Call().
>>> Assuming there is no bug in the implementations (i.e. they both do
>>> the same thing), is there any well known reason why the C/C++
>>> implementation could return numerical results non identical to the
>>> one obtained from the pure R code? (e.g. could it be rounding errors?
>>> please explain.)
>>> Has anybody had a similar experience?
>> R often uses extended reals (80 bit floating point values on Intel
>> chips) for intermediate values.  C compilers may or may not do that.
>>> By not identical, I mean very small differences (< 2.4 e-14), but
>>> enough to have identical() returning FALSE. Maybe I should not
>>> bother, but I want to be sure where the differences come from, at
>>> least by mere curiosity.
>>>
>>> Briefly the R code perform multiple matrix product; the C code is an
>>> optimization of those specific products via custom for loops, where
>>> entries are not computed in the same order, etc... which improves
>>> both memory usage and speed. The result is theoretically the same.
>> Changing the order of operations will often affect rounding.  For
>> example, suppose epsilon is the smallest number such that 1 + epsilon
>> is not equal to 1.  Then 1 + (epsilon/2) + (epsilon/2) will evaluate
>> to either 1 or 1 + epsilon, depending on the order of computing the
>> additions.
>>
>> Duncan Murdoch
>>
>>> Thank you,
>>> Renaud
>>>
>
>  
>
> ###
> UNIVERSITY OF CAPE TOWN
>
> This e-mail is subject to the UCT ICT policies and e-mail disclaimer published on our website at http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from +27 21 650 4500. This e-mail is intended only for the person(s) to whom it is addressed. If the e-mail has reached you in error, please notify the author. If you are not the intended recipient of the e-mail you may not use, disclose, copy, redirect or print the content. If this e-mail is not related to the business of UCT it is sent by the sender in the sender's individual capacity.
>
> ###
>  
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Non identical numerical results from R code vs C/C++ code?

barry rowlingson
In reply to this post by Renaud Gaujoux
On Fri, Sep 10, 2010 at 11:46 AM, Renaud Gaujoux
<[hidden email]> wrote:

> Hi,
>
> suppose you have two versions of the same algorithm: one in pure R, the
> other one in C/C++ called via .Call().
> Assuming there is no bug in the implementations (i.e. they both do the same
> thing), is there any well known reason why the C/C++ implementation could
> return numerical results non identical to the one obtained from the pure R
> code? (e.g. could it be rounding errors? please explain.)
> Has anybody had a similar experience?
>
> By not identical, I mean very small differences (< 2.4 e-14), but enough to
> have identical() returning FALSE. Maybe I should not bother, but I want to
> be sure where the differences come from, at least by mere curiosity.
>
> Briefly the R code perform multiple matrix product; the C code is an
> optimization of those specific products via custom for loops, where entries
> are not computed in the same order, etc... which improves both memory usage
> and speed. The result is theoretically the same.

 I've had almost exactly this situation recently with an algorithm I
first implemented in R and then in C. Guess what the problem was? Yes,
a bug in the C code.

 At first I thought everything was okay because the returned values
were close-ish, and I thought 'oh, rounding error', but I wasn't happy
about that. So I dug a bit deeper. This is worth doing even if you are
sure there no bugs in the C code (remember: there's always one more
bug).

 First, construct a minimal dataset so you can demonstrate the problem
with a manageable size of matrix. I went down to 7 data points. Then
get the C to print out its inputs. Identical, and as expected? Good.

 Now debug intermediate calculations, printing or saving from C and
checking they are the same as the intermediate calculations from R. If
possible, try returning intermediate calculations in C and checking
identical() with R intermediates.

 Eventually you will find out where the first diverge - and this is
the important bit. It's no use just knowing the final answers come out
different, it's likely your answer has a sensitive dependence on
initial conditions. You need to track down when the first bits are
different.

 Or it could be rounding error...

Barry

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Non identical numerical results from R code vs C/C++ code?

Renaud Gaujoux
Ok.

I quickly tried using LDOUBLE wherever I could, but it did not changed
the results. I might try harder...
I agree with you Barry, and I will re-double re-check my code.

Thank you both for your help.
Bests,
Renaud

On 10/09/2010 13:24, Barry Rowlingson wrote:

> On Fri, Sep 10, 2010 at 11:46 AM, Renaud Gaujoux
> <[hidden email]>  wrote:
>    
>> Hi,
>>
>> suppose you have two versions of the same algorithm: one in pure R, the
>> other one in C/C++ called via .Call().
>> Assuming there is no bug in the implementations (i.e. they both do the same
>> thing), is there any well known reason why the C/C++ implementation could
>> return numerical results non identical to the one obtained from the pure R
>> code? (e.g. could it be rounding errors? please explain.)
>> Has anybody had a similar experience?
>>
>> By not identical, I mean very small differences (<  2.4 e-14), but enough to
>> have identical() returning FALSE. Maybe I should not bother, but I want to
>> be sure where the differences come from, at least by mere curiosity.
>>
>> Briefly the R code perform multiple matrix product; the C code is an
>> optimization of those specific products via custom for loops, where entries
>> are not computed in the same order, etc... which improves both memory usage
>> and speed. The result is theoretically the same.
>>      
>   I've had almost exactly this situation recently with an algorithm I
> first implemented in R and then in C. Guess what the problem was? Yes,
> a bug in the C code.
>
>   At first I thought everything was okay because the returned values
> were close-ish, and I thought 'oh, rounding error', but I wasn't happy
> about that. So I dug a bit deeper. This is worth doing even if you are
> sure there no bugs in the C code (remember: there's always one more
> bug).
>
>   First, construct a minimal dataset so you can demonstrate the problem
> with a manageable size of matrix. I went down to 7 data points. Then
> get the C to print out its inputs. Identical, and as expected? Good.
>
>   Now debug intermediate calculations, printing or saving from C and
> checking they are the same as the intermediate calculations from R. If
> possible, try returning intermediate calculations in C and checking
> identical() with R intermediates.
>
>   Eventually you will find out where the first diverge - and this is
> the important bit. It's no use just knowing the final answers come out
> different, it's likely your answer has a sensitive dependence on
> initial conditions. You need to track down when the first bits are
> different.
>
>   Or it could be rounding error...
>
> Barry
>    

 

###
UNIVERSITY OF CAPE TOWN

This e-mail is subject to the UCT ICT policies and e-mail disclaimer published on our website at http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from +27 21 650 4500. This e-mail is intended only for the person(s) to whom it is addressed. If the e-mail has reached you in error, please notify the author. If you are not the intended recipient of the e-mail you may not use, disclose, copy, redirect or print the content. If this e-mail is not related to the business of UCT it is sent by the sender in the sender's individual capacity.

###

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Non identical numerical results from R code vs C/C++ code?

Paul Gilbert
In reply to this post by Renaud Gaujoux
With fortran I have always managed to be able to get identical results
on the same computer with the same OS. You will have trouble if you
switch OS or hardware, or try the same hardware and OS with different
math libraries. All the real calculations need to be double, even
intermediate variables. Also, I've had trouble with arrays not being
initialized to double 0.0. If you initialize to single 0.0 the
straggling bits can cause differences. You also need to be careful about
conversion from integer to real, to do double conversion. I'm not sure
about C, but I would guess there are some of the same problems.

Paul

>-----Original Message-----
>From: [hidden email] [mailto:r-devel-bounces@r-
>project.org] On Behalf Of Renaud Gaujoux
>Sent: September 10, 2010 6:47 AM
>To: [hidden email]
>Subject: [Rd] Non identical numerical results from R code vs C/C++
code?
>
>Hi,
>
>suppose you have two versions of the same algorithm: one in pure R, the
>other one in C/C++ called via .Call().
>Assuming there is no bug in the implementations (i.e. they both do the
>same thing), is there any well known reason why the C/C++
implementation
>could return numerical results non identical to the one obtained from
>the pure R code? (e.g. could it be rounding errors? please explain.)
>Has anybody had a similar experience?
>
>By not identical, I mean very small differences (< 2.4 e-14), but
enough

>to have identical() returning FALSE. Maybe I should not bother, but I
>want to be sure where the differences come from, at least by mere
>curiosity.
>
>Briefly the R code perform multiple matrix product; the C code is an
>optimization of those specific products via custom for loops, where
>entries are not computed in the same order, etc... which improves both
>memory usage and speed. The result is theoretically the same.
>
>Thank you,
>Renaud
>
>--
>Renaud Gaujoux
>Computational Biology - University of Cape Town
>South Africa
>
>
>
>
>###
>UNIVERSITY OF CAPE TOWN
>
>This e-mail is subject to the UCT ICT policies and e-mail disclaimer
>published on our website at
>http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from
>+27 21 650 4500. This e-mail is intended only for the person(s) to whom
>it is addressed. If the e-mail has reached you in error, please notify
>the author. If you are not the intended recipient of the e-mail you may
>not use, disclose, copy, redirect or print the content. If this e-mail
>is not related to the business of UCT it is sent by the sender in the
>sender's individual capacity.
>
>###
>
>______________________________________________
>[hidden email] mailing list
>https://stat.ethz.ch/mailman/listinfo/r-devel
====================================================================================

La version française suit le texte anglais.

------------------------------------------------------------------------------------

This email may contain privileged and/or confidential information, and the Bank of
Canada does not waive any related rights. Any distribution, use, or copying of this
email or the information it contains by other than the intended recipient is
unauthorized. If you received this email in error please delete it immediately from
your system and notify the sender promptly by email that you have done so.

------------------------------------------------------------------------------------

Le présent courriel peut contenir de l'information privilégiée ou confidentielle.
La Banque du Canada ne renonce pas aux droits qui s'y rapportent. Toute diffusion,
utilisation ou copie de ce courriel ou des renseignements qu'il contient par une
personne autre que le ou les destinataires désignés est interdite. Si vous recevez
ce courriel par erreur, veuillez le supprimer immédiatement et envoyer sans délai à
l'expéditeur un message électronique pour l'aviser que vous avez éliminé de votre
ordinateur toute copie du courriel reçu.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Non identical numerical results from R code vs C/C++ code?

Renaud Gaujoux
Thanks Paul for the hints.
After some tests, reducing portion of my code, I found that simply doing
a naive computation of 'crossprod' in C does NOT give exactly the same
results as calling the Fortran underlying routine (dsyrk) as used in the
R source code.
I will try the double 0.0 to see if it makes a difference.

What do you mean by

"You also need to be careful about
conversion from integer to real, to do double conversion."


?
Where are the trap in this type of conversion?
Thanks.

Renaud

On 10/09/2010 16:07, Paul Gilbert wrote:

> With fortran I have always managed to be able to get identical results
> on the same computer with the same OS. You will have trouble if you
> switch OS or hardware, or try the same hardware and OS with different
> math libraries. All the real calculations need to be double, even
> intermediate variables. Also, I've had trouble with arrays not being
> initialized to double 0.0. If you initialize to single 0.0 the
> straggling bits can cause differences. You also need to be careful about
> conversion from integer to real, to do double conversion. I'm not sure
> about C, but I would guess there are some of the same problems.
>
> Paul
>
>    
>> -----Original Message-----
>> From: [hidden email] [mailto:r-devel-bounces@r-
>> project.org] On Behalf Of Renaud Gaujoux
>> Sent: September 10, 2010 6:47 AM
>> To: [hidden email]
>> Subject: [Rd] Non identical numerical results from R code vs C/C++
>>      
> code?
>    
>> Hi,
>>
>> suppose you have two versions of the same algorithm: one in pure R, the
>> other one in C/C++ called via .Call().
>> Assuming there is no bug in the implementations (i.e. they both do the
>> same thing), is there any well known reason why the C/C++
>>      
> implementation
>    
>> could return numerical results non identical to the one obtained from
>> the pure R code? (e.g. could it be rounding errors? please explain.)
>> Has anybody had a similar experience?
>>
>> By not identical, I mean very small differences (<  2.4 e-14), but
>>      
> enough
>    
>> to have identical() returning FALSE. Maybe I should not bother, but I
>> want to be sure where the differences come from, at least by mere
>> curiosity.
>>
>> Briefly the R code perform multiple matrix product; the C code is an
>> optimization of those specific products via custom for loops, where
>> entries are not computed in the same order, etc... which improves both
>> memory usage and speed. The result is theoretically the same.
>>
>> Thank you,
>> Renaud
>>
>> --
>> Renaud Gaujoux
>> Computational Biology - University of Cape Town
>> South Africa
>>
>>
>>
>>
>> ###
>> UNIVERSITY OF CAPE TOWN
>>
>> This e-mail is subject to the UCT ICT policies and e-mail disclaimer
>> published on our website at
>> http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from
>> +27 21 650 4500. This e-mail is intended only for the person(s) to whom
>> it is addressed. If the e-mail has reached you in error, please notify
>> the author. If you are not the intended recipient of the e-mail you may
>> not use, disclose, copy, redirect or print the content. If this e-mail
>> is not related to the business of UCT it is sent by the sender in the
>> sender's individual capacity.
>>
>> ###
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>      
> ====================================================================================
>
> La version française suit le texte anglais.
>
> ------------------------------------------------------------------------------------
>
> This email may contain privileged and/or confidential ...{{dropped:27}}

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Non identical numerical results from R code vs C/C++ code?

Ted Byers
In reply to this post by Paul Gilbert
You will also get differences if you change optimization settings; even
though the hardware and OS and development tools are the same.  The issue
there involves rounding error, particularly on intermediate results, and
propagation of that error (depending on the nature of the calculations after
the rounding has occurred).  Unless you're compiling the instance of R
you're comparing your C/C++ results against, so you can be certain both are
using the same compiler optimization settings, I would be very surprised it
you were able to obtain results identical to the least significant bit.  If
you're using a different algorithm than what the R code uses, it becomes
almost impossible to get results identical to the last significant bit
(unless both use very unusual algorithms that are not susceptible to
rounding error in intermediate results).

For initialization, if your code makes assumptions about how
variables/arrays are initialized, you have a number of options to ensure
that they're initialized to whatever you want, both in C and in C++.

My practice, for high performance number crunching, has been to be diligent
in creating unit tests and then integration tests that exploit known
mathematical properties of the math I am using (e.g. the eigensystem
computed from a matrix has a suite of mathematical properties that can be
used to test the correctness of the implementation that computes the
eigensystem of a matrix of any particular type, so one can construct tests
based on those properties and a driver that applies the test on a suitably
large number of randomly constructed matrices with specified properties).
But, given the fact that different algorithms for computing the same thing
are guaranteed to have slightly different results because of rounding
errors, I almost never test for strict equality but instead test that
differences are all less than some tolerance defined prior to the tests (and
which depend on the precision in which the calculations are done).  This
does not necessarily carry a cost WRT performance if one is careful in the
design of the classes used and makes appropriate use of inline functions
(and, if you're up to it, template metaprogramming).

HTH

Ted

On Fri, Sep 10, 2010 at 10:07 AM, Paul Gilbert <
[hidden email]> wrote:

> With fortran I have always managed to be able to get identical results
> on the same computer with the same OS. You will have trouble if you
> switch OS or hardware, or try the same hardware and OS with different
> math libraries. All the real calculations need to be double, even
> intermediate variables. Also, I've had trouble with arrays not being
> initialized to double 0.0. If you initialize to single 0.0 the
> straggling bits can cause differences. You also need to be careful about
> conversion from integer to real, to do double conversion. I'm not sure
> about C, but I would guess there are some of the same problems.
>
> Paul
>
> >-----Original Message-----
> >From: [hidden email] [mailto:r-devel-bounces@r-
> >project.org] On Behalf Of Renaud Gaujoux
> >Sent: September 10, 2010 6:47 AM
> >To: [hidden email]
> >Subject: [Rd] Non identical numerical results from R code vs C/C++
> code?
> >
> >Hi,
> >
> >suppose you have two versions of the same algorithm: one in pure R, the
> >other one in C/C++ called via .Call().
> >Assuming there is no bug in the implementations (i.e. they both do the
> >same thing), is there any well known reason why the C/C++
> implementation
> >could return numerical results non identical to the one obtained from
> >the pure R code? (e.g. could it be rounding errors? please explain.)
> >Has anybody had a similar experience?
> >
> >By not identical, I mean very small differences (< 2.4 e-14), but
> enough
> >to have identical() returning FALSE. Maybe I should not bother, but I
> >want to be sure where the differences come from, at least by mere
> >curiosity.
> >
> >Briefly the R code perform multiple matrix product; the C code is an
> >optimization of those specific products via custom for loops, where
> >entries are not computed in the same order, etc... which improves both
> >memory usage and speed. The result is theoretically the same.
> >
> >Thank you,
> >Renaud
> >
> >--
> >Renaud Gaujoux
> >Computational Biology - University of Cape Town
> >South Africa
> >
> >
> >
> >
> >###
> >UNIVERSITY OF CAPE TOWN
> >
> >This e-mail is subject to the UCT ICT policies and e-mail disclaimer
> >published on our website at
> >http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from
> >+27 21 650 4500. This e-mail is intended only for the person(s) to whom
> >it is addressed. If the e-mail has reached you in error, please notify
> >the author. If you are not the intended recipient of the e-mail you may
> >not use, disclose, copy, redirect or print the content. If this e-mail
> >is not related to the business of UCT it is sent by the sender in the
> >sender's individual capacity.
> >
> >###
> >
> >______________________________________________
> >[hidden email] mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ====================================================================================
>
> La version française suit le texte anglais.
>
>
> ------------------------------------------------------------------------------------
>
> This email may contain privileged and/or confidential information, and the
> Bank of
> Canada does not waive any related rights. Any distribution, use, or copying
> of this
> email or the information it contains by other than the intended recipient
> is
> unauthorized. If you received this email in error please delete it
> immediately from
> your system and notify the sender promptly by email that you have done so.
>
>
> ------------------------------------------------------------------------------------
>
> Le présent courriel peut contenir de l'information privilégiée ou
> confidentielle.
> La Banque du Canada ne renonce pas aux droits qui s'y rapportent. Toute
> diffusion,
> utilisation ou copie de ce courriel ou des renseignements qu'il contient
> par une
> personne autre que le ou les destinataires désignés est interdite. Si vous
> recevez
> ce courriel par erreur, veuillez le supprimer immédiatement et envoyer sans
> délai à
> l'expéditeur un message électronique pour l'aviser que vous avez éliminé de
> votre
> ordinateur toute copie du courriel reçu.
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>


--
R.E.(Ted) Byers, Ph.D.,Ed.D.
[hidden email]
CTO
Merchant Services Corp.
17665 Leslie st., unit 30
Newmarket , Ontario
L3Y 3E3

        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Non identical numerical results from R code vs C/C++ code?

Thomas Lumley
In reply to this post by Duncan Murdoch-2
On Fri, 10 Sep 2010, Duncan Murdoch wrote:

> On 10/09/2010 7:07 AM, Renaud Gaujoux wrote:
>> Thank you Duncan for your reply.
>>
>> Currently I am using 'double' for the computations.
>> What type should I use for extended real in my intermediate computations?
>
> I think it depends on compiler details.  On some compilers "long double" will
> get it, but I don't think there's a standard type that works on all
> compilers.  (In fact, the 80 bit type on Intel chips isn't necessarily
> supported on other hardware.)  R defines LDOUBLE in its header files and it
> is probably best to use that if you want to duplicate R results.

As a little more detail, 'long double' is in the C99 standard and seems to be fairly widely implemented, so code using it is likely to compile.   The Standard, as usual, doesn't define exactly what type it is, and permits it to be a synonym for 'double', so you may not get any extra precision.

On Intel chips it is likely to be the 80-bit type, but the Sparc architecture doesn't have any larger hardware type.  Radford Neal has recently reported much slower results on Solaris with long double, consistent with Wikipedia's statement that long double is sometimes a software-implemented 128-bit type on these systems.


>> The result will still be 'double' anyway right?
>
> Yes, you do need to return type double.
>
> Duncan Murdoch
>
>>
>>
>>
>> On 10/09/2010 13:00, Duncan Murdoch wrote:
>>> On 10/09/2010 6:46 AM, Renaud Gaujoux wrote:
>>>> Hi,
>>>>
>>>> suppose you have two versions of the same algorithm: one in pure R, the
>>>> other one in C/C++ called via .Call().
>>>> Assuming there is no bug in the implementations (i.e. they both do the
>>>> same thing), is there any well known reason why the C/C++ implementation
>>>> could return numerical results non identical to the one obtained from the
>>>> pure R code? (e.g. could it be rounding errors? please explain.)
>>>> Has anybody had a similar experience?
>>> R often uses extended reals (80 bit floating point values on Intel chips)
>>> for intermediate values.  C compilers may or may not do that.
>>>> By not identical, I mean very small differences (< 2.4 e-14), but enough
>>>> to have identical() returning FALSE. Maybe I should not bother, but I
>>>> want to be sure where the differences come from, at least by mere
>>>> curiosity.
>>>>
>>>> Briefly the R code perform multiple matrix product; the C code is an
>>>> optimization of those specific products via custom for loops, where
>>>> entries are not computed in the same order, etc... which improves both
>>>> memory usage and speed. The result is theoretically the same.
>>> Changing the order of operations will often affect rounding.  For example,
>>> suppose epsilon is the smallest number such that 1 + epsilon is not equal
>>> to 1.  Then 1 + (epsilon/2) + (epsilon/2) will evaluate to either 1 or 1 +
>>> epsilon, depending on the order of computing the additions.
>>>
>>> Duncan Murdoch
>>>
>>>> Thank you,
>>>> Renaud
>>>>
>>
>>  
>> ###
>> UNIVERSITY OF CAPE TOWN
>> This e-mail is subject to the UCT ICT policies and e-mail disclaimer
>> published on our website at
>> http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from +27
>> 21 650 4500. This e-mail is intended only for the person(s) to whom it is
>> addressed. If the e-mail has reached you in error, please notify the
>> author. If you are not the intended recipient of the e-mail you may not
>> use, disclose, copy, redirect or print the content. If this e-mail is not
>> related to the business of UCT it is sent by the sender in the sender's
>> individual capacity.
>>
>> ###
>>  
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Thomas Lumley
Professor of Biostatistics
University of Washington, Seattle

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel