Computing means, variances and sums

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Computing means, variances and sums

Brian Ripley
There has been a recent thread on R-help on this, which resurrected
concepts from bug reports PR#1228 and PR#6743.  Since the discussion has
included a lot of erroneous 'information' based on misunderstandings of
floating-point computations, this is an attempt to set the record straight
and explain the solutions adopted.

The problem was that var(rep(0.02, 10)) was observed to be (on some
machines) about 1e-35.  This can easily be explained.

0.02 is not an exactly repesented binary fraction.  Repeatedly adding the
represented quantity makes increasing rounding errors relative to the
exact computation, so (on Sparc Solaris)

> var(rep(0.02, 10))
[1] 1.337451e-35
> options(digits=18)
> sum(rep(0.02, 10))
[1] 0.19999999999999998
> sum(rep(0.02, 10)) -0.2
[1] -2.775557561562891e-17
> sum(rep(0.02, 10))/10 -0.02
[1] -3.469446951953614e-18
> z <- sum(rep(0.02, 10))/10 -0.02
> 10*z^2/9
[1] 1.3374513502689138e-35

and so the non-zero variance is arising from (x[i] - mean) being non-zero.
(I did check that was what was happening at C level.)

There has been talk of other ways to arrange these computations, for
example Kahan summation and Welford's algorithm (see Chan & Lewis, 1979,
CACM 22, 526-531 and references therein).  However, R already used the
two-pass algorithm which is the most accurate (in terms of error bounds)
in that reference.

Why are most people seeing 0?  Because the way computation is done in
modern FPUs is not the computation analysed in early numerical analysis
papers, including in Chan & Lewis.  First, all FPUs that I am aware of
allow the use of guard digits, effectively doing intermediate computations
to one more bit than required.  And many use extended precision registers
for computations which they can keep in FP registers, thereby keeping
another 10 or more bits.  (This includes R on most OSes on ix86 CPUs, the
exception being on Windows where the FPU has been reset by some other
software. Typically it is not the case for RISC CPUs, e.g. Sparc.)

The use of extended precision registers invalidates the textbook
comparisons of accuracy in at least two ways.  First, the error analysis
is different if all intermediate results are stored in extended precision.
Second, the simpler the algorithm, the more intermediate results which
will remain in extended precision.  This means that (for example) Kahan
summation is usually less accurate than direct summation on real-world
FPUs.

There are at least two steps which can be done to improve accuracy.
One is to ensure that intermediate results are stored in extended
precision.  Every R platform of which I am aware has a 'long double' type
which can be used.  On ix86 this is the extended precision type used
internally in the FPU and so the cost is zero or close to zero, whereas on
a Sparc the extra precision is more but there is some cost.  The second
step is to use iterative refinement, so the final part of mean.default
currently is

     ## sum(int) can overflow, so convert here.
     if(is.integer(x)) x <- as.numeric(x)
     ## use one round of iterative refinement
     res <- sum(x)/n
     if(is.finite(res)) res + sum(x-res)/n else res

This is a well-known technique in numerical linear algebra, and improves
the accuracy whilst doubling the cost.  (This is about to be replaced by
an internal function to allow the intermediate result to be stored in a
long double.)

Note the is.finite(res) there.  R works with the extended IEC60059 (aka
IEEE754) quantities of Inf, -Inf and NaN (NA being a special NaN).  The
rearranged computations do not work correctly for those quantities.  So
although they can be more 'efficient' (in terms of flops), they have to be
supplemented by some other calculation to ensure that the specials are
handled correctly.  Remember once again that we get both speed and
accuracy advantages by keeping computations in FPU registers, so
complicating the code has considerable cost.

R-devel will shortly use long doubles for critical intermediate results
and iterative refinement for calculations of means.  This may be slower
but it would be an extreme case in which the speed difference was
detectable.  Higher accuracy has a cost too: there are several packages
(dprep and mclust are two) whose results are critically dependent on fine
details of computations and will for example infinite-loop if an optimized
BLAS is used on AMD64.


The choice of algorithms in R is an eclectic mixture of accuracy and
speed.  When (some of) R-core decided to make use of a BLAS for e.g.
matrix products this produced a large speed increase for those with
optimized BLAS (and a small speed decrease for others), but it did result
in lower accuracy and problems with NAs etc (and the alternative
algorithms have since been added back to cover such cases).  But it seems
that nowadays few R users understand the notion of rounding error, and it
is easier to make the computations maximally accurate than to keep
explaining basic numerical analysis on R-help.  (The problem is thereby
just pushed farther down the line, because for example linear regression
is not maximally accurate.)

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Computing means, variances and sums

Spencer Graves
Dear Professors Ripley & Murdoch, & others:

          If this were just an issue of computations like var(rep(0.02, 10))
producing something other than 0 on certain platforms (e.g.,
combinations of operating systems and microprocessors), then I would
suggest it be documented as an FAQ and left as a tool to help explain
finite precision arithmetic and rounding issues to people who don't yet
understand those concepts.

          However, Duncan's comment shows that it is more than that, namely
that certain DLLs change the precision of the fpu (floating point unit?)
from 64 to 53 bit matissas AND DON'T RESET THEM.  When this is not
detected, it could make the difference between a useable answer and
nonsense in poorly conditioned applications where those 9 bits might be
important.  For me, that problem is NOT that one occasionally gets
nonsense from a poorly conditioned compution.  Rather it is that the
SAME computation could give a useful answer in one case and nonsense a
few seconds later ON THE SAME COMPUTER, operating system, etc.

          To test my understanding, I simplified Barry Zajdik's example further:

 > var(rep(.2, 3))
[1] 0
 > RSiteSearch("convert to binary")
A search query has been submitted to http://search.r-project.org
The results page should open in your browser shortly
 > var(rep(.2, 3))
[1] 1.155558e-33
 > sessionInfo()
R version 2.2.1, 2005-12-20, i386-pc-mingw32

attached base packages:
[1] "methods"   "stats"     "graphics"  "grDevices" "utils"     "datasets"
[7] "base"

          This indicates there is a problem that perhaps should eventually be
fixed.  However, please do NOT spend time on this because I suggested it
was an issue.  The conditions under which this would create problems for
anyone are still so rare that I would not want to alter anyone's work
priorities for it.

          spencer graves
p.s.  If my computations are correct, 0.2 = 0*/2 + 0/4 + 1/8 + 1/16 +
0/32 + 0/64 + 1/128 + 1/256 + 0/512 + 0/1024 + 1/2048 + 1/4096 + ... =
0.3333333333333h.  Perhaps someone can extend this to an FAQ to help
explain finite precision arithmetic and rounding issues.
       
Prof Brian Ripley wrote:

> There has been a recent thread on R-help on this, which resurrected
> concepts from bug reports PR#1228 and PR#6743.  Since the discussion has
> included a lot of erroneous 'information' based on misunderstandings of
> floating-point computations, this is an attempt to set the record
> straight and explain the solutions adopted.
>
> The problem was that var(rep(0.02, 10)) was observed to be (on some
> machines) about 1e-35.  This can easily be explained.
>
> 0.02 is not an exactly repesented binary fraction.  Repeatedly adding
> the represented quantity makes increasing rounding errors relative to
> the exact computation, so (on Sparc Solaris)
>
>> var(rep(0.02, 10))
>
> [1] 1.337451e-35
>
>> options(digits=18)
>> sum(rep(0.02, 10))
>
> [1] 0.19999999999999998
>
>> sum(rep(0.02, 10)) -0.2
>
> [1] -2.775557561562891e-17
>
>> sum(rep(0.02, 10))/10 -0.02
>
> [1] -3.469446951953614e-18
>
>> z <- sum(rep(0.02, 10))/10 -0.02
>> 10*z^2/9
>
> [1] 1.3374513502689138e-35
>
> and so the non-zero variance is arising from (x[i] - mean) being non-zero.
> (I did check that was what was happening at C level.)
>
> There has been talk of other ways to arrange these computations, for
> example Kahan summation and Welford's algorithm (see Chan & Lewis, 1979,
> CACM 22, 526-531 and references therein).  However, R already used the
> two-pass algorithm which is the most accurate (in terms of error bounds)
> in that reference.
>
> Why are most people seeing 0?  Because the way computation is done in
> modern FPUs is not the computation analysed in early numerical analysis
> papers, including in Chan & Lewis.  First, all FPUs that I am aware of
> allow the use of guard digits, effectively doing intermediate
> computations to one more bit than required.  And many use extended
> precision registers for computations which they can keep in FP
> registers, thereby keeping another 10 or more bits.  (This includes R on
> most OSes on ix86 CPUs, the exception being on Windows where the FPU has
> been reset by some other software. Typically it is not the case for RISC
> CPUs, e.g. Sparc.)
>
> The use of extended precision registers invalidates the textbook
> comparisons of accuracy in at least two ways.  First, the error analysis
> is different if all intermediate results are stored in extended
> precision. Second, the simpler the algorithm, the more intermediate
> results which will remain in extended precision.  This means that (for
> example) Kahan summation is usually less accurate than direct summation
> on real-world FPUs.
>
> There are at least two steps which can be done to improve accuracy.
> One is to ensure that intermediate results are stored in extended
> precision.  Every R platform of which I am aware has a 'long double'
> type which can be used.  On ix86 this is the extended precision type
> used internally in the FPU and so the cost is zero or close to zero,
> whereas on a Sparc the extra precision is more but there is some cost.  
> The second step is to use iterative refinement, so the final part of
> mean.default currently is
>
>     ## sum(int) can overflow, so convert here.
>     if(is.integer(x)) x <- as.numeric(x)
>     ## use one round of iterative refinement
>     res <- sum(x)/n
>     if(is.finite(res)) res + sum(x-res)/n else res
>
> This is a well-known technique in numerical linear algebra, and improves
> the accuracy whilst doubling the cost.  (This is about to be replaced by
> an internal function to allow the intermediate result to be stored in a
> long double.)
>
> Note the is.finite(res) there.  R works with the extended IEC60059 (aka
> IEEE754) quantities of Inf, -Inf and NaN (NA being a special NaN).  The
> rearranged computations do not work correctly for those quantities.  So
> although they can be more 'efficient' (in terms of flops), they have to
> be supplemented by some other calculation to ensure that the specials
> are handled correctly.  Remember once again that we get both speed and
> accuracy advantages by keeping computations in FPU registers, so
> complicating the code has considerable cost.
>
> R-devel will shortly use long doubles for critical intermediate results
> and iterative refinement for calculations of means.  This may be slower
> but it would be an extreme case in which the speed difference was
> detectable.  Higher accuracy has a cost too: there are several packages
> (dprep and mclust are two) whose results are critically dependent on
> fine details of computations and will for example infinite-loop if an
> optimized BLAS is used on AMD64.
>
>
> The choice of algorithms in R is an eclectic mixture of accuracy and
> speed.  When (some of) R-core decided to make use of a BLAS for e.g.
> matrix products this produced a large speed increase for those with
> optimized BLAS (and a small speed decrease for others), but it did
> result in lower accuracy and problems with NAs etc (and the alternative
> algorithms have since been added back to cover such cases).  But it
> seems that nowadays few R users understand the notion of rounding error,
> and it is easier to make the computations maximally accurate than to
> keep explaining basic numerical analysis on R-help.  (The problem is
> thereby just pushed farther down the line, because for example linear
> regression is not maximally accurate.)
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Computing means, variances and sums

hadley wickham
> p.s.  If my computations are correct, 0.2 = 0*/2 + 0/4 + 1/8 + 1/16 +
> 0/32 + 0/64 + 1/128 + 1/256 + 0/512 + 0/1024 + 1/2048 + 1/4096 + ... =
> 0.3333333333333h.  Perhaps someone can extend this to an FAQ to help
> explain finite precision arithmetic and rounding issues.

This is drifting a bit off topic, but the other day I discovered this
rather nice illustration of the perils of finite precision arithmetic
while creating a contrast matrix:

> n <- 13
> a <- matrix(-1/n, ncol=n, nrow=n) + diag(n)
> rowSums(a)
 [1]  2.775558e-16  2.775558e-16  5.551115e-17  5.551115e-17  5.551115e-17
 [6]  5.551115e-17  0.000000e+00 -5.551115e-17  0.000000e+00  5.551115e-17
[11]  1.110223e-16  1.665335e-16  2.220446e-16

Not only do most of the rows not sum to 0, they do not even sum to the
same number!  It is hard to remember the familiar rules of arithmetic
do not always apply.

Hadley

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Computing means, variances and sums

Brian Ripley
In reply to this post by Spencer Graves
As I have said before on R-help, it is _already_ fixed in this (and
related cases) and the reasons it is fixed were explained here.  If you
read carefully, you will have noticed that some platforms (such as Sparc)
only use 53 bits (+ a guard bit).

Not that I accept that the 53-bit calculation is `nonsense', or anything
close to it.

On Sun, 19 Feb 2006, Spencer Graves wrote:

> Dear Professors Ripley & Murdoch, & others:
>
>  If this were just an issue of computations like var(rep(0.02, 10))
> producing something other than 0 on certain platforms (e.g., combinations of
> operating systems and microprocessors), then I would suggest it be documented
> as an FAQ and left as a tool to help explain finite precision arithmetic and
> rounding issues to people who don't yet understand those concepts.
>
>  However, Duncan's comment shows that it is more than that, namely
> that certain DLLs change the precision of the fpu (floating point unit?) from
> 64 to 53 bit matissas AND DON'T RESET THEM.  When this is not detected, it
> could make the difference between a useable answer and nonsense in poorly
> conditioned applications where those 9 bits might be important.  For me, that
> problem is NOT that one occasionally gets nonsense from a poorly conditioned
> compution.  Rather it is that the SAME computation could give a useful answer
> in one case and nonsense a few seconds later ON THE SAME COMPUTER, operating
> system, etc.
>
>  To test my understanding, I simplified Barry Zajdik's example
> further:
>
>> var(rep(.2, 3))
> [1] 0
>> RSiteSearch("convert to binary")
> A search query has been submitted to http://search.r-project.org
> The results page should open in your browser shortly
>> var(rep(.2, 3))
> [1] 1.155558e-33
>> sessionInfo()
> R version 2.2.1, 2005-12-20, i386-pc-mingw32
>
> attached base packages:
> [1] "methods"   "stats"     "graphics"  "grDevices" "utils"     "datasets"
> [7] "base"
>
>  This indicates there is a problem that perhaps should eventually be
> fixed.  However, please do NOT spend time on this because I suggested it was
> an issue.  The conditions under which this would create problems for anyone
> are still so rare that I would not want to alter anyone's work priorities for
> it.
>
>  spencer graves
> p.s.  If my computations are correct, 0.2 = 0*/2 + 0/4 + 1/8 + 1/16 + 0/32 +
> 0/64 + 1/128 + 1/256 + 0/512 + 0/1024 + 1/2048 + 1/4096 + ... =
> 0.3333333333333h.  Perhaps someone can extend this to an FAQ to help explain
> finite precision arithmetic and rounding issues.
> Prof Brian Ripley wrote:
>> There has been a recent thread on R-help on this, which resurrected
>> concepts from bug reports PR#1228 and PR#6743.  Since the discussion has
>> included a lot of erroneous 'information' based on misunderstandings of
>> floating-point computations, this is an attempt to set the record straight
>> and explain the solutions adopted.
>>
>> The problem was that var(rep(0.02, 10)) was observed to be (on some
>> machines) about 1e-35.  This can easily be explained.
>>
>> 0.02 is not an exactly repesented binary fraction.  Repeatedly adding the
>> represented quantity makes increasing rounding errors relative to the exact
>> computation, so (on Sparc Solaris)
>>
>>> var(rep(0.02, 10))
>>
>> [1] 1.337451e-35
>>
>>> options(digits=18)
>>> sum(rep(0.02, 10))
>>
>> [1] 0.19999999999999998
>>
>>> sum(rep(0.02, 10)) -0.2
>>
>> [1] -2.775557561562891e-17
>>
>>> sum(rep(0.02, 10))/10 -0.02
>>
>> [1] -3.469446951953614e-18
>>
>>> z <- sum(rep(0.02, 10))/10 -0.02
>>> 10*z^2/9
>>
>> [1] 1.3374513502689138e-35
>>
>> and so the non-zero variance is arising from (x[i] - mean) being non-zero.
>> (I did check that was what was happening at C level.)
>>
>> There has been talk of other ways to arrange these computations, for
>> example Kahan summation and Welford's algorithm (see Chan & Lewis, 1979,
>> CACM 22, 526-531 and references therein).  However, R already used the
>> two-pass algorithm which is the most accurate (in terms of error bounds) in
>> that reference.
>>
>> Why are most people seeing 0?  Because the way computation is done in
>> modern FPUs is not the computation analysed in early numerical analysis
>> papers, including in Chan & Lewis.  First, all FPUs that I am aware of
>> allow the use of guard digits, effectively doing intermediate computations
>> to one more bit than required.  And many use extended precision registers
>> for computations which they can keep in FP registers, thereby keeping
>> another 10 or more bits.  (This includes R on most OSes on ix86 CPUs, the
>> exception being on Windows where the FPU has been reset by some other
>> software. Typically it is not the case for RISC CPUs, e.g. Sparc.)
>>
>> The use of extended precision registers invalidates the textbook
>> comparisons of accuracy in at least two ways.  First, the error analysis is
>> different if all intermediate results are stored in extended precision.
>> Second, the simpler the algorithm, the more intermediate results which will
>> remain in extended precision.  This means that (for example) Kahan
>> summation is usually less accurate than direct summation on real-world
>> FPUs.
>>
>> There are at least two steps which can be done to improve accuracy.
>> One is to ensure that intermediate results are stored in extended
>> precision.  Every R platform of which I am aware has a 'long double' type
>> which can be used.  On ix86 this is the extended precision type used
>> internally in the FPU and so the cost is zero or close to zero, whereas on
>> a Sparc the extra precision is more but there is some cost.  The second
>> step is to use iterative refinement, so the final part of mean.default
>> currently is
>>
>>     ## sum(int) can overflow, so convert here.
>>     if(is.integer(x)) x <- as.numeric(x)
>>     ## use one round of iterative refinement
>>     res <- sum(x)/n
>>     if(is.finite(res)) res + sum(x-res)/n else res
>>
>> This is a well-known technique in numerical linear algebra, and improves
>> the accuracy whilst doubling the cost.  (This is about to be replaced by an
>> internal function to allow the intermediate result to be stored in a long
>> double.)
>>
>> Note the is.finite(res) there.  R works with the extended IEC60059 (aka
>> IEEE754) quantities of Inf, -Inf and NaN (NA being a special NaN).  The
>> rearranged computations do not work correctly for those quantities.  So
>> although they can be more 'efficient' (in terms of flops), they have to be
>> supplemented by some other calculation to ensure that the specials are
>> handled correctly.  Remember once again that we get both speed and accuracy
>> advantages by keeping computations in FPU registers, so complicating the
>> code has considerable cost.
>>
>> R-devel will shortly use long doubles for critical intermediate results and
>> iterative refinement for calculations of means.  This may be slower but it
>> would be an extreme case in which the speed difference was detectable.
>> Higher accuracy has a cost too: there are several packages (dprep and
>> mclust are two) whose results are critically dependent on fine details of
>> computations and will for example infinite-loop if an optimized BLAS is
>> used on AMD64.
>>
>>
>> The choice of algorithms in R is an eclectic mixture of accuracy and speed.
>> When (some of) R-core decided to make use of a BLAS for e.g. matrix
>> products this produced a large speed increase for those with optimized BLAS
>> (and a small speed decrease for others), but it did result in lower
>> accuracy and problems with NAs etc (and the alternative algorithms have
>> since been added back to cover such cases).  But it seems that nowadays few
>> R users understand the notion of rounding error, and it is easier to make
>> the computations maximally accurate than to keep explaining basic numerical
>> analysis on R-help.  (The problem is thereby just pushed farther down the
>> line, because for example linear regression is not maximally accurate.)
>>
>
>

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Computing means, variances and sums

Brian Ripley
In reply to this post by hadley wickham
On Sun, 19 Feb 2006, hadley wickham wrote:

>> p.s.  If my computations are correct, 0.2 = 0*/2 + 0/4 + 1/8 + 1/16 +
>> 0/32 + 0/64 + 1/128 + 1/256 + 0/512 + 0/1024 + 1/2048 + 1/4096 + ... =
>> 0.3333333333333h.  Perhaps someone can extend this to an FAQ to help
>> explain finite precision arithmetic and rounding issues.
>
> This is drifting a bit off topic, but the other day I discovered this
> rather nice illustration of the perils of finite precision arithmetic
> while creating a contrast matrix:
>
>> n <- 13
>> a <- matrix(-1/n, ncol=n, nrow=n) + diag(n)
>> rowSums(a)
> [1]  2.775558e-16  2.775558e-16  5.551115e-17  5.551115e-17  5.551115e-17
> [6]  5.551115e-17  0.000000e+00 -5.551115e-17  0.000000e+00  5.551115e-17
> [11]  1.110223e-16  1.665335e-16  2.220446e-16
>
> Not only do most of the rows not sum to 0, they do not even sum to the
> same number!  It is hard to remember the familiar rules of arithmetic
> do not always apply.

I think you will find this example does give all 0's in R-devel, even
on platforms like Sparc.  But users do need to remember that computer
arithmetic is inexact except in rather narrowly delimited cases.

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Computing means, variances and sums

Duncan Murdoch
On 2/19/2006 3:18 PM, Prof Brian Ripley wrote:

> On Sun, 19 Feb 2006, hadley wickham wrote:
>
>>> p.s.  If my computations are correct, 0.2 = 0*/2 + 0/4 + 1/8 + 1/16 +
>>> 0/32 + 0/64 + 1/128 + 1/256 + 0/512 + 0/1024 + 1/2048 + 1/4096 + ... =
>>> 0.3333333333333h.  Perhaps someone can extend this to an FAQ to help
>>> explain finite precision arithmetic and rounding issues.
>> This is drifting a bit off topic, but the other day I discovered this
>> rather nice illustration of the perils of finite precision arithmetic
>> while creating a contrast matrix:
>>
>>> n <- 13
>>> a <- matrix(-1/n, ncol=n, nrow=n) + diag(n)
>>> rowSums(a)
>> [1]  2.775558e-16  2.775558e-16  5.551115e-17  5.551115e-17  5.551115e-17
>> [6]  5.551115e-17  0.000000e+00 -5.551115e-17  0.000000e+00  5.551115e-17
>> [11]  1.110223e-16  1.665335e-16  2.220446e-16
>>
>> Not only do most of the rows not sum to 0, they do not even sum to the
>> same number!  It is hard to remember the familiar rules of arithmetic
>> do not always apply.
>
> I think you will find this example does give all 0's in R-devel, even
> on platforms like Sparc.  

Only until the fpu precision gets changed:

 > n <- 13
 > a <- matrix(-1/n, ncol=n, nrow=n) + diag(n)
 > rowSums(a)
  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0
 > RSiteSearch('junk')
A search query has been submitted to http://search.r-project.org
The results page should open in your browser shortly

 > n <- 13
 > a <- matrix(-1/n, ncol=n, nrow=n) + diag(n)
 > rowSums(a)
  [1]  2.775558e-16  2.775558e-16  5.551115e-17  5.551115e-17  5.551115e-17
  [6]  5.551115e-17  0.000000e+00 -5.551115e-17  0.000000e+00  5.551115e-17
[11]  1.110223e-16  1.665335e-16  2.220446e-16

We still need to protect against these changes.  I'll put something
together, unless you're already working on it.

The approach I'm thinking of is to define a macro to be called in risky
situations.  On platforms where this isn't an issue, the macro would be
null; on Windows, it would reset the fpu to full precision.

For example, RSiteSearch causes damage in the ShellExecute call in
do_shellexec called from browseURL, so I'd add protection there.  I
think we should also add detection code somewhere in the evaluation loop
to help in diagnosing these problems.

> But users do need to remember that computer
> arithmetic is inexact except in rather narrowly delimited cases.

Yes, that too.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Computing means, variances and sums

Brian Ripley
I've managed to track this down.  The setting of the FPU control word on a
ix86 machine changes the precision of (gcc) long double calculations in
the FPU, as well as those of double.  So if it gets changed to PC_53, long
doubles lose accuracy even though 10 bytes are carried around.

For some discussion of extended-precision issues, see Priest's annex to
Goldberg's paper at http://www.validlab.com/goldberg/paper.pdf.

Many other OSes other than Windows on ix86 do consider the FPU control
word when doing context-switching.  There is (often heated) debate as to
what the correct default should be, see e.g. the thread begining

http://gcc.gnu.org/ml/gcc/1998-12/msg00473.html

Whereas Linux has selected extended precision, apparently FreeBSD selected
53-bit mantissa, and Solaris x86 used the equivalent of -ffloat-store to
ensure stricter compliant to the IEEE754 'double' model (and cynics say,
to make the Sparc Solaris performance look good).

It will be interesting to see what MacIntel has selected (I suspect the
FreeBSD solution).

This does mean that using long doubles for accumulators is not necessarily
always a win although the potential loss is likely to be small.


On Sun, 19 Feb 2006, Duncan Murdoch wrote:

> On 2/19/2006 3:18 PM, Prof Brian Ripley wrote:
>> On Sun, 19 Feb 2006, hadley wickham wrote:
>>
>>>> p.s.  If my computations are correct, 0.2 = 0*/2 + 0/4 + 1/8 + 1/16 +
>>>> 0/32 + 0/64 + 1/128 + 1/256 + 0/512 + 0/1024 + 1/2048 + 1/4096 + ... =
>>>> 0.3333333333333h.  Perhaps someone can extend this to an FAQ to help
>>>> explain finite precision arithmetic and rounding issues.
>>> This is drifting a bit off topic, but the other day I discovered this
>>> rather nice illustration of the perils of finite precision arithmetic
>>> while creating a contrast matrix:
>>>
>>>> n <- 13
>>>> a <- matrix(-1/n, ncol=n, nrow=n) + diag(n)
>>>> rowSums(a)
>>> [1]  2.775558e-16  2.775558e-16  5.551115e-17  5.551115e-17  5.551115e-17
>>> [6]  5.551115e-17  0.000000e+00 -5.551115e-17  0.000000e+00  5.551115e-17
>>> [11]  1.110223e-16  1.665335e-16  2.220446e-16
>>>
>>> Not only do most of the rows not sum to 0, they do not even sum to the
>>> same number!  It is hard to remember the familiar rules of arithmetic
>>> do not always apply.
>>
>> I think you will find this example does give all 0's in R-devel, even on
>> platforms like Sparc.
>
> Only until the fpu precision gets changed:
>
>> n <- 13
>> a <- matrix(-1/n, ncol=n, nrow=n) + diag(n)
>> rowSums(a)
> [1] 0 0 0 0 0 0 0 0 0 0 0 0 0
>> RSiteSearch('junk')
> A search query has been submitted to http://search.r-project.org
> The results page should open in your browser shortly
>
>> n <- 13
>> a <- matrix(-1/n, ncol=n, nrow=n) + diag(n)
>> rowSums(a)
> [1]  2.775558e-16  2.775558e-16  5.551115e-17  5.551115e-17  5.551115e-17
> [6]  5.551115e-17  0.000000e+00 -5.551115e-17  0.000000e+00  5.551115e-17
> [11]  1.110223e-16  1.665335e-16  2.220446e-16
>
> We still need to protect against these changes.  I'll put something together,
> unless you're already working on it.
>
> The approach I'm thinking of is to define a macro to be called in risky
> situations.  On platforms where this isn't an issue, the macro would be null;
> on Windows, it would reset the fpu to full precision.
>
> For example, RSiteSearch causes damage in the ShellExecute call in
> do_shellexec called from browseURL, so I'd add protection there.  I think we
> should also add detection code somewhere in the evaluation loop to help in
> diagnosing these problems.
>
>> But users do need to remember that computer arithmetic is inexact except in
>> rather narrowly delimited cases.
>
> Yes, that too.
>
> Duncan Murdoch
>
>

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Computing means, variances and sums

Duncan Murdoch
On 2/22/2006 3:52 AM, Prof Brian Ripley wrote:

> I've managed to track this down.  The setting of the FPU control word on a
> ix86 machine changes the precision of (gcc) long double calculations in
> the FPU, as well as those of double.  So if it gets changed to PC_53, long
> doubles lose accuracy even though 10 bytes are carried around.
>
> For some discussion of extended-precision issues, see Priest's annex to
> Goldberg's paper at http://www.validlab.com/goldberg/paper.pdf.
>
> Many other OSes other than Windows on ix86 do consider the FPU control
> word when doing context-switching.  There is (often heated) debate as to
> what the correct default should be, see e.g. the thread begining

I think Windows also preserves the FPU control word across context
switches.  The problem is that it doesn't do a context switch when you
make a call into the system.  In particular, the shell services (file
dialogs, URL recognition, etc.) involve a large number of DLLs, all
running within the same context.  MSVC++ normally chooses 53 bit
precision as the default, and will set that when a DLL starts up, or
whenever a function in it asks to reset the FPU.

Duncan Murdoch

>
> http://gcc.gnu.org/ml/gcc/1998-12/msg00473.html
>
> Whereas Linux has selected extended precision, apparently FreeBSD selected
> 53-bit mantissa, and Solaris x86 used the equivalent of -ffloat-store to
> ensure stricter compliant to the IEEE754 'double' model (and cynics say,
> to make the Sparc Solaris performance look good).
>
> It will be interesting to see what MacIntel has selected (I suspect the
> FreeBSD solution).
>
> This does mean that using long doubles for accumulators is not necessarily
> always a win although the potential loss is likely to be small.
>
>
> On Sun, 19 Feb 2006, Duncan Murdoch wrote:
>
>> On 2/19/2006 3:18 PM, Prof Brian Ripley wrote:
>>> On Sun, 19 Feb 2006, hadley wickham wrote:
>>>
>>>>> p.s.  If my computations are correct, 0.2 = 0*/2 + 0/4 + 1/8 + 1/16 +
>>>>> 0/32 + 0/64 + 1/128 + 1/256 + 0/512 + 0/1024 + 1/2048 + 1/4096 + ... =
>>>>> 0.3333333333333h.  Perhaps someone can extend this to an FAQ to help
>>>>> explain finite precision arithmetic and rounding issues.
>>>> This is drifting a bit off topic, but the other day I discovered this
>>>> rather nice illustration of the perils of finite precision arithmetic
>>>> while creating a contrast matrix:
>>>>
>>>>> n <- 13
>>>>> a <- matrix(-1/n, ncol=n, nrow=n) + diag(n)
>>>>> rowSums(a)
>>>> [1]  2.775558e-16  2.775558e-16  5.551115e-17  5.551115e-17  5.551115e-17
>>>> [6]  5.551115e-17  0.000000e+00 -5.551115e-17  0.000000e+00  5.551115e-17
>>>> [11]  1.110223e-16  1.665335e-16  2.220446e-16
>>>>
>>>> Not only do most of the rows not sum to 0, they do not even sum to the
>>>> same number!  It is hard to remember the familiar rules of arithmetic
>>>> do not always apply.
>>> I think you will find this example does give all 0's in R-devel, even on
>>> platforms like Sparc.
>> Only until the fpu precision gets changed:
>>
>>> n <- 13
>>> a <- matrix(-1/n, ncol=n, nrow=n) + diag(n)
>>> rowSums(a)
>> [1] 0 0 0 0 0 0 0 0 0 0 0 0 0
>>> RSiteSearch('junk')
>> A search query has been submitted to http://search.r-project.org
>> The results page should open in your browser shortly
>>
>>> n <- 13
>>> a <- matrix(-1/n, ncol=n, nrow=n) + diag(n)
>>> rowSums(a)
>> [1]  2.775558e-16  2.775558e-16  5.551115e-17  5.551115e-17  5.551115e-17
>> [6]  5.551115e-17  0.000000e+00 -5.551115e-17  0.000000e+00  5.551115e-17
>> [11]  1.110223e-16  1.665335e-16  2.220446e-16
>>
>> We still need to protect against these changes.  I'll put something together,
>> unless you're already working on it.
>>
>> The approach I'm thinking of is to define a macro to be called in risky
>> situations.  On platforms where this isn't an issue, the macro would be null;
>> on Windows, it would reset the fpu to full precision.
>>
>> For example, RSiteSearch causes damage in the ShellExecute call in
>> do_shellexec called from browseURL, so I'd add protection there.  I think we
>> should also add detection code somewhere in the evaluation loop to help in
>> diagnosing these problems.
>>
>>> But users do need to remember that computer arithmetic is inexact except in
>>> rather narrowly delimited cases.
>> Yes, that too.
>>
>> Duncan Murdoch
>>
>>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Computing means, variances and sums

Brian Ripley
On Wed, 22 Feb 2006, Duncan Murdoch wrote:

> On 2/22/2006 3:52 AM, Prof Brian Ripley wrote:
>> I've managed to track this down.  The setting of the FPU control word on a
>> ix86 machine changes the precision of (gcc) long double calculations in the
>> FPU, as well as those of double.  So if it gets changed to PC_53, long
>> doubles lose accuracy even though 10 bytes are carried around.
>>
>> For some discussion of extended-precision issues, see Priest's annex to
>> Goldberg's paper at http://www.validlab.com/goldberg/paper.pdf.
>>
>> Many other OSes other than Windows on ix86 do consider the FPU control word
>> when doing context-switching.  There is (often heated) debate as to what
>> the correct default should be, see e.g. the thread begining
>
> I think Windows also preserves the FPU control word across context switches.
> The problem is that it doesn't do a context switch when you make a call into
> the system.  In particular, the shell services (file dialogs, URL
> recognition, etc.) involve a large number of DLLs, all running within the
> same context.  MSVC++ normally chooses 53 bit precision as the default, and
> will set that when a DLL starts up, or whenever a function in it asks to
> reset the FPU.

Yes, I think you are correct. In that case the difference is that it is
considered legitimate for a shared resource (a DLL) to change the control
word in the context that initializes it.

Since as I understand it Win64 does not have a working controlfp nor the
ability to change precision (from
http://msdn2.microsoft.com/en-us/library/e9b52ceh.aspx) in its native
mode, maybe one day this will go away.

Brian Ripley

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel