Numerical stability in chisq.test

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Numerical stability in chisq.test

The chisq.test contains following code:
        STATISTIC <- sum(sort((x - E)^2/E, decreasing = TRUE))

However, based on book Accuracy and stability of numerical algorithms <> Table 4.1 on page 89, it is better to sort the data in increasing order than in decreasing order, when the data are non-negative.

A demonstrative example:
x = matrix(c(rep(1.1, 10000)), 10^16, nrow = 10001, ncol = 1)    # We have a vector with 10000*1.1 and 1*10^16
c(sum(sort(x, decreasing = TRUE)), sum(sort(x, decreasing = FALSE)))
The result:
        10000000000010996 10000000000011000
When we sort the data in the increasing order, we get the correct result. If we sort the data in the decreasing order, we get a result that is off by 4.

Shouldn't the sort be in the increasing order rather than in the decreasing order?

Best regards,
 Jan Motl

PS: This post is based on discussion on <>.

        [[alternative HTML version deleted]]

[hidden email] mailing list -- To UNSUBSCRIBE and more, see
PLEASE do read the posting guide
and provide commented, minimal, self-contained, reproducible code.