). Using a test of

> Recently, I came across a strange and potentially troublesome behaviour of

> the lm and aov functions that ask questions about calculation accuracy. Let

> us consider the 2 following datasets dat1 & dat2 :

>

> > (dat1 <- data.frame(Y=c(1:3, 10+1:3), F=c(rep("A",3), rep("B",3))))

> Y F

> 1 1 A

> 2 2 A

> 3 3 A

> 4 11 B

> 5 12 B

> 6 13 B

> > (dat2 <- data.frame(Y=c(10+1:3, 1:3), F=c(rep("A",3), rep("B",3))))

> Y F

> 1 11 A

> 2 12 A

> 3 13 A

> 4 1 B

> 5 2 B

> 6 3 B

>

> They only differ in the order of values that were exchanged between

> samples A and B. Thus the sd is 1 for each sample in either data sets, and

> the absolute mean difference |A-B| is 10 in both datasets.

> Now, let us perform an anova to compare samples A and B in both datasets

> (of course, in such simple case, a bilateral T test would do the job, but

> an anova is nevertheless allowed and should give the same probability than

> Student's test):

>

> > (anova1 <- anova(lm(Y~F, dat1)))

> Analysis of Variance Table

>

> Response: Y

> Df Sum Sq Mean Sq F value Pr(>F)

> F 1 150 150 150 0.0002552 ***

> Residuals 4 4 1

>

> > (anova2 <- anova(lm(Y~F, dat2)))

> Analysis of Variance Table

>

> Response: Y

> Df Sum Sq Mean Sq F value Pr(>F)

> F 1 150 150 150 0.0002552 ***

> Residuals 4 4 1

>

> As expected, both datasets give a same anova table, but this is only

> apparent. Indeed :

>

> > anova1$F[1] == anova2$F[1]

> [1] FALSE

> > anova1$F[1] - anova2$F[1]

> [1] 5.684342e-14

>

> In fact the F values differ slightly, and this holds also for the aov

> function. I checked also (not shown) that both the residual and factorial

> sums of squares differ between dat1 and dat2. Thus, for some undocumented

> reason (at least for the end user), the F values depend on the order of

> data!

> While such tiny differences (e-14 in this example) are devoid of

> consequences on the risk evaluation by Fisher's distribution, they may have

> huge consequences on the risk evaluation by the permutation method. Indeed,

> the shift from continuous to discrete distributions is far from being

> insignificant.

>

> For instance, the following code in R is at the basis of many permutation

> algorithms found in the internet and in teaching because it seems quite

> straightforward (see for example

>

http://www.uvm.edu/~dhowell/StatPages/More_Stuff/Permutation%20Anova/PermTestsAnova.html>

http://www.usna.edu/Users/math/jager/courses/sm439/labs/Lab7.pdf>

>

http://biostatistics1014.blogspot.fr/2013/04/one-way-anova-permutation-test-and.html>

http://adn.biol.umontreal.ca/~numericalecology/Rcode/):>

> > nperm <- 1000 # number of permutations

> > Fperm <- replicate(n=nperm, anova(lm(sample(Y) ~ F, dat1))$F[1]) #

> calculates nperm F values

> > (prob <- (sum(anova1$F[1] <= Fperm) + 1)/(nperm +1)) # risk calculation

> [1] 0.04695305

>

> Of course, because of the sample function, repeating this code gives

> different prob values, but they remain always close to 5% instead of the

> exact probability of 10%. Indeed, there are only choose(6,3) = 20 possible

> permutations, but because they are symmetric, they give only 10 absolute

> mean differences. Thus, the only exact probabilities are 10%, 20% ... 100%.

> In our example where samples do not overlap, 10% is obviously the right

> answer.

>

> Thus, the use of lm and aov functions in permutation methods does not seem

> a good idea as it results in biases that underestimate the exact risk. In

> the simple case of one-way anova, it is quite simple to remedy this

> problem. As the total Sums of squares and the degrees of freedom do not

> change with permutations, it is easier and much faster to compare the

> residual sums of squares. For instance, the exact probabilities can be

> calculated that way :

>

> > combi <- combn(6, 3, FUN=function(i) {append(dat1$Y[i], dat1$Y[-i])}) #

> all permutations

> > SCEResid <- apply(combi, 2, FUN=function(x) sum(tapply(x, dat1$F,

> function(x) sum((x - mean(x))^2)))) # all resi SS

> > (prob <- mean(SCEResid <= SCEResid[1])) # risk calculation

> [1] 0.1

>

> 10% is indeed the exact risk.

>

> Finally, my problem is : How can we know if R libraries that use

> randomization procedures are not biased ? In the basic case of one way

> anova, it seems easy to submit the above example (by the way, the defunct

> lmPerm library does not succeed ...), but how can we check more complex

> anova models ?

>

>

>

>

> [[alternative HTML version deleted]]

>

> ______________________________________________

>

[hidden email] mailing list

>

https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide

>

http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.

>

Joshua F. Wiley

Ph.D. Student, UCLA Department of Psychology

Senior Analyst, Elkhart Group Ltd.