This concerns whether p-values from lmer can be trusted. From

simulations, it seems that lmer can produce very small, and probably

spurious, p-values. I realize that lmer is not yet a finished product.

Is it likely that the problem will be fixed in a future release of the

lme4 package?

Using simulated data for a quite standard mixed-model anova (a balanced

two-way design; see code for the function SimMixed pasted below), I

compared the output of lmer, for three slightly different models, with

the output of aov. For an example where there is no fixed treatment

effect (null hypothesis is true), with 4 blocks, 2 treatments, and 40

observations per treatment-block combination, I find that lmer gives

more statistical significances than it should, whereas aov does not have

this problem. An example of output I generated by calling

> SimMixed(1000)

is the following:

Proportion significances at the 0.05 level

aov: 0.05

lmer.1: 0.148

lmer.2: 0.148

lmer.3: 0.151

Proportion significances at the 0.01 level

aov: 0.006

lmer.1: 0.076

lmer.2: 0.076

lmer.3: 0.077

Proportion significances at the 0.001 level

aov: 0.001

lmer.1: 0.047

lmer.2: 0.047

lmer.3: 0.047

which is based on 1000 simulations (and takes about 5 min on my PowerMac

G5). The different models fitted are:

fm.aov <- aov(y ~ Treat + Error(Block/Treat), data = dat)

fm.lmer.1 <- lmer(y ~ Treat + (Treat|Block), data = dat)

fm.lmer.2 <- lmer(y ~ Treat + (Treat-1|Block), data = dat)

fm.lmer.3 <- lmer(y ~ Treat + (1|Block) + (Treat-1|Block), data = dat)

It seems that, depending on the level of the test, lmer gives between a

factor of 3 to a factor of around 50 times too many significances. The

first two lmer models seem to give identical results, whereas the third

(which I think perhaps is the one that best represents the data

generated by the simulation) differs slightly. In running the

simulations, warnings like this are occasionally generated:

Warning message:

optim or nlminb returned message false convergence (8)

in: "LMEoptimize<-"(`*tmp*`, value = list(maxIter = 200, tolerance =

1.49011611938477e-08,

They seem to derive from the third of the lmer models. Perhaps there is

some numerical issue in the lmer function? From running SimMixed()

several times, I have noticed that large p-values (say, larger than 0.5)

agree very well between lmer and aov, but there seems to be a systematic

discrepancy for smaller p-values, where lmer gives smaller values than

aov. The F-values agree between all analyzes (except for fm.lmer.3 when

there is a warning), so there is a systematic difference between lmer

and aov in how a p-value is obtained from the F-value, which becomes

severe for small p-values.

My output from sessionInfo()

R version 2.2.1, 2005-12-20, powerpc-apple-darwin7.9.0

attached base packages:

[1] "methods" "stats" "graphics" "grDevices" "utils"

"datasets" "base"

other attached packages:

lme4 lattice Matrix

"0.98-1" "0.12-11" "0.99-3"

Pasted code for the SimMixed function (some lines might wrap):

# This function generates n.sims random data sets for a design with 4

# blocks, 2 treatments applied to each block, and 40 replicate

# observations for each block-treatment combination. There is no true

# fixed treatment effect, so a statistical significance of a test for

# a fixed treatment effect ought to occur with a probability equal to

# the nominal level of the test. Four tests are applied to each

# simulated data set: the classical aov and three versions of lmer,

# corresponding to different model formulations. The proportion of

# tests for a fixed treatment effect that become significant at the

# 0.05 0.01 and 0.001 levels are printed, as well as the p-values for

# the last of the simulations. In my runs, lmer gives significance

# more often than indicated by the nominal level, for each of the

# three models, whereas aov is OK. The package lme4 needs to be loaded

# to run the code.

SimMixed <- function(n.sims = 1) {

k <- 4 # number of blocks

n <- 40 # num obs per block X treatment combination

m1 <- 1.0 # fixed effect of level 1 of treatment

m2 <- m1 # fixed effect of level 2 of treatment

sd.block <- 0.5 # SD of block random effect

sd.block.trt <- 1.0 # SD of random effect for block X treatm

sd.res <- 0.1 # Residual SD

Block <- factor( rep(1:k, each=2*n) )

Treat <- factor( rep( rep(c("Tr1","Tr2"), k), each=n) )

m <- rep( rep(c(m1, m2), k), each=n) # fixed effects

# storage for p-values

p.aov <- rep(0, n.sims)

p.lmer.1 <- rep(0, n.sims)

p.lmer.2 <- rep(0, n.sims)

p.lmer.3 <- rep(0, n.sims)

for (i in 1:n.sims) {

# first get block and treatment random deviations

b <- rep( rep(rnorm(k, 0, sd.block), each=2) +

rnorm(2*k, 0, sd.block.trt), each=n )

# then get response

y <- m + b + rnorm(2*k*n, 0, sd.res)

dat <- data.frame(Block, Treat, y)

# perform the tests

fm.aov <- aov(y ~ Treat+Error(Block/Treat), data = dat)

fm.lmer.1 <- lmer(y ~ Treat+(Treat|Block), data = dat)

fm.lmer.2 <- lmer(y ~ Treat+(Treat-1|Block), data = dat)

fm.lmer.3 <- lmer(y ~ Treat+(1|Block)+(Treat-1|Block), data = dat)

# store the p-values

p.aov[i] <- summary(fm.aov)$"Error: Block:Treat"[[1]]$"Pr(>F)"[1]

p.lmer.1[i] <- anova(fm.lmer.1)[6]

p.lmer.2[i] <- anova(fm.lmer.2)[6]

p.lmer.3[i] <- anova(fm.lmer.3)[6]

}

cat("\nProportion significances at the 0.05 level \n")

cat("aov: ", sum(p.aov<0.05)/n.sims, "\n")

cat("lmer.1: ", sum(p.lmer.1<0.05)/n.sims, "\n")

cat("lmer.2: ", sum(p.lmer.2<0.05)/n.sims, "\n")

cat("lmer.3: ", sum(p.lmer.3<0.05)/n.sims, "\n")

cat("\nProportion significances at the 0.01 level \n")

cat("aov: ", sum(p.aov<0.01)/n.sims, "\n")

cat("lmer.1: ", sum(p.lmer.1<0.01)/n.sims, "\n")

cat("lmer.2: ", sum(p.lmer.2<0.01)/n.sims, "\n")

cat("lmer.3: ", sum(p.lmer.3<0.01)/n.sims, "\n")

cat("\nProportion significances at the 0.001 level \n")

cat("aov: ", sum(p.aov<0.001)/n.sims, "\n")

cat("lmer.1: ", sum(p.lmer.1<0.001)/n.sims, "\n")

cat("lmer.2: ", sum(p.lmer.2<0.001)/n.sims, "\n")

cat("lmer.3: ", sum(p.lmer.3<0.001)/n.sims, "\n")

cat("\nFinal aov analysis: \n")

print(summary(fm.aov)$"Error: Block:Treat")

cat("\nFinal lmer analysis 1: \n")

print(anova(fm.lmer.1))

cat("\nFinal lmer analysis 2: \n")

print(anova(fm.lmer.2))

cat("\nFinal lmer analysis 3: \n")

print(anova(fm.lmer.3))

}

--

Olof Leimar, Professor

Department of Zoology

Stockholm University

SE-106 91 Stockholm

Sweden

[hidden email]
http://www.zoologi.su.se/research/leimar/______________________________________________

[hidden email] mailing list

https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide!

http://www.R-project.org/posting-guide.html