Does anyone know of any good resources on specifying
anova models in R with aov. I particular, I am
interesting in the details and functioning of the
Error() structure. I could not find anything in the
documentation and help(Error) bounced me into the
aov() help pages.
See the reference on ?aov, and MASS (the book, see the FAQ).
I think you need to understand the underlying theory first, and that is no
longer (even for my time) part of a statistical education. I learnt it
from Bill Venables who has educated in the 1960s -- so his account in MASS
comes with at least one satisfied client.
On Tue, 12 Aug 2008, Brett Magill wrote:
> Hello all,
> Does anyone know of any good resources on specifying
> anova models in R with aov. I particular, I am
> interesting in the details and functioning of the
> Error() structure. I could not find anything in the
> documentation and help(Error) bounced me into the
> aov() help pages.
> Thank you.
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Prof Brian Ripley wrote:
> See the reference on ?aov, and MASS (the book, see the FAQ).
> I think you need to understand the underlying theory first, and that
> is no longer (even for my time) part of a statistical education. I
> learnt it from Bill Venables who has educated in the 1960s -- so his
> account in MASS comes with at least one satisfied client.
Hmm, I'm younger than Brian and I did study this extensively, based on
the description in the Genstat manual (1977) and Tue Tjur's lecture
notes (later developed into his 1984 paper in Int.Statist.Rev 52, pp.
The way I prefer to think about it is the following. It works only when
the error model is completely balanced and factorial, but there are
hardly any other models that are interpretable.
Assume for the sake of discussion a complete two-way layout (A*B) within
Subject. A relevant model could be y ~ A*B + Error(Subj/(A*B))
Start by expanding the Error() terms into simple interactions, i.e.
Subj/(A*B) = Subj + Subj:A + Subj:B + Subj:A:B. Each term defines a
table containing a (constant) number of observations in each cell, and
the error model is that there is a variance component that is common to
observations within the same cell, but has independent contributions to
This error model defines a decomposition of data into "error strata"
which corresponds to certain contrasts of means: Variation of subject
means around the grand mean, variation of within-subject "A" means
around the subject mean. Ditto for the "B" means, and finally the
residual, alias the within-subject interaction contrasts.
There are now two crucial points: (1) You can treat each component as if
it had been based on independent data with a different variance for each
stratum, and (2) in "nice" (orthogonal) designs it turns out that the
systematic terms distribute into error strata, so that significance of A
is evaluated in the Subj:A stratum, etc.
(As you see, this easily gets long-winded to explain, and I even glossed
over a number of rather important details.)
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - ([hidden email]) FAX: (+45) 35327907