The underlying least squares arithmetic of aov and lm is identical.
In R, the QR algorithm is used. The difference between the two is
intent of the analysis and the default presentation of the results.
With lm [Linear Model], the focus is on the effect of the individual
columns of the predictor matrix. The columns are usually interpreted
as values of real-valued observations. The regression coefficients
are usually meaningful and interesting.
With aov [Analysis Of Variance], the focus is on the effects of
factors. These are multi-degree of freedom effects associated with
categorical variables. The arithmetic is based on a set of dummy
variables constructed from a contrast matrix. The individual
regression coefficients themselves are not easily interpretable.
You can pursue the details of this summary in any good statistical
Matthew Bridgman wrote:
> Why would someone use lm and ANOVA (anova(lm(x))) instead of AOV (or
> the other way around)?
> The mean squares and sum of squares are the same, but the F values
> and p-values are slightly different.
Crudely put, aov() is effectively useless on unbalanced designs. On the
other hand, it will allow you to handle models with multistratum error
I am somewhat at a loss as to how you manage to get the same MS and SS
but different F. Presumably, the denominator is different, but if you're
not messing with Error() terms, then I believe both aov() and
anova(lm()) would use the residual MS. An example might help.