|
Hello all:
I am confused about the output from a lm() model with an incomplete design/missing level. I have two categorical predictors and a continuous covariate (day) that I am using to model larval mass (l.mass): leaf.species has three levels - map, syc, and oak cond.time has two levels - 30 and 150. There are no response values for Map-150, so that entire, two-way, level is missing. When running anova() on the model with Type I SS, the full factorial design does not return errors; however, using package:car Anova() and Type III SS, I receive an singularity error unless I used the argument 'singular.ok = T' (it is defaulted to F). So, why don't I receive an error with anova() when I do with Anova(type = "III")? How do anova() and Anova() handle incomplete designs, and how can interactions of variables with missing levels be interpreted? I realize these are fairly broad questions, but any insight would be helpful. Thanks, all. Below is code to illustrate my question(s): > lmMass <- lm(log(l.mass) ~ day*leaf.species + cond.time, data = growth.data) #lm() without cond.time interactions > lmMassInt <- lm(log(l.mass) ~ day*leaf.species*cond.time, data = growth.data) #lm() with cond.time interactions > anova(lmMass); anova(lmMassInt) #ANOVA summary of both models with Type I SS Analysis of Variance Table Response: log(l.mass) Df Sum Sq Mean Sq F value Pr(>F) day 1 51.373 51.373 75.7451 2.073e-15 leaf.species 2 0.340 0.170 0.2506 0.7786 cond.time 1 0.161 0.161 0.2369 0.6271 day:leaf.species 2 1.296 0.648 0.9551 0.3867 Residuals 179 121.404 0.678 Analysis of Variance Table Response: log(l.mass) Df Sum Sq Mean Sq F value Pr(>F) day 1 51.373 51.373 76.5651 1.693e-15 leaf.species 2 0.340 0.170 0.2533 0.77654 cond.time 1 0.161 0.161 0.2394 0.62523 day:leaf.species 2 1.296 0.648 0.9655 0.38281 day:cond.time 1 0.080 0.080 0.1198 0.72965 leaf.species:cond.time 1 1.318 1.318 1.9642 0.16282 day:leaf.species:cond.time 1 1.915 1.915 2.8539 0.09293 Residuals 176 118.091 0.671 > Anova(lmMass, type = 'III'); Anova(lmMassInt, type = 'III') #ANOVA summary of both models with Type III SS Anova Table (Type III tests) Response: log(l.mass) Sum Sq Df F value Pr(>F) (Intercept) 39.789 1 58.6653 1.13e-12 day 3.278 1 4.8336 0.02919 leaf.species 0.934 2 0.6888 0.50352 cond.time 0.168 1 0.2472 0.61968 day:leaf.species 1.296 2 0.9551 0.38672 Residuals 121.404 179 Error in Anova.III.lm(mod, error, singular.ok = singular.ok, ...) : there are aliased coefficients in the model > Anova(lmMassInt, type = 'III', singular.ok = T) #Given the error in Anova() above, set singular.ok = T Anova Table (Type III tests) Response: log(l.mass) Sum Sq Df F value Pr(>F) (Intercept) 39.789 1 59.3004 9.402e-13 day 3.278 1 4.8860 0.02837 leaf.species 1.356 2 1.0103 0.36623 cond.time 0.124 1 0.1843 0.66822 day:leaf.species 2.783 2 2.0738 0.12877 day:cond.time 0.805 1 1.1994 0.27493 leaf.species:cond.time 0.568 1 0.8462 0.35888 day:leaf.species:cond.time 1.915 1 2.8539 0.09293 Residuals 118.091 176 > - Justin Montemarano Graduate Student Kent State University - Biological Sciences http://www.montegraphia.com <http://www.montegraphia.com/> -- Justin Montemarano Graduate Student Kent State University - Biological Sciences http://www.montegraphia.com [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Dear Justin,
anova() and Anova() are entirely different functions; the former is part of the standard R distribution and the second part of the car package. By default, Anova() produces an error for type-III tests conducted on rank-deficient models because the hypotheses tested aren't generally sensible. >From ?Anova: "singular.ok defaults to TRUE for type-II tests, and FALSE for type-III tests (where the tests for models with aliased coefficients will not be straightforwardly interpretable); if FALSE, a model with aliased coefficients produces an error." and "The designations "type-II" and "type-III" are borrowed from SAS, but the definitions used here do not correspond precisely to those employed by SAS. Type-II tests are calculated according to the principle of marginality, testing each term after all others, except ignoring the term's higher-order relatives; so-called type-III tests violate marginality, testing each term in the model after all of the others. This definition of Type-II tests corresponds to the tests produced by SAS for analysis-of-variance models, where all of the predictors are factors, but not more generally (i.e., when there are quantitative predictors). Be very careful in formulating the model for type-III tests, or the hypotheses tested will not make sense." I hope this helps, John ------------------------------------------------ John Fox Sen. William McMaster Prof. of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ On Fri, 15 Jun 2012 15:01:27 -0400 Justin Montemarano <[hidden email]> wrote: > Hello all: > > I am confused about the output from a lm() model with an incomplete > design/missing level. > > I have two categorical predictors and a continuous covariate (day) that > I am using to model larval mass (l.mass): > > leaf.species has three levels - map, syc, and oak > > cond.time has two levels - 30 and 150. > > There are no response values for Map-150, so that entire, two-way, level > is missing. > > When running anova() on the model with Type I SS, the full factorial > design does not return errors; however, using package:car Anova() and > Type III SS, I receive an singularity error unless I used the argument > 'singular.ok = T' (it is defaulted to F). > > So, why don't I receive an error with anova() when I do with Anova(type > = "III")? How do anova() and Anova() handle incomplete designs, and how > can interactions of variables with missing levels be interpreted? > > I realize these are fairly broad questions, but any insight would be > helpful. Thanks, all. > > Below is code to illustrate my question(s): > > > lmMass <- lm(log(l.mass) ~ day*leaf.species + cond.time, data = > growth.data) #lm() without cond.time interactions > > lmMassInt <- lm(log(l.mass) ~ day*leaf.species*cond.time, data = > growth.data) #lm() with cond.time interactions > > anova(lmMass); anova(lmMassInt) #ANOVA summary of both models > with Type I SS > Analysis of Variance Table > > Response: log(l.mass) > Df Sum Sq Mean Sq F value Pr(>F) > day 1 51.373 51.373 75.7451 2.073e-15 > leaf.species 2 0.340 0.170 0.2506 0.7786 > cond.time 1 0.161 0.161 0.2369 0.6271 > day:leaf.species 2 1.296 0.648 0.9551 0.3867 > Residuals 179 121.404 0.678 > Analysis of Variance Table > > Response: log(l.mass) > Df Sum Sq Mean Sq F value Pr(>F) > day 1 51.373 51.373 76.5651 1.693e-15 > leaf.species 2 0.340 0.170 0.2533 0.77654 > cond.time 1 0.161 0.161 0.2394 0.62523 > day:leaf.species 2 1.296 0.648 0.9655 0.38281 > day:cond.time 1 0.080 0.080 0.1198 0.72965 > leaf.species:cond.time 1 1.318 1.318 1.9642 0.16282 > day:leaf.species:cond.time 1 1.915 1.915 2.8539 0.09293 > Residuals 176 118.091 0.671 > > Anova(lmMass, type = 'III'); Anova(lmMassInt, type = 'III') > #ANOVA summary of both models with Type III SS > Anova Table (Type III tests) > > Response: log(l.mass) > Sum Sq Df F value Pr(>F) > (Intercept) 39.789 1 58.6653 1.13e-12 > day 3.278 1 4.8336 0.02919 > leaf.species 0.934 2 0.6888 0.50352 > cond.time 0.168 1 0.2472 0.61968 > day:leaf.species 1.296 2 0.9551 0.38672 > Residuals 121.404 179 > Error in Anova.III.lm(mod, error, singular.ok = singular.ok, ...) : > there are aliased coefficients in the model > > Anova(lmMassInt, type = 'III', singular.ok = T) #Given the error > in Anova() above, set singular.ok = T > Anova Table (Type III tests) > > Response: log(l.mass) > Sum Sq Df F value Pr(>F) > (Intercept) 39.789 1 59.3004 9.402e-13 > day 3.278 1 4.8860 0.02837 > leaf.species 1.356 2 1.0103 0.36623 > cond.time 0.124 1 0.1843 0.66822 > day:leaf.species 2.783 2 2.0738 0.12877 > day:cond.time 0.805 1 1.1994 0.27493 > leaf.species:cond.time 0.568 1 0.8462 0.35888 > day:leaf.species:cond.time 1.915 1 2.8539 0.09293 > Residuals 118.091 176 > > > > > > - > Justin Montemarano > Graduate Student > Kent State University - Biological Sciences > > http://www.montegraphia.com > <http://www.montegraphia.com/> > -- > Justin Montemarano > Graduate Student > Kent State University - Biological Sciences > > http://www.montegraphia.com > > > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Thanks for your response, John. That was helpful.
I was using Type III from Anova() as a comparison to some results I had obtained JMP, which I've lost access to and have moved on to R, and I was confused by the error. Given that I do have a continuous covariate, the analyses are not likely comparable, considering your response. I am still confused about interpretation of interactions within an anova() with an incomplete design, as mine is. Is the interaction term still informative? - Justin Montemarano Graduate Student Kent State University - Biological Sciences http://www.montegraphia.com On Sat, Jun 16, 2012 at 9:20 PM, John Fox <[hidden email]> wrote: > Dear Justin, > > anova() and Anova() are entirely different functions; the former is part > of the standard R distribution and the second part of the car package. By > default, Anova() produces an error for type-III tests conducted on > rank-deficient models because the hypotheses tested aren't generally > sensible. > > From ?Anova: > > "singular.ok > defaults to TRUE for type-II tests, and FALSE for type-III tests (where > the tests for models with aliased coefficients will not be > straightforwardly interpretable); if FALSE, a model with aliased > coefficients produces an error." > > and > > "The designations "type-II" and "type-III" are borrowed from SAS, but the > definitions used here do not correspond precisely to those employed by SAS. > Type-II tests are calculated according to the principle of marginality, > testing each term after all others, except ignoring the term's higher-order > relatives; so-called type-III tests violate marginality, testing each term > in the model after all of the others. This definition of Type-II tests > corresponds to the tests produced by SAS for analysis-of-variance models, > where all of the predictors are factors, but not more generally (i.e., when > there are quantitative predictors). Be very careful in formulating the > model for type-III tests, or the hypotheses tested will not make sense." > > I hope this helps, > John > > ------------------------------------------------ > John Fox > Sen. William McMaster Prof. of Social Statistics > Department of Sociology > McMaster University > Hamilton, Ontario, Canada > http://socserv.mcmaster.ca/jfox/ > > > On Fri, 15 Jun 2012 15:01:27 -0400 > Justin Montemarano <[hidden email]> wrote: > > Hello all: > > > > I am confused about the output from a lm() model with an incomplete > > design/missing level. > > > > I have two categorical predictors and a continuous covariate (day) that > > I am using to model larval mass (l.mass): > > > > leaf.species has three levels - map, syc, and oak > > > > cond.time has two levels - 30 and 150. > > > > There are no response values for Map-150, so that entire, two-way, level > > is missing. > > > > When running anova() on the model with Type I SS, the full factorial > > design does not return errors; however, using package:car Anova() and > > Type III SS, I receive an singularity error unless I used the argument > > 'singular.ok = T' (it is defaulted to F). > > > > So, why don't I receive an error with anova() when I do with Anova(type > > = "III")? How do anova() and Anova() handle incomplete designs, and how > > can interactions of variables with missing levels be interpreted? > > > > I realize these are fairly broad questions, but any insight would be > > helpful. Thanks, all. > > > > Below is code to illustrate my question(s): > > > > > lmMass <- lm(log(l.mass) ~ day*leaf.species + cond.time, data = > > growth.data) #lm() without cond.time interactions > > > lmMassInt <- lm(log(l.mass) ~ day*leaf.species*cond.time, data = > > growth.data) #lm() with cond.time interactions > > > anova(lmMass); anova(lmMassInt) #ANOVA summary of both models > > with Type I SS > > Analysis of Variance Table > > > > Response: log(l.mass) > > Df Sum Sq Mean Sq F value Pr(>F) > > day 1 51.373 51.373 75.7451 2.073e-15 > > leaf.species 2 0.340 0.170 0.2506 0.7786 > > cond.time 1 0.161 0.161 0.2369 0.6271 > > day:leaf.species 2 1.296 0.648 0.9551 0.3867 > > Residuals 179 121.404 0.678 > > Analysis of Variance Table > > > > Response: log(l.mass) > > Df Sum Sq Mean Sq F value Pr(>F) > > day 1 51.373 51.373 76.5651 1.693e-15 > > leaf.species 2 0.340 0.170 0.2533 0.77654 > > cond.time 1 0.161 0.161 0.2394 0.62523 > > day:leaf.species 2 1.296 0.648 0.9655 0.38281 > > day:cond.time 1 0.080 0.080 0.1198 0.72965 > > leaf.species:cond.time 1 1.318 1.318 1.9642 0.16282 > > day:leaf.species:cond.time 1 1.915 1.915 2.8539 0.09293 > > Residuals 176 118.091 0.671 > > > Anova(lmMass, type = 'III'); Anova(lmMassInt, type = 'III') > > #ANOVA summary of both models with Type III SS > > Anova Table (Type III tests) > > > > Response: log(l.mass) > > Sum Sq Df F value Pr(>F) > > (Intercept) 39.789 1 58.6653 1.13e-12 > > day 3.278 1 4.8336 0.02919 > > leaf.species 0.934 2 0.6888 0.50352 > > cond.time 0.168 1 0.2472 0.61968 > > day:leaf.species 1.296 2 0.9551 0.38672 > > Residuals 121.404 179 > > Error in Anova.III.lm(mod, error, singular.ok = singular.ok, ...) : > > there are aliased coefficients in the model > > > Anova(lmMassInt, type = 'III', singular.ok = T) #Given the error > > in Anova() above, set singular.ok = T > > Anova Table (Type III tests) > > > > Response: log(l.mass) > > Sum Sq Df F value Pr(>F) > > (Intercept) 39.789 1 59.3004 9.402e-13 > > day 3.278 1 4.8860 0.02837 > > leaf.species 1.356 2 1.0103 0.36623 > > cond.time 0.124 1 0.1843 0.66822 > > day:leaf.species 2.783 2 2.0738 0.12877 > > day:cond.time 0.805 1 1.1994 0.27493 > > leaf.species:cond.time 0.568 1 0.8462 0.35888 > > day:leaf.species:cond.time 1.915 1 2.8539 0.09293 > > Residuals 118.091 176 > > > > > > > > > > > - > > Justin Montemarano > > Graduate Student > > Kent State University - Biological Sciences > > > > http://www.montegraphia.com > > <http://www.montegraphia.com/> > > -- > > Justin Montemarano > > Graduate Student > > Kent State University - Biological Sciences > > > > http://www.montegraphia.com > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > [hidden email] mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Dear Justin,
On Mon, 18 Jun 2012 11:24:33 -0400 Justin Montemarano <[hidden email]> wrote: > Thanks for your response, John. That was helpful. > > I was using Type III from Anova() as a comparison to some results I had > obtained JMP, which I've lost access to and have moved on to R, and I was > confused by the error. Given that I do have a continuous covariate, the > analyses are not likely comparable, considering your response. If you look more carefully, you'll see that the reference here was to type-II tests. I believe that the definition used by Anova() is more sensible. > > I am still confused about interpretation of interactions within an anova() > with an incomplete design, as mine is. Is the interaction term still > informative? I don't think that these matters are easily discussed on an email list. Briefly, I'd argue that the type-II tests (as defined by Anova) still have a straightforward interpretation since the test for a (say, two-way) interaction represent a contrast to a model that's additive with respect to the predictors involved in the interaction. Best, John > > - > Justin Montemarano > Graduate Student > Kent State University - Biological Sciences > > http://www.montegraphia.com > > > On Sat, Jun 16, 2012 at 9:20 PM, John Fox <[hidden email]> wrote: > > > Dear Justin, > > > > anova() and Anova() are entirely different functions; the former is part > > of the standard R distribution and the second part of the car package. By > > default, Anova() produces an error for type-III tests conducted on > > rank-deficient models because the hypotheses tested aren't generally > > sensible. > > > > From ?Anova: > > > > "singular.ok > > defaults to TRUE for type-II tests, and FALSE for type-III tests (where > > the tests for models with aliased coefficients will not be > > straightforwardly interpretable); if FALSE, a model with aliased > > coefficients produces an error." > > > > and > > > > "The designations "type-II" and "type-III" are borrowed from SAS, but the > > definitions used here do not correspond precisely to those employed by SAS. > > Type-II tests are calculated according to the principle of marginality, > > testing each term after all others, except ignoring the term's higher-order > > relatives; so-called type-III tests violate marginality, testing each term > > in the model after all of the others. This definition of Type-II tests > > corresponds to the tests produced by SAS for analysis-of-variance models, > > where all of the predictors are factors, but not more generally (i.e., when > > there are quantitative predictors). Be very careful in formulating the > > model for type-III tests, or the hypotheses tested will not make sense." > > > > I hope this helps, > > John > > > > ------------------------------------------------ > > John Fox > > Sen. William McMaster Prof. of Social Statistics > > Department of Sociology > > McMaster University > > Hamilton, Ontario, Canada > > http://socserv.mcmaster.ca/jfox/ > > > > > > On Fri, 15 Jun 2012 15:01:27 -0400 > > Justin Montemarano <[hidden email]> wrote: > > > Hello all: > > > > > > I am confused about the output from a lm() model with an incomplete > > > design/missing level. > > > > > > I have two categorical predictors and a continuous covariate (day) that > > > I am using to model larval mass (l.mass): > > > > > > leaf.species has three levels - map, syc, and oak > > > > > > cond.time has two levels - 30 and 150. > > > > > > There are no response values for Map-150, so that entire, two-way, level > > > is missing. > > > > > > When running anova() on the model with Type I SS, the full factorial > > > design does not return errors; however, using package:car Anova() and > > > Type III SS, I receive an singularity error unless I used the argument > > > 'singular.ok = T' (it is defaulted to F). > > > > > > So, why don't I receive an error with anova() when I do with Anova(type > > > = "III")? How do anova() and Anova() handle incomplete designs, and how > > > can interactions of variables with missing levels be interpreted? > > > > > > I realize these are fairly broad questions, but any insight would be > > > helpful. Thanks, all. > > > > > > Below is code to illustrate my question(s): > > > > > > > lmMass <- lm(log(l.mass) ~ day*leaf.species + cond.time, data = > > > growth.data) #lm() without cond.time interactions > > > > lmMassInt <- lm(log(l.mass) ~ day*leaf.species*cond.time, data = > > > growth.data) #lm() with cond.time interactions > > > > anova(lmMass); anova(lmMassInt) #ANOVA summary of both models > > > with Type I SS > > > Analysis of Variance Table > > > > > > Response: log(l.mass) > > > Df Sum Sq Mean Sq F value Pr(>F) > > > day 1 51.373 51.373 75.7451 2.073e-15 > > > leaf.species 2 0.340 0.170 0.2506 0.7786 > > > cond.time 1 0.161 0.161 0.2369 0.6271 > > > day:leaf.species 2 1.296 0.648 0.9551 0.3867 > > > Residuals 179 121.404 0.678 > > > Analysis of Variance Table > > > > > > Response: log(l.mass) > > > Df Sum Sq Mean Sq F value Pr(>F) > > > day 1 51.373 51.373 76.5651 1.693e-15 > > > leaf.species 2 0.340 0.170 0.2533 0.77654 > > > cond.time 1 0.161 0.161 0.2394 0.62523 > > > day:leaf.species 2 1.296 0.648 0.9655 0.38281 > > > day:cond.time 1 0.080 0.080 0.1198 0.72965 > > > leaf.species:cond.time 1 1.318 1.318 1.9642 0.16282 > > > day:leaf.species:cond.time 1 1.915 1.915 2.8539 0.09293 > > > Residuals 176 118.091 0.671 > > > > Anova(lmMass, type = 'III'); Anova(lmMassInt, type = 'III') > > > #ANOVA summary of both models with Type III SS > > > Anova Table (Type III tests) > > > > > > Response: log(l.mass) > > > Sum Sq Df F value Pr(>F) > > > (Intercept) 39.789 1 58.6653 1.13e-12 > > > day 3.278 1 4.8336 0.02919 > > > leaf.species 0.934 2 0.6888 0.50352 > > > cond.time 0.168 1 0.2472 0.61968 > > > day:leaf.species 1.296 2 0.9551 0.38672 > > > Residuals 121.404 179 > > > Error in Anova.III.lm(mod, error, singular.ok = singular.ok, ...) : > > > there are aliased coefficients in the model > > > > Anova(lmMassInt, type = 'III', singular.ok = T) #Given the error > > > in Anova() above, set singular.ok = T > > > Anova Table (Type III tests) > > > > > > Response: log(l.mass) > > > Sum Sq Df F value Pr(>F) > > > (Intercept) 39.789 1 59.3004 9.402e-13 > > > day 3.278 1 4.8860 0.02837 > > > leaf.species 1.356 2 1.0103 0.36623 > > > cond.time 0.124 1 0.1843 0.66822 > > > day:leaf.species 2.783 2 2.0738 0.12877 > > > day:cond.time 0.805 1 1.1994 0.27493 > > > leaf.species:cond.time 0.568 1 0.8462 0.35888 > > > day:leaf.species:cond.time 1.915 1 2.8539 0.09293 > > > Residuals 118.091 176 > > > > > > > > > > > > > > > > - > > > Justin Montemarano > > > Graduate Student > > > Kent State University - Biological Sciences > > > > > > http://www.montegraphia.com > > > <http://www.montegraphia.com/> > > > -- > > > Justin Montemarano > > > Graduate Student > > > Kent State University - Biological Sciences > > > > > > http://www.montegraphia.com > > > > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > [hidden email] mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
| Powered by Nabble | Edit this page |
