Hello,
I’m using zero-inflated Poisson regression via the zeroinfl() function to analyze data on seed-set of plants, but for some reason, I don’t seem to be getting the output for all three levels of my two categorical predictors. More about my data and model: My response variable is the number of viable seeds (AVInt), and my two categorical predictors are elevation (Elev) and Treatment (Treatment). Elev has three levels: 01-Low, 02-Mid, and 03-High; Treatment also has three possibilities: B, F, or O. Because the response variable (AVInt) is zero-inflated and Poisson-distributed, I’m using zeroinfl() under the pcsl library as an alternative to factorial ANOVA (I’ve also tried the zero-inflated negative binomial). This is early in my data-analysis, but I will likely incorporate additional categorical and continuous predictors at a later time. This gives me the following model: zipclay=zeroinfl(AVInt ~ Elev + Treatment) So running the model, I have: > zipclay=zeroinfl(AVInt ~ Elev + Treatment) > summary(zipclay) Call: zeroinfl(formula = AVInt ~ Elev + Treatment) Pearson residuals: Min 1Q Median 3Q Max -1.1958 -0.5612 -0.3764 0.2704 5.5130 Count model coefficients (poisson with log link): Estimate Std. Error z value Pr(>|z|) (Intercept) -0.2035 0.3435 -0.592 0.55368 Elev02-Mid 0.3937 0.1806 2.180 0.02923 * Elev03-High 0.1635 0.1792 0.912 0.36159 TreatmentF 1.0026 0.3305 3.033 0.00242 ** TreatmentO 0.5915 0.3293 1.796 0.07244 . Zero-inflation model coefficients (binomial with logit link): Estimate Std. Error z value Pr(>|z|) (Intercept) 1.6086 0.5080 3.167 0.00154 ** Elev02-Mid -0.3813 0.4345 -0.878 0.38020 Elev03-High -0.9512 0.4532 -2.099 0.03584 * TreatmentF -0.9774 0.4690 -2.084 0.03718 * TreatmentO -3.0242 0.6561 -4.609 4.05e-06 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Number of iterations in BFGS optimization: 16 Log-likelihood: -363.2 on 10 Df So my question is, where did my "Elev01-Low" and "TreatmentB" go?? Why aren't they appearing in the output table? Any insight would be greatly appreciated! - Jason |
On Sun, 4 Mar 2012, j.straka wrote:
> Hello, > I?m using zero-inflated Poisson regression via the zeroinfl() function to > analyze data on seed-set of plants, but for some reason, I don?t seem to be > getting the output for all three levels of my two categorical predictors. > > More about my data and model: > My response variable is the number of viable seeds (AVInt), and my two > categorical predictors are elevation (Elev) and Treatment (Treatment). Elev > has three levels: 01-Low, 02-Mid, and 03-High; Treatment also has three > possibilities: B, F, or O. > > Because the response variable (AVInt) is zero-inflated and > Poisson-distributed, I?m using zeroinfl() under the pcsl library as an > alternative to factorial ANOVA (I?ve also tried the zero-inflated negative > binomial). This is early in my data-analysis, but I will likely incorporate > additional categorical and continuous predictors at a later time. > > This gives me the following model: > > zipclay=zeroinfl(AVInt ~ Elev + Treatment) > > So running the model, I have: >> zipclay=zeroinfl(AVInt ~ Elev + Treatment) >> summary(zipclay) > > Call: > zeroinfl(formula = AVInt ~ Elev + Treatment) > > Pearson residuals: > Min 1Q Median 3Q Max > -1.1958 -0.5612 -0.3764 0.2704 5.5130 > > Count model coefficients (poisson with log link): > Estimate Std. Error z value Pr(>|z|) > (Intercept) -0.2035 0.3435 -0.592 0.55368 > Elev02-Mid 0.3937 0.1806 2.180 0.02923 * > Elev03-High 0.1635 0.1792 0.912 0.36159 > TreatmentF 1.0026 0.3305 3.033 0.00242 ** > TreatmentO 0.5915 0.3293 1.796 0.07244 . > > Zero-inflation model coefficients (binomial with logit link): > Estimate Std. Error z value Pr(>|z|) > (Intercept) 1.6086 0.5080 3.167 0.00154 ** > Elev02-Mid -0.3813 0.4345 -0.878 0.38020 > Elev03-High -0.9512 0.4532 -2.099 0.03584 * > TreatmentF -0.9774 0.4690 -2.084 0.03718 * > TreatmentO -3.0242 0.6561 -4.609 4.05e-06 *** > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > Number of iterations in BFGS optimization: 16 > Log-likelihood: -363.2 on 10 Df > > So my question is, where did my "Elev01-Low" and "TreatmentB" go?? Why > aren't they appearing in the output table? Both factors are coded with treatment contrasts and hence the main effect of the first category is constrained to zero to make the model identifiable. But this is the same as in a 2-way ANOVA. Compare with: summary(lm(AVInt ~ Elev + Treatment)) where the intercept corresponds to the mean for Elev01-Low/TreatmentB. The regressors in the zero-inflated models are set up in exactly the same way as in such a 2-way ANOVA. hth, Z > Any insight would be greatly appreciated! > > - Jason > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Can-t-find-all-levels-of-categorical-predictors-in-output-of-zeroinfl-tp4444214p4444214.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by j.straka
Hi
Far from being an expert in this field I assume the answer is that it is included in intercept term. It has something to do with contrasts specification, but its usage is above my modelling knowledge. You could try to see what ?contrasts tells you. Regards Petr you can > > Hello, > I’m using zero-inflated Poisson regression via the zeroinfl() function to > analyze data on seed-set of plants, but for some reason, I don’t seem to be > getting the output for all three levels of my two categorical predictors. > > More about my data and model: > My response variable is the number of viable seeds (AVInt), and my two > categorical predictors are elevation (Elev) and Treatment (Treatment). Elev > has three levels: 01-Low, 02-Mid, and 03-High; Treatment also has three > possibilities: B, F, or O. > > Because the response variable (AVInt) is zero-inflated and > Poisson-distributed, I’m using zeroinfl() under the pcsl library as an > alternative to factorial ANOVA (I’ve also tried the zero-inflated negative > binomial). This is early in my data-analysis, but I will likely incorporate > additional categorical and continuous predictors at a later time. > > This gives me the following model: > > zipclay=zeroinfl(AVInt ~ Elev + Treatment) > > So running the model, I have: > > zipclay=zeroinfl(AVInt ~ Elev + Treatment) > > summary(zipclay) > > Call: > zeroinfl(formula = AVInt ~ Elev + Treatment) > > Pearson residuals: > Min 1Q Median 3Q Max > -1.1958 -0.5612 -0.3764 0.2704 5.5130 > > Count model coefficients (poisson with log link): > Estimate Std. Error z value Pr(>|z|) > (Intercept) -0.2035 0.3435 -0.592 0.55368 > Elev02-Mid 0.3937 0.1806 2.180 0.02923 * > Elev03-High 0.1635 0.1792 0.912 0.36159 > TreatmentF 1.0026 0.3305 3.033 0.00242 ** > TreatmentO 0.5915 0.3293 1.796 0.07244 . > > Zero-inflation model coefficients (binomial with logit link): > Estimate Std. Error z value Pr(>|z|) > (Intercept) 1.6086 0.5080 3.167 0.00154 ** > Elev02-Mid -0.3813 0.4345 -0.878 0.38020 > Elev03-High -0.9512 0.4532 -2.099 0.03584 * > TreatmentF -0.9774 0.4690 -2.084 0.03718 * > TreatmentO -3.0242 0.6561 -4.609 4.05e-06 *** > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > Number of iterations in BFGS optimization: 16 > Log-likelihood: -363.2 on 10 Df > > So my question is, where did my "Elev01-Low" and "TreatmentB" go?? Why > aren't they appearing in the output table? > > Any insight would be greatly appreciated! > > - Jason > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Can-t-find- > > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Thank you Achim and Petr for your quick replies. My apologies, but I'm quite new to statistical modeling (and the language surrounding it) as well. I'm not sure exactly what was meant by "making the model identifiable" but I think I understand the purpose...
Yes, I think this means that the factors I was looking for were in the intercept term (set to zero, for contrasts). I think the rest of my questions are "statistical" questions to do with how to interpret the model, so I will have to do a bit more reading! Cheers, - Jason |
Free forum by Nabble | Edit this page |