Can't find all levels of categorical predictors in output of zeroinfl()

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Can't find all levels of categorical predictors in output of zeroinfl()

j.straka
Hello,
I’m using zero-inflated Poisson regression via the zeroinfl() function to analyze data on seed-set of plants, but for some reason, I don’t seem to be getting the output for all three levels of my two categorical predictors.

More about my data and model:
My response variable is the number of viable seeds (AVInt), and my two categorical predictors are elevation (Elev) and Treatment (Treatment).  Elev has three levels: 01-Low, 02-Mid, and 03-High; Treatment also has three possibilities: B, F, or O.

Because the response variable (AVInt) is zero-inflated and Poisson-distributed, I’m using zeroinfl() under the pcsl library as an alternative to factorial ANOVA (I’ve also tried the zero-inflated negative binomial).  This is early in my data-analysis, but I will likely incorporate additional categorical and continuous predictors at a later time.

This gives me the following model:

zipclay=zeroinfl(AVInt ~ Elev + Treatment)

So running the model, I have:
> zipclay=zeroinfl(AVInt ~ Elev + Treatment)
> summary(zipclay)

Call:
zeroinfl(formula = AVInt ~ Elev + Treatment)

Pearson residuals:
    Min      1Q  Median      3Q     Max
-1.1958 -0.5612 -0.3764  0.2704  5.5130

Count model coefficients (poisson with log link):
            Estimate Std. Error z value Pr(>|z|)  
(Intercept)  -0.2035     0.3435  -0.592  0.55368  
Elev02-Mid    0.3937     0.1806   2.180  0.02923 *
Elev03-High   0.1635     0.1792   0.912  0.36159  
TreatmentF    1.0026     0.3305   3.033  0.00242 **
TreatmentO    0.5915     0.3293   1.796  0.07244 .

Zero-inflation model coefficients (binomial with logit link):
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)   1.6086     0.5080   3.167  0.00154 **
Elev02-Mid   -0.3813     0.4345  -0.878  0.38020    
Elev03-High  -0.9512     0.4532  -2.099  0.03584 *  
TreatmentF   -0.9774     0.4690  -2.084  0.03718 *  
TreatmentO   -3.0242     0.6561  -4.609 4.05e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Number of iterations in BFGS optimization: 16
Log-likelihood: -363.2 on 10 Df

So my question is, where did my "Elev01-Low" and "TreatmentB" go??  Why aren't they appearing in the output table?

Any insight would be greatly appreciated!

- Jason

Reply | Threaded
Open this post in threaded view
|

Re: Can't find all levels of categorical predictors in output of zeroinfl()

Achim Zeileis-4
On Sun, 4 Mar 2012, j.straka wrote:

> Hello,
> I?m using zero-inflated Poisson regression via the zeroinfl() function to
> analyze data on seed-set of plants, but for some reason, I don?t seem to be
> getting the output for all three levels of my two categorical predictors.
>
> More about my data and model:
> My response variable is the number of viable seeds (AVInt), and my two
> categorical predictors are elevation (Elev) and Treatment (Treatment).  Elev
> has three levels: 01-Low, 02-Mid, and 03-High; Treatment also has three
> possibilities: B, F, or O.
>
> Because the response variable (AVInt) is zero-inflated and
> Poisson-distributed, I?m using zeroinfl() under the pcsl library as an
> alternative to factorial ANOVA (I?ve also tried the zero-inflated negative
> binomial).  This is early in my data-analysis, but I will likely incorporate
> additional categorical and continuous predictors at a later time.
>
> This gives me the following model:
>
> zipclay=zeroinfl(AVInt ~ Elev + Treatment)
>
> So running the model, I have:
>> zipclay=zeroinfl(AVInt ~ Elev + Treatment)
>> summary(zipclay)
>
> Call:
> zeroinfl(formula = AVInt ~ Elev + Treatment)
>
> Pearson residuals:
>    Min      1Q  Median      3Q     Max
> -1.1958 -0.5612 -0.3764  0.2704  5.5130
>
> Count model coefficients (poisson with log link):
>            Estimate Std. Error z value Pr(>|z|)
> (Intercept)  -0.2035     0.3435  -0.592  0.55368
> Elev02-Mid    0.3937     0.1806   2.180  0.02923 *
> Elev03-High   0.1635     0.1792   0.912  0.36159
> TreatmentF    1.0026     0.3305   3.033  0.00242 **
> TreatmentO    0.5915     0.3293   1.796  0.07244 .
>
> Zero-inflation model coefficients (binomial with logit link):
>            Estimate Std. Error z value Pr(>|z|)
> (Intercept)   1.6086     0.5080   3.167  0.00154 **
> Elev02-Mid   -0.3813     0.4345  -0.878  0.38020
> Elev03-High  -0.9512     0.4532  -2.099  0.03584 *
> TreatmentF   -0.9774     0.4690  -2.084  0.03718 *
> TreatmentO   -3.0242     0.6561  -4.609 4.05e-06 ***
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
> Number of iterations in BFGS optimization: 16
> Log-likelihood: -363.2 on 10 Df
>
> So my question is, where did my "Elev01-Low" and "TreatmentB" go??  Why
> aren't they appearing in the output table?

Both factors are coded with treatment contrasts and hence the main effect
of the first category is constrained to zero to make the model
identifiable. But this is the same as in a 2-way ANOVA. Compare with:

   summary(lm(AVInt ~ Elev + Treatment))

where the intercept corresponds to the mean for Elev01-Low/TreatmentB.

The regressors in the zero-inflated models are set up in exactly the same
way as in such a 2-way ANOVA.

hth,
Z

> Any insight would be greatly appreciated!
>
> - Jason
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Can-t-find-all-levels-of-categorical-predictors-in-output-of-zeroinfl-tp4444214p4444214.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Can't find all levels of categorical predictors in output of zeroinfl()

PIKAL Petr
In reply to this post by j.straka
Hi

Far from being an expert in this field I assume the answer is that it is
included in intercept term. It has something to do with contrasts
specification, but its usage is above my modelling knowledge.

You could try to see what

?contrasts

tells you.

Regards
Petr


you can
>
> Hello,
> I’m using zero-inflated Poisson regression via the zeroinfl() function
to
> analyze data on seed-set of plants, but for some reason, I don’t seem to
be
> getting the output for all three levels of my two categorical
predictors.
>
> More about my data and model:
> My response variable is the number of viable seeds (AVInt), and my two
> categorical predictors are elevation (Elev) and Treatment (Treatment).
Elev
> has three levels: 01-Low, 02-Mid, and 03-High; Treatment also has three
> possibilities: B, F, or O.
>
> Because the response variable (AVInt) is zero-inflated and
> Poisson-distributed, I’m using zeroinfl() under the pcsl library as an
> alternative to factorial ANOVA (I’ve also tried the zero-inflated
negative
> binomial).  This is early in my data-analysis, but I will likely
incorporate

> additional categorical and continuous predictors at a later time.
>
> This gives me the following model:
>
> zipclay=zeroinfl(AVInt ~ Elev + Treatment)
>
> So running the model, I have:
> > zipclay=zeroinfl(AVInt ~ Elev + Treatment)
> > summary(zipclay)
>
> Call:
> zeroinfl(formula = AVInt ~ Elev + Treatment)
>
> Pearson residuals:
>     Min      1Q  Median      3Q     Max
> -1.1958 -0.5612 -0.3764  0.2704  5.5130
>
> Count model coefficients (poisson with log link):
>             Estimate Std. Error z value Pr(>|z|)
> (Intercept)  -0.2035     0.3435  -0.592  0.55368
> Elev02-Mid    0.3937     0.1806   2.180  0.02923 *
> Elev03-High   0.1635     0.1792   0.912  0.36159
> TreatmentF    1.0026     0.3305   3.033  0.00242 **
> TreatmentO    0.5915     0.3293   1.796  0.07244 .
>
> Zero-inflation model coefficients (binomial with logit link):
>             Estimate Std. Error z value Pr(>|z|)
> (Intercept)   1.6086     0.5080   3.167  0.00154 **
> Elev02-Mid   -0.3813     0.4345  -0.878  0.38020
> Elev03-High  -0.9512     0.4532  -2.099  0.03584 *
> TreatmentF   -0.9774     0.4690  -2.084  0.03718 *
> TreatmentO   -3.0242     0.6561  -4.609 4.05e-06 ***
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
> Number of iterations in BFGS optimization: 16
> Log-likelihood: -363.2 on 10 Df
>
> So my question is, where did my "Elev01-Low" and "TreatmentB" go??  Why
> aren't they appearing in the output table?
>
> Any insight would be greatly appreciated!
>
> - Jason
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Can-t-find-
>
all-levels-of-categorical-predictors-in-output-of-zeroinfl-tp4444214p4444214.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Can't find all levels of categorical predictors in output of zeroinfl()

j.straka
Thank you Achim and Petr for your quick replies.  My apologies, but I'm quite new to statistical modeling (and the language surrounding it) as well.   I'm not sure exactly what was meant by "making the model identifiable" but I think I understand the purpose...

Yes, I think this means that the factors I was looking for were in the intercept term (set to zero, for contrasts).  I think the rest of my questions are "statistical" questions to do with how to interpret the model, so I will have to do a bit more reading!

Cheers,

- Jason