Dear R crew:

I am sorry this question has been posted before, but I can't seem to solve

this problem yet.

I have a simple dataset consisting of two variables: cestode intensity and

chick size (defined as CAPI).

Intensity is a count and clearly overdispersed, with way too many zeroes.

I'm interested in looking at the association between these two variables,

i.e. how well does chick size predict tape intensity?

Since I have a small sample size, I fit a zero inflated negat. Binomial (not

Poisson) model using the "pscl" package.

I built tried two models and got the outputs below.

> model <- zeroinfl(Int_Cesto ~ CAPI, dist = "negbin", EM = TRUE)

Call:

zeroinfl(formula = Int_Cesto ~ CAPI, dist = "negbin", EM = TRUE)

Count model coefficients (negbin with log link):

(Intercept) CAPI

-2.99182 0.06817

Theta = 0.4528

Zero-inflation model coefficients (binomial with logit link):

(Intercept) CAPI

12.1364 -0.1572

> summary(model)

Call:

zeroinfl(formula = Int_Cesto ~ CAPI, dist = "negbin", EM = TRUE)

Pearson residuals:

Min 1Q Median 3Q Max

-0.62751 -0.38842 -0.21303 -0.06899 7.29566

Count model coefficients (negbin with log link):

Estimate Std. Error z value Pr(>|z|)

(Intercept) -2.99182 3.39555 -0.881 0.3783

CAPI 0.06817 0.04098 1.664 0.0962 .

Log(theta) -0.79222 0.45031 -1.759 0.0785 .

Zero-inflation model coefficients (binomial with logit link):

Estimate Std. Error z value Pr(>|z|)

(Intercept) 12.13636 3.71918 3.263 0.00110 **

CAPI -0.15720 0.04989 -3.151 0.00163 **

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Theta = 0.4528

Number of iterations in BFGS optimization: 1

Log-likelihood: -140.2 on 5 Df

QUESTIONS

1. Is my model adequately specified?

2. CAPI is included in blocks 1 of output containing negative binomial

regression coefficients for CAPI, and is also included also in block 2

corresponding to the inflation model. Does this make sense?

If I specify my model slightly differently, I get what I believe is more

reasonable results:

> model12 <- zeroinfl(Int_Cesto ~ 1|CAPI, dist = "negbin", EM = TRUE)

> model12

Call:

zeroinfl(formula = Int_Cesto ~ 1 | CAPI, dist = "negbin", EM = TRUE)

Count model coefficients (negbin with log link):

(Intercept)

2.692

Theta = 0.4346

Zero-inflation model coefficients (binomial with logit link):

(Intercept) CAPI

13.2476 -0.1708

> summary(model12)

Call:

zeroinfl(formula = Int_Cesto ~ 1 | CAPI, dist = "negbin", EM = TRUE)

Pearson residuals:

Min 1Q Median 3Q Max

-0.61616 -0.36902 -0.19466 -0.06666 4.85481

Count model coefficients (negbin with log link):

Estimate Std. Error z value Pr(>|z|)

(Intercept) 2.6924 0.3031 8.883 <2e-16 ***

Log(theta) -0.8334 0.4082 -2.042 0.0412 *

Zero-inflation model coefficients (binomial with logit link):

Estimate Std. Error z value Pr(>|z|)

(Intercept) 13.24757 3.64531 3.634 0.000279 ***

CAPI -0.17078 0.04921 -3.471 0.000519 ***

---

Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Theta = 0.4346

Number of iterations in BFGS optimization: 1

Log-likelihood: -141.9 on 4 Df

QUESTION:

1. Is this model specification and output more reasonable?

2. CAPI appears only in the second block that corresponds to the

inflation model.

Thanks in advance!

Luciano

[[alternative HTML version deleted]]

______________________________________________

[hidden email] mailing list

https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide

http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.