Mixed-effects model for overdispersed count data?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Mixed-effects model for overdispersed count data?

Marie-Hélène Hachey

Hi,

I have to analyse the number of provisioning trips to nestlings according to a number of biological and environmental factors. I was thinking of building a mixed-effects model with species and nestid as random effects, using a Poisson distribution, but the data are overdispersed (variance/mean = 5). I then thought of using a mixed-effects model with negative binomial distribution, but I have 2 problems:
 
1- The only package building mixed models with neg. bin. distribution I found is the package glmmADMB but I have a hard time understanding the output. Anyone knows of a R package with an output that gives p values?
 
2- Two people I asked advice to told me that I should use either a mixed-effect model with a Poisson distribution (the random effects will take care of the overdispersion) OR a glm using neg. bin. distribution but not both at the same time, which would be unnecessary.
 
Any advice is welcome!
 
Thank you
 
Marie-Helene Hachey
M.Sc. student
Universite Laval, Quebec
     
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Mixed-effects model for overdispersed count data?

dave fournier
According to the documentation for glmmADMB if you fit
your model with a statment like


fit =glmm.admb(y~Base*trt+Age+Visit, ...  data=epil2,family="nbinom")

and that the parameter estimates are in

    fit$b  while their estimated standard deviations are
in

     fit$stdbeta

so presumably  p values can be constructed from the
quotient

      fit$b/fit$stdbeta

by assuming a t distribution with (somehow) the correct
degrees of freedom.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Mixed-effects model for overdispersed count data?

bbolker
In reply to this post by Marie-Hélène Hachey
Marie-Hélène Hachey <marie_helene48 <at> hotmail.com> writes:

>
>
> Hi,
>
> I have to analyse the number of provisioning trips to nestlings
> according to a number of biological and
> environmental factors. I was thinking of building a mixed-effects model
> with species and nestid as
> random effects, using a Poisson distribution, but the data are
> overdispersed (variance/mean = 5). I then
> thought of using a mixed-effects model with negative binomial
> distribution, but I have 2 problems:
>
> 1- The only package building mixed models with neg. bin.
> distribution I found is the package glmmADMB but I
> have a hard time understanding the output. Anyone knows of a R
> package with an output that gives p values?
>
> 2- Two people I asked advice to told me that I should use either a
> mixed-effect model with a Poisson
> distribution (the random effects will take care of the overdispersion)
> OR a glm using neg. bin.
> distribution but not both at the same time, which would be unnecessary.
>

  Several pieces of advice:

* this question is probably most appropriate for r-sig-mixed-models
(or perhaps r-sig-ecology)

* glmmADMB is admittedly a bit scratchy at the moment, but you
may not find a package that gives much easier-to-understand output --
almost all packages will give output in terms of fixed effect
coefficients, standard errors, and variances/covariances/standard deviations
of random effects.

* you might want to consider Poisson-lognormal models instead,
which allow for overdispersion and are a bit easier to fit in
the context of mixed models, by defining an individual-level
random effect: see e.g.
Elston, D. A., R. Moss, T. Boulinier, C. Arrowsmith, and X. Lambin. 2001.
Analysis of Aggregation, a Worked Example: Numbers of Ticks on Red Grouse
Chicks. Parasitology 122, no. 05: 563-569. doi:10.1017/S0031182001007740.
http://journals.cambridge.org/action/displayAbstract?fromPage=online&aid=82701.

  Such models can be fitted in (at least) MCMCglmm and recent versions
of glmer.

* p values will be tricky indeed.  sorry about that.

* as to the advice about using either mixed models or NB models but not
both -- that's an empirical question.  It may indeed be the case that
one or the other takes care of the overdispersion, but you won't know
until you try.  It is certainly possible to have overdispersion even
within a species/nestid combination.

I would suggest <http://glmm.wikidot.com/faq> as a starting point for
further reading ...

   good luck

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Mixed-effects model for overdispersed count data?

bbolker
In reply to this post by dave fournier
dave fournier <otter <at> otter-rsch.com> writes:

>
> According to the documentation for glmmADMB if you fit
> your model with a statment like
>
> fit =glmm.admb(y~Base*trt+Age+Visit, ...  data=epil2,family="nbinom")
>
> and that the parameter estimates are in
>
>     fit$b  while their estimated standard deviations are
> in
>
>      fit$stdbeta
>
> so presumably  p values can be constructed from the
> quotient
>
>       fit$b/fit$stdbeta
>
> by assuming a t distribution with (somehow) the correct
> degrees of freedom.


  As I commented elsewhere (for the record in this group),
you would do that in R via

2*pnorm(-abs(fit$b/fit$stdbeta))

for a 2-tailed test, but these values should be taken as
order-of-magnitude estimates of the 'true' (???) p-value at
best, because they are Wald tests (not score or likelihood,
both of which are more reliable) and because they assume
infinite 'denominator degrees of freedom' (i.e. Z/chi-squared
test rather than t/F test equivalent).
   Probably reliable only for a large, well-behaved data set
(e.g., >40 random-effects levels (species or nests)) ...

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.