Re: Logistic regression model selection with overdispersed/autocorrelated data
Thanks for pointing out the aod package and the beta-binomial logistic
While I see how betabinom could be applied to some of our other analyses ,
I don't see how it can be used in our habitat selection analysis where
individual locations are coded as 0 or 1 rather than proportions. Gee
models (geeglm from geepack) could be used for our analyses. Even though
these models are fit using maximum likelihood estimation, they do not solve
our model selection problem.
Beta-coefficients from gee, glm, glmm's, and lrm are nearly identical. The
only thing that varies is the variance-covariance matrix and the resulting
standard errors. Consequently, the deviances should be similar because
predicted values (p) are calculated from the beta-coefficients. For an
individual data point, the loglikelihood = y * log(p) + (1 - y) * log(1-p)
and the deviance = -2 * sum(loglikelihoods). Consequently, the difference
in deviance between two models is amplified by autocorrelated data and
causes models to be overparamaterized when using AIC or likelihood ratio
I am curious how others select models with autocorrelated data.
Thanks for your help,
<renaud.lancelot@ To: "[hidden email]" <[hidden email]>
gmail.com> cc: [hidden email] Subject: Re: [R] Logistic regression model selection with overdispersed/autocorrelated
31/01/2006 01:02 data
If you're not interested in fitting caribou-specific responses, you
can use beta-binomial logistic models. There are several package
available for this purpose on CRAN, among which aod. Because these
models are fitted using maximum-likelihood methods, you can use AIC
(or other information criteria) to compare different models.
2006/1/30, [hidden email] <[hidden email]>:
> I am creating habitat selection models for caribou and other species with
> data collected from GPS collars. In my current situation the
> recorded the locations of 30 caribou every 6 hours. I am then comparing
> resources used at caribou locations to random locations using logistic
> regression (standard habitat analysis).
> The data is therefore highly autocorrelated and this causes Type I error
> two ways – small standard errors around beta-coefficients and
> over-paramaterization during model selection. Robust standard errors are
> easily calculated by block-bootstrapping the data using "animal" as a
> cluster with the Design library, however I haven't found a satisfactory
> solution for model selection.
> A couple options are:
> 1. Using QAIC where the deviance is divided by a variance inflation
> (Burnham & Anderson). However, this VIF can vary greatly depending on
> data set and the set of covariates used in the global model.
> 2. Manual forward stepwise regression using both changes in deviance and
> robust p-values for the beta-coefficients.
> I have been looking for a solution to this problem for a couple years and
> would appreciate any advice.
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide!