mgcv: bam warning messages and non-convergence

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

mgcv: bam warning messages and non-convergence

R help mailing list-2
I have a large dataset of 118225 observations from 16 columns and as such I’ve been using bam, rather than gam, for my analyses.

The response variable I’m using is count data but it’s overdispersed, and as such, I thought I’d use a negative binomial model. I have 5 explanatory variables, which are biologically important. Two are numerical and 3 are categorical. I’ve only applied a smoother to the first numerical explanatory variable, because, from some prior analyses I found that TL had edf values of 1.01 and was therefore linear. I also have included categorical two random effects in the model.

m3 <- bam(deg ~ s(SE_score) + TL + species + sex + season + year +
            s(code, bs = 're') + s(monthyear, bs = 're'),
          family=nb(), data=node_dat, method = "REML")

th <- m3$family$getTheta(TRUE) #extracts theta

m3 <- bam(deg ~ s(SE_score) + TL + species + sex + season + year +
            s(code, bs = 're') + s(monthyear, bs = 're'),
          family=nb(th), data=node_dat, method = "REML")

summary(m3)

However I’m getting this warning and I can’t find out what it means

There were 32 warnings (use warnings() to see them)
> warnings()
Warning messages:
1: In pmax(1, y)/mu :
  longer object length is not a multiple of shorter object length
2: In y * log(pmax(1, y)/mu) :
  longer object length is not a multiple of shorter object length

Is this an issue? The model converges, and I’ve checked overdispersion again and get this value

> E3 <- resid(m3, type = "pearson")
> sum(E3^2)/m3$df.res
[1] 0.7436045

So this suggests there is some under dispersion now? Also the model summary gives

> summary(m3)

Family: Negative Binomial(0.055)
Link function: log

I’ve read that the 0.055 is also a measure of dispersion so which one is correct?

I was confused about all this and I have a lot of zeros in my data (about 96%) so I thought I’d also try an zero inflated poisson, however is does not converge.

m4 <- bam(deg ~ s(SE_score) + TL + species + sex + season + year +
                  s(code, bs = 're') + s(monthyear, bs = 're'),
                family=ziP(), data=node_dat, method = "REML")

Warning message:
In bgam.fit(G, mf, chunk.size, gp, scale, gamma, method = method,  :
  algorithm did not converge

Is there any reason why it does not onverge? And maybe a zero inflated negative binomial would better but I’m not sure how to undertake that.

I know there’s a lot here but any help would be appreciated.

Many thanks,

Mike



Michael Williamson
London NERC DTP Candidate

Email: [hidden email]<mailto:[hidden email]> Phone: +447764836592 Skype: mikejwilliamson Twitter: @mjw_marine Website: www.thenetlab.uk<https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.thenetlab.uk%2F&data=01%7C01%7Cmichael.williamson%40kcl.ac.uk%7C07c592826b364b249c9208d84e5dbc12%7C8370cf1416f34c16b83c724071654356%7C0&sdata=vaibGznfTGGiS7l0lHuRaQ3w4fnEQGaXIfgQ34OrhG4%3D&reserved=0>

Most recent paper:
Williamson, M. J. et al. (2021). Analysing detection gaps in acoustic telemetry data to infer differential movement patterns in fish. Ecology and Evolution, 11, 2717-2730. https://doi.org/10.1002/ece3.7226




        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.