|
Hi,
I am trying to replicate Lambert (1992)'s simulation with zero-inflated Poisson models. The citation is here: @article{lambert1992zero, Author = {Lambert, D.}, Journal = {Technometrics}, Pages = {1--14}, Publisher = {JSTOR}, Title = {Zero-inflated {P}oisson regression, with an application to defects in manufacturing}, Year = {1992}} Specifically I am trying to recreate Table 2. But my estimates for Gamma are not working out. Any ideas why? Please cc me on a reply! Thanks, Chris ## ZIP simulations based on Lambert (1992)'s conditions # Empty workspace rm(list=ls()) # Load zero-inflation package library(pscl) # Simulation conditions #3 NumSimulations=2000 X=seq(from=0,to=1,length.out=N) Model.Matrix=cbind(rep(1,length(X)),X) Gamma=c(-1.5,2) Beta=c(1.5,-2) P=exp(Model.Matrix%*%Gamma)/exp(1+Model.Matrix%*%Gamma) Lambda=exp(Model.Matrix%*%Beta) CoefSimulations=matrix(nrow=NumSimulations,ncol=2*dim(Model.Matrix)[2]) for(i in 1 : NumSimulations){ Lambda.Draw=rpois(N,Lambda) U=runif(N) Y=ifelse(U<=P,0,Lambda.Draw) CoefSimulations[i,]=coef(zeroinfl(Y~X|X)) } # What were the estimates? colMeans(CoefSimulations) # My gamma estimates aren't even close ... [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
On Thu, 26 Apr 2012, Christopher Desjardins wrote:
> Hi, > I am trying to replicate Lambert (1992)'s simulation with zero-inflated > Poisson models. The citation is here: > > @article{lambert1992zero, > Author = {Lambert, D.}, > Journal = {Technometrics}, > Pages = {1--14}, > Publisher = {JSTOR}, > Title = {Zero-inflated {P}oisson regression, with an application to defects > in manufacturing}, > Year = {1992}} > > Specifically I am trying to recreate Table 2. But my estimates for Gamma > are not working out. Any ideas why? Your implementation of the inverse logit link is wrong, it should be exp(x)/(1 + exp(x)) and not exp(x)/exp(1 + x). In R I never code this by hand but either use qlogis()/plogis() or make.link("logit"). Your setting resulting in an almost constant probability of zero inflation and hence gamma was completely off. Below is my cleaned up code which nicely replicates the results for n = 100. The other two would require additional work because one would need to catch non-convergence etc. hth, Z ## data-generating process dgp <- function(nobs = 100) { gamma <- c(-1.5, 2) beta <- c(1.5, -2) x <- seq(0, 1, length.out = nobs) lambda <- exp(beta[1] + beta[2] * x) p <- plogis(gamma[1] + gamma[2] * x) y <- ifelse(runif(nobs) <= p, 0, rpois(nobs, lambda = lambda)) data.frame(y = y, x = x) } ## simulation of coefficients and standard errors sim <- function(nrep = 2000, ...) { onesim <- function(i) { d <- dgp(...) m <- zeroinfl(y ~ x, data = d) c(coef(m), sqrt(diag(vcov(m)))) } t(sapply(1:nrep, onesim)) } ## run simulation #3 library("pscl") set.seed(1) cfse <- sim(nobs = 100) ## mean coefficient estimates apply(cfse[, 1:4], 2, mean) ## median coefficient estimates apply(cfse[, 1:4], 2, median) ## sd of coefficient estimates apply(cfse[, 1:4], 2, sd) ## mean of the standard deviation estimates from observed information apply(cfse[, 5:8], 2, mean) > Please cc me on a reply! > Thanks, > Chris > > ## ZIP simulations based on Lambert (1992)'s conditions > > # Empty workspace > rm(list=ls()) > > # Load zero-inflation package > library(pscl) > > # Simulation conditions #3 > NumSimulations=2000 > X=seq(from=0,to=1,length.out=N) > Model.Matrix=cbind(rep(1,length(X)),X) > Gamma=c(-1.5,2) > Beta=c(1.5,-2) > P=exp(Model.Matrix%*%Gamma)/exp(1+Model.Matrix%*%Gamma) > Lambda=exp(Model.Matrix%*%Beta) > CoefSimulations=matrix(nrow=NumSimulations,ncol=2*dim(Model.Matrix)[2]) > > for(i in 1 : NumSimulations){ > Lambda.Draw=rpois(N,Lambda) > U=runif(N) > Y=ifelse(U<=P,0,Lambda.Draw) > CoefSimulations[i,]=coef(zeroinfl(Y~X|X)) > } > > # What were the estimates? > colMeans(CoefSimulations) # My gamma estimates aren't even close ... > > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
On Fri, Apr 27, 2012 at 1:53 AM, Achim Zeileis <[hidden email]>wrote:
> On Thu, 26 Apr 2012, Christopher Desjardins wrote: > > Hi, >> I am trying to replicate Lambert (1992)'s simulation with zero-inflated >> Poisson models. The citation is here: >> >> @article{lambert1992zero, >> Author = {Lambert, D.}, >> Journal = {Technometrics}, >> Pages = {1--14}, >> Publisher = {JSTOR}, >> Title = {Zero-inflated {P}oisson regression, with an application to >> defects >> in manufacturing}, >> Year = {1992}} >> >> Specifically I am trying to recreate Table 2. But my estimates for Gamma >> are not working out. Any ideas why? >> > > Your implementation of the inverse logit link is wrong, it should be > exp(x)/(1 + exp(x)) and not exp(x)/exp(1 + x). In R I never code this by > hand but either use qlogis()/plogis() or make.link("logit"). > Doh! So obvious. I really should have noticed that. Thanks! I'll have a look at the rest of your code too. Cheers, Chris > > Your setting resulting in an almost constant probability of zero inflation > and hence gamma was completely off. > > Below is my cleaned up code which nicely replicates the results for n = > 100. The other two would require additional work because one would need to > catch non-convergence etc. > > hth, > Z > > ## data-generating process > dgp <- function(nobs = 100) { > gamma <- c(-1.5, 2) > beta <- c(1.5, -2) > x <- seq(0, 1, length.out = nobs) > lambda <- exp(beta[1] + beta[2] * x) > p <- plogis(gamma[1] + gamma[2] * x) > y <- ifelse(runif(nobs) <= p, 0, rpois(nobs, lambda = lambda)) > data.frame(y = y, x = x) > } > > ## simulation of coefficients and standard errors > sim <- function(nrep = 2000, ...) { > onesim <- function(i) { > d <- dgp(...) > m <- zeroinfl(y ~ x, data = d) > c(coef(m), sqrt(diag(vcov(m)))) > } > t(sapply(1:nrep, onesim)) > } > > ## run simulation #3 > library("pscl") > set.seed(1) > cfse <- sim(nobs = 100) > > ## mean coefficient estimates > apply(cfse[, 1:4], 2, mean) > > ## median coefficient estimates > apply(cfse[, 1:4], 2, median) > > ## sd of coefficient estimates > apply(cfse[, 1:4], 2, sd) > > ## mean of the standard deviation estimates from observed information > apply(cfse[, 5:8], 2, mean) > > > > Please cc me on a reply! >> Thanks, >> Chris >> >> ## ZIP simulations based on Lambert (1992)'s conditions >> >> # Empty workspace >> rm(list=ls()) >> >> # Load zero-inflation package >> library(pscl) >> >> # Simulation conditions #3 >> NumSimulations=2000 >> X=seq(from=0,to=1,length.out=**N) >> Model.Matrix=cbind(rep(1,**length(X)),X) >> Gamma=c(-1.5,2) >> Beta=c(1.5,-2) >> P=exp(Model.Matrix%*%Gamma)/**exp(1+Model.Matrix%*%Gamma) >> Lambda=exp(Model.Matrix%*%**Beta) >> CoefSimulations=matrix(nrow=**NumSimulations,ncol=2*dim(** >> Model.Matrix)[2]) >> >> for(i in 1 : NumSimulations){ >> Lambda.Draw=rpois(N,Lambda) >> U=runif(N) >> Y=ifelse(U<=P,0,Lambda.Draw) >> CoefSimulations[i,]=coef(**zeroinfl(Y~X|X)) >> } >> >> # What were the estimates? >> colMeans(CoefSimulations) # My gamma estimates aren't even close ... >> >> [[alternative HTML version deleted]] >> >> ______________________________**________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> >> PLEASE do read the posting guide http://www.R-project.org/** >> posting-guide.html <http://www.R-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, reproducible code. >> >> [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
In reply to this post by Achim Zeileis-4
On Sat, 5 May 2012, Christopher Desjardins wrote:
> Hi, > I am a little confused at the output from predict() for a zeroinfl object. > > Here's my confusion: > > ## From zeroinfl package > fm_zinb2 <- zeroinfl(art ~ . | ., data = bioChemists, dist = "negbin") > > > ## The raw zero-inflated overdispersed data > > table(bioChemists$art) > > 0 1 2 3 4 5 6 7 8 9 10 11 12 16 19 > 275 246 178 84 67 27 17 12 1 2 1 1 2 1 1 > > ## The default output from predict. It looks like it is doing a horrible > job. Does it really predict 7 zeros? predicting no 0s": https://stat.ethz.ch/pipermail/r-help/2011-June/279765.html The predicted _mean_ of a negative binomial distribution is not the most likely outcome (i.e., the _mode_) of the distribution. The post above presents some hands on examples. > > table(round(predict(fm_zinb2)) ) > > 0 1 2 3 4 5 6 10 > 7 354 487 45 12 6 3 1 > > ## The output from predict using "count" > > table(round(predict(fm_zinb2,type="count"))) > > 1 2 3 4 5 6 10 > 312 536 45 12 6 3 1 > > ## The output from predict using "zero", but here it predicts 24 > "structural" zeros? > > table(round(predict(fm_zinb2,type="zero"))) > > 0 1 > 891 24 > > > So my question is how do I interpret these different outputs from the > zeroinf object? What are the differences? The help page just left me > confused. I would expect that table(round(predict(fm_zinb2))) would be E(Y) > and would most accurately track table(bioChemists$art) but I am wrong. How > can I find the E(Y) that would most closely track the raw data? > > Thanks, > Chris > > [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
| Powered by Nabble | Edit this page |
