Quantcast

Lambert (1992) simulation

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Lambert (1992) simulation

cddesjar
Hi,
I am trying to replicate Lambert (1992)'s simulation with zero-inflated
Poisson models. The citation is here:

@article{lambert1992zero,
Author = {Lambert, D.},
Journal = {Technometrics},
Pages = {1--14},
Publisher = {JSTOR},
Title = {Zero-inflated {P}oisson regression, with an application to defects
in manufacturing},
Year = {1992}}

Specifically I am trying to recreate Table 2. But my estimates for Gamma
are not working out. Any ideas why? Please cc me on a reply!
Thanks,
Chris

## ZIP simulations based on Lambert (1992)'s conditions

# Empty workspace
rm(list=ls())

# Load zero-inflation package
library(pscl)

# Simulation conditions #3
NumSimulations=2000
X=seq(from=0,to=1,length.out=N)
Model.Matrix=cbind(rep(1,length(X)),X)
Gamma=c(-1.5,2)
Beta=c(1.5,-2)
P=exp(Model.Matrix%*%Gamma)/exp(1+Model.Matrix%*%Gamma)
Lambda=exp(Model.Matrix%*%Beta)
CoefSimulations=matrix(nrow=NumSimulations,ncol=2*dim(Model.Matrix)[2])

for(i in 1 : NumSimulations){
Lambda.Draw=rpois(N,Lambda)
U=runif(N)
Y=ifelse(U<=P,0,Lambda.Draw)
CoefSimulations[i,]=coef(zeroinfl(Y~X|X))
}

# What were the estimates?
colMeans(CoefSimulations) # My gamma estimates aren't even close ...

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Lambert (1992) simulation

Achim Zeileis-4
On Thu, 26 Apr 2012, Christopher Desjardins wrote:

> Hi,
> I am trying to replicate Lambert (1992)'s simulation with zero-inflated
> Poisson models. The citation is here:
>
> @article{lambert1992zero,
> Author = {Lambert, D.},
> Journal = {Technometrics},
> Pages = {1--14},
> Publisher = {JSTOR},
> Title = {Zero-inflated {P}oisson regression, with an application to defects
> in manufacturing},
> Year = {1992}}
>
> Specifically I am trying to recreate Table 2. But my estimates for Gamma
> are not working out. Any ideas why?

Your implementation of the inverse logit link is wrong, it should be
exp(x)/(1 + exp(x)) and not exp(x)/exp(1 + x). In R I never code this by
hand but either use qlogis()/plogis() or make.link("logit").

Your setting resulting in an almost constant probability of zero inflation
and hence gamma was completely off.

Below is my cleaned up code which nicely replicates the results for n =
100. The other two would require additional work because one would need to
catch non-convergence etc.

hth,
Z

## data-generating process
dgp <- function(nobs = 100) {
   gamma <- c(-1.5, 2)
   beta <- c(1.5, -2)
   x <- seq(0, 1, length.out = nobs)
   lambda <- exp(beta[1] + beta[2] * x)
   p <- plogis(gamma[1] + gamma[2] * x)
   y <- ifelse(runif(nobs) <= p, 0, rpois(nobs, lambda = lambda))
   data.frame(y = y, x = x)
}

## simulation of coefficients and standard errors
sim <- function(nrep = 2000, ...) {
   onesim <- function(i) {
     d <- dgp(...)
     m <- zeroinfl(y ~ x, data = d)
     c(coef(m), sqrt(diag(vcov(m))))
   }
   t(sapply(1:nrep, onesim))
}

## run simulation #3
library("pscl")
set.seed(1)
cfse <- sim(nobs = 100)

## mean coefficient estimates
apply(cfse[, 1:4], 2, mean)

## median coefficient estimates
apply(cfse[, 1:4], 2, median)

## sd of coefficient estimates
apply(cfse[, 1:4], 2, sd)

## mean of the standard deviation estimates from observed information
apply(cfse[, 5:8], 2, mean)



> Please cc me on a reply!
> Thanks,
> Chris
>
> ## ZIP simulations based on Lambert (1992)'s conditions
>
> # Empty workspace
> rm(list=ls())
>
> # Load zero-inflation package
> library(pscl)
>
> # Simulation conditions #3
> NumSimulations=2000
> X=seq(from=0,to=1,length.out=N)
> Model.Matrix=cbind(rep(1,length(X)),X)
> Gamma=c(-1.5,2)
> Beta=c(1.5,-2)
> P=exp(Model.Matrix%*%Gamma)/exp(1+Model.Matrix%*%Gamma)
> Lambda=exp(Model.Matrix%*%Beta)
> CoefSimulations=matrix(nrow=NumSimulations,ncol=2*dim(Model.Matrix)[2])
>
> for(i in 1 : NumSimulations){
> Lambda.Draw=rpois(N,Lambda)
> U=runif(N)
> Y=ifelse(U<=P,0,Lambda.Draw)
> CoefSimulations[i,]=coef(zeroinfl(Y~X|X))
> }
>
> # What were the estimates?
> colMeans(CoefSimulations) # My gamma estimates aren't even close ...
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Lambert (1992) simulation

cddesjar
On Fri, Apr 27, 2012 at 1:53 AM, Achim Zeileis <[hidden email]>wrote:

> On Thu, 26 Apr 2012, Christopher Desjardins wrote:
>
>  Hi,
>> I am trying to replicate Lambert (1992)'s simulation with zero-inflated
>> Poisson models. The citation is here:
>>
>> @article{lambert1992zero,
>> Author = {Lambert, D.},
>> Journal = {Technometrics},
>> Pages = {1--14},
>> Publisher = {JSTOR},
>> Title = {Zero-inflated {P}oisson regression, with an application to
>> defects
>> in manufacturing},
>> Year = {1992}}
>>
>> Specifically I am trying to recreate Table 2. But my estimates for Gamma
>> are not working out. Any ideas why?
>>
>
> Your implementation of the inverse logit link is wrong, it should be
> exp(x)/(1 + exp(x)) and not exp(x)/exp(1 + x). In R I never code this by
> hand but either use qlogis()/plogis() or make.link("logit").
>


Doh! So obvious. I really should have noticed that. Thanks! I'll have a
look at the rest of your code too.
Cheers,
Chris


>
> Your setting resulting in an almost constant probability of zero inflation
> and hence gamma was completely off.
>
> Below is my cleaned up code which nicely replicates the results for n =
> 100. The other two would require additional work because one would need to
> catch non-convergence etc.
>
> hth,
> Z
>
> ## data-generating process
> dgp <- function(nobs = 100) {
>  gamma <- c(-1.5, 2)
>  beta <- c(1.5, -2)
>  x <- seq(0, 1, length.out = nobs)
>  lambda <- exp(beta[1] + beta[2] * x)
>  p <- plogis(gamma[1] + gamma[2] * x)
>  y <- ifelse(runif(nobs) <= p, 0, rpois(nobs, lambda = lambda))
>  data.frame(y = y, x = x)
> }
>
> ## simulation of coefficients and standard errors
> sim <- function(nrep = 2000, ...) {
>  onesim <- function(i) {
>    d <- dgp(...)
>    m <- zeroinfl(y ~ x, data = d)
>    c(coef(m), sqrt(diag(vcov(m))))
>  }
>  t(sapply(1:nrep, onesim))
> }
>
> ## run simulation #3
> library("pscl")
> set.seed(1)
> cfse <- sim(nobs = 100)
>
> ## mean coefficient estimates
> apply(cfse[, 1:4], 2, mean)
>
> ## median coefficient estimates
> apply(cfse[, 1:4], 2, median)
>
> ## sd of coefficient estimates
> apply(cfse[, 1:4], 2, sd)
>
> ## mean of the standard deviation estimates from observed information
> apply(cfse[, 5:8], 2, mean)
>
>
>
>  Please cc me on a reply!
>> Thanks,
>> Chris
>>
>> ## ZIP simulations based on Lambert (1992)'s conditions
>>
>> # Empty workspace
>> rm(list=ls())
>>
>> # Load zero-inflation package
>> library(pscl)
>>
>> # Simulation conditions #3
>> NumSimulations=2000
>> X=seq(from=0,to=1,length.out=**N)
>> Model.Matrix=cbind(rep(1,**length(X)),X)
>> Gamma=c(-1.5,2)
>> Beta=c(1.5,-2)
>> P=exp(Model.Matrix%*%Gamma)/**exp(1+Model.Matrix%*%Gamma)
>> Lambda=exp(Model.Matrix%*%**Beta)
>> CoefSimulations=matrix(nrow=**NumSimulations,ncol=2*dim(**
>> Model.Matrix)[2])
>>
>> for(i in 1 : NumSimulations){
>> Lambda.Draw=rpois(N,Lambda)
>> U=runif(N)
>> Y=ifelse(U<=P,0,Lambda.Draw)
>> CoefSimulations[i,]=coef(**zeroinfl(Y~X|X))
>> }
>>
>> # What were the estimates?
>> colMeans(CoefSimulations) # My gamma estimates aren't even close ...
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________**________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Lambert (1992) simulation

Achim Zeileis-4
In reply to this post by Achim Zeileis-4
On Sat, 5 May 2012, Christopher Desjardins wrote:

> Hi,
> I am a little confused at the output from predict() for a zeroinfl object.
>
> Here's my confusion:
>
> ## From zeroinfl package
> fm_zinb2 <- zeroinfl(art ~ . | ., data = bioChemists, dist = "negbin")
>
>
> ## The raw zero-inflated overdispersed data
> > table(bioChemists$art)
>
>   0   1   2   3   4   5   6   7   8   9  10  11  12  16  19
> 275 246 178  84  67  27  17  12   1   2   1   1   2   1   1
>
> ## The default output from predict. It looks like it is doing a horrible
> job. Does it really predict 7 zeros?
No, see also this R-help post on "Zero-inflated regression models:
predicting no 0s":
https://stat.ethz.ch/pipermail/r-help/2011-June/279765.html

The predicted _mean_ of a negative binomial distribution is not the most
likely outcome (i.e., the _mode_) of the distribution. The post above
presents some hands on examples.


> > table(round(predict(fm_zinb2)) )
>
>   0   1   2   3   4   5   6  10
>   7 354 487  45  12   6   3   1
>
> ##  The output from predict using "count"
> > table(round(predict(fm_zinb2,type="count")))
>
>   1   2   3   4   5   6  10
> 312 536  45  12   6   3   1
>
> ## The output from predict using "zero", but here it predicts 24
> "structural" zeros?
> > table(round(predict(fm_zinb2,type="zero")))
>
>   0   1
> 891  24
>
>
> So my question is how do I interpret these different outputs from the
> zeroinf object? What are the differences? The help page just left me
> confused. I would expect that table(round(predict(fm_zinb2))) would be E(Y)
> and would most accurately track table(bioChemists$art) but I am wrong. How
> can I find the E(Y) that would most closely track the raw data?
>
> Thanks,
> Chris
>
>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...