hurdle model - count and response predictions

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

hurdle model - count and response predictions

John Wilson
Hello,

I'm using pscl to run a hurdle model. Everything works great until I get to
the point of making predictions. All of my "count" predictions are lower
than my actual data, and lower than the "response" predictions, similar to
the issue described here (
https://stat.ethz.ch/pipermail/r-help/2012-August/320426.html) and here (
https://stackoverflow.com/questions/48794622/hurdle-model-prediction-count-vs-response
).

Since the issue is the same (and not resolved), I'll just use the example
from the second link:

library("pscl")
data("RecreationDemand", package = "AER")

## model
m <- hurdle(trips ~ quality | ski, data = RecreationDemand, dist = "negbin")
nd <- data.frame(quality = 0:5, ski = "no")
predict(m, newdata = nd, type = "count")
predict(m, newdata = nd, type = "response")

The presence/absence part of the model gives identical estimates to a
logistic model run on the data. However, I thought that the negbin part of
the hurdle should give identical estimates to a separate, glm.nb model of
the positive data. But I get completely different values...

library(MASS)
m.nb <- glm.nb(trips ~ quality, data =
RecreationDemand[RecreationDemand$trips > 0,])
predict(m, newdata = nd, type = "count") ## hurdle
predict(m.nb, newdata = nd, type = "response") ## positive counts only

Any help would be appreciated.

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: hurdle model - count and response predictions

Achim Zeileis-4
I answered the question on SO. In short the differences come from
truncated vs. untruncated models and conditional vs. unconditional
expectations. Feel free to follow-up on SO or here on the list...

On Fri, 16 Feb 2018, John Wilson wrote:

> Hello,
>
> I'm using pscl to run a hurdle model. Everything works great until I get to
> the point of making predictions. All of my "count" predictions are lower
> than my actual data, and lower than the "response" predictions, similar to
> the issue described here (
> https://stat.ethz.ch/pipermail/r-help/2012-August/320426.html) and here (
> https://stackoverflow.com/questions/48794622/hurdle-model-prediction-count-vs-response
> ).
>
> Since the issue is the same (and not resolved), I'll just use the example
> from the second link:
>
> library("pscl")
> data("RecreationDemand", package = "AER")
>
> ## model
> m <- hurdle(trips ~ quality | ski, data = RecreationDemand, dist = "negbin")
> nd <- data.frame(quality = 0:5, ski = "no")
> predict(m, newdata = nd, type = "count")
> predict(m, newdata = nd, type = "response")
>
> The presence/absence part of the model gives identical estimates to a
> logistic model run on the data. However, I thought that the negbin part of
> the hurdle should give identical estimates to a separate, glm.nb model of
> the positive data. But I get completely different values...
>
> library(MASS)
> m.nb <- glm.nb(trips ~ quality, data =
> RecreationDemand[RecreationDemand$trips > 0,])
> predict(m, newdata = nd, type = "count") ## hurdle
> predict(m.nb, newdata = nd, type = "response") ## positive counts only
>
> Any help would be appreciated.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.