Prediction with two fixed-effects - large number of IDs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Prediction with two fixed-effects - large number of IDs

Miluji Sb
Dear all,

I am running a panel regression with time and location fixed effects:

###

reg1 <- lm(lny ~ factor(id) + factor(year) + x1+ I(x1)^2 + x2+ I(x2)^2 ,
 data=mydata, na.action="na.omit")
###

My goal is to use the estimation for prediction. However, I have 8,500 IDs,
which is resulting in very slow computation. Ideally, I would like to do
the following:

###
reg2 <- felm(lny ~ x1+ I(x1)^2 + x2+ I(x2)^2 | id + year , data=mydata,
na.action="na.omit")
###

However, predict does not work with felm. Is there a way to either make lm
faster or use predict with felm? Is parallelizing an option?

Any help will be appreciated. Thank you!

Sincerely,

Milu

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Prediction with two fixed-effects - large number of IDs

Jeff Newmiller
I have no direct experience with such horrific models, but your formula is a mess and Google suggests the biglm package with ffdf.

Specifically, you should convert your discrete variables to factors before you build the model, particularly since you want to use predict after the fact, for which you will need a new data set with the exact same levels in the factors.

Also, your use of I() is broken and redundant.  I think formulas

lny ~ id + year + x1 + I(x1^2) + x2 + I(x2^2)

or

lny ~ id + year + x1^2 + x2^2

would obtain the intended prediction results.

--
Sent from my phone. Please excuse my brevity.

On June 17, 2017 11:24:05 AM PDT, Miluji Sb <[hidden email]> wrote:

>Dear all,
>
>I am running a panel regression with time and location fixed effects:
>
>###
>
>reg1 <- lm(lny ~ factor(id) + factor(year) + x1+ I(x1)^2 + x2+ I(x2)^2
>,
> data=mydata, na.action="na.omit")
>###
>
>My goal is to use the estimation for prediction. However, I have 8,500
>IDs,
>which is resulting in very slow computation. Ideally, I would like to
>do
>the following:
>
>###
>reg2 <- felm(lny ~ x1+ I(x1)^2 + x2+ I(x2)^2 | id + year , data=mydata,
>na.action="na.omit")
>###
>
>However, predict does not work with felm. Is there a way to either make
>lm
>faster or use predict with felm? Is parallelizing an option?
>
>Any help will be appreciated. Thank you!
>
>Sincerely,
>
>Milu
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Prediction with two fixed-effects - large number of IDs

Miluji Sb
Dear Jeff,

Thank you so much and apologies for the typo in I() - it was silly.

I will try the biglm package - thanks!

Sincerely,

Milu

On Sat, Jun 17, 2017 at 9:01 PM, Jeff Newmiller <[hidden email]>
wrote:

> I have no direct experience with such horrific models, but your formula is
> a mess and Google suggests the biglm package with ffdf.
>
> Specifically, you should convert your discrete variables to factors before
> you build the model, particularly since you want to use predict after the
> fact, for which you will need a new data set with the exact same levels in
> the factors.
>
> Also, your use of I() is broken and redundant.  I think formulas
>
> lny ~ id + year + x1 + I(x1^2) + x2 + I(x2^2)
>
> or
>
> lny ~ id + year + x1^2 + x2^2
>
> would obtain the intended prediction results.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> On June 17, 2017 11:24:05 AM PDT, Miluji Sb <[hidden email]> wrote:
> >Dear all,
> >
> >I am running a panel regression with time and location fixed effects:
> >
> >###
> >
> >reg1 <- lm(lny ~ factor(id) + factor(year) + x1+ I(x1)^2 + x2+ I(x2)^2
> >,
> > data=mydata, na.action="na.omit")
> >###
> >
> >My goal is to use the estimation for prediction. However, I have 8,500
> >IDs,
> >which is resulting in very slow computation. Ideally, I would like to
> >do
> >the following:
> >
> >###
> >reg2 <- felm(lny ~ x1+ I(x1)^2 + x2+ I(x2)^2 | id + year , data=mydata,
> >na.action="na.omit")
> >###
> >
> >However, predict does not work with felm. Is there a way to either make
> >lm
> >faster or use predict with felm? Is parallelizing an option?
> >
> >Any help will be appreciated. Thank you!
> >
> >Sincerely,
> >
> >Milu
> >
> >       [[alternative HTML version deleted]]
> >
> >______________________________________________
> >[hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Prediction with two fixed-effects - large number of IDs

David Winsemius
In reply to this post by Jeff Newmiller

> On Jun 17, 2017, at 12:01 PM, Jeff Newmiller <[hidden email]> wrote:
>
> I have no direct experience with such horrific models, but your formula is a mess and Google suggests the biglm package with ffdf.
>
> Specifically, you should convert your discrete variables to factors before you build the model, particularly since you want to use predict after the fact, for which you will need a new data set with the exact same levels in the factors.
>
> Also, your use of I() is broken and redundant.  I think formulas
>
> lny ~ id + year + x1 + I(x1^2) + x2 + I(x2^2)
>
> or
>
> lny ~ id + year + x1^2 + x2^2

This was offered as a formula to `felm` (but with no data example), a package with which I have no experience either, but if experience with `lm` and `glm` is any guide, an inferentially safer approach might be:

   lny ~ id + year + poly(x1,2) + poly(x2,2)

--

David


>
> would obtain the intended prediction results.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> On June 17, 2017 11:24:05 AM PDT, Miluji Sb <[hidden email]> wrote:
>> Dear all,
>>
>> I am running a panel regression with time and location fixed effects:
>>
>> ###
>>
>> reg1 <- lm(lny ~ factor(id) + factor(year) + x1+ I(x1)^2 + x2+ I(x2)^2
>> ,
>> data=mydata, na.action="na.omit")
>> ###
>>
>> My goal is to use the estimation for prediction. However, I have 8,500
>> IDs,
>> which is resulting in very slow computation. Ideally, I would like to
>> do
>> the following:
>>
>> ###
>> reg2 <- felm(lny ~ x1+ I(x1)^2 + x2+ I(x2)^2 | id + year , data=mydata,
>> na.action="na.omit")
>> ###
>>
>> However, predict does not work with felm. Is there a way to either make
>> lm
>> faster or use predict with felm? Is parallelizing an option?
>>
>> Any help will be appreciated. Thank you!
>>
>> Sincerely,
>>
>> Milu
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...