predicting values from multiple regression

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

predicting values from multiple regression

Anna Lee
Hey List,

I did a multiple regression and my final model looks as follows:

model9<-lm(calP ~ nsP + I(st^2) + distPr + I(distPr^2))

Now I tried to predict the values for calP from this model using the
following function:

xv<-seq(0,89,by=1)
yv<-predict(model9,list(distPr=xv,st=xv,nsP=xv))

The predicted values are however strange. Now I do not know weather
just the model does not fit the data (actually all coefficiets are
significant and the plot(model) shows a good shape) or wether I did
something wrong with my prediction command. Does anyone have an
idea???

--


Thanks a lot, Anna

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: predicting values from multiple regression

Ista Zahn-2
Hi Anna,

On Sun, Mar 20, 2011 at 2:54 PM, Anna Lee <[hidden email]> wrote:

> Hey List,
>
> I did a multiple regression and my final model looks as follows:
>
> model9<-lm(calP ~ nsP + I(st^2) + distPr + I(distPr^2))
>
> Now I tried to predict the values for calP from this model using the
> following function:
>
> xv<-seq(0,89,by=1)
> yv<-predict(model9,list(distPr=xv,st=xv,nsP=xv))

The second argument to predict.lm is newdata, which should be a
data.frame. see ?predict.lm.

Beyond that though, I'm not sure what you are trying to accomplish.
The way you've set this up you would get predicted values for cases
like

distPr     st     nsp
0            0      0
1            1      1
2            2      2
.             .       .
89          89     89


Is that really what you want?

Best,
Ista

>
> The predicted values are however strange. Now I do not know weather
> just the model does not fit the data (actually all coefficiets are
> significant and the plot(model) shows a good shape) or wether I did
> something wrong with my prediction command. Does anyone have an
> idea???
>
> --
>
>
> Thanks a lot, Anna
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: predicting values from multiple regression

Anna Lee
Dear Ista!

Thank you for replying. The point you made is exactly what's the
problem: I want to predict the values at different points in space.
calP stands for the water content at each sampling point (n=90) but I
don't quite understand what R does. calP is my vector of measured data
and I thought with the predict function the programm would calculate a
value from the model function for every value of calP... ?

2011/3/20 Ista Zahn <[hidden email]>:

> Hi Anna,
>
> On Sun, Mar 20, 2011 at 2:54 PM, Anna Lee <[hidden email]> wrote:
>> Hey List,
>>
>> I did a multiple regression and my final model looks as follows:
>>
>> model9<-lm(calP ~ nsP + I(st^2) + distPr + I(distPr^2))
>>
>> Now I tried to predict the values for calP from this model using the
>> following function:
>>
>> xv<-seq(0,89,by=1)
>> yv<-predict(model9,list(distPr=xv,st=xv,nsP=xv))
>
> The second argument to predict.lm is newdata, which should be a
> data.frame. see ?predict.lm.
>
> Beyond that though, I'm not sure what you are trying to accomplish.
> The way you've set this up you would get predicted values for cases
> like
>
> distPr     st     nsp
> 0            0      0
> 1            1      1
> 2            2      2
> .             .       .
> 89          89     89
>
>
> Is that really what you want?
>
> Best,
> Ista
>>
>> The predicted values are however strange. Now I do not know weather
>> just the model does not fit the data (actually all coefficiets are
>> significant and the plot(model) shows a good shape) or wether I did
>> something wrong with my prediction command. Does anyone have an
>> idea???
>>
>> --
>>
>>
>> Thanks a lot, Anna
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Ista Zahn
> Graduate student
> University of Rochester
> Department of Clinical and Social Psychology
> http://yourpsyche.org
>



--



Der Inhalt dieser E-Mail ist vertraulich. Sollte Ihnen die E-Mail
irrtümlich zugesandt worden sein, bitte ich Sie, mich unverzüglich zu
benachrichtigen und die E-Mail zu löschen.

This e-mail is confidential. If you have received it in error, please
notify me immediately and delete it from your system.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: predicting values from multiple regression

David Winsemius

On Mar 20, 2011, at 3:56 PM, Anna Lee wrote:

> Dear Ista!
>
> Thank you for replying. The point you made is exactly what's the
> problem: I want to predict the values at different points in space.
> calP stands for the water content at each sampling point (n=90) but I
> don't quite understand what R does. calP is my vector of measured data
> and I thought with the predict function the programm would calculate a
> value from the model function for every value of calP... ?

I do not think you are reading Ista Zahn comments carefully. She said  
you needed to offer you newdata argument as a data.frame. That is the  
fundamental problem.

She also pointed out that your arguments in the list didn't seem to  
represent a full explorations of the data "space". If all of the  
variables in the regression problem have the same value for each case,  
then there is no point in adding multiple variables.

--
David.


>
> 2011/3/20 Ista Zahn <[hidden email]>:
>> Hi Anna,
>>
>> On Sun, Mar 20, 2011 at 2:54 PM, Anna Lee <[hidden email]> wrote:
>>> Hey List,
>>>
>>> I did a multiple regression and my final model looks as follows:
>>>
>>> model9<-lm(calP ~ nsP + I(st^2) + distPr + I(distPr^2))
>>>
>>> Now I tried to predict the values for calP from this model using the
>>> following function:
>>>
>>> xv<-seq(0,89,by=1)
>>> yv<-predict(model9,list(distPr=xv,st=xv,nsP=xv))
>>
>> The second argument to predict.lm is newdata, which should be a
>> data.frame. see ?predict.lm.
>>
>> Beyond that though, I'm not sure what you are trying to accomplish.
>> The way you've set this up you would get predicted values for cases
>> like
>>
>> distPr     st     nsp
>> 0            0      0
>> 1            1      1
>> 2            2      2
>> .             .       .
>> 89          89     89
>>
>>
>> Is that really what you want?
>>
>> Best,
>> Ista
>>>
>>> The predicted values are however strange. Now I do not know weather
>>> just the model does not fit the data (actually all coefficiets are
>>> significant and the plot(model) shows a good shape) or wether I did
>>> something wrong with my prediction command. Does anyone have an
>>> idea???
>>>
>>> --
>>>
>>>
>>> Thanks a lot, Anna
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Ista Zahn
>> Graduate student
>> University of Rochester
>> Department of Clinical and Social Psychology
>> http://yourpsyche.org
>>
>
>
>
> --
>
>
>
> Der Inhalt dieser E-Mail ist vertraulich. Sollte Ihnen die E-Mail
> irrtümlich zugesandt worden sein, bitte ich Sie, mich unverzüglich zu
> benachrichtigen und die E-Mail zu löschen.
>
> This e-mail is confidential. If you have received it in error, please
> notify me immediately and delete it from your system.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: predicting values from multiple regression

Ista Zahn-2
In reply to this post by Ista Zahn-2
Hi Anna,
Maybe you can start again and tell us what you are trying to
accomplish. If you are trying to calculate predicted values for the
cases in the data set used to fit the model you don't need the newdata
argument at all. Just do

predict(model9)

If you are trying to do something else, please describe what that
something else is and I'm sure someone will help.

Best,
Ista

On Sun, Mar 20, 2011 at 7:56 PM, Anna Lee <[hidden email]> wrote:

> Dear Ista!
>
> Thank you for replying. The point you made is exactly what's the
> problem: I want to predict the values at different points in space.
> calP stands for the water content at each sampling point (n=90) but I
> don't quite understand what R does. calP is my vector of measured data
> and I thought with the predict function the programm would calculate a
> value from the model function for every value of calP... ?
>
> 2011/3/20 Ista Zahn <[hidden email]>:
>> Hi Anna,
>>
>> On Sun, Mar 20, 2011 at 2:54 PM, Anna Lee <[hidden email]> wrote:
>>> Hey List,
>>>
>>> I did a multiple regression and my final model looks as follows:
>>>
>>> model9<-lm(calP ~ nsP + I(st^2) + distPr + I(distPr^2))
>>>
>>> Now I tried to predict the values for calP from this model using the
>>> following function:
>>>
>>> xv<-seq(0,89,by=1)
>>> yv<-predict(model9,list(distPr=xv,st=xv,nsP=xv))
>>
>> The second argument to predict.lm is newdata, which should be a
>> data.frame. see ?predict.lm.
>>
>> Beyond that though, I'm not sure what you are trying to accomplish.
>> The way you've set this up you would get predicted values for cases
>> like
>>
>> distPr     st     nsp
>> 0            0      0
>> 1            1      1
>> 2            2      2
>> .             .       .
>> 89          89     89
>>
>>
>> Is that really what you want?
>>
>> Best,
>> Ista
>>>
>>> The predicted values are however strange. Now I do not know weather
>>> just the model does not fit the data (actually all coefficiets are
>>> significant and the plot(model) shows a good shape) or wether I did
>>> something wrong with my prediction command. Does anyone have an
>>> idea???
>>>
>>> --
>>>
>>>
>>> Thanks a lot, Anna
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Ista Zahn
>> Graduate student
>> University of Rochester
>> Department of Clinical and Social Psychology
>> http://yourpsyche.org
>>
>
>
>
> --
>
>
>
> Der Inhalt dieser E-Mail ist vertraulich. Sollte Ihnen die E-Mail
> irrtümlich zugesandt worden sein, bitte ich Sie, mich unverzüglich zu
> benachrichtigen und die E-Mail zu löschen.
>
> This e-mail is confidential. If you have received it in error, please
> notify me immediately and delete it from your system.
>



--
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: predicting values from multiple regression

Anna Lee
In reply to this post by Anna Lee
Dennis: Thank you so much! I got it now - it just works perfectly. Thanks a lot!
Anna

2011/3/21 Dennis Murphy <[hidden email]>:

> Hi:
>
> To amplify Ista's and David's comments:
>
> (1) You should not be inputting separate vectors into lm(), especially if
> you intend to do prediction. They should be combined into a data frame
> instead. This is not a requirement, but it's a much safer strategy for
> modeling in R.
> (2) Your covariate st does not have a linear component. It should,
> particularly if this is an empirical model rather than a theoretical one.
> (3) You should be using poly(var, 2) to create orthogonal columns in the
> model matrix for the variables that are to contain quadratic terms.
> (4) The newdata =  argument of predict.lm() [whose help page you should read
> carefully] requires a data frame with columns having precisely the same
> variable names as exist in the RHS of the model formula in lm().
>
> Example:
> dd <- data.frame(y = rnorm(50), x1 = rnorm(50), x2 = runif(50, -2, 2), x3 =
> rpois(50, 10))
>
> #  fit yhat = b0 + b1 * x1 + b2 * x1^2 + b3 * x2 + b4 * x3 + b5 * x3^2
> mod <- lm(y ~ poly(x1, 2) + x2 + poly(x3, 2), data = dd)
>
> # Note that the names of the variables in newd are the same as those on the
> RHS of the formula in mod
> newd <- data.frame(x1 = rnorm(5), x2 = runif(5, -2, 2), x3 = rpois(5,
> 10))      # new data points
> # Append predictions to newd
> cbind(newd, predict(mod, newdata = newd))             # predictions at new
> data points
>
> # To just get predictions at the observed points, all you need is
> predict(mod)
>
> HTH,
> Dennis
>
> On Sun, Mar 20, 2011 at 11:54 AM, Anna Lee <[hidden email]> wrote:
>>
>> Hey List,
>>
>> I did a multiple regression and my final model looks as follows:
>>
>> model9<-lm(calP ~ nsP + I(st^2) + distPr + I(distPr^2))
>>
>> Now I tried to predict the values for calP from this model using the
>> following function:
>>
>> xv<-seq(0,89,by=1)
>> yv<-predict(model9,list(distPr=xv,st=xv,nsP=xv))
>>
>> The predicted values are however strange. Now I do not know weather
>> just the model does not fit the data (actually all coefficiets are
>> significant and the plot(model) shows a good shape) or wether I did
>> something wrong with my prediction command. Does anyone have an
>> idea???
>>
>> --
>>
>>
>> Thanks a lot, Anna
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>



--



Der Inhalt dieser E-Mail ist vertraulich. Sollte Ihnen die E-Mail
irrtümlich zugesandt worden sein, bitte ich Sie, mich unverzüglich zu
benachrichtigen und die E-Mail zu löschen.

This e-mail is confidential. If you have received it in error, please
notify me immediately and delete it from your system.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: predicting values from multiple regression

Anna Lee
In reply to this post by Anna Lee
Dennis: thank you so much! I got it now and it works just perfectly.
Thanks a lot to the others too!
Anna

2011/3/21 Dennis Murphy <[hidden email]>:

> Hi:
>
> To amplify Ista's and David's comments:
>
> (1) You should not be inputting separate vectors into lm(), especially if
> you intend to do prediction. They should be combined into a data frame
> instead. This is not a requirement, but it's a much safer strategy for
> modeling in R.
> (2) Your covariate st does not have a linear component. It should,
> particularly if this is an empirical model rather than a theoretical one.
> (3) You should be using poly(var, 2) to create orthogonal columns in the
> model matrix for the variables that are to contain quadratic terms.
> (4) The newdata =  argument of predict.lm() [whose help page you should read
> carefully] requires a data frame with columns having precisely the same
> variable names as exist in the RHS of the model formula in lm().
>
> Example:
> dd <- data.frame(y = rnorm(50), x1 = rnorm(50), x2 = runif(50, -2, 2), x3 =
> rpois(50, 10))
>
> #  fit yhat = b0 + b1 * x1 + b2 * x1^2 + b3 * x2 + b4 * x3 + b5 * x3^2
> mod <- lm(y ~ poly(x1, 2) + x2 + poly(x3, 2), data = dd)
>
> # Note that the names of the variables in newd are the same as those on the
> RHS of the formula in mod
> newd <- data.frame(x1 = rnorm(5), x2 = runif(5, -2, 2), x3 = rpois(5,
> 10))      # new data points
> # Append predictions to newd
> cbind(newd, predict(mod, newdata = newd))             # predictions at new
> data points
>
> # To just get predictions at the observed points, all you need is
> predict(mod)
>
> HTH,
> Dennis
>
> On Sun, Mar 20, 2011 at 11:54 AM, Anna Lee <[hidden email]> wrote:
>>
>> Hey List,
>>
>> I did a multiple regression and my final model looks as follows:
>>
>> model9<-lm(calP ~ nsP + I(st^2) + distPr + I(distPr^2))
>>
>> Now I tried to predict the values for calP from this model using the
>> following function:
>>
>> xv<-seq(0,89,by=1)
>> yv<-predict(model9,list(distPr=xv,st=xv,nsP=xv))
>>
>> The predicted values are however strange. Now I do not know weather
>> just the model does not fit the data (actually all coefficiets are
>> significant and the plot(model) shows a good shape) or wether I did
>> something wrong with my prediction command. Does anyone have an
>> idea???
>>
>> --
>>
>>
>> Thanks a lot, Anna
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>



--



Der Inhalt dieser E-Mail ist vertraulich. Sollte Ihnen die E-Mail
irrtümlich zugesandt worden sein, bitte ich Sie, mich unverzüglich zu
benachrichtigen und die E-Mail zu löschen.

This e-mail is confidential. If you have received it in error, please
notify me immediately and delete it from your system.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.