Help on predict.lm

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Help on predict.lm

Nederjaard
This post was updated on .
Hello, 


I'm new here, but will try to be as specific and complete as possible. I'm trying to use “lm“ to first estimate parameter values from a set of calibration measurements, and then later to use those estimates to calculate another set of values with “predict.lm”.

First I have a calibration dataset of absorbance values measured from standard solutions with known concentration of Bromide:

> stds
      abs conc
1 -0.0021    0
2  0.1003  200
3  0.2395  500
4  0.3293  800

On this small calibration series, I perform a linear regression to find the parameter estimates of the relationship between absorbance (abs) and concentration (conc):

> linear1 <- lm(abs~conc, data=stds)
> summary(linear1)

Call:
lm(formula = abs ~ conc, data = stds)

Residuals:
        1         2         3         4
-0.012600  0.006467  0.020667 -0.014533

Coefficients:
             Estimate Std. Error t value Pr(>|t|)  
(Intercept) 1.050e-02  1.629e-02   0.645  0.58527  
conc        4.167e-04  3.378e-05  12.333  0.00651 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.02048 on 2 degrees of freedom
Multiple R-squared: 0.987,      Adjusted R-squared: 0.9805
F-statistic: 152.1 on 1 and 2 DF,  p-value: 0.00651


Now I come with another dataset, which contains measured absorbance values of Bromide in solution:

> brom
    hours     abs
1    -1.0  0.0633
2     1.0  0.2686
3     5.0  0.2446
4    18.0  0.2274
5    29.0  0.2091
6    42.0  0.1961
7    53.0  0.1310
8    76.0  0.1504
9    91.0  0.1317
10   95.5  0.1169
11  101.0  0.0977
12  115.0  0.1023
13  123.5  0.0879
14  138.5  0.0724
15  147.5  0.0564
16  163.0  0.0495
17  171.0  0.0325
18  189.0  0.0182
19  211.0  0.0047
20  212.5      NA
21  815.5 -0.2112
22  816.5 -0.1896
23  817.5 -0.0783
24  818.5  0.2963
25  819.5  0.1448
26  839.5  0.0936
27  864.0  0.0560
28  888.0  0.0310
29  960.5  0.0056
30 1009.0 -0.0163

The values in column brom$abs, measured on 30 subsequent points in time need to be calculated to Bromide concentrations, using the previously established relationship “linear1”.  
At first, I thought it could be done by:

> predict.lm(linear1, brom$abs)
Error in eval(predvars, data, env) : numeric 'envir' arg not of length one

But, R gives the above error message. Then, after some searching around on different fora and R-communities (including this one), I learned that the “newdata” in “predict.lm” actually needs to be coerced into a separate dataframe. Thus:

> mabs <- data.frame(Abs = brom$abs)
> predict.lm(linear1, mabs)
Error in eval(expr, envir, enclos) : object 'conc' not found

Again, R gives an error...because I made an error somewhere, but I truly fail to see where. I hope somebody can explain to me clearly what I'm doing wrong and what I should do to instead.
Any help is greatly appreciated, thanks !
Reply | Threaded
Open this post in threaded view
|

Re: Help on predict.lm

Berend Hasselman

On 27-03-2012, at 19:24, Nederjaard wrote:

> Hello,
>
> I'm new here, but will try to be as specific and complete as possible. I'm
> trying to use “lm“ to first estimate parameter values from a set of
> calibration measurements, and then later to use those estimates to calculate
> another set of values with “predict.lm”.
>
> First I have a calibration dataset of absorbance values measured from
> standard solutions with known concentration of Bromide:
>
>> stds
>      abs conc
> 1 -0.0021    0
> 2  0.1003  200
> 3  0.2395  500
> 4  0.3293  800
>
> On this small calibration series, I perform a linear regression to find the
> parameter estimates of the relationship between absorbance (abs) and
> concentration (conc):
>
>> linear1 <- lm(abs~conc, data=stds)
>> summary(linear1)
>
> Call:
> lm(formula = abs ~ conc, data = stds)
>
> Residuals:
>        1         2         3         4
> -0.012600  0.006467  0.020667 -0.014533
>
> Coefficients:
>             Estimate Std. Error t value Pr(>|t|)  
> (Intercept) 1.050e-02  1.629e-02   0.645  0.58527  
> conc        4.167e-04  3.378e-05  12.333  0.00651 **
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 0.02048 on 2 degrees of freedom
> Multiple R-squared: 0.987,      Adjusted R-squared: 0.9805
> F-statistic: 152.1 on 1 and 2 DF,  p-value: 0.00651
>
>
>
>
>
> Now I come with another dataset, which contains measured absorbance values
> of Bromide in solution:
>
>> brom
>    hours     abs
> 1    -1.0  0.0633
> 2     1.0  0.2686
> 3     5.0  0.2446
> 4    18.0  0.2274
> 5    29.0  0.2091
> 6    42.0  0.1961
> 7    53.0  0.1310
> 8    76.0  0.1504
> 9    91.0  0.1317
> 10   95.5  0.1169
> 11  101.0  0.0977
> 12  115.0  0.1023
> 13  123.5  0.0879
> 14  138.5  0.0724
> 15  147.5  0.0564
> 16  163.0  0.0495
> 17  171.0  0.0325
> 18  189.0  0.0182
> 19  211.0  0.0047
> 20  212.5      NA
> 21  815.5 -0.2112
> 22  816.5 -0.1896
> 23  817.5 -0.0783
> 24  818.5  0.2963
> 25  819.5  0.1448
> 26  839.5  0.0936
> 27  864.0  0.0560
> 28  888.0  0.0310
> 29  960.5  0.0056
> 30 1009.0 -0.0163
>
> The values in column brom$abs, measured on 30 subsequent points in time need
> to be calculated to Bromide concentrations, using the previously established
> relationship “linear1”.  
> At first, I thought it could be done by:
>
>> predict.lm(linear1, brom$abs)
> Error in eval(predvars, data, env) :
>  numeric 'envir' arg not of length one
>
> But, R gives the above error message. Then, after some searching around on
> different fora and R-communities (including this one), I learned that the
> “newdata” in “predict.lm” actually needs to be coerced into a separate
> dataframe. Thus:
>
>> mabs <- data.frame(Abs = brom$abs)
>> predict.lm(linear1, mabs)
> Error in eval(expr, envir, enclos) : object 'conc' not found
>

There is no column with name "conc" in your dataframe mabs.

You regressed abs on conc. For prediction you need data for conc and not abs.
So provide data for conc. Or change the regression around: lm(conc ~ abs, data=stds) if that makes any sense.

What you did with mabs wouldn't have worked anyway because Abs is not the same as abs.
And it wasn't necessary.

Berend


> Again, R gives an error...probably because I made an error, but I truly fail
> to see where. I hope somebody can explain to me clearly what I'm doing wrong
> and what I should do to instead.
> Any help is greatly appreciated, thanks !
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Help-on-predict-lm-tp4509586p4509586.html
> Sent from the R help mailing list archive at Nabble.com.
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Help on predict.lm

Peter Ehlers
In reply to this post by Nederjaard

R tries hard to keep you from committing scientific abuse.
As stated, your problem seems to me akin to

1. Given that a man's age can be modelled as a function
    of the grayness of his hair,
2. predict a man's age from the temperature in Barcelona.

Your calibration relates 'abs' and 'conc'. Now you want
to predict 'abs' from _'hours'_ (I think). I suspect that
concentration is actually related to time and this is
the missing link that you'll have to provide.

BTW, I'm surprised that you didn't find the requirement
for 'newdata' to be a data frame on the predict.lm help
page - it's pretty clearly stated there.

Peter Ehlers


On 2012-03-27 10:24, Nederjaard wrote:

> Hello,
>
> I'm new here, but will try to be as specific and complete as possible. I'm
> trying to use “lm“ to first estimate parameter values from a set of
> calibration measurements, and then later to use those estimates to calculate
> another set of values with “predict.lm”.
>
> First I have a calibration dataset of absorbance values measured from
> standard solutions with known concentration of Bromide:
>
>> stds
>        abs conc
> 1 -0.0021    0
> 2  0.1003  200
> 3  0.2395  500
> 4  0.3293  800
>
> On this small calibration series, I perform a linear regression to find the
> parameter estimates of the relationship between absorbance (abs) and
> concentration (conc):
>
>> linear1<- lm(abs~conc, data=stds)
>> summary(linear1)
>
> Call:
> lm(formula = abs ~ conc, data = stds)
>
> Residuals:
>          1         2         3         4
> -0.012600  0.006467  0.020667 -0.014533
>
> Coefficients:
>               Estimate Std. Error t value Pr(>|t|)
> (Intercept) 1.050e-02  1.629e-02   0.645  0.58527
> conc        4.167e-04  3.378e-05  12.333  0.00651 **
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 0.02048 on 2 degrees of freedom
> Multiple R-squared: 0.987,      Adjusted R-squared: 0.9805
> F-statistic: 152.1 on 1 and 2 DF,  p-value: 0.00651
>
>
>
>
>
> Now I come with another dataset, which contains measured absorbance values
> of Bromide in solution:
>
>> brom
>      hours     abs
> 1    -1.0  0.0633
> 2     1.0  0.2686
> 3     5.0  0.2446
> 4    18.0  0.2274
> 5    29.0  0.2091
> 6    42.0  0.1961
> 7    53.0  0.1310
> 8    76.0  0.1504
> 9    91.0  0.1317
> 10   95.5  0.1169
> 11  101.0  0.0977
> 12  115.0  0.1023
> 13  123.5  0.0879
> 14  138.5  0.0724
> 15  147.5  0.0564
> 16  163.0  0.0495
> 17  171.0  0.0325
> 18  189.0  0.0182
> 19  211.0  0.0047
> 20  212.5      NA
> 21  815.5 -0.2112
> 22  816.5 -0.1896
> 23  817.5 -0.0783
> 24  818.5  0.2963
> 25  819.5  0.1448
> 26  839.5  0.0936
> 27  864.0  0.0560
> 28  888.0  0.0310
> 29  960.5  0.0056
> 30 1009.0 -0.0163
>
> The values in column brom$abs, measured on 30 subsequent points in time need
> to be calculated to Bromide concentrations, using the previously established
> relationship “linear1”.
> At first, I thought it could be done by:
>
>> predict.lm(linear1, brom$abs)
> Error in eval(predvars, data, env) :
>    numeric 'envir' arg not of length one
>
> But, R gives the above error message. Then, after some searching around on
> different fora and R-communities (including this one), I learned that the
> “newdata” in “predict.lm” actually needs to be coerced into a separate
> dataframe. Thus:
>
>> mabs<- data.frame(Abs = brom$abs)
>> predict.lm(linear1, mabs)
> Error in eval(expr, envir, enclos) : object 'conc' not found
>
> Again, R gives an error...probably because I made an error, but I truly fail
> to see where. I hope somebody can explain to me clearly what I'm doing wrong
> and what I should do to instead.
> Any help is greatly appreciated, thanks !
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Help-on-predict-lm-tp4509586p4509586.html
> Sent from the R help mailing list archive at Nabble.com.
> [[alternative HTML version deleted]]
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Help on predict.lm

Bert Gunter
FORTUNE!!!
-- Bert

On Tue, Mar 27, 2012 at 11:44 AM, Peter Ehlers <[hidden email]> wrote:
>
> R tries hard to keep you from committing scientific abuse.
> As stated, your problem seems to me akin to
>
> 1. Given that a man's age can be modelled as a function
>   of the grayness of his hair,
> 2. predict a man's age from the temperature in Barcelona.
>

...




Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Help on predict.lm

Peter Ehlers
In reply to this post by Nederjaard

R tries hard to keep you from committing scientific abuse.
As stated, your problem seems to me akin to

1. Given that a man's age can be modelled as a function
     of the grayness of his hair,
2. predict a man's age from the temperature in Barcelona.

Your calibration relates 'abs' and 'conc'. Now you want
to predict 'abs' from 'hours' (I think). I suspect that
concentration is actually related to time and this is
the missing link that

BTW, I'm surprised that you didn't find the requirement
for 'newdata' to be a data frame on the predict.lm help
page - it's pretty clearly stated there.

Peter Ehlers


On 2012-03-27 10:24, Nederjaard wrote:

> Hello,
>
> I'm new here, but will try to be as specific and complete as possible. I'm
> trying to use “lm“ to first estimate parameter values from a set of
> calibration measurements, and then later to use those estimates to calculate
> another set of values with “predict.lm”.
>
> First I have a calibration dataset of absorbance values measured from
> standard solutions with known concentration of Bromide:
>
>> stds
>        abs conc
> 1 -0.0021    0
> 2  0.1003  200
> 3  0.2395  500
> 4  0.3293  800
>
> On this small calibration series, I perform a linear regression to find the
> parameter estimates of the relationship between absorbance (abs) and
> concentration (conc):
>
>> linear1<- lm(abs~conc, data=stds)
>> summary(linear1)
>
> Call:
> lm(formula = abs ~ conc, data = stds)
>
> Residuals:
>          1         2         3         4
> -0.012600  0.006467  0.020667 -0.014533
>
> Coefficients:
>               Estimate Std. Error t value Pr(>|t|)
> (Intercept) 1.050e-02  1.629e-02   0.645  0.58527
> conc        4.167e-04  3.378e-05  12.333  0.00651 **
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 0.02048 on 2 degrees of freedom
> Multiple R-squared: 0.987,      Adjusted R-squared: 0.9805
> F-statistic: 152.1 on 1 and 2 DF,  p-value: 0.00651
>
>
>
>
>
> Now I come with another dataset, which contains measured absorbance values
> of Bromide in solution:
>
>> brom
>      hours     abs
> 1    -1.0  0.0633
> 2     1.0  0.2686
> 3     5.0  0.2446
> 4    18.0  0.2274
> 5    29.0  0.2091
> 6    42.0  0.1961
> 7    53.0  0.1310
> 8    76.0  0.1504
> 9    91.0  0.1317
> 10   95.5  0.1169
> 11  101.0  0.0977
> 12  115.0  0.1023
> 13  123.5  0.0879
> 14  138.5  0.0724
> 15  147.5  0.0564
> 16  163.0  0.0495
> 17  171.0  0.0325
> 18  189.0  0.0182
> 19  211.0  0.0047
> 20  212.5      NA
> 21  815.5 -0.2112
> 22  816.5 -0.1896
> 23  817.5 -0.0783
> 24  818.5  0.2963
> 25  819.5  0.1448
> 26  839.5  0.0936
> 27  864.0  0.0560
> 28  888.0  0.0310
> 29  960.5  0.0056
> 30 1009.0 -0.0163
>
> The values in column brom$abs, measured on 30 subsequent points in time need
> to be calculated to Bromide concentrations, using the previously established
> relationship “linear1”.
> At first, I thought it could be done by:
>
>> predict.lm(linear1, brom$abs)
> Error in eval(predvars, data, env) :
>    numeric 'envir' arg not of length one
>
> But, R gives the above error message. Then, after some searching around on
> different fora and R-communities (including this one), I learned that the
> “newdata” in “predict.lm” actually needs to be coerced into a separate
> dataframe. Thus:
>
>> mabs<- data.frame(Abs = brom$abs)
>> predict.lm(linear1, mabs)
> Error in eval(expr, envir, enclos) : object 'conc' not found
>
> Again, R gives an error...probably because I made an error, but I truly fail
> to see where. I hope somebody can explain to me clearly what I'm doing wrong
> and what I should do to instead.
> Any help is greatly appreciated, thanks !
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Help-on-predict-lm-tp4509586p4509586.html
> Sent from the R help mailing list archive at Nabble.com.
> [[alternative HTML version deleted]]
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Help on predict.lm

Nederjaard
Hello all,

Thanks for all your replies. I have studied on it some more in the meantime, and found indeed out that what I was trying to do was not correct to begin with. Sorry to have wasted your time, but thanks for the comments.