Repeated cross-validation for a lm object

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Repeated cross-validation for a lm object

samuel-rosa
Dear R users

I'd like to hear from someone if there is a function to do a repeated k-fold cross-validation for a lm object and get the predicted values for every observation. The situation is as follows:
I had a data set composed by 174 observations from which I sampled randomly a subset composed by 150 observations. With the subset (n = 150) I fitted the model: y = a + bx. The model validation has to be done using a repeated k-fold cross-validation on the complete data set (n = 174). I need to use 10 folds and repeat the cross-validation 100 times. In the end of the procedure, I need to have access to the predicted values for each observation, that is, to the 100 predicted values for each observation. The function lmCV() in the package chemometrics provides the predicted values. However, it works only with multiple linear regression models.
I hope there is a way of doing it.
Best regards,
Bc.Sc.Agri. Alessandro Samuel-Rosa
Postgraduate Program in Soil Science
Federal University of Santa Maria
Av. Roraima, nº 1000, Bairro Camobi, CEP 97105-970
Santa Maria, Rio Grande do Sul, Brazil
Reply | Threaded
Open this post in threaded view
|

Re: Repeated cross-validation for a lm object

glsnow
The validate function in the rms package can do cross validation of
ols objects (ols is similar to lm, but with additional information),
the default is to do bootstrap validation, but you can specify
crossvalidation instead.

On Thu, Feb 16, 2012 at 10:44 AM, samuel-rosa
<[hidden email]> wrote:

> Dear R users
>
> I'd like to hear from someone if there is a function to do a repeated k-fold
> cross-validation for a lm object and get the predicted values for every
> observation. The situation is as follows:
> I had a data set composed by 174 observations from which I sampled randomly
> a subset composed by 150 observations. With the subset (n = 150) I fitted
> the model: y = a + bx. The model validation has to be done using a repeated
> k-fold cross-validation on the complete data set (n = 174). I need to use 10
> folds and repeat the cross-validation 100 times. In the end of the
> procedure, I need to have access to the predicted values for each
> observation, that is, to the 100 predicted values for each observation. The
> function lmCV() in the package chemometrics provides the predicted values.
> However, it works only with multiple linear regression models.
> I hope there is a way of doing it.
> Best regards,
>
> -----
> Bc.Sc.Agri. Alessandro Samuel-Rosa
> Postgraduate Program in Soil Science
> Federal University of Santa Maria
> Av. Roraima, nº 1000, Bairro Camobi, CEP 97105-970
> Santa Maria, Rio Grande do Sul, Brazil
> --
> View this message in context: http://r.789695.n4.nabble.com/Repeated-cross-validation-for-a-lm-object-tp4394833p4394833.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Gregory (Greg) L. Snow Ph.D.
[hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Repeated cross-validation for a lm object

Mxkuhn
The train function in the caret package will do this. The trainControl function would use method ="repeatedcv" and repeats = 100.

On Feb 18, 2012, at 2:15 PM, Greg Snow <[hidden email]> wrote:

> The validate function in the rms package can do cross validation of
> ols objects (ols is similar to lm, but with additional information),
> the default is to do bootstrap validation, but you can specify
> crossvalidation instead.
>
> On Thu, Feb 16, 2012 at 10:44 AM, samuel-rosa
> <[hidden email]> wrote:
>> Dear R users
>>
>> I'd like to hear from someone if there is a function to do a repeated k-fold
>> cross-validation for a lm object and get the predicted values for every
>> observation. The situation is as follows:
>> I had a data set composed by 174 observations from which I sampled randomly
>> a subset composed by 150 observations. With the subset (n = 150) I fitted
>> the model: y = a + bx. The model validation has to be done using a repeated
>> k-fold cross-validation on the complete data set (n = 174). I need to use 10
>> folds and repeat the cross-validation 100 times. In the end of the
>> procedure, I need to have access to the predicted values for each
>> observation, that is, to the 100 predicted values for each observation. The
>> function lmCV() in the package chemometrics provides the predicted values.
>> However, it works only with multiple linear regression models.
>> I hope there is a way of doing it.
>> Best regards,
>>
>> -----
>> Bc.Sc.Agri. Alessandro Samuel-Rosa
>> Postgraduate Program in Soil Science
>> Federal University of Santa Maria
>> Av. Roraima, nº 1000, Bairro Camobi, CEP 97105-970
>> Santa Maria, Rio Grande do Sul, Brazil
>> --
>> View this message in context: http://r.789695.n4.nabble.com/Repeated-cross-validation-for-a-lm-object-tp4394833p4394833.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> [hidden email]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Repeated cross-validation for a lm object

samuel-rosa
Dear Max

Thank you for your attention. The train function in the caret package realy does what I need.

Best regards,
Bc.Sc.Agri. Alessandro Samuel-Rosa
Postgraduate Program in Soil Science
Federal University of Santa Maria
Av. Roraima, nº 1000, Bairro Camobi, CEP 97105-970
Santa Maria, Rio Grande do Sul, Brazil
Reply | Threaded
Open this post in threaded view
|

Re: Repeated cross-validation for a lm object

samuel-rosa
This post has NOT been accepted by the mailing list yet.
In reply to this post by glsnow
Dear Greg

Thank you for your attention. I've seen that the train function in the caret package does the job. However, I'll also try do it following your suggestion.

Best regards,
Bc.Sc.Agri. Alessandro Samuel-Rosa
Postgraduate Program in Soil Science
Federal University of Santa Maria
Av. Roraima, nº 1000, Bairro Camobi, CEP 97105-970
Santa Maria, Rio Grande do Sul, Brazil