How make a x,y dataset from a formula based entry

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

How make a x,y dataset from a formula based entry

trekvana
Hello all,

So I am using the (formula entry) method for randomForests:

randomForest(y~x1+x2+...+x39+x40,data=xxx,...) but the issue is that some of the items in that package dont take a formula entry - you have to explicitly state the y and x vector:

randomForest(x=xxx[,c('x1','x2',...,'x40')],y=xxx[,'y'],...)

Now my question is whether there is a function/way to tell R to take a formula and make the two corresponding datasets [x,y] (that way I dont have to create the x dataset manually with all 40 variables I have).

There must be a more elegant way to do this than x=xxx[,c('x1','x2',...,'x40')]

Thanks!
George
Reply | Threaded
Open this post in threaded view
|

Re: How make a x,y dataset from a formula based entry

Jean-Christophe BOUËTTÉ
Hello,
You can check ?model.frame.
I do not know however to extract only the right-hand of left-hand part
of a formula.

JC

2011/9/22 trekvana <[hidden email]>:

> Hello all,
>
> So I am using the (formula entry) method for randomForests:
>
> randomForest(y~x1+x2+...+x39+x40,data=xxx,...) but the issue is that some of
> the items in that package dont take a formula entry - you have to explicitly
> state the y and x vector:
>
> randomForest(x=xxx[,c('x1','x2',...,'x40')],y=xxx[,'y'],...)
>
> Now my question is whether there is a function/way to tell R to take a
> formula and make the two corresponding datasets [x,y] (that way I dont have
> to create the x dataset manually with all 40 variables I have).
>
> There must be a more elegant way to do this than
> x=xxx[,c('x1','x2',...,'x40')]
>
> Thanks!
> George
>
> --
> View this message in context: http://r.789695.n4.nabble.com/How-make-a-x-y-dataset-from-a-formula-based-entry-tp3834477p3834477.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: How make a x,y dataset from a formula based entry

Helios de Rosario
In reply to this post by trekvana
To separate the parts of a formula, use as.character
(check the examples in ?character)

Helios

22 Sep 2011 16:14:05 -0400
From: Jean-Christophe BOU?TT? <[hidden email]>
> Hello,
> You can check ?model.frame.
> I do not know however to extract only the right-hand of left-hand
part

> of a formula.
>
> JC
>
> 2011/9/22 trekvana <[hidden email]>:
>> Hello all,
>>
>> So I am using the (formula entry) method for randomForests:
>>
>> randomForest(y~x1+x2+...+x39+x40,data=xxx,...) but the issue is that
some of
>> the items in that package dont take a formula entry - you have to
explicitly
>> state the y and x vector:
>>
>> randomForest(x=xxx[,c('x1','x2',...,'x40')],y=xxx[,'y'],...)
>>
>> Now my question is whether there is a function/way to tell R to take
a
>> formula and make the two corresponding datasets [x,y] (that way I
dont have
>> to create the x dataset manually with all 40 variables I have).
>>
>> There must be a more elegant way to do this than
>> x=xxx[,c('x1','x2',...,'x40')]
>>
>> Thanks!
>> George


INSTITUTO DE BIOMECÁNICA DE VALENCIA
Universidad Politécnica de Valencia • Edificio 9C
Camino de Vera s/n • 46022 VALENCIA (ESPAÑA)
Tel. +34 96 387 91 60 • Fax +34 96 387 91 69
www.ibv.org

  Antes de imprimir este e-mail piense bien si es necesario hacerlo.
En cumplimiento de la Ley Orgánica 15/1999 reguladora de la Protección
de Datos de Carácter Personal, le informamos de que el presente mensaje
contiene información confidencial, siendo para uso exclusivo del
destinatario arriba indicado. En caso de no ser usted el destinatario
del mismo le informamos que su recepción no le autoriza a su divulgación
o reproducción por cualquier medio, debiendo destruirlo de inmediato,
rogándole lo notifique al remitente.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: How make a x,y dataset from a formula based entry

Jean-Christophe BOUËTTÉ
Also, if your formula is really of the form y ~x1+...+xn
you can have a look at the last example for ?formula for a simple way
to generate the formula.
HTH,
JC

2011/9/23 Helios de Rosario <[hidden email]>:

> To separate the parts of a formula, use as.character
> (check the examples in ?character)
>
> Helios
>
> 22 Sep 2011 16:14:05 -0400
> From: Jean-Christophe BOU?TT? <[hidden email]>
>> Hello,
>> You can check ?model.frame.
>> I do not know however to extract only the right-hand of left-hand
> part
>> of a formula.
>>
>> JC
>>
>> 2011/9/22 trekvana <[hidden email]>:
>>> Hello all,
>>>
>>> So I am using the (formula entry) method for randomForests:
>>>
>>> randomForest(y~x1+x2+...+x39+x40,data=xxx,...) but the issue is that
> some of
>>> the items in that package dont take a formula entry - you have to
> explicitly
>>> state the y and x vector:
>>>
>>> randomForest(x=xxx[,c('x1','x2',...,'x40')],y=xxx[,'y'],...)
>>>
>>> Now my question is whether there is a function/way to tell R to take
> a
>>> formula and make the two corresponding datasets [x,y] (that way I
> dont have
>>> to create the x dataset manually with all 40 variables I have).
>>>
>>> There must be a more elegant way to do this than
>>> x=xxx[,c('x1','x2',...,'x40')]
>>>
>>> Thanks!
>>> George
>
>
> INSTITUTO DE BIOMECÁNICA DE VALENCIA
> Universidad Politécnica de Valencia • Edificio 9C
> Camino de Vera s/n • 46022 VALENCIA (ESPAÑA)
> Tel. +34 96 387 91 60 • Fax +34 96 387 91 69
> www.ibv.org
>
>  Antes de imprimir este e-mail piense bien si es necesario hacerlo.
> En cumplimiento de la Ley Orgánica 15/1999 reguladora de la Protección
> de Datos de Carácter Personal, le informamos de que el presente mensaje
> contiene información confidencial, siendo para uso exclusivo del
> destinatario arriba indicado. En caso de no ser usted el destinatario
> del mismo le informamos que su recepción no le autoriza a su divulgación
> o reproducción por cualquier medio, debiendo destruirlo de inmediato,
> rogándole lo notifique al remitente.
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: How make a x,y dataset from a formula based entry

Gabor Grothendieck
In reply to this post by trekvana
On Thu, Sep 22, 2011 at 2:54 PM, trekvana <[hidden email]> wrote:

> Hello all,
>
> So I am using the (formula entry) method for randomForests:
>
> randomForest(y~x1+x2+...+x39+x40,data=xxx,...) but the issue is that some of
> the items in that package dont take a formula entry - you have to explicitly
> state the y and x vector:
>
> randomForest(x=xxx[,c('x1','x2',...,'x40')],y=xxx[,'y'],...)
>
> Now my question is whether there is a function/way to tell R to take a
> formula and make the two corresponding datasets [x,y] (that way I dont have
> to create the x dataset manually with all 40 variables I have).
>
> There must be a more elegant way to do this than
> x=xxx[,c('x1','x2',...,'x40')]

We assume that the formula is of the form:

fo <- y ~ x1 + x2 + x3

Now if we set:

v <- all.vars(fo)

and if DF is our data frame then DF[, v[1]] and DF[v[-1]] are the
response and predictors.  (You may need to add an intercept to the
predictors and convert the predictors from data frame to a matrix
depending on what you intend to do next.)

--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.