metaprogramming with lm

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

metaprogramming with lm

June Kim
Hello,

Say I want to make a multiple regression model with the following expression:

lm(y~x1 + x2 + x3 + ... + x_n,data=mydata)

It gets boring to type in the whole independent variables, in this
case x_i. Is there any simple way to do the metaprogramming for this?
(There are different cases where the names of the independent
variables might sometimes have apparent patterns or not)

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: metaprogramming with lm

Erik Iverson
The special name "." may be used on the right side of the "~" operator,
to stand for all the variables in a data.frame other than the response.

--John Chambers, Statistical Models in S, p. 101

So, if the y and Xi (in your case) were the only variables in mydata, then

lm(y ~ . , data = mydata)

would be of use.

Erik

June Kim wrote:

> Hello,
>
> Say I want to make a multiple regression model with the following expression:
>
> lm(y~x1 + x2 + x3 + ... + x_n,data=mydata)
>
> It gets boring to type in the whole independent variables, in this
> case x_i. Is there any simple way to do the metaprogramming for this?
> (There are different cases where the names of the independent
> variables might sometimes have apparent patterns or not)
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: metaprogramming with lm

Bill.Venables
In reply to this post by June Kim
Two possible ways around this are

1. If the x's are *all* the other variables in your data frame you can use a dot:

fm <- lm(y ~ ., data = myData)

2. Here is another idea

> as.formula(paste("y~", paste("x",1:10, sep="", collapse="+")))
y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10
>

(You bore easily!)


Bill Venables
http://www.cmis.csiro.au/bill.venables/


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of June Kim
Sent: Thursday, 13 November 2008 10:27 AM
To: [hidden email]
Subject: [R] metaprogramming with lm

Hello,

Say I want to make a multiple regression model with the following expression:

lm(y~x1 + x2 + x3 + ... + x_n,data=mydata)

It gets boring to type in the whole independent variables, in this
case x_i. Is there any simple way to do the metaprogramming for this?
(There are different cases where the names of the independent
variables might sometimes have apparent patterns or not)

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: metaprogramming with lm

Simon Blomberg-4
In reply to this post by June Kim
You can construct the formula on the fly. Say you have a data frame with
columns: y, x1,...x10:

dat <- data.frame(matrix(rnorm(1100), ncol=11, dimnames=list(NULL,c("y",
paste("x", 1:10, sep="")))))

Then you could construct the formula using:

form <- formula(paste("y ~ ", paste(names(dat)[which(names(dat) !=
"y")], collapse="+")))

fit <- lm(form, data=dat)

HTH,

Simon.


On Thu, 2008-11-13 at 09:27 +0900, June Kim wrote:

> Hello,
>
> Say I want to make a multiple regression model with the following expression:
>
> lm(y~x1 + x2 + x3 + ... + x_n,data=mydata)
>
> It gets boring to type in the whole independent variables, in this
> case x_i. Is there any simple way to do the metaprogramming for this?
> (There are different cases where the names of the independent
> variables might sometimes have apparent patterns or not)
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Simon Blomberg, BSc (Hons), PhD, MAppStat.
Lecturer and Consultant Statistician
Faculty of Biological and Chemical Sciences
The University of Queensland
St. Lucia Queensland 4072
Australia
Room 320 Goddard Building (8)
T: +61 7 3365 2506
http://www.uq.edu.au/~uqsblomb
email: S.Blomberg1_at_uq.edu.au

Policies:
1.  I will NOT analyse your data for you.
2.  Your deadline is your problem.

The combination of some data and an aching desire for
an answer does not ensure that a reasonable answer can
be extracted from a given body of data. - John Tukey.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: metaprogramming with lm

Simon Blomberg-4
In reply to this post by Bill.Venables
Yet again my baroque programming style shows itself. The . notation is
great, although solution 2. is perhaps more versatile, allowing you to
pick and choose your predictors more easily.

On Thu, 2008-11-13 at 11:56 +1100, [hidden email] wrote:

> Two possible ways around this are
>
> 1. If the x's are *all* the other variables in your data frame you can use a dot:
>
> fm <- lm(y ~ ., data = myData)
>
> 2. Here is another idea
>
> > as.formula(paste("y~", paste("x",1:10, sep="", collapse="+")))
> y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10
> >
>
> (You bore easily!)
>
>
> Bill Venables
> http://www.cmis.csiro.au/bill.venables/
>
>
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf Of June Kim
> Sent: Thursday, 13 November 2008 10:27 AM
> To: [hidden email]
> Subject: [R] metaprogramming with lm
>
> Hello,
>
> Say I want to make a multiple regression model with the following expression:
>
> lm(y~x1 + x2 + x3 + ... + x_n,data=mydata)
>
> It gets boring to type in the whole independent variables, in this
> case x_i. Is there any simple way to do the metaprogramming for this?
> (There are different cases where the names of the independent
> variables might sometimes have apparent patterns or not)
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Simon Blomberg, BSc (Hons), PhD, MAppStat.
Lecturer and Consultant Statistician
Faculty of Biological and Chemical Sciences
The University of Queensland
St. Lucia Queensland 4072
Australia
Room 320 Goddard Building (8)
T: +61 7 3365 2506
http://www.uq.edu.au/~uqsblomb
email: S.Blomberg1_at_uq.edu.au

Policies:
1.  I will NOT analyse your data for you.
2.  Your deadline is your problem.

The combination of some data and an aching desire for
an answer does not ensure that a reasonable answer can
be extracted from a given body of data. - John Tukey.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.