

Hi, I'm quite new to R (1 month full time use so far). I have to run loop regressions VERY often in my work, so I would appreciate some new methodology that I'm not considering.
#
y<matrix(rnorm(100),ncol=10,nrow=10)
x<matrix(rnorm(50),ncol=5,nrow=10)
#Suppose I want to run the specification y=A+Bx+error, for each and every y[,n] onto each and every x[,n].
#So with:
ncol(y);ncol(x)
#I should end up with 10*5=50 regressions in total.
#I know how to do this fine:
MISC1<0
for(i in 1:ncol(y)){
for(j in 1:ncol(x)){
reg<lm(y[,i]~x[,j])
MISC1<cbind(MISC1,coef(reg)) #for coefficients
}}
coef<matrix(MISC1[,1],ncol=50)
coef[,1];coef(lm(y[,1]~x[,1])) #test passed
ncol(coef) #as desired, 50 regressions.
#
Now for my question: Is there easier or better methods of doing this? I know of a lapply method, but the only lapply way I know of for lm(..) is basically doing a lapply inside of a lapply, meaning it's exactly the same as the double loop above... I'm looking to escape from loops.
Also, if any of you could share your top R tips that you've learned over the years, I'd really appreciate it. Tiny things like learning that array() and matrix() can have a 3rd dimension, learning of strsplit, etc.. have helped me immeasurably. (Not that I'm also googling for this stuff! I'm doing R 14 hours a day!).
Thanks.

Isaac
Research Assistant
Quantitative Finance Faculty, UTS


You can get the ols coefficients with basic matrix operations as well (
https://files.nyu.edu/mrg217/public/ols_matrix.pdf) and by that avoid one
of the loops. I do not know how efficient this is but I have attached an
example you can paste bellow your code. Here, one xarray is used as a
right hand side variable for all yarrays in each loop.
The coefficients match, but they are in different order.
#
#Original code here....
m1=array(1,nrow(x))
#Creates an array of ones
MISC2<0
for(j in 1:ncol(x)){
mX=cbind(m1,x[,j])
reg2<solve(t(mX)%*%mX)%*%t(mX)%*%y
MISC2<cbind(MISC2,reg2) #for coefficients
}
coef<matrix(MISC2[,1],ncol=50)
coef[,1];coef(lm(y[,1]~x[,1])) #test passed
ncol(coef)
#as desired, 50 regressions.
MISC1
MISC2
#
2011/12/26 iliketurtles < [hidden email]>
> Hi, I'm quite new to R (1 month full time use so far). I have to run loop
> regressions VERY often in my work, so I would appreciate some new
> methodology that I'm not considering.
>
>
> #
> y<matrix(rnorm(100),ncol=10,nrow=10)
> x<matrix(rnorm(50),ncol=5,nrow=10)
>
> #Suppose I want to run the specification y=A+Bx+error, for each and every
> y[,n] onto each and every x[,n].
> #So with:
> ncol(y);ncol(x)
> #I should end up with 10*5=50 regressions in total.
>
> #I know how to do this fine:
> MISC1<0
> for(i in 1:ncol(y)){
> for(j in 1:ncol(x)){
> reg<lm(y[,i]~x[,j])
> MISC1<cbind(MISC1,coef(reg)) #for coefficients
> }}
> coef<matrix(MISC1[,1],ncol=50)
>
> coef[,1];coef(lm(y[,1]~x[,1])) #test passed
> ncol(coef) #as desired, 50 regressions.
>
> #
>
> Now for my question: Is there easier or better methods of doing this? I
> know
> of a lapply method, but the only lapply way I know of for lm(..) is
> basically doing a lapply inside of a lapply, meaning it's exactly the same
> as the double loop above... I'm looking to escape from loops.
>
> Also, if any of you could share your top R tips that you've learned over
> the
> years, I'd really appreciate it. Tiny things like learning that array() and
> matrix() can have a 3rd dimension, learning of strsplit, etc.. have helped
> me immeasurably. (Not that I'm also googling for this stuff! I'm doing R 14
> hours a day!).
>
> Thanks.
>
> 
> View this message in context:
> http://r.789695.n4.nabble.com/Otherwaystolmregressionnonlooptp4234487p4234487.html> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hello, iliketurtles (?),
for whatever strange reasons you want to regress all ycolumns on all
xcolumns, maybe
reg < apply( x, 2, function( xx) lm( y ~ xx))
do.call( "cbind", lapply( reg, coef))
does what you want. (To understand what the code above does, check the
documentation for lm(): "If response is a matrix a linear model is fitted
separately by leastsquares to each column of the matrix.")
Hth  Gerrit
On Mon, 26 Dec 2011, iliketurtles wrote:
> Hi, I'm quite new to R (1 month full time use so far). I have to run loop
> regressions VERY often in my work, so I would appreciate some new
> methodology that I'm not considering.
>
> #
> y<matrix(rnorm(100),ncol=10,nrow=10)
> x<matrix(rnorm(50),ncol=5,nrow=10)
>
> #Suppose I want to run the specification y=A+Bx+error, for each and every
> y[,n] onto each and every x[,n].
> #So with:
> ncol(y);ncol(x)
> #I should end up with 10*5=50 regressions in total.
>
> #I know how to do this fine:
> MISC1<0
> for(i in 1:ncol(y)){
> for(j in 1:ncol(x)){
> reg<lm(y[,i]~x[,j])
> MISC1<cbind(MISC1,coef(reg)) #for coefficients
> }}
> coef<matrix(MISC1[,1],ncol=50)
>
> coef[,1];coef(lm(y[,1]~x[,1])) #test passed
> ncol(coef) #as desired, 50 regressions.
> #
>
> Now for my question: Is there easier or better methods of doing this? I know
> of a lapply method, but the only lapply way I know of for lm(..) is
> basically doing a lapply inside of a lapply, meaning it's exactly the same
> as the double loop above... I'm looking to escape from loops.
>
> Also, if any of you could share your top R tips that you've learned over the
> years, I'd really appreciate it. Tiny things like learning that array() and
> matrix() can have a 3rd dimension, learning of strsplit, etc.. have helped
> me immeasurably. (Not that I'm also googling for this stuff! I'm doing R 14
> hours a day!).
>
> Thanks.
>
> 
> View this message in context: http://r.789695.n4.nabble.com/Otherwaystolmregressionnonlooptp4234487p4234487.html> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Dear anonymous:
1. You may be more likely to get "useful tips" on this list if you
sign with your real name. It's friendlier.
2. If you are using R "14 hours/day." get and read a good R book. The
CRAN site or Amazon lists many; choose one or more that suits your
needs.
3. Read the R Help files carefully. ?lm tells you that you do not
need a loop to fit many y's simultaneously:
"If response is a matrix a linear model is fitted separately by
leastsquares to each column of the matrix. "
4. Loops are not necessarily so terrible. "apply" type functions are
basically loops, also; their chief advantage is often just code
readability, not efficiency.
5. For tasks such as yours, ?update methods are typically useful.
Cheers,
Bert
On Mon, Dec 26, 2011 at 4:29 AM, iliketurtles < [hidden email]> wrote:
> Hi, I'm quite new to R (1 month full time use so far). I have to run loop
> regressions VERY often in my work, so I would appreciate some new
> methodology that I'm not considering.
>
> #
> y<matrix(rnorm(100),ncol=10,nrow=10)
> x<matrix(rnorm(50),ncol=5,nrow=10)
>
> #Suppose I want to run the specification y=A+Bx+error, for each and every
> y[,n] onto each and every x[,n].
> #So with:
> ncol(y);ncol(x)
> #I should end up with 10*5=50 regressions in total.
>
> #I know how to do this fine:
> MISC1<0
> for(i in 1:ncol(y)){
> for(j in 1:ncol(x)){
> reg<lm(y[,i]~x[,j])
> MISC1<cbind(MISC1,coef(reg)) #for coefficients
> }}
> coef<matrix(MISC1[,1],ncol=50)
>
> coef[,1];coef(lm(y[,1]~x[,1])) #test passed
> ncol(coef) #as desired, 50 regressions.
> #
>
> Now for my question: Is there easier or better methods of doing this? I know
> of a lapply method, but the only lapply way I know of for lm(..) is
> basically doing a lapply inside of a lapply, meaning it's exactly the same
> as the double loop above... I'm looking to escape from loops.
>
> Also, if any of you could share your top R tips that you've learned over the
> years, I'd really appreciate it. Tiny things like learning that array() and
> matrix() can have a 3rd dimension, learning of strsplit, etc.. have helped
> me immeasurably. (Not that I'm also googling for this stuff! I'm doing R 14
> hours a day!).
>
> Thanks.
>
> 
> View this message in context: http://r.789695.n4.nabble.com/Otherwaystolmregressionnonlooptp4234487p4234487.html> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.

Bert Gunter
Genentech Nonclinical Biostatistics
Internal Contact Info:
Phone: 4677374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdbfunctionalgroups/pdbbiostatistics/pdbncbhome.htm______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Thanks for the advice everyone. All very helpful.
@Bert
Added my information to signature, thanks.

Isaac
Research Assistant
Quantitative Finance Faculty, UTS

