Hi, I'm quite new to R (1 month full time use so far). I have to run loop regressions VERY often in my work, so I would appreciate some new methodology that I'm not considering.
#--------------------------------------------------------------------------------------------- y<-matrix(rnorm(100),ncol=10,nrow=10) x<-matrix(rnorm(50),ncol=5,nrow=10) #Suppose I want to run the specification y=A+Bx+error, for each and every y[,n] onto each and every x[,n]. #So with: ncol(y);ncol(x) #I should end up with 10*5=50 regressions in total. #I know how to do this fine: MISC1<-0 for(i in 1:ncol(y)){ for(j in 1:ncol(x)){ reg<-lm(y[,i]~x[,j]) MISC1<-cbind(MISC1,coef(reg)) #for coefficients }} coef<-matrix(MISC1[,-1],ncol=50) coef[,1];coef(lm(y[,1]~x[,1])) #test passed ncol(coef) #as desired, 50 regressions. #--------------------------------------------------------------------------------------------- Now for my question: Is there easier or better methods of doing this? I know of a lapply method, but the only lapply way I know of for lm(..) is basically doing a lapply inside of a lapply, meaning it's exactly the same as the double loop above... I'm looking to escape from loops. Also, if any of you could share your top R tips that you've learned over the years, I'd really appreciate it. Tiny things like learning that array() and matrix() can have a 3rd dimension, learning of strsplit, etc.. have helped me immeasurably. (Not that I'm also googling for this stuff! I'm doing R 14 hours a day!). Thanks.
----
Isaac Research Assistant Quantitative Finance Faculty, UTS |
You can get the ols coefficients with basic matrix operations as well (
https://files.nyu.edu/mrg217/public/ols_matrix.pdf) and by that avoid one of the loops. I do not know how efficient this is but I have attached an example you can paste bellow your code. Here, one x-array is used as a right hand side variable for all y-arrays in each loop. The coefficients match, but they are in different order. #---------------------------------------------------------------------------------------------------------------------------- #Original code here.... m1=array(1,nrow(x)) #Creates an array of ones MISC2<-0 for(j in 1:ncol(x)){ mX=cbind(m1,x[,j]) reg2<-solve(t(mX)%*%mX)%*%t(mX)%*%y MISC2<-cbind(MISC2,reg2) #for coefficients } coef<-matrix(MISC2[,-1],ncol=50) coef[,1];coef(lm(y[,1]~x[,1])) #test passed ncol(coef) #as desired, 50 regressions. MISC1 MISC2 #------------------------------------------------------------------------------------------------------------------------------------- 2011/12/26 iliketurtles <[hidden email]> > Hi, I'm quite new to R (1 month full time use so far). I have to run loop > regressions VERY often in my work, so I would appreciate some new > methodology that I'm not considering. > > > #--------------------------------------------------------------------------------------------- > y<-matrix(rnorm(100),ncol=10,nrow=10) > x<-matrix(rnorm(50),ncol=5,nrow=10) > > #Suppose I want to run the specification y=A+Bx+error, for each and every > y[,n] onto each and every x[,n]. > #So with: > ncol(y);ncol(x) > #I should end up with 10*5=50 regressions in total. > > #I know how to do this fine: > MISC1<-0 > for(i in 1:ncol(y)){ > for(j in 1:ncol(x)){ > reg<-lm(y[,i]~x[,j]) > MISC1<-cbind(MISC1,coef(reg)) #for coefficients > }} > coef<-matrix(MISC1[,-1],ncol=50) > > coef[,1];coef(lm(y[,1]~x[,1])) #test passed > ncol(coef) #as desired, 50 regressions. > > #--------------------------------------------------------------------------------------------- > > Now for my question: Is there easier or better methods of doing this? I > know > of a lapply method, but the only lapply way I know of for lm(..) is > basically doing a lapply inside of a lapply, meaning it's exactly the same > as the double loop above... I'm looking to escape from loops. > > Also, if any of you could share your top R tips that you've learned over > the > years, I'd really appreciate it. Tiny things like learning that array() and > matrix() can have a 3rd dimension, learning of strsplit, etc.. have helped > me immeasurably. (Not that I'm also googling for this stuff! I'm doing R 14 > hours a day!). > > Thanks. > > -- > View this message in context: > http://r.789695.n4.nabble.com/Other-ways-to-lm-regression-non-loop-tp4234487p4234487.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by iliketurtles
Hello, iliketurtles (?),
for whatever strange reasons you want to regress all y-columns on all x-columns, maybe reg <- apply( x, 2, function( xx) lm( y ~ xx)) do.call( "cbind", lapply( reg, coef)) does what you want. (To understand what the code above does, check the documentation for lm(): "If response is a matrix a linear model is fitted separately by least-squares to each column of the matrix.") Hth -- Gerrit On Mon, 26 Dec 2011, iliketurtles wrote: > Hi, I'm quite new to R (1 month full time use so far). I have to run loop > regressions VERY often in my work, so I would appreciate some new > methodology that I'm not considering. > > #--------------------------------------------------------------------------------------------- > y<-matrix(rnorm(100),ncol=10,nrow=10) > x<-matrix(rnorm(50),ncol=5,nrow=10) > > #Suppose I want to run the specification y=A+Bx+error, for each and every > y[,n] onto each and every x[,n]. > #So with: > ncol(y);ncol(x) > #I should end up with 10*5=50 regressions in total. > > #I know how to do this fine: > MISC1<-0 > for(i in 1:ncol(y)){ > for(j in 1:ncol(x)){ > reg<-lm(y[,i]~x[,j]) > MISC1<-cbind(MISC1,coef(reg)) #for coefficients > }} > coef<-matrix(MISC1[,-1],ncol=50) > > coef[,1];coef(lm(y[,1]~x[,1])) #test passed > ncol(coef) #as desired, 50 regressions. > #--------------------------------------------------------------------------------------------- > > Now for my question: Is there easier or better methods of doing this? I know > of a lapply method, but the only lapply way I know of for lm(..) is > basically doing a lapply inside of a lapply, meaning it's exactly the same > as the double loop above... I'm looking to escape from loops. > > Also, if any of you could share your top R tips that you've learned over the > years, I'd really appreciate it. Tiny things like learning that array() and > matrix() can have a 3rd dimension, learning of strsplit, etc.. have helped > me immeasurably. (Not that I'm also googling for this stuff! I'm doing R 14 > hours a day!). > > Thanks. > > -- > View this message in context: http://r.789695.n4.nabble.com/Other-ways-to-lm-regression-non-loop-tp4234487p4234487.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by iliketurtles
Dear anonymous:
1. You may be more likely to get "useful tips" on this list if you sign with your real name. It's friendlier. 2. If you are using R "14 hours/day." get and read a good R book. The CRAN site or Amazon lists many; choose one or more that suits your needs. 3. Read the R Help files carefully. ?lm tells you that you do not need a loop to fit many y's simultaneously: "If response is a matrix a linear model is fitted separately by least-squares to each column of the matrix. " 4. Loops are not necessarily so terrible. "apply" type functions are basically loops, also; their chief advantage is often just code readability, not efficiency. 5. For tasks such as yours, ?update methods are typically useful. Cheers, Bert On Mon, Dec 26, 2011 at 4:29 AM, iliketurtles <[hidden email]> wrote: > Hi, I'm quite new to R (1 month full time use so far). I have to run loop > regressions VERY often in my work, so I would appreciate some new > methodology that I'm not considering. > > #--------------------------------------------------------------------------------------------- > y<-matrix(rnorm(100),ncol=10,nrow=10) > x<-matrix(rnorm(50),ncol=5,nrow=10) > > #Suppose I want to run the specification y=A+Bx+error, for each and every > y[,n] onto each and every x[,n]. > #So with: > ncol(y);ncol(x) > #I should end up with 10*5=50 regressions in total. > > #I know how to do this fine: > MISC1<-0 > for(i in 1:ncol(y)){ > for(j in 1:ncol(x)){ > reg<-lm(y[,i]~x[,j]) > MISC1<-cbind(MISC1,coef(reg)) #for coefficients > }} > coef<-matrix(MISC1[,-1],ncol=50) > > coef[,1];coef(lm(y[,1]~x[,1])) #test passed > ncol(coef) #as desired, 50 regressions. > #--------------------------------------------------------------------------------------------- > > Now for my question: Is there easier or better methods of doing this? I know > of a lapply method, but the only lapply way I know of for lm(..) is > basically doing a lapply inside of a lapply, meaning it's exactly the same > as the double loop above... I'm looking to escape from loops. > > Also, if any of you could share your top R tips that you've learned over the > years, I'd really appreciate it. Tiny things like learning that array() and > matrix() can have a 3rd dimension, learning of strsplit, etc.. have helped > me immeasurably. (Not that I'm also googling for this stuff! I'm doing R 14 > hours a day!). > > Thanks. > > -- > View this message in context: http://r.789695.n4.nabble.com/Other-ways-to-lm-regression-non-loop-tp4234487p4234487.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Thanks for the advice everyone. All very helpful.
@Bert Added my information to signature, thanks.
----
Isaac Research Assistant Quantitative Finance Faculty, UTS |
Free forum by Nabble | Edit this page |