

Hi, I'm quite new to R (1 month full time use so far). I have to run loop regressions VERY often in my work, so I would appreciate some new methodology that I'm not considering.
#
y<matrix(rnorm(100),ncol=10,nrow=10)
x<matrix(rnorm(50),ncol=5,nrow=10)
#Suppose I want to run the specification y=A+Bx+error, for each and every y[,n] onto each and every x[,n].
#So with:
ncol(y);ncol(x)
#I should end up with 10*5=50 regressions in total.
#I know how to do this fine:
MISC1<0
for(i in 1:ncol(y)){
for(j in 1:ncol(x)){
reg<lm(y[,i]~x[,j])
MISC1<cbind(MISC1,coef(reg)) #for coefficients
}}
coef<matrix(MISC1[,1],ncol=50)
coef[,1];coef(lm(y[,1]~x[,1])) #test passed
ncol(coef) #as desired, 50 regressions.
#
Now for my question: Is there easier or better methods of doing this? I know of a lapply method, but the only lapply way I know of for lm(..) is basically doing a lapply inside of a lapply, meaning it's exactly the same as the double loop above... I'm looking to escape from loops.
Also, if any of you could share your top R tips that you've learned over the years, I'd really appreciate it. Tiny things like learning that array() and matrix() can have a 3rd dimension, learning of strsplit, etc.. have helped me immeasurably. (Not that I'm also googling for this stuff! I'm doing R 14 hours a day!).
Thanks.

Isaac
Research Assistant
Quantitative Finance Faculty, UTS


You can get the ols coefficients with basic matrix operations as well (
https://files.nyu.edu/mrg217/public/ols_matrix.pdf) and by that avoid one
of the loops. I do not know how efficient this is but I have attached an
example you can paste bellow your code. Here, one xarray is used as a
right hand side variable for all yarrays in each loop.
The coefficients match, but they are in different order.
#
#Original code here....
m1=array(1,nrow(x))
#Creates an array of ones
MISC2<0
for(j in 1:ncol(x)){
mX=cbind(m1,x[,j])
reg2<solve(t(mX)%*%mX)%*%t(mX)%*%y
MISC2<cbind(MISC2,reg2) #for coefficients
}
coef<matrix(MISC2[,1],ncol=50)
coef[,1];coef(lm(y[,1]~x[,1])) #test passed
ncol(coef)
#as desired, 50 regressions.
MISC1
MISC2
#






Hello, iliketurtles (?),
for whatever strange reasons you want to regress all ycolumns on all
xcolumns, maybe
reg < apply( x, 2, function( xx) lm( y ~ xx))
do.call( "cbind", lapply( reg, coef))
does what you want. (To understand what the code above does, check the
documentation for lm(): "If response is a matrix a linear model is fitted
separately by leastsquares to each column of the matrix.")
Hth  Gerrit



Dear anonymous:
1. You may be more likely to get "useful tips" on this list if you
sign with your real name. It's friendlier.
2. If you are using R "14 hours/day." get and read a good R book. The
CRAN site or Amazon lists many; choose one or more that suits your
needs.
3. Read the R Help files carefully. ?lm tells you that you do not
need a loop to fit many y's simultaneously:
"If response is a matrix a linear model is fitted separately by
leastsquares to each column of the matrix. "
4. Loops are not necessarily so terrible. "apply" type functions are
basically loops, also; their chief advantage is often just code
readability, not efficiency.
5. For tasks such as yours, ?update methods are typically useful.
Cheers,
Bert




Bert Gunter
Genentech Nonclinical Biostatistics

Thanks for the advice everyone. All very helpful.
@Bert
Added my information to signature, thanks.

Isaac
Research Assistant
Quantitative Finance Faculty, UTS

