# repeating regression

 Hi, I think my problem is a bit mundane but it's quite intriguing. Imagine I have a matrix of 10 by 2 million. The first 5 columns are x and the last 5 are y values. I have to regress y on x (assume 0 intercept) for each row to observe time series of the slope. I am wondering if there is any way to speed this calculation up? I tried with apply. But it is still slow. Is there any trick I should know? Thank you. Robert
 You can probably specify the problem with model.matrix and use lm.fit directly, but what's probably even better is to remember that the slope can be calculated as correlation * std_y / std_x for this simple case of one independent variable and implement directly . E.g., something like apply(Data, 1, function(x) std(x[6:10])/std(x[1:5]) * cor(x[1:5],x[6:10])) You can do this even faster by taking x and y to big vectors, taking a rolling std and cor with length 5, and sampling each 5 steps as well, but it's after midnight and I'm doing a disastrous OJ project (don't ask....) for school so I can't really think right now. Michael
 You may be interested in the fastLM function from the RcppArmadillo package