Hi
I am most likely committing an error in trying to predict using linear regression lm model. please help me figure out what am I doing wrong. I am trying to regress a index and its constituents. Here is the code #split ts inttwo parts a<-300; x1<-x[1:a,]; y1<-y[1:a,]; x2<-x[(a+1):nrow(x),]; y2<-y[(a+1):nrow(y),]; #regression m1<-lm( y1~x1) r1<-residuals(m1) coef(m1) ##out of sample y_hat<-predict.lm(m1,x2); r2<-y_hat-y2; x,y are xts. X contains multiple time series. The y_ hat turns out to be of 300 samples only, whereas x2 contains 1400 samples. Please help me figure out how to predict using model that I have found using regression.
Hi Amol,
My guess is that you can't use lm() directly on xts objects. See this: https://stackoverflow.com/questions/21692560/linear-regression-with-xts-object Regards, -Ed On Sun, Jul 16, 2017 at 4:31 PM, amol gupta <[hidden email]> wrote: > Hi > > I am most likely committing an error in trying to predict using linear > regression lm model. please help me figure out what am I doing wrong. I am > trying to regress a index and its constituents. Here is the code > > > #split ts inttwo parts > a<-300; > > x1<-x[1:a,]; > y1<-y[1:a,]; > > x2<-x[(a+1):nrow(x),]; > y2<-y[(a+1):nrow(y),]; > > > #regression > m1<-lm( y1~x1) > r1<-residuals(m1) > coef(m1) > > ##out of sample > y_hat<-predict.lm(m1,x2); > r2<-y_hat-y2; > > > x,y are xts. X contains multiple time series. The y_ hat turns out to be of > 300 samples only, whereas x2 contains 1400 samples. > > Please help me figure out how to predict using model that I have found > using regression.
On Mon, Jul 24, 2017 at 1:10 PM, Ed Herranz <[hidden email]> wrote:
On Mon, Jul 24, 2017 at 1:10 PM, Ed Herranz <[hidden email]> wrote:

> Hi Amol, > > My guess is that you can't use lm() directly on xts objects. See this: > > https://stackoverflow.com/questions/21692560/linear-regression-with-xts-object > Bad guess. :) library(xts) data(sample_matrix) xtsObject <- as.xts(sample_matrix) xtsObject$t <- seq_len(nrow(xtsObject))-1 lm(Open ~ t, data=xtsObject) > Regards, > -Ed > > On Sun, Jul 16, 2017 at 4:31 PM, amol gupta <[hidden email]> wrote: > >> Hi >> >> I am most likely committing an error in trying to predict using linear >> regression lm model. please help me figure out what am I doing wrong. I am >> trying to regress a index and its constituents. Here is the code >> >> >> #split ts inttwo parts >> a<-300; >> >> x1<-x[1:a,]; >> y1<-y[1:a,]; >> >> x2<-x[(a+1):nrow(x),]; >> y2<-y[(a+1):nrow(y),]; >> >> >> #regression >> m1<-lm( y1~x1) >> r1<-residuals(m1) >> coef(m1) >> >> ##out of sample >> y_hat<-predict.lm(m1,x2); >> r2<-y_hat-y2; >> >> >> x,y are xts. X contains multiple time series. The y_ hat turns out to be of >> 300 samples only, whereas x2 contains 1400 samples. >> >> Please help me figure out how to predict using model that I have found >> using regression. >> example. Most people do not have, and will not spend, the time it takes to imagine and create data required to reproduce the issue you describe. Please see: https://stackoverflow.com/q/5963269/271616
In reply to this post by amol gupta
Hi Amol,
The lm function is not intended to be used in the way you are calling it. Even though you can actually pass y and x as actual data in the formula argument (y~x), its better to pass the data set in the data argument and use column names in the formula argument especially when you want to use the predict function on the fitted object as predict.lm looks for variables in the function environment. In your example, newdata and those variables would not have similar length that results in length of y_hat equal to 300. Now there might be some clever way to get around this with the same function call that you used (you can try playing with the variable name of new data to be same as column names in x) but I would rather suggest using this - a<-300 data_fit = data.frame(x = matrix(rnorm(1700*5), ncol = 5), y = matrix(rnorm(1700))) data_fit_is = data_fit[1:a,] #In Sample data_fit_os = data_fit[(a+1):nrow(data_fit), ] #Out of Sample m1 = lm(y~., data = data_fit_is) length(predict(m1, data_fit_os[, 1:5])) #Should be equal to 1400 now and 300 now Regards, Kshitij Dhingra On Sun, Jul 16, 2017 at 4:31 PM, amol gupta <[hidden email]> wrote: > Hi > > I am most likely committing an error in trying to predict using linear > regression lm model. please help me figure out what am I doing wrong. I am > trying to regress a index and its constituents. Here is the code > > > #split ts inttwo parts > a<-300; > > x1<-x[1:a,]; > y1<-y[1:a,]; > > x2<-x[(a+1):nrow(x),]; > y2<-y[(a+1):nrow(y),]; > > > #regression > m1<-lm( y1~x1) > r1<-residuals(m1) > coef(m1) > > ##out of sample > y_hat<-predict.lm(m1,x2); > r2<-y_hat-y2; > > > x,y are xts. X contains multiple time series. The y_ hat turns out to be of > 300 samples only, whereas x2 contains 1400 samples. > > Please help me figure out how to predict using model that I have found > using regression.
All
Thank you all for the response. I could resolve the issue by separating formula and data. That is where I was committing error. Joshua Ulrich I will try to send data set along or ensure that the problem is reproducible.
