Hi
I am most likely committing an error in trying to predict using linear regression lm model. please help me figure out what am I doing wrong. I am trying to regress a index and its constituents. Here is the code #split ts inttwo parts a<-300; x1<-x[1:a,]; y1<-y[1:a,]; x2<-x[(a+1):nrow(x),]; y2<-y[(a+1):nrow(y),]; #regression m1<-lm( y1~x1) r1<-residuals(m1) coef(m1) ##out of sample y_hat<-predict.lm(m1,x2); r2<-y_hat-y2; x,y are xts. X contains multiple time series. The y_ hat turns out to be of 300 samples only, whereas x2 contains 1400 samples. Please help me figure out how to predict using model that I have found using regression. -- Regards Amol +91-9897860992 +91-8889676918 [[alternative HTML version deleted]] _______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go. |
Hi Amol,
My guess is that you can't use lm() directly on xts objects. See this: https://stackoverflow.com/questions/21692560/linear-regression-with-xts-object Regards, -Ed On Sun, Jul 16, 2017 at 4:31 PM, amol gupta <[hidden email]> wrote: > Hi > > I am most likely committing an error in trying to predict using linear > regression lm model. please help me figure out what am I doing wrong. I am > trying to regress a index and its constituents. Here is the code > > > #split ts inttwo parts > a<-300; > > x1<-x[1:a,]; > y1<-y[1:a,]; > > x2<-x[(a+1):nrow(x),]; > y2<-y[(a+1):nrow(y),]; > > > #regression > m1<-lm( y1~x1) > r1<-residuals(m1) > coef(m1) > > ##out of sample > y_hat<-predict.lm(m1,x2); > r2<-y_hat-y2; > > > x,y are xts. X contains multiple time series. The y_ hat turns out to be of > 300 samples only, whereas x2 contains 1400 samples. > > Please help me figure out how to predict using model that I have found > using regression. > > > -- > Regards > Amol > +91-9897860992 > +91-8889676918 > > [[alternative HTML version deleted]] > > _______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-sig-finance > -- Subscriber-posting only. If you want to post, subscribe first. > -- Also note that this is not the r-help list where general R questions > should go. > [[alternative HTML version deleted]] _______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go. |
On Mon, Jul 24, 2017 at 1:10 PM, Ed Herranz <[hidden email]> wrote:
> Hi Amol, > > My guess is that you can't use lm() directly on xts objects. See this: > > https://stackoverflow.com/questions/21692560/linear-regression-with-xts-object > Bad guess. :) library(xts) data(sample_matrix) xtsObject <- as.xts(sample_matrix) xtsObject$t <- seq_len(nrow(xtsObject))-1 lm(Open ~ t, data=xtsObject) > Regards, > -Ed > > On Sun, Jul 16, 2017 at 4:31 PM, amol gupta <[hidden email]> wrote: > >> Hi >> >> I am most likely committing an error in trying to predict using linear >> regression lm model. please help me figure out what am I doing wrong. I am >> trying to regress a index and its constituents. Here is the code >> >> >> #split ts inttwo parts >> a<-300; >> >> x1<-x[1:a,]; >> y1<-y[1:a,]; >> >> x2<-x[(a+1):nrow(x),]; >> y2<-y[(a+1):nrow(y),]; >> >> >> #regression >> m1<-lm( y1~x1) >> r1<-residuals(m1) >> coef(m1) >> >> ##out of sample >> y_hat<-predict.lm(m1,x2); >> r2<-y_hat-y2; >> >> >> x,y are xts. X contains multiple time series. The y_ hat turns out to be of >> 300 samples only, whereas x2 contains 1400 samples. >> >> Please help me figure out how to predict using model that I have found >> using regression. >> example. Most people do not have, and will not spend, the time it takes to imagine and create data required to reproduce the issue you describe. Please see: https://stackoverflow.com/q/5963269/271616 >> >> -- >> Regards >> Amol >> +91-9897860992 >> +91-8889676918 >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-sig-finance >> -- Subscriber-posting only. If you want to post, subscribe first. >> -- Also note that this is not the r-help list where general R questions >> should go. >> > > [[alternative HTML version deleted]] > > _______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-sig-finance > -- Subscriber-posting only. If you want to post, subscribe first. > -- Also note that this is not the r-help list where general R questions should go. -- Joshua Ulrich | about.me/joshuaulrich FOSS Trading | www.fosstrading.com R/Finance 2017 | www.rinfinance.com On Mon, Jul 24, 2017 at 1:10 PM, Ed Herranz <[hidden email]> wrote: > Hi Amol, > > My guess is that you can't use lm() directly on xts objects. See this: > > https://stackoverflow.com/questions/21692560/linear-regression-with-xts-object > > Regards, > -Ed > > On Sun, Jul 16, 2017 at 4:31 PM, amol gupta <[hidden email]> wrote: > >> Hi >> >> I am most likely committing an error in trying to predict using linear >> regression lm model. please help me figure out what am I doing wrong. I am >> trying to regress a index and its constituents. Here is the code >> >> >> #split ts inttwo parts >> a<-300; >> >> x1<-x[1:a,]; >> y1<-y[1:a,]; >> >> x2<-x[(a+1):nrow(x),]; >> y2<-y[(a+1):nrow(y),]; >> >> >> #regression >> m1<-lm( y1~x1) >> r1<-residuals(m1) >> coef(m1) >> >> ##out of sample >> y_hat<-predict.lm(m1,x2); >> r2<-y_hat-y2; >> >> >> x,y are xts. X contains multiple time series. The y_ hat turns out to be of >> 300 samples only, whereas x2 contains 1400 samples. >> >> Please help me figure out how to predict using model that I have found >> using regression. >> >> >> -- >> Regards >> Amol >> +91-9897860992 >> +91-8889676918 >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-sig-finance >> -- Subscriber-posting only. If you want to post, subscribe first. >> -- Also note that this is not the r-help list where general R questions >> should go. >> > > [[alternative HTML version deleted]] > > _______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-sig-finance > -- Subscriber-posting only. If you want to post, subscribe first. > -- Also note that this is not the r-help list where general R questions should go. -- Joshua Ulrich | about.me/joshuaulrich FOSS Trading | www.fosstrading.com R/Finance 2017 | www.rinfinance.com _______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go. |
In reply to this post by amol gupta
Hi Amol,
The lm function is not intended to be used in the way you are calling it. Even though you can actually pass y and x as actual data in the formula argument (y~x), its better to pass the data set in the data argument and use column names in the formula argument especially when you want to use the predict function on the fitted object as predict.lm looks for variables in the function environment. In your example, newdata and those variables would not have similar length that results in length of y_hat equal to 300. Now there might be some clever way to get around this with the same function call that you used (you can try playing with the variable name of new data to be same as column names in x) but I would rather suggest using this - a<-300 data_fit = data.frame(x = matrix(rnorm(1700*5), ncol = 5), y = matrix(rnorm(1700))) data_fit_is = data_fit[1:a,] #In Sample data_fit_os = data_fit[(a+1):nrow(data_fit), ] #Out of Sample m1 = lm(y~., data = data_fit_is) length(predict(m1, data_fit_os[, 1:5])) #Should be equal to 1400 now and 300 now Regards, Kshitij Dhingra On Sun, Jul 16, 2017 at 4:31 PM, amol gupta <[hidden email]> wrote: > Hi > > I am most likely committing an error in trying to predict using linear > regression lm model. please help me figure out what am I doing wrong. I am > trying to regress a index and its constituents. Here is the code > > > #split ts inttwo parts > a<-300; > > x1<-x[1:a,]; > y1<-y[1:a,]; > > x2<-x[(a+1):nrow(x),]; > y2<-y[(a+1):nrow(y),]; > > > #regression > m1<-lm( y1~x1) > r1<-residuals(m1) > coef(m1) > > ##out of sample > y_hat<-predict.lm(m1,x2); > r2<-y_hat-y2; > > > x,y are xts. X contains multiple time series. The y_ hat turns out to be of > 300 samples only, whereas x2 contains 1400 samples. > > Please help me figure out how to predict using model that I have found > using regression. > > > -- > Regards > Amol > +91-9897860992 > +91-8889676918 > > [[alternative HTML version deleted]] > > _______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-sig-finance > -- Subscriber-posting only. If you want to post, subscribe first. > -- Also note that this is not the r-help list where general R questions > should go. > -- Kshitij Dhingra Applied Academics LLC Office: +1.917.262.0516 Mobile: +1.206.696.5945 Email: [hidden email] Website: http://www.AppliedAcademics.com [[alternative HTML version deleted]] _______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go. |
All
Thank you all for the response. I could resolve the issue by separating formula and data. That is where I was committing error. Joshua Ulrich I will try to send data set along or ensure that the problem is reproducible. On Fri, Jul 28, 2017 at 7:28 PM, Kevin Dhingra < [hidden email]> wrote: > Hi Amol, > > The lm function is not intended to be used in the way you are calling it. > Even though you can actually pass y and x as actual data in the formula > argument (y~x), its better to pass the data set in the data argument and > use column names in the formula argument especially when you want to use > the predict function on the fitted object as predict.lm looks for variables > in the function environment. In your example, newdata and those variables > would not have similar length that results in length of y_hat equal to 300. > > Now there might be some clever way to get around this with the same > function call that you used (you can try playing with the variable name of > new data to be same as column names in x) but I would rather suggest using > this - > > a<-300 > data_fit = data.frame(x = matrix(rnorm(1700*5), ncol = 5), y = > matrix(rnorm(1700))) > data_fit_is = data_fit[1:a,] #In Sample > data_fit_os = data_fit[(a+1):nrow(data_fit), ] #Out of Sample > m1 = lm(y~., data = data_fit_is) > length(predict(m1, data_fit_os[, 1:5])) #Should be equal to 1400 now and > 300 now > > Regards, > Kshitij Dhingra > > On Sun, Jul 16, 2017 at 4:31 PM, amol gupta <[hidden email]> wrote: > >> Hi >> >> I am most likely committing an error in trying to predict using linear >> regression lm model. please help me figure out what am I doing wrong. I am >> trying to regress a index and its constituents. Here is the code >> >> >> #split ts inttwo parts >> a<-300; >> >> x1<-x[1:a,]; >> y1<-y[1:a,]; >> >> x2<-x[(a+1):nrow(x),]; >> y2<-y[(a+1):nrow(y),]; >> >> >> #regression >> m1<-lm( y1~x1) >> r1<-residuals(m1) >> coef(m1) >> >> ##out of sample >> y_hat<-predict.lm(m1,x2); >> r2<-y_hat-y2; >> >> >> x,y are xts. X contains multiple time series. The y_ hat turns out to be >> of >> 300 samples only, whereas x2 contains 1400 samples. >> >> Please help me figure out how to predict using model that I have found >> using regression. >> >> >> -- >> Regards >> Amol >> +91-9897860992 >> +91-8889676918 >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-sig-finance >> -- Subscriber-posting only. If you want to post, subscribe first. >> -- Also note that this is not the r-help list where general R questions >> should go. >> > > > > -- > Kshitij Dhingra > Applied Academics LLC > Office: +1.917.262.0516 > Mobile: +1.206.696.5945 > Email: [hidden email] > Website: http://www.AppliedAcademics.com > -- Regards Amol +91-9897860992 +91-8889676918 [[alternative HTML version deleted]] _______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go. |
Free forum by Nabble | Edit this page |