Hello,
I am synthesising some sales data over a twelve month period, and then trying to use the "predict" function, firstly to generate a thirteenth month forecast with upper and lower 95% confidence limits. So far so good But what I then want to do is add the upper sales value at the 95th confidence limit to the vector of thirteen months and their respective sales to create a fourteenth month with a predicted sale and the 95% upper confidence limit for this, and so on The idea being to create a "trumpet" of extreme posistions But I keep getting instead of one line of predictions for the fourteenth month, a whole set. What I don't understand is why it works OK with my original synthetic set of twelve months, but doesn't like the set of thirteen sales data points, even though as far as I can see I'm just repeating the process, albeit with a different label I have tried to use different column labels in case that was the problem but it doesn't seem to make any difference I am also getting these weird warning messages telling me that things are being "masked": The following object is masked _by_ .GlobalEnv: sales The following object is masked from highdf (pos = 4): sales Etc Is it something to do with attaching the various data frames? I am a bit at sea on this and would be thankful for any pointers Nick My code: m<-runif(1,0,1) m mres<-m*(seq(1,12)) mres ssd<-rexp(1,1) ssd devs<-rep(0,length(mres)) for(i in 1:length(mres)){devs[i]<-rnorm(1,0,ssd)} devs plot(-10,-10,xlim=c(1,24),ylim=c(0,20000)) sales<-round((mres+devs)*1000) points(sales,pch=19) ptr<-cbind(1:length(sales),sales,sales,sales) ptr sdf<-data.frame(cbind(1:nrow(ptr),sales)) sdf colnames(sdf)<-c("monat","mitte") sdf attach(sdf) s.lm<-lm(mitte~monat) s.lm abline(s.lm,lty=2) news<-data.frame(monat=nrow(sdf)+1) news fcs<-predict(s.lm,news,interval="predict") fcs points(1+nrow(ptr),fcs[,1],col="grey",pch=19) points(1+nrow(ptr),fcs[,2]) points(1+nrow(ptr),fcs[,3]) ptr<-rbind(ptr,c(1+nrow(ptr),fcs[2],fcs[1],fcs[3])) ptr highdf<-data.frame(ptr[,c(1,4)]) highdf colnames(highdf)<-c("month","sales") highdf attach(highdf) h.lm<-lm(highdf[,2]~highdf[,1]) h.lm abline(h.lm,col="gray",lty=2) news<-data.frame(month=nrow(ptr)+1) news hcs<-predict(h.lm,news,interval="predict") hcs
Your messages about masking come from attaching your data set to the R session. In general, that is bad practice as it leads to confusing code. It is typically better to use the “data” argument in things like lm() to accomplish this task.
As near as I can tell, your second set of predictions is not working because your call to lm() directly references vectors from the highdf data frame. If you do this: h.lm <- lm(sales ~ month, data = highdf) news <- data.frame(month = nrow(ptr) + 1) hcs <- predict(h.lm, news, interval = "predict") You should see the expected results. Note that here I'm directly referring to the variables "sales" and "month" and not using the bracket notation. > On Jan 31, 2018, at 11:08 AM, WRAY NICHOLAS via R-help <[hidden email]> wrote: > > Hello, > > I am synthesising some sales data over a twelve month period, and then trying to > use the "predict" function, firstly to generate a thirteenth month forecast with > upper and lower 95% confidence limits. So far so good > > But what I then want to do is add the upper sales value at the 95th confidence > limit to the vector of thirteen months and their respective sales to create a > fourteenth month with a predicted sale and the 95% upper confidence limit for > this, and so on The idea being to create a "trumpet" of extreme posistions > > But I keep getting instead of one line of predictions for the fourteenth month, > a whole set. What I don't understand is why it works OK with my original > synthetic set of twelve months, but doesn't like the set of thirteen sales data > points, even though as far as I can see I'm just repeating the process, albeit > with a different label I have tried to use different column labels in case that > was the problem but it doesn't seem to make any difference > > I am also getting these weird warning messages telling me that things are being > "masked": > > The following object is masked _by_ .GlobalEnv: > > sales > > The following object is masked from highdf (pos = 4): > > sales > Etc > > Is it something to do with attaching the various data frames? I am a bit at sea > on this and would be thankful for any pointers > > Nick > > My code: > > > m<-runif(1,0,1) > m > mres<-m*(seq(1,12)) > mres > ssd<-rexp(1,1) > ssd > devs<-rep(0,length(mres)) > for(i in 1:length(mres)){devs[i]<-rnorm(1,0,ssd)} > devs > plot(-10,-10,xlim=c(1,24),ylim=c(0,20000)) > sales<-round((mres+devs)*1000) > > points(sales,pch=19) > > ptr<-cbind(1:length(sales),sales,sales,sales) > > ptr > sdf<-data.frame(cbind(1:nrow(ptr),sales)) > sdf > > colnames(sdf)<-c("monat","mitte") > sdf > attach(sdf) > s.lm<-lm(mitte~monat) > > s.lm > abline(s.lm,lty=2) > news<-data.frame(monat=nrow(sdf)+1) > news > fcs<-predict(s.lm,news,interval="predict") > fcs > > points(1+nrow(ptr),fcs[,1],col="grey",pch=19) > points(1+nrow(ptr),fcs[,2]) > points(1+nrow(ptr),fcs[,3]) > ptr<-rbind(ptr,c(1+nrow(ptr),fcs[2],fcs[1],fcs[3])) > ptr > > highdf<-data.frame(ptr[,c(1,4)]) > highdf > colnames(highdf)<-c("month","sales") > highdf > > attach(highdf) > h.lm<-lm(highdf[,2]~highdf[,1]) > h.lm > abline(h.lm,col="gray",lty=2) > news<-data.frame(month=nrow(ptr)+1) > news > hcs<-predict(h.lm,news,interval="predict") > hcs
