11 messages
Open this post in threaded view
|

 Hi I have a follow up question, relating to subsetting to list items. After using the list and min(sapply()) method to adjust the length of the variables, I specify a dynamic regression equation using the variables in the list. My list looks like this: Dcr<- list(Dcre1=DCred1,Dcre2=DCred2,Dcre3=DCred3,Dbobc1=DBoBC1,Dbobc2=DBoBC2,Dbobc3=DBoBC3,...) By specifying the list items with names, I thought I could end by referencing them (or subsetting the list) as, eg., Dcr\$Dcre1 and get DCred1, Dcr\$Dbobc1 and get DBoBC1, etc so that the explanatory variables of the equation can be easily associated with their respective original names. This way, I would avoid specifying the list as Dcr<-list(Dcr1, Dcr2, Dcr, 3..., Dcr15) and then subsetting the list using Dcr[[1]][1:29], Dcr[[[2]][1:29], ..., Dcr[[15]][1:29] because the list has many variables (15) and referencing the variables with numbers makes them lose their original names. When I specify the list as Dcr<- list(Dcr1, Dcr2, ..., Dcr15), then the regression equation specified as: # Regression regCred<- lm(Dcr[[1]][1:29]~Dcr[[2]][1:29]+Dcr[[3]][1:29]+Dcr[[4]][1:29]+Dcr[[5]][1:29]+Dcr[[6]][1:29]+...) runs without problems - the results are shown here below: Call: lm(formula = Dcr[[1]][1:29] ~ Dcr[[2]][1:29] + Dcr[[3]][1:29] + Dcr[[4]][1:29] + Dcr[[5]][1:29] + Dcr[[6]][1:29]) Residuals: Min      1Q  Median      3Q     Max -86.293 -33.586  -9.969  40.147 117.965 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept)    81.02064   13.28632   6.098 3.21e-06 *** Dcr[[2]][1:29] -0.97407    0.11081  -8.791 8.20e-09 *** Dcr[[3]][1:29] -0.27950    0.05899  -4.738 8.95e-05 *** Dcr[[4]][1:29] -0.07961    0.04856  -1.639    0.115 Dcr[[5]][1:29] -0.07180    0.05515  -1.302    0.206 Dcr[[6]][1:29] -0.01562    0.02086  -0.749    0.462 But when I specify the list with names as shown above, then the equation does not run - as shown by the following error message > # Regression > regCred<- lm(Dcr[[1]][1:29]~Dcr[[2]][1:29]+Dcr[[3]][1:29]+Dcr[[4]][1:29]+ + Dcr[[5]][1:29]+Dcr\$Dbobc3) Error in model.frame.default(formula = Dcr[[1]][1:29] ~ Dcr[[2]][1:29] +  : variable lengths differ (found for 'Dcr\$Dbobc3') > Dcr[[5]][1:29]+Dcr\$Dbobc3[1:29]) Error: unexpected ')' in "Dcr[[5]][1:29]+Dcr\$Dbobc3[1:29])" NB: In the equation with error message, only the last term is specified by referencing its name (ie., Dcr\$Dbobc3[1:29]. Also note that the error occurs whether I append '[1:29]' to Dcr\$Dbobc or not. How do I resolve this? Thanks. Lexi NB: I tried typing the above in the same email Petr used to reply me, but the email could not be delivered due to size problems         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: Adjusting length of series

 On Jun 30, 2012, at 6:04 PM, Lekgatlhamang, lexi Setlhare wrote: > Hi > I have a follow up question, relating to subsetting to list items.   > After using the list and min(sapply()) method to adjust the length   > of the variables, I specify a dynamic regression equation using the   > variables in the list. My list looks like this: > Dcr<-   > list > (Dcre1 > = > DCred1 > ,Dcre2 > =DCred2,Dcre3=DCred3,Dbobc1=DBoBC1,Dbobc2=DBoBC2,Dbobc3=DBoBC3,...) This should ahve been done like this: Dcr<- data.frame(Dcre1=DCred1, Dcre2=DCred2, Dcre3=DCred3,   Dbobc1=DBoBC1, Dbobc2=DBoBC2, Dbobc3=DBoBC3) > By specifying the list items with names, I thought I could end by   > referencing them (or subsetting the list) as, eg., Dcr\$Dcre1 and get   > DCred1, Dcr\$Dbobc1 and get DBoBC1, etc so that the explanatory   > variables of the equation can be easily associated with their   > respective original names. This way, I would avoid specifying the   > list as Dcr<-list(Dcr1, Dcr2, Dcr, 3..., Dcr15) and then subsetting   > the list using Dcr[[1]][1:29], Dcr[[[2]][1:29], ..., Dcr[[15]][1:29]   > because the list has many variables (15) and referencing the   > variables with numbers makes them lose their original names. > When I specify the list as Dcr<- list(Dcr1, Dcr2, ..., Dcr15), then   > the regression equation specified as: > # Regression > regCred<- lm(Dcr[[1]][1:29]~Dcr[[2]][1:29]+Dcr[[3]][1:29]+Dcr[[4]] > [1:29]+Dcr[[5]][1:29]+Dcr[[6]][1:29]+...) And the you could have done > regCred<- lm(Dcre1 ~ . , data=Dcr [ , 1:29] ) (Leaving out the , ...) > runs without problems - the results are shown here below: > Call: > lm(formula = Dcr[[1]][1:29] ~ Dcr[[2]][1:29] + Dcr[[3]][1:29] + > Dcr[[4]][1:29] + Dcr[[5]][1:29] + Dcr[[6]][1:29]) > Residuals: > Min      1Q  Median      3Q     Max > -86.293 -33.586  -9.969  40.147 117.965 > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept)    81.02064   13.28632   6.098 3.21e-06 *** > Dcr[[2]][1:29] -0.97407    0.11081  -8.791 8.20e-09 *** > Dcr[[3]][1:29] -0.27950    0.05899  -4.738 8.95e-05 *** > Dcr[[4]][1:29] -0.07961    0.04856  -1.639    0.115 > Dcr[[5]][1:29] -0.07180    0.05515  -1.302    0.206 > Dcr[[6]][1:29] -0.01562    0.02086  -0.749    0.462 > > But when I specify the list with names as shown above, then the   > equation does not run - as shown by the following error message >> # Regression >> regCred<- lm(Dcr[[1]][1:29]~Dcr[[2]][1:29]+Dcr[[3]][1:29]+Dcr[[4]] >> [1:29]+ > + Dcr[[5]][1:29]+Dcr\$Dbobc3) > Error in model.frame.default(formula = Dcr[[1]][1:29] ~ Dcr[[2]] > [1:29] +  : > variable lengths differ (found for 'Dcr\$Dbobc3') >> Dcr[[5]][1:29]+Dcr\$Dbobc3[1:29]) > Error: unexpected ')' in "Dcr[[5]][1:29]+Dcr\$Dbobc3[1:29])" > > NB: In the equation with error message, only the last term is   > specified by referencing its name (ie., Dcr\$Dbobc3[1:29]. Also note   > that the error occurs whether I append '[1:29]' to Dcr\$Dbobc or not. > How do I resolve this? You should have offered str(Dcr) > Thanks. Lexi > > NB: I tried typing the above in the same email Petr used to reply   > me, but the email could not be delivered due to size problems > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: Adjusting length of series

 On Jun 30, 2012, at 8:47 PM, David Winsemius wrote: > > On Jun 30, 2012, at 6:04 PM, Lekgatlhamang, lexi Setlhare wrote: > >> Hi >> I have a follow up question, relating to subsetting to list items.   >> After using the list and min(sapply()) method to adjust the length   >> of the variables, I specify a dynamic regression equation using the   >> variables in the list. My list looks like this: >> Dcr<-   >> list >> (Dcre1 >> = >> DCred1 >> ,Dcre2 >> =DCred2,Dcre3=DCred3,Dbobc1=DBoBC1,Dbobc2=DBoBC2,Dbobc3=DBoBC3,...) > > This should ahve been done like this: > > Dcr<- data.frame(Dcre1=DCred1, Dcre2=DCred2, Dcre3=DCred3,   > Dbobc1=DBoBC1, Dbobc2=DBoBC2, Dbobc3=DBoBC3) > >> By specifying the list items with names, I thought I could end by   >> referencing them (or subsetting the list) as, eg., Dcr\$Dcre1 and   >> get DCred1, Dcr\$Dbobc1 and get DBoBC1, etc so that the explanatory   >> variables of the equation can be easily associated with their   >> respective original names. This way, I would avoid specifying the   >> list as Dcr<-list(Dcr1, Dcr2, Dcr, 3..., Dcr15) and then subsetting   >> the list using Dcr[[1]][1:29], Dcr[[[2]][1:29], ..., Dcr[[15]] >> [1:29] because the list has many variables (15) and referencing the   >> variables with numbers makes them lose their original names. >> When I specify the list as Dcr<- list(Dcr1, Dcr2, ..., Dcr15), then   >> the regression equation specified as: >> # Regression >> regCred<- lm(Dcr[[1]][1:29]~Dcr[[2]][1:29]+Dcr[[3]][1:29]+Dcr[[4]] >> [1:29]+Dcr[[5]][1:29]+Dcr[[6]][1:29]+...) > > And the you could have done > >> regCred<- lm(Dcre1 ~ . , data=Dcr [ , 1:29] ) Oh, Nuts! I meant to type: regCred<- lm(Dcre1 ~ . , data=Dcr [ 1:29, ] ) > (Leaving out the , ...) > > >> runs without problems - the results are shown here below: >> Call: >> lm(formula = Dcr[[1]][1:29] ~ Dcr[[2]][1:29] + Dcr[[3]][1:29] + >> Dcr[[4]][1:29] + Dcr[[5]][1:29] + Dcr[[6]][1:29]) >> Residuals: >> Min      1Q  Median      3Q     Max >> -86.293 -33.586  -9.969  40.147 117.965 >> Coefficients: >> Estimate Std. Error t value Pr(>|t|) >> (Intercept)    81.02064   13.28632   6.098 3.21e-06 *** >> Dcr[[2]][1:29] -0.97407    0.11081  -8.791 8.20e-09 *** >> Dcr[[3]][1:29] -0.27950    0.05899  -4.738 8.95e-05 *** >> Dcr[[4]][1:29] -0.07961    0.04856  -1.639    0.115 >> Dcr[[5]][1:29] -0.07180    0.05515  -1.302    0.206 >> Dcr[[6]][1:29] -0.01562    0.02086  -0.749    0.462 >> >> But when I specify the list with names as shown above, then the   >> equation does not run - as shown by the following error message >>> # Regression >>> regCred<- lm(Dcr[[1]][1:29]~Dcr[[2]][1:29]+Dcr[[3]][1:29]+Dcr[[4]] >>> [1:29]+ >> + Dcr[[5]][1:29]+Dcr\$Dbobc3) >> Error in model.frame.default(formula = Dcr[[1]][1:29] ~ Dcr[[2]] >> [1:29] +  : >> variable lengths differ (found for 'Dcr\$Dbobc3') >>> Dcr[[5]][1:29]+Dcr\$Dbobc3[1:29]) >> Error: unexpected ')' in "Dcr[[5]][1:29]+Dcr\$Dbobc3[1:29])" >> >> NB: In the equation with error message, only the last term is   >> specified by referencing its name (ie., Dcr\$Dbobc3[1:29]. Also note   >> that the error occurs whether I append '[1:29]' to Dcr\$Dbobc or not. >> How do I resolve this? > This still applies: You should have offered str(Dcr) > David Winsemius, MD West Hartford, CT ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: Adjusting length of series

Open this post in threaded view
|

## Re: Adjusting length of series

Open this post in threaded view
|

## Re: Adjusting length of series

 In reply to this post by arun kirshna Hi David and AK, I have been trying to implement your suggestions since yesterday, but I encountered some challenges. Â  As for David's suggestions, I could only implement it after some modifications.Â Using an abridgedÂ version of my data, I dpud my dataset and then show my steps below. Â  > dput(ydata) structure(c(68.1000000000004, -34.8000000000002, 90.3999999999996, 54.6000000000004, -172.3, 51.8000000000002, 175, 79.8000000000002, -35.7000000000007, 130.5, 116.8, -67.5, 164.5, 514.8, -326.1, 98.4000000000005, 160.2, 53.1999999999998, 283.6, -111.6, 127.8, -17.3000000000002, 286.3, NA, NA, -102.900000000001, 125.2, -35.7999999999993, -226.900000000001, 224.1, 123.2, -95.1999999999998, -115.500000000001, 166.200000000001, -13.6999999999998, -184.3, 232, 350.3, -840.900000000001, 424.500000000001, 61.7999999999993, -107, 230.400000000001, -395.200000000001, 239.400000000001, -145.1, 303.6, NA, NA, NA, 228.1, -160.999999999999, -191.100000000001, 451.000000000001, -100.900000000001, -218.4, -20.3000000000011, 281.700000000002, -179.900000000001, -170.6, 416.3, 118.3, -1191.2, 1265.4, -362.700000000002, -168.799999999999, 337.400000000001, -625.600000000001, 634.600000000001, -384.500000000001, 448.700000000001, NA, NA, -164.457840999999, 17.0793539999995, 95.9767880000009, 680.238166999999, -491.348690999999, -274.694009, -256.332907, 469.62296, -146.431891, -41.0772019999995, -106.970104, 757.688263999999, -1689.214533, 2320.098952, -1446.97942, 516.384521, -375.277650999999, 293.867029999999, 417.845195, 278.198807, -968.592033999999, -314.195986, NA, NA, NA, 181.537194999999, 78.8974340000013, 584.261378999998, -1171.586858, 216.654681999999, 18.3611019999998, 725.955867, -616.054851, 105.354689000001, -65.8929020000005, 864.658367999999, -2446.902797, 4009.313485, -3767.078372, 1963.363941, -891.662171999999, 669.144680999999, 123.978165, -139.646388, -1246.790841, 654.396048, NA, 4937, 5005.1, 4970.3, 5060.7, 5115.3, 4943, 4994.8, 5169.8, 5249.6, 5213.9, 5344.4, 5461.2, 5393.7, 5558.2, 6073, 5746.9, 5845.3, 6005.5, 6058.7, 6342.3, 6230.7, 6358.5, 6341.2, 6627.5, 4187.5, 4296.004835, 4240.051829, 4201.178177, 4258.281313, 4995.622616, 5241.615228, 5212.913831, 4927.879527, 5112.468183, 5150.624948, 5147.704511, 5037.81397, 5685.611693, 4644.194883, 5922.877025, 5754.579747, 6102.66699, 6075.476582, 6342.153204, 7026.675021, 7989.395645, 7983.524235, 7663.456839), .Dim = c(24L, 7L), .Dimnames = list( Â Â Â  NULL, c("DCred1", "DCred2", "DCred3", "DBoBC2", "DBoBC3", Â Â Â  "CredL1", "BoBCL1")), .Tsp = c(2001.08333333333, 2003, 12 ), class = c("mts", "ts")) Â  NB: the NAs in the dataset emanated from laggingÂ or differencing the series Â  David's suggestion Â df<-data.frame(DCred1,DCred2,DCred3,DBoBC2,DBoBC3,CredL1,BoBCL1) Error in data.frame(DCred1, DCred2, DCred3, DBoBC2, DBoBC3, CredL1, BoBCL1) : Â  arguments imply differing number of rows: 23, 22, 21, 24 So I modified as follows: length(DCred3)Â  # finding the minimum length of various series [1] 21 # Then dataframe construction dframe<- data.frame(Dcre1=DCred1[1:21],Dcre2=DCred2[1:21],Dcre3=DCred3[1:21], + Dbobc2=DBoBC2[1:21],Dbobc3=DBoBC3[1:21],CredL=CredL1[1:21],BoBCL=BoBCL1[1:21]) # Then estimated regression > regCred<- lm(Dcre1~Dcre2+Dcre3+Dbobc2+Dbobc3+CredL+BoBCL, data=dframe) > summary(regCred) # Worked well as shown by results below Call: lm(formula = Dcre1 ~ Dcre2 + Dcre3 + Dbobc2 + Dbobc3 + CredL + Â Â Â  BoBCL, data = dframe) Residuals: Â Â Â  MinÂ Â Â Â Â  1QÂ  MedianÂ Â Â Â Â  3QÂ Â Â Â  Max -69.516 -27.695Â  -8.085Â  13.851 107.276 Coefficients: Â Â Â Â Â Â Â Â Â Â Â Â  Estimate Std. Error t value Pr(>|t|)Â Â Â  (Intercept) 159.32304Â  157.15209Â Â  1.014 0.327873Â Â Â  Dcre2Â Â Â Â Â Â Â  -0.75527Â Â Â  0.17262Â  -4.375 0.000634 *** Dcre3Â Â Â Â Â Â Â  -0.21006Â Â Â  0.08656Â  -2.427 0.029329 *Â  Dbobc2Â Â Â Â Â Â Â  0.05111Â Â Â  0.06565Â Â  0.779 0.449197Â Â Â  Dbobc3Â Â Â Â Â Â Â  0.03106Â Â Â  0.03510Â Â  0.885 0.391108Â Â Â  CredLÂ Â Â Â Â Â Â  -0.10967Â Â Â  0.04933Â  -2.223 0.043177 *Â  BoBCLÂ Â Â Â Â Â Â Â  0.09756Â Â Â  0.03097Â Â  3.150 0.007087 ** --- Signif. codes:Â  0 â***â 0.001 â**â 0.01 â*â 0.05 â.â 0.1 â â 1 Residual standard error: 52.3 on 14 degrees of freedom Multiple R-squared: 0.9331,Â Â Â Â  Adjusted R-squared: 0.9044 F-statistic: 32.55 on 6 and 14 DF,Â  p-value: 1.911e-07 Â  This is good, but couldn't I code the process for my 15 variable model? Perhaps that is where the use of Dcr<- lapply(..., function(x) ...) comes in? Â  AK, if you spare some minutes,Â please use my dput data to illustrate the suggestion you made, I searched the lapply function (using ??lapply) but could not get a handle of how to use it in my case. My dput data is as shown below. Â  Â Â Â Â Â Â Â Â  DCred1 DCred2Â  DCred3Â Â Â Â Â  DBoBC2Â Â Â Â Â  DBoBC3 CredL1Â Â  BoBCL1 Feb 2001Â Â  68.1Â Â Â Â  NAÂ Â Â Â Â  NAÂ Â Â Â Â Â Â Â Â  NAÂ Â Â Â Â Â Â Â Â  NA 4937.0 4187.500 Mar 2001Â  -34.8 -102.9Â Â Â Â Â  NAÂ  -164.45784Â Â Â Â Â Â Â Â Â  NA 5005.1 4296.005 Apr 2001Â Â  90.4Â  125.2Â Â  228.1Â Â Â  17.07935Â Â  181.53719 4970.3 4240.052 May 2001Â Â  54.6Â  -35.8Â  -161.0Â Â Â  95.97679Â Â Â  78.89743 5060.7 4201.178 Jun 2001 -172.3 -226.9Â  -191.1Â Â  680.23817Â Â  584.26138 5115.3 4258.281 Jul 2001Â Â  51.8Â  224.1Â Â  451.0Â  -491.34869 -1171.58686 4943.0 4995.623 Aug 2001Â  175.0Â  123.2Â  -100.9Â  -274.69401Â Â  216.65468 4994.8 5241.615 Sep 2001Â Â  79.8Â  -95.2Â  -218.4Â  -256.33291Â Â Â  18.36110 5169.8 5212.914 Oct 2001Â  -35.7 -115.5Â Â  -20.3Â Â  469.62296Â Â  725.95587 5249.6 4927.880 Nov 2001Â  130.5Â  166.2Â Â  281.7Â  -146.43189Â  -616.05485 5213.9 5112.468 Dec 2001Â  116.8Â  -13.7Â  -179.9Â Â  -41.07720Â Â  105.35469 5344.4 5150.625 Jan 2002Â  -67.5 -184.3Â  -170.6Â  -106.97010Â Â  -65.89290 5461.2 5147.705 Feb 2002Â  164.5Â  232.0Â Â  416.3Â Â  757.68826Â Â  864.65837 5393.7 5037.814 Mar 2002Â  514.8Â  350.3Â Â  118.3 -1689.21453 -2446.90280 5558.2 5685.612 Apr 2002 -326.1 -840.9 -1191.2Â  2320.09895Â  4009.31348 6073.0 4644.195 May 2002Â Â  98.4Â  424.5Â  1265.4 -1446.97942 -3767.07837 5746.9 5922.877 Jun 2002Â  160.2Â Â  61.8Â  -362.7Â Â  516.38452Â  1963.36394 5845.3 5754.580 Jul 2002Â Â  53.2 -107.0Â  -168.8Â  -375.27765Â  -891.66217 6005.5 6102.667 Aug 2002Â  283.6Â  230.4Â Â  337.4Â Â  293.86703Â Â  669.14468 6058.7 6075.477 Sep 2002 -111.6 -395.2Â  -625.6Â Â  417.84519Â Â  123.97817 6342.3 6342.153 Oct 2002Â  127.8Â  239.4Â Â  634.6Â Â  278.19881Â  -139.64639 6230.7 7026.675 Nov 2002Â  -17.3 -145.1Â  -384.5Â  -968.59203 -1246.79084 6358.5 7989.396 Dec 2002Â  286.3Â  303.6Â Â  448.7Â  -314.19599Â Â  654.39605 6341.2 7983.524 Jan 2003Â Â Â Â  NAÂ Â Â Â  NAÂ Â Â Â Â  NAÂ Â Â Â Â Â Â Â Â  NAÂ Â Â Â Â Â Â Â Â  NA 6627.5 7663.457 Thanks kindly. Lexi         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: Adjusting length of series

Open this post in threaded view
|

## Re: Adjusting length of series

Open this post in threaded view
|