# Problem with ldply

3 messages
Open this post in threaded view
|

## Problem with ldply

 I've examining a number of linear regression models on a large dataset following the basic ideas presented here Calculating all possible linear regressions. I run into a problem with ldply when I have a formula that includes no intercept. Here's a simple test to show what happens. # data and two linear model regressions xy <- data.frame(cbind(x=(0:10),y=2*x + 0.2*rnorm(11))) models <- as.list(c('y ~ x', 'y ~ -1 + x')) models <- lapply(models, function(x) (as.formula(x)) ) fits <- lapply(models, function(x) lm(x, data=xy)) # regression summaries specified individually (OK) coef(summary(fits[[1]])) #               Estimate Std. Error     t value     Pr(>|t|) # (Intercept) -0.0594176 0.10507394  -0.5654837 5.855640e-01 # x            2.0163534 0.01776074 113.5286997 1.620614e-15 coef(summary(fits[[2]])) #   Estimate Std. Error  t value     Pr(>|t|) # x 2.007865 0.00916494 219.0811 9.652427e-20 # Coefficients as a dataframe using ldply (OK) ldply(fits, function(x) as.data.frame(t(coef(x)))) #   (Intercept)        x # 1  -0.0594176 2.016353 # 2          NA 2.007865 # Std Errors as a dataframe using ldply  (FAIL) # variable name 'x' is missed in the second model which has no intercept. Default variable # name V1 is added to the output instead. # The same behaviour is observed for 't value' and 'Pr(>|t|)' ldply(fits, function(x) as.data.frame(t(coef(summary(x))[,'Std. Error']))) #   (Intercept)          x         V1 # 1   0.1050739 0.01776074         NA # 2          NA         NA 0.00916494 Is this a bug or (hopefully) user error? Any ideas for a workaround? Thanks.