Problem with ldply

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem with ldply

cfriedl
I've examining a number of linear regression models on a large dataset following the basic ideas presented here Calculating all possible linear regressions. I run into a problem with ldply when I have a formula that includes no intercept. Here's a simple test to show what happens.

# data and two linear model regressions
xy <- data.frame(cbind(x=(0:10),y=2*x + 0.2*rnorm(11)))
models <- as.list(c('y ~ x', 'y ~ -1 + x'))
models <- lapply(models, function(x) (as.formula(x)) )
fits <- lapply(models, function(x) lm(x, data=xy))

# regression summaries specified individually (OK)
coef(summary(fits[[1]]))

#               Estimate Std. Error     t value     Pr(>|t|)
# (Intercept) -0.0594176 0.10507394  -0.5654837 5.855640e-01
# x            2.0163534 0.01776074 113.5286997 1.620614e-15

coef(summary(fits[[2]]))

#   Estimate Std. Error  t value     Pr(>|t|)
# x 2.007865 0.00916494 219.0811 9.652427e-20


# Coefficients as a dataframe using ldply (OK)
ldply(fits, function(x) as.data.frame(t(coef(x))))

#   (Intercept)        x
# 1  -0.0594176 2.016353
# 2          NA 2.007865



# Std Errors as a dataframe using ldply  (FAIL)
# variable name 'x' is missed in the second model which has no intercept. Default variable
# name V1 is added to the output instead.
# The same behaviour is observed for 't value' and 'Pr(>|t|)'
ldply(fits, function(x) as.data.frame(t(coef(summary(x))[,'Std. Error'])))

#   (Intercept)          x         V1
# 1   0.1050739 0.01776074         NA
# 2          NA         NA 0.00916494


Is this a bug or (hopefully) user error? Any ideas for a workaround?

Thanks.






Reply | Threaded
Open this post in threaded view
|

Re: Problem with ldply

Ista Zahn-2
The probem is that in the case of model 2 the standard error terms
reduce to a vector of length one. Since subsetting with '[' drops
unneeded dimensions by default, this vector loses it's name. The
solution is to add 'drop = FALSE' to your subset call, like this

ldply(fits, function(x) as.data.frame(t(coef(summary(x))[,'Std.
Error', drop=FALSE])))

Best,
Ista

On Mon, May 17, 2010 at 5:20 AM, cfriedl <[hidden email]> wrote:

>
> I've examining a number of linear regression models on a large dataset
> following the basic ideas presented here
> http://www.r-bloggers.com/r-calculating-all-possible-linear-regression-models-for-a-given-set-of-predictors/
> Calculating all possible linear regressions . I run into a problem with
> ldply when I have a formula that includes no intercept. Here's a simple test
> to show what happens.
>
> # data and two linear model regressions
> xy <- data.frame(cbind(x=(0:10),y=2*x + 0.2*rnorm(11)))
> models <- as.list(c('y ~ x', 'y ~ -1 + x'))
> models <- lapply(models, function(x) (as.formula(x)) )
> fits <- lapply(models, function(x) lm(x, data=xy))
>
> # regression summaries specified individually (OK)
> coef(summary(fits[[1]]))
>
> #               Estimate Std. Error     t value     Pr(>|t|)
> # (Intercept) -0.0594176 0.10507394  -0.5654837 5.855640e-01
> # x            2.0163534 0.01776074 113.5286997 1.620614e-15
>
> coef(summary(fits[[2]]))
>
> #   Estimate Std. Error  t value     Pr(>|t|)
> # x 2.007865 0.00916494 219.0811 9.652427e-20
>
>
> # Coefficients as a dataframe using ldply (OK)
> ldply(fits, function(x) as.data.frame(t(coef(x))))
>
> #   (Intercept)        x
> # 1  -0.0594176 2.016353
> # 2          NA 2.007865
>
>
>
> # Std Errors as a dataframe using ldply  (FAIL)
> # variable name 'x' is missed in the second model which has no intercept.
> Default variable
> # name V1 is added to the output instead.
> # The same behaviour is observed for 't value' and 'Pr(>|t|)'
> ldply(fits, function(x) as.data.frame(t(coef(summary(x))[,'Std. Error'])))
>
> #   (Intercept)          x         V1
> # 1   0.1050739 0.01776074         NA
> # 2          NA         NA 0.00916494
>
>
> Is this a bug or (hopefully) user error? Any ideas for a workaround?
>
> Thanks.
>
>
>
>
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Problem-with-ldply-tp2219094p2219094.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem with ldply

cfriedl
Thanks for teaching me something new. :)