Improving function that estimates regressions for all variables specified

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Improving function that estimates regressions for all variables specified

Jorge Cimentada
Hi, I'd like some feedback on how to make this function more "quicker and
parsimonious".

I normally run several regressions like this:
y ~ x1
y ~ x1 + x2
y ~ x1 + x2 +xn

Instead, I created a function in which I specify y, x1 and x2 and the
function automatically generates:
y ~ x1
y ~ x1 + x2
y ~ x1 + x2 +xn

This is the function:

models <- function(dv, covariates, data) {
    dv <- paste(dv, "~ 1")
    combinations <- lapply(1:length(covariates), function(i) seq(1:i))
    formulas <- lapply(combinations, function(p) x <-
as.formula(paste(c(dv, covariates[p]), collapse=" + ")))
    results <- lapply(formulas, function(o) lm(o, data=data))
    return(results)
}

And an example:
models("mpg",c("cyl","disp","hp","am"), mtcars)

I'm concerned about the time that it takes when using other regression
models, such as those with the survey package(I know these models are heavy
and take time) but I'm sure that the function has room for improvement.

I'd also like to specify the variables as a formula. I managed to do it but
I get different results when using things like scale() for predictors.

Formula version of the function:
models2 <- function(formula, data) {
    dv <- paste(all.vars(formula)[1], " ~ 1")
    covariates <- all.vars(formula)[-1]
    combinations <- lapply(1:length(covariates), function(i) seq(1:i))
    lfo <- lapply(combinations, function(p) x <- as.formula(paste(c(dv,
covariates[p]), collapse=" + ")))
    results <- lapply(lfo, function(o) lm(o, data=data))
    return(results)
}

models("mpg",c("cyl","scale(disp)"), mtcars)

models2(mpg ~ cyl + scale(disp), mtcars)

See the difference between the disp variables?

Any feedback is appreciated!


*Jorge Cimentada*
*Ph.D. Candidate*
Dpt. Ciències Polítiques i Socials
Ramon Trias Fargas, 25-27 | 08005 Barcelona

Office 24.331
[Tel.] 697 382 009www.jorgecimentada.com

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Improving function that estimates regressions for all variables specified

Adams, Jean
You may be able find someone else's function that already does you want.
For example the dredge() function of the MuMIn package.

http://rpackages.ianhowson.com/cran/MuMIn/man/MuMIn-package.html

Jean

On Fri, Aug 26, 2016 at 1:11 PM, Jorge Cimentada <[hidden email]>
wrote:

> Hi, I'd like some feedback on how to make this function more "quicker and
> parsimonious".
>
> I normally run several regressions like this:
> y ~ x1
> y ~ x1 + x2
> y ~ x1 + x2 +xn
>
> Instead, I created a function in which I specify y, x1 and x2 and the
> function automatically generates:
> y ~ x1
> y ~ x1 + x2
> y ~ x1 + x2 +xn
>
> This is the function:
>
> models <- function(dv, covariates, data) {
>     dv <- paste(dv, "~ 1")
>     combinations <- lapply(1:length(covariates), function(i) seq(1:i))
>     formulas <- lapply(combinations, function(p) x <-
> as.formula(paste(c(dv, covariates[p]), collapse=" + ")))
>     results <- lapply(formulas, function(o) lm(o, data=data))
>     return(results)
> }
>
> And an example:
> models("mpg",c("cyl","disp","hp","am"), mtcars)
>
> I'm concerned about the time that it takes when using other regression
> models, such as those with the survey package(I know these models are heavy
> and take time) but I'm sure that the function has room for improvement.
>
> I'd also like to specify the variables as a formula. I managed to do it but
> I get different results when using things like scale() for predictors.
>
> Formula version of the function:
> models2 <- function(formula, data) {
>     dv <- paste(all.vars(formula)[1], " ~ 1")
>     covariates <- all.vars(formula)[-1]
>     combinations <- lapply(1:length(covariates), function(i) seq(1:i))
>     lfo <- lapply(combinations, function(p) x <- as.formula(paste(c(dv,
> covariates[p]), collapse=" + ")))
>     results <- lapply(lfo, function(o) lm(o, data=data))
>     return(results)
> }
>
> models("mpg",c("cyl","scale(disp)"), mtcars)
>
> models2(mpg ~ cyl + scale(disp), mtcars)
>
> See the difference between the disp variables?
>
> Any feedback is appreciated!
>
>
> *Jorge Cimentada*
> *Ph.D. Candidate*
> Dpt. Ciències Polítiques i Socials
> Ramon Trias Fargas, 25-27 | 08005 Barcelona
>
> Office 24.331
> [Tel.] 697 382 009www.jorgecimentada.com
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.