Passing formula and weights error

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Passing formula and weights error

John Smith-5
Dear R-help:

I am writing a function based on glm and would like some variations of
weights. In the code below, I couldn't understand why the second glm
function fails and don't know how to fix it:

Error in eval(extras, data, env) : object 'newweights' not found
 Calls: print ... eval -> <Anonymous> -> model.frame.default -> eval -> eval
 Execution halted

### R code
y <- rnorm(100)
 x <- rnorm(100)
 data <- data.frame(cbind(x, y))
 weights <- rep(1, 100)
 n <- 100
 myglm <- function(formula, data, weights){
     ## this works
     print(glm(formula, data, family=gaussian(), weights))
     ## this is not working
     newweights <- rep(1, n)
     glm(formula, data, family=gaussian(), weights=newweights)
 }
 myglm(y~., data, weights)

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Passing formula and weights error

Duncan Murdoch-2
This came up recently in a discussion of lm() on the R-devel list.  I'd
assume the same issue applies to glm.

The problem is that the argument to weights is evaluated in the same way
as arguments in the formula:  first in data, then in the environment of
the formula.  The latter will eventually lead back to the global
environment, but won't lead to the local evaluation frame in myglm().

The easiest solution is to add newweights to the data argument, but
there are a few gotchas here.

First, if newweights is already a column in data, you'll mess things up.
  So be sure to use a name that can't be there.  That's okay in your
example.

The second problem is that a dot in the formula will cause problems,
because it will try to include newweights as a predictor variable.  It's
possible to work around this, but it's probably better to use a more
complicated solution instead:  modify the formula environment so it
starts with a small environment holding newweights.  You don't want to
add newweights directly to environment(formula), because that will have
side effects outside your function.

This version of your function takes this more complicated approach:

  myglm <- function(formula, data, weights){
      ## this works
      print(glm(formula, data, family=gaussian(), weights))
      env <- new.env(parent = environment(formula))
      env$newweights <- rep(1, n)
      environment(formula) <- env

      glm(formula, data, family=gaussian(), weights=newweights)
  }

Duncan Murdoch


On 28/08/2020 11:32 a.m., John Smith wrote:

> Dear R-help:
>
> I am writing a function based on glm and would like some variations of
> weights. In the code below, I couldn't understand why the second glm
> function fails and don't know how to fix it:
>
> Error in eval(extras, data, env) : object 'newweights' not found
>   Calls: print ... eval -> <Anonymous> -> model.frame.default -> eval -> eval
>   Execution halted
>
> ### R code
> y <- rnorm(100)
>   x <- rnorm(100)
>   data <- data.frame(cbind(x, y))
>   weights <- rep(1, 100)
>   n <- 100
>   myglm <- function(formula, data, weights){
>       ## this works
>       print(glm(formula, data, family=gaussian(), weights))
>       ## this is not working
>       newweights <- rep(1, n)
>       glm(formula, data, family=gaussian(), weights=newweights)
>   }
>   myglm(y~., data, weights)
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Passing formula and weights error

R help mailing list-2
In reply to this post by John Smith-5
Note that neither call to glm in your myglm function really works -
the first one is using the 'weights' object from the global
environment, not the weights argument.  E.g., in the fresh R session,
where I avoid making unneeded assignments and use fixed x and y for
repeatability,

  > n <- 16
  > data <- data.frame(x = log2(1:n), y = 1:n)
  > myglm2 <- function(formula, data, weights)
      {
          glm(formula, data=data, family=gaussian(), weights=weights)
      }
  > myglm2(y~., data=data, weights=1/(1:n))
  Error in model.frame.default(formula = formula, data = data, weights
= weights,  :
    invalid type (closure) for variable '(weights)'

The error arises because glm finds stats::weights, a function, not the
argument called weights.  glm(), lm() and their ilk evaluate their
weights and subset arguments in the environment of the formula.  In
this case environment(y~.) is .GlobalEnv, not the function's
environment.  The following function gives one way to deal with this,
by giving formula a new environment that inherits from its original
environment and contains the extra variables.

  > myglm3 <- function(formula, data, weights)
      {
          envir <- list2env(list(weights=weights), parent=environment(formula))
          environment(formula) <- envir
          glm(formula, data=data, family=gaussian(), weights=weights)
      }
  > myglm3(y~., data=data, weights=1/(1:n))

  Call:  glm(formula = formula, family = gaussian(), data = data,
weights = weights)

  Coefficients:
  (Intercept)            x
     -0.09553      2.93352

  Degrees of Freedom: 15 Total (i.e. Null);  14 Residual
  Null Deviance:      60.28
  Residual Deviance: 7.72         AIC: 70.42

This is the same result you get with a direct call to
  glm(y~., data=data, weights=1/(1:n))

This is a common problem and I don't know if there is a FAQ on it or a
standard function to deal with it.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Aug 28, 2020 at 8:33 AM John Smith <[hidden email]> wrote:

>
> Dear R-help:
>
> I am writing a function based on glm and would like some variations of
> weights. In the code below, I couldn't understand why the second glm
> function fails and don't know how to fix it:
>
> Error in eval(extras, data, env) : object 'newweights' not found
>  Calls: print ... eval -> <Anonymous> -> model.frame.default -> eval -> eval
>  Execution halted
>
> ### R code
> y <- rnorm(100)
>  x <- rnorm(100)
>  data <- data.frame(cbind(x, y))
>  weights <- rep(1, 100)
>  n <- 100
>  myglm <- function(formula, data, weights){
>      ## this works
>      print(glm(formula, data, family=gaussian(), weights))
>      ## this is not working
>      newweights <- rep(1, n)
>      glm(formula, data, family=gaussian(), weights=newweights)
>  }
>  myglm(y~., data, weights)
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Passing formula and weights error

John Smith-5
Thanks to Duncan and Bill for very helpful tips.

On Fri, Aug 28, 2020 at 11:38 AM William Dunlap <[hidden email]> wrote:

> Note that neither call to glm in your myglm function really works -
> the first one is using the 'weights' object from the global
> environment, not the weights argument.  E.g., in the fresh R session,
> where I avoid making unneeded assignments and use fixed x and y for
> repeatability,
>
>   > n <- 16
>   > data <- data.frame(x = log2(1:n), y = 1:n)
>   > myglm2 <- function(formula, data, weights)
>       {
>           glm(formula, data=data, family=gaussian(), weights=weights)
>       }
>   > myglm2(y~., data=data, weights=1/(1:n))
>   Error in model.frame.default(formula = formula, data = data, weights
> = weights,  :
>     invalid type (closure) for variable '(weights)'
>
> The error arises because glm finds stats::weights, a function, not the
> argument called weights.  glm(), lm() and their ilk evaluate their
> weights and subset arguments in the environment of the formula.  In
> this case environment(y~.) is .GlobalEnv, not the function's
> environment.  The following function gives one way to deal with this,
> by giving formula a new environment that inherits from its original
> environment and contains the extra variables.
>
>   > myglm3 <- function(formula, data, weights)
>       {
>           envir <- list2env(list(weights=weights),
> parent=environment(formula))
>           environment(formula) <- envir
>           glm(formula, data=data, family=gaussian(), weights=weights)
>       }
>   > myglm3(y~., data=data, weights=1/(1:n))
>
>   Call:  glm(formula = formula, family = gaussian(), data = data,
> weights = weights)
>
>   Coefficients:
>   (Intercept)            x
>      -0.09553      2.93352
>
>   Degrees of Freedom: 15 Total (i.e. Null);  14 Residual
>   Null Deviance:      60.28
>   Residual Deviance: 7.72         AIC: 70.42
>
> This is the same result you get with a direct call to
>   glm(y~., data=data, weights=1/(1:n))
>
> This is a common problem and I don't know if there is a FAQ on it or a
> standard function to deal with it.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Fri, Aug 28, 2020 at 8:33 AM John Smith <[hidden email]> wrote:
> >
> > Dear R-help:
> >
> > I am writing a function based on glm and would like some variations of
> > weights. In the code below, I couldn't understand why the second glm
> > function fails and don't know how to fix it:
> >
> > Error in eval(extras, data, env) : object 'newweights' not found
> >  Calls: print ... eval -> <Anonymous> -> model.frame.default -> eval ->
> eval
> >  Execution halted
> >
> > ### R code
> > y <- rnorm(100)
> >  x <- rnorm(100)
> >  data <- data.frame(cbind(x, y))
> >  weights <- rep(1, 100)
> >  n <- 100
> >  myglm <- function(formula, data, weights){
> >      ## this works
> >      print(glm(formula, data, family=gaussian(), weights))
> >      ## this is not working
> >      newweights <- rep(1, n)
> >      glm(formula, data, family=gaussian(), weights=newweights)
> >  }
> >  myglm(y~., data, weights)
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.