Suggestion for `glm.fit`

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Suggestion for `glm.fit`

Benjamin Christoffersen
Dear sirs,

One gets unexpected `residuals` if one is not aware of the meaning of
weights when a weight is set to zero and the outcome is one in the
`binomial` family in a call to `glm.fit`. The reason is the following
line from `binomial()$initialize`
> y[weights == 0] <- 0

Here is an example:
pval <- seq(.05, .95, length.out = 25)
X <- log(pval / (1 - pval)) - 2
Y <- c(
  FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE,
  FALSE, FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, TRUE,
  TRUE, TRUE, TRUE, FALSE, TRUE, TRUE)

W <- rep(1, length(Y))
W[length(W)] <- 0
fit <- glm(Y ~ X, binomial(), weights = W)
fit$residuals[25]
#R        25
#R -45.77847

# Maybe it should be the following. Otherwise maybe there should be a
# warning in `binomial()$initialize` when `y`s are set to zero?
with(
  fit, tail((Y - fitted.values) / binomial()$mu.eta(linear.predictors), 1))
#R       25
#R 1.022332

sessionInfo()
#R R version 3.5.1 (2018-07-02)
#R Platform: x86_64-w64-mingw32/x64 (64-bit)
#R Running under: Windows >= 8 x64 (build 9200)
#R
#R Matrix products: default
#R
#R locale:
#R [1] LC_COLLATE=English_United States.1252
#R [2] LC_CTYPE=English_United States.1252
#R [3] LC_MONETARY=English_United States.1252
#R [4] LC_NUMERIC=C
#R [5] LC_TIME=English_United States.1252
#R
#R attached base packages:
#R [1] stats     graphics  grDevices utils     datasets  methods
#R [7] base
#R
#R loaded via a namespace (and not attached):
#R [1] compiler_3.5.1 tools_3.5.1    yaml_2.1.18

Sincerely yours,
Benjamin Christoffersen

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Suggestion for `glm.fit`

bbolker

 I don't know whether this helps or not, but using residuals(fit) rather
than fit$residuals returns 0 for the last value.  This is different from
(predict(fit,type="response")[25] - Y[25]) (or the equivalent Pearson
residual) because the *weighted* residuals are returned by definition
(not that I see this being explained super-clearly in the documentation) ...

FWIW, ?residuals says:

     The abbreviated form ‘resid’ is an alias for ‘residuals’.  It is
     intended to encourage users to access object components through an
     accessor function rather than by directly referencing an object
     slot.

 - in other words, if you use fit$residuals you're expected to know
exactly what you're doing and how things might get weird ...



On 2018-11-26 5:08 p.m., Benjamin Christoffersen wrote:

> Dear sirs,
>
> One gets unexpected `residuals` if one is not aware of the meaning of
> weights when a weight is set to zero and the outcome is one in the
> `binomial` family in a call to `glm.fit`. The reason is the following
> line from `binomial()$initialize`
>> y[weights == 0] <- 0
>
> Here is an example:
> pval <- seq(.05, .95, length.out = 25)
> X <- log(pval / (1 - pval)) - 2
> Y <- c(
>   FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE,
>   FALSE, FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, TRUE,
>   TRUE, TRUE, TRUE, FALSE, TRUE, TRUE)
>
> W <- rep(1, length(Y))
> W[length(W)] <- 0
> fit <- glm(Y ~ X, binomial(), weights = W)
> fit$residuals[25]
> #R        25
> #R -45.77847
>
> # Maybe it should be the following. Otherwise maybe there should be a
> # warning in `binomial()$initialize` when `y`s are set to zero?
> with(
>   fit, tail((Y - fitted.values) / binomial()$mu.eta(linear.predictors), 1))
> #R       25
> #R 1.022332
>
> sessionInfo()
> #R R version 3.5.1 (2018-07-02)
> #R Platform: x86_64-w64-mingw32/x64 (64-bit)
> #R Running under: Windows >= 8 x64 (build 9200)
> #R
> #R Matrix products: default
> #R
> #R locale:
> #R [1] LC_COLLATE=English_United States.1252
> #R [2] LC_CTYPE=English_United States.1252
> #R [3] LC_MONETARY=English_United States.1252
> #R [4] LC_NUMERIC=C
> #R [5] LC_TIME=English_United States.1252
> #R
> #R attached base packages:
> #R [1] stats     graphics  grDevices utils     datasets  methods
> #R [7] base
> #R
> #R loaded via a namespace (and not attached):
> #R [1] compiler_3.5.1 tools_3.5.1    yaml_2.1.18
>
> Sincerely yours,
> Benjamin Christoffersen
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel