FWIW p. 206 of the White Book gives the following for

names(binomial()): family, names, link, inverse, deriv, initialize,

variance, deviance, weight.

So $aic wasn't there In The Beginning. I haven't done any more

archaeology to try to figure out when/by whom it was first introduced

...

Section 6.3.3, on extending families, doesn't give any other relevant info.

A patch for src/library/stats/man/family.Rd below: please check what

I've said about $aic and $mu.eta to make sure they're correct! I can

submit this to the r bug list if preferred.

----

Index: family.Rd

===================================================================

--- family.Rd (revision 74904)

+++ family.Rd (working copy)

@@ -31,7 +31,7 @@

\arguments{

\item{link}{a specification for the model link function. This can be

a name/expression, a literal character string, a length-one character

- vector or an object of class

+ vector, or an object of class

\code{"\link[=make.link]{link-glm}"} (such as generated by

\code{\link{make.link}}) provided it is not specified

\emph{via} one of the standard names given next.

@@ -45,7 +45,7 @@

the \code{Gamma} family the links \code{inverse}, \code{identity}

and \code{log};

the \code{poisson} family the links \code{log}, \code{identity},

- and \code{sqrt} and the \code{inverse.gaussian} family the links

+ and \code{sqrt}; and the \code{inverse.gaussian} family the links

\code{1/mu^2}, \code{inverse}, \code{identity}

and \code{log}.

@@ -105,8 +105,8 @@

\note{

The \code{link} and \code{variance} arguments have rather awkward

semantics for back-compatibility. The recommended way is to supply

- them is as quoted character strings, but they can also be supplied

- unquoted (as names or expressions). In addition, they can also be

+ them as quoted character strings, but they can also be supplied

+ unquoted (as names or expressions). Additionally, they can be

supplied as a length-one character vector giving the name of one of

the options, or as a list (for \code{link}, of class

\code{"link-glm"}). The restrictions apply only to links given as

@@ -130,10 +130,18 @@

\item{dev.resids}{function giving the deviance residuals as a function

of \code{(y, mu, wt)}.}

\item{aic}{function giving the AIC value if appropriate (but \code{NA}

- for the quasi- families). See \code{\link{logLik}} for the assumptions

- made about the dispersion parameter.}

- \item{mu.eta}{function: derivative \code{function(eta)}

- \eqn{d\mu/d\eta}.}

+ for the quasi- families). More precisely, this function

+ returns \eqn{-2 L + 2 s}, where \eqn{L} is the log-likelihood and \eqn{s}

+ is the number of estimated scale parameters; the penalty term for the

+ location parameters is added elsewhere.

+ See \code{\link{logLik}} for the assumptions

+ made about the dispersion parameter.}

+ \item{mu.eta}{function: derivative of the inverse-link function

+ with respect to the linear predictor. If the inverse-link

+ function is \eqn{\mu=g^{-1}(\eta)}{mu=ginv(eta)}

+ where \eqn{eta}{\eta} is the value

+ of the linear predictor, then this function returns

+ \eqn{d(g^{-1})/d\eta=d\mu/d\eta}{d(ginv(eta))/d(eta)=d(mu)/d(eta)}.}

\item{initialize}{expression. This needs to set up whatever data

objects are needed for the family as well as \code{n} (needed for

AIC in the binomial family) and \code{mustart} (see \code{\link{glm}}).}

@@ -224,8 +232,8 @@

## which case use an offset of 0 in the corresponding formula

## to get the null deviance right.

-## Binomial with identity link: often not a good idea.

-\dontrun{binomial(link = make.link("identity"))}

+## Binomial with identity link: often computationally and

conceptually difficult

+\dontrun{binomial(link = "identity")}

## tests of quasi

x <- rnorm(100)

@@ -236,7 +244,7 @@

glm(y ~ x, family = quasi(variance = "mu^2", link = "log"))

\dontrun{glm(y ~ x, family = quasi(variance = "mu^3", link = "log")) # fails}

y <- rbinom(100, 1, plogis(x))

-# needs to set a starting value for the next fit

+# need to set a starting value for the next fit

glm(y ~ x, family = quasi(variance = "mu(1-mu)", link = "logit"),

start = c(0,1))

}

\keyword{models}

On Mon, Jun 4, 2018 at 10:46 AM, Martin Maechler

<

[hidden email]> wrote:

>>>>>> Ben Bolker

>>>>>> on Sun, 3 Jun 2018 17:33:18 -0400 writes:

>

> > Is it generally known/has it been previously discussed here that the

> > $aic() component in GLM-family objects (e.g. results of binomial(),

> > poisson(), etc.) does not as implemented actually return the AIC, but

> > rather -2*log-likelihood + 2*(model_has_scale_parameter) ?

>

> This rings a faint bell from the last millennium with me,

> and the following "fortune" may contain the answer implicitly :

>

> ------------------------------------------------------------

> > if(!require("fortunes")) install.packages("fortunes")

> > fortune("bug compatib")

>

> For quite a while, bug-for-bug compatibility with S-PLUS v 3.x was considered

> important to allow people to port their packages between systems.

> -- Peter Dalgaard

> R-help (February 2009)

> >

> ------------------------------------------------------------

>

> Ideally, readers who still have access to a version of S-PLUS / S+

> or who have read and internalized or (even co-written !)

> "The white book", notably Ch.6, may be able to shed a historic

> light on this.

>

> I note that the white book's Appendix B with function help

> pages, has a page ?family.object, accessible here

>

https://sites.oxy.edu/lengyel/M150/Sueselbeck/helpfiles/family.object.html>

> which does *not* mention a <fam>$dev.resid() component, but instead

> allows to use <fam>$residuals(*, residuals=TRUE)

> get the

> "

> vector of deviance residual, whose weighted sum of

> squares is the deviance

> "

>

> Given the above, and also the ?glm entry

>

> || Author(s):

> ||

> || The original R implementation of ‘glm’ was written by Simon Davies

> || working for Ross Ihaka at the University of Auckland, but has

> || since been extensively re-written by members of the R Core team.

> ||

> || The design was inspired by the S function of the same name

> || described in Hastie & Pregibon (1992).

>

> actually suggest that it may be hard nowadays to find the

> original "design specs" that Simon and Ross had used at the

> time, and also that they only were _inspired_ by the white book

> chapter 6 (= Hastie & Pregibon (1992)).

>

>

>

> > Can anyone in this forum gauge how a documentation patch

> > would be received?

>

> It depends on further answers to your questions (i.e, this

> thread), but I'd currently say "gratefully".

> I'd expect it would be a patch mainly to

> src/library/stats/man/family.Rd

>

> Note that help(AIC) has a non-small 'Details' section, but

> indeed it does not mention the family(*)$aic function.

>

> > This behaviour does not seem to be documented in ?family (or anywhere

> > else I can find), which says:

>

> > aic: function giving the AIC value if appropriate (but ‘NA’ for

> > the quasi- families). See ‘logLik’ for the assumptions made

> > about the dispersion parameter.

>

>

> > For a demonstration that e.g. binomial()$aic() is really -2*log L and

> > not the AIC, see:

>

> >

https://github.com/wch/r-source/blob/trunk/src/library/stats/R/family.R#L317>

> > This document

> > <

https://github.com/lme4/lme4/blob/master/misc/notes/deviance.rmd>

> > explicates the details a bit more ('L' denotes log-likelihood):

>

> > * family()$aic computes $-2L$, which glm.fit translates to an AIC by

> > adding $2k$ and storing it in model$aic

> > * logLik.default retrieves model$aic and converts it back to a

> > log-likelihood

> > * stats:::AIC.default retrieves the log-likelihood and converts it

> > back to an AIC (!)

> > * family()$dev.resid() computes the squared deviance residuals

> > * stats:::residuals.glm retrieves these values and takes the signed

> > square root

>

> > cheers

> > Ben Bolker

______________________________________________

[hidden email] mailing list

https://stat.ethz.ch/mailman/listinfo/r-devel