[R] extend summary.lm for hccm?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[R] extend summary.lm for hccm?

ivo welch-2
dear R experts:

I wonder whether it is possible to extend the summary method for the
lm function, so that it uses an option "hccm" (well, model "hc0").  In
my line of work, it is pretty much required in reporting of almost all
linear regressions these days, which means that it would be very nice
not to have to manually library car, then sqrt the diagonal, and
recompute T-stats; instead, I would love to get everything in the same
format as the current output---except errors heteroskedasticity
adjusted.

easy or hard?

regards,

/iaw

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [R] extend summary.lm for hccm?

Dirk Eddelbuettel

On 23 December 2006 at 20:46, ivo welch wrote:
| I wonder whether it is possible to extend the summary method for the
| lm function, so that it uses an option "hccm" (well, model "hc0").  In
| my line of work, it is pretty much required in reporting of almost all
| linear regressions these days, which means that it would be very nice
| not to have to manually library car, then sqrt the diagonal, and
| recompute T-stats; instead, I would love to get everything in the same
| format as the current output---except errors heteroskedasticity
| adjusted.
|
| easy or hard?

Did you consider the 'sandwich' package? A simple

        > install.packages("sandwich")
        > library(sandwich)
        > ?vcovHC
        > ?vcovHAC
       
should get you there.

Dirk

--
Hell, there are no rules here - we're trying to accomplish something.
                                                  -- Thomas A. Edison

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [R] extend summary.lm for hccm?

Fox, John
Dear Dirk and Ivo,

It's true that the sandwich package has more extensive facilities for
sandwich variance estimation than the hccm function in the car package, but
I think that the thrust of Ivo's question is to get a model summary that
automatically uses the adjusted standard errors rather than having to
compute them and use them "manually."  Here's one approach, which could be
modified slightly if Ivo wants "hc0" as the default; it could also be
adapted to use the sandwich package.

I hope this helps,
 John

----------- snip --------------

summaryHCCM <- function(model, ...) UseMethod("summaryHCCM")

summaryHCCM.lm <- function(model, type=c("hc3", "hc0", "hc1", "hc2", "hc4"),

    ...){
    if (!require(car)) stop("Required car package is missing.")
    type <- match.arg(type)
    V <- hccm(model, type=type)
    sumry <- summary(model)
    table <- coef(sumry)
    table[,2] <- sqrt(diag(V))
    table[,3] <- table[,1]/table[,2]
    table[,4] <- 2*pt(abs(table[,3]), df.residual(model), lower.tail=FALSE)
    sumry$coefficients <- table
    print(sumry)
    cat("Note: Heteroscedasticity-consistant standard errors using
adjustment",
        type, "\n")
    }    

--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
--------------------------------

> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Dirk
> Eddelbuettel
> Sent: Saturday, December 23, 2006 11:16 PM
> To: ivo welch
> Cc: [hidden email]
> Subject: Re: [R] extend summary.lm for hccm?
>
>
> On 23 December 2006 at 20:46, ivo welch wrote:
> | I wonder whether it is possible to extend the summary
> method for the
> | lm function, so that it uses an option "hccm" (well, model
> "hc0").  In
> | my line of work, it is pretty much required in reporting of
> almost all
> | linear regressions these days, which means that it would be
> very nice
> | not to have to manually library car, then sqrt the diagonal, and
> | recompute T-stats; instead, I would love to get everything
> in the same
> | format as the current output---except errors heteroskedasticity
> | adjusted.
> |
> | easy or hard?
>
> Did you consider the 'sandwich' package? A simple
>
> > install.packages("sandwich")
> > library(sandwich)
> > ?vcovHC
> > ?vcovHAC
>
> should get you there.
>
> Dirk
>
> --
> Hell, there are no rules here - we're trying to accomplish something.
>                                                   -- Thomas A. Edison
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [R] extend summary.lm for hccm?

Frank Harrell
John Fox wrote:

> Dear Dirk and Ivo,
>
> It's true that the sandwich package has more extensive facilities for
> sandwich variance estimation than the hccm function in the car package, but
> I think that the thrust of Ivo's question is to get a model summary that
> automatically uses the adjusted standard errors rather than having to
> compute them and use them "manually."  Here's one approach, which could be
> modified slightly if Ivo wants "hc0" as the default; it could also be
> adapted to use the sandwich package.
>
> I hope this helps,
>  John

Another approach:

library(Design)  # also requires library(Hmisc)
f <- ols(y ~ x1 + x2, x=TRUE, y=TRUE)
f <- robcov(f)   # sandwich; also allows clustering.  Also see bootcov
anova(f)         # all later steps use sandwich variance matrix
summmary(f)
contrast(f, list(x1=.5), list(x1=.2))

BUT note that sandwich covariance matrix estimators can have poor mean
squared error (a paper by Bill Gould in Stata Technical Bulletin years
ago related to logistic regression showed an example with a 100-fold
increase in the variance of a variance estimate) and can give you the
right estimate of the wrong quantity (reference below).

Frank Harrell

@Article{free06so,
   author =               {Freedman, David A.},
   title =                {On the so-called ``{Huber} sandwich
estimator'' and
``robust standard errors''},
   journal =      The American Statistician,
   year =                 2006,
   volume =               60,
   pages =                {299-302},
   annote =               {nice summary of derivation of sandwich
estimators;questions why we should be interested in getting the right
variance of the wrong parameters when the model doesn't fit}
}


>
> ----------- snip --------------
>
> summaryHCCM <- function(model, ...) UseMethod("summaryHCCM")
>
> summaryHCCM.lm <- function(model, type=c("hc3", "hc0", "hc1", "hc2", "hc4"),
>
>     ...){
>     if (!require(car)) stop("Required car package is missing.")
>     type <- match.arg(type)
>     V <- hccm(model, type=type)
>     sumry <- summary(model)
>     table <- coef(sumry)
>     table[,2] <- sqrt(diag(V))
>     table[,3] <- table[,1]/table[,2]
>     table[,4] <- 2*pt(abs(table[,3]), df.residual(model), lower.tail=FALSE)
>     sumry$coefficients <- table
>     print(sumry)
>     cat("Note: Heteroscedasticity-consistant standard errors using
> adjustment",
>         type, "\n")
>     }    
>
> --------------------------------
> John Fox
> Department of Sociology
> McMaster University
> Hamilton, Ontario
> Canada L8S 4M4
> 905-525-9140x23604
> http://socserv.mcmaster.ca/jfox 
> --------------------------------
>
>> -----Original Message-----
>> From: [hidden email]
>> [mailto:[hidden email]] On Behalf Of Dirk
>> Eddelbuettel
>> Sent: Saturday, December 23, 2006 11:16 PM
>> To: ivo welch
>> Cc: [hidden email]
>> Subject: Re: [R] extend summary.lm for hccm?
>>
>>
>> On 23 December 2006 at 20:46, ivo welch wrote:
>> | I wonder whether it is possible to extend the summary
>> method for the
>> | lm function, so that it uses an option "hccm" (well, model
>> "hc0").  In
>> | my line of work, it is pretty much required in reporting of
>> almost all
>> | linear regressions these days, which means that it would be
>> very nice
>> | not to have to manually library car, then sqrt the diagonal, and
>> | recompute T-stats; instead, I would love to get everything
>> in the same
>> | format as the current output---except errors heteroskedasticity
>> | adjusted.
>> |
>> | easy or hard?
>>
>> Did you consider the 'sandwich' package? A simple
>>
>> > install.packages("sandwich")
>> > library(sandwich)
>> > ?vcovHC
>> > ?vcovHAC
>>
>> should get you there.
>>
>> Dirk
>>
>> --



--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Frank Harrell
Department of Biostatistics, Vanderbilt University
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [R] extend summary.lm for hccm?

Fox, John
Dear Frank,

If I remember Freedman's recent paper correctly, he argues that sandwich
variance estimator, though problematic in general, is not problematic in the
case that White described -- an otherwise correctly specified linear model
with heteroscedasticity estimated by least-squares.

Regards,
 John

--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
--------------------------------

> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Frank
> E Harrell Jr
> Sent: Sunday, December 24, 2006 10:29 AM
> To: John Fox
> Cc: [hidden email]; 'ivo welch'; 'Dirk Eddelbuettel'
> Subject: Re: [R] extend summary.lm for hccm?
>
> John Fox wrote:
> > Dear Dirk and Ivo,
> >
> > It's true that the sandwich package has more extensive
> facilities for
> > sandwich variance estimation than the hccm function in the car
> > package, but I think that the thrust of Ivo's question is to get a
> > model summary that automatically uses the adjusted standard errors
> > rather than having to compute them and use them "manually."  Here's
> > one approach, which could be modified slightly if Ivo wants
> "hc0" as
> > the default; it could also be adapted to use the sandwich package.
> >
> > I hope this helps,
> >  John
>
> Another approach:
>
> library(Design)  # also requires library(Hmisc) f <- ols(y ~
> x1 + x2, x=TRUE, y=TRUE)
> f <- robcov(f)   # sandwich; also allows clustering.  Also see bootcov
> anova(f)         # all later steps use sandwich variance matrix
> summmary(f)
> contrast(f, list(x1=.5), list(x1=.2))
>
> BUT note that sandwich covariance matrix estimators can have
> poor mean squared error (a paper by Bill Gould in Stata
> Technical Bulletin years ago related to logistic regression
> showed an example with a 100-fold increase in the variance of
> a variance estimate) and can give you the right estimate of
> the wrong quantity (reference below).
>
> Frank Harrell
>
> @Article{free06so,
>    author =               {Freedman, David A.},
>    title =                {On the so-called ``{Huber} sandwich
> estimator'' and
> ``robust standard errors''},
>    journal =      The American Statistician,
>    year =                 2006,
>    volume =               60,
>    pages =                {299-302},
>    annote =               {nice summary of derivation of sandwich
> estimators;questions why we should be interested in getting
> the right variance of the wrong parameters when the model
> doesn't fit} }
>
>
> >
> > ----------- snip --------------
> >
> > summaryHCCM <- function(model, ...) UseMethod("summaryHCCM")
> >
> > summaryHCCM.lm <- function(model, type=c("hc3", "hc0",
> "hc1", "hc2",
> > "hc4"),
> >
> >     ...){
> >     if (!require(car)) stop("Required car package is missing.")
> >     type <- match.arg(type)
> >     V <- hccm(model, type=type)
> >     sumry <- summary(model)
> >     table <- coef(sumry)
> >     table[,2] <- sqrt(diag(V))
> >     table[,3] <- table[,1]/table[,2]
> >     table[,4] <- 2*pt(abs(table[,3]), df.residual(model),
> lower.tail=FALSE)
> >     sumry$coefficients <- table
> >     print(sumry)
> >     cat("Note: Heteroscedasticity-consistant standard errors using
> > adjustment",
> >         type, "\n")
> >     }    
> >
> > --------------------------------
> > John Fox
> > Department of Sociology
> > McMaster University
> > Hamilton, Ontario
> > Canada L8S 4M4
> > 905-525-9140x23604
> > http://socserv.mcmaster.ca/jfox
> > --------------------------------
> >
> >> -----Original Message-----
> >> From: [hidden email]
> >> [mailto:[hidden email]] On Behalf Of Dirk
> >> Eddelbuettel
> >> Sent: Saturday, December 23, 2006 11:16 PM
> >> To: ivo welch
> >> Cc: [hidden email]
> >> Subject: Re: [R] extend summary.lm for hccm?
> >>
> >>
> >> On 23 December 2006 at 20:46, ivo welch wrote:
> >> | I wonder whether it is possible to extend the summary
> >> method for the
> >> | lm function, so that it uses an option "hccm" (well, model
> >> "hc0").  In
> >> | my line of work, it is pretty much required in reporting of
> >> almost all
> >> | linear regressions these days, which means that it would be
> >> very nice
> >> | not to have to manually library car, then sqrt the diagonal, and
> >> | recompute T-stats; instead, I would love to get everything
> >> in the same
> >> | format as the current output---except errors heteroskedasticity
> >> | adjusted.
> >> |
> >> | easy or hard?
> >>
> >> Did you consider the 'sandwich' package? A simple
> >>
> >> > install.packages("sandwich")
> >> > library(sandwich)
> >> > ?vcovHC
> >> > ?vcovHAC
> >>
> >> should get you there.
> >>
> >> Dirk
> >>
> >> --
>
>
>
> --
> Frank E Harrell Jr   Professor and Chair           School of Medicine
>                       Department of Biostatistics  
> Vanderbilt University
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [R] extend summary.lm for hccm?

Frank Harrell
John Fox wrote:
> Dear Frank,
>
> If I remember Freedman's recent paper correctly, he argues that sandwich
> variance estimator, though problematic in general, is not problematic in the
> case that White described -- an otherwise correctly specified linear model
> with heteroscedasticity estimated by least-squares.
>
> Regards,
>  John

That's right John although the precision of the variance estimators is
still worth studying in that case.

Best regards,
Frank

>
> --------------------------------
> John Fox
> Department of Sociology
> McMaster University
> Hamilton, Ontario
> Canada L8S 4M4
> 905-525-9140x23604
> http://socserv.mcmaster.ca/jfox 
> --------------------------------
>
>> -----Original Message-----
>> From: [hidden email]
>> [mailto:[hidden email]] On Behalf Of Frank
>> E Harrell Jr
>> Sent: Sunday, December 24, 2006 10:29 AM
>> To: John Fox
>> Cc: [hidden email]; 'ivo welch'; 'Dirk Eddelbuettel'
>> Subject: Re: [R] extend summary.lm for hccm?
>>
>> John Fox wrote:
>>> Dear Dirk and Ivo,
>>>
>>> It's true that the sandwich package has more extensive
>> facilities for
>>> sandwich variance estimation than the hccm function in the car
>>> package, but I think that the thrust of Ivo's question is to get a
>>> model summary that automatically uses the adjusted standard errors
>>> rather than having to compute them and use them "manually."  Here's
>>> one approach, which could be modified slightly if Ivo wants
>> "hc0" as
>>> the default; it could also be adapted to use the sandwich package.
>>>
>>> I hope this helps,
>>>  John
>> Another approach:
>>
>> library(Design)  # also requires library(Hmisc) f <- ols(y ~
>> x1 + x2, x=TRUE, y=TRUE)
>> f <- robcov(f)   # sandwich; also allows clustering.  Also see bootcov
>> anova(f)         # all later steps use sandwich variance matrix
>> summmary(f)
>> contrast(f, list(x1=.5), list(x1=.2))
>>
>> BUT note that sandwich covariance matrix estimators can have
>> poor mean squared error (a paper by Bill Gould in Stata
>> Technical Bulletin years ago related to logistic regression
>> showed an example with a 100-fold increase in the variance of
>> a variance estimate) and can give you the right estimate of
>> the wrong quantity (reference below).
>>
>> Frank Harrell
>>
>> @Article{free06so,
>>    author =               {Freedman, David A.},
>>    title =                {On the so-called ``{Huber} sandwich
>> estimator'' and
>> ``robust standard errors''},
>>    journal =      The American Statistician,
>>    year =                 2006,
>>    volume =               60,
>>    pages =                {299-302},
>>    annote =               {nice summary of derivation of sandwich
>> estimators;questions why we should be interested in getting
>> the right variance of the wrong parameters when the model
>> doesn't fit} }
>>
>>
>>> ----------- snip --------------
>>>
>>> summaryHCCM <- function(model, ...) UseMethod("summaryHCCM")
>>>
>>> summaryHCCM.lm <- function(model, type=c("hc3", "hc0",
>> "hc1", "hc2",
>>> "hc4"),
>>>
>>>     ...){
>>>     if (!require(car)) stop("Required car package is missing.")
>>>     type <- match.arg(type)
>>>     V <- hccm(model, type=type)
>>>     sumry <- summary(model)
>>>     table <- coef(sumry)
>>>     table[,2] <- sqrt(diag(V))
>>>     table[,3] <- table[,1]/table[,2]
>>>     table[,4] <- 2*pt(abs(table[,3]), df.residual(model),
>> lower.tail=FALSE)
>>>     sumry$coefficients <- table
>>>     print(sumry)
>>>     cat("Note: Heteroscedasticity-consistant standard errors using
>>> adjustment",
>>>         type, "\n")
>>>     }    
>>>
>>> --------------------------------
>>> John Fox
>>> Department of Sociology
>>> McMaster University
>>> Hamilton, Ontario
>>> Canada L8S 4M4
>>> 905-525-9140x23604
>>> http://socserv.mcmaster.ca/jfox
>>> --------------------------------
>>>
>>>> -----Original Message-----
>>>> From: [hidden email]
>>>> [mailto:[hidden email]] On Behalf Of Dirk
>>>> Eddelbuettel
>>>> Sent: Saturday, December 23, 2006 11:16 PM
>>>> To: ivo welch
>>>> Cc: [hidden email]
>>>> Subject: Re: [R] extend summary.lm for hccm?
>>>>
>>>>
>>>> On 23 December 2006 at 20:46, ivo welch wrote:
>>>> | I wonder whether it is possible to extend the summary
>>>> method for the
>>>> | lm function, so that it uses an option "hccm" (well, model
>>>> "hc0").  In
>>>> | my line of work, it is pretty much required in reporting of
>>>> almost all
>>>> | linear regressions these days, which means that it would be
>>>> very nice
>>>> | not to have to manually library car, then sqrt the diagonal, and
>>>> | recompute T-stats; instead, I would love to get everything
>>>> in the same
>>>> | format as the current output---except errors heteroskedasticity
>>>> | adjusted.
>>>> |
>>>> | easy or hard?
>>>>
>>>> Did you consider the 'sandwich' package? A simple
>>>>
>>>> > install.packages("sandwich")
>>>> > library(sandwich)
>>>> > ?vcovHC
>>>> > ?vcovHAC
>>>>
>>>> should get you there.
>>>>
>>>> Dirk

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Frank Harrell
Department of Biostatistics, Vanderbilt University
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [R] extend summary.lm for hccm?

ivo welch-2
thank you, everybody.  It's hard to find much fault with R, but one
feature that would be nice to have would be hooks to output functions
that make it easy to augment to the output---e.g., to add a new
statistic to summary (e.g., mean, sd, different stderrs).  not a big
flaw, because one can write replacement functions...

 /ivo

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [R] extend summary.lm for hccm?

Achim Zeileis
In reply to this post by Fox, John
On Sun, 24 Dec 2006, John Fox wrote:

> Dear Dirk and Ivo,
>
> It's true that the sandwich package has more extensive facilities for
> sandwich variance estimation than the hccm function in the car package, but
> I think that the thrust of Ivo's question is to get a model summary that
> automatically uses the adjusted standard errors rather than having to
> compute them and use them "manually."

I've written the function coeftest() in package "lmtest" for this purpose.
With this you can
  coeftest(obj, vcov = sandwich)
or you can put this into a specialized summary() function as John
suggested (where you probably would want the F statistic to use the other
vcov as well). See also function waldtest() in "lmtest",
linear.hypothesis() in "car" and
  vignette("sandwich", package = "sandwich")

Although this works, it is still a nuisance to use a different function
and not summary() directly. In addition, it would also be nice to plug in
different vcovs into confint() or predict() methods. Of course, one could
write different generics or overload the methods in certain packages.
However, I guess that many practitioners want to use different vcov
estimators - especially in the social and political scieneces, and
econometrics etc. - so that this might justify that the original "lm" (and
"glm") methods are extended to allow for plugging in different vcov
matrices. Maybe we could try to convince R-core to include somthing like
this?

Best wishes,
Z

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [R] extend summary.lm for hccm?

Achim Zeileis
In reply to this post by Fox, John
On Sun, 24 Dec 2006, John Fox wrote:

> If I remember Freedman's recent paper correctly, he argues that sandwich
> variance estimator, though problematic in general, is not problematic in the
> case that White described -- an otherwise correctly specified linear model
> with heteroscedasticity estimated by least-squares.

More generally, sandwich-type estimators are valid (i.e., estimate
the right quantity, although not necessarily precisely, as Frank pointed
out) in situations where the estimating functions are correctly specified
but remaining aspets of the likelihood (not captured in the estimating
functions) are potentially not.

In linear models, it is easy to see what this means: the mean function has
to be correctly specified (i.e., the errors have zero mean) but the
correlation structure of the errors (i.e., their (co-)variances) might
differ from the usual assumptions. In GLMs, in particular logistic
regression, it is much more difficult to see against which types of
misspecification sandwich-based inference is robust.

Freedman's paper stresses the point that many model misspecifications also
imply misspecified estimating functions (and in his example this is rather
obvious) so that consequently the sandwich-type estimators estimate the
wrong quantity.

Best wishes,
Z

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [R] extend summary.lm for hccm?

Fox, John
In reply to this post by Achim Zeileis
Dear Achim,

> -----Original Message-----
> From: Achim Zeileis [mailto:[hidden email]]
> Sent: Monday, December 25, 2006 8:38 AM
> To: John Fox
> Cc: 'Dirk Eddelbuettel'; 'ivo welch'; [hidden email]
> Subject: Re: [R] extend summary.lm for hccm?
>
> On Sun, 24 Dec 2006, John Fox wrote:
>
> > Dear Dirk and Ivo,
> >
> > It's true that the sandwich package has more extensive
> facilities for
> > sandwich variance estimation than the hccm function in the car
> > package, but I think that the thrust of Ivo's question is to get a
> > model summary that automatically uses the adjusted standard errors
> > rather than having to compute them and use them "manually."
>
> I've written the function coeftest() in package "lmtest" for
> this purpose.
> With this you can
>   coeftest(obj, vcov = sandwich)

Since coeftest() already exists in a package, I think that this is a
preferable solution.

> or you can put this into a specialized summary() function as
> John suggested (where you probably would want the F statistic
> to use the other vcov as well).

Good point. Here's a modified version that also fixes up the F-test:

summaryHCCM.lm <- function(model, type=c("hc3", "hc0", "hc1", "hc2", "hc4"),

    ...){
    if (!require(car)) stop("Required car package is missing.")
    type <- match.arg(type)
    V <- hccm(model, type=type)
    sumry <- summary(model)
    table <- coef(sumry)
    table[,2] <- sqrt(diag(V))
    table[,3] <- table[,1]/table[,2]
    table[,4] <- 2*pt(abs(table[,3]), df.residual(model), lower.tail=FALSE)
    sumry$coefficients <- table
    p <- nrow(table)
    hyp <- if (has.intercept(model)) cbind(0, diag(p - 1)) else diag(p)
    sumry$fstatistic[1] <- linear.hypothesis(model, hyp,
white.adjust=type)[2,"F"]
    print(sumry)
    cat("Note: Heteroscedasticity-consistant standard errors using
adjustment",
        type, "\n")
    }

Regards,
 John

> See also function waldtest()
> in "lmtest",
> linear.hypothesis() in "car" and
>   vignette("sandwich", package = "sandwich")
>
> Although this works, it is still a nuisance to use a
> different function and not summary() directly. In addition,
> it would also be nice to plug in different vcovs into
> confint() or predict() methods. Of course, one could write
> different generics or overload the methods in certain packages.
> However, I guess that many practitioners want to use
> different vcov estimators - especially in the social and
> political scieneces, and econometrics etc. - so that this
> might justify that the original "lm" (and
> "glm") methods are extended to allow for plugging in
> different vcov matrices. Maybe we could try to convince
> R-core to include somthing like this?
>
> Best wishes,
> Z
>
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [R] extend summary.lm for hccm?

cberry
In reply to this post by Achim Zeileis
On Mon, 25 Dec 2006, Achim Zeileis wrote:

> On Sun, 24 Dec 2006, John Fox wrote:
>
>> If I remember Freedman's recent paper correctly, he argues that sandwich
>> variance estimator, though problematic in general, is not problematic in the
>> case that White described -- an otherwise correctly specified linear model
>> with heteroscedasticity estimated by least-squares.
>
> More generally, sandwich-type estimators are valid (i.e., estimate
> the right quantity, although not necessarily precisely, as Frank pointed
> out) in situations where the estimating functions are correctly specified
> but remaining aspets of the likelihood (not captured in the estimating
> functions) are potentially not.

Not exactly. The asymptotic properites are good, but in samples of
moderate size the properties (including both biasedness and variance) can
be surprisingly bad. And if you are trying to calculate a p-value, getting
a too-small variance estimate gives you spuriously small p-values. FWIW,
I've seen a case in which the nominal size of a test based on the sandwich
estimator was several orders of magnitude smaller than a test with correct
nominal size.

There is a modest literature on this. Some refs:

Background Papers:

Drum M, McCullagh P. Comment. Statistical Science 1993; 8:300-301.

Freedman DA. On the So-Called "Huber Sandwich Estimator" and "Robust"
Standard Errors. The American Statistician, Volume 60, Number 4,
November 2006, pp. 299-302(4)

-----

Some Proposed Corrections:

Fay MP and Graubard BI. Small-Sample Adjustments for
Walt-Type Tests Using Sandwich Estimators. Biometrics 2001; 57:
1198-1206.

Guo X, Pan W, Connett JE, Hannan PJ and French SA.  Small-sample
performance of the robust score test and its modifications in
generalized estimating equations Statistics in Medicine 2005;
24:3479-3495

Mancl LA and DeRouen TA, A Covariance Estimator for GEE with Improved
Small-Sample Properties. Biometrics 2001; 57:126-134.

Morel JG, Bokossa MC, and Neerchal NK.  Small Sample Correction for
the Variance of GEE Estimators Biometrical Journal 2003; 45(4):
395-409.

Pan W and Wall MM. Small-sample adjustments in using the
sandwich variance estimator in generalized estimating
equations. Statistics in Medicine  2002; 21:1429-1441.


HTH,

Chuck

>
> In linear models, it is easy to see what this means: the mean function has
> to be correctly specified (i.e., the errors have zero mean) but the
> correlation structure of the errors (i.e., their (co-)variances) might
> differ from the usual assumptions. In GLMs, in particular logistic
> regression, it is much more difficult to see against which types of
> misspecification sandwich-based inference is robust.
>
> Freedman's paper stresses the point that many model misspecifications also
> imply misspecified estimating functions (and in his example this is rather
> obvious) so that consequently the sandwich-type estimators estimate the
> wrong quantity.
>
> Best wishes,
> Z
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry                        (858) 534-2098
                                          Dept of Family/Preventive Medicine
E mailto:[hidden email]         UC San Diego
http://biostat.ucsd.edu/~cberry/         La Jolla, San Diego 92093-0717

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...