

dear R experts:
I wonder whether it is possible to extend the summary method for the
lm function, so that it uses an option "hccm" (well, model "hc0"). In
my line of work, it is pretty much required in reporting of almost all
linear regressions these days, which means that it would be very nice
not to have to manually library car, then sqrt the diagonal, and
recompute Tstats; instead, I would love to get everything in the same
format as the current outputexcept errors heteroskedasticity
adjusted.
easy or hard?
regards,
/iaw
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On 23 December 2006 at 20:46, ivo welch wrote:
 I wonder whether it is possible to extend the summary method for the
 lm function, so that it uses an option "hccm" (well, model "hc0"). In
 my line of work, it is pretty much required in reporting of almost all
 linear regressions these days, which means that it would be very nice
 not to have to manually library car, then sqrt the diagonal, and
 recompute Tstats; instead, I would love to get everything in the same
 format as the current outputexcept errors heteroskedasticity
 adjusted.

 easy or hard?
Did you consider the 'sandwich' package? A simple
> install.packages("sandwich")
> library(sandwich)
> ?vcovHC
> ?vcovHAC
should get you there.
Dirk

Hell, there are no rules here  we're trying to accomplish something.
 Thomas A. Edison
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Dear Dirk and Ivo,
It's true that the sandwich package has more extensive facilities for
sandwich variance estimation than the hccm function in the car package, but
I think that the thrust of Ivo's question is to get a model summary that
automatically uses the adjusted standard errors rather than having to
compute them and use them "manually." Here's one approach, which could be
modified slightly if Ivo wants "hc0" as the default; it could also be
adapted to use the sandwich package.
I hope this helps,
John
 snip 
summaryHCCM < function(model, ...) UseMethod("summaryHCCM")
summaryHCCM.lm < function(model, type=c("hc3", "hc0", "hc1", "hc2", "hc4"),
...){
if (!require(car)) stop("Required car package is missing.")
type < match.arg(type)
V < hccm(model, type=type)
sumry < summary(model)
table < coef(sumry)
table[,2] < sqrt(diag(V))
table[,3] < table[,1]/table[,2]
table[,4] < 2*pt(abs(table[,3]), df.residual(model), lower.tail=FALSE)
sumry$coefficients < table
print(sumry)
cat("Note: Heteroscedasticityconsistant standard errors using
adjustment",
type, "\n")
}

John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
9055259140x23604
http://socserv.mcmaster.ca/jfox

> Original Message
> From: [hidden email]
> [mailto: [hidden email]] On Behalf Of Dirk
> Eddelbuettel
> Sent: Saturday, December 23, 2006 11:16 PM
> To: ivo welch
> Cc: [hidden email]
> Subject: Re: [R] extend summary.lm for hccm?
>
>
> On 23 December 2006 at 20:46, ivo welch wrote:
>  I wonder whether it is possible to extend the summary
> method for the
>  lm function, so that it uses an option "hccm" (well, model
> "hc0"). In
>  my line of work, it is pretty much required in reporting of
> almost all
>  linear regressions these days, which means that it would be
> very nice
>  not to have to manually library car, then sqrt the diagonal, and
>  recompute Tstats; instead, I would love to get everything
> in the same
>  format as the current outputexcept errors heteroskedasticity
>  adjusted.
> 
>  easy or hard?
>
> Did you consider the 'sandwich' package? A simple
>
> > install.packages("sandwich")
> > library(sandwich)
> > ?vcovHC
> > ?vcovHAC
>
> should get you there.
>
> Dirk
>
> 
> Hell, there are no rules here  we're trying to accomplish something.
>  Thomas A. Edison
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


John Fox wrote:
> Dear Dirk and Ivo,
>
> It's true that the sandwich package has more extensive facilities for
> sandwich variance estimation than the hccm function in the car package, but
> I think that the thrust of Ivo's question is to get a model summary that
> automatically uses the adjusted standard errors rather than having to
> compute them and use them "manually." Here's one approach, which could be
> modified slightly if Ivo wants "hc0" as the default; it could also be
> adapted to use the sandwich package.
>
> I hope this helps,
> John
Another approach:
library(Design) # also requires library(Hmisc)
f < ols(y ~ x1 + x2, x=TRUE, y=TRUE)
f < robcov(f) # sandwich; also allows clustering. Also see bootcov
anova(f) # all later steps use sandwich variance matrix
summmary(f)
contrast(f, list(x1=.5), list(x1=.2))
BUT note that sandwich covariance matrix estimators can have poor mean
squared error (a paper by Bill Gould in Stata Technical Bulletin years
ago related to logistic regression showed an example with a 100fold
increase in the variance of a variance estimate) and can give you the
right estimate of the wrong quantity (reference below).
Frank Harrell
@Article{free06so,
author = {Freedman, David A.},
title = {On the socalled ``{Huber} sandwich
estimator'' and
``robust standard errors''},
journal = The American Statistician,
year = 2006,
volume = 60,
pages = {299302},
annote = {nice summary of derivation of sandwich
estimators;questions why we should be interested in getting the right
variance of the wrong parameters when the model doesn't fit}
}
>
>  snip 
>
> summaryHCCM < function(model, ...) UseMethod("summaryHCCM")
>
> summaryHCCM.lm < function(model, type=c("hc3", "hc0", "hc1", "hc2", "hc4"),
>
> ...){
> if (!require(car)) stop("Required car package is missing.")
> type < match.arg(type)
> V < hccm(model, type=type)
> sumry < summary(model)
> table < coef(sumry)
> table[,2] < sqrt(diag(V))
> table[,3] < table[,1]/table[,2]
> table[,4] < 2*pt(abs(table[,3]), df.residual(model), lower.tail=FALSE)
> sumry$coefficients < table
> print(sumry)
> cat("Note: Heteroscedasticityconsistant standard errors using
> adjustment",
> type, "\n")
> }
>
> 
> John Fox
> Department of Sociology
> McMaster University
> Hamilton, Ontario
> Canada L8S 4M4
> 9055259140x23604
> http://socserv.mcmaster.ca/jfox
> 
>
>> Original Message
>> From: [hidden email]
>> [mailto: [hidden email]] On Behalf Of Dirk
>> Eddelbuettel
>> Sent: Saturday, December 23, 2006 11:16 PM
>> To: ivo welch
>> Cc: [hidden email]
>> Subject: Re: [R] extend summary.lm for hccm?
>>
>>
>> On 23 December 2006 at 20:46, ivo welch wrote:
>>  I wonder whether it is possible to extend the summary
>> method for the
>>  lm function, so that it uses an option "hccm" (well, model
>> "hc0"). In
>>  my line of work, it is pretty much required in reporting of
>> almost all
>>  linear regressions these days, which means that it would be
>> very nice
>>  not to have to manually library car, then sqrt the diagonal, and
>>  recompute Tstats; instead, I would love to get everything
>> in the same
>>  format as the current outputexcept errors heteroskedasticity
>>  adjusted.
>> 
>>  easy or hard?
>>
>> Did you consider the 'sandwich' package? A simple
>>
>> > install.packages("sandwich")
>> > library(sandwich)
>> > ?vcovHC
>> > ?vcovHAC
>>
>> should get you there.
>>
>> Dirk
>>
>> 

Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.
Frank Harrell
Department of Biostatistics, Vanderbilt University


Dear Frank,
If I remember Freedman's recent paper correctly, he argues that sandwich
variance estimator, though problematic in general, is not problematic in the
case that White described  an otherwise correctly specified linear model
with heteroscedasticity estimated by leastsquares.
Regards,
John

John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
9055259140x23604
http://socserv.mcmaster.ca/jfox

> Original Message
> From: [hidden email]
> [mailto: [hidden email]] On Behalf Of Frank
> E Harrell Jr
> Sent: Sunday, December 24, 2006 10:29 AM
> To: John Fox
> Cc: [hidden email]; 'ivo welch'; 'Dirk Eddelbuettel'
> Subject: Re: [R] extend summary.lm for hccm?
>
> John Fox wrote:
> > Dear Dirk and Ivo,
> >
> > It's true that the sandwich package has more extensive
> facilities for
> > sandwich variance estimation than the hccm function in the car
> > package, but I think that the thrust of Ivo's question is to get a
> > model summary that automatically uses the adjusted standard errors
> > rather than having to compute them and use them "manually." Here's
> > one approach, which could be modified slightly if Ivo wants
> "hc0" as
> > the default; it could also be adapted to use the sandwich package.
> >
> > I hope this helps,
> > John
>
> Another approach:
>
> library(Design) # also requires library(Hmisc) f < ols(y ~
> x1 + x2, x=TRUE, y=TRUE)
> f < robcov(f) # sandwich; also allows clustering. Also see bootcov
> anova(f) # all later steps use sandwich variance matrix
> summmary(f)
> contrast(f, list(x1=.5), list(x1=.2))
>
> BUT note that sandwich covariance matrix estimators can have
> poor mean squared error (a paper by Bill Gould in Stata
> Technical Bulletin years ago related to logistic regression
> showed an example with a 100fold increase in the variance of
> a variance estimate) and can give you the right estimate of
> the wrong quantity (reference below).
>
> Frank Harrell
>
> @Article{free06so,
> author = {Freedman, David A.},
> title = {On the socalled ``{Huber} sandwich
> estimator'' and
> ``robust standard errors''},
> journal = The American Statistician,
> year = 2006,
> volume = 60,
> pages = {299302},
> annote = {nice summary of derivation of sandwich
> estimators;questions why we should be interested in getting
> the right variance of the wrong parameters when the model
> doesn't fit} }
>
>
> >
> >  snip 
> >
> > summaryHCCM < function(model, ...) UseMethod("summaryHCCM")
> >
> > summaryHCCM.lm < function(model, type=c("hc3", "hc0",
> "hc1", "hc2",
> > "hc4"),
> >
> > ...){
> > if (!require(car)) stop("Required car package is missing.")
> > type < match.arg(type)
> > V < hccm(model, type=type)
> > sumry < summary(model)
> > table < coef(sumry)
> > table[,2] < sqrt(diag(V))
> > table[,3] < table[,1]/table[,2]
> > table[,4] < 2*pt(abs(table[,3]), df.residual(model),
> lower.tail=FALSE)
> > sumry$coefficients < table
> > print(sumry)
> > cat("Note: Heteroscedasticityconsistant standard errors using
> > adjustment",
> > type, "\n")
> > }
> >
> > 
> > John Fox
> > Department of Sociology
> > McMaster University
> > Hamilton, Ontario
> > Canada L8S 4M4
> > 9055259140x23604
> > http://socserv.mcmaster.ca/jfox> > 
> >
> >> Original Message
> >> From: [hidden email]
> >> [mailto: [hidden email]] On Behalf Of Dirk
> >> Eddelbuettel
> >> Sent: Saturday, December 23, 2006 11:16 PM
> >> To: ivo welch
> >> Cc: [hidden email]
> >> Subject: Re: [R] extend summary.lm for hccm?
> >>
> >>
> >> On 23 December 2006 at 20:46, ivo welch wrote:
> >>  I wonder whether it is possible to extend the summary
> >> method for the
> >>  lm function, so that it uses an option "hccm" (well, model
> >> "hc0"). In
> >>  my line of work, it is pretty much required in reporting of
> >> almost all
> >>  linear regressions these days, which means that it would be
> >> very nice
> >>  not to have to manually library car, then sqrt the diagonal, and
> >>  recompute Tstats; instead, I would love to get everything
> >> in the same
> >>  format as the current outputexcept errors heteroskedasticity
> >>  adjusted.
> >> 
> >>  easy or hard?
> >>
> >> Did you consider the 'sandwich' package? A simple
> >>
> >> > install.packages("sandwich")
> >> > library(sandwich)
> >> > ?vcovHC
> >> > ?vcovHAC
> >>
> >> should get you there.
> >>
> >> Dirk
> >>
> >> 
>
>
>
> 
> Frank E Harrell Jr Professor and Chair School of Medicine
> Department of Biostatistics
> Vanderbilt University
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


John Fox wrote:
> Dear Frank,
>
> If I remember Freedman's recent paper correctly, he argues that sandwich
> variance estimator, though problematic in general, is not problematic in the
> case that White described  an otherwise correctly specified linear model
> with heteroscedasticity estimated by leastsquares.
>
> Regards,
> John
That's right John although the precision of the variance estimators is
still worth studying in that case.
Best regards,
Frank
>
> 
> John Fox
> Department of Sociology
> McMaster University
> Hamilton, Ontario
> Canada L8S 4M4
> 9055259140x23604
> http://socserv.mcmaster.ca/jfox
> 
>
>> Original Message
>> From: [hidden email]
>> [mailto: [hidden email]] On Behalf Of Frank
>> E Harrell Jr
>> Sent: Sunday, December 24, 2006 10:29 AM
>> To: John Fox
>> Cc: [hidden email]; 'ivo welch'; 'Dirk Eddelbuettel'
>> Subject: Re: [R] extend summary.lm for hccm?
>>
>> John Fox wrote:
>>> Dear Dirk and Ivo,
>>>
>>> It's true that the sandwich package has more extensive
>> facilities for
>>> sandwich variance estimation than the hccm function in the car
>>> package, but I think that the thrust of Ivo's question is to get a
>>> model summary that automatically uses the adjusted standard errors
>>> rather than having to compute them and use them "manually." Here's
>>> one approach, which could be modified slightly if Ivo wants
>> "hc0" as
>>> the default; it could also be adapted to use the sandwich package.
>>>
>>> I hope this helps,
>>> John
>> Another approach:
>>
>> library(Design) # also requires library(Hmisc) f < ols(y ~
>> x1 + x2, x=TRUE, y=TRUE)
>> f < robcov(f) # sandwich; also allows clustering. Also see bootcov
>> anova(f) # all later steps use sandwich variance matrix
>> summmary(f)
>> contrast(f, list(x1=.5), list(x1=.2))
>>
>> BUT note that sandwich covariance matrix estimators can have
>> poor mean squared error (a paper by Bill Gould in Stata
>> Technical Bulletin years ago related to logistic regression
>> showed an example with a 100fold increase in the variance of
>> a variance estimate) and can give you the right estimate of
>> the wrong quantity (reference below).
>>
>> Frank Harrell
>>
>> @Article{free06so,
>> author = {Freedman, David A.},
>> title = {On the socalled ``{Huber} sandwich
>> estimator'' and
>> ``robust standard errors''},
>> journal = The American Statistician,
>> year = 2006,
>> volume = 60,
>> pages = {299302},
>> annote = {nice summary of derivation of sandwich
>> estimators;questions why we should be interested in getting
>> the right variance of the wrong parameters when the model
>> doesn't fit} }
>>
>>
>>>  snip 
>>>
>>> summaryHCCM < function(model, ...) UseMethod("summaryHCCM")
>>>
>>> summaryHCCM.lm < function(model, type=c("hc3", "hc0",
>> "hc1", "hc2",
>>> "hc4"),
>>>
>>> ...){
>>> if (!require(car)) stop("Required car package is missing.")
>>> type < match.arg(type)
>>> V < hccm(model, type=type)
>>> sumry < summary(model)
>>> table < coef(sumry)
>>> table[,2] < sqrt(diag(V))
>>> table[,3] < table[,1]/table[,2]
>>> table[,4] < 2*pt(abs(table[,3]), df.residual(model),
>> lower.tail=FALSE)
>>> sumry$coefficients < table
>>> print(sumry)
>>> cat("Note: Heteroscedasticityconsistant standard errors using
>>> adjustment",
>>> type, "\n")
>>> }
>>>
>>> 
>>> John Fox
>>> Department of Sociology
>>> McMaster University
>>> Hamilton, Ontario
>>> Canada L8S 4M4
>>> 9055259140x23604
>>> http://socserv.mcmaster.ca/jfox>>> 
>>>
>>>> Original Message
>>>> From: [hidden email]
>>>> [mailto: [hidden email]] On Behalf Of Dirk
>>>> Eddelbuettel
>>>> Sent: Saturday, December 23, 2006 11:16 PM
>>>> To: ivo welch
>>>> Cc: [hidden email]
>>>> Subject: Re: [R] extend summary.lm for hccm?
>>>>
>>>>
>>>> On 23 December 2006 at 20:46, ivo welch wrote:
>>>>  I wonder whether it is possible to extend the summary
>>>> method for the
>>>>  lm function, so that it uses an option "hccm" (well, model
>>>> "hc0"). In
>>>>  my line of work, it is pretty much required in reporting of
>>>> almost all
>>>>  linear regressions these days, which means that it would be
>>>> very nice
>>>>  not to have to manually library car, then sqrt the diagonal, and
>>>>  recompute Tstats; instead, I would love to get everything
>>>> in the same
>>>>  format as the current outputexcept errors heteroskedasticity
>>>>  adjusted.
>>>> 
>>>>  easy or hard?
>>>>
>>>> Did you consider the 'sandwich' package? A simple
>>>>
>>>> > install.packages("sandwich")
>>>> > library(sandwich)
>>>> > ?vcovHC
>>>> > ?vcovHAC
>>>>
>>>> should get you there.
>>>>
>>>> Dirk
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.
Frank Harrell
Department of Biostatistics, Vanderbilt University


thank you, everybody. It's hard to find much fault with R, but one
feature that would be nice to have would be hooks to output functions
that make it easy to augment to the outpute.g., to add a new
statistic to summary (e.g., mean, sd, different stderrs). not a big
flaw, because one can write replacement functions...
/ivo
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On Sun, 24 Dec 2006, John Fox wrote:
> Dear Dirk and Ivo,
>
> It's true that the sandwich package has more extensive facilities for
> sandwich variance estimation than the hccm function in the car package, but
> I think that the thrust of Ivo's question is to get a model summary that
> automatically uses the adjusted standard errors rather than having to
> compute them and use them "manually."
I've written the function coeftest() in package "lmtest" for this purpose.
With this you can
coeftest(obj, vcov = sandwich)
or you can put this into a specialized summary() function as John
suggested (where you probably would want the F statistic to use the other
vcov as well). See also function waldtest() in "lmtest",
linear.hypothesis() in "car" and
vignette("sandwich", package = "sandwich")
Although this works, it is still a nuisance to use a different function
and not summary() directly. In addition, it would also be nice to plug in
different vcovs into confint() or predict() methods. Of course, one could
write different generics or overload the methods in certain packages.
However, I guess that many practitioners want to use different vcov
estimators  especially in the social and political scieneces, and
econometrics etc.  so that this might justify that the original "lm" (and
"glm") methods are extended to allow for plugging in different vcov
matrices. Maybe we could try to convince Rcore to include somthing like
this?
Best wishes,
Z
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On Sun, 24 Dec 2006, John Fox wrote:
> If I remember Freedman's recent paper correctly, he argues that sandwich
> variance estimator, though problematic in general, is not problematic in the
> case that White described  an otherwise correctly specified linear model
> with heteroscedasticity estimated by leastsquares.
More generally, sandwichtype estimators are valid (i.e., estimate
the right quantity, although not necessarily precisely, as Frank pointed
out) in situations where the estimating functions are correctly specified
but remaining aspets of the likelihood (not captured in the estimating
functions) are potentially not.
In linear models, it is easy to see what this means: the mean function has
to be correctly specified (i.e., the errors have zero mean) but the
correlation structure of the errors (i.e., their (co)variances) might
differ from the usual assumptions. In GLMs, in particular logistic
regression, it is much more difficult to see against which types of
misspecification sandwichbased inference is robust.
Freedman's paper stresses the point that many model misspecifications also
imply misspecified estimating functions (and in his example this is rather
obvious) so that consequently the sandwichtype estimators estimate the
wrong quantity.
Best wishes,
Z
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Dear Achim,
> Original Message
> From: Achim Zeileis [mailto: [hidden email]]
> Sent: Monday, December 25, 2006 8:38 AM
> To: John Fox
> Cc: 'Dirk Eddelbuettel'; 'ivo welch'; [hidden email]
> Subject: Re: [R] extend summary.lm for hccm?
>
> On Sun, 24 Dec 2006, John Fox wrote:
>
> > Dear Dirk and Ivo,
> >
> > It's true that the sandwich package has more extensive
> facilities for
> > sandwich variance estimation than the hccm function in the car
> > package, but I think that the thrust of Ivo's question is to get a
> > model summary that automatically uses the adjusted standard errors
> > rather than having to compute them and use them "manually."
>
> I've written the function coeftest() in package "lmtest" for
> this purpose.
> With this you can
> coeftest(obj, vcov = sandwich)
Since coeftest() already exists in a package, I think that this is a
preferable solution.
> or you can put this into a specialized summary() function as
> John suggested (where you probably would want the F statistic
> to use the other vcov as well).
Good point. Here's a modified version that also fixes up the Ftest:
summaryHCCM.lm < function(model, type=c("hc3", "hc0", "hc1", "hc2", "hc4"),
...){
if (!require(car)) stop("Required car package is missing.")
type < match.arg(type)
V < hccm(model, type=type)
sumry < summary(model)
table < coef(sumry)
table[,2] < sqrt(diag(V))
table[,3] < table[,1]/table[,2]
table[,4] < 2*pt(abs(table[,3]), df.residual(model), lower.tail=FALSE)
sumry$coefficients < table
p < nrow(table)
hyp < if (has.intercept(model)) cbind(0, diag(p  1)) else diag(p)
sumry$fstatistic[1] < linear.hypothesis(model, hyp,
white.adjust=type)[2,"F"]
print(sumry)
cat("Note: Heteroscedasticityconsistant standard errors using
adjustment",
type, "\n")
}
Regards,
John
> See also function waldtest()
> in "lmtest",
> linear.hypothesis() in "car" and
> vignette("sandwich", package = "sandwich")
>
> Although this works, it is still a nuisance to use a
> different function and not summary() directly. In addition,
> it would also be nice to plug in different vcovs into
> confint() or predict() methods. Of course, one could write
> different generics or overload the methods in certain packages.
> However, I guess that many practitioners want to use
> different vcov estimators  especially in the social and
> political scieneces, and econometrics etc.  so that this
> might justify that the original "lm" (and
> "glm") methods are extended to allow for plugging in
> different vcov matrices. Maybe we could try to convince
> Rcore to include somthing like this?
>
> Best wishes,
> Z
>
>
>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On Mon, 25 Dec 2006, Achim Zeileis wrote:
> On Sun, 24 Dec 2006, John Fox wrote:
>
>> If I remember Freedman's recent paper correctly, he argues that sandwich
>> variance estimator, though problematic in general, is not problematic in the
>> case that White described  an otherwise correctly specified linear model
>> with heteroscedasticity estimated by leastsquares.
>
> More generally, sandwichtype estimators are valid (i.e., estimate
> the right quantity, although not necessarily precisely, as Frank pointed
> out) in situations where the estimating functions are correctly specified
> but remaining aspets of the likelihood (not captured in the estimating
> functions) are potentially not.
Not exactly. The asymptotic properites are good, but in samples of
moderate size the properties (including both biasedness and variance) can
be surprisingly bad. And if you are trying to calculate a pvalue, getting
a toosmall variance estimate gives you spuriously small pvalues. FWIW,
I've seen a case in which the nominal size of a test based on the sandwich
estimator was several orders of magnitude smaller than a test with correct
nominal size.
There is a modest literature on this. Some refs:
Background Papers:
Drum M, McCullagh P. Comment. Statistical Science 1993; 8:300301.
Freedman DA. On the SoCalled "Huber Sandwich Estimator" and "Robust"
Standard Errors. The American Statistician, Volume 60, Number 4,
November 2006, pp. 299302(4)

Some Proposed Corrections:
Fay MP and Graubard BI. SmallSample Adjustments for
WaltType Tests Using Sandwich Estimators. Biometrics 2001; 57:
11981206.
Guo X, Pan W, Connett JE, Hannan PJ and French SA. Smallsample
performance of the robust score test and its modifications in
generalized estimating equations Statistics in Medicine 2005;
24:34793495
Mancl LA and DeRouen TA, A Covariance Estimator for GEE with Improved
SmallSample Properties. Biometrics 2001; 57:126134.
Morel JG, Bokossa MC, and Neerchal NK. Small Sample Correction for
the Variance of GEE Estimators Biometrical Journal 2003; 45(4):
395409.
Pan W and Wall MM. Smallsample adjustments in using the
sandwich variance estimator in generalized estimating
equations. Statistics in Medicine 2002; 21:14291441.
HTH,
Chuck
>
> In linear models, it is easy to see what this means: the mean function has
> to be correctly specified (i.e., the errors have zero mean) but the
> correlation structure of the errors (i.e., their (co)variances) might
> differ from the usual assumptions. In GLMs, in particular logistic
> regression, it is much more difficult to see against which types of
> misspecification sandwichbased inference is robust.
>
> Freedman's paper stresses the point that many model misspecifications also
> imply misspecified estimating functions (and in his example this is rather
> obvious) so that consequently the sandwichtype estimators estimate the
> wrong quantity.
>
> Best wishes,
> Z
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
Charles C. Berry (858) 5342098
Dept of Family/Preventive Medicine
E mailto: [hidden email] UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 920930717
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.

