dear R experts:
I wonder whether it is possible to extend the summary method for the lm function, so that it uses an option "hccm" (well, model "hc0"). In my line of work, it is pretty much required in reporting of almost all linear regressions these days, which means that it would be very nice not to have to manually library car, then sqrt the diagonal, and recompute T-stats; instead, I would love to get everything in the same format as the current output---except errors heteroskedasticity adjusted. easy or hard? regards, /iaw ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
On 23 December 2006 at 20:46, ivo welch wrote: | I wonder whether it is possible to extend the summary method for the | lm function, so that it uses an option "hccm" (well, model "hc0"). In | my line of work, it is pretty much required in reporting of almost all | linear regressions these days, which means that it would be very nice | not to have to manually library car, then sqrt the diagonal, and | recompute T-stats; instead, I would love to get everything in the same | format as the current output---except errors heteroskedasticity | adjusted. | | easy or hard? Did you consider the 'sandwich' package? A simple > install.packages("sandwich") > library(sandwich) > ?vcovHC > ?vcovHAC should get you there. Dirk -- Hell, there are no rules here - we're trying to accomplish something. -- Thomas A. Edison ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Dear Dirk and Ivo,
It's true that the sandwich package has more extensive facilities for sandwich variance estimation than the hccm function in the car package, but I think that the thrust of Ivo's question is to get a model summary that automatically uses the adjusted standard errors rather than having to compute them and use them "manually." Here's one approach, which could be modified slightly if Ivo wants "hc0" as the default; it could also be adapted to use the sandwich package. I hope this helps, John ----------- snip -------------- summaryHCCM <- function(model, ...) UseMethod("summaryHCCM") summaryHCCM.lm <- function(model, type=c("hc3", "hc0", "hc1", "hc2", "hc4"), ...){ if (!require(car)) stop("Required car package is missing.") type <- match.arg(type) V <- hccm(model, type=type) sumry <- summary(model) table <- coef(sumry) table[,2] <- sqrt(diag(V)) table[,3] <- table[,1]/table[,2] table[,4] <- 2*pt(abs(table[,3]), df.residual(model), lower.tail=FALSE) sumry$coefficients <- table print(sumry) cat("Note: Heteroscedasticity-consistant standard errors using adjustment", type, "\n") } -------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -------------------------------- > -----Original Message----- > From: [hidden email] > [mailto:[hidden email]] On Behalf Of Dirk > Eddelbuettel > Sent: Saturday, December 23, 2006 11:16 PM > To: ivo welch > Cc: [hidden email] > Subject: Re: [R] extend summary.lm for hccm? > > > On 23 December 2006 at 20:46, ivo welch wrote: > | I wonder whether it is possible to extend the summary > method for the > | lm function, so that it uses an option "hccm" (well, model > "hc0"). In > | my line of work, it is pretty much required in reporting of > almost all > | linear regressions these days, which means that it would be > very nice > | not to have to manually library car, then sqrt the diagonal, and > | recompute T-stats; instead, I would love to get everything > in the same > | format as the current output---except errors heteroskedasticity > | adjusted. > | > | easy or hard? > > Did you consider the 'sandwich' package? A simple > > > install.packages("sandwich") > > library(sandwich) > > ?vcovHC > > ?vcovHAC > > should get you there. > > Dirk > > -- > Hell, there are no rules here - we're trying to accomplish something. > -- Thomas A. Edison > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
John Fox wrote:
> Dear Dirk and Ivo, > > It's true that the sandwich package has more extensive facilities for > sandwich variance estimation than the hccm function in the car package, but > I think that the thrust of Ivo's question is to get a model summary that > automatically uses the adjusted standard errors rather than having to > compute them and use them "manually." Here's one approach, which could be > modified slightly if Ivo wants "hc0" as the default; it could also be > adapted to use the sandwich package. > > I hope this helps, > John Another approach: library(Design) # also requires library(Hmisc) f <- ols(y ~ x1 + x2, x=TRUE, y=TRUE) f <- robcov(f) # sandwich; also allows clustering. Also see bootcov anova(f) # all later steps use sandwich variance matrix summmary(f) contrast(f, list(x1=.5), list(x1=.2)) BUT note that sandwich covariance matrix estimators can have poor mean squared error (a paper by Bill Gould in Stata Technical Bulletin years ago related to logistic regression showed an example with a 100-fold increase in the variance of a variance estimate) and can give you the right estimate of the wrong quantity (reference below). Frank Harrell @Article{free06so, author = {Freedman, David A.}, title = {On the so-called ``{Huber} sandwich estimator'' and ``robust standard errors''}, journal = The American Statistician, year = 2006, volume = 60, pages = {299-302}, annote = {nice summary of derivation of sandwich estimators;questions why we should be interested in getting the right variance of the wrong parameters when the model doesn't fit} } > > ----------- snip -------------- > > summaryHCCM <- function(model, ...) UseMethod("summaryHCCM") > > summaryHCCM.lm <- function(model, type=c("hc3", "hc0", "hc1", "hc2", "hc4"), > > ...){ > if (!require(car)) stop("Required car package is missing.") > type <- match.arg(type) > V <- hccm(model, type=type) > sumry <- summary(model) > table <- coef(sumry) > table[,2] <- sqrt(diag(V)) > table[,3] <- table[,1]/table[,2] > table[,4] <- 2*pt(abs(table[,3]), df.residual(model), lower.tail=FALSE) > sumry$coefficients <- table > print(sumry) > cat("Note: Heteroscedasticity-consistant standard errors using > adjustment", > type, "\n") > } > > -------------------------------- > John Fox > Department of Sociology > McMaster University > Hamilton, Ontario > Canada L8S 4M4 > 905-525-9140x23604 > http://socserv.mcmaster.ca/jfox > -------------------------------- > >> -----Original Message----- >> From: [hidden email] >> [mailto:[hidden email]] On Behalf Of Dirk >> Eddelbuettel >> Sent: Saturday, December 23, 2006 11:16 PM >> To: ivo welch >> Cc: [hidden email] >> Subject: Re: [R] extend summary.lm for hccm? >> >> >> On 23 December 2006 at 20:46, ivo welch wrote: >> | I wonder whether it is possible to extend the summary >> method for the >> | lm function, so that it uses an option "hccm" (well, model >> "hc0"). In >> | my line of work, it is pretty much required in reporting of >> almost all >> | linear regressions these days, which means that it would be >> very nice >> | not to have to manually library car, then sqrt the diagonal, and >> | recompute T-stats; instead, I would love to get everything >> in the same >> | format as the current output---except errors heteroskedasticity >> | adjusted. >> | >> | easy or hard? >> >> Did you consider the 'sandwich' package? A simple >> >> > install.packages("sandwich") >> > library(sandwich) >> > ?vcovHC >> > ?vcovHAC >> >> should get you there. >> >> Dirk >> >> -- -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Frank Harrell
Department of Biostatistics, Vanderbilt University |
Dear Frank,
If I remember Freedman's recent paper correctly, he argues that sandwich variance estimator, though problematic in general, is not problematic in the case that White described -- an otherwise correctly specified linear model with heteroscedasticity estimated by least-squares. Regards, John -------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -------------------------------- > -----Original Message----- > From: [hidden email] > [mailto:[hidden email]] On Behalf Of Frank > E Harrell Jr > Sent: Sunday, December 24, 2006 10:29 AM > To: John Fox > Cc: [hidden email]; 'ivo welch'; 'Dirk Eddelbuettel' > Subject: Re: [R] extend summary.lm for hccm? > > John Fox wrote: > > Dear Dirk and Ivo, > > > > It's true that the sandwich package has more extensive > facilities for > > sandwich variance estimation than the hccm function in the car > > package, but I think that the thrust of Ivo's question is to get a > > model summary that automatically uses the adjusted standard errors > > rather than having to compute them and use them "manually." Here's > > one approach, which could be modified slightly if Ivo wants > "hc0" as > > the default; it could also be adapted to use the sandwich package. > > > > I hope this helps, > > John > > Another approach: > > library(Design) # also requires library(Hmisc) f <- ols(y ~ > x1 + x2, x=TRUE, y=TRUE) > f <- robcov(f) # sandwich; also allows clustering. Also see bootcov > anova(f) # all later steps use sandwich variance matrix > summmary(f) > contrast(f, list(x1=.5), list(x1=.2)) > > BUT note that sandwich covariance matrix estimators can have > poor mean squared error (a paper by Bill Gould in Stata > Technical Bulletin years ago related to logistic regression > showed an example with a 100-fold increase in the variance of > a variance estimate) and can give you the right estimate of > the wrong quantity (reference below). > > Frank Harrell > > @Article{free06so, > author = {Freedman, David A.}, > title = {On the so-called ``{Huber} sandwich > estimator'' and > ``robust standard errors''}, > journal = The American Statistician, > year = 2006, > volume = 60, > pages = {299-302}, > annote = {nice summary of derivation of sandwich > estimators;questions why we should be interested in getting > the right variance of the wrong parameters when the model > doesn't fit} } > > > > > > ----------- snip -------------- > > > > summaryHCCM <- function(model, ...) UseMethod("summaryHCCM") > > > > summaryHCCM.lm <- function(model, type=c("hc3", "hc0", > "hc1", "hc2", > > "hc4"), > > > > ...){ > > if (!require(car)) stop("Required car package is missing.") > > type <- match.arg(type) > > V <- hccm(model, type=type) > > sumry <- summary(model) > > table <- coef(sumry) > > table[,2] <- sqrt(diag(V)) > > table[,3] <- table[,1]/table[,2] > > table[,4] <- 2*pt(abs(table[,3]), df.residual(model), > lower.tail=FALSE) > > sumry$coefficients <- table > > print(sumry) > > cat("Note: Heteroscedasticity-consistant standard errors using > > adjustment", > > type, "\n") > > } > > > > -------------------------------- > > John Fox > > Department of Sociology > > McMaster University > > Hamilton, Ontario > > Canada L8S 4M4 > > 905-525-9140x23604 > > http://socserv.mcmaster.ca/jfox > > -------------------------------- > > > >> -----Original Message----- > >> From: [hidden email] > >> [mailto:[hidden email]] On Behalf Of Dirk > >> Eddelbuettel > >> Sent: Saturday, December 23, 2006 11:16 PM > >> To: ivo welch > >> Cc: [hidden email] > >> Subject: Re: [R] extend summary.lm for hccm? > >> > >> > >> On 23 December 2006 at 20:46, ivo welch wrote: > >> | I wonder whether it is possible to extend the summary > >> method for the > >> | lm function, so that it uses an option "hccm" (well, model > >> "hc0"). In > >> | my line of work, it is pretty much required in reporting of > >> almost all > >> | linear regressions these days, which means that it would be > >> very nice > >> | not to have to manually library car, then sqrt the diagonal, and > >> | recompute T-stats; instead, I would love to get everything > >> in the same > >> | format as the current output---except errors heteroskedasticity > >> | adjusted. > >> | > >> | easy or hard? > >> > >> Did you consider the 'sandwich' package? A simple > >> > >> > install.packages("sandwich") > >> > library(sandwich) > >> > ?vcovHC > >> > ?vcovHAC > >> > >> should get you there. > >> > >> Dirk > >> > >> -- > > > > -- > Frank E Harrell Jr Professor and Chair School of Medicine > Department of Biostatistics > Vanderbilt University > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
John Fox wrote:
> Dear Frank, > > If I remember Freedman's recent paper correctly, he argues that sandwich > variance estimator, though problematic in general, is not problematic in the > case that White described -- an otherwise correctly specified linear model > with heteroscedasticity estimated by least-squares. > > Regards, > John That's right John although the precision of the variance estimators is still worth studying in that case. Best regards, Frank > > -------------------------------- > John Fox > Department of Sociology > McMaster University > Hamilton, Ontario > Canada L8S 4M4 > 905-525-9140x23604 > http://socserv.mcmaster.ca/jfox > -------------------------------- > >> -----Original Message----- >> From: [hidden email] >> [mailto:[hidden email]] On Behalf Of Frank >> E Harrell Jr >> Sent: Sunday, December 24, 2006 10:29 AM >> To: John Fox >> Cc: [hidden email]; 'ivo welch'; 'Dirk Eddelbuettel' >> Subject: Re: [R] extend summary.lm for hccm? >> >> John Fox wrote: >>> Dear Dirk and Ivo, >>> >>> It's true that the sandwich package has more extensive >> facilities for >>> sandwich variance estimation than the hccm function in the car >>> package, but I think that the thrust of Ivo's question is to get a >>> model summary that automatically uses the adjusted standard errors >>> rather than having to compute them and use them "manually." Here's >>> one approach, which could be modified slightly if Ivo wants >> "hc0" as >>> the default; it could also be adapted to use the sandwich package. >>> >>> I hope this helps, >>> John >> Another approach: >> >> library(Design) # also requires library(Hmisc) f <- ols(y ~ >> x1 + x2, x=TRUE, y=TRUE) >> f <- robcov(f) # sandwich; also allows clustering. Also see bootcov >> anova(f) # all later steps use sandwich variance matrix >> summmary(f) >> contrast(f, list(x1=.5), list(x1=.2)) >> >> BUT note that sandwich covariance matrix estimators can have >> poor mean squared error (a paper by Bill Gould in Stata >> Technical Bulletin years ago related to logistic regression >> showed an example with a 100-fold increase in the variance of >> a variance estimate) and can give you the right estimate of >> the wrong quantity (reference below). >> >> Frank Harrell >> >> @Article{free06so, >> author = {Freedman, David A.}, >> title = {On the so-called ``{Huber} sandwich >> estimator'' and >> ``robust standard errors''}, >> journal = The American Statistician, >> year = 2006, >> volume = 60, >> pages = {299-302}, >> annote = {nice summary of derivation of sandwich >> estimators;questions why we should be interested in getting >> the right variance of the wrong parameters when the model >> doesn't fit} } >> >> >>> ----------- snip -------------- >>> >>> summaryHCCM <- function(model, ...) UseMethod("summaryHCCM") >>> >>> summaryHCCM.lm <- function(model, type=c("hc3", "hc0", >> "hc1", "hc2", >>> "hc4"), >>> >>> ...){ >>> if (!require(car)) stop("Required car package is missing.") >>> type <- match.arg(type) >>> V <- hccm(model, type=type) >>> sumry <- summary(model) >>> table <- coef(sumry) >>> table[,2] <- sqrt(diag(V)) >>> table[,3] <- table[,1]/table[,2] >>> table[,4] <- 2*pt(abs(table[,3]), df.residual(model), >> lower.tail=FALSE) >>> sumry$coefficients <- table >>> print(sumry) >>> cat("Note: Heteroscedasticity-consistant standard errors using >>> adjustment", >>> type, "\n") >>> } >>> >>> -------------------------------- >>> John Fox >>> Department of Sociology >>> McMaster University >>> Hamilton, Ontario >>> Canada L8S 4M4 >>> 905-525-9140x23604 >>> http://socserv.mcmaster.ca/jfox >>> -------------------------------- >>> >>>> -----Original Message----- >>>> From: [hidden email] >>>> [mailto:[hidden email]] On Behalf Of Dirk >>>> Eddelbuettel >>>> Sent: Saturday, December 23, 2006 11:16 PM >>>> To: ivo welch >>>> Cc: [hidden email] >>>> Subject: Re: [R] extend summary.lm for hccm? >>>> >>>> >>>> On 23 December 2006 at 20:46, ivo welch wrote: >>>> | I wonder whether it is possible to extend the summary >>>> method for the >>>> | lm function, so that it uses an option "hccm" (well, model >>>> "hc0"). In >>>> | my line of work, it is pretty much required in reporting of >>>> almost all >>>> | linear regressions these days, which means that it would be >>>> very nice >>>> | not to have to manually library car, then sqrt the diagonal, and >>>> | recompute T-stats; instead, I would love to get everything >>>> in the same >>>> | format as the current output---except errors heteroskedasticity >>>> | adjusted. >>>> | >>>> | easy or hard? >>>> >>>> Did you consider the 'sandwich' package? A simple >>>> >>>> > install.packages("sandwich") >>>> > library(sandwich) >>>> > ?vcovHC >>>> > ?vcovHAC >>>> >>>> should get you there. >>>> >>>> Dirk ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Frank Harrell
Department of Biostatistics, Vanderbilt University |
thank you, everybody. It's hard to find much fault with R, but one
feature that would be nice to have would be hooks to output functions that make it easy to augment to the output---e.g., to add a new statistic to summary (e.g., mean, sd, different stderrs). not a big flaw, because one can write replacement functions... /ivo ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by Fox, John
On Sun, 24 Dec 2006, John Fox wrote:
> Dear Dirk and Ivo, > > It's true that the sandwich package has more extensive facilities for > sandwich variance estimation than the hccm function in the car package, but > I think that the thrust of Ivo's question is to get a model summary that > automatically uses the adjusted standard errors rather than having to > compute them and use them "manually." I've written the function coeftest() in package "lmtest" for this purpose. With this you can coeftest(obj, vcov = sandwich) or you can put this into a specialized summary() function as John suggested (where you probably would want the F statistic to use the other vcov as well). See also function waldtest() in "lmtest", linear.hypothesis() in "car" and vignette("sandwich", package = "sandwich") Although this works, it is still a nuisance to use a different function and not summary() directly. In addition, it would also be nice to plug in different vcovs into confint() or predict() methods. Of course, one could write different generics or overload the methods in certain packages. However, I guess that many practitioners want to use different vcov estimators - especially in the social and political scieneces, and econometrics etc. - so that this might justify that the original "lm" (and "glm") methods are extended to allow for plugging in different vcov matrices. Maybe we could try to convince R-core to include somthing like this? Best wishes, Z ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by Fox, John
On Sun, 24 Dec 2006, John Fox wrote:
> If I remember Freedman's recent paper correctly, he argues that sandwich > variance estimator, though problematic in general, is not problematic in the > case that White described -- an otherwise correctly specified linear model > with heteroscedasticity estimated by least-squares. More generally, sandwich-type estimators are valid (i.e., estimate the right quantity, although not necessarily precisely, as Frank pointed out) in situations where the estimating functions are correctly specified but remaining aspets of the likelihood (not captured in the estimating functions) are potentially not. In linear models, it is easy to see what this means: the mean function has to be correctly specified (i.e., the errors have zero mean) but the correlation structure of the errors (i.e., their (co-)variances) might differ from the usual assumptions. In GLMs, in particular logistic regression, it is much more difficult to see against which types of misspecification sandwich-based inference is robust. Freedman's paper stresses the point that many model misspecifications also imply misspecified estimating functions (and in his example this is rather obvious) so that consequently the sandwich-type estimators estimate the wrong quantity. Best wishes, Z ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by Achim Zeileis
Dear Achim,
> -----Original Message----- > From: Achim Zeileis [mailto:[hidden email]] > Sent: Monday, December 25, 2006 8:38 AM > To: John Fox > Cc: 'Dirk Eddelbuettel'; 'ivo welch'; [hidden email] > Subject: Re: [R] extend summary.lm for hccm? > > On Sun, 24 Dec 2006, John Fox wrote: > > > Dear Dirk and Ivo, > > > > It's true that the sandwich package has more extensive > facilities for > > sandwich variance estimation than the hccm function in the car > > package, but I think that the thrust of Ivo's question is to get a > > model summary that automatically uses the adjusted standard errors > > rather than having to compute them and use them "manually." > > I've written the function coeftest() in package "lmtest" for > this purpose. > With this you can > coeftest(obj, vcov = sandwich) Since coeftest() already exists in a package, I think that this is a preferable solution. > or you can put this into a specialized summary() function as > John suggested (where you probably would want the F statistic > to use the other vcov as well). Good point. Here's a modified version that also fixes up the F-test: summaryHCCM.lm <- function(model, type=c("hc3", "hc0", "hc1", "hc2", "hc4"), ...){ if (!require(car)) stop("Required car package is missing.") type <- match.arg(type) V <- hccm(model, type=type) sumry <- summary(model) table <- coef(sumry) table[,2] <- sqrt(diag(V)) table[,3] <- table[,1]/table[,2] table[,4] <- 2*pt(abs(table[,3]), df.residual(model), lower.tail=FALSE) sumry$coefficients <- table p <- nrow(table) hyp <- if (has.intercept(model)) cbind(0, diag(p - 1)) else diag(p) sumry$fstatistic[1] <- linear.hypothesis(model, hyp, white.adjust=type)[2,"F"] print(sumry) cat("Note: Heteroscedasticity-consistant standard errors using adjustment", type, "\n") } Regards, John > See also function waldtest() > in "lmtest", > linear.hypothesis() in "car" and > vignette("sandwich", package = "sandwich") > > Although this works, it is still a nuisance to use a > different function and not summary() directly. In addition, > it would also be nice to plug in different vcovs into > confint() or predict() methods. Of course, one could write > different generics or overload the methods in certain packages. > However, I guess that many practitioners want to use > different vcov estimators - especially in the social and > political scieneces, and econometrics etc. - so that this > might justify that the original "lm" (and > "glm") methods are extended to allow for plugging in > different vcov matrices. Maybe we could try to convince > R-core to include somthing like this? > > Best wishes, > Z > > > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by Achim Zeileis
On Mon, 25 Dec 2006, Achim Zeileis wrote:
> On Sun, 24 Dec 2006, John Fox wrote: > >> If I remember Freedman's recent paper correctly, he argues that sandwich >> variance estimator, though problematic in general, is not problematic in the >> case that White described -- an otherwise correctly specified linear model >> with heteroscedasticity estimated by least-squares. > > More generally, sandwich-type estimators are valid (i.e., estimate > the right quantity, although not necessarily precisely, as Frank pointed > out) in situations where the estimating functions are correctly specified > but remaining aspets of the likelihood (not captured in the estimating > functions) are potentially not. Not exactly. The asymptotic properites are good, but in samples of moderate size the properties (including both biasedness and variance) can be surprisingly bad. And if you are trying to calculate a p-value, getting a too-small variance estimate gives you spuriously small p-values. FWIW, I've seen a case in which the nominal size of a test based on the sandwich estimator was several orders of magnitude smaller than a test with correct nominal size. There is a modest literature on this. Some refs: Background Papers: Drum M, McCullagh P. Comment. Statistical Science 1993; 8:300-301. Freedman DA. On the So-Called "Huber Sandwich Estimator" and "Robust" Standard Errors. The American Statistician, Volume 60, Number 4, November 2006, pp. 299-302(4) ----- Some Proposed Corrections: Fay MP and Graubard BI. Small-Sample Adjustments for Walt-Type Tests Using Sandwich Estimators. Biometrics 2001; 57: 1198-1206. Guo X, Pan W, Connett JE, Hannan PJ and French SA. Small-sample performance of the robust score test and its modifications in generalized estimating equations Statistics in Medicine 2005; 24:3479-3495 Mancl LA and DeRouen TA, A Covariance Estimator for GEE with Improved Small-Sample Properties. Biometrics 2001; 57:126-134. Morel JG, Bokossa MC, and Neerchal NK. Small Sample Correction for the Variance of GEE Estimators Biometrical Journal 2003; 45(4): 395-409. Pan W and Wall MM. Small-sample adjustments in using the sandwich variance estimator in generalized estimating equations. Statistics in Medicine 2002; 21:1429-1441. HTH, Chuck > > In linear models, it is easy to see what this means: the mean function has > to be correctly specified (i.e., the errors have zero mean) but the > correlation structure of the errors (i.e., their (co-)variances) might > differ from the usual assumptions. In GLMs, in particular logistic > regression, it is much more difficult to see against which types of > misspecification sandwich-based inference is robust. > > Freedman's paper stresses the point that many model misspecifications also > imply misspecified estimating functions (and in his example this is rather > obvious) so that consequently the sandwich-type estimators estimate the > wrong quantity. > > Best wishes, > Z > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:[hidden email] UC San Diego http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0717 ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Free forum by Nabble | Edit this page |