p-values

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

p-values

Celso Barros
Dear All,

When I run rlm to obtain robust standard errors, my output does not include
p-values. Is there any reason p-values should not be used in this case? Is
there an argument I could use in rlm so that the output does
include p-values?

Thanks in advance,

Celso

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: summary.rlm (was p-values)

Prof Brian Ripley
On Wed, 5 Jul 2006, Celso Barros wrote:

> Dear All,
>
> When I run rlm to obtain robust standard errors, my output does not include
> p-values. Is there any reason p-values should not be used in this case? Is
> there an argument I could use in rlm so that the output does
> include p-values?

First you would have to derive a reliable theory for the tests you want
p-values for.  The t ratios are approximately normally distributed, but to
quote realistic p-values you would need much more accurate distribtution
theory.

For summary.lm we can use Student's t distribution and say the results are
exact only under normality.  There is no analogue for summary.rlm.  Note
that S does not quote p-values for summmary.glm for similar reasons: R
does but I do not consider it to be a good idea.

>
> Thanks in advance,
>
> Celso
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: p-values for robust regression

Martin Maechler
In reply to this post by Celso Barros
  [Oops! Written 6 hours ago, the following was accidentally not sent.]

>>>>> "Celso" == Celso Barros <[hidden email]>
>>>>>     on Wed, 5 Jul 2006 04:09:17 -0300 writes:

    Celso> When I run rlm to obtain robust standard errors, my output does not include
    Celso> p-values. Is there any reason p-values should not be used in this case?

yes (see also below).

    Celso> Is there an argument I could use in rlm so that the output does
    Celso> include p-values?
no.

What are the reasons?

How to properly do hypothesis testing in the context of robust
regression has partly been an open research problem.  Whereas
or has been solved in Elvezio Ronchetti's PhD thesis (1982)
by tau-tests, see chapter 7 of  Hampel, Rousseeuw, Ronchetti,
Stahel (1986), these are not (directly) related to standard
errors, and t-tests with some degrees of freedom.
Hence they are not so intuitively explainable, and not entirely
trivial to implement.  Probably this is one of reasons, why they
(tau-tests) haven't been programmed for MASS (the book and the R package).

Recent research, namely,
     Croux, C., Dhaene, G. and Hoorelbeke, D. (2003) _Robust standard
     errors for robust estimators_, Discussion Papers Series 03.16,
     K.U. Leuven, CES.
has been made use of by Matias Salibian-Barrera's roblm()
function now available as  lmrob() from package 'robustbase'.
There,  mod <- lmrob(........);  summary( mod )
does provide you with P-values.
But we still recommend *not* to ``believe in the P-values''
blindly, but rather base your data analysis on serious analysis
of residuals and other model checking.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: [RsR] p-values for robust regression

Matias Salibian-Barrera

Dear Celso,

I would only add a few comments to Martin's explanation below:

- tau-tests were primarily meant for general nested hypotheses, whereas
for a hypothesis of the form "beta_j = 0" for a single index "j" one can
also use (as it's done for glm estimators, say) *approximate* p-values
based on a normal approximation to the distribution of the ratio
"estimator / standard error" -- these are the p-values that
"summary.lmrob" currently reports, based on the *robust standard errors*
of Croux et al. 2003 (full reference in Martin's e-mail below) that
remain valid even when the data may contain asymmetric outliers;

- the asymptotic distribution of tau-tests is known for symmetrically
distributed errors and, furthermore, it involves a weighted sum of
independent chi-squared distributions, with weights depending on the
eigenvalues of the (asymptotic) covariance matrix of the explanatory
variables. Not surprisingly, their p-values are rather difficult to
calculate in practice (although approximations do exist: see Alfio
Marazzi's ROBETH and S-PLUS's robust libraries);

- for nested linear hypotheses, the tests in Markatou and Hettmansperger
(1990, "Robust bounded-influence tests in linear models", JASA, 85,
187-190) provide an alternative to the tau-tests with the "usual"
asymptotic chi-squared distribution, although this asymptotic
approximation is also known to hold for symmetrically distributed
errors, and moreover, seems to be rather sensitive to the presence of
outliers (see my paper in JSPI, 2005, 128, 241-257), while the Robust
Bootstrap performs quite well in estimating the p-values for these
robust tests.


Summarizing:

- the standard errors and p-values for individual hypotheses of the form
"beta_j=0" reported by summary.lmrob (in robustbase) are (robust)
asymptotic approximations, which should be interpreted and used accordingly;
- if you're interested in nested linear hypotheses, there are some
proposals in the literature to obtain robust p-values for robust tests
although they have not been implemented in robustbase yet (hopefully
they will be in the near future).

Best,

Matias

--
______________________________________________________________
Matias Salibian-Barrera - Department of Statistics
University of British Columbia - [hidden email]
Phone: (604) 822-3410 - Fax: (604) 822-6960


Martin Maechler wrote:

>   [Oops! Written 6 hours ago, the following was accidentally not sent.]
>
>>>>>> "Celso" == Celso Barros <[hidden email]>
>>>>>>     on Wed, 5 Jul 2006 04:09:17 -0300 writes:
>
>     Celso> When I run rlm to obtain robust standard errors, my output does not include
>     Celso> p-values. Is there any reason p-values should not be used in this case?
>
> yes (see also below).
>
>     Celso> Is there an argument I could use in rlm so that the output does
>     Celso> include p-values?
> no.
>
> What are the reasons?
>
> How to properly do hypothesis testing in the context of robust
> regression has partly been an open research problem.  Whereas
> or has been solved in Elvezio Ronchetti's PhD thesis (1982)
> by tau-tests, see chapter 7 of  Hampel, Rousseeuw, Ronchetti,
> Stahel (1986), these are not (directly) related to standard
> errors, and t-tests with some degrees of freedom.
> Hence they are not so intuitively explainable, and not entirely
> trivial to implement.  Probably this is one of reasons, why they
> (tau-tests) haven't been programmed for MASS (the book and the R package).
>
> Recent research, namely,
>      Croux, C., Dhaene, G. and Hoorelbeke, D. (2003) _Robust standard
>      errors for robust estimators_, Discussion Papers Series 03.16,
>      K.U. Leuven, CES.
> has been made use of by Matias Salibian-Barrera's roblm()
> function now available as  lmrob() from package 'robustbase'.
> There,  mod <- lmrob(........);  summary( mod )
> does provide you with P-values.
> But we still recommend *not* to ``believe in the P-values''
> blindly, but rather base your data analysis on serious analysis
> of residuals and other model checking.
>
> _______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-robust
>
>


--

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html