default for 'signif.stars'

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

default for 'signif.stars'

Lenth, Russell V
Dear R-Devel,

As I am sure many of you know, a special issue of The American Statistician just came out, and its theme is the [mis]use of P values and the many common ways in which they are abused. The lead editorial in that issue mentions the 2014 ASA guidelines on P values, and goes one step further, by now recommending that the words "statistically significant" and related simplistic interpretations no longer be used. There is much discussion of the problems with drawing "bright lines" concerning P values.

This is the position of a US society, but my sense is that the statistical community worldwide is pretty much on the same page.

Meanwhile, functions such as 'print.summary.lm' and 'print.anova' have an argument 'signif.stars' that really does involve drawing bright lines when it is set to TRUE. And the default setting for the "show.signif.stars" option is TRUE. Isn't it time to at least make "show.signif.stars" default to FALSE? And, indeed, to consider deprecating those 'signif.stars' options altogether?

Thanks

Russ

Russell V. Lenth  -  Professor Emeritus
Department of Statistics and Actuarial Science  
The University of Iowa  -  Iowa City, IA 52242  USA  
Voice (319)335-0712 (Dept. office)  -  FAX (319)335-3017

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: default for 'signif.stars'

Abby Spurdle
I read through the editorial.
This is the one of the most mega-ultra-super-biased articles I've ever read.

e.g.
The authors encourage Baysian methods, and literally encourage subjective
approaches.
However, there's only one reference to robust methods and one reference to
nonparametric methods, both of which are labelled as purely exploratory
methods, which I regard as extremely offensive.
And there don't appear to be any references to semiparameric methods, or
machine learning.

Surprisingly, they encourage multiple testing, however, don't mention the
multiple comparison problem.
Something I can't understand at all.

So, maybe we should replace signif.stars with emoji...?

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: default for 'signif.stars'

Martin Maechler
In reply to this post by Lenth, Russell V
>>>>> Lenth, Russell V
>>>>>     on Wed, 27 Mar 2019 00:06:08 +0000 writes:

    > Dear R-Devel, As I am sure many of you know, a special
    > issue of The American Statistician just came out, and its
    > theme is the [mis]use of P values and the many common ways
    > in which they are abused. The lead editorial in that issue
    > mentions the 2014 ASA guidelines on P values, and goes one
    > step further, by now recommending that the words
    > "statistically significant" and related simplistic
    > interpretations no longer be used. There is much
    > discussion of the problems with drawing "bright lines"
    > concerning P values.

    > This is the position of a US society, but my sense is that
    > the statistical community worldwide is pretty much on the
    > same page.

    > Meanwhile, functions such as 'print.summary.lm' and
    > 'print.anova' have an argument 'signif.stars' that really
    > does involve drawing bright lines when it is set to
    > TRUE. And the default setting for the "show.signif.stars"
    > option is TRUE. Isn't it time to at least make
    > "show.signif.stars" default to FALSE? And, indeed, to
    > consider deprecating those 'signif.stars' options
    > altogether?

Dear Russ,
Abs has already given good reasons why this article may well be
considered problematic.

However, I think you and (many but not all) others who've raised
this issue before you, slightly miss the following point.

If p-values are misleading they should not be shown (and hence
the signif.stars neither.
That has been the approach adopted e.g., in the lme4 package
*AND* has been an approach originally used in S and I think
parts of R as well, in more places than now, notably, e.g., for
print( summary(<glm>) ).

Fact is that users will write wrappers and their own packages
just to get to p values, even in very doubtful cases...
But anyway that (p values or not) is a different discussion
which has some value.

You however focus on the "significance stars".  I've argued for
years why they are useful, as they are just a simple
visualization of p values, and saving a lot of human time when
there are many (fixed) effects looked at simultaneously.
Why should users have to visually scan 20 or 50 numbers?  In
modern Data analysis they should never have to but rather look
at a visualization of those numbers. ... and that's what
significance stars are, not more, nor less.

Martin

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: default for 'signif.stars'

R devel mailing list
In reply to this post by Lenth, Russell V
The addition of significant stars was, in my opinion, one of the worst defaults ever added
to R.   I would be delighted to see it removed, or at least change the default.  It is one
of the few overrides that I have argued to add to our site-wide defaults file.

My bias comes from 30+ years in a medical statistics career where fighting the disease of
"dichotomania" has been an eternal struggle.  Continuous covariates are split in two,
nuanced risk scores are thresholded, decisions become yes/no, ....    Adding stars to
output is, to me, simply a gateway drug to this pernicous addiction.   We shouldn't
encourage it.

Wrt Abe's rant about the Nature article:  I've read the article and found it to be well
reasoned, and I can't say the same about the rant.   The issue in biomedical science is
that the p-value has fallen victim to Goodhart's law: "When a measure becomes a target, it
ceases to be a good measure."  The article argues, and I would agree, that the .05 yes/no
decision rule is currently doing more harm than good in biomedical research.   What to do
instead of this is a tough question, but it is fairly clear that the current plan isn't
working.   I have seen many cases of two papers which both found a risk increase of 1.9
for something where one paper claimed "smoking gun" and the other "completely
exonerated".   Do YOU want to take a drug with 2x risk and a p= 0.2 'proof' that it is
okay?   Of course, if there is too much to do and too little time, people will find a way
to create a shortcut yes/no rule no matter what we preach.   (We statisticians will do it
too.)

Terry T.




        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: default for 'signif.stars'

Fox, John
In reply to this post by Lenth, Russell V
Dear all,

I agree with both Russ and Terry that the significance stars option should default to FALSE. Here's what Sandy Weisberg and I say about significance starts in the current edition of the R Companion to Applied Regression:

        'If you find the “statistical-significance” asterisks that R prints to the right of the p-values annoying, as we do, you can suppress them, as we will in the remainder of the R Companion, by entering the command: options(show.signif.stars=FALSE).'

This is a rare case in which I find myself disagreeing with Martin, whose arguments are almost invariably careful and considered. In particular, the crude discretization of p-values into several categories seems a poor visualization to me, and in any event "scanning" many p-values quickly, which is the use-case that Martin cites, avoids serious issues of simultaneous inference.

Best,
 John

> -----Original Message-----
> From: R-devel [mailto:[hidden email]] On Behalf Of
> Therneau, Terry M., Ph.D. via R-devel
> Sent: Thursday, March 28, 2019 9:28 AM
> To: [hidden email]
> Subject: Re: [Rd] default for 'signif.stars'
>
> The addition of significant stars was, in my opinion, one of the worst
> defaults ever added to R.   I would be delighted to see it removed, or
> at least change the default.  It is one of the few overrides that I
> have argued to add to our site- wide defaults file.
>
> My bias comes from 30+ years in a medical statistics career where
> fighting the disease of "dichotomania" has been an eternal struggle. 
> Continuous covariates are split in two, nuanced risk scores are
> thresholded, decisions become yes/no, ....    Adding stars to output
> is, to me, simply a gateway drug to this pernicous addiction.   We shouldn't encourage it.
>
> Wrt Abe's rant about the Nature article:  I've read the article and
> found it to be well reasoned, and I can't say the same about the rant.  
> The issue in biomedical science is that the p-value has fallen victim to Goodhart's law:
> "When a measure becomes a target, it ceases to be a good measure." 
> The article argues, and I would agree, that the .05 yes/no decision
> rule is currently doing more harm than good in biomedical research.  
> What to do instead of this is a tough question, but it is fairly clear
> that the current plan isn't working.   I have seen many cases of two
> papers which both found a risk increase of 1.9 for something where one
> paper claimed "smoking gun" and the other "completely exonerated".  
> Do YOU want to take a drug with 2x risk and a p= 0.2 'proof' that it
> is okay?   Of course, if there is too much to do and too little time,
> people will find a way to create a shortcut yes/no rule no matter what
> we preach.   (We statisticians will do it
> too.)
>
> Terry T.
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: default for 'signif.stars'

Andrew Robinson
In reply to this post by Lenth, Russell V
Hi Martin,

I take your point - but I'd argue that significance stars are a clumsy
solution to the very real problem that you outline, and their inclusion as
a default sends a signal about their appropriateness that I would prefer R
not to endorse.

My preference (to the extent that it matters) would be to see the
significance stars be an option but not a default one, and the addition of
different functionality to handle the many-predictor problem, perhaps a new
summary that more efficiently provides more useful information.

If we were to invent lm() now, how would we solve the problem of big P?  I
don't think we would use stars.

Cheers,

Andrew




On Thu, 28 Mar 2019 at 20:19, Martin Maechler <[hidden email]>
wrote:

> >>>>> Lenth, Russell V
> >>>>>     on Wed, 27 Mar 2019 00:06:08 +0000 writes:
>
>     > Dear R-Devel, As I am sure many of you know, a special
>     > issue of The American Statistician just came out, and its
>     > theme is the [mis]use of P values and the many common ways
>     > in which they are abused. The lead editorial in that issue
>     > mentions the 2014 ASA guidelines on P values, and goes one
>     > step further, by now recommending that the words
>     > "statistically significant" and related simplistic
>     > interpretations no longer be used. There is much
>     > discussion of the problems with drawing "bright lines"
>     > concerning P values.
>
>     > This is the position of a US society, but my sense is that
>     > the statistical community worldwide is pretty much on the
>     > same page.
>
>     > Meanwhile, functions such as 'print.summary.lm' and
>     > 'print.anova' have an argument 'signif.stars' that really
>     > does involve drawing bright lines when it is set to
>     > TRUE. And the default setting for the "show.signif.stars"
>     > option is TRUE. Isn't it time to at least make
>     > "show.signif.stars" default to FALSE? And, indeed, to
>     > consider deprecating those 'signif.stars' options
>     > altogether?
>
> Dear Russ,
> Abs has already given good reasons why this article may well be
> considered problematic.
>
> However, I think you and (many but not all) others who've raised
> this issue before you, slightly miss the following point.
>
> If p-values are misleading they should not be shown (and hence
> the signif.stars neither.
> That has been the approach adopted e.g., in the lme4 package
> *AND* has been an approach originally used in S and I think
> parts of R as well, in more places than now, notably, e.g., for
> print( summary(<glm>) ).
>
> Fact is that users will write wrappers and their own packages
> just to get to p values, even in very doubtful cases...
> But anyway that (p values or not) is a different discussion
> which has some value.
>
> You however focus on the "significance stars".  I've argued for
> years why they are useful, as they are just a simple
> visualization of p values, and saving a lot of human time when
> there are many (fixed) effects looked at simultaneously.
> Why should users have to visually scan 20 or 50 numbers?  In
> modern Data analysis they should never have to but rather look
> at a visualization of those numbers. ... and that's what
> significance stars are, not more, nor less.
>
> Martin
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: default for 'signif.stars'

Abby Spurdle
In reply to this post by Lenth, Russell V
> If we were to invent lm() now, how would we solve the problem of big P?
> I don't think we would use stars.

Assuming that this is a good idea in the first place, here's a simple
solution, in the context of backward selection.

One could sort the terms, from lowest p-value to highest p-value.
If each variable is associated with more than one parameter (e.g.
interactions), then it complicates things, however, the same principle
applies.

It would be possible to group terms, based on their significance level,
however, this is unlikely to be popular. You could also use a head() and
tail() approach, something I've been using a lot, in other contexts.

However, I think a better solution is to automate the backward selection
process, however, that requires decision rules, and we're back to the
original problem.

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: [External] re: default for 'signif.stars'

Lenth, Russell V
In reply to this post by Abby Spurdle
Abs,

There are definitely problems with the editorial, but I think "most mega-ultra-super-biased" is an overreaction. It appears that you have overlooked some of the points made there, and the fact that it does not pretend to be an exhaustive list of alternative methods. The editorial attempts to digest what is in 43 articles in that special issue. Some of those articles do promote Bayesian methods – not a surprise – and some advocate using P values but without ascribing magical properties to P < 0.05. My own emmeans package does present P values (sans stars, or emojis either) in a lot of contexts.

More to the point, the criticisms you offer have to do with later sections of the editorial – not the initial part, which is largely a repeat of an earlier ASA statement on interpretation of P values with the added recommendation that people should never say "statistically significant." It is that initial part that I think does describe a consensus of a large and growing proportion of statisticians and other scientists that placing undue emphasis on "statistical significance" is a bad thing. Emphasizing P values by adding stars encourages that kind of misdirected emphasis.

It seems fairly harmless to change the default for "show.signif.stars" to FALSE. However, I do recognize that no change to R's defaults should be taken lightly or done without careful consideration. I only ask that such careful consideration take place, and hope in fact that a plan can be made to phase-in such a change.

Thanks,

Russ

Russell V. Lenth  -  Professor Emeritus
Department of Statistics and Actuarial Science  
The University of Iowa  -  Iowa City, IA 52242  USA  
Voice (319)335-0712 (Dept. office)  -  FAX (319)335-3017



From: Abs Spurdle <[hidden email]>
Sent: Thursday, March 28, 2019 12:19 AM
To: Lenth, Russell V <[hidden email]>; r-devel <[hidden email]>
Subject: [External] re: [Rd] default for 'signif.stars'

I read through the editorial.
This is the one of the most mega-ultra-super-biased articles I've ever read.

e.g.
The authors encourage Baysian methods, and literally encourage subjective approaches.
However, there's only one reference to robust methods and one reference to nonparametric methods, both of which are labelled as purely exploratory methods, which I regard as extremely offensive.
And there don't appear to be any references to semiparameric methods, or machine learning.

Surprisingly, they encourage multiple testing, however, don't mention the multiple comparison problem.
Something I can't understand at all.

So, maybe we should replace signif.stars with emoji...?


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel