Regression stars

classic Classic list List threaded Threaded
31 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Regression stars

Frank Harrell
Today's GNU R tutorial in http://how-to.linuxcareer.com/a-quick-gnu-r-tutorial-to-statistical-models-and-graphics points out how bad statistical practice is being further perpetuated, by virtue of "significance stars" still being the default in printed output from lm models.
Frank Harrell
Department of Biostatistics, Vanderbilt University
Reply | Threaded
Open this post in threaded view
|

Re: Regression stars

Fox, John
Dear Frank,

I'd like to second your implicit motion to make options(show.signif.stars=FALSE) the default.

Thanks for raising this point.

John

On Thu, 7 Feb 2013 05:32:04 -0800 (PST)
 Frank Harrell <[hidden email]> wrote:

> Today's GNU R tutorial in
> http://how-to.linuxcareer.com/a-quick-gnu-r-tutorial-to-statistical-models-and-graphics
> points out how bad statistical practice is being further perpetuated, by
> virtue of "significance stars" still being the default in printed output
> from lm models.
>
>
>
>
> -----
> Frank Harrell
> Department of Biostatistics, Vanderbilt University
> --
> View this message in context: http://r.789695.n4.nabble.com/Regression-stars-tp4657795.html
> Sent from the R devel mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Regression stars

Marc Schwartz-3
FWIW, that has been my default setting for years in my .Rprofile.

If there is some agreement on this from R Core, it would seem that version 3.0.0 would be a reasonable breakpoint for this change in default behavior.

Regards,

Marc Schwartz

On Feb 7, 2013, at 8:27 AM, John Fox <[hidden email]> wrote:

> Dear Frank,
>
> I'd like to second your implicit motion to make options(show.signif.stars=FALSE) the default.
>
> Thanks for raising this point.
>
> John
>
> On Thu, 7 Feb 2013 05:32:04 -0800 (PST)
> Frank Harrell <[hidden email]> wrote:
>> Today's GNU R tutorial in
>> http://how-to.linuxcareer.com/a-quick-gnu-r-tutorial-to-statistical-models-and-graphics
>> points out how bad statistical practice is being further perpetuated, by
>> virtue of "significance stars" still being the default in printed output
>> from lm models.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: regression stars

Therneau, Terry M., Ph.D.
In reply to this post by Frank Harrell
  There are only a few things in R where we override the global defaults on a departmental
level -- we really don't like to do so.  But "show.signif.stars" is one of the 3.

   The other 2 if you are curious: set stringsAsFactors=FALSE and make NA included by
default in the output of table. We've been overriding both of these for 10+ years.

Terry Therneau

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Regression stars

Norm Matloff
In reply to this post by Frank Harrell
Thanks for bringing this up, Frank.

Since many of us are "educators," I'd like to suggest a bolder approach.
Discontinue even offering the stars as an option.  Sadly, we can't stop
reporting p-values, as the world expects them, but does R need to cater
to that attitude by offering star display?  For that matter, why not
have R report confidence intervals as a default?

Many years ago, I wrote a short textbook on stat, and included a
substantial section on the dangers of significance testing.  All three
internal reviewers liked it, but the funny part is that all three said,
"I agree with this, but no one else will." :-)

Norm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Regression stars

Tim Triche, Jr.
Changing the default for show.signif.stars should be sufficient to ensure
that, if people are going to get themselves into trouble, they will have to
do it on purpose.  It's just a visual cue; removing it will not remove the
underlying issue, namely blind acceptance of unlikely null models and
distributions.

For any complex problem, there is a solution that is simple, elegant, and
wrong.  As grants and careers can depend on these magic numbers, Upton
Sinclair might save everyone some trouble... It is difficult to get a man
to understand something, when his salary depends upon his not
understanding.

stringsAsFactors, however, is responsible for an endless stream of mildly
irritating misunderstandings, and defaulting that to FALSE would be very
nice.

Just my $0.02.  Defaults are one of the most powerful forces in the
universe.

Also, I liked your book.



On Sat, Feb 9, 2013 at 10:48 AM, Norm Matloff <[hidden email]>wrote:

> Thanks for bringing this up, Frank.
>
> Since many of us are "educators," I'd like to suggest a bolder approach.
> Discontinue even offering the stars as an option.  Sadly, we can't stop
> reporting p-values, as the world expects them, but does R need to cater
> to that attitude by offering star display?  For that matter, why not
> have R report confidence intervals as a default?
>
> Many years ago, I wrote a short textbook on stat, and included a
> substantial section on the dangers of significance testing.  All three
> internal reviewers liked it, but the funny part is that all three said,
> "I agree with this, but no one else will." :-)
>
> Norm
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



--
*A model is a lie that helps you see the truth.*
*
*
Howard Skipper<http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Regression stars

Tim Triche, Jr.
To clarify, I favor changing the defaults for stringsAsFactors and
show.signif.stars to FALSE in R-3.0.0, and view any attempt to remove
either functionality as a seemingly simple but fundamentally misguided idea.

This is just my opinion, of course.  The change could easily be accompanied
by a startup notice or release notes indicating that the changes have been
made, and can be reverted to past behavior if the user so desires.  Perhaps
more users will investigate the various settings, as a happy side effect.

My thanks to everyone who spends time supporting and working on R-core.



On Sat, Feb 9, 2013 at 12:44 PM, Tim Triche, Jr. <[hidden email]>wrote:

> Changing the default for show.signif.stars should be sufficient to ensure
> that, if people are going to get themselves into trouble, they will have to
> do it on purpose.  It's just a visual cue; removing it will not remove the
> underlying issue, namely blind acceptance of unlikely null models and
> distributions.
>
> For any complex problem, there is a solution that is simple, elegant, and
> wrong.  As grants and careers can depend on these magic numbers, Upton
> Sinclair might save everyone some trouble... It is difficult to get a man
> to understand something, when his salary depends upon his not
> understanding.
>
> stringsAsFactors, however, is responsible for an endless stream of mildly
> irritating misunderstandings, and defaulting that to FALSE would be very
> nice.
>
> Just my $0.02.  Defaults are one of the most powerful forces in the
> universe.
>
> Also, I liked your book.
>
>
>
> On Sat, Feb 9, 2013 at 10:48 AM, Norm Matloff <[hidden email]>wrote:
>
>> Thanks for bringing this up, Frank.
>>
>> Since many of us are "educators," I'd like to suggest a bolder approach.
>> Discontinue even offering the stars as an option.  Sadly, we can't stop
>> reporting p-values, as the world expects them, but does R need to cater
>> to that attitude by offering star display?  For that matter, why not
>> have R report confidence intervals as a default?
>>
>> Many years ago, I wrote a short textbook on stat, and included a
>> substantial section on the dangers of significance testing.  All three
>> internal reviewers liked it, but the funny part is that all three said,
>> "I agree with this, but no one else will." :-)
>>
>> Norm
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>
>
> --
> *A model is a lie that helps you see the truth.*
> *
> *
> Howard Skipper<http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf>
>



--
*A model is a lie that helps you see the truth.*
*
*
Howard Skipper<http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Regression stars

Norm Matloff
In reply to this post by Frank Harrell
I appreciate Tim's comments.

I myself have a "social science" paper coming out soon in which I felt
forced to use p-values, given their ubiquity.  However, I also told
readers of the paper that confidence intervals are much more informative
and I do provide them.  As I said earlier, there is no avoiding that,
and R needs to report p-values for that reason.  

Instead, the question is what to do about the stars; I proposed
eliminating them altogether.  Star-crazed users know how to determine
them themselves from the p-values, but deleting them from R would send a
message.

I did say my proposal was "bold," which really meant I was suggesting
that R do SOMETHING to send that message, not necessarily star
elimination.

One such "something" would be the proposal I made, which would be to add
confidence intervals to the output.  This too could be just an option,
but again offering that option would send a message.  Indeed, I would
suggest that the help page explain that confidence intervals are more
informative.  (The help page could make a similar statement regarding
the stars.)

When I pitch R to people, I say that in addition to the large function
and library base and the nice graphics capabilities, R is above all
Statistically Correct--it's written by statisticians who know what they
are doing, rather than some programmer simply implementing a formula
from a textbook.  I know that a lot of people feel this is one of R's
biggest strengths.  Given that, one might argue that R should do what it
can to help users engage in good statistical practice.  I think this was
Frank's point.

Norm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Regression stars

Frank Harrell
Great discussion.   Tim's Sinclair quote is priceless and relates to the non-reproducible research done in some quarters.   Norm's wish to remove stars altogether is entirely consistent with good statistical practice and would make a statement that R base adheres to good practice.  I don't think it will work to add confidence intervals because models can have nonlinear or interaction terms, and the reference cell for a factor variable may not be what the analyst chooses for a comparison group.

I would like for us to find a way to, over time, implement Norm's wish to de-emphasize P-values in general.  The harm done  by P-values is immeasureable.

Frank
Norm Matloff wrote
I appreciate Tim's comments.

I myself have a "social science" paper coming out soon in which I felt
forced to use p-values, given their ubiquity.  However, I also told
readers of the paper that confidence intervals are much more informative
and I do provide them.  As I said earlier, there is no avoiding that,
and R needs to report p-values for that reason.  

Instead, the question is what to do about the stars; I proposed
eliminating them altogether.  Star-crazed users know how to determine
them themselves from the p-values, but deleting them from R would send a
message.

I did say my proposal was "bold," which really meant I was suggesting
that R do SOMETHING to send that message, not necessarily star
elimination.

One such "something" would be the proposal I made, which would be to add
confidence intervals to the output.  This too could be just an option,
but again offering that option would send a message.  Indeed, I would
suggest that the help page explain that confidence intervals are more
informative.  (The help page could make a similar statement regarding
the stars.)

When I pitch R to people, I say that in addition to the large function
and library base and the nice graphics capabilities, R is above all
Statistically Correct--it's written by statisticians who know what they
are doing, rather than some programmer simply implementing a formula
from a textbook.  I know that a lot of people feel this is one of R's
biggest strengths.  Given that, one might argue that R should do what it
can to help users engage in good statistical practice.  I think this was
Frank's point.

Norm

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Frank Harrell
Department of Biostatistics, Vanderbilt University
Reply | Threaded
Open this post in threaded view
|

Re: Regression stars

Duncan Murdoch-2
In reply to this post by Tim Triche, Jr.
On 13-02-09 3:49 PM, Tim Triche, Jr. wrote:
> To clarify, I favor changing the defaults for stringsAsFactors and
> show.signif.stars to FALSE in R-3.0.0, and view any attempt to remove
> either functionality as a seemingly simple but fundamentally misguided idea.

Both of these were discussed by R Core.  I think it's unlikely the
default for stringsAsFactors will be changed (some R Core members like
the current behaviour), but it's fairly likely the show.signif.stars
default will change.  (That's if someone gets around to it:  I
personally don't care about that one.  P-values are commonly used
statistics, and the stars are just a simple graphical display of them.
I find some p-values to be useful, and the display to be harmless.)

I think it's really unlikely the more extreme changes (i.e. dropping
show.signif.stars completely, or dropping p-values) will happen.

Regarding stringsAsFactors:  I'm not going to defend keeping it as is,
I'll let the people who like it defend it.  What I will likely do is
make a few changes so that character vectors are automatically changed
to factors in modelling functions, so that operating with
stringsAsFactors=FALSE doesn't trigger silly warnings.

Duncan Murdoch

>
> This is just my opinion, of course.  The change could easily be accompanied
> by a startup notice or release notes indicating that the changes have been
> made, and can be reverted to past behavior if the user so desires.  Perhaps
> more users will investigate the various settings, as a happy side effect.
>
> My thanks to everyone who spends time supporting and working on R-core.
>
>
>
> On Sat, Feb 9, 2013 at 12:44 PM, Tim Triche, Jr. <[hidden email]>wrote:
>
>> Changing the default for show.signif.stars should be sufficient to ensure
>> that, if people are going to get themselves into trouble, they will have to
>> do it on purpose.  It's just a visual cue; removing it will not remove the
>> underlying issue, namely blind acceptance of unlikely null models and
>> distributions.
>>
>> For any complex problem, there is a solution that is simple, elegant, and
>> wrong.  As grants and careers can depend on these magic numbers, Upton
>> Sinclair might save everyone some trouble... It is difficult to get a man
>> to understand something, when his salary depends upon his not
>> understanding.
>>
>> stringsAsFactors, however, is responsible for an endless stream of mildly
>> irritating misunderstandings, and defaulting that to FALSE would be very
>> nice.
>>
>> Just my $0.02.  Defaults are one of the most powerful forces in the
>> universe.
>>
>> Also, I liked your book.
>>
>>
>>
>> On Sat, Feb 9, 2013 at 10:48 AM, Norm Matloff <[hidden email]>wrote:
>>
>>> Thanks for bringing this up, Frank.
>>>
>>> Since many of us are "educators," I'd like to suggest a bolder approach.
>>> Discontinue even offering the stars as an option.  Sadly, we can't stop
>>> reporting p-values, as the world expects them, but does R need to cater
>>> to that attitude by offering star display?  For that matter, why not
>>> have R report confidence intervals as a default?
>>>
>>> Many years ago, I wrote a short textbook on stat, and included a
>>> substantial section on the dangers of significance testing.  All three
>>> internal reviewers liked it, but the funny part is that all three said,
>>> "I agree with this, but no one else will." :-)
>>>
>>> Norm
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>>
>>
>> --
>> *A model is a lie that helps you see the truth.*
>> *
>> *
>> Howard Skipper<http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf>
>>
>
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Regression stars

bbolker
Duncan Murdoch <murdoch.duncan <at> gmail.com> writes:

  [snip]
>
> Regarding stringsAsFactors:  I'm not going to defend keeping it as is,
> I'll let the people who like it defend it.  

  Would someone (anyone) like to come forward and give us a defense
of stringsAsFactors=TRUE -- even someone who doesn't personally like
it but would like to play devil's advocate?

> What I will likely do is
> make a few changes so that character vectors are automatically changed
> to factors in modelling functions, so that operating with
> stringsAsFactors=FALSE doesn't trigger silly warnings.
>
> Duncan Murdoch
>

 [apologies for snipping context: "gmane made me do it"]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Regression stars

Uwe Ligges-3


On 12.02.2013 14:54, Ben Bolker wrote:

> Duncan Murdoch <murdoch.duncan <at> gmail.com> writes:
>
>    [snip]
>>
>> Regarding stringsAsFactors:  I'm not going to defend keeping it as is,
>> I'll let the people who like it defend it.
>
>    Would someone (anyone) like to come forward and give us a defense
> of stringsAsFactors=TRUE -- even someone who doesn't personally like
> it but would like to play devil's advocate?

Sure:
I will have to change all my scripts, my teaching examples, my book, and
lots of code examples for research and particularly consulting jobs.

Personally, I think having stringsAsFactors=TRUE is not too bad for
read.table() but less useful for data.frame().

And since you ask for the devil's advocate already, related to the
subject line: Removing stars is horrible for consulting: With all those
people from biology, medicine and other fields who even ask us questions
in term of significance stars that are obviously very common for them.
Many of them will certainly ask us for the stars, and ask us to switch
to another software product once they do not get it from R. They may not
be interested in being taught about the advantages or disadvantages of
p-values or stars.

There are different use cases of R, and I want to keep stars for
consulting tasks where things have to be delivered within minutes. I am
happy with or without for teaching, where I have the time and can easily
talk about the sense and nonsense of p-values.


Best,
Uwe













>
>> What I will likely do is
>> make a few changes so that character vectors are automatically changed
>> to factors in modelling functions, so that operating with
>> stringsAsFactors=FALSE doesn't trigger silly warnings.
>>
>> Duncan Murdoch
>>
>
>   [apologies for snipping context: "gmane made me do it"]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Regression stars

Frank Harrell
Uwe I've been consulting for decades and have never once been asked for such stars.  And when a clinical researcher puts a sentence in a study protocol that P<0.05 will be considered "significant" I get them to take it out.
Frank
Uwe Ligges-3 wrote
On 12.02.2013 14:54, Ben Bolker wrote:
> Duncan Murdoch <murdoch.duncan <at> gmail.com> writes:
>
>    [snip]
>>
>> Regarding stringsAsFactors:  I'm not going to defend keeping it as is,
>> I'll let the people who like it defend it.
>
>    Would someone (anyone) like to come forward and give us a defense
> of stringsAsFactors=TRUE -- even someone who doesn't personally like
> it but would like to play devil's advocate?

Sure:
I will have to change all my scripts, my teaching examples, my book, and
lots of code examples for research and particularly consulting jobs.

Personally, I think having stringsAsFactors=TRUE is not too bad for
read.table() but less useful for data.frame().

And since you ask for the devil's advocate already, related to the
subject line: Removing stars is horrible for consulting: With all those
people from biology, medicine and other fields who even ask us questions
in term of significance stars that are obviously very common for them.
Many of them will certainly ask us for the stars, and ask us to switch
to another software product once they do not get it from R. They may not
be interested in being taught about the advantages or disadvantages of
p-values or stars.

There are different use cases of R, and I want to keep stars for
consulting tasks where things have to be delivered within minutes. I am
happy with or without for teaching, where I have the time and can easily
talk about the sense and nonsense of p-values.


Best,
Uwe













>
>> What I will likely do is
>> make a few changes so that character vectors are automatically changed
>> to factors in modelling functions, so that operating with
>> stringsAsFactors=FALSE doesn't trigger silly warnings.
>>
>> Duncan Murdoch
>>
>
>   [apologies for snipping context: "gmane made me do it"]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Frank Harrell
Department of Biostatistics, Vanderbilt University
Reply | Threaded
Open this post in threaded view
|

Re: Regression stars

Duncan Murdoch-2
In reply to this post by Uwe Ligges-3
On 12/02/2013 9:20 AM, Uwe Ligges wrote:

>
> On 12.02.2013 14:54, Ben Bolker wrote:
> > Duncan Murdoch <murdoch.duncan <at> gmail.com> writes:
> >
> >    [snip]
> >>
> >> Regarding stringsAsFactors:  I'm not going to defend keeping it as is,
> >> I'll let the people who like it defend it.
> >
> >    Would someone (anyone) like to come forward and give us a defense
> > of stringsAsFactors=TRUE -- even someone who doesn't personally like
> > it but would like to play devil's advocate?
>
> Sure:
> I will have to change all my scripts, my teaching examples, my book, and
> lots of code examples for research and particularly consulting jobs.

Could you post an example of a non-trivial one?  (By trivial, I mean one
that says "data.frame() converts character vectors to factors".
Obviously that would need to change.  I mean one that just assumes
current behaviour, and would be broken by the change.)

Duncan Murdoch

>
> Personally, I think having stringsAsFactors=TRUE is not too bad for
> read.table() but less useful for data.frame().
>
> And since you ask for the devil's advocate already, related to the
> subject line: Removing stars is horrible for consulting: With all those
> people from biology, medicine and other fields who even ask us questions
> in term of significance stars that are obviously very common for them.
> Many of them will certainly ask us for the stars, and ask us to switch
> to another software product once they do not get it from R. They may not
> be interested in being taught about the advantages or disadvantages of
> p-values or stars.
>
> There are different use cases of R, and I want to keep stars for
> consulting tasks where things have to be delivered within minutes. I am
> happy with or without for teaching, where I have the time and can easily
> talk about the sense and nonsense of p-values.
>
>
> Best,
> Uwe
>
>
>
>
>
>
>
>
>
>
>
>
>
> >
> >> What I will likely do is
> >> make a few changes so that character vectors are automatically changed
> >> to factors in modelling functions, so that operating with
> >> stringsAsFactors=FALSE doesn't trigger silly warnings.
> >>
> >> Duncan Murdoch
> >>
> >
> >   [apologies for snipping context: "gmane made me do it"]
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Regression stars

Ravi Varadhan-2
In reply to this post by Frank Harrell
I think that we should use P < .03 (which approximates the probability of 5 consecutive heads) for assigning significance!

Ravi

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Frank Harrell
Sent: Tuesday, February 12, 2013 9:43 AM
To: [hidden email]
Subject: Re: [Rd] Regression stars

Uwe I've been consulting for decades and have never once been asked for such stars.  And when a clinical researcher puts a sentence in a study protocol that P<0.05 will be considered "significant" I get them to take it out.
Frank

Uwe Ligges-3 wrote

> On 12.02.2013 14:54, Ben Bolker wrote:
>> Duncan Murdoch
> <murdoch.duncan <at>
>  gmail.com> writes:
>>
>>    [snip]
>>>
>>> Regarding stringsAsFactors:  I'm not going to defend keeping it as
>>> is, I'll let the people who like it defend it.
>>
>>    Would someone (anyone) like to come forward and give us a defense
>> of stringsAsFactors=TRUE -- even someone who doesn't personally like
>> it but would like to play devil's advocate?
>
> Sure:
> I will have to change all my scripts, my teaching examples, my book,
> and lots of code examples for research and particularly consulting jobs.
>
> Personally, I think having stringsAsFactors=TRUE is not too bad for
> read.table() but less useful for data.frame().
>
> And since you ask for the devil's advocate already, related to the
> subject line: Removing stars is horrible for consulting: With all
> those people from biology, medicine and other fields who even ask us
> questions in term of significance stars that are obviously very common for them.
> Many of them will certainly ask us for the stars, and ask us to switch
> to another software product once they do not get it from R. They may
> not be interested in being taught about the advantages or
> disadvantages of p-values or stars.
>
> There are different use cases of R, and I want to keep stars for
> consulting tasks where things have to be delivered within minutes. I
> am happy with or without for teaching, where I have the time and can
> easily talk about the sense and nonsense of p-values.
>
>
> Best,
> Uwe
>
>
>
>
>
>
>
>
>
>
>
>
>
>>
>>> What I will likely do is
>>> make a few changes so that character vectors are automatically
>>> changed to factors in modelling functions, so that operating with
>>> stringsAsFactors=FALSE doesn't trigger silly warnings.
>>>
>>> Duncan Murdoch
>>>
>>
>>   [apologies for snipping context: "gmane made me do it"]
>>
>> ______________________________________________
>>

> R-devel@

>  mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> ______________________________________________

> R-devel@

>  mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel





-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: http://r.789695.n4.nabble.com/Regression-stars-tp4657795p4658268.html
Sent from the R devel mailing list archive at Nabble.com.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Regression stars

Uwe Ligges-3
In reply to this post by Frank Harrell


On 12.02.2013 15:42, Frank Harrell wrote:
> Uwe I've been consulting for decades and have never once been asked for such
> stars.

Honestly: last time I have been asked last week.

And when I answered (in another case few months ago) "OK, I can add you
another 5 stars for p values smaller than 0.5" they did not find it too
funny.

Best,
Uwe

> And when a clinical researcher puts a sentence in a study protocol
> that P<0.05 will be considered "significant" I get them to take it out.
>
> Frank
>
> Uwe Ligges-3 wrote
>> On 12.02.2013 14:54, Ben Bolker wrote:
>>> Duncan Murdoch
>> <murdoch.duncan <at>
>>   gmail.com> writes:
>>>
>>>     [snip]
>>>>
>>>> Regarding stringsAsFactors:  I'm not going to defend keeping it as is,
>>>> I'll let the people who like it defend it.
>>>
>>>     Would someone (anyone) like to come forward and give us a defense
>>> of stringsAsFactors=TRUE -- even someone who doesn't personally like
>>> it but would like to play devil's advocate?
>>
>> Sure:
>> I will have to change all my scripts, my teaching examples, my book, and
>> lots of code examples for research and particularly consulting jobs.
>>
>> Personally, I think having stringsAsFactors=TRUE is not too bad for
>> read.table() but less useful for data.frame().
>>
>> And since you ask for the devil's advocate already, related to the
>> subject line: Removing stars is horrible for consulting: With all those
>> people from biology, medicine and other fields who even ask us questions
>> in term of significance stars that are obviously very common for them.
>> Many of them will certainly ask us for the stars, and ask us to switch
>> to another software product once they do not get it from R. They may not
>> be interested in being taught about the advantages or disadvantages of
>> p-values or stars.
>>
>> There are different use cases of R, and I want to keep stars for
>> consulting tasks where things have to be delivered within minutes. I am
>> happy with or without for teaching, where I have the time and can easily
>> talk about the sense and nonsense of p-values.
>>
>>
>> Best,
>> Uwe
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>>
>>>> What I will likely do is
>>>> make a few changes so that character vectors are automatically changed
>>>> to factors in modelling functions, so that operating with
>>>> stringsAsFactors=FALSE doesn't trigger silly warnings.
>>>>
>>>> Duncan Murdoch
>>>>
>>>
>>>    [apologies for snipping context: "gmane made me do it"]
>>>
>>> ______________________________________________
>>>
>
>> R-devel@
>
>>   mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>> ______________________________________________
>
>> R-devel@
>
>>   mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
>
>
> -----
> Frank Harrell
> Department of Biostatistics, Vanderbilt University
> --
> View this message in context: http://r.789695.n4.nabble.com/Regression-stars-tp4657795p4658268.html
> Sent from the R devel mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Regression stars

Ravi Varadhan-2
They are "reaching for the stars".  Pardon my jest, but I couldn't resist.

Ravi

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Uwe Ligges
Sent: Tuesday, February 12, 2013 10:01 AM
To: Frank Harrell
Cc: [hidden email]
Subject: Re: [Rd] Regression stars



On 12.02.2013 15:42, Frank Harrell wrote:
> Uwe I've been consulting for decades and have never once been asked
> for such stars.

Honestly: last time I have been asked last week.

And when I answered (in another case few months ago) "OK, I can add you another 5 stars for p values smaller than 0.5" they did not find it too funny.

Best,
Uwe

> And when a clinical researcher puts a sentence in a study protocol
> that P<0.05 will be considered "significant" I get them to take it out.
>
> Frank
>
> Uwe Ligges-3 wrote
>> On 12.02.2013 14:54, Ben Bolker wrote:
>>> Duncan Murdoch
>> <murdoch.duncan <at>
>>   gmail.com> writes:
>>>
>>>     [snip]
>>>>
>>>> Regarding stringsAsFactors:  I'm not going to defend keeping it as
>>>> is, I'll let the people who like it defend it.
>>>
>>>     Would someone (anyone) like to come forward and give us a
>>> defense of stringsAsFactors=TRUE -- even someone who doesn't
>>> personally like it but would like to play devil's advocate?
>>
>> Sure:
>> I will have to change all my scripts, my teaching examples, my book,
>> and lots of code examples for research and particularly consulting jobs.
>>
>> Personally, I think having stringsAsFactors=TRUE is not too bad for
>> read.table() but less useful for data.frame().
>>
>> And since you ask for the devil's advocate already, related to the
>> subject line: Removing stars is horrible for consulting: With all
>> those people from biology, medicine and other fields who even ask us
>> questions in term of significance stars that are obviously very common for them.
>> Many of them will certainly ask us for the stars, and ask us to
>> switch to another software product once they do not get it from R.
>> They may not be interested in being taught about the advantages or
>> disadvantages of p-values or stars.
>>
>> There are different use cases of R, and I want to keep stars for
>> consulting tasks where things have to be delivered within minutes. I
>> am happy with or without for teaching, where I have the time and can
>> easily talk about the sense and nonsense of p-values.
>>
>>
>> Best,
>> Uwe
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>>
>>>> What I will likely do is
>>>> make a few changes so that character vectors are automatically
>>>> changed to factors in modelling functions, so that operating with
>>>> stringsAsFactors=FALSE doesn't trigger silly warnings.
>>>>
>>>> Duncan Murdoch
>>>>
>>>
>>>    [apologies for snipping context: "gmane made me do it"]
>>>
>>> ______________________________________________
>>>
>
>> R-devel@
>
>>   mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>> ______________________________________________
>
>> R-devel@
>
>>   mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
>
>
> -----
> Frank Harrell
> Department of Biostatistics, Vanderbilt University
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Regression-stars-tp4657795p4658268.html
> Sent from the R devel mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Regression stars

bbolker
In reply to this post by Uwe Ligges-3
On 13-02-12 09:20 AM, Uwe Ligges wrote:

>
>
> On 12.02.2013 14:54, Ben Bolker wrote:
>> Duncan Murdoch <murdoch.duncan <at> gmail.com> writes:
>>
>>    [snip]
>>>
>>> Regarding stringsAsFactors:  I'm not going to defend keeping it as is,
>>> I'll let the people who like it defend it.
>>
>>    Would someone (anyone) like to come forward and give us a defense
>> of stringsAsFactors=TRUE -- even someone who doesn't personally like
>> it but would like to play devil's advocate?
>
> Sure:
> I will have to change all my scripts, my teaching examples, my book, and
> lots of code examples for research and particularly consulting jobs.
>
> Personally, I think having stringsAsFactors=TRUE is not too bad for
> read.table() but less useful for data.frame().
>
> And since you ask for the devil's advocate already, related to the
> subject line: Removing stars is horrible for consulting: With all those
> people from biology, medicine and other fields who even ask us questions
> in term of significance stars that are obviously very common for them.
> Many of them will certainly ask us for the stars, and ask us to switch
> to another software product once they do not get it from R. They may not
> be interested in being taught about the advantages or disadvantages of
> p-values or stars.
>
> There are different use cases of R, and I want to keep stars for
> consulting tasks where things have to be delivered within minutes. I am
> happy with or without for teaching, where I have the time and can easily
> talk about the sense and nonsense of p-values.
>
>
> Best,
> Uwe

  Thanks, Uwe.
  Now let me go one step farther.

  Can you (or anyone) give a good argument **other than backward
compatibility** for keeping the stringAsFactors=TRUE argument on
data.frame()?

  I appreciate your distinction between data.frame() and read.table()'s
use of stringAsFactors, and I can see that there is some point for
quick-and-dirty interactive use in setting all non-numeric variables to
factors (arguing that wanting non-numerics as factors is somewhat more
common than wanting them as strings).

  It might be nice to add an optional stringsAsFactors (and check.names)
argument to transform(): I've had to write my own Transform() function
to allow the defaults to be overridden, since transform() calls
data.frame() with the defaults.  (Setting the stringsAsFactors option
globally would work, although not for check.names.)

  Ben BOlker

>
>>
>>> What I will likely do is
>>> make a few changes so that character vectors are automatically changed
>>> to factors in modelling functions, so that operating with
>>> stringsAsFactors=FALSE doesn't trigger silly warnings.
>>>
>>> Duncan Murdoch
>>>
>>
>>   [apologies for snipping context: "gmane made me do it"]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Regression stars

Uwe Ligges-3


On 12.02.2013 16:40, Ben Bolker wrote:

> On 13-02-12 09:20 AM, Uwe Ligges wrote:
>>
>>
>> On 12.02.2013 14:54, Ben Bolker wrote:
>>> Duncan Murdoch <murdoch.duncan <at> gmail.com> writes:
>>>
>>>     [snip]
>>>>
>>>> Regarding stringsAsFactors:  I'm not going to defend keeping it as is,
>>>> I'll let the people who like it defend it.
>>>
>>>     Would someone (anyone) like to come forward and give us a defense
>>> of stringsAsFactors=TRUE -- even someone who doesn't personally like
>>> it but would like to play devil's advocate?
>>
>> Sure:
>> I will have to change all my scripts, my teaching examples, my book, and
>> lots of code examples for research and particularly consulting jobs.
>>
>> Personally, I think having stringsAsFactors=TRUE is not too bad for
>> read.table() but less useful for data.frame().
>>
>> And since you ask for the devil's advocate already, related to the
>> subject line: Removing stars is horrible for consulting: With all those
>> people from biology, medicine and other fields who even ask us questions
>> in term of significance stars that are obviously very common for them.
>> Many of them will certainly ask us for the stars, and ask us to switch
>> to another software product once they do not get it from R. They may not
>> be interested in being taught about the advantages or disadvantages of
>> p-values or stars.
>>
>> There are different use cases of R, and I want to keep stars for
>> consulting tasks where things have to be delivered within minutes. I am
>> happy with or without for teaching, where I have the time and can easily
>> talk about the sense and nonsense of p-values.
>>
>>
>> Best,
>> Uwe
>
>    Thanks, Uwe.
>    Now let me go one step farther.
>
>    Can you (or anyone) give a good argument **other than backward
> compatibility** for keeping the stringAsFactors=TRUE argument on
> data.frame()?

No, I cannot,
Uwe


>
>    I appreciate your distinction between data.frame() and read.table()'s
> use of stringAsFactors, and I can see that there is some point for
> quick-and-dirty interactive use in setting all non-numeric variables to
> factors (arguing that wanting non-numerics as factors is somewhat more
> common than wanting them as strings).
>
>    It might be nice to add an optional stringsAsFactors (and check.names)
> argument to transform(): I've had to write my own Transform() function
> to allow the defaults to be overridden, since transform() calls
> data.frame() with the defaults.  (Setting the stringsAsFactors option
> globally would work, although not for check.names.)
>
>    Ben BOlker
>
>>
>>>
>>>> What I will likely do is
>>>> make a few changes so that character vectors are automatically changed
>>>> to factors in modelling functions, so that operating with
>>>> stringsAsFactors=FALSE doesn't trigger silly warnings.
>>>>
>>>> Duncan Murdoch
>>>>
>>>
>>>    [apologies for snipping context: "gmane made me do it"]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Regression stars

Duncan Murdoch-2
In reply to this post by bbolker
On 12/02/2013 10:40 AM, Ben Bolker wrote:

> On 13-02-12 09:20 AM, Uwe Ligges wrote:
> >
> >
> > On 12.02.2013 14:54, Ben Bolker wrote:
> >> Duncan Murdoch <murdoch.duncan <at> gmail.com> writes:
> >>
> >>    [snip]
> >>>
> >>> Regarding stringsAsFactors:  I'm not going to defend keeping it as is,
> >>> I'll let the people who like it defend it.
> >>
> >>    Would someone (anyone) like to come forward and give us a defense
> >> of stringsAsFactors=TRUE -- even someone who doesn't personally like
> >> it but would like to play devil's advocate?
> >
> > Sure:
> > I will have to change all my scripts, my teaching examples, my book, and
> > lots of code examples for research and particularly consulting jobs.
> >
> > Personally, I think having stringsAsFactors=TRUE is not too bad for
> > read.table() but less useful for data.frame().
> >
> > And since you ask for the devil's advocate already, related to the
> > subject line: Removing stars is horrible for consulting: With all those
> > people from biology, medicine and other fields who even ask us questions
> > in term of significance stars that are obviously very common for them.
> > Many of them will certainly ask us for the stars, and ask us to switch
> > to another software product once they do not get it from R. They may not
> > be interested in being taught about the advantages or disadvantages of
> > p-values or stars.
> >
> > There are different use cases of R, and I want to keep stars for
> > consulting tasks where things have to be delivered within minutes. I am
> > happy with or without for teaching, where I have the time and can easily
> > talk about the sense and nonsense of p-values.
> >
> >
> > Best,
> > Uwe
>
>    Thanks, Uwe.
>    Now let me go one step farther.
>
>    Can you (or anyone) give a good argument **other than backward
> compatibility** for keeping the stringAsFactors=TRUE argument on
> data.frame()?

I can, under two assumptions:

   1.  We keep stringsAsFactors=TRUE on read.table().
   2.  We keep the stringsAsFactors argument in data.frame().

Under those assumptions, it would just be confusing to have opposite
defaults.  (Just in case someone hasn't read all of this thread: I'd be
happier to have the default be FALSE in both cases, but not until
3.1.x.  For 3.0.x I think I'd just change the default value of
default.stringsAsFactors() to FALSE, so people could easily get the old
behaviour.)

Duncan Murdoch

>
>    I appreciate your distinction between data.frame() and read.table()'s
> use of stringAsFactors, and I can see that there is some point for
> quick-and-dirty interactive use in setting all non-numeric variables to
> factors (arguing that wanting non-numerics as factors is somewhat more
> common than wanting them as strings).
>
>    It might be nice to add an optional stringsAsFactors (and check.names)
> argument to transform(): I've had to write my own Transform() function
> to allow the defaults to be overridden, since transform() calls
> data.frame() with the defaults.  (Setting the stringsAsFactors option
> globally would work, although not for check.names.)
>
>    Ben BOlker
>
> >
> >>
> >>> What I will likely do is
> >>> make a few changes so that character vectors are automatically changed
> >>> to factors in modelling functions, so that operating with
> >>> stringsAsFactors=FALSE doesn't trigger silly warnings.
> >>>
> >>> Duncan Murdoch
> >>>
> >>
> >>   [apologies for snipping context: "gmane made me do it"]
> >>
> >> ______________________________________________
> >> [hidden email] mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
12