specials issue, a heads up

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

specials issue, a heads up

R devel mailing list
I recently had a long argument wrt the survival package, namely that the following code
didn't do what they expected, and so they reported it as a bug

   survival::coxph( survival::Surv(time, status) ~ age + sex + survival::strata(inst),
data=lung)

a. The Google R style guide  recommends that one put :: everywhere
b. This breaks the recognition of cluster as a "special" in the terms function.

I've been stubborn and said that their misunderstanding of how formulas work is not my
problem.   But I'm sure that the issue will come up again, and multiple other packages
will break.

A big problem is that the code runs, it just gives the wrong answer.

Suggestions?

Terry T.


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: specials issue, a heads up

Hugh Parsonage
I mean if the person filing the bug regards style as more important than
the truth of how R treats formulas then they’re literally talking in
another language.

I strongly recommend you do nothing or at most make a note in the
documentation addressing this. Your time is too valuable.

On Tue, 25 Feb 2020 at 12:56 am, Therneau, Terry M., Ph.D. via R-devel <
[hidden email]> wrote:

> I recently had a long argument wrt the survival package, namely that the
> following code
> didn't do what they expected, and so they reported it as a bug
>
>    survival::coxph( survival::Surv(time, status) ~ age + sex +
> survival::strata(inst),
> data=lung)
>
> a. The Google R style guide  recommends that one put :: everywhere
> b. This breaks the recognition of cluster as a "special" in the terms
> function.
>
> I've been stubborn and said that their misunderstanding of how formulas
> work is not my
> problem.   But I'm sure that the issue will come up again, and multiple
> other packages
> will break.
>
> A big problem is that the code runs, it just gives the wrong answer.
>
> Suggestions?
>
> Terry T.
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: specials issue, a heads up

bbolker
In the long run, coming up with a way to parse specials in formulas
that is both clean and robust is a good idea - annoying users are a
little bit like CRAN maintainers in this respect. I think I would
probably do this by testing identical(eval(extracted_head),
survival::Surv) - but this has lots of potential annoyances (what if
extracted_head is a symbol that can't be found in any attached
environment?  Do we have to start with if
(length(find(deparse(extracted_head))>0) ?

In the short run, a clear note in the documentation seems entirely sufficient.

On Mon, Feb 24, 2020 at 12:01 PM Hugh Parsonage
<[hidden email]> wrote:

>
> I mean if the person filing the bug regards style as more important than
> the truth of how R treats formulas then they’re literally talking in
> another language.
>
> I strongly recommend you do nothing or at most make a note in the
> documentation addressing this. Your time is too valuable.
>
> On Tue, 25 Feb 2020 at 12:56 am, Therneau, Terry M., Ph.D. via R-devel <
> [hidden email]> wrote:
>
> > I recently had a long argument wrt the survival package, namely that the
> > following code
> > didn't do what they expected, and so they reported it as a bug
> >
> >    survival::coxph( survival::Surv(time, status) ~ age + sex +
> > survival::strata(inst),
> > data=lung)
> >
> > a. The Google R style guide  recommends that one put :: everywhere
> > b. This breaks the recognition of cluster as a "special" in the terms
> > function.
> >
> > I've been stubborn and said that their misunderstanding of how formulas
> > work is not my
> > problem.   But I'm sure that the issue will come up again, and multiple
> > other packages
> > will break.
> >
> > A big problem is that the code runs, it just gives the wrong answer.
> >
> > Suggestions?
> >
> > Terry T.
> >
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: specials issue, a heads up

Duncan Murdoch-2
In reply to this post by R devel mailing list
On 24/02/2020 8:55 a.m., Therneau, Terry M., Ph.D. via R-devel wrote:

> I recently had a long argument wrt the survival package, namely that the following code
> didn't do what they expected, and so they reported it as a bug
>
>     survival::coxph( survival::Surv(time, status) ~ age + sex + survival::strata(inst),
> data=lung)
>
> a. The Google R style guide  recommends that one put :: everywhere
> b. This breaks the recognition of cluster as a "special" in the terms function.
>
> I've been stubborn and said that their misunderstanding of how formulas work is not my
> problem.   But I'm sure that the issue will come up again, and multiple other packages
> will break.
>
> A big problem is that the code runs, it just gives the wrong answer.
>
> Suggestions?

I don't know how widely used survival::strata is versus the special
strata (or cluster, or other specials).  If you were just introducing
this now, I'd try to make sure that only one of those worked:  don't
have any functions matching the names of specials, or have functions
that generate an error if you call them.  I did that in the much less
widely used "tables" package, e.g. Heading() has special interpretation,
and the Heading function is defined as

Heading <- function(name = NULL, override = TRUE,
                     character.only = FALSE,
                    nearData = TRUE)
     stop("This is a pseudo-function, not meant to be called.")

However, survival has far more users than tables does, so changing the
name of your special functions or the corresponding regular functions
could be a huge headache.

Perhaps there's a way to set a flag before evaluating the function in
the formula, and generate a warning if survival::strata is called when
it looks like the special function is intended.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: specials issue, a heads up

Peter Dalgaard-2
Notice that the stats package contains the same issue: For some reason it defines an offset() function (for no particular reason, afaics) which just returns its argument. So

> x <- rnorm(10)
> y <- z <- 1:10
> lm(x~y+offset(z))

Call:
lm(formula = x ~ y + offset(z))

Coefficients:
(Intercept)            y  
     0.8253      -1.0840  

> lm(x~y+stats::offset(z))

Call:
lm(formula = x ~ y + stats::offset(z))

Coefficients:
     (Intercept)                 y  stats::offset(z)  
         0.82531          -0.08397                NA

So I'm inclined to say that formulas are formulas and functions using formulas interpret functions and operators at their own convenience. You also deserve what you get from

> lm(x~base::`+`(y,z))

Call:
lm(formula = x ~ base::`+`(y, z))

Coefficients:
    (Intercept)  base::`+`(y, z)  
        0.82531         -0.04198  

-pd

> On 24 Feb 2020, at 19:21 , Duncan Murdoch <[hidden email]> wrote:
>
> On 24/02/2020 8:55 a.m., Therneau, Terry M., Ph.D. via R-devel wrote:
>> I recently had a long argument wrt the survival package, namely that the following code
>> didn't do what they expected, and so they reported it as a bug
>>    survival::coxph( survival::Surv(time, status) ~ age + sex + survival::strata(inst),
>> data=lung)
>> a. The Google R style guide  recommends that one put :: everywhere
>> b. This breaks the recognition of cluster as a "special" in the terms function.
>> I've been stubborn and said that their misunderstanding of how formulas work is not my
>> problem.   But I'm sure that the issue will come up again, and multiple other packages
>> will break.
>> A big problem is that the code runs, it just gives the wrong answer.
>> Suggestions?
>
> I don't know how widely used survival::strata is versus the special strata (or cluster, or other specials).  If you were just introducing this now, I'd try to make sure that only one of those worked: don't have any functions matching the names of specials, or have functions that generate an error if you call them.  I did that in the much less widely used "tables" package, e.g. Heading() has special interpretation, and the Heading function is defined as
>
> Heading <- function(name = NULL, override = TRUE,
>                    character.only = FALSE,
>    nearData = TRUE)
>    stop("This is a pseudo-function, not meant to be called.")
>
> However, survival has far more users than tables does, so changing the name of your special functions or the corresponding regular functions could be a huge headache.
>
> Perhaps there's a way to set a flag before evaluating the function in the formula, and generate a warning if survival::strata is called when it looks like the special function is intended.
>
> Duncan Murdoch
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: specials issue, a heads up

robin hankin-3
In reply to this post by R devel mailing list
Terry, speaking as a package author I would say that the package is the
primary unit of organisation of R functionality, and package considerations
should trump R style considerations.  Packages should be self-contained as
far as possible.

Having said that, many of my own packages use---shall we say---distinct
idiom which is easy to misunderstand.  My suggestion would be to document
the misunderstanding. Add the survival::coxph() expression you quote above
to coxph.Rd,  maybe under a \warning{} section, explaining both a
reasonable but wrong, and the correct way, to parse such constructions.

Best wishes

Robin






On Tue, Feb 25, 2020 at 2:56 AM Therneau, Terry M., Ph.D. via R-devel <
[hidden email]> wrote:

> I recently had a long argument wrt the survival package, namely that the
> following code
> didn't do what they expected, and so they reported it as a bug
>
>    survival::coxph( survival::Surv(time, status) ~ age + sex +
> survival::strata(inst),
> data=lung)
>
> a. The Google R style guide  recommends that one put :: everywhere
> b. This breaks the recognition of cluster as a "special" in the terms
> function.
>
> I've been stubborn and said that their misunderstanding of how formulas
> work is not my
> problem.   But I'm sure that the issue will come up again, and multiple
> other packages
> will break.
>
> A big problem is that the code runs, it just gives the wrong answer.
>
> Suggestions?
>
> Terry T.
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: specials issue, a heads up

Peter Dalgaard-2
In reply to this post by bbolker
<musing>
For some reason, I can't read this without thinking of the tech support guy going

"No, it is http:// ... aitch - tee - tee - pee - colon - AS IN WHERE YOUR HEAD IS - slash - slash ..."
</musing>

-pd

> On 24 Feb 2020, at 18:26 , Ben Bolker <[hidden email]> wrote:
>
> In the long run, coming up with a way to parse specials in formulas
> that is both clean and robust is a good idea - annoying users are a
> little bit like CRAN maintainers in this respect. I think I would
> probably do this by testing identical(eval(extracted_head),
> survival::Surv) - but this has lots of potential annoyances (what if
> extracted_head is a symbol that can't be found in any attached
> environment?  Do we have to start with if
> (length(find(deparse(extracted_head))>0) ?
>
> In the short run, a clear note in the documentation seems entirely sufficient.
>
> On Mon, Feb 24, 2020 at 12:01 PM Hugh Parsonage
> <[hidden email]> wrote:
>>
>> I mean if the person filing the bug regards style as more important than
>> the truth of how R treats formulas then they’re literally talking in
>> another language.
>>
>> I strongly recommend you do nothing or at most make a note in the
>> documentation addressing this. Your time is too valuable.
>>
>> On Tue, 25 Feb 2020 at 12:56 am, Therneau, Terry M., Ph.D. via R-devel <
>> [hidden email]> wrote:
>>
>>> I recently had a long argument wrt the survival package, namely that the
>>> following code
>>> didn't do what they expected, and so they reported it as a bug
>>>
>>>   survival::coxph( survival::Surv(time, status) ~ age + sex +
>>> survival::strata(inst),
>>> data=lung)
>>>
>>> a. The Google R style guide  recommends that one put :: everywhere
>>> b. This breaks the recognition of cluster as a "special" in the terms
>>> function.
>>>
>>> I've been stubborn and said that their misunderstanding of how formulas
>>> work is not my
>>> problem.   But I'm sure that the issue will come up again, and multiple
>>> other packages
>>> will break.
>>>
>>> A big problem is that the code runs, it just gives the wrong answer.
>>>
>>> Suggestions?
>>>
>>> Terry T.
>>>
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: specials issue, a heads up

R devel mailing list
In reply to this post by Duncan Murdoch-2
Thanks to all who commented.   In some defense of the person who reported the "bug", it
appeared to be a company policy from above, influenced by the fact that they had been
often burned by not using :: when multiple packages use the same symbol.

Some further technical detail:  coxph has 3 specials: strata, cluster, and tt.  For the
last of these I took a different route, and did not make it an exported symbol in the
survival package. (I try to be smarter over time).  Instead, coxph first looks at the
formula.  If that formula contains tt() then it creates an envionment that contains the
function  "tt <- funciton(x) x", whose parent is the environment of the formula, changes
the formula's environment to this, and then proceeds as usual.    As a curiousity, I'm
going to change cluster to the same form, and them run my script that executes R CMD 
check on every package that has surivval as depend/import/suggest and see how many
break.   I'm guessing I'll find several that used cluster for their own ends. The strata
function will be worse.

Duncan -- to use your approach, I'd instead hack the formula before calling model.frame,
so that it does not try to call cluster?

Another approach I tried was not exporting cluster(), but that fails when model.frame
tries to call it.


Terry


On 2/24/20 12:21 PM, Duncan Murdoch wrote:

> On 24/02/2020 8:55 a.m., Therneau, Terry M., Ph.D. via R-devel wrote:
>> I recently had a long argument wrt the survival package, namely that the following code
>> didn't do what they expected, and so they reported it as a bug
>>
>>     survival::coxph( survival::Surv(time, status) ~ age + sex + survival::strata(inst),
>> data=lung)
>>
>> a. The Google R style guide  recommends that one put :: everywhere
>> b. This breaks the recognition of cluster as a "special" in the terms function.
>>
>> I've been stubborn and said that their misunderstanding of how formulas work is not my
>> problem.   But I'm sure that the issue will come up again, and multiple other packages
>> will break.
>>
>> A big problem is that the code runs, it just gives the wrong answer.
>>
>> Suggestions?
>
> I don't know how widely used survival::strata is versus the special strata (or cluster,
> or other specials).  If you were just introducing this now, I'd try to make sure that
> only one of those worked:  don't have any functions matching the names of specials, or
> have functions that generate an error if you call them.  I did that in the much less
> widely used "tables" package, e.g. Heading() has special interpretation, and the Heading
> function is defined as
>
> Heading <- function(name = NULL, override = TRUE,
>                     character.only = FALSE,
>             nearData = TRUE)
>     stop("This is a pseudo-function, not meant to be called.")
>
> However, survival has far more users than tables does, so changing the name of your
> special functions or the corresponding regular functions could be a huge headache.
>
> Perhaps there's a way to set a flag before evaluating the function in the formula, and
> generate a warning if survival::strata is called when it looks like the special function
> is intended.
>
> Duncan Murdoch


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel