findInterval Documentation Suggestion

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

findInterval Documentation Suggestion

R devel mailing list
I've found over time that R documentation that comes off as terse at
first blush is usually revealed to be precise, concise, and complete
on close reading.  I'm sure this is also true of `?findInterval`, but
for whatever reason my brain simply refuses to extract meaning from it.

Part of the problem may be that we interact with the function via a
compressed form of the bounds of the intervals (only specify left bounds
for 2nd interval onwards), but the semantics are described mostly in
terms of the intervals themselves.  This requires indirections to map
the parameters to the concepts.

An alternative is to first describe what the function does directly in
terms of its inputs, and subsequent relate that to the intervals.  If I
understand correctly (in default mode) the function can be described as:

     Given a vector of non-decreasing values 'vec', for each value in
     'x' return the highest position in 'vec' that corresponds to a
     value less than or equal to that 'x' value, or zero if none are.
     Equivalently, if the values in 'vec' are taken to be the closed
     left-bounds of contiguous half-open intervals, return which of
     those intervals each value of 'x' lies in.

Compared to the original:

     Given a vector of non-decreasing breakpoints in ‘vec’, find the
     interval containing each element of ‘x’; i.e., if ‘i <-
     findInterval(x,v)’, for each index ‘j’ in ‘x’ v[i[j]] <= x[j] <
     v[i[j] + 1] where v[0] := - Inf, v[N+1] := + Inf, and ‘N <-
     length(v)’.  At the two boundaries, the returned index may differ
     by 1, depending on the optional arguments ‘rightmost.closed’ and
     ‘all.inside’.

Obviously you would be right to question whether someone who claims not
to understand the documentation should venture to re-write it.
Nonetheless I attach a proposed alternate version in the hopes that
someone who clearly understand the original might use or adapt parts of it to
make `?findInterval` more accessible to those comprehension-challenged
like me.


Best,

Brodie
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: findInterval Documentation Suggestion

R devel mailing list
Trying the attachment as .txt instead of Rd.


On Thursday, March 5, 2020, 5:20:25 PM EST, brodie gaslam via R-devel <[hidden email]> wrote:

<snip>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

findInterval2.txt (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: findInterval Documentation Suggestion

Martin Maechler
In reply to this post by R devel mailing list
>>>>> brodie gaslam via R-devel
>>>>>     on Thu, 5 Mar 2020 22:18:33 +0000 (UTC) writes:

    > I've found over time that R documentation that comes off as terse at
    > first blush is usually revealed to be precise, concise, and complete
    > on close reading.  I'm sure this is also true of `?findInterval`, but
    > for whatever reason my brain simply refuses to extract meaning from it.

    > Part of the problem may be that we interact with the function via a
    > compressed form of the bounds of the intervals (only specify left bounds
    > for 2nd interval onwards), but the semantics are described mostly in
    > terms of the intervals themselves.  This requires indirections to map
    > the parameters to the concepts.

    > An alternative is to first describe what the function does directly in
    > terms of its inputs, and subsequent relate that to the intervals.  If I
    > understand correctly (in default mode) the function can be described as:

    >      Given a vector of non-decreasing values 'vec', for each value in
    >      'x' return the highest position in 'vec' that corresponds to a
    >      value less than or equal to that 'x' value, or zero if none are.
    >      Equivalently, if the values in 'vec' are taken to be the closed
    >      left-bounds of contiguous half-open intervals, return which of
    >      those intervals each value of 'x' lies in.

    > Compared to the original:

    >      Given a vector of non-decreasing breakpoints in ‘vec’, find the
    >      interval containing each element of ‘x’; i.e., if ‘i <-
    >      findInterval(x,v)’, for each index ‘j’ in ‘x’ v[i[j]] <= x[j] <
    >      v[i[j] + 1] where v[0] := - Inf, v[N+1] := + Inf, and ‘N <-
    >      length(v)’.  At the two boundaries, the returned index may differ
    >      by 1, depending on the optional arguments ‘rightmost.closed’ and
    >      ‘all.inside’.

Note that the  * -> LaTex -> PDF rendered version looks a bit
nicer.

  See lower part of page 206 of (the 33nn pages of)
  https://cran.r-project.org/doc/manuals/r-release/fullrefman.pdf
 
I wrote the function and that help page originally.  Of
course, I'm interested to hear how to improve the documentation.
However, the help pages make up the "Reference Manual", and so
-- as you mention initially -- should be precise and (mostly)
comprehensive.

For that reason, replacing the well defined precise
inequality-based definition by *much* less precise English prosa
is out of the question.

Extending that very long first sentence
    "Given .... .... .... length(v)'.
by adding some helper words or other means may be fine and
indeed an improvement, .. so I'm happy for another try.

Martin

    > Obviously you would be right to question whether someone who claims not
    > to understand the documentation should venture to re-write it.
    > Nonetheless I attach a proposed alternate version in the hopes that
    > someone who clearly understand the original might use or adapt parts of it to
    > make `?findInterval` more accessible to those comprehension-challenged
    > like me.


    > Best,
    > Brodie
    > ______________________________________________
    > [hidden email] mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: findInterval Documentation Suggestion

R devel mailing list
 > On Friday, March 6, 2020, 8:56:54 AM EST, Martin Maechler <[hidden email]> wrote:

> Note that the  * -> LaTex -> PDF rendered version looks a bitnicer.

Ah yes, that does indeed look quite a bit nicer.

> I wrote the function and that help page originally.

And thank you for doing so. It is a wonderful function.
(0 sarcasm here).

> For that reason, replacing the well defined precise
> inequality-based definition by *much* less precise English prosa
> is out of the question.

I figured that might be an issue.  Would you be open to
providing a prose translation, but putting that in the
details? If so, it would be useful to get feedback on
what parts of the prose I proposed are imprecise enough
to be incorrect/incomplete for some corner case.

Finally, would it make sense to move this discussion to
bugzilla?

Best,

Brodie.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: findInterval Documentation Suggestion

R devel mailing list

> On Mar 6, 2020, at 9:17 AM, brodie gaslam via R-devel <[hidden email]> wrote:
>
>> On Friday, March 6, 2020, 8:56:54 AM EST, Martin Maechler <[hidden email]> wrote:
>
>> Note that the  * -> LaTex -> PDF rendered version looks a bitnicer.
>
> Ah yes, that does indeed look quite a bit nicer.
>
>> I wrote the function and that help page originally.
>
> And thank you for doing so. It is a wonderful function.
> (0 sarcasm here).
>
>> For that reason, replacing the well defined precise
>> inequality-based definition by *much* less precise English prosa
>> is out of the question.
>
> I figured that might be an issue.  Would you be open to
> providing a prose translation, but putting that in the
> details? If so, it would be useful to get feedback on
> what parts of the prose I proposed are imprecise enough
> to be incorrect/incomplete for some corner case.
>
> Finally, would it make sense to move this discussion to
> bugzilla?
>
> Best,
>
> Brodie.


Hi,

Just to put forth an alternative to modifying the existing, precise content that Martin wrote, in many cases, that content can be reasonably supplemented by the addition of specific examples and perhaps concise comments, that demonstrate what, otherwise, may be surprising behavior.

If Brodie can construct one or more such examples that might provide additional insights, then perhaps they can be considered for inclusion in the help file, such that meeting both goals of not compromising the language that Martin has contributed, while expanding comprehension, can be achieved.

Regards,

Marc Schwartz

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel