Proposed Patch for poly.Rd

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Proposed Patch for poly.Rd

Marc Schwartz-3
Hi All,

As per the discussion today on R-Help:

  https://stat.ethz.ch/pipermail/r-help/2017-July/448132.html

I am attaching a proposed patch for poly.Rd to provide clarifying wording relative to naming the 'degree' argument explicitly, in the case where the 'x' argument is a matrix, rather than a vector.

This is based upon the svn trunk version of poly.Rd.

Thanks for your consideration.

Regards,

Marc Schwartz





______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

patch.txt (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Proposed Patch for poly.Rd

Duncan Murdoch-2
On 13/07/2017 4:08 PM, Marc Schwartz wrote:
> Hi All,
>
> As per the discussion today on R-Help:
>
>   https://stat.ethz.ch/pipermail/r-help/2017-July/448132.html
>
> I am attaching a proposed patch for poly.Rd to provide clarifying wording relative to naming the 'degree' argument explicitly, in the case where the 'x' argument is a matrix, rather than a vector.
>
> This is based upon the svn trunk version of poly.Rd.

I don't think this is the right fix.  The use of the unnamed 2nd arg as
degree happens whether the first arg is a matrix or not.

I didn't read the whole thread in detail, but it appears there's a bug
somewhere, in the report or in the poly() code or in the plsr() code.
That bug should be reported on the bug list if it turns out to be in
base R, and to the package maintainer if it is in plsr().

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Proposed Patch for poly.Rd

Marc Schwartz-3

> On Jul 13, 2017, at 3:22 PM, Duncan Murdoch <[hidden email]> wrote:
>
> On 13/07/2017 4:08 PM, Marc Schwartz wrote:
>> Hi All,
>>
>> As per the discussion today on R-Help:
>>
>>  https://stat.ethz.ch/pipermail/r-help/2017-July/448132.html
>>
>> I am attaching a proposed patch for poly.Rd to provide clarifying wording relative to naming the 'degree' argument explicitly, in the case where the 'x' argument is a matrix, rather than a vector.
>>
>> This is based upon the svn trunk version of poly.Rd.
>
> I don't think this is the right fix.  The use of the unnamed 2nd arg as degree happens whether the first arg is a matrix or not.
>
> I didn't read the whole thread in detail, but it appears there's a bug somewhere, in the report or in the poly() code or in the plsr() code. That bug should be reported on the bug list if it turns out to be in base R, and to the package maintainer if it is in plsr().
>
> Duncan Murdoch


Duncan,

Thanks for your reply. You only really need to read that last post in the thread linked to above.

I won't deny the possibility of a bug in poly(), relative to the handling of 'x' as a matrix. The behavior occurs in the poly() function in a pure stand alone fashion, without the need for plsr():

x1 <- runif(20)
x2 <- runif(20)
mx <- cbind(x1, x2)

> poly(mx, 2)
Error in poly(dots[[i]], degree, raw = raw, simple = raw) :
  'degree' must be less than number of unique points

The above error occurs because of the way in which 'mx' is transformed internally in poly(), as per the R-Help post I linked to above.


Compare that to:

> poly(mx, degree = 2)
              1.0          2.0         0.1          1.1         0.2
 [1,] -0.11175349 -0.112802655  0.34729146 -0.038811031  0.29371194
 [2,]  0.27620511 -0.102592711  0.27672559  0.076433023  0.10192546
 [3,]  0.31709686 -0.000822981 -0.06017089 -0.019080000 -0.20283645
 [4,] -0.05873472 -0.213373684  0.26314361 -0.015455666  0.07009778
 [5,] -0.17389885  0.046175314  0.08393899 -0.014596893 -0.19610518
 [6,] -0.07143282 -0.192226574  0.12931566 -0.009237383 -0.15572309
 [7,] -0.20924410  0.156380030 -0.38783860  0.081152937  0.46977236
 [8,]  0.09192574 -0.322960534 -0.13012298 -0.011961651 -0.13946871
 [9,] -0.08030862 -0.176345544 -0.11855987  0.009521379 -0.15294790
[10,]  0.26551532 -0.126030940 -0.09225246 -0.024494442 -0.17918115
[11,] -0.16961102  0.033781845  0.23980484 -0.040673544  0.01924080
[12,] -0.23503411  0.245845222  0.37898576 -0.089074579  0.39427472
[13,]  0.44343189  0.434902694  0.19305658  0.085607445 -0.06804699
[14,] -0.16429372  0.018706099 -0.04315970  0.007090868 -0.21166328
[15,]  0.04616179 -0.317237087 -0.09818924 -0.004532591 -0.17379927
[16,] -0.20148531  0.130959507 -0.32805340  0.066097939  0.27578123
[17,] -0.25585213  0.323634018 -0.34406268  0.088029169  0.32460950
[18,] -0.21168308  0.164513794 -0.10037452  0.021247587 -0.17173927
[19,]  0.41817752  0.333143463 -0.04018127 -0.016802902 -0.21294380
[20,]  0.08481772 -0.323649275 -0.16929688 -0.014359375 -0.08495871
attr(,"degree")
[1] 1 2 1 2 2
attr(,"coefs")
attr(,"coefs")[[1]]
attr(,"coefs")[[1]]$alpha
[1] 0.3596862 0.5799695

attr(,"coefs")[[1]]$norm2
[1]  1.000000 20.000000  1.898620  0.109334


attr(,"coefs")[[2]]
attr(,"coefs")[[2]]$alpha
[1] 0.5123548 0.5290189

attr(,"coefs")[[2]]$norm2
[1]  1.0000000 20.0000000  1.5765605  0.1255148


attr(,"class")
[1] "poly"   "matrix"



Thoughts?

Regards,

Marc

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Proposed Patch for poly.Rd

Duncan Murdoch-2
On 13/07/2017 4:37 PM, Marc Schwartz wrote:

>
>> On Jul 13, 2017, at 3:22 PM, Duncan Murdoch <[hidden email]> wrote:
>>
>> On 13/07/2017 4:08 PM, Marc Schwartz wrote:
>>> Hi All,
>>>
>>> As per the discussion today on R-Help:
>>>
>>>  https://stat.ethz.ch/pipermail/r-help/2017-July/448132.html
>>>
>>> I am attaching a proposed patch for poly.Rd to provide clarifying wording relative to naming the 'degree' argument explicitly, in the case where the 'x' argument is a matrix, rather than a vector.
>>>
>>> This is based upon the svn trunk version of poly.Rd.
>>
>> I don't think this is the right fix.  The use of the unnamed 2nd arg as degree happens whether the first arg is a matrix or not.
>>
>> I didn't read the whole thread in detail, but it appears there's a bug somewhere, in the report or in the poly() code or in the plsr() code. That bug should be reported on the bug list if it turns out to be in base R, and to the package maintainer if it is in plsr().
>>
>> Duncan Murdoch
>
>
> Duncan,
>
> Thanks for your reply. You only really need to read that last post in the thread linked to above.
>
> I won't deny the possibility of a bug in poly(), relative to the handling of 'x' as a matrix. The behavior occurs in the poly() function in a pure stand alone fashion, without the need for plsr():
>
> x1 <- runif(20)
> x2 <- runif(20)
> mx <- cbind(x1, x2)
>
>> poly(mx, 2)
> Error in poly(dots[[i]], degree, raw = raw, simple = raw) :
>   'degree' must be less than number of unique points
>
> The above error occurs because of the way in which 'mx' is transformed internally in poly(), as per the R-Help post I linked to above.
>
>
> Compare that to:
>
>> poly(mx, degree = 2)
>               1.0          2.0         0.1          1.1         0.2
>  [1,] -0.11175349 -0.112802655  0.34729146 -0.038811031  0.29371194
>  [2,]  0.27620511 -0.102592711  0.27672559  0.076433023  0.10192546
>  [3,]  0.31709686 -0.000822981 -0.06017089 -0.019080000 -0.20283645
>  [4,] -0.05873472 -0.213373684  0.26314361 -0.015455666  0.07009778
>  [5,] -0.17389885  0.046175314  0.08393899 -0.014596893 -0.19610518
>  [6,] -0.07143282 -0.192226574  0.12931566 -0.009237383 -0.15572309
>  [7,] -0.20924410  0.156380030 -0.38783860  0.081152937  0.46977236
>  [8,]  0.09192574 -0.322960534 -0.13012298 -0.011961651 -0.13946871
>  [9,] -0.08030862 -0.176345544 -0.11855987  0.009521379 -0.15294790
> [10,]  0.26551532 -0.126030940 -0.09225246 -0.024494442 -0.17918115
> [11,] -0.16961102  0.033781845  0.23980484 -0.040673544  0.01924080
> [12,] -0.23503411  0.245845222  0.37898576 -0.089074579  0.39427472
> [13,]  0.44343189  0.434902694  0.19305658  0.085607445 -0.06804699
> [14,] -0.16429372  0.018706099 -0.04315970  0.007090868 -0.21166328
> [15,]  0.04616179 -0.317237087 -0.09818924 -0.004532591 -0.17379927
> [16,] -0.20148531  0.130959507 -0.32805340  0.066097939  0.27578123
> [17,] -0.25585213  0.323634018 -0.34406268  0.088029169  0.32460950
> [18,] -0.21168308  0.164513794 -0.10037452  0.021247587 -0.17173927
> [19,]  0.41817752  0.333143463 -0.04018127 -0.016802902 -0.21294380
> [20,]  0.08481772 -0.323649275 -0.16929688 -0.014359375 -0.08495871
> attr(,"degree")
> [1] 1 2 1 2 2
> attr(,"coefs")
> attr(,"coefs")[[1]]
> attr(,"coefs")[[1]]$alpha
> [1] 0.3596862 0.5799695
>
> attr(,"coefs")[[1]]$norm2
> [1]  1.000000 20.000000  1.898620  0.109334
>
>
> attr(,"coefs")[[2]]
> attr(,"coefs")[[2]]$alpha
> [1] 0.5123548 0.5290189
>
> attr(,"coefs")[[2]]$norm2
> [1]  1.0000000 20.0000000  1.5765605  0.1255148
>
>
> attr(,"class")
> [1] "poly"   "matrix"
>
>
>
> Thoughts?

I think based on this, there's a bug in poly():  poly(mx, 2) and
poly(mx, degree = 2) should be the same.  I'll put in a bug report.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Proposed Patch for poly.Rd

Marc Schwartz-3
In reply to this post by Marc Schwartz-3

> On Jul 13, 2017, at 3:37 PM, Marc Schwartz <[hidden email]> wrote:
>
>
>> On Jul 13, 2017, at 3:22 PM, Duncan Murdoch <[hidden email]> wrote:
>>
>> On 13/07/2017 4:08 PM, Marc Schwartz wrote:
>>> Hi All,
>>>
>>> As per the discussion today on R-Help:
>>>
>>> https://stat.ethz.ch/pipermail/r-help/2017-July/448132.html
>>>
>>> I am attaching a proposed patch for poly.Rd to provide clarifying wording relative to naming the 'degree' argument explicitly, in the case where the 'x' argument is a matrix, rather than a vector.
>>>
>>> This is based upon the svn trunk version of poly.Rd.
>>
>> I don't think this is the right fix.  The use of the unnamed 2nd arg as degree happens whether the first arg is a matrix or not.
>>
>> I didn't read the whole thread in detail, but it appears there's a bug somewhere, in the report or in the poly() code or in the plsr() code. That bug should be reported on the bug list if it turns out to be in base R, and to the package maintainer if it is in plsr().
>>
>> Duncan Murdoch
>
>
> Duncan,
>
> Thanks for your reply. You only really need to read that last post in the thread linked to above.
>
> I won't deny the possibility of a bug in poly(), relative to the handling of 'x' as a matrix. The behavior occurs in the poly() function in a pure stand alone fashion, without the need for plsr():
>
> x1 <- runif(20)
> x2 <- runif(20)
> mx <- cbind(x1, x2)
>

<snip>

Duncan,

Tracing through the code for poly() using debug once with:

  poly(mx, 2)

and then with:

  poly(mx, degree = 2)

there is a difference in the transformation of 'mx' internally by the use of:

if (is.matrix(x)) {
    m <- unclass(as.data.frame(cbind(x, ...)))
    return(do.call(polym, c(m, degree = degree, raw = raw, list(coefs = coefs))))
}


In the first case, 'mx' ends up being transformed to:

Browse[2]> m
$x1
 [1] 0.99056941 0.13953093 0.38965567 0.35353514 0.90838486 0.97552474
 [7] 0.01135743 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
[13] 0.31333549 0.97488211 0.54952630 0.71747157 0.31164777 0.81694822
[19] 0.58641410 0.08858699

$x2
 [1] 0.6628658 0.9221436 0.3162418 0.8494452 0.4665010 0.3403719
 [7] 0.4040692 0.4916650 0.9091161 0.2956006 0.3454689 0.3331070
[13] 0.8788974 0.5614636 0.7794396 0.2304009 0.6566537 0.6875646
[19] 0.5110733 0.4122336

$V3
 [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

attr(,"row.names")
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

Thus, when do.call() is used, m$V3 is passed as the 'x' argument on the third iteration, essentially resulting in:

> polym(rep(2, 20), degree = 2)
Error in poly(dots[[1L]], degree, raw = raw, simple = raw && nd > 1) :
 'degree' must be less than number of unique points


Note also that in this case, 'dots', which is the result of using list(...) on the initial call, is:

Browse[2]> dots
[[1]]
[1] 2


In the second case:

Browse[2]> m
$x1
 [1] 0.99056941 0.13953093 0.38965567 0.35353514 0.90838486 0.97552474
 [7] 0.01135743 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
[13] 0.31333549 0.97488211 0.54952630 0.71747157 0.31164777 0.81694822
[19] 0.58641410 0.08858699

$x2
 [1] 0.6628658 0.9221436 0.3162418 0.8494452 0.4665010 0.3403719
 [7] 0.4040692 0.4916650 0.9091161 0.2956006 0.3454689 0.3331070
[13] 0.8788974 0.5614636 0.7794396 0.2304009 0.6566537 0.6875646
[19] 0.5110733 0.4122336

attr(,"row.names")
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20


So, there is no m$V3.

Note also that 'dots' ends up being:

Browse[2]> dots
list()


In both cases, 'degree' is indeed 2, but the result of 'list(...)' on the initial function call is quite different.

So, I may be hypo-caffeinated, but if there is a bug here, it may be due to the way in which cbind() is being called in the code above, where the three dots are being used?

I can replicate the presumably correct behavior by using:

  m <- unclass(as.data.frame(cbind(x)))

instead of:

  m <- unclass(as.data.frame(cbind(x, ...)))

But I am not sure if removing the three dots in the cbind() call may have other unintended consequences.

Regards,

Marc

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Proposed Patch for poly.Rd

Marc Schwartz-3

> On Jul 13, 2017, at 5:07 PM, Marc Schwartz <[hidden email]> wrote:
>
>
>> On Jul 13, 2017, at 3:37 PM, Marc Schwartz <[hidden email]> wrote:
>>
>>
>>> On Jul 13, 2017, at 3:22 PM, Duncan Murdoch <[hidden email]> wrote:
>>>
>>> On 13/07/2017 4:08 PM, Marc Schwartz wrote:
>>>> Hi All,
>>>>
>>>> As per the discussion today on R-Help:
>>>>
>>>> https://stat.ethz.ch/pipermail/r-help/2017-July/448132.html
>>>>
>>>> I am attaching a proposed patch for poly.Rd to provide clarifying wording relative to naming the 'degree' argument explicitly, in the case where the 'x' argument is a matrix, rather than a vector.
>>>>
>>>> This is based upon the svn trunk version of poly.Rd.
>>>
>>> I don't think this is the right fix.  The use of the unnamed 2nd arg as degree happens whether the first arg is a matrix or not.
>>>
>>> I didn't read the whole thread in detail, but it appears there's a bug somewhere, in the report or in the poly() code or in the plsr() code. That bug should be reported on the bug list if it turns out to be in base R, and to the package maintainer if it is in plsr().
>>>
>>> Duncan Murdoch
>>
>>
>> Duncan,
>>
>> Thanks for your reply. You only really need to read that last post in the thread linked to above.
>>
>> I won't deny the possibility of a bug in poly(), relative to the handling of 'x' as a matrix. The behavior occurs in the poly() function in a pure stand alone fashion, without the need for plsr():
>>
>> x1 <- runif(20)
>> x2 <- runif(20)
>> mx <- cbind(x1, x2)
>>
>
> <snip>
>
> Duncan,
>
> Tracing through the code for poly() using debug once with:
>
>  poly(mx, 2)
>
> and then with:
>
>  poly(mx, degree = 2)
>
> there is a difference in the transformation of 'mx' internally by the use of:
>
> if (is.matrix(x)) {
>    m <- unclass(as.data.frame(cbind(x, ...)))
>    return(do.call(polym, c(m, degree = degree, raw = raw, list(coefs = coefs))))
> }
>
>
> In the first case, 'mx' ends up being transformed to:
>
> Browse[2]> m
> $x1
> [1] 0.99056941 0.13953093 0.38965567 0.35353514 0.90838486 0.97552474
> [7] 0.01135743 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
> [13] 0.31333549 0.97488211 0.54952630 0.71747157 0.31164777 0.81694822
> [19] 0.58641410 0.08858699
>
> $x2
> [1] 0.6628658 0.9221436 0.3162418 0.8494452 0.4665010 0.3403719
> [7] 0.4040692 0.4916650 0.9091161 0.2956006 0.3454689 0.3331070
> [13] 0.8788974 0.5614636 0.7794396 0.2304009 0.6566537 0.6875646
> [19] 0.5110733 0.4122336
>
> $V3
> [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
>
> attr(,"row.names")
> [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
>
> Thus, when do.call() is used, m$V3 is passed as the 'x' argument on the third iteration, essentially resulting in:
>
>> polym(rep(2, 20), degree = 2)
> Error in poly(dots[[1L]], degree, raw = raw, simple = raw && nd > 1) :
> 'degree' must be less than number of unique points
>
>
> Note also that in this case, 'dots', which is the result of using list(...) on the initial call, is:
>
> Browse[2]> dots
> [[1]]
> [1] 2
>
>
> In the second case:
>
> Browse[2]> m
> $x1
> [1] 0.99056941 0.13953093 0.38965567 0.35353514 0.90838486 0.97552474
> [7] 0.01135743 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
> [13] 0.31333549 0.97488211 0.54952630 0.71747157 0.31164777 0.81694822
> [19] 0.58641410 0.08858699
>
> $x2
> [1] 0.6628658 0.9221436 0.3162418 0.8494452 0.4665010 0.3403719
> [7] 0.4040692 0.4916650 0.9091161 0.2956006 0.3454689 0.3331070
> [13] 0.8788974 0.5614636 0.7794396 0.2304009 0.6566537 0.6875646
> [19] 0.5110733 0.4122336
>
> attr(,"row.names")
> [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
>
>
> So, there is no m$V3.
>
> Note also that 'dots' ends up being:
>
> Browse[2]> dots
> list()
>
>
> In both cases, 'degree' is indeed 2, but the result of 'list(...)' on the initial function call is quite different.
>
> So, I may be hypo-caffeinated, but if there is a bug here, it may be due to the way in which cbind() is being called in the code above, where the three dots are being used?
>
> I can replicate the presumably correct behavior by using:
>
>  m <- unclass(as.data.frame(cbind(x)))
>
> instead of:
>
>  m <- unclass(as.data.frame(cbind(x, ...)))
>
> But I am not sure if removing the three dots in the cbind() call may have other unintended consequences.
>
> Regards,
>
> Marc


Duncan,

Some additional information here.

Reviewing the source code for the function in SVN:

  https://svn.r-project.org/R/trunk/src/library/stats/R/contr.poly.R

there is a relevant comment in the code:

 if(is.matrix(x)) { ## FIXME: fails when combined with 'unnamed degree' above
        m <- unclass(as.data.frame(cbind(x, ...)))
        return(do.call(polym, c(m, degree = degree, raw = raw,
                                list(coefs=coefs))))
    }


A version review would suggest that the above comment was added to the code back in 2015.

So it would appear that the behavior being discussed here is known.

I am still confused by the need for the '...' in the call to cbind(), which as far as I can tell, has been in the code at least back to 2003, when the poly() code was split from base. I am not sure why one would want to pass on other '...' arguments to cbind(), but I am presumably missing something here.

Regards,

Marc

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Proposed Patch for poly.Rd

Martin Maechler
>>>>> Marc Schwartz <[hidden email]>
>>>>>     on Fri, 14 Jul 2017 06:57:26 -0500 writes:

    >> On Jul 13, 2017, at 5:07 PM, Marc Schwartz <[hidden email]> wrote:
    >>
    >>
    >>> On Jul 13, 2017, at 3:37 PM, Marc Schwartz <[hidden email]> wrote:
    >>>
    >>>
    >>>> On Jul 13, 2017, at 3:22 PM, Duncan Murdoch <[hidden email]> wrote:
    >>>>
    >>>> On 13/07/2017 4:08 PM, Marc Schwartz wrote:
    >>>>> Hi All,
    >>>>>
    >>>>> As per the discussion today on R-Help:
    >>>>>
    >>>>> https://stat.ethz.ch/pipermail/r-help/2017-July/448132.html
    >>>>>
    >>>>> I am attaching a proposed patch for poly.Rd to provide clarifying wording relative to naming the 'degree' argument explicitly, in the case where the 'x' argument is a matrix, rather than a vector.
    >>>>>
    >>>>> This is based upon the svn trunk version of poly.Rd.
    >>>>
    >>>> I don't think this is the right fix.  The use of the unnamed 2nd arg as degree happens whether the first arg is a matrix or not.
    >>>>
    >>>> I didn't read the whole thread in detail, but it appears there's a bug somewhere, in the report or in the poly() code or in the plsr() code. That bug should be reported on the bug list if it turns out to be in base R, and to the package maintainer if it is in plsr().
    >>>>
    >>>> Duncan Murdoch
    >>>
    >>>
    >>> Duncan,
    >>>
    >>> Thanks for your reply. You only really need to read that last post in the thread linked to above.
    >>>
    >>> I won't deny the possibility of a bug in poly(), relative to the handling of 'x' as a matrix. The behavior occurs in the poly() function in a pure stand alone fashion, without the need for plsr():
    >>>
    >>> x1 <- runif(20)
    >>> x2 <- runif(20)
    >>> mx <- cbind(x1, x2)
    >>>
    >>
    >> <snip>
    >>
    >> Duncan,
    >>
    >> Tracing through the code for poly() using debug once with:
    >>
    >> poly(mx, 2)
    >>
    >> and then with:
    >>
    >> poly(mx, degree = 2)
    >>
    >> there is a difference in the transformation of 'mx' internally by the use of:
    >>
    >> if (is.matrix(x)) {
    >> m <- unclass(as.data.frame(cbind(x, ...)))
    >> return(do.call(polym, c(m, degree = degree, raw = raw, list(coefs = coefs))))
    >> }
    >>
    >>
    >> In the first case, 'mx' ends up being transformed to:
    >>
    >> Browse[2]> m
    >> $x1
    >> [1] 0.99056941 0.13953093 0.38965567 0.35353514 0.90838486 0.97552474
    >> [7] 0.01135743 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
    >> [13] 0.31333549 0.97488211 0.54952630 0.71747157 0.31164777 0.81694822
    >> [19] 0.58641410 0.08858699
    >>
    >> $x2
    >> [1] 0.6628658 0.9221436 0.3162418 0.8494452 0.4665010 0.3403719
    >> [7] 0.4040692 0.4916650 0.9091161 0.2956006 0.3454689 0.3331070
    >> [13] 0.8788974 0.5614636 0.7794396 0.2304009 0.6566537 0.6875646
    >> [19] 0.5110733 0.4122336
    >>
    >> $V3
    >> [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
    >>
    >> attr(,"row.names")
    >> [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
    >>
    >> Thus, when do.call() is used, m$V3 is passed as the 'x' argument on the third iteration, essentially resulting in:
    >>
    >>> polym(rep(2, 20), degree = 2)
    >> Error in poly(dots[[1L]], degree, raw = raw, simple = raw && nd > 1) :
    >> 'degree' must be less than number of unique points
    >>
    >>
    >> Note also that in this case, 'dots', which is the result of using list(...) on the initial call, is:
    >>
    >> Browse[2]> dots
    >> [[1]]
    >> [1] 2
    >>
    >>
    >> In the second case:
    >>
    >> Browse[2]> m
    >> $x1
    >> [1] 0.99056941 0.13953093 0.38965567 0.35353514 0.90838486 0.97552474
    >> [7] 0.01135743 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
    >> [13] 0.31333549 0.97488211 0.54952630 0.71747157 0.31164777 0.81694822
    >> [19] 0.58641410 0.08858699
    >>
    >> $x2
    >> [1] 0.6628658 0.9221436 0.3162418 0.8494452 0.4665010 0.3403719
    >> [7] 0.4040692 0.4916650 0.9091161 0.2956006 0.3454689 0.3331070
    >> [13] 0.8788974 0.5614636 0.7794396 0.2304009 0.6566537 0.6875646
    >> [19] 0.5110733 0.4122336
    >>
    >> attr(,"row.names")
    >> [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
    >>
    >>
    >> So, there is no m$V3.
    >>
    >> Note also that 'dots' ends up being:
    >>
    >> Browse[2]> dots
    >> list()
    >>
    >>
    >> In both cases, 'degree' is indeed 2, but the result of 'list(...)' on the initial function call is quite different.
    >>
    >> So, I may be hypo-caffeinated, but if there is a bug here, it may be due to the way in which cbind() is being called in the code above, where the three dots are being used?
    >>
    >> I can replicate the presumably correct behavior by using:
    >>
    >> m <- unclass(as.data.frame(cbind(x)))
    >>
    >> instead of:
    >>
    >> m <- unclass(as.data.frame(cbind(x, ...)))
    >>
    >> But I am not sure if removing the three dots in the cbind() call may have other unintended consequences.
    >>
    >> Regards,
    >>
    >> Marc


    > Duncan,

    > Some additional information here.
    > Reviewing the source code for the function in SVN:

    > https://svn.r-project.org/R/trunk/src/library/stats/R/contr.poly.R

    > there is a relevant comment in the code:

    > if(is.matrix(x)) { ## FIXME: fails when combined with 'unnamed degree' above
    >   m <- unclass(as.data.frame(cbind(x, ...)))
    >   return(do.call(polym, c(m, degree = degree, raw = raw,
    >                  list(coefs=coefs))))
    > }


    > A version review would suggest that the above comment was added to the code back in 2015.

Yes, by me, possibly here :

   $ svn log -v -c68727
   ------------------------------------------------------------------------
   r68727 | maechler | 2015-07-23 16:14:59 +0200 (Thu, 23 Jul 2015) | 1 line
   Changed paths:
      M /trunk/doc/NEWS.Rd
      M /trunk/src/library/stats/R/contr.poly.R
      M /trunk/src/library/stats/man/poly.Rd
      M /trunk/tests/Examples/stats-Ex.Rout.save
      M /trunk/tests/reg-tests-1c.R

   poly(), polym() now work better notably for prediction
   ------------------------------------------------------------------------
   $ svn-diffB -c68727 doc/NEWS.Rd
   Index: doc/NEWS.Rd
   ===================================================================
   126a127,133
   >
   >       \item \code{polym()} gains a \code{coefs = NULL} argument and
   >       returns class \code{"poly"} just like \code{poly()} which gets a
   >       new \code{simple=FALSE} option.  They now lead to correct
   >       \code{predict()}ions, e.g., on subsets of the original data.
   >       %% see https://stat.ethz.ch/pipermail/r-devel/2015-July/071532.html


    > So it would appear that the behavior being discussed here is known.

Indeed!  I remember to have spent quite a few hours with the
code and its different uses before committing that patch.

    > I am still confused by the need for the '...' in the call to cbind(), which as far as I can tell, has been in the code at least back to 2003, when the poly() code was split from base.

    > I am not sure why one would want to pass on other '...' arguments to cbind(), but I am presumably missing something here.

Yes, I think passing the '...' is important there...
OTOH, I'm almost sure that I wrote the 'FIXME' because I thought
one should be able to do things better.
So, I'm happy to e-talk to you about how to get rid of the FIXME
and still remain back-compatible: Notably with the paragraph in  ?poly
|> Details:
|>
|>      Although formally ‘degree’ should be named (as it follows ‘...’),
|>      an unnamed second argument of length 1 will be interpreted as the
|>      degree, such that ‘poly(x, 3)’ can be used in formulas.

I'm appending an -- I think -- nice and pedagogical example on
how you can use  trace() in an easy but powerful way to see when
'...' is used in one example:

trExp <- quote({
    cat("'...':  "); str(list(...))
    cat("Local variables: "); print(ls.str())
    cat("---end{trace}---------------------\n")
    })
trace(poly, trExp)
trace(polym,trExp)

alm <- lm(stack.loss ~ poly(Air.Flow, Water.Temp, degree=3), stackloss)

which shows as

> alm <- lm(stack.loss ~ poly(Air.Flow, Water.Temp, degree=3), stackloss)
Tracing poly(Air.Flow, Water.Temp, degree = 3) on entry
'...':  List of 1
 $ : num [1:21] 27 27 25 24 22 23 24 24 23 18 ...
Local variables: coefs :  NULL
degree :  num 3
raw :  logi FALSE
simple :  logi FALSE
x :  num [1:21] 80 80 75 62 62 62 62 62 58 58 ...
---end{trace}---------------------
Tracing polym(x, ..., degree = degree, coefs = coefs, raw = raw) on entry
'...':  List of 2
 $ : num [1:21] 80 80 75 62 62 62 62 62 58 58 ...
 $ : num [1:21] 27 27 25 24 22 23 24 24 23 18 ...
Local variables: coefs :  NULL
degree :  num 3
raw :  logi FALSE
---end{trace}---------------------
Tracing poly(dots[[1L]], degree, raw = raw, simple = raw && nd > 1) on entry
'...':  List of 1
 $ : num 3
Local variables: coefs :  NULL
degree :  num 1
raw :  logi FALSE
simple :  logi FALSE
x :  num [1:21] 80 80 75 62 62 62 62 62 58 58 ...
---end{trace}---------------------
Tracing poly(dots[[i]], degree, raw = raw, simple = raw) on entry
'...':  List of 1
 $ : num 3
Local variables: coefs :  NULL
degree :  num 1
raw :  logi FALSE
simple :  logi FALSE
x :  num [1:21] 27 27 25 24 22 23 24 24 23 18 ...
---end{trace}---------------------
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Proposed Patch for poly.Rd

Martin Maechler
>>>>> Martin Maechler <[hidden email]>
>>>>>     on Fri, 14 Jul 2017 16:30:50 +0200 writes:

>>>>> Marc Schwartz <[hidden email]>
>>>>>     on Fri, 14 Jul 2017 06:57:26 -0500 writes:

    >>> On Jul 13, 2017, at 5:07 PM, Marc Schwartz <[hidden email]> wrote:
    >>>
    >>>
    >>>> On Jul 13, 2017, at 3:37 PM, Marc Schwartz <[hidden email]> wrote:
    >>>>
    >>>>
    >>>>> On Jul 13, 2017, at 3:22 PM, Duncan Murdoch <[hidden email]> wrote:
    >>>>>
    >>>>> On 13/07/2017 4:08 PM, Marc Schwartz wrote:
    >>>>>> Hi All,
    >>>>>>
    >>>>>> As per the discussion today on R-Help:
    >>>>>>
    >>>>>> https://stat.ethz.ch/pipermail/r-help/2017-July/448132.html
    >>>>>>
    >>>>>> I am attaching a proposed patch for poly.Rd to provide clarifying wording relative to naming the 'degree' argument explicitly, in the case where the 'x' argument is a matrix, rather than a vector.
    >>>>>>
    >>>>>> This is based upon the svn trunk version of poly.Rd.
    >>>>>
    >>>>> I don't think this is the right fix.  The use of the unnamed 2nd arg as degree happens whether the first arg is a matrix or not.
    >>>>>
    >>>>> I didn't read the whole thread in detail, but it appears there's a bug somewhere, in the report or in the poly() code or in the plsr() code. That bug should be reported on the bug list if it turns out to be in base R, and to the package maintainer if it is in plsr().
    >>>>>
    >>>>> Duncan Murdoch
    >>>>
    >>>>
    >>>> Duncan,
    >>>>
    >>>> Thanks for your reply. You only really need to read that last post in the thread linked to above.
    >>>>
    >>>> I won't deny the possibility of a bug in poly(), relative to the handling of 'x' as a matrix. The behavior occurs in the poly() function in a pure stand alone fashion, without the need for plsr():
    >>>>
    >>>> x1 <- runif(20)
    >>>> x2 <- runif(20)
    >>>> mx <- cbind(x1, x2)
    >>>>
    >>>
    >>> <snip>
    >>>
    >>> Duncan,
    >>>
    >>> Tracing through the code for poly() using debug once with:
    >>>
    >>> poly(mx, 2)
    >>>
    >>> and then with:
    >>>
    >>> poly(mx, degree = 2)
    >>>
    >>> there is a difference in the transformation of 'mx' internally by the use of:
    >>>
    >>> if (is.matrix(x)) {
    >>> m <- unclass(as.data.frame(cbind(x, ...)))
    >>> return(do.call(polym, c(m, degree = degree, raw = raw, list(coefs = coefs))))
    >>> }
    >>>
    >>>
    >>> In the first case, 'mx' ends up being transformed to:
    >>>
    >>> Browse[2]> m
    >>> $x1
    >>> [1] 0.99056941 0.13953093 0.38965567 0.35353514 0.90838486 0.97552474
    >>> [7] 0.01135743 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
    >>> [13] 0.31333549 0.97488211 0.54952630 0.71747157 0.31164777 0.81694822
    >>> [19] 0.58641410 0.08858699
    >>>
    >>> $x2
    >>> [1] 0.6628658 0.9221436 0.3162418 0.8494452 0.4665010 0.3403719
    >>> [7] 0.4040692 0.4916650 0.9091161 0.2956006 0.3454689 0.3331070
    >>> [13] 0.8788974 0.5614636 0.7794396 0.2304009 0.6566537 0.6875646
    >>> [19] 0.5110733 0.4122336
    >>>
    >>> $V3
    >>> [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
    >>>
    >>> attr(,"row.names")
    >>> [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
    >>>
    >>> Thus, when do.call() is used, m$V3 is passed as the 'x' argument on the third iteration, essentially resulting in:
    >>>
    >>>> polym(rep(2, 20), degree = 2)
    >>> Error in poly(dots[[1L]], degree, raw = raw, simple = raw && nd > 1) :
    >>> 'degree' must be less than number of unique points
    >>>
    >>>
    >>> Note also that in this case, 'dots', which is the result of using list(...) on the initial call, is:
    >>>
    >>> Browse[2]> dots
    >>> [[1]]
    >>> [1] 2
    >>>
    >>>
    >>> In the second case:
    >>>
    >>> Browse[2]> m
    >>> $x1
    >>> [1] 0.99056941 0.13953093 0.38965567 0.35353514 0.90838486 0.97552474
    >>> [7] 0.01135743 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
    >>> [13] 0.31333549 0.97488211 0.54952630 0.71747157 0.31164777 0.81694822
    >>> [19] 0.58641410 0.08858699
    >>>
    >>> $x2
    >>> [1] 0.6628658 0.9221436 0.3162418 0.8494452 0.4665010 0.3403719
    >>> [7] 0.4040692 0.4916650 0.9091161 0.2956006 0.3454689 0.3331070
    >>> [13] 0.8788974 0.5614636 0.7794396 0.2304009 0.6566537 0.6875646
    >>> [19] 0.5110733 0.4122336
    >>>
    >>> attr(,"row.names")
    >>> [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
    >>>
    >>>
    >>> So, there is no m$V3.
    >>>
    >>> Note also that 'dots' ends up being:
    >>>
    >>> Browse[2]> dots
    >>> list()
    >>>
    >>>
    >>> In both cases, 'degree' is indeed 2, but the result of 'list(...)' on the initial function call is quite different.
    >>>
    >>> So, I may be hypo-caffeinated, but if there is a bug here, it may be due to the way in which cbind() is being called in the code above, where the three dots are being used?
    >>>
    >>> I can replicate the presumably correct behavior by using:
    >>>
    >>> m <- unclass(as.data.frame(cbind(x)))
    >>>
    >>> instead of:
    >>>
    >>> m <- unclass(as.data.frame(cbind(x, ...)))
    >>>
    >>> But I am not sure if removing the three dots in the cbind() call may have other unintended consequences.
    >>>
    >>> Regards,
    >>>
    >>> Marc


    >> Duncan,

    >> Some additional information here.
    >> Reviewing the source code for the function in SVN:

    >> https://svn.r-project.org/R/trunk/src/library/stats/R/contr.poly.R

    >> there is a relevant comment in the code:

    >> if(is.matrix(x)) { ## FIXME: fails when combined with 'unnamed degree' above
    >> m <- unclass(as.data.frame(cbind(x, ...)))
    >> return(do.call(polym, c(m, degree = degree, raw = raw,
    >> list(coefs=coefs))))
    >> }


    >> A version review would suggest that the above comment was added to the code back in 2015.

    > Yes, by me, possibly here :

    > $ svn log -v -c68727
    > ------------------------------------------------------------------------
    > r68727 | maechler | 2015-07-23 16:14:59 +0200 (Thu, 23 Jul 2015) | 1 line
    > Changed paths:
    > M /trunk/doc/NEWS.Rd
    > M /trunk/src/library/stats/R/contr.poly.R
    > M /trunk/src/library/stats/man/poly.Rd
    > M /trunk/tests/Examples/stats-Ex.Rout.save
    > M /trunk/tests/reg-tests-1c.R

    > poly(), polym() now work better notably for prediction
    > ------------------------------------------------------------------------
    > $ svn-diffB -c68727 doc/NEWS.Rd
    > Index: doc/NEWS.Rd
    > ===================================================================
    > 126a127,133
    >>
    >> \item \code{polym()} gains a \code{coefs = NULL} argument and
    >> returns class \code{"poly"} just like \code{poly()} which gets a
    >> new \code{simple=FALSE} option.  They now lead to correct
    >> \code{predict()}ions, e.g., on subsets of the original data.
    >> %% see https://stat.ethz.ch/pipermail/r-devel/2015-July/071532.html


    >> So it would appear that the behavior being discussed here is known.

    > Indeed!  I remember to have spent quite a few hours with the
    > code and its different uses before committing that patch.

    >> I am still confused by the need for the '...' in the call to cbind(), which as far as I can tell, has been in the code at least back to 2003, when the poly() code was split from base.

    >> I am not sure why one would want to pass on other '...' arguments to cbind(), but I am presumably missing something here.

    > Yes, I think passing the '...' is important there...
    > OTOH, I'm almost sure that I wrote the 'FIXME' because I thought
    > one should be able to do things better.
    > So, I'm happy to e-talk to you about how to get rid of the FIXME
    > and still remain back-compatible: Notably with the paragraph in  ?poly
    > |> Details:
    > |>
    > |>      Although formally ‘degree’ should be named (as it follows ‘...’),
    > |>      an unnamed second argument of length 1 will be interpreted as the
    > |>      degree, such that ‘poly(x, 3)’ can be used in formulas.

As a matter of fact,  a patch seems very simple, and I am
testing it now.

Won't have much more time today, but will return "on this channel"
later, maybe tomorrow.

Martin

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Proposed Patch for poly.Rd

Marc Schwartz-3

> On Jul 14, 2017, at 9:50 AM, Martin Maechler <[hidden email]> wrote:
>
>>>>>> Martin Maechler <[hidden email]>
>>>>>>    on Fri, 14 Jul 2017 16:30:50 +0200 writes:
>
>>>>>> Marc Schwartz <[hidden email]>
>>>>>>    on Fri, 14 Jul 2017 06:57:26 -0500 writes:
>
>>>> On Jul 13, 2017, at 5:07 PM, Marc Schwartz <[hidden email]> wrote:
>>>>
>>>>
>>>>> On Jul 13, 2017, at 3:37 PM, Marc Schwartz <[hidden email]> wrote:
>>>>>
>>>>>
>>>>>> On Jul 13, 2017, at 3:22 PM, Duncan Murdoch <[hidden email]> wrote:
>>>>>>
>>>>>> On 13/07/2017 4:08 PM, Marc Schwartz wrote:
>>>>>>> Hi All,
>>>>>>>
>>>>>>> As per the discussion today on R-Help:
>>>>>>>
>>>>>>> https://stat.ethz.ch/pipermail/r-help/2017-July/448132.html
>>>>>>>
>>>>>>> I am attaching a proposed patch for poly.Rd to provide clarifying wording relative to naming the 'degree' argument explicitly, in the case where the 'x' argument is a matrix, rather than a vector.
>>>>>>>
>>>>>>> This is based upon the svn trunk version of poly.Rd.
>>>>>>
>>>>>> I don't think this is the right fix.  The use of the unnamed 2nd arg as degree happens whether the first arg is a matrix or not.
>>>>>>
>>>>>> I didn't read the whole thread in detail, but it appears there's a bug somewhere, in the report or in the poly() code or in the plsr() code. That bug should be reported on the bug list if it turns out to be in base R, and to the package maintainer if it is in plsr().
>>>>>>
>>>>>> Duncan Murdoch
>>>>>
>>>>>
>>>>> Duncan,
>>>>>
>>>>> Thanks for your reply. You only really need to read that last post in the thread linked to above.
>>>>>
>>>>> I won't deny the possibility of a bug in poly(), relative to the handling of 'x' as a matrix. The behavior occurs in the poly() function in a pure stand alone fashion, without the need for plsr():
>>>>>
>>>>> x1 <- runif(20)
>>>>> x2 <- runif(20)
>>>>> mx <- cbind(x1, x2)
>>>>>
>>>>
>>>> <snip>
>>>>
>>>> Duncan,
>>>>
>>>> Tracing through the code for poly() using debug once with:
>>>>
>>>> poly(mx, 2)
>>>>
>>>> and then with:
>>>>
>>>> poly(mx, degree = 2)
>>>>
>>>> there is a difference in the transformation of 'mx' internally by the use of:
>>>>
>>>> if (is.matrix(x)) {
>>>> m <- unclass(as.data.frame(cbind(x, ...)))
>>>> return(do.call(polym, c(m, degree = degree, raw = raw, list(coefs = coefs))))
>>>> }
>>>>
>>>>
>>>> In the first case, 'mx' ends up being transformed to:
>>>>
>>>> Browse[2]> m
>>>> $x1
>>>> [1] 0.99056941 0.13953093 0.38965567 0.35353514 0.90838486 0.97552474
>>>> [7] 0.01135743 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
>>>> [13] 0.31333549 0.97488211 0.54952630 0.71747157 0.31164777 0.81694822
>>>> [19] 0.58641410 0.08858699
>>>>
>>>> $x2
>>>> [1] 0.6628658 0.9221436 0.3162418 0.8494452 0.4665010 0.3403719
>>>> [7] 0.4040692 0.4916650 0.9091161 0.2956006 0.3454689 0.3331070
>>>> [13] 0.8788974 0.5614636 0.7794396 0.2304009 0.6566537 0.6875646
>>>> [19] 0.5110733 0.4122336
>>>>
>>>> $V3
>>>> [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
>>>>
>>>> attr(,"row.names")
>>>> [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
>>>>
>>>> Thus, when do.call() is used, m$V3 is passed as the 'x' argument on the third iteration, essentially resulting in:
>>>>
>>>>> polym(rep(2, 20), degree = 2)
>>>> Error in poly(dots[[1L]], degree, raw = raw, simple = raw && nd > 1) :
>>>> 'degree' must be less than number of unique points
>>>>
>>>>
>>>> Note also that in this case, 'dots', which is the result of using list(...) on the initial call, is:
>>>>
>>>> Browse[2]> dots
>>>> [[1]]
>>>> [1] 2
>>>>
>>>>
>>>> In the second case:
>>>>
>>>> Browse[2]> m
>>>> $x1
>>>> [1] 0.99056941 0.13953093 0.38965567 0.35353514 0.90838486 0.97552474
>>>> [7] 0.01135743 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
>>>> [13] 0.31333549 0.97488211 0.54952630 0.71747157 0.31164777 0.81694822
>>>> [19] 0.58641410 0.08858699
>>>>
>>>> $x2
>>>> [1] 0.6628658 0.9221436 0.3162418 0.8494452 0.4665010 0.3403719
>>>> [7] 0.4040692 0.4916650 0.9091161 0.2956006 0.3454689 0.3331070
>>>> [13] 0.8788974 0.5614636 0.7794396 0.2304009 0.6566537 0.6875646
>>>> [19] 0.5110733 0.4122336
>>>>
>>>> attr(,"row.names")
>>>> [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
>>>>
>>>>
>>>> So, there is no m$V3.
>>>>
>>>> Note also that 'dots' ends up being:
>>>>
>>>> Browse[2]> dots
>>>> list()
>>>>
>>>>
>>>> In both cases, 'degree' is indeed 2, but the result of 'list(...)' on the initial function call is quite different.
>>>>
>>>> So, I may be hypo-caffeinated, but if there is a bug here, it may be due to the way in which cbind() is being called in the code above, where the three dots are being used?
>>>>
>>>> I can replicate the presumably correct behavior by using:
>>>>
>>>> m <- unclass(as.data.frame(cbind(x)))
>>>>
>>>> instead of:
>>>>
>>>> m <- unclass(as.data.frame(cbind(x, ...)))
>>>>
>>>> But I am not sure if removing the three dots in the cbind() call may have other unintended consequences.
>>>>
>>>> Regards,
>>>>
>>>> Marc
>
>
>>> Duncan,
>
>>> Some additional information here.
>>> Reviewing the source code for the function in SVN:
>
>>> https://svn.r-project.org/R/trunk/src/library/stats/R/contr.poly.R
>
>>> there is a relevant comment in the code:
>
>>> if(is.matrix(x)) { ## FIXME: fails when combined with 'unnamed degree' above
>>> m <- unclass(as.data.frame(cbind(x, ...)))
>>> return(do.call(polym, c(m, degree = degree, raw = raw,
>>> list(coefs=coefs))))
>>> }
>
>
>>> A version review would suggest that the above comment was added to the code back in 2015.
>
>> Yes, by me, possibly here :
>
>> $ svn log -v -c68727
>> ------------------------------------------------------------------------
>> r68727 | maechler | 2015-07-23 16:14:59 +0200 (Thu, 23 Jul 2015) | 1 line
>> Changed paths:
>> M /trunk/doc/NEWS.Rd
>> M /trunk/src/library/stats/R/contr.poly.R
>> M /trunk/src/library/stats/man/poly.Rd
>> M /trunk/tests/Examples/stats-Ex.Rout.save
>> M /trunk/tests/reg-tests-1c.R
>
>> poly(), polym() now work better notably for prediction
>> ------------------------------------------------------------------------
>> $ svn-diffB -c68727 doc/NEWS.Rd
>> Index: doc/NEWS.Rd
>> ===================================================================
>> 126a127,133
>>>
>>> \item \code{polym()} gains a \code{coefs = NULL} argument and
>>> returns class \code{"poly"} just like \code{poly()} which gets a
>>> new \code{simple=FALSE} option.  They now lead to correct
>>> \code{predict()}ions, e.g., on subsets of the original data.
>>> %% see https://stat.ethz.ch/pipermail/r-devel/2015-July/071532.html
>
>
>>> So it would appear that the behavior being discussed here is known.
>
>> Indeed!  I remember to have spent quite a few hours with the
>> code and its different uses before committing that patch.
>
>>> I am still confused by the need for the '...' in the call to cbind(), which as far as I can tell, has been in the code at least back to 2003, when the poly() code was split from base.
>
>>> I am not sure why one would want to pass on other '...' arguments to cbind(), but I am presumably missing something here.
>
>> Yes, I think passing the '...' is important there...
>> OTOH, I'm almost sure that I wrote the 'FIXME' because I thought
>> one should be able to do things better.
>> So, I'm happy to e-talk to you about how to get rid of the FIXME
>> and still remain back-compatible: Notably with the paragraph in  ?poly
>> |> Details:
>> |>
>> |>      Although formally ‘degree’ should be named (as it follows ‘...’),
>> |>      an unnamed second argument of length 1 will be interpreted as the
>> |>      degree, such that ‘poly(x, 3)’ can be used in formulas.
>
> As a matter of fact,  a patch seems very simple, and I am
> testing it now.
>
> Won't have much more time today, but will return "on this channel"
> later, maybe tomorrow.
>
> Martin


Martin,

Thanks for taking the time to look at this!

Marc

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Proposed Patch for poly.Rd

Martin Maechler
>>>>> Marc Schwartz <[hidden email]>
>>>>>     on Fri, 14 Jul 2017 11:01:03 -0500 writes:

    >> On Jul 14, 2017, at 9:50 AM, Martin Maechler
    >> <[hidden email]> wrote:
    >>
    >>>>>>> Martin Maechler <[hidden email]> on Fri,
    >>>>>>> 14 Jul 2017 16:30:50 +0200 writes:
    >>
    >>>>>>> Marc Schwartz <[hidden email]> on Fri, 14 Jul
    >>>>>>> 2017 06:57:26 -0500 writes:
    >>
    >>>>> On Jul 13, 2017, at 5:07 PM, Marc Schwartz
    >>>>> <[hidden email]> wrote:
    >>>>>
    >>>>>
>>>>> On Jul 13, 2017, at 3:37 PM, Marc Schwartz
    >>>>> <[hidden email]> wrote:
>>>>>
>>>>>
    >>>>>>> On Jul 13, 2017, at 3:22 PM, Duncan Murdoch
    >>>>>>> <[hidden email]> wrote:
    >>>>>>>
    >>>>>>> On 13/07/2017 4:08 PM, Marc Schwartz wrote:
    >>>>>>>> Hi All,
    >>>>>>>>
    >>>>>>>> As per the discussion today on R-Help:
    >>>>>>>>
    >>>>>>>> https://stat.ethz.ch/pipermail/r-help/2017-July/448132.html
    >>>>>>>>
    >>>>>>>> I am attaching a proposed patch for poly.Rd to
    >>>>>>>> provide clarifying wording relative to naming the
    >>>>>>>> 'degree' argument explicitly, in the case where the
    >>>>>>>> 'x' argument is a matrix, rather than a vector.
    >>>>>>>>
    >>>>>>>> This is based upon the svn trunk version of
    >>>>>>>> poly.Rd.
    >>>>>>>
    >>>>>>> I don't think this is the right fix.  The use of the
    >>>>>>> unnamed 2nd arg as degree happens whether the first
    >>>>>>> arg is a matrix or not.
    >>>>>>>
    >>>>>>> I didn't read the whole thread in detail, but it
    >>>>>>> appears there's a bug somewhere, in the report or in
    >>>>>>> the poly() code or in the plsr() code. That bug
    >>>>>>> should be reported on the bug list if it turns out
    >>>>>>> to be in base R, and to the package maintainer if it
    >>>>>>> is in plsr().
    >>>>>>>
    >>>>>>> Duncan Murdoch
>>>>>
>>>>>
>>>>> Duncan,
>>>>>
>>>>> Thanks for your reply. You only really need to read that
    >>>>>>> last post in the thread linked to above.
>>>>>
>>>>> I won't deny the possibility of a bug in poly(), relative
    >>>>>>> to the handling of 'x' as a matrix. The behavior
    >>>>>>> occurs in the poly() function in a pure stand alone
    >>>>>>> fashion, without the need for plsr():
>>>>>
>>>>> x1 <- runif(20)
>>>>> x2 <- runif(20)
>>>>> mx <- cbind(x1, x2)
>>>>>
    >>>>>
    >>>>> <snip>
    >>>>>
    >>>>> Duncan,
    >>>>>
    >>>>> Tracing through the code for poly() using debug once
    >>>>> with:
    >>>>>
    >>>>> poly(mx, 2)
    >>>>>
    >>>>> and then with:
    >>>>>
    >>>>> poly(mx, degree = 2)
    >>>>>
    >>>>> there is a difference in the transformation of 'mx'
    >>>>> internally by the use of:
    >>>>>
    >>>>> if (is.matrix(x)) { m <-
    >>>>> unclass(as.data.frame(cbind(x, ...)))
    >>>>> return(do.call(polym, c(m, degree = degree, raw = raw,
    >>>>> list(coefs = coefs)))) }
    >>>>>
    >>>>>
    >>>>> In the first case, 'mx' ends up being transformed to:
    >>>>>
    >>>>> Browse[2]> m $x1 [1] 0.99056941 0.13953093 0.38965567
    >>>>> 0.35353514 0.90838486 0.97552474 [7] 0.01135743
    >>>>> 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
    >>>>> [13] 0.31333549 0.97488211 0.54952630 0.71747157
    >>>>> 0.31164777 0.81694822 [19] 0.58641410 0.08858699
    >>>>>
    >>>>> $x2 [1] 0.6628658 0.9221436 0.3162418 0.8494452
    >>>>> 0.4665010 0.3403719 [7] 0.4040692 0.4916650 0.9091161
    >>>>> 0.2956006 0.3454689 0.3331070 [13] 0.8788974 0.5614636
    >>>>> 0.7794396 0.2304009 0.6566537 0.6875646 [19] 0.5110733
    >>>>> 0.4122336
    >>>>>
    >>>>> $V3 [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
    >>>>>
    >>>>> attr(,"row.names") [1] 1 2 3 4 5 6 7 8 9 10 11 12 13
    >>>>> 14 15 16 17 18 19 20
    >>>>>
    >>>>> Thus, when do.call() is used, m$V3 is passed as the
    >>>>> 'x' argument on the third iteration, essentially
    >>>>> resulting in:
    >>>>>
>>>>> polym(rep(2, 20), degree = 2) Error in poly(dots[[1L]],
    >>>>> degree, raw = raw, simple = raw && nd > 1) : 'degree'
    >>>>> must be less than number of unique points
    >>>>>
    >>>>>
    >>>>> Note also that in this case, 'dots', which is the
    >>>>> result of using list(...) on the initial call, is:
    >>>>>
    >>>>> Browse[2]> dots [[1]] [1] 2
    >>>>>
    >>>>>
    >>>>> In the second case:
    >>>>>
    >>>>> Browse[2]> m $x1 [1] 0.99056941 0.13953093 0.38965567
    >>>>> 0.35353514 0.90838486 0.97552474 [7] 0.01135743
    >>>>> 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
    >>>>> [13] 0.31333549 0.97488211 0.54952630 0.71747157
    >>>>> 0.31164777 0.81694822 [19] 0.58641410 0.08858699
    >>>>>
    >>>>> $x2 [1] 0.6628658 0.9221436 0.3162418 0.8494452
    >>>>> 0.4665010 0.3403719 [7] 0.4040692 0.4916650 0.9091161
    >>>>> 0.2956006 0.3454689 0.3331070 [13] 0.8788974 0.5614636
    >>>>> 0.7794396 0.2304009 0.6566537 0.6875646 [19] 0.5110733
    >>>>> 0.4122336
    >>>>>
    >>>>> attr(,"row.names") [1] 1 2 3 4 5 6 7 8 9 10 11 12 13
    >>>>> 14 15 16 17 18 19 20
    >>>>>
    >>>>>
    >>>>> So, there is no m$V3.
    >>>>>
    >>>>> Note also that 'dots' ends up being:
    >>>>>
    >>>>> Browse[2]> dots list()
    >>>>>
    >>>>>
    >>>>> In both cases, 'degree' is indeed 2, but the result of
    >>>>> 'list(...)' on the initial function call is quite
    >>>>> different.
    >>>>>
    >>>>> So, I may be hypo-caffeinated, but if there is a bug
    >>>>> here, it may be due to the way in which cbind() is
    >>>>> being called in the code above, where the three dots
    >>>>> are being used?
    >>>>>
    >>>>> I can replicate the presumably correct behavior by
    >>>>> using:
    >>>>>
    >>>>> m <- unclass(as.data.frame(cbind(x)))
    >>>>>
    >>>>> instead of:
    >>>>>
    >>>>> m <- unclass(as.data.frame(cbind(x, ...)))
    >>>>>
    >>>>> But I am not sure if removing the three dots in the
    >>>>> cbind() call may have other unintended consequences.
    >>>>>
    >>>>> Regards,
    >>>>>
    >>>>> Marc
    >>
    >>
    >>>> Duncan,
    >>
    >>>> Some additional information here.  Reviewing the source
    >>>> code for the function in SVN:
    >>
    >>>> https://svn.r-project.org/R/trunk/src/library/stats/R/contr.poly.R
    >>
    >>>> there is a relevant comment in the code:
    >>
    >>>> if(is.matrix(x)) { ## FIXME: fails when combined with
    >>>> 'unnamed degree' above m <-
    >>>> unclass(as.data.frame(cbind(x, ...)))
    >>>> return(do.call(polym, c(m, degree = degree, raw = raw,
    >>>> list(coefs=coefs)))) }
    >>
    >>
    >>>> A version review would suggest that the above comment
    >>>> was added to the code back in 2015.
    >>
    >>> Yes, by me, possibly here :
    >>
    >>> $ svn log -v -c68727
    >>> ------------------------------------------------------------------------
    >>> r68727 | maechler | 2015-07-23 16:14:59 +0200 (Thu, 23
    >>> Jul 2015) | 1 line Changed paths: M /trunk/doc/NEWS.Rd M
    >>> /trunk/src/library/stats/R/contr.poly.R M
    >>> /trunk/src/library/stats/man/poly.Rd M
    >>> /trunk/tests/Examples/stats-Ex.Rout.save M
    >>> /trunk/tests/reg-tests-1c.R
    >>
    >>> poly(), polym() now work better notably for prediction
    >>> ------------------------------------------------------------------------
    >>> $ svn-diffB -c68727 doc/NEWS.Rd Index: doc/NEWS.Rd
    >>> ===================================================================
    >>> 126a127,133
    >>>>
    >>>> \item \code{polym()} gains a \code{coefs = NULL}
    >>>> argument and returns class \code{"poly"} just like
    >>>> \code{poly()} which gets a new \code{simple=FALSE}
    >>>> option.  They now lead to correct \code{predict()}ions,
    >>>> e.g., on subsets of the original data.  %% see
    >>>> https://stat.ethz.ch/pipermail/r-devel/2015-July/071532.html
    >>
    >>
    >>>> So it would appear that the behavior being discussed
    >>>> here is known.
    >>
    >>> Indeed!  I remember to have spent quite a few hours with
    >>> the code and its different uses before committing that
    >>> patch.
    >>
    >>>> I am still confused by the need for the '...' in the
    >>>> call to cbind(), which as far as I can tell, has been
    >>>> in the code at least back to 2003, when the poly() code
    >>>> was split from base.
    >>
    >>>> I am not sure why one would want to pass on other '...'
    >>>> arguments to cbind(), but I am presumably missing
    >>>> something here.
    >>
    >>> Yes, I think passing the '...' is important there...
    >>> OTOH, I'm almost sure that I wrote the 'FIXME' because I
    >>> thought one should be able to do things better.  So, I'm
    >>> happy to e-talk to you about how to get rid of the FIXME
    >>> and still remain back-compatible: Notably with the
    >>> paragraph in ?poly |> Details:
    >>> |>
    >>> |> Although formally ‘degree’ should be named (as it
    >>> follows ‘...’), |> an unnamed second argument of length
    >>> 1 will be interpreted as the |> degree, such that
    >>> ‘poly(x, 3)’ can be used in formulas.
    >>
    >> As a matter of fact, a patch seems very simple, and I am
    >> testing it now.
    >>
    >> Won't have much more time today, but will return "on this
    >> channel" later, maybe tomorrow.
    >>
    >> Martin


    > Martin,
    > Thanks for taking the time to look at this!

    > Marc

Duncan had in the mean time filed a bug report about this,
 --> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17310
but I had fixed the issue even before seeing the PR.
    [currently fixed in R-devel only (svn r 72919)]

Martin

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Proposed Patch for poly.Rd

Duncan Murdoch-2
On 15/07/2017 11:37 AM, Martin Maechler wrote:

>>>>>> Marc Schwartz <[hidden email]>
>>>>>>     on Fri, 14 Jul 2017 11:01:03 -0500 writes:
>
>     >> On Jul 14, 2017, at 9:50 AM, Martin Maechler
>     >> <[hidden email]> wrote:
>     >>
>     >>>>>>> Martin Maechler <[hidden email]> on Fri,
>     >>>>>>> 14 Jul 2017 16:30:50 +0200 writes:
>     >>
>     >>>>>>> Marc Schwartz <[hidden email]> on Fri, 14 Jul
>     >>>>>>> 2017 06:57:26 -0500 writes:
>     >>
>     >>>>> On Jul 13, 2017, at 5:07 PM, Marc Schwartz
>     >>>>> <[hidden email]> wrote:
>     >>>>>
>     >>>>>
>>>>>> On Jul 13, 2017, at 3:37 PM, Marc Schwartz
>     >>>>> <[hidden email]> wrote:
>>>>>>
>>>>>>
>     >>>>>>> On Jul 13, 2017, at 3:22 PM, Duncan Murdoch
>     >>>>>>> <[hidden email]> wrote:
>     >>>>>>>
>     >>>>>>> On 13/07/2017 4:08 PM, Marc Schwartz wrote:
>     >>>>>>>> Hi All,
>     >>>>>>>>
>     >>>>>>>> As per the discussion today on R-Help:
>     >>>>>>>>
>     >>>>>>>> https://stat.ethz.ch/pipermail/r-help/2017-July/448132.html
>     >>>>>>>>
>     >>>>>>>> I am attaching a proposed patch for poly.Rd to
>     >>>>>>>> provide clarifying wording relative to naming the
>     >>>>>>>> 'degree' argument explicitly, in the case where the
>     >>>>>>>> 'x' argument is a matrix, rather than a vector.
>     >>>>>>>>
>     >>>>>>>> This is based upon the svn trunk version of
>     >>>>>>>> poly.Rd.
>     >>>>>>>
>     >>>>>>> I don't think this is the right fix.  The use of the
>     >>>>>>> unnamed 2nd arg as degree happens whether the first
>     >>>>>>> arg is a matrix or not.
>     >>>>>>>
>     >>>>>>> I didn't read the whole thread in detail, but it
>     >>>>>>> appears there's a bug somewhere, in the report or in
>     >>>>>>> the poly() code or in the plsr() code. That bug
>     >>>>>>> should be reported on the bug list if it turns out
>     >>>>>>> to be in base R, and to the package maintainer if it
>     >>>>>>> is in plsr().
>     >>>>>>>
>     >>>>>>> Duncan Murdoch
>>>>>>
>>>>>>
>>>>>> Duncan,
>>>>>>
>>>>>> Thanks for your reply. You only really need to read that
>     >>>>>>> last post in the thread linked to above.
>>>>>>
>>>>>> I won't deny the possibility of a bug in poly(), relative
>     >>>>>>> to the handling of 'x' as a matrix. The behavior
>     >>>>>>> occurs in the poly() function in a pure stand alone
>     >>>>>>> fashion, without the need for plsr():
>>>>>>
>>>>>> x1 <- runif(20)
>>>>>> x2 <- runif(20)
>>>>>> mx <- cbind(x1, x2)
>>>>>>
>     >>>>>
>     >>>>> <snip>
>     >>>>>
>     >>>>> Duncan,
>     >>>>>
>     >>>>> Tracing through the code for poly() using debug once
>     >>>>> with:
>     >>>>>
>     >>>>> poly(mx, 2)
>     >>>>>
>     >>>>> and then with:
>     >>>>>
>     >>>>> poly(mx, degree = 2)
>     >>>>>
>     >>>>> there is a difference in the transformation of 'mx'
>     >>>>> internally by the use of:
>     >>>>>
>     >>>>> if (is.matrix(x)) { m <-
>     >>>>> unclass(as.data.frame(cbind(x, ...)))
>     >>>>> return(do.call(polym, c(m, degree = degree, raw = raw,
>     >>>>> list(coefs = coefs)))) }
>     >>>>>
>     >>>>>
>     >>>>> In the first case, 'mx' ends up being transformed to:
>     >>>>>
>     >>>>> Browse[2]> m $x1 [1] 0.99056941 0.13953093 0.38965567
>     >>>>> 0.35353514 0.90838486 0.97552474 [7] 0.01135743
>     >>>>> 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
>     >>>>> [13] 0.31333549 0.97488211 0.54952630 0.71747157
>     >>>>> 0.31164777 0.81694822 [19] 0.58641410 0.08858699
>     >>>>>
>     >>>>> $x2 [1] 0.6628658 0.9221436 0.3162418 0.8494452
>     >>>>> 0.4665010 0.3403719 [7] 0.4040692 0.4916650 0.9091161
>     >>>>> 0.2956006 0.3454689 0.3331070 [13] 0.8788974 0.5614636
>     >>>>> 0.7794396 0.2304009 0.6566537 0.6875646 [19] 0.5110733
>     >>>>> 0.4122336
>     >>>>>
>     >>>>> $V3 [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
>     >>>>>
>     >>>>> attr(,"row.names") [1] 1 2 3 4 5 6 7 8 9 10 11 12 13
>     >>>>> 14 15 16 17 18 19 20
>     >>>>>
>     >>>>> Thus, when do.call() is used, m$V3 is passed as the
>     >>>>> 'x' argument on the third iteration, essentially
>     >>>>> resulting in:
>     >>>>>
>>>>>> polym(rep(2, 20), degree = 2) Error in poly(dots[[1L]],
>     >>>>> degree, raw = raw, simple = raw && nd > 1) : 'degree'
>     >>>>> must be less than number of unique points
>     >>>>>
>     >>>>>
>     >>>>> Note also that in this case, 'dots', which is the
>     >>>>> result of using list(...) on the initial call, is:
>     >>>>>
>     >>>>> Browse[2]> dots [[1]] [1] 2
>     >>>>>
>     >>>>>
>     >>>>> In the second case:
>     >>>>>
>     >>>>> Browse[2]> m $x1 [1] 0.99056941 0.13953093 0.38965567
>     >>>>> 0.35353514 0.90838486 0.97552474 [7] 0.01135743
>     >>>>> 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
>     >>>>> [13] 0.31333549 0.97488211 0.54952630 0.71747157
>     >>>>> 0.31164777 0.81694822 [19] 0.58641410 0.08858699
>     >>>>>
>     >>>>> $x2 [1] 0.6628658 0.9221436 0.3162418 0.8494452
>     >>>>> 0.4665010 0.3403719 [7] 0.4040692 0.4916650 0.9091161
>     >>>>> 0.2956006 0.3454689 0.3331070 [13] 0.8788974 0.5614636
>     >>>>> 0.7794396 0.2304009 0.6566537 0.6875646 [19] 0.5110733
>     >>>>> 0.4122336
>     >>>>>
>     >>>>> attr(,"row.names") [1] 1 2 3 4 5 6 7 8 9 10 11 12 13
>     >>>>> 14 15 16 17 18 19 20
>     >>>>>
>     >>>>>
>     >>>>> So, there is no m$V3.
>     >>>>>
>     >>>>> Note also that 'dots' ends up being:
>     >>>>>
>     >>>>> Browse[2]> dots list()
>     >>>>>
>     >>>>>
>     >>>>> In both cases, 'degree' is indeed 2, but the result of
>     >>>>> 'list(...)' on the initial function call is quite
>     >>>>> different.
>     >>>>>
>     >>>>> So, I may be hypo-caffeinated, but if there is a bug
>     >>>>> here, it may be due to the way in which cbind() is
>     >>>>> being called in the code above, where the three dots
>     >>>>> are being used?
>     >>>>>
>     >>>>> I can replicate the presumably correct behavior by
>     >>>>> using:
>     >>>>>
>     >>>>> m <- unclass(as.data.frame(cbind(x)))
>     >>>>>
>     >>>>> instead of:
>     >>>>>
>     >>>>> m <- unclass(as.data.frame(cbind(x, ...)))
>     >>>>>
>     >>>>> But I am not sure if removing the three dots in the
>     >>>>> cbind() call may have other unintended consequences.
>     >>>>>
>     >>>>> Regards,
>     >>>>>
>     >>>>> Marc
>     >>
>     >>
>     >>>> Duncan,
>     >>
>     >>>> Some additional information here.  Reviewing the source
>     >>>> code for the function in SVN:
>     >>
>     >>>> https://svn.r-project.org/R/trunk/src/library/stats/R/contr.poly.R
>     >>
>     >>>> there is a relevant comment in the code:
>     >>
>     >>>> if(is.matrix(x)) { ## FIXME: fails when combined with
>     >>>> 'unnamed degree' above m <-
>     >>>> unclass(as.data.frame(cbind(x, ...)))
>     >>>> return(do.call(polym, c(m, degree = degree, raw = raw,
>     >>>> list(coefs=coefs)))) }
>     >>
>     >>
>     >>>> A version review would suggest that the above comment
>     >>>> was added to the code back in 2015.
>     >>
>     >>> Yes, by me, possibly here :
>     >>
>     >>> $ svn log -v -c68727
>     >>> ------------------------------------------------------------------------
>     >>> r68727 | maechler | 2015-07-23 16:14:59 +0200 (Thu, 23
>     >>> Jul 2015) | 1 line Changed paths: M /trunk/doc/NEWS.Rd M
>     >>> /trunk/src/library/stats/R/contr.poly.R M
>     >>> /trunk/src/library/stats/man/poly.Rd M
>     >>> /trunk/tests/Examples/stats-Ex.Rout.save M
>     >>> /trunk/tests/reg-tests-1c.R
>     >>
>     >>> poly(), polym() now work better notably for prediction
>     >>> ------------------------------------------------------------------------
>     >>> $ svn-diffB -c68727 doc/NEWS.Rd Index: doc/NEWS.Rd
>     >>> ===================================================================
>     >>> 126a127,133
>     >>>>
>     >>>> \item \code{polym()} gains a \code{coefs = NULL}
>     >>>> argument and returns class \code{"poly"} just like
>     >>>> \code{poly()} which gets a new \code{simple=FALSE}
>     >>>> option.  They now lead to correct \code{predict()}ions,
>     >>>> e.g., on subsets of the original data.  %% see
>     >>>> https://stat.ethz.ch/pipermail/r-devel/2015-July/071532.html
>     >>
>     >>
>     >>>> So it would appear that the behavior being discussed
>     >>>> here is known.
>     >>
>     >>> Indeed!  I remember to have spent quite a few hours with
>     >>> the code and its different uses before committing that
>     >>> patch.
>     >>
>     >>>> I am still confused by the need for the '...' in the
>     >>>> call to cbind(), which as far as I can tell, has been
>     >>>> in the code at least back to 2003, when the poly() code
>     >>>> was split from base.
>     >>
>     >>>> I am not sure why one would want to pass on other '...'
>     >>>> arguments to cbind(), but I am presumably missing
>     >>>> something here.
>     >>
>     >>> Yes, I think passing the '...' is important there...
>     >>> OTOH, I'm almost sure that I wrote the 'FIXME' because I
>     >>> thought one should be able to do things better.  So, I'm
>     >>> happy to e-talk to you about how to get rid of the FIXME
>     >>> and still remain back-compatible: Notably with the
>     >>> paragraph in ?poly |> Details:
>     >>> |>
>     >>> |> Although formally ‘degree’ should be named (as it
>     >>> follows ‘...’), |> an unnamed second argument of length
>     >>> 1 will be interpreted as the |> degree, such that
>     >>> ‘poly(x, 3)’ can be used in formulas.
>     >>
>     >> As a matter of fact, a patch seems very simple, and I am
>     >> testing it now.
>     >>
>     >> Won't have much more time today, but will return "on this
>     >> channel" later, maybe tomorrow.
>     >>
>     >> Martin
>
>
>     > Martin,
>     > Thanks for taking the time to look at this!
>
>     > Marc
>
> Duncan had in the mean time filed a bug report about this,
>  --> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17310
> but I had fixed the issue even before seeing the PR.
>     [currently fixed in R-devel only (svn r 72919)]

I wrote to you the next day, when Marc pointed out the FIXME comment.
Did you not receive my message?

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Proposed Patch for poly.Rd

Martin Maechler
>>>>> Duncan Murdoch <[hidden email]>
>>>>>     on Sat, 15 Jul 2017 19:27:57 -0400 writes:

    > On 15/07/2017 11:37 AM, Martin Maechler wrote:
    >>>>>>> Marc Schwartz <[hidden email]>
    >>>>>>> on Fri, 14 Jul 2017 11:01:03 -0500 writes:
    >>
    >> >> On Jul 14, 2017, at 9:50 AM, Martin Maechler
    >> >> <[hidden email]> wrote:
    >> >>
    >> >>>>>>> Martin Maechler <[hidden email]> on Fri,
    >> >>>>>>> 14 Jul 2017 16:30:50 +0200 writes:
    >> >>
    >> >>>>>>> Marc Schwartz <[hidden email]> on Fri, 14 Jul
    >> >>>>>>> 2017 06:57:26 -0500 writes:
    >> >>
    >> >>>>> On Jul 13, 2017, at 5:07 PM, Marc Schwartz
    >> >>>>> <[hidden email]> wrote:
    >> >>>>>
    >> >>>>>
    >>>>>>> On Jul 13, 2017, at 3:37 PM, Marc Schwartz
    >> >>>>> <[hidden email]> wrote:
    >>>>>>>
    >>>>>>>
    >> >>>>>>> On Jul 13, 2017, at 3:22 PM, Duncan Murdoch
    >> >>>>>>> <[hidden email]> wrote:
    >> >>>>>>>
    >> >>>>>>> On 13/07/2017 4:08 PM, Marc Schwartz wrote:
    >> >>>>>>>> Hi All,
    >> >>>>>>>>
    >> >>>>>>>> As per the discussion today on R-Help:
    >> >>>>>>>>
    >> >>>>>>>> https://stat.ethz.ch/pipermail/r-help/2017-July/448132.html
    >> >>>>>>>>
    >> >>>>>>>> I am attaching a proposed patch for poly.Rd to
    >> >>>>>>>> provide clarifying wording relative to naming the
    >> >>>>>>>> 'degree' argument explicitly, in the case where the
    >> >>>>>>>> 'x' argument is a matrix, rather than a vector.
    >> >>>>>>>>
    >> >>>>>>>> This is based upon the svn trunk version of
    >> >>>>>>>> poly.Rd.
    >> >>>>>>>
    >> >>>>>>> I don't think this is the right fix.  The use of the
    >> >>>>>>> unnamed 2nd arg as degree happens whether the first
    >> >>>>>>> arg is a matrix or not.
    >> >>>>>>>
    >> >>>>>>> I didn't read the whole thread in detail, but it
    >> >>>>>>> appears there's a bug somewhere, in the report or in
    >> >>>>>>> the poly() code or in the plsr() code. That bug
    >> >>>>>>> should be reported on the bug list if it turns out
    >> >>>>>>> to be in base R, and to the package maintainer if it
    >> >>>>>>> is in plsr().
    >> >>>>>>>
    >> >>>>>>> Duncan Murdoch
    >>>>>>>
    >>>>>>>
    >>>>>>> Duncan,
    >>>>>>>
    >>>>>>> Thanks for your reply. You only really need to read that
    >> >>>>>>> last post in the thread linked to above.
    >>>>>>>
    >>>>>>> I won't deny the possibility of a bug in poly(), relative
    >> >>>>>>> to the handling of 'x' as a matrix. The behavior
    >> >>>>>>> occurs in the poly() function in a pure stand alone
    >> >>>>>>> fashion, without the need for plsr():
    >>>>>>>
    >>>>>>> x1 <- runif(20)
    >>>>>>> x2 <- runif(20)
    >>>>>>> mx <- cbind(x1, x2)
    >>>>>>>
    >> >>>>>
    >> >>>>> <snip>
    >> >>>>>
    >> >>>>> Duncan,
    >> >>>>>
    >> >>>>> Tracing through the code for poly() using debug once
    >> >>>>> with:
    >> >>>>>
    >> >>>>> poly(mx, 2)
    >> >>>>>
    >> >>>>> and then with:
    >> >>>>>
    >> >>>>> poly(mx, degree = 2)
    >> >>>>>
    >> >>>>> there is a difference in the transformation of 'mx'
    >> >>>>> internally by the use of:
    >> >>>>>
    >> >>>>> if (is.matrix(x)) { m <-
    >> >>>>> unclass(as.data.frame(cbind(x, ...)))
    >> >>>>> return(do.call(polym, c(m, degree = degree, raw = raw,
    >> >>>>> list(coefs = coefs)))) }
    >> >>>>>
    >> >>>>>
    >> >>>>> In the first case, 'mx' ends up being transformed to:
    >> >>>>>
    >> >>>>> Browse[2]> m $x1 [1] 0.99056941 0.13953093 0.38965567
    >> >>>>> 0.35353514 0.90838486 0.97552474 [7] 0.01135743
    >> >>>>> 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
    >> >>>>> [13] 0.31333549 0.97488211 0.54952630 0.71747157
    >> >>>>> 0.31164777 0.81694822 [19] 0.58641410 0.08858699
    >> >>>>>
    >> >>>>> $x2 [1] 0.6628658 0.9221436 0.3162418 0.8494452
    >> >>>>> 0.4665010 0.3403719 [7] 0.4040692 0.4916650 0.9091161
    >> >>>>> 0.2956006 0.3454689 0.3331070 [13] 0.8788974 0.5614636
    >> >>>>> 0.7794396 0.2304009 0.6566537 0.6875646 [19] 0.5110733
    >> >>>>> 0.4122336
    >> >>>>>
    >> >>>>> $V3 [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
    >> >>>>>
    >> >>>>> attr(,"row.names") [1] 1 2 3 4 5 6 7 8 9 10 11 12 13
    >> >>>>> 14 15 16 17 18 19 20
    >> >>>>>
    >> >>>>> Thus, when do.call() is used, m$V3 is passed as the
    >> >>>>> 'x' argument on the third iteration, essentially
    >> >>>>> resulting in:
    >> >>>>>
    >>>>>>> polym(rep(2, 20), degree = 2) Error in poly(dots[[1L]],
    >> >>>>> degree, raw = raw, simple = raw && nd > 1) : 'degree'
    >> >>>>> must be less than number of unique points
    >> >>>>>
    >> >>>>>
    >> >>>>> Note also that in this case, 'dots', which is the
    >> >>>>> result of using list(...) on the initial call, is:
    >> >>>>>
    >> >>>>> Browse[2]> dots [[1]] [1] 2
    >> >>>>>
    >> >>>>>
    >> >>>>> In the second case:
    >> >>>>>
    >> >>>>> Browse[2]> m $x1 [1] 0.99056941 0.13953093 0.38965567
    >> >>>>> 0.35353514 0.90838486 0.97552474 [7] 0.01135743
    >> >>>>> 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
    >> >>>>> [13] 0.31333549 0.97488211 0.54952630 0.71747157
    >> >>>>> 0.31164777 0.81694822 [19] 0.58641410 0.08858699
    >> >>>>>
    >> >>>>> $x2 [1] 0.6628658 0.9221436 0.3162418 0.8494452
    >> >>>>> 0.4665010 0.3403719 [7] 0.4040692 0.4916650 0.9091161
    >> >>>>> 0.2956006 0.3454689 0.3331070 [13] 0.8788974 0.5614636
    >> >>>>> 0.7794396 0.2304009 0.6566537 0.6875646 [19] 0.5110733
    >> >>>>> 0.4122336
    >> >>>>>
    >> >>>>> attr(,"row.names") [1] 1 2 3 4 5 6 7 8 9 10 11 12 13
    >> >>>>> 14 15 16 17 18 19 20
    >> >>>>>
    >> >>>>>
    >> >>>>> So, there is no m$V3.
    >> >>>>>
    >> >>>>> Note also that 'dots' ends up being:
    >> >>>>>
    >> >>>>> Browse[2]> dots list()
    >> >>>>>
    >> >>>>>
    >> >>>>> In both cases, 'degree' is indeed 2, but the result of
    >> >>>>> 'list(...)' on the initial function call is quite
    >> >>>>> different.
    >> >>>>>
    >> >>>>> So, I may be hypo-caffeinated, but if there is a bug
    >> >>>>> here, it may be due to the way in which cbind() is
    >> >>>>> being called in the code above, where the three dots
    >> >>>>> are being used?
    >> >>>>>
    >> >>>>> I can replicate the presumably correct behavior by
    >> >>>>> using:
    >> >>>>>
    >> >>>>> m <- unclass(as.data.frame(cbind(x)))
    >> >>>>>
    >> >>>>> instead of:
    >> >>>>>
    >> >>>>> m <- unclass(as.data.frame(cbind(x, ...)))
    >> >>>>>
    >> >>>>> But I am not sure if removing the three dots in the
    >> >>>>> cbind() call may have other unintended consequences.
    >> >>>>>
    >> >>>>> Regards,
    >> >>>>>
    >> >>>>> Marc
    >> >>
    >> >>
    >> >>>> Duncan,
    >> >>
    >> >>>> Some additional information here.  Reviewing the source
    >> >>>> code for the function in SVN:
    >> >>
    >> >>>> https://svn.r-project.org/R/trunk/src/library/stats/R/contr.poly.R
    >> >>
    >> >>>> there is a relevant comment in the code:
    >> >>
    >> >>>> if(is.matrix(x)) { ## FIXME: fails when combined with
    >> >>>> 'unnamed degree' above m <-
    >> >>>> unclass(as.data.frame(cbind(x, ...)))
    >> >>>> return(do.call(polym, c(m, degree = degree, raw = raw,
    >> >>>> list(coefs=coefs)))) }
    >> >>
    >> >>
    >> >>>> A version review would suggest that the above comment
    >> >>>> was added to the code back in 2015.
    >> >>
    >> >>> Yes, by me, possibly here :
    >> >>
    >> >>> $ svn log -v -c68727
    >> >>> ------------------------------------------------------------------------
    >> >>> r68727 | maechler | 2015-07-23 16:14:59 +0200 (Thu, 23
    >> >>> Jul 2015) | 1 line Changed paths: M /trunk/doc/NEWS.Rd M
    >> >>> /trunk/src/library/stats/R/contr.poly.R M
    >> >>> /trunk/src/library/stats/man/poly.Rd M
    >> >>> /trunk/tests/Examples/stats-Ex.Rout.save M
    >> >>> /trunk/tests/reg-tests-1c.R
    >> >>
    >> >>> poly(), polym() now work better notably for prediction
    >> >>> ------------------------------------------------------------------------
    >> >>> $ svn-diffB -c68727 doc/NEWS.Rd Index: doc/NEWS.Rd
    >> >>> ===================================================================
    >> >>> 126a127,133
    >> >>>>
    >> >>>> \item \code{polym()} gains a \code{coefs = NULL}
    >> >>>> argument and returns class \code{"poly"} just like
    >> >>>> \code{poly()} which gets a new \code{simple=FALSE}
    >> >>>> option.  They now lead to correct \code{predict()}ions,
    >> >>>> e.g., on subsets of the original data.  %% see
    >> >>>> https://stat.ethz.ch/pipermail/r-devel/2015-July/071532.html
    >> >>
    >> >>
    >> >>>> So it would appear that the behavior being discussed
    >> >>>> here is known.
    >> >>
    >> >>> Indeed!  I remember to have spent quite a few hours with
    >> >>> the code and its different uses before committing that
    >> >>> patch.
    >> >>
    >> >>>> I am still confused by the need for the '...' in the
    >> >>>> call to cbind(), which as far as I can tell, has been
    >> >>>> in the code at least back to 2003, when the poly() code
    >> >>>> was split from base.
    >> >>
    >> >>>> I am not sure why one would want to pass on other '...'
    >> >>>> arguments to cbind(), but I am presumably missing
    >> >>>> something here.
    >> >>
    >> >>> Yes, I think passing the '...' is important there...
    >> >>> OTOH, I'm almost sure that I wrote the 'FIXME' because I
    >> >>> thought one should be able to do things better.  So, I'm
    >> >>> happy to e-talk to you about how to get rid of the FIXME
    >> >>> and still remain back-compatible: Notably with the
    >> >>> paragraph in ?poly |> Details:
    >> >>> |>
    >> >>> |> Although formally ‘degree’ should be named (as it
    >> >>> follows ‘...’), |> an unnamed second argument of length
    >> >>> 1 will be interpreted as the |> degree, such that
    >> >>> ‘poly(x, 3)’ can be used in formulas.
    >> >>
    >> >> As a matter of fact, a patch seems very simple, and I am
    >> >> testing it now.
    >> >>
    >> >> Won't have much more time today, but will return "on this
    >> >> channel" later, maybe tomorrow.
    >> >>
    >> >> Martin
    >>
    >>
    >> > Martin,
    >> > Thanks for taking the time to look at this!
    >>
    >> > Marc
    >>
    >> Duncan had in the mean time filed a bug report about this,
    --> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17310
    >> but I had fixed the issue even before seeing the PR.
    >> [currently fixed in R-devel only (svn r 72919)]

    > I wrote to you the next day, when Marc pointed out the FIXME comment.
    > Did you not receive my message?

    > Duncan Murdoch

I'm sorry Duncan,
there must have been messages crossing each other, possibly
delayed on this end, where some (including me) are using a new
spam/virus filtering service.
Yes, I saw that too.. but also quite a bit *after* having
replied on R-devel that I was the author of the FIXME and that I
was going to look into the issue...
... and then I did look into the issue instead of checking my
(almost always much too many) e-mails.

But there's no problem about this, right?
I'm sorry if I gave the impression in some way.
It's good to have a bug report that we can quickly close isn't it?

Best,
Martin

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Proposed Patch for poly.Rd

Duncan Murdoch-2
On 17/07/2017 2:57 AM, Martin Maechler wrote:

>>>>>> Duncan Murdoch <[hidden email]>
>>>>>>     on Sat, 15 Jul 2017 19:27:57 -0400 writes:
>
>     > On 15/07/2017 11:37 AM, Martin Maechler wrote:
>     >>>>>>> Marc Schwartz <[hidden email]>
>     >>>>>>> on Fri, 14 Jul 2017 11:01:03 -0500 writes:
>     >>
>     >> >> On Jul 14, 2017, at 9:50 AM, Martin Maechler
>     >> >> <[hidden email]> wrote:
>     >> >>
>     >> >>>>>>> Martin Maechler <[hidden email]> on Fri,
>     >> >>>>>>> 14 Jul 2017 16:30:50 +0200 writes:
>     >> >>
>     >> >>>>>>> Marc Schwartz <[hidden email]> on Fri, 14 Jul
>     >> >>>>>>> 2017 06:57:26 -0500 writes:
>     >> >>
>     >> >>>>> On Jul 13, 2017, at 5:07 PM, Marc Schwartz
>     >> >>>>> <[hidden email]> wrote:
>     >> >>>>>
>     >> >>>>>
>     >>>>>>> On Jul 13, 2017, at 3:37 PM, Marc Schwartz
>     >> >>>>> <[hidden email]> wrote:
>     >>>>>>>
>     >>>>>>>
>     >> >>>>>>> On Jul 13, 2017, at 3:22 PM, Duncan Murdoch
>     >> >>>>>>> <[hidden email]> wrote:
>     >> >>>>>>>
>     >> >>>>>>> On 13/07/2017 4:08 PM, Marc Schwartz wrote:
>     >> >>>>>>>> Hi All,
>     >> >>>>>>>>
>     >> >>>>>>>> As per the discussion today on R-Help:
>     >> >>>>>>>>
>     >> >>>>>>>> https://stat.ethz.ch/pipermail/r-help/2017-July/448132.html
>     >> >>>>>>>>
>     >> >>>>>>>> I am attaching a proposed patch for poly.Rd to
>     >> >>>>>>>> provide clarifying wording relative to naming the
>     >> >>>>>>>> 'degree' argument explicitly, in the case where the
>     >> >>>>>>>> 'x' argument is a matrix, rather than a vector.
>     >> >>>>>>>>
>     >> >>>>>>>> This is based upon the svn trunk version of
>     >> >>>>>>>> poly.Rd.
>     >> >>>>>>>
>     >> >>>>>>> I don't think this is the right fix.  The use of the
>     >> >>>>>>> unnamed 2nd arg as degree happens whether the first
>     >> >>>>>>> arg is a matrix or not.
>     >> >>>>>>>
>     >> >>>>>>> I didn't read the whole thread in detail, but it
>     >> >>>>>>> appears there's a bug somewhere, in the report or in
>     >> >>>>>>> the poly() code or in the plsr() code. That bug
>     >> >>>>>>> should be reported on the bug list if it turns out
>     >> >>>>>>> to be in base R, and to the package maintainer if it
>     >> >>>>>>> is in plsr().
>     >> >>>>>>>
>     >> >>>>>>> Duncan Murdoch
>     >>>>>>>
>     >>>>>>>
>     >>>>>>> Duncan,
>     >>>>>>>
>     >>>>>>> Thanks for your reply. You only really need to read that
>     >> >>>>>>> last post in the thread linked to above.
>     >>>>>>>
>     >>>>>>> I won't deny the possibility of a bug in poly(), relative
>     >> >>>>>>> to the handling of 'x' as a matrix. The behavior
>     >> >>>>>>> occurs in the poly() function in a pure stand alone
>     >> >>>>>>> fashion, without the need for plsr():
>     >>>>>>>
>     >>>>>>> x1 <- runif(20)
>     >>>>>>> x2 <- runif(20)
>     >>>>>>> mx <- cbind(x1, x2)
>     >>>>>>>
>     >> >>>>>
>     >> >>>>> <snip>
>     >> >>>>>
>     >> >>>>> Duncan,
>     >> >>>>>
>     >> >>>>> Tracing through the code for poly() using debug once
>     >> >>>>> with:
>     >> >>>>>
>     >> >>>>> poly(mx, 2)
>     >> >>>>>
>     >> >>>>> and then with:
>     >> >>>>>
>     >> >>>>> poly(mx, degree = 2)
>     >> >>>>>
>     >> >>>>> there is a difference in the transformation of 'mx'
>     >> >>>>> internally by the use of:
>     >> >>>>>
>     >> >>>>> if (is.matrix(x)) { m <-
>     >> >>>>> unclass(as.data.frame(cbind(x, ...)))
>     >> >>>>> return(do.call(polym, c(m, degree = degree, raw = raw,
>     >> >>>>> list(coefs = coefs)))) }
>     >> >>>>>
>     >> >>>>>
>     >> >>>>> In the first case, 'mx' ends up being transformed to:
>     >> >>>>>
>     >> >>>>> Browse[2]> m $x1 [1] 0.99056941 0.13953093 0.38965567
>     >> >>>>> 0.35353514 0.90838486 0.97552474 [7] 0.01135743
>     >> >>>>> 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
>     >> >>>>> [13] 0.31333549 0.97488211 0.54952630 0.71747157
>     >> >>>>> 0.31164777 0.81694822 [19] 0.58641410 0.08858699
>     >> >>>>>
>     >> >>>>> $x2 [1] 0.6628658 0.9221436 0.3162418 0.8494452
>     >> >>>>> 0.4665010 0.3403719 [7] 0.4040692 0.4916650 0.9091161
>     >> >>>>> 0.2956006 0.3454689 0.3331070 [13] 0.8788974 0.5614636
>     >> >>>>> 0.7794396 0.2304009 0.6566537 0.6875646 [19] 0.5110733
>     >> >>>>> 0.4122336
>     >> >>>>>
>     >> >>>>> $V3 [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
>     >> >>>>>
>     >> >>>>> attr(,"row.names") [1] 1 2 3 4 5 6 7 8 9 10 11 12 13
>     >> >>>>> 14 15 16 17 18 19 20
>     >> >>>>>
>     >> >>>>> Thus, when do.call() is used, m$V3 is passed as the
>     >> >>>>> 'x' argument on the third iteration, essentially
>     >> >>>>> resulting in:
>     >> >>>>>
>     >>>>>>> polym(rep(2, 20), degree = 2) Error in poly(dots[[1L]],
>     >> >>>>> degree, raw = raw, simple = raw && nd > 1) : 'degree'
>     >> >>>>> must be less than number of unique points
>     >> >>>>>
>     >> >>>>>
>     >> >>>>> Note also that in this case, 'dots', which is the
>     >> >>>>> result of using list(...) on the initial call, is:
>     >> >>>>>
>     >> >>>>> Browse[2]> dots [[1]] [1] 2
>     >> >>>>>
>     >> >>>>>
>     >> >>>>> In the second case:
>     >> >>>>>
>     >> >>>>> Browse[2]> m $x1 [1] 0.99056941 0.13953093 0.38965567
>     >> >>>>> 0.35353514 0.90838486 0.97552474 [7] 0.01135743
>     >> >>>>> 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973
>     >> >>>>> [13] 0.31333549 0.97488211 0.54952630 0.71747157
>     >> >>>>> 0.31164777 0.81694822 [19] 0.58641410 0.08858699
>     >> >>>>>
>     >> >>>>> $x2 [1] 0.6628658 0.9221436 0.3162418 0.8494452
>     >> >>>>> 0.4665010 0.3403719 [7] 0.4040692 0.4916650 0.9091161
>     >> >>>>> 0.2956006 0.3454689 0.3331070 [13] 0.8788974 0.5614636
>     >> >>>>> 0.7794396 0.2304009 0.6566537 0.6875646 [19] 0.5110733
>     >> >>>>> 0.4122336
>     >> >>>>>
>     >> >>>>> attr(,"row.names") [1] 1 2 3 4 5 6 7 8 9 10 11 12 13
>     >> >>>>> 14 15 16 17 18 19 20
>     >> >>>>>
>     >> >>>>>
>     >> >>>>> So, there is no m$V3.
>     >> >>>>>
>     >> >>>>> Note also that 'dots' ends up being:
>     >> >>>>>
>     >> >>>>> Browse[2]> dots list()
>     >> >>>>>
>     >> >>>>>
>     >> >>>>> In both cases, 'degree' is indeed 2, but the result of
>     >> >>>>> 'list(...)' on the initial function call is quite
>     >> >>>>> different.
>     >> >>>>>
>     >> >>>>> So, I may be hypo-caffeinated, but if there is a bug
>     >> >>>>> here, it may be due to the way in which cbind() is
>     >> >>>>> being called in the code above, where the three dots
>     >> >>>>> are being used?
>     >> >>>>>
>     >> >>>>> I can replicate the presumably correct behavior by
>     >> >>>>> using:
>     >> >>>>>
>     >> >>>>> m <- unclass(as.data.frame(cbind(x)))
>     >> >>>>>
>     >> >>>>> instead of:
>     >> >>>>>
>     >> >>>>> m <- unclass(as.data.frame(cbind(x, ...)))
>     >> >>>>>
>     >> >>>>> But I am not sure if removing the three dots in the
>     >> >>>>> cbind() call may have other unintended consequences.
>     >> >>>>>
>     >> >>>>> Regards,
>     >> >>>>>
>     >> >>>>> Marc
>     >> >>
>     >> >>
>     >> >>>> Duncan,
>     >> >>
>     >> >>>> Some additional information here.  Reviewing the source
>     >> >>>> code for the function in SVN:
>     >> >>
>     >> >>>> https://svn.r-project.org/R/trunk/src/library/stats/R/contr.poly.R
>     >> >>
>     >> >>>> there is a relevant comment in the code:
>     >> >>
>     >> >>>> if(is.matrix(x)) { ## FIXME: fails when combined with
>     >> >>>> 'unnamed degree' above m <-
>     >> >>>> unclass(as.data.frame(cbind(x, ...)))
>     >> >>>> return(do.call(polym, c(m, degree = degree, raw = raw,
>     >> >>>> list(coefs=coefs)))) }
>     >> >>
>     >> >>
>     >> >>>> A version review would suggest that the above comment
>     >> >>>> was added to the code back in 2015.
>     >> >>
>     >> >>> Yes, by me, possibly here :
>     >> >>
>     >> >>> $ svn log -v -c68727
>     >> >>> ------------------------------------------------------------------------
>     >> >>> r68727 | maechler | 2015-07-23 16:14:59 +0200 (Thu, 23
>     >> >>> Jul 2015) | 1 line Changed paths: M /trunk/doc/NEWS.Rd M
>     >> >>> /trunk/src/library/stats/R/contr.poly.R M
>     >> >>> /trunk/src/library/stats/man/poly.Rd M
>     >> >>> /trunk/tests/Examples/stats-Ex.Rout.save M
>     >> >>> /trunk/tests/reg-tests-1c.R
>     >> >>
>     >> >>> poly(), polym() now work better notably for prediction
>     >> >>> ------------------------------------------------------------------------
>     >> >>> $ svn-diffB -c68727 doc/NEWS.Rd Index: doc/NEWS.Rd
>     >> >>> ===================================================================
>     >> >>> 126a127,133
>     >> >>>>
>     >> >>>> \item \code{polym()} gains a \code{coefs = NULL}
>     >> >>>> argument and returns class \code{"poly"} just like
>     >> >>>> \code{poly()} which gets a new \code{simple=FALSE}
>     >> >>>> option.  They now lead to correct \code{predict()}ions,
>     >> >>>> e.g., on subsets of the original data.  %% see
>     >> >>>> https://stat.ethz.ch/pipermail/r-devel/2015-July/071532.html
>     >> >>
>     >> >>
>     >> >>>> So it would appear that the behavior being discussed
>     >> >>>> here is known.
>     >> >>
>     >> >>> Indeed!  I remember to have spent quite a few hours with
>     >> >>> the code and its different uses before committing that
>     >> >>> patch.
>     >> >>
>     >> >>>> I am still confused by the need for the '...' in the
>     >> >>>> call to cbind(), which as far as I can tell, has been
>     >> >>>> in the code at least back to 2003, when the poly() code
>     >> >>>> was split from base.
>     >> >>
>     >> >>>> I am not sure why one would want to pass on other '...'
>     >> >>>> arguments to cbind(), but I am presumably missing
>     >> >>>> something here.
>     >> >>
>     >> >>> Yes, I think passing the '...' is important there...
>     >> >>> OTOH, I'm almost sure that I wrote the 'FIXME' because I
>     >> >>> thought one should be able to do things better.  So, I'm
>     >> >>> happy to e-talk to you about how to get rid of the FIXME
>     >> >>> and still remain back-compatible: Notably with the
>     >> >>> paragraph in ?poly |> Details:
>     >> >>> |>
>     >> >>> |> Although formally ‘degree’ should be named (as it
>     >> >>> follows ‘...’), |> an unnamed second argument of length
>     >> >>> 1 will be interpreted as the |> degree, such that
>     >> >>> ‘poly(x, 3)’ can be used in formulas.
>     >> >>
>     >> >> As a matter of fact, a patch seems very simple, and I am
>     >> >> testing it now.
>     >> >>
>     >> >> Won't have much more time today, but will return "on this
>     >> >> channel" later, maybe tomorrow.
>     >> >>
>     >> >> Martin
>     >>
>     >>
>     >> > Martin,
>     >> > Thanks for taking the time to look at this!
>     >>
>     >> > Marc
>     >>
>     >> Duncan had in the mean time filed a bug report about this,
>     --> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17310
>     >> but I had fixed the issue even before seeing the PR.
>     >> [currently fixed in R-devel only (svn r 72919)]
>
>     > I wrote to you the next day, when Marc pointed out the FIXME comment.
>     > Did you not receive my message?
>
>     > Duncan Murdoch
>
> I'm sorry Duncan,
> there must have been messages crossing each other, possibly
> delayed on this end, where some (including me) are using a new
> spam/virus filtering service.
> Yes, I saw that too.. but also quite a bit *after* having
> replied on R-devel that I was the author of the FIXME and that I
> was going to look into the issue...
> ... and then I did look into the issue instead of checking my
> (almost always much too many) e-mails.
>
> But there's no problem about this, right?
> I'm sorry if I gave the impression in some way.
> It's good to have a bug report that we can quickly close isn't it?

No, there's no problem.  I was just concerned because it sounded as
though you hadn't received it; if messages were getting lost, that would
be a problem.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel