Why missing values are not allowed in 'poly'?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Why missing values are not allowed in 'poly'?

Liviu Andronic
Dear all,
I'm a bit surprised by this behavior in poly:

x <- c(NA, 1:10)
poly(x, degree = 2, raw=TRUE)
## Error in poly(x, degree = 2, raw = TRUE) :
##   missing values are not allowed in 'poly'
x^2
## [1] NA 1 4 9 16 25 36 49 64 81 100

As you can see, poly() will fail if the vector contains NAs, whereas
it is perfectly possible to obtain the square of the vector manually.

Is there a reason for this limitation in poly?

Regards,
Liviu


--
Do you think you know what math is?
http://www.ideasroadshow.com/issues/ian-stewart-2013-08-02
Or what it means to be intelligent?
http://www.ideasroadshow.com/issues/john-duncan-2013-08-30
Think again:
http://www.ideasroadshow.com/library

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Why missing values are not allowed in 'poly'?

R help mailing list-2
I think the worst aspect of this restriction in poly() is that when
you use poly in the formula of a model-fitting function you cannot
have any missing values in the data, even if you supply
na.action=na.exclude.

  > d <- transform(data.frame(y=c(-1,1:10)), x=log(y))
  Warning message:
  In log(y) : NaNs produced
  > fit <- lm(y ~ poly(x, 3), data=d, na.action=na.exclude)
  Error in poly(x, 3) : missing values are not allowed in 'poly'

Thus people are pushed to using a less stable formulation like
  > fit <- lm(y ~ x + I(x^2) + I(x^3), data=d, na.action=na.exclude)


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Mar 23, 2016 at 12:59 PM, Liviu Andronic <[hidden email]>
wrote:

> Dear all,
> I'm a bit surprised by this behavior in poly:
>
> x <- c(NA, 1:10)
> poly(x, degree = 2, raw=TRUE)
> ## Error in poly(x, degree = 2, raw = TRUE) :
> ##   missing values are not allowed in 'poly'
> x^2
> ## [1] NA 1 4 9 16 25 36 49 64 81 100
>
> As you can see, poly() will fail if the vector contains NAs, whereas
> it is perfectly possible to obtain the square of the vector manually.
>
> Is there a reason for this limitation in poly?
>
> Regards,
> Liviu
>
>
> --
> Do you think you know what math is?
> http://www.ideasroadshow.com/issues/ian-stewart-2013-08-02
> Or what it means to be intelligent?
> http://www.ideasroadshow.com/issues/john-duncan-2013-08-30
> Think again:
> http://www.ideasroadshow.com/library
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Why missing values are not allowed in 'poly'?

Liviu Andronic
On Wed, Mar 23, 2016 at 9:29 PM, William Dunlap <[hidden email]> wrote:

> I think the worst aspect of this restriction in poly() is that when
> you use poly in the formula of a model-fitting function you cannot
> have any missing values in the data, even if you supply
> na.action=na.exclude.
>
>   > d <- transform(data.frame(y=c(-1,1:10)), x=log(y))
>   Warning message:
>   In log(y) : NaNs produced
>   > fit <- lm(y ~ poly(x, 3), data=d, na.action=na.exclude)
>   Error in poly(x, 3) : missing values are not allowed in 'poly'
>
> Thus people are pushed to using a less stable formulation like
>   > fit <- lm(y ~ x + I(x^2) + I(x^3), data=d, na.action=na.exclude)
>
My difficulty precisely. What's more, I inspected the code for `poly`
and at least for the simple case of raw=TRUE it seems trivial to
support NAs. It suffices to change line 15 of the function:
if (anyNA(x)) stop("missing values are not allowed in 'poly'")

to:
if (!raw && anyNA(x)) stop("missing values are not allowed in 'poly'")

This way for raw polynomials estimation continues unimpeded. With the
change above, I get this:
> poly(x, degree = 2, raw=TRUE)
       1 2
[1,] NA NA
[2,] 1 1
[3,] 2 4
[4,] 3 9
[5,] 4 16
[6,] 5 25
[7,] 6 36
[8,] 7 49
[9,] 8 64
[10,] 9 81
[11,] 10 100
attr(,"degree")
[1] 1 2
attr(,"class")
[1] "poly" "matrix"


Regards,
Liviu


>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Wed, Mar 23, 2016 at 12:59 PM, Liviu Andronic <[hidden email]>
> wrote:
>>
>> Dear all,
>> I'm a bit surprised by this behavior in poly:
>>
>> x <- c(NA, 1:10)
>> poly(x, degree = 2, raw=TRUE)
>> ## Error in poly(x, degree = 2, raw = TRUE) :
>> ##   missing values are not allowed in 'poly'
>> x^2
>> ## [1] NA 1 4 9 16 25 36 49 64 81 100
>>
>> As you can see, poly() will fail if the vector contains NAs, whereas
>> it is perfectly possible to obtain the square of the vector manually.
>>
>> Is there a reason for this limitation in poly?
>>
>> Regards,
>> Liviu
>>
>>
>> --
>> Do you think you know what math is?
>> http://www.ideasroadshow.com/issues/ian-stewart-2013-08-02
>> Or what it means to be intelligent?
>> http://www.ideasroadshow.com/issues/john-duncan-2013-08-30
>> Think again:
>> http://www.ideasroadshow.com/library
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>



--
Do you think you know what math is?
http://www.ideasroadshow.com/issues/ian-stewart-2013-08-02
Or what it means to be intelligent?
http://www.ideasroadshow.com/issues/john-duncan-2013-08-30
Think again:
http://www.ideasroadshow.com/library

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Why missing values are not allowed in 'poly'?

R help mailing list-2
I don't know what is in R's poly(), but if it is like S+'s or TERR's then
one could do

            if (anyNA(x))  {
                nax <- na.exclude(x)
                px <- poly(x = nax, degree = degree, coefs = coefs, raw =
raw, simple = simple)
                px <- structure(naresid(attr(nax, "na.action"), px), coefs
= attr(px, "coefs"), degree = attr(px, "degree"), class = attr(px, "class"))
                return(px)
            }

and get nice results in the usual raw=FALSE case as well.  Similar stuff
could be done in the multivariate cases.



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Mar 23, 2016 at 1:41 PM, Liviu Andronic <[hidden email]>
wrote:

> On Wed, Mar 23, 2016 at 9:29 PM, William Dunlap <[hidden email]> wrote:
> > I think the worst aspect of this restriction in poly() is that when
> > you use poly in the formula of a model-fitting function you cannot
> > have any missing values in the data, even if you supply
> > na.action=na.exclude.
> >
> >   > d <- transform(data.frame(y=c(-1,1:10)), x=log(y))
> >   Warning message:
> >   In log(y) : NaNs produced
> >   > fit <- lm(y ~ poly(x, 3), data=d, na.action=na.exclude)
> >   Error in poly(x, 3) : missing values are not allowed in 'poly'
> >
> > Thus people are pushed to using a less stable formulation like
> >   > fit <- lm(y ~ x + I(x^2) + I(x^3), data=d, na.action=na.exclude)
> >
> My difficulty precisely. What's more, I inspected the code for `poly`
> and at least for the simple case of raw=TRUE it seems trivial to
> support NAs. It suffices to change line 15 of the function:
> if (anyNA(x)) stop("missing values are not allowed in 'poly'")
>
> to:
> if (!raw && anyNA(x)) stop("missing values are not allowed in 'poly'")
>
> This way for raw polynomials estimation continues unimpeded. With the
> change above, I get this:
> > poly(x, degree = 2, raw=TRUE)
>        1 2
> [1,] NA NA
> [2,] 1 1
> [3,] 2 4
> [4,] 3 9
> [5,] 4 16
> [6,] 5 25
> [7,] 6 36
> [8,] 7 49
> [9,] 8 64
> [10,] 9 81
> [11,] 10 100
> attr(,"degree")
> [1] 1 2
> attr(,"class")
> [1] "poly" "matrix"
>
>
> Regards,
> Liviu
>
>
> >
> > Bill Dunlap
> > TIBCO Software
> > wdunlap tibco.com
> >
> > On Wed, Mar 23, 2016 at 12:59 PM, Liviu Andronic <[hidden email]
> >
> > wrote:
> >>
> >> Dear all,
> >> I'm a bit surprised by this behavior in poly:
> >>
> >> x <- c(NA, 1:10)
> >> poly(x, degree = 2, raw=TRUE)
> >> ## Error in poly(x, degree = 2, raw = TRUE) :
> >> ##   missing values are not allowed in 'poly'
> >> x^2
> >> ## [1] NA 1 4 9 16 25 36 49 64 81 100
> >>
> >> As you can see, poly() will fail if the vector contains NAs, whereas
> >> it is perfectly possible to obtain the square of the vector manually.
> >>
> >> Is there a reason for this limitation in poly?
> >>
> >> Regards,
> >> Liviu
> >>
> >>
> >> --
> >> Do you think you know what math is?
> >> http://www.ideasroadshow.com/issues/ian-stewart-2013-08-02
> >> Or what it means to be intelligent?
> >> http://www.ideasroadshow.com/issues/john-duncan-2013-08-30
> >> Think again:
> >> http://www.ideasroadshow.com/library
> >>
> >> ______________________________________________
> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
>
>
> --
> Do you think you know what math is?
> http://www.ideasroadshow.com/issues/ian-stewart-2013-08-02
> Or what it means to be intelligent?
> http://www.ideasroadshow.com/issues/john-duncan-2013-08-30
> Think again:
> http://www.ideasroadshow.com/library
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Why missing values are not allowed in 'poly'?

Martin Maechler
>>>>> William Dunlap via R-help <[hidden email]>
>>>>>     on Wed, 23 Mar 2016 13:56:35 -0700 writes:

    > I don't know what is in R's poly(), but if it is like S+'s or TERR's then
    > one could do

    > if (anyNA(x))  {
    > nax <- na.exclude(x)
    > px <- poly(x = nax, degree = degree, coefs = coefs, raw =
    > raw, simple = simple)
    > px <- structure(naresid(attr(nax, "na.action"), px), coefs
    > = attr(px, "coefs"), degree = attr(px, "degree"), class = attr(px, "class"))
    > return(px)
    > }

    > and get nice results in the usual raw=FALSE case as well.  Similar stuff
    > could be done in the multivariate cases.

I don't have too much time for that now,
and I know that Bill Dunlap cannot provide patches for R --- for
good reasons, though it's a pity for us! ---
but you can, Liviu!
So, and  as you see at every startup of R :

   "R is a collaborative project with many contributors."

I'm willing to try "good-looking" patches.
(to the *sources*, *NOT* to a printout of the function in your R console!)

Martin Maechler
ETH Zurich and R Core Team.

    > Bill Dunlap
    > TIBCO Software
    > wdunlap tibco.com

    > On Wed, Mar 23, 2016 at 1:41 PM, Liviu Andronic <[hidden email]>
    > wrote:

    >> On Wed, Mar 23, 2016 at 9:29 PM, William Dunlap <[hidden email]> wrote:
    >> > I think the worst aspect of this restriction in poly() is that when
    >> > you use poly in the formula of a model-fitting function you cannot
    >> > have any missing values in the data, even if you supply
    >> > na.action=na.exclude.
    >> >
    >> >   > d <- transform(data.frame(y=c(-1,1:10)), x=log(y))
    >> >   Warning message:
    >> >   In log(y) : NaNs produced
    >> >   > fit <- lm(y ~ poly(x, 3), data=d, na.action=na.exclude)
    >> >   Error in poly(x, 3) : missing values are not allowed in 'poly'
    >> >
    >> > Thus people are pushed to using a less stable formulation like
    >> >   > fit <- lm(y ~ x + I(x^2) + I(x^3), data=d, na.action=na.exclude)
    >> >
    >> My difficulty precisely. What's more, I inspected the code for `poly`
    >> and at least for the simple case of raw=TRUE it seems trivial to
    >> support NAs. It suffices to change line 15 of the function:
    >> if (anyNA(x)) stop("missing values are not allowed in 'poly'")
    >>
    >> to:
    >> if (!raw && anyNA(x)) stop("missing values are not allowed in 'poly'")
    >>
    >> This way for raw polynomials estimation continues unimpeded. With the
    >> change above, I get this:
    >> > poly(x, degree = 2, raw=TRUE)
    >> 1 2
    >> [1,] NA NA
    >> [2,] 1 1
    >> [3,] 2 4
    >> [4,] 3 9
    >> [5,] 4 16
    >> [6,] 5 25
    >> [7,] 6 36
    >> [8,] 7 49
    >> [9,] 8 64
    >> [10,] 9 81
    >> [11,] 10 100
    >> attr(,"degree")
    >> [1] 1 2
    >> attr(,"class")
    >> [1] "poly" "matrix"
    >>
    >>
    >> Regards,
    >> Liviu
    >>
    >>
    >> >
    >> > Bill Dunlap
    >> > TIBCO Software
    >> > wdunlap tibco.com
    >> >
    >> > On Wed, Mar 23, 2016 at 12:59 PM, Liviu Andronic <[hidden email]
    >> >
    >> > wrote:
    >> >>
    >> >> Dear all,
    >> >> I'm a bit surprised by this behavior in poly:
    >> >>
    >> >> x <- c(NA, 1:10)
    >> >> poly(x, degree = 2, raw=TRUE)
    >> >> ## Error in poly(x, degree = 2, raw = TRUE) :
    >> >> ##   missing values are not allowed in 'poly'
    >> >> x^2
    >> >> ## [1] NA 1 4 9 16 25 36 49 64 81 100
    >> >>
    >> >> As you can see, poly() will fail if the vector contains NAs, whereas
    >> >> it is perfectly possible to obtain the square of the vector manually.
    >> >>
    >> >> Is there a reason for this limitation in poly?
    >> >>
    >> >> Regards,
    >> >> Liviu
    >> >>
    >> >>
    >> >> --
    >> >> Do you think you know what math is?
    >> >> http://www.ideasroadshow.com/issues/ian-stewart-2013-08-02
    >> >> Or what it means to be intelligent?
    >> >> http://www.ideasroadshow.com/issues/john-duncan-2013-08-30
    >> >> Think again:
    >> >> http://www.ideasroadshow.com/library
    >> >>
    >> >> ______________________________________________
    >> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
    >> >> https://stat.ethz.ch/mailman/listinfo/r-help
    >> >> PLEASE do read the posting guide
    >> >> http://www.R-project.org/posting-guide.html
    >> >> and provide commented, minimal, self-contained, reproducible code.
    >> >
    >> >
    >>
    >>
    >>
    >> --
    >> Do you think you know what math is?
    >> http://www.ideasroadshow.com/issues/ian-stewart-2013-08-02
    >> Or what it means to be intelligent?
    >> http://www.ideasroadshow.com/issues/john-duncan-2013-08-30
    >> Think again:
    >> http://www.ideasroadshow.com/library
    >>

    > [[alternative HTML version deleted]]

    > ______________________________________________
    > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
    > https://stat.ethz.ch/mailman/listinfo/r-help
    > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    > and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Why missing values are not allowed in 'poly'?

R help mailing list-2
If poly were to be changed to allow NA's, how should it act in the
multivariate case when raw=FALSE?

Suppose x1 had an NA in position 1 only and x2 had NA in position 2 only.
Should the output matrix have NAs in all columns of rows 1 and 2 with its
coefs attribute reflectiing only the data in na.omit(cbind(x1,x2))?  This
would make the output matrix orthonormal after NA removal.

Or should it remove NAs from the input vectors independently?  This is
analogous to what predict.poly() currently does - different columns of the
output will have different patterns of NAs.

An example of the first method would be:
> poly(c(NA,2,5,7,8), c(1,NA,3,4,9), degree=2)
            1.0        2.0        0.1         1.1        0.2
[1,]         NA         NA         NA          NA         NA
[2,]         NA         NA         NA          NA         NA
[3,] -0.7715167  0.2672612 -0.5132649  0.39599247  0.6350006
[4,]  0.1543033 -0.8017837 -0.2932942 -0.04525628 -0.7620008
[5,]  0.6172134  0.5345225  0.8065591  0.49781910  0.1270001
attr(,"degree")
[1] 1 2 1 2 2
attr(,"coefs")
[[1]]
[[1]]$alpha
[1] 6.666667 6.190476

[[1]]$norm2
[1] 1.000000 3.000000 4.666667 2.571429

[[2]]
[[2]]$alpha
[1] 5.333333 6.989247

[[2]]$norm2
[1]  1.00000  3.00000 20.66667 14.51613
attr(,"class")
[1] "poly"   "matrix"

and the second
>  poly(c(NA,2,5,7,8), c(1,NA,3,4,9), degree=2)
            1.0        2.0         0.1         1.1        0.2
[1,]         NA         NA -0.55132280          NA  0.6398330
[2,] -0.7637626  0.3988620          NA          NA         NA
[3,] -0.1091089 -0.7407437 -0.21204723  0.02313625 -0.3442756
[4,]  0.3273268 -0.1709409 -0.04240945 -0.01388175 -0.6106021
[5,]  0.5455447  0.5128226  0.80577948  0.43958875  0.3150447
attr(,"degree")
[1] 1 2 1 2 2
attr(,"coefs")
[[1]]
[[1]]$alpha
[1] 5.500000 4.357143

[[1]]$norm2
[1]  1.00000  4.00000 21.00000 56.57143

[[2]]
[[2]]$alpha
[1] 4.250000 6.289568

[[2]]$norm2
[1]   1.0000   4.0000  34.7500 176.6331
attr(,"class")
[1] "poly"   "matrix"



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Mar 24, 2016 at 4:54 AM, Martin Maechler <[hidden email]
> wrote:

> >>>>> William Dunlap via R-help <[hidden email]>
> >>>>>     on Wed, 23 Mar 2016 13:56:35 -0700 writes:
>
>     > I don't know what is in R's poly(), but if it is like S+'s or TERR's
> then
>     > one could do
>
>     > if (anyNA(x))  {
>     > nax <- na.exclude(x)
>     > px <- poly(x = nax, degree = degree, coefs = coefs, raw =
>     > raw, simple = simple)
>     > px <- structure(naresid(attr(nax, "na.action"), px), coefs
>     > = attr(px, "coefs"), degree = attr(px, "degree"), class = attr(px,
> "class"))
>     > return(px)
>     > }
>
>     > and get nice results in the usual raw=FALSE case as well.  Similar
> stuff
>     > could be done in the multivariate cases.
>
> I don't have too much time for that now,
> and I know that Bill Dunlap cannot provide patches for R --- for
> good reasons, though it's a pity for us! ---
> but you can, Liviu!
> So, and  as you see at every startup of R :
>
>    "R is a collaborative project with many contributors."
>
> I'm willing to try "good-looking" patches.
> (to the *sources*, *NOT* to a printout of the function in your R console!)
>
> Martin Maechler
> ETH Zurich and R Core Team.
>
>     > Bill Dunlap
>     > TIBCO Software
>     > wdunlap tibco.com
>
>     > On Wed, Mar 23, 2016 at 1:41 PM, Liviu Andronic <
> [hidden email]>
>     > wrote:
>
>     >> On Wed, Mar 23, 2016 at 9:29 PM, William Dunlap <[hidden email]>
> wrote:
>     >> > I think the worst aspect of this restriction in poly() is that
> when
>     >> > you use poly in the formula of a model-fitting function you cannot
>     >> > have any missing values in the data, even if you supply
>     >> > na.action=na.exclude.
>     >> >
>     >> >   > d <- transform(data.frame(y=c(-1,1:10)), x=log(y))
>     >> >   Warning message:
>     >> >   In log(y) : NaNs produced
>     >> >   > fit <- lm(y ~ poly(x, 3), data=d, na.action=na.exclude)
>     >> >   Error in poly(x, 3) : missing values are not allowed in 'poly'
>     >> >
>     >> > Thus people are pushed to using a less stable formulation like
>     >> >   > fit <- lm(y ~ x + I(x^2) + I(x^3), data=d,
> na.action=na.exclude)
>     >> >
>     >> My difficulty precisely. What's more, I inspected the code for
> `poly`
>     >> and at least for the simple case of raw=TRUE it seems trivial to
>     >> support NAs. It suffices to change line 15 of the function:
>     >> if (anyNA(x)) stop("missing values are not allowed in 'poly'")
>     >>
>     >> to:
>     >> if (!raw && anyNA(x)) stop("missing values are not allowed in
> 'poly'")
>     >>
>     >> This way for raw polynomials estimation continues unimpeded. With
> the
>     >> change above, I get this:
>     >> > poly(x, degree = 2, raw=TRUE)
>     >> 1 2
>     >> [1,] NA NA
>     >> [2,] 1 1
>     >> [3,] 2 4
>     >> [4,] 3 9
>     >> [5,] 4 16
>     >> [6,] 5 25
>     >> [7,] 6 36
>     >> [8,] 7 49
>     >> [9,] 8 64
>     >> [10,] 9 81
>     >> [11,] 10 100
>     >> attr(,"degree")
>     >> [1] 1 2
>     >> attr(,"class")
>     >> [1] "poly" "matrix"
>     >>
>     >>
>     >> Regards,
>     >> Liviu
>     >>
>     >>
>     >> >
>     >> > Bill Dunlap
>     >> > TIBCO Software
>     >> > wdunlap tibco.com
>     >> >
>     >> > On Wed, Mar 23, 2016 at 12:59 PM, Liviu Andronic <
> [hidden email]
>     >> >
>     >> > wrote:
>     >> >>
>     >> >> Dear all,
>     >> >> I'm a bit surprised by this behavior in poly:
>     >> >>
>     >> >> x <- c(NA, 1:10)
>     >> >> poly(x, degree = 2, raw=TRUE)
>     >> >> ## Error in poly(x, degree = 2, raw = TRUE) :
>     >> >> ##   missing values are not allowed in 'poly'
>     >> >> x^2
>     >> >> ## [1] NA 1 4 9 16 25 36 49 64 81 100
>     >> >>
>     >> >> As you can see, poly() will fail if the vector contains NAs,
> whereas
>     >> >> it is perfectly possible to obtain the square of the vector
> manually.
>     >> >>
>     >> >> Is there a reason for this limitation in poly?
>     >> >>
>     >> >> Regards,
>     >> >> Liviu
>     >> >>
>     >> >>
>     >> >> --
>     >> >> Do you think you know what math is?
>     >> >> http://www.ideasroadshow.com/issues/ian-stewart-2013-08-02
>     >> >> Or what it means to be intelligent?
>     >> >> http://www.ideasroadshow.com/issues/john-duncan-2013-08-30
>     >> >> Think again:
>     >> >> http://www.ideasroadshow.com/library
>     >> >>
>     >> >> ______________________________________________
>     >> >> [hidden email] mailing list -- To UNSUBSCRIBE and more,
> see
>     >> >> https://stat.ethz.ch/mailman/listinfo/r-help
>     >> >> PLEASE do read the posting guide
>     >> >> http://www.R-project.org/posting-guide.html
>     >> >> and provide commented, minimal, self-contained, reproducible
> code.
>     >> >
>     >> >
>     >>
>     >>
>     >>
>     >> --
>     >> Do you think you know what math is?
>     >> http://www.ideasroadshow.com/issues/ian-stewart-2013-08-02
>     >> Or what it means to be intelligent?
>     >> http://www.ideasroadshow.com/issues/john-duncan-2013-08-30
>     >> Think again:
>     >> http://www.ideasroadshow.com/library
>     >>
>
>     > [[alternative HTML version deleted]]
>
>     > ______________________________________________
>     > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>     > https://stat.ethz.ch/mailman/listinfo/r-help
>     > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>     > and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Why missing values are not allowed in 'poly'?

Liviu Andronic
On Mon, Mar 28, 2016 at 8:32 PM, William Dunlap <[hidden email]> wrote:
> If poly were to be changed to allow NA's, how should it act in the
> multivariate case when raw=FALSE?
>
Thank you Bill for looking into this. I was playing myself with your
proposed code to come up with a clean patch, and ran into the same
queries as you did, i.e. what to do when the NA structure is different
in the multivariate case? In the univariate case or in the
multivariate case when the NA structure is identical things are
straightforward. But in the multivariate case with differing NA
structure I'm not sure what is the expected output.

Regards,
Liviu


> Suppose x1 had an NA in position 1 only and x2 had NA in position 2 only.
> Should the output matrix have NAs in all columns of rows 1 and 2 with its
> coefs attribute reflectiing only the data in na.omit(cbind(x1,x2))?  This
> would make the output matrix orthonormal after NA removal.
>
> Or should it remove NAs from the input vectors independently?  This is
> analogous to what predict.poly() currently does - different columns of the
> output will have different patterns of NAs.
>
> An example of the first method would be:
>> poly(c(NA,2,5,7,8), c(1,NA,3,4,9), degree=2)
>             1.0        2.0        0.1         1.1        0.2
> [1,]         NA         NA         NA          NA         NA
> [2,]         NA         NA         NA          NA         NA
> [3,] -0.7715167  0.2672612 -0.5132649  0.39599247  0.6350006
> [4,]  0.1543033 -0.8017837 -0.2932942 -0.04525628 -0.7620008
> [5,]  0.6172134  0.5345225  0.8065591  0.49781910  0.1270001
> attr(,"degree")
> [1] 1 2 1 2 2
> attr(,"coefs")
> [[1]]
> [[1]]$alpha
> [1] 6.666667 6.190476
>
> [[1]]$norm2
> [1] 1.000000 3.000000 4.666667 2.571429
>
> [[2]]
> [[2]]$alpha
> [1] 5.333333 6.989247
>
> [[2]]$norm2
> [1]  1.00000  3.00000 20.66667 14.51613
> attr(,"class")
> [1] "poly"   "matrix"
>
> and the second
>>  poly(c(NA,2,5,7,8), c(1,NA,3,4,9), degree=2)
>             1.0        2.0         0.1         1.1        0.2
> [1,]         NA         NA -0.55132280          NA  0.6398330
> [2,] -0.7637626  0.3988620          NA          NA         NA
> [3,] -0.1091089 -0.7407437 -0.21204723  0.02313625 -0.3442756
> [4,]  0.3273268 -0.1709409 -0.04240945 -0.01388175 -0.6106021
> [5,]  0.5455447  0.5128226  0.80577948  0.43958875  0.3150447
> attr(,"degree")
> [1] 1 2 1 2 2
> attr(,"coefs")
> [[1]]
> [[1]]$alpha
> [1] 5.500000 4.357143
>
> [[1]]$norm2
> [1]  1.00000  4.00000 21.00000 56.57143
>
> [[2]]
> [[2]]$alpha
> [1] 4.250000 6.289568
>
> [[2]]$norm2
> [1]   1.0000   4.0000  34.7500 176.6331
> attr(,"class")
> [1] "poly"   "matrix"
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Thu, Mar 24, 2016 at 4:54 AM, Martin Maechler
> <[hidden email]> wrote:
>>
>> >>>>> William Dunlap via R-help <[hidden email]>
>> >>>>>     on Wed, 23 Mar 2016 13:56:35 -0700 writes:
>>
>>     > I don't know what is in R's poly(), but if it is like S+'s or TERR's
>> then
>>     > one could do
>>
>>     > if (anyNA(x))  {
>>     > nax <- na.exclude(x)
>>     > px <- poly(x = nax, degree = degree, coefs = coefs, raw =
>>     > raw, simple = simple)
>>     > px <- structure(naresid(attr(nax, "na.action"), px), coefs
>>     > = attr(px, "coefs"), degree = attr(px, "degree"), class = attr(px,
>> "class"))
>>     > return(px)
>>     > }
>>
>>     > and get nice results in the usual raw=FALSE case as well.  Similar
>> stuff
>>     > could be done in the multivariate cases.
>>
>> I don't have too much time for that now,
>> and I know that Bill Dunlap cannot provide patches for R --- for
>> good reasons, though it's a pity for us! ---
>> but you can, Liviu!
>> So, and  as you see at every startup of R :
>>
>>    "R is a collaborative project with many contributors."
>>
>> I'm willing to try "good-looking" patches.
>> (to the *sources*, *NOT* to a printout of the function in your R console!)
>>
>> Martin Maechler
>> ETH Zurich and R Core Team.
>>
>>     > Bill Dunlap
>>     > TIBCO Software
>>     > wdunlap tibco.com
>>
>>     > On Wed, Mar 23, 2016 at 1:41 PM, Liviu Andronic
>> <[hidden email]>
>>     > wrote:
>>
>>     >> On Wed, Mar 23, 2016 at 9:29 PM, William Dunlap <[hidden email]>
>> wrote:
>>     >> > I think the worst aspect of this restriction in poly() is that
>> when
>>     >> > you use poly in the formula of a model-fitting function you
>> cannot
>>     >> > have any missing values in the data, even if you supply
>>     >> > na.action=na.exclude.
>>     >> >
>>     >> >   > d <- transform(data.frame(y=c(-1,1:10)), x=log(y))
>>     >> >   Warning message:
>>     >> >   In log(y) : NaNs produced
>>     >> >   > fit <- lm(y ~ poly(x, 3), data=d, na.action=na.exclude)
>>     >> >   Error in poly(x, 3) : missing values are not allowed in 'poly'
>>     >> >
>>     >> > Thus people are pushed to using a less stable formulation like
>>     >> >   > fit <- lm(y ~ x + I(x^2) + I(x^3), data=d,
>> na.action=na.exclude)
>>     >> >
>>     >> My difficulty precisely. What's more, I inspected the code for
>> `poly`
>>     >> and at least for the simple case of raw=TRUE it seems trivial to
>>     >> support NAs. It suffices to change line 15 of the function:
>>     >> if (anyNA(x)) stop("missing values are not allowed in 'poly'")
>>     >>
>>     >> to:
>>     >> if (!raw && anyNA(x)) stop("missing values are not allowed in
>> 'poly'")
>>     >>
>>     >> This way for raw polynomials estimation continues unimpeded. With
>> the
>>     >> change above, I get this:
>>     >> > poly(x, degree = 2, raw=TRUE)
>>     >> 1 2
>>     >> [1,] NA NA
>>     >> [2,] 1 1
>>     >> [3,] 2 4
>>     >> [4,] 3 9
>>     >> [5,] 4 16
>>     >> [6,] 5 25
>>     >> [7,] 6 36
>>     >> [8,] 7 49
>>     >> [9,] 8 64
>>     >> [10,] 9 81
>>     >> [11,] 10 100
>>     >> attr(,"degree")
>>     >> [1] 1 2
>>     >> attr(,"class")
>>     >> [1] "poly" "matrix"
>>     >>
>>     >>
>>     >> Regards,
>>     >> Liviu
>>     >>
>>     >>
>>     >> >
>>     >> > Bill Dunlap
>>     >> > TIBCO Software
>>     >> > wdunlap tibco.com
>>     >> >
>>     >> > On Wed, Mar 23, 2016 at 12:59 PM, Liviu Andronic
>> <[hidden email]
>>     >> >
>>     >> > wrote:
>>     >> >>
>>     >> >> Dear all,
>>     >> >> I'm a bit surprised by this behavior in poly:
>>     >> >>
>>     >> >> x <- c(NA, 1:10)
>>     >> >> poly(x, degree = 2, raw=TRUE)
>>     >> >> ## Error in poly(x, degree = 2, raw = TRUE) :
>>     >> >> ##   missing values are not allowed in 'poly'
>>     >> >> x^2
>>     >> >> ## [1] NA 1 4 9 16 25 36 49 64 81 100
>>     >> >>
>>     >> >> As you can see, poly() will fail if the vector contains NAs,
>> whereas
>>     >> >> it is perfectly possible to obtain the square of the vector
>> manually.
>>     >> >>
>>     >> >> Is there a reason for this limitation in poly?
>>     >> >>
>>     >> >> Regards,
>>     >> >> Liviu
>>     >> >>
>>     >> >>
>>     >> >> --
>>     >> >> Do you think you know what math is?
>>     >> >> http://www.ideasroadshow.com/issues/ian-stewart-2013-08-02
>>     >> >> Or what it means to be intelligent?
>>     >> >> http://www.ideasroadshow.com/issues/john-duncan-2013-08-30
>>     >> >> Think again:
>>     >> >> http://www.ideasroadshow.com/library
>>     >> >>
>>     >> >> ______________________________________________
>>     >> >> [hidden email] mailing list -- To UNSUBSCRIBE and more,
>> see
>>     >> >> https://stat.ethz.ch/mailman/listinfo/r-help
>>     >> >> PLEASE do read the posting guide
>>     >> >> http://www.R-project.org/posting-guide.html
>>     >> >> and provide commented, minimal, self-contained, reproducible
>> code.
>>     >> >
>>     >> >
>>     >>
>>     >>
>>     >>
>>     >> --
>>     >> Do you think you know what math is?
>>     >> http://www.ideasroadshow.com/issues/ian-stewart-2013-08-02
>>     >> Or what it means to be intelligent?
>>     >> http://www.ideasroadshow.com/issues/john-duncan-2013-08-30
>>     >> Think again:
>>     >> http://www.ideasroadshow.com/library
>>     >>
>>
>>     > [[alternative HTML version deleted]]
>>
>>     > ______________________________________________
>>     > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>     > https://stat.ethz.ch/mailman/listinfo/r-help
>>     > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>     > and provide commented, minimal, self-contained, reproducible code.
>
>



--
Do you think you know what math is?
http://www.ideasroadshow.com/issues/ian-stewart-2013-08-02
Or what it means to be intelligent?
http://www.ideasroadshow.com/issues/john-duncan-2013-08-30
Think again:
http://www.ideasroadshow.com/library

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.