Orthogonal polynomials used by R

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Orthogonal polynomials used by R

Ashim Kapoor
Dear All,

I have created a time trend by doing x<-1:93 because I have a time series
with 93 data points. Next I did :-

y = lm(series ~ poly(x,4))$residuals

to detrend series.

I choose this 4 as the order of my polynomial using cross validation/
checking the absence of trend in the residuals so I think I have not
overfit this series.

I wish to document the formula of poly(x,4). I am not able to find it in
?poly

Can someone please tell me what the formula for the orthogonal polynomial
used by R is ?

Thank you,
Ashim

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Orthogonal polynomials used by R

PIKAL Petr
You could get answer quickly by searching net.

https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-p
olynomials-how-to-understand-the-coefs-ret/39051154#39051154

Cheers
Petr

> -----Original Message-----
> From: R-help <[hidden email]> On Behalf Of Ashim Kapoor
> Sent: Wednesday, November 27, 2019 12:55 PM
> To: R Help <[hidden email]>
> Subject: [R] Orthogonal polynomials used by R
>
> Dear All,
>
> I have created a time trend by doing x<-1:93 because I have a time series
> with 93 data points. Next I did :-
>
> y = lm(series ~ poly(x,4))$residuals
>
> to detrend series.
>
> I choose this 4 as the order of my polynomial using cross validation/
> checking the absence of trend in the residuals so I think I have not
overfit
> this series.
>
> I wish to document the formula of poly(x,4). I am not able to find it in
?poly

>
> Can someone please tell me what the formula for the orthogonal
> polynomial used by R is ?
>
> Thank you,
> Ashim
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Orthogonal polynomials used by R

Ashim Kapoor
Dear Petr,

Many thanks for the quick response.

I also read this:-
https://en.wikipedia.org/wiki/Discrete_orthogonal_polynomials

Also I read  in ?poly:-
     The orthogonal polynomial is summarized by the coefficients, which
     can be used to evaluate it via the three-term recursion given in
     Kennedy & Gentle (1980, pp. 343-4), and used in the ‘predict’ part
     of the code.

I don't have access to the mentioned book.

Out of curiosity, what is the name of the discrete orthogonal polynomial
used by R ?
What discrete measure is it orthogonal with respect to ?

Many thanks,
Ashim




On Wed, Nov 27, 2019 at 6:11 PM PIKAL Petr <[hidden email]> wrote:

> You could get answer quickly by searching net.
>
>
> https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-p
> olynomials-how-to-understand-the-coefs-ret/39051154#39051154
> <https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-polynomials-how-to-understand-the-coefs-ret/39051154#39051154>
>
> Cheers
> Petr
>
> > -----Original Message-----
> > From: R-help <[hidden email]> On Behalf Of Ashim Kapoor
> > Sent: Wednesday, November 27, 2019 12:55 PM
> > To: R Help <[hidden email]>
> > Subject: [R] Orthogonal polynomials used by R
> >
> > Dear All,
> >
> > I have created a time trend by doing x<-1:93 because I have a time series
> > with 93 data points. Next I did :-
> >
> > y = lm(series ~ poly(x,4))$residuals
> >
> > to detrend series.
> >
> > I choose this 4 as the order of my polynomial using cross validation/
> > checking the absence of trend in the residuals so I think I have not
> overfit
> > this series.
> >
> > I wish to document the formula of poly(x,4). I am not able to find it in
> ?poly
> >
> > Can someone please tell me what the formula for the orthogonal
> > polynomial used by R is ?
> >
> > Thank you,
> > Ashim
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Orthogonal polynomials used by R

PIKAL Petr
Hi

Your questions are beyound my expertise. Maybe others could answer it.

However, if you know enough theorethical statistics, you could get many answers searching e.g.

R lm poly

Cheers
Petr

From: Ashim Kapoor <[hidden email]>
Sent: Wednesday, November 27, 2019 4:18 PM
To: PIKAL Petr <[hidden email]>
Cc: R Help <[hidden email]>
Subject: Re: [R] Orthogonal polynomials used by R

Dear Petr,

Many thanks for the quick response.

I also read this:-
https://en.wikipedia.org/wiki/Discrete_orthogonal_polynomials

Also I read  in ?poly:-
     The orthogonal polynomial is summarized by the coefficients, which
     can be used to evaluate it via the three-term recursion given in
     Kennedy & Gentle (1980, pp. 343-4), and used in the ‘predict’ part
     of the code.

I don't have access to the mentioned book.

Out of curiosity, what is the name of the discrete orthogonal polynomial used by R ?
What discrete measure is it orthogonal with respect to ?

Many thanks,
Ashim




On Wed, Nov 27, 2019 at 6:11 PM PIKAL Petr <mailto:[hidden email]> wrote:
You could get answer quickly by searching net.

https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-polynomials-how-to-understand-the-coefs-ret/39051154#39051154

Cheers
Petr

> -----Original Message-----
> From: R-help <mailto:[hidden email]> On Behalf Of Ashim Kapoor
> Sent: Wednesday, November 27, 2019 12:55 PM
> To: R Help <mailto:[hidden email]>
> Subject: [R] Orthogonal polynomials used by R
>
> Dear All,
>
> I have created a time trend by doing x<-1:93 because I have a time series
> with 93 data points. Next I did :-
>
> y = lm(series ~ poly(x,4))$residuals
>
> to detrend series.
>
> I choose this 4 as the order of my polynomial using cross validation/
> checking the absence of trend in the residuals so I think I have not
overfit
> this series.
>
> I wish to document the formula of poly(x,4). I am not able to find it in
?poly

>
> Can someone please tell me what the formula for the orthogonal
> polynomial used by R is ?
>
> Thank you,
> Ashim
>
>       [[alternative HTML version deleted]]
>
> ______________________________________________
> mailto:[hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Orthogonal polynomials used by R

Fox, John
In reply to this post by PIKAL Petr
Dear Ashim,

Orthogonal polynomials are used because they tend to produce more accurate numerical computations, not because their coefficients are interpretable, so I wonder why you're interested in the coefficients.

The regressors produced are orthogonal to the constant regressor and are orthogonal to each other (and in fact are orthonormal), as it's simple to demonstrate:

------- snip -------

> x <- 1:93
> y <- 1 + x + x^2 + x^3 + x^4 + rnorm(93)
> (m <- lm(y ~ poly(x, 4)))

Call:
lm(formula = y ~ poly(x, 4))

Coefficients:
(Intercept)  poly(x, 4)1  poly(x, 4)2  poly(x, 4)3  poly(x, 4)4  
   15574516    172715069     94769949     27683528      3429259  

> X <- model.matrix(m)
> head(X)
  (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
1           1  -0.1776843   0.2245083  -0.2572066  0.27935949
2           1  -0.1738216   0.2098665  -0.2236579  0.21862917
3           1  -0.1699589   0.1955464  -0.1919525  0.16390514
4           1  -0.1660962   0.1815482  -0.1620496  0.11487597
5           1  -0.1622335   0.1678717  -0.1339080  0.07123722
6           1  -0.1583708   0.1545171  -0.1074869  0.03269145

> zapsmall(crossprod(X))# X'X
            (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
(Intercept)          93           0           0           0           0
poly(x, 4)1           0           1           0           0           0
poly(x, 4)2           0           0           1           0           0
poly(x, 4)3           0           0           0           1           0
poly(x, 4)4           0           0           0           0           1

------- snip -------

If for some not immediately obvious reason you're interested in the regression coefficients, why not just use a "raw" polynomial:

------- snip -------

> (m1 <- lm(y ~ poly(x, 4, raw=TRUE)))

Call:
lm(formula = y ~ poly(x, 4, raw = TRUE))

Coefficients:
            (Intercept)  poly(x, 4, raw = TRUE)1  poly(x, 4, raw = TRUE)2  poly(x, 4, raw = TRUE)3  
                 1.5640                   0.8985                   1.0037                   1.0000  
poly(x, 4, raw = TRUE)4  
                 1.0000  

------- snip -------

These coefficients are simply interpretable but the model matrix is more poorly conditioned:

------- snip -------

> head(X1)
  (Intercept) poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2 poly(x, 4, raw = TRUE)3
1           1                       1                       1                       1
2           1                       2                       4                       8
3           1                       3                       9                      27
4           1                       4                      16                      64
5           1                       5                      25                     125
6           1                       6                      36                     216
  poly(x, 4, raw = TRUE)4
1                       1
2                      16
3                      81
4                     256
5                     625
6                    1296
> round(cor(X1[, -1]), 2)
                        poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2 poly(x, 4, raw = TRUE)3
poly(x, 4, raw = TRUE)1                    1.00                    0.97                    0.92
poly(x, 4, raw = TRUE)2                    0.97                    1.00                    0.99
poly(x, 4, raw = TRUE)3                    0.92                    0.99                    1.00
poly(x, 4, raw = TRUE)4                    0.87                    0.96                    0.99
                        poly(x, 4, raw = TRUE)4
poly(x, 4, raw = TRUE)1                    0.87
poly(x, 4, raw = TRUE)2                    0.96
poly(x, 4, raw = TRUE)3                    0.99
poly(x, 4, raw = TRUE)4                    1.00

------- snip -------

The two parametrizations are equivalent, however, in that they represent the same regression surface, and so, e.g., produce the same fitted values:

------- snip -------

> all.equal(fitted(m), fitted(m1))
[1] TRUE

------- snip -------

Because one is usually not interested in the individual coefficients of a polynomial there usually isn't a reason to prefer one parametrization to the other on the grounds of interpretability, so why do you need to interpret the regression equation?

I hope this helps,
 John

  -----------------------------
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Nov 27, 2019, at 10:17 AM, Ashim Kapoor <[hidden email]> wrote:
>
> Dear Petr,
>
> Many thanks for the quick response.
>
> I also read this:-
> https://en.wikipedia.org/wiki/Discrete_orthogonal_polynomials
>
> Also I read  in ?poly:-
>     The orthogonal polynomial is summarized by the coefficients, which
>     can be used to evaluate it via the three-term recursion given in
>     Kennedy & Gentle (1980, pp. 343-4), and used in the ‘predict’ part
>     of the code.
>
> I don't have access to the mentioned book.
>
> Out of curiosity, what is the name of the discrete orthogonal polynomial
> used by R ?
> What discrete measure is it orthogonal with respect to ?
>
> Many thanks,
> Ashim
>
>
>
>
> On Wed, Nov 27, 2019 at 6:11 PM PIKAL Petr <[hidden email]> wrote:
>
>> You could get answer quickly by searching net.
>>
>>
>> https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-p
>> olynomials-how-to-understand-the-coefs-ret/39051154#39051154
>> <https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-polynomials-how-to-understand-the-coefs-ret/39051154#39051154>
>>
>> Cheers
>> Petr
>>
>>> -----Original Message-----
>>> From: R-help <[hidden email]> On Behalf Of Ashim Kapoor
>>> Sent: Wednesday, November 27, 2019 12:55 PM
>>> To: R Help <[hidden email]>
>>> Subject: [R] Orthogonal polynomials used by R
>>>
>>> Dear All,
>>>
>>> I have created a time trend by doing x<-1:93 because I have a time series
>>> with 93 data points. Next I did :-
>>>
>>> y = lm(series ~ poly(x,4))$residuals
>>>
>>> to detrend series.
>>>
>>> I choose this 4 as the order of my polynomial using cross validation/
>>> checking the absence of trend in the residuals so I think I have not
>> overfit
>>> this series.
>>>
>>> I wish to document the formula of poly(x,4). I am not able to find it in
>> ?poly
>>>
>>> Can someone please tell me what the formula for the orthogonal
>>> polynomial used by R is ?
>>>
>>> Thank you,
>>> Ashim
>>>
>>>      [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>> guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Orthogonal polynomials used by R

Ashim Kapoor
Dear Peter and John,

Many thanks for your prompt replies.

Here is what I was trying to do.  I was trying to build a statistical model
of a given time series using Box Jenkins methodology. The series has 93
data points. Before I analyse the ACF and PACF, I am required to de-trend
the series. The series seems to have an upward trend. I wanted to find out
what order polynomial should I fit the series
without overfitting.  For this I want to use orthogonal polynomials(I think
someone on the internet was talking about preventing overfitting by using
orthogonal polynomials) . This seems to me as a poor man's cross
validation.

So my plan is to keep increasing the degree of the orthogonal polynomials
till the coefficient of the last orthogonal polynomial becomes
insignificant.

Note : If I do NOT use orthogonal polynomials, I will overfit the data set
and I don't think that is a good way to detect the true order of the
polynomial.

Also now that I have detrended the series and built an ARIMA model of the
residuals, now I want to forecast. For this I need to use the original
polynomials and their coefficients.

I hope I was clear and that my methodology is ok.

I have another query here :-

Note : If I used cross-validation to determine the order of the polynomial,
I don't get a clear answer.

See here :-
library(boot)
mydf = data.frame(cbind(gdp,x))
d<-(c(
cv.glm(data = mydf,glm(gdp~x),K=10)$delta[1],
cv.glm(data = mydf,glm(gdp~poly(x,2)),K=10)$delta[1],
cv.glm(data = mydf,glm(gdp~poly(x,3)),K=10)$delta[1],
cv.glm(data = mydf,glm(gdp~poly(x,4)),K=10)$delta[1],
cv.glm(data = mydf,glm(gdp~poly(x,5)),K=10)$delta[1],
cv.glm(data = mydf,glm(gdp~poly(x,6)),K=10)$delta[1]))
print(d)
## [1] 2.178574e+13 7.303031e+11 5.994783e+11 4.943586e+11 4.596648e+11
## [6] 4.980159e+11

# Here it chooses 5. (but 4 and 5 are kind of similar).


d1 <- (c(
cv.glm(data = mydf,glm(gdp~1+x),K=10)$delta[1],
cv.glm(data = mydf,glm(gdp~1+x+x^2),K=10)$delta[1],
cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3),K=10)$delta[1],
cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4),K=10)$delta[1],
cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5),K=10)$delta[1],
cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5+x^6),K=10)$delta[1]))

print(d1)
## [1] 2.149647e+13 2.253999e+13 2.182175e+13 2.177170e+13 2.198675e+13
## [6] 2.145754e+13

# here it chooses 1 or 6

Query : Why does it choose 1? Notice : Is this just round off noise / noise
due to sampling error created by Cross Validation when it creates the K
folds? Is this due to the ill conditioned model matrix?

Best Regards,
Ashim.





On Wed, Nov 27, 2019 at 10:41 PM Fox, John <[hidden email]> wrote:

> Dear Ashim,
>
> Orthogonal polynomials are used because they tend to produce more accurate
> numerical computations, not because their coefficients are interpretable,
> so I wonder why you're interested in the coefficients.
>
> The regressors produced are orthogonal to the constant regressor and are
> orthogonal to each other (and in fact are orthonormal), as it's simple to
> demonstrate:
>
> ------- snip -------
>
> > x <- 1:93
> > y <- 1 + x + x^2 + x^3 + x^4 + rnorm(93)
> > (m <- lm(y ~ poly(x, 4)))
>
> Call:
> lm(formula = y ~ poly(x, 4))
>
> Coefficients:
> (Intercept)  poly(x, 4)1  poly(x, 4)2  poly(x, 4)3  poly(x, 4)4
>    15574516    172715069     94769949     27683528      3429259
>
> > X <- model.matrix(m)
> > head(X)
>   (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
> 1           1  -0.1776843   0.2245083  -0.2572066  0.27935949
> 2           1  -0.1738216   0.2098665  -0.2236579  0.21862917
> 3           1  -0.1699589   0.1955464  -0.1919525  0.16390514
> 4           1  -0.1660962   0.1815482  -0.1620496  0.11487597
> 5           1  -0.1622335   0.1678717  -0.1339080  0.07123722
> 6           1  -0.1583708   0.1545171  -0.1074869  0.03269145
>
> > zapsmall(crossprod(X))# X'X
>             (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
> (Intercept)          93           0           0           0           0
> poly(x, 4)1           0           1           0           0           0
> poly(x, 4)2           0           0           1           0           0
> poly(x, 4)3           0           0           0           1           0
> poly(x, 4)4           0           0           0           0           1
>
> ------- snip -------
>
> If for some not immediately obvious reason you're interested in the
> regression coefficients, why not just use a "raw" polynomial:
>
> ------- snip -------
>
> > (m1 <- lm(y ~ poly(x, 4, raw=TRUE)))
>
> Call:
> lm(formula = y ~ poly(x, 4, raw = TRUE))
>
> Coefficients:
>             (Intercept)  poly(x, 4, raw = TRUE)1  poly(x, 4, raw = TRUE)2
> poly(x, 4, raw = TRUE)3
>                  1.5640                   0.8985                   1.0037
>                  1.0000
> poly(x, 4, raw = TRUE)4
>                  1.0000
>
> ------- snip -------
>
> These coefficients are simply interpretable but the model matrix is more
> poorly conditioned:
>
> ------- snip -------
>
> > head(X1)
>   (Intercept) poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2 poly(x, 4,
> raw = TRUE)3
> 1           1                       1                       1
>          1
> 2           1                       2                       4
>          8
> 3           1                       3                       9
>         27
> 4           1                       4                      16
>         64
> 5           1                       5                      25
>        125
> 6           1                       6                      36
>        216
>   poly(x, 4, raw = TRUE)4
> 1                       1
> 2                      16
> 3                      81
> 4                     256
> 5                     625
> 6                    1296
> > round(cor(X1[, -1]), 2)
>                         poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2
> poly(x, 4, raw = TRUE)3
> poly(x, 4, raw = TRUE)1                    1.00                    0.97
>                 0.92
> poly(x, 4, raw = TRUE)2                    0.97                    1.00
>                 0.99
> poly(x, 4, raw = TRUE)3                    0.92                    0.99
>                 1.00
> poly(x, 4, raw = TRUE)4                    0.87                    0.96
>                 0.99
>                         poly(x, 4, raw = TRUE)4
> poly(x, 4, raw = TRUE)1                    0.87
> poly(x, 4, raw = TRUE)2                    0.96
> poly(x, 4, raw = TRUE)3                    0.99
> poly(x, 4, raw = TRUE)4                    1.00
>
> ------- snip -------
>
> The two parametrizations are equivalent, however, in that they represent
> the same regression surface, and so, e.g., produce the same fitted values:
>
> ------- snip -------
>
> > all.equal(fitted(m), fitted(m1))
> [1] TRUE
>
> ------- snip -------
>
> Because one is usually not interested in the individual coefficients of a
> polynomial there usually isn't a reason to prefer one parametrization to
> the other on the grounds of interpretability, so why do you need to
> interpret the regression equation?
>
> I hope this helps,
>  John
>
>   -----------------------------
>   John Fox, Professor Emeritus
>   McMaster University
>   Hamilton, Ontario, Canada
>   Web: http::/socserv.mcmaster.ca/jfox
>
> > On Nov 27, 2019, at 10:17 AM, Ashim Kapoor <[hidden email]>
> wrote:
> >
> > Dear Petr,
> >
> > Many thanks for the quick response.
> >
> > I also read this:-
> > https://en.wikipedia.org/wiki/Discrete_orthogonal_polynomials
> >
> > Also I read  in ?poly:-
> >     The orthogonal polynomial is summarized by the coefficients, which
> >     can be used to evaluate it via the three-term recursion given in
> >     Kennedy & Gentle (1980, pp. 343-4), and used in the ‘predict’ part
> >     of the code.
> >
> > I don't have access to the mentioned book.
> >
> > Out of curiosity, what is the name of the discrete orthogonal polynomial
> > used by R ?
> > What discrete measure is it orthogonal with respect to ?
> >
> > Many thanks,
> > Ashim
> >
> >
> >
> >
> > On Wed, Nov 27, 2019 at 6:11 PM PIKAL Petr <[hidden email]>
> wrote:
> >
> >> You could get answer quickly by searching net.
> >>
> >>
> >>
> https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-p
> >> olynomials-how-to-understand-the-coefs-ret/39051154#39051154
> >> <
> https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-polynomials-how-to-understand-the-coefs-ret/39051154#39051154
> >
> >>
> >> Cheers
> >> Petr
> >>
> >>> -----Original Message-----
> >>> From: R-help <[hidden email]> On Behalf Of Ashim Kapoor
> >>> Sent: Wednesday, November 27, 2019 12:55 PM
> >>> To: R Help <[hidden email]>
> >>> Subject: [R] Orthogonal polynomials used by R
> >>>
> >>> Dear All,
> >>>
> >>> I have created a time trend by doing x<-1:93 because I have a time
> series
> >>> with 93 data points. Next I did :-
> >>>
> >>> y = lm(series ~ poly(x,4))$residuals
> >>>
> >>> to detrend series.
> >>>
> >>> I choose this 4 as the order of my polynomial using cross validation/
> >>> checking the absence of trend in the residuals so I think I have not
> >> overfit
> >>> this series.
> >>>
> >>> I wish to document the formula of poly(x,4). I am not able to find it
> in
> >> ?poly
> >>>
> >>> Can someone please tell me what the formula for the orthogonal
> >>> polynomial used by R is ?
> >>>
> >>> Thank you,
> >>> Ashim
> >>>
> >>>      [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide http://www.R-project.org/posting-
> >>> guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Orthogonal polynomials used by R

Bert Gunter-2
Statistical questions are generally off topic on this list. Try
stats.stackexchange.com instead.

But FWIW, I recommend that you work with someone with expertise in time
series analysis, as your efforts to shake and bake your own methodology
seem rather unwise to me.

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Nov 27, 2019 at 9:47 PM Ashim Kapoor <[hidden email]> wrote:

> Dear Peter and John,
>
> Many thanks for your prompt replies.
>
> Here is what I was trying to do.  I was trying to build a statistical model
> of a given time series using Box Jenkins methodology. The series has 93
> data points. Before I analyse the ACF and PACF, I am required to de-trend
> the series. The series seems to have an upward trend. I wanted to find out
> what order polynomial should I fit the series
> without overfitting.  For this I want to use orthogonal polynomials(I think
> someone on the internet was talking about preventing overfitting by using
> orthogonal polynomials) . This seems to me as a poor man's cross
> validation.
>
> So my plan is to keep increasing the degree of the orthogonal polynomials
> till the coefficient of the last orthogonal polynomial becomes
> insignificant.
>
> Note : If I do NOT use orthogonal polynomials, I will overfit the data set
> and I don't think that is a good way to detect the true order of the
> polynomial.
>
> Also now that I have detrended the series and built an ARIMA model of the
> residuals, now I want to forecast. For this I need to use the original
> polynomials and their coefficients.
>
> I hope I was clear and that my methodology is ok.
>
> I have another query here :-
>
> Note : If I used cross-validation to determine the order of the polynomial,
> I don't get a clear answer.
>
> See here :-
> library(boot)
> mydf = data.frame(cbind(gdp,x))
> d<-(c(
> cv.glm(data = mydf,glm(gdp~x),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~poly(x,2)),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~poly(x,3)),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~poly(x,4)),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~poly(x,5)),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~poly(x,6)),K=10)$delta[1]))
> print(d)
> ## [1] 2.178574e+13 7.303031e+11 5.994783e+11 4.943586e+11 4.596648e+11
> ## [6] 4.980159e+11
>
> # Here it chooses 5. (but 4 and 5 are kind of similar).
>
>
> d1 <- (c(
> cv.glm(data = mydf,glm(gdp~1+x),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~1+x+x^2),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5+x^6),K=10)$delta[1]))
>
> print(d1)
> ## [1] 2.149647e+13 2.253999e+13 2.182175e+13 2.177170e+13 2.198675e+13
> ## [6] 2.145754e+13
>
> # here it chooses 1 or 6
>
> Query : Why does it choose 1? Notice : Is this just round off noise / noise
> due to sampling error created by Cross Validation when it creates the K
> folds? Is this due to the ill conditioned model matrix?
>
> Best Regards,
> Ashim.
>
>
>
>
>
> On Wed, Nov 27, 2019 at 10:41 PM Fox, John <[hidden email]> wrote:
>
> > Dear Ashim,
> >
> > Orthogonal polynomials are used because they tend to produce more
> accurate
> > numerical computations, not because their coefficients are interpretable,
> > so I wonder why you're interested in the coefficients.
> >
> > The regressors produced are orthogonal to the constant regressor and are
> > orthogonal to each other (and in fact are orthonormal), as it's simple to
> > demonstrate:
> >
> > ------- snip -------
> >
> > > x <- 1:93
> > > y <- 1 + x + x^2 + x^3 + x^4 + rnorm(93)
> > > (m <- lm(y ~ poly(x, 4)))
> >
> > Call:
> > lm(formula = y ~ poly(x, 4))
> >
> > Coefficients:
> > (Intercept)  poly(x, 4)1  poly(x, 4)2  poly(x, 4)3  poly(x, 4)4
> >    15574516    172715069     94769949     27683528      3429259
> >
> > > X <- model.matrix(m)
> > > head(X)
> >   (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
> > 1           1  -0.1776843   0.2245083  -0.2572066  0.27935949
> > 2           1  -0.1738216   0.2098665  -0.2236579  0.21862917
> > 3           1  -0.1699589   0.1955464  -0.1919525  0.16390514
> > 4           1  -0.1660962   0.1815482  -0.1620496  0.11487597
> > 5           1  -0.1622335   0.1678717  -0.1339080  0.07123722
> > 6           1  -0.1583708   0.1545171  -0.1074869  0.03269145
> >
> > > zapsmall(crossprod(X))# X'X
> >             (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
> > (Intercept)          93           0           0           0           0
> > poly(x, 4)1           0           1           0           0           0
> > poly(x, 4)2           0           0           1           0           0
> > poly(x, 4)3           0           0           0           1           0
> > poly(x, 4)4           0           0           0           0           1
> >
> > ------- snip -------
> >
> > If for some not immediately obvious reason you're interested in the
> > regression coefficients, why not just use a "raw" polynomial:
> >
> > ------- snip -------
> >
> > > (m1 <- lm(y ~ poly(x, 4, raw=TRUE)))
> >
> > Call:
> > lm(formula = y ~ poly(x, 4, raw = TRUE))
> >
> > Coefficients:
> >             (Intercept)  poly(x, 4, raw = TRUE)1  poly(x, 4, raw = TRUE)2
> > poly(x, 4, raw = TRUE)3
> >                  1.5640                   0.8985                   1.0037
> >                  1.0000
> > poly(x, 4, raw = TRUE)4
> >                  1.0000
> >
> > ------- snip -------
> >
> > These coefficients are simply interpretable but the model matrix is more
> > poorly conditioned:
> >
> > ------- snip -------
> >
> > > head(X1)
> >   (Intercept) poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2 poly(x, 4,
> > raw = TRUE)3
> > 1           1                       1                       1
> >          1
> > 2           1                       2                       4
> >          8
> > 3           1                       3                       9
> >         27
> > 4           1                       4                      16
> >         64
> > 5           1                       5                      25
> >        125
> > 6           1                       6                      36
> >        216
> >   poly(x, 4, raw = TRUE)4
> > 1                       1
> > 2                      16
> > 3                      81
> > 4                     256
> > 5                     625
> > 6                    1296
> > > round(cor(X1[, -1]), 2)
> >                         poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2
> > poly(x, 4, raw = TRUE)3
> > poly(x, 4, raw = TRUE)1                    1.00                    0.97
> >                 0.92
> > poly(x, 4, raw = TRUE)2                    0.97                    1.00
> >                 0.99
> > poly(x, 4, raw = TRUE)3                    0.92                    0.99
> >                 1.00
> > poly(x, 4, raw = TRUE)4                    0.87                    0.96
> >                 0.99
> >                         poly(x, 4, raw = TRUE)4
> > poly(x, 4, raw = TRUE)1                    0.87
> > poly(x, 4, raw = TRUE)2                    0.96
> > poly(x, 4, raw = TRUE)3                    0.99
> > poly(x, 4, raw = TRUE)4                    1.00
> >
> > ------- snip -------
> >
> > The two parametrizations are equivalent, however, in that they represent
> > the same regression surface, and so, e.g., produce the same fitted
> values:
> >
> > ------- snip -------
> >
> > > all.equal(fitted(m), fitted(m1))
> > [1] TRUE
> >
> > ------- snip -------
> >
> > Because one is usually not interested in the individual coefficients of a
> > polynomial there usually isn't a reason to prefer one parametrization to
> > the other on the grounds of interpretability, so why do you need to
> > interpret the regression equation?
> >
> > I hope this helps,
> >  John
> >
> >   -----------------------------
> >   John Fox, Professor Emeritus
> >   McMaster University
> >   Hamilton, Ontario, Canada
> >   Web: http::/socserv.mcmaster.ca/jfox
> >
> > > On Nov 27, 2019, at 10:17 AM, Ashim Kapoor <[hidden email]>
> > wrote:
> > >
> > > Dear Petr,
> > >
> > > Many thanks for the quick response.
> > >
> > > I also read this:-
> > > https://en.wikipedia.org/wiki/Discrete_orthogonal_polynomials
> > >
> > > Also I read  in ?poly:-
> > >     The orthogonal polynomial is summarized by the coefficients, which
> > >     can be used to evaluate it via the three-term recursion given in
> > >     Kennedy & Gentle (1980, pp. 343-4), and used in the ‘predict’ part
> > >     of the code.
> > >
> > > I don't have access to the mentioned book.
> > >
> > > Out of curiosity, what is the name of the discrete orthogonal
> polynomial
> > > used by R ?
> > > What discrete measure is it orthogonal with respect to ?
> > >
> > > Many thanks,
> > > Ashim
> > >
> > >
> > >
> > >
> > > On Wed, Nov 27, 2019 at 6:11 PM PIKAL Petr <[hidden email]>
> > wrote:
> > >
> > >> You could get answer quickly by searching net.
> > >>
> > >>
> > >>
> >
> https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-p
> > >> olynomials-how-to-understand-the-coefs-ret/39051154#39051154
> > >> <
> >
> https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-polynomials-how-to-understand-the-coefs-ret/39051154#39051154
> > >
> > >>
> > >> Cheers
> > >> Petr
> > >>
> > >>> -----Original Message-----
> > >>> From: R-help <[hidden email]> On Behalf Of Ashim
> Kapoor
> > >>> Sent: Wednesday, November 27, 2019 12:55 PM
> > >>> To: R Help <[hidden email]>
> > >>> Subject: [R] Orthogonal polynomials used by R
> > >>>
> > >>> Dear All,
> > >>>
> > >>> I have created a time trend by doing x<-1:93 because I have a time
> > series
> > >>> with 93 data points. Next I did :-
> > >>>
> > >>> y = lm(series ~ poly(x,4))$residuals
> > >>>
> > >>> to detrend series.
> > >>>
> > >>> I choose this 4 as the order of my polynomial using cross validation/
> > >>> checking the absence of trend in the residuals so I think I have not
> > >> overfit
> > >>> this series.
> > >>>
> > >>> I wish to document the formula of poly(x,4). I am not able to find it
> > in
> > >> ?poly
> > >>>
> > >>> Can someone please tell me what the formula for the orthogonal
> > >>> polynomial used by R is ?
> > >>>
> > >>> Thank you,
> > >>> Ashim
> > >>>
> > >>>      [[alternative HTML version deleted]]
> > >>>
> > >>> ______________________________________________
> > >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > >>> PLEASE do read the posting guide http://www.R-project.org/posting-
> > >>> guide.html
> > >>> and provide commented, minimal, self-contained, reproducible code.
> > >>
> > >
> > >       [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Orthogonal polynomials used by R

Ashim Kapoor
Dear Bert,

OK and thank you.

@Fox, John <[hidden email]> will be grateful for an offline reply.

Best,
Ashim

On Thu, Nov 28, 2019 at 11:43 AM Bert Gunter <[hidden email]> wrote:

> Statistical questions are generally off topic on this list. Try
> stats.stackexchange.com instead.
>
> But FWIW, I recommend that you work with someone with expertise in time
> series analysis, as your efforts to shake and bake your own methodology
> seem rather unwise to me.
>
> Cheers,
> Bert
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, Nov 27, 2019 at 9:47 PM Ashim Kapoor <[hidden email]>
> wrote:
>
>> Dear Peter and John,
>>
>> Many thanks for your prompt replies.
>>
>> Here is what I was trying to do.  I was trying to build a statistical
>> model
>> of a given time series using Box Jenkins methodology. The series has 93
>> data points. Before I analyse the ACF and PACF, I am required to de-trend
>> the series. The series seems to have an upward trend. I wanted to find out
>> what order polynomial should I fit the series
>> without overfitting.  For this I want to use orthogonal polynomials(I
>> think
>> someone on the internet was talking about preventing overfitting by using
>> orthogonal polynomials) . This seems to me as a poor man's cross
>> validation.
>>
>> So my plan is to keep increasing the degree of the orthogonal polynomials
>> till the coefficient of the last orthogonal polynomial becomes
>> insignificant.
>>
>> Note : If I do NOT use orthogonal polynomials, I will overfit the data set
>> and I don't think that is a good way to detect the true order of the
>> polynomial.
>>
>> Also now that I have detrended the series and built an ARIMA model of the
>> residuals, now I want to forecast. For this I need to use the original
>> polynomials and their coefficients.
>>
>> I hope I was clear and that my methodology is ok.
>>
>> I have another query here :-
>>
>> Note : If I used cross-validation to determine the order of the
>> polynomial,
>> I don't get a clear answer.
>>
>> See here :-
>> library(boot)
>> mydf = data.frame(cbind(gdp,x))
>> d<-(c(
>> cv.glm(data = mydf,glm(gdp~x),K=10)$delta[1],
>> cv.glm(data = mydf,glm(gdp~poly(x,2)),K=10)$delta[1],
>> cv.glm(data = mydf,glm(gdp~poly(x,3)),K=10)$delta[1],
>> cv.glm(data = mydf,glm(gdp~poly(x,4)),K=10)$delta[1],
>> cv.glm(data = mydf,glm(gdp~poly(x,5)),K=10)$delta[1],
>> cv.glm(data = mydf,glm(gdp~poly(x,6)),K=10)$delta[1]))
>> print(d)
>> ## [1] 2.178574e+13 7.303031e+11 5.994783e+11 4.943586e+11 4.596648e+11
>> ## [6] 4.980159e+11
>>
>> # Here it chooses 5. (but 4 and 5 are kind of similar).
>>
>>
>> d1 <- (c(
>> cv.glm(data = mydf,glm(gdp~1+x),K=10)$delta[1],
>> cv.glm(data = mydf,glm(gdp~1+x+x^2),K=10)$delta[1],
>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3),K=10)$delta[1],
>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4),K=10)$delta[1],
>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5),K=10)$delta[1],
>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5+x^6),K=10)$delta[1]))
>>
>> print(d1)
>> ## [1] 2.149647e+13 2.253999e+13 2.182175e+13 2.177170e+13 2.198675e+13
>> ## [6] 2.145754e+13
>>
>> # here it chooses 1 or 6
>>
>> Query : Why does it choose 1? Notice : Is this just round off noise /
>> noise
>> due to sampling error created by Cross Validation when it creates the K
>> folds? Is this due to the ill conditioned model matrix?
>>
>> Best Regards,
>> Ashim.
>>
>>
>>
>>
>>
>> On Wed, Nov 27, 2019 at 10:41 PM Fox, John <[hidden email]> wrote:
>>
>> > Dear Ashim,
>> >
>> > Orthogonal polynomials are used because they tend to produce more
>> accurate
>> > numerical computations, not because their coefficients are
>> interpretable,
>> > so I wonder why you're interested in the coefficients.
>> >
>> > The regressors produced are orthogonal to the constant regressor and are
>> > orthogonal to each other (and in fact are orthonormal), as it's simple
>> to
>> > demonstrate:
>> >
>> > ------- snip -------
>> >
>> > > x <- 1:93
>> > > y <- 1 + x + x^2 + x^3 + x^4 + rnorm(93)
>> > > (m <- lm(y ~ poly(x, 4)))
>> >
>> > Call:
>> > lm(formula = y ~ poly(x, 4))
>> >
>> > Coefficients:
>> > (Intercept)  poly(x, 4)1  poly(x, 4)2  poly(x, 4)3  poly(x, 4)4
>> >    15574516    172715069     94769949     27683528      3429259
>> >
>> > > X <- model.matrix(m)
>> > > head(X)
>> >   (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
>> > 1           1  -0.1776843   0.2245083  -0.2572066  0.27935949
>> > 2           1  -0.1738216   0.2098665  -0.2236579  0.21862917
>> > 3           1  -0.1699589   0.1955464  -0.1919525  0.16390514
>> > 4           1  -0.1660962   0.1815482  -0.1620496  0.11487597
>> > 5           1  -0.1622335   0.1678717  -0.1339080  0.07123722
>> > 6           1  -0.1583708   0.1545171  -0.1074869  0.03269145
>> >
>> > > zapsmall(crossprod(X))# X'X
>> >             (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
>> > (Intercept)          93           0           0           0           0
>> > poly(x, 4)1           0           1           0           0           0
>> > poly(x, 4)2           0           0           1           0           0
>> > poly(x, 4)3           0           0           0           1           0
>> > poly(x, 4)4           0           0           0           0           1
>> >
>> > ------- snip -------
>> >
>> > If for some not immediately obvious reason you're interested in the
>> > regression coefficients, why not just use a "raw" polynomial:
>> >
>> > ------- snip -------
>> >
>> > > (m1 <- lm(y ~ poly(x, 4, raw=TRUE)))
>> >
>> > Call:
>> > lm(formula = y ~ poly(x, 4, raw = TRUE))
>> >
>> > Coefficients:
>> >             (Intercept)  poly(x, 4, raw = TRUE)1  poly(x, 4, raw =
>> TRUE)2
>> > poly(x, 4, raw = TRUE)3
>> >                  1.5640                   0.8985
>>  1.0037
>> >                  1.0000
>> > poly(x, 4, raw = TRUE)4
>> >                  1.0000
>> >
>> > ------- snip -------
>> >
>> > These coefficients are simply interpretable but the model matrix is more
>> > poorly conditioned:
>> >
>> > ------- snip -------
>> >
>> > > head(X1)
>> >   (Intercept) poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2 poly(x, 4,
>> > raw = TRUE)3
>> > 1           1                       1                       1
>> >          1
>> > 2           1                       2                       4
>> >          8
>> > 3           1                       3                       9
>> >         27
>> > 4           1                       4                      16
>> >         64
>> > 5           1                       5                      25
>> >        125
>> > 6           1                       6                      36
>> >        216
>> >   poly(x, 4, raw = TRUE)4
>> > 1                       1
>> > 2                      16
>> > 3                      81
>> > 4                     256
>> > 5                     625
>> > 6                    1296
>> > > round(cor(X1[, -1]), 2)
>> >                         poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2
>> > poly(x, 4, raw = TRUE)3
>> > poly(x, 4, raw = TRUE)1                    1.00                    0.97
>> >                 0.92
>> > poly(x, 4, raw = TRUE)2                    0.97                    1.00
>> >                 0.99
>> > poly(x, 4, raw = TRUE)3                    0.92                    0.99
>> >                 1.00
>> > poly(x, 4, raw = TRUE)4                    0.87                    0.96
>> >                 0.99
>> >                         poly(x, 4, raw = TRUE)4
>> > poly(x, 4, raw = TRUE)1                    0.87
>> > poly(x, 4, raw = TRUE)2                    0.96
>> > poly(x, 4, raw = TRUE)3                    0.99
>> > poly(x, 4, raw = TRUE)4                    1.00
>> >
>> > ------- snip -------
>> >
>> > The two parametrizations are equivalent, however, in that they represent
>> > the same regression surface, and so, e.g., produce the same fitted
>> values:
>> >
>> > ------- snip -------
>> >
>> > > all.equal(fitted(m), fitted(m1))
>> > [1] TRUE
>> >
>> > ------- snip -------
>> >
>> > Because one is usually not interested in the individual coefficients of
>> a
>> > polynomial there usually isn't a reason to prefer one parametrization to
>> > the other on the grounds of interpretability, so why do you need to
>> > interpret the regression equation?
>> >
>> > I hope this helps,
>> >  John
>> >
>> >   -----------------------------
>> >   John Fox, Professor Emeritus
>> >   McMaster University
>> >   Hamilton, Ontario, Canada
>> >   Web: http::/socserv.mcmaster.ca/jfox
>> >
>> > > On Nov 27, 2019, at 10:17 AM, Ashim Kapoor <[hidden email]>
>> > wrote:
>> > >
>> > > Dear Petr,
>> > >
>> > > Many thanks for the quick response.
>> > >
>> > > I also read this:-
>> > > https://en.wikipedia.org/wiki/Discrete_orthogonal_polynomials
>> > >
>> > > Also I read  in ?poly:-
>> > >     The orthogonal polynomial is summarized by the coefficients, which
>> > >     can be used to evaluate it via the three-term recursion given in
>> > >     Kennedy & Gentle (1980, pp. 343-4), and used in the ‘predict’ part
>> > >     of the code.
>> > >
>> > > I don't have access to the mentioned book.
>> > >
>> > > Out of curiosity, what is the name of the discrete orthogonal
>> polynomial
>> > > used by R ?
>> > > What discrete measure is it orthogonal with respect to ?
>> > >
>> > > Many thanks,
>> > > Ashim
>> > >
>> > >
>> > >
>> > >
>> > > On Wed, Nov 27, 2019 at 6:11 PM PIKAL Petr <[hidden email]>
>> > wrote:
>> > >
>> > >> You could get answer quickly by searching net.
>> > >>
>> > >>
>> > >>
>> >
>> https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-p
>> > >> olynomials-how-to-understand-the-coefs-ret/39051154#39051154
>> > >> <
>> >
>> https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-polynomials-how-to-understand-the-coefs-ret/39051154#39051154
>> > >
>> > >>
>> > >> Cheers
>> > >> Petr
>> > >>
>> > >>> -----Original Message-----
>> > >>> From: R-help <[hidden email]> On Behalf Of Ashim
>> Kapoor
>> > >>> Sent: Wednesday, November 27, 2019 12:55 PM
>> > >>> To: R Help <[hidden email]>
>> > >>> Subject: [R] Orthogonal polynomials used by R
>> > >>>
>> > >>> Dear All,
>> > >>>
>> > >>> I have created a time trend by doing x<-1:93 because I have a time
>> > series
>> > >>> with 93 data points. Next I did :-
>> > >>>
>> > >>> y = lm(series ~ poly(x,4))$residuals
>> > >>>
>> > >>> to detrend series.
>> > >>>
>> > >>> I choose this 4 as the order of my polynomial using cross
>> validation/
>> > >>> checking the absence of trend in the residuals so I think I have not
>> > >> overfit
>> > >>> this series.
>> > >>>
>> > >>> I wish to document the formula of poly(x,4). I am not able to find
>> it
>> > in
>> > >> ?poly
>> > >>>
>> > >>> Can someone please tell me what the formula for the orthogonal
>> > >>> polynomial used by R is ?
>> > >>>
>> > >>> Thank you,
>> > >>> Ashim
>> > >>>
>> > >>>      [[alternative HTML version deleted]]
>> > >>>
>> > >>> ______________________________________________
>> > >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> > >>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> > >>> guide.html
>> > >>> and provide commented, minimal, self-contained, reproducible code.
>> > >>
>> > >
>> > >       [[alternative HTML version deleted]]
>> > >
>> > > ______________________________________________
>> > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > > and provide commented, minimal, self-contained, reproducible code.
>> >
>> >
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Orthogonal polynomials used by R

Fox, John
In reply to this post by Ashim Kapoor
Dear Ashim,

I'm afraid that much of what you say here is confused.

First, because poly(x) and poly(x, raw=TRUE) produce the same fitted values (as I previously explained), they also produce the same residuals, and consequently the same CV criteria. From the point of view of CV, there's therefore no reason to prefer orthogonal polynomials. And you still don't explain why you want to interpret the coefficients of the polynomial.

Second, the model formula gdp~1+x+x^2 and other similar formulas in your message don't do what you think. Like + and *, the ^ operator has special meaning on the right-hand side of an R model formula. See ?Formula and perhaps read something about statistical models in R. For example:

> x <- 1:93
> y <- 1 + x + x^2 + x^3 + x^4 + rnorm(93)
> (m <- lm(y ~ x + x^2))

Call:
lm(formula = y ~ x + x^2)

Coefficients:
(Intercept)            x  
  -15781393       667147  

While gpp ~ x + I(x^2) would work, a better way to fit a raw quadratic is as gdp ~ poly(x, 2, raw=TRUE), as I suggested in my earlier message.

Finally, as to what you should do, I generally try to avoid statistical consulting by email. If you can find competent statistical help locally, such as at a nearby university, I'd recommend talking to someone about the purpose of your research and the nature of your data. If that's not possible, then others have suggested where you might find help, but to get useful advice you'll have to provide much more information about your research.

Best,
 John

  -----------------------------
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Nov 28, 2019, at 12:46 AM, Ashim Kapoor <[hidden email]> wrote:
>
> Dear Peter and John,
>
> Many thanks for your prompt replies.
>
> Here is what I was trying to do.  I was trying to build a statistical model of a given time series using Box Jenkins methodology. The series has 93 data points. Before I analyse the ACF and PACF, I am required to de-trend the series. The series seems to have an upward trend. I wanted to find out what order polynomial should I fit the series
> without overfitting.  For this I want to use orthogonal polynomials(I think someone on the internet was talking about preventing overfitting by using orthogonal polynomials) . This seems to me as a poor man's cross validation.
>
> So my plan is to keep increasing the degree of the orthogonal polynomials till the coefficient of the last orthogonal polynomial becomes insignificant.
>
> Note : If I do NOT use orthogonal polynomials, I will overfit the data set and I don't think that is a good way to detect the true order of the polynomial.
>
> Also now that I have detrended the series and built an ARIMA model of the residuals, now I want to forecast. For this I need to use the original polynomials and their coefficients.
>
> I hope I was clear and that my methodology is ok.
>
> I have another query here :-
>
> Note : If I used cross-validation to determine the order of the polynomial, I don't get a clear answer.
>
> See here :-
> library(boot)
> mydf = data.frame(cbind(gdp,x))
> d<-(c(
> cv.glm(data = mydf,glm(gdp~x),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~poly(x,2)),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~poly(x,3)),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~poly(x,4)),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~poly(x,5)),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~poly(x,6)),K=10)$delta[1]))
> print(d)
> ## [1] 2.178574e+13 7.303031e+11 5.994783e+11 4.943586e+11 4.596648e+11
> ## [6] 4.980159e+11
>
> # Here it chooses 5. (but 4 and 5 are kind of similar).
>
>
> d1 <- (c(
> cv.glm(data = mydf,glm(gdp~1+x),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~1+x+x^2),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5+x^6),K=10)$delta[1]))
>
> print(d1)
> ## [1] 2.149647e+13 2.253999e+13 2.182175e+13 2.177170e+13 2.198675e+13
> ## [6] 2.145754e+13
>
> # here it chooses 1 or 6
>
> Query : Why does it choose 1? Notice : Is this just round off noise / noise due to sampling error created by Cross Validation when it creates the K folds? Is this due to the ill conditioned model matrix?
>
> Best Regards,
> Ashim.
>
>
>
>
>
> On Wed, Nov 27, 2019 at 10:41 PM Fox, John <[hidden email]> wrote:
> Dear Ashim,
>
> Orthogonal polynomials are used because they tend to produce more accurate numerical computations, not because their coefficients are interpretable, so I wonder why you're interested in the coefficients.
>
> The regressors produced are orthogonal to the constant regressor and are orthogonal to each other (and in fact are orthonormal), as it's simple to demonstrate:
>
> ------- snip -------
>
> > x <- 1:93
> > y <- 1 + x + x^2 + x^3 + x^4 + rnorm(93)
> > (m <- lm(y ~ poly(x, 4)))
>
> Call:
> lm(formula = y ~ poly(x, 4))
>
> Coefficients:
> (Intercept)  poly(x, 4)1  poly(x, 4)2  poly(x, 4)3  poly(x, 4)4  
>    15574516    172715069     94769949     27683528      3429259  
>
> > X <- model.matrix(m)
> > head(X)
>   (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
> 1           1  -0.1776843   0.2245083  -0.2572066  0.27935949
> 2           1  -0.1738216   0.2098665  -0.2236579  0.21862917
> 3           1  -0.1699589   0.1955464  -0.1919525  0.16390514
> 4           1  -0.1660962   0.1815482  -0.1620496  0.11487597
> 5           1  -0.1622335   0.1678717  -0.1339080  0.07123722
> 6           1  -0.1583708   0.1545171  -0.1074869  0.03269145
>
> > zapsmall(crossprod(X))# X'X
>             (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
> (Intercept)          93           0           0           0           0
> poly(x, 4)1           0           1           0           0           0
> poly(x, 4)2           0           0           1           0           0
> poly(x, 4)3           0           0           0           1           0
> poly(x, 4)4           0           0           0           0           1
>
> ------- snip -------
>
> If for some not immediately obvious reason you're interested in the regression coefficients, why not just use a "raw" polynomial:
>
> ------- snip -------
>
> > (m1 <- lm(y ~ poly(x, 4, raw=TRUE)))
>
> Call:
> lm(formula = y ~ poly(x, 4, raw = TRUE))
>
> Coefficients:
>             (Intercept)  poly(x, 4, raw = TRUE)1  poly(x, 4, raw = TRUE)2  poly(x, 4, raw = TRUE)3  
>                  1.5640                   0.8985                   1.0037                   1.0000  
> poly(x, 4, raw = TRUE)4  
>                  1.0000  
>
> ------- snip -------
>
> These coefficients are simply interpretable but the model matrix is more poorly conditioned:
>
> ------- snip -------
>
> > head(X1)
>   (Intercept) poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2 poly(x, 4, raw = TRUE)3
> 1           1                       1                       1                       1
> 2           1                       2                       4                       8
> 3           1                       3                       9                      27
> 4           1                       4                      16                      64
> 5           1                       5                      25                     125
> 6           1                       6                      36                     216
>   poly(x, 4, raw = TRUE)4
> 1                       1
> 2                      16
> 3                      81
> 4                     256
> 5                     625
> 6                    1296
> > round(cor(X1[, -1]), 2)
>                         poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2 poly(x, 4, raw = TRUE)3
> poly(x, 4, raw = TRUE)1                    1.00                    0.97                    0.92
> poly(x, 4, raw = TRUE)2                    0.97                    1.00                    0.99
> poly(x, 4, raw = TRUE)3                    0.92                    0.99                    1.00
> poly(x, 4, raw = TRUE)4                    0.87                    0.96                    0.99
>                         poly(x, 4, raw = TRUE)4
> poly(x, 4, raw = TRUE)1                    0.87
> poly(x, 4, raw = TRUE)2                    0.96
> poly(x, 4, raw = TRUE)3                    0.99
> poly(x, 4, raw = TRUE)4                    1.00
>
> ------- snip -------
>
> The two parametrizations are equivalent, however, in that they represent the same regression surface, and so, e.g., produce the same fitted values:
>
> ------- snip -------
>
> > all.equal(fitted(m), fitted(m1))
> [1] TRUE
>
> ------- snip -------
>
> Because one is usually not interested in the individual coefficients of a polynomial there usually isn't a reason to prefer one parametrization to the other on the grounds of interpretability, so why do you need to interpret the regression equation?
>
> I hope this helps,
>  John
>
>   -----------------------------
>   John Fox, Professor Emeritus
>   McMaster University
>   Hamilton, Ontario, Canada
>   Web: http::/socserv.mcmaster.ca/jfox
>
> > On Nov 27, 2019, at 10:17 AM, Ashim Kapoor <[hidden email]> wrote:
> >
> > Dear Petr,
> >
> > Many thanks for the quick response.
> >
> > I also read this:-
> > https://en.wikipedia.org/wiki/Discrete_orthogonal_polynomials
> >
> > Also I read  in ?poly:-
> >     The orthogonal polynomial is summarized by the coefficients, which
> >     can be used to evaluate it via the three-term recursion given in
> >     Kennedy & Gentle (1980, pp. 343-4), and used in the ‘predict’ part
> >     of the code.
> >
> > I don't have access to the mentioned book.
> >
> > Out of curiosity, what is the name of the discrete orthogonal polynomial
> > used by R ?
> > What discrete measure is it orthogonal with respect to ?
> >
> > Many thanks,
> > Ashim
> >
> >
> >
> >
> > On Wed, Nov 27, 2019 at 6:11 PM PIKAL Petr <[hidden email]> wrote:
> >
> >> You could get answer quickly by searching net.
> >>
> >>
> >> https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-p
> >> olynomials-how-to-understand-the-coefs-ret/39051154#39051154
> >> <https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-polynomials-how-to-understand-the-coefs-ret/39051154#39051154>
> >>
> >> Cheers
> >> Petr
> >>
> >>> -----Original Message-----
> >>> From: R-help <[hidden email]> On Behalf Of Ashim Kapoor
> >>> Sent: Wednesday, November 27, 2019 12:55 PM
> >>> To: R Help <[hidden email]>
> >>> Subject: [R] Orthogonal polynomials used by R
> >>>
> >>> Dear All,
> >>>
> >>> I have created a time trend by doing x<-1:93 because I have a time series
> >>> with 93 data points. Next I did :-
> >>>
> >>> y = lm(series ~ poly(x,4))$residuals
> >>>
> >>> to detrend series.
> >>>
> >>> I choose this 4 as the order of my polynomial using cross validation/
> >>> checking the absence of trend in the residuals so I think I have not
> >> overfit
> >>> this series.
> >>>
> >>> I wish to document the formula of poly(x,4). I am not able to find it in
> >> ?poly
> >>>
> >>> Can someone please tell me what the formula for the orthogonal
> >>> polynomial used by R is ?
> >>>
> >>> Thank you,
> >>> Ashim
> >>>
> >>>      [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide http://www.R-project.org/posting-
> >>> guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>


______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Orthogonal polynomials used by R

Ashim Kapoor
On Thu, Nov 28, 2019 at 7:38 PM Fox, John <[hidden email]> wrote:

> Dear Ashim,
>
> I'm afraid that much of what you say here is confused.
>
> First, because poly(x) and poly(x, raw=TRUE) produce the same fitted
> values (as I previously explained), they also produce the same residuals,
> and consequently the same CV criteria. From the point of view of CV,
> there's therefore no reason to prefer orthogonal polynomials. And you still
> don't explain why you want to interpret the coefficients of the polynomial.
>

The trend in the variable that I am trying to create an ARIMA model for is
given by poly(x,4). That is why I wished to know what these polynomials
look like.

I used  :

trend <- predict(lm(gdp~poly(x,4)),newdata = data.frame(
x=94:103),interval="confidence")

and I was able to (numerically) extrapolate the poly(x,4) trend, although,
I think it would be interesting to know what polynomials I was dealing with
in this case. Just some intuition as to if the linear / quadratic / cubic /
fourth order polynomial trend is dominating. I don't know how I would
interpret them, but it would be fun to know.

Please allow me to show you a trick. I read this on the internet, here :-

https://www.datasciencecentral.com/profiles/blogs/deep-dive-into-polynomial-regression-and-overfitting

Please see the LAST comment by Scott Stelljes where he suggests using an
orthogonal polynomial basis. He does not elaborate but leaves the reader to
work out the details.

Here is what I think of this. Take a big number say 20 and take a variable
in which we are trying to find the order of the polynomial in the trend.
Like this :-

> summary(lm(gdp ~ poly(x,20)))

Call:
lm(formula = gdp ~ poly(x, 20))

Residuals:
     Min       1Q   Median       3Q      Max
-1235661  -367798   -80453   240360  1450906

Coefficients:
               Estimate Std. Error t value Pr(>|t|)
(Intercept)    17601482      66934 262.968  < 2e-16 ***
poly(x, 20)1  125679081     645487 194.704  < 2e-16 ***
poly(x, 20)2   43108747     645487  66.785  < 2e-16 ***
poly(x, 20)3    3605839     645487   5.586 3.89e-07 ***
poly(x, 20)4   -2977277     645487  -4.612 1.69e-05 ***
poly(x, 20)5    1085732     645487   1.682   0.0969 .
poly(x, 20)6    1124125     645487   1.742   0.0859 .
poly(x, 20)7    -108676     645487  -0.168   0.8668
poly(x, 20)8    -976915     645487  -1.513   0.1345
poly(x, 20)9   -1635444     645487  -2.534   0.0135 *
poly(x, 20)10   -715019     645487  -1.108   0.2717
poly(x, 20)11    347102     645487   0.538   0.5924
poly(x, 20)12   -176728     645487  -0.274   0.7850
poly(x, 20)13   -634151     645487  -0.982   0.3292
poly(x, 20)14   -537725     645487  -0.833   0.4076
poly(x, 20)15    -58674     645487  -0.091   0.9278
poly(x, 20)16    -67030     645487  -0.104   0.9176
poly(x, 20)17   -809443     645487  -1.254   0.2139
poly(x, 20)18   -668879     645487  -1.036   0.3036
poly(x, 20)19   -302384     645487  -0.468   0.6409
poly(x, 20)20    359134     645487   0.556   0.5797
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 645500 on 72 degrees of freedom
Multiple R-squared:  0.9983, Adjusted R-squared:  0.9978
F-statistic:  2122 on 20 and 72 DF,  p-value: < 2.2e-16

>


The CV estimate of the trend is 4. I am not saying this method is perfect,
but look above this method also suggests n=4.

I CANNOT do this with raw polynomials, since they are correlated and
JOINTLY in the presence of others they may not be significant, please see
below.

> summary(lm(gdp ~ poly(x,20,raw=T)))

Call:
lm(formula = gdp ~ poly(x, 20, raw = T))

Residuals:
     Min       1Q   Median       3Q      Max
-1286007  -372221   -81320   248510  1589130

Coefficients: (4 not defined because of singularities)
                         Estimate Std. Error t value Pr(>|t|)
(Intercept)             2.067e+06  2.649e+06   0.780    0.438
poly(x, 20, raw = T)1   1.633e+06  3.556e+06   0.459    0.647
poly(x, 20, raw = T)2  -7.601e+05  1.679e+06  -0.453    0.652
poly(x, 20, raw = T)3   1.861e+05  3.962e+05   0.470    0.640
poly(x, 20, raw = T)4  -2.634e+04  5.480e+04  -0.481    0.632
poly(x, 20, raw = T)5   2.370e+03  4.854e+03   0.488    0.627
poly(x, 20, raw = T)6  -1.434e+02  2.906e+02  -0.493    0.623
poly(x, 20, raw = T)7   6.022e+00  1.213e+01   0.496    0.621
poly(x, 20, raw = T)8  -1.784e-01  3.587e-01  -0.497    0.620
poly(x, 20, raw = T)9   3.727e-03  7.503e-03   0.497    0.621
poly(x, 20, raw = T)10 -5.373e-05  1.086e-04  -0.495    0.622
poly(x, 20, raw = T)11  5.016e-07  1.018e-06   0.493    0.624
poly(x, 20, raw = T)12 -2.483e-09  5.069e-09  -0.490    0.626
poly(x, 20, raw = T)13         NA         NA      NA       NA
poly(x, 20, raw = T)14  5.656e-14  1.167e-13   0.485    0.629
poly(x, 20, raw = T)15         NA         NA      NA       NA
poly(x, 20, raw = T)16 -1.933e-18  4.011e-18  -0.482    0.631
poly(x, 20, raw = T)17         NA         NA      NA       NA
poly(x, 20, raw = T)18  5.181e-23  1.076e-22   0.482    0.631
poly(x, 20, raw = T)19         NA         NA      NA       NA
poly(x, 20, raw = T)20 -7.173e-28  1.480e-27  -0.485    0.629

Residual standard error: 641000 on 76 degrees of freedom
Multiple R-squared:  0.9982, Adjusted R-squared:  0.9979
F-statistic:  2690 on 16 and 76 DF,  p-value: < 2.2e-16

>

Note,however, once the orthogonal polynomials have suggested a number, 4 in
this case, I can do this :-

 summary(lm(gdp ~ poly(x,4,raw=T)))

Call:
lm(formula = gdp ~ poly(x, 4, raw = T))

Residuals:
     Min       1Q   Median       3Q      Max
-1278673  -424315   -22357   310977  1731813

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)
(Intercept)           3.022e+06  3.676e+05   8.220 1.64e-12 ***
poly(x, 4, raw = T)1  1.741e+05  5.357e+04   3.249  0.00164 **
poly(x, 4, raw = T)2 -6.434e+03  2.300e+03  -2.797  0.00633 **
poly(x, 4, raw = T)3  1.878e+02  3.667e+01   5.123 1.76e-06 ***
poly(x, 4, raw = T)4 -8.682e-01  1.935e-01  -4.486 2.19e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 663700 on 88 degrees of freedom
Multiple R-squared:  0.9978, Adjusted R-squared:  0.9977
F-statistic: 1.003e+04 on 4 and 88 DF,  p-value: < 2.2e-16

>

Although due to correlations they may not be significant jointly, but in
this case all 4 powers come out significant.


Second, the model formula gdp~1+x+x^2 and other similar formulas in your

> message don't do what you think. Like + and *, the ^ operator has special
> meaning on the right-hand side of an R model formula. See ?Formula and
> perhaps read something about statistical models in R. For example:
>
> > x <- 1:93
> > y <- 1 + x + x^2 + x^3 + x^4 + rnorm(93)
> > (m <- lm(y ~ x + x^2))
>
> Call:
> lm(formula = y ~ x + x^2)
>
> Coefficients:
> (Intercept)            x
>   -15781393       667147
>
> While gpp ~ x + I(x^2) would work, a better way to fit a raw quadratic is
> as gdp ~ poly(x, 2, raw=TRUE), as I suggested in my earlier message.
>

My bad. Yes, I have some idea of the Wilkinson-Rogers notation. I have seen
it in books, although it slipped my mind that I had to use I( ).


> Finally, as to what you should do, I generally try to avoid statistical
> consulting by email. If you can find competent statistical help locally,
> such as at a nearby university, I'd recommend talking to someone about the
> purpose of your research and the nature of your data. If that's not
> possible, then others have suggested where you might find help, but to get
> useful advice you'll have to provide much more information about your
> research.
>

My original query was about the polynomials used by R which I think is ON
topic. My apologies that this query turned into a statistics query.


> Best,
>  John
>
>   -----------------------------
>   John Fox, Professor Emeritus
>   McMaster University
>   Hamilton, Ontario, Canada
>   Web: http::/socserv.mcmaster.ca/jfox
>
> > On Nov 28, 2019, at 12:46 AM, Ashim Kapoor <[hidden email]>
> wrote:
> >
> > Dear Peter and John,
> >
> > Many thanks for your prompt replies.
> >
> > Here is what I was trying to do.  I was trying to build a statistical
> model of a given time series using Box Jenkins methodology. The series has
> 93 data points. Before I analyse the ACF and PACF, I am required to
> de-trend the series. The series seems to have an upward trend. I wanted to
> find out what order polynomial should I fit the series
> > without overfitting.  For this I want to use orthogonal polynomials(I
> think someone on the internet was talking about preventing overfitting by
> using orthogonal polynomials) . This seems to me as a poor man's cross
> validation.
> >
> > So my plan is to keep increasing the degree of the orthogonal
> polynomials till the coefficient of the last orthogonal polynomial becomes
> insignificant.
> >
> > Note : If I do NOT use orthogonal polynomials, I will overfit the data
> set and I don't think that is a good way to detect the true order of the
> polynomial.
> >
> > Also now that I have detrended the series and built an ARIMA model of
> the residuals, now I want to forecast. For this I need to use the original
> polynomials and their coefficients.
> >
> > I hope I was clear and that my methodology is ok.
> >
> > I have another query here :-
> >
> > Note : If I used cross-validation to determine the order of the
> polynomial, I don't get a clear answer.
> >
> > See here :-
> > library(boot)
> > mydf = data.frame(cbind(gdp,x))
> > d<-(c(
> > cv.glm(data = mydf,glm(gdp~x),K=10)$delta[1],
> > cv.glm(data = mydf,glm(gdp~poly(x,2)),K=10)$delta[1],
> > cv.glm(data = mydf,glm(gdp~poly(x,3)),K=10)$delta[1],
> > cv.glm(data = mydf,glm(gdp~poly(x,4)),K=10)$delta[1],
> > cv.glm(data = mydf,glm(gdp~poly(x,5)),K=10)$delta[1],
> > cv.glm(data = mydf,glm(gdp~poly(x,6)),K=10)$delta[1]))
> > print(d)
> > ## [1] 2.178574e+13 7.303031e+11 5.994783e+11 4.943586e+11 4.596648e+11
> > ## [6] 4.980159e+11
> >
> > # Here it chooses 5. (but 4 and 5 are kind of similar).
> >
> >
> > d1 <- (c(
> > cv.glm(data = mydf,glm(gdp~1+x),K=10)$delta[1],
> > cv.glm(data = mydf,glm(gdp~1+x+x^2),K=10)$delta[1],
> > cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3),K=10)$delta[1],
> > cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4),K=10)$delta[1],
> > cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5),K=10)$delta[1],
> > cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5+x^6),K=10)$delta[1]))
> >
> > print(d1)
> > ## [1] 2.149647e+13 2.253999e+13 2.182175e+13 2.177170e+13 2.198675e+13
> > ## [6] 2.145754e+13
> >
> > # here it chooses 1 or 6
> >
> > Query : Why does it choose 1? Notice : Is this just round off noise /
> noise due to sampling error created by Cross Validation when it creates the
> K folds? Is this due to the ill conditioned model matrix?
> >
> > Best Regards,
> > Ashim.
> >
> >
> >
> >
> >
> > On Wed, Nov 27, 2019 at 10:41 PM Fox, John <[hidden email]> wrote:
> > Dear Ashim,
> >
> > Orthogonal polynomials are used because they tend to produce more
> accurate numerical computations, not because their coefficients are
> interpretable, so I wonder why you're interested in the coefficients.
> >
> > The regressors produced are orthogonal to the constant regressor and are
> orthogonal to each other (and in fact are orthonormal), as it's simple to
> demonstrate:
> >
> > ------- snip -------
> >
> > > x <- 1:93
> > > y <- 1 + x + x^2 + x^3 + x^4 + rnorm(93)
> > > (m <- lm(y ~ poly(x, 4)))
> >
> > Call:
> > lm(formula = y ~ poly(x, 4))
> >
> > Coefficients:
> > (Intercept)  poly(x, 4)1  poly(x, 4)2  poly(x, 4)3  poly(x, 4)4
> >    15574516    172715069     94769949     27683528      3429259
> >
> > > X <- model.matrix(m)
> > > head(X)
> >   (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
> > 1           1  -0.1776843   0.2245083  -0.2572066  0.27935949
> > 2           1  -0.1738216   0.2098665  -0.2236579  0.21862917
> > 3           1  -0.1699589   0.1955464  -0.1919525  0.16390514
> > 4           1  -0.1660962   0.1815482  -0.1620496  0.11487597
> > 5           1  -0.1622335   0.1678717  -0.1339080  0.07123722
> > 6           1  -0.1583708   0.1545171  -0.1074869  0.03269145
> >
> > > zapsmall(crossprod(X))# X'X
> >             (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
> > (Intercept)          93           0           0           0           0
> > poly(x, 4)1           0           1           0           0           0
> > poly(x, 4)2           0           0           1           0           0
> > poly(x, 4)3           0           0           0           1           0
> > poly(x, 4)4           0           0           0           0           1
> >
> > ------- snip -------
> >
> > If for some not immediately obvious reason you're interested in the
> regression coefficients, why not just use a "raw" polynomial:
> >
> > ------- snip -------
> >
> > > (m1 <- lm(y ~ poly(x, 4, raw=TRUE)))
> >
> > Call:
> > lm(formula = y ~ poly(x, 4, raw = TRUE))
> >
> > Coefficients:
> >             (Intercept)  poly(x, 4, raw = TRUE)1  poly(x, 4, raw =
> TRUE)2  poly(x, 4, raw = TRUE)3
> >                  1.5640                   0.8985
>  1.0037                   1.0000
> > poly(x, 4, raw = TRUE)4
> >                  1.0000
> >
> > ------- snip -------
> >
> > These coefficients are simply interpretable but the model matrix is more
> poorly conditioned:
> >
> > ------- snip -------
> >
> > > head(X1)
> >   (Intercept) poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2 poly(x, 4,
> raw = TRUE)3
> > 1           1                       1                       1
>            1
> > 2           1                       2                       4
>            8
> > 3           1                       3                       9
>           27
> > 4           1                       4                      16
>           64
> > 5           1                       5                      25
>          125
> > 6           1                       6                      36
>          216
> >   poly(x, 4, raw = TRUE)4
> > 1                       1
> > 2                      16
> > 3                      81
> > 4                     256
> > 5                     625
> > 6                    1296
> > > round(cor(X1[, -1]), 2)
> >                         poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2
> poly(x, 4, raw = TRUE)3
> > poly(x, 4, raw = TRUE)1                    1.00                    0.97
>                   0.92
> > poly(x, 4, raw = TRUE)2                    0.97                    1.00
>                   0.99
> > poly(x, 4, raw = TRUE)3                    0.92                    0.99
>                   1.00
> > poly(x, 4, raw = TRUE)4                    0.87                    0.96
>                   0.99
> >                         poly(x, 4, raw = TRUE)4
> > poly(x, 4, raw = TRUE)1                    0.87
> > poly(x, 4, raw = TRUE)2                    0.96
> > poly(x, 4, raw = TRUE)3                    0.99
> > poly(x, 4, raw = TRUE)4                    1.00
> >
> > ------- snip -------
> >
> > The two parametrizations are equivalent, however, in that they represent
> the same regression surface, and so, e.g., produce the same fitted values:
> >
> > ------- snip -------
> >
> > > all.equal(fitted(m), fitted(m1))
> > [1] TRUE
> >
> > ------- snip -------
> >
> > Because one is usually not interested in the individual coefficients of
> a polynomial there usually isn't a reason to prefer one parametrization to
> the other on the grounds of interpretability, so why do you need to
> interpret the regression equation?
> >
> > I hope this helps,
> >  John
> >
> >   -----------------------------
> >   John Fox, Professor Emeritus
> >   McMaster University
> >   Hamilton, Ontario, Canada
> >   Web: http::/socserv.mcmaster.ca/jfox
> >
> > > On Nov 27, 2019, at 10:17 AM, Ashim Kapoor <[hidden email]>
> wrote:
> > >
> > > Dear Petr,
> > >
> > > Many thanks for the quick response.
> > >
> > > I also read this:-
> > > https://en.wikipedia.org/wiki/Discrete_orthogonal_polynomials
> > >
> > > Also I read  in ?poly:-
> > >     The orthogonal polynomial is summarized by the coefficients, which
> > >     can be used to evaluate it via the three-term recursion given in
> > >     Kennedy & Gentle (1980, pp. 343-4), and used in the ‘predict’ part
> > >     of the code.
> > >
> > > I don't have access to the mentioned book.
> > >
> > > Out of curiosity, what is the name of the discrete orthogonal
> polynomial
> > > used by R ?
> > > What discrete measure is it orthogonal with respect to ?
> > >
> > > Many thanks,
> > > Ashim
> > >
> > >
> > >
> > >
> > > On Wed, Nov 27, 2019 at 6:11 PM PIKAL Petr <[hidden email]>
> wrote:
> > >
> > >> You could get answer quickly by searching net.
> > >>
> > >>
> > >>
> https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-p
> > >> olynomials-how-to-understand-the-coefs-ret/39051154#39051154
> > >> <
> https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-polynomials-how-to-understand-the-coefs-ret/39051154#39051154
> >
> > >>
> > >> Cheers
> > >> Petr
> > >>
> > >>> -----Original Message-----
> > >>> From: R-help <[hidden email]> On Behalf Of Ashim
> Kapoor
> > >>> Sent: Wednesday, November 27, 2019 12:55 PM
> > >>> To: R Help <[hidden email]>
> > >>> Subject: [R] Orthogonal polynomials used by R
> > >>>
> > >>> Dear All,
> > >>>
> > >>> I have created a time trend by doing x<-1:93 because I have a time
> series
> > >>> with 93 data points. Next I did :-
> > >>>
> > >>> y = lm(series ~ poly(x,4))$residuals
> > >>>
> > >>> to detrend series.
> > >>>
> > >>> I choose this 4 as the order of my polynomial using cross validation/
> > >>> checking the absence of trend in the residuals so I think I have not
> > >> overfit
> > >>> this series.
> > >>>
> > >>> I wish to document the formula of poly(x,4). I am not able to find
> it in
> > >> ?poly
> > >>>
> > >>> Can someone please tell me what the formula for the orthogonal
> > >>> polynomial used by R is ?
> > >>>
> > >>> Thank you,
> > >>> Ashim
> > >>>
> > >>>      [[alternative HTML version deleted]]
> > >>>
> > >>> ______________________________________________
> > >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > >>> PLEASE do read the posting guide http://www.R-project.org/posting-
> > >>> guide.html
> > >>> and provide commented, minimal, self-contained, reproducible code.
> > >>
> > >
> > >       [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Orthogonal polynomials used by R

Michael Dewey-3
Dear Ashim

As John said your two examples give the same model to within rounding
error so it is not clear what you see the problem as being. You can
always remove some of the correlation by subtracting out a large
constant from x before you use poly() on it.

Michael

On 28/11/2019 16:02, Ashim Kapoor wrote:

> On Thu, Nov 28, 2019 at 7:38 PM Fox, John <[hidden email]> wrote:
>
>> Dear Ashim,
>>
>> I'm afraid that much of what you say here is confused.
>>
>> First, because poly(x) and poly(x, raw=TRUE) produce the same fitted
>> values (as I previously explained), they also produce the same residuals,
>> and consequently the same CV criteria. From the point of view of CV,
>> there's therefore no reason to prefer orthogonal polynomials. And you still
>> don't explain why you want to interpret the coefficients of the polynomial.
>>
>
> The trend in the variable that I am trying to create an ARIMA model for is
> given by poly(x,4). That is why I wished to know what these polynomials
> look like.
>
> I used  :
>
> trend <- predict(lm(gdp~poly(x,4)),newdata = data.frame(
> x=94:103),interval="confidence")
>
> and I was able to (numerically) extrapolate the poly(x,4) trend, although,
> I think it would be interesting to know what polynomials I was dealing with
> in this case. Just some intuition as to if the linear / quadratic / cubic /
> fourth order polynomial trend is dominating. I don't know how I would
> interpret them, but it would be fun to know.
>
> Please allow me to show you a trick. I read this on the internet, here :-
>
> https://www.datasciencecentral.com/profiles/blogs/deep-dive-into-polynomial-regression-and-overfitting
>
> Please see the LAST comment by Scott Stelljes where he suggests using an
> orthogonal polynomial basis. He does not elaborate but leaves the reader to
> work out the details.
>
> Here is what I think of this. Take a big number say 20 and take a variable
> in which we are trying to find the order of the polynomial in the trend.
> Like this :-
>
>> summary(lm(gdp ~ poly(x,20)))
>
> Call:
> lm(formula = gdp ~ poly(x, 20))
>
> Residuals:
>       Min       1Q   Median       3Q      Max
> -1235661  -367798   -80453   240360  1450906
>
> Coefficients:
>                 Estimate Std. Error t value Pr(>|t|)
> (Intercept)    17601482      66934 262.968  < 2e-16 ***
> poly(x, 20)1  125679081     645487 194.704  < 2e-16 ***
> poly(x, 20)2   43108747     645487  66.785  < 2e-16 ***
> poly(x, 20)3    3605839     645487   5.586 3.89e-07 ***
> poly(x, 20)4   -2977277     645487  -4.612 1.69e-05 ***
> poly(x, 20)5    1085732     645487   1.682   0.0969 .
> poly(x, 20)6    1124125     645487   1.742   0.0859 .
> poly(x, 20)7    -108676     645487  -0.168   0.8668
> poly(x, 20)8    -976915     645487  -1.513   0.1345
> poly(x, 20)9   -1635444     645487  -2.534   0.0135 *
> poly(x, 20)10   -715019     645487  -1.108   0.2717
> poly(x, 20)11    347102     645487   0.538   0.5924
> poly(x, 20)12   -176728     645487  -0.274   0.7850
> poly(x, 20)13   -634151     645487  -0.982   0.3292
> poly(x, 20)14   -537725     645487  -0.833   0.4076
> poly(x, 20)15    -58674     645487  -0.091   0.9278
> poly(x, 20)16    -67030     645487  -0.104   0.9176
> poly(x, 20)17   -809443     645487  -1.254   0.2139
> poly(x, 20)18   -668879     645487  -1.036   0.3036
> poly(x, 20)19   -302384     645487  -0.468   0.6409
> poly(x, 20)20    359134     645487   0.556   0.5797
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 645500 on 72 degrees of freedom
> Multiple R-squared:  0.9983, Adjusted R-squared:  0.9978
> F-statistic:  2122 on 20 and 72 DF,  p-value: < 2.2e-16
>
>>
>
>
> The CV estimate of the trend is 4. I am not saying this method is perfect,
> but look above this method also suggests n=4.
>
> I CANNOT do this with raw polynomials, since they are correlated and
> JOINTLY in the presence of others they may not be significant, please see
> below.
>
>> summary(lm(gdp ~ poly(x,20,raw=T)))
>
> Call:
> lm(formula = gdp ~ poly(x, 20, raw = T))
>
> Residuals:
>       Min       1Q   Median       3Q      Max
> -1286007  -372221   -81320   248510  1589130
>
> Coefficients: (4 not defined because of singularities)
>                           Estimate Std. Error t value Pr(>|t|)
> (Intercept)             2.067e+06  2.649e+06   0.780    0.438
> poly(x, 20, raw = T)1   1.633e+06  3.556e+06   0.459    0.647
> poly(x, 20, raw = T)2  -7.601e+05  1.679e+06  -0.453    0.652
> poly(x, 20, raw = T)3   1.861e+05  3.962e+05   0.470    0.640
> poly(x, 20, raw = T)4  -2.634e+04  5.480e+04  -0.481    0.632
> poly(x, 20, raw = T)5   2.370e+03  4.854e+03   0.488    0.627
> poly(x, 20, raw = T)6  -1.434e+02  2.906e+02  -0.493    0.623
> poly(x, 20, raw = T)7   6.022e+00  1.213e+01   0.496    0.621
> poly(x, 20, raw = T)8  -1.784e-01  3.587e-01  -0.497    0.620
> poly(x, 20, raw = T)9   3.727e-03  7.503e-03   0.497    0.621
> poly(x, 20, raw = T)10 -5.373e-05  1.086e-04  -0.495    0.622
> poly(x, 20, raw = T)11  5.016e-07  1.018e-06   0.493    0.624
> poly(x, 20, raw = T)12 -2.483e-09  5.069e-09  -0.490    0.626
> poly(x, 20, raw = T)13         NA         NA      NA       NA
> poly(x, 20, raw = T)14  5.656e-14  1.167e-13   0.485    0.629
> poly(x, 20, raw = T)15         NA         NA      NA       NA
> poly(x, 20, raw = T)16 -1.933e-18  4.011e-18  -0.482    0.631
> poly(x, 20, raw = T)17         NA         NA      NA       NA
> poly(x, 20, raw = T)18  5.181e-23  1.076e-22   0.482    0.631
> poly(x, 20, raw = T)19         NA         NA      NA       NA
> poly(x, 20, raw = T)20 -7.173e-28  1.480e-27  -0.485    0.629
>
> Residual standard error: 641000 on 76 degrees of freedom
> Multiple R-squared:  0.9982, Adjusted R-squared:  0.9979
> F-statistic:  2690 on 16 and 76 DF,  p-value: < 2.2e-16
>
>>
>
> Note,however, once the orthogonal polynomials have suggested a number, 4 in
> this case, I can do this :-
>
>   summary(lm(gdp ~ poly(x,4,raw=T)))
>
> Call:
> lm(formula = gdp ~ poly(x, 4, raw = T))
>
> Residuals:
>       Min       1Q   Median       3Q      Max
> -1278673  -424315   -22357   310977  1731813
>
> Coefficients:
>                         Estimate Std. Error t value Pr(>|t|)
> (Intercept)           3.022e+06  3.676e+05   8.220 1.64e-12 ***
> poly(x, 4, raw = T)1  1.741e+05  5.357e+04   3.249  0.00164 **
> poly(x, 4, raw = T)2 -6.434e+03  2.300e+03  -2.797  0.00633 **
> poly(x, 4, raw = T)3  1.878e+02  3.667e+01   5.123 1.76e-06 ***
> poly(x, 4, raw = T)4 -8.682e-01  1.935e-01  -4.486 2.19e-05 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 663700 on 88 degrees of freedom
> Multiple R-squared:  0.9978, Adjusted R-squared:  0.9977
> F-statistic: 1.003e+04 on 4 and 88 DF,  p-value: < 2.2e-16
>
>>
>
> Although due to correlations they may not be significant jointly, but in
> this case all 4 powers come out significant.
>
>
> Second, the model formula gdp~1+x+x^2 and other similar formulas in your
>> message don't do what you think. Like + and *, the ^ operator has special
>> meaning on the right-hand side of an R model formula. See ?Formula and
>> perhaps read something about statistical models in R. For example:
>>
>>> x <- 1:93
>>> y <- 1 + x + x^2 + x^3 + x^4 + rnorm(93)
>>> (m <- lm(y ~ x + x^2))
>>
>> Call:
>> lm(formula = y ~ x + x^2)
>>
>> Coefficients:
>> (Intercept)            x
>>    -15781393       667147
>>
>> While gpp ~ x + I(x^2) would work, a better way to fit a raw quadratic is
>> as gdp ~ poly(x, 2, raw=TRUE), as I suggested in my earlier message.
>>
>
> My bad. Yes, I have some idea of the Wilkinson-Rogers notation. I have seen
> it in books, although it slipped my mind that I had to use I( ).
>
>
>> Finally, as to what you should do, I generally try to avoid statistical
>> consulting by email. If you can find competent statistical help locally,
>> such as at a nearby university, I'd recommend talking to someone about the
>> purpose of your research and the nature of your data. If that's not
>> possible, then others have suggested where you might find help, but to get
>> useful advice you'll have to provide much more information about your
>> research.
>>
>
> My original query was about the polynomials used by R which I think is ON
> topic. My apologies that this query turned into a statistics query.
>
>
>> Best,
>>   John
>>
>>    -----------------------------
>>    John Fox, Professor Emeritus
>>    McMaster University
>>    Hamilton, Ontario, Canada
>>    Web: http::/socserv.mcmaster.ca/jfox
>>
>>> On Nov 28, 2019, at 12:46 AM, Ashim Kapoor <[hidden email]>
>> wrote:
>>>
>>> Dear Peter and John,
>>>
>>> Many thanks for your prompt replies.
>>>
>>> Here is what I was trying to do.  I was trying to build a statistical
>> model of a given time series using Box Jenkins methodology. The series has
>> 93 data points. Before I analyse the ACF and PACF, I am required to
>> de-trend the series. The series seems to have an upward trend. I wanted to
>> find out what order polynomial should I fit the series
>>> without overfitting.  For this I want to use orthogonal polynomials(I
>> think someone on the internet was talking about preventing overfitting by
>> using orthogonal polynomials) . This seems to me as a poor man's cross
>> validation.
>>>
>>> So my plan is to keep increasing the degree of the orthogonal
>> polynomials till the coefficient of the last orthogonal polynomial becomes
>> insignificant.
>>>
>>> Note : If I do NOT use orthogonal polynomials, I will overfit the data
>> set and I don't think that is a good way to detect the true order of the
>> polynomial.
>>>
>>> Also now that I have detrended the series and built an ARIMA model of
>> the residuals, now I want to forecast. For this I need to use the original
>> polynomials and their coefficients.
>>>
>>> I hope I was clear and that my methodology is ok.
>>>
>>> I have another query here :-
>>>
>>> Note : If I used cross-validation to determine the order of the
>> polynomial, I don't get a clear answer.
>>>
>>> See here :-
>>> library(boot)
>>> mydf = data.frame(cbind(gdp,x))
>>> d<-(c(
>>> cv.glm(data = mydf,glm(gdp~x),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~poly(x,2)),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~poly(x,3)),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~poly(x,4)),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~poly(x,5)),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~poly(x,6)),K=10)$delta[1]))
>>> print(d)
>>> ## [1] 2.178574e+13 7.303031e+11 5.994783e+11 4.943586e+11 4.596648e+11
>>> ## [6] 4.980159e+11
>>>
>>> # Here it chooses 5. (but 4 and 5 are kind of similar).
>>>
>>>
>>> d1 <- (c(
>>> cv.glm(data = mydf,glm(gdp~1+x),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~1+x+x^2),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5+x^6),K=10)$delta[1]))
>>>
>>> print(d1)
>>> ## [1] 2.149647e+13 2.253999e+13 2.182175e+13 2.177170e+13 2.198675e+13
>>> ## [6] 2.145754e+13
>>>
>>> # here it chooses 1 or 6
>>>
>>> Query : Why does it choose 1? Notice : Is this just round off noise /
>> noise due to sampling error created by Cross Validation when it creates the
>> K folds? Is this due to the ill conditioned model matrix?
>>>
>>> Best Regards,
>>> Ashim.
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Nov 27, 2019 at 10:41 PM Fox, John <[hidden email]> wrote:
>>> Dear Ashim,
>>>
>>> Orthogonal polynomials are used because they tend to produce more
>> accurate numerical computations, not because their coefficients are
>> interpretable, so I wonder why you're interested in the coefficients.
>>>
>>> The regressors produced are orthogonal to the constant regressor and are
>> orthogonal to each other (and in fact are orthonormal), as it's simple to
>> demonstrate:
>>>
>>> ------- snip -------
>>>
>>>> x <- 1:93
>>>> y <- 1 + x + x^2 + x^3 + x^4 + rnorm(93)
>>>> (m <- lm(y ~ poly(x, 4)))
>>>
>>> Call:
>>> lm(formula = y ~ poly(x, 4))
>>>
>>> Coefficients:
>>> (Intercept)  poly(x, 4)1  poly(x, 4)2  poly(x, 4)3  poly(x, 4)4
>>>     15574516    172715069     94769949     27683528      3429259
>>>
>>>> X <- model.matrix(m)
>>>> head(X)
>>>    (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
>>> 1           1  -0.1776843   0.2245083  -0.2572066  0.27935949
>>> 2           1  -0.1738216   0.2098665  -0.2236579  0.21862917
>>> 3           1  -0.1699589   0.1955464  -0.1919525  0.16390514
>>> 4           1  -0.1660962   0.1815482  -0.1620496  0.11487597
>>> 5           1  -0.1622335   0.1678717  -0.1339080  0.07123722
>>> 6           1  -0.1583708   0.1545171  -0.1074869  0.03269145
>>>
>>>> zapsmall(crossprod(X))# X'X
>>>              (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
>>> (Intercept)          93           0           0           0           0
>>> poly(x, 4)1           0           1           0           0           0
>>> poly(x, 4)2           0           0           1           0           0
>>> poly(x, 4)3           0           0           0           1           0
>>> poly(x, 4)4           0           0           0           0           1
>>>
>>> ------- snip -------
>>>
>>> If for some not immediately obvious reason you're interested in the
>> regression coefficients, why not just use a "raw" polynomial:
>>>
>>> ------- snip -------
>>>
>>>> (m1 <- lm(y ~ poly(x, 4, raw=TRUE)))
>>>
>>> Call:
>>> lm(formula = y ~ poly(x, 4, raw = TRUE))
>>>
>>> Coefficients:
>>>              (Intercept)  poly(x, 4, raw = TRUE)1  poly(x, 4, raw =
>> TRUE)2  poly(x, 4, raw = TRUE)3
>>>                   1.5640                   0.8985
>>   1.0037                   1.0000
>>> poly(x, 4, raw = TRUE)4
>>>                   1.0000
>>>
>>> ------- snip -------
>>>
>>> These coefficients are simply interpretable but the model matrix is more
>> poorly conditioned:
>>>
>>> ------- snip -------
>>>
>>>> head(X1)
>>>    (Intercept) poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2 poly(x, 4,
>> raw = TRUE)3
>>> 1           1                       1                       1
>>             1
>>> 2           1                       2                       4
>>             8
>>> 3           1                       3                       9
>>            27
>>> 4           1                       4                      16
>>            64
>>> 5           1                       5                      25
>>           125
>>> 6           1                       6                      36
>>           216
>>>    poly(x, 4, raw = TRUE)4
>>> 1                       1
>>> 2                      16
>>> 3                      81
>>> 4                     256
>>> 5                     625
>>> 6                    1296
>>>> round(cor(X1[, -1]), 2)
>>>                          poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2
>> poly(x, 4, raw = TRUE)3
>>> poly(x, 4, raw = TRUE)1                    1.00                    0.97
>>                    0.92
>>> poly(x, 4, raw = TRUE)2                    0.97                    1.00
>>                    0.99
>>> poly(x, 4, raw = TRUE)3                    0.92                    0.99
>>                    1.00
>>> poly(x, 4, raw = TRUE)4                    0.87                    0.96
>>                    0.99
>>>                          poly(x, 4, raw = TRUE)4
>>> poly(x, 4, raw = TRUE)1                    0.87
>>> poly(x, 4, raw = TRUE)2                    0.96
>>> poly(x, 4, raw = TRUE)3                    0.99
>>> poly(x, 4, raw = TRUE)4                    1.00
>>>
>>> ------- snip -------
>>>
>>> The two parametrizations are equivalent, however, in that they represent
>> the same regression surface, and so, e.g., produce the same fitted values:
>>>
>>> ------- snip -------
>>>
>>>> all.equal(fitted(m), fitted(m1))
>>> [1] TRUE
>>>
>>> ------- snip -------
>>>
>>> Because one is usually not interested in the individual coefficients of
>> a polynomial there usually isn't a reason to prefer one parametrization to
>> the other on the grounds of interpretability, so why do you need to
>> interpret the regression equation?
>>>
>>> I hope this helps,
>>>   John
>>>
>>>    -----------------------------
>>>    John Fox, Professor Emeritus
>>>    McMaster University
>>>    Hamilton, Ontario, Canada
>>>    Web: http::/socserv.mcmaster.ca/jfox
>>>
>>>> On Nov 27, 2019, at 10:17 AM, Ashim Kapoor <[hidden email]>
>> wrote:
>>>>
>>>> Dear Petr,
>>>>
>>>> Many thanks for the quick response.
>>>>
>>>> I also read this:-
>>>> https://en.wikipedia.org/wiki/Discrete_orthogonal_polynomials
>>>>
>>>> Also I read  in ?poly:-
>>>>      The orthogonal polynomial is summarized by the coefficients, which
>>>>      can be used to evaluate it via the three-term recursion given in
>>>>      Kennedy & Gentle (1980, pp. 343-4), and used in the ‘predict’ part
>>>>      of the code.
>>>>
>>>> I don't have access to the mentioned book.
>>>>
>>>> Out of curiosity, what is the name of the discrete orthogonal
>> polynomial
>>>> used by R ?
>>>> What discrete measure is it orthogonal with respect to ?
>>>>
>>>> Many thanks,
>>>> Ashim
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Nov 27, 2019 at 6:11 PM PIKAL Petr <[hidden email]>
>> wrote:
>>>>
>>>>> You could get answer quickly by searching net.
>>>>>
>>>>>
>>>>>
>> https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-p
>>>>> olynomials-how-to-understand-the-coefs-ret/39051154#39051154
>>>>> <
>> https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-polynomials-how-to-understand-the-coefs-ret/39051154#39051154
>>>
>>>>>
>>>>> Cheers
>>>>> Petr
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: R-help <[hidden email]> On Behalf Of Ashim
>> Kapoor
>>>>>> Sent: Wednesday, November 27, 2019 12:55 PM
>>>>>> To: R Help <[hidden email]>
>>>>>> Subject: [R] Orthogonal polynomials used by R
>>>>>>
>>>>>> Dear All,
>>>>>>
>>>>>> I have created a time trend by doing x<-1:93 because I have a time
>> series
>>>>>> with 93 data points. Next I did :-
>>>>>>
>>>>>> y = lm(series ~ poly(x,4))$residuals
>>>>>>
>>>>>> to detrend series.
>>>>>>
>>>>>> I choose this 4 as the order of my polynomial using cross validation/
>>>>>> checking the absence of trend in the residuals so I think I have not
>>>>> overfit
>>>>>> this series.
>>>>>>
>>>>>> I wish to document the formula of poly(x,4). I am not able to find
>> it in
>>>>> ?poly
>>>>>>
>>>>>> Can someone please tell me what the formula for the orthogonal
>>>>>> polynomial used by R is ?
>>>>>>
>>>>>> Thank you,
>>>>>> Ashim
>>>>>>
>>>>>>       [[alternative HTML version deleted]]
>>>>>>
>>>>>> ______________________________________________
>>>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>>>>> guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>
>>>>        [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

--
Michael
http://www.dewey.myzen.co.uk/home.html

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Orthogonal polynomials used by R

Fox, John
In reply to this post by Fox, John
Dear Ashim,

Please see my brief remarks below:

> On Nov 28, 2019, at 11:02 AM, Ashim Kapoor <[hidden email]> wrote:
>
> On Thu, Nov 28, 2019 at 7:38 PM Fox, John <[hidden email]> wrote:
>
>> Dear Ashim,
>>
>> I'm afraid that much of what you say here is confused.
>>
>> First, because poly(x) and poly(x, raw=TRUE) produce the same fitted
>> values (as I previously explained), they also produce the same residuals,
>> and consequently the same CV criteria. From the point of view of CV,
>> there's therefore no reason to prefer orthogonal polynomials. And you still
>> don't explain why you want to interpret the coefficients of the polynomial.
>>
>
> The trend in the variable that I am trying to create an ARIMA model for is
> given by poly(x,4). That is why I wished to know what these polynomials
> look like.

The polynomial "looks" exactly the same whether or not you use raw or orthogonal regressors as a basis for it. That is, the two bases represent exactly the same regression surface (i.e., curve in the case of one x). To see what the fitted polynomial looks like, graph it. But I've now made essentially this point three times, so if it's not clear I regret the unclarity but I don't really have anything to add.

For other points, see below.

>
> I used  :
>
> trend <- predict(lm(gdp~poly(x,4)),newdata = data.frame(
> x=94:103),interval="confidence")
>
> and I was able to (numerically) extrapolate the poly(x,4) trend, although,
> I think it would be interesting to know what polynomials I was dealing with
> in this case. Just some intuition as to if the linear / quadratic / cubic /
> fourth order polynomial trend is dominating. I don't know how I would
> interpret them, but it would be fun to know.

I'm not sure how you intend to interpret the coefficients, say of the raw polynomial. Their magnitudes shouldn't be compared because the size of the powers of x grows with the powers.

BTW, it's very risky to use high-order polynomials for extrapolation beyond the observed range of x, even if the model fits well within the observed range of x, and of course raw and orthogonal polynomial produce exactly the same (problematic) extrapolations (although those produced by raw polynomials may be subject to more rounding error). To be clear, I'm not arguing that one should in general use raw polynomials in preference to orthogonal polynomials, just that the former have generally interpretable coefficients and the latter don't.

>
> Please allow me to show you a trick. I read this on the internet, here :-
>
> https://www.datasciencecentral.com/profiles/blogs/deep-dive-into-polynomial-regression-and-overfitting
>
> Please see the LAST comment by Scott Stelljes where he suggests using an
> orthogonal polynomial basis. He does not elaborate buttoleaves the reader to
> work out the details.

This blog focuses on the numerical stability of raw versus orthogonal polynomials. If by "stepwise" you mean adding successive powers to the model, you'll get exactly the same sequence of fits with raw as with orthogonal polynomial, as I've now explained several times.

>
> Here is what I think of this. Take a big number say 20 and take a variable
> in which we are trying to find the order of the polynomial in the trend.
> Like this :-
>
>> summary(lm(gdp ~ poly(x,20)))
>
> Call:
> lm(formula = gdp ~ poly(x, 20))
>
> Residuals:
>     Min       1Q   Median       3Q      Max
> -1235661  -367798   -80453   240360  1450906
>
> Coefficients:
>               Estimate Std. Error t value Pr(>|t|)
> (Intercept)    17601482      66934 262.968  < 2e-16 ***
> poly(x, 20)1  125679081     645487 194.704  < 2e-16 ***
> poly(x, 20)2   43108747     645487  66.785  < 2e-16 ***
> poly(x, 20)3    3605839     645487   5.586 3.89e-07 ***
> poly(x, 20)4   -2977277     645487  -4.612 1.69e-05 ***
> poly(x, 20)5    1085732     645487   1.682   0.0969 .
> poly(x, 20)6    1124125     645487   1.742   0.0859 .
> poly(x, 20)7    -108676     645487  -0.168   0.8668
> poly(x, 20)8    -976915     645487  -1.513   0.1345
> poly(x, 20)9   -1635444     645487  -2.534   0.0135 *
> poly(x, 20)10   -715019     645487  -1.108   0.2717
> poly(x, 20)11    347102     645487   0.538   0.5924
> poly(x, 20)12   -176728     645487  -0.274   0.7850
> poly(x, 20)13   -634151     645487  -0.982   0.3292
> poly(x, 20)14   -537725     645487  -0.833   0.4076
> poly(x, 20)15    -58674     645487  -0.091   0.9278
> poly(x, 20)16    -67030     645487  -0.104   0.9176
> poly(x, 20)17   -809443     645487  -1.254   0.2139
> poly(x, 20)18   -668879     645487  -1.036   0.3036
> poly(x, 20)19   -302384     645487  -0.468   0.6409
> poly(x, 20)20    359134     645487   0.556   0.5797
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 645500 on 72 degrees of freedom
> Multiple R-squared:  0.9983, Adjusted R-squared:  0.9978
> F-statistic:  2122 on 20 and 72 DF,  p-value: < 2.2e-16
>
>>
>
>
> The CV estimate of the trend is 4. I am not saying this method is perfect,
> but look above this method also suggests n=4.
>
> I CANNOT do this with raw polynomials, since they are correlated and
> JOINTLY in the presence of others they may not be significant, please see
> below.
>
>> summary(lm(gdp ~ poly(x,20,raw=T)))
>
> Call:
> lm(formula = gdp ~ poly(x, 20, raw = T))
>
> Residuals:
>     Min       1Q   Median       3Q      Max
> -1286007  -372221   -81320   248510  1589130
>
> Coefficients: (4 not defined because of singularities)
>                         Estimate Std. Error t value Pr(>|t|)
> (Intercept)             2.067e+06  2.649e+06   0.780    0.438
> poly(x, 20, raw = T)1   1.633e+06  3.556e+06   0.459    0.647
> poly(x, 20, raw = T)2  -7.601e+05  1.679e+06  -0.453    0.652
> poly(x, 20, raw = T)3   1.861e+05  3.962e+05   0.470    0.640
> poly(x, 20, raw = T)4  -2.634e+04  5.480e+04  -0.481    0.632
> poly(x, 20, raw = T)5   2.370e+03  4.854e+03   0.488    0.627
> poly(x, 20, raw = T)6  -1.434e+02  2.906e+02  -0.493    0.623
> poly(x, 20, raw = T)7   6.022e+00  1.213e+01   0.496    0.621
> poly(x, 20, raw = T)8  -1.784e-01  3.587e-01  -0.497    0.620
> poly(x, 20, raw = T)9   3.727e-03  7.503e-03   0.497    0.621
> poly(x, 20, raw = T)10 -5.373e-05  1.086e-04  -0.495    0.622
> poly(x, 20, raw = T)11  5.016e-07  1.018e-06   0.493    0.624
> poly(x, 20, raw = T)12 -2.483e-09  5.069e-09  -0.490    0.626
> poly(x, 20, raw = T)13         NA         NA      NA       NA
> poly(x, 20, raw = T)14  5.656e-14  1.167e-13   0.485    0.629
> poly(x, 20, raw = T)15         NA         NA      NA       NA
> poly(x, 20, raw = T)16 -1.933e-18  4.011e-18  -0.482    0.631
> poly(x, 20, raw = T)17         NA         NA      NA       NA
> poly(x, 20, raw = T)18  5.181e-23  1.076e-22   0.482    0.631
> poly(x, 20, raw = T)19         NA         NA      NA       NA
> poly(x, 20, raw = T)20 -7.173e-28  1.480e-27  -0.485    0.629
>
> Residual standard error: 641000 on 76 degrees of freedom
> Multiple R-squared:  0.9982, Adjusted R-squared:  0.9979
> F-statistic:  2690 on 16 and 76 DF,  p-value: < 2.2e-16
>
>>
>
> Note,however, once the orthogonal polynomials have suggested a number, 4 in
> this case, I can do this :-
>
> summary(lm(gdp ~ poly(x,4,raw=T)))
>
> Call:
> lm(formula = gdp ~ poly(x, 4, raw = T))
>
> Residuals:
>     Min       1Q   Median       3Q      Max
> -1278673  -424315   -22357   310977  1731813
>
> Coefficients:
>                       Estimate Std. Error t value Pr(>|t|)
> (Intercept)           3.022e+06  3.676e+05   8.220 1.64e-12 ***
> poly(x, 4, raw = T)1  1.741e+05  5.357e+04   3.249  0.00164 **
> poly(x, 4, raw = T)2 -6.434e+03  2.300e+03  -2.797  0.00633 **
> poly(x, 4, raw = T)3  1.878e+02  3.667e+01   5.123 1.76e-06 ***
> poly(x, 4, raw = T)4 -8.682e-01  1.935e-01  -4.486 2.19e-05 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 663700 on 88 degrees of freedom
> Multiple R-squared:  0.9978, Adjusted R-squared:  0.9977
> F-statistic: 1.003e+04 on 4 and 88 DF,  p-value: < 2.2e-16
>
>>
>
> Although due to correlations they may not be significant jointly, but in
> this case all 4 powers come out significant.

Yes, the coefficients of orthogonal polynomials permit stepwise tests of each term after the previous ones because the orthogonalization is done stepwise. But (1) interpreting these tests is problematic because, e.g., of issues of simultaneous inference, and (2) you're using CV for model selection anyway (aren't you?) and you'll get (once more) exactly the same CV results from raw and orthogonal polynomials.

>
>
> Second, the model formula gdp~1+x+x^2 and other similar formulas in your
>> message don't do what you think. Like + and *, the ^ operator has special
>> meaning on the right-hand side of an R model formula. See ?Formula and
>> perhaps read something about statistical models in R. For example:
>>
>>> x <- 1:93
>>> y <- 1 + x + x^2 + x^3 + x^4 + rnorm(93)
>>> (m <- lm(y ~ x + x^2))
>>
>> Call:
>> lm(formula = y ~ x + x^2)
>>
>> Coefficients:
>> (Intercept)            x
>>  -15781393       667147
>>
>> While gpp ~ x + I(x^2) would work, a better way to fit a raw quadratic is
>> as gdp ~ poly(x, 2, raw=TRUE), as I suggested in my earlier message.
>>
>
> My bad. Yes, I have some idea of the Wilkinson-Rogers notation. I have seen
> it in books, although it slipped my mind that I had to use I( ).
>
>
>> Finally, as to what you should do, I generally try to avoid statistical
>> consulting by email. If you can find competent statistical help locally,
>> such as at a nearby university, I'd recommend talking to someone about the
>> purpose of your research and the nature of your data. If that's not
>> possible, then others have suggested where you might find help, but to get
>> useful advice you'll have to provide much more information about your
>> research.
>>
>
> My original query was about the polynomials used by R which I think is ON
> topic.

I think that question was answered a while ago.

> My apologies that this query turned into a statistics query.

I don't feel the need for an apology, and although the list focuses on using R, often related statistical issues arise. On the other hand, I don't have anything more to say about your problem. Others are welcome to pick it up.

Best,
 John

>
>
>> Best,
>> John
>>
>>  -----------------------------
>>  John Fox, Professor Emeritus
>>  McMaster University
>>  Hamilton, Ontario, Canada
>>  Web: http::/socserv.mcmaster.ca/jfox
>>
>>> On Nov 28, 2019, at 12:46 AM, Ashim Kapoor <[hidden email]>
>> wrote:
>>>
>>> Dear Peter and John,
>>>
>>> Many thanks for your prompt replies.
>>>
>>> Here is what I was trying to do.  I was trying to build a statistical
>> model of a given time series using Box Jenkins methodology. The series has
>> 93 data points. Before I analyse the ACF and PACF, I am required to
>> de-trend the series. The series seems to have an upward trend. I wanted to
>> find out what order polynomial should I fit the series
>>> without overfitting.  For this I want to use orthogonal polynomials(I
>> think someone on the internet was talking about preventing overfitting by
>> using orthogonal polynomials) . This seems to me as a poor man's cross
>> validation.
>>>
>>> So my plan is to keep increasing the degree of the orthogonal
>> polynomials till the coefficient of the last orthogonal polynomial becomes
>> insignificant.
>>>
>>> Note : If I do NOT use orthogonal polynomials, I will overfit the data
>> set and I don't think that is a good way to detect the true order of the
>> polynomial.
>>>
>>> Also now that I have detrended the series and built an ARIMA model of
>> the residuals, now I want to forecast. For this I need to use the original
>> polynomials and their coefficients.
>>>
>>> I hope I was clear and that my methodology is ok.
>>>
>>> I have another query here :-
>>>
>>> Note : If I used cross-validation to determine the order of the
>> polynomial, I don't get a clear answer.
>>>
>>> See here :-
>>> library(boot)
>>> mydf = data.frame(cbind(gdp,x))
>>> d<-(c(
>>> cv.glm(data = mydf,glm(gdp~x),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~poly(x,2)),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~poly(x,3)),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~poly(x,4)),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~poly(x,5)),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~poly(x,6)),K=10)$delta[1]))
>>> print(d)
>>> ## [1] 2.178574e+13 7.303031e+11 5.994783e+11 4.943586e+11 4.596648e+11
>>> ## [6] 4.980159e+11
>>>
>>> # Here it chooses 5. (but 4 and 5 are kind of similar).
>>>
>>>
>>> d1 <- (c(
>>> cv.glm(data = mydf,glm(gdp~1+x),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~1+x+x^2),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5),K=10)$delta[1],
>>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5+x^6),K=10)$delta[1]))
>>>
>>> print(d1)
>>> ## [1] 2.149647e+13 2.253999e+13 2.182175e+13 2.177170e+13 2.198675e+13
>>> ## [6] 2.145754e+13
>>>
>>> # here it chooses 1 or 6
>>>
>>> Query : Why does it choose 1? Notice : Is this just round off noise /
>> noise due to sampling error created by Cross Validation when it creates the
>> K folds? Is this due to the ill conditioned model matrix?
>>>
>>> Best Regards,
>>> Ashim.
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Nov 27, 2019 at 10:41 PM Fox, John <[hidden email]> wrote:
>>> Dear Ashim,
>>>
>>> Orthogonal polynomials are used because they tend to produce more
>> accurate numerical computations, not because their coefficients are
>> interpretable, so I wonder why you're interested in the coefficients.
>>>
>>> The regressors produced are orthogonal to the constant regressor and are
>> orthogonal to each other (and in fact are orthonormal), as it's simple to
>> demonstrate:
>>>
>>> ------- snip -------
>>>
>>>> x <- 1:93
>>>> y <- 1 + x + x^2 + x^3 + x^4 + rnorm(93)
>>>> (m <- lm(y ~ poly(x, 4)))
>>>
>>> Call:
>>> lm(formula = y ~ poly(x, 4))
>>>
>>> Coefficients:
>>> (Intercept)  poly(x, 4)1  poly(x, 4)2  poly(x, 4)3  poly(x, 4)4
>>>   15574516    172715069     94769949     27683528      3429259
>>>
>>>> X <- model.matrix(m)
>>>> head(X)
>>>  (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
>>> 1           1  -0.1776843   0.2245083  -0.2572066  0.27935949
>>> 2           1  -0.1738216   0.2098665  -0.2236579  0.21862917
>>> 3           1  -0.1699589   0.1955464  -0.1919525  0.16390514
>>> 4           1  -0.1660962   0.1815482  -0.1620496  0.11487597
>>> 5           1  -0.1622335   0.1678717  -0.1339080  0.07123722
>>> 6           1  -0.1583708   0.1545171  -0.1074869  0.03269145
>>>
>>>> zapsmall(crossprod(X))# X'X
>>>            (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
>>> (Intercept)          93           0           0           0           0
>>> poly(x, 4)1           0           1           0           0           0
>>> poly(x, 4)2           0           0           1           0           0
>>> poly(x, 4)3           0           0           0           1           0
>>> poly(x, 4)4           0           0           0           0           1
>>>
>>> ------- snip -------
>>>
>>> If for some not immediately obvious reason you're interested in the
>> regression coefficients, why not just use a "raw" polynomial:
>>>
>>> ------- snip -------
>>>
>>>> (m1 <- lm(y ~ poly(x, 4, raw=TRUE)))
>>>
>>> Call:
>>> lm(formula = y ~ poly(x, 4, raw = TRUE))
>>>
>>> Coefficients:
>>>            (Intercept)  poly(x, 4, raw = TRUE)1  poly(x, 4, raw =
>> TRUE)2  poly(x, 4, raw = TRUE)3
>>>                 1.5640                   0.8985
>> 1.0037                   1.0000
>>> poly(x, 4, raw = TRUE)4
>>>                 1.0000
>>>
>>> ------- snip -------
>>>
>>> These coefficients are simply interpretable but the model matrix is more
>> poorly conditioned:
>>>
>>> ------- snip -------
>>>
>>>> head(X1)
>>>  (Intercept) poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2 poly(x, 4,
>> raw = TRUE)3
>>> 1           1                       1                       1
>>           1
>>> 2           1                       2                       4
>>           8
>>> 3           1                       3                       9
>>          27
>>> 4           1                       4                      16
>>          64
>>> 5           1                       5                      25
>>         125
>>> 6           1                       6                      36
>>         216
>>>  poly(x, 4, raw = TRUE)4
>>> 1                       1
>>> 2                      16
>>> 3                      81
>>> 4                     256
>>> 5                     625
>>> 6                    1296
>>>> round(cor(X1[, -1]), 2)
>>>                        poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2
>> poly(x, 4, raw = TRUE)3
>>> poly(x, 4, raw = TRUE)1                    1.00                    0.97
>>                  0.92
>>> poly(x, 4, raw = TRUE)2                    0.97                    1.00
>>                  0.99
>>> poly(x, 4, raw = TRUE)3                    0.92                    0.99
>>                  1.00
>>> poly(x, 4, raw = TRUE)4                    0.87                    0.96
>>                  0.99
>>>                        poly(x, 4, raw = TRUE)4
>>> poly(x, 4, raw = TRUE)1                    0.87
>>> poly(x, 4, raw = TRUE)2                    0.96
>>> poly(x, 4, raw = TRUE)3                    0.99
>>> poly(x, 4, raw = TRUE)4                    1.00
>>>
>>> ------- snip -------
>>>
>>> The two parametrizations are equivalent, however, in that they represent
>> the same regression surface, and so, e.g., produce the same fitted values:
>>>
>>> ------- snip -------
>>>
>>>> all.equal(fitted(m), fitted(m1))
>>> [1] TRUE
>>>
>>> ------- snip -------
>>>
>>> Because one is usually not interested in the individual coefficients of
>> a polynomial there usually isn't a reason to prefer one parametrization to
>> the other on the grounds of interpretability, so why do you need to
>> interpret the regression equation?
>>>
>>> I hope this helps,
>>> John
>>>
>>>  -----------------------------
>>>  John Fox, Professor Emeritus
>>>  McMaster University
>>>  Hamilton, Ontario, Canada
>>>  Web: http::/socserv.mcmaster.ca/jfox
>>>
>>>> On Nov 27, 2019, at 10:17 AM, Ashim Kapoor <[hidden email]>
>> wrote:
>>>>
>>>> Dear Petr,
>>>>
>>>> Many thanks for the quick response.
>>>>
>>>> I also read this:-
>>>> https://en.wikipedia.org/wiki/Discrete_orthogonal_polynomials
>>>>
>>>> Also I read  in ?poly:-
>>>>    The orthogonal polynomial is summarized by the coefficients, which
>>>>    can be used to evaluate it via the three-term recursion given in
>>>>    Kennedy & Gentle (1980, pp. 343-4), and used in the ‘predict’ part
>>>>    of the code.
>>>>
>>>> I don't have access to the mentioned book.
>>>>
>>>> Out of curiosity, what is the name of the discrete orthogonal
>> polynomial
>>>> used by R ?
>>>> What discrete measure is it orthogonal with respect to ?
>>>>
>>>> Many thanks,
>>>> Ashim
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Nov 27, 2019 at 6:11 PM PIKAL Petr <[hidden email]>
>> wrote:
>>>>
>>>>> You could get answer quickly by searching net.
>>>>>
>>>>>
>>>>>
>> https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-p
>>>>> olynomials-how-to-understand-the-coefs-ret/39051154#39051154
>>>>> <
>> https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-polynomials-how-to-understand-the-coefs-ret/39051154#39051154
>>>
>>>>>
>>>>> Cheers
>>>>> Petr
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: R-help <[hidden email]> On Behalf Of Ashim
>> Kapoor
>>>>>> Sent: Wednesday, November 27, 2019 12:55 PM
>>>>>> To: R Help <[hidden email]>
>>>>>> Subject: [R] Orthogonal polynomials used by R
>>>>>>
>>>>>> Dear All,
>>>>>>
>>>>>> I have created a time trend by doing x<-1:93 because I have a time
>> series
>>>>>> with 93 data points. Next I did :-
>>>>>>
>>>>>> y = lm(series ~ poly(x,4))$residuals
>>>>>>
>>>>>> to detrend series.
>>>>>>
>>>>>> I choose this 4 as the order of my polynomial using cross validation/
>>>>>> checking the absence of trend in the residuals so I think I have not
>>>>> overfit
>>>>>> this series.
>>>>>>
>>>>>> I wish to document the formula of poly(x,4). I am not able to find
>> it in
>>>>> ?poly
>>>>>>
>>>>>> Can someone please tell me what the formula for the orthogonal
>>>>>> polynomial used by R is ?
>>>>>>
>>>>>> Thank you,
>>>>>> Ashim
>>>>>>
>>>>>>     [[alternative HTML version deleted]]
>>>>>>
>>>>>> ______________________________________________
>>>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>>>>> guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>
>>>>      [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Orthogonal polynomials used by R / another perspective

J C Nash
I'm not going to comment at all on the original question, but on a very common --
and often troublesome -- mixing of viewpoints about data modelling.

R and other software is used to "fit equations to data" and to "estimate models".
Unfortunately, a good bit of both these tasks is common. Usually (but NOT exclusively),
we fit by minimizing a sum of squared deviations then carry forward the calculations to
make inferences about the parameters of the model equations. Quite frequently, a single
equation can be written different ways e.g., ordinary or orthogonal polynomials.
It gets worse, much worse, for nonlinear models.

Moreover, sometimes (in my case because I get people sending me troublesome problems, about
90% of instances) there are very different sets of numerical values for the parameters of the
modelling equations that give essentially the same sum of squares (or other loss function).

Over several decades, because I am sometimes quite happy to use ANY of the choices, even
when there is a linear dependence in a linear modelling equation for the data given, I'm
often granted a lot of nasty comments. If I'm using the FIT e.g. for approximation, then such
criticism is mis-placed. If I want to use the particular parameters to say something about
the system I'm studying, then indeed I should go back to school (my critics might say
kindergarten). A serious concern about some machine learning is that fit alone is used
as a criteria from which to predict using the "equations" (though some models are just
algorithms). There is a jump from a good fit to existing data to a hope that we can get
good predictions outside of the available data space.

Having taught statistics for several decades also, I know how difficult it is to impart
a good perspective on these issues. However, I'll continue to urge data scientists and
statisticians to keep a wide view and be clear on what they want the software to do for
them.

For now, end of rant.

John Nash (package author of several packages for fitting and optimizing)


On 2019-11-28 12:12 p.m., Fox, John wrote:

> Dear Ashim,
>
> Please see my brief remarks below:
>
>> On Nov 28, 2019, at 11:02 AM, Ashim Kapoor <[hidden email]> wrote:
>>
>> On Thu, Nov 28, 2019 at 7:38 PM Fox, John <[hidden email]> wrote:
>>
>>> Dear Ashim,
>>>
>>> I'm afraid that much of what you say here is confused.
>>>
>>> First, because poly(x) and poly(x, raw=TRUE) produce the same fitted
>>> values (as I previously explained), they also produce the same residuals,
>>> and consequently the same CV criteria. From the point of view of CV,
>>> there's therefore no reason to prefer orthogonal polynomials. And you still
>>> don't explain why you want to interpret the coefficients of the polynomial.
>>>
>>
>> The trend in the variable that I am trying to create an ARIMA model for is
>> given by poly(x,4). That is why I wished to know what these polynomials
>> look like.
>
> The polynomial "looks" exactly the same whether or not you use raw or orthogonal regressors as a basis for it. That is, the two bases represent exactly the same regression surface (i.e., curve in the case of one x). To see what the fitted polynomial looks like, graph it. But I've now made essentially this point three times, so if it's not clear I regret the unclarity but I don't really have anything to add.
>
> For other points, see below.
>
>>
>> I used  :
>>
>> trend <- predict(lm(gdp~poly(x,4)),newdata = data.frame(
>> x=94:103),interval="confidence")
>>
>> and I was able to (numerically) extrapolate the poly(x,4) trend, although,
>> I think it would be interesting to know what polynomials I was dealing with
>> in this case. Just some intuition as to if the linear / quadratic / cubic /
>> fourth order polynomial trend is dominating. I don't know how I would
>> interpret them, but it would be fun to know.
>
> I'm not sure how you intend to interpret the coefficients, say of the raw polynomial. Their magnitudes shouldn't be compared because the size of the powers of x grows with the powers.
>
> BTW, it's very risky to use high-order polynomials for extrapolation beyond the observed range of x, even if the model fits well within the observed range of x, and of course raw and orthogonal polynomial produce exactly the same (problematic) extrapolations (although those produced by raw polynomials may be subject to more rounding error). To be clear, I'm not arguing that one should in general use raw polynomials in preference to orthogonal polynomials, just that the former have generally interpretable coefficients and the latter don't.
>
>>
>> Please allow me to show you a trick. I read this on the internet, here :-
>>
>> https://www.datasciencecentral.com/profiles/blogs/deep-dive-into-polynomial-regression-and-overfitting
>>
>> Please see the LAST comment by Scott Stelljes where he suggests using an
>> orthogonal polynomial basis. He does not elaborate buttoleaves the reader to
>> work out the details.
>
> This blog focuses on the numerical stability of raw versus orthogonal polynomials. If by "stepwise" you mean adding successive powers to the model, you'll get exactly the same sequence of fits with raw as with orthogonal polynomial, as I've now explained several times.
>
>>
>> Here is what I think of this. Take a big number say 20 and take a variable
>> in which we are trying to find the order of the polynomial in the trend.
>> Like this :-
>>
>>> summary(lm(gdp ~ poly(x,20)))
>>
>> Call:
>> lm(formula = gdp ~ poly(x, 20))
>>
>> Residuals:
>>     Min       1Q   Median       3Q      Max
>> -1235661  -367798   -80453   240360  1450906
>>
>> Coefficients:
>>               Estimate Std. Error t value Pr(>|t|)
>> (Intercept)    17601482      66934 262.968  < 2e-16 ***
>> poly(x, 20)1  125679081     645487 194.704  < 2e-16 ***
>> poly(x, 20)2   43108747     645487  66.785  < 2e-16 ***
>> poly(x, 20)3    3605839     645487   5.586 3.89e-07 ***
>> poly(x, 20)4   -2977277     645487  -4.612 1.69e-05 ***
>> poly(x, 20)5    1085732     645487   1.682   0.0969 .
>> poly(x, 20)6    1124125     645487   1.742   0.0859 .
>> poly(x, 20)7    -108676     645487  -0.168   0.8668
>> poly(x, 20)8    -976915     645487  -1.513   0.1345
>> poly(x, 20)9   -1635444     645487  -2.534   0.0135 *
>> poly(x, 20)10   -715019     645487  -1.108   0.2717
>> poly(x, 20)11    347102     645487   0.538   0.5924
>> poly(x, 20)12   -176728     645487  -0.274   0.7850
>> poly(x, 20)13   -634151     645487  -0.982   0.3292
>> poly(x, 20)14   -537725     645487  -0.833   0.4076
>> poly(x, 20)15    -58674     645487  -0.091   0.9278
>> poly(x, 20)16    -67030     645487  -0.104   0.9176
>> poly(x, 20)17   -809443     645487  -1.254   0.2139
>> poly(x, 20)18   -668879     645487  -1.036   0.3036
>> poly(x, 20)19   -302384     645487  -0.468   0.6409
>> poly(x, 20)20    359134     645487   0.556   0.5797
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>
>> Residual standard error: 645500 on 72 degrees of freedom
>> Multiple R-squared:  0.9983, Adjusted R-squared:  0.9978
>> F-statistic:  2122 on 20 and 72 DF,  p-value: < 2.2e-16
>>
>>>
>>
>>
>> The CV estimate of the trend is 4. I am not saying this method is perfect,
>> but look above this method also suggests n=4.
>>
>> I CANNOT do this with raw polynomials, since they are correlated and
>> JOINTLY in the presence of others they may not be significant, please see
>> below.
>>
>>> summary(lm(gdp ~ poly(x,20,raw=T)))
>>
>> Call:
>> lm(formula = gdp ~ poly(x, 20, raw = T))
>>
>> Residuals:
>>     Min       1Q   Median       3Q      Max
>> -1286007  -372221   -81320   248510  1589130
>>
>> Coefficients: (4 not defined because of singularities)
>>                         Estimate Std. Error t value Pr(>|t|)
>> (Intercept)             2.067e+06  2.649e+06   0.780    0.438
>> poly(x, 20, raw = T)1   1.633e+06  3.556e+06   0.459    0.647
>> poly(x, 20, raw = T)2  -7.601e+05  1.679e+06  -0.453    0.652
>> poly(x, 20, raw = T)3   1.861e+05  3.962e+05   0.470    0.640
>> poly(x, 20, raw = T)4  -2.634e+04  5.480e+04  -0.481    0.632
>> poly(x, 20, raw = T)5   2.370e+03  4.854e+03   0.488    0.627
>> poly(x, 20, raw = T)6  -1.434e+02  2.906e+02  -0.493    0.623
>> poly(x, 20, raw = T)7   6.022e+00  1.213e+01   0.496    0.621
>> poly(x, 20, raw = T)8  -1.784e-01  3.587e-01  -0.497    0.620
>> poly(x, 20, raw = T)9   3.727e-03  7.503e-03   0.497    0.621
>> poly(x, 20, raw = T)10 -5.373e-05  1.086e-04  -0.495    0.622
>> poly(x, 20, raw = T)11  5.016e-07  1.018e-06   0.493    0.624
>> poly(x, 20, raw = T)12 -2.483e-09  5.069e-09  -0.490    0.626
>> poly(x, 20, raw = T)13         NA         NA      NA       NA
>> poly(x, 20, raw = T)14  5.656e-14  1.167e-13   0.485    0.629
>> poly(x, 20, raw = T)15         NA         NA      NA       NA
>> poly(x, 20, raw = T)16 -1.933e-18  4.011e-18  -0.482    0.631
>> poly(x, 20, raw = T)17         NA         NA      NA       NA
>> poly(x, 20, raw = T)18  5.181e-23  1.076e-22   0.482    0.631
>> poly(x, 20, raw = T)19         NA         NA      NA       NA
>> poly(x, 20, raw = T)20 -7.173e-28  1.480e-27  -0.485    0.629
>>
>> Residual standard error: 641000 on 76 degrees of freedom
>> Multiple R-squared:  0.9982, Adjusted R-squared:  0.9979
>> F-statistic:  2690 on 16 and 76 DF,  p-value: < 2.2e-16
>>
>>>
>>
>> Note,however, once the orthogonal polynomials have suggested a number, 4 in
>> this case, I can do this :-
>>
>> summary(lm(gdp ~ poly(x,4,raw=T)))
>>
>> Call:
>> lm(formula = gdp ~ poly(x, 4, raw = T))
>>
>> Residuals:
>>     Min       1Q   Median       3Q      Max
>> -1278673  -424315   -22357   310977  1731813
>>
>> Coefficients:
>>                       Estimate Std. Error t value Pr(>|t|)
>> (Intercept)           3.022e+06  3.676e+05   8.220 1.64e-12 ***
>> poly(x, 4, raw = T)1  1.741e+05  5.357e+04   3.249  0.00164 **
>> poly(x, 4, raw = T)2 -6.434e+03  2.300e+03  -2.797  0.00633 **
>> poly(x, 4, raw = T)3  1.878e+02  3.667e+01   5.123 1.76e-06 ***
>> poly(x, 4, raw = T)4 -8.682e-01  1.935e-01  -4.486 2.19e-05 ***
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>
>> Residual standard error: 663700 on 88 degrees of freedom
>> Multiple R-squared:  0.9978, Adjusted R-squared:  0.9977
>> F-statistic: 1.003e+04 on 4 and 88 DF,  p-value: < 2.2e-16
>>
>>>
>>
>> Although due to correlations they may not be significant jointly, but in
>> this case all 4 powers come out significant.
>
> Yes, the coefficients of orthogonal polynomials permit stepwise tests of each term after the previous ones because the orthogonalization is done stepwise. But (1) interpreting these tests is problematic because, e.g., of issues of simultaneous inference, and (2) you're using CV for model selection anyway (aren't you?) and you'll get (once more) exactly the same CV results from raw and orthogonal polynomials.
>
>>
>>
>> Second, the model formula gdp~1+x+x^2 and other similar formulas in your
>>> message don't do what you think. Like + and *, the ^ operator has special
>>> meaning on the right-hand side of an R model formula. See ?Formula and
>>> perhaps read something about statistical models in R. For example:
>>>
>>>> x <- 1:93
>>>> y <- 1 + x + x^2 + x^3 + x^4 + rnorm(93)
>>>> (m <- lm(y ~ x + x^2))
>>>
>>> Call:
>>> lm(formula = y ~ x + x^2)
>>>
>>> Coefficients:
>>> (Intercept)            x
>>>  -15781393       667147
>>>
>>> While gpp ~ x + I(x^2) would work, a better way to fit a raw quadratic is
>>> as gdp ~ poly(x, 2, raw=TRUE), as I suggested in my earlier message.
>>>
>>
>> My bad. Yes, I have some idea of the Wilkinson-Rogers notation. I have seen
>> it in books, although it slipped my mind that I had to use I( ).
>>
>>
>>> Finally, as to what you should do, I generally try to avoid statistical
>>> consulting by email. If you can find competent statistical help locally,
>>> such as at a nearby university, I'd recommend talking to someone about the
>>> purpose of your research and the nature of your data. If that's not
>>> possible, then others have suggested where you might find help, but to get
>>> useful advice you'll have to provide much more information about your
>>> research.
>>>
>>
>> My original query was about the polynomials used by R which I think is ON
>> topic.
>
> I think that question was answered a while ago.
>
>> My apologies that this query turned into a statistics query.
>
> I don't feel the need for an apology, and although the list focuses on using R, often related statistical issues arise. On the other hand, I don't have anything more to say about your problem. Others are welcome to pick it up.
>
> Best,
>  John
>
>>
>>
>>> Best,
>>> John
>>>
>>>  -----------------------------
>>>  John Fox, Professor Emeritus
>>>  McMaster University
>>>  Hamilton, Ontario, Canada
>>>  Web: http::/socserv.mcmaster.ca/jfox
>>>
>>>> On Nov 28, 2019, at 12:46 AM, Ashim Kapoor <[hidden email]>
>>> wrote:
>>>>
>>>> Dear Peter and John,
>>>>
>>>> Many thanks for your prompt replies.
>>>>
>>>> Here is what I was trying to do.  I was trying to build a statistical
>>> model of a given time series using Box Jenkins methodology. The series has
>>> 93 data points. Before I analyse the ACF and PACF, I am required to
>>> de-trend the series. The series seems to have an upward trend. I wanted to
>>> find out what order polynomial should I fit the series
>>>> without overfitting.  For this I want to use orthogonal polynomials(I
>>> think someone on the internet was talking about preventing overfitting by
>>> using orthogonal polynomials) . This seems to me as a poor man's cross
>>> validation.
>>>>
>>>> So my plan is to keep increasing the degree of the orthogonal
>>> polynomials till the coefficient of the last orthogonal polynomial becomes
>>> insignificant.
>>>>
>>>> Note : If I do NOT use orthogonal polynomials, I will overfit the data
>>> set and I don't think that is a good way to detect the true order of the
>>> polynomial.
>>>>
>>>> Also now that I have detrended the series and built an ARIMA model of
>>> the residuals, now I want to forecast. For this I need to use the original
>>> polynomials and their coefficients.
>>>>
>>>> I hope I was clear and that my methodology is ok.
>>>>
>>>> I have another query here :-
>>>>
>>>> Note : If I used cross-validation to determine the order of the
>>> polynomial, I don't get a clear answer.
>>>>
>>>> See here :-
>>>> library(boot)
>>>> mydf = data.frame(cbind(gdp,x))
>>>> d<-(c(
>>>> cv.glm(data = mydf,glm(gdp~x),K=10)$delta[1],
>>>> cv.glm(data = mydf,glm(gdp~poly(x,2)),K=10)$delta[1],
>>>> cv.glm(data = mydf,glm(gdp~poly(x,3)),K=10)$delta[1],
>>>> cv.glm(data = mydf,glm(gdp~poly(x,4)),K=10)$delta[1],
>>>> cv.glm(data = mydf,glm(gdp~poly(x,5)),K=10)$delta[1],
>>>> cv.glm(data = mydf,glm(gdp~poly(x,6)),K=10)$delta[1]))
>>>> print(d)
>>>> ## [1] 2.178574e+13 7.303031e+11 5.994783e+11 4.943586e+11 4.596648e+11
>>>> ## [6] 4.980159e+11
>>>>
>>>> # Here it chooses 5. (but 4 and 5 are kind of similar).
>>>>
>>>>
>>>> d1 <- (c(
>>>> cv.glm(data = mydf,glm(gdp~1+x),K=10)$delta[1],
>>>> cv.glm(data = mydf,glm(gdp~1+x+x^2),K=10)$delta[1],
>>>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3),K=10)$delta[1],
>>>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4),K=10)$delta[1],
>>>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5),K=10)$delta[1],
>>>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5+x^6),K=10)$delta[1]))
>>>>
>>>> print(d1)
>>>> ## [1] 2.149647e+13 2.253999e+13 2.182175e+13 2.177170e+13 2.198675e+13
>>>> ## [6] 2.145754e+13
>>>>
>>>> # here it chooses 1 or 6
>>>>
>>>> Query : Why does it choose 1? Notice : Is this just round off noise /
>>> noise due to sampling error created by Cross Validation when it creates the
>>> K folds? Is this due to the ill conditioned model matrix?
>>>>
>>>> Best Regards,
>>>> Ashim.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Nov 27, 2019 at 10:41 PM Fox, John <[hidden email]> wrote:
>>>> Dear Ashim,
>>>>
>>>> Orthogonal polynomials are used because they tend to produce more
>>> accurate numerical computations, not because their coefficients are
>>> interpretable, so I wonder why you're interested in the coefficients.
>>>>
>>>> The regressors produced are orthogonal to the constant regressor and are
>>> orthogonal to each other (and in fact are orthonormal), as it's simple to
>>> demonstrate:
>>>>
>>>> ------- snip -------
>>>>
>>>>> x <- 1:93
>>>>> y <- 1 + x + x^2 + x^3 + x^4 + rnorm(93)
>>>>> (m <- lm(y ~ poly(x, 4)))
>>>>
>>>> Call:
>>>> lm(formula = y ~ poly(x, 4))
>>>>
>>>> Coefficients:
>>>> (Intercept)  poly(x, 4)1  poly(x, 4)2  poly(x, 4)3  poly(x, 4)4
>>>>   15574516    172715069     94769949     27683528      3429259
>>>>
>>>>> X <- model.matrix(m)
>>>>> head(X)
>>>>  (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
>>>> 1           1  -0.1776843   0.2245083  -0.2572066  0.27935949
>>>> 2           1  -0.1738216   0.2098665  -0.2236579  0.21862917
>>>> 3           1  -0.1699589   0.1955464  -0.1919525  0.16390514
>>>> 4           1  -0.1660962   0.1815482  -0.1620496  0.11487597
>>>> 5           1  -0.1622335   0.1678717  -0.1339080  0.07123722
>>>> 6           1  -0.1583708   0.1545171  -0.1074869  0.03269145
>>>>
>>>>> zapsmall(crossprod(X))# X'X
>>>>            (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
>>>> (Intercept)          93           0           0           0           0
>>>> poly(x, 4)1           0           1           0           0           0
>>>> poly(x, 4)2           0           0           1           0           0
>>>> poly(x, 4)3           0           0           0           1           0
>>>> poly(x, 4)4           0           0           0           0           1
>>>>
>>>> ------- snip -------
>>>>
>>>> If for some not immediately obvious reason you're interested in the
>>> regression coefficients, why not just use a "raw" polynomial:
>>>>
>>>> ------- snip -------
>>>>
>>>>> (m1 <- lm(y ~ poly(x, 4, raw=TRUE)))
>>>>
>>>> Call:
>>>> lm(formula = y ~ poly(x, 4, raw = TRUE))
>>>>
>>>> Coefficients:
>>>>            (Intercept)  poly(x, 4, raw = TRUE)1  poly(x, 4, raw =
>>> TRUE)2  poly(x, 4, raw = TRUE)3
>>>>                 1.5640                   0.8985
>>> 1.0037                   1.0000
>>>> poly(x, 4, raw = TRUE)4
>>>>                 1.0000
>>>>
>>>> ------- snip -------
>>>>
>>>> These coefficients are simply interpretable but the model matrix is more
>>> poorly conditioned:
>>>>
>>>> ------- snip -------
>>>>
>>>>> head(X1)
>>>>  (Intercept) poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2 poly(x, 4,
>>> raw = TRUE)3
>>>> 1           1                       1                       1
>>>           1
>>>> 2           1                       2                       4
>>>           8
>>>> 3           1                       3                       9
>>>          27
>>>> 4           1                       4                      16
>>>          64
>>>> 5           1                       5                      25
>>>         125
>>>> 6           1                       6                      36
>>>         216
>>>>  poly(x, 4, raw = TRUE)4
>>>> 1                       1
>>>> 2                      16
>>>> 3                      81
>>>> 4                     256
>>>> 5                     625
>>>> 6                    1296
>>>>> round(cor(X1[, -1]), 2)
>>>>                        poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2
>>> poly(x, 4, raw = TRUE)3
>>>> poly(x, 4, raw = TRUE)1                    1.00                    0.97
>>>                  0.92
>>>> poly(x, 4, raw = TRUE)2                    0.97                    1.00
>>>                  0.99
>>>> poly(x, 4, raw = TRUE)3                    0.92                    0.99
>>>                  1.00
>>>> poly(x, 4, raw = TRUE)4                    0.87                    0.96
>>>                  0.99
>>>>                        poly(x, 4, raw = TRUE)4
>>>> poly(x, 4, raw = TRUE)1                    0.87
>>>> poly(x, 4, raw = TRUE)2                    0.96
>>>> poly(x, 4, raw = TRUE)3                    0.99
>>>> poly(x, 4, raw = TRUE)4                    1.00
>>>>
>>>> ------- snip -------
>>>>
>>>> The two parametrizations are equivalent, however, in that they represent
>>> the same regression surface, and so, e.g., produce the same fitted values:
>>>>
>>>> ------- snip -------
>>>>
>>>>> all.equal(fitted(m), fitted(m1))
>>>> [1] TRUE
>>>>
>>>> ------- snip -------
>>>>
>>>> Because one is usually not interested in the individual coefficients of
>>> a polynomial there usually isn't a reason to prefer one parametrization to
>>> the other on the grounds of interpretability, so why do you need to
>>> interpret the regression equation?
>>>>
>>>> I hope this helps,
>>>> John
>>>>
>>>>  -----------------------------
>>>>  John Fox, Professor Emeritus
>>>>  McMaster University
>>>>  Hamilton, Ontario, Canada
>>>>  Web: http::/socserv.mcmaster.ca/jfox
>>>>
>>>>> On Nov 27, 2019, at 10:17 AM, Ashim Kapoor <[hidden email]>
>>> wrote:
>>>>>
>>>>> Dear Petr,
>>>>>
>>>>> Many thanks for the quick response.
>>>>>
>>>>> I also read this:-
>>>>> https://en.wikipedia.org/wiki/Discrete_orthogonal_polynomials
>>>>>
>>>>> Also I read  in ?poly:-
>>>>>    The orthogonal polynomial is summarized by the coefficients, which
>>>>>    can be used to evaluate it via the three-term recursion given in
>>>>>    Kennedy & Gentle (1980, pp. 343-4), and used in the ‘predict’ part
>>>>>    of the code.
>>>>>
>>>>> I don't have access to the mentioned book.
>>>>>
>>>>> Out of curiosity, what is the name of the discrete orthogonal
>>> polynomial
>>>>> used by R ?
>>>>> What discrete measure is it orthogonal with respect to ?
>>>>>
>>>>> Many thanks,
>>>>> Ashim
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Nov 27, 2019 at 6:11 PM PIKAL Petr <[hidden email]>
>>> wrote:
>>>>>
>>>>>> You could get answer quickly by searching net.
>>>>>>
>>>>>>
>>>>>>
>>> https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-p
>>>>>> olynomials-how-to-understand-the-coefs-ret/39051154#39051154
>>>>>> <
>>> https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-polynomials-how-to-understand-the-coefs-ret/39051154#39051154
>>>>
>>>>>>
>>>>>> Cheers
>>>>>> Petr
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: R-help <[hidden email]> On Behalf Of Ashim
>>> Kapoor
>>>>>>> Sent: Wednesday, November 27, 2019 12:55 PM
>>>>>>> To: R Help <[hidden email]>
>>>>>>> Subject: [R] Orthogonal polynomials used by R
>>>>>>>
>>>>>>> Dear All,
>>>>>>>
>>>>>>> I have created a time trend by doing x<-1:93 because I have a time
>>> series
>>>>>>> with 93 data points. Next I did :-
>>>>>>>
>>>>>>> y = lm(series ~ poly(x,4))$residuals
>>>>>>>
>>>>>>> to detrend series.
>>>>>>>
>>>>>>> I choose this 4 as the order of my polynomial using cross validation/
>>>>>>> checking the absence of trend in the residuals so I think I have not
>>>>>> overfit
>>>>>>> this series.
>>>>>>>
>>>>>>> I wish to document the formula of poly(x,4). I am not able to find
>>> it in
>>>>>> ?poly
>>>>>>>
>>>>>>> Can someone please tell me what the formula for the orthogonal
>>>>>>> polynomial used by R is ?
>>>>>>>
>>>>>>> Thank you,
>>>>>>> Ashim
>>>>>>>
>>>>>>>     [[alternative HTML version deleted]]
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>>>>>> guide.html
>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>
>>>>>
>>>>>      [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Orthogonal polynomials used by R / another perspective

Bert Gunter-2
Warning: This may be off topic, but as several éminences grises have now
offered comments, I recommend this striking discussion on many related
issues by yet another éminence grise.

https://academic.oup.com/aje/article/186/6/639/3886035

**PLEASE DO NOT REPLY ON LIST**  This is not the place for a discussion of
the many issues Greenland raises. I am open to private replies, but I am
not an authority and my opinions aren't worth much. (See tagline below).

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Nov 28, 2019 at 9:53 AM J C Nash <[hidden email]> wrote:

> I'm not going to comment at all on the original question, but on a very
> common --
> and often troublesome -- mixing of viewpoints about data modelling.
>
> R and other software is used to "fit equations to data" and to "estimate
> models".
> Unfortunately, a good bit of both these tasks is common. Usually (but NOT
> exclusively),
> we fit by minimizing a sum of squared deviations then carry forward the
> calculations to
> make inferences about the parameters of the model equations. Quite
> frequently, a single
> equation can be written different ways e.g., ordinary or orthogonal
> polynomials.
> It gets worse, much worse, for nonlinear models.
>
> Moreover, sometimes (in my case because I get people sending me
> troublesome problems, about
> 90% of instances) there are very different sets of numerical values for
> the parameters of the
> modelling equations that give essentially the same sum of squares (or
> other loss function).
>
> Over several decades, because I am sometimes quite happy to use ANY of the
> choices, even
> when there is a linear dependence in a linear modelling equation for the
> data given, I'm
> often granted a lot of nasty comments. If I'm using the FIT e.g. for
> approximation, then such
> criticism is mis-placed. If I want to use the particular parameters to say
> something about
> the system I'm studying, then indeed I should go back to school (my
> critics might say
> kindergarten). A serious concern about some machine learning is that fit
> alone is used
> as a criteria from which to predict using the "equations" (though some
> models are just
> algorithms). There is a jump from a good fit to existing data to a hope
> that we can get
> good predictions outside of the available data space.
>
> Having taught statistics for several decades also, I know how difficult it
> is to impart
> a good perspective on these issues. However, I'll continue to urge data
> scientists and
> statisticians to keep a wide view and be clear on what they want the
> software to do for
> them.
>
> For now, end of rant.
>
> John Nash (package author of several packages for fitting and optimizing)
>
>
> On 2019-11-28 12:12 p.m., Fox, John wrote:
> > Dear Ashim,
> >
> > Please see my brief remarks below:
> >
> >> On Nov 28, 2019, at 11:02 AM, Ashim Kapoor <[hidden email]>
> wrote:
> >>
> >> On Thu, Nov 28, 2019 at 7:38 PM Fox, John <[hidden email]> wrote:
> >>
> >>> Dear Ashim,
> >>>
> >>> I'm afraid that much of what you say here is confused.
> >>>
> >>> First, because poly(x) and poly(x, raw=TRUE) produce the same fitted
> >>> values (as I previously explained), they also produce the same
> residuals,
> >>> and consequently the same CV criteria. From the point of view of CV,
> >>> there's therefore no reason to prefer orthogonal polynomials. And you
> still
> >>> don't explain why you want to interpret the coefficients of the
> polynomial.
> >>>
> >>
> >> The trend in the variable that I am trying to create an ARIMA model for
> is
> >> given by poly(x,4). That is why I wished to know what these polynomials
> >> look like.
> >
> > The polynomial "looks" exactly the same whether or not you use raw or
> orthogonal regressors as a basis for it. That is, the two bases represent
> exactly the same regression surface (i.e., curve in the case of one x). To
> see what the fitted polynomial looks like, graph it. But I've now made
> essentially this point three times, so if it's not clear I regret the
> unclarity but I don't really have anything to add.
> >
> > For other points, see below.
> >
> >>
> >> I used  :
> >>
> >> trend <- predict(lm(gdp~poly(x,4)),newdata = data.frame(
> >> x=94:103),interval="confidence")
> >>
> >> and I was able to (numerically) extrapolate the poly(x,4) trend,
> although,
> >> I think it would be interesting to know what polynomials I was dealing
> with
> >> in this case. Just some intuition as to if the linear / quadratic /
> cubic /
> >> fourth order polynomial trend is dominating. I don't know how I would
> >> interpret them, but it would be fun to know.
> >
> > I'm not sure how you intend to interpret the coefficients, say of the
> raw polynomial. Their magnitudes shouldn't be compared because the size of
> the powers of x grows with the powers.
> >
> > BTW, it's very risky to use high-order polynomials for extrapolation
> beyond the observed range of x, even if the model fits well within the
> observed range of x, and of course raw and orthogonal polynomial produce
> exactly the same (problematic) extrapolations (although those produced by
> raw polynomials may be subject to more rounding error). To be clear, I'm
> not arguing that one should in general use raw polynomials in preference to
> orthogonal polynomials, just that the former have generally interpretable
> coefficients and the latter don't.
> >
> >>
> >> Please allow me to show you a trick. I read this on the internet, here
> :-
> >>
> >>
> https://www.datasciencecentral.com/profiles/blogs/deep-dive-into-polynomial-regression-and-overfitting
> >>
> >> Please see the LAST comment by Scott Stelljes where he suggests using an
> >> orthogonal polynomial basis. He does not elaborate buttoleaves the
> reader to
> >> work out the details.
> >
> > This blog focuses on the numerical stability of raw versus orthogonal
> polynomials. If by "stepwise" you mean adding successive powers to the
> model, you'll get exactly the same sequence of fits with raw as with
> orthogonal polynomial, as I've now explained several times.
> >
> >>
> >> Here is what I think of this. Take a big number say 20 and take a
> variable
> >> in which we are trying to find the order of the polynomial in the trend.
> >> Like this :-
> >>
> >>> summary(lm(gdp ~ poly(x,20)))
> >>
> >> Call:
> >> lm(formula = gdp ~ poly(x, 20))
> >>
> >> Residuals:
> >>     Min       1Q   Median       3Q      Max
> >> -1235661  -367798   -80453   240360  1450906
> >>
> >> Coefficients:
> >>               Estimate Std. Error t value Pr(>|t|)
> >> (Intercept)    17601482      66934 262.968  < 2e-16 ***
> >> poly(x, 20)1  125679081     645487 194.704  < 2e-16 ***
> >> poly(x, 20)2   43108747     645487  66.785  < 2e-16 ***
> >> poly(x, 20)3    3605839     645487   5.586 3.89e-07 ***
> >> poly(x, 20)4   -2977277     645487  -4.612 1.69e-05 ***
> >> poly(x, 20)5    1085732     645487   1.682   0.0969 .
> >> poly(x, 20)6    1124125     645487   1.742   0.0859 .
> >> poly(x, 20)7    -108676     645487  -0.168   0.8668
> >> poly(x, 20)8    -976915     645487  -1.513   0.1345
> >> poly(x, 20)9   -1635444     645487  -2.534   0.0135 *
> >> poly(x, 20)10   -715019     645487  -1.108   0.2717
> >> poly(x, 20)11    347102     645487   0.538   0.5924
> >> poly(x, 20)12   -176728     645487  -0.274   0.7850
> >> poly(x, 20)13   -634151     645487  -0.982   0.3292
> >> poly(x, 20)14   -537725     645487  -0.833   0.4076
> >> poly(x, 20)15    -58674     645487  -0.091   0.9278
> >> poly(x, 20)16    -67030     645487  -0.104   0.9176
> >> poly(x, 20)17   -809443     645487  -1.254   0.2139
> >> poly(x, 20)18   -668879     645487  -1.036   0.3036
> >> poly(x, 20)19   -302384     645487  -0.468   0.6409
> >> poly(x, 20)20    359134     645487   0.556   0.5797
> >> ---
> >> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >>
> >> Residual standard error: 645500 on 72 degrees of freedom
> >> Multiple R-squared:  0.9983, Adjusted R-squared:  0.9978
> >> F-statistic:  2122 on 20 and 72 DF,  p-value: < 2.2e-16
> >>
> >>>
> >>
> >>
> >> The CV estimate of the trend is 4. I am not saying this method is
> perfect,
> >> but look above this method also suggests n=4.
> >>
> >> I CANNOT do this with raw polynomials, since they are correlated and
> >> JOINTLY in the presence of others they may not be significant, please
> see
> >> below.
> >>
> >>> summary(lm(gdp ~ poly(x,20,raw=T)))
> >>
> >> Call:
> >> lm(formula = gdp ~ poly(x, 20, raw = T))
> >>
> >> Residuals:
> >>     Min       1Q   Median       3Q      Max
> >> -1286007  -372221   -81320   248510  1589130
> >>
> >> Coefficients: (4 not defined because of singularities)
> >>                         Estimate Std. Error t value Pr(>|t|)
> >> (Intercept)             2.067e+06  2.649e+06   0.780    0.438
> >> poly(x, 20, raw = T)1   1.633e+06  3.556e+06   0.459    0.647
> >> poly(x, 20, raw = T)2  -7.601e+05  1.679e+06  -0.453    0.652
> >> poly(x, 20, raw = T)3   1.861e+05  3.962e+05   0.470    0.640
> >> poly(x, 20, raw = T)4  -2.634e+04  5.480e+04  -0.481    0.632
> >> poly(x, 20, raw = T)5   2.370e+03  4.854e+03   0.488    0.627
> >> poly(x, 20, raw = T)6  -1.434e+02  2.906e+02  -0.493    0.623
> >> poly(x, 20, raw = T)7   6.022e+00  1.213e+01   0.496    0.621
> >> poly(x, 20, raw = T)8  -1.784e-01  3.587e-01  -0.497    0.620
> >> poly(x, 20, raw = T)9   3.727e-03  7.503e-03   0.497    0.621
> >> poly(x, 20, raw = T)10 -5.373e-05  1.086e-04  -0.495    0.622
> >> poly(x, 20, raw = T)11  5.016e-07  1.018e-06   0.493    0.624
> >> poly(x, 20, raw = T)12 -2.483e-09  5.069e-09  -0.490    0.626
> >> poly(x, 20, raw = T)13         NA         NA      NA       NA
> >> poly(x, 20, raw = T)14  5.656e-14  1.167e-13   0.485    0.629
> >> poly(x, 20, raw = T)15         NA         NA      NA       NA
> >> poly(x, 20, raw = T)16 -1.933e-18  4.011e-18  -0.482    0.631
> >> poly(x, 20, raw = T)17         NA         NA      NA       NA
> >> poly(x, 20, raw = T)18  5.181e-23  1.076e-22   0.482    0.631
> >> poly(x, 20, raw = T)19         NA         NA      NA       NA
> >> poly(x, 20, raw = T)20 -7.173e-28  1.480e-27  -0.485    0.629
> >>
> >> Residual standard error: 641000 on 76 degrees of freedom
> >> Multiple R-squared:  0.9982, Adjusted R-squared:  0.9979
> >> F-statistic:  2690 on 16 and 76 DF,  p-value: < 2.2e-16
> >>
> >>>
> >>
> >> Note,however, once the orthogonal polynomials have suggested a number,
> 4 in
> >> this case, I can do this :-
> >>
> >> summary(lm(gdp ~ poly(x,4,raw=T)))
> >>
> >> Call:
> >> lm(formula = gdp ~ poly(x, 4, raw = T))
> >>
> >> Residuals:
> >>     Min       1Q   Median       3Q      Max
> >> -1278673  -424315   -22357   310977  1731813
> >>
> >> Coefficients:
> >>                       Estimate Std. Error t value Pr(>|t|)
> >> (Intercept)           3.022e+06  3.676e+05   8.220 1.64e-12 ***
> >> poly(x, 4, raw = T)1  1.741e+05  5.357e+04   3.249  0.00164 **
> >> poly(x, 4, raw = T)2 -6.434e+03  2.300e+03  -2.797  0.00633 **
> >> poly(x, 4, raw = T)3  1.878e+02  3.667e+01   5.123 1.76e-06 ***
> >> poly(x, 4, raw = T)4 -8.682e-01  1.935e-01  -4.486 2.19e-05 ***
> >> ---
> >> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >>
> >> Residual standard error: 663700 on 88 degrees of freedom
> >> Multiple R-squared:  0.9978, Adjusted R-squared:  0.9977
> >> F-statistic: 1.003e+04 on 4 and 88 DF,  p-value: < 2.2e-16
> >>
> >>>
> >>
> >> Although due to correlations they may not be significant jointly, but in
> >> this case all 4 powers come out significant.
> >
> > Yes, the coefficients of orthogonal polynomials permit stepwise tests of
> each term after the previous ones because the orthogonalization is done
> stepwise. But (1) interpreting these tests is problematic because, e.g., of
> issues of simultaneous inference, and (2) you're using CV for model
> selection anyway (aren't you?) and you'll get (once more) exactly the same
> CV results from raw and orthogonal polynomials.
> >
> >>
> >>
> >> Second, the model formula gdp~1+x+x^2 and other similar formulas in your
> >>> message don't do what you think. Like + and *, the ^ operator has
> special
> >>> meaning on the right-hand side of an R model formula. See ?Formula and
> >>> perhaps read something about statistical models in R. For example:
> >>>
> >>>> x <- 1:93
> >>>> y <- 1 + x + x^2 + x^3 + x^4 + rnorm(93)
> >>>> (m <- lm(y ~ x + x^2))
> >>>
> >>> Call:
> >>> lm(formula = y ~ x + x^2)
> >>>
> >>> Coefficients:
> >>> (Intercept)            x
> >>>  -15781393       667147
> >>>
> >>> While gpp ~ x + I(x^2) would work, a better way to fit a raw quadratic
> is
> >>> as gdp ~ poly(x, 2, raw=TRUE), as I suggested in my earlier message.
> >>>
> >>
> >> My bad. Yes, I have some idea of the Wilkinson-Rogers notation. I have
> seen
> >> it in books, although it slipped my mind that I had to use I( ).
> >>
> >>
> >>> Finally, as to what you should do, I generally try to avoid statistical
> >>> consulting by email. If you can find competent statistical help
> locally,
> >>> such as at a nearby university, I'd recommend talking to someone about
> the
> >>> purpose of your research and the nature of your data. If that's not
> >>> possible, then others have suggested where you might find help, but to
> get
> >>> useful advice you'll have to provide much more information about your
> >>> research.
> >>>
> >>
> >> My original query was about the polynomials used by R which I think is
> ON
> >> topic.
> >
> > I think that question was answered a while ago.
> >
> >> My apologies that this query turned into a statistics query.
> >
> > I don't feel the need for an apology, and although the list focuses on
> using R, often related statistical issues arise. On the other hand, I don't
> have anything more to say about your problem. Others are welcome to pick it
> up.
> >
> > Best,
> >  John
> >
> >>
> >>
> >>> Best,
> >>> John
> >>>
> >>>  -----------------------------
> >>>  John Fox, Professor Emeritus
> >>>  McMaster University
> >>>  Hamilton, Ontario, Canada
> >>>  Web: http::/socserv.mcmaster.ca/jfox
> >>>
> >>>> On Nov 28, 2019, at 12:46 AM, Ashim Kapoor <[hidden email]>
> >>> wrote:
> >>>>
> >>>> Dear Peter and John,
> >>>>
> >>>> Many thanks for your prompt replies.
> >>>>
> >>>> Here is what I was trying to do.  I was trying to build a statistical
> >>> model of a given time series using Box Jenkins methodology. The series
> has
> >>> 93 data points. Before I analyse the ACF and PACF, I am required to
> >>> de-trend the series. The series seems to have an upward trend. I
> wanted to
> >>> find out what order polynomial should I fit the series
> >>>> without overfitting.  For this I want to use orthogonal polynomials(I
> >>> think someone on the internet was talking about preventing overfitting
> by
> >>> using orthogonal polynomials) . This seems to me as a poor man's cross
> >>> validation.
> >>>>
> >>>> So my plan is to keep increasing the degree of the orthogonal
> >>> polynomials till the coefficient of the last orthogonal polynomial
> becomes
> >>> insignificant.
> >>>>
> >>>> Note : If I do NOT use orthogonal polynomials, I will overfit the data
> >>> set and I don't think that is a good way to detect the true order of
> the
> >>> polynomial.
> >>>>
> >>>> Also now that I have detrended the series and built an ARIMA model of
> >>> the residuals, now I want to forecast. For this I need to use the
> original
> >>> polynomials and their coefficients.
> >>>>
> >>>> I hope I was clear and that my methodology is ok.
> >>>>
> >>>> I have another query here :-
> >>>>
> >>>> Note : If I used cross-validation to determine the order of the
> >>> polynomial, I don't get a clear answer.
> >>>>
> >>>> See here :-
> >>>> library(boot)
> >>>> mydf = data.frame(cbind(gdp,x))
> >>>> d<-(c(
> >>>> cv.glm(data = mydf,glm(gdp~x),K=10)$delta[1],
> >>>> cv.glm(data = mydf,glm(gdp~poly(x,2)),K=10)$delta[1],
> >>>> cv.glm(data = mydf,glm(gdp~poly(x,3)),K=10)$delta[1],
> >>>> cv.glm(data = mydf,glm(gdp~poly(x,4)),K=10)$delta[1],
> >>>> cv.glm(data = mydf,glm(gdp~poly(x,5)),K=10)$delta[1],
> >>>> cv.glm(data = mydf,glm(gdp~poly(x,6)),K=10)$delta[1]))
> >>>> print(d)
> >>>> ## [1] 2.178574e+13 7.303031e+11 5.994783e+11 4.943586e+11
> 4.596648e+11
> >>>> ## [6] 4.980159e+11
> >>>>
> >>>> # Here it chooses 5. (but 4 and 5 are kind of similar).
> >>>>
> >>>>
> >>>> d1 <- (c(
> >>>> cv.glm(data = mydf,glm(gdp~1+x),K=10)$delta[1],
> >>>> cv.glm(data = mydf,glm(gdp~1+x+x^2),K=10)$delta[1],
> >>>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3),K=10)$delta[1],
> >>>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4),K=10)$delta[1],
> >>>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5),K=10)$delta[1],
> >>>> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5+x^6),K=10)$delta[1]))
> >>>>
> >>>> print(d1)
> >>>> ## [1] 2.149647e+13 2.253999e+13 2.182175e+13 2.177170e+13
> 2.198675e+13
> >>>> ## [6] 2.145754e+13
> >>>>
> >>>> # here it chooses 1 or 6
> >>>>
> >>>> Query : Why does it choose 1? Notice : Is this just round off noise /
> >>> noise due to sampling error created by Cross Validation when it
> creates the
> >>> K folds? Is this due to the ill conditioned model matrix?
> >>>>
> >>>> Best Regards,
> >>>> Ashim.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Wed, Nov 27, 2019 at 10:41 PM Fox, John <[hidden email]> wrote:
> >>>> Dear Ashim,
> >>>>
> >>>> Orthogonal polynomials are used because they tend to produce more
> >>> accurate numerical computations, not because their coefficients are
> >>> interpretable, so I wonder why you're interested in the coefficients.
> >>>>
> >>>> The regressors produced are orthogonal to the constant regressor and
> are
> >>> orthogonal to each other (and in fact are orthonormal), as it's simple
> to
> >>> demonstrate:
> >>>>
> >>>> ------- snip -------
> >>>>
> >>>>> x <- 1:93
> >>>>> y <- 1 + x + x^2 + x^3 + x^4 + rnorm(93)
> >>>>> (m <- lm(y ~ poly(x, 4)))
> >>>>
> >>>> Call:
> >>>> lm(formula = y ~ poly(x, 4))
> >>>>
> >>>> Coefficients:
> >>>> (Intercept)  poly(x, 4)1  poly(x, 4)2  poly(x, 4)3  poly(x, 4)4
> >>>>   15574516    172715069     94769949     27683528      3429259
> >>>>
> >>>>> X <- model.matrix(m)
> >>>>> head(X)
> >>>>  (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
> >>>> 1           1  -0.1776843   0.2245083  -0.2572066  0.27935949
> >>>> 2           1  -0.1738216   0.2098665  -0.2236579  0.21862917
> >>>> 3           1  -0.1699589   0.1955464  -0.1919525  0.16390514
> >>>> 4           1  -0.1660962   0.1815482  -0.1620496  0.11487597
> >>>> 5           1  -0.1622335   0.1678717  -0.1339080  0.07123722
> >>>> 6           1  -0.1583708   0.1545171  -0.1074869  0.03269145
> >>>>
> >>>>> zapsmall(crossprod(X))# X'X
> >>>>            (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
> >>>> (Intercept)          93           0           0           0
>  0
> >>>> poly(x, 4)1           0           1           0           0
>  0
> >>>> poly(x, 4)2           0           0           1           0
>  0
> >>>> poly(x, 4)3           0           0           0           1
>  0
> >>>> poly(x, 4)4           0           0           0           0
>  1
> >>>>
> >>>> ------- snip -------
> >>>>
> >>>> If for some not immediately obvious reason you're interested in the
> >>> regression coefficients, why not just use a "raw" polynomial:
> >>>>
> >>>> ------- snip -------
> >>>>
> >>>>> (m1 <- lm(y ~ poly(x, 4, raw=TRUE)))
> >>>>
> >>>> Call:
> >>>> lm(formula = y ~ poly(x, 4, raw = TRUE))
> >>>>
> >>>> Coefficients:
> >>>>            (Intercept)  poly(x, 4, raw = TRUE)1  poly(x, 4, raw =
> >>> TRUE)2  poly(x, 4, raw = TRUE)3
> >>>>                 1.5640                   0.8985
> >>> 1.0037                   1.0000
> >>>> poly(x, 4, raw = TRUE)4
> >>>>                 1.0000
> >>>>
> >>>> ------- snip -------
> >>>>
> >>>> These coefficients are simply interpretable but the model matrix is
> more
> >>> poorly conditioned:
> >>>>
> >>>> ------- snip -------
> >>>>
> >>>>> head(X1)
> >>>>  (Intercept) poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2 poly(x,
> 4,
> >>> raw = TRUE)3
> >>>> 1           1                       1                       1
> >>>           1
> >>>> 2           1                       2                       4
> >>>           8
> >>>> 3           1                       3                       9
> >>>          27
> >>>> 4           1                       4                      16
> >>>          64
> >>>> 5           1                       5                      25
> >>>         125
> >>>> 6           1                       6                      36
> >>>         216
> >>>>  poly(x, 4, raw = TRUE)4
> >>>> 1                       1
> >>>> 2                      16
> >>>> 3                      81
> >>>> 4                     256
> >>>> 5                     625
> >>>> 6                    1296
> >>>>> round(cor(X1[, -1]), 2)
> >>>>                        poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2
> >>> poly(x, 4, raw = TRUE)3
> >>>> poly(x, 4, raw = TRUE)1                    1.00
> 0.97
> >>>                  0.92
> >>>> poly(x, 4, raw = TRUE)2                    0.97
> 1.00
> >>>                  0.99
> >>>> poly(x, 4, raw = TRUE)3                    0.92
> 0.99
> >>>                  1.00
> >>>> poly(x, 4, raw = TRUE)4                    0.87
> 0.96
> >>>                  0.99
> >>>>                        poly(x, 4, raw = TRUE)4
> >>>> poly(x, 4, raw = TRUE)1                    0.87
> >>>> poly(x, 4, raw = TRUE)2                    0.96
> >>>> poly(x, 4, raw = TRUE)3                    0.99
> >>>> poly(x, 4, raw = TRUE)4                    1.00
> >>>>
> >>>> ------- snip -------
> >>>>
> >>>> The two parametrizations are equivalent, however, in that they
> represent
> >>> the same regression surface, and so, e.g., produce the same fitted
> values:
> >>>>
> >>>> ------- snip -------
> >>>>
> >>>>> all.equal(fitted(m), fitted(m1))
> >>>> [1] TRUE
> >>>>
> >>>> ------- snip -------
> >>>>
> >>>> Because one is usually not interested in the individual coefficients
> of
> >>> a polynomial there usually isn't a reason to prefer one
> parametrization to
> >>> the other on the grounds of interpretability, so why do you need to
> >>> interpret the regression equation?
> >>>>
> >>>> I hope this helps,
> >>>> John
> >>>>
> >>>>  -----------------------------
> >>>>  John Fox, Professor Emeritus
> >>>>  McMaster University
> >>>>  Hamilton, Ontario, Canada
> >>>>  Web: http::/socserv.mcmaster.ca/jfox
> >>>>
> >>>>> On Nov 27, 2019, at 10:17 AM, Ashim Kapoor <[hidden email]>
> >>> wrote:
> >>>>>
> >>>>> Dear Petr,
> >>>>>
> >>>>> Many thanks for the quick response.
> >>>>>
> >>>>> I also read this:-
> >>>>> https://en.wikipedia.org/wiki/Discrete_orthogonal_polynomials
> >>>>>
> >>>>> Also I read  in ?poly:-
> >>>>>    The orthogonal polynomial is summarized by the coefficients, which
> >>>>>    can be used to evaluate it via the three-term recursion given in
> >>>>>    Kennedy & Gentle (1980, pp. 343-4), and used in the ‘predict’ part
> >>>>>    of the code.
> >>>>>
> >>>>> I don't have access to the mentioned book.
> >>>>>
> >>>>> Out of curiosity, what is the name of the discrete orthogonal
> >>> polynomial
> >>>>> used by R ?
> >>>>> What discrete measure is it orthogonal with respect to ?
> >>>>>
> >>>>> Many thanks,
> >>>>> Ashim
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Wed, Nov 27, 2019 at 6:11 PM PIKAL Petr <[hidden email]>
> >>> wrote:
> >>>>>
> >>>>>> You could get answer quickly by searching net.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>
> https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-p
> >>>>>> olynomials-how-to-understand-the-coefs-ret/39051154#39051154
> >>>>>> <
> >>>
> https://stackoverflow.com/questions/39031172/how-poly-generates-orthogonal-polynomials-how-to-understand-the-coefs-ret/39051154#39051154
> >>>>
> >>>>>>
> >>>>>> Cheers
> >>>>>> Petr
> >>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: R-help <[hidden email]> On Behalf Of Ashim
> >>> Kapoor
> >>>>>>> Sent: Wednesday, November 27, 2019 12:55 PM
> >>>>>>> To: R Help <[hidden email]>
> >>>>>>> Subject: [R] Orthogonal polynomials used by R
> >>>>>>>
> >>>>>>> Dear All,
> >>>>>>>
> >>>>>>> I have created a time trend by doing x<-1:93 because I have a time
> >>> series
> >>>>>>> with 93 data points. Next I did :-
> >>>>>>>
> >>>>>>> y = lm(series ~ poly(x,4))$residuals
> >>>>>>>
> >>>>>>> to detrend series.
> >>>>>>>
> >>>>>>> I choose this 4 as the order of my polynomial using cross
> validation/
> >>>>>>> checking the absence of trend in the residuals so I think I have
> not
> >>>>>> overfit
> >>>>>>> this series.
> >>>>>>>
> >>>>>>> I wish to document the formula of poly(x,4). I am not able to find
> >>> it in
> >>>>>> ?poly
> >>>>>>>
> >>>>>>> Can someone please tell me what the formula for the orthogonal
> >>>>>>> polynomial used by R is ?
> >>>>>>>
> >>>>>>> Thank you,
> >>>>>>> Ashim
> >>>>>>>
> >>>>>>>     [[alternative HTML version deleted]]
> >>>>>>>
> >>>>>>> ______________________________________________
> >>>>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-
> >>>>>>> guide.html
> >>>>>>> and provide commented, minimal, self-contained, reproducible code.
> >>>>>>
> >>>>>
> >>>>>      [[alternative HTML version deleted]]
> >>>>>
> >>>>> ______________________________________________
> >>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>>>> and provide commented, minimal, self-contained, reproducible code.
> >>>>
> >>>
> >>>
> >>>
> >>
> >>      [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.