Quantcast

Ljung-Box test (Box.test)

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Ljung-Box test (Box.test)

Steven Winter
I fit a simple linear model y = bX to a data set today, and that produced 24 residuals (I have 24 data points, one for each year from 1984-2007). I would like to test the time-independence of the residuals of my model, and I was recommended by my supervisor to use the Ljung-Box test. The Box.test function in R takes 4 arguments: 

x a numeric vector or univariate time series.
lag the statistic will be based on lag autocorrelation
coefficients.
type test to be performed: partial matching is used.
fitdf number of degrees of freedom to be subtracted if x is a series of residuals.

Unfortunately, I never took a statistics class where I learned the Ljung-Box test, and information about it online is hard to find. What does "lag" mean, and what value would you guys recommend I use for the test? Also, what does "fitdf" represent, and what would the value for that parameter be in my case? Finally, the value of x is a vector of my 24 residuals, correct?

Thank you all so much. I apologize for the basic nature of the question.

Steven
        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Ljung-Box test (Box.test)

Rui Barradas
Hello,

That's a statistics question, but it's also about using an R function.

The Ljung-Box test isn't supposed to be used in such a context, to test
the residuals of an ols y = bX + e. It is used to test time independence
of the original series or of the residuals of an ARMA(p, q) fit.

In both cases you are right, 'x' is a series.
'lag' can be explained as follows: you have a time series and want to
know if the value observed today depends on what was observed in the
past. Then, a linear regression of "today" on "yesterday" could be

X[t] = b[1]*X[t-1] + e[t], e ~ Normal(0, sigma^2)

A linear regression on two time units in the past would be

X[t] = b[1]*X[t-1] + b[2]*X[t-2] + e[t], e ~ Normal(0, sigma^2)

etc. This is a regression of the series on itself lagged by a certain
number of time units, the present is regressed on the past. Function
ar() fits this kind of model to a time series. In the first case, the
order is p=1, in the second, p=2.

Now, in the first case, is there second order serial correlation? Test
the residuals with lag=2, fitdf=1, the value of p. Third order? lag=3,
fitdf=p=1, etc.

You are NOT fitting this type of model, so the Ljung-Box test is
misused. Test the original series with default parameters, lag=1. If
there is serial correlation, fit an AR (Auto-Regressive) model with
ar(). See the help page ?ar. And see a statiscian with experience in
time series. It's a world on its own, I haven't even mentioned
seasonality. And almost everything else about time series.

Do ask someone near you.

Hope this helps,

Rui Barradas
Em 26-06-2012 19:01, Steven Winter escreveu:

> I fit a simple linear model y = bX to a data set today, and that produced 24 residuals (I have 24 data points, one for each year from 1984-2007). I would like to test the time-independence of the residuals of my model, and I was recommended by my supervisor to use the Ljung-Box test. The Box.test function in R takes 4 arguments:
>
> x a numeric vector or univariate time series.
> lag the statistic will be based on lag autocorrelation
> coefficients.
> type test to be performed: partial matching is used.
> fitdf number of degrees of freedom to be subtracted if x is a series of residuals.
>
> Unfortunately, I never took a statistics class where I learned the Ljung-Box test, and information about it online is hard to find. What does "lag" mean, and what value would you guys recommend I use for the test? Also, what does "fitdf" represent, and what would the value for that parameter be in my case? Finally, the value of x is a vector of my 24 residuals, correct?
>
> Thank you all so much. I apologize for the basic nature of the question.
>
> Steven
> [[alternative HTML version deleted]]
>
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: Ljung-Box test (Box.test)

Rui Barradas
Hello,

No, the Ljung-Box test wouldn't be inappropriate in that case. First you
detrend the series and then test for serial independence. It's even
usual to do so. I would use the default values for lag and fitdf. But
use type="Ljung", the Box-Pierce test is nowadays seldom used in
pratice, if at all. It does have great historical and pedagogic
interess, the Ljung-Box test statistic follows it and corrects its
variance estimation's bias.
The parameter fitdf is relevant if you test the residuals of a fitted
ARMA(p, q) model, which isn't the case, so keep it equal to zero.
In that case, lag is chosen such that lag > fitdf. 1 will do.

Oh, and please, it's Rui, not Mr.

Rui Barradas

Em 27-06-2012 16:55, Steven Winter escreveu:

> Dear Mr. Barradas,
>
> Thank you for your help. Let's say I have the yearly standard deviation
> of temperatures over New York City for the past 24 years. So, there are
> 24 data points. I would like to put a linear/quadratic/some kind of
> model on top of the data to show that there might be a trend in the data
> over time. But to do so, I have to test the time independence of the
> residuals. Would you say the Ljung-Box test is inappropriate in this
> case? If so, what would be my values for "lag" and "fitdf" that I plug
> into the Box.test function in R?
>
> Thank you,
> Steven
>
> ------------------------------------------------------------------------
> *From:* Rui Barradas <[hidden email]>
> *To:* Steven Winter <[hidden email]>
> *Cc:* [hidden email]
> *Sent:* Tuesday, June 26, 2012 3:13 PM
> *Subject:* Re: [R] Ljung-Box test (Box.test)
>
> Hello,
>
> That's a statistics question, but it's also about using an R function.
>
> The Ljung-Box test isn't supposed to be used in such a context, to test
> the residuals of an ols y = bX + e. It is used to test time independence
> of the original series or of the residuals of an ARMA(p, q) fit.
>
> In both cases you are right, 'x' is a series.
> 'lag' can be explained as follows: you have a time series and want to
> know if the value observed today depends on what was observed in the
> past. Then, a linear regression of "today" on "yesterday" could be
>
> X[t] = b[1]*X[t-1] + e[t], e ~ Normal(0, sigma^2)
>
> A linear regression on two time units in the past would be
>
> X[t] = b[1]*X[t-1] + b[2]*X[t-2] + e[t], e ~ Normal(0, sigma^2)
>
> etc. This is a regression of the series on itself lagged by a certain
> number of time units, the present is regressed on the past. Function
> ar() fits this kind of model to a time series. In the first case, the
> order is p=1, in the second, p=2.
>
> Now, in the first case, is there second order serial correlation? Test
> the residuals with lag=2, fitdf=1, the value of p. Third order? lag=3,
> fitdf=p=1, etc.
>
> You are NOT fitting this type of model, so the Ljung-Box test is
> misused. Test the original series with default parameters, lag=1. If
> there is serial correlation, fit an AR (Auto-Regressive) model with
> ar(). See the help page ?ar. And see a statiscian with experience in
> time series. It's a world on its own, I haven't even mentioned
> seasonality. And almost everything else about time series.
>
> Do ask someone near you.
>
> Hope this helps,
>
> Rui Barradas
> Em 26-06-2012 19:01, Steven Winter escreveu:
>  > I fit a simple linear model y = bX to a data set today, and that
> produced 24 residuals (I have 24 data points, one for each year from
> 1984-2007). I would like to test the time-independence of the residuals
> of my model, and I was recommended by my supervisor to use the Ljung-Box
> test. The Box.test function in R takes 4 arguments:
>  >
>  > x a numeric vector or univariate time series.
>  > lag the statistic will be based on lag autocorrelation
>  > coefficients.
>  > type test to be performed: partial matching is used.
>  > fitdf number of degrees of freedom to be subtracted if x is a series
> of residuals.
>  >
>  > Unfortunately, I never took a statistics class where I learned the
> Ljung-Box test, and information about it online is hard to find. What
> does "lag" mean, and what value would you guys recommend I use for the
> test? Also, what does "fitdf" represent, and what would the value for
> that parameter be in my case? Finally, the value of x is a vector of my
> 24 residuals, correct?
>  >
>  > Thank you all so much. I apologize for the basic nature of the question.
>  >
>  > Steven
>  >     [[alternative HTML version deleted]]
>  >
>  >
>  >
>  > ______________________________________________
>  > [hidden email] <mailto:[hidden email]> mailing list
>  > https://stat.ethz.ch/mailman/listinfo/r-help
>  > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> <http://www.r-project.org/posting-guide.html>
>  > and provide commented, minimal, self-contained, reproducible code.
>  >
>
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...