multiple comparisons of time series data

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

multiple comparisons of time series data

Kyle Hall-2
I am interested in a statistical comparison of multiple (5) time series'
generated from modeling software (Hydrologic Simulation Program Fortran). The
model output simulates daily bacteria concentration in a stream. The multiple
time series' are a result of varying our representation of the stream within
the model.

Our main question is: Do the different methods used to represent a stream
produce different results at a statistically significant level?

We want to compare each otput time series to determine if there is a
difference before looking into the cause within the model.  In a previous
study, the Kolmogorov-Smirnov k-sample test was used to compare multiple time
series'.

I am unsure about the strength of the Kolmogorov-Smirnov test and I have set
out to determine if there are any other tests to compare multiple time
series'.

I know htat R has the ks.test but I am unsure how this test handles multiple
comparisons.  Is there something similar to a pairwise.t.test with a
bonferroni corection, only with time series data?

Does R currently (v 2.3.0) have a comparison test that takes into account the
strong serial correlation of time series data?


Kyle Hall

Graduate Research Assistant
Biological Systems Engineering
Virginia Tech

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: multiple comparisons of time series data

Spencer Graves
PAIRWISE KOLMOGOROV-SMIRNOV:

          I don't know, but it looks like you could just type "pairwise.t.test"
at a command prompt, copy the code into an R script file, and create a
function "pairwise.ks.test" just by changing the call to "t.test" with
one to "ks.test".  Try it.  If you have trouble making it work, submit a
post on that.

          I would NOT do this, however, because the "ks.test" assumes samples
of INDEPENDENT observations.  If you've got time series, I would expect
the assumption of independence to be violated, and I would not believe
the results of a KS test.  If you what to try what I just suggested,
please also try it with multiple time series WITHOUT "varying our
representation of the stream within the model", preferably several times.

COMPARING MULTIPLE TIME SERIES

          If I had k different time series to compare, I might proceed as
follows:

          1.  Make normal probability plots using, e.g., qqnorm.  If the
observations did NOT look normal, I'd consider some transformation.  If
the numbers were all positive, I might consider using the "boxcox"
function in library(MASS) to help select one.  However, I wouldn't
completely believe the results, because this also assumes the
observations are independent, and I know they're not.

          2.  Try to fit some traditional time series model as describe, e.g.,
in the chapter on time series on Venables and Ripley (2002) Modern
Applied Statistics with S (Springer).  There are better books on time
series, but this is probably the first book I would recommend to anyone
using R, and this chapter would be a reasonable start.  I'd play with
this until I seemed to get sensible fits for nearly all series with the
same model and with residuals that looked fairly though not totally (a)
white by the Box-Ljung criteria, and (b) normal in normal probability
plots.  If I saw consistent non-normal behavior in the residuals, it
would indicate a problem bigger than I can handle in a brief email like
this.

          3.  With k different time series, most of the results of "2" could be
summarized in k sets of estimated regression coefficients, all for the
same model, with estimated standard errors plus whitened residuals.  If
you had m parameters, each pair of time series could then be summarized
into m z-scores = (b.i-b.j)/(var.b.i+var.b.j), which could then be
further converted into m p.values.  You would then add the p.values from
ks.test, making (m+1) p.values for each of the k*(k-1)/2 = 10 pairs of
series with k = 5 series.  I'd then feed these k*(m+1) p.values into
"p.adjust" to get an answer.  (Note:  "pairwise.t.test" calls
"pairwise.table", which further calls "p.adjust".  I didn't know any of
this before I read your post.)  I might experiment with the different
"methods" for p.adjust, and I got different answers from the different
methods, I might worry about which to believe.  The Bonferroni is the
simplest, most widely known and understood, but also perhaps the most
conservative.  I might tend to believe some of the others more, but if I
got different answers, I'd suspect that the case was marginal, and I
might want to generate other sets of simulations and try those.

          4.  There are other facilities in R for multiple comparisons, e.g.,
in the multcomp and pgirmess packages.  Before I actually undertook
steps 1, 2, and 3, above, I might review these packages to familiarize
myself more with their contents.

          5.  Virginia Tech has an excellent Statistics department with a
consulting center.  You might try them.

          hope this helps,
          Spencer Graves

Kyle Hall wrote:

> I am interested in a statistical comparison of multiple (5) time series'
> generated from modeling software (Hydrologic Simulation Program Fortran). The
> model output simulates daily bacteria concentration in a stream. The multiple
> time series' are a result of varying our representation of the stream within
> the model.
>
> Our main question is: Do the different methods used to represent a stream
> produce different results at a statistically significant level?
>
> We want to compare each otput time series to determine if there is a
> difference before looking into the cause within the model.  In a previous
> study, the Kolmogorov-Smirnov k-sample test was used to compare multiple time
> series'.
>
> I am unsure about the strength of the Kolmogorov-Smirnov test and I have set
> out to determine if there are any other tests to compare multiple time
> series'.
>
> I know htat R has the ks.test but I am unsure how this test handles multiple
> comparisons.  Is there something similar to a pairwise.t.test with a
> bonferroni corection, only with time series data?
>
> Does R currently (v 2.3.0) have a comparison test that takes into account the
> strong serial correlation of time series data?
>
>
> Kyle Hall
>
> Graduate Research Assistant
> Biological Systems Engineering
> Virginia Tech
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: multiple comparisons of time series data

Thomas Adams
In reply to this post by Kyle Hall-2
Kyle,

You might try the Wilcoxon Rank Sum test (and there is also the paired
rank sum test) that may be useful. Both are found in R. There is an
application of the test in the textbook by Loucks, D.P., Stedinger J.R.,
and Haith, D., 1981. "Water Resources Systems Planning and Analysis",
Prentice-Hall, Eaglewood Cliffs, New Jersey. I hope this helps…

Go HOKIES!

Tom


Kyle Hall wrote:

> I am interested in a statistical comparison of multiple (5) time series'
> generated from modeling software (Hydrologic Simulation Program Fortran). The
> model output simulates daily bacteria concentration in a stream. The multiple
> time series' are a result of varying our representation of the stream within
> the model.
>
> Our main question is: Do the different methods used to represent a stream
> produce different results at a statistically significant level?
>
> We want to compare each otput time series to determine if there is a
> difference before looking into the cause within the model.  In a previous
> study, the Kolmogorov-Smirnov k-sample test was used to compare multiple time
> series'.
>
> I am unsure about the strength of the Kolmogorov-Smirnov test and I have set
> out to determine if there are any other tests to compare multiple time
> series'.
>
> I know htat R has the ks.test but I am unsure how this test handles multiple
> comparisons.  Is there something similar to a pairwise.t.test with a
> bonferroni corection, only with time series data?
>
> Does R currently (v 2.3.0) have a comparison test that takes into account the
> strong serial correlation of time series data?
>
>
> Kyle Hall
>
> Graduate Research Assistant
> Biological Systems Engineering
> Virginia Tech
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>  


--
Thomas E Adams
National Weather Service
Ohio River Forecast Center
1901 South State Route 134
Wilmington, OH 45177

EMAIL: [hidden email]

VOICE: 937-383-0528
FAX: 937-383-0033

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html