I am interested in a statistical comparison of multiple (5) time series'
generated from modeling software (Hydrologic Simulation Program Fortran). The model output simulates daily bacteria concentration in a stream. The multiple time series' are a result of varying our representation of the stream within the model. Our main question is: Do the different methods used to represent a stream produce different results at a statistically significant level? We want to compare each otput time series to determine if there is a difference before looking into the cause within the model. In a previous study, the Kolmogorov-Smirnov k-sample test was used to compare multiple time series'. I am unsure about the strength of the Kolmogorov-Smirnov test and I have set out to determine if there are any other tests to compare multiple time series'. I know htat R has the ks.test but I am unsure how this test handles multiple comparisons. Is there something similar to a pairwise.t.test with a bonferroni corection, only with time series data? Does R currently (v 2.3.0) have a comparison test that takes into account the strong serial correlation of time series data? Kyle Hall Graduate Research Assistant Biological Systems Engineering Virginia Tech ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html |
PAIRWISE KOLMOGOROV-SMIRNOV:
I don't know, but it looks like you could just type "pairwise.t.test" at a command prompt, copy the code into an R script file, and create a function "pairwise.ks.test" just by changing the call to "t.test" with one to "ks.test". Try it. If you have trouble making it work, submit a post on that. I would NOT do this, however, because the "ks.test" assumes samples of INDEPENDENT observations. If you've got time series, I would expect the assumption of independence to be violated, and I would not believe the results of a KS test. If you what to try what I just suggested, please also try it with multiple time series WITHOUT "varying our representation of the stream within the model", preferably several times. COMPARING MULTIPLE TIME SERIES If I had k different time series to compare, I might proceed as follows: 1. Make normal probability plots using, e.g., qqnorm. If the observations did NOT look normal, I'd consider some transformation. If the numbers were all positive, I might consider using the "boxcox" function in library(MASS) to help select one. However, I wouldn't completely believe the results, because this also assumes the observations are independent, and I know they're not. 2. Try to fit some traditional time series model as describe, e.g., in the chapter on time series on Venables and Ripley (2002) Modern Applied Statistics with S (Springer). There are better books on time series, but this is probably the first book I would recommend to anyone using R, and this chapter would be a reasonable start. I'd play with this until I seemed to get sensible fits for nearly all series with the same model and with residuals that looked fairly though not totally (a) white by the Box-Ljung criteria, and (b) normal in normal probability plots. If I saw consistent non-normal behavior in the residuals, it would indicate a problem bigger than I can handle in a brief email like this. 3. With k different time series, most of the results of "2" could be summarized in k sets of estimated regression coefficients, all for the same model, with estimated standard errors plus whitened residuals. If you had m parameters, each pair of time series could then be summarized into m z-scores = (b.i-b.j)/(var.b.i+var.b.j), which could then be further converted into m p.values. You would then add the p.values from ks.test, making (m+1) p.values for each of the k*(k-1)/2 = 10 pairs of series with k = 5 series. I'd then feed these k*(m+1) p.values into "p.adjust" to get an answer. (Note: "pairwise.t.test" calls "pairwise.table", which further calls "p.adjust". I didn't know any of this before I read your post.) I might experiment with the different "methods" for p.adjust, and I got different answers from the different methods, I might worry about which to believe. The Bonferroni is the simplest, most widely known and understood, but also perhaps the most conservative. I might tend to believe some of the others more, but if I got different answers, I'd suspect that the case was marginal, and I might want to generate other sets of simulations and try those. 4. There are other facilities in R for multiple comparisons, e.g., in the multcomp and pgirmess packages. Before I actually undertook steps 1, 2, and 3, above, I might review these packages to familiarize myself more with their contents. 5. Virginia Tech has an excellent Statistics department with a consulting center. You might try them. hope this helps, Spencer Graves Kyle Hall wrote: > I am interested in a statistical comparison of multiple (5) time series' > generated from modeling software (Hydrologic Simulation Program Fortran). The > model output simulates daily bacteria concentration in a stream. The multiple > time series' are a result of varying our representation of the stream within > the model. > > Our main question is: Do the different methods used to represent a stream > produce different results at a statistically significant level? > > We want to compare each otput time series to determine if there is a > difference before looking into the cause within the model. In a previous > study, the Kolmogorov-Smirnov k-sample test was used to compare multiple time > series'. > > I am unsure about the strength of the Kolmogorov-Smirnov test and I have set > out to determine if there are any other tests to compare multiple time > series'. > > I know htat R has the ks.test but I am unsure how this test handles multiple > comparisons. Is there something similar to a pairwise.t.test with a > bonferroni corection, only with time series data? > > Does R currently (v 2.3.0) have a comparison test that takes into account the > strong serial correlation of time series data? > > > Kyle Hall > > Graduate Research Assistant > Biological Systems Engineering > Virginia Tech > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html |
In reply to this post by Kyle Hall-2
Kyle,
You might try the Wilcoxon Rank Sum test (and there is also the paired rank sum test) that may be useful. Both are found in R. There is an application of the test in the textbook by Loucks, D.P., Stedinger J.R., and Haith, D., 1981. "Water Resources Systems Planning and Analysis", Prentice-Hall, Eaglewood Cliffs, New Jersey. I hope this helpsâ€¦ Go HOKIES! Tom Kyle Hall wrote: > I am interested in a statistical comparison of multiple (5) time series' > generated from modeling software (Hydrologic Simulation Program Fortran). The > model output simulates daily bacteria concentration in a stream. The multiple > time series' are a result of varying our representation of the stream within > the model. > > Our main question is: Do the different methods used to represent a stream > produce different results at a statistically significant level? > > We want to compare each otput time series to determine if there is a > difference before looking into the cause within the model. In a previous > study, the Kolmogorov-Smirnov k-sample test was used to compare multiple time > series'. > > I am unsure about the strength of the Kolmogorov-Smirnov test and I have set > out to determine if there are any other tests to compare multiple time > series'. > > I know htat R has the ks.test but I am unsure how this test handles multiple > comparisons. Is there something similar to a pairwise.t.test with a > bonferroni corection, only with time series data? > > Does R currently (v 2.3.0) have a comparison test that takes into account the > strong serial correlation of time series data? > > > Kyle Hall > > Graduate Research Assistant > Biological Systems Engineering > Virginia Tech > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > -- Thomas E Adams National Weather Service Ohio River Forecast Center 1901 South State Route 134 Wilmington, OH 45177 EMAIL: [hidden email] VOICE: 937-383-0528 FAX: 937-383-0033 ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html |
Free forum by Nabble | Edit this page |