Dear fellow R users,
Keywords: Kruskal-Wallis, Post-Hoc, pair-wise comparisons, Nemenyi-Damico-Wolfe-Dunn test, coin package, oneway_test I am using the "oneway_test" function in the R package "coin" and I am obtaining results which I cannot believe are accurate. I do not wish to waste anyone's time and so if the following problem is rather trivial, I apologize, however I could not seem to resolve my problem with an online search and I am fresh out of ideas. I have carried out a Kruskal-Wallis test to compare breeding strategy variance of my study organisms (rank data, therefore non-parametric, in oder of increasing degree of "terrestrialization", in this case: adaptations to breeding on land as opposed to in aquatic habitats) between habitat groups (I, II and III). Subsequently I would like to do a "Post-Hoc test" or in other words a set of corrected pair-wise comparisons to test the relationship between individual groups. For this I would like to use the Nemenyi-Damico-Wolfe-Dunn test in the "coin" package (aka oneway_test). However, when I apply it to my data, I receive highly significant differences between all of my groups, which when looking at my data, cannot be true. I have posted one of my command blocks below containing my data set as well as the script adapted from the coin package manual. library(coin) library(multcomp) ###this is my data: mydata <- data.frame(breeding = c(4,4,4,4,1,1,1,1,8,8,8,8,9,7,7,4,4,4,6,1,1,1,1,1,1,4,1,4,4,1,1,4,4,1,1,1,1,6,6,6,6,6,2,2,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,11,7,1,1,1,1,1,4,4,4,4,1,1,1,1,8,8,8,8,8,8,8,8,8,8,8,8,8,9,9,9,9,9,9,7,7,7,7,7,7,7,7,7,7,5,5,5,5,5,4,4,6,6,6,6,6,6,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,3,3,3,1,4,1,7,7,7,7,7,7,12,12,12,12,12,12,1,4,7,4,1,1,1,1,1,1,1,1,4), habitat = factor(c(rep("I", 68), rep("II", 89), rep("III", 12)))) ###box plot to visualize data boxplot(breeding~habitat,data=mydata,main="Boxplot of breeding strategies",ylab="breeding strategy",col="gold",lty=1) ### Kruskal-Wallis test kruskal_test(breeding ~ habitat, data = mydata, distribution = approximate(B = 9999)) ### Nemenyi-Damico-Wolfe-Dunn test (joint ranking) NDWD <- oneway_test(breeding ~ habitat, data = mydata, ytrafo = function(data) trafo(data, numeric_trafo = rank), xtrafo = function(data) trafo(data, factor_trafo = function(x) model.matrix(~x - 1) %*% t(contrMat(table(x), "Tukey"))), teststat = "max", distribution = approximate(B = 900000)) ### global p-value print(pvalue(NDWD)) ### sites-by-site p values at alpha = 0.01 (page 244) print(pvalue(NDWD, method = "single-step")) I should be detecting some non-significance between groups I and III at least, but the test comes back with extremely low p-values. Where am I going wrong? Thank you very much for your help. With kind regards Christoph [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
On Jan 09, 2012 at 11:48am Christoph Liedtke wrote:
> I should be detecting some non-significance between groups I and III at least, but the test comes back with > extremely low p-values. Where am I going wrong? Nowhere, I think. This does seem to be an error in coin. You should send your example to the maintainers of the package. Apart from the visual you have provided, other reasons for thinking that this is an error are the following. First, if you redo the analysis excluding habitat II, then the contrast is not significant, as expected. Secondly, if you repeat the full analysis using package nparcomp then you get the results you are expecting, based on the graphical representation of the data. See the examples below. ## drop habitat == II NDWD <- oneway_test(breeding ~ habitat, data = droplevels(subset(mydata, habitat != "II")), ytrafo = function(data) trafo(data, numeric_trafo = rank), xtrafo = function(data) trafo(data, factor_trafo = function(x) model.matrix(~x - 1) %*% t(contrMat(table(x), "Tukey"))), teststat = "max", distribution = approximate(B = 900000)) print(NDWD) print(pvalue(NDWD, method = "single-step")) ## use nparcomp library(nparcomp) npar <- nparcomp(breeding ~ habitat, data = mydata, type = "Tukey") npar Regards, Mark.
Mark Difford (Ph.D.)
Research Associate Botany Department Nelson Mandela Metropolitan University Port Elizabeth, South Africa |
In reply to this post by Christoph Liedtke
The usual way to pose questions about "unexpected behavior" is to look
up the package maintainer's name and email address in the DESCRIPTION file of the package which is accessed with the help function: help(package=coin) You see: Maintainer: Torsten Hothorn <[hidden email]> -- David On Jan 9, 2012, at 4:48 AM, Christoph Liedtke wrote: > Dear fellow R users, > > Keywords: Kruskal-Wallis, Post-Hoc, pair-wise comparisons, Nemenyi- > Damico-Wolfe-Dunn test, coin package, oneway_test > > I am using the "oneway_test" function in the R package "coin" and I > am obtaining results which I cannot believe are accurate. I do not > wish to waste anyone's time and so if the following problem is > rather trivial, I apologize, however I could not seem to resolve my > problem with an online search and I am fresh out of ideas. > > I have carried out a Kruskal-Wallis test to compare breeding > strategy variance of my study organisms (rank data, therefore non- > parametric, in oder of increasing degree of "terrestrialization", in > this case: adaptations to breeding on land as opposed to in aquatic > habitats) between habitat groups (I, II and III). Subsequently I > would like to do a "Post-Hoc test" or in other words a set of > corrected pair-wise comparisons to test the relationship between > individual groups. For this I would like to use the Nemenyi-Damico- > Wolfe-Dunn test in the "coin" package (aka oneway_test). However, > when I apply it to my data, I receive highly significant differences > between all of my groups, which when looking at my data, cannot be > true. I have posted one of my command blocks below containing my > data set as well as the script adapted from the coin package manual. > > library(coin) > library(multcomp) > > ###this is my data: > mydata <- data.frame(breeding = > c(4,4,4,4,1,1,1,1,8,8,8,8,9,7,7,4,4,4,6,1,1,1,1,1,1,4,1,4,4,1,1,4,4,1,1,1,1,6,6,6,6,6,2,2,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,11,7,1,1,1,1,1,4,4,4,4,1,1,1,1,8,8,8,8,8,8,8,8,8,8,8,8,8,9,9,9,9,9,9,7,7,7,7,7,7,7,7,7,7,5,5,5,5,5,4,4,6,6,6,6,6,6,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,3,3,3,1,4,1,7,7,7,7,7,7,12,12,12,12,12,12,1,4,7,4,1,1,1,1,1,1,1,1,4), > habitat = factor(c(rep("I", 68), rep("II", 89), > rep("III", 12)))) > > ###box plot to visualize data > boxplot(breeding~habitat,data=mydata,main="Boxplot of breeding > strategies",ylab="breeding strategy",col="gold",lty=1) > > > ### Kruskal-Wallis test > kruskal_test(breeding ~ habitat, data = mydata, distribution = > approximate(B = 9999)) > > ### Nemenyi-Damico-Wolfe-Dunn test (joint ranking) > NDWD <- oneway_test(breeding ~ habitat, data = mydata, > ytrafo = function(data) trafo(data, numeric_trafo = rank), > xtrafo = function(data) trafo(data, factor_trafo = function(x) > model.matrix(~x - 1) %*% t(contrMat(table(x), "Tukey"))), > teststat = "max", distribution = approximate(B = 900000)) > > ### global p-value > print(pvalue(NDWD)) > > ### sites-by-site p values at alpha = 0.01 (page 244) > print(pvalue(NDWD, method = "single-step")) > > > > > > I should be detecting some non-significance between groups I and III > at least, but the test comes back with extremely low p-values. > Where am I going wrong? > > Thank you very much for your help. > > With kind regards > > Christoph > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Powered by Nabble | Edit this page |