Quantcast

Unexpected results using the oneway_test in the coin package

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Unexpected results using the oneway_test in the coin package

Christoph Liedtke
Dear fellow R users,

Keywords: Kruskal-Wallis, Post-Hoc, pair-wise comparisons, Nemenyi-Damico-Wolfe-Dunn test, coin package, oneway_test

I am using the "oneway_test" function in the R package "coin" and I am obtaining results which I cannot believe are accurate. I do not wish to waste anyone's time and so if the following problem is rather trivial, I apologize, however I could not seem to resolve my problem with an online search and I am fresh out of ideas.

I have carried out a Kruskal-Wallis test to compare breeding strategy variance of my study organisms (rank data, therefore non-parametric, in oder of increasing degree of "terrestrialization", in this case: adaptations to breeding on land as opposed to in aquatic habitats) between habitat groups (I, II and III). Subsequently I would like to do a "Post-Hoc test" or in other words a set of corrected pair-wise comparisons to test the relationship between individual groups. For this I would like to use the Nemenyi-Damico-Wolfe-Dunn test in the "coin" package (aka oneway_test). However, when I apply it to my data, I receive highly significant differences between all of my groups, which when looking at my data, cannot be true. I have posted one of my command blocks below containing my data set as well as the script adapted from the coin package manual.

library(coin)
library(multcomp)

###this is my data:
mydata <- data.frame(breeding = c(4,4,4,4,1,1,1,1,8,8,8,8,9,7,7,4,4,4,6,1,1,1,1,1,1,4,1,4,4,1,1,4,4,1,1,1,1,6,6,6,6,6,2,2,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,11,7,1,1,1,1,1,4,4,4,4,1,1,1,1,8,8,8,8,8,8,8,8,8,8,8,8,8,9,9,9,9,9,9,7,7,7,7,7,7,7,7,7,7,5,5,5,5,5,4,4,6,6,6,6,6,6,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,3,3,3,1,4,1,7,7,7,7,7,7,12,12,12,12,12,12,1,4,7,4,1,1,1,1,1,1,1,1,4),
                    habitat = factor(c(rep("I", 68), rep("II", 89),
                                    rep("III", 12))))

###box plot to visualize data
boxplot(breeding~habitat,data=mydata,main="Boxplot of breeding strategies",ylab="breeding strategy",col="gold",lty=1)


### Kruskal-Wallis test
kruskal_test(breeding ~ habitat, data = mydata, distribution = approximate(B = 9999))

### Nemenyi-Damico-Wolfe-Dunn test (joint ranking)
NDWD <- oneway_test(breeding ~ habitat, data = mydata,
        ytrafo = function(data) trafo(data, numeric_trafo = rank),
        xtrafo = function(data) trafo(data, factor_trafo = function(x)
            model.matrix(~x - 1) %*% t(contrMat(table(x), "Tukey"))),
        teststat = "max", distribution = approximate(B = 900000))

### global p-value
print(pvalue(NDWD))

### sites-by-site p values at alpha = 0.01 (page 244)
print(pvalue(NDWD, method = "single-step"))





I should be detecting some non-significance between groups I and III at least, but the test comes back with extremely low p-values.  Where am I going wrong?

Thank you very much for your help.

With kind regards

Christoph






        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unexpected results using the oneway_test in the coin package

Mark Difford
On Jan 09, 2012 at 11:48am Christoph Liedtke wrote:

> I should be detecting some non-significance between groups I and III at least, but the test comes back with
> extremely low p-values.  Where am I going wrong?

Nowhere, I think. This does seem to be an error in coin. You should send your example to the maintainers of the package. Apart from the visual you have provided, other reasons for thinking that this is an error are the following.

First, if you redo the analysis excluding habitat II, then the contrast is not significant, as expected. Secondly, if you repeat the full analysis using package nparcomp then you get the results you are expecting, based on the graphical representation of the data. See the examples below.

## drop habitat == II
NDWD <- oneway_test(breeding ~ habitat, data = droplevels(subset(mydata, habitat != "II")),
        ytrafo = function(data) trafo(data, numeric_trafo = rank),
        xtrafo = function(data) trafo(data, factor_trafo = function(x)
            model.matrix(~x - 1) %*% t(contrMat(table(x), "Tukey"))),
        teststat = "max", distribution = approximate(B = 900000))

print(NDWD)
print(pvalue(NDWD, method = "single-step"))

## use nparcomp
library(nparcomp)
npar <- nparcomp(breeding ~ habitat, data = mydata, type = "Tukey")
npar

Regards, Mark.
Mark Difford (Ph.D.)
Research Associate
Botany Department
Nelson Mandela Metropolitan University
Port Elizabeth, South Africa
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Unexpected results using the oneway_test in the coin package

David Winsemius
In reply to this post by Christoph Liedtke
The usual way to pose questions about "unexpected behavior" is to look  
up the package maintainer's name and email address in the DESCRIPTION  
file of the package which is accessed with the help function:

help(package=coin)

You see:
Maintainer: Torsten Hothorn <[hidden email]>
--
David


On Jan 9, 2012, at 4:48 AM, Christoph Liedtke wrote:

> Dear fellow R users,
>
> Keywords: Kruskal-Wallis, Post-Hoc, pair-wise comparisons, Nemenyi-
> Damico-Wolfe-Dunn test, coin package, oneway_test
>
> I am using the "oneway_test" function in the R package "coin" and I  
> am obtaining results which I cannot believe are accurate. I do not  
> wish to waste anyone's time and so if the following problem is  
> rather trivial, I apologize, however I could not seem to resolve my  
> problem with an online search and I am fresh out of ideas.
>
> I have carried out a Kruskal-Wallis test to compare breeding  
> strategy variance of my study organisms (rank data, therefore non-
> parametric, in oder of increasing degree of "terrestrialization", in  
> this case: adaptations to breeding on land as opposed to in aquatic  
> habitats) between habitat groups (I, II and III). Subsequently I  
> would like to do a "Post-Hoc test" or in other words a set of  
> corrected pair-wise comparisons to test the relationship between  
> individual groups. For this I would like to use the Nemenyi-Damico-
> Wolfe-Dunn test in the "coin" package (aka oneway_test). However,  
> when I apply it to my data, I receive highly significant differences  
> between all of my groups, which when looking at my data, cannot be  
> true. I have posted one of my command blocks below containing my  
> data set as well as the script adapted from the coin package manual.
>
> library(coin)
> library(multcomp)
>
> ###this is my data:
> mydata <- data.frame(breeding =  
> c(4,4,4,4,1,1,1,1,8,8,8,8,9,7,7,4,4,4,6,1,1,1,1,1,1,4,1,4,4,1,1,4,4,1,1,1,1,6,6,6,6,6,2,2,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,11,7,1,1,1,1,1,4,4,4,4,1,1,1,1,8,8,8,8,8,8,8,8,8,8,8,8,8,9,9,9,9,9,9,7,7,7,7,7,7,7,7,7,7,5,5,5,5,5,4,4,6,6,6,6,6,6,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,3,3,3,1,4,1,7,7,7,7,7,7,12,12,12,12,12,12,1,4,7,4,1,1,1,1,1,1,1,1,4),
>                    habitat = factor(c(rep("I", 68), rep("II", 89),
>                                    rep("III", 12))))
>
> ###box plot to visualize data
> boxplot(breeding~habitat,data=mydata,main="Boxplot of breeding  
> strategies",ylab="breeding strategy",col="gold",lty=1)
>
>
> ### Kruskal-Wallis test
> kruskal_test(breeding ~ habitat, data = mydata, distribution =  
> approximate(B = 9999))
>
> ### Nemenyi-Damico-Wolfe-Dunn test (joint ranking)
> NDWD <- oneway_test(breeding ~ habitat, data = mydata,
>        ytrafo = function(data) trafo(data, numeric_trafo = rank),
>        xtrafo = function(data) trafo(data, factor_trafo = function(x)
>            model.matrix(~x - 1) %*% t(contrMat(table(x), "Tukey"))),
>        teststat = "max", distribution = approximate(B = 900000))
>
> ### global p-value
> print(pvalue(NDWD))
>
> ### sites-by-site p values at alpha = 0.01 (page 244)
> print(pvalue(NDWD, method = "single-step"))
>
>
>
>
>
> I should be detecting some non-significance between groups I and III  
> at least, but the test comes back with extremely low p-values.  
> Where am I going wrong?
>
> Thank you very much for your help.
>
> With kind regards
>
> Christoph
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...