

Hi,
I became a little bit confused when working with the Wilcoxon test in R.
As far as I understood, there are mainly two versions:
1) wilcox.test{stats}, which is the default and an approximation, especially,
when ties are involved
2) wilcox_test{coin}, which does calculate the distribution _exactly_ even,
with ties.
I have the following scenario:
#BeginCode
# big example
size = 60
big1 = rnorm(size, 0, 1)
big2 = rnorm(size, 0.5, 1
g1f = rep(1, size)
g2f = rep(2, size)
big = c(big1, big2)
data_frame = data.frame(big, gr=as.factor(c(g1f, g2f)))
wilcox_approx = wilcox.test(big1, big2)
wilcox_exact = wilcox_test(big ~ gr, data=data_frame, distribution="exact")
#EndCode
I found here http://wwwstat.stanford.edu/~susan/courses/s141/hononpara.pdf
that wilcox.test (at least for the signed rank test) relies on exact
(p)values until N = n1 + n2 = 50.
I can reproduce this, when using e.g. size = 15. The pvalues then are the
same, as I would expect it, having read the info from the link.
#BeginCode
print(paste("Wilcox approx pvalue:", wilcox_approx$p.value), quote=F)
print(paste("Wilcox exact pvalue:", pvalue(wilcox_exact)), quote=F)
#EndCode
That said, if I set e.g. size = 60, then the pvalues of wilcox.test and
wilcox_test differ, as expected.
What's puzzling me particularly is the differing results when wanting to
calculate the pvalue manually, for bigger sample sizes.
So, if we get the Wscore from wilcox.test:
#BeginCode
W_big = wilcox.test(big1, big2))$statistic
#EndCode
and "convert" it to a Zscore, like this:
#BeginCode
mu_big = (size^2)/2
sd_big = sqrt(size*size*(size + size + 1)/12)
N = size + size
sd_big_corr = sqrt( (size * size) / (N * (N  1)) * (N^3  N) / 12 )
Z_big = (((W_big  mu_big)/sd_big)
#EndCode
The ZScore (Z_big) is equal to the statistic of wilcox_test.
So far so good. And now comes the main problem.
When I follow the documentation correctly, the pvalue for a given Wscore/
statistic ist calculated using the normalapproximation with the Zscore.
However, when I do that, I get a different result than what I would expect.
Because I would expect the pvalue of wilcox.test to be equal to
2*pnorm(Z_big), which is in fact _not_ equal. Please see:
#BeginCode
p_value_manual = 2 * pnorm(Z_big)
print(" Resulting pvalues  ", quote=F)
print(paste("Wilcox approx pvalue:", wilcox_approx$p.value), quote=F)
print(paste("Wilcox exact pvalue:", pvalue(wilcox_exact)), quote=F)
print(paste("Pvalue manual:", p_value_manual), quote=F)
#EndCode
So how is the calculation of the pvalue performed in wilcox.test, when the
sample sizes are big? Because this might explain why the value differs from
that being calculated manually.
Best regards,
Cedric
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


After fixing the parentheses in your code so it does run, it seems that the difference is that wilcox.test defaults to using a continuity correction and your manual calculation does not. Use wilcox.test(big1, big2, correct=FALSE).
thomas
On Tue, 17 Aug 2010, Cedric Laczny wrote:
> Hi,
>
> I became a little bit confused when working with the Wilcoxon test in R.
> As far as I understood, there are mainly two versions:
> 1) wilcox.test{stats}, which is the default and an approximation, especially,
> when ties are involved
> 2) wilcox_test{coin}, which does calculate the distribution _exactly_ even,
> with ties.
>
> I have the following scenario:
>
> #BeginCode
> # big example
> size = 60
> big1 = rnorm(size, 0, 1)
> big2 = rnorm(size, 0.5, 1
>
> g1f = rep(1, size)
> g2f = rep(2, size)
> big = c(big1, big2)
> data_frame = data.frame(big, gr=as.factor(c(g1f, g2f)))
>
> wilcox_approx = wilcox.test(big1, big2)
> wilcox_exact = wilcox_test(big ~ gr, data=data_frame, distribution="exact")
> #EndCode
>
> I found here http://wwwstat.stanford.edu/~susan/courses/s141/hononpara.pdf> that wilcox.test (at least for the signed rank test) relies on exact
> (p)values until N = n1 + n2 = 50.
> I can reproduce this, when using e.g. size = 15. The pvalues then are the
> same, as I would expect it, having read the info from the link.
>
> #BeginCode
> print(paste("Wilcox approx pvalue:", wilcox_approx$p.value), quote=F)
> print(paste("Wilcox exact pvalue:", pvalue(wilcox_exact)), quote=F)
> #EndCode
>
> That said, if I set e.g. size = 60, then the pvalues of wilcox.test and
> wilcox_test differ, as expected.
>
> What's puzzling me particularly is the differing results when wanting to
> calculate the pvalue manually, for bigger sample sizes.
>
> So, if we get the Wscore from wilcox.test:
>
> #BeginCode
> W_big = wilcox.test(big1, big2))$statistic
> #EndCode
>
> and "convert" it to a Zscore, like this:
>
> #BeginCode
> mu_big = (size^2)/2
> sd_big = sqrt(size*size*(size + size + 1)/12)
> N = size + size
> sd_big_corr = sqrt( (size * size) / (N * (N  1)) * (N^3  N) / 12 )
>
> Z_big = (((W_big  mu_big)/sd_big)
> #EndCode
>
> The ZScore (Z_big) is equal to the statistic of wilcox_test.
> So far so good. And now comes the main problem.
> When I follow the documentation correctly, the pvalue for a given Wscore/
> statistic ist calculated using the normalapproximation with the Zscore.
> However, when I do that, I get a different result than what I would expect.
> Because I would expect the pvalue of wilcox.test to be equal to
> 2*pnorm(Z_big), which is in fact _not_ equal. Please see:
>
> #BeginCode
> p_value_manual = 2 * pnorm(Z_big)
>
> print(" Resulting pvalues  ", quote=F)
> print(paste("Wilcox approx pvalue:", wilcox_approx$p.value), quote=F)
> print(paste("Wilcox exact pvalue:", pvalue(wilcox_exact)), quote=F)
> print(paste("Pvalue manual:", p_value_manual), quote=F)
> #EndCode
>
> So how is the calculation of the pvalue performed in wilcox.test, when the
> sample sizes are big? Because this might explain why the value differs from
> that being calculated manually.
>
> Best regards,
>
> Cedric
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
Thomas Lumley
Professor of Biostatistics
University of Washington, Seattle
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Thanks for the hint.
I tested this generic example and had the same behavior also in a special
example, that can be found below. This example does not involve continuity
correction but exibits the same "unexpected" behavior:
GDS_example = function()
{
print("> GDS EXAMPLE <", quote=F)
#For GDS1331!!!
exp_all_200601_at = c(281.600, 209.300, 202.600, 263.700, 242.500,
281.300, 329.100, 268.200, 311.700, 230.600, 280.600, 233.900, 270.200,
274.000, 169.000, 154.500, 160.100, 196.900, 169.500, 102.800, 148.600,
111.700, 117.000, 173.000, 79.100, 175.700, 149.500, 151.600, 78.700, 113.900,
196.200)
c1 = c(1:14)
c2 = c(20:31)
group1 = exp_all_200601_at[c1]
group2 = exp_all_200601_at[c2]
wilcox_approx = wilcox.test(group1, group2, correct=F)
# Deviation estimate > N.B. the TRUE deviation value can NOT be known as
we would have to sample _ALL_ possible "tissue" etc.
#wilcox.test(group1, group2, conf.int=T)$estimate
g1f = rep(1, length(c1))
g2f = rep(2, length(c2))
gf = as.factor(c(g1f, g2f))
exp_groups_200601_at = exp_all_200601_at[c(c1, c2)]
exp_groups_data_frame = data.frame( exp_groups_200601_at, gf )
wilcox_exact = wilcox_test( exp_groups_200601_at ~gf, data =
exp_groups_data_frame, conf.int=T, distribution="exact" )
print(wilcox_approx, quote=F)
print(wilcox_exact)
print(paste("Pvalue manually:", 2*(1  pnorm(statistic(wilcox_exact)),
quote=F) ))
}
Best,
Cedric
On Tuesday, 17. August 2010 17:50:48 Thomas Lumley wrote:
> After fixing the parentheses in your code so it does run, it seems that the
> difference is that wilcox.test defaults to using a continuity correction
> and your manual calculation does not. Use wilcox.test(big1, big2,
> correct=FALSE).
>
> thomas
>
> On Tue, 17 Aug 2010, Cedric Laczny wrote:
> > Hi,
> >
> > I became a little bit confused when working with the Wilcoxon test in R.
> > As far as I understood, there are mainly two versions:
> > 1) wilcox.test{stats}, which is the default and an approximation,
> > especially, when ties are involved
> > 2) wilcox_test{coin}, which does calculate the distribution _exactly_
> > even, with ties.
> >
> > I have the following scenario:
> >
> > #BeginCode
> > # big example
> > size = 60
> > big1 = rnorm(size, 0, 1)
> > big2 = rnorm(size, 0.5, 1
> >
> > g1f = rep(1, size)
> > g2f = rep(2, size)
> > big = c(big1, big2)
> > data_frame = data.frame(big, gr=as.factor(c(g1f, g2f)))
> >
> > wilcox_approx = wilcox.test(big1, big2)
> > wilcox_exact = wilcox_test(big ~ gr, data=data_frame,
> > distribution="exact") #EndCode
> >
> > I found here
> > http://wwwstat.stanford.edu/~susan/courses/s141/hononpara.pdf that
> > wilcox.test (at least for the signed rank test) relies on exact
> > (p)values until N = n1 + n2 = 50.
> > I can reproduce this, when using e.g. size = 15. The pvalues then are
> > the same, as I would expect it, having read the info from the link.
> >
> > #BeginCode
> > print(paste("Wilcox approx pvalue:", wilcox_approx$p.value), quote=F)
> > print(paste("Wilcox exact pvalue:", pvalue(wilcox_exact)), quote=F)
> > #EndCode
> >
> > That said, if I set e.g. size = 60, then the pvalues of wilcox.test and
> > wilcox_test differ, as expected.
> >
> > What's puzzling me particularly is the differing results when wanting to
> > calculate the pvalue manually, for bigger sample sizes.
> >
> > So, if we get the Wscore from wilcox.test:
> >
> > #BeginCode
> > W_big = wilcox.test(big1, big2))$statistic
> > #EndCode
> >
> > and "convert" it to a Zscore, like this:
> >
> > #BeginCode
> > mu_big = (size^2)/2
> > sd_big = sqrt(size*size*(size + size + 1)/12)
> > N = size + size
> > sd_big_corr = sqrt( (size * size) / (N * (N  1)) * (N^3  N) / 12 )
> >
> > Z_big = (((W_big  mu_big)/sd_big)
> > #EndCode
> >
> > The ZScore (Z_big) is equal to the statistic of wilcox_test.
> > So far so good. And now comes the main problem.
> > When I follow the documentation correctly, the pvalue for a given
> > Wscore/ statistic ist calculated using the normalapproximation with
> > the Zscore. However, when I do that, I get a different result than what
> > I would expect. Because I would expect the pvalue of wilcox.test to be
> > equal to 2*pnorm(Z_big), which is in fact _not_ equal. Please see:
> >
> > #BeginCode
> > p_value_manual = 2 * pnorm(Z_big)
> >
> > print(" Resulting pvalues  ", quote=F)
> > print(paste("Wilcox approx pvalue:", wilcox_approx$p.value), quote=F)
> > print(paste("Wilcox exact pvalue:", pvalue(wilcox_exact)), quote=F)
> > print(paste("Pvalue manual:", p_value_manual), quote=F)
> > #EndCode
> >
> > So how is the calculation of the pvalue performed in wilcox.test, when
> > the sample sizes are big? Because this might explain why the value
> > differs from that being calculated manually.
> >
> > Best regards,
> >
> > Cedric
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/rhelp> > PLEASE do read the posting guide
> > http://www.Rproject.org/postingguide.html and provide commented,
> > minimal, selfcontained, reproducible code.
>
> Thomas Lumley
> Professor of Biostatistics
> University of Washington, Seattle
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


I was able to trace down the unexpected behavior to the following line
SIGMA < sqrt((n.x * n.y/12) * ((n.x + n.y + 1) 
sum(NTIES^3  NTIES)/((n.x + n.y) * (n.x + n.y 
1))))
My calculations of the Zscore for the normal approximation where based on
using the standard deviation for ranks _without_ ties. The above formula seems
to account for ties and thus, yields a slightly different zscore. However, the
data seems to include at most 1 tie (based on rnorm), so it would be the same
result as if it contained no tie (1^3  1 has the same result as 0^3  0,
obviously ;) ) and thus I would expect the result to be the same as when using
the formula for the standard deviation without ties.
Interestigly, this zscore is also different from that reported by wilcox_test
(exact calculation of pvalue).
I have not been able to find this formula in any textbook nearby or at any
website. Therefore I am wondering where it does actually com from?
Best,
Cedric
On Tuesday, 17. August 2010 13:48:10 Cedric Laczny wrote:
> Hi,
>
> I became a little bit confused when working with the Wilcoxon test in R.
> As far as I understood, there are mainly two versions:
> 1) wilcox.test{stats}, which is the default and an approximation,
> especially, when ties are involved
> 2) wilcox_test{coin}, which does calculate the distribution _exactly_ even,
> with ties.
>
> I have the following scenario:
>
> #BeginCode
> # big example
> size = 60
> big1 = rnorm(size, 0, 1)
> big2 = rnorm(size, 0.5, 1
>
> g1f = rep(1, size)
> g2f = rep(2, size)
> big = c(big1, big2)
> data_frame = data.frame(big, gr=as.factor(c(g1f, g2f)))
>
> wilcox_approx = wilcox.test(big1, big2)
> wilcox_exact = wilcox_test(big ~ gr, data=data_frame, distribution="exact")
> #EndCode
>
> I found here http://wwwstat.stanford.edu/~susan/courses/s141/hononpara.pdf> that wilcox.test (at least for the signed rank test) relies on exact
> (p)values until N = n1 + n2 = 50.
> I can reproduce this, when using e.g. size = 15. The pvalues then are the
> same, as I would expect it, having read the info from the link.
>
> #BeginCode
> print(paste("Wilcox approx pvalue:", wilcox_approx$p.value), quote=F)
> print(paste("Wilcox exact pvalue:", pvalue(wilcox_exact)), quote=F)
> #EndCode
>
> That said, if I set e.g. size = 60, then the pvalues of wilcox.test and
> wilcox_test differ, as expected.
>
> What's puzzling me particularly is the differing results when wanting to
> calculate the pvalue manually, for bigger sample sizes.
>
> So, if we get the Wscore from wilcox.test:
>
> #BeginCode
> W_big = wilcox.test(big1, big2))$statistic
> #EndCode
>
> and "convert" it to a Zscore, like this:
>
> #BeginCode
> mu_big = (size^2)/2
> sd_big = sqrt(size*size*(size + size + 1)/12)
> N = size + size
> sd_big_corr = sqrt( (size * size) / (N * (N  1)) * (N^3  N) / 12 )
>
> Z_big = (((W_big  mu_big)/sd_big)
> #EndCode
>
> The ZScore (Z_big) is equal to the statistic of wilcox_test.
> So far so good. And now comes the main problem.
> When I follow the documentation correctly, the pvalue for a given
> Wscore/ statistic ist calculated using the normalapproximation with the
> Zscore. However, when I do that, I get a different result than what I
> would expect. Because I would expect the pvalue of wilcox.test to be
> equal to
> 2*pnorm(Z_big), which is in fact _not_ equal. Please see:
>
> #BeginCode
> p_value_manual = 2 * pnorm(Z_big)
>
> print(" Resulting pvalues  ", quote=F)
> print(paste("Wilcox approx pvalue:", wilcox_approx$p.value), quote=F)
> print(paste("Wilcox exact pvalue:", pvalue(wilcox_exact)), quote=F)
> print(paste("Pvalue manual:", p_value_manual), quote=F)
> #EndCode
>
> So how is the calculation of the pvalue performed in wilcox.test, when the
> sample sizes are big? Because this might explain why the value differs from
> that being calculated manually.
>
> Best regards,
>
> Cedric
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html and provide commented,
> minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


By using the help page for wilcox.test and looking in the therementioned book
(Hollander and Wolfe), I could find the formula also in the book.
People that might run into the same problem: The formula can also be found
here, at least if you have access to it via your university or insitution, or
so: www.springerlink.com/index/a757704707951858.pdf
For those of you who might have stumbled upon
http://reiter1.com/Glossar/MannWhitney%20Test.htm ,
if I haven't done any mistake in the calculations, the formula is equivalent
to the one in R (Hollander and Wolfe).
Especially, if there are no ties, then the formula is equivalent to the
formula for the variance (sigmasquared) for ranks _without_ties.
Also please notice, as mentioned in an earlier response to this thread,
results will probably differ if continuity correction in enabled.
When doing your own calculations or trying to compare some results to the ones
you obtain in R, take care about the distribution that was used in order to
calculate the pvalue: in R for n1 or n2 (respective group sizes) >= 50 a
normal approximation is used via pnorm(z), whereas for smaller group sizes
pwilcox(w) is used, which is _exact_.
Thank you all for you help!
Best,
Cedric
On Wednesday, 18. August 2010 11:55:52 Cedric Laczny wrote:
> I was able to trace down the unexpected behavior to the following line
> SIGMA < sqrt((n.x * n.y/12) * ((n.x + n.y + 1) 
> sum(NTIES^3  NTIES)/((n.x + n.y) * (n.x + n.y 
> 1))))
> My calculations of the Zscore for the normal approximation where based on
> using the standard deviation for ranks _without_ ties. The above formula
> seems to account for ties and thus, yields a slightly different zscore.
> However, the data seems to include at most 1 tie (based on rnorm), so it
> would be the same result as if it contained no tie (1^3  1 has the same
> result as 0^3  0, obviously ;) ) and thus I would expect the result to be
> the same as when using the formula for the standard deviation without
> ties.
> Interestigly, this zscore is also different from that reported by
> wilcox_test (exact calculation of pvalue).
> I have not been able to find this formula in any textbook nearby or at any
> website. Therefore I am wondering where it does actually com from?
>
> Best,
>
> Cedric
>
> On Tuesday, 17. August 2010 13:48:10 Cedric Laczny wrote:
> > Hi,
> >
> > I became a little bit confused when working with the Wilcoxon test in R.
> > As far as I understood, there are mainly two versions:
> > 1) wilcox.test{stats}, which is the default and an approximation,
> > especially, when ties are involved
> > 2) wilcox_test{coin}, which does calculate the distribution _exactly_
> > even, with ties.
> >
> > I have the following scenario:
> >
> > #BeginCode
> > # big example
> > size = 60
> > big1 = rnorm(size, 0, 1)
> > big2 = rnorm(size, 0.5, 1
> >
> > g1f = rep(1, size)
> > g2f = rep(2, size)
> > big = c(big1, big2)
> > data_frame = data.frame(big, gr=as.factor(c(g1f, g2f)))
> >
> > wilcox_approx = wilcox.test(big1, big2)
> > wilcox_exact = wilcox_test(big ~ gr, data=data_frame,
> > distribution="exact") #EndCode
> >
> > I found here
> > http://wwwstat.stanford.edu/~susan/courses/s141/hononpara.pdf that
> > wilcox.test (at least for the signed rank test) relies on exact
> > (p)values until N = n1 + n2 = 50.
> > I can reproduce this, when using e.g. size = 15. The pvalues then are
> > the same, as I would expect it, having read the info from the link.
> >
> > #BeginCode
> > print(paste("Wilcox approx pvalue:", wilcox_approx$p.value), quote=F)
> > print(paste("Wilcox exact pvalue:", pvalue(wilcox_exact)), quote=F)
> > #EndCode
> >
> > That said, if I set e.g. size = 60, then the pvalues of wilcox.test and
> > wilcox_test differ, as expected.
> >
> > What's puzzling me particularly is the differing results when wanting to
> > calculate the pvalue manually, for bigger sample sizes.
> >
> > So, if we get the Wscore from wilcox.test:
> >
> > #BeginCode
> > W_big = wilcox.test(big1, big2))$statistic
> > #EndCode
> >
> > and "convert" it to a Zscore, like this:
> >
> > #BeginCode
> > mu_big = (size^2)/2
> > sd_big = sqrt(size*size*(size + size + 1)/12)
> > N = size + size
> > sd_big_corr = sqrt( (size * size) / (N * (N  1)) * (N^3  N) / 12 )
> >
> > Z_big = (((W_big  mu_big)/sd_big)
> > #EndCode
> >
> > The ZScore (Z_big) is equal to the statistic of wilcox_test.
> > So far so good. And now comes the main problem.
> > When I follow the documentation correctly, the pvalue for a given
> > Wscore/ statistic ist calculated using the normalapproximation with
> > the Zscore. However, when I do that, I get a different result than what
> > I would expect. Because I would expect the pvalue of wilcox.test to be
> > equal to
> > 2*pnorm(Z_big), which is in fact _not_ equal. Please see:
> >
> > #BeginCode
> > p_value_manual = 2 * pnorm(Z_big)
> >
> > print(" Resulting pvalues  ", quote=F)
> > print(paste("Wilcox approx pvalue:", wilcox_approx$p.value), quote=F)
> > print(paste("Wilcox exact pvalue:", pvalue(wilcox_exact)), quote=F)
> > print(paste("Pvalue manual:", p_value_manual), quote=F)
> > #EndCode
> >
> > So how is the calculation of the pvalue performed in wilcox.test, when
> > the sample sizes are big? Because this might explain why the value
> > differs from that being calculated manually.
> >
> > Best regards,
> >
> > Cedric
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/rhelp> > PLEASE do read the posting guide
> > http://www.Rproject.org/postingguide.html and provide commented,
> > minimal, selfcontained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html and provide commented,
> minimal, selfcontained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On Aug 18, 2010, at 11:55 AM, Cedric Laczny wrote:
> I was able to trace down the unexpected behavior to the following line
> SIGMA < sqrt((n.x * n.y/12) * ((n.x + n.y + 1) 
> sum(NTIES^3  NTIES)/((n.x + n.y) * (n.x + n.y 
> 1))))
> My calculations of the Zscore for the normal approximation where based on
> using the standard deviation for ranks _without_ ties. The above formula seems
> to account for ties and thus, yields a slightly different zscore. However, the
> data seems to include at most 1 tie (based on rnorm), so it would be the same
> result as if it contained no tie (1^3  1 has the same result as 0^3  0,
> obviously ;) ) and thus I would expect the result to be the same as when using
> the formula for the standard deviation without ties.
Note the definition of NTIES < table(r), counting the number of observations tied for a particular rank, so it is all ones if and only if there are NO ties in data.
(If you are in paperandpencil mode, these formulas are fairly easily worked out once you realize that you only need the mean and variance of the rank of a single observation  the covariances are C(R1,R2) = 1/(N1) V(R1) because of symmetry and the fact that the sum of all N ranks is fixed.)

Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: [hidden email] Priv: [hidden email]
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.

