

Dear R users,
Why the result of Wilcoxon sum rank test by R is different from sas
https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_npar1way_sect022.htmThe code is next:
sampleA < c(1.94, 1.94, 2.92, 2.92, 2.92, 2.92, 3.27, 3.27, 3.27, 3.27,
3.7, 3.7, 3.74)
sampleB < c(3.27, 3.27, 3.27, 3.7, 3.7, 3.74)
wilcox.test(A,B,paired = F)
Thanks in advance
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On Fri, 21 Apr 2017, Tripoli Massimiliano wrote:
> Dear R users,
> Why the result of Wilcoxon sum rank test by R is different from sas
>
> https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_npar1way_sect022.htm>
> The code is next:
>
> sampleA < c(1.94, 1.94, 2.92, 2.92, 2.92, 2.92, 3.27, 3.27, 3.27, 3.27,
> 3.7, 3.7, 3.74)
>
> sampleB < c(3.27, 3.27, 3.27, 3.7, 3.7, 3.74)
> wilcox.test(A,B,paired = F)
There are different ways how to compute or approximate the asymptotic or
exact conditional distribution of the test statistic:
SAS reports an asymptotic normal approximation (apparently without
continuity correction along with an asymptotic t approximation and the
exact conditional distribution.
Base R's stats::wilcox.test can either report the exact conditional
distribution (but only if there are no ties) or the asymptotic normal
distribution (with or without continuity correction). In small samples the
default is to use the former but a warning is issued when there are ties
(as in your case).
Furthermore, coin::wilcox_test can report either the asymptotic normal
distribution (without continuity correction) or the exact conditional
distribution (even in the presence of ties).
Thus:
## collect data in data.frame
d < data.frame(
y = c(sampleA, sampleB),
x = factor(rep(0:1, c(length(sampleA), length(sampleB))))
)
## asymptotic normal distribution without continuity correction
## (p = 0.0764)
stats::wilcox.test(y ~ x, data = d, exact = FALSE, correct = FALSE)
coin::wilcox_test(y ~ x, data = d, distribution = "asymptotic")
## exact conditional distribution (p = 0.1054)
coin::wilcox_test(y ~ x, data = d, distribution = "exact")
These match SAS's results. The default result of stats::wilcox.test is
different as explained by the warning issued.
hth,
Z
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


In reply to this post by Tripoli Massimiliano
Try setting the 'correct' argument to FALSE (similar to CORRECT=NO
option in the SAS documentation).
The pvalues are then identical, although the W values are different.
Additionally I cannot understand why you get a warning from R that it
cannot compute exact pvalues because of ties, while the SAS
documentation states that "Because the sample size is small, the
largesample normal approximation might not be adequate, and it is
appropriate to compute the exact test."
And yet, the pvalue from R is identical to the twosided pvalues with
the normal approximation...?!
Another oddity is the legend of the SAS output, which does not
correspond to the data in the output itself (but corresponds to the R
values with correct=TRUE)!
Could the SAS documentation have some errors? I don't have SAS installed
so cannot test the code.
Ivan

Dr. Ivan Calandra
TraCEr, Laboratory for Traceology and Controlled Experiments
MONREPOS Archaeological Research Centre and
Museum for Human Behavioural Evolution
Schloss Monrepos
56567 Neuwied, Germany
+49 (0) 2631 9772243
https://www.researchgate.net/profile/Ivan_CalandraOn 21/04/2017 14:37, Tripoli Massimiliano wrote:
> Dear R users,
> Why the result of Wilcoxon sum rank test by R is different from sas
>
> https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_npar1way_sect022.htm>
> The code is next:
>
> sampleA < c(1.94, 1.94, 2.92, 2.92, 2.92, 2.92, 3.27, 3.27, 3.27, 3.27,
> 3.7, 3.7, 3.74)
>
> sampleB < c(3.27, 3.27, 3.27, 3.7, 3.7, 3.74)
> wilcox.test(A,B,paired = F)
>
>
> Thanks in advance
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Also, as far as I know just for historical consistency, the test statistic in R is the rank sum of the first group MINUS its minimum possible value: W = 110.5  sum(1:13) = 19.5
pd
> On 21 Apr 2017, at 14:54 , Achim Zeileis < [hidden email]> wrote:
>
> On Fri, 21 Apr 2017, Tripoli Massimiliano wrote:
>
>> Dear R users,
>> Why the result of Wilcoxon sum rank test by R is different from sas
>>
>> https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_npar1way_sect022.htm>>
>> The code is next:
>>
>> sampleA < c(1.94, 1.94, 2.92, 2.92, 2.92, 2.92, 3.27, 3.27, 3.27, 3.27,
>> 3.7, 3.7, 3.74)
>>
>> sampleB < c(3.27, 3.27, 3.27, 3.7, 3.7, 3.74)
>> wilcox.test(A,B,paired = F)
>
> There are different ways how to compute or approximate the asymptotic or exact conditional distribution of the test statistic:
>
> SAS reports an asymptotic normal approximation (apparently without continuity correction along with an asymptotic t approximation and the exact conditional distribution.
>
> Base R's stats::wilcox.test can either report the exact conditional distribution (but only if there are no ties) or the asymptotic normal distribution (with or without continuity correction). In small samples the default is to use the former but a warning is issued when there are ties (as in your case).
>
> Furthermore, coin::wilcox_test can report either the asymptotic normal distribution (without continuity correction) or the exact conditional distribution (even in the presence of ties).
>
> Thus:
>
> ## collect data in data.frame
> d < data.frame(
> y = c(sampleA, sampleB),
> x = factor(rep(0:1, c(length(sampleA), length(sampleB))))
> )
>
> ## asymptotic normal distribution without continuity correction
> ## (p = 0.0764)
> stats::wilcox.test(y ~ x, data = d, exact = FALSE, correct = FALSE)
> coin::wilcox_test(y ~ x, data = d, distribution = "asymptotic")
>
> ## exact conditional distribution (p = 0.1054)
> coin::wilcox_test(y ~ x, data = d, distribution = "exact")
>
> These match SAS's results. The default result of stats::wilcox.test is different as explained by the warning issued.
>
> hth,
> Z
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.

Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [hidden email] Priv: [hidden email]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On Fri, 21 Apr 2017, peter dalgaard wrote:
> Also, as far as I know just for historical consistency, the test
> statistic in R is the rank sum of the first group MINUS its minimum
> possible value: W = 110.5  sum(1:13) = 19.5
Ah, yes, I meant to add that remark. And coin::wilcox_test always computes
a standardized test statistic as opposed to the (adjusted) rank sum. But
these are all "simple" transformations of the test statistic and hence do
not influence the pvalues.
See also the "Note" in ?wilcox.test on the difference between socalled
Wilcoxon and MannWhitney statistics.
>> On 21 Apr 2017, at 14:54 , Achim Zeileis < [hidden email]> wrote:
>>
>> On Fri, 21 Apr 2017, Tripoli Massimiliano wrote:
>>
>>> Dear R users,
>>> Why the result of Wilcoxon sum rank test by R is different from sas
>>>
>>> https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_npar1way_sect022.htm>>>
>>> The code is next:
>>>
>>> sampleA < c(1.94, 1.94, 2.92, 2.92, 2.92, 2.92, 3.27, 3.27, 3.27, 3.27,
>>> 3.7, 3.7, 3.74)
>>>
>>> sampleB < c(3.27, 3.27, 3.27, 3.7, 3.7, 3.74)
>>> wilcox.test(A,B,paired = F)
>>
>> There are different ways how to compute or approximate the asymptotic or exact conditional distribution of the test statistic:
>>
>> SAS reports an asymptotic normal approximation (apparently without continuity correction along with an asymptotic t approximation and the exact conditional distribution.
>>
>> Base R's stats::wilcox.test can either report the exact conditional distribution (but only if there are no ties) or the asymptotic normal distribution (with or without continuity correction). In small samples the default is to use the former but a warning is issued when there are ties (as in your case).
>>
>> Furthermore, coin::wilcox_test can report either the asymptotic normal distribution (without continuity correction) or the exact conditional distribution (even in the presence of ties).
>>
>> Thus:
>>
>> ## collect data in data.frame
>> d < data.frame(
>> y = c(sampleA, sampleB),
>> x = factor(rep(0:1, c(length(sampleA), length(sampleB))))
>> )
>>
>> ## asymptotic normal distribution without continuity correction
>> ## (p = 0.0764)
>> stats::wilcox.test(y ~ x, data = d, exact = FALSE, correct = FALSE)
>> coin::wilcox_test(y ~ x, data = d, distribution = "asymptotic")
>>
>> ## exact conditional distribution (p = 0.1054)
>> coin::wilcox_test(y ~ x, data = d, distribution = "exact")
>>
>> These match SAS's results. The default result of stats::wilcox.test is different as explained by the warning issued.
>>
>> hth,
>> Z
>>
>> ______________________________________________
>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html>> and provide commented, minimal, selfcontained, reproducible code.
>
> 
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: [hidden email] Priv: [hidden email]
>
>
>
>
>
>
>
>
>
>
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.

