Different results on running Wilcoxon Rank Sum test in R and SPSS

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Different results on running Wilcoxon Rank Sum test in R and SPSS

R help mailing list-2
Hello, 
On running the Wilcoxon Rank Sum test in R and SPSS, I am getting the following discrepancies which I am unable to explain.
Q1 In the attached data set, I was trying to compare freq4w_n in those with drug_code 0 vs 1. SPSS gives a P value 0.031 vs R gives a P value 0.001779. 
The code I used in R is as follows - 
wilcox.test(freq4w_n, drug_code, conf.int = T)


Q2 Similarly, in the same data set, when trying to compare PFD_n in those with drug_code 0 vs 1, SPSS gives a P value 0.038 vs R gives a P value < 2.2e-16. 
The code I used in R is as follows - 
wilcox.test(PFD_n, drug_code, mu = 0, alternative = "two.sided", correct = TRUE, paired = FALSE, conf.int = TRUE)


I have tried searching on Google and watching some Youtube tutorials, I cannot find an answer, Any help will be really appreciated, Thank you! 
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Different results on running Wilcoxon Rank Sum test in R and SPSS

Michael Dewey-3
Unfortunately your data did not come through. Try using dput() and then
pasting that into the body of your e-mail message.

On 18/01/2021 17:26, bharat rawlley via R-help wrote:

> Hello,
> On running the Wilcoxon Rank Sum test in R and SPSS, I am getting the following discrepancies which I am unable to explain.
> Q1 In the attached data set, I was trying to compare freq4w_n in those with drug_code 0 vs 1. SPSS gives a P value 0.031 vs R gives a P value 0.001779.
> The code I used in R is as follows -
> wilcox.test(freq4w_n, drug_code, conf.int = T)
>
>
> Q2 Similarly, in the same data set, when trying to compare PFD_n in those with drug_code 0 vs 1, SPSS gives a P value 0.038 vs R gives a P value < 2.2e-16.
> The code I used in R is as follows -
> wilcox.test(PFD_n, drug_code, mu = 0, alternative = "two.sided", correct = TRUE, paired = FALSE, conf.int = TRUE)
>
>
> I have tried searching on Google and watching some Youtube tutorials, I cannot find an answer, Any help will be really appreciated, Thank you!
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

--
Michael
http://www.dewey.myzen.co.uk/home.html

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Different results on running Wilcoxon Rank Sum test in R and SPSS

R help mailing list-2
 Thank you for the reply and suggestion, Michael! 
I used dput() and this is the output I can share with you. Simply explained, I have 3 columns namely, drug_code, freq4w_n and PFD_n. Each column has 132 values (including NA). The problem with the Wilcoxon Rank Sum test has been described in my first email. 
Please do let me know if you need any further clarification from my side! Thanks a lot for your time!  
structure(list(drug_code = c(0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0), freq4w_n = c(1, NA, NA, 0, NA, 4, NA, 10, NA, 0, 6, NA, NA, NA, NA, NA, 10, NA, 0, NA, NA, NA, NA, 0, NA, 0, NA, NA, NA, 0, NA, 0, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 12, 0, NA, 1, 2, 1, 2, 2, NA, 28, 0, NA, 4, NA, 1, NA, NA, NA, NA, NA, 0, 3, 1, NA, NA, NA, NA, 4, 28, NA, NA, 0, 2, 12, 0, NA, NA, NA, 0, NA, 0, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3, NA, NA, NA, NA, NA, NA, 6, 1, NA, NA, NA, 0, NA, NA, NA, 0, 0, NA, 0, NA, 2, 8, 3, NA, NA, NA, 0, NA, NA, NA, 9, NA, NA, NA, NA, NA, NA, NA, NA), PFD_n = c(27, NA, NA, 28, NA, 26, NA, 20, NA, 30, 24, NA, NA, NA, NA, NA, 18, NA, 28, NA, NA, NA, NA, 28, NA, 28, NA, NA, NA, 28, NA, 28, NA, NA, NA, NA, NA, NA, NA, NA, 28, 28, 16, 28, NA, 27, 26, 27, 26, 26, NA, 0, 30, NA, 24, NA, 27, NA, NA, NA, NA, NA, 28, 25, 27, NA, NA, NA, NA, 26, 0, NA, NA, 28, 26, 16, 28, NA, NA, NA, 28, NA, 28, NA, NA, NA, NA, NA, NA, NA, NA, NA, 25, NA, NA, NA, NA, NA, NA, 22, 27, NA, NA, NA, 28, NA, NA, NA, 28, 28, NA, 28, NA, 26, 20, 25, NA, NA, NA, 30, NA, NA, NA, 19, NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(NA, -132L), class = c("tbl_df", "tbl", "data.frame"))

Yours sincerely Bharat Rawlley    On Tuesday, 19 January, 2021, 03:53:27 pm IST, Michael Dewey <[hidden email]> wrote:  
 
 Unfortunately your data did not come through. Try using dput() and then
pasting that into the body of your e-mail message.

On 18/01/2021 17:26, bharat rawlley via R-help wrote:

> Hello,
> On running the Wilcoxon Rank Sum test in R and SPSS, I am getting the following discrepancies which I am unable to explain.
> Q1 In the attached data set, I was trying to compare freq4w_n in those with drug_code 0 vs 1. SPSS gives a P value 0.031 vs R gives a P value 0.001779.
> The code I used in R is as follows -
> wilcox.test(freq4w_n, drug_code, conf.int = T)
>
>
> Q2 Similarly, in the same data set, when trying to compare PFD_n in those with drug_code 0 vs 1, SPSS gives a P value 0.038 vs R gives a P value < 2.2e-16.
> The code I used in R is as follows -
> wilcox.test(PFD_n, drug_code, mu = 0, alternative = "two.sided", correct = TRUE, paired = FALSE, conf.int = TRUE)
>
>
> I have tried searching on Google and watching some Youtube tutorials, I cannot find an answer, Any help will be really appreciated, Thank you!
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

--
Michael
http://www.dewey.myzen.co.uk/home.html
 
        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: [SPAM] Re: Different results on running Wilcoxon Rank Sum test in R and SPSS

Michael Dewey-3
See comments inline

On 19/01/2021 10:46, bharat rawlley wrote:

> Thank you for the reply and suggestion, Michael!
>
> I used dput() and this is the output I can share with you. Simply
> explained, I have 3 columns namely, drug_code, freq4w_n and PFD_n. Each
> column has 132 values (including NA). The problem with the Wilcoxon Rank
> Sum test has been described in my first email.
>
> Please do let me know if you need any further clarification from my
> side! Thanks a lot for your time!
>
> structure(list(drug_code = c(0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0,
> 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1,
> 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1,
> 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1,
> 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0,
> 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1,
> 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0), freq4w_n = c(1,
> NA, NA, 0, NA, 4, NA, 10, NA, 0, 6, NA, NA, NA, NA, NA, 10, NA,
> 0, NA, NA, NA, NA, 0, NA, 0, NA, NA, NA, 0, NA, 0, NA, NA, NA,
> NA, NA, NA, NA, NA, 0, 0, 12, 0, NA, 1, 2, 1, 2, 2, NA, 28, 0,
> NA, 4, NA, 1, NA, NA, NA, NA, NA, 0, 3, 1, NA, NA, NA, NA, 4,
> 28, NA, NA, 0, 2, 12, 0, NA, NA, NA, 0, NA, 0, NA, NA, NA, NA,
> NA, NA, NA, NA, NA, 3, NA, NA, NA, NA, NA, NA, 6, 1, NA, NA,
> NA, 0, NA, NA, NA, 0, 0, NA, 0, NA, 2, 8, 3, NA, NA, NA, 0, NA,
> NA, NA, 9, NA, NA, NA, NA, NA, NA, NA, NA), PFD_n = c(27, NA,
> NA, 28, NA, 26, NA, 20, NA, 30, 24, NA, NA, NA, NA, NA, 18, NA,
> 28, NA, NA, NA, NA, 28, NA, 28, NA, NA, NA, 28, NA, 28, NA, NA,
> NA, NA, NA, NA, NA, NA, 28, 28, 16, 28, NA, 27, 26, 27, 26, 26,
> NA, 0, 30, NA, 24, NA, 27, NA, NA, NA, NA, NA, 28, 25, 27, NA,
> NA, NA, NA, 26, 0, NA, NA, 28, 26, 16, 28, NA, NA, NA, 28, NA,
> 28, NA, NA, NA, NA, NA, NA, NA, NA, NA, 25, NA, NA, NA, NA, NA,
> NA, 22, 27, NA, NA, NA, 28, NA, NA, NA, 28, 28, NA, 28, NA, 26,
> 20, 25, NA, NA, NA, 30, NA, NA, NA, 19, NA, NA, NA, NA, NA, NA,
> NA, NA)), row.names = c(NA, -132L), class = c("tbl_df", "tbl",
> "data.frame"))
>
>
> Yours sincerely
> Bharat Rawlley
> On Tuesday, 19 January, 2021, 03:53:27 pm IST, Michael Dewey
> <[hidden email]> wrote:
>
>
> Unfortunately your data did not come through. Try using dput() and then
> pasting that into the body of your e-mail message.
>
> On 18/01/2021 17:26, bharat rawlley via R-help wrote:
>  > Hello,
>  > On running the Wilcoxon Rank Sum test in R and SPSS, I am getting the
> following discrepancies which I am unable to explain.
>  > Q1 In the attached data set, I was trying to compare freq4w_n in
> those with drug_code 0 vs 1. SPSS gives a P value 0.031 vs R gives a P
> value 0.001779.
>  > The code I used in R is as follows -
>  > wilcox.test(freq4w_n, drug_code, conf.int = T)

If I store your data in dat and then go

wilcox.test(freq4w_n ~ drug_code, dat)

I get a p-value of 0.031 agreeing with SPSS

The reason you are getting something different is that you are not
specifying the first two parameters to wilcox.test() correctly.

>  >
>  >
>  > Q2 Similarly, in the same data set, when trying to compare PFD_n in
> those with drug_code 0 vs 1, SPSS gives a P value 0.038 vs R gives a P
> value < 2.2e-16.
>  > The code I used in R is as follows -
>  > wilcox.test(PFD_n, drug_code, mu = 0, alternative = "two.sided",
> correct = TRUE, paired = FALSE, conf.int = TRUE)
>  >
>  >
>  > I have tried searching on Google and watching some Youtube tutorials,
> I cannot find an answer, Any help will be really appreciated, Thank you!
>
>  > ______________________________________________
>  > [hidden email] <mailto:[hidden email]> mailing list -- To
> UNSUBSCRIBE and more, see
>  > https://stat.ethz.ch/mailman/listinfo/r-help 
> <https://stat.ethz.ch/mailman/listinfo/r-help>
>  > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html 
> <http://www.R-project.org/posting-guide.html>
>  > and provide commented, minimal, self-contained, reproducible code.
>  >
>
> --
> Michael
> http://www.dewey.myzen.co.uk/home.html 
> <http://www.dewey.myzen.co.uk/home.html>
>
>
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
> Virus-free. www.avg.com
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>
>
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

--
Michael
http://www.dewey.myzen.co.uk/home.html

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Different results on running Wilcoxon Rank Sum test in R and SPSS

John Fox
In reply to this post by Michael Dewey-3
Dear Bharat Rawlley,

What you tried to do appears to be nonsense. That is, you're treating
PFD_n and drug_code as if they were scores for two different groups.

I assume that what you really want to do is to treat PFD_n as a vector
of scores and drug_code as defining two groups. If that's correct, and
with your data into Data, you can try the following:

------snip ------

 > wilcox.test(PFD_n ~ drug_code, data=Data, conf.int=TRUE)

        Wilcoxon rank sum test with continuity correction

data:  PFD_n by drug_code
W = 197, p-value = 0.05563
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
  -2.000014e+00  5.037654e-05
sample estimates:
difference in location
              -1.000019

Warning messages:
1: In wilcox.test.default(x = c(27, 26, 20, 24, 28, 28, 27, 27, 26,  :
   cannot compute exact p-value with ties
2: In wilcox.test.default(x = c(27, 26, 20, 24, 28, 28, 27, 27, 26,  :
   cannot compute exact confidence intervals with ties

------snip ------

You can get an approximate confidence interval by specifying exact=FALSE:

------snip ------

 > wilcox.test(PFD_n ~ drug_code, data=Data, conf.int=TRUE, exact=FALSE)

        Wilcoxon rank sum test with continuity correction

data:  PFD_n by drug_code
W = 197, p-value = 0.05563
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
  -2.000014e+00  5.037654e-05
sample estimates:
difference in location
              -1.000019

------snip ------

As it turns out, your data are highly discrete and have a lot of ties
(see in particular PFD_n = 28):

------snip ------

 > xtabs(~ PFD_n + drug_code, data=Data)

      drug_code
PFD_n  0  1
    0   2  0
    16  1  1
    18  0  1
    19  0  1
    20  2  0
    22  0  1
    24  2  0
    25  1  2
    26  5  2
    27  4  2
    28  5 13
    30  1  2

------snip ------

I'm no expert in nonparametric inference, but I doubt whether the
approximate p-value will be very accurate for data like these.

I don't know why wilcox.test() (correctly used) and SPSS are giving you
slightly different results -- assuming that you're actually doing the
same thing in both cases. I couldn't help but notice that most of your
data are missing. Are you getting the same value of the test statistic
and different p-values, or is the test statistic different as well?

I hope this helps,
  John

John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/

On 2021-01-19 5:46 a.m., bharat rawlley via R-help wrote:

>   Thank you for the reply and suggestion, Michael!
> I used dput() and this is the output I can share with you. Simply explained, I have 3 columns namely, drug_code, freq4w_n and PFD_n. Each column has 132 values (including NA). The problem with the Wilcoxon Rank Sum test has been described in my first email.
> Please do let me know if you need any further clarification from my side! Thanks a lot for your time!
> structure(list(drug_code = c(0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0), freq4w_n = c(1, NA, NA, 0, NA, 4, NA, 10, NA, 0, 6, NA, NA, NA, NA, NA, 10, NA, 0, NA, NA, NA, NA, 0, NA, 0, NA, NA, NA, 0, NA, 0, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 12, 0, NA, 1, 2, 1, 2, 2, NA, 28, 0, NA, 4, NA, 1, NA, NA, NA, NA, NA, 0, 3, 1, NA, NA, NA, NA, 4, 28, NA, NA, 0, 2, 12, 0, NA, NA, NA, 0, NA, 0, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3, NA, NA, NA, NA, NA, NA, 6, 1, NA, NA, NA, 0, NA, NA, NA, 0, 0, NA, 0, NA, 2, 8, 3, NA, NA, NA, 0, NA, NA, NA, 9, NA, NA, NA, NA, NA, NA, NA, NA), PFD_n = c(27, NA, NA, 28, NA, 26, NA, 20, NA, 30, 24, NA, NA, NA, NA, NA, 18, NA, 28, NA, NA, NA, NA, 28, NA, 28, NA, NA, NA, 28, NA, 28, NA, NA, NA, NA, NA, NA, NA, NA, 28, 28, 16, 28, NA, 27, 26, 27, 26, 26, NA, 0, 30, NA, 24, NA, 27, NA, NA, NA, NA, NA, 28, 25, 27, NA, NA, NA, NA, 26, 0, NA, NA, 28, 26, 16, 28, NA, NA, NA, 28, NA, 28, NA, NA, NA, NA, NA, NA, NA, NA, NA, 25, NA, NA, NA, NA, NA, NA, 22, 27, NA, NA, NA, 28, NA, NA, NA, 28, 28, NA, 28, NA, 26, 20, 25, NA, NA, NA, 30, NA, NA, NA, 19, NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(NA, -132L), class = c("tbl_df", "tbl", "data.frame"))
>
> Yours sincerely Bharat Rawlley    On Tuesday, 19 January, 2021, 03:53:27 pm IST, Michael Dewey <[hidden email]> wrote:
>  
>   Unfortunately your data did not come through. Try using dput() and then
> pasting that into the body of your e-mail message.
>
> On 18/01/2021 17:26, bharat rawlley via R-help wrote:
>> Hello,
>> On running the Wilcoxon Rank Sum test in R and SPSS, I am getting the following discrepancies which I am unable to explain.
>> Q1 In the attached data set, I was trying to compare freq4w_n in those with drug_code 0 vs 1. SPSS gives a P value 0.031 vs R gives a P value 0.001779.
>> The code I used in R is as follows -
>> wilcox.test(freq4w_n, drug_code, conf.int = T)
>>
>>
>> Q2 Similarly, in the same data set, when trying to compare PFD_n in those with drug_code 0 vs 1, SPSS gives a P value 0.038 vs R gives a P value < 2.2e-16.
>> The code I used in R is as follows -
>> wilcox.test(PFD_n, drug_code, mu = 0, alternative = "two.sided", correct = TRUE, paired = FALSE, conf.int = TRUE)
>>
>>
>> I have tried searching on Google and watching some Youtube tutorials, I cannot find an answer, Any help will be really appreciated, Thank you!
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Different results on running Wilcoxon Rank Sum test in R and SPSS

R help mailing list-2
 Dear Professor John, 
Thank you very much for your reply! 
I agree with you that the non-parametric tests I mentioned in my previous email (Moods median test and Median test) do not make sense in this situation as they treat PFD_n and drug_code as different groups. As you correctly said, I want to use PFD_n as a vector of scores and drug_code to make two groups out of it. This is exactly what the Independent samples median test does in SPSS. I wish to perform the same test in R and am unable to do so.
Simply put, I am asking how to perform the Independent samples median test in R just like it is performed in SPSS? 

Secondly, for the question you are asking about the test statistic, I have not performed the Wilcoxon Rank sum test in SPSS for the PFD_n and drug_code data. I have said something to the contrary in my first email, I apologize for that. 
Thank you very much for your time! 
Yours sincerelyBharat Rawlley    On Wednesday, 20 January, 2021, 04:47:21 am IST, John Fox <[hidden email]> wrote:  
 
 Dear Bharat Rawlley,

What you tried to do appears to be nonsense. That is, you're treating
PFD_n and drug_code as if they were scores for two different groups.

I assume that what you really want to do is to treat PFD_n as a vector
of scores and drug_code as defining two groups. If that's correct, and
with your data into Data, you can try the following:

------snip ------

 > wilcox.test(PFD_n ~ drug_code, data=Data, conf.int=TRUE)

    Wilcoxon rank sum test with continuity correction

data:  PFD_n by drug_code
W = 197, p-value = 0.05563
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
  -2.000014e+00  5.037654e-05
sample estimates:
difference in location
              -1.000019

Warning messages:
1: In wilcox.test.default(x = c(27, 26, 20, 24, 28, 28, 27, 27, 26,  :
  cannot compute exact p-value with ties
2: In wilcox.test.default(x = c(27, 26, 20, 24, 28, 28, 27, 27, 26,  :
  cannot compute exact confidence intervals with ties

------snip ------

You can get an approximate confidence interval by specifying exact=FALSE:

------snip ------

 > wilcox.test(PFD_n ~ drug_code, data=Data, conf.int=TRUE, exact=FALSE)

    Wilcoxon rank sum test with continuity correction

data:  PFD_n by drug_code
W = 197, p-value = 0.05563
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
  -2.000014e+00  5.037654e-05
sample estimates:
difference in location
              -1.000019

------snip ------

As it turns out, your data are highly discrete and have a lot of ties
(see in particular PFD_n = 28):

------snip ------

 > xtabs(~ PFD_n + drug_code, data=Data)

      drug_code
PFD_n  0  1
    0  2  0
    16  1  1
    18  0  1
    19  0  1
    20  2  0
    22  0  1
    24  2  0
    25  1  2
    26  5  2
    27  4  2
    28  5 13
    30  1  2

------snip ------

I'm no expert in nonparametric inference, but I doubt whether the
approximate p-value will be very accurate for data like these.

I don't know why wilcox.test() (correctly used) and SPSS are giving you
slightly different results -- assuming that you're actually doing the
same thing in both cases. I couldn't help but notice that most of your
data are missing. Are you getting the same value of the test statistic
and different p-values, or is the test statistic different as well?

I hope this helps,
  John

John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/

On 2021-01-19 5:46 a.m., bharat rawlley via R-help wrote:

>  Thank you for the reply and suggestion, Michael!
> I used dput() and this is the output I can share with you. Simply explained, I have 3 columns namely, drug_code, freq4w_n and PFD_n. Each column has 132 values (including NA). The problem with the Wilcoxon Rank Sum test has been described in my first email.
> Please do let me know if you need any further clarification from my side! Thanks a lot for your time!
> structure(list(drug_code = c(0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0), freq4w_n = c(1, NA, NA, 0, NA, 4, NA, 10, NA, 0, 6, NA, NA, NA, NA, NA, 10, NA, 0, NA, NA, NA, NA, 0, NA, 0, NA, NA, NA, 0, NA, 0, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 12, 0, NA, 1, 2, 1, 2, 2, NA, 28, 0, NA, 4, NA, 1, NA, NA, NA, NA, NA, 0, 3, 1, NA, NA, NA, NA, 4, 28, NA, NA, 0, 2, 12, 0, NA, NA, NA, 0, NA, 0, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3, NA, NA, NA, NA, NA, NA, 6, 1, NA, NA, NA, 0, NA, NA, NA, 0, 0, NA, 0, NA, 2, 8, 3, NA, NA, NA, 0, NA, NA, NA, 9, NA, NA, NA, NA, NA, NA, NA, NA), PFD_n = c(27, NA, NA, 28, NA, 26, NA, 20, NA, 30, 24, NA, NA, NA, NA, NA, 18, NA, 28, NA, NA, NA, NA, 28, NA, 28, NA, NA, NA, 28, NA, 28, NA, NA, NA, NA, NA, NA, NA, NA, 28, 28, 16, 28, NA, 27, 26, 27, 26, 26, NA, 0, 30, NA, 24, NA, 27, NA, NA, NA, NA, NA, 28, 25, 27, NA, NA, NA, NA, 26, 0, NA, NA, 28, 26, 16, 28, NA, NA, NA, 28, NA, 28, NA, NA, NA, NA, NA, NA, NA, NA, NA, 25, NA, NA, NA, NA, NA, NA, 22, 27, NA, NA, NA, 28, NA, NA, NA, 28, 28, NA, 28, NA, 26, 20, 25, NA, NA, NA, 30, NA, NA, NA, 19, NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(NA, -132L), class = c("tbl_df", "tbl", "data.frame"))
>
> Yours sincerely Bharat Rawlley    On Tuesday, 19 January, 2021, 03:53:27 pm IST, Michael Dewey <[hidden email]> wrote:

>  Unfortunately your data did not come through. Try using dput() and then
> pasting that into the body of your e-mail message.
>
> On 18/01/2021 17:26, bharat rawlley via R-help wrote:
>> Hello,
>> On running the Wilcoxon Rank Sum test in R and SPSS, I am getting the following discrepancies which I am unable to explain.
>> Q1 In the attached data set, I was trying to compare freq4w_n in those with drug_code 0 vs 1. SPSS gives a P value 0.031 vs R gives a P value 0.001779.
>> The code I used in R is as follows -
>> wilcox.test(freq4w_n, drug_code, conf.int = T)
>>
>>
>> Q2 Similarly, in the same data set, when trying to compare PFD_n in those with drug_code 0 vs 1, SPSS gives a P value 0.038 vs R gives a P value < 2.2e-16.
>> The code I used in R is as follows -
>> wilcox.test(PFD_n, drug_code, mu = 0, alternative = "two.sided", correct = TRUE, paired = FALSE, conf.int = TRUE)
>>
>>
>> I have tried searching on Google and watching some Youtube tutorials, I cannot find an answer, Any help will be really appreciated, Thank you!
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
 
        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Different results on running Wilcoxon Rank Sum test in R and SPSS

John Fox
In reply to this post by John Fox
Dear Bharat Rawlley,

On 2021-01-20 1:45 p.m., bharat rawlley via R-help wrote:
>   Dear Professor John,
> Thank you very much for your reply!
> I agree with you that the non-parametric tests I mentioned in my previous email (Moods median test and Median test) do not make sense in this situation as they treat PFD_n and drug_code as different groups. As you correctly said, I want to use PFD_n as a vector of scores and drug_code to make two groups out of it. This is exactly what the Independent samples median test does in SPSS. I wish to perform the same test in R and am unable to do so.
> Simply put, I am asking how to perform the Independent samples median test in R just like it is performed in SPSS?

I'm afraid that I'm the wrong person to ask, since I haven't used SPSS
in perhaps 30 years and have no idea what it does to test for
differences in medians. A Google search for "independent samples median
test in R" turns up a number of hits.

>
> Secondly, for the question you are asking about the test statistic, I have not performed the Wilcoxon Rank sum test in SPSS for the PFD_n and drug_code data. I have said something to the contrary in my first email, I apologize for that.

For continuous data, the Wilcoxon test is, I believe, a reasonable
choice, but not when there are so many ties. If SPSS doesn't perform a
Wilcoxon test for a difference in medians, then there's of course no
reason to expect that the p-values would be the same.

Best,
  John

> Thank you very much for your time!
> Yours sincerelyBharat Rawlley    On Wednesday, 20 January, 2021, 04:47:21 am IST, John Fox <[hidden email]> wrote:
>  
>   Dear Bharat Rawlley,
>
> What you tried to do appears to be nonsense. That is, you're treating
> PFD_n and drug_code as if they were scores for two different groups.
>
> I assume that what you really want to do is to treat PFD_n as a vector
> of scores and drug_code as defining two groups. If that's correct, and
> with your data into Data, you can try the following:
>
> ------snip ------
>
>   > wilcox.test(PFD_n ~ drug_code, data=Data, conf.int=TRUE)
>
>      Wilcoxon rank sum test with continuity correction
>
> data:  PFD_n by drug_code
> W = 197, p-value = 0.05563
> alternative hypothesis: true location shift is not equal to 0
> 95 percent confidence interval:
>    -2.000014e+00  5.037654e-05
> sample estimates:
> difference in location
>                -1.000019
>
> Warning messages:
> 1: In wilcox.test.default(x = c(27, 26, 20, 24, 28, 28, 27, 27, 26,  :
>    cannot compute exact p-value with ties
> 2: In wilcox.test.default(x = c(27, 26, 20, 24, 28, 28, 27, 27, 26,  :
>    cannot compute exact confidence intervals with ties
>
> ------snip ------
>
> You can get an approximate confidence interval by specifying exact=FALSE:
>
> ------snip ------
>
>   > wilcox.test(PFD_n ~ drug_code, data=Data, conf.int=TRUE, exact=FALSE)
>
>      Wilcoxon rank sum test with continuity correction
>
> data:  PFD_n by drug_code
> W = 197, p-value = 0.05563
> alternative hypothesis: true location shift is not equal to 0
> 95 percent confidence interval:
>    -2.000014e+00  5.037654e-05
> sample estimates:
> difference in location
>                -1.000019
>
> ------snip ------
>
> As it turns out, your data are highly discrete and have a lot of ties
> (see in particular PFD_n = 28):
>
> ------snip ------
>
>   > xtabs(~ PFD_n + drug_code, data=Data)
>
>        drug_code
> PFD_n  0  1
>      0  2  0
>      16  1  1
>      18  0  1
>      19  0  1
>      20  2  0
>      22  0  1
>      24  2  0
>      25  1  2
>      26  5  2
>      27  4  2
>      28  5 13
>      30  1  2
>
> ------snip ------
>
> I'm no expert in nonparametric inference, but I doubt whether the
> approximate p-value will be very accurate for data like these.
>
> I don't know why wilcox.test() (correctly used) and SPSS are giving you
> slightly different results -- assuming that you're actually doing the
> same thing in both cases. I couldn't help but notice that most of your
> data are missing. Are you getting the same value of the test statistic
> and different p-values, or is the test statistic different as well?
>
> I hope this helps,
>    John
>
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> web: https://socialsciences.mcmaster.ca/jfox/
>
> On 2021-01-19 5:46 a.m., bharat rawlley via R-help wrote:
>>    Thank you for the reply and suggestion, Michael!
>> I used dput() and this is the output I can share with you. Simply explained, I have 3 columns namely, drug_code, freq4w_n and PFD_n. Each column has 132 values (including NA). The problem with the Wilcoxon Rank Sum test has been described in my first email.
>> Please do let me know if you need any further clarification from my side! Thanks a lot for your time!
>> structure(list(drug_code = c(0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0), freq4w_n = c(1, NA, NA, 0, NA, 4, NA, 10, NA, 0, 6, NA, NA, NA, NA, NA, 10, NA, 0, NA, NA, NA, NA, 0, NA, 0, NA, NA, NA, 0, NA, 0, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 12, 0, NA, 1, 2, 1, 2, 2, NA, 28, 0, NA, 4, NA, 1, NA, NA, NA, NA, NA, 0, 3, 1, NA, NA, NA, NA, 4, 28, NA, NA, 0, 2, 12, 0, NA, NA, NA, 0, NA, 0, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3, NA, NA, NA, NA, NA, NA, 6, 1, NA, NA, NA, 0, NA, NA, NA, 0, 0, NA, 0, NA, 2, 8, 3, NA, NA, NA, 0, NA, NA, NA, 9, NA, NA, NA, NA, NA, NA, NA, NA), PFD_n = c(27, NA, NA, 28, NA, 26, NA, 20, NA, 30, 24, NA, NA, NA, NA, NA, 18, NA, 28, NA, NA, NA, NA, 28, NA, 28, NA, NA, NA, 28, NA, 28, NA, NA, NA, NA, NA, NA, NA, NA, 28, 28, 16, 28, NA, 27, 26, 27, 26, 26, NA, 0, 30, NA, 24, NA, 27, NA, NA, NA, NA, NA, 28, 25, 27, NA, NA, NA, NA, 26, 0, NA, NA, 28, 26, 16, 28, NA, NA, NA, 28, NA, 28, NA, NA, NA, NA, NA, NA, NA, NA, NA, 25, NA, NA, NA, NA, NA, NA, 22, 27, NA, NA, NA, 28, NA, NA, NA, 28, 28, NA, 28, NA, 26, 20, 25, NA, NA, NA, 30, NA, NA, NA, 19, NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(NA, -132L), class = c("tbl_df", "tbl", "data.frame"))
>>
>> Yours sincerely Bharat Rawlley    On Tuesday, 19 January, 2021, 03:53:27 pm IST, Michael Dewey <[hidden email]> wrote:
>>    
>>    Unfortunately your data did not come through. Try using dput() and then
>> pasting that into the body of your e-mail message.
>>
>> On 18/01/2021 17:26, bharat rawlley via R-help wrote:
>>> Hello,
>>> On running the Wilcoxon Rank Sum test in R and SPSS, I am getting the following discrepancies which I am unable to explain.
>>> Q1 In the attached data set, I was trying to compare freq4w_n in those with drug_code 0 vs 1. SPSS gives a P value 0.031 vs R gives a P value 0.001779.
>>> The code I used in R is as follows -
>>> wilcox.test(freq4w_n, drug_code, conf.int = T)
>>>
>>>
>>> Q2 Similarly, in the same data set, when trying to compare PFD_n in those with drug_code 0 vs 1, SPSS gives a P value 0.038 vs R gives a P value < 2.2e-16.
>>> The code I used in R is as follows -
>>> wilcox.test(PFD_n, drug_code, mu = 0, alternative = "two.sided", correct = TRUE, paired = FALSE, conf.int = TRUE)
>>>
>>>
>>> I have tried searching on Google and watching some Youtube tutorials, I cannot find an answer, Any help will be really appreciated, Thank you!
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>    
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Different results on running Wilcoxon Rank Sum test in R and SPSS

R help mailing list-2
Thank you for your time, Professor John! Much appreciated! 
Yours sincerely Bharat Rawlley 



Sent from Yahoo Mail on Android
 
  On Thu, 21 Jan 2021 at 4:40 AM, John Fox<[hidden email]> wrote:   Dear Bharat Rawlley,

On 2021-01-20 1:45 p.m., bharat rawlley via R-help wrote:
>  Dear Professor John,
> Thank you very much for your reply!
> I agree with you that the non-parametric tests I mentioned in my previous email (Moods median test and Median test) do not make sense in this situation as they treat PFD_n and drug_code as different groups. As you correctly said, I want to use PFD_n as a vector of scores and drug_code to make two groups out of it. This is exactly what the Independent samples median test does in SPSS. I wish to perform the same test in R and am unable to do so.
> Simply put, I am asking how to perform the Independent samples median test in R just like it is performed in SPSS?

I'm afraid that I'm the wrong person to ask, since I haven't used SPSS
in perhaps 30 years and have no idea what it does to test for
differences in medians. A Google search for "independent samples median
test in R" turns up a number of hits.

>
> Secondly, for the question you are asking about the test statistic, I have not performed the Wilcoxon Rank sum test in SPSS for the PFD_n and drug_code data. I have said something to the contrary in my first email, I apologize for that.

For continuous data, the Wilcoxon test is, I believe, a reasonable
choice, but not when there are so many ties. If SPSS doesn't perform a
Wilcoxon test for a difference in medians, then there's of course no
reason to expect that the p-values would be the same.

Best,
  John

> Thank you very much for your time!
> Yours sincerelyBharat Rawlley    On Wednesday, 20 January, 2021, 04:47:21 am IST, John Fox <[hidden email]> wrote:

>  Dear Bharat Rawlley,
>
> What you tried to do appears to be nonsense. That is, you're treating
> PFD_n and drug_code as if they were scores for two different groups.
>
> I assume that what you really want to do is to treat PFD_n as a vector
> of scores and drug_code as defining two groups. If that's correct, and
> with your data into Data, you can try the following:
>
> ------snip ------
>
>  > wilcox.test(PFD_n ~ drug_code, data=Data, conf.int=TRUE)
>
>      Wilcoxon rank sum test with continuity correction
>
> data:  PFD_n by drug_code
> W = 197, p-value = 0.05563
> alternative hypothesis: true location shift is not equal to 0
> 95 percent confidence interval:
>    -2.000014e+00  5.037654e-05
> sample estimates:
> difference in location
>                -1.000019
>
> Warning messages:
> 1: In wilcox.test.default(x = c(27, 26, 20, 24, 28, 28, 27, 27, 26,  :
>    cannot compute exact p-value with ties
> 2: In wilcox.test.default(x = c(27, 26, 20, 24, 28, 28, 27, 27, 26,  :
>    cannot compute exact confidence intervals with ties
>
> ------snip ------
>
> You can get an approximate confidence interval by specifying exact=FALSE:
>
> ------snip ------
>
>  > wilcox.test(PFD_n ~ drug_code, data=Data, conf.int=TRUE, exact=FALSE)
>
>      Wilcoxon rank sum test with continuity correction
>
> data:  PFD_n by drug_code
> W = 197, p-value = 0.05563
> alternative hypothesis: true location shift is not equal to 0
> 95 percent confidence interval:
>    -2.000014e+00  5.037654e-05
> sample estimates:
> difference in location
>                -1.000019
>
> ------snip ------
>
> As it turns out, your data are highly discrete and have a lot of ties
> (see in particular PFD_n = 28):
>
> ------snip ------
>
>  > xtabs(~ PFD_n + drug_code, data=Data)
>
>        drug_code
> PFD_n  0  1
>      0  2  0
>      16  1  1
>      18  0  1
>      19  0  1
>      20  2  0
>      22  0  1
>      24  2  0
>      25  1  2
>      26  5  2
>      27  4  2
>      28  5 13
>      30  1  2
>
> ------snip ------
>
> I'm no expert in nonparametric inference, but I doubt whether the
> approximate p-value will be very accurate for data like these.
>
> I don't know why wilcox.test() (correctly used) and SPSS are giving you
> slightly different results -- assuming that you're actually doing the
> same thing in both cases. I couldn't help but notice that most of your
> data are missing. Are you getting the same value of the test statistic
> and different p-values, or is the test statistic different as well?
>
> I hope this helps,
>    John
>
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> web: https://socialsciences.mcmaster.ca/jfox/
>
> On 2021-01-19 5:46 a.m., bharat rawlley via R-help wrote:
>>    Thank you for the reply and suggestion, Michael!
>> I used dput() and this is the output I can share with you. Simply explained, I have 3 columns namely, drug_code, freq4w_n and PFD_n. Each column has 132 values (including NA). The problem with the Wilcoxon Rank Sum test has been described in my first email.
>> Please do let me know if you need any further clarification from my side! Thanks a lot for your time!
>> structure(list(drug_code = c(0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0), freq4w_n = c(1, NA, NA, 0, NA, 4, NA, 10, NA, 0, 6, NA, NA, NA, NA, NA, 10, NA, 0, NA, NA, NA, NA, 0, NA, 0, NA, NA, NA, 0, NA, 0, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 12, 0, NA, 1, 2, 1, 2, 2, NA, 28, 0, NA, 4, NA, 1, NA, NA, NA, NA, NA, 0, 3, 1, NA, NA, NA, NA, 4, 28, NA, NA, 0, 2, 12, 0, NA, NA, NA, 0, NA, 0, NA, NA, NA, NA, NA, NA, NA, NA, NA, 3, NA, NA, NA, NA, NA, NA, 6, 1, NA, NA, NA, 0, NA, NA, NA, 0, 0, NA, 0, NA, 2, 8, 3, NA, NA, NA, 0, NA, NA, NA, 9, NA, NA, NA, NA, NA, NA, NA, NA), PFD_n = c(27, NA, NA, 28, NA, 26, NA, 20, NA, 30, 24, NA, NA, NA, NA, NA, 18, NA, 28, NA, NA, NA, NA, 28, NA, 28, NA, NA, NA, 28, NA, 28, NA, NA, NA, NA, NA, NA, NA, NA, 28, 28, 16, 28, NA, 27, 26, 27, 26, 26, NA, 0, 30, NA, 24, NA, 27, NA, NA, NA, NA, NA, 28, 25, 27, NA, NA, NA, NA, 26, 0, NA, NA, 28, 26, 16, 28, NA, NA, NA, 28, NA, 28, NA, NA, NA, NA, NA, NA, NA, NA, NA, 25, NA, NA, NA, NA, NA, NA, 22, 27, NA, NA, NA, 28, NA, NA, NA, 28, 28, NA, 28, NA, 26, 20, 25, NA, NA, NA, 30, NA, NA, NA, 19, NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(NA, -132L), class = c("tbl_df", "tbl", "data.frame"))
>>
>> Yours sincerely Bharat Rawlley    On Tuesday, 19 January, 2021, 03:53:27 pm IST, Michael Dewey <[hidden email]> wrote:
>>   
>>    Unfortunately your data did not come through. Try using dput() and then
>> pasting that into the body of your e-mail message.
>>
>> On 18/01/2021 17:26, bharat rawlley via R-help wrote:
>>> Hello,
>>> On running the Wilcoxon Rank Sum test in R and SPSS, I am getting the following discrepancies which I am unable to explain.
>>> Q1 In the attached data set, I was trying to compare freq4w_n in those with drug_code 0 vs 1. SPSS gives a P value 0.031 vs R gives a P value 0.001779.
>>> The code I used in R is as follows -
>>> wilcox.test(freq4w_n, drug_code, conf.int = T)
>>>
>>>
>>> Q2 Similarly, in the same data set, when trying to compare PFD_n in those with drug_code 0 vs 1, SPSS gives a P value 0.038 vs R gives a P value < 2.2e-16.
>>> The code I used in R is as follows -
>>> wilcox.test(PFD_n, drug_code, mu = 0, alternative = "two.sided", correct = TRUE, paired = FALSE, conf.int = TRUE)
>>>
>>>
>>> I have tried searching on Google and watching some Youtube tutorials, I cannot find an answer, Any help will be really appreciated, Thank you!
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>   
>     [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/
 

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.