ks.test() with 2 samples vs. 1 sample an distr. function

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

ks.test() with 2 samples vs. 1 sample an distr. function

Tonja Krueger
Dear all,
I have a question concerning the ks.test() function. I tryed to calculate the example given on the German wikipedia page.
xi <- c(9.41,9.92,11.55,11.6,11.73,12,12.06,13.3)
I get the right results when I calculate: ks.test(xi,pnorm,11,1)
Now the question: shouldn't I obtain the same or a very similar result if I commpare the sample and a calculated sample from the distribution?
p<- c(0.125, 0.250, 0.375, 0.500, 0.625, 0.750, 0.875, 0.9999)
x <- qnorm(p,11,1)
ks.test(xi,x)
Why don't I?
Thanks for helping me!
Tonja

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: ks.test() with 2 samples vs. 1 sample an distr. function

David Carlson
In the first example you are performing a one-sample test against a continuous cumulative distribution (in this case a normal distribution). In the second case you are performing a two-sample test. You drew your values for x non-randomly by specifying fixed intervals along a normal distribution, but ks.test() just sees that you have provided two samples, not one sample and values along a cumulative distribution.

----------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77843-4352


-----Original Message-----
From: R-help [mailto:[hidden email]] On Behalf Of [hidden email]
Sent: Wednesday, November 15, 2017 3:47 AM
To: [hidden email]
Subject: [R] ks.test() with 2 samples vs. 1 sample an distr. function

Dear all,
I have a question concerning the ks.test() function. I tryed to calculate the example given on the German wikipedia page.
xi <- c(9.41,9.92,11.55,11.6,11.73,12,12.06,13.3)
I get the right results when I calculate: ks.test(xi,pnorm,11,1) Now the question: shouldn't I obtain the same or a very similar result if I commpare the sample and a calculated sample from the distribution?
p<- c(0.125, 0.250, 0.375, 0.500, 0.625, 0.750, 0.875, 0.9999) x <- qnorm(p,11,1)
ks.test(xi,x)
Why don't I?
Thanks for helping me!
Tonja

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: ks.test() with 2 samples vs. 1 sample an distr. function

Peter Dalgaard-2
I suspect that that reply just replicates the question.

There are two issues: The distribution of the test statistic is different, which may be unsurprising. However, the test statistic itself is also different which may be a bit more subtle. It may help to plot(ecdf(xi)) and similarly x. The 2-sample KS statistic will is the maximum vertical distance between two step functions, so with 2x8 points, it will be a multiple of .125. The 1-sample version is the max distance between a step function and a smooth curve.

-pd

 

> On 15 Nov 2017, at 16:56 , David L Carlson <[hidden email]> wrote:
>
> In the first example you are performing a one-sample test against a continuous cumulative distribution (in this case a normal distribution). In the second case you are performing a two-sample test. You drew your values for x non-randomly by specifying fixed intervals along a normal distribution, but ks.test() just sees that you have provided two samples, not one sample and values along a cumulative distribution.
>
> ----------------------------------------
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77843-4352
>
>
> -----Original Message-----
> From: R-help [mailto:[hidden email]] On Behalf Of [hidden email]
> Sent: Wednesday, November 15, 2017 3:47 AM
> To: [hidden email]
> Subject: [R] ks.test() with 2 samples vs. 1 sample an distr. function
>
> Dear all,
> I have a question concerning the ks.test() function. I tryed to calculate the example given on the German wikipedia page.
> xi <- c(9.41,9.92,11.55,11.6,11.73,12,12.06,13.3)
> I get the right results when I calculate: ks.test(xi,pnorm,11,1) Now the question: shouldn't I obtain the same or a very similar result if I commpare the sample and a calculated sample from the distribution?
> p<- c(0.125, 0.250, 0.375, 0.500, 0.625, 0.750, 0.875, 0.9999) x <- qnorm(p,11,1)
> ks.test(xi,x)
> Why don't I?
> Thanks for helping me!
> Tonja
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.