P-values Kolmogorov–Smirnov test Classic List Threaded 8 messages Open this post in threaded view
|

P-values Kolmogorov–Smirnov test

 Hello, I am trying to perform a Kolmogorov–Smirnov test to assess the difference between a distribution and samples drawn proportionally to size of different sizes. I managed to compute the Kolmogorov–Smirnov distance but I am lost with the p-value. I have looked into the ks.test function unsuccessfully. Can anyone help me with computing p-values for a two-tailed test? Below a simplified version of my code. Thanks in advance. Gianluca library(spatstat) #reference distribution d_1 <- sort(rpois(1000, 500)) p_1 <- d_1/sum(d_1) m_1 <- data.frame(d_1, p_1) #data frame to store the values of the siumation d_stat <- data.frame(1:1000, NA, NA) names(d_stat) <- c("sample_size", "ks_distance", "p_value") #simulation for (i in 1:1000) {   #sample from the reference distribution   m_2 <-m_1[(sample(nrow(m_1), size=i, prob=p_1, replace=F)),]   m_2 <-m_2[order(m_2\$d_1),]   d_2 <- m_2\$d_1   p_2 <- m_2\$p_1   #weighted ecdf for the reference distribution and the sample   f_d_1 <- ewcdf(d_1, normalise=F)   f_d_2 <- ewcdf(d_2, 1/p_2, normalise=F, adjust=1/length(d_2))   #kolmogorov-smirnov distance   d_stat[i,2] <- max(abs(f_d_1(d_2) - f_d_2(d_2))) }         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

Re: P-values Kolmogorov–Smirnov test

 Hello, I don't have the algorithms at hand but the KS statistic calculation is more complicated than your max/abs difference. Anyway, why not use ks.test? it's not that difficult: set.seed(1234) #reference distribution d_1 <- sort(rpois(1000, 500)) p_1 <- d_1/sum(d_1) m_1 <- data.frame(d_1, p_1) #data frame to store the values of the simulation d_stat <- data.frame(1:1000, NA, NA) names(d_stat) <- c("sample_size", "ks_distance", "p_value") #simulation for (i in 1:1000) {    #sample from the reference distribution    m_2 <-m_1[(sample(nrow(m_1), size=i, prob=p_1, replace=F)),]    d_2 <- m_2\$d_1    ht <- ks.test(d_1, d_2)    #kolmogorov-smirnov distance    d_stat[i, 2] <- ht\$statistic    d_stat[i, 3] <- ht\$p.value } hist(d_stat[, 2]) hist(d_stat[, 3]) Note that d_2 is not sorted, but the results are equal in the sense of function identical(), meaning they are *exactly* the same. Why shouldn't they? Hope this helps, Rui Barradas Às 17:06 de 05/09/19, Boo G. escreveu: > Hello, > > I am trying to perform a Kolmogorov–Smirnov test to assess the difference between a distribution and samples drawn proportionally to size of different sizes. I managed to compute the Kolmogorov–Smirnov distance but I am lost with the p-value. I have looked into the ks.test function unsuccessfully. Can anyone help me with computing p-values for a two-tailed test? > > Below a simplified version of my code. > > Thanks in advance. > Gianluca > > > library(spatstat) > > #reference distribution > d_1 <- sort(rpois(1000, 500)) > p_1 <- d_1/sum(d_1) > m_1 <- data.frame(d_1, p_1) > > #data frame to store the values of the siumation > d_stat <- data.frame(1:1000, NA, NA) > names(d_stat) <- c("sample_size", "ks_distance", "p_value") > > #simulation > for (i in 1:1000) { >    #sample from the reference distribution >    m_2 <-m_1[(sample(nrow(m_1), size=i, prob=p_1, replace=F)),] >    m_2 <-m_2[order(m_2\$d_1),] >    d_2 <- m_2\$d_1 >    p_2 <- m_2\$p_1 > >    #weighted ecdf for the reference distribution and the sample >    f_d_1 <- ewcdf(d_1, normalise=F) >    f_d_2 <- ewcdf(d_2, 1/p_2, normalise=F, adjust=1/length(d_2)) > >    #kolmogorov-smirnov distance >    d_stat[i,2] <- max(abs(f_d_1(d_2) - f_d_2(d_2))) > } > > > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

Re: P-values Kolmogorov–Smirnov test

Open this post in threaded view
|

Re: P-values Kolmogorov–Smirnov test

Open this post in threaded view
|

Re: P-values Kolmogorov–Smirnov test

Open this post in threaded view
|