# Difference betweeen cor.test() and formula everyone says to use

4 messages
Open this post in threaded view
|

## Difference betweeen cor.test() and formula everyone says to use

 I'm trying to understand how cor.test() is calculating the p-value of a correlation. It gives a p-value based on t, but every text I've ever seen gives the calculation based on z. For example: > data(cars) > with(cars[1:10, ], cor.test(speed, dist)) Pearson's product-moment correlation data:  speed and dist t = 2.3893, df = 8, p-value = 0.04391 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval:  0.02641348 0.90658582 sample estimates:       cor 0.6453079 But when I use the regular formula: > r <- cor(cars[1:10, ])[1, 2] > r.z <- fisherz(r) > se <- se <- 1/sqrt(10 - 3) > z <- r.z / se > (1 - pnorm(z))*2 [1] 0.04237039 My p-value is different.  The help file for cor.test doesn't (seem to) have any reference to this, and I can see in the source code that it is doing something different. I'm just not sure what. Thanks, Jeremy ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: Difference betweeen cor.test() and formula everyone says to use

 Hi Jeremy, I don't know about references, but this around.  See for example: http://afni.nimh.nih.gov/sscc/gangc/tr.htmlthe relevant line in cor.test is: STATISTIC <- c(t = sqrt(df) * r/sqrt(1 - r^2)) You can convert *t*s to *r*s and vice versa. Best, Josh On Fri, Oct 17, 2014 at 10:32 AM, Jeremy Miles <[hidden email]> wrote: > I'm trying to understand how cor.test() is calculating the p-value of > a correlation. It gives a p-value based on t, but every text I've ever > seen gives the calculation based on z. > > For example: > > data(cars) > > with(cars[1:10, ], cor.test(speed, dist)) > > Pearson's product-moment correlation > > data:  speed and dist > t = 2.3893, df = 8, p-value = 0.04391 > alternative hypothesis: true correlation is not equal to 0 > 95 percent confidence interval: >  0.02641348 0.90658582 > sample estimates: >       cor > 0.6453079 > > But when I use the regular formula: > > r <- cor(cars[1:10, ])[1, 2] > > r.z <- fisherz(r) > > se <- se <- 1/sqrt(10 - 3) > > z <- r.z / se > > (1 - pnorm(z))*2 > [1] 0.04237039 > > My p-value is different.  The help file for cor.test doesn't (seem to) > have any reference to this, and I can see in the source code that it > is doing something different. I'm just not sure what. > > Thanks, > > Jeremy > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. > -- Joshua F. Wiley Ph.D. Student, UCLA Department of Psychology http://joshuawiley.com/Senior Analyst, Elkhart Group Ltd. http://elkhartgroup.comOffice: 260.673.5518         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
 In reply to this post by Jeremy Miles-2 The distribution of the statistic $ndf * r^2 / (1-r^2)$ with  the true value $\rho = zero$ follows an $F(1,ndf)$ distribution. So the t-test is the correct test for $\rho=0$. Fisher's z is an asymptotically normal  transformation for any value of $\rho$. Thus  Fisher's z is better for testing $\rho= \rho_0$ or $\rho_1 = \rho_2$. The two statistics will not be equivalent at $\rho=0$ because the statistics are based on different assumptions. Jeremy Miles <[hidden email]> Sent by: [hidden email] 10/16/2014 07:32 PM To r-help <[hidden email]>, cc Subject [R] Difference betweeen cor.test() and formula everyone says to use I'm trying to understand how cor.test() is calculating the p-value of a correlation. It gives a p-value based on t, but every text I've ever seen gives the calculation based on z. For example: > data(cars) > with(cars[1:10, ], cor.test(speed, dist)) Pearson's product-moment correlation data:  speed and dist t = 2.3893, df = 8, p-value = 0.04391 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval:  0.02641348 0.90658582 sample estimates:       cor 0.6453079 But when I use the regular formula: > r <- cor(cars[1:10, ])[1, 2] > r.z <- fisherz(r) > se <- se <- 1/sqrt(10 - 3) > z <- r.z / se > (1 - pnorm(z))*2 [1] 0.04237039 My p-value is different.  The help file for cor.test doesn't (seem to) have any reference to this, and I can see in the source code that it is doing something different. I'm just not sure what. Thanks, Jeremy ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.