return only pairwise correlations greater than given value

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

return only pairwise correlations greater than given value

Brad Schneid
Hello,

 I would like to find out if a function already exists that returns only pairwise correlations above/below a certain threshold (e.g, -.90, .90)

Thank you.

Reply | Threaded
Open this post in threaded view
|

Re: return only pairwise correlations greater than given value

Michael Weylandt
What exactly do you mean "returns" them? More generally I suppose,
what do you have in mind to do with this?

You could do something like this:

BigCorrelation <- function(X){

     return(which(abs(cor(X)) > 0.9, arr.ind = T))
}

but it hardly seems worth its own function call.

On Thu, Nov 17, 2011 at 12:42 AM, B77S <[hidden email]> wrote:

> Hello,
>
>  I would like to find out if a function already exists that returns only
> pairwise correlations above/below a certain threshold (e.g, -.90, .90)
>
> Thank you.
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4079028.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: return only pairwise correlations greater than given value

Brad Schneid
This post was updated on .
Thanks Michael,

I just started on the following code (below), and realized I should ask, as this likely exists already.

basically what I'd like is for the function to return (basically) what you just suggested, plus the names of the two variables (I suppose pasted together would be good).

I hope that is clear, and obviously I didn't get so far as to add the names to the output.

#
sig.cor <- function(dat, r, ...){

cv2 <- data.frame(cor(dat))
var.names <- rownames(cv2)

list.cv2 <- which(cv2 >=r | cv2 <= -r, arr.ind=T)
cor.r <- cv2[list.cv2[which(list.cv2 [,"row"]!=list.cv2 [,"col"]),]]
cor.names <- var.names[list.cv2[which(list.cv2 [,"row"]!=list.cv2 [,"col"]),]]

       
return(cor.r)

}


data(mtcars)
sig.cor(mtcars[,2:5], .90)


#> sig.cor(mtcars[,2:5], .90)
#[1] 0.9020329 0.9020329


# Ideally this would look likt this:

cyl-disp
0.9020329





Michael Weylandt wrote
What exactly do you mean "returns" them? More generally I suppose,
what do you have in mind to do with this?

You could do something like this:

BigCorrelation <- function(X){

     return(which(abs(cor(X)) > 0.9, arr.ind = T))
}

but it hardly seems worth its own function call.

On Thu, Nov 17, 2011 at 12:42 AM, B77S <[hidden email]> wrote:
> Hello,
>
>  I would like to find out if a function already exists that returns only
> pairwise correlations above/below a certain threshold (e.g, -.90, .90)
>
> Thank you.
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4079028.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: return only pairwise correlations greater than given value

Brad Schneid
This is probably not the prettiest or most efficient function ever, but this seems to do what I wanted.


spec.cor <- function(dat, r, ...){
       
        require("reshape")
       
        d1 <- data.frame(cor(dat))
        d2 <- melt(d1)
        d2[,3] <- rep(rownames(d1), nrow(d2)/length(unique(d2[,1])))
        d2 <- d2[,c("variable", "V3", "value")]
        colnames(d2) <- c("V1", "V2", "value")
        d2 <- d2[with(d2, which(V1 != V2, arr.ind=T)), ]
        d2 <- d2[which(d2[,3] >=r | d2[,3] <= -r, arr.ind=T),]
        d2[,1:2] <- t(apply(d2[,1:2], MARGIN=1, function(x) sort(x)))
        d2 <- unique(d2)
       
        return(d2)
}



data(mtcars)


> spec.cor(mtcars[,2:5], .6)
Using  as id variables
    V1   V2      value
2  cyl disp  0.9020329
3  cyl   hp  0.8324475
4  cyl drat -0.6999381
7 disp   hp  0.7909486
8 disp drat -0.7102139



I'm not sure how to make melt() quit giving the "Using  as id variables" warning, but I don't really care either.





B77S wrote
Thanks Michael,

I just started on the following code (below), and realized I should ask, as this likely exists already.

basically what I'd like is for the function to return (basically) what you just suggested, plus the names of the two variables (I suppose pasted together would be good).

I hope that is clear, and obviously I didn't get so far as to add the names to the output.

#
sig.cor <- function(dat, r, ...){

cv2 <- data.frame(cor(dat))
var.names <- rownames(cv2)

list.cv2 <- which(cv2 >=r | cv2 <= -r, arr.ind=T)
cor.r <- cv2[list.cv2[which(list.cv2 [,"row"]!=list.cv2 [,"col"]),]]
cor.names <- var.names[list.cv2[which(list.cv2 [,"row"]!=list.cv2 [,"col"]),]]

       
return(cor.r)

}


data(mtcars)
sig.cor(mtcars[,2:5], .90)


#> sig.cor(mtcars[,2:5], .90)
#[1] 0.9020329 0.9020329


# Ideally this would look likt this:

cyl-disp
0.9020329





Michael Weylandt wrote
What exactly do you mean "returns" them? More generally I suppose,
what do you have in mind to do with this?

You could do something like this:

BigCorrelation <- function(X){

     return(which(abs(cor(X)) > 0.9, arr.ind = T))
}

but it hardly seems worth its own function call.

On Thu, Nov 17, 2011 at 12:42 AM, B77S <[hidden email]> wrote:
> Hello,
>
>  I would like to find out if a function already exists that returns only
> pairwise correlations above/below a certain threshold (e.g, -.90, .90)
>
> Thank you.
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4079028.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: return only pairwise correlations greater than given value

Joshua Wiley-2
In reply to this post by Brad Schneid
Hi Brad,

You do not really need to reshape the correlation matrix.  This seems
to do what you want:

spec.cor <- function(dat, r, ...) {
  x <- cor(dat, ...)
  x[upper.tri(x, TRUE)] <- NA
  i <- which(abs(x) >= r, arr.ind = TRUE)
  data.frame(matrix(colnames(x)[as.vector(i)], ncol = 2), value = x[i])
}

spec.cor(mtcars[, 2:5], .6)

Cheers,

Josh

On Wed, Nov 16, 2011 at 9:58 PM, B77S <[hidden email]> wrote:

> Thanks Michael,
>
> I just started on the following code (below), and realized I should as as
> this might exist.
>
> basically what I'd like is for the function to return (basically) what you
> just suggested, plus the names of the two variables (I suppose pasted
> together would be good).
>
> I hope that is clear.
>
> #
> sig.cor <- function(dat, r, ...){
>
> cv2 <- data.frame(cor(dat))
> var.names <- rownames(cv2)
>
> list.cv2 <- which(cv2 >=r | cv2 <= -r, arr.ind=T)
> cor.r <- cv2[list.cv2[which(list.cv2 [,"row"]!=list.cv2 [,"col"]),]]
> cor.names <- var.names[list.cv2[which(list.cv2 [,"row"]!=list.cv2
> [,"col"]),]]
>
>
> return(cor.r)
>
> }
>
>
>
>
>
> Michael Weylandt wrote:
>>
>> What exactly do you mean "returns" them? More generally I suppose,
>> what do you have in mind to do with this?
>>
>> You could do something like this:
>>
>> BigCorrelation <- function(X){
>>
>>      return(which(abs(cor(X)) > 0.9, arr.ind = T))
>> }
>>
>> but it hardly seems worth its own function call.
>>
>> On Thu, Nov 17, 2011 at 12:42 AM, B77S &lt;bps0002@&gt; wrote:
>>> Hello,
>>>
>>>  I would like to find out if a function already exists that returns only
>>> pairwise correlations above/below a certain threshold (e.g, -.90, .90)
>>>
>>> Thank you.
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4079028.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> ______________________________________________
>>> R-help@ mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> ______________________________________________
>> R-help@ mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4079044.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: return only pairwise correlations greater than given value

Brad Schneid
Excellent; thanks Josh.

Joshua Wiley-2 wrote
Hi Brad,

You do not really need to reshape the correlation matrix.  This seems
to do what you want:

spec.cor <- function(dat, r, ...) {
  x <- cor(dat, ...)
  x[upper.tri(x, TRUE)] <- NA
  i <- which(abs(x) >= r, arr.ind = TRUE)
  data.frame(matrix(colnames(x)[as.vector(i)], ncol = 2), value = x[i])
}

spec.cor(mtcars[, 2:5], .6)

Cheers,

Josh

On Wed, Nov 16, 2011 at 9:58 PM, B77S <[hidden email]> wrote:
> Thanks Michael,
>
> I just started on the following code (below), and realized I should as as
> this might exist.
>
> basically what I'd like is for the function to return (basically) what you
> just suggested, plus the names of the two variables (I suppose pasted
> together would be good).
>
> I hope that is clear.
>
> #
> sig.cor <- function(dat, r, ...){
>
> cv2 <- data.frame(cor(dat))
> var.names <- rownames(cv2)
>
> list.cv2 <- which(cv2 >=r | cv2 <= -r, arr.ind=T)
> cor.r <- cv2[list.cv2[which(list.cv2 [,"row"]!=list.cv2 [,"col"]),]]
> cor.names <- var.names[list.cv2[which(list.cv2 [,"row"]!=list.cv2
> [,"col"]),]]
>
>
> return(cor.r)
>
> }
>
>
>
>
>
> Michael Weylandt wrote:
>>
>> What exactly do you mean "returns" them? More generally I suppose,
>> what do you have in mind to do with this?
>>
>> You could do something like this:
>>
>> BigCorrelation <- function(X){
>>
>>      return(which(abs(cor(X)) > 0.9, arr.ind = T))
>> }
>>
>> but it hardly seems worth its own function call.
>>
>> On Thu, Nov 17, 2011 at 12:42 AM, B77S <bps0002@> wrote:
>>> Hello,
>>>
>>>  I would like to find out if a function already exists that returns only
>>> pairwise correlations above/below a certain threshold (e.g, -.90, .90)
>>>
>>> Thank you.
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4079028.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> ______________________________________________
>>> R-help@ mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> ______________________________________________
>> R-help@ mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4079044.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.