aggregation-type question

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

aggregation-type question

carslaw
I seem to have a Friday afternoon block and can't see the easiest way of doing this.

Given a data frame like:

dat <- data.frame(x = runif(100), y = runif(100), group = rep(letters[1:10], each = 10))
> head(dat)
            x         y group
1 0.876751503 0.6518345     a
2 0.627067150 0.8801790     a
3 0.632465192 0.1768305     a
4 0.060359554 0.8835652     a
5 0.675868776 0.7721177     a
6 0.008465241 0.5046486     a

I want to work out cor(x, y) by group, so in this case ending up with 10 correlation coefficients by group.

I'm not seeing a straightforward solution and I'd appreciate your help.

Thanks

David
Reply | Threaded
Open this post in threaded view
|

Re: aggregation-type question

Rui Barradas
Hello,

With the following, the first instruction will give you correlations
matrices, the second coefficients.

dat <- read.table(text = "
          x         y group
1 0.876751503 0.6518345     a
2 0.627067150 0.8801790     a
3 0.632465192 0.1768305     a
4 0.060359554 0.8835652     a
5 0.675868776 0.7721177     a
6 0.008465241 0.5046486     a
", header = TRUE)
str(dat)

lapply(split(dat[, c("x", "y")], dat$group), cor)
lapply(split(dat[, c("x", "y")], dat$group), function(d) cor(d$x, d$y))


Hope this helps,

Rui Barradas

Em 08-02-2013 16:33, carslaw escreveu:

> I seem to have a Friday afternoon block and can't see the easiest way of
> doing this.
>
> Given a data frame like:
>
> dat <- data.frame(x = runif(100), y = runif(100), group = rep(letters[1:10],
> each = 10))
>> head(dat)
>              x         y group
> 1 0.876751503 0.6518345     a
> 2 0.627067150 0.8801790     a
> 3 0.632465192 0.1768305     a
> 4 0.060359554 0.8835652     a
> 5 0.675868776 0.7721177     a
> 6 0.008465241 0.5046486     a
>
> I want to work out cor(x, y) by group, so in this case ending up with 10
> correlation coefficients by group.
>
> I'm not seeing a straightforward solution and I'd appreciate your help.
>
> Thanks
>
> David
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/aggregation-type-question-tp4657966.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: aggregation-type question

arun kirshna
This post has NOT been accepted by the mailing list yet.
In reply to this post by carslaw
Hi,
set.seed(25)
dat <- data.frame(x = runif(100), y = runif(100), group = rep(letters[1:10], each = 10))
library(plyr)
 ddply(dat,.(group),summarise,Correl=cor(x,y))
#  group      Correl
#1      a -0.30616332
#2      b  0.30268334
#3      c  0.65611027
#4      d  0.01539136
#      e -0.51470362
#6      f -0.32576373
#7      g -0.35340778
#8      h  0.82409944
#9      i  0.19392836
#10     j  0.04448608
#or
 res<-do.call(rbind,lapply(split(dat,dat$group),function(x) cor(x[,-3])[2,1]))
head(res)
#         [,1]
#a -0.30616332
#b  0.30268334
#c  0.65611027
#d  0.01539136
#e -0.51470362
#f -0.32576373
A.K.
Reply | Threaded
Open this post in threaded view
|

Re: aggregation-type question

John Kane
In reply to this post by carslaw
I seem to be suffering from the same problem (the Friday one, not the cor one)

Have a look at http://stats.stackexchange.com/questions/4040/r-compute-correlation-by-group for something that looks like it will work

John Kane
Kingston ON Canada


> -----Original Message-----
> From: [hidden email]
> Sent: Fri, 8 Feb 2013 08:33:45 -0800 (PST)
> To: [hidden email]
> Subject: [R] aggregation-type question
>
> I seem to have a Friday afternoon block and can't see the easiest way of
> doing this.
>
> Given a data frame like:
>
> dat <- data.frame(x = runif(100), y = runif(100), group =
> rep(letters[1:10],
> each = 10))
>> head(dat)
>             x         y group
> 1 0.876751503 0.6518345     a
> 2 0.627067150 0.8801790     a
> 3 0.632465192 0.1768305     a
> 4 0.060359554 0.8835652     a
> 5 0.675868776 0.7721177     a
> 6 0.008465241 0.5046486     a
>
> I want to work out cor(x, y) by group, so in this case ending up with 10
> correlation coefficients by group.
>
> I'm not seeing a straightforward solution and I'd appreciate your help.
>
> Thanks
>
> David
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/aggregation-type-question-tp4657966.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

____________________________________________________________
FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.