Quantcast

group by in data.frame

classic Classic list List threaded Threaded
10 messages Options
zem
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

group by in data.frame

zem
Hi all,

i have a little problem, and i think it is really simple to solve, but i dont know exactly how to.
here is the "challange":
i have a data.frame with n colum, i have to group 2 of them and calculate the mean value of the 3. one. so far so good, that was easy - i used aggregate function to do this:
group<-aggregate(x[,1],list(x[,2],x[,3]),mean)
and now i have to copy the calculated mean value to every row of the date.frame (in a new column in the dataframe), ofcourse by copying should be the value  adequate to the group

it will be great if someone can help me
thanx in advance!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: group by in data.frame

Ivan Calandra
Hi,

I think ave() might do what you want:
df <- data.frame(a=rep(c("this","that"),5), b1=rnorm(10), b2=rnorm(10))
ave(df[,2], df[,1], FUN=mean)

For all columns, you could do that:
d <- lapply(df[,2:3], FUN=function(x)ave(x,df[,1],FUN=mean))
df2 <- cbind(df, d)

HTH,
Ivan

Le 2/25/2011 12:11, zem a écrit :

> Hi all,
>
> i have a little problem, and i think it is really simple to solve, but i
> dont know exactly how to.
> here is the "challange":
> i have a data.frame with n colum, i have to group 2 of them and calculate
> the mean value of the 3. one. so far so good, that was easy - i used
> aggregate function to do this:
> group<-aggregate(x[,1],list(x[,2],x[,3]),mean)
> and now i have to copy the calculated mean value to every row of the
> date.frame (in a new column in the dataframe), ofcourse by copying should be
> the value  adequate to the group
>
> it will be great if someone can help me
> thanx in advance!

--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
[hidden email]

**********
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
zem
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: group by in data.frame

zem
In reply to this post by zem
Hi Ivan,

thanks for your replay!
but the problem is there that the dataframe has 20000 rows and  ca. 2000 groups, but i dont have the column with the groupnames, because the groups are depending on 2 onother columns ...
any other idea or i didnt understand waht are you posted ... :(
zem
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: group by in data.frame

zem
In reply to this post by zem
10x i solved it ... mein problem was that i had 2 column by them i have to group, i just "pasted" the values together so that at the end i have one column to group and then was easy ...
here is the script that i used: http://tolstoy.newcastle.edu.au/R/help/06/07/30184.html
Ivan thanks for the help too :)
zem
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: group by in data.frame

zem
In reply to this post by zem
Yeah, you are right
i want to post an short example what i want to do .. and in the meantime i solved the problem ...
but here is:
i have something like this dataframe:
c1<-c(1,2,3,2,2,3,1,2,2,2)
c2<-c(5,6,7,7,5,7,5,7,6,6)
c3<-rnorm(10)
x<-cbind(c1,c2,c3)
> x
      c1 c2          c3
 [1,]  1  5  0.08279036
 [2,]  2  6  0.59135988
 [3,]  3  7  1.45520468
 [4,]  2  7 -1.70094640
 [5,]  2  5  0.13065228
 [6,]  3  7 -1.12080980
 [7,]  1  5  0.42779354
 [8,]  2  7 -1.53111972
 [9,]  2  6  0.29299987
[10,]  2  6 -0.01602095

#whith aggregate i receive this:
>aggregate(x[,3],list(x[,1],x[,2]),mean)
  Group.1 Group.2          x
1       1       5  0.2552920
2       2       5  0.1306523
3       2       6  0.2894463
4       2       7 -1.6160331
5       3       7  0.1671974


and the problem was that i was grouping by 2 columns, so i couldn't copy the result to x.

the solution was i made another column with paste(x[,1],x[,2],sep="_")
and then i used the solution from this link: http://tolstoy.newcastle.edu.au/R/help/06/07/30184.html
so i solved my problem

Ivan, many thanks for your support and quik responses! :)
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: group by in data.frame

Ivan Calandra
Ok, now I think I've understood, but I'm not sure since I think that my
ave() solution does work. Although, I though you have several numerical
variables and 1 factor; it is the opposite but it is still possible:

c3_mean <- ave(x[,3], list(x[,1],x[,2]), FUN=mean)  #note that values
are different because of rnorm()
cbind(x[,1:2], c3_mean)

Is it what you want?

Ivan


Le 2/25/2011 16:14, zem a écrit :

> Yeah, you are right
> i want to post an short example what i want to do .. and in the meantime i
> solved the problem ...
> but here is:
> i have something like this dataframe:
> c1<-c(1,2,3,2,2,3,1,2,2,2)
> c2<-c(5,6,7,7,5,7,5,7,6,6)
> c3<-rnorm(10)
> x<-cbind(c1,c2,c3)
>> x
>        c1 c2          c3
>   [1,]  1  5  0.08279036
>   [2,]  2  6  0.59135988
>   [3,]  3  7  1.45520468
>   [4,]  2  7 -1.70094640
>   [5,]  2  5  0.13065228
>   [6,]  3  7 -1.12080980
>   [7,]  1  5  0.42779354
>   [8,]  2  7 -1.53111972
>   [9,]  2  6  0.29299987
> [10,]  2  6 -0.01602095
>
> #whith aggregate i receive this:
>> aggregate(x[,3],list(x[,1],x[,2]),mean)
>    Group.1 Group.2          x
> 1       1       5  0.2552920
> 2       2       5  0.1306523
> 3       2       6  0.2894463
> 4       2       7 -1.6160331
> 5       3       7  0.1671974
>
>
> and the problem was that i was grouping by 2 columns, so i couldn't copy the
> result to x.
>
> the solution was i made another column with paste(x[,1],x[,2],sep="_")
> and then i used the solution from this link:
> http://tolstoy.newcastle.edu.au/R/help/06/07/30184.html
> so i solved my problem
>
> Ivan, many thanks for your support and quik responses! :)
>

--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
[hidden email]

**********
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: group by in data.frame

David Winsemius
In reply to this post by zem

On Feb 25, 2011, at 10:14 AM, zem wrote:

>
> Yeah, you are right
> i want to post an short example what i want to do .. and in the  
> meantime i
> solved the problem ...
> but here is:
> i have something like this dataframe:
> c1<-c(1,2,3,2,2,3,1,2,2,2)
> c2<-c(5,6,7,7,5,7,5,7,6,6)
> c3<-rnorm(10)
> x<-cbind(c1,c2,c3)
>> x
>      c1 c2          c3
> [1,]  1  5  0.08279036
> [2,]  2  6  0.59135988
> [3,]  3  7  1.45520468
> [4,]  2  7 -1.70094640
> [5,]  2  5  0.13065228
> [6,]  3  7 -1.12080980
> [7,]  1  5  0.42779354
> [8,]  2  7 -1.53111972
> [9,]  2  6  0.29299987
> [10,]  2  6 -0.01602095
>
> #whith aggregate i receive this:
>> aggregate(x[,3],list(x[,1],x[,2]),mean)
>  Group.1 Group.2          x
> 1       1       5  0.2552920
> 2       2       5  0.1306523
> 3       2       6  0.2894463
> 4       2       7 -1.6160331
> 5       3       7  0.1671974
>
>
> and the problem was that i was grouping by 2 columns, so i couldn't  
> copy the
> result to x.
>
> the solution was i made another column with paste(x[,1],x[,2],sep="_")
> and then i used the solution from this link:
> http://tolstoy.newcastle.edu.au/R/help/06/07/30184.html
> so i solved my problem

Right. That works and has the virtue that it is reasonably clear what  
is going on. Another approach, possibly even more clear and even more  
R-ish, is to use the interaction() function.

 > aggregate(x[,3], list(interaction(x[,1],x[,2]) ), mean)
   Group.1            x
1     1.5 -0.658932424
2     2.5  0.824756795
3     2.6  0.640471421
4     2.7 -0.008519716
5     3.7 -0.053233855


>
> Ivan, many thanks for your support and quik responses! :)
>
> --
> View this message in context: http://r.789695.n4.nabble.com/group-by-in-data-frame-tp3324240p3324608.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: group by in data.frame

djmuseR
In reply to this post by zem
Hi:

Here's another way:

c1<-c(1,2,3,2,2,3,1,2,2,2)
c2<-c(5,6,7,7,5,7,5,7,6,6)
c3<-rnorm(10)
x <- data.frame(c1 = factor(c1), c2 = factor(c2), c3)
x <- transform(x, mean = ave(c3, c1, c2, FUN = mean))

Yet another with function ddply() in package plyr:
ddply(x, .(c1, c2), transform, mean = mean(c3))

HTH,
Dennis


On Fri, Feb 25, 2011 at 7:14 AM, zem <[hidden email]> wrote:

>
> Yeah, you are right
> i want to post an short example what i want to do .. and in the meantime i
> solved the problem ...
> but here is:
> i have something like this dataframe:
> c1<-c(1,2,3,2,2,3,1,2,2,2)
> c2<-c(5,6,7,7,5,7,5,7,6,6)
> c3<-rnorm(10)
> x<-cbind(c1,c2,c3)
> > x
>      c1 c2          c3
>  [1,]  1  5  0.08279036
>  [2,]  2  6  0.59135988
>  [3,]  3  7  1.45520468
>  [4,]  2  7 -1.70094640
>  [5,]  2  5  0.13065228
>  [6,]  3  7 -1.12080980
>  [7,]  1  5  0.42779354
>  [8,]  2  7 -1.53111972
>  [9,]  2  6  0.29299987
> [10,]  2  6 -0.01602095
>
> #whith aggregate i receive this:
> >aggregate(x[,3],list(x[,1],x[,2]),mean)
>  Group.1 Group.2          x
> 1       1       5  0.2552920
> 2       2       5  0.1306523
> 3       2       6  0.2894463
> 4       2       7 -1.6160331
> 5       3       7  0.1671974
>
>
> and the problem was that i was grouping by 2 columns, so i couldn't copy
> the
> result to x.
>
> the solution was i made another column with paste(x[,1],x[,2],sep="_")
> and then i used the solution from this link:
> http://tolstoy.newcastle.edu.au/R/help/06/07/30184.html
> so i solved my problem
>
> Ivan, many thanks for your support and quik responses! :)
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/group-by-in-data-frame-tp3324240p3324608.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
zem
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: group by in data.frame

zem
In reply to this post by zem
hi guys,

many many thanks for all the solutions! :-D they are working great!
i have another "little" question:
i would like to save these groups in a new column with serial number like the solution from David, but wit integer values: 1,2,3...
i do this allready but with my 1. solution and there is to much temp-data, that i dont realy nead ...
if you have any idea, that'll be really great!

thanks a lot !
zem
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: group by in data.frame

zem
In reply to this post by zem
ok, i have it - match()

10x all again! :)
Loading...