Not sure how to use aggregate, colSums, by

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Not sure how to use aggregate, colSums, by

eric
I have a data frame called test shown below that i would like to summarize in a particular way :

I want to show the column sums (columns y ,f) grouped by country (column e1). However, I'm looking for the data to be split according to column e2. In other words, two tables of sum by country. One table for "con" and one table for "std" shown in column e2. Finally at the bottom of the two tables, I would like the overall sum /Totals for all the countries for the two columns (y,f).  The lay outs for the two tables I'm looking for are also shown below in case my description isn't completely clear

I would also like to be able to use the Totals of y and f for the two tables in other calculations.

I can get the two sets of totals with the following commands but not the sums by country.

colSums(test[test$e2=="std", c(3,4)])
colSums(test[test$e2=="con", c(3,4)])

I know there's an easy way to do this with a combination of colSums, by, aggregate but I can't seem to get it.

std         y       f

usa        sum   sum
france    sum   sum
can        sum   sum
italy       sum   sum
Totals    sum   sum

con       y       f

usa       sum   sum
france   sum   sum
can       sum   sum
italy      sum   sum
Totals    sum   sum

      e1  e2 y  f
1     usa std 1  1
2     usa std 1  2
3     can con 1  3
4  france con 1  4
5     can std 1  5
6   italy con 1  6
7     usa std 2  7
8     usa std 2  8
9     can con 2  9
10 france con 2 10
11    can std 2 11
12  italy con 2 12
13    usa std 3 13
14    usa std 3 14
15    can con 3 15
16 france con 3 16
17    can std 3 17
18  italy con 3 18
19    usa std 4 19
20    usa std 4 20
21    can con 4 21
22 france con 4 22
23    can std 4 23
24  italy con 4 24
25    usa std 5 25
26    usa std 5 26
27    can con 5 27
28 france con 5 28
29    can std 5 29
30  italy con 5 30
31    usa std 6 31
32    usa std 6 32
33    can con 6 33
34 france con 6 34
35    can std 6 35
36  italy con 6 36
Reply | Threaded
Open this post in threaded view
|

Re: Not sure how to use aggregate, colSums, by

Mikhail Titov-2
I hope this will help you get going

b <- sapply(unique(test$e2), function(x) {
    out <- aggregate(cbind(y,f)~e1, subset(test, e2==x),"sum")
    out <- rbind(out, data.frame(e1="total", y=sum(out$y), f=sum(out$f)))
    out <- list(out)
    names(out) <- x
    out
})


> b
$std
     e1  y   f
1   can 21 120
2   usa 42 198
3 total 63 318

$con
      e1  y   f
1    can 21 108
2 france 21 114
3  italy 21 126
4  total 63 348


On 08/14/2011 12:20 PM, eric wrote:

> I have a data frame called test shown below that i would like to summarize in
> a particular way :
>
> I want to show the column sums (columns y ,f) grouped by country (column
> e1). However, I'm looking for the data to be split according to column e2.
> In other words, two tables of sum by country. One table for "con" and one
> table for "std" shown in column e2. Finally at the bottom of the two tables,
> I would like the overall sum /Totals for all the countries for the two
> columns (y,f).  The lay outs for the two tables I'm looking for are also
> shown below in case my description isn't completely clear
>
> I would also like to be able to use the Totals of y and f for the two tables
> in other calculations.
>
> I can get the two sets of totals with the following commands but not the
> sums by country.
>
> colSums(test[test$e2=="std", c(3,4)])
> colSums(test[test$e2=="con", c(3,4)])
>
> I know there's an easy way to do this with a combination of colSums, by,
> aggregate but I can't seem to get it.
>
> std         y       f
>
> usa        sum   sum
> france    sum   sum
> can        sum   sum
> italy       sum   sum
> Totals    sum   sum
>
> con       y       f
>
> usa       sum   sum
> france   sum   sum
> can       sum   sum
> italy      sum   sum
> Totals    sum   sum
>
>       e1  e2 y  f
> 1     usa std 1  1
> 2     usa std 1  2
> 3     can con 1  3
> 4  france con 1  4
> 5     can std 1  5
> 6   italy con 1  6
> 7     usa std 2  7
> 8     usa std 2  8
> 9     can con 2  9
> 10 france con 2 10
> 11    can std 2 11
> 12  italy con 2 12
> 13    usa std 3 13
> 14    usa std 3 14
> 15    can con 3 15
> 16 france con 3 16
> 17    can std 3 17
> 18  italy con 3 18
> 19    usa std 4 19
> 20    usa std 4 20
> 21    can con 4 21
> 22 france con 4 22
> 23    can std 4 23
> 24  italy con 4 24
> 25    usa std 5 25
> 26    usa std 5 26
> 27    can con 5 27
> 28 france con 5 28
> 29    can std 5 29
> 30  italy con 5 30
> 31    usa std 6 31
> 32    usa std 6 32
> 33    can con 6 33
> 34 france con 6 34
> 35    can std 6 35
> 36  italy con 6 36
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Not-sure-how-to-use-aggregate-colSums-by-tp3743258p3743258.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Not sure how to use aggregate, colSums, by

Jorge I Velez
In reply to this post by eric
Hi eric,

Try

lapply(with(x, split(x, e2)), function(l){
r <- with(l, aggregate(list(y, f), list(e1), sum))
colnames(r) <- c('e1', 'y', 'f')
 r
   })

HTH,
Jorge


On Sun, Aug 14, 2011 at 1:20 PM, eric <> wrote:

> I have a data frame called test shown below that i would like to summarize
> in
> a particular way :
>
> I want to show the column sums (columns y ,f) grouped by country (column
> e1). However, I'm looking for the data to be split according to column e2.
> In other words, two tables of sum by country. One table for "con" and one
> table for "std" shown in column e2. Finally at the bottom of the two
> tables,
> I would like the overall sum /Totals for all the countries for the two
> columns (y,f).  The lay outs for the two tables I'm looking for are also
> shown below in case my description isn't completely clear
>
> I would also like to be able to use the Totals of y and f for the two
> tables
> in other calculations.
>
> I can get the two sets of totals with the following commands but not the
> sums by country.
>
> colSums(test[test$e2=="std", c(3,4)])
> colSums(test[test$e2=="con", c(3,4)])
>
> I know there's an easy way to do this with a combination of colSums, by,
> aggregate but I can't seem to get it.
>
> std         y       f
>
> usa        sum   sum
> france    sum   sum
> can        sum   sum
> italy       sum   sum
> Totals    sum   sum
>
> con       y       f
>
> usa       sum   sum
> france   sum   sum
> can       sum   sum
> italy      sum   sum
> Totals    sum   sum
>
>      e1  e2 y  f
> 1     usa std 1  1
> 2     usa std 1  2
> 3     can con 1  3
> 4  france con 1  4
> 5     can std 1  5
> 6   italy con 1  6
> 7     usa std 2  7
> 8     usa std 2  8
> 9     can con 2  9
> 10 france con 2 10
> 11    can std 2 11
> 12  italy con 2 12
> 13    usa std 3 13
> 14    usa std 3 14
> 15    can con 3 15
> 16 france con 3 16
> 17    can std 3 17
> 18  italy con 3 18
> 19    usa std 4 19
> 20    usa std 4 20
> 21    can con 4 21
> 22 france con 4 22
> 23    can std 4 23
> 24  italy con 4 24
> 25    usa std 5 25
> 26    usa std 5 26
> 27    can con 5 27
> 28 france con 5 28
> 29    can std 5 29
> 30  italy con 5 30
> 31    usa std 6 31
> 32    usa std 6 32
> 33    can con 6 33
> 34 france con 6 34
> 35    can std 6 35
> 36  italy con 6 36
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Not-sure-how-to-use-aggregate-colSums-by-tp3743258p3743258.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Not sure how to use aggregate, colSums, by

djmuseR
In reply to this post by eric
Hi:

Here's another approach using the reshape2 package. I called your data
frame dat in the code below.

library('reshape2')
mdat <- melt(dat, measure = c('y', 'f'))
acast(mdat, e1 ~ variable ~ e2, fun = sum, margins = 'e1')
, , con

        y   f
can    21 108
france 21 114
italy  21 126
usa     0   0
(all)  63 348

, , std

        y   f
can    21 120
france  0   0
italy   0   0
usa    42 198
(all)  63 318

This returns an array rather than a list.

HTH,
Dennis

On Sun, Aug 14, 2011 at 10:20 AM, eric <[hidden email]> wrote:

> I have a data frame called test shown below that i would like to summarize in
> a particular way :
>
> I want to show the column sums (columns y ,f) grouped by country (column
> e1). However, I'm looking for the data to be split according to column e2.
> In other words, two tables of sum by country. One table for "con" and one
> table for "std" shown in column e2. Finally at the bottom of the two tables,
> I would like the overall sum /Totals for all the countries for the two
> columns (y,f).  The lay outs for the two tables I'm looking for are also
> shown below in case my description isn't completely clear
>
> I would also like to be able to use the Totals of y and f for the two tables
> in other calculations.
>
> I can get the two sets of totals with the following commands but not the
> sums by country.
>
> colSums(test[test$e2=="std", c(3,4)])
> colSums(test[test$e2=="con", c(3,4)])
>
> I know there's an easy way to do this with a combination of colSums, by,
> aggregate but I can't seem to get it.
>
> std         y       f
>
> usa        sum   sum
> france    sum   sum
> can        sum   sum
> italy       sum   sum
> Totals    sum   sum
>
> con       y       f
>
> usa       sum   sum
> france   sum   sum
> can       sum   sum
> italy      sum   sum
> Totals    sum   sum
>
>      e1  e2 y  f
> 1     usa std 1  1
> 2     usa std 1  2
> 3     can con 1  3
> 4  france con 1  4
> 5     can std 1  5
> 6   italy con 1  6
> 7     usa std 2  7
> 8     usa std 2  8
> 9     can con 2  9
> 10 france con 2 10
> 11    can std 2 11
> 12  italy con 2 12
> 13    usa std 3 13
> 14    usa std 3 14
> 15    can con 3 15
> 16 france con 3 16
> 17    can std 3 17
> 18  italy con 3 18
> 19    usa std 4 19
> 20    usa std 4 20
> 21    can con 4 21
> 22 france con 4 22
> 23    can std 4 23
> 24  italy con 4 24
> 25    usa std 5 25
> 26    usa std 5 26
> 27    can con 5 27
> 28 france con 5 28
> 29    can std 5 29
> 30  italy con 5 30
> 31    usa std 6 31
> 32    usa std 6 32
> 33    can con 6 33
> 34 france con 6 34
> 35    can std 6 35
> 36  italy con 6 36
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Not-sure-how-to-use-aggregate-colSums-by-tp3743258p3743258.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.