# Sum by column

5 messages
Open this post in threaded view
|

## Sum by column

 Dear List, I have a question of convenience, I am looking to sum the values of one column based on another column - a example may help explain better! ED ECOCODE 21.809467 AA0101 36.229566 PA1201 51.861284 PA1201 11.36232 PA1201 27.264634 PA1201 12.261986 PA1201 46.519313 PA1201 7.815376 PA1201 2.810428 PA1201 13.478372 PA1201 35.670182 PA1301 27.128715 AT0801 19.010294 AT1201 15.475368 AT1201 18.597983 AT0101 29.292615 AT0101 6.749846 AT0101 14.981488 AT0101 14.93511 AT0101 14.93511 AT0101 21.040785 AT0101 8.271615 AT0101 12.94232 AT0101 6.749846 AT0101 15.484412 AT0101 29.644494 AT0101 43.211212 AT0101 So for AA0101 it would be = 21.809467             AT1201 it would be = 19.010294+15.475368 etc I would then like to be able to output a table with ECOCODE in one column and the sum of ED in the other. This is stored in a dataframe called ecoregion, i understand people like having code to change but i have none as i am a relative beginner! Sorry in advance! Thanks Peter ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: Sum by column

 Hi Peter, R has some fairly flexible ways of passing values of some variable (X) by another (the INDEX) to different FUNctions.  Here is an example using your data: ## your email data, in convenient form dat <- structure(list(ED = c(21.809467, 36.229566, 51.861284, 11.36232, 27.264634, 12.261986, 46.519313, 7.815376, 2.810428, 13.478372, 35.670182, 27.128715, 19.010294, 15.475368, 18.597983, 29.292615, 6.749846, 14.981488, 14.93511, 14.93511, 21.040785, 8.271615, 12.94232, 6.749846, 15.484412, 29.644494, 43.211212), ECOCODE = structure(c(1L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 3L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("AA0101", "AT0101", "AT0801", "AT1201", "PA1201", "PA1301"), class = "factor")), .Names = c("ED", "ECOCODE"), class = "data.frame", row.names = c(NA, -27L)) ## look at the structure of the data str(dat) ## inside of "dat" (to avoid typing its name repeatedly) ## find the sum of ED at each level of ECOCODE with(dat, tapply(X = ED, INDEX = ECOCODE, FUN = sum, na.rm = TRUE)) ## should give something like    AA0101    AT0101    AT0801    AT1201    PA1201    PA1301  21.80947 236.83684  27.12871  34.48566 209.60328  35.67018 For documentation, look at: ?tapply ## similar in many ways though sometimes slightly more/less convenient ?by Hope that helps, Josh On Wed, Jan 12, 2011 at 2:38 AM, Peter Francis <[hidden email]> wrote: > Dear List, > > I have a question of convenience, > > I am looking to sum the values of one column based on another column - a example may help explain better! > > ED                      ECOCODE > 21.809467       AA0101 > 36.229566       PA1201 > 51.861284       PA1201 > 11.36232        PA1201 > 27.264634       PA1201 > 12.261986       PA1201 > 46.519313       PA1201 > 7.815376        PA1201 > 2.810428        PA1201 > 13.478372       PA1201 > 35.670182       PA1301 > 27.128715       AT0801 > 19.010294       AT1201 > 15.475368       AT1201 > 18.597983       AT0101 > 29.292615       AT0101 > 6.749846        AT0101 > 14.981488       AT0101 > 14.93511        AT0101 > 14.93511        AT0101 > 21.040785       AT0101 > 8.271615        AT0101 > 12.94232        AT0101 > 6.749846        AT0101 > 15.484412       AT0101 > 29.644494       AT0101 > 43.211212       AT0101 > > So for AA0101 it would be = 21.809467 >            AT1201 it would be = 19.010294+15.475368 > > etc > > I would then like to be able to output a table with ECOCODE in one column and the sum of ED in the other. > > This is stored in a dataframe called ecoregion, i understand people like having code to change but i have none as i am a relative beginner! Sorry in advance! > > Thanks > > Peter > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: Sum by column

 In reply to this post by peterfrancis There are two functions you need to become familiar with: ?tapply ?ave If you wanted these summed values to be placed in another column of   the same dataframe, you would use ave. If you wanted a new structure   (somewhat shorter) you would use tapply with sum as the function. E. g: tapply(ecoregion\$ED, ecoregion\$ECOCODE, sum) -- David. On Jan 12, 2011, at 5:38 AM, Peter Francis wrote: > Dear List, > > I have a question of convenience, > > I am looking to sum the values of one column based on another column   > - a example may help explain better! > > ED ECOCODE > 21.809467 AA0101 > 36.229566 PA1201 > 51.861284 PA1201 > 11.36232 PA1201 > 27.264634 PA1201 > 12.261986 PA1201 > 46.519313 PA1201 > 7.815376 PA1201 > 2.810428 PA1201 > 13.478372 PA1201 > 35.670182 PA1301 > 27.128715 AT0801 > 19.010294 AT1201 > 15.475368 AT1201 > 18.597983 AT0101 > 29.292615 AT0101 > 6.749846 AT0101 > 14.981488 AT0101 > 14.93511 AT0101 > 14.93511 AT0101 > 21.040785 AT0101 > 8.271615 AT0101 > 12.94232 AT0101 > 6.749846 AT0101 > 15.484412 AT0101 > 29.644494 AT0101 > 43.211212 AT0101 > > So for AA0101 it would be = 21.809467 >    AT1201 it would be = 19.010294+15.475368 > > etc > > I would then like to be able to output a table with ECOCODE in one   > column and the sum of ED in the other. > > This is stored in a dataframe called ecoregion, i understand people   > like having code to change but i have none as i am a relative   > beginner! Sorry in advance! > > Thanks > > Peter > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.