Get a percent variable based on group

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Get a percent variable based on group

Karine Charlebois
Dear all, I'd like to get a percentage variable based on a group, but without creating a new data frame.
For example:
data(iris)

iris$percent <-unlist(tapply(iris$Sepal.Length,iris$Species,function(x) x/sum(x, na.rm=TRUE)))

This does not work, I should have only three standard values, respectively for setosa, versicolor, and virginica. How can I do this?

MANY THANKS,

Karine
     
        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Get a percent variable based on group

arun kirshna


HI,

Not sure if this is what you meant.
tapply(iris$Sepal.Length,iris$Species,FUN=function(x) sum(x)/sum(iris$Sepal.Length)*100)
 #  setosa versicolor  virginica
 # 28.55676   33.86195   37.58129
A.K.


----- Original Message -----
From: Karine Charlebois <[hidden email]>
To: "[hidden email]" <[hidden email]>
Cc:
Sent: Tuesday, January 15, 2013 9:30 PM
Subject: [R] Get a percent variable based on group

Dear all, I'd like to get a percentage variable based on a group, but without creating a new data frame.
For example:
data(iris)

iris$percent <-unlist(tapply(iris$Sepal.Length,iris$Species,function(x) x/sum(x, na.rm=TRUE)))

This does not work, I should have only three standard values, respectively for setosa, versicolor, and virginica. How can I do this?

MANY THANKS,

Karine
                         
    [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Get a percent variable based on group

David Winsemius
In reply to this post by Karine Charlebois

On Jan 15, 2013, at 6:30 PM, Karine Charlebois wrote:

> Dear all, I'd like to get a percentage variable based on a group, but without creating a new data frame.
> For example:
> data(iris)
>
> iris$percent <-unlist(tapply(iris$Sepal.Length,iris$Species,function(x) x/sum(x, na.rm=TRUE)))

A percentage is 100 times a fraction whose nominal value is unity. My guess is that you want a percentage of the group mean? So this would just be:

iris$percent <-ave(iris$Sepal.Length, iris$Species, FUN=function(x) 100*x/mean(x, na.rm=TRUE))

head(iris)

>
> This does not work, I should have only three standard values, respectively for setosa, versicolor, and virginica. How can I do this?

If you just want three values, then I do not see how these are percentages.

> tapply(iris$Sepal.Length,iris$Species,function(x) mean(x, na.rm=TRUE))
    setosa versicolor  virginica
     5.006      5.936      6.588


--
David Winsemius
Alameda, CA, USA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Get a percent variable based on group

arun kirshna
In reply to this post by arun kirshna
Hi,
Is it this?


aggregate(iris$Sepal.Length,by=list(iris$Species),FUN=function(x) sum(x)/sum(iris$Sepal.Length)*100)
     Group.1        x
1     setosa 28.55676
2 versicolor 33.86195
3  virginica 37.58129

A.K.





________________________________
From: Karine Charlebois <[hidden email]>
To: arun <[hidden email]>
Sent: Tuesday, January 15, 2013 10:22 PM
Subject: RE: [R] Get a percent variable based on group



For example, 
iris$percent <- unlist(tapply(iris$Sepal.Length,iris$Species,function(x) x/sum(iris$Sepal.Length, na.rm=TRUE)))

aggregate(iris$percent, by=list(iris$Species),
FUN=sum, na.rm=TRUE)

this last command should return 100% for each specie, not the following values:
Group.1         x
1     setosa 0.2855676
2 versicolor 0.3386195
3  virginica 0.3758129


________________________________
From: [hidden email]
To: [hidden email]
Subject: RE: [R] Get a percent variable based on group
Date: Tue, 15 Jan 2013 22:13:27 -0500


No, it is not. I need a new column with these values.

Karine


> Date: Tue, 15 Jan 2013 19:11:22 -0800
> From: [hidden email]
> Subject: Re: [R] Get a percent variable based on group
> To: [hidden email]
> CC: [hidden email]
>
>
>
> HI,
>
> Not sure if this is what you meant.
> tapply(iris$Sepal.Length,iris$Species,FUN=function(x) sum(x)/sum(iris$Sepal.Length)*100)
>  #  setosa versicolor  virginica
>  # 28.55676   33.86195   37.58129
> A.K.
>
>
> ----- Original Message -----
> From: Karine Charlebois <[hidden email]>
> To: "[hidden email]" <[hidden email]>
> Cc:
> Sent: Tuesday, January 15, 2013 9:30 PM
> Subject: [R] Get a percent variable based on group
>
> Dear all, I'd like to get a percentage variable based on a group, but without creating a new data frame.
> For example:
> data(iris)
>
> iris$percent <-unlist(tapply(iris$Sepal.Length,iris$Species,function(x) x/sum(x, na.rm=TRUE)))
>
> This does not work, I should have only three standard values, respectively for setosa, versicolor, and virginica. How can I do this?
>
> MANY THANKS,
>
> Karine
>                          
>     [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Get a percent variable based on group

Jeff Newmiller
In reply to this post by Karine Charlebois
As others have said, your goal is unclear to us. However, one guess I have not seen others make is if you are looking for a way to normalize within each group, perhaps you should look at

?ave

which typically creates a vector just as long as your data vector and grouping vector.
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.

Karine Charlebois <[hidden email]> wrote:

>Dear all, I'd like to get a percentage variable based on a group, but
>without creating a new data frame.
>For example:
>data(iris)
>
>iris$percent <-unlist(tapply(iris$Sepal.Length,iris$Species,function(x)
>x/sum(x, na.rm=TRUE)))
>
>This does not work, I should have only three standard values,
>respectively for setosa, versicolor, and virginica. How can I do this?
>
>MANY THANKS,
>
>Karine
>    
> [[alternative HTML version deleted]]
>
>______________________________________________
>[hidden email] mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.