Specifying parameters for use in "plyr" / "ddply"

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Specifying parameters for use in "plyr" / "ddply"

Dimitri Liakhovitski
Dear R-ers!

# I have a data frame with one factor and 2 numeric variables:
x<-data.frame(group=c("b","b","d","d","e","e"),a=c(1,NA,10,20,100,200),b=c(5,15,20,NA,10,30))
x

# I want to divide each value of each variable by its group mean -
using plyr and ddply. It works fine, for example, for variable "a":
library(plyr)
x2<-ddply(x, "group", transform, a = a / mean(a, na.rm = T))
x2

# Because I want to do the same for both variables (a and b) I want to
put it into a function.
# So, I am parametrising the grouping variable and the variable to transform.
# However, my code below is not working - I know that x[[variable]] is
not correct - but what is the right way of doing it?
grouping.factor<-"group"
variable<-"a"
x2<-ddply(x, grouping.factor, transform, x[[variable]] = x[[variable]]
/ mean(x[[variable]], na.rm = T))


Or is there a more effective way of using ddply on a bunch of variables?
Thank you very much for your advise!

--
Dimitri Liakhovitski
Ninah.com
[hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Specifying parameters for use in "plyr" / "ddply"

Peter Ehlers
On 2010-04-07 8:37, Dimitri Liakhovitski wrote:

> Dear R-ers!
>
> # I have a data frame with one factor and 2 numeric variables:
> x<-data.frame(group=c("b","b","d","d","e","e"),a=c(1,NA,10,20,100,200),b=c(5,15,20,NA,10,30))
> x
>
> # I want to divide each value of each variable by its group mean -
> using plyr and ddply. It works fine, for example, for variable "a":
> library(plyr)
> x2<-ddply(x, "group", transform, a = a / mean(a, na.rm = T))
> x2
>
> # Because I want to do the same for both variables (a and b) I want to
> put it into a function.
> # So, I am parametrising the grouping variable and the variable to transform.
> # However, my code below is not working - I know that x[[variable]] is
> not correct - but what is the right way of doing it?
> grouping.factor<-"group"
> variable<-"a"
> x2<-ddply(x, grouping.factor, transform, x[[variable]] = x[[variable]]
> / mean(x[[variable]], na.rm = T))
>
>
> Or is there a more effective way of using ddply on a bunch of variables?
> Thank you very much for your advise!
>
Yes, there is: colwise()

   f <- function(x) x / mean(x, na.rm = TRUE)
   ddply(x, "group", colwise(f, c("a", "b")))

--
Peter Ehlers
University of Calgary

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Specifying parameters for use in "plyr" / "ddply"

David Winsemius

On Apr 7, 2010, at 11:41 AM, Peter Ehlers wrote:

> On 2010-04-07 8:37, Dimitri Liakhovitski wrote:
>> Dear R-ers!
>>
>> # I have a data frame with one factor and 2 numeric variables:
>> x<-data.frame(group=c("b","b","d","d","e","e"),a=c(1,NA,
>> 10,20,100,200),b=c(5,15,20,NA,10,30))
>> x
>>
>> # I want to divide each value of each variable by its group mean -
>> using plyr and ddply. It works fine, for example, for variable "a":
>> library(plyr)
>> x2<-ddply(x, "group", transform, a = a / mean(a, na.rm = T))
>> x2
>>
>> # Because I want to do the same for both variables (a and b) I want  
>> to
>> put it into a function.
>> # So, I am parametrising the grouping variable and the variable to  
>> transform.
>> # However, my code below is not working - I know that x[[variable]]  
>> is
>> not correct - but what is the right way of doing it?
>> grouping.factor<-"group"
>> variable<-"a"
>> x2<-ddply(x, grouping.factor, transform, x[[variable]] =  
>> x[[variable]]
>> / mean(x[[variable]], na.rm = T))
>>
>>
>> Or is there a more effective way of using ddply on a bunch of  
>> variables?
>> Thank you very much for your advise!
>>
> Yes, there is: colwise()
>
>  f <- function(x) x / mean(x, na.rm = TRUE)
>  ddply(x, "group", colwise(f, c("a", "b")))
>

Which might also be productive strategy for yesterday's question (for  
which we have yet to be offered reproducible data example, however):

https://stat.ethz.ch/pipermail/r-help/2010-April/234363.html
>
--

David Winsemius, MD
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.