Quantcast

recode Variable in dependence of values of two other variables

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

recode Variable in dependence of values of two other variables

Julia Moeller
Hi,

as an R-beginner, I have a recoding problem and hope you can help me:

I am working on a SPSS dataset, which I loaded into R (load("C:/...)

I have  2 existing Variables: "ID" and "X" ,
and one variable to be computed: meanX.dependID (=mean of X for all rows
in which ID has the same value)

ID = subject ID.  Since it is a longitudinal dataset, there are repeated
measurement points for each subject, each of which appears in a new row.
So, each ID value appears in many rows. (e.g. ID ==1 in row 1:5; ID ==2
in rows 6:8 etc).


Now: For all rows, in which ID has a certain value, meanX.dependID shall
be the mean of X in for these rows. How can I automatisize that, without
having to specify the number of the rows each time?

e.g.


ID    X    meanX.dependID
1    2    2.25
1    3    2.25
1    1    2.25
1    3    2.25
2    5    3.3
2    2    3.3
2    3    3.3
3    4    3
3    1    3
3    2    3
3    3    3
3    4    3
3    5    3


Thanks a lot! Hope this is the right place to post, if not, please tell me!
best,
Julia

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: recode Variable in dependence of values of two other variables

Mikhail Titov-2
?aggregate

aggregate(X~ID, your.data.frame.goes.here, "mean")

Mikhail


> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]]
On
> Behalf Of Julia Moeller
> Sent: Friday, August 12, 2011 10:10 AM
> To: [hidden email]
> Subject: [R] recode Variable in dependence of values of two other
variables

>
> Hi,
>
> as an R-beginner, I have a recoding problem and hope you can help me:
>
> I am working on a SPSS dataset, which I loaded into R (load("C:/...)
>
> I have  2 existing Variables: "ID" and "X" , and one variable to be
> computed: meanX.dependID (=mean of X for all rows in which ID has the same
> value)
>
> ID = subject ID.  Since it is a longitudinal dataset, there are repeated
> measurement points for each subject, each of which appears in a new row.
> So, each ID value appears in many rows. (e.g. ID ==1 in row 1:5; ID ==2 in
> rows 6:8 etc).
>
>
> Now: For all rows, in which ID has a certain value, meanX.dependID shall
be

> the mean of X in for these rows. How can I automatisize that, without
> having to specify the number of the rows each time?
>
> e.g.
>
>
> ID    X    meanX.dependID
> 1    2    2.25
> 1    3    2.25
> 1    1    2.25
> 1    3    2.25
> 2    5    3.3
> 2    2    3.3
> 2    3    3.3
> 3    4    3
> 3    1    3
> 3    2    3
> 3    3    3
> 3    4    3
> 3    5    3
>
>
> Thanks a lot! Hope this is the right place to post, if not, please tell
me!
> best,
> Julia

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: recode Variable in dependence of values of two other variables

djmuseR
In reply to this post by Julia Moeller
Hi:

Here are several equivalent ways to produce your desired output:

# Base package: transform()

df <- transform(df, mean = ave(x, id, FUN = mean))

# plyr package
library('plyr')
ddply(df, .(id), transform, mean = mean(x))

# data.table package
library('data.table')
dt <- data.table(df, key = 'id')
dt[, list(x, mean = mean(x)), by = 'id']

# doBy package
library('doBy')
transformBy(~ id, data = df, mean = mean(x))

HTH,
Dennis

On Fri, Aug 12, 2011 at 8:10 AM, Julia Moeller
<[hidden email]> wrote:

> Hi,
>
> as an R-beginner, I have a recoding problem and hope you can help me:
>
> I am working on a SPSS dataset, which I loaded into R (load("C:/...)
>
> I have  2 existing Variables: "ID" and "X" ,
> and one variable to be computed: meanX.dependID (=mean of X for all rows in
> which ID has the same value)
>
> ID = subject ID.  Since it is a longitudinal dataset, there are repeated
> measurement points for each subject, each of which appears in a new row. So,
> each ID value appears in many rows. (e.g. ID ==1 in row 1:5; ID ==2 in rows
> 6:8 etc).
>
>
> Now: For all rows, in which ID has a certain value, meanX.dependID shall be
> the mean of X in for these rows. How can I automatisize that, without having
> to specify the number of the rows each time?
>
> e.g.
>
>
> ID    X    meanX.dependID
> 1    2    2.25
> 1    3    2.25
> 1    1    2.25
> 1    3    2.25
> 2    5    3.3
> 2    2    3.3
> 2    3    3.3
> 3    4    3
> 3    1    3
> 3    2    3
> 3    3    3
> 3    4    3
> 3    5    3
>
>
> Thanks a lot! Hope this is the right place to post, if not, please tell me!
> best,
> Julia
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...