|
How could I select the rows of a dataset that have the maximum value in one variable and to do this nested in another variable. It is a dataframe in long format with repeated measures per subject.
I was not successful using aggregate, because one of the columns has character values (and/or possibly because of another reason). I would like to transfer something like this: subject time.ms V3 1 1 stringA 1 12 stringB 1 22 stringC 2 1 stringB 2 14 stringC 2 25 stringA …. To something like this: subject time.ms V3 1 22 stringC 2 25 stringA … Thank you very much for you help! Miriam -- Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Hi
> > How could I select the rows of a dataset that have the maximum value in > one variable and to do this nested in another variable. It is a dataframe > in long format with repeated measures per subject. > I was not successful using aggregate, because one of the columns has You could do it by aggregate and subsequent selection matching values from your data frame but it is perfect example for powerfull list operations > do.call("rbind",lapply(split(test, test$subject), function(x) x[which.max(x[,2]),])) subject time.ms V3 1 1 22 stringC 2 2 25 stringA > split splits data frame test according to subject variable into list of sub data frames function x computes which is maximum value in second column in each sub data frame and selects the appropriate row do.call takes the list and rbinds it to one final data frame. Regards Petr > character values (and/or possibly because of another reason). > I would like to transfer something like this: > subject time.ms V3 > 1 1 stringA > 1 12 stringB > 1 22 stringC > 2 1 stringB > 2 14 stringC > 2 25 stringA > …. > To something like this: > subject time.ms V3 > 1 22 stringC > 2 25 stringA > … > > Thank you very much for you help! > Miriam > -- > > Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
Hello,
Here's a solution using aggregate and merge. I've kept it in two steps for clarity. d <- read.table(text=" subject time.ms V3 1 1 stringA 1 12 stringB 1 22 stringC 2 1 stringB 2 14 stringC 2 25 stringA ", header=TRUE) ag <- aggregate(time.ms~subject, data=d, max) merge(ag, d) # It also works if the maximum is not unique d2 <- rbind(d, c(1, 22, "stringA")) ag2 <- aggregate(time.ms~subject, data=d2, max) merge(ag2, d2) The split version would have to be slightly modified, to make use of 'which' and 'max' separately. do.call("rbind",lapply(split(d2, d2$subject), function(x) x[which(x[, 2] == max(x[, 2])), ])) Hope this helps, Rui Barradas Em 27-06-2012 09:30, Petr PIKAL escreveu: > Hi >> >> How could I select the rows of a dataset that have the maximum value in >> one variable and to do this nested in another variable. It is a > dataframe >> in long format with repeated measures per subject. >> I was not successful using aggregate, because one of the columns has > > You could do it by aggregate and subsequent selection matching values from > your data frame but it is perfect example for powerfull list operations > >> do.call("rbind",lapply(split(test, test$subject), function(x) > x[which.max(x[,2]),])) > subject time.ms V3 > 1 1 22 stringC > 2 2 25 stringA >> > > split splits data frame test according to subject variable into list of > sub data frames > function x computes which is maximum value in second column in each sub > data frame and selects the appropriate row > do.call takes the list and rbinds it to one final data frame. > > Regards > Petr > >> character values (and/or possibly because of another reason). >> I would like to transfer something like this: >> subject time.ms V3 >> 1 1 stringA >> 1 12 stringB >> 1 22 stringC >> 2 1 stringB >> 2 14 stringC >> 2 25 stringA >> …. >> To something like this: >> subject time.ms V3 >> 1 22 stringC >> 2 25 stringA >> … >> >> Thank you very much for you help! >> Miriam >> -- >> >> Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
|
In reply to this post by Miriam -2
HI,
Try this: dat1 <- read.table(text=" subject time.ms V3 1 1 stringA 1 12 stringB 1 22 stringC 2 1 stringB 2 14 stringC 2 25 stringA ", sep="",header=TRUE) dat2<-aggregate(dat1$time.ms,list(dat1$subject),max) colnames(dat2)<-c("subject","time.ms") merge(dat2,dat1) subject time.ms V3 1 1 22 stringC 2 2 25 stringA A.K. ----- Original Message ----- From: Miriam <[hidden email]> To: [hidden email] Cc: Sent: Tuesday, June 26, 2012 5:21 PM Subject: [R] selecting rows by maximum value of one variables in dataframe nested by another Variable How could I select the rows of a dataset that have the maximum value in one variable and to do this nested in another variable. It is a dataframe in long format with repeated measures per subject. I was not successful using aggregate, because one of the columns has character values (and/or possibly because of another reason). I would like to transfer something like this: subject time.ms V3 1 1 stringA 1 12 stringB 1 22 stringC 2 1 stringB 2 14 stringC 2 25 stringA …. To something like this: subject time.ms V3 1 22 stringC 2 25 stringA … Thank you very much for you help! Miriam -- Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
| Powered by Nabble | Edit this page |
