R-help:
I have a variable ("ID_list") containing about 1800 unique numbers, and a 143066x29 data frame. One of the columns ("ID") in my data frame contains a list of ids, many of which appear more than once. I'd like to find the subset of my data frame for which "ID" matches one of the numbers in "ID_list." I'm pretty sure I could write a function to do this--something like: dataSubset<-function(df, id_list){ tmp = data.frame() for(i in id_list){ for(j in 1:dim(df)[1]){ if(i==df$ID[j]){ tmp<-data.frame(df[j,]) } } } tmp } but this seems inefficient. As I understand it, the subset function won't really solve my problem, but it seems like there must be something out there that will that I must be forgetting. Does anyone know of a way to solve this problem in an efficient way? Thanks! Kyle H. Ambert Graduate Student, Department of Medical Informatics & Clinical Epidemiology Oregon Health & Science University [hidden email] [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
R-help: I have a variable ("ID_list") containing about 1800 unique numbers, and a 143066x29 data frame. One of the columns ("ID") in my data frame contains a list of ids, many of which appear more than once. I'd like to find the subset of my data frame for which "ID" matches one of the numbers in "ID_list." I'm pretty sure I could write a function to do this--something like: dataSubset<-function(df, id_list){ tmp = data.frame() for(i in id_list){ for(j in 1:dim(df)[1]){ if(i==df$ID[j]){ tmp<-data.frame(df[j,]) } } } tmp } but this seems inefficient. As I understand it, the subset function won't really solve my problem, but it seems like there must be something out there that will that I must be forgetting. Does anyone know of a way to solve this problem in an efficient way? Thanks! Kyle H. Ambert Graduate Student, Department of Medical Informatics & Clinical Epidemiology Oregon Health & Science University [hidden email] [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by Kyle.-3
I don't know if I understand (small example with R command wouuld help),
but, assuming your data.frame is called 'df' subset(df, ID %in% ID_list) Question, is ID_list a "list" or a vector, and are they really "numbers" or "factors"? Kyle. wrote: > R-help: > > I have a variable ("ID_list") containing about 1800 unique numbers, and a > 143066x29 data frame. One of the columns ("ID") in my data frame contains a > list of ids, many of which appear more than once. I'd like to find the > subset of my data frame for which "ID" matches one of the numbers in > "ID_list." I'm pretty sure I could write a function to do this--something > like: > > dataSubset<-function(df, id_list){ > tmp = data.frame() > for(i in id_list){ > for(j in 1:dim(df)[1]){ > if(i==df$ID[j]){ > tmp<-data.frame(df[j,]) > } > } > } > tmp > } > > but this seems inefficient. As I understand it, the subset function won't > really solve my problem, but it seems like there must be something out there > that will that I must be forgetting. Does anyone know of a way to solve this > problem in an efficient way? Thanks! > > > Kyle H. Ambert > Graduate Student, Department of Medical Informatics & Clinical Epidemiology > Oregon Health & Science University > [hidden email] > > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Free forum by Nabble - Free Resume Builder | Edit this page |