# Delete observations with a frequency < x

4 messages
Open this post in threaded view
|
Report Content as Inappropriate

## Delete observations with a frequency < x

 Hi, I have two columns with data (both identifiers - it's an affiliation list) and I would like to delete the rows in which the observations in the second column have a frequency < 5 in the entire second column. Example: 1     a 1     b 1     c 2     a 2     b 2     d Let's say, I would like to delete the rows in which the observation in the second column has a frequency < 2 in the entire second column. This would result in: 1     a 1     b 2     a 2     b How can I do this? Thanks in advance! Mathijs
Open this post in threaded view
|
Report Content as Inappropriate

## Re: Delete observations with a frequency < x

 Suppose this is your data frame: > df = data.frame(x=c(1,1,1,2,2,2),y=c('a','b','c','a','b','d')) > df    x y 1 1 a 2 1 b 3 1 c 4 2 a 5 2 b 6 2 d > df[!table(df\$y)[df\$y] < 2,]    x y 1 1 a 2 1 b 4 2 a 5 2 b Note that this will only work properly if y is a factor or character variable.  If y was numeric, you would need df[!table(df\$y)[as.character(df\$y)]   - Phil Spector   Statistical Computing Facility   Department of Statistics   UC Berkeley   [hidden email] On Thu, 9 Dec 2010, mathijsdevaan wrote: > > Hi, > > I have two columns with data (both identifiers - it's an affiliation list) > and I would like to delete the rows in which the observations in the second > column have a frequency < 5 in the entire second column. Example: > > 1     a > 1     b > 1     c > 2     a > 2     b > 2     d > > Let's say, I would like to delete the rows in which the observation in the > second column has a frequency < 2 in the entire second column. This would > result in: > > 1     a > 1     b > 2     a > 2     b > > How can I do this? Thanks in advance! > > Mathijs > -- > View this message in context: http://r.789695.n4.nabble.com/Delete-observations-with-a-frequency-x-tp3081226p3081226.html> Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|
Report Content as Inappropriate

## Re: Delete observations with a frequency < x

 In reply to this post by mdvaan mathijsdevaan  wrote on 12/09/2010 04:21:54 PM: > I have two columns with data (both identifiers - it's an affiliation list) > and I would like to delete the rows in which the observations in the second > column have a frequency < 5 in the entire second column. Example: > > 1     a > 1     b > 1     c > 2     a > 2     b > 2     d > > Let's say, I would like to delete the rows in which the observation in the > second column has a frequency < 2 in the entire second column. This would > result in: > > 1     a > 1     b > 2     a > 2     b > > How can I do this? Thanks in advance! > It's not clear whether you want to delete rows where the value second column occurs less than 5 times or appears less than 2 times.  I'll assume the latter. foo <- data.frame(k=rep(1:2, each=3), x=letters[c(1,2,3,1,2,4)]) bar <- subset(foo, x %in% names(table(foo\$x))[table(foo\$x)>=2]) No doubt others can write this more succinctly. -- Curt Seeliger, Data Ranger Raytheon Information Services - Contractor to ORD [hidden email] 541/754-4638         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.