Subset based on multiple values

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Subset based on multiple values

Amanduh320
This post was updated on .
I'm stuck on a seemingly simple problem. I'm trying to subset the data by several numbers and it cuts out half of the rows. Here is the sample code:

test2 <- as.matrix(c(1,1,1,1,3,3,7,7,7,7))
Count <- tapply(test2[,1], test2[,1], length)   # count for each value
spp <- unique(test2[,1])
Count1 <- as.data.frame(cbind(Count,spp))
Max <- max(Count)
Count1$sppMax <- ifelse(Count1$Count >= Max, Count1$spp, 0)  # only keep values that =Max
Count2 <- subset(Count1, sppMax > 0)    #get rid of values that are less than Max
AllMax <- unique(Count2$sppMax)
test3 <- subset(test2, test2[,1] == AllMax)

this works where there is only one value for AllMax, how to make it work when there are multiple?
Reply | Threaded
Open this post in threaded view
|

Re: Subset based on multiple values

Yasir Kaheil
I see that this is your input: c(1,1,1,1,3,3,7,7,7,7)
what do you want the output to be?
what could the multiple values of AllMax be?
Yasir Kaheil
Reply | Threaded
Open this post in threaded view
|

Re: Subset based on multiple values

Amanduh320
I want the output to be 1,1,1,1,7,7,7,7
The multiple values of AllMax are 1 and 7.

I think I've figured it out though, I added a loop at the end:

test <- as.matrix(c(1,1,1,1,3,3,7,7,7,7))
Count <- tapply(test[,1], test[,1], length)   # count for each value
spp <- unique(test[,1])
Count1 <- as.data.frame(cbind(spp,Count))
Max <- max(Count1$Count)
Count1$sppMax <- ifelse(Count1$Count >= Max, Count1$spp, 0)  # only keep values that =Max
Count2 <- subset(Count1, sppMax > 0)    #get rid of values that are less than Max
AllMax <- unique(Count2$sppMax)

test2 <- test[0,] # set up the blank header, same format as test2
for(i in 1:length(AllMax)) {      # run 1 round of the loop for each value in the vector AllMax
 tempset <- subset(test, test[,1] == AllMax[i])      #subset all the values in test2 column 1 that match the current value in the AllMax
 test2 <- rbind(test2,tempset)     # bind the subset with the blank header (first round of the loop) or with subsets from previous values of AllMax (subsequent rounds)
}
Reply | Threaded
Open this post in threaded view
|

Re: Subset based on multiple values

andrija djurovic
In reply to this post by Amanduh320
Hi. Maybe this:

ct <- table(test)
as.numeric(names(ct[ct==max(ct)]))
test[test[,1]%in%as.numeric(names(ct[ct==max(ct)])),,drop=FALSE]

?

Andrija

On Wed, Jul 11, 2012 at 8:33 PM, Amanduh320 <[hidden email]> wrote:

> I'm stuck on a seemingly simple problem. I'm trying to subset the data by
> several numbers and it cuts out half of the rows. Here is the sample code:
>
> test <- as.matrix(c(1,1,1,1,3,3,7,7,7,7))
> Count <- tapply(test[,1], test[,1], length)   # count for each value
> spp <- unique(test[,1])
> Count1 <- as.data.frame(cbind(Count,spp))
> Max <- max(Count)
> Count1$sppMax <- ifelse(Count1$Count >= Max, Count1$spp, 0)  # only keep
> values that =Max
> Count2 <- subset(Count1, sppMax > 0)    #get rid of values that are less
> than Max
> AllMax <- unique(Count2$sppMax)
> test2 <- subset(test, test[,1] == AllMax)
>
>
> this works where there is only one value for AllMax, how to make it work
> when there are multiple?
>
> Thank you!
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Subset-based-on-multiple-values-tp4636159.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Subset based on multiple values

Yasir Kaheil
In reply to this post by Amanduh320
yes just use %in% instead of == AllMax.. but also use table for count
Yasir Kaheil
Reply | Threaded
Open this post in threaded view
|

Re: Subset based on multiple values

arun kirshna
In reply to this post by Amanduh320
Hi,

Try this: (a variant of andrija)
 testct<-table(test)
 subset(test,!is.na(match(test,as.integer(names(testct[testct%in%max(testct)])))))

    [,1]
[1,]    1
[2,]    1
[3,]    1
[4,]    1
[5,]    7
[6,]    7
[7,]    7
[8,]    7

A.K.



----- Original Message -----
From: Amanduh320 <[hidden email]>
To: [hidden email]
Cc:
Sent: Wednesday, July 11, 2012 2:33 PM
Subject: [R] Subset based on multiple values

I'm stuck on a seemingly simple problem. I'm trying to subset the data by
several numbers and it cuts out half of the rows. Here is the sample code:

test <- as.matrix(c(1,1,1,1,3,3,7,7,7,7))
Count <- tapply(test[,1], test[,1], length)   # count for each value
spp <- unique(test[,1])
Count1 <- as.data.frame(cbind(Count,spp))
Max <- max(Count)
Count1$sppMax <- ifelse(Count1$Count >= Max, Count1$spp, 0)  # only keep
values that =Max
Count2 <- subset(Count1, sppMax > 0)    #get rid of values that are less
than Max
AllMax <- unique(Count2$sppMax)
test2 <- subset(test, test[,1] == AllMax)


this works where there is only one value for AllMax, how to make it work
when there are multiple?

Thank you!

--
View this message in context: http://r.789695.n4.nabble.com/Subset-based-on-multiple-values-tp4636159.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.