dplyr filter function returns all the levels

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

dplyr filter function returns all the levels

R help mailing list-2
Hello everyone,

I have the following dataframe 

        
    library(dplyr)
    dput(df)
    structure(list(Freq = c(19L, 19L, 18L, 15L, 14L, 13L, 13L, 12L, 
   11L, 11L, 11L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 9L), word1 = structure(c(3L, 
   11L, 5L, 6L, 11L, 3L, 7L, 10L, 8L, 11L, 13L, 1L, 1L, 4L, 5L, 
   9L, 12L, 14L, 2L), .Label = c("a", "art", "at", "by", "for", 
   "for ", "i", "is", "on", "said", "the", "there", "to", "when"
  ), class = "factor"), word2 = structure(c(13L, 12L, 13L, 13L, 
  6L, 13L, 5L, 8L, 11L, 4L, 2L, 3L, 10L, 13L, 13L, 13L, 1L, 9L, 
  7L), .Label = c("are", "be", "bit", "bottom", "dont", "end", 
  "hotel", "in", "it", "little", "one", "rest", "the"), class = "factor"), 
  word3 = structure(c(5L, 9L, 6L, 7L, 9L, 11L, 13L, 1L, 9L, 
  9L, 2L, 9L, 3L, 5L, 6L, 10L, 12L, 4L, 8L), .Label = c("a", 
  "able", "bit", "comes", "end", "first", "first ", "florence", 
  "of", "other", "same", "so", "want"), class = "factor"), 
  word4 = structure(c(5L, 9L, 10L, 8L, 9L, 10L, 11L, 7L, 9L, 
  9L, 11L, 1L, 5L, 5L, 6L, 2L, 4L, 11L, 3L), .Label = c("a", 
  "hand", "italy", "many", "of", "place", "statement", "states", 
  "the", "time", "to"), class = "factor")), class = "data.frame", row.names = c(NA, 
  -19L))


Is there a way to modify the following command so that the filter function doesn't return all the Levels?



   filter(df,(word1 == 'for' & word2 == 'the' & word3 == 'first')) $ word4
   [1] time  place
   Levels: a hand italy many of place statement states the time to



Thanks a lot! 
Elahe 

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: dplyr filter function returns all the levels

Jeff Newmiller
Don't use factors in the first place? Use character data.

On March 27, 2020 4:17:57 AM PDT, Elahe chalabi via R-help <[hidden email]> wrote:

>Hello everyone,
>
>I have the following dataframe 
>
>        
>    library(dplyr)
>    dput(df)
>    structure(list(Freq = c(19L, 19L, 18L, 15L, 14L, 13L, 13L, 12L, 
>   11L, 11L, 11L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 9L), word1 =
>structure(c(3L, 
>   11L, 5L, 6L, 11L, 3L, 7L, 10L, 8L, 11L, 13L, 1L, 1L, 4L, 5L, 
>   9L, 12L, 14L, 2L), .Label = c("a", "art", "at", "by", "for", 
>   "for ", "i", "is", "on", "said", "the", "there", "to", "when"
>  ), class = "factor"), word2 = structure(c(13L, 12L, 13L, 13L, 
>  6L, 13L, 5L, 8L, 11L, 4L, 2L, 3L, 10L, 13L, 13L, 13L, 1L, 9L, 
>  7L), .Label = c("are", "be", "bit", "bottom", "dont", "end", 
>  "hotel", "in", "it", "little", "one", "rest", "the"), class =
>"factor"), 
>  word3 = structure(c(5L, 9L, 6L, 7L, 9L, 11L, 13L, 1L, 9L, 
>  9L, 2L, 9L, 3L, 5L, 6L, 10L, 12L, 4L, 8L), .Label = c("a", 
>  "able", "bit", "comes", "end", "first", "first ", "florence", 
>  "of", "other", "same", "so", "want"), class = "factor"), 
>  word4 = structure(c(5L, 9L, 10L, 8L, 9L, 10L, 11L, 7L, 9L, 
>  9L, 11L, 1L, 5L, 5L, 6L, 2L, 4L, 11L, 3L), .Label = c("a", 
>  "hand", "italy", "many", "of", "place", "statement", "states", 
>  "the", "time", "to"), class = "factor")), class = "data.frame",
>row.names = c(NA, 
>  -19L))
>
>
>Is there a way to modify the following command so that the filter
>function doesn't return all the Levels?
>
>
>
>   filter(df,(word1 == 'for' & word2 == 'the' & word3 == 'first')) $
>word4
>   [1] time  place
>   Levels: a hand italy many of place statement states the time to
>
>
>
>Thanks a lot! 
>Elahe 
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: dplyr filter function returns all the levels

Rui Barradas
Hello,

Here are two ways. The first keeps word4 as factor, the second coerces
to character first, like Jeff said.

df %>%
   filter(word1 == 'for' & word2 == 'the' & word3 == 'first') %>%
   pull(word4) %>%
   droplevels()


df %>%
   mutate_if(is.factor, as.character) %>%
   filter(word1 == 'for' & word2 == 'the' & word3 == 'first') %>%
   pull(word4)



Hope this helps,

Rui Barradas

Às 12:41 de 27/03/20, Jeff Newmiller escreveu:

> Don't use factors in the first place? Use character data.
>
> On March 27, 2020 4:17:57 AM PDT, Elahe chalabi via R-help <[hidden email]> wrote:
>> Hello everyone,
>>
>> I have the following dataframe
>>
>>          
>>      library(dplyr)
>>      dput(df)
>>      structure(list(Freq = c(19L, 19L, 18L, 15L, 14L, 13L, 13L, 12L,
>>     11L, 11L, 11L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 9L), word1 =
>> structure(c(3L,
>>     11L, 5L, 6L, 11L, 3L, 7L, 10L, 8L, 11L, 13L, 1L, 1L, 4L, 5L,
>>     9L, 12L, 14L, 2L), .Label = c("a", "art", "at", "by", "for",
>>     "for ", "i", "is", "on", "said", "the", "there", "to", "when"
>>    ), class = "factor"), word2 = structure(c(13L, 12L, 13L, 13L,
>>    6L, 13L, 5L, 8L, 11L, 4L, 2L, 3L, 10L, 13L, 13L, 13L, 1L, 9L,
>>    7L), .Label = c("are", "be", "bit", "bottom", "dont", "end",
>>    "hotel", "in", "it", "little", "one", "rest", "the"), class =
>> "factor"),
>>    word3 = structure(c(5L, 9L, 6L, 7L, 9L, 11L, 13L, 1L, 9L,
>>    9L, 2L, 9L, 3L, 5L, 6L, 10L, 12L, 4L, 8L), .Label = c("a",
>>    "able", "bit", "comes", "end", "first", "first ", "florence",
>>    "of", "other", "same", "so", "want"), class = "factor"),
>>    word4 = structure(c(5L, 9L, 10L, 8L, 9L, 10L, 11L, 7L, 9L,
>>    9L, 11L, 1L, 5L, 5L, 6L, 2L, 4L, 11L, 3L), .Label = c("a",
>>    "hand", "italy", "many", "of", "place", "statement", "states",
>>    "the", "time", "to"), class = "factor")), class = "data.frame",
>> row.names = c(NA,
>>    -19L))
>>
>>
>> Is there a way to modify the following command so that the filter
>> function doesn't return all the Levels?
>>
>>
>>
>>     filter(df,(word1 == 'for' & word2 == 'the' & word3 == 'first')) $
>> word4
>>     [1] time  place
>>     Levels: a hand italy many of place statement states the time to
>>
>>
>>
>> Thanks a lot!
>> Elahe
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.