select from data frame

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

select from data frame

R help mailing list-2
Dear All,

wonder if you could please assist with the following

df<-data.frame(ID=c(1,1,1,2,2,3,3,4,4,5,5),samples=c("A","B","C","A","C","A","D","C","B","A","C"))

from this data frame the goal is to extract the value of 3 from the ID column based on the logic that the ID=3 in the data frame has NO row that would pair 3 with either "B", AND/OR "C" in the samples column...


much appreciate your help...

thanks,
 Andras

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: select from data frame

David Winsemius

> On Jul 15, 2017, at 4:01 AM, Andras Farkas via R-help <[hidden email]> wrote:
>
> Dear All,
>
> wonder if you could please assist with the following
>
> df<-data.frame(ID=c(1,1,1,2,2,3,3,4,4,5,5),samples=c("A","B","C","A","C","A","D","C","B","A","C"))
>
> from this data frame the goal is to extract the value of 3 from the ID column based on the logic that the ID=3 in the data frame has NO row that would pair 3 with either "B", AND/OR "C" in the samples column...
>

This returns a vector that determines if either of those characters are in the character values of that factor column you created. Coercing to character is needed because leaving samples as a factor generated an invalid factor level warning and gave useless results.

 with( df, ave( as.character(samples), ID, FUN=function(x) {!any(x %in% c("B","C"))}))
 [1] "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "TRUE"  "TRUE"  "FALSE" "FALSE"
[10] "FALSE" "FALSE"

You can then use it to extract and consolidate to a single value (although wrapping with as.logical was needed because `ave` returned character class values):

 unique( df$ID[ as.logical(   # fails without this since "FALSE" != FALSE
                    with( df,
                       ave( as.character(samples), ID, FUN=function(x) {!any(x %in% c("B","C"))})))
              ] )
#[1] 3

The same sort of logic could also be constructed with a for-loop:

> for (x in unique(df$ID) ) { if ( !any( df$samples[df$ID==x] %in% c("b","C")) ) print(x) }
[1] 3

Although you are warned that for-loops do not return values and you might need to make an assignment rather than just printing.

--

David Winsemius
Alameda, CA, USA

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: select from data frame

Bert Gunter-2
If I understand correctly, no looping (ave(), for()) or type casting
(as.character()) is needed -- indexing and matching suffice:

> with(df, ID[!ID %in% unique(ID[samples %in% c("B","C") ])])
[1] 3 3



Cheers,

Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Jul 15, 2017 at 8:54 AM, David Winsemius <[hidden email]> wrote:

>
>> On Jul 15, 2017, at 4:01 AM, Andras Farkas via R-help <[hidden email]> wrote:
>>
>> Dear All,
>>
>> wonder if you could please assist with the following
>>
>> df<-data.frame(ID=c(1,1,1,2,2,3,3,4,4,5,5),samples=c("A","B","C","A","C","A","D","C","B","A","C"))
>>
>> from this data frame the goal is to extract the value of 3 from the ID column based on the logic that the ID=3 in the data frame has NO row that would pair 3 with either "B", AND/OR "C" in the samples column...
>>
>
> This returns a vector that determines if either of those characters are in the character values of that factor column you created. Coercing to character is needed because leaving samples as a factor generated an invalid factor level warning and gave useless results.
>
>  with( df, ave( as.character(samples), ID, FUN=function(x) {!any(x %in% c("B","C"))}))
>  [1] "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "TRUE"  "TRUE"  "FALSE" "FALSE"
> [10] "FALSE" "FALSE"
>
> You can then use it to extract and consolidate to a single value (although wrapping with as.logical was needed because `ave` returned character class values):
>
>  unique( df$ID[ as.logical(   # fails without this since "FALSE" != FALSE
>                     with( df,
>                        ave( as.character(samples), ID, FUN=function(x) {!any(x %in% c("B","C"))})))
>               ] )
> #[1] 3
>
> The same sort of logic could also be constructed with a for-loop:
>
>> for (x in unique(df$ID) ) { if ( !any( df$samples[df$ID==x] %in% c("b","C")) ) print(x) }
> [1] 3
>
> Although you are warned that for-loops do not return values and you might need to make an assignment rather than just printing.
>
> --
>
> David Winsemius
> Alameda, CA, USA
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: select from data frame

Bert Gunter-2
...
and here is a slightly cleaner and more transparent way of doing the
same thing (setdiff() does the matching)

> with(df, setdiff(ID,ID[samples %in% c("B","C") ]))
[1] 3

-- Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Jul 15, 2017 at 9:23 AM, Bert Gunter <[hidden email]> wrote:

> If I understand correctly, no looping (ave(), for()) or type casting
> (as.character()) is needed -- indexing and matching suffice:
>
>> with(df, ID[!ID %in% unique(ID[samples %in% c("B","C") ])])
> [1] 3 3
>
>
>
> Cheers,
>
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sat, Jul 15, 2017 at 8:54 AM, David Winsemius <[hidden email]> wrote:
>>
>>> On Jul 15, 2017, at 4:01 AM, Andras Farkas via R-help <[hidden email]> wrote:
>>>
>>> Dear All,
>>>
>>> wonder if you could please assist with the following
>>>
>>> df<-data.frame(ID=c(1,1,1,2,2,3,3,4,4,5,5),samples=c("A","B","C","A","C","A","D","C","B","A","C"))
>>>
>>> from this data frame the goal is to extract the value of 3 from the ID column based on the logic that the ID=3 in the data frame has NO row that would pair 3 with either "B", AND/OR "C" in the samples column...
>>>
>>
>> This returns a vector that determines if either of those characters are in the character values of that factor column you created. Coercing to character is needed because leaving samples as a factor generated an invalid factor level warning and gave useless results.
>>
>>  with( df, ave( as.character(samples), ID, FUN=function(x) {!any(x %in% c("B","C"))}))
>>  [1] "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "TRUE"  "TRUE"  "FALSE" "FALSE"
>> [10] "FALSE" "FALSE"
>>
>> You can then use it to extract and consolidate to a single value (although wrapping with as.logical was needed because `ave` returned character class values):
>>
>>  unique( df$ID[ as.logical(   # fails without this since "FALSE" != FALSE
>>                     with( df,
>>                        ave( as.character(samples), ID, FUN=function(x) {!any(x %in% c("B","C"))})))
>>               ] )
>> #[1] 3
>>
>> The same sort of logic could also be constructed with a for-loop:
>>
>>> for (x in unique(df$ID) ) { if ( !any( df$samples[df$ID==x] %in% c("b","C")) ) print(x) }
>> [1] 3
>>
>> Although you are warned that for-loops do not return values and you might need to make an assignment rather than just printing.
>>
>> --
>>
>> David Winsemius
>> Alameda, CA, USA
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: select from data frame

R help mailing list-2
thank you David and Bert, these solutions will work for me... Andras 

    On Saturday, July 15, 2017 6:05 PM, Bert Gunter <[hidden email]> wrote:
 

 ...
and here is a slightly cleaner and more transparent way of doing the
same thing (setdiff() does the matching)

> with(df, setdiff(ID,ID[samples %in% c("B","C") ]))
[1] 3

-- Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Jul 15, 2017 at 9:23 AM, Bert Gunter <[hidden email]> wrote:

> If I understand correctly, no looping (ave(), for()) or type casting
> (as.character()) is needed -- indexing and matching suffice:
>
>> with(df, ID[!ID %in% unique(ID[samples %in% c("B","C") ])])
> [1] 3 3
>
>
>
> Cheers,
>
> Bert
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sat, Jul 15, 2017 at 8:54 AM, David Winsemius <[hidden email]> wrote:
>>
>>> On Jul 15, 2017, at 4:01 AM, Andras Farkas via R-help <[hidden email]> wrote:
>>>
>>> Dear All,
>>>
>>> wonder if you could please assist with the following
>>>
>>> df<-data.frame(ID=c(1,1,1,2,2,3,3,4,4,5,5),samples=c("A","B","C","A","C","A","D","C","B","A","C"))
>>>
>>> from this data frame the goal is to extract the value of 3 from the ID column based on the logic that the ID=3 in the data frame has NO row that would pair 3 with either "B", AND/OR "C" in the samples column...
>>>
>>
>> This returns a vector that determines if either of those characters are in the character values of that factor column you created. Coercing to character is needed because leaving samples as a factor generated an invalid factor level warning and gave useless results.
>>
>>  with( df, ave( as.character(samples), ID, FUN=function(x) {!any(x %in% c("B","C"))}))
>>  [1] "FALSE" "FALSE" "FALSE" "FALSE" "FALSE" "TRUE"  "TRUE"  "FALSE" "FALSE"
>> [10] "FALSE" "FALSE"
>>
>> You can then use it to extract and consolidate to a single value (although wrapping with as.logical was needed because `ave` returned character class values):
>>
>>  unique( df$ID[ as.logical(  # fails without this since "FALSE" != FALSE
>>                    with( df,
>>                        ave( as.character(samples), ID, FUN=function(x) {!any(x %in% c("B","C"))})))
>>              ] )
>> #[1] 3
>>
>> The same sort of logic could also be constructed with a for-loop:
>>
>>> for (x in unique(df$ID) ) { if ( !any( df$samples[df$ID==x] %in% c("b","C")) ) print(x) }
>> [1] 3
>>
>> Although you are warned that for-loops do not return values and you might need to make an assignment rather than just printing.
>>
>> --
>>
>> David Winsemius
>> Alameda, CA, USA
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

   
        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...