how to use AND in grepl

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

how to use AND in grepl

R help mailing list-2
Hi all,

I have one factor variable in my df and I want to extract the names from it which contain both "t2" and "pd":

  'data.frame': 36919 obs. of 162 variables
   $TE                :int 38,41,11,52,48,75,.....
   $TR                :int 100,210,548,546,.....
   $Command          :factor W/2229 levels "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...

I have tried this but I did not get result:
   
  t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))


does anyone know how to apply AND in grepl?

Thanks
Elahe

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to use AND in grepl

R help mailing list-2
Your code looks fine to me.  What did t2pd look like?

I tried reproducing the problem in R-3.2.4(Revised) and everything worked
(although the output of str() looked a bit different - perhaps you have an
old version of R)

> df <- data.frame(TE=1:10, TR=101:110,
Command=c("pd_local_abdomen_t2","knee_pd_t1_localize","PD_localize_tre_t2","t2_localize_PD")[rep(1:4,len=10)])
> str(df)
'data.frame':   10 obs. of  3 variables:
 $ TE     : int  1 2 3 4 5 6 7 8 9 10
 $ TR     : int  101 102 103 104 105 106 107 108 109 110
 $ Command: Factor w/ 4 levels "knee_pd_t1_localize",..: 2 1 3 4 2 1 3 4 2 1
> subset(df,grepl("t2",Command) & grepl("pd",Command))
  TE  TR             Command
1  1 101 pd_local_abdomen_t2
5  5 105 pd_local_abdomen_t2
9  9 109 pd_local_abdomen_t2
> subset(df,grepl("t2",Command,ignore.case=TRUE) &
grepl("pd",Command,ignore.case=TRUE))
  TE  TR             Command
1  1 101 pd_local_abdomen_t2
3  3 103  PD_localize_tre_t2
4  4 104      t2_localize_PD
5  5 105 pd_local_abdomen_t2
7  7 107  PD_localize_tre_t2
8  8 108      t2_localize_PD
9  9 109 pd_local_abdomen_t2


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help <[hidden email]>
wrote:

> Hi all,
>
> I have one factor variable in my df and I want to extract the names from
> it which contain both "t2" and "pd":
>
>   'data.frame': 36919 obs. of 162 variables
>    $TE                :int 38,41,11,52,48,75,.....
>    $TR                :int 100,210,548,546,.....
>    $Command          :factor W/2229 levels
> "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
>
> I have tried this but I did not get result:
>
>   t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
>
>
> does anyone know how to apply AND in grepl?
>
> Thanks
> Elahe
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to use AND in grepl

Tom Wright-9
In reply to this post by R help mailing list-2
subset(df,grepl("t2|pd",x$Command))


On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help <[hidden email]>
wrote:

> Hi all,
>
> I have one factor variable in my df and I want to extract the names from
> it which contain both "t2" and "pd":
>
>   'data.frame': 36919 obs. of 162 variables
>    $TE                :int 38,41,11,52,48,75,.....
>    $TR                :int 100,210,548,546,.....
>    $Command          :factor W/2229 levels
> "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
>
> I have tried this but I did not get result:
>
>   t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
>
>
> does anyone know how to apply AND in grepl?
>
> Thanks
> Elahe
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to use AND in grepl

Tom Wright-9
Actually not sure my previous answer does what you wanted. Using your
approach:

 t2pd=subset(df,grepl("t2",df$Command) & grepl("pd",df$Command))

Should work.

I think the regex pattern you are looking for is:

 Subset(df,grepl("(.* t2.*pd.* )|(.* pd.* t2.*)",df$Command)

On Sat, Apr 30, 2016, 7:07 PM Tom Wright <[hidden email]> wrote:

> subset(df,grepl("t2|pd",x$Command))
>
>
> On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help <[hidden email]
> > wrote:
>
>> Hi all,
>>
>> I have one factor variable in my df and I want to extract the names from
>> it which contain both "t2" and "pd":
>>
>>   'data.frame': 36919 obs. of 162 variables
>>    $TE                :int 38,41,11,52,48,75,.....
>>    $TR                :int 100,210,548,546,.....
>>    $Command          :factor W/2229 levels
>> "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
>>
>> I have tried this but I did not get result:
>>
>>   t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
>>
>>
>> does anyone know how to apply AND in grepl?
>>
>> Thanks
>> Elahe
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to use AND in grepl

R help mailing list-2
Thanks for your reply tom. After using  Subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)"),df$Command)  I get this error: Argument "x" is missing, with no default. Actually I don't know how to fix this. Do you have any idea?
Thanks,
Elahe


On Saturday, April 30, 2016 7:35 PM, Tom Wright <[hidden email]> wrote:



Actually not sure my previous answer does what you wanted. Using your approach:
 t2pd=subset(df,grepl("t2",df$Command) & grepl("pd",df$Command))
Should work.
I think the regex pattern you are looking for is:
 Subset(df,grepl("(.* t2.*pd.* )|(.* pd.* t2.*)",df$Command)

On Sat, Apr 30, 2016, 7:07 PM Tom Wright <[hidden email]> wrote:

subset(df,grepl("t2|pd",x$Command))

>
>
>
>
>On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help <[hidden email]> wrote:
>
>Hi all,
>>
>>I have one factor variable in my df and I want to extract the names from it which contain both "t2" and "pd":
>>
>>  'data.frame': 36919 obs. of 162 variables
>>   $TE                :int 38,41,11,52,48,75,.....
>>   $TR                :int 100,210,548,546,.....
>>   $Command          :factor W/2229 levels "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
>>
>>I have tried this but I did not get result:
>>
>>  t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
>>
>>
>>does anyone know how to apply AND in grepl?
>>
>>Thanks
>>Elahe
>>
>>______________________________________________
>>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>>.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to use AND in grepl

Peter Dalgaard-2

On 02 May 2016, at 12:43 , ch.elahe via R-help <[hidden email]> wrote:

> Thanks for your reply tom. After using  Subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)"),df$Command)  I get this error: Argument "x" is missing, with no default. Actually I don't know how to fix this. Do you have any idea?

Tom's code was missing a ")" but not where you put one. He probably also didn't intend to capitalize "subset".

-pd

> Thanks,
> Elahe
>
>
> On Saturday, April 30, 2016 7:35 PM, Tom Wright <[hidden email]> wrote:
>
>
>
> Actually not sure my previous answer does what you wanted. Using your approach:
> t2pd=subset(df,grepl("t2",df$Command) & grepl("pd",df$Command))
> Should work.
> I think the regex pattern you are looking for is:
> Subset(df,grepl("(.* t2.*pd.* )|(.* pd.* t2.*)",df$Command)
>
> On Sat, Apr 30, 2016, 7:07 PM Tom Wright <[hidden email]> wrote:
>
> subset(df,grepl("t2|pd",x$Command))
>>
>>
>>
>>
>> On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help <[hidden email]> wrote:
>>
>> Hi all,
>>>
>>> I have one factor variable in my df and I want to extract the names from it which contain both "t2" and "pd":
>>>
>>> 'data.frame': 36919 obs. of 162 variables
>>>  $TE                :int 38,41,11,52,48,75,.....
>>>  $TR                :int 100,210,548,546,.....
>>>  $Command          :factor W/2229 levels "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
>>>
>>> I have tried this but I did not get result:
>>>
>>> t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
>>>
>>>
>>> does anyone know how to apply AND in grepl?
>>>
>>> Thanks
>>> Elahe
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> .
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to use AND in grepl

R help mailing list-2
Thanks Peter, you were right, the exact grepl is grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command), but it does not change anything in Command, when I check the size of it by sum(grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command))  the result is 0, but I am sure that the size is not 0. It seems that this AND does not work.
 

On Monday, May 2, 2016 5:05 AM, peter dalgaard <[hidden email]> wrote:

On 02 May 2016, at 12:43 , ch.elahe via R-help <[hidden email]> wrote:

> Thanks for your reply tom. After using  Subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)"),df$Command)  I get this error: Argument "x" is missing, with no default. Actually I don't know how to fix this. Do you have any idea?

Tom's code was missing a ")" but not where you put one. He probably also didn't intend to capitalize "subset".


-pd

> Thanks,
> Elahe
>
>
> On Saturday, April 30, 2016 7:35 PM, Tom Wright <[hidden email]> wrote:
>
>
>
> Actually not sure my previous answer does what you wanted. Using your approach:
> t2pd=subset(df,grepl("t2",df$Command) & grepl("pd",df$Command))
> Should work.
> I think the regex pattern you are looking for is:
> Subset(df,grepl("(.* t2.*pd.* )|(.* pd.* t2.*)",df$Command)
>
> On Sat, Apr 30, 2016, 7:07 PM Tom Wright <[hidden email]> wrote:
>
> subset(df,grepl("t2|pd",x$Command))
>>
>>
>>
>>
>> On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help <[hidden email]> wrote:
>>
>> Hi all,
>>>
>>> I have one factor variable in my df and I want to extract the names from it which contain both "t2" and "pd":
>>>
>>> 'data.frame': 36919 obs. of 162 variables
>>>  $TE                :int 38,41,11,52,48,75,.....
>>>  $TR                :int 100,210,548,546,.....
>>>  $Command          :factor W/2229 levels "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
>>>
>>> I have tried this but I did not get result:
>>>
>>> t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
>>>
>>>
>>> does anyone know how to apply AND in grepl?
>>>
>>> Thanks
>>> Elahe
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> .
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to use AND in grepl

Tom Wright-9
Sorry for the missed braces earlier. I was typing on a phone, not the best
place to conjugate regular expressions.
Using the example you provided:

> df=data.frame(Command=c("_localize_PD", "_localize_tre_t2",
"_abdomen_t1_seq", "knee_pd_t1_localize", "pd_local_abdomen_t2"))

> grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command)
[1] FALSE FALSE FALSE FALSE  TRUE

> subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command))
              Command
5 pd_local_abdomen_t2


On Mon, May 2, 2016 at 7:42 AM, <[hidden email]> wrote:

> Thanks Peter, you were right, the exact grepl is
> grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command), but it does not change
> anything in Command, when I check the size of it by
> sum(grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command))  the result is 0, but I
> am sure that the size is not 0. It seems that this AND does not work.
>
>
> On Monday, May 2, 2016 5:05 AM, peter dalgaard <[hidden email]> wrote:
>
> On 02 May 2016, at 12:43 , ch.elahe via R-help <[hidden email]>
> wrote:
>
> > Thanks for your reply tom. After using
> Subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)"),df$Command)  I get this error:
> Argument "x" is missing, with no default. Actually I don't know how to fix
> this. Do you have any idea?
>
> Tom's code was missing a ")" but not where you put one. He probably also
> didn't intend to capitalize "subset".
>
>
> -pd
>
> > Thanks,
> > Elahe
> >
> >
> > On Saturday, April 30, 2016 7:35 PM, Tom Wright <[hidden email]>
> wrote:
> >
> >
> >
> > Actually not sure my previous answer does what you wanted. Using your
> approach:
> > t2pd=subset(df,grepl("t2",df$Command) & grepl("pd",df$Command))
> > Should work.
> > I think the regex pattern you are looking for is:
> > Subset(df,grepl("(.* t2.*pd.* )|(.* pd.* t2.*)",df$Command)
> >
> > On Sat, Apr 30, 2016, 7:07 PM Tom Wright <[hidden email]> wrote:
> >
> > subset(df,grepl("t2|pd",x$Command))
> >>
> >>
> >>
> >>
> >> On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help <
> [hidden email]> wrote:
> >>
> >> Hi all,
> >>>
> >>> I have one factor variable in my df and I want to extract the names
> from it which contain both "t2" and "pd":
> >>>
> >>> 'data.frame': 36919 obs. of 162 variables
> >>>  $TE                :int 38,41,11,52,48,75,.....
> >>>  $TR                :int 100,210,548,546,.....
> >>>  $Command          :factor W/2229 levels
> "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
> >>>
> >>> I have tried this but I did not get result:
> >>>
> >>> t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
> >>>
> >>>
> >>> does anyone know how to apply AND in grepl?
> >>>
> >>> Thanks
> >>> Elahe
> >>>
> >>> ______________________________________________
> >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>> .
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: [hidden email]  Priv: [hidden email]
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to use AND in grepl

R help mailing list-2
Yes it works, but let me explain what I am going to do. I extract all the names I want and then create a new column out of them for my plot. This is he whole thing I do:
  PD=subset(df,grepl("pd",Command)) //extract names in Command with only "pd"
  t2=subset(df,grepl("t2",Command)) //extract names with only "t2"
  PDT2=subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command) // extract names which contain both "pd" and "t2"
  v1=c('PD','t2','PDT2')// I create a vector with these conditions
  str_extract(df$Command,paste(v1,collaps='|')) //returning patterns, using stringr library

here I see no pattern named PDT2 but there are only PD and t2 patterns.
On Monday, May 2, 2016 8:18 AM, Tom Wright <[hidden email]> wrote:



Sorry for the missed braces earlier. I was typing on a phone, not the best place to conjugate regular expressions.
Using the example you provided:

> df=data.frame(Command=c("_localize_PD", "_localize_tre_t2", "_abdomen_t1_seq", "knee_pd_t1_localize", "pd_local_abdomen_t2"))

> grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command)
[1] FALSE FALSE FALSE FALSE  TRUE

> subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command))
              Command
5 pd_local_abdomen_t2



On Mon, May 2, 2016 at 7:42 AM, <[hidden email]> wrote:

Thanks Peter, you were right, the exact grepl is grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command), but it does not change anything in Command, when I check the size of it by sum(grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command))  the result is 0, but I am sure that the size is not 0. It seems that this AND does not work.

>
>
>
>On Monday, May 2, 2016 5:05 AM, peter dalgaard <[hidden email]> wrote:
>
>On 02 May 2016, at 12:43 , ch.elahe via R-help <[hidden email]> wrote:
>
>> Thanks for your reply tom. After using  Subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)"),df$Command)  I get this error: Argument "x" is missing, with no default. Actually I don't know how to fix this. Do you have any idea?
>
>Tom's code was missing a ")" but not where you put one. He probably also didn't intend to capitalize "subset".
>
>
>-pd
>
>> Thanks,
>> Elahe
>>
>>
>> On Saturday, April 30, 2016 7:35 PM, Tom Wright <[hidden email]> wrote:
>>
>>
>>
>> Actually not sure my previous answer does what you wanted. Using your approach:
>> t2pd=subset(df,grepl("t2",df$Command) & grepl("pd",df$Command))
>> Should work.
>> I think the regex pattern you are looking for is:
>> Subset(df,grepl("(.* t2.*pd.* )|(.* pd.* t2.*)",df$Command)
>>
>> On Sat, Apr 30, 2016, 7:07 PM Tom Wright <[hidden email]> wrote:
>>
>> subset(df,grepl("t2|pd",x$Command))
>>>
>>>
>>>
>>>
>>> On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help <[hidden email]> wrote:
>>>
>>> Hi all,
>>>>
>>>> I have one factor variable in my df and I want to extract the names from it which contain both "t2" and "pd":
>>>>
>>>> 'data.frame': 36919 obs. of 162 variables
>>>>  $TE                :int 38,41,11,52,48,75,.....
>>>>  $TR                :int 100,210,548,546,.....
>>>>  $Command          :factor W/2229 levels "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
>>>>
>>>> I have tried this but I did not get result:
>>>>
>>>> t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
>>>>
>>>>
>>>> does anyone know how to apply AND in grepl?
>>>>
>>>> Thanks
>>>> Elahe
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>> .
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>--
>Peter Dalgaard, Professor,
>Center for Statistics, Copenhagen Business School
>Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>Phone: (+45)38153501
>Office: A 4.23
>Email: [hidden email]  Priv: [hidden email]
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to use AND in grepl

Tom Wright-9
The first thing I notice here is that your first two subset statements are
searching in an object named Command, not the column df$Command. I'm not at
all sure what you are trying to achieve with the str_extract process but it
is looking for the exact string 'PDT2' the vectors / dataframe formed in
your previous commands are not being used at all.
Moving forward I think you need to pay attention to case "PD" != "pd". Also
the set PDT2 is going to be a subset of both  sets PD and t2, I don't think
this is what you are after.

On Mon, May 2, 2016, 8:49 AM <[hidden email]> wrote:

> Yes it works, but let me explain what I am going to do. I extract all the
> names I want and then create a new column out of them for my plot. This is
> he whole thing I do:
>   PD=subset(df,grepl("pd",Command)) //extract names in Command with only
> "pd"
>   t2=subset(df,grepl("t2",Command)) //extract names with only "t2"
>   PDT2=subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command) // extract
> names which contain both "pd" and "t2"
>   v1=c('PD','t2','PDT2')// I create a vector with these conditions
>   str_extract(df$Command,paste(v1,collaps='|')) //returning patterns,
> using stringr library
>
> here I see no pattern named PDT2 but there are only PD and t2 patterns.
> On Monday, May 2, 2016 8:18 AM, Tom Wright <[hidden email]> wrote:
>
>
>
> Sorry for the missed braces earlier. I was typing on a phone, not the best
> place to conjugate regular expressions.
> Using the example you provided:
>
> > df=data.frame(Command=c("_localize_PD", "_localize_tre_t2",
> "_abdomen_t1_seq", "knee_pd_t1_localize", "pd_local_abdomen_t2"))
>
> > grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command)
> [1] FALSE FALSE FALSE FALSE  TRUE
>
> > subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command))
>               Command
> 5 pd_local_abdomen_t2
>
>
>
> On Mon, May 2, 2016 at 7:42 AM, <[hidden email]> wrote:
>
> Thanks Peter, you were right, the exact grepl is
> grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command), but it does not change
> anything in Command, when I check the size of it by
> sum(grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command))  the result is 0, but I
> am sure that the size is not 0. It seems that this AND does not work.
> >
> >
> >
> >On Monday, May 2, 2016 5:05 AM, peter dalgaard <[hidden email]> wrote:
> >
> >On 02 May 2016, at 12:43 , ch.elahe via R-help <[hidden email]>
> wrote:
> >
> >> Thanks for your reply tom. After using
> Subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)"),df$Command)  I get this error:
> Argument "x" is missing, with no default. Actually I don't know how to fix
> this. Do you have any idea?
> >
> >Tom's code was missing a ")" but not where you put one. He probably also
> didn't intend to capitalize "subset".
> >
> >
> >-pd
> >
> >> Thanks,
> >> Elahe
> >>
> >>
> >> On Saturday, April 30, 2016 7:35 PM, Tom Wright <[hidden email]>
> wrote:
> >>
> >>
> >>
> >> Actually not sure my previous answer does what you wanted. Using your
> approach:
> >> t2pd=subset(df,grepl("t2",df$Command) & grepl("pd",df$Command))
> >> Should work.
> >> I think the regex pattern you are looking for is:
> >> Subset(df,grepl("(.* t2.*pd.* )|(.* pd.* t2.*)",df$Command)
> >>
> >> On Sat, Apr 30, 2016, 7:07 PM Tom Wright <[hidden email]> wrote:
> >>
> >> subset(df,grepl("t2|pd",x$Command))
> >>>
> >>>
> >>>
> >>>
> >>> On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help <
> [hidden email]> wrote:
> >>>
> >>> Hi all,
> >>>>
> >>>> I have one factor variable in my df and I want to extract the names
> from it which contain both "t2" and "pd":
> >>>>
> >>>> 'data.frame': 36919 obs. of 162 variables
> >>>>  $TE                :int 38,41,11,52,48,75,.....
> >>>>  $TR                :int 100,210,548,546,.....
> >>>>  $Command          :factor W/2229 levels
> "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
> >>>>
> >>>> I have tried this but I did not get result:
> >>>>
> >>>> t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
> >>>>
> >>>>
> >>>> does anyone know how to apply AND in grepl?
> >>>>
> >>>> Thanks
> >>>> Elahe
> >>>>
> >>>> ______________________________________________
> >>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >>>> and provide commented, minimal, self-contained, reproducible code.
> >>>> .
> >>
> >> ______________________________________________
> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >--
> >Peter Dalgaard, Professor,
> >Center for Statistics, Copenhagen Business School
> >Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> >Phone: (+45)38153501
> >Office: A 4.23
> >Email: [hidden email]  Priv: [hidden email]
> >
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to use AND in grepl

R help mailing list-2
I just changed all the names in Command to lowercase, then this str_extract works fine for "pd" and "t2", but not for "PDT2". Do you have any idea how I can bring PDT2  also in str_extract?  


On Monday, May 2, 2016 9:16 AM, Tom Wright <[hidden email]> wrote:



The first thing I notice here is that your first two subset statements are searching in an object named Command, not the column df$Command. I'm not at all sure what you are trying to achieve with the str_extract process but it is looking for the exact string 'PDT2' the vectors / dataframe formed in your previous commands are not being used at all.
Moving forward I think you need to pay attention to case "PD" != "pd". Also the set PDT2 is going to be a subset of both  sets PD and t2, I don't think this is what you are after.

On Mon, May 2, 2016, 8:49 AM  <[hidden email]> wrote:

Yes it works, but let me explain what I am going to do. I extract all the names I want and then create a new column out of them for my plot. This is he whole thing I do:

>  PD=subset(df,grepl("pd",Command)) //extract names in Command with only "pd"
>  t2=subset(df,grepl("t2",Command)) //extract names with only "t2"
>  PDT2=subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command) // extract names which contain both "pd" and "t2"
>  v1=c('PD','t2','PDT2')// I create a vector with these conditions
>  str_extract(df$Command,paste(v1,collaps='|')) //returning patterns, using stringr library
>
>here I see no pattern named PDT2 but there are only PD and t2 patterns.
>On Monday, May 2, 2016 8:18 AM, Tom Wright <[hidden email]> wrote:
>
>
>
>Sorry for the missed braces earlier. I was typing on a phone, not the best place to conjugate regular expressions.
>Using the example you provided:
>
>> df=data.frame(Command=c("_localize_PD", "_localize_tre_t2", "_abdomen_t1_seq", "knee_pd_t1_localize", "pd_local_abdomen_t2"))
>
>> grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command)
>[1] FALSE FALSE FALSE FALSE  TRUE
>
>> subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command))
>              Command
>5 pd_local_abdomen_t2
>
>
>
>On Mon, May 2, 2016 at 7:42 AM, <[hidden email]> wrote:
>
>Thanks Peter, you were right, the exact grepl is grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command), but it does not change anything in Command, when I check the size of it by sum(grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command))  the result is 0, but I am sure that the size is not 0. It seems that this AND does not work.
>>
>>
>>
>>On Monday, May 2, 2016 5:05 AM, peter dalgaard <[hidden email]> wrote:
>>
>>On 02 May 2016, at 12:43 , ch.elahe via R-help <[hidden email]> wrote:
>>
>>> Thanks for your reply tom. After using  Subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)"),df$Command)  I get this error: Argument "x" is missing, with no default. Actually I don't know how to fix this. Do you have any idea?
>>
>>Tom's code was missing a ")" but not where you put one. He probably also didn't intend to capitalize "subset".
>>
>>
>>-pd
>>
>>> Thanks,
>>> Elahe
>>>
>>>
>>> On Saturday, April 30, 2016 7:35 PM, Tom Wright <[hidden email]> wrote:
>>>
>>>
>>>
>>> Actually not sure my previous answer does what you wanted. Using your approach:
>>> t2pd=subset(df,grepl("t2",df$Command) & grepl("pd",df$Command))
>>> Should work.
>>> I think the regex pattern you are looking for is:
>>> Subset(df,grepl("(.* t2.*pd.* )|(.* pd.* t2.*)",df$Command)
>>>
>>> On Sat, Apr 30, 2016, 7:07 PM Tom Wright <[hidden email]> wrote:
>>>
>>> subset(df,grepl("t2|pd",x$Command))
>>>>
>>>>
>>>>
>>>>
>>>> On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help <[hidden email]> wrote:
>>>>
>>>> Hi all,
>>>>>
>>>>> I have one factor variable in my df and I want to extract the names from it which contain both "t2" and "pd":
>>>>>
>>>>> 'data.frame': 36919 obs. of 162 variables
>>>>>  $TE                :int 38,41,11,52,48,75,.....
>>>>>  $TR                :int 100,210,548,546,.....
>>>>>  $Command          :factor W/2229 levels "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
>>>>>
>>>>> I have tried this but I did not get result:
>>>>>
>>>>> t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
>>>>>
>>>>>
>>>>> does anyone know how to apply AND in grepl?
>>>>>
>>>>> Thanks
>>>>> Elahe
>>>>>
>>>>> ______________________________________________
>>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>> .
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>--
>>Peter Dalgaard, Professor,
>>Center for Statistics, Copenhagen Business School
>>Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>>Phone: (+45)38153501
>>Office: A 4.23
>>Email: [hidden email]  Priv: [hidden email]
>>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to use AND in grepl

John McKown
On Mon, May 2, 2016 at 1:01 PM, ch.elahe via R-help <[hidden email]>
wrote:

> I just changed all the names in Command to lowercase, then this
> str_extract works fine for "pd" and "t2", but not for "PDT2". Do you have
> any idea how I can bring PDT2  also in str_extract?
>

Looking at ​?grepl, I see the option: ignore.case=TRUE​

 PDT2=subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$
Command,ignore.case=TRUE)

Perhaps this will do the trick.

--
The unfacts, did we have them, are too imprecisely few to warrant our
certitude.

Maranatha! <><
John McKown

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: how to use AND in grepl

Tom Wright-9
In reply to this post by R help mailing list-2
Please try to read my earlier comments.
In the absence of a proper example with expected output I think what you
are trying to achieve is:

# create a sample dataframe
df <- data.frame(Command=c("_localize_PD", "_localize_tre_t2",
"_abdomen_t1_seq", "knee_pd_t1_localize", "pd_local_abdomen_t2"))

# identify which rows in the dataframe set match the patterns
# note, the vectors PD, T2 and PDT2 are booleans indicating if a match was
made
PD <- grepl("pd", df$Command)
T2 <- grepl('t2', df$Command)
PDT2 <- grepl("(.*t2.*pd.*)|(.*pd.*t2.*)", df$Command)

# create the new column to hold the new names
df$new_name <- NA

df[PD,'new_name'] <- 'pd'
df[T2,'new_name'] <- 't2'
df[PDT2,'new_name'] <- 'pdt2'


# note 1: the order of these command is important, if the last command is
run first all matches will be overwritten by the single matches for 't2'
and 'pd'.
# note 2: There is no match for row 1 as "PD" != "pd", as suggested by John
McKown the ignore.case parameter for grepl can be used to change this
behaviour.

On Mon, May 2, 2016 at 11:01 AM, <[hidden email]> wrote:

> I just changed all the names in Command to lowercase, then this
> str_extract works fine for "pd" and "t2", but not for "PDT2". Do you have
> any idea how I can bring PDT2  also in str_extract?
>
>
> On Monday, May 2, 2016 9:16 AM, Tom Wright <[hidden email]> wrote:
>
>
>
> The first thing I notice here is that your first two subset statements are
> searching in an object named Command, not the column df$Command. I'm not at
> all sure what you are trying to achieve with the str_extract process but it
> is looking for the exact string 'PDT2' the vectors / dataframe formed in
> your previous commands are not being used at all.
> Moving forward I think you need to pay attention to case "PD" != "pd".
> Also the set PDT2 is going to be a subset of both  sets PD and t2, I don't
> think this is what you are after.
>
> On Mon, May 2, 2016, 8:49 AM  <[hidden email]> wrote:
>
> Yes it works, but let me explain what I am going to do. I extract all the
> names I want and then create a new column out of them for my plot. This is
> he whole thing I do:
> >  PD=subset(df,grepl("pd",Command)) //extract names in Command with only
> "pd"
> >  t2=subset(df,grepl("t2",Command)) //extract names with only "t2"
> >  PDT2=subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command) // extract
> names which contain both "pd" and "t2"
> >  v1=c('PD','t2','PDT2')// I create a vector with these conditions
> >  str_extract(df$Command,paste(v1,collaps='|')) //returning patterns,
> using stringr library
> >
> >here I see no pattern named PDT2 but there are only PD and t2 patterns.
> >On Monday, May 2, 2016 8:18 AM, Tom Wright <[hidden email]> wrote:
> >
> >
> >
> >Sorry for the missed braces earlier. I was typing on a phone, not the
> best place to conjugate regular expressions.
> >Using the example you provided:
> >
> >> df=data.frame(Command=c("_localize_PD", "_localize_tre_t2",
> "_abdomen_t1_seq", "knee_pd_t1_localize", "pd_local_abdomen_t2"))
> >
> >> grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command)
> >[1] FALSE FALSE FALSE FALSE  TRUE
> >
> >> subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command))
> >              Command
> >5 pd_local_abdomen_t2
> >
> >
> >
> >On Mon, May 2, 2016 at 7:42 AM, <[hidden email]> wrote:
> >
> >Thanks Peter, you were right, the exact grepl is
> grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command), but it does not change
> anything in Command, when I check the size of it by
> sum(grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command))  the result is 0, but I
> am sure that the size is not 0. It seems that this AND does not work.
> >>
> >>
> >>
> >>On Monday, May 2, 2016 5:05 AM, peter dalgaard <[hidden email]> wrote:
> >>
> >>On 02 May 2016, at 12:43 , ch.elahe via R-help <[hidden email]>
> wrote:
> >>
> >>> Thanks for your reply tom. After using
> Subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)"),df$Command)  I get this error:
> Argument "x" is missing, with no default. Actually I don't know how to fix
> this. Do you have any idea?
> >>
> >>Tom's code was missing a ")" but not where you put one. He probably also
> didn't intend to capitalize "subset".
> >>
> >>
> >>-pd
> >>
> >>> Thanks,
> >>> Elahe
> >>>
> >>>
> >>> On Saturday, April 30, 2016 7:35 PM, Tom Wright <[hidden email]>
> wrote:
> >>>
> >>>
> >>>
> >>> Actually not sure my previous answer does what you wanted. Using your
> approach:
> >>> t2pd=subset(df,grepl("t2",df$Command) & grepl("pd",df$Command))
> >>> Should work.
> >>> I think the regex pattern you are looking for is:
> >>> Subset(df,grepl("(.* t2.*pd.* )|(.* pd.* t2.*)",df$Command)
> >>>
> >>> On Sat, Apr 30, 2016, 7:07 PM Tom Wright <[hidden email]> wrote:
> >>>
> >>> subset(df,grepl("t2|pd",x$Command))
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help <
> [hidden email]> wrote:
> >>>>
> >>>> Hi all,
> >>>>>
> >>>>> I have one factor variable in my df and I want to extract the names
> from it which contain both "t2" and "pd":
> >>>>>
> >>>>> 'data.frame': 36919 obs. of 162 variables
> >>>>>  $TE                :int 38,41,11,52,48,75,.....
> >>>>>  $TR                :int 100,210,548,546,.....
> >>>>>  $Command          :factor W/2229 levels
> "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
> >>>>>
> >>>>> I have tried this but I did not get result:
> >>>>>
> >>>>> t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
> >>>>>
> >>>>>
> >>>>> does anyone know how to apply AND in grepl?
> >>>>>
> >>>>> Thanks
> >>>>> Elahe
> >>>>>
> >>>>> ______________________________________________
> >>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >>>>> and provide commented, minimal, self-contained, reproducible code.
> >>>>> .
> >>>
> >>> ______________________________________________
> >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >>--
> >>Peter Dalgaard, Professor,
> >>Center for Statistics, Copenhagen Business School
> >>Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> >>Phone: (+45)38153501
> >>Office: A 4.23
> >>Email: [hidden email]  Priv: [hidden email]
> >>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.