|
|
Hi all,
I have one factor variable in my df and I want to extract the names from it which contain both "t2" and "pd":
'data.frame': 36919 obs. of 162 variables
$TE :int 38,41,11,52,48,75,.....
$TR :int 100,210,548,546,.....
$Command :factor W/2229 levels "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
I have tried this but I did not get result:
t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
does anyone know how to apply AND in grepl?
Thanks
Elahe
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
Your code looks fine to me. What did t2pd look like?
I tried reproducing the problem in R-3.2.4(Revised) and everything worked
(although the output of str() looked a bit different - perhaps you have an
old version of R)
> df <- data.frame(TE=1:10, TR=101:110,
Command=c("pd_local_abdomen_t2","knee_pd_t1_localize","PD_localize_tre_t2","t2_localize_PD")[rep(1:4,len=10)])
> str(df)
'data.frame': 10 obs. of 3 variables:
$ TE : int 1 2 3 4 5 6 7 8 9 10
$ TR : int 101 102 103 104 105 106 107 108 109 110
$ Command: Factor w/ 4 levels "knee_pd_t1_localize",..: 2 1 3 4 2 1 3 4 2 1
> subset(df,grepl("t2",Command) & grepl("pd",Command))
TE TR Command
1 1 101 pd_local_abdomen_t2
5 5 105 pd_local_abdomen_t2
9 9 109 pd_local_abdomen_t2
> subset(df,grepl("t2",Command,ignore.case=TRUE) &
grepl("pd",Command,ignore.case=TRUE))
TE TR Command
1 1 101 pd_local_abdomen_t2
3 3 103 PD_localize_tre_t2
4 4 104 t2_localize_PD
5 5 105 pd_local_abdomen_t2
7 7 107 PD_localize_tre_t2
8 8 108 t2_localize_PD
9 9 109 pd_local_abdomen_t2
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help < [hidden email]>
wrote:
> Hi all,
>
> I have one factor variable in my df and I want to extract the names from
> it which contain both "t2" and "pd":
>
> 'data.frame': 36919 obs. of 162 variables
> $TE :int 38,41,11,52,48,75,.....
> $TR :int 100,210,548,546,.....
> $Command :factor W/2229 levels
> "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
>
> I have tried this but I did not get result:
>
> t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
>
>
> does anyone know how to apply AND in grepl?
>
> Thanks
> Elahe
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
In reply to this post by R help mailing list-2
subset(df,grepl("t2|pd",x$Command))
On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help < [hidden email]>
wrote:
> Hi all,
>
> I have one factor variable in my df and I want to extract the names from
> it which contain both "t2" and "pd":
>
> 'data.frame': 36919 obs. of 162 variables
> $TE :int 38,41,11,52,48,75,.....
> $TR :int 100,210,548,546,.....
> $Command :factor W/2229 levels
> "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
>
> I have tried this but I did not get result:
>
> t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
>
>
> does anyone know how to apply AND in grepl?
>
> Thanks
> Elahe
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
Actually not sure my previous answer does what you wanted. Using your
approach:
t2pd=subset(df,grepl("t2",df$Command) & grepl("pd",df$Command))
Should work.
I think the regex pattern you are looking for is:
Subset(df,grepl("(.* t2.*pd.* )|(.* pd.* t2.*)",df$Command)
On Sat, Apr 30, 2016, 7:07 PM Tom Wright < [hidden email]> wrote:
> subset(df,grepl("t2|pd",x$Command))
>
>
> On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help < [hidden email]
> > wrote:
>
>> Hi all,
>>
>> I have one factor variable in my df and I want to extract the names from
>> it which contain both "t2" and "pd":
>>
>> 'data.frame': 36919 obs. of 162 variables
>> $TE :int 38,41,11,52,48,75,.....
>> $TR :int 100,210,548,546,.....
>> $Command :factor W/2229 levels
>> "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
>>
>> I have tried this but I did not get result:
>>
>> t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
>>
>>
>> does anyone know how to apply AND in grepl?
>>
>> Thanks
>> Elahe
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html>> and provide commented, minimal, self-contained, reproducible code.
>>
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
Thanks for your reply tom. After using Subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)"),df$Command) I get this error: Argument "x" is missing, with no default. Actually I don't know how to fix this. Do you have any idea?
Thanks,
Elahe
On Saturday, April 30, 2016 7:35 PM, Tom Wright < [hidden email]> wrote:
Actually not sure my previous answer does what you wanted. Using your approach:
t2pd=subset(df,grepl("t2",df$Command) & grepl("pd",df$Command))
Should work.
I think the regex pattern you are looking for is:
Subset(df,grepl("(.* t2.*pd.* )|(.* pd.* t2.*)",df$Command)
On Sat, Apr 30, 2016, 7:07 PM Tom Wright < [hidden email]> wrote:
subset(df,grepl("t2|pd",x$Command))
>
>
>
>
>On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help < [hidden email]> wrote:
>
>Hi all,
>>
>>I have one factor variable in my df and I want to extract the names from it which contain both "t2" and "pd":
>>
>> 'data.frame': 36919 obs. of 162 variables
>> $TE :int 38,41,11,52,48,75,.....
>> $TR :int 100,210,548,546,.....
>> $Command :factor W/2229 levels "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
>>
>>I have tried this but I did not get result:
>>
>> t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
>>
>>
>>does anyone know how to apply AND in grepl?
>>
>>Thanks
>>Elahe
>>
>>______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html>>and provide commented, minimal, self-contained, reproducible code.
>>.
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
On 02 May 2016, at 12:43 , ch.elahe via R-help < [hidden email]> wrote:
> Thanks for your reply tom. After using Subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)"),df$Command) I get this error: Argument "x" is missing, with no default. Actually I don't know how to fix this. Do you have any idea?
Tom's code was missing a ")" but not where you put one. He probably also didn't intend to capitalize "subset".
-pd
> Thanks,
> Elahe
>
>
> On Saturday, April 30, 2016 7:35 PM, Tom Wright < [hidden email]> wrote:
>
>
>
> Actually not sure my previous answer does what you wanted. Using your approach:
> t2pd=subset(df,grepl("t2",df$Command) & grepl("pd",df$Command))
> Should work.
> I think the regex pattern you are looking for is:
> Subset(df,grepl("(.* t2.*pd.* )|(.* pd.* t2.*)",df$Command)
>
> On Sat, Apr 30, 2016, 7:07 PM Tom Wright < [hidden email]> wrote:
>
> subset(df,grepl("t2|pd",x$Command))
>>
>>
>>
>>
>> On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help < [hidden email]> wrote:
>>
>> Hi all,
>>>
>>> I have one factor variable in my df and I want to extract the names from it which contain both "t2" and "pd":
>>>
>>> 'data.frame': 36919 obs. of 162 variables
>>> $TE :int 38,41,11,52,48,75,.....
>>> $TR :int 100,210,548,546,.....
>>> $Command :factor W/2229 levels "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
>>>
>>> I have tried this but I did not get result:
>>>
>>> t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
>>>
>>>
>>> does anyone know how to apply AND in grepl?
>>>
>>> Thanks
>>> Elahe
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html>>> and provide commented, minimal, self-contained, reproducible code.
>>> .
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [hidden email] Priv: [hidden email]
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
Thanks Peter, you were right, the exact grepl is grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command), but it does not change anything in Command, when I check the size of it by sum(grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command)) the result is 0, but I am sure that the size is not 0. It seems that this AND does not work.
On Monday, May 2, 2016 5:05 AM, peter dalgaard < [hidden email]> wrote:
On 02 May 2016, at 12:43 , ch.elahe via R-help < [hidden email]> wrote:
> Thanks for your reply tom. After using Subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)"),df$Command) I get this error: Argument "x" is missing, with no default. Actually I don't know how to fix this. Do you have any idea?
Tom's code was missing a ")" but not where you put one. He probably also didn't intend to capitalize "subset".
-pd
> Thanks,
> Elahe
>
>
> On Saturday, April 30, 2016 7:35 PM, Tom Wright < [hidden email]> wrote:
>
>
>
> Actually not sure my previous answer does what you wanted. Using your approach:
> t2pd=subset(df,grepl("t2",df$Command) & grepl("pd",df$Command))
> Should work.
> I think the regex pattern you are looking for is:
> Subset(df,grepl("(.* t2.*pd.* )|(.* pd.* t2.*)",df$Command)
>
> On Sat, Apr 30, 2016, 7:07 PM Tom Wright < [hidden email]> wrote:
>
> subset(df,grepl("t2|pd",x$Command))
>>
>>
>>
>>
>> On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help < [hidden email]> wrote:
>>
>> Hi all,
>>>
>>> I have one factor variable in my df and I want to extract the names from it which contain both "t2" and "pd":
>>>
>>> 'data.frame': 36919 obs. of 162 variables
>>> $TE :int 38,41,11,52,48,75,.....
>>> $TR :int 100,210,548,546,.....
>>> $Command :factor W/2229 levels "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
>>>
>>> I have tried this but I did not get result:
>>>
>>> t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
>>>
>>>
>>> does anyone know how to apply AND in grepl?
>>>
>>> Thanks
>>> Elahe
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html>>> and provide commented, minimal, self-contained, reproducible code.
>>> .
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [hidden email] Priv: [hidden email]
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
Sorry for the missed braces earlier. I was typing on a phone, not the best
place to conjugate regular expressions.
Using the example you provided:
> df=data.frame(Command=c("_localize_PD", "_localize_tre_t2",
"_abdomen_t1_seq", "knee_pd_t1_localize", "pd_local_abdomen_t2"))
> grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command)
[1] FALSE FALSE FALSE FALSE TRUE
> subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command))
Command
5 pd_local_abdomen_t2
On Mon, May 2, 2016 at 7:42 AM, < [hidden email]> wrote:
> Thanks Peter, you were right, the exact grepl is
> grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command), but it does not change
> anything in Command, when I check the size of it by
> sum(grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command)) the result is 0, but I
> am sure that the size is not 0. It seems that this AND does not work.
>
>
> On Monday, May 2, 2016 5:05 AM, peter dalgaard < [hidden email]> wrote:
>
> On 02 May 2016, at 12:43 , ch.elahe via R-help < [hidden email]>
> wrote:
>
> > Thanks for your reply tom. After using
> Subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)"),df$Command) I get this error:
> Argument "x" is missing, with no default. Actually I don't know how to fix
> this. Do you have any idea?
>
> Tom's code was missing a ")" but not where you put one. He probably also
> didn't intend to capitalize "subset".
>
>
> -pd
>
> > Thanks,
> > Elahe
> >
> >
> > On Saturday, April 30, 2016 7:35 PM, Tom Wright < [hidden email]>
> wrote:
> >
> >
> >
> > Actually not sure my previous answer does what you wanted. Using your
> approach:
> > t2pd=subset(df,grepl("t2",df$Command) & grepl("pd",df$Command))
> > Should work.
> > I think the regex pattern you are looking for is:
> > Subset(df,grepl("(.* t2.*pd.* )|(.* pd.* t2.*)",df$Command)
> >
> > On Sat, Apr 30, 2016, 7:07 PM Tom Wright < [hidden email]> wrote:
> >
> > subset(df,grepl("t2|pd",x$Command))
> >>
> >>
> >>
> >>
> >> On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help <
> [hidden email]> wrote:
> >>
> >> Hi all,
> >>>
> >>> I have one factor variable in my df and I want to extract the names
> from it which contain both "t2" and "pd":
> >>>
> >>> 'data.frame': 36919 obs. of 162 variables
> >>> $TE :int 38,41,11,52,48,75,.....
> >>> $TR :int 100,210,548,546,.....
> >>> $Command :factor W/2229 levels
> "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
> >>>
> >>> I have tried this but I did not get result:
> >>>
> >>> t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
> >>>
> >>>
> >>> does anyone know how to apply AND in grepl?
> >>>
> >>> Thanks
> >>> Elahe
> >>>
> >>> ______________________________________________
> >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help> >>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html> >>> and provide commented, minimal, self-contained, reproducible code.
> >>> .
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: [hidden email] Priv: [hidden email]
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
Yes it works, but let me explain what I am going to do. I extract all the names I want and then create a new column out of them for my plot. This is he whole thing I do:
PD=subset(df,grepl("pd",Command)) //extract names in Command with only "pd"
t2=subset(df,grepl("t2",Command)) //extract names with only "t2"
PDT2=subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command) // extract names which contain both "pd" and "t2"
v1=c('PD','t2','PDT2')// I create a vector with these conditions
str_extract(df$Command,paste(v1,collaps='|')) //returning patterns, using stringr library
here I see no pattern named PDT2 but there are only PD and t2 patterns.
On Monday, May 2, 2016 8:18 AM, Tom Wright < [hidden email]> wrote:
Sorry for the missed braces earlier. I was typing on a phone, not the best place to conjugate regular expressions.
Using the example you provided:
> df=data.frame(Command=c("_localize_PD", "_localize_tre_t2", "_abdomen_t1_seq", "knee_pd_t1_localize", "pd_local_abdomen_t2"))
> grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command)
[1] FALSE FALSE FALSE FALSE TRUE
> subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command))
Command
5 pd_local_abdomen_t2
On Mon, May 2, 2016 at 7:42 AM, < [hidden email]> wrote:
Thanks Peter, you were right, the exact grepl is grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command), but it does not change anything in Command, when I check the size of it by sum(grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command)) the result is 0, but I am sure that the size is not 0. It seems that this AND does not work.
>
>
>
>On Monday, May 2, 2016 5:05 AM, peter dalgaard < [hidden email]> wrote:
>
>On 02 May 2016, at 12:43 , ch.elahe via R-help < [hidden email]> wrote:
>
>> Thanks for your reply tom. After using Subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)"),df$Command) I get this error: Argument "x" is missing, with no default. Actually I don't know how to fix this. Do you have any idea?
>
>Tom's code was missing a ")" but not where you put one. He probably also didn't intend to capitalize "subset".
>
>
>-pd
>
>> Thanks,
>> Elahe
>>
>>
>> On Saturday, April 30, 2016 7:35 PM, Tom Wright < [hidden email]> wrote:
>>
>>
>>
>> Actually not sure my previous answer does what you wanted. Using your approach:
>> t2pd=subset(df,grepl("t2",df$Command) & grepl("pd",df$Command))
>> Should work.
>> I think the regex pattern you are looking for is:
>> Subset(df,grepl("(.* t2.*pd.* )|(.* pd.* t2.*)",df$Command)
>>
>> On Sat, Apr 30, 2016, 7:07 PM Tom Wright < [hidden email]> wrote:
>>
>> subset(df,grepl("t2|pd",x$Command))
>>>
>>>
>>>
>>>
>>> On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help < [hidden email]> wrote:
>>>
>>> Hi all,
>>>>
>>>> I have one factor variable in my df and I want to extract the names from it which contain both "t2" and "pd":
>>>>
>>>> 'data.frame': 36919 obs. of 162 variables
>>>> $TE :int 38,41,11,52,48,75,.....
>>>> $TR :int 100,210,548,546,.....
>>>> $Command :factor W/2229 levels "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
>>>>
>>>> I have tried this but I did not get result:
>>>>
>>>> t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
>>>>
>>>>
>>>> does anyone know how to apply AND in grepl?
>>>>
>>>> Thanks
>>>> Elahe
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html>>>> and provide commented, minimal, self-contained, reproducible code.
>>>> .
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html>> and provide commented, minimal, self-contained, reproducible code.
>
>--
>Peter Dalgaard, Professor,
>Center for Statistics, Copenhagen Business School
>Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>Phone: (+45)38153501
>Office: A 4.23
>Email: [hidden email] Priv: [hidden email]
>
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
The first thing I notice here is that your first two subset statements are
searching in an object named Command, not the column df$Command. I'm not at
all sure what you are trying to achieve with the str_extract process but it
is looking for the exact string 'PDT2' the vectors / dataframe formed in
your previous commands are not being used at all.
Moving forward I think you need to pay attention to case "PD" != "pd". Also
the set PDT2 is going to be a subset of both sets PD and t2, I don't think
this is what you are after.
On Mon, May 2, 2016, 8:49 AM < [hidden email]> wrote:
> Yes it works, but let me explain what I am going to do. I extract all the
> names I want and then create a new column out of them for my plot. This is
> he whole thing I do:
> PD=subset(df,grepl("pd",Command)) //extract names in Command with only
> "pd"
> t2=subset(df,grepl("t2",Command)) //extract names with only "t2"
> PDT2=subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command) // extract
> names which contain both "pd" and "t2"
> v1=c('PD','t2','PDT2')// I create a vector with these conditions
> str_extract(df$Command,paste(v1,collaps='|')) //returning patterns,
> using stringr library
>
> here I see no pattern named PDT2 but there are only PD and t2 patterns.
> On Monday, May 2, 2016 8:18 AM, Tom Wright < [hidden email]> wrote:
>
>
>
> Sorry for the missed braces earlier. I was typing on a phone, not the best
> place to conjugate regular expressions.
> Using the example you provided:
>
> > df=data.frame(Command=c("_localize_PD", "_localize_tre_t2",
> "_abdomen_t1_seq", "knee_pd_t1_localize", "pd_local_abdomen_t2"))
>
> > grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command)
> [1] FALSE FALSE FALSE FALSE TRUE
>
> > subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command))
> Command
> 5 pd_local_abdomen_t2
>
>
>
> On Mon, May 2, 2016 at 7:42 AM, < [hidden email]> wrote:
>
> Thanks Peter, you were right, the exact grepl is
> grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command), but it does not change
> anything in Command, when I check the size of it by
> sum(grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command)) the result is 0, but I
> am sure that the size is not 0. It seems that this AND does not work.
> >
> >
> >
> >On Monday, May 2, 2016 5:05 AM, peter dalgaard < [hidden email]> wrote:
> >
> >On 02 May 2016, at 12:43 , ch.elahe via R-help < [hidden email]>
> wrote:
> >
> >> Thanks for your reply tom. After using
> Subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)"),df$Command) I get this error:
> Argument "x" is missing, with no default. Actually I don't know how to fix
> this. Do you have any idea?
> >
> >Tom's code was missing a ")" but not where you put one. He probably also
> didn't intend to capitalize "subset".
> >
> >
> >-pd
> >
> >> Thanks,
> >> Elahe
> >>
> >>
> >> On Saturday, April 30, 2016 7:35 PM, Tom Wright < [hidden email]>
> wrote:
> >>
> >>
> >>
> >> Actually not sure my previous answer does what you wanted. Using your
> approach:
> >> t2pd=subset(df,grepl("t2",df$Command) & grepl("pd",df$Command))
> >> Should work.
> >> I think the regex pattern you are looking for is:
> >> Subset(df,grepl("(.* t2.*pd.* )|(.* pd.* t2.*)",df$Command)
> >>
> >> On Sat, Apr 30, 2016, 7:07 PM Tom Wright < [hidden email]> wrote:
> >>
> >> subset(df,grepl("t2|pd",x$Command))
> >>>
> >>>
> >>>
> >>>
> >>> On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help <
> [hidden email]> wrote:
> >>>
> >>> Hi all,
> >>>>
> >>>> I have one factor variable in my df and I want to extract the names
> from it which contain both "t2" and "pd":
> >>>>
> >>>> 'data.frame': 36919 obs. of 162 variables
> >>>> $TE :int 38,41,11,52,48,75,.....
> >>>> $TR :int 100,210,548,546,.....
> >>>> $Command :factor W/2229 levels
> "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
> >>>>
> >>>> I have tried this but I did not get result:
> >>>>
> >>>> t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
> >>>>
> >>>>
> >>>> does anyone know how to apply AND in grepl?
> >>>>
> >>>> Thanks
> >>>> Elahe
> >>>>
> >>>> ______________________________________________
> >>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>>> https://stat.ethz.ch/mailman/listinfo/r-help> >>>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html> >>>> and provide commented, minimal, self-contained, reproducible code.
> >>>> .
> >>
> >> ______________________________________________
> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >--
> >Peter Dalgaard, Professor,
> >Center for Statistics, Copenhagen Business School
> >Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> >Phone: (+45)38153501
> >Office: A 4.23
> >Email: [hidden email] Priv: [hidden email]
> >
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
I just changed all the names in Command to lowercase, then this str_extract works fine for "pd" and "t2", but not for "PDT2". Do you have any idea how I can bring PDT2 also in str_extract?
On Monday, May 2, 2016 9:16 AM, Tom Wright < [hidden email]> wrote:
The first thing I notice here is that your first two subset statements are searching in an object named Command, not the column df$Command. I'm not at all sure what you are trying to achieve with the str_extract process but it is looking for the exact string 'PDT2' the vectors / dataframe formed in your previous commands are not being used at all.
Moving forward I think you need to pay attention to case "PD" != "pd". Also the set PDT2 is going to be a subset of both sets PD and t2, I don't think this is what you are after.
On Mon, May 2, 2016, 8:49 AM < [hidden email]> wrote:
Yes it works, but let me explain what I am going to do. I extract all the names I want and then create a new column out of them for my plot. This is he whole thing I do:
> PD=subset(df,grepl("pd",Command)) //extract names in Command with only "pd"
> t2=subset(df,grepl("t2",Command)) //extract names with only "t2"
> PDT2=subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command) // extract names which contain both "pd" and "t2"
> v1=c('PD','t2','PDT2')// I create a vector with these conditions
> str_extract(df$Command,paste(v1,collaps='|')) //returning patterns, using stringr library
>
>here I see no pattern named PDT2 but there are only PD and t2 patterns.
>On Monday, May 2, 2016 8:18 AM, Tom Wright < [hidden email]> wrote:
>
>
>
>Sorry for the missed braces earlier. I was typing on a phone, not the best place to conjugate regular expressions.
>Using the example you provided:
>
>> df=data.frame(Command=c("_localize_PD", "_localize_tre_t2", "_abdomen_t1_seq", "knee_pd_t1_localize", "pd_local_abdomen_t2"))
>
>> grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command)
>[1] FALSE FALSE FALSE FALSE TRUE
>
>> subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command))
> Command
>5 pd_local_abdomen_t2
>
>
>
>On Mon, May 2, 2016 at 7:42 AM, < [hidden email]> wrote:
>
>Thanks Peter, you were right, the exact grepl is grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command), but it does not change anything in Command, when I check the size of it by sum(grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command)) the result is 0, but I am sure that the size is not 0. It seems that this AND does not work.
>>
>>
>>
>>On Monday, May 2, 2016 5:05 AM, peter dalgaard < [hidden email]> wrote:
>>
>>On 02 May 2016, at 12:43 , ch.elahe via R-help < [hidden email]> wrote:
>>
>>> Thanks for your reply tom. After using Subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)"),df$Command) I get this error: Argument "x" is missing, with no default. Actually I don't know how to fix this. Do you have any idea?
>>
>>Tom's code was missing a ")" but not where you put one. He probably also didn't intend to capitalize "subset".
>>
>>
>>-pd
>>
>>> Thanks,
>>> Elahe
>>>
>>>
>>> On Saturday, April 30, 2016 7:35 PM, Tom Wright < [hidden email]> wrote:
>>>
>>>
>>>
>>> Actually not sure my previous answer does what you wanted. Using your approach:
>>> t2pd=subset(df,grepl("t2",df$Command) & grepl("pd",df$Command))
>>> Should work.
>>> I think the regex pattern you are looking for is:
>>> Subset(df,grepl("(.* t2.*pd.* )|(.* pd.* t2.*)",df$Command)
>>>
>>> On Sat, Apr 30, 2016, 7:07 PM Tom Wright < [hidden email]> wrote:
>>>
>>> subset(df,grepl("t2|pd",x$Command))
>>>>
>>>>
>>>>
>>>>
>>>> On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help < [hidden email]> wrote:
>>>>
>>>> Hi all,
>>>>>
>>>>> I have one factor variable in my df and I want to extract the names from it which contain both "t2" and "pd":
>>>>>
>>>>> 'data.frame': 36919 obs. of 162 variables
>>>>> $TE :int 38,41,11,52,48,75,.....
>>>>> $TR :int 100,210,548,546,.....
>>>>> $Command :factor W/2229 levels "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
>>>>>
>>>>> I have tried this but I did not get result:
>>>>>
>>>>> t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
>>>>>
>>>>>
>>>>> does anyone know how to apply AND in grepl?
>>>>>
>>>>> Thanks
>>>>> Elahe
>>>>>
>>>>> ______________________________________________
>>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>> .
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>--
>>Peter Dalgaard, Professor,
>>Center for Statistics, Copenhagen Business School
>>Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>>Phone: (+45)38153501
>>Office: A 4.23
>>Email: [hidden email] Priv: [hidden email]
>>
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
On Mon, May 2, 2016 at 1:01 PM, ch.elahe via R-help < [hidden email]>
wrote:
> I just changed all the names in Command to lowercase, then this
> str_extract works fine for "pd" and "t2", but not for "PDT2". Do you have
> any idea how I can bring PDT2 also in str_extract?
>
Looking at ?grepl, I see the option: ignore.case=TRUE
PDT2=subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$
Command,ignore.case=TRUE)
Perhaps this will do the trick.
--
The unfacts, did we have them, are too imprecisely few to warrant our
certitude.
Maranatha! <><
John McKown
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|
In reply to this post by R help mailing list-2
Please try to read my earlier comments.
In the absence of a proper example with expected output I think what you
are trying to achieve is:
# create a sample dataframe
df <- data.frame(Command=c("_localize_PD", "_localize_tre_t2",
"_abdomen_t1_seq", "knee_pd_t1_localize", "pd_local_abdomen_t2"))
# identify which rows in the dataframe set match the patterns
# note, the vectors PD, T2 and PDT2 are booleans indicating if a match was
made
PD <- grepl("pd", df$Command)
T2 <- grepl('t2', df$Command)
PDT2 <- grepl("(.*t2.*pd.*)|(.*pd.*t2.*)", df$Command)
# create the new column to hold the new names
df$new_name <- NA
df[PD,'new_name'] <- 'pd'
df[T2,'new_name'] <- 't2'
df[PDT2,'new_name'] <- 'pdt2'
# note 1: the order of these command is important, if the last command is
run first all matches will be overwritten by the single matches for 't2'
and 'pd'.
# note 2: There is no match for row 1 as "PD" != "pd", as suggested by John
McKown the ignore.case parameter for grepl can be used to change this
behaviour.
On Mon, May 2, 2016 at 11:01 AM, < [hidden email]> wrote:
> I just changed all the names in Command to lowercase, then this
> str_extract works fine for "pd" and "t2", but not for "PDT2". Do you have
> any idea how I can bring PDT2 also in str_extract?
>
>
> On Monday, May 2, 2016 9:16 AM, Tom Wright < [hidden email]> wrote:
>
>
>
> The first thing I notice here is that your first two subset statements are
> searching in an object named Command, not the column df$Command. I'm not at
> all sure what you are trying to achieve with the str_extract process but it
> is looking for the exact string 'PDT2' the vectors / dataframe formed in
> your previous commands are not being used at all.
> Moving forward I think you need to pay attention to case "PD" != "pd".
> Also the set PDT2 is going to be a subset of both sets PD and t2, I don't
> think this is what you are after.
>
> On Mon, May 2, 2016, 8:49 AM < [hidden email]> wrote:
>
> Yes it works, but let me explain what I am going to do. I extract all the
> names I want and then create a new column out of them for my plot. This is
> he whole thing I do:
> > PD=subset(df,grepl("pd",Command)) //extract names in Command with only
> "pd"
> > t2=subset(df,grepl("t2",Command)) //extract names with only "t2"
> > PDT2=subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command) // extract
> names which contain both "pd" and "t2"
> > v1=c('PD','t2','PDT2')// I create a vector with these conditions
> > str_extract(df$Command,paste(v1,collaps='|')) //returning patterns,
> using stringr library
> >
> >here I see no pattern named PDT2 but there are only PD and t2 patterns.
> >On Monday, May 2, 2016 8:18 AM, Tom Wright < [hidden email]> wrote:
> >
> >
> >
> >Sorry for the missed braces earlier. I was typing on a phone, not the
> best place to conjugate regular expressions.
> >Using the example you provided:
> >
> >> df=data.frame(Command=c("_localize_PD", "_localize_tre_t2",
> "_abdomen_t1_seq", "knee_pd_t1_localize", "pd_local_abdomen_t2"))
> >
> >> grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command)
> >[1] FALSE FALSE FALSE FALSE TRUE
> >
> >> subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command))
> > Command
> >5 pd_local_abdomen_t2
> >
> >
> >
> >On Mon, May 2, 2016 at 7:42 AM, < [hidden email]> wrote:
> >
> >Thanks Peter, you were right, the exact grepl is
> grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command), but it does not change
> anything in Command, when I check the size of it by
> sum(grepl("(.*t2.*pd.*)|(.*pd.*t2.*)",df$Command)) the result is 0, but I
> am sure that the size is not 0. It seems that this AND does not work.
> >>
> >>
> >>
> >>On Monday, May 2, 2016 5:05 AM, peter dalgaard < [hidden email]> wrote:
> >>
> >>On 02 May 2016, at 12:43 , ch.elahe via R-help < [hidden email]>
> wrote:
> >>
> >>> Thanks for your reply tom. After using
> Subset(df,grepl("(.*t2.*pd.*)|(.*pd.*t2.*)"),df$Command) I get this error:
> Argument "x" is missing, with no default. Actually I don't know how to fix
> this. Do you have any idea?
> >>
> >>Tom's code was missing a ")" but not where you put one. He probably also
> didn't intend to capitalize "subset".
> >>
> >>
> >>-pd
> >>
> >>> Thanks,
> >>> Elahe
> >>>
> >>>
> >>> On Saturday, April 30, 2016 7:35 PM, Tom Wright < [hidden email]>
> wrote:
> >>>
> >>>
> >>>
> >>> Actually not sure my previous answer does what you wanted. Using your
> approach:
> >>> t2pd=subset(df,grepl("t2",df$Command) & grepl("pd",df$Command))
> >>> Should work.
> >>> I think the regex pattern you are looking for is:
> >>> Subset(df,grepl("(.* t2.*pd.* )|(.* pd.* t2.*)",df$Command)
> >>>
> >>> On Sat, Apr 30, 2016, 7:07 PM Tom Wright < [hidden email]> wrote:
> >>>
> >>> subset(df,grepl("t2|pd",x$Command))
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Sat, Apr 30, 2016 at 2:38 PM, ch.elahe via R-help <
> [hidden email]> wrote:
> >>>>
> >>>> Hi all,
> >>>>>
> >>>>> I have one factor variable in my df and I want to extract the names
> from it which contain both "t2" and "pd":
> >>>>>
> >>>>> 'data.frame': 36919 obs. of 162 variables
> >>>>> $TE :int 38,41,11,52,48,75,.....
> >>>>> $TR :int 100,210,548,546,.....
> >>>>> $Command :factor W/2229 levels
> "_localize_PD","_localize_tre_t2","_abdomen_t1_seq","knee_pd_t1_localize","pd_local_abdomen_t2"...
> >>>>>
> >>>>> I have tried this but I did not get result:
> >>>>>
> >>>>> t2pd=subset(df,grepl("t2",Command) & grepl("pd",Command))
> >>>>>
> >>>>>
> >>>>> does anyone know how to apply AND in grepl?
> >>>>>
> >>>>> Thanks
> >>>>> Elahe
> >>>>>
> >>>>> ______________________________________________
> >>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>>>> https://stat.ethz.ch/mailman/listinfo/r-help> >>>>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html> >>>>> and provide commented, minimal, self-contained, reproducible code.
> >>>>> .
> >>>
> >>> ______________________________________________
> >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help> >>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >>--
> >>Peter Dalgaard, Professor,
> >>Center for Statistics, Copenhagen Business School
> >>Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> >>Phone: (+45)38153501
> >>Office: A 4.23
> >>Email: [hidden email] Priv: [hidden email]
> >>
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
|
|