Wildcard for indexing?

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Wildcard for indexing?

Johannes Radinger
Hi,

I'd like to know if it is possible to use wildcards * for indexing...
E.g. I have a vector of strings. Now I'd like to select all elements
which start with A_*? I'd also need to combine that with logical operators:

"Select all elements of a vector that start with A (A*) OR that start with B (B*)"

Probably that is quite easy. I looked into grep() which I think might perform such tasks, but probably there is a more straigth forward solution.

a <- c("A_A","A_B","C_A","BB","A_Asd")
a[a=="A_A"| a=="A_B"] # here I'd like an index but with wildcard

/johannes
--

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Wildcard for indexing?

Duncan Murdoch-2
On 14/02/2012 9:54 AM, Johannes Radinger wrote:

> Hi,
>
> I'd like to know if it is possible to use wildcards * for indexing...
> E.g. I have a vector of strings. Now I'd like to select all elements
> which start with A_*? I'd also need to combine that with logical operators:
>
> "Select all elements of a vector that start with A (A*) OR that start with B (B*)"
>
> Probably that is quite easy. I looked into grep() which I think might perform such tasks, but probably there is a more straigth forward solution.
>
> a<- c("A_A","A_B","C_A","BB","A_Asd")
> a[a=="A_A"| a=="A_B"] # here I'd like an index but with wildcard

Try grepl():

a[grepl("^[AB]", a)]

is probably the simplest way for your example.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Wildcard for indexing?

Michael Weylandt
In reply to this post by Johannes Radinger
I think the grep()-family (regular expressions) will be the easiest
way to do this, though it sounds like you might prefer grepl() which
returns a logical vector:

^[AB] # Starts with either an A or a B
^A_ # Starting with A_

a <-  c("A_A","A_B","C_A","BB","A_Asd"
grepl("^[AB]", a)
grepl("^A_")

Michael

On Tue, Feb 14, 2012 at 9:54 AM, Johannes Radinger <[hidden email]> wrote:

> Hi,
>
> I'd like to know if it is possible to use wildcards * for indexing...
> E.g. I have a vector of strings. Now I'd like to select all elements
> which start with A_*? I'd also need to combine that with logical operators:
>
> "Select all elements of a vector that start with A (A*) OR that start with B (B*)"
>
> Probably that is quite easy. I looked into grep() which I think might perform such tasks, but probably there is a more straigth forward solution.
>
> a <- c("A_A","A_B","C_A","BB","A_Asd")
> a[a=="A_A"| a=="A_B"] # here I'd like an index but with wildcard
>
> /johannes
> --
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Wildcard for indexing?

Sarah Goslee
In reply to this post by Johannes Radinger
Hi,

On Tue, Feb 14, 2012 at 9:54 AM, Johannes Radinger <[hidden email]> wrote:

> Hi,
>
> I'd like to know if it is possible to use wildcards * for indexing...
> E.g. I have a vector of strings. Now I'd like to select all elements
> which start with A_*? I'd also need to combine that with logical operators:
>
> "Select all elements of a vector that start with A (A*) OR that start with B (B*)"
>
> Probably that is quite easy. I looked into grep() which I think might perform such tasks, but probably there is a more straigth forward solution.
>
> a <- c("A_A","A_B","C_A","BB","A_Asd")
> a[a=="A_A"| a=="A_B"] # here I'd like an index but with wildcard

Do you want elements that start with A or B, as you state above, or elements
that start with A_A or A_B as here?

Either way, this is a job for grepl(), and it is quite easy:

> a <- c("A_A","A_B","C_A","BB","A_Asd")
>
> grepl("^[AB]", a)
[1]  TRUE  TRUE FALSE  TRUE  TRUE
> grepl("^A_[AB]", a)
[1]  TRUE  TRUE FALSE FALSE  TRUE

Sarah


--
Sarah Goslee
http://www.functionaldiversity.org

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Wildcard for indexing?

Johannes Radinger
In reply to this post by Michael Weylandt
Hi,

-------- Original-Nachricht --------
> Datum: Tue, 14 Feb 2012 09:59:39 -0500
> Von: "R. Michael Weylandt" <[hidden email]>
> An: Johannes Radinger <[hidden email]>
> CC: [hidden email]
> Betreff: Re: [R] Wildcard for indexing?

> I think the grep()-family (regular expressions) will be the easiest
> way to do this, though it sounds like you might prefer grepl() which
> returns a logical vector:
>
> ^[AB] # Starts with either an A or a B
> ^A_ # Starting with A_
>
> a <-  c("A_A","A_B","C_A","BB","A_Asd"
> grepl("^[AB]", a)
> grepl("^A_")

Yes grepl() is what I am looking for.
is there also something like an OR statement e.g. if I want to
select for elements that start with "as" OR "df"?

/johannes

>
> Michael
>
> On Tue, Feb 14, 2012 at 9:54 AM, Johannes Radinger <[hidden email]>
> wrote:
> > Hi,
> >
> > I'd like to know if it is possible to use wildcards * for indexing...
> > E.g. I have a vector of strings. Now I'd like to select all elements
> > which start with A_*? I'd also need to combine that with logical
> operators:
> >
> > "Select all elements of a vector that start with A (A*) OR that start
> with B (B*)"
> >
> > Probably that is quite easy. I looked into grep() which I think might
> perform such tasks, but probably there is a more straigth forward solution.
> >
> > a <- c("A_A","A_B","C_A","BB","A_Asd")
> > a[a=="A_A"| a=="A_B"] # here I'd like an index but with wildcard
> >
> > /johannes
> > --
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

--

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Wildcard for indexing?

Sarah Goslee
Hi,

You should probably do a bit of reading about regular expressions, but
here's one way:

On Tue, Feb 14, 2012 at 10:10 AM, Johannes Radinger <[hidden email]> wrote:

> Hi,
>
> -------- Original-Nachricht --------
>> Datum: Tue, 14 Feb 2012 09:59:39 -0500
>> Von: "R. Michael Weylandt" <[hidden email]>
>> An: Johannes Radinger <[hidden email]>
>> CC: [hidden email]
>> Betreff: Re: [R] Wildcard for indexing?
>
>> I think the grep()-family (regular expressions) will be the easiest
>> way to do this, though it sounds like you might prefer grepl() which
>> returns a logical vector:
>>
>> ^[AB] # Starts with either an A or a B
>> ^A_ # Starting with A_
>>
>> a <-  c("A_A","A_B","C_A","BB","A_Asd"
>> grepl("^[AB]", a)
>> grepl("^A_")
>
> Yes grepl() is what I am looking for.
> is there also something like an OR statement e.g. if I want to
> select for elements that start with "as" OR "df"?

> a <- c("as1", "bb", "as2", "cc", "df", "aa", "dd", "sdf")
> grepl("^as|^df", a)
[1]  TRUE FALSE  TRUE FALSE  TRUE FALSE FALSE FALSE


The square brackets match any of those characters, so are good
for single characters. For more complex patterns, | is the or symbol.
^ marks the beginning.

Sarah

--
Sarah Goslee
http://www.functionaldiversity.org

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Wildcard for indexing?

Duncan Murdoch-2
In reply to this post by Johannes Radinger
On 14/02/2012 10:10 AM, Johannes Radinger wrote:

> Hi,
>
> -------- Original-Nachricht --------
> >  Datum: Tue, 14 Feb 2012 09:59:39 -0500
> >  Von: "R. Michael Weylandt"<[hidden email]>
> >  An: Johannes Radinger<[hidden email]>
> >  CC: [hidden email]
> >  Betreff: Re: [R] Wildcard for indexing?
>
> >  I think the grep()-family (regular expressions) will be the easiest
> >  way to do this, though it sounds like you might prefer grepl() which
> >  returns a logical vector:
> >
> >  ^[AB] # Starts with either an A or a B
> >  ^A_ # Starting with A_
> >
> >  a<-  c("A_A","A_B","C_A","BB","A_Asd"
> >  grepl("^[AB]", a)
> >  grepl("^A_")
>
> Yes grepl() is what I am looking for.
> is there also something like an OR statement e.g. if I want to
> select for elements that start with "as" OR "df"?

grepl("^(as|df)", a)

should work.  See ?regexp for all the possibilities, rules about
operator precedence, etc.  (Or just use grepl("^as", a) | grepl("^df", a).)

Duncan Murdoch

> /johannes
>
> >
> >  Michael
> >
> >  On Tue, Feb 14, 2012 at 9:54 AM, Johannes Radinger<[hidden email]>
> >  wrote:
> >  >  Hi,
> >  >
> >  >  I'd like to know if it is possible to use wildcards * for indexing...
> >  >  E.g. I have a vector of strings. Now I'd like to select all elements
> >  >  which start with A_*? I'd also need to combine that with logical
> >  operators:
> >  >
> >  >  "Select all elements of a vector that start with A (A*) OR that start
> >  with B (B*)"
> >  >
> >  >  Probably that is quite easy. I looked into grep() which I think might
> >  perform such tasks, but probably there is a more straigth forward solution.
> >  >
> >  >  a<- c("A_A","A_B","C_A","BB","A_Asd")
> >  >  a[a=="A_A"| a=="A_B"] # here I'd like an index but with wildcard
> >  >
> >  >  /johannes
> >  >  --
> >  >
> >  >  ______________________________________________
> >  >  [hidden email] mailing list
> >  >  https://stat.ethz.ch/mailman/listinfo/r-help
> >  >  PLEASE do read the posting guide
> >  http://www.R-project.org/posting-guide.html
> >  >  and provide commented, minimal, self-contained, reproducible code.
>
> --
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Wildcard for indexing?

Johannes Radinger
In reply to this post by Sarah Goslee


-------- Original-Nachricht --------
> Datum: Tue, 14 Feb 2012 10:18:33 -0500
> Von: Sarah Goslee <[hidden email]>
> An: Johannes Radinger <[hidden email]>
> CC: [hidden email]
> Betreff: Re: [R] Wildcard for indexing?

> Hi,
>
> You should probably do a bit of reading about regular expressions, but
> here's one way:
>
> On Tue, Feb 14, 2012 at 10:10 AM, Johannes Radinger <[hidden email]>
> wrote:
> > Hi,
> >
> > -------- Original-Nachricht --------
> >> Datum: Tue, 14 Feb 2012 09:59:39 -0500
> >> Von: "R. Michael Weylandt" <[hidden email]>
> >> An: Johannes Radinger <[hidden email]>
> >> CC: [hidden email]
> >> Betreff: Re: [R] Wildcard for indexing?
> >
> >> I think the grep()-family (regular expressions) will be the easiest
> >> way to do this, though it sounds like you might prefer grepl() which
> >> returns a logical vector:
> >>
> >> ^[AB] # Starts with either an A or a B
> >> ^A_ # Starting with A_
> >>
> >> a <-  c("A_A","A_B","C_A","BB","A_Asd"
> >> grepl("^[AB]", a)
> >> grepl("^A_")
> >
> > Yes grepl() is what I am looking for.
> > is there also something like an OR statement e.g. if I want to
> > select for elements that start with "as" OR "df"?
>
> > a <- c("as1", "bb", "as2", "cc", "df", "aa", "dd", "sdf")
> > grepl("^as|^df", a)
> [1]  TRUE FALSE  TRUE FALSE  TRUE FALSE FALSE FALSE
>
>
> The square brackets match any of those characters, so are good
> for single characters. For more complex patterns, | is the or symbol.
> ^ marks the beginning.

Thank you so much Sarah! I tried that | symbol intuitively, there was just a problem with the quotation marks :(

Now everything is solved...

/johannes

>
> Sarah
>
> --
> Sarah Goslee
> http://www.functionaldiversity.org

--

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Wildcard for indexing?

glsnow
Note that you can also do logical comparisons with the results of grepl like:

grepl('^as', a) | grepl('^df',a)

For the given example it is probably simplest to do it in the regular
expression as shown, but for some more complex cases (or including
other variables) the logic with the output may be simpler.

On Tue, Feb 14, 2012 at 8:23 AM, Johannes Radinger <[hidden email]> wrote:

>
>
> -------- Original-Nachricht --------
>> Datum: Tue, 14 Feb 2012 10:18:33 -0500
>> Von: Sarah Goslee <[hidden email]>
>> An: Johannes Radinger <[hidden email]>
>> CC: [hidden email]
>> Betreff: Re: [R] Wildcard for indexing?
>
>> Hi,
>>
>> You should probably do a bit of reading about regular expressions, but
>> here's one way:
>>
>> On Tue, Feb 14, 2012 at 10:10 AM, Johannes Radinger <[hidden email]>
>> wrote:
>> > Hi,
>> >
>> > -------- Original-Nachricht --------
>> >> Datum: Tue, 14 Feb 2012 09:59:39 -0500
>> >> Von: "R. Michael Weylandt" <[hidden email]>
>> >> An: Johannes Radinger <[hidden email]>
>> >> CC: [hidden email]
>> >> Betreff: Re: [R] Wildcard for indexing?
>> >
>> >> I think the grep()-family (regular expressions) will be the easiest
>> >> way to do this, though it sounds like you might prefer grepl() which
>> >> returns a logical vector:
>> >>
>> >> ^[AB] # Starts with either an A or a B
>> >> ^A_ # Starting with A_
>> >>
>> >> a <-  c("A_A","A_B","C_A","BB","A_Asd"
>> >> grepl("^[AB]", a)
>> >> grepl("^A_")
>> >
>> > Yes grepl() is what I am looking for.
>> > is there also something like an OR statement e.g. if I want to
>> > select for elements that start with "as" OR "df"?
>>
>> > a <- c("as1", "bb", "as2", "cc", "df", "aa", "dd", "sdf")
>> > grepl("^as|^df", a)
>> [1]  TRUE FALSE  TRUE FALSE  TRUE FALSE FALSE FALSE
>>
>>
>> The square brackets match any of those characters, so are good
>> for single characters. For more complex patterns, | is the or symbol.
>> ^ marks the beginning.
>
> Thank you so much Sarah! I tried that | symbol intuitively, there was just a problem with the quotation marks :(
>
> Now everything is solved...
>
> /johannes
>
>>
>> Sarah
>>
>> --
>> Sarah Goslee
>> http://www.functionaldiversity.org
>
> --
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Gregory (Greg) L. Snow Ph.D.
[hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.