Hi,
I'd like to know if it is possible to use wildcards * for indexing... E.g. I have a vector of strings. Now I'd like to select all elements which start with A_*? I'd also need to combine that with logical operators: "Select all elements of a vector that start with A (A*) OR that start with B (B*)" Probably that is quite easy. I looked into grep() which I think might perform such tasks, but probably there is a more straigth forward solution. a <- c("A_A","A_B","C_A","BB","A_Asd") a[a=="A_A"| a=="A_B"] # here I'd like an index but with wildcard /johannes -- ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
On 14/02/2012 9:54 AM, Johannes Radinger wrote:
> Hi, > > I'd like to know if it is possible to use wildcards * for indexing... > E.g. I have a vector of strings. Now I'd like to select all elements > which start with A_*? I'd also need to combine that with logical operators: > > "Select all elements of a vector that start with A (A*) OR that start with B (B*)" > > Probably that is quite easy. I looked into grep() which I think might perform such tasks, but probably there is a more straigth forward solution. > > a<- c("A_A","A_B","C_A","BB","A_Asd") > a[a=="A_A"| a=="A_B"] # here I'd like an index but with wildcard Try grepl(): a[grepl("^[AB]", a)] is probably the simplest way for your example. Duncan Murdoch ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by Johannes Radinger
I think the grep()-family (regular expressions) will be the easiest
way to do this, though it sounds like you might prefer grepl() which returns a logical vector: ^[AB] # Starts with either an A or a B ^A_ # Starting with A_ a <- c("A_A","A_B","C_A","BB","A_Asd" grepl("^[AB]", a) grepl("^A_") Michael On Tue, Feb 14, 2012 at 9:54 AM, Johannes Radinger <[hidden email]> wrote: > Hi, > > I'd like to know if it is possible to use wildcards * for indexing... > E.g. I have a vector of strings. Now I'd like to select all elements > which start with A_*? I'd also need to combine that with logical operators: > > "Select all elements of a vector that start with A (A*) OR that start with B (B*)" > > Probably that is quite easy. I looked into grep() which I think might perform such tasks, but probably there is a more straigth forward solution. > > a <- c("A_A","A_B","C_A","BB","A_Asd") > a[a=="A_A"| a=="A_B"] # here I'd like an index but with wildcard > > /johannes > -- > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by Johannes Radinger
Hi,
On Tue, Feb 14, 2012 at 9:54 AM, Johannes Radinger <[hidden email]> wrote: > Hi, > > I'd like to know if it is possible to use wildcards * for indexing... > E.g. I have a vector of strings. Now I'd like to select all elements > which start with A_*? I'd also need to combine that with logical operators: > > "Select all elements of a vector that start with A (A*) OR that start with B (B*)" > > Probably that is quite easy. I looked into grep() which I think might perform such tasks, but probably there is a more straigth forward solution. > > a <- c("A_A","A_B","C_A","BB","A_Asd") > a[a=="A_A"| a=="A_B"] # here I'd like an index but with wildcard Do you want elements that start with A or B, as you state above, or elements that start with A_A or A_B as here? Either way, this is a job for grepl(), and it is quite easy: > a <- c("A_A","A_B","C_A","BB","A_Asd") > > grepl("^[AB]", a) [1] TRUE TRUE FALSE TRUE TRUE > grepl("^A_[AB]", a) [1] TRUE TRUE FALSE FALSE TRUE Sarah -- Sarah Goslee http://www.functionaldiversity.org ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by Michael Weylandt
Hi,
-------- Original-Nachricht -------- > Datum: Tue, 14 Feb 2012 09:59:39 -0500 > Von: "R. Michael Weylandt" <[hidden email]> > An: Johannes Radinger <[hidden email]> > CC: [hidden email] > Betreff: Re: [R] Wildcard for indexing? > I think the grep()-family (regular expressions) will be the easiest > way to do this, though it sounds like you might prefer grepl() which > returns a logical vector: > > ^[AB] # Starts with either an A or a B > ^A_ # Starting with A_ > > a <- c("A_A","A_B","C_A","BB","A_Asd" > grepl("^[AB]", a) > grepl("^A_") Yes grepl() is what I am looking for. is there also something like an OR statement e.g. if I want to select for elements that start with "as" OR "df"? /johannes > > Michael > > On Tue, Feb 14, 2012 at 9:54 AM, Johannes Radinger <[hidden email]> > wrote: > > Hi, > > > > I'd like to know if it is possible to use wildcards * for indexing... > > E.g. I have a vector of strings. Now I'd like to select all elements > > which start with A_*? I'd also need to combine that with logical > operators: > > > > "Select all elements of a vector that start with A (A*) OR that start > with B (B*)" > > > > Probably that is quite easy. I looked into grep() which I think might > perform such tasks, but probably there is a more straigth forward solution. > > > > a <- c("A_A","A_B","C_A","BB","A_Asd") > > a[a=="A_A"| a=="A_B"] # here I'd like an index but with wildcard > > > > /johannes > > -- > > > > ______________________________________________ > > [hidden email] mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. -- ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Hi,
You should probably do a bit of reading about regular expressions, but here's one way: On Tue, Feb 14, 2012 at 10:10 AM, Johannes Radinger <[hidden email]> wrote: > Hi, > > -------- Original-Nachricht -------- >> Datum: Tue, 14 Feb 2012 09:59:39 -0500 >> Von: "R. Michael Weylandt" <[hidden email]> >> An: Johannes Radinger <[hidden email]> >> CC: [hidden email] >> Betreff: Re: [R] Wildcard for indexing? > >> I think the grep()-family (regular expressions) will be the easiest >> way to do this, though it sounds like you might prefer grepl() which >> returns a logical vector: >> >> ^[AB] # Starts with either an A or a B >> ^A_ # Starting with A_ >> >> a <- c("A_A","A_B","C_A","BB","A_Asd" >> grepl("^[AB]", a) >> grepl("^A_") > > Yes grepl() is what I am looking for. > is there also something like an OR statement e.g. if I want to > select for elements that start with "as" OR "df"? > a <- c("as1", "bb", "as2", "cc", "df", "aa", "dd", "sdf") > grepl("^as|^df", a) [1] TRUE FALSE TRUE FALSE TRUE FALSE FALSE FALSE The square brackets match any of those characters, so are good for single characters. For more complex patterns, | is the or symbol. ^ marks the beginning. Sarah -- Sarah Goslee http://www.functionaldiversity.org ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by Johannes Radinger
On 14/02/2012 10:10 AM, Johannes Radinger wrote:
> Hi, > > -------- Original-Nachricht -------- > > Datum: Tue, 14 Feb 2012 09:59:39 -0500 > > Von: "R. Michael Weylandt"<[hidden email]> > > An: Johannes Radinger<[hidden email]> > > CC: [hidden email] > > Betreff: Re: [R] Wildcard for indexing? > > > I think the grep()-family (regular expressions) will be the easiest > > way to do this, though it sounds like you might prefer grepl() which > > returns a logical vector: > > > > ^[AB] # Starts with either an A or a B > > ^A_ # Starting with A_ > > > > a<- c("A_A","A_B","C_A","BB","A_Asd" > > grepl("^[AB]", a) > > grepl("^A_") > > Yes grepl() is what I am looking for. > is there also something like an OR statement e.g. if I want to > select for elements that start with "as" OR "df"? grepl("^(as|df)", a) should work. See ?regexp for all the possibilities, rules about operator precedence, etc. (Or just use grepl("^as", a) | grepl("^df", a).) Duncan Murdoch > /johannes > > > > > Michael > > > > On Tue, Feb 14, 2012 at 9:54 AM, Johannes Radinger<[hidden email]> > > wrote: > > > Hi, > > > > > > I'd like to know if it is possible to use wildcards * for indexing... > > > E.g. I have a vector of strings. Now I'd like to select all elements > > > which start with A_*? I'd also need to combine that with logical > > operators: > > > > > > "Select all elements of a vector that start with A (A*) OR that start > > with B (B*)" > > > > > > Probably that is quite easy. I looked into grep() which I think might > > perform such tasks, but probably there is a more straigth forward solution. > > > > > > a<- c("A_A","A_B","C_A","BB","A_Asd") > > > a[a=="A_A"| a=="A_B"] # here I'd like an index but with wildcard > > > > > > /johannes > > > -- > > > > > > ______________________________________________ > > > [hidden email] mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > -- > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by Sarah Goslee
-------- Original-Nachricht -------- > Datum: Tue, 14 Feb 2012 10:18:33 -0500 > Von: Sarah Goslee <[hidden email]> > An: Johannes Radinger <[hidden email]> > CC: [hidden email] > Betreff: Re: [R] Wildcard for indexing? > Hi, > > You should probably do a bit of reading about regular expressions, but > here's one way: > > On Tue, Feb 14, 2012 at 10:10 AM, Johannes Radinger <[hidden email]> > wrote: > > Hi, > > > > -------- Original-Nachricht -------- > >> Datum: Tue, 14 Feb 2012 09:59:39 -0500 > >> Von: "R. Michael Weylandt" <[hidden email]> > >> An: Johannes Radinger <[hidden email]> > >> CC: [hidden email] > >> Betreff: Re: [R] Wildcard for indexing? > > > >> I think the grep()-family (regular expressions) will be the easiest > >> way to do this, though it sounds like you might prefer grepl() which > >> returns a logical vector: > >> > >> ^[AB] # Starts with either an A or a B > >> ^A_ # Starting with A_ > >> > >> a <- c("A_A","A_B","C_A","BB","A_Asd" > >> grepl("^[AB]", a) > >> grepl("^A_") > > > > Yes grepl() is what I am looking for. > > is there also something like an OR statement e.g. if I want to > > select for elements that start with "as" OR "df"? > > > a <- c("as1", "bb", "as2", "cc", "df", "aa", "dd", "sdf") > > grepl("^as|^df", a) > [1] TRUE FALSE TRUE FALSE TRUE FALSE FALSE FALSE > > > The square brackets match any of those characters, so are good > for single characters. For more complex patterns, | is the or symbol. > ^ marks the beginning. Thank you so much Sarah! I tried that | symbol intuitively, there was just a problem with the quotation marks :( Now everything is solved... /johannes > > Sarah > > -- > Sarah Goslee > http://www.functionaldiversity.org -- ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Note that you can also do logical comparisons with the results of grepl like:
grepl('^as', a) | grepl('^df',a) For the given example it is probably simplest to do it in the regular expression as shown, but for some more complex cases (or including other variables) the logic with the output may be simpler. On Tue, Feb 14, 2012 at 8:23 AM, Johannes Radinger <[hidden email]> wrote: > > > -------- Original-Nachricht -------- >> Datum: Tue, 14 Feb 2012 10:18:33 -0500 >> Von: Sarah Goslee <[hidden email]> >> An: Johannes Radinger <[hidden email]> >> CC: [hidden email] >> Betreff: Re: [R] Wildcard for indexing? > >> Hi, >> >> You should probably do a bit of reading about regular expressions, but >> here's one way: >> >> On Tue, Feb 14, 2012 at 10:10 AM, Johannes Radinger <[hidden email]> >> wrote: >> > Hi, >> > >> > -------- Original-Nachricht -------- >> >> Datum: Tue, 14 Feb 2012 09:59:39 -0500 >> >> Von: "R. Michael Weylandt" <[hidden email]> >> >> An: Johannes Radinger <[hidden email]> >> >> CC: [hidden email] >> >> Betreff: Re: [R] Wildcard for indexing? >> > >> >> I think the grep()-family (regular expressions) will be the easiest >> >> way to do this, though it sounds like you might prefer grepl() which >> >> returns a logical vector: >> >> >> >> ^[AB] # Starts with either an A or a B >> >> ^A_ # Starting with A_ >> >> >> >> a <- c("A_A","A_B","C_A","BB","A_Asd" >> >> grepl("^[AB]", a) >> >> grepl("^A_") >> > >> > Yes grepl() is what I am looking for. >> > is there also something like an OR statement e.g. if I want to >> > select for elements that start with "as" OR "df"? >> >> > a <- c("as1", "bb", "as2", "cc", "df", "aa", "dd", "sdf") >> > grepl("^as|^df", a) >> [1] TRUE FALSE TRUE FALSE TRUE FALSE FALSE FALSE >> >> >> The square brackets match any of those characters, so are good >> for single characters. For more complex patterns, | is the or symbol. >> ^ marks the beginning. > > Thank you so much Sarah! I tried that | symbol intuitively, there was just a problem with the quotation marks :( > > Now everything is solved... > > /johannes > >> >> Sarah >> >> -- >> Sarah Goslee >> http://www.functionaldiversity.org > > -- > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. [hidden email] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Free forum by Nabble | Edit this page |