Regexp bug or misunderstanding

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Regexp bug or misunderstanding

Martin Møller Skarbiniks Pedersen
Hi,

   Have I found a bug in R? Or misunderstood something about grep() ?

   Case 1 gives the expected output
   Case 2 does not gives the expected output.
   I expected integer(0) also for this case.

case 1:
grep("[:digit:]", "**ABAAbabaabackabaloneaban")
integer(0)

case 2:
grep("[:digit:]", "**ABAAbabaabackabaloneaband")
[1] 1

Regards
Martin

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Regexp bug or misunderstanding

Ista Zahn
I think you want "[[:digit:]]" instead of "[:digit:]"

--Ista

On Mon, Jul 2, 2018 at 8:52 AM, Martin Møller Skarbiniks Pedersen
<[hidden email]> wrote:

> Hi,
>
>    Have I found a bug in R? Or misunderstood something about grep() ?
>
>    Case 1 gives the expected output
>    Case 2 does not gives the expected output.
>    I expected integer(0) also for this case.
>
> case 1:
> grep("[:digit:]", "**ABAAbabaabackabaloneaban")
> integer(0)
>
> case 2:
> grep("[:digit:]", "**ABAAbabaabackabaloneaband")
> [1] 1
>
> Regards
> Martin
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Regexp bug or misunderstanding

Dénes Tóth-2
In reply to this post by Martin Møller Skarbiniks Pedersen
Hi Martin,

I assume you want to check whether a particular character string
contains a digit. In this case you should use the following pattern:
"[[:digit:]]" instead of "[:digit:]".

 From ?regex:
"A character class is a list of characters enclosed between [ and ]
which matches any single character in that list; unless the first
character of the list is the caret ^, when it matches any character not
in the list. ... Certain named classes of characters are predefined...
For example, [[:alnum:]] means [0-9A-Za-z]"

So if you use simply "[:digit:]" as a pattern, it means: a character
string which contains any of the following characters: ':', 'd', 'i',
'g', 't'. Your second test case contains 'd', whereas the first case
contains neither of the above characters.

HTH,
Denes



On 07/02/2018 02:52 PM, Martin Møller Skarbiniks Pedersen wrote:

> Hi,
>
>     Have I found a bug in R? Or misunderstood something about grep() ?
>
>     Case 1 gives the expected output
>     Case 2 does not gives the expected output.
>     I expected integer(0) also for this case.
>
> case 1:
> grep("[:digit:]", "**ABAAbabaabackabaloneaban")
> integer(0)
>
> case 2:
> grep("[:digit:]", "**ABAAbabaabackabaloneaband")
> [1] 1
>
> Regards
> Martin
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.