matching period with perl regular expression

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

matching period with perl regular expression

Stephen J. Barr-2
Hello,

I have several strings where I am trying to eliminate the period and
everything after the period, using a regular expression. However, I am
having trouble getting this to work.

> x = "wa.w"
> gsub(x, "\..*", "", perl=TRUE)
[1] ""
Warning messages:
1: '\.' is an unrecognized escape in a character string
2: unrecognized escape removed from "\..*"

In perl, you can match a single period with \.
Is this not so even with perl=TRUE. I would like for x to be equal to
> x = "wa"

What am I missing here?
-stephen
==========================================
Stephen J. Barr
University of Washington
WEB: www.econsteve.com

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: matching period with perl regular expression

Henrique Dallazuanna
Try this:

gsub("^(\\w*).*$", "\\1", x)

On Wed, May 13, 2009 at 8:41 PM, Stephen J. Barr <[hidden email]>wrote:

> Hello,
>
> I have several strings where I am trying to eliminate the period and
> everything after the period, using a regular expression. However, I am
> having trouble getting this to work.
>
> > x = "wa.w"
> > gsub(x, "\..*", "", perl=TRUE)
> [1] ""
> Warning messages:
> 1: '\.' is an unrecognized escape in a character string
> 2: unrecognized escape removed from "\..*"
>
> In perl, you can match a single period with \.
> Is this not so even with perl=TRUE. I would like for x to be equal to
> > x = "wa"
>
> What am I missing here?
> -stephen
> ==========================================
> Stephen J. Barr
> University of Washington
> WEB: www.econsteve.com
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: matching period with perl regular expression

Gabor Grothendieck
In reply to this post by Stephen J. Barr-2
R interprets backslash to give special meaning to the next character, i.e.
it strips off the backslash and send the following character to gsub
possibly reinterpreting it specially (for example \n is newline).  Thus
a backslash will never get to gsub unless you use a double backslash.
Thus we can use "'\\." to represent \.   Also note that
that the regular expression "[.]" represents a literal dot and does not
require a backslash in the first place.  You don't need perl  = TRUE for
simple regular expressions like this.

On Wed, May 13, 2009 at 7:41 PM, Stephen J. Barr <[hidden email]> wrote:

> Hello,
>
> I have several strings where I am trying to eliminate the period and
> everything after the period, using a regular expression. However, I am
> having trouble getting this to work.
>
>> x = "wa.w"
>> gsub(x, "\..*", "", perl=TRUE)
> [1] ""
> Warning messages:
> 1: '\.' is an unrecognized escape in a character string
> 2: unrecognized escape removed from "\..*"
>
> In perl, you can match a single period with \.
> Is this not so even with perl=TRUE. I would like for x to be equal to
>> x = "wa"
>
> What am I missing here?
> -stephen
> ==========================================
> Stephen J. Barr
> University of Washington
> WEB: www.econsteve.com
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: matching period with perl regular expression

Ted.Harding-2
In reply to this post by Henrique Dallazuanna
On 13-May-09 23:47:41, Henrique Dallazuanna wrote:
> Try this:
>
> gsub("^(\\w*).*$", "\\1", x)

Even simpler:
  x
# [1] "wa.w"
  gsub("\\..*","",x,perl=TRUE)
# [1] "wa"
  x<-"abcde.fghij.klmno"
  gsub("\\..*","",x,perl=TRUE)
# [1] "abcde"

(and it doesn't matter whether 'perl' is TRUE or FALSE)

Ted.

> On Wed, May 13, 2009 at 8:41 PM, Stephen J. Barr
> <[hidden email]>wrote:
>
>> Hello,
>>
>> I have several strings where I am trying to eliminate the period and
>> everything after the period, using a regular expression. However, I am
>> having trouble getting this to work.
>>
>> > x = "wa.w"
>> > gsub(x, "\..*", "", perl=TRUE)
>> [1] ""
>> Warning messages:
>> 1: '\.' is an unrecognized escape in a character string
>> 2: unrecognized escape removed from "\..*"
>>
>> In perl, you can match a single period with \.
>> Is this not so even with perl=TRUE. I would like for x to be equal to
>> > x = "wa"
>>
>> What am I missing here?
>> -stephen
>> ==========================================
>> Stephen J. Barr
>> University of Washington
>> WEB: www.econsteve.com
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
>
>       [[alternative HTML version deleted]]
>

--------------------------------------------------------------------
E-Mail: (Ted Harding) <[hidden email]>
Fax-to-email: +44 (0)870 094 0861
Date: 14-May-09                                       Time: 00:58:07
------------------------------ XFMail ------------------------------

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: matching period with perl regular expression

Bill.Venables
In reply to this post by Stephen J. Barr-2
You have the arguments out of line and you need two backslashes:

> x <- "wa.w"
> gsub("\\..*", "", x)
[1] "wa"
>  


Bill Venables
http://www.cmis.csiro.au/bill.venables/ 


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Stephen J. Barr
Sent: Thursday, 14 May 2009 9:42 AM
To: [hidden email]
Subject: [R] matching period with perl regular expression

Hello,

I have several strings where I am trying to eliminate the period and
everything after the period, using a regular expression. However, I am
having trouble getting this to work.

> x = "wa.w"
> gsub(x, "\..*", "", perl=TRUE)
[1] ""
Warning messages:
1: '\.' is an unrecognized escape in a character string
2: unrecognized escape removed from "\..*"

In perl, you can match a single period with \.
Is this not so even with perl=TRUE. I would like for x to be equal to
> x = "wa"

What am I missing here?
-stephen
==========================================
Stephen J. Barr
University of Washington
WEB: www.econsteve.com

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: matching period with perl regular expression

Dieter Menne
In reply to this post by Gabor Grothendieck
Gabor Grothendieck <ggrothendieck <at> gmail.com> writes:

>
> R interprets backslash to give special meaning to the next character, i.e.
> it strips off the backslash and send the following character to gsub
> possibly reinterpreting it specially (for example \n is newline).  Thus
> a backslash will never get to gsub unless you use a double backslash.

To quote Peter Dalgaard:

"The generic rule for backslashes is that you need twice as many as you thought"

Only make sure that you have the right stopping rule for recursive evaluation of
that.

Dieter

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.