Problem with the str_replace function

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem with the str_replace function

phil-3
I have a problem with the str_replace() function in the stringr package.
Please refer to my reprex below.

I start with a vector of strings, called x. Some of the strings contain
apostrophes and brackets. I make a simple replacement as with x1, and
there is no problem. I make another simple replacement, x2, where the
pattern string has an apostrophe. Again no problem. Then I make a third
replacement, x3, where the pattern has opening and closing brackets and
the function still works fine. Finally I make a replacement where the
pattern has both an apostrophe and opening and closing brackets and the
replacement does not work. I tried to solve this by putting backslashes
before the apostrophe and/or the brackets, but that accomplished
nothing. I am stumped.

# Reprex for str_replace problem

library(stringr)

x <- c(
   "Clothing and footwear",
   "Women's clothing",
   "Women's footwear (excluding athletic)",
   "Clothing accessories (belts and so on)",
   "Clothing and footwear",
   "Women's clothing",
   "Women's footwear (excluding athletic)",
   "Clothing accessories (belts and so on)"
)
x
x1 <- str_replace(x,
   "Clothing and footwear",
   "Clothing and shoes"
)
x1
x2 <- str_replace(x,
   "Women's clothing",
   "Women's clothing goods"
)
x2
x3 <- str_replace(x,
   "Clothing accessories (belts and so on)",
   "Clothing accessories")
x3
x4 <- str_replace(x,
   "Women's footwear (excluding athletic)",
   "Women's footwear")
x4

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem with the str_replace function

Bert Gunter-2
I prefer using regular expressions directly, so this may not satisfy you:

> a <-"Women's footwear (excluding athletic)"
> b <- gsub("(.*) \\(.*$","\\1",a)
> b
[1] "Women's footwear"

There are, of course other ways to do this with regex's or even substring()

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Mar 16, 2021 at 5:36 PM <[hidden email]> wrote:

> I have a problem with the str_replace() function in the stringr package.
> Please refer to my reprex below.
>
> I start with a vector of strings, called x. Some of the strings contain
> apostrophes and brackets. I make a simple replacement as with x1, and
> there is no problem. I make another simple replacement, x2, where the
> pattern string has an apostrophe. Again no problem. Then I make a third
> replacement, x3, where the pattern has opening and closing brackets and
> the function still works fine. Finally I make a replacement where the
> pattern has both an apostrophe and opening and closing brackets and the
> replacement does not work. I tried to solve this by putting backslashes
> before the apostrophe and/or the brackets, but that accomplished
> nothing. I am stumped.
>
> # Reprex for str_replace problem
>
> library(stringr)
>
> x <- c(
>    "Clothing and footwear",
>    "Women's clothing",
>    "Women's footwear (excluding athletic)",
>    "Clothing accessories (belts and so on)",
>    "Clothing and footwear",
>    "Women's clothing",
>    "Women's footwear (excluding athletic)",
>    "Clothing accessories (belts and so on)"
> )
> x
> x1 <- str_replace(x,
>    "Clothing and footwear",
>    "Clothing and shoes"
> )
> x1
> x2 <- str_replace(x,
>    "Women's clothing",
>    "Women's clothing goods"
> )
> x2
> x3 <- str_replace(x,
>    "Clothing accessories (belts and so on)",
>    "Clothing accessories")
> x3
> x4 <- str_replace(x,
>    "Women's footwear (excluding athletic)",
>    "Women's footwear")
> x4
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem with the str_replace function

Hervé Pagès-3
In reply to this post by phil-3
Hi,

stringr::str_replace() treats the 2nd argument ('pattern') as a regular
expression and some characters have a special meaning when they are used
in a regular expression. For example the dot plays the role of a
wildcard (i.e. it means "any character"):

   > str_replace("aaXcc", "a.c", "ZZ")
   [1] "aZZc"

If you want to treat a special character literally, you need to escape
it with a double backslahe '\\':

   > str_replace(c("aaXcc", "aa.cc"), "a.c", "ZZ")
   [1] "aZZc" "aZZc"

   > str_replace(c("aaXcc", "aa.cc"), "a\\.c", "ZZ")
   [1] "aaXcc" "aZZc"

Turns out that parenthesis are also special characters so you also need
to escape them:

   > str_replace("aa(X)cc", "a(X)c", "ZZ")
   [1] "aa(X)cc"

   > str_replace("aa(X)cc", "a\\(X\\)c", "ZZ")
   [1] "aZZc"

There are plenty of example in the man page for str_replace() (see
'?str_replace') including examples showing the use of parenthesis in the
pattern.

Hope this helps,

H.


On 3/16/21 5:34 PM, [hidden email] wrote:

> I have a problem with the str_replace() function in the stringr package.
> Please refer to my reprex below.
>
> I start with a vector of strings, called x. Some of the strings contain
> apostrophes and brackets. I make a simple replacement as with x1, and
> there is no problem. I make another simple replacement, x2, where the
> pattern string has an apostrophe. Again no problem. Then I make a third
> replacement, x3, where the pattern has opening and closing brackets and
> the function still works fine. Finally I make a replacement where the
> pattern has both an apostrophe and opening and closing brackets and the
> replacement does not work. I tried to solve this by putting backslashes
> before the apostrophe and/or the brackets, but that accomplished
> nothing. I am stumped.
>
> # Reprex for str_replace problem
>
> library(stringr)
>
> x <- c(
>    "Clothing and footwear",
>    "Women's clothing",
>    "Women's footwear (excluding athletic)",
>    "Clothing accessories (belts and so on)",
>    "Clothing and footwear",
>    "Women's clothing",
>    "Women's footwear (excluding athletic)",
>    "Clothing accessories (belts and so on)"
> )
> x
> x1 <- str_replace(x,
>    "Clothing and footwear",
>    "Clothing and shoes"
> )
> x1
> x2 <- str_replace(x,
>    "Women's clothing",
>    "Women's clothing goods"
> )
> x2
> x3 <- str_replace(x,
>    "Clothing accessories (belts and so on)",
>    "Clothing accessories")
> x3
> x4 <- str_replace(x,
>    "Women's footwear (excluding athletic)",
>    "Women's footwear")
> x4
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--
Hervé Pagès

Bioconductor Core Team
[hidden email]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Problem with the str_replace function

phil-3
Your help is much appreciated. I now understand what my problem was and
can move forward.

Philip


On 2021-03-17 01:19, Hervé Pagès wrote:

> Hi,
>
> stringr::str_replace() treats the 2nd argument ('pattern') as a
> regular expression and some characters have a special meaning when
> they are used in a regular expression. For example the dot plays the
> role of a wildcard (i.e. it means "any character"):
>
>   > str_replace("aaXcc", "a.c", "ZZ")
>   [1] "aZZc"
>
> If you want to treat a special character literally, you need to escape
> it with a double backslahe '\\':
>
>   > str_replace(c("aaXcc", "aa.cc"), "a.c", "ZZ")
>   [1] "aZZc" "aZZc"
>
>   > str_replace(c("aaXcc", "aa.cc"), "a\\.c", "ZZ")
>   [1] "aaXcc" "aZZc"
>
> Turns out that parenthesis are also special characters so you also
> need to escape them:
>
>   > str_replace("aa(X)cc", "a(X)c", "ZZ")
>   [1] "aa(X)cc"
>
>   > str_replace("aa(X)cc", "a\\(X\\)c", "ZZ")
>   [1] "aZZc"
>
> There are plenty of example in the man page for str_replace() (see
> '?str_replace') including examples showing the use of parenthesis in
> the pattern.
>
> Hope this helps,
>
> H.
>
>
> On 3/16/21 5:34 PM, [hidden email] wrote:
>> I have a problem with the str_replace() function in the stringr
>> package. Please refer to my reprex below.
>>
>> I start with a vector of strings, called x. Some of the strings
>> contain apostrophes and brackets. I make a simple replacement as with
>> x1, and there is no problem. I make another simple replacement, x2,
>> where the pattern string has an apostrophe. Again no problem. Then I
>> make a third replacement, x3, where the pattern has opening and
>> closing brackets and the function still works fine. Finally I make a
>> replacement where the pattern has both an apostrophe and opening and
>> closing brackets and the replacement does not work. I tried to solve
>> this by putting backslashes before the apostrophe and/or the brackets,
>> but that accomplished nothing. I am stumped.
>>
>> # Reprex for str_replace problem
>>
>> library(stringr)
>>
>> x <- c(
>>    "Clothing and footwear",
>>    "Women's clothing",
>>    "Women's footwear (excluding athletic)",
>>    "Clothing accessories (belts and so on)",
>>    "Clothing and footwear",
>>    "Women's clothing",
>>    "Women's footwear (excluding athletic)",
>>    "Clothing accessories (belts and so on)"
>> )
>> x
>> x1 <- str_replace(x,
>>    "Clothing and footwear",
>>    "Clothing and shoes"
>> )
>> x1
>> x2 <- str_replace(x,
>>    "Women's clothing",
>>    "Women's clothing goods"
>> )
>> x2
>> x3 <- str_replace(x,
>>    "Clothing accessories (belts and so on)",
>>    "Clothing accessories")
>> x3
>> x4 <- str_replace(x,
>>    "Women's footwear (excluding athletic)",
>>    "Women's footwear")
>> x4
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.