I have a problem with the str_replace() function in the stringr package.
Please refer to my reprex below. I start with a vector of strings, called x. Some of the strings contain apostrophes and brackets. I make a simple replacement as with x1, and there is no problem. I make another simple replacement, x2, where the pattern string has an apostrophe. Again no problem. Then I make a third replacement, x3, where the pattern has opening and closing brackets and the function still works fine. Finally I make a replacement where the pattern has both an apostrophe and opening and closing brackets and the replacement does not work. I tried to solve this by putting backslashes before the apostrophe and/or the brackets, but that accomplished nothing. I am stumped. # Reprex for str_replace problem library(stringr) x <- c( "Clothing and footwear", "Women's clothing", "Women's footwear (excluding athletic)", "Clothing accessories (belts and so on)", "Clothing and footwear", "Women's clothing", "Women's footwear (excluding athletic)", "Clothing accessories (belts and so on)" ) x x1 <- str_replace(x, "Clothing and footwear", "Clothing and shoes" ) x1 x2 <- str_replace(x, "Women's clothing", "Women's clothing goods" ) x2 x3 <- str_replace(x, "Clothing accessories (belts and so on)", "Clothing accessories") x3 x4 <- str_replace(x, "Women's footwear (excluding athletic)", "Women's footwear") x4 ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
I prefer using regular expressions directly, so this may not satisfy you:
> a <-"Women's footwear (excluding athletic)" > b <- gsub("(.*) \\(.*$","\\1",a) > b [1] "Women's footwear" There are, of course other ways to do this with regex's or even substring() Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Mar 16, 2021 at 5:36 PM <[hidden email]> wrote: > I have a problem with the str_replace() function in the stringr package. > Please refer to my reprex below. > > I start with a vector of strings, called x. Some of the strings contain > apostrophes and brackets. I make a simple replacement as with x1, and > there is no problem. I make another simple replacement, x2, where the > pattern string has an apostrophe. Again no problem. Then I make a third > replacement, x3, where the pattern has opening and closing brackets and > the function still works fine. Finally I make a replacement where the > pattern has both an apostrophe and opening and closing brackets and the > replacement does not work. I tried to solve this by putting backslashes > before the apostrophe and/or the brackets, but that accomplished > nothing. I am stumped. > > # Reprex for str_replace problem > > library(stringr) > > x <- c( > "Clothing and footwear", > "Women's clothing", > "Women's footwear (excluding athletic)", > "Clothing accessories (belts and so on)", > "Clothing and footwear", > "Women's clothing", > "Women's footwear (excluding athletic)", > "Clothing accessories (belts and so on)" > ) > x > x1 <- str_replace(x, > "Clothing and footwear", > "Clothing and shoes" > ) > x1 > x2 <- str_replace(x, > "Women's clothing", > "Women's clothing goods" > ) > x2 > x3 <- str_replace(x, > "Clothing accessories (belts and so on)", > "Clothing accessories") > x3 > x4 <- str_replace(x, > "Women's footwear (excluding athletic)", > "Women's footwear") > x4 > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by phil-3
Hi,
stringr::str_replace() treats the 2nd argument ('pattern') as a regular expression and some characters have a special meaning when they are used in a regular expression. For example the dot plays the role of a wildcard (i.e. it means "any character"): > str_replace("aaXcc", "a.c", "ZZ") [1] "aZZc" If you want to treat a special character literally, you need to escape it with a double backslahe '\\': > str_replace(c("aaXcc", "aa.cc"), "a.c", "ZZ") [1] "aZZc" "aZZc" > str_replace(c("aaXcc", "aa.cc"), "a\\.c", "ZZ") [1] "aaXcc" "aZZc" Turns out that parenthesis are also special characters so you also need to escape them: > str_replace("aa(X)cc", "a(X)c", "ZZ") [1] "aa(X)cc" > str_replace("aa(X)cc", "a\\(X\\)c", "ZZ") [1] "aZZc" There are plenty of example in the man page for str_replace() (see '?str_replace') including examples showing the use of parenthesis in the pattern. Hope this helps, H. On 3/16/21 5:34 PM, [hidden email] wrote: > I have a problem with the str_replace() function in the stringr package. > Please refer to my reprex below. > > I start with a vector of strings, called x. Some of the strings contain > apostrophes and brackets. I make a simple replacement as with x1, and > there is no problem. I make another simple replacement, x2, where the > pattern string has an apostrophe. Again no problem. Then I make a third > replacement, x3, where the pattern has opening and closing brackets and > the function still works fine. Finally I make a replacement where the > pattern has both an apostrophe and opening and closing brackets and the > replacement does not work. I tried to solve this by putting backslashes > before the apostrophe and/or the brackets, but that accomplished > nothing. I am stumped. > > # Reprex for str_replace problem > > library(stringr) > > x <- c( > "Clothing and footwear", > "Women's clothing", > "Women's footwear (excluding athletic)", > "Clothing accessories (belts and so on)", > "Clothing and footwear", > "Women's clothing", > "Women's footwear (excluding athletic)", > "Clothing accessories (belts and so on)" > ) > x > x1 <- str_replace(x, > "Clothing and footwear", > "Clothing and shoes" > ) > x1 > x2 <- str_replace(x, > "Women's clothing", > "Women's clothing goods" > ) > x2 > x3 <- str_replace(x, > "Clothing accessories (belts and so on)", > "Clothing accessories") > x3 > x4 <- str_replace(x, > "Women's footwear (excluding athletic)", > "Women's footwear") > x4 > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Hervé Pagès Bioconductor Core Team [hidden email] ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Your help is much appreciated. I now understand what my problem was and
can move forward. Philip On 2021-03-17 01:19, Hervé Pagès wrote: > Hi, > > stringr::str_replace() treats the 2nd argument ('pattern') as a > regular expression and some characters have a special meaning when > they are used in a regular expression. For example the dot plays the > role of a wildcard (i.e. it means "any character"): > > > str_replace("aaXcc", "a.c", "ZZ") > [1] "aZZc" > > If you want to treat a special character literally, you need to escape > it with a double backslahe '\\': > > > str_replace(c("aaXcc", "aa.cc"), "a.c", "ZZ") > [1] "aZZc" "aZZc" > > > str_replace(c("aaXcc", "aa.cc"), "a\\.c", "ZZ") > [1] "aaXcc" "aZZc" > > Turns out that parenthesis are also special characters so you also > need to escape them: > > > str_replace("aa(X)cc", "a(X)c", "ZZ") > [1] "aa(X)cc" > > > str_replace("aa(X)cc", "a\\(X\\)c", "ZZ") > [1] "aZZc" > > There are plenty of example in the man page for str_replace() (see > '?str_replace') including examples showing the use of parenthesis in > the pattern. > > Hope this helps, > > H. > > > On 3/16/21 5:34 PM, [hidden email] wrote: >> I have a problem with the str_replace() function in the stringr >> package. Please refer to my reprex below. >> >> I start with a vector of strings, called x. Some of the strings >> contain apostrophes and brackets. I make a simple replacement as with >> x1, and there is no problem. I make another simple replacement, x2, >> where the pattern string has an apostrophe. Again no problem. Then I >> make a third replacement, x3, where the pattern has opening and >> closing brackets and the function still works fine. Finally I make a >> replacement where the pattern has both an apostrophe and opening and >> closing brackets and the replacement does not work. I tried to solve >> this by putting backslashes before the apostrophe and/or the brackets, >> but that accomplished nothing. I am stumped. >> >> # Reprex for str_replace problem >> >> library(stringr) >> >> x <- c( >> "Clothing and footwear", >> "Women's clothing", >> "Women's footwear (excluding athletic)", >> "Clothing accessories (belts and so on)", >> "Clothing and footwear", >> "Women's clothing", >> "Women's footwear (excluding athletic)", >> "Clothing accessories (belts and so on)" >> ) >> x >> x1 <- str_replace(x, >> "Clothing and footwear", >> "Clothing and shoes" >> ) >> x1 >> x2 <- str_replace(x, >> "Women's clothing", >> "Women's clothing goods" >> ) >> x2 >> x3 <- str_replace(x, >> "Clothing accessories (belts and so on)", >> "Clothing accessories") >> x3 >> x4 <- str_replace(x, >> "Women's footwear (excluding athletic)", >> "Women's footwear") >> x4 >> >> ______________________________________________ >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Free forum by Nabble | Edit this page |