Extracting text from a character string

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Extracting text from a character string

Shawn Way
 I have a set of character strings like below:
 
> data3[1]
[1] "CB01_0171_03-27-2002-(Sample 26609)-(126)"
>
 
I am trying to extract the text 03-27-2002 and convert this into a date
for the same record.  I keep looking at the grep function, however I
cannot quite get it to work.
 
grep("\d\d-\d\d-\d\d\d\d",data3[1],perl=TRUE,value=TRUE)
 
Any hints?
 

-------------------------------------------------------------------------------
Shawn Way
14 Cambridge Center
Cambridge, MA 02142

Ph:617-679-4488
        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Extracting text from a character string

Marc Schwartz
On Fri, 2007-03-09 at 15:23 -0500, Shawn Way wrote:

>  I have a set of character strings like below:
>  
> > data3[1]
> [1] "CB01_0171_03-27-2002-(Sample 26609)-(126)"
> >
>  
> I am trying to extract the text 03-27-2002 and convert this into a date
> for the same record.  I keep looking at the grep function, however I
> cannot quite get it to work.
>  
> grep("\d\d-\d\d-\d\d\d\d",data3[1],perl=TRUE,value=TRUE)
>  
> Any hints?


At least two different ways:

Vec <- "CB01_0171_03-27-2002-(Sample 26609)-(126)"


1. Using substr(), if your source vector is a fixed format

# Get the 11th thru the 20th character
> substr(Vec, 11, 20)
[1] "03-27-2002"


2. Using sub() for a more generalized approach:

# Use a back reference, returning the value pattern within the
# parens

> sub(".+([0-9]{2}-[0-9]{2}-[0-9]{4}).+", "\\1", Vec)
[1] "03-27-2002"


See ?substr, ?sub and ?regex

HTH,

Marc Schwartz

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.