Quantcast

convert 'character' vector containing mixed formats to 'Date'

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

convert 'character' vector containing mixed formats to 'Date'

Liviu Andronic
Dear all
I have a 'character' vector containing mixed formats (thanks Excel!)
and I'd like to translate it into a default "%Y-%m-%d" Date vector.
x <- c("1/3/2005", "13/04/2004", "2/5/2005", "2/5/2005", "7/5/2007",
       "22/04/2004", "21/04/2005", "20080430", "13/05/2003", "20080529",
       NA, NA, "19/05/1999", "17/05/2000", "17/05/2000")


In the above you will see that some dates are of format="%d/%m/%Y",
others of format="%Y%m%d" and some NA values. Can you suggest a
straight-forward way of transforming these to a uniform 'character' or
'Date' vector? I tried to do the following, but it outputs very
strange results:
> x
 [1] "1/3/2005"   "13/04/2004" "2/5/2005"   "2/5/2005"   "7/5/2007"
"22/04/2004"
 [7] "21/04/2005" "20080430"   "13/05/2003" "20080529"   NA
NA
[13] "19/05/1999" "17/05/2000" "17/05/2000"
> sum(xa <- grepl('/', x))
[1] 11
> sum(xb  <- grepl('200', substr(x, 1,4)))
[1] 2
> sum(xc <- is.na(x))
[1] 2
> x[xa] <- as.Date(x[xa], format="%d/%m/%Y")
> x[xb] <- as.Date(x[xb], format="%Y%m%d")
> x
 [1] "12843" "12521" "12905" "12905" "13640" "12530" "12894" "13999"
"12185" "14028"
[11] NA      NA      "10730" "11094" "11094"


The culprit is likely that the 'x' vector is 'character' throughout,
but I'm not sure how to work around. For example, I couldn't figure
how to create an empty 'Date' vector. Regards
Liviu


--
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: convert 'character' vector containing mixed formats to 'Date'

Liviu Andronic
On Thu, Jun 21, 2012 at 2:48 PM, Liviu Andronic <[hidden email]> wrote:
> The culprit is likely that the 'x' vector is 'character' throughout,
> but I'm not sure how to work around. For example, I couldn't figure
> how to create an empty 'Date' vector. Regards
>

I think I managed to crack this by myself. I only needed to add an
as.character() call:
> x[xa] <- as.character(as.Date(x[xa], format="%d/%m/%Y"))
> x[xb] <- as.character(as.Date(x[xb], format="%Y%m%d"))
> x
 [1] "2005-03-01" "2004-04-13" "2005-05-02" "2005-05-02" "2007-05-07"
"2004-04-22"
 [7] "2005-04-21" "2008-04-30" "2003-05-13" "2008-05-29" NA
NA
[13] "1999-05-19" "2000-05-17" "2000-05-17"


Liviu

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: convert 'character' vector containing mixed formats to 'Date'

PIKAL Petr
In reply to this post by Liviu Andronic
Hi

>
> Dear all
> I have a 'character' vector containing mixed formats (thanks Excel!)
> and I'd like to translate it into a default "%Y-%m-%d" Date vector.
> x <- c("1/3/2005", "13/04/2004", "2/5/2005", "2/5/2005", "7/5/2007",
>        "22/04/2004", "21/04/2005", "20080430", "13/05/2003", "20080529",
>        NA, NA, "19/05/1999", "17/05/2000", "17/05/2000")
>
>
> In the above you will see that some dates are of format="%d/%m/%Y",
> others of format="%Y%m%d" and some NA values. Can you suggest a
> straight-forward way of transforming these to a uniform 'character' or
> 'Date' vector? I tried to do the following, but it outputs very
> strange results:
> > x
>  [1] "1/3/2005"   "13/04/2004" "2/5/2005"   "2/5/2005"   "7/5/2007"
> "22/04/2004"
>  [7] "21/04/2005" "20080430"   "13/05/2003" "20080529"   NA
> NA
> [13] "19/05/1999" "17/05/2000" "17/05/2000"
> > sum(xa <- grepl('/', x))
> [1] 11
> > sum(xb  <- grepl('200', substr(x, 1,4)))
> [1] 2
> > sum(xc <- is.na(x))
> [1] 2
> > x[xa] <- as.Date(x[xa], format="%d/%m/%Y")
> > x[xb] <- as.Date(x[xb], format="%Y%m%d")
> > x
>  [1] "12843" "12521" "12905" "12905" "13640" "12530" "12894" "13999"
> "12185" "14028"
> [11] NA      NA      "10730" "11094" "11094"
>

You can use another as.Date with origin specified.

as.Date(ifelse(ind, as.Date(x, format="%d/%m/%Y"), as.Date(x,
format="%Y%m%d")) , origin="1970-01-01")
 [1] "2005-03-01" "2004-04-13" "2005-05-02" "2005-05-02" "2007-05-07"
 [6] "2004-04-22" "2005-04-21" "2008-04-30" "2003-05-13" "2008-05-29"
[11] NA           NA           "1999-05-19" "2000-05-17" "2000-05-17"
 
Regards
Petr

>
> The culprit is likely that the 'x' vector is 'character' throughout,
> but I'm not sure how to work around. For example, I couldn't figure
> how to create an empty 'Date' vector. Regards
> Liviu
>
>
> --
> Do you know how to read?
> http://www.alienetworks.com/srtest.cfm
> http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
> Do you know how to write?
> http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: convert 'character' vector containing mixed formats to 'Date'

Duncan Murdoch-2
In reply to this post by Liviu Andronic
On 12-06-21 8:48 AM, Liviu Andronic wrote:

> Dear all
> I have a 'character' vector containing mixed formats (thanks Excel!)
> and I'd like to translate it into a default "%Y-%m-%d" Date vector.
> x<- c("1/3/2005", "13/04/2004", "2/5/2005", "2/5/2005", "7/5/2007",
>         "22/04/2004", "21/04/2005", "20080430", "13/05/2003", "20080529",
>         NA, NA, "19/05/1999", "17/05/2000", "17/05/2000")
>
>
> In the above you will see that some dates are of format="%d/%m/%Y",
> others of format="%Y%m%d" and some NA values. Can you suggest a
> straight-forward way of transforming these to a uniform 'character' or
> 'Date' vector? I tried to do the following, but it outputs very
> strange results:
>> x
>   [1] "1/3/2005"   "13/04/2004" "2/5/2005"   "2/5/2005"   "7/5/2007"
> "22/04/2004"
>   [7] "21/04/2005" "20080430"   "13/05/2003" "20080529"   NA
> NA
> [13] "19/05/1999" "17/05/2000" "17/05/2000"
>> sum(xa<- grepl('/', x))
> [1] 11
>> sum(xb<- grepl('200', substr(x, 1,4)))
> [1] 2
>> sum(xc<- is.na(x))1
> [1] 2
>> x[xa]<- as.Date(x[xa], format="%d/%m/%Y")
>> x[xb]<- as.Date(x[xb], format="%Y%m%d")
>> x
>   [1] "12843" "12521" "12905" "12905" "13640" "12530" "12894" "13999"
> "12185" "14028"
> [11] NA      NA      "10730" "11094" "11094"
>
>
> The culprit is likely that the 'x' vector is 'character' throughout,
> but I'm not sure how to work around. For example, I couldn't figure
> how to create an empty 'Date' vector. Regards

You probably don't want the vector to be empty, so something like this
would work:

y <- as.Date(rep(NA, 15))

Then things like

y[xa] <- as.Date(x[xa], format="%d/%m/%Y")

etc. should work.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: convert 'character' vector containing mixed formats to 'Date'

arun kirshna
In reply to this post by Liviu Andronic
HI,

You could also try it with strptime to get the result:

x <- c("1/3/2005", "13/04/2004", "2/5/2005", "2/5/2005", "7/5/2007",
      "22/04/2004", "21/04/2005", "20080430", "13/05/2003", "20080529",
      NA, NA, "19/05/1999", "17/05/2000", "17/05/2000")
 x<-as.character(na.omit(x))
x1<-strptime(x,"%d/%m/%Y")
x2<-strptime(x,"%Y%m%d")
x1[is.na(x1)]<-c(x2[8],x2[10])
 x1
 [1] "2005-03-01" "2004-04-13" "2005-05-02" "2005-05-02" "2007-05-07"
 [6] "2004-04-22" "2005-04-21" "2008-04-30" "2003-05-13" "2008-05-29"
[11] "1999-05-19" "2000-05-17" "2000-05-17"

A.K.




----- Original Message -----
From: Liviu Andronic <[hidden email]>
To: "[hidden email] Help" <[hidden email]>
Cc:
Sent: Thursday, June 21, 2012 8:48 AM
Subject: [R] convert 'character' vector containing mixed formats to 'Date'

Dear all
I have a 'character' vector containing mixed formats (thanks Excel!)
and I'd like to translate it into a default "%Y-%m-%d" Date vector.
x <- c("1/3/2005", "13/04/2004", "2/5/2005", "2/5/2005", "7/5/2007",
       "22/04/2004", "21/04/2005", "20080430", "13/05/2003", "20080529",
       NA, NA, "19/05/1999", "17/05/2000", "17/05/2000")


In the above you will see that some dates are of format="%d/%m/%Y",
others of format="%Y%m%d" and some NA values. Can you suggest a
straight-forward way of transforming these to a uniform 'character' or
'Date' vector? I tried to do the following, but it outputs very
strange results:
> x
[1] "1/3/2005"   "13/04/2004" "2/5/2005"   "2/5/2005"   "7/5/2007"
"22/04/2004"
[7] "21/04/2005" "20080430"   "13/05/2003" "20080529"   NA
NA
[13] "19/05/1999" "17/05/2000" "17/05/2000"
> sum(xa <- grepl('/', x))
[1] 11
> sum(xb  <- grepl('200', substr(x, 1,4)))
[1] 2
> sum(xc <- is.na(x))
[1] 2
> x[xa] <- as.Date(x[xa], format="%d/%m/%Y")
> x[xb] <- as.Date(x[xb], format="%Y%m%d")
> x
[1] "12843" "12521" "12905" "12905" "13640" "12530" "12894" "13999"
"12185" "14028"
[11] NA      NA      "10730" "11094" "11094"


The culprit is likely that the 'x' vector is 'character' throughout,
but I'm not sure how to work around. For example, I couldn't figure
how to create an empty 'Date' vector. Regards
Liviu


--
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: convert 'character' vector containing mixed formats to 'Date'

arun kirshna
HI,
#Instead of
x1[is.na(x1)]<-c(x2[8],x2[10])
#You can use more general form,
x1[is.na(x1)]<-x2[!is.na(x2)]
x1
 [1] "2005-03-01" "2004-04-13" "2005-05-02" "2005-05-02" "2007-05-07"
 [6] "2004-04-22" "2005-04-21" "2008-04-30" "2003-05-13" "2008-05-29"
[11] "1999-05-19" "2000-05-17" "2000-05-17"


A.K.




----- Original Message -----
From: arun <[hidden email]>
To: Liviu Andronic <[hidden email]>
Cc: R help <[hidden email]>
Sent: Thursday, June 21, 2012 1:10 PM
Subject: Re: [R] convert 'character' vector containing mixed formats to 'Date'

HI,

You could also try it with strptime to get the result:

x <- c("1/3/2005", "13/04/2004", "2/5/2005", "2/5/2005", "7/5/2007",
      "22/04/2004", "21/04/2005", "20080430", "13/05/2003", "20080529",
      NA, NA, "19/05/1999", "17/05/2000", "17/05/2000")
 x<-as.character(na.omit(x))
x1<-strptime(x,"%d/%m/%Y")
x2<-strptime(x,"%Y%m%d")
x1[is.na(x1)]<-c(x2[8],x2[10])
 x1
 [1] "2005-03-01" "2004-04-13" "2005-05-02" "2005-05-02" "2007-05-07"
 [6] "2004-04-22" "2005-04-21" "2008-04-30" "2003-05-13" "2008-05-29"
[11] "1999-05-19" "2000-05-17" "2000-05-17"

A.K.




----- Original Message -----
From: Liviu Andronic <[hidden email]>
To: "[hidden email] Help" <[hidden email]>
Cc:
Sent: Thursday, June 21, 2012 8:48 AM
Subject: [R] convert 'character' vector containing mixed formats to 'Date'

Dear all
I have a 'character' vector containing mixed formats (thanks Excel!)
and I'd like to translate it into a default "%Y-%m-%d" Date vector.
x <- c("1/3/2005", "13/04/2004", "2/5/2005", "2/5/2005", "7/5/2007",
       "22/04/2004", "21/04/2005", "20080430", "13/05/2003", "20080529",
       NA, NA, "19/05/1999", "17/05/2000", "17/05/2000")


In the above you will see that some dates are of format="%d/%m/%Y",
others of format="%Y%m%d" and some NA values. Can you suggest a
straight-forward way of transforming these to a uniform 'character' or
'Date' vector? I tried to do the following, but it outputs very
strange results:
> x
[1] "1/3/2005"   "13/04/2004" "2/5/2005"   "2/5/2005"   "7/5/2007"
"22/04/2004"
[7] "21/04/2005" "20080430"   "13/05/2003" "20080529"   NA
NA
[13] "19/05/1999" "17/05/2000" "17/05/2000"
> sum(xa <- grepl('/', x))
[1] 11
> sum(xb  <- grepl('200', substr(x, 1,4)))
[1] 2
> sum(xc <- is.na(x))
[1] 2
> x[xa] <- as.Date(x[xa], format="%d/%m/%Y")
> x[xb] <- as.Date(x[xb], format="%Y%m%d")
> x
[1] "12843" "12521" "12905" "12905" "13640" "12530" "12894" "13999"
"12185" "14028"
[11] NA      NA      "10730" "11094" "11094"


The culprit is likely that the 'x' vector is 'character' throughout,
but I'm not sure how to work around. For example, I couldn't figure
how to create an empty 'Date' vector. Regards
Liviu


--
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...