transforming dates

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

transforming dates

reichmaj
R-Help Forum

 

I have a data set that contains a date field but the dates are in two
formats

 

11/7/2016            dd/mm/yyyy

14-07-16               dd-mm-yy

 

How would I go about correcting this problem. Should I separate the dates,
format them , and then recombine?

 

Sincerely

 

Jeff Reichman

(314) 457-1966

 


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: transforming dates

Bert Gunter-2
Well, one way to do it is via regex's -- no splitting and recombining
needed.
Note: This will convert a factor into a character vector.

> z <- c("11/7/2016", "14-07-16")
> z <- gsub("-([[:digit:]]{2})-([[:digit:]]{2})", "/\\1/20\\2",z) ## /\ is
/ and \
> z
[1] "11/7/2016"  "14/07/2016"

I leave it to you as an exercise to either convert 7 to 07 or vice-versa if
you want to do this.
Note, if you have spaces sprinkled inconsistently around your separators,
you'll have to work a bit harder with your regex.

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sat, Nov 2, 2019 at 7:25 PM <[hidden email]> wrote:

> R-Help Forum
>
>
>
> I have a data set that contains a date field but the dates are in two
> formats
>
>
>
> 11/7/2016            dd/mm/yyyy
>
> 14-07-16               dd-mm-yy
>
>
>
> How would I go about correcting this problem. Should I separate the dates,
> format them , and then recombine?
>
>
>
> Sincerely
>
>
>
> Jeff Reichman
>
> (314) 457-1966
>
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: transforming dates

Rui Barradas
In reply to this post by reichmaj
Hello,

I believe the simplest is to use package lubridate. Its functions try
several formats until either one is right or none fits the data.

x <- c('11/7/2016', '14-07-16')
lubridate::dmy(x)
#[1] "2016-07-11" "2016-07-14"


The order dmy must be the same for all vector elements, if not

y <- c('11/7/2016', '14-07-16', '2016/7/11')
lubridate::dmy(y)
#[1] "2016-07-11" "2016-07-14" NA
#Warning message:
# 1 failed to parse.


Hope this helps,

Rui Barradas

Às 02:25 de 03/11/19, [hidden email] escreveu:

> R-Help Forum
>
>  
>
> I have a data set that contains a date field but the dates are in two
> formats
>
>  
>
> 11/7/2016            dd/mm/yyyy
>
> 14-07-16               dd-mm-yy
>
>  
>
> How would I go about correcting this problem. Should I separate the dates,
> format them , and then recombine?
>
>  
>
> Sincerely
>
>  
>
> Jeff Reichman
>
> (314) 457-1966
>
>  
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: transforming dates

Bert Gunter-2
Rui is right -- lubridate functionality and robustness is better -- but
just for fun, here is a simple function, poorly named reformat(), that
splits up the date formats, cleans them up and standardizes them a bit, and
spits them back out with a sep character of your choice (your original
split and recombine suggestion). Lubridate probably does something similar
but more sophisticated, but maybe it's worthwhile to see how one can do it
using basic functionality. This only requires a few short lines of code.

reformat <- function(z, sep = "-"){
   z <- gsub(" ","",z) ## remove blanks
   ## break up dates into 3 component pieces and convert to matrix
   z <- matrix(unlist(strsplit(z, "-|/")), nrow = 3)
   ## add "0" in front of single digit in dd and mm
   ## add "20" in front  of "yy"
   for(i in 1:2) z[i, ] <- gsub("\\<([[:digit:]])\\>","0\\1",z[i, ])
   z[3, ] <- sub("\\<([[:digit:]]{2})\\>","20\\1",z[3, ])
   ## combine back into single string separated by sep
   paste(z[1, ],z[2, ],z[3, ], sep = sep)
}

## Testit
> z <- c(" 1 / 22 /2015"," 1 -5 -15","11/7/2016", "14-07-16")

> reformat(z)
[1] "01-22-2015" "01-05-2015" "11-07-2016" "14-07-2016"

> reformat(z,"/")
[1] "01/22/2015" "01/05/2015" "11/07/2016" "14/07/2016"

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Nov 3, 2019 at 12:15 AM Rui Barradas <[hidden email]> wrote:

> Hello,
>
> I believe the simplest is to use package lubridate. Its functions try
> several formats until either one is right or none fits the data.
>
> x <- c('11/7/2016', '14-07-16')
> lubridate::dmy(x)
> #[1] "2016-07-11" "2016-07-14"
>
>
> The order dmy must be the same for all vector elements, if not
>
> y <- c('11/7/2016', '14-07-16', '2016/7/11')
> lubridate::dmy(y)
> #[1] "2016-07-11" "2016-07-14" NA
> #Warning message:
> # 1 failed to parse.
>
>
> Hope this helps,
>
> Rui Barradas
>
> Às 02:25 de 03/11/19, [hidden email] escreveu:
> > R-Help Forum
> >
> >
> >
> > I have a data set that contains a date field but the dates are in two
> > formats
> >
> >
> >
> > 11/7/2016            dd/mm/yyyy
> >
> > 14-07-16               dd-mm-yy
> >
> >
> >
> > How would I go about correcting this problem. Should I separate the
> dates,
> > format them , and then recombine?
> >
> >
> >
> > Sincerely
> >
> >
> >
> > Jeff Reichman
> >
> > (314) 457-1966
> >
> >
> >
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: transforming dates

David Winsemius

On 11/3/19 11:51 AM, Bert Gunter wrote:
> Rui is right -- lubridate functionality and robustness is better -- but
> just for fun, here is a simple function, poorly named reformat(), that
> splits up the date formats, cleans them up and standardizes them a bit, and
> spits them back out with a sep character of your choice (your original
> split and recombine suggestion). Lubridate probably does something similar
> but more sophisticated, but maybe it's worthwhile to see how one can do it
> using basic functionality. This only requires a few short lines of code.

If one wants to investigate existing efforts at automatic date _and_
time reformatting, then do not forget Dirk's anytime package:


https://cran.r-project.org/web/packages/anytime/index.html


--

David.

>
> reformat <- function(z, sep = "-"){
>     z <- gsub(" ","",z) ## remove blanks
>     ## break up dates into 3 component pieces and convert to matrix
>     z <- matrix(unlist(strsplit(z, "-|/")), nrow = 3)
>     ## add "0" in front of single digit in dd and mm
>     ## add "20" in front  of "yy"
>     for(i in 1:2) z[i, ] <- gsub("\\<([[:digit:]])\\>","0\\1",z[i, ])
>     z[3, ] <- sub("\\<([[:digit:]]{2})\\>","20\\1",z[3, ])
>     ## combine back into single string separated by sep
>     paste(z[1, ],z[2, ],z[3, ], sep = sep)
> }
>
> ## Testit
>> z <- c(" 1 / 22 /2015"," 1 -5 -15","11/7/2016", "14-07-16")
>> reformat(z)
> [1] "01-22-2015" "01-05-2015" "11-07-2016" "14-07-2016"
>
>> reformat(z,"/")
> [1] "01/22/2015" "01/05/2015" "11/07/2016" "14/07/2016"
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Sun, Nov 3, 2019 at 12:15 AM Rui Barradas <[hidden email]> wrote:
>
>> Hello,
>>
>> I believe the simplest is to use package lubridate. Its functions try
>> several formats until either one is right or none fits the data.
>>
>> x <- c('11/7/2016', '14-07-16')
>> lubridate::dmy(x)
>> #[1] "2016-07-11" "2016-07-14"
>>
>>
>> The order dmy must be the same for all vector elements, if not
>>
>> y <- c('11/7/2016', '14-07-16', '2016/7/11')
>> lubridate::dmy(y)
>> #[1] "2016-07-11" "2016-07-14" NA
>> #Warning message:
>> # 1 failed to parse.
>>
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>> Às 02:25 de 03/11/19, [hidden email] escreveu:
>>> R-Help Forum
>>>
>>>
>>>
>>> I have a data set that contains a date field but the dates are in two
>>> formats
>>>
>>>
>>>
>>> 11/7/2016            dd/mm/yyyy
>>>
>>> 14-07-16               dd-mm-yy
>>>
>>>
>>>
>>> How would I go about correcting this problem. Should I separate the
>> dates,
>>> format them , and then recombine?
>>>
>>>
>>>
>>> Sincerely
>>>
>>>
>>>
>>> Jeff Reichman
>>>
>>> (314) 457-1966
>>>
>>>
>>>
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: transforming dates

Bert Gunter-2
Yes, indeed.

Thanks, David.

Cheers,
Bert

On Sun, Nov 3, 2019 at 12:22 PM David Winsemius <[hidden email]>
wrote:

>
> On 11/3/19 11:51 AM, Bert Gunter wrote:
> > Rui is right -- lubridate functionality and robustness is better -- but
> > just for fun, here is a simple function, poorly named reformat(), that
> > splits up the date formats, cleans them up and standardizes them a bit,
> and
> > spits them back out with a sep character of your choice (your original
> > split and recombine suggestion). Lubridate probably does something
> similar
> > but more sophisticated, but maybe it's worthwhile to see how one can do
> it
> > using basic functionality. This only requires a few short lines of code.
>
> If one wants to investigate existing efforts at automatic date _and_
> time reformatting, then do not forget Dirk's anytime package:
>
>
> https://cran.r-project.org/web/packages/anytime/index.html
>
>
> --
>
> David.
>
> >
> > reformat <- function(z, sep = "-"){
> >     z <- gsub(" ","",z) ## remove blanks
> >     ## break up dates into 3 component pieces and convert to matrix
> >     z <- matrix(unlist(strsplit(z, "-|/")), nrow = 3)
> >     ## add "0" in front of single digit in dd and mm
> >     ## add "20" in front  of "yy"
> >     for(i in 1:2) z[i, ] <- gsub("\\<([[:digit:]])\\>","0\\1",z[i, ])
> >     z[3, ] <- sub("\\<([[:digit:]]{2})\\>","20\\1",z[3, ])
> >     ## combine back into single string separated by sep
> >     paste(z[1, ],z[2, ],z[3, ], sep = sep)
> > }
> >
> > ## Testit
> >> z <- c(" 1 / 22 /2015"," 1 -5 -15","11/7/2016", "14-07-16")
> >> reformat(z)
> > [1] "01-22-2015" "01-05-2015" "11-07-2016" "14-07-2016"
> >
> >> reformat(z,"/")
> > [1] "01/22/2015" "01/05/2015" "11/07/2016" "14/07/2016"
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> and
> > sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Sun, Nov 3, 2019 at 12:15 AM Rui Barradas <[hidden email]>
> wrote:
> >
> >> Hello,
> >>
> >> I believe the simplest is to use package lubridate. Its functions try
> >> several formats until either one is right or none fits the data.
> >>
> >> x <- c('11/7/2016', '14-07-16')
> >> lubridate::dmy(x)
> >> #[1] "2016-07-11" "2016-07-14"
> >>
> >>
> >> The order dmy must be the same for all vector elements, if not
> >>
> >> y <- c('11/7/2016', '14-07-16', '2016/7/11')
> >> lubridate::dmy(y)
> >> #[1] "2016-07-11" "2016-07-14" NA
> >> #Warning message:
> >> # 1 failed to parse.
> >>
> >>
> >> Hope this helps,
> >>
> >> Rui Barradas
> >>
> >> Às 02:25 de 03/11/19, [hidden email] escreveu:
> >>> R-Help Forum
> >>>
> >>>
> >>>
> >>> I have a data set that contains a date field but the dates are in two
> >>> formats
> >>>
> >>>
> >>>
> >>> 11/7/2016            dd/mm/yyyy
> >>>
> >>> 14-07-16               dd-mm-yy
> >>>
> >>>
> >>>
> >>> How would I go about correcting this problem. Should I separate the
> >> dates,
> >>> format them , and then recombine?
> >>>
> >>>
> >>>
> >>> Sincerely
> >>>
> >>>
> >>>
> >>> Jeff Reichman
> >>>
> >>> (314) 457-1966
> >>>
> >>>
> >>>
> >>>
> >>>        [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >> ______________________________________________
> >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: transforming dates

Peter Dalgaard-2
In reply to this post by David Winsemius


> On 3 Nov 2019, at 21:22 , David Winsemius <[hidden email]> wrote:
>
>
> On 11/3/19 11:51 AM, Bert Gunter wrote:
    =======

Hey, that's my birthday! Err, no it isn't... ;-)

--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [hidden email]  Priv: [hidden email]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: transforming dates

Spencer Graves-4


On 2019-11-03 17:04, Peter Dalgaard wrote:
>
>> On 3 Nov 2019, at 21:22 , David Winsemius <[hidden email]> wrote:
>>
>>
>> On 11/3/19 11:51 AM, Bert Gunter wrote:
> =======
>
> Hey, that's my birthday! Err, no it isn't... ;-)
>

       Is that November 11 of 2019 or March 19 of 2011 or 11 March 2019?


       The English still use stones as a unit of mass, and most of the
US still steadfastly refuses to seriously consider metrication or  ISO
8601.  I know an architect in the US, who has worked on several
different projects every year for the past 40 years only one of which
has been in metric units.


        Binary, octal or hex is superior to decimal, except for the fact
that most humans have 10 digits on hands and feet.  And decimal is
vastly superior to arithmetic in mixed bases, e.g., adding miles, rods,
yards, feet, inches, and 64ths.


       Spencer Graves

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: transforming dates

Spencer Graves-4
In reply to this post by Peter Dalgaard-2


On 2019-11-03 17:04, Peter Dalgaard wrote:
>
>> On 3 Nov 2019, at 21:22 , David Winsemius <[hidden email]> wrote:
>>
>>
>> On 11/3/19 11:51 AM, Bert Gunter wrote:
> =======
>
> Hey, that's my birthday! Err, no it isn't... ;-)
>

       Is that November 3 of 2019 or March 19 of 2011 or 11 March 2019? 
[please excuse the typo in the earlier response]


       The English still use stones as a unit of mass, and most of the
US still steadfastly refuses to seriously consider metrication or  ISO
8601.  I know an architect in the US, who has worked on several
different projects every year for the past 40 years only one of which
has been in metric units.


        Binary, octal or hex is superior to decimal, except for the fact
that most humans have 10 digits on hands and feet.  And decimal is
vastly superior to arithmetic in mixed bases, e.g., adding miles, rods,
yards, feet, inches, and 64ths.


       Spencer Graves

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

transforming dates

Stefano Sofia
In reply to this post by reichmaj
Hello.
I am not sure if this hint might be useful, but I share it anyway: what about using the as.POSIXct command?

as.POSIXct(mydate, format="%d-%m-%Y) or as.POSIXct(mydate, format="%d/%m/%Y)

Hope this helps
Stefano

         (oo)
--oOO--( )--OOo----------------
Stefano Sofia PhD
Civil Protection - Marche Region
Meteo Section
Snow Section
Via del Colle Ameno 5
60126 Torrette di Ancona, Ancona
Uff: 071 806 7743
E-mail: [hidden email]
---Oo---------oO----------------

Date: Sat, 2 Nov 2019 21:25:09 -0500
From: <[hidden email]>
To: <[hidden email]>
Subject: [R] transforming  dates
Message-ID: <000001d591ed$eae94350$c0bbc9f0$@sbcglobal.net>
Content-Type: text/plain; charset="utf-8"

R-Help Forum



I have a data set that contains a date field but the dates are in two
formats



11/7/2016            dd/mm/yyyy

14-07-16               dd-mm-yy



How would I go about correcting this problem. Should I separate the dates,
format them , and then recombine?



Sincerely



Jeff Reichman

(314) 457-1966




        [[alternative HTML version deleted]]




________________________________

AVVISO IMPORTANTE: Questo messaggio di posta elettronica può contenere informazioni confidenziali, pertanto è destinato solo a persone autorizzate alla ricezione. I messaggi di posta elettronica per i client di Regione Marche possono contenere informazioni confidenziali e con privilegi legali. Se non si è il destinatario specificato, non leggere, copiare, inoltrare o archiviare questo messaggio. Se si è ricevuto questo messaggio per errore, inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi dell’art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessità ed urgenza, la risposta al presente messaggio di posta elettronica può essere visionata da persone estranee al destinatario.
IMPORTANT NOTICE: This e-mail message is intended to be received only by persons entitled to receive the confidential information it may contain. E-mail messages to clients of Regione Marche may contain information that is confidential and legally privileged. Please do not read, copy, forward, or store this message unless you are an intended recipient of it. If you have received this message in error, please forward it to the sender and delete it completely from your computer system.

--
Questo messaggio  stato analizzato da Libra ESVA ed  risultato non infetto.
This message was scanned by Libra ESVA and is believed to be clean.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.