Bug in POSIXct string representation?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Bug in POSIXct string representation?

Festl, Andreas
Dear all,

I just have identified the following issue which I believe could be a bug in R:

Let me illustrate:

First, enable the display of fractional seconds and check that it works:
> options(digits.secs = 6, digits = 6)
> as.character(as.POSIXct("2018-08-31 14:15:16.123456"))
[1] "2018-08-31 14:15:16.123456"

Now create a sequence of POSIXct with stepwidth 0.1sec:
> test <- as.POSIXct("2018-08-31 14:15:16.000000")
> test_seq <- seq(test, test + 1, by = 1/10)

Calling format with the millisecond conversion specification gives the intended result (even though there is a small representation error):
> format(test_seq, "%F %T.%OS")
 [1] "2018-08-31 14:15:16.16.000000" "2018-08-31 14:15:16.16.099999" "2018-08-31 14:15:16.16.200000" "2018-08-31 14:15:16.16.299999"
 [5] "2018-08-31 14:15:16.16.400000" "2018-08-31 14:15:16.16.500000" "2018-08-31 14:15:16.16.599999" "2018-08-31 14:15:16.16.700000"
 [9] "2018-08-31 14:15:16.16.799999" "2018-08-31 14:15:16.16.900000" "2018-08-31 14:15:17.17.000000"

However, if I use as.character, the milliseconds seemingly just get cut-off after one digit, resulting in incorrect representations:
> as.character(test_seq)
 [1] "2018-08-31 14:15:16.0" "2018-08-31 14:15:16.0" "2018-08-31 14:15:16.2" "2018-08-31 14:15:16.2" "2018-08-31 14:15:16.4" "2018-08-31 14:15:16.5"
 [7] "2018-08-31 14:15:16.5" "2018-08-31 14:15:16.7" "2018-08-31 14:15:16.7" "2018-08-31 14:15:16.9" "2018-08-31 14:15:17.0"

It seems to me, that R correctly decides that there is only one significant digit after the decimal point, but then incorrectly (due to representation error) just cuts off after the first digit.

BR,
  Andreas

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Bug in POSIXct string representation?

Joshua Ulrich
Hi Andreas,

On Thu, Aug 9, 2018 at 2:26 AM, Festl, Andreas <[hidden email]> wrote:

> Dear all,
>
> I just have identified the following issue which I believe could be a bug in R:
>
> Let me illustrate:
>
> First, enable the display of fractional seconds and check that it works:
>> options(digits.secs = 6, digits = 6)
>> as.character(as.POSIXct("2018-08-31 14:15:16.123456"))
> [1] "2018-08-31 14:15:16.123456"
>
> Now create a sequence of POSIXct with stepwidth 0.1sec:
>> test <- as.POSIXct("2018-08-31 14:15:16.000000")
>> test_seq <- seq(test, test + 1, by = 1/10)
>
> Calling format with the millisecond conversion specification gives the intended result (even though there is a small representation error):
>> format(test_seq, "%F %T.%OS")
>  [1] "2018-08-31 14:15:16.16.000000" "2018-08-31 14:15:16.16.099999" "2018-08-31 14:15:16.16.200000" "2018-08-31 14:15:16.16.299999"
>  [5] "2018-08-31 14:15:16.16.400000" "2018-08-31 14:15:16.16.500000" "2018-08-31 14:15:16.16.599999" "2018-08-31 14:15:16.16.700000"
>  [9] "2018-08-31 14:15:16.16.799999" "2018-08-31 14:15:16.16.900000" "2018-08-31 14:15:17.17.000000"
>
> However, if I use as.character, the milliseconds seemingly just get cut-off after one digit, resulting in incorrect representations:
>> as.character(test_seq)
>  [1] "2018-08-31 14:15:16.0" "2018-08-31 14:15:16.0" "2018-08-31 14:15:16.2" "2018-08-31 14:15:16.2" "2018-08-31 14:15:16.4" "2018-08-31 14:15:16.5"
>  [7] "2018-08-31 14:15:16.5" "2018-08-31 14:15:16.7" "2018-08-31 14:15:16.7" "2018-08-31 14:15:16.9" "2018-08-31 14:15:17.0"
>
> It seems to me, that R correctly decides that there is only one significant digit after the decimal point, but then incorrectly (due to representation error) just cuts off after the first digit.
>
This is known behavior with how POSIXt objects are printed, and has
been discussed before on R-help:
https://stat.ethz.ch/pipermail/r-help/2015-June/429600.html

Basically, the behavior is a combination of truncating fractional
seconds rather than rounding combined with the floating point
representation error you noticed.  And truncation is the behavior for
printing whole seconds:
format(as.POSIXct("2018-08-31 14:15:16.9"))  # 16s, not 17s
[1] "2018-08-31 14:15:16"

So it would not be consistent to round fractional seconds, unless you
kept track of the rounding error relative to the desired resolution.

There are more details in the R-help thread and the StackOverflow Q&A
it references.

Best,
Josh

> BR,
>   Andreas
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



--
Joshua Ulrich  |  about.me/joshuaulrich
FOSS Trading  |  www.fosstrading.com
R/Finance 2018 | www.rinfinance.com

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel