I noticed the following peculiarity with `serialize()' when `ascii = TRUE' is
used. In today's (svn r37299) R-devel, I get > set.seed(10) > x <- rnorm(10) > > a <- serialize(x, con = NULL, ascii = TRUE) > b <- unserialize(a) > > identical(x, b) ## FALSE [1] FALSE > x - b [1] -3.469447e-18 2.775558e-17 -4.440892e-16 0.000000e+00 5.551115e-17 [6] -5.551115e-17 -4.440892e-16 0.000000e+00 2.220446e-16 -5.551115e-17 I expected `x' and `b' to be identical, which is what I get when `ascii = FALSE': > a <- serialize(x, con = NULL, ascii = FALSE) > b <- unserialize(a) > > identical(x, b) ## TRUE [1] TRUE The same phenomenon occurs with `.saveRDS(ascii = TRUE)', > .saveRDS(x, file = "asdf", ascii = TRUE) > d <- .readRDS("asdf") > > identical(x, d) ## FALSE [1] FALSE > Has anyone noticed this before? I didn't see anything in the docs for `serialize()' that would indicate this behavior should be expected. I'm on Linux Fedora Core 4. -roger -- Roger D. Peng | http://www.biostat.jhsph.edu/~rpeng/ ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
It is known (happens with save() too and did in earlier save formats).
Nothing particularly clever is done (the format is "%.16g\n") and similarly as.character/parse are not inverses. Perhaps more relevant is > b/x -1 [1] 0.000000e+00 -1.110223e-16 2.220446e-16 0.000000e+00 0.000000e+00 [6] 2.220446e-16 4.440892e-16 0.000000e+00 2.220446e-16 0.000000e+00 so the error (on my system) is about what you would expect from floating-point computations. There is a comment in serialize.c /* 16: full precision; 17 gives 999, 000 &c */ which suggests that the format is optimized for size not maximal possible accuracy. Really all you have said is `floating point operations are subject to rounding error'. On Wed, 8 Feb 2006, Roger D. Peng wrote: > I noticed the following peculiarity with `serialize()' when `ascii = TRUE' is > used. In today's (svn r37299) R-devel, I get > > > set.seed(10) > > x <- rnorm(10) > > > > a <- serialize(x, con = NULL, ascii = TRUE) > > b <- unserialize(a) > > > > identical(x, b) ## FALSE > [1] FALSE > > x - b > [1] -3.469447e-18 2.775558e-17 -4.440892e-16 0.000000e+00 5.551115e-17 > [6] -5.551115e-17 -4.440892e-16 0.000000e+00 2.220446e-16 -5.551115e-17 > > > I expected `x' and `b' to be identical, which is what I get when `ascii = FALSE': > > > a <- serialize(x, con = NULL, ascii = FALSE) > > b <- unserialize(a) > > > > identical(x, b) ## TRUE > [1] TRUE > > > The same phenomenon occurs with `.saveRDS(ascii = TRUE)', > > > .saveRDS(x, file = "asdf", ascii = TRUE) > > d <- .readRDS("asdf") > > > > identical(x, d) ## FALSE > [1] FALSE > > > > Has anyone noticed this before? I didn't see anything in the docs for > `serialize()' that would indicate this behavior should be expected. > > I'm on Linux Fedora Core 4. > > -roger > -- Brian D. Ripley, [hidden email] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
Okay, I just wasn't sure of the source of the changes. In retrospect, character
and other vectors did serialize/unserialize to the original objects. -roger Prof Brian Ripley wrote: > It is known (happens with save() too and did in earlier save formats). > Nothing particularly clever is done (the format is "%.16g\n") and > similarly as.character/parse are not inverses. > > Perhaps more relevant is > >> b/x -1 > [1] 0.000000e+00 -1.110223e-16 2.220446e-16 0.000000e+00 0.000000e+00 > [6] 2.220446e-16 4.440892e-16 0.000000e+00 2.220446e-16 0.000000e+00 > > so the error (on my system) is about what you would expect from > floating-point computations. > > There is a comment in serialize.c > > /* 16: full precision; 17 gives 999, 000 &c */ > > which suggests that the format is optimized for size not maximal > possible accuracy. > > Really all you have said is `floating point operations are subject to > rounding error'. > > > On Wed, 8 Feb 2006, Roger D. Peng wrote: > >> I noticed the following peculiarity with `serialize()' when `ascii = >> TRUE' is >> used. In today's (svn r37299) R-devel, I get >> >> > set.seed(10) >> > x <- rnorm(10) >> > >> > a <- serialize(x, con = NULL, ascii = TRUE) >> > b <- unserialize(a) >> > >> > identical(x, b) ## FALSE >> [1] FALSE >> > x - b >> [1] -3.469447e-18 2.775558e-17 -4.440892e-16 0.000000e+00 >> 5.551115e-17 >> [6] -5.551115e-17 -4.440892e-16 0.000000e+00 2.220446e-16 >> -5.551115e-17 >> >> >> I expected `x' and `b' to be identical, which is what I get when >> `ascii = FALSE': >> >> > a <- serialize(x, con = NULL, ascii = FALSE) >> > b <- unserialize(a) >> > >> > identical(x, b) ## TRUE >> [1] TRUE >> >> >> The same phenomenon occurs with `.saveRDS(ascii = TRUE)', >> >> > .saveRDS(x, file = "asdf", ascii = TRUE) >> > d <- .readRDS("asdf") >> > >> > identical(x, d) ## FALSE >> [1] FALSE >> > >> >> Has anyone noticed this before? I didn't see anything in the docs for >> `serialize()' that would indicate this behavior should be expected. >> >> I'm on Linux Fedora Core 4. >> >> -roger >> > -- Roger D. Peng | http://www.biostat.jhsph.edu/~rpeng/ ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel |
Free forum by Nabble | Edit this page |