v3 serialization of compact_intseq altrep should write modified data

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

v3 serialization of compact_intseq altrep should write modified data

R devel mailing list
Experimenting with altrep objects and v3 serialization, I discovered a
possible bug.  Calling DATAPTR on a compact_intseq object returns a
pointer to the expanded integer sequence in memory.  If you modify
this data, the object values appear to be changed.  However, if the
compact_intseq object is then serialized (with version=3), only the
original integer sequence info is written.

For example, suppose I have compiled and loaded the following C code:
  SEXP set_intseq_data(SEXP x)
  {
      void* ptr = DATAPTR(x);
      ((int*)ptr)[3] = 1234;
      return R_NilValue;
  }

I see the following behavior in R 3.5.1:
  > x <- 1:10
  > x
   [1]  1  2  3  4  5  6  7  8  9 10
  > .Call("set_intseq_data", x)
  NULL
  > x
   [1]    1    2    3 1234    5    6    7    8    9   10
  > save(x, file="temp.rda", version=3)
  > load(file="temp.rda")
  > x
   [1]  1  2  3  4  5  6  7  8  9 10
  >

I would have expected the modified vector data to be serialized to the
file, and be restored when it is loaded.

  ~~ Michael Sannella

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: v3 serialization of compact_intseq altrep should write modified data

Tierney, Luke
Try this C code:

SEXP set_intseq_data(SEXP x)
{
     if (MAYBE_SHARED(x))
  error("Oops, not supposed to do this!");
     void* ptr = DATAPTR(x);
     ((int*)ptr)[3] = 1234;
     return R_NilValue;
}

Lots of things will break if you modify objects that have been marked
as immutable (and hence where MAYBE_SHARED returns TRUE).

For now the implementation of compact sequences marks them as
immutable and so assumes the expanded version will not be changed.
That implementation detail might be changed at some point but C code
should not make assumptions.

Best,

luke

On Mon, 22 Oct 2018, Michael Sannella via R-devel wrote:

> Experimenting with altrep objects and v3 serialization, I discovered a
> possible bug.  Calling DATAPTR on a compact_intseq object returns a
> pointer to the expanded integer sequence in memory.  If you modify
> this data, the object values appear to be changed.  However, if the
> compact_intseq object is then serialized (with version=3), only the
> original integer sequence info is written.
>
> For example, suppose I have compiled and loaded the following C code:
>  SEXP set_intseq_data(SEXP x)
>  {
>      void* ptr = DATAPTR(x);
>      ((int*)ptr)[3] = 1234;
>      return R_NilValue;
>  }
>
> I see the following behavior in R 3.5.1:
>  > x <- 1:10
>  > x
>   [1]  1  2  3  4  5  6  7  8  9 10
>  > .Call("set_intseq_data", x)
>  NULL
>  > x
>   [1]    1    2    3 1234    5    6    7    8    9   10
>  > save(x, file="temp.rda", version=3)
>  > load(file="temp.rda")
>  > x
>   [1]  1  2  3  4  5  6  7  8  9 10
>  >
>
> I would have expected the modified vector data to be serialized to the
> file, and be restored when it is loaded.
>
>  ~~ Michael Sannella
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   [hidden email]
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel