ALTREP: Bug reports

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

ALTREP: Bug reports

Wang Jiefei
Hello,

I have encountered two bugs when using ALTREP APIs.

1. STDVEC_DATAPTR

From RInternal.h file it has a comment:

/* ALTREP support */
> void *(STDVEC_DATAPTR)(SEXP x);


However, this comment might not be true, the easiest way to verify it is to
define a C++ function:

 void C_testFunc(SEXP a)
> {
> STDVEC_DATAPTR(a);
> }


and call it in R via

> a=1:10
> > C_testFunc(a)
> Error in C_testFunc(a) : cannot get STDVEC_DATAPTR from ALTREP object


 We can inspect the internal type and call ALTREP function to check if it
is an ALTREP:

> .Internal(inspect(a))
> @0x000000001b5a3310 13 INTSXP g0c0 [NAM(7)]  1 : 10 (compact)
> > #This is a wrapper of ALTREP
> > is.altrep(a)
> [1] TRUE


I've also defined an ALTREP type and it did not work either. I guess this
might be a bug? Or did I miss something?

2. Wrapper objects in ALTREP

If the duplicate function is defined to return the object itself:

SEXP vector_dulplicate(SEXP x, Rboolean deep) {
return(x);
}

In R an ALTREP object will behave like an environment (pass-by-reference).
However, if we do something like(pseudo code):

n=100
> x=runif(n)
> alt1=createAltrep(x)
> alt2=alt1
> alt2[1]=10
> .Internal(inspect(alt1))
> .Internal(inspect(alt2))


The result would be:

> .Internal(inspect(alt1))
> @0x00000000156f4d18 14 REALSXP g0c0 [NAM(7)]
> > .Internal(inspect(alt2 ))
> @0x00000000156a33e0 14 REALSXP g0c0 [NAM(7)]  wrapper
> [srt=-2147483648,no_na=0]
>   @0x00000000156f4d18 14 REALSXP g0c0 [NAM(7)]


It seems like the object alt2 automatically gets wrapped by R. Although at
the R level it seems fine because there are no differences between alt1 and
alt2, if we define a C function as:

SEXP C_peekSharedMemory(SEXP x) {
> return(R_altrep_data1(x));

}


and call it in R to get the internal data structure of an ALTREP object.

C_peekSharedMemory(alt1)
> C_peekSharedMemory(alt2)


The first one correctly returns its internal data structure, but the second
one returns the ALTREP object it wraps since the wrapper itself is an
ALTREP. This behavior is unexpected. Since the dulplicate function returns
the object itself, I will expect alt1 and alt2 should be the same object.
Even if they are essentially not the same, calling the same function should
at least return the same result. Other than that, It seems like R does not
always wrap an ALTREP object. If we change n from 100 to 10 and check the
internal again, alt2 will not get wrapped. This makes the problem even more
difficult since we cannot predict when would the wrapper appear.

Here is the source code for the wrapper:
https://github.com/wch/r-source/blob/trunk/src/main/altclasses.c#L1399

Here is a working example if one can build the sharedObject package from
https://github.com/Jiefei-Wang/sharedObject

n=100
> x=runif(n)
> so1=sharedObject(x,copyOnWrite = FALSE)
> so2=so1
> so2[1]=10
> .Internal(inspect(so1))
> .Internal(inspect(so2))


Here is my session info:

R version 3.6.0 alpha (2019-04-08 r76348)

> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows >= 8 x64 (build 9200)
> Matrix products: default
> locale:
> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
> States.1252
> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>
> [5] LC_TIME=English_United States.1252
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
> other attached packages:
> [1] sharedObject_0.0.99
> loaded via a namespace (and not attached):
> [1] compiler_3.6.0 tools_3.6.0    Rcpp_1.0.1


Best,
Jiefei

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: [External] ALTREP: Bug reports

Tierney, Luke
On Thu, 16 May 2019, 介非王 wrote:

> Hello,
>
> I have encountered two bugs when using ALTREP APIs.
>
> 1. STDVEC_DATAPTR
>
> From RInternal.h file it has a comment:
>
> /* ALTREP support */
>> void *(STDVEC_DATAPTR)(SEXP x);
>
>
> However, this comment might not be true, the easiest way to verify it is to
> define a C++ function:
>
> void C_testFunc(SEXP a)
>> {
>> STDVEC_DATAPTR(a);
>> }
>
>
> and call it in R via
>
>> a=1:10
>>> C_testFunc(a)
>> Error in C_testFunc(a) : cannot get STDVEC_DATAPTR from ALTREP object
>
>
> We can inspect the internal type and call ALTREP function to check if it
> is an ALTREP:
>
>> .Internal(inspect(a))
>> @0x000000001b5a3310 13 INTSXP g0c0 [NAM(7)]  1 : 10 (compact)
>>> #This is a wrapper of ALTREP
>>> is.altrep(a)
>> [1] TRUE
>
>
> I've also defined an ALTREP type and it did not work either. I guess this
> might be a bug? Or did I miss something?

STDVEC_DATAPTR returns the data pointer of a standard (non-ALTREP)
vector.  It should not be necessary to use it in package code; if you
callit on an ALTREP you are likely to get a segfault.

>
> 2. Wrapper objects in ALTREP
>
> If the duplicate function is defined to return the object itself:

Don't do that. Mutable objects don't work. Look at the vignette in
https://github.com/ALTREP-examples/Rpkg-mutable for more on this.

Best,

luke

>
> SEXP vector_dulplicate(SEXP x, Rboolean deep) {
> return(x);
> }
>
> In R an ALTREP object will behave like an environment (pass-by-reference).
> However, if we do something like(pseudo code):
>
> n=100
>> x=runif(n)
>> alt1=createAltrep(x)
>> alt2=alt1
>> alt2[1]=10
>> .Internal(inspect(alt1))
>> .Internal(inspect(alt2))
>
>
> The result would be:
>
>> .Internal(inspect(alt1))
>> @0x00000000156f4d18 14 REALSXP g0c0 [NAM(7)]
>>> .Internal(inspect(alt2 ))
>> @0x00000000156a33e0 14 REALSXP g0c0 [NAM(7)]  wrapper
>> [srt=-2147483648,no_na=0]
>>   @0x00000000156f4d18 14 REALSXP g0c0 [NAM(7)]
>
>
> It seems like the object alt2 automatically gets wrapped by R. Although at
> the R level it seems fine because there are no differences between alt1 and
> alt2, if we define a C function as:
>
> SEXP C_peekSharedMemory(SEXP x) {
>> return(R_altrep_data1(x));
>
> }
>
>
> and call it in R to get the internal data structure of an ALTREP object.
>
> C_peekSharedMemory(alt1)
>> C_peekSharedMemory(alt2)
>
>
> The first one correctly returns its internal data structure, but the second
> one returns the ALTREP object it wraps since the wrapper itself is an
> ALTREP. This behavior is unexpected. Since the dulplicate function returns
> the object itself, I will expect alt1 and alt2 should be the same object.
> Even if they are essentially not the same, calling the same function should
> at least return the same result. Other than that, It seems like R does not
> always wrap an ALTREP object. If we change n from 100 to 10 and check the
> internal again, alt2 will not get wrapped. This makes the problem even more
> difficult since we cannot predict when would the wrapper appear.
>
> Here is the source code for the wrapper:
> https://github.com/wch/r-source/blob/trunk/src/main/altclasses.c#L1399
>
> Here is a working example if one can build the sharedObject package from
> https://github.com/Jiefei-Wang/sharedObject
>
> n=100
>> x=runif(n)
>> so1=sharedObject(x,copyOnWrite = FALSE)
>> so2=so1
>> so2[1]=10
>> .Internal(inspect(so1))
>> .Internal(inspect(so2))
>
>
> Here is my session info:
>
> R version 3.6.0 alpha (2019-04-08 r76348)
>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>> Running under: Windows >= 8 x64 (build 9200)
>> Matrix products: default
>> locale:
>> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
>> States.1252
>> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>>
>> [5] LC_TIME=English_United States.1252
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>> other attached packages:
>> [1] sharedObject_0.0.99
>> loaded via a namespace (and not attached):
>> [1] compiler_3.6.0 tools_3.6.0    Rcpp_1.0.1
>
>
> Best,
> Jiefei
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   [hidden email]
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ALTREP: Bug reports

Gabriel Becker-2
In reply to this post by Wang Jiefei
Hi Jiefei,

Thanks for tryingout the ALTREP stuff and letting us know how it is going.
That said I don't think either of these are bugs, per se, but rather a
misunderstanding of the API. Details inline.



On Thu, May 16, 2019 at 11:57 AM 介非王 <[hidden email]> wrote:

> Hello,
>
> I have encountered two bugs when using ALTREP APIs.
>
> 1. STDVEC_DATAPTR
>
> From RInternal.h file it has a comment:
>
> /* ALTREP support */
> > void *(STDVEC_DATAPTR)(SEXP x);
>
>
> However, this comment might not be true, the easiest way to verify it is to
> define a C++ function:
>
>  void C_testFunc(SEXP a)
> > {
> > STDVEC_DATAPTR(a);
> > }
>
>
> and call it in R via
>
> > a=1:10
> > > C_testFunc(a)
> > Error in C_testFunc(a) : cannot get STDVEC_DATAPTR from ALTREP object
>
>
The STDVEC here refers to the SEXP not being an ALTREP. Anything that
starts with STDVEC should never receive an ALTREP, ie it should only be
called after non-ALTREPness has been confirmed by the surrounding/preceding
code. So this is expected behavior.




>
>  We can inspect the internal type and call ALTREP function to check if it
> is an ALTREP:
>
> > .Internal(inspect(a))
> > @0x000000001b5a3310 13 INTSXP g0c0 [NAM(7)]  1 : 10 (compact)
> > > #This is a wrapper of ALTREP
> > > is.altrep(a)
> > [1] TRUE
>
>
> I've also defined an ALTREP type and it did not work either. I guess this
> might be a bug? Or did I miss something?
>
> 2. Wrapper objects in ALTREP
>
> If the duplicate function is defined to return the object itself:
>
> SEXP vector_dulplicate(SEXP x, Rboolean deep) {
> return(x);
> }
>

So this is a violation of of the contract. <youraltrep>_duplicate *must* do
an actual duplication. Returning the object unduplicated when duplicate is
called is going to have all sorts of unintended negative consequences. R's
internals rely on the fact that a SEXP that has been passed to DUPLICATE
has been duplciated and is safe to modify inplace.



> In R an ALTREP object will behave like an environment (pass-by-reference).
> However, if we do something like(pseudo code):
>
> n=100
> > x=runif(n)
> > alt1=createAltrep(x)
> > alt2=alt1
> > alt2[1]=10
> > .Internal(inspect(alt1))
> > .Internal(inspect(alt2))
>
>
> The result would be:
>
> > .Internal(inspect(alt1))
> > @0x00000000156f4d18 14 REALSXP g0c0 [NAM(7)]
> > > .Internal(inspect(alt2 ))
> > @0x00000000156a33e0 14 REALSXP g0c0 [NAM(7)]  wrapper
> > [srt=-2147483648,no_na=0]
> >   @0x00000000156f4d18 14 REALSXP g0c0 [NAM(7)]
>
>
> It seems like the object alt2 automatically gets wrapped by R. Although at
> the R level it seems fine because there are no differences between alt1 and
> alt2, if we define a C function as:
>

So I'm not sure what is happening here, because it depends on what your
createAltrep function does. R automatically creates wrappers in some cases
but not nearly all (or even very many currently) cases.

>
> SEXP C_peekSharedMemory(SEXP x) {
> > return(R_altrep_data1(x));
>
> }
>
>
> and call it in R to get the internal data structure of an ALTREP object.
>
> C_peekSharedMemory(alt1)
> > C_peekSharedMemory(alt2)
>
>
> The first one correctly returns its internal data structure, but the second
> one returns the ALTREP object it wraps since the wrapper itself is an
> ALTREP. This behavior is unexpected.


I disagree. R_altrep_data1 returns whatever THAT altrep SEXP stores in its
"data1" part. There is no recursion/descent going on, and there shouldn't
be.


> Since the dulplicate function returns
> the object itself, I will expect alt1 and alt2 should be the same object.
>

Again, this is a violation of the core assumptions of ALTREP that is not
allowed, so I'd argue that any behavior this causes is largely irrelevant
(and a smart part of the much larger set of problems not duplicating when R
told you to duplicate will cause).







> Even if they are essentially not the same, calling the same function should
> at least return the same result. Other than that, It seems like R does not
> always wrap an ALTREP object. If we change n from 100 to 10 and check the
> internal again, alt2 will not get wrapped.


Right, so this is a misunderstanding (which may be the fault of sparse
documentation on our part);  wrapper is one particular ALTREP class, its
not a fundamental aspect of ALTREPs themselves. Most ALTREP objects do not
have wrappers. See, e.g.,

> .Internal(inspect(1:4))

@7fb727d6be50 13 INTSXP g0c0 [NAM(3)]  1 : 4 (compact)


That's an ALTREP with no wrapper (a compact sequence). The wrapper ALTREP
class is for attaching metadata (known sortedness, known lack of NAs) to R
vectors. Its primary use currently is on the return value of sort().


> This makes the problem even more
> difficult since we cannot predict when would the wrapper appear.
>

As currently factored, its not intended that you would be or need to
predict when a wrapper would appear. Using the C API or any R functions
will transparently treat wrapped and non-wrapped objects the same, and any
code you write should hit these API entrypoints so that any code you write
does the same.

Does that help?

Best,
~G

>
> Here is the source code for the wrapper:
> https://github.com/wch/r-source/blob/trunk/src/main/altclasses.c#L1399
>
> Here is a working example if one can build the sharedObject package from
> https://github.com/Jiefei-Wang/sharedObject
>
> n=100
> > x=runif(n)
> > so1=sharedObject(x,copyOnWrite = FALSE)
> > so2=so1
> > so2[1]=10
> > .Internal(inspect(so1))
> > .Internal(inspect(so2))
>
>
> Here is my session info:
>
> R version 3.6.0 alpha (2019-04-08 r76348)
> > Platform: x86_64-w64-mingw32/x64 (64-bit)
> > Running under: Windows >= 8 x64 (build 9200)
> > Matrix products: default
> > locale:
> > [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
> > States.1252
> > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
> >
> > [5] LC_TIME=English_United States.1252
> > attached base packages:
> > [1] stats     graphics  grDevices utils     datasets  methods   base
> > other attached packages:
> > [1] sharedObject_0.0.99
> > loaded via a namespace (and not attached):
> > [1] compiler_3.6.0 tools_3.6.0    Rcpp_1.0.1
>
>
> Best,
> Jiefei
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ALTREP: Bug reports

Wang Jiefei
Hello Luke and Gabriel,

Thank you very much for your quick responses. The explanation of STDVEC is
very helpful and I appreciate it! For the wrapper, I have a few new
questions.


1. Like Luke said a mutable object is not possible. However, I noticed that
there is one extra argument *deep* in the function duplicate. I've googled
all the available documentation for ALTREP but I did not find any
explanation of it. Could you please give some detail on it?


2.

> The first one correctly returns its internal data structure, but the second
> one returns the ALTREP object it wraps since the wrapper itself is an
> ALTREP. This behavior is unexpected.


I disagree. R_altrep_data1 returns whatever THAT altrep SEXP stores in its
> "data1" part. There is no recursion/descent going on, and there shouldn't
> be.


This is might be a bug since in R release 3.6 it will return the ALTREP
instead of the data of the ALTREP. I'm not sure if it has been fixed in
3.7. Here is a simple example:

SEXP C_peekSharedMemory(SEXP x) {
> while (ALTREP(x)) {
> Rprintf("getting data 1\n");
> x = R_altrep_data1(x);
> }
> return(x);
> }


If calling R_altrep_data1 return the internal data directly, we will only
see one message. following my last example

> .Internal(inspect(so1))
> @0x0000000005e7fbb0 14 REALSXP g0c0 [MARK,NAM(7)]  Share object of type
> double
> > .Internal(inspect(so2))
> @0x0000000005fc5ac0 14 REALSXP g0c0 [MARK,NAM(7)]  wrapper
> [srt=-2147483648,no_na=0]
>   @0x0000000005e7fbb0 14 REALSXP g0c0 [MARK,NAM(7)]  Share object of type
> double
> > sm1=peekSharedMemory(so1)
> getting data 1
> > sm2=peekSharedMemory(so2)
> getting data 1
> getting data 1


We see that so2 call R_altrep_data1 twice to get the internal data. This is
very unexpected.

Thank you very much for your help again!

Best,
Jiefei



On Thu, May 16, 2019 at 3:47 PM Gabriel Becker <[hidden email]>
wrote:

> Hi Jiefei,
>
> Thanks for tryingout the ALTREP stuff and letting us know how it is going.
> That said I don't think either of these are bugs, per se, but rather a
> misunderstanding of the API. Details inline.
>
>
>
> On Thu, May 16, 2019 at 11:57 AM 介非王 <[hidden email]> wrote:
>
>> Hello,
>>
>> I have encountered two bugs when using ALTREP APIs.
>>
>> 1. STDVEC_DATAPTR
>>
>> From RInternal.h file it has a comment:
>>
>> /* ALTREP support */
>> > void *(STDVEC_DATAPTR)(SEXP x);
>>
>>
>> However, this comment might not be true, the easiest way to verify it is
>> to
>> define a C++ function:
>>
>>  void C_testFunc(SEXP a)
>> > {
>> > STDVEC_DATAPTR(a);
>> > }
>>
>>
>> and call it in R via
>>
>> > a=1:10
>> > > C_testFunc(a)
>> > Error in C_testFunc(a) : cannot get STDVEC_DATAPTR from ALTREP object
>>
>>
> The STDVEC here refers to the SEXP not being an ALTREP. Anything that
> starts with STDVEC should never receive an ALTREP, ie it should only be
> called after non-ALTREPness has been confirmed by the surrounding/preceding
> code. So this is expected behavior.
>
>
>
>
>>
>>  We can inspect the internal type and call ALTREP function to check if it
>> is an ALTREP:
>>
>> > .Internal(inspect(a))
>> > @0x000000001b5a3310 13 INTSXP g0c0 [NAM(7)]  1 : 10 (compact)
>> > > #This is a wrapper of ALTREP
>> > > is.altrep(a)
>> > [1] TRUE
>>
>>
>> I've also defined an ALTREP type and it did not work either. I guess this
>> might be a bug? Or did I miss something?
>>
>> 2. Wrapper objects in ALTREP
>>
>> If the duplicate function is defined to return the object itself:
>>
>> SEXP vector_dulplicate(SEXP x, Rboolean deep) {
>> return(x);
>> }
>>
>
> So this is a violation of of the contract. <youraltrep>_duplicate *must*
> do an actual duplication. Returning the object unduplicated when duplicate
> is called is going to have all sorts of unintended negative consequences.
> R's internals rely on the fact that a SEXP that has been passed to
> DUPLICATE has been duplciated and is safe to modify inplace.
>
>
>
>> In R an ALTREP object will behave like an environment (pass-by-reference).
>> However, if we do something like(pseudo code):
>>
>> n=100
>> > x=runif(n)
>> > alt1=createAltrep(x)
>> > alt2=alt1
>> > alt2[1]=10
>> > .Internal(inspect(alt1))
>> > .Internal(inspect(alt2))
>>
>>
>> The result would be:
>>
>> > .Internal(inspect(alt1))
>> > @0x00000000156f4d18 14 REALSXP g0c0 [NAM(7)]
>> > > .Internal(inspect(alt2 ))
>> > @0x00000000156a33e0 14 REALSXP g0c0 [NAM(7)]  wrapper
>> > [srt=-2147483648,no_na=0]
>> >   @0x00000000156f4d18 14 REALSXP g0c0 [NAM(7)]
>>
>>
>> It seems like the object alt2 automatically gets wrapped by R. Although at
>> the R level it seems fine because there are no differences between alt1
>> and
>> alt2, if we define a C function as:
>>
>
> So I'm not sure what is happening here, because it depends on what your
> createAltrep function does. R automatically creates wrappers in some cases
> but not nearly all (or even very many currently) cases.
>
>>
>> SEXP C_peekSharedMemory(SEXP x) {
>> > return(R_altrep_data1(x));
>>
>> }
>>
>>
>> and call it in R to get the internal data structure of an ALTREP object.
>>
>> C_peekSharedMemory(alt1)
>> > C_peekSharedMemory(alt2)
>>
>>
>> The first one correctly returns its internal data structure, but the
>> second
>> one returns the ALTREP object it wraps since the wrapper itself is an
>> ALTREP. This behavior is unexpected.
>
>
> I disagree. R_altrep_data1 returns whatever THAT altrep SEXP stores in its
> "data1" part. There is no recursion/descent going on, and there shouldn't
> be.
>
>
>> Since the dulplicate function returns
>> the object itself, I will expect alt1 and alt2 should be the same object.
>>
>
> Again, this is a violation of the core assumptions of ALTREP that is not
> allowed, so I'd argue that any behavior this causes is largely irrelevant
> (and a smart part of the much larger set of problems not duplicating when R
> told you to duplicate will cause).
>
>
>
>
>
>
>
>> Even if they are essentially not the same, calling the same function
>> should
>> at least return the same result. Other than that, It seems like R does not
>> always wrap an ALTREP object. If we change n from 100 to 10 and check the
>> internal again, alt2 will not get wrapped.
>
>
> Right, so this is a misunderstanding (which may be the fault of sparse
> documentation on our part);  wrapper is one particular ALTREP class, its
> not a fundamental aspect of ALTREPs themselves. Most ALTREP objects do not
> have wrappers. See, e.g.,
>
> > .Internal(inspect(1:4))
>
> @7fb727d6be50 13 INTSXP g0c0 [NAM(3)]  1 : 4 (compact)
>
>
> That's an ALTREP with no wrapper (a compact sequence). The wrapper ALTREP
> class is for attaching metadata (known sortedness, known lack of NAs) to R
> vectors. Its primary use currently is on the return value of sort().
>
>
>> This makes the problem even more
>> difficult since we cannot predict when would the wrapper appear.
>>
>
> As currently factored, its not intended that you would be or need to
> predict when a wrapper would appear. Using the C API or any R functions
> will transparently treat wrapped and non-wrapped objects the same, and any
> code you write should hit these API entrypoints so that any code you write
> does the same.
>
> Does that help?
>
> Best,
> ~G
>
>>
>> Here is the source code for the wrapper:
>> https://github.com/wch/r-source/blob/trunk/src/main/altclasses.c#L1399
>>
>> Here is a working example if one can build the sharedObject package from
>> https://github.com/Jiefei-Wang/sharedObject
>>
>> n=100
>> > x=runif(n)
>> > so1=sharedObject(x,copyOnWrite = FALSE)
>> > so2=so1
>> > so2[1]=10
>> > .Internal(inspect(so1))
>> > .Internal(inspect(so2))
>>
>>
>> Here is my session info:
>>
>> R version 3.6.0 alpha (2019-04-08 r76348)
>> > Platform: x86_64-w64-mingw32/x64 (64-bit)
>> > Running under: Windows >= 8 x64 (build 9200)
>> > Matrix products: default
>> > locale:
>> > [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
>> > States.1252
>> > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>> >
>> > [5] LC_TIME=English_United States.1252
>> > attached base packages:
>> > [1] stats     graphics  grDevices utils     datasets  methods   base
>> > other attached packages:
>> > [1] sharedObject_0.0.99
>> > loaded via a namespace (and not attached):
>> > [1] compiler_3.6.0 tools_3.6.0    Rcpp_1.0.1
>>
>>
>> Best,
>> Jiefei
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ALTREP: Bug reports

Wang Jiefei
Hi,

Sorry for overflow the mailbox. Please ignore the second question, I
misunderstand Gabriel answer.

Best,
Jiefei

On Thu, May 16, 2019 at 5:29 PM 介非王 <[hidden email]> wrote:

> Hello Luke and Gabriel,
>
> Thank you very much for your quick responses. The explanation of STDVEC is
> very helpful and I appreciate it! For the wrapper, I have a few new
> questions.
>
>
> 1. Like Luke said a mutable object is not possible. However, I noticed
> that there is one extra argument *deep* in the function duplicate. I've
> googled all the available documentation for ALTREP but I did not find any
> explanation of it. Could you please give some detail on it?
>
>
> 2.
>
>> The first one correctly returns its internal data structure, but the
>> second
>> one returns the ALTREP object it wraps since the wrapper itself is an
>> ALTREP. This behavior is unexpected.
>
>
> I disagree. R_altrep_data1 returns whatever THAT altrep SEXP stores in its
>> "data1" part. There is no recursion/descent going on, and there shouldn't
>> be.
>
>
> This is might be a bug since in R release 3.6 it will return the ALTREP
> instead of the data of the ALTREP. I'm not sure if it has been fixed in
> 3.7. Here is a simple example:
>
> SEXP C_peekSharedMemory(SEXP x) {
>> while (ALTREP(x)) {
>> Rprintf("getting data 1\n");
>> x = R_altrep_data1(x);
>> }
>> return(x);
>> }
>
>
> If calling R_altrep_data1 return the internal data directly, we will only
> see one message. following my last example
>
> > .Internal(inspect(so1))
>> @0x0000000005e7fbb0 14 REALSXP g0c0 [MARK,NAM(7)]  Share object of type
>> double
>> > .Internal(inspect(so2))
>> @0x0000000005fc5ac0 14 REALSXP g0c0 [MARK,NAM(7)]  wrapper
>> [srt=-2147483648,no_na=0]
>>   @0x0000000005e7fbb0 14 REALSXP g0c0 [MARK,NAM(7)]  Share object of type
>> double
>> > sm1=peekSharedMemory(so1)
>> getting data 1
>> > sm2=peekSharedMemory(so2)
>> getting data 1
>> getting data 1
>
>
> We see that so2 call R_altrep_data1 twice to get the internal data. This
> is very unexpected.
>
> Thank you very much for your help again!
>
> Best,
> Jiefei
>
>
>
> On Thu, May 16, 2019 at 3:47 PM Gabriel Becker <[hidden email]>
> wrote:
>
>> Hi Jiefei,
>>
>> Thanks for tryingout the ALTREP stuff and letting us know how it is
>> going. That said I don't think either of these are bugs, per se, but rather
>> a misunderstanding of the API. Details inline.
>>
>>
>>
>> On Thu, May 16, 2019 at 11:57 AM 介非王 <[hidden email]> wrote:
>>
>>> Hello,
>>>
>>> I have encountered two bugs when using ALTREP APIs.
>>>
>>> 1. STDVEC_DATAPTR
>>>
>>> From RInternal.h file it has a comment:
>>>
>>> /* ALTREP support */
>>> > void *(STDVEC_DATAPTR)(SEXP x);
>>>
>>>
>>> However, this comment might not be true, the easiest way to verify it is
>>> to
>>> define a C++ function:
>>>
>>>  void C_testFunc(SEXP a)
>>> > {
>>> > STDVEC_DATAPTR(a);
>>> > }
>>>
>>>
>>> and call it in R via
>>>
>>> > a=1:10
>>> > > C_testFunc(a)
>>> > Error in C_testFunc(a) : cannot get STDVEC_DATAPTR from ALTREP object
>>>
>>>
>> The STDVEC here refers to the SEXP not being an ALTREP. Anything that
>> starts with STDVEC should never receive an ALTREP, ie it should only be
>> called after non-ALTREPness has been confirmed by the surrounding/preceding
>> code. So this is expected behavior.
>>
>>
>>
>>
>>>
>>>  We can inspect the internal type and call ALTREP function to check if it
>>> is an ALTREP:
>>>
>>> > .Internal(inspect(a))
>>> > @0x000000001b5a3310 13 INTSXP g0c0 [NAM(7)]  1 : 10 (compact)
>>> > > #This is a wrapper of ALTREP
>>> > > is.altrep(a)
>>> > [1] TRUE
>>>
>>>
>>> I've also defined an ALTREP type and it did not work either. I guess this
>>> might be a bug? Or did I miss something?
>>>
>>> 2. Wrapper objects in ALTREP
>>>
>>> If the duplicate function is defined to return the object itself:
>>>
>>> SEXP vector_dulplicate(SEXP x, Rboolean deep) {
>>> return(x);
>>> }
>>>
>>
>> So this is a violation of of the contract. <youraltrep>_duplicate *must*
>> do an actual duplication. Returning the object unduplicated when duplicate
>> is called is going to have all sorts of unintended negative consequences.
>> R's internals rely on the fact that a SEXP that has been passed to
>> DUPLICATE has been duplciated and is safe to modify inplace.
>>
>>
>>
>>> In R an ALTREP object will behave like an environment
>>> (pass-by-reference).
>>> However, if we do something like(pseudo code):
>>>
>>> n=100
>>> > x=runif(n)
>>> > alt1=createAltrep(x)
>>> > alt2=alt1
>>> > alt2[1]=10
>>> > .Internal(inspect(alt1))
>>> > .Internal(inspect(alt2))
>>>
>>>
>>> The result would be:
>>>
>>> > .Internal(inspect(alt1))
>>> > @0x00000000156f4d18 14 REALSXP g0c0 [NAM(7)]
>>> > > .Internal(inspect(alt2 ))
>>> > @0x00000000156a33e0 14 REALSXP g0c0 [NAM(7)]  wrapper
>>> > [srt=-2147483648,no_na=0]
>>> >   @0x00000000156f4d18 14 REALSXP g0c0 [NAM(7)]
>>>
>>>
>>> It seems like the object alt2 automatically gets wrapped by R. Although
>>> at
>>> the R level it seems fine because there are no differences between alt1
>>> and
>>> alt2, if we define a C function as:
>>>
>>
>> So I'm not sure what is happening here, because it depends on what your
>> createAltrep function does. R automatically creates wrappers in some cases
>> but not nearly all (or even very many currently) cases.
>>
>>>
>>> SEXP C_peekSharedMemory(SEXP x) {
>>> > return(R_altrep_data1(x));
>>>
>>> }
>>>
>>>
>>> and call it in R to get the internal data structure of an ALTREP object.
>>>
>>> C_peekSharedMemory(alt1)
>>> > C_peekSharedMemory(alt2)
>>>
>>>
>>> The first one correctly returns its internal data structure, but the
>>> second
>>> one returns the ALTREP object it wraps since the wrapper itself is an
>>> ALTREP. This behavior is unexpected.
>>
>>
>> I disagree. R_altrep_data1 returns whatever THAT altrep SEXP stores in
>> its "data1" part. There is no recursion/descent going on, and there
>> shouldn't be.
>>
>>
>>> Since the dulplicate function returns
>>> the object itself, I will expect alt1 and alt2 should be the same object.
>>>
>>
>> Again, this is a violation of the core assumptions of ALTREP that is not
>> allowed, so I'd argue that any behavior this causes is largely irrelevant
>> (and a smart part of the much larger set of problems not duplicating when R
>> told you to duplicate will cause).
>>
>>
>>
>>
>>
>>
>>
>>> Even if they are essentially not the same, calling the same function
>>> should
>>> at least return the same result. Other than that, It seems like R does
>>> not
>>> always wrap an ALTREP object. If we change n from 100 to 10 and check the
>>> internal again, alt2 will not get wrapped.
>>
>>
>> Right, so this is a misunderstanding (which may be the fault of sparse
>> documentation on our part);  wrapper is one particular ALTREP class, its
>> not a fundamental aspect of ALTREPs themselves. Most ALTREP objects do not
>> have wrappers. See, e.g.,
>>
>> > .Internal(inspect(1:4))
>>
>> @7fb727d6be50 13 INTSXP g0c0 [NAM(3)]  1 : 4 (compact)
>>
>>
>> That's an ALTREP with no wrapper (a compact sequence). The wrapper ALTREP
>> class is for attaching metadata (known sortedness, known lack of NAs) to R
>> vectors. Its primary use currently is on the return value of sort().
>>
>>
>>> This makes the problem even more
>>> difficult since we cannot predict when would the wrapper appear.
>>>
>>
>> As currently factored, its not intended that you would be or need to
>> predict when a wrapper would appear. Using the C API or any R functions
>> will transparently treat wrapped and non-wrapped objects the same, and any
>> code you write should hit these API entrypoints so that any code you write
>> does the same.
>>
>> Does that help?
>>
>> Best,
>> ~G
>>
>>>
>>> Here is the source code for the wrapper:
>>> https://github.com/wch/r-source/blob/trunk/src/main/altclasses.c#L1399
>>>
>>> Here is a working example if one can build the sharedObject package from
>>> https://github.com/Jiefei-Wang/sharedObject
>>>
>>> n=100
>>> > x=runif(n)
>>> > so1=sharedObject(x,copyOnWrite = FALSE)
>>> > so2=so1
>>> > so2[1]=10
>>> > .Internal(inspect(so1))
>>> > .Internal(inspect(so2))
>>>
>>>
>>> Here is my session info:
>>>
>>> R version 3.6.0 alpha (2019-04-08 r76348)
>>> > Platform: x86_64-w64-mingw32/x64 (64-bit)
>>> > Running under: Windows >= 8 x64 (build 9200)
>>> > Matrix products: default
>>> > locale:
>>> > [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
>>> > States.1252
>>> > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>>> >
>>> > [5] LC_TIME=English_United States.1252
>>> > attached base packages:
>>> > [1] stats     graphics  grDevices utils     datasets  methods   base
>>> > other attached packages:
>>> > [1] sharedObject_0.0.99
>>> > loaded via a namespace (and not attached):
>>> > [1] compiler_3.6.0 tools_3.6.0    Rcpp_1.0.1
>>>
>>>
>>> Best,
>>> Jiefei
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ALTREP: Bug reports

Gabriel Becker-2
In reply to this post by Wang Jiefei
Jiefei,

Inline.

On Thu, May 16, 2019 at 2:30 PM 介非王 <[hidden email]> wrote:

> Hello Luke and Gabriel,
>
> Thank you very much for your quick responses. The explanation of STDVEC is
> very helpful and I appreciate it! For the wrapper, I have a few new
> questions.
>
>
> 1. Like Luke said a mutable object is not possible. However, I noticed
> that there is one extra argument *deep* in the function duplicate. I've
> googled all the available documentation for ALTREP but I did not find any
> explanation of it. Could you please give some detail on it?
>

Deep means in the case of compound/nested structure, e.g., most easily
illustrative the case of a list in R (or VECSXP in C) , do the elements
need to be duplicated (deep == TRUE) or *only* the "container" SEXP.

Consider an R list:

x = 1:5

y = 2:20

z= c(TRUE, FALSE)

w = "hi there"

lst = list(a= x, b = y, c =z)

lst2 =lst # NAMED == 2, more than one symbol pointing to

And we want to modify lst like so

lst[[2]] = w

We need to duplicate the "container SEXP", ie the VECSXP, so that lst's
SEXP and lst2's SEXP point to diferent SEXPs in their second element, but
we don't need to duplicate any SEXPs that represent the data in any of the
elements (the SEXPs bound to symbols x, y, z, and w), because none of those
were modified.

Thus, if deep == FALSE, those element SEXPs are NOT duplicated, just the
top-level one is. if deep==TRUE, then the element SEXPs are duplicated too,
because  R decided it neeeded that to happen for some reason.

In terms of implementing an ALTREP class, you can either a) just ignore
deep and *always* do a deep (ie full) duplication of everything in your
ALTREP class, or  b) you can pay attention to it and  always create a new
altrep  but which can potentially - *ONLY in cases where deep==FALSE* -
not duplicate the SEXPs that make up its alternative representation,
provided you're careful about then making sure that duplication happens at
a later time if necessary.

I'd strongly suggest starting with option (a) just to have something
working and completely safe, then considering if its important enough to
you to look into (b).

Does that make sense?

Best,
~G




>
> 2.
>
>> The first one correctly returns its internal data structure, but the
>> second
>> one returns the ALTREP object it wraps since the wrapper itself is an
>> ALTREP. This behavior is unexpected.
>
>
> I disagree. R_altrep_data1 returns whatever THAT altrep SEXP stores in its
>> "data1" part. There is no recursion/descent going on, and there shouldn't
>> be.
>
>
> This is might be a bug since in R release 3.6 it will return the ALTREP
> instead of the data of the ALTREP. I'm not sure if it has been fixed in
> 3.7. Here is a simple example:
>
> SEXP C_peekSharedMemory(SEXP x) {
>> while (ALTREP(x)) {
>> Rprintf("getting data 1\n");
>> x = R_altrep_data1(x);
>> }
>> return(x);
>> }
>
>
> If calling R_altrep_data1 return the internal data directly, we will only
> see one message. following my last example
>
> > .Internal(inspect(so1))
>> @0x0000000005e7fbb0 14 REALSXP g0c0 [MARK,NAM(7)]  Share object of type
>> double
>> > .Internal(inspect(so2))
>> @0x0000000005fc5ac0 14 REALSXP g0c0 [MARK,NAM(7)]  wrapper
>> [srt=-2147483648,no_na=0]
>>   @0x0000000005e7fbb0 14 REALSXP g0c0 [MARK,NAM(7)]  Share object of type
>> double
>> > sm1=peekSharedMemory(so1)
>> getting data 1
>> > sm2=peekSharedMemory(so2)
>> getting data 1
>> getting data 1
>
>
> We see that so2 call R_altrep_data1 twice to get the internal data. This
> is very unexpected.
>
> Thank you very much for your help again!
>
> Best,
> Jiefei
>
>
>
> On Thu, May 16, 2019 at 3:47 PM Gabriel Becker <[hidden email]>
> wrote:
>
>> Hi Jiefei,
>>
>> Thanks for tryingout the ALTREP stuff and letting us know how it is
>> going. That said I don't think either of these are bugs, per se, but rather
>> a misunderstanding of the API. Details inline.
>>
>>
>>
>> On Thu, May 16, 2019 at 11:57 AM 介非王 <[hidden email]> wrote:
>>
>>> Hello,
>>>
>>> I have encountered two bugs when using ALTREP APIs.
>>>
>>> 1. STDVEC_DATAPTR
>>>
>>> From RInternal.h file it has a comment:
>>>
>>> /* ALTREP support */
>>> > void *(STDVEC_DATAPTR)(SEXP x);
>>>
>>>
>>> However, this comment might not be true, the easiest way to verify it is
>>> to
>>> define a C++ function:
>>>
>>>  void C_testFunc(SEXP a)
>>> > {
>>> > STDVEC_DATAPTR(a);
>>> > }
>>>
>>>
>>> and call it in R via
>>>
>>> > a=1:10
>>> > > C_testFunc(a)
>>> > Error in C_testFunc(a) : cannot get STDVEC_DATAPTR from ALTREP object
>>>
>>>
>> The STDVEC here refers to the SEXP not being an ALTREP. Anything that
>> starts with STDVEC should never receive an ALTREP, ie it should only be
>> called after non-ALTREPness has been confirmed by the surrounding/preceding
>> code. So this is expected behavior.
>>
>>
>>
>>
>>>
>>>  We can inspect the internal type and call ALTREP function to check if it
>>> is an ALTREP:
>>>
>>> > .Internal(inspect(a))
>>> > @0x000000001b5a3310 13 INTSXP g0c0 [NAM(7)]  1 : 10 (compact)
>>> > > #This is a wrapper of ALTREP
>>> > > is.altrep(a)
>>> > [1] TRUE
>>>
>>>
>>> I've also defined an ALTREP type and it did not work either. I guess this
>>> might be a bug? Or did I miss something?
>>>
>>> 2. Wrapper objects in ALTREP
>>>
>>> If the duplicate function is defined to return the object itself:
>>>
>>> SEXP vector_dulplicate(SEXP x, Rboolean deep) {
>>> return(x);
>>> }
>>>
>>
>> So this is a violation of of the contract. <youraltrep>_duplicate *must*
>> do an actual duplication. Returning the object unduplicated when duplicate
>> is called is going to have all sorts of unintended negative consequences.
>> R's internals rely on the fact that a SEXP that has been passed to
>> DUPLICATE has been duplciated and is safe to modify inplace.
>>
>>
>>
>>> In R an ALTREP object will behave like an environment
>>> (pass-by-reference).
>>> However, if we do something like(pseudo code):
>>>
>>> n=100
>>> > x=runif(n)
>>> > alt1=createAltrep(x)
>>> > alt2=alt1
>>> > alt2[1]=10
>>> > .Internal(inspect(alt1))
>>> > .Internal(inspect(alt2))
>>>
>>>
>>> The result would be:
>>>
>>> > .Internal(inspect(alt1))
>>> > @0x00000000156f4d18 14 REALSXP g0c0 [NAM(7)]
>>> > > .Internal(inspect(alt2 ))
>>> > @0x00000000156a33e0 14 REALSXP g0c0 [NAM(7)]  wrapper
>>> > [srt=-2147483648,no_na=0]
>>> >   @0x00000000156f4d18 14 REALSXP g0c0 [NAM(7)]
>>>
>>>
>>> It seems like the object alt2 automatically gets wrapped by R. Although
>>> at
>>> the R level it seems fine because there are no differences between alt1
>>> and
>>> alt2, if we define a C function as:
>>>
>>
>> So I'm not sure what is happening here, because it depends on what your
>> createAltrep function does. R automatically creates wrappers in some cases
>> but not nearly all (or even very many currently) cases.
>>
>>>
>>> SEXP C_peekSharedMemory(SEXP x) {
>>> > return(R_altrep_data1(x));
>>>
>>> }
>>>
>>>
>>> and call it in R to get the internal data structure of an ALTREP object.
>>>
>>> C_peekSharedMemory(alt1)
>>> > C_peekSharedMemory(alt2)
>>>
>>>
>>> The first one correctly returns its internal data structure, but the
>>> second
>>> one returns the ALTREP object it wraps since the wrapper itself is an
>>> ALTREP. This behavior is unexpected.
>>
>>
>> I disagree. R_altrep_data1 returns whatever THAT altrep SEXP stores in
>> its "data1" part. There is no recursion/descent going on, and there
>> shouldn't be.
>>
>>
>>> Since the dulplicate function returns
>>> the object itself, I will expect alt1 and alt2 should be the same object.
>>>
>>
>> Again, this is a violation of the core assumptions of ALTREP that is not
>> allowed, so I'd argue that any behavior this causes is largely irrelevant
>> (and a smart part of the much larger set of problems not duplicating when R
>> told you to duplicate will cause).
>>
>>
>>
>>
>>
>>
>>
>>> Even if they are essentially not the same, calling the same function
>>> should
>>> at least return the same result. Other than that, It seems like R does
>>> not
>>> always wrap an ALTREP object. If we change n from 100 to 10 and check the
>>> internal again, alt2 will not get wrapped.
>>
>>
>> Right, so this is a misunderstanding (which may be the fault of sparse
>> documentation on our part);  wrapper is one particular ALTREP class, its
>> not a fundamental aspect of ALTREPs themselves. Most ALTREP objects do not
>> have wrappers. See, e.g.,
>>
>> > .Internal(inspect(1:4))
>>
>> @7fb727d6be50 13 INTSXP g0c0 [NAM(3)]  1 : 4 (compact)
>>
>>
>> That's an ALTREP with no wrapper (a compact sequence). The wrapper ALTREP
>> class is for attaching metadata (known sortedness, known lack of NAs) to R
>> vectors. Its primary use currently is on the return value of sort().
>>
>>
>>> This makes the problem even more
>>> difficult since we cannot predict when would the wrapper appear.
>>>
>>
>> As currently factored, its not intended that you would be or need to
>> predict when a wrapper would appear. Using the C API or any R functions
>> will transparently treat wrapped and non-wrapped objects the same, and any
>> code you write should hit these API entrypoints so that any code you write
>> does the same.
>>
>> Does that help?
>>
>> Best,
>> ~G
>>
>>>
>>> Here is the source code for the wrapper:
>>> https://github.com/wch/r-source/blob/trunk/src/main/altclasses.c#L1399
>>>
>>> Here is a working example if one can build the sharedObject package from
>>> https://github.com/Jiefei-Wang/sharedObject
>>>
>>> n=100
>>> > x=runif(n)
>>> > so1=sharedObject(x,copyOnWrite = FALSE)
>>> > so2=so1
>>> > so2[1]=10
>>> > .Internal(inspect(so1))
>>> > .Internal(inspect(so2))
>>>
>>>
>>> Here is my session info:
>>>
>>> R version 3.6.0 alpha (2019-04-08 r76348)
>>> > Platform: x86_64-w64-mingw32/x64 (64-bit)
>>> > Running under: Windows >= 8 x64 (build 9200)
>>> > Matrix products: default
>>> > locale:
>>> > [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
>>> > States.1252
>>> > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>>> >
>>> > [5] LC_TIME=English_United States.1252
>>> > attached base packages:
>>> > [1] stats     graphics  grDevices utils     datasets  methods   base
>>> > other attached packages:
>>> > [1] sharedObject_0.0.99
>>> > loaded via a namespace (and not attached):
>>> > [1] compiler_3.6.0 tools_3.6.0    Rcpp_1.0.1
>>>
>>>
>>> Best,
>>> Jiefei
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ALTREP: Bug reports

Wang Jiefei
Thank you very much for your answer. If I understand it correctly, for an
ALTREP class, a non-deep copy only creates a new ALTREP object but refers
to the same underlying SEXP as the old ALTREP object has, is it correct?
But since they all share the same underlying SEXP, will change of the value
in the old ALTREP object cause the change of the value in the new ALTREP
object? Or do you mean we need to decide which SEXP has to be copied even
*deep==FALSE*? I made a small test code:

x=runif(10)
> so1=sharedObject(x)
> so2=so1
> so2[1]=10


The last line of the code will call the duplicate function with
*deep==FALSE,* which does not sound correct to me if we don't do a deep
copy of the SEXP.

Best,
Jiefei

On Thu, May 16, 2019 at 3:07 PM Gabriel Becker <[hidden email]>
wrote:

> Jiefei,
>
> Inline.
>
> On Thu, May 16, 2019 at 2:30 PM 介非王 <[hidden email]> wrote:
>
>> Hello Luke and Gabriel,
>>
>> Thank you very much for your quick responses. The explanation of STDVEC
>> is very helpful and I appreciate it! For the wrapper, I have a few new
>> questions.
>>
>>
>> 1. Like Luke said a mutable object is not possible. However, I noticed
>> that there is one extra argument *deep* in the function duplicate. I've
>> googled all the available documentation for ALTREP but I did not find any
>> explanation of it. Could you please give some detail on it?
>>
>
> Deep means in the case of compound/nested structure, e.g., most easily
> illustrative the case of a list in R (or VECSXP in C) , do the elements
> need to be duplicated (deep == TRUE) or *only* the "container" SEXP.
>
> Consider an R list:
>
> x = 1:5
>
> y = 2:20
>
> z= c(TRUE, FALSE)
>
> w = "hi there"
>
> lst = list(a= x, b = y, c =z)
>
> lst2 =lst # NAMED == 2, more than one symbol pointing to
>
> And we want to modify lst like so
>
> lst[[2]] = w
>
> We need to duplicate the "container SEXP", ie the VECSXP, so that lst's
> SEXP and lst2's SEXP point to diferent SEXPs in their second element, but
> we don't need to duplicate any SEXPs that represent the data in any of the
> elements (the SEXPs bound to symbols x, y, z, and w), because none of those
> were modified.
>
> Thus, if deep == FALSE, those element SEXPs are NOT duplicated, just the
> top-level one is. if deep==TRUE, then the element SEXPs are duplicated too,
> because  R decided it neeeded that to happen for some reason.
>
> In terms of implementing an ALTREP class, you can either a) just ignore
> deep and *always* do a deep (ie full) duplication of everything in your
> ALTREP class, or  b) you can pay attention to it and  always create a new
> altrep  but which can potentially - *ONLY in cases where deep==FALSE* -
> not duplicate the SEXPs that make up its alternative representation,
> provided you're careful about then making sure that duplication happens at
> a later time if necessary.
>
> I'd strongly suggest starting with option (a) just to have something
> working and completely safe, then considering if its important enough to
> you to look into (b).
>
> Does that make sense?
>
> Best,
> ~G
>
>
>
>
>>
>> 2.
>>
>>> The first one correctly returns its internal data structure, but the
>>> second
>>> one returns the ALTREP object it wraps since the wrapper itself is an
>>> ALTREP. This behavior is unexpected.
>>
>>
>> I disagree. R_altrep_data1 returns whatever THAT altrep SEXP stores in
>>> its "data1" part. There is no recursion/descent going on, and there
>>> shouldn't be.
>>
>>
>> This is might be a bug since in R release 3.6 it will return the ALTREP
>> instead of the data of the ALTREP. I'm not sure if it has been fixed in
>> 3.7. Here is a simple example:
>>
>> SEXP C_peekSharedMemory(SEXP x) {
>>> while (ALTREP(x)) {
>>> Rprintf("getting data 1\n");
>>> x = R_altrep_data1(x);
>>> }
>>> return(x);
>>> }
>>
>>
>> If calling R_altrep_data1 return the internal data directly, we will only
>> see one message. following my last example
>>
>> > .Internal(inspect(so1))
>>> @0x0000000005e7fbb0 14 REALSXP g0c0 [MARK,NAM(7)]  Share object of type
>>> double
>>> > .Internal(inspect(so2))
>>> @0x0000000005fc5ac0 14 REALSXP g0c0 [MARK,NAM(7)]  wrapper
>>> [srt=-2147483648,no_na=0]
>>>   @0x0000000005e7fbb0 14 REALSXP g0c0 [MARK,NAM(7)]  Share object of
>>> type double
>>> > sm1=peekSharedMemory(so1)
>>> getting data 1
>>> > sm2=peekSharedMemory(so2)
>>> getting data 1
>>> getting data 1
>>
>>
>> We see that so2 call R_altrep_data1 twice to get the internal data. This
>> is very unexpected.
>>
>> Thank you very much for your help again!
>>
>> Best,
>> Jiefei
>>
>>
>>
>> On Thu, May 16, 2019 at 3:47 PM Gabriel Becker <[hidden email]>
>> wrote:
>>
>>> Hi Jiefei,
>>>
>>> Thanks for tryingout the ALTREP stuff and letting us know how it is
>>> going. That said I don't think either of these are bugs, per se, but rather
>>> a misunderstanding of the API. Details inline.
>>>
>>>
>>>
>>> On Thu, May 16, 2019 at 11:57 AM 介非王 <[hidden email]> wrote:
>>>
>>>> Hello,
>>>>
>>>> I have encountered two bugs when using ALTREP APIs.
>>>>
>>>> 1. STDVEC_DATAPTR
>>>>
>>>> From RInternal.h file it has a comment:
>>>>
>>>> /* ALTREP support */
>>>> > void *(STDVEC_DATAPTR)(SEXP x);
>>>>
>>>>
>>>> However, this comment might not be true, the easiest way to verify it
>>>> is to
>>>> define a C++ function:
>>>>
>>>>  void C_testFunc(SEXP a)
>>>> > {
>>>> > STDVEC_DATAPTR(a);
>>>> > }
>>>>
>>>>
>>>> and call it in R via
>>>>
>>>> > a=1:10
>>>> > > C_testFunc(a)
>>>> > Error in C_testFunc(a) : cannot get STDVEC_DATAPTR from ALTREP object
>>>>
>>>>
>>> The STDVEC here refers to the SEXP not being an ALTREP. Anything that
>>> starts with STDVEC should never receive an ALTREP, ie it should only be
>>> called after non-ALTREPness has been confirmed by the surrounding/preceding
>>> code. So this is expected behavior.
>>>
>>>
>>>
>>>
>>>>
>>>>  We can inspect the internal type and call ALTREP function to check if
>>>> it
>>>> is an ALTREP:
>>>>
>>>> > .Internal(inspect(a))
>>>> > @0x000000001b5a3310 13 INTSXP g0c0 [NAM(7)]  1 : 10 (compact)
>>>> > > #This is a wrapper of ALTREP
>>>> > > is.altrep(a)
>>>> > [1] TRUE
>>>>
>>>>
>>>> I've also defined an ALTREP type and it did not work either. I guess
>>>> this
>>>> might be a bug? Or did I miss something?
>>>>
>>>> 2. Wrapper objects in ALTREP
>>>>
>>>> If the duplicate function is defined to return the object itself:
>>>>
>>>> SEXP vector_dulplicate(SEXP x, Rboolean deep) {
>>>> return(x);
>>>> }
>>>>
>>>
>>> So this is a violation of of the contract. <youraltrep>_duplicate *must*
>>> do an actual duplication. Returning the object unduplicated when duplicate
>>> is called is going to have all sorts of unintended negative consequences.
>>> R's internals rely on the fact that a SEXP that has been passed to
>>> DUPLICATE has been duplciated and is safe to modify inplace.
>>>
>>>
>>>
>>>> In R an ALTREP object will behave like an environment
>>>> (pass-by-reference).
>>>> However, if we do something like(pseudo code):
>>>>
>>>> n=100
>>>> > x=runif(n)
>>>> > alt1=createAltrep(x)
>>>> > alt2=alt1
>>>> > alt2[1]=10
>>>> > .Internal(inspect(alt1))
>>>> > .Internal(inspect(alt2))
>>>>
>>>>
>>>> The result would be:
>>>>
>>>> > .Internal(inspect(alt1))
>>>> > @0x00000000156f4d18 14 REALSXP g0c0 [NAM(7)]
>>>> > > .Internal(inspect(alt2 ))
>>>> > @0x00000000156a33e0 14 REALSXP g0c0 [NAM(7)]  wrapper
>>>> > [srt=-2147483648,no_na=0]
>>>> >   @0x00000000156f4d18 14 REALSXP g0c0 [NAM(7)]
>>>>
>>>>
>>>> It seems like the object alt2 automatically gets wrapped by R. Although
>>>> at
>>>> the R level it seems fine because there are no differences between alt1
>>>> and
>>>> alt2, if we define a C function as:
>>>>
>>>
>>> So I'm not sure what is happening here, because it depends on what your
>>> createAltrep function does. R automatically creates wrappers in some cases
>>> but not nearly all (or even very many currently) cases.
>>>
>>>>
>>>> SEXP C_peekSharedMemory(SEXP x) {
>>>> > return(R_altrep_data1(x));
>>>>
>>>> }
>>>>
>>>>
>>>> and call it in R to get the internal data structure of an ALTREP object.
>>>>
>>>> C_peekSharedMemory(alt1)
>>>> > C_peekSharedMemory(alt2)
>>>>
>>>>
>>>> The first one correctly returns its internal data structure, but the
>>>> second
>>>> one returns the ALTREP object it wraps since the wrapper itself is an
>>>> ALTREP. This behavior is unexpected.
>>>
>>>
>>> I disagree. R_altrep_data1 returns whatever THAT altrep SEXP stores in
>>> its "data1" part. There is no recursion/descent going on, and there
>>> shouldn't be.
>>>
>>>
>>>> Since the dulplicate function returns
>>>> the object itself, I will expect alt1 and alt2 should be the same
>>>> object.
>>>>
>>>
>>> Again, this is a violation of the core assumptions of ALTREP that is not
>>> allowed, so I'd argue that any behavior this causes is largely irrelevant
>>> (and a smart part of the much larger set of problems not duplicating when R
>>> told you to duplicate will cause).
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>> Even if they are essentially not the same, calling the same function
>>>> should
>>>> at least return the same result. Other than that, It seems like R does
>>>> not
>>>> always wrap an ALTREP object. If we change n from 100 to 10 and check
>>>> the
>>>> internal again, alt2 will not get wrapped.
>>>
>>>
>>> Right, so this is a misunderstanding (which may be the fault of sparse
>>> documentation on our part);  wrapper is one particular ALTREP class, its
>>> not a fundamental aspect of ALTREPs themselves. Most ALTREP objects do not
>>> have wrappers. See, e.g.,
>>>
>>> > .Internal(inspect(1:4))
>>>
>>> @7fb727d6be50 13 INTSXP g0c0 [NAM(3)]  1 : 4 (compact)
>>>
>>>
>>> That's an ALTREP with no wrapper (a compact sequence). The wrapper
>>> ALTREP class is for attaching metadata (known sortedness, known lack of
>>> NAs) to R vectors. Its primary use currently is on the return value of
>>> sort().
>>>
>>>
>>>> This makes the problem even more
>>>> difficult since we cannot predict when would the wrapper appear.
>>>>
>>>
>>> As currently factored, its not intended that you would be or need to
>>> predict when a wrapper would appear. Using the C API or any R functions
>>> will transparently treat wrapped and non-wrapped objects the same, and any
>>> code you write should hit these API entrypoints so that any code you write
>>> does the same.
>>>
>>> Does that help?
>>>
>>> Best,
>>> ~G
>>>
>>>>
>>>> Here is the source code for the wrapper:
>>>> https://github.com/wch/r-source/blob/trunk/src/main/altclasses.c#L1399
>>>>
>>>> Here is a working example if one can build the sharedObject package from
>>>> https://github.com/Jiefei-Wang/sharedObject
>>>>
>>>> n=100
>>>> > x=runif(n)
>>>> > so1=sharedObject(x,copyOnWrite = FALSE)
>>>> > so2=so1
>>>> > so2[1]=10
>>>> > .Internal(inspect(so1))
>>>> > .Internal(inspect(so2))
>>>>
>>>>
>>>> Here is my session info:
>>>>
>>>> R version 3.6.0 alpha (2019-04-08 r76348)
>>>> > Platform: x86_64-w64-mingw32/x64 (64-bit)
>>>> > Running under: Windows >= 8 x64 (build 9200)
>>>> > Matrix products: default
>>>> > locale:
>>>> > [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
>>>> > States.1252
>>>> > [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>>>> >
>>>> > [5] LC_TIME=English_United States.1252
>>>> > attached base packages:
>>>> > [1] stats     graphics  grDevices utils     datasets  methods   base
>>>> > other attached packages:
>>>> > [1] sharedObject_0.0.99
>>>> > loaded via a namespace (and not attached):
>>>> > [1] compiler_3.6.0 tools_3.6.0    Rcpp_1.0.1
>>>>
>>>>
>>>> Best,
>>>> Jiefei
>>>>
>>>>         [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel