ALTREP wrappers and factors

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

ALTREP wrappers and factors

Bemis, Kylie
Hello,

I’m experimenting with ALTREP and was wondering if there is a preferred way to create an ALTREP wrapper vector without using .Internal(wrap_meta(…)), which R CMD check doesn’t like since it uses an .Internal() function.

I was trying to create a factor that used an ALTREP integer, but attempting to set the class and levels attributes always ended up duplicating and materializing the integer vector. Using the wrapper avoided this issue.

Here is my initial ALTREP integer vector:

> fc0 <- factor(c("a", "a", "b"))
>
> y <- matter::as.matter(as.integer(fc0))
> y <- matter:::as.altrep(y)
>
> .Internal(inspect(y))
@7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3, mem=0)

Here is what I get without a wrapper:

> fc1 <- structure(y, class="factor", levels=levels(x))
> .Internal(inspect(fc1))
@7fb0cae66408 13 INTSXP g0c2 [OBJ,NAM(2),ATT] (len=3, tl=0) 1,1,2
ATTRIB:
  @7fb0ce771868 02 LISTSXP g0c0 []
    TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has value)
    @7fb0c9fcbe90 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0)
      @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "factor"
    TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000] "levels" (has value)
    @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0)
      @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a"
      @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b"

Here is what I get with a wrapper:

> fc2 <- structure(.Internal(wrap_meta(y, 0, 0)), class="factor", levels=levels(x))
> .Internal(inspect(fc2))
@7fb0ce764630 13 INTSXP g0c0 [OBJ,NAM(2),ATT]  wrapper [srt=0,no_na=0]
  @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3, mem=0)
ATTRIB:
  @7fb0ce764668 02 LISTSXP g0c0 []
    TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has value)
    @7fb0c9fcb010 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0)
      @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "factor"
    TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000] "levels" (has value)
    @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0)
      @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a"
      @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b"

Is there a way to do this that doesn’t rely on .Internal() and won’t produce R CMD check warnings?

~~~
Kylie Ariel Bemis
Khoury College of Computer Sciences
Northeastern University
kuwisdelu.github.io<https://kuwisdelu.github.io>











        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ALTREP wrappers and factors

Wang Jiefei
Hi Kylie,

For your question, I don't think a wrapper can completely solve your
problem. The duplication occurs since your variable y has more than 1
reference number( Please see highlighted), so even you have a wrapper, any
changes on the value of the wrapper still can trigger the duplication.

> .Internal(inspect(y))
> @7fb0ce78c0f0 13 INTSXP g0c0 *[NAM(7)]* matter vector (mode=3, len=3,
> mem=0)


My guess is that *matter:::as.altrep* function assigned the variable *y* to
a local variable so that it increases the reference number. For example:

*This would not cause a duplication*

> > a=c(1,2,3)
> > .Internal(inspect(a))
> @0x000000002384f530 14 REALSXP g0c3 [NAM(1)] (len=3, tl=0) 1,2,3
> > attr(a,"dim")=c(1,3)
> > .Internal(inspect(a))
> @0x000000002384f530 14 REALSXP g0c3 [NAM(1),ATT] (len=3, tl=0) 1,2,3
> ATTRIB:
>   @0x0000000023864b58 02 LISTSXP g0c0 []
>     TAG: @0x00000000044b1a90 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "dim"
> (has value)
>     @0x000000002384cb48 13 INTSXP g0c1 [NAM(7)] (len=2, tl=0) 1,3
>

*This would cause a duplication, even though the function test does
nothing.*

> > test<-function(x) x1=x
> > a=c(1,2,3)
> > .Internal(inspect(a))
> @0x000000002384f260 14 REALSXP g0c3 [NAM(1)] (len=3, tl=0) 1,2,3
> > test(a)
> > .Internal(inspect(a))
> @0x000000002384f260 14 REALSXP g0c3 [NAM(7)] (len=3, tl=0) 1,2,3
> > attr(a,"dim")=c(1,3)
> > .Internal(inspect(a))
> @0x000000002384f0d0 14 REALSXP g0c3 [NAM(1),ATT] (len=3, tl=0) 1,2,3
> ATTRIB:
>   @0x00000000238666c0 02 LISTSXP g0c0 []
>     TAG: @0x00000000044b1a90 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "dim"
> (has value)
>     @0x000000002384c6e8 13 INTSXP g0c1 [NAM(7)] (len=2, tl=0) 1,3
>


If that is the case and you are 100% sure the reference number should be 1
for your variable *y*, my solution is to call *SET_NAMED *in C++ to reset
the reference number. Note that you need to unbind your local variable
before you reset the number. To return an unbound SEXP,  the C++ function
should be placed at the end of your *matter:::as.altrep *function. I don't
know if there is any simpler way to do that and I'll be happy to see any
opinion.


Also, I notice that you are using ALTREP to create a wrapper for your
*matter_vec *class. I'm an author of AltWrapper package and the package is
able to define an ALTREP in pure R level, it is capable to add an attribute
to ALTREP object when creating the object and has a correct reference
number. The simplest example would be

*CODE*
```
library(AltWrapper)
inspectFunc <- function(x) cat("Altrep object\n")
lengthFunc <- function(x) return(length(x))
getPtrFunc <- function(x, writeable) return(x)

setAltClass(className = "test", classType = "real")
setAltMethod(className = "test", inspect = inspectFunc)
setAltMethod(className = "test", getLength = lengthFunc)
setAltMethod(className = "test", getDataptr = getPtrFunc)

A = runif(6)
A_alt = makeAltrep(className = "test", x = A, *attributes = list(dim = c(2,
3))*)
```
*RESULT*
```
> .Internal(inspect(A_alt))
@0x000000002385ac00 14 REALSXP g0c0 [NAM(1),ATT] Altrep object
ATTRIB:
  @0x000000002385a8b8 02 LISTSXP g0c0 []
    TAG: @0x00000000044b1a90 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "dim" (has
value)
    @0x000000002384d590 13 INTSXP g0c1 [NAM(7)] (len=2, tl=0) 2,3
> A_alt
          [,1]     [,2]      [,3]
[1,] 0.9430458 0.548670 0.4148741
[2,] 0.9550899 0.251857 0.6077540
```
I will be happy to talk more about it if you are interested in the package,
it is available at
https://github.com/Jiefei-Wang/AltWrapper

Best,
Jiefei


On Thu, Jul 18, 2019 at 3:28 AM Bemis, Kylie <[hidden email]>
wrote:

> Hello,
>
> I’m experimenting with ALTREP and was wondering if there is a preferred
> way to create an ALTREP wrapper vector without using
> .Internal(wrap_meta(…)), which R CMD check doesn’t like since it uses an
> .Internal() function.
>
> I was trying to create a factor that used an ALTREP integer, but
> attempting to set the class and levels attributes always ended up
> duplicating and materializing the integer vector. Using the wrapper avoided
> this issue.
>
> Here is my initial ALTREP integer vector:
>
> > fc0 <- factor(c("a", "a", "b"))
> >
> > y <- matter::as.matter(as.integer(fc0))
> > y <- matter:::as.altrep(y)
> >
> > .Internal(inspect(y))
> @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3, mem=0)
>
> Here is what I get without a wrapper:
>
> > fc1 <- structure(y, class="factor", levels=levels(x))
> > .Internal(inspect(fc1))
> @7fb0cae66408 13 INTSXP g0c2 [OBJ,NAM(2),ATT] (len=3, tl=0) 1,1,2
> ATTRIB:
>   @7fb0ce771868 02 LISTSXP g0c0 []
>     TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has
> value)
>     @7fb0c9fcbe90 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0)
>       @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached]
> "factor"
>     TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000] "levels"
> (has value)
>     @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0)
>       @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a"
>       @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b"
>
> Here is what I get with a wrapper:
>
> > fc2 <- structure(.Internal(wrap_meta(y, 0, 0)), class="factor",
> levels=levels(x))
> > .Internal(inspect(fc2))
> @7fb0ce764630 13 INTSXP g0c0 [OBJ,NAM(2),ATT]  wrapper [srt=0,no_na=0]
>   @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3,
> mem=0)
> ATTRIB:
>   @7fb0ce764668 02 LISTSXP g0c0 []
>     TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has
> value)
>     @7fb0c9fcb010 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0)
>       @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached]
> "factor"
>     TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000] "levels"
> (has value)
>     @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0)
>       @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a"
>       @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b"
>
> Is there a way to do this that doesn’t rely on .Internal() and won’t
> produce R CMD check warnings?
>
> ~~~
> Kylie Ariel Bemis
> Khoury College of Computer Sciences
> Northeastern University
> kuwisdelu.github.io<https://kuwisdelu.github.io>
>
>
>
>
>
>
>
>
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ALTREP wrappers and factors

Gabriel Becker-2
Hi Jiefei and Kylie,

Great to see people engaging with the ALTREP framework and identifying
places we may need more tooling. Comments inline.

On Thu, Jul 18, 2019 at 12:22 PM King Jiefei <[hidden email]> wrote:

>
> If that is the case and you are 100% sure the reference number should be 1
> for your variable *y*, my solution is to call *SET_NAMED *in C++ to reset
> the reference number. Note that you need to unbind your local variable
> before you reset the number. To return an unbound SEXP,  the C++ function
> should be placed at the end of your *matter:::as.altrep *function. I don't
> know if there is any simpler way to do that and I'll be happy to see any
> opinion.
>

So as far as I know, manually setting the NAMED value on any SEXP the
garbage collector is aware of is a direct violation of C-API contract and
not something that package code should ever be doing.

Its not at all clear to me that you can *ever* be 100% sure that the
reference number should be 1 when it is not currently one for an R object
that exists at the R-level (as opposed to only in pure C code). Sure, maybe
the object is created within the body of your R function instead of being
passed in, but what if someone is debugging your function and assigns the
value to the global environment using <<-  for later inspection; now  you
have an invalidly low NAMED value, ie you have a segfault coming. I know of
no way for you to prevent this or even know it has happened.



> On Thu, Jul 18, 2019 at 3:28 AM Bemis, Kylie <[hidden email]>
> wrote:
>
> > Hello,
> >
> > I’m experimenting with ALTREP and was wondering if there is a preferred
> > way to create an ALTREP wrapper vector without using
> > .Internal(wrap_meta(…)), which R CMD check doesn’t like since it uses an
> > .Internal() function.
>

So there is the .doSortWrap  (and its currently inexplicably identical
clone .doWrap) function in base, which is an R level function that calls
down to .Internal(wrap_meta(...)), which you can use, but it doesn't look
general enough for what  I think you need (it was written for things that
have just been sorted, thus the name). Specifically, its not able to
indicate that things are of unknown sortedness as currently written.  If
matter vectors are guaranteed to be sorted for some reason, though, you can
use this. I'll talk to Luke about whether we want to generalize this, it
would be easy to have this support the full space of metadata for wrappers
and be a general purpose wrapper-maker, but that isn't what it is right now.

At the C-level, it looks like we do make R_tryWrap available (it appears in
Rinternals.h, and not within a USE_RINTERNALS section),so you can call that
from your own C(++) code. This creates a wrapper that has no metadata on it
(or rather it has metadata but  the metadata indicates that no special info
is known about the vector).

>
> > I was trying to create a factor that used an ALTREP integer, but
> > attempting to set the class and levels attributes always ended up
> > duplicating and materializing the integer vector. Using the wrapper
> avoided
> > this issue.
> >
> > Here is my initial ALTREP integer vector:
> >
> > > fc0 <- factor(c("a", "a", "b"))
> > >
> > > y <- matter::as.matter(as.integer(fc0))
> > > y <- matter:::as.altrep(y)
> > >
> > > .Internal(inspect(y))
> > @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3,
> mem=0)
> >
> > Here is what I get without a wrapper:
> >
> > > fc1 <- structure(y, class="factor", levels=levels(x))
> > > .Internal(inspect(fc1))
> > @7fb0cae66408 13 INTSXP g0c2 [OBJ,NAM(2),ATT] (len=3, tl=0) 1,1,2
> > ATTRIB:
> >   @7fb0ce771868 02 LISTSXP g0c0 []
> >     TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has
> > value)
> >     @7fb0c9fcbe90 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0)
> >       @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached]
> > "factor"
> >     TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000]
> "levels"
> > (has value)
> >     @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0)
> >       @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a"
> >       @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b"
> >
> > Here is what I get with a wrapper:
> >
> > > fc2 <- structure(.Internal(wrap_meta(y, 0, 0)), class="factor",
> > levels=levels(x))
> > > .Internal(inspect(fc2))
> > @7fb0ce764630 13 INTSXP g0c0 [OBJ,NAM(2),ATT]  wrapper [srt=0,no_na=0]
> >   @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3,
> > mem=0)
> > ATTRIB:
> >   @7fb0ce764668 02 LISTSXP g0c0 []
> >     TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has
> > value)
> >     @7fb0c9fcb010 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0)
> >       @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached]
> > "factor"
> >     TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000]
> "levels"
> > (has value)
> >     @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0)
> >       @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a"
> >       @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b"
> >
> > Is there a way to do this that doesn’t rely on .Internal() and won’t
> > produce R CMD check warnings?
> >
> > ~~~
> > Kylie Ariel Bemis
> > Khoury College of Computer Sciences
> > Northeastern University
> > kuwisdelu.github.io<https://kuwisdelu.github.io>
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ALTREP wrappers and factors

Bemis, Kylie
Using R_tryWrap() at the C-level works perfectly and does what I need. Thanks, Gabe!

Yes, my reference count is maxed (I assume) because I am using MARK_NOT_MUTABLE().

Which makes me think I may want to return a wrapped matter/ALTREP object by default, so the user can set the names() and dim(), etc., without triggering a potentially-costly duplication. The data payload is intended to be immutable, but the attributes aren’t.

Decoupling the attributes and other metadata from the data payload seems like a good thing to have generally.

Are there any potential drawbacks of using R_tryWrap() that I should know about, besides an additional method dispatch happening somewhere?

Thanks again!

~~~
Kylie Ariel Bemis
Khoury College of Computer Sciences
Northeastern University
kuwisdelu.github.io<https://kuwisdelu.github.io>










On Jul 19, 2019, at 4:00 AM, Gabriel Becker <[hidden email]<mailto:[hidden email]>> wrote:

Hi Jiefei and Kylie,

Great to see people engaging with the ALTREP framework and identifying places we may need more tooling. Comments inline.

On Thu, Jul 18, 2019 at 12:22 PM King Jiefei <[hidden email]<mailto:[hidden email]>> wrote:

If that is the case and you are 100% sure the reference number should be 1
for your variable *y*, my solution is to call *SET_NAMED *in C++ to reset
the reference number. Note that you need to unbind your local variable
before you reset the number. To return an unbound SEXP,  the C++ function
should be placed at the end of your *matter:::as.altrep *function. I don't
know if there is any simpler way to do that and I'll be happy to see any
opinion.

So as far as I know, manually setting the NAMED value on any SEXP the garbage collector is aware of is a direct violation of C-API contract and not something that package code should ever be doing.

Its not at all clear to me that you can ever be 100% sure that the reference number should be 1 when it is not currently one for an R object that exists at the R-level (as opposed to only in pure C code). Sure, maybe the object is created within the body of your R function instead of being passed in, but what if someone is debugging your function and assigns the value to the global environment using <<-  for later inspection; now  you have an invalidly low NAMED value, ie you have a segfault coming. I know of no way for you to prevent this or even know it has happened.



On Thu, Jul 18, 2019 at 3:28 AM Bemis, Kylie <[hidden email]<mailto:[hidden email]>>
wrote:

> Hello,
>
> I’m experimenting with ALTREP and was wondering if there is a preferred
> way to create an ALTREP wrapper vector without using
> .Internal(wrap_meta(…)), which R CMD check doesn’t like since it uses an
> .Internal() function.

So there is the .doSortWrap  (and its currently inexplicably identical clone .doWrap) function in base, which is an R level function that calls down to .Internal(wrap_meta(...)), which you can use, but it doesn't look general enough for what  I think you need (it was written for things that have just been sorted, thus the name). Specifically, its not able to indicate that things are of unknown sortedness as currently written.  If matter vectors are guaranteed to be sorted for some reason, though, you can use this. I'll talk to Luke about whether we want to generalize this, it would be easy to have this support the full space of metadata for wrappers and be a general purpose wrapper-maker, but that isn't what it is right now.

At the C-level, it looks like we do make R_tryWrap available (it appears in Rinternals.h, and not within a USE_RINTERNALS section),so you can call that from your own C(++) code. This creates a wrapper that has no metadata on it (or rather it has metadata but  the metadata indicates that no special info is known about the vector).

>
> I was trying to create a factor that used an ALTREP integer, but
> attempting to set the class and levels attributes always ended up
> duplicating and materializing the integer vector. Using the wrapper avoided
> this issue.
>
> Here is my initial ALTREP integer vector:
>
> > fc0 <- factor(c("a", "a", "b"))
> >
> > y <- matter::as.matter(as.integer(fc0))
> > y <- matter:::as.altrep(y)
> >
> > .Internal(inspect(y))
> @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3, mem=0)
>
> Here is what I get without a wrapper:
>
> > fc1 <- structure(y, class="factor", levels=levels(x))
> > .Internal(inspect(fc1))
> @7fb0cae66408 13 INTSXP g0c2 [OBJ,NAM(2),ATT] (len=3, tl=0) 1,1,2
> ATTRIB:
>   @7fb0ce771868 02 LISTSXP g0c0 []
>     TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has
> value)
>     @7fb0c9fcbe90 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0)
>       @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached]
> "factor"
>     TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000] "levels"
> (has value)
>     @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0)
>       @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a"
>       @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b"
>
> Here is what I get with a wrapper:
>
> > fc2 <- structure(.Internal(wrap_meta(y, 0, 0)), class="factor",
> levels=levels(x))
> > .Internal(inspect(fc2))
> @7fb0ce764630 13 INTSXP g0c0 [OBJ,NAM(2),ATT]  wrapper [srt=0,no_na=0]
>   @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3,
> mem=0)
> ATTRIB:
>   @7fb0ce764668 02 LISTSXP g0c0 []
>     TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has
> value)
>     @7fb0c9fcb010 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0)
>       @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached]
> "factor"
>     TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000] "levels"
> (has value)
>     @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0)
>       @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a"
>       @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b"
>
> Is there a way to do this that doesn’t rely on .Internal() and won’t
> produce R CMD check warnings?
>
> ~~~
> Kylie Ariel Bemis
> Khoury College of Computer Sciences
> Northeastern University
> kuwisdelu.github.io<https://nam05.safelinks.protection.outlook.com/?url=http%3A%2F%2Fkuwisdelu.github.io&data=02%7C01%7Ck.bemis%40northeastern.edu%7C2941b0ace204410a4be508d70becd82e%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636990984192834656&sdata=y%2F9QS%2B%2B5BV16kYaHD1U4luNjIv%2F0q4KIhupAH%2FeJIe4%3D&reserved=0><https://kuwisdelu.github.io<https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkuwisdelu.github.io&data=02%7C01%7Ck.bemis%40northeastern.edu%7C2941b0ace204410a4be508d70becd82e%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636990984192834656&sdata=JnYrgz3NrgaYbkGSYwnDvIUhzf7DTsqph%2FKy15t%2BLZ4%3D&reserved=0>>
>
>
>
>
>
>
>
>
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email]<mailto:[hidden email]> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel<https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Ck.bemis%40northeastern.edu%7C2941b0ace204410a4be508d70becd82e%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636990984192844664&sdata=u5KvounmbXv%2ByahC7JLDzR4GMBPmds7dPcwx%2F01WLt8%3D&reserved=0>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email]<mailto:[hidden email]> mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel<https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-devel&data=02%7C01%7Ck.bemis%40northeastern.edu%7C2941b0ace204410a4be508d70becd82e%7Ca8eec281aaa34daeac9b9a398b9215e7%7C0%7C0%7C636990984192854672&sdata=WpQxQv%2F4fcX6KbUKoACYHx8vcPsNyVZh%2BWL0dejrXeY%3D&reserved=0>


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: [External] Re: ALTREP wrappers and factors

Tierney, Luke
In reply to this post by Gabriel Becker-2
On Fri, 19 Jul 2019, Gabriel Becker wrote:

> Hi Jiefei and Kylie,
>
> Great to see people engaging with the ALTREP framework and identifying
> places we may need more tooling. Comments inline.
>
> On Thu, Jul 18, 2019 at 12:22 PM King Jiefei <[hidden email]> wrote:
>
>>
>> If that is the case and you are 100% sure the reference number should be 1
>> for your variable *y*, my solution is to call *SET_NAMED *in C++ to reset
>> the reference number. Note that you need to unbind your local variable
>> before you reset the number. To return an unbound SEXP,  the C++ function
>> should be placed at the end of your *matter:::as.altrep *function. I don't
>> know if there is any simpler way to do that and I'll be happy to see any
>> opinion.
>>
>
> So as far as I know, manually setting the NAMED value on any SEXP the
> garbage collector is aware of is a direct violation of C-API contract and
> not something that package code should ever be doing.
>
> Its not at all clear to me that you can *ever* be 100% sure that the
> reference number should be 1 when it is not currently one for an R object
> that exists at the R-level (as opposed to only in pure C code). Sure, maybe
> the object is created within the body of your R function instead of being
> passed in, but what if someone is debugging your function and assigns the
> value to the global environment using <<-  for later inspection; now  you
> have an invalidly low NAMED value, ie you have a segfault coming. I know of
> no way for you to prevent this or even know it has happened.

SET_NAMED should NEVER be used in a package. In fact it will hopefully
disappear at some point not too far in the future.

>> On Thu, Jul 18, 2019 at 3:28 AM Bemis, Kylie <[hidden email]>
>> wrote:
>>
>>> Hello,
>>>
>>> I’m experimenting with ALTREP and was wondering if there is a preferred
>>> way to create an ALTREP wrapper vector without using
>>> .Internal(wrap_meta(…)), which R CMD check doesn’t like since it uses an
>>> .Internal() function.
>>
>
> So there is the .doSortWrap  (and its currently inexplicably identical
> clone .doWrap) function in base, which is an R level function that calls
> down to .Internal(wrap_meta(...)), which you can use, but it doesn't look
> general enough for what  I think you need (it was written for things that
> have just been sorted, thus the name). Specifically, its not able to
> indicate that things are of unknown sortedness as currently written.  If
> matter vectors are guaranteed to be sorted for some reason, though, you can
> use this. I'll talk to Luke about whether we want to generalize this, it
> would be easy to have this support the full space of metadata for wrappers
> and be a general purpose wrapper-maker, but that isn't what it is right now.
>
> At the C-level, it looks like we do make R_tryWrap available (it appears in
> Rinternals.h, and not within a USE_RINTERNALS section),so you can call that
> from your own C(++) code. This creates a wrapper that has no metadata on it
> (or rather it has metadata but  the metadata indicates that no special info
> is known about the vector).

At this point we are not ready to cast in stone an interface to
creating wrappers from R.  The C R_tryWrap could be used, but it is
still subject to change.

You might try your example with a larger vector. In R 3.6.x
structure() should produce a wrapper for length 100 or more.

Best,

luke

>>
>>> I was trying to create a factor that used an ALTREP integer, but
>>> attempting to set the class and levels attributes always ended up
>>> duplicating and materializing the integer vector. Using the wrapper
>> avoided
>>> this issue.
>>>
>>> Here is my initial ALTREP integer vector:
>>>
>>>> fc0 <- factor(c("a", "a", "b"))
>>>>
>>>> y <- matter::as.matter(as.integer(fc0))
>>>> y <- matter:::as.altrep(y)
>>>>
>>>> .Internal(inspect(y))
>>> @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3,
>> mem=0)
>>>
>>> Here is what I get without a wrapper:
>>>
>>>> fc1 <- structure(y, class="factor", levels=levels(x))
>>>> .Internal(inspect(fc1))
>>> @7fb0cae66408 13 INTSXP g0c2 [OBJ,NAM(2),ATT] (len=3, tl=0) 1,1,2
>>> ATTRIB:
>>>   @7fb0ce771868 02 LISTSXP g0c0 []
>>>     TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has
>>> value)
>>>     @7fb0c9fcbe90 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0)
>>>       @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached]
>>> "factor"
>>>     TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000]
>> "levels"
>>> (has value)
>>>     @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0)
>>>       @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a"
>>>       @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b"
>>>
>>> Here is what I get with a wrapper:
>>>
>>>> fc2 <- structure(.Internal(wrap_meta(y, 0, 0)), class="factor",
>>> levels=levels(x))
>>>> .Internal(inspect(fc2))
>>> @7fb0ce764630 13 INTSXP g0c0 [OBJ,NAM(2),ATT]  wrapper [srt=0,no_na=0]
>>>   @7fb0ce78c0f0 13 INTSXP g0c0 [NAM(7)] matter vector (mode=3, len=3,
>>> mem=0)
>>> ATTRIB:
>>>   @7fb0ce764668 02 LISTSXP g0c0 []
>>>     TAG: @7fb0c80043d0 01 SYMSXP g1c0 [MARK,LCK,gp=0x4000] "class" (has
>>> value)
>>>     @7fb0c9fcb010 16 STRSXP g0c1 [NAM(7)] (len=1, tl=0)
>>>       @7fb0c80841a0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached]
>>> "factor"
>>>     TAG: @7fb0c8004050 01 SYMSXP g1c0 [MARK,NAM(7),LCK,gp=0x4000]
>> "levels"
>>> (has value)
>>>     @7fb0d1dd58c8 16 STRSXP g0c2 [MARK,NAM(7)] (len=2, tl=0)
>>>       @7fb0c81bf4c0 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "a"
>>>       @7fb0c90ba728 09 CHARSXP g1c1 [MARK,gp=0x61] [ASCII] [cached] "b"
>>>
>>> Is there a way to do this that doesn’t rely on .Internal() and won’t
>>> produce R CMD check warnings?
>>>
>>> ~~~
>>> Kylie Ariel Bemis
>>> Khoury College of Computer Sciences
>>> Northeastern University
>>> kuwisdelu.github.io<https://kuwisdelu.github.io>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   [hidden email]
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel