ALTREP "wrapper" classes needs an Extract_subset method

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

ALTREP "wrapper" classes needs an Extract_subset method

Davis Vaughan
Hi all,

I believe I have found a bug (or perhaps just an oversight) with the ALTREP
wrapper classes. The short form of this is that I believe that the wrapper
classes need to override the default ALTREP `Extract_subset_method()` with
a method that calls `Extract_subset()` on the "wrapped" object. I have a
patch prepared here:

https://github.com/DavisVaughan/r-source/pull/1

There is currently no call to `R_set_altvec_Extract_subset_method()` in the
wrapper class init functions. This means that the default ALTREP method of
`altvec_Extract_subset_default()` is called, which simply returns `NULL`.

Consider what that means for an ALTREP compact integer seq that has been
"wrapped". The default subsetting code will eventually call
`ExtractSubset()`. That checks if the object is ALTREP, and calls the
ALTREP Extract_subset() method if so. If the return value from that is
NULL, then it will fallback to repeatedly calling `INTEGER_ELT()` to do the
subsetting. See below for the relevant section:

https://github.com/wch/r-source/blob/d1c0c6b921fc6a0cbe82c4354c6ec6ceb7f320aa/src/main/subset.c#L103

This wrapped compact integer seq is an ALTREP object, so `ALTREP(x)`
returns true. But then it just calls the default method of returning NULL
rather than calling the compact integer seq `Extract_subset()` method! This
still "works" because it falls back to `INTEGER_ELT()` for which there is
a `wrapper_integer_Elt()` method defined that will use the underlying
compact seq's `integer_Elt()` method, but it is less efficient than it
could be.

My rough benchmarks show that in R 3.6.0 the subset benchmarks at the
bottom of this message take 4ms on the compact seq, and 5ms on the wrapped
compact seq. With a patch that I have prepared, both take 4ms.

The other reason I bring this up is that it can result in bugs with some
ALTREP objects. I was working on one where it makes sense to have an
`Extract_subset()` method but not an `Elt()` method. When it gets
"wrapped", my `Extract_subset()` method is bypassed as shown above, and the
`Elt()` method is incorrectly called (which throws an error because it is
not meaningful for me).

If you all agree these changes should be made, I can submit the patch.

Thanks,
Davis

# Ensure we have enough elements for "wrapping" to kick in
x <- 1:65

# select the 1st element a large number of times
index <- rep(1L, 1e6) + 0L

# ALTREP - but not wrapped
# .Internal(inspect(x))

bench::mark(x[index], iterations = 1000)

# Wrap it by adding a dummy attribute
attributes(x) <- list(foo = "bar")

# ALTREP - wrapped + compact seq
# .Internal(inspect(x))

bench::mark(x[index], iterations = 1000)

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: [External] ALTREP "wrapper" classes needs an Extract_subset method

Tierney, Luke
It's not a bug as there will always be cases where ALTREP will need to
fall back. But it does look like something that would be good to
address. So please file it as a wishlist item and I'll look at a patch
if you have one.

As to your issue at the end, it seems to me that you should probably
have another look at your design. Supporting getting a subset,
including one of size one, but not getting an element seems odd, and I
suspect will get you in trouble somewhere else before long..

Best,

luke

On Mon, 3 Feb 2020, Davis Vaughan wrote:

> Hi all,
>
> I believe I have found a bug (or perhaps just an oversight) with the ALTREP
> wrapper classes. The short form of this is that I believe that the wrapper
> classes need to override the default ALTREP `Extract_subset_method()` with
> a method that calls `Extract_subset()` on the "wrapped" object. I have a
> patch prepared here:
>
> https://github.com/DavisVaughan/r-source/pull/1
>
> There is currently no call to `R_set_altvec_Extract_subset_method()` in the
> wrapper class init functions. This means that the default ALTREP method of
> `altvec_Extract_subset_default()` is called, which simply returns `NULL`.
>
> Consider what that means for an ALTREP compact integer seq that has been
> "wrapped". The default subsetting code will eventually call
> `ExtractSubset()`. That checks if the object is ALTREP, and calls the
> ALTREP Extract_subset() method if so. If the return value from that is
> NULL, then it will fallback to repeatedly calling `INTEGER_ELT()` to do the
> subsetting. See below for the relevant section:
>
> https://github.com/wch/r-source/blob/d1c0c6b921fc6a0cbe82c4354c6ec6ceb7f320aa/src/main/subset.c#L103
>
> This wrapped compact integer seq is an ALTREP object, so `ALTREP(x)`
> returns true. But then it just calls the default method of returning NULL
> rather than calling the compact integer seq `Extract_subset()` method! This
> still "works" because it falls back to `INTEGER_ELT()` for which there is
> a `wrapper_integer_Elt()` method defined that will use the underlying
> compact seq's `integer_Elt()` method, but it is less efficient than it
> could be.
>
> My rough benchmarks show that in R 3.6.0 the subset benchmarks at the
> bottom of this message take 4ms on the compact seq, and 5ms on the wrapped
> compact seq. With a patch that I have prepared, both take 4ms.
>
> The other reason I bring this up is that it can result in bugs with some
> ALTREP objects. I was working on one where it makes sense to have an
> `Extract_subset()` method but not an `Elt()` method. When it gets
> "wrapped", my `Extract_subset()` method is bypassed as shown above, and the
> `Elt()` method is incorrectly called (which throws an error because it is
> not meaningful for me).
>
> If you all agree these changes should be made, I can submit the patch.
>
> Thanks,
> Davis
>
> # Ensure we have enough elements for "wrapping" to kick in
> x <- 1:65
>
> # select the 1st element a large number of times
> index <- rep(1L, 1e6) + 0L
>
> # ALTREP - but not wrapped
> # .Internal(inspect(x))
>
> bench::mark(x[index], iterations = 1000)
>
> # Wrap it by adding a dummy attribute
> attributes(x) <- list(foo = "bar")
>
> # ALTREP - wrapped + compact seq
> # .Internal(inspect(x))
>
> bench::mark(x[index], iterations = 1000)
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   [hidden email]
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel