Small performance bug in [.Date

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Small performance bug in [.Date

hadley wickham
Hi all,

I think there's an unnecessary line in [.Date which has a considerable
impact on performance when subsetting large dates:

x <- Sys.Date() + 1:1e6

microbenchmark::microbenchmark(x[1])
#> Unit: microseconds
#>  expr     min       lq     mean   median       uq      max neval
#>  x[1] 920.651 1039.346 3624.833 2294.404 3786.881 41176.38   100

`[.Date` <- function(x, ..., drop = TRUE) {
    cl <- oldClass(x)
    # class(x) <- NULL
    val <- NextMethod("[")
    class(val) <- cl
    val
}
microbenchmark::microbenchmark(x[1])
#> Unit: microseconds
#>  expr   min     lq     mean median    uq      max neval
#>  x[1] 2.738 3.0225 28.40893  3.269 3.513 2470.068   100

Setting the class of x to NULL is problematic because it forces a
copy, and I'm pretty sure it's unnecessary as NextMethod() does not
consult the class of x, but instead uses .Class.

Hadley

--
http://hadley.nz

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Small performance bug in [.Date

Martin Maechler
>>>>> Hadley Wickham <[hidden email]>
>>>>>     on Mon, 20 Nov 2017 12:50:24 -0600 writes:

    > Hi all,
    > I think there's an unnecessary line in [.Date which has a considerable
    > impact on performance when subsetting large dates:

    > x <- Sys.Date() + 1:1e6

    > microbenchmark::microbenchmark(x[1])
    > #> Unit: microseconds
    > #>  expr     min       lq     mean   median       uq      max neval
    > #>  x[1] 920.651 1039.346 3624.833 2294.404 3786.881 41176.38   100

    > `[.Date` <- function(x, ..., drop = TRUE) {
    > cl <- oldClass(x)
    > # class(x) <- NULL
    > val <- NextMethod("[")
    > class(val) <- cl
    > val
    > }
    > microbenchmark::microbenchmark(x[1])
    > #> Unit: microseconds
    > #>  expr   min     lq     mean median    uq      max neval
    > #>  x[1] 2.738 3.0225 28.40893  3.269 3.513 2470.068   100

    > Setting the class of x to NULL is problematic because it forces a
    > copy, and I'm pretty sure it's unnecessary as NextMethod() does not
    > consult the class of x, but instead uses .Class.

Yes, at least so it looks in  src/main/objects.c

Also, we had a very similar change a while ago :
------------------------------------------------------------------------
r65926 | luke | 2014-06-12 15:54:38 +0200 (Thu, 12. Jun 2014) | 2 Zeilen
GeƤnderte Pfade:
   M src/library/base/R/datetime.R

Commented out class(x) <- NULL in [.POSIXct and [[.POSICct.
------------------------------------------------------------------------

and we never seemed to have followed up in a systematic manner
finding other places where this happens and could be
eliminated.  I see about half a dozen examples in
base/R/dates.R  alone and am trying to find more in other places.

[maybe this used to be necessary for very early different
 versions of NextMethod() which were not yet optimized using  .Class etc]

Thank you very much,
Hadley!

Martin

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Small performance bug in [.Date

hadley wickham
> Yes, at least so it looks in  src/main/objects.c
>
> Also, we had a very similar change a while ago :
> ------------------------------------------------------------------------
> r65926 | luke | 2014-06-12 15:54:38 +0200 (Thu, 12. Jun 2014) | 2 Zeilen
> GeƤnderte Pfade:
>    M src/library/base/R/datetime.R
>
> Commented out class(x) <- NULL in [.POSIXct and [[.POSICct.
> ------------------------------------------------------------------------
>
> and we never seemed to have followed up in a systematic manner
> finding other places where this happens and could be
> eliminated.  I see about half a dozen examples in
> base/R/dates.R  alone and am trying to find more in other places.
>
> [maybe this used to be necessary for very early different
>  versions of NextMethod() which were not yet optimized using  .Class etc]

Thanks for making the fix!

Hadley

--
http://hadley.nz

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel