head.matrix can return 1000s of columns -- limit to n or add new argument?

classic Classic list List threaded Threaded
46 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: class(<matrix>) |--> c("matrix", "arrary") -- and S3 dispatch

Dirk Eddelbuettel

On 21 November 2019 at 17:57, Martin Maechler wrote:
| (if you use a version of R-devel, with svn rev >= 77446; which
|  you may get as a binary for Windows in about one day; everyone
|  else needs to compile for the sources .. or wait a bit, maybe
|  also not much longer than one day, for a docker image) :

FYI: rocker/drd [1] and rocker/r-devel both have rev 77455 now (as they are
both on weekend auto-rebuild schedule).  The former is smaller, both should
work to test this. Quick demo below [2].

Dirk

[1] This comes from 'drd == daily r-devel' but we do not build it daily.
[2] Quick demo follows

edd@rob:~$ docker run --rm -ti rocker/r-devel bash
root@a30e4a5c89ba:/# RD

R Under development (unstable) (2019-11-23 r77455) -- "Unsuffered Consequences"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> Sys.setenv("_R_CLASS_MATRIX_ARRAY_" = "BOOH !") # ==> future R behavior
> class(m <- diag(1))
[1] "matrix" "array"
>

--
http://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: class(<matrix>) |--> c("matrix", "arrary") -- and S3 dispatch

Jan Gorecki
In case if anyone needs daily R-devel there is my build scheduled on GitLab.
As of now based on Ubuntu 16.04, R built using:
--with-recommended-packages --enable-strict-barrier
--disable-long-double
Predefined Makevars for building pkgs using: -g -O2 -Wall -pedantic
-fstack-protector-strong -D_FORTIFY_SOURCE=2

$ docker run --rm -ti registry.gitlab.com/jangorecki/dockerfiles/r-devel:latest
R Under development (unstable) (2019-11-23 r77455) -- "Unsuffered Consequences"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> Sys.setenv("_R_CLASS_MATRIX_ARRAY_" = "BOOH !")
> class(m <- diag(1))
[1] "matrix" "array"

On Mon, Nov 25, 2019 at 10:31 PM Dirk Eddelbuettel <[hidden email]> wrote:

>
>
> On 21 November 2019 at 17:57, Martin Maechler wrote:
> | (if you use a version of R-devel, with svn rev >= 77446; which
> |  you may get as a binary for Windows in about one day; everyone
> |  else needs to compile for the sources .. or wait a bit, maybe
> |  also not much longer than one day, for a docker image) :
>
> FYI: rocker/drd [1] and rocker/r-devel both have rev 77455 now (as they are
> both on weekend auto-rebuild schedule).  The former is smaller, both should
> work to test this. Quick demo below [2].
>
> Dirk
>
> [1] This comes from 'drd == daily r-devel' but we do not build it daily.
> [2] Quick demo follows
>
> edd@rob:~$ docker run --rm -ti rocker/r-devel bash
> root@a30e4a5c89ba:/# RD
>
> R Under development (unstable) (2019-11-23 r77455) -- "Unsuffered Consequences"
> Copyright (C) 2019 The R Foundation for Statistical Computing
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
>
>   Natural language support but running in an English locale
>
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and
> 'citation()' on how to cite R or R packages in publications.
>
> Type 'demo()' for some demos, 'help()' for on-line help, or
> 'help.start()' for an HTML browser interface to help.
> Type 'q()' to quit R.
>
> > Sys.setenv("_R_CLASS_MATRIX_ARRAY_" = "BOOH !") # ==> future R behavior
> > class(m <- diag(1))
> [1] "matrix" "array"
> >
>
> --
> http://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: head.matrix can return 1000s of columns ..

Martin Maechler
In reply to this post by Gabriel Becker-2
>>>>> Gabriel Becker
>>>>>     on Sat, 2 Nov 2019 12:40:16 -0700 writes:

    [....................]

In the mean time,  Gabe had worked quite a bit and provided a
patch proposal  at R's bugzilla,  PR#17652 ,
i.e., here
      https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17652

A few days ago, I had committed a (slightly simplified) version
of that to R-devel (svn rev 77462 )
with NEWS entry

    * head(x, n) and tail() default and other S3 methods notably for
      _vector_ n, e.g. to get a "corner" of a matrix, also extended for
      array's of higher dimension, thanks to the patch proposal by Gabe
      Becker in PR#16764.

 (which contains a *wrong* PR number that I've corrected in the
  mean time)

A day or so later, the CRAN has alerted me to the fact that this
change breaks the checks of some CRAN packages, as it seems
about 30 now.

There were at least two principal reasons, one of which was the
fact that data frame subsetting has been somewhat surprising in R,
without being documented so, *and* some packages have
inadvertently made use of this pecularity -- which was
inadvertently changed by r77462.

In short,   head(<data frame>)  kept extraneous attributes
because indeed
                d[i, ]
keeps those attributes ... for data frames.

I will amend the  head() and tail() methods to remain back
compatible (as much as sensible) for now,  but here's what I've
found about subsetting, i.e., behavior of the (partly C code
internal)  `[`  methods in R :

1)  For a data frame d,  d[i, ]  differs  from  d[i,j],
    as the former keeps (extra) attributes,
2)  For a matrix both forms of indexing do not keep (extra) attributes.

Here's some simple reproducible R code exhibiting the claim:

##==== Data frame subsetting (vs. matrix, array)  "with extra attributes": =====
## data frame w/ a (non-standard) attribute:
str(treeS <- structure(trees, foo = "bar"))

chkMat <- function(M) {
    stopifnot(nzchar(Mfoo <- attr(M, "foo")),
              length(d <- dim(M)) == 2,
              (n <- d[1]) >= 6, d[2] >= 3)
    ## n = nrow(M)
    stopifnot(exprs = { # attribute is kept
        if(inherits(M, "data.frame")) {
            identical(  attr(M[    1:3 , ] , "foo") , "bar") &&
            identical(  attr(M[(n-2):n , ] , "foo") , "bar")
        } else { ## matrix
            is.null  (  attr(M[    1:3 , ] , "foo")) &&
            is.null  (  attr(M[(n-2):n , ] , "foo"))
        }
        ## OTOH,  [i,j]-indexing of data frames *does* drop "other" attributes:
        inherits(print(t.ij <- M[(n-2):n, 2:3] ), class(M))
        ## now, the "foo" attribute of  M[i,j] is gone!
        is.null(attr(t.ij, "foo"))
    })
}

chkMat(treeS)
chkMat(as.matrix(treeS))

-------

And (to repeat), currently  head(d, n)  is the same as   d[1:n , ]
when n >= 1,  length(n) == 1  and this equality is relied upon
by CRAN package code out there .. and hence I'll keep it with
the "generalized" head() & tail() in R-devel.

Martin

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: class(<matrix>) |--> c("matrix", "arrary") -- and S3 dispatch

Hervé Pagès-2
In reply to this post by Martin Maechler
Dear Martin,

What's the ETA for _R_CLASS_MATRIX_ARRAY_=TRUE to become the new
unconditional behavior in R devel? Thanks!

H.


On 11/21/19 08:57, Martin Maechler wrote:

>
> TLDR: This is quite technical, still somewhat important:
>       1)  R 4.0.0 will become a bit more coherent: a matrix is an array
>       2)  Your package (or one you use) may be affected.
>
>
>>>>>> Martin Maechler
>>>>>>      on Fri, 15 Nov 2019 17:31:15 +0100 writes:
>
>>>>>> Pages, Herve
>>>>>>      on Thu, 14 Nov 2019 19:13:47 +0000 writes:
>
>      >> On 11/14/19 05:47, Hadley Wickham wrote:
>      >>> On Sun, Nov 10, 2019 at 2:37 AM Martin Maechler ... wrote:
>
>      [................]
>      
>      >>>>> Note again that both "matrix" and "array" are special [see ?class] as
>      >>>>> being of  __implicit class__  and I am considering that this
>      >>>>> implicit class behavior for these two should be slightly
>      >>>>> changed ....
>      >>>>>
>      >>>>> And indeed I think you are right on spot and this would mean
>      >>>>> that indeed the implicit class
>      >>>>> "matrix" should rather become c("matrix", "array").
>      >>>>
>      >>>> I've made up my mind (and not been contradicted by my fellow R
>      >>>> corers) to try go there for  R 4.0.0   next April.
>
>      >>> I can't seem to find the previous thread, so would you mind being a
>      >>> bit more explicit here? Do you mean adding "array" to the implicit
>      >>> class?
>
>      >> It's late in Europe ;-)
>
>      >> That's my understanding. I think the plan is to have class(matrix())
>      >> return c("matrix", "array"). No class attributes added to matrix or
>      >> array objects.
>
>      >> It's all what is needed to have inherits(matrix(), "array") return TRUE
>      >> (instead of FALSE at the moment) and S3 dispatch pick up the foo.array
>      >> method when foo(matrix()) is called and there is no foo.matrix method.
>
>      > Thank you, Hervé!  That's exactly the plan.
>
> BUT it's wrong what I (and Peter and Hervé and ....) had assumed:
>
> If I just change the class
>       (as I already did a few days ago, but you must activate the change
>        via environment variable, see below),
>
> S3 dispatch does *NOT* at all pick it up:
> "matrix" (and "array") are even more special here (see below),
> and from Hadley's questions, in hindsight I now see that he's been aware
> of that and I hereby apologize to Hadley for not having thought
> and looked more, when he asked ..
>
> Half an hour ago, I've done another source code commit (svn r77446),
> to "R-devel" only, of course, and the R-devel NEWS now starts as
>
> ------------------------------------------------------------
>
> CHANGES IN R-devel:
>
>    USER-VISIBLE CHANGES:
>
>      •  .... intention that the next non-patch release should be 4.0.0.
>
>      • R now builds by default against a PCRE2 library ........
>        ...................
>        ...................
>
>      • For now only active when environment variable
>        _R_CLASS_MATRIX_ARRAY_ is set to non-empty, but planned to be the
>        new unconditional behavior when R 4.0.0 is released:
>
>        Newly, matrix objects also inherit from class "array", namely,
>        e.g., class(diag(1)) is c("matrix", "array") which invalidates
>        code (wrongly) assuming that length(class(obj)) == 1, a wrong
>        assumption that is less frequently fulfilled now.  (Currently
>        only after setting _R_CLASS_MATRIX_ARRAY_ to non-empty.)
>
>        S3 methods for "array", i.e., <someFun>.array(), are now also
>        dispatched for matrix objects.
>
> ------------------------------------------------------------
> (where only the very last 1.5 lines paragraph is new.)
>
> Note the following
> (if you use a version of R-devel, with svn rev >= 77446; which
>   you may get as a binary for Windows in about one day; everyone
>   else needs to compile for the sources .. or wait a bit, maybe
>   also not much longer than one day, for a docker image) :
>
>
>> Sys.unsetenv("_R_CLASS_MATRIX_ARRAY_") # ==> current R behavior
>> class(m <- diag(1))
> [1] "matrix"
>> Sys.setenv("_R_CLASS_MATRIX_ARRAY_" = "BOOH !") # ==> future R behavior
>> class(m)
> [1] "matrix" "array"
>>
>> foo <- function(x) UseMethod("foo")
>> foo.array <- function(x) "made in foo.array()"
>> foo(m)
> [1] "made in foo.array()"
>> Sys.unsetenv("_R_CLASS_MATRIX_ARRAY_")# ==> current R behavior
>> foo(m)
> Error in UseMethod("foo") :
>    no applicable method for 'foo' applied to an object of class "c('matrix', 'double', 'numeric')"
>
>> Sys.setenv("_R_CLASS_MATRIX_ARRAY_" = TRUE) # ==> future R behavior
>> foo(m)
> [1] "made in foo.array()"
>> foo.A <- foo.array ; rm(foo.array)
>> foo(m)
> Error in UseMethod("foo") :
>    no applicable method for 'foo' applied to an object of class "c('matrix', 'array', 'double', 'numeric')"
>>
>
> So, with my commit 77446, the  _R_CLASS_MATRIX_ARRAY_
> environment variable also changes the
>
>     "S3 dispatch determining class"
>
> mentioned as 'class' in the error message (of the two cases, old
> and new) above,  which in R <= 3.6.x for a numeric matrix is
>
>      c('matrix', 'double', 'numeric')
>
> and from R 4.0.0 on  will be
>
>      c('matrix', 'array', 'double', 'numeric')
>
> Note that this is *not* (in R <= 3.6.x, nor very probably in R 4.0.0)
> the same as  R's  class().
> Hadley calls this long class vector the  'implicit class' -- which
> is a good term but somewhat conflicting with R's (i.e. R-core's)
> "definition" used in the  ?class  help page (for ca. 11 years).
>
> R's internal C code has a nice function class R_data_class2()
> which computes this 'S3-dispatch-class' character (vector) for
> any R object, and R_data_class2() is indeed called from (the
> underlying C function of)  R's UseMethod().
>
> Using the above fact of an error message,
> I wrote a nice (quite well tested) function  my.class2()  which
> returns this S3_dispatch_class() also in current versions of R:
>
> my.class2 <- function(x) { # use a fn name not used by any sane ..
>      foo.7.3.343 <- function(x) UseMethod("foo.7.3.343")
>      msg <- tryCatch(foo.7.3.343(x), error=function(e) e$message)
>      clm <- sub('"$', '', sub(".* of class \"", '', msg))
>      if(is.language(x) || is.function(x))
>          clm
>      else {
>          cl <- str2lang(clm)
>          if(is.symbol(cl)) as.character(cl) else eval(cl)
>      }
> }
>
> ## str2lang() needs R >= 3.6.0:
> if(getRversion() < "3.6.0") ## substitute for str2lang(), good enough here:
>      str2lang <- function(s) parse(text = s, keep.source=FALSE)[[1]]
>
>    
> Now you can look at such things yourself:
>
> ## --------------------- the "interesting" cases : ---
> ## integer and double
> my.class2( pi) # == c("double",  "numeric")
> my.class2(1:2) # == c("integer", "numeric")
> ## matrix and array [also combined with int / double ] :
> my.class2(matrix(1L, 2,3))   # == c(matrixCL, "integer", "numeric")  <<<
> my.class2(matrix(pi, 2,3))   # == c(matrixCL,  "double", "numeric")  <<<
> my.class2(array("A", 2:3))   # == c(matrixCL,  "character")          <<<
> my.class2(array(1:24, 2:4))   # == c("array",  "integer", "numeric")
> my.class2(array( pi , 2:4))   # == c("array",   "double", "numeric")
> my.class2(array(TRUE, 2:4))   # == c("array", "logical")
> my.class2(array(letters, 2:4)) # == c("array", "character")
> my.class2(array(1:24 + 1i, 2)) # == c("array", "complex")
>
> ## other cases
> my.class2(NA) # == class(NA) : "logical"
> my.class2("A") # == class("B"): "character"
> my.class2(as.raw(0:2)) # == "raw"
> my.class2(1 + 2i) # == "complex"
> my.class2(USJudgeRatings)#== "data.frame"
> my.class2(class) # == "function" # also for a primitive
> my.class2(globalenv()) # == "environment"
> my.class2(quote(sin(x)))# == "call"
> my.class2(quote(sin) )  # == "name"
> my.class2(quote({})) # == class(*) == "{"
> my.class2(quote((.))) # == class(*) == "("
>
> -----------------------------------------------------
>
> note that of course, the lines marked "<<<" above, contain
> 'matrixCL'  which is "matrix" in "old" (i.e. current) R,
>    and is c("matrix", "array") in "new" (i.e. future) R.
>
> Last but not least: It's quite trivial (only few words need to
> be added to the sources; more to the documentation)  to add an R
> function to base R which provides the same as my.class2() above,
> (but much more efficiently, not via catching error messages !!),
> and my current proposal for that function's name is  .class2()
> {it should start with a dot ("."), as it's not for the simple
>   minded average useR ... and you know how I'm happy with
>   function names that do not need one single [Shift] key ...}
>
> The current plan contains
>
> 1)  Notify CRAN package maintainers (ca 140) whose packages no
>      longer pass R CMD check  when the feature is turned on
>      (via setting the environment variable) in R-devel.
>
> 2a) (Some) CRAN team members set _R_CLASS_MATRIX_ARRAY_ (to non-empty),
>      as part of the incoming checks, at least for all new CRAN submissions
>
> 2b) set the  _R_CLASS_MATRIX_ARRAY_ (to non-empty), as part of
>      ' R CMD check --as-cran <pkg>'
>
> 3)  Before the end of 2019, change the R sources (for R-devel)
>      such that it behaves as it behaves currently when the environment
>      variable is set *AND* abolish this environment variable from
>      the sources.  {read on to learn *why*}
>
> Consequently (to 3), R 4.0.0 will behave as indicated, unconditionally.
>
> Note that (as I've shown above in the first example set) this is
> set up in such a manner that you can change the environment
> variable during a *running* R session, and observe the effect immediately.
> This however lead to some slow down of quite a bit of the R
> code, because actually the environment variable has to be
> checked quite often (easily dozens of times for simple R calls).
>
> For that reason, we want to do "3)" as quickly as possible.
>
> Please do not hesitate to ask or comment
> -- here, not on Twitter, please --  noting that I'll be
> basically offline for an extended weekend within 24h, now.
>
> I hope this will eventually to lead to clean up and clarity in
> R, and hence should be worth the pain of broken
> back-compatibility and having to adapt your (almost always only
> sub-optimally written ;-)) R code,
> see also my Blog   https://urldefense.proofpoint.com/v2/url?u=http-3A__bit.ly_R-5Fblog-5Fclass-5Fthink-5F2x&d=DwIDaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=xAGXmo1FhJxT-qBfj-McDEn3sqWhqJHNV-IPpN7g6oA&s=yUUwdjl5LE90V0tLTM3FZYZ0zHf8coHo49Vt95O7IwQ&e=
>
> Martin Maechler
> ETH Zurich and R Core team
>

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [hidden email]
Phone:  (206) 667-5791
Fax:    (206) 667-1319
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: class(<matrix>) |--> c("matrix", "arrary") -- and S3 dispatch

Martin Maechler
>>>>> Pages, Herve
>>>>>     on Tue, 21 Jan 2020 17:33:01 +0000 writes:

    > Dear Martin,
    > What's the ETA for _R_CLASS_MATRIX_ARRAY_=TRUE to become the new
    > unconditional behavior in R devel? Thanks!

    > H.

Thank you, Hervé, for asking / reminding.

It has been made so now, 3 days ago (svn r77714).

Martin




    > On 11/21/19 08:57, Martin Maechler wrote:
    >>
    >> TLDR: This is quite technical, still somewhat important:
    >> 1)  R 4.0.0 will become a bit more coherent: a matrix is an array
    >> 2)  Your package (or one you use) may be affected.
    >>
    >>
    >>>>>>> Martin Maechler
    >>>>>>> on Fri, 15 Nov 2019 17:31:15 +0100 writes:
    >>
    >>>>>>> Pages, Herve
    >>>>>>> on Thu, 14 Nov 2019 19:13:47 +0000 writes:
    >>
    >> >> On 11/14/19 05:47, Hadley Wickham wrote:
    >> >>> On Sun, Nov 10, 2019 at 2:37 AM Martin Maechler ... wrote:
    >>
    >> [................]
    >>
    >> >>>>> Note again that both "matrix" and "array" are special [see ?class] as
    >> >>>>> being of  __implicit class__  and I am considering that this
    >> >>>>> implicit class behavior for these two should be slightly
    >> >>>>> changed ....
    >> >>>>>
    >> >>>>> And indeed I think you are right on spot and this would mean
    >> >>>>> that indeed the implicit class
    >> >>>>> "matrix" should rather become c("matrix", "array").
    >> >>>>
    >> >>>> I've made up my mind (and not been contradicted by my fellow R
    >> >>>> corers) to try go there for  R 4.0.0   next April.
    >>
    >> >>> I can't seem to find the previous thread, so would you mind being a
    >> >>> bit more explicit here? Do you mean adding "array" to the implicit
    >> >>> class?
    >>
    >> >> It's late in Europe ;-)
    >>
    >> >> That's my understanding. I think the plan is to have class(matrix())
    >> >> return c("matrix", "array"). No class attributes added to matrix or
    >> >> array objects.
    >>
    >> >> It's all what is needed to have inherits(matrix(), "array") return TRUE
    >> >> (instead of FALSE at the moment) and S3 dispatch pick up the foo.array
    >> >> method when foo(matrix()) is called and there is no foo.matrix method.
    >>
    >> > Thank you, Hervé!  That's exactly the plan.
    >>
    >> BUT it's wrong what I (and Peter and Hervé and ....) had assumed:
    >>
    >> If I just change the class
    >> (as I already did a few days ago, but you must activate the change
    >> via environment variable, see below),
    >>
    >> S3 dispatch does *NOT* at all pick it up:
    >> "matrix" (and "array") are even more special here (see below),
    >> and from Hadley's questions, in hindsight I now see that he's been aware
    >> of that and I hereby apologize to Hadley for not having thought
    >> and looked more, when he asked ..
    >>
    >> Half an hour ago, I've done another source code commit (svn r77446),
    >> to "R-devel" only, of course, and the R-devel NEWS now starts as
    >>
    >> ------------------------------------------------------------
    >>
    >> CHANGES IN R-devel:
    >>
    >> USER-VISIBLE CHANGES:
    >>
    >> •  .... intention that the next non-patch release should be 4.0.0.
    >>
    >> • R now builds by default against a PCRE2 library ........
    >> ...................
    >> ...................
    >>
    >> • For now only active when environment variable
    >> _R_CLASS_MATRIX_ARRAY_ is set to non-empty, but planned to be the
    >> new unconditional behavior when R 4.0.0 is released:
    >>
    >> Newly, matrix objects also inherit from class "array", namely,
    >> e.g., class(diag(1)) is c("matrix", "array") which invalidates
    >> code (wrongly) assuming that length(class(obj)) == 1, a wrong
    >> assumption that is less frequently fulfilled now.  (Currently
    >> only after setting _R_CLASS_MATRIX_ARRAY_ to non-empty.)
    >>
    >> S3 methods for "array", i.e., <someFun>.array(), are now also
    >> dispatched for matrix objects.
    >>
    >> ------------------------------------------------------------
    >> (where only the very last 1.5 lines paragraph is new.)
    >>
    >> Note the following
    >> (if you use a version of R-devel, with svn rev >= 77446; which
    >> you may get as a binary for Windows in about one day; everyone
    >> else needs to compile for the sources .. or wait a bit, maybe
    >> also not much longer than one day, for a docker image) :
    >>
    >>
    >>> Sys.unsetenv("_R_CLASS_MATRIX_ARRAY_") # ==> current R behavior
    >>> class(m <- diag(1))
    >> [1] "matrix"
    >>> Sys.setenv("_R_CLASS_MATRIX_ARRAY_" = "BOOH !") # ==> future R behavior
    >>> class(m)
    >> [1] "matrix" "array"
    >>>
    >>> foo <- function(x) UseMethod("foo")
    >>> foo.array <- function(x) "made in foo.array()"
    >>> foo(m)
    >> [1] "made in foo.array()"
    >>> Sys.unsetenv("_R_CLASS_MATRIX_ARRAY_")# ==> current R behavior
    >>> foo(m)
    >> Error in UseMethod("foo") :
    >> no applicable method for 'foo' applied to an object of class "c('matrix', 'double', 'numeric')"
    >>
    >>> Sys.setenv("_R_CLASS_MATRIX_ARRAY_" = TRUE) # ==> future R behavior
    >>> foo(m)
    >> [1] "made in foo.array()"
    >>> foo.A <- foo.array ; rm(foo.array)
    >>> foo(m)
    >> Error in UseMethod("foo") :
    >> no applicable method for 'foo' applied to an object of class "c('matrix', 'array', 'double', 'numeric')"
    >>>
    >>
    >> So, with my commit 77446, the  _R_CLASS_MATRIX_ARRAY_
    >> environment variable also changes the
    >>
    >> "S3 dispatch determining class"
    >>
    >> mentioned as 'class' in the error message (of the two cases, old
    >> and new) above,  which in R <= 3.6.x for a numeric matrix is
    >>
    >> c('matrix', 'double', 'numeric')
    >>
    >> and from R 4.0.0 on  will be
    >>
    >> c('matrix', 'array', 'double', 'numeric')
    >>
    >> Note that this is *not* (in R <= 3.6.x, nor very probably in R 4.0.0)
    >> the same as  R's  class().
    >> Hadley calls this long class vector the  'implicit class' -- which
    >> is a good term but somewhat conflicting with R's (i.e. R-core's)
    >> "definition" used in the  ?class  help page (for ca. 11 years).
    >>
    >> R's internal C code has a nice function class R_data_class2()
    >> which computes this 'S3-dispatch-class' character (vector) for
    >> any R object, and R_data_class2() is indeed called from (the
    >> underlying C function of)  R's UseMethod().
    >>
    >> Using the above fact of an error message,
    >> I wrote a nice (quite well tested) function  my.class2()  which
    >> returns this S3_dispatch_class() also in current versions of R:
    >>
    >> my.class2 <- function(x) { # use a fn name not used by any sane ..
    >> foo.7.3.343 <- function(x) UseMethod("foo.7.3.343")
    >> msg <- tryCatch(foo.7.3.343(x), error=function(e) e$message)
    >> clm <- sub('"$', '', sub(".* of class \"", '', msg))
    >> if(is.language(x) || is.function(x))
    >> clm
    >> else {
    >> cl <- str2lang(clm)
    >> if(is.symbol(cl)) as.character(cl) else eval(cl)
    >> }
    >> }
    >>
    >> ## str2lang() needs R >= 3.6.0:
    >> if(getRversion() < "3.6.0") ## substitute for str2lang(), good enough here:
    >> str2lang <- function(s) parse(text = s, keep.source=FALSE)[[1]]
    >>
    >>
    >> Now you can look at such things yourself:
    >>
    >> ## --------------------- the "interesting" cases : ---
    >> ## integer and double
    >> my.class2( pi) # == c("double",  "numeric")
    >> my.class2(1:2) # == c("integer", "numeric")
    >> ## matrix and array [also combined with int / double ] :
    >> my.class2(matrix(1L, 2,3))   # == c(matrixCL, "integer", "numeric")  <<<
    >> my.class2(matrix(pi, 2,3))   # == c(matrixCL,  "double", "numeric")  <<<
    >> my.class2(array("A", 2:3))   # == c(matrixCL,  "character")          <<<
    >> my.class2(array(1:24, 2:4))   # == c("array",  "integer", "numeric")
    >> my.class2(array( pi , 2:4))   # == c("array",   "double", "numeric")
    >> my.class2(array(TRUE, 2:4))   # == c("array", "logical")
    >> my.class2(array(letters, 2:4)) # == c("array", "character")
    >> my.class2(array(1:24 + 1i, 2)) # == c("array", "complex")
    >>
    >> ## other cases
    >> my.class2(NA) # == class(NA) : "logical"
    >> my.class2("A") # == class("B"): "character"
    >> my.class2(as.raw(0:2)) # == "raw"
    >> my.class2(1 + 2i) # == "complex"
    >> my.class2(USJudgeRatings)#== "data.frame"
    >> my.class2(class) # == "function" # also for a primitive
    >> my.class2(globalenv()) # == "environment"
    >> my.class2(quote(sin(x)))# == "call"
    >> my.class2(quote(sin) )  # == "name"
    >> my.class2(quote({})) # == class(*) == "{"
    >> my.class2(quote((.))) # == class(*) == "("
    >>
    >> -----------------------------------------------------
    >>
    >> note that of course, the lines marked "<<<" above, contain
    >> 'matrixCL'  which is "matrix" in "old" (i.e. current) R,
    >> and is c("matrix", "array") in "new" (i.e. future) R.
    >>
    >> Last but not least: It's quite trivial (only few words need to
    >> be added to the sources; more to the documentation)  to add an R
    >> function to base R which provides the same as my.class2() above,
    >> (but much more efficiently, not via catching error messages !!),
    >> and my current proposal for that function's name is  .class2()
    >> {it should start with a dot ("."), as it's not for the simple
    >> minded average useR ... and you know how I'm happy with
    >> function names that do not need one single [Shift] key ...}
    >>
    >> The current plan contains
    >>
    >> 1)  Notify CRAN package maintainers (ca 140) whose packages no
    >> longer pass R CMD check  when the feature is turned on
    >> (via setting the environment variable) in R-devel.
    >>
    >> 2a) (Some) CRAN team members set _R_CLASS_MATRIX_ARRAY_ (to non-empty),
    >> as part of the incoming checks, at least for all new CRAN submissions
    >>
    >> 2b) set the  _R_CLASS_MATRIX_ARRAY_ (to non-empty), as part of
    >> ' R CMD check --as-cran <pkg>'
    >>
    >> 3)  Before the end of 2019, change the R sources (for R-devel)
    >> such that it behaves as it behaves currently when the environment
    >> variable is set *AND* abolish this environment variable from
    >> the sources.  {read on to learn *why*}
    >>
    >> Consequently (to 3), R 4.0.0 will behave as indicated, unconditionally.
    >>
    >> Note that (as I've shown above in the first example set) this is
    >> set up in such a manner that you can change the environment
    >> variable during a *running* R session, and observe the effect immediately.
    >> This however lead to some slow down of quite a bit of the R
    >> code, because actually the environment variable has to be
    >> checked quite often (easily dozens of times for simple R calls).
    >>
    >> For that reason, we want to do "3)" as quickly as possible.
    >>
    >> Please do not hesitate to ask or comment
    >> -- here, not on Twitter, please --  noting that I'll be
    >> basically offline for an extended weekend within 24h, now.
    >>
    >> I hope this will eventually to lead to clean up and clarity in
    >> R, and hence should be worth the pain of broken
    >> back-compatibility and having to adapt your (almost always only
    >> sub-optimally written ;-)) R code,
    >> see also my Blog   https://urldefense.proofpoint.com/v2/url?u=http-3A__bit.ly_R-5Fblog-5Fclass-5Fthink-5F2x&d=DwIDaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=xAGXmo1FhJxT-qBfj-McDEn3sqWhqJHNV-IPpN7g6oA&s=yUUwdjl5LE90V0tLTM3FZYZ0zHf8coHo49Vt95O7IwQ&e=
    >>
    >> Martin Maechler
    >> ETH Zurich and R Core team
    >>

    > --
    > Hervé Pagès

    > Program in Computational Biology
    > Division of Public Health Sciences
    > Fred Hutchinson Cancer Research Center
    > 1100 Fairview Ave. N, M1-B514
    > P.O. Box 19024
    > Seattle, WA 98109-1024

    > E-mail: [hidden email]
    > Phone:  (206) 667-5791
    > Fax:    (206) 667-1319

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: class(<matrix>) |--> c("matrix", "arrary") -- and S3 dispatch

Hervé Pagès-2
On 1/27/20 23:51, Martin Maechler wrote:

>>>>>> Pages, Herve
>>>>>>      on Tue, 21 Jan 2020 17:33:01 +0000 writes:
>
>      > Dear Martin,
>      > What's the ETA for _R_CLASS_MATRIX_ARRAY_=TRUE to become the new
>      > unconditional behavior in R devel? Thanks!
>
>      > H.
>
> Thank you, Hervé, for asking / reminding.
>
> It has been made so now, 3 days ago (svn r77714).

Yep, I've seen that. Already deployed on the Bioconductor build
machines. Thanks!

H.

>
> Martin
>
>
>
>
>      > On 11/21/19 08:57, Martin Maechler wrote:
>      >>
>      >> TLDR: This is quite technical, still somewhat important:
>      >> 1)  R 4.0.0 will become a bit more coherent: a matrix is an array
>      >> 2)  Your package (or one you use) may be affected.
>      >>
>      >>
>      >>>>>>> Martin Maechler
>      >>>>>>> on Fri, 15 Nov 2019 17:31:15 +0100 writes:
>      >>
>      >>>>>>> Pages, Herve
>      >>>>>>> on Thu, 14 Nov 2019 19:13:47 +0000 writes:
>      >>
>      >> >> On 11/14/19 05:47, Hadley Wickham wrote:
>      >> >>> On Sun, Nov 10, 2019 at 2:37 AM Martin Maechler ... wrote:
>      >>
>      >> [................]
>      >>
>      >> >>>>> Note again that both "matrix" and "array" are special [see ?class] as
>      >> >>>>> being of  __implicit class__  and I am considering that this
>      >> >>>>> implicit class behavior for these two should be slightly
>      >> >>>>> changed ....
>      >> >>>>>
>      >> >>>>> And indeed I think you are right on spot and this would mean
>      >> >>>>> that indeed the implicit class
>      >> >>>>> "matrix" should rather become c("matrix", "array").
>      >> >>>>
>      >> >>>> I've made up my mind (and not been contradicted by my fellow R
>      >> >>>> corers) to try go there for  R 4.0.0   next April.
>      >>
>      >> >>> I can't seem to find the previous thread, so would you mind being a
>      >> >>> bit more explicit here? Do you mean adding "array" to the implicit
>      >> >>> class?
>      >>
>      >> >> It's late in Europe ;-)
>      >>
>      >> >> That's my understanding. I think the plan is to have class(matrix())
>      >> >> return c("matrix", "array"). No class attributes added to matrix or
>      >> >> array objects.
>      >>
>      >> >> It's all what is needed to have inherits(matrix(), "array") return TRUE
>      >> >> (instead of FALSE at the moment) and S3 dispatch pick up the foo.array
>      >> >> method when foo(matrix()) is called and there is no foo.matrix method.
>      >>
>      >> > Thank you, Hervé!  That's exactly the plan.
>      >>
>      >> BUT it's wrong what I (and Peter and Hervé and ....) had assumed:
>      >>
>      >> If I just change the class
>      >> (as I already did a few days ago, but you must activate the change
>      >> via environment variable, see below),
>      >>
>      >> S3 dispatch does *NOT* at all pick it up:
>      >> "matrix" (and "array") are even more special here (see below),
>      >> and from Hadley's questions, in hindsight I now see that he's been aware
>      >> of that and I hereby apologize to Hadley for not having thought
>      >> and looked more, when he asked ..
>      >>
>      >> Half an hour ago, I've done another source code commit (svn r77446),
>      >> to "R-devel" only, of course, and the R-devel NEWS now starts as
>      >>
>      >> ------------------------------------------------------------
>      >>
>      >> CHANGES IN R-devel:
>      >>
>      >> USER-VISIBLE CHANGES:
>      >>
>      >> •  .... intention that the next non-patch release should be 4.0.0.
>      >>
>      >> • R now builds by default against a PCRE2 library ........
>      >> ...................
>      >> ...................
>      >>
>      >> • For now only active when environment variable
>      >> _R_CLASS_MATRIX_ARRAY_ is set to non-empty, but planned to be the
>      >> new unconditional behavior when R 4.0.0 is released:
>      >>
>      >> Newly, matrix objects also inherit from class "array", namely,
>      >> e.g., class(diag(1)) is c("matrix", "array") which invalidates
>      >> code (wrongly) assuming that length(class(obj)) == 1, a wrong
>      >> assumption that is less frequently fulfilled now.  (Currently
>      >> only after setting _R_CLASS_MATRIX_ARRAY_ to non-empty.)
>      >>
>      >> S3 methods for "array", i.e., <someFun>.array(), are now also
>      >> dispatched for matrix objects.
>      >>
>      >> ------------------------------------------------------------
>      >> (where only the very last 1.5 lines paragraph is new.)
>      >>
>      >> Note the following
>      >> (if you use a version of R-devel, with svn rev >= 77446; which
>      >> you may get as a binary for Windows in about one day; everyone
>      >> else needs to compile for the sources .. or wait a bit, maybe
>      >> also not much longer than one day, for a docker image) :
>      >>
>      >>
>      >>> Sys.unsetenv("_R_CLASS_MATRIX_ARRAY_") # ==> current R behavior
>      >>> class(m <- diag(1))
>      >> [1] "matrix"
>      >>> Sys.setenv("_R_CLASS_MATRIX_ARRAY_" = "BOOH !") # ==> future R behavior
>      >>> class(m)
>      >> [1] "matrix" "array"
>      >>>
>      >>> foo <- function(x) UseMethod("foo")
>      >>> foo.array <- function(x) "made in foo.array()"
>      >>> foo(m)
>      >> [1] "made in foo.array()"
>      >>> Sys.unsetenv("_R_CLASS_MATRIX_ARRAY_")# ==> current R behavior
>      >>> foo(m)
>      >> Error in UseMethod("foo") :
>      >> no applicable method for 'foo' applied to an object of class "c('matrix', 'double', 'numeric')"
>      >>
>      >>> Sys.setenv("_R_CLASS_MATRIX_ARRAY_" = TRUE) # ==> future R behavior
>      >>> foo(m)
>      >> [1] "made in foo.array()"
>      >>> foo.A <- foo.array ; rm(foo.array)
>      >>> foo(m)
>      >> Error in UseMethod("foo") :
>      >> no applicable method for 'foo' applied to an object of class "c('matrix', 'array', 'double', 'numeric')"
>      >>>
>      >>
>      >> So, with my commit 77446, the  _R_CLASS_MATRIX_ARRAY_
>      >> environment variable also changes the
>      >>
>      >> "S3 dispatch determining class"
>      >>
>      >> mentioned as 'class' in the error message (of the two cases, old
>      >> and new) above,  which in R <= 3.6.x for a numeric matrix is
>      >>
>      >> c('matrix', 'double', 'numeric')
>      >>
>      >> and from R 4.0.0 on  will be
>      >>
>      >> c('matrix', 'array', 'double', 'numeric')
>      >>
>      >> Note that this is *not* (in R <= 3.6.x, nor very probably in R 4.0.0)
>      >> the same as  R's  class().
>      >> Hadley calls this long class vector the  'implicit class' -- which
>      >> is a good term but somewhat conflicting with R's (i.e. R-core's)
>      >> "definition" used in the  ?class  help page (for ca. 11 years).
>      >>
>      >> R's internal C code has a nice function class R_data_class2()
>      >> which computes this 'S3-dispatch-class' character (vector) for
>      >> any R object, and R_data_class2() is indeed called from (the
>      >> underlying C function of)  R's UseMethod().
>      >>
>      >> Using the above fact of an error message,
>      >> I wrote a nice (quite well tested) function  my.class2()  which
>      >> returns this S3_dispatch_class() also in current versions of R:
>      >>
>      >> my.class2 <- function(x) { # use a fn name not used by any sane ..
>      >> foo.7.3.343 <- function(x) UseMethod("foo.7.3.343")
>      >> msg <- tryCatch(foo.7.3.343(x), error=function(e) e$message)
>      >> clm <- sub('"$', '', sub(".* of class \"", '', msg))
>      >> if(is.language(x) || is.function(x))
>      >> clm
>      >> else {
>      >> cl <- str2lang(clm)
>      >> if(is.symbol(cl)) as.character(cl) else eval(cl)
>      >> }
>      >> }
>      >>
>      >> ## str2lang() needs R >= 3.6.0:
>      >> if(getRversion() < "3.6.0") ## substitute for str2lang(), good enough here:
>      >> str2lang <- function(s) parse(text = s, keep.source=FALSE)[[1]]
>      >>
>      >>
>      >> Now you can look at such things yourself:
>      >>
>      >> ## --------------------- the "interesting" cases : ---
>      >> ## integer and double
>      >> my.class2( pi) # == c("double",  "numeric")
>      >> my.class2(1:2) # == c("integer", "numeric")
>      >> ## matrix and array [also combined with int / double ] :
>      >> my.class2(matrix(1L, 2,3))   # == c(matrixCL, "integer", "numeric")  <<<
>      >> my.class2(matrix(pi, 2,3))   # == c(matrixCL,  "double", "numeric")  <<<
>      >> my.class2(array("A", 2:3))   # == c(matrixCL,  "character")          <<<
>      >> my.class2(array(1:24, 2:4))   # == c("array",  "integer", "numeric")
>      >> my.class2(array( pi , 2:4))   # == c("array",   "double", "numeric")
>      >> my.class2(array(TRUE, 2:4))   # == c("array", "logical")
>      >> my.class2(array(letters, 2:4)) # == c("array", "character")
>      >> my.class2(array(1:24 + 1i, 2)) # == c("array", "complex")
>      >>
>      >> ## other cases
>      >> my.class2(NA) # == class(NA) : "logical"
>      >> my.class2("A") # == class("B"): "character"
>      >> my.class2(as.raw(0:2)) # == "raw"
>      >> my.class2(1 + 2i) # == "complex"
>      >> my.class2(USJudgeRatings)#== "data.frame"
>      >> my.class2(class) # == "function" # also for a primitive
>      >> my.class2(globalenv()) # == "environment"
>      >> my.class2(quote(sin(x)))# == "call"
>      >> my.class2(quote(sin) )  # == "name"
>      >> my.class2(quote({})) # == class(*) == "{"
>      >> my.class2(quote((.))) # == class(*) == "("
>      >>
>      >> -----------------------------------------------------
>      >>
>      >> note that of course, the lines marked "<<<" above, contain
>      >> 'matrixCL'  which is "matrix" in "old" (i.e. current) R,
>      >> and is c("matrix", "array") in "new" (i.e. future) R.
>      >>
>      >> Last but not least: It's quite trivial (only few words need to
>      >> be added to the sources; more to the documentation)  to add an R
>      >> function to base R which provides the same as my.class2() above,
>      >> (but much more efficiently, not via catching error messages !!),
>      >> and my current proposal for that function's name is  .class2()
>      >> {it should start with a dot ("."), as it's not for the simple
>      >> minded average useR ... and you know how I'm happy with
>      >> function names that do not need one single [Shift] key ...}
>      >>
>      >> The current plan contains
>      >>
>      >> 1)  Notify CRAN package maintainers (ca 140) whose packages no
>      >> longer pass R CMD check  when the feature is turned on
>      >> (via setting the environment variable) in R-devel.
>      >>
>      >> 2a) (Some) CRAN team members set _R_CLASS_MATRIX_ARRAY_ (to non-empty),
>      >> as part of the incoming checks, at least for all new CRAN submissions
>      >>
>      >> 2b) set the  _R_CLASS_MATRIX_ARRAY_ (to non-empty), as part of
>      >> ' R CMD check --as-cran <pkg>'
>      >>
>      >> 3)  Before the end of 2019, change the R sources (for R-devel)
>      >> such that it behaves as it behaves currently when the environment
>      >> variable is set *AND* abolish this environment variable from
>      >> the sources.  {read on to learn *why*}
>      >>
>      >> Consequently (to 3), R 4.0.0 will behave as indicated, unconditionally.
>      >>
>      >> Note that (as I've shown above in the first example set) this is
>      >> set up in such a manner that you can change the environment
>      >> variable during a *running* R session, and observe the effect immediately.
>      >> This however lead to some slow down of quite a bit of the R
>      >> code, because actually the environment variable has to be
>      >> checked quite often (easily dozens of times for simple R calls).
>      >>
>      >> For that reason, we want to do "3)" as quickly as possible.
>      >>
>      >> Please do not hesitate to ask or comment
>      >> -- here, not on Twitter, please --  noting that I'll be
>      >> basically offline for an extended weekend within 24h, now.
>      >>
>      >> I hope this will eventually to lead to clean up and clarity in
>      >> R, and hence should be worth the pain of broken
>      >> back-compatibility and having to adapt your (almost always only
>      >> sub-optimally written ;-)) R code,
>      >> see also my Blog   https://urldefense.proofpoint.com/v2/url?u=http-3A__bit.ly_R-5Fblog-5Fclass-5Fthink-5F2x&d=DwIDaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=xAGXmo1FhJxT-qBfj-McDEn3sqWhqJHNV-IPpN7g6oA&s=yUUwdjl5LE90V0tLTM3FZYZ0zHf8coHo49Vt95O7IwQ&e=
>      >>
>      >> Martin Maechler
>      >> ETH Zurich and R Core team
>      >>
>
>      > --
>      > Hervé Pagès
>
>      > Program in Computational Biology
>      > Division of Public Health Sciences
>      > Fred Hutchinson Cancer Research Center
>      > 1100 Fairview Ave. N, M1-B514
>      > P.O. Box 19024
>      > Seattle, WA 98109-1024
>
>      > E-mail: [hidden email]
>      > Phone:  (206) 667-5791
>      > Fax:    (206) 667-1319
>

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [hidden email]
Phone:  (206) 667-5791
Fax:    (206) 667-1319
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
123