as.vector() broken on a matrix or array of type "list"

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

as.vector() broken on a matrix or array of type "list"

Hervé Pagès-2
Hi,

Unlike on an atomic matrix, as.vector() doesn't drop the "dim"
attribute of matrix or array of type "list":

   m <- matrix(list(), nrow=2, ncol=3)
   m
   #      [,1] [,2] [,3]
   # [1,] NULL NULL NULL
   # [2,] NULL NULL NULL

   as.vector(m)
   #      [,1] [,2] [,3]
   # [1,] NULL NULL NULL
   # [2,] NULL NULL NULL

   is.vector(as.vector(m))
   # [1] FALSE

Thanks,
H.

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [hidden email]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: as.vector() broken on a matrix or array of type "list"

Martin Maechler
>>>>> Hervé Pagès
>>>>>     on Tue, 25 Sep 2018 23:27:19 -0700 writes:

    > Hi, Unlike on an atomic matrix, as.vector() doesn't drop
    > the "dim" attribute of matrix or array of type "list":


>    m <- matrix(list(), nrow=2, ncol=3)
>    m
>    #      [,1] [,2] [,3]
>    # [1,] NULL NULL NULL
>    # [2,] NULL NULL NULL

>
>    as.vector(m)
>    #      [,1] [,2] [,3]
>    # [1,] NULL NULL NULL
>    # [2,] NULL NULL NULL

as documented and as always, including (probably all) versions of S and S-plus.

>    is.vector(as.vector(m))
>    # [1] FALSE

as bad is that looks, that's also "known" and has been the case
forever as well...

I agree that the semantics of as.vector(.)  are not what you
would expect, and probably neither what we would do when
creating R today. *)
The help page {the same for as.vector() and is.vector()}
mentions that as.vector() behavior more than once, notably at
the end of 'Details' and its 'Note's....
... with one exception where you have a strong point, and the documenation
is incomplete at least -- under the heading

 Methods for 'as.vector()':

   ....... follow the conventions of the default method.  In particular

   ...
   ...
   ...

   • ‘is.vector(as.vector(x, m), m)’ should be true for any mode ‘m’,
      including the default ‘"any"’.

and you are right that this is not fulfilled in the case the
list has a 'dim' attribute.  

But I don't think we "can" change as.vector(.) for that case
(where it is a no-op).
Rather  possibly is.vector(.) should not return FALSE but TRUE -- with
the reasoning (I think most experienced R programmers would
agree) that the foremost property of 'm' is to be
 - a list() {with a dim attribute and matrix-like indexing possibility}
   rather than
 - a 'matrix' {where every matrix entry is a list()}.

At the moment my gut feeling would propose to only update the
documentation, adding that one case as "an exception for historic reasons".

Martin

-----
*) {Possibly such an R we would create today would be much closer to
    julia, where every function is generic / a multi-dispach method
    "a la S4" .... and still be blazingly fast, thanks to JIT
    compilation, method caching and more smart things.}
But as you know one of the strength of (base) R is its stability
and reliability.  You can only use something as a "the language
of applied statistics and data science" and rely that published
code still works 10 years later if the language is not
changed/redesigned from scratch every few years ((as some ... are)).

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: as.vector() broken on a matrix or array of type "list"

Hervé Pagès-2
Hi Martin,

On 09/26/2018 12:41 AM, Martin Maechler wrote:

>>>>>> Hervé Pagès
>>>>>>      on Tue, 25 Sep 2018 23:27:19 -0700 writes:
>
>      > Hi, Unlike on an atomic matrix, as.vector() doesn't drop
>      > the "dim" attribute of matrix or array of type "list":
>
>
>>     m <- matrix(list(), nrow=2, ncol=3)
>>     m
>>     #      [,1] [,2] [,3]
>>     # [1,] NULL NULL NULL
>>     # [2,] NULL NULL NULL
>
>>
>>     as.vector(m)
>>     #      [,1] [,2] [,3]
>>     # [1,] NULL NULL NULL
>>     # [2,] NULL NULL NULL
>
> as documented and as always, including (probably all) versions of S and S-plus.
>
>>     is.vector(as.vector(m))
>>     # [1] FALSE
>
> as bad is that looks, that's also "known" and has been the case
> forever as well...
>
> I agree that the semantics of as.vector(.)  are not what you
> would expect, and probably neither what we would do when
> creating R today. *)
> The help page {the same for as.vector() and is.vector()}
> mentions that as.vector() behavior more than once, notably at
> the end of 'Details' and its 'Note's....
> ... with one exception where you have a strong point, and the documenation
> is incomplete at least -- under the heading
>
>   Methods for 'as.vector()':
>
>     ....... follow the conventions of the default method.  In particular
>
>     ...
>     ...
>     ...
>
>     • ‘is.vector(as.vector(x, m), m)’ should be true for any mode ‘m’,
>        including the default ‘"any"’.
>
> and you are right that this is not fulfilled in the case the
> list has a 'dim' attribute.
>
> But I don't think we "can" change as.vector(.) for that case
> (where it is a no-op).
> Rather  possibly is.vector(.) should not return FALSE but TRUE -- with
> the reasoning (I think most experienced R programmers would
> agree) that the foremost property of 'm' is to be
>   - a list() {with a dim attribute and matrix-like indexing possibility}
>     rather than
>   - a 'matrix' {where every matrix entry is a list()}.

Note that this change would break all the code around that uses
is.vector() to distinguish between an array (of mode "atomic" or
"list") and a non-array. Arguably is.array() should preferably be
used for that but I'm sure there is a lot of code around that uses
is.vector().

The bottom of the problem is that as.vector() doesn't drop attributes
that is.vector() sees as "vector breakers" i.e. as breaking the vector
nature of an object. So for example is.vector() considers the "dim"
attribute to be a vector breaker but as.vector() doesn't drop it.

So yes in order to bring is.vector() and as.vector() in agreement you
can either change one or the other, or both. My gut feeling though is
that it would be less disruptive to not change what is.vector() thinks
about the "dim" attribute and to make sure that as.vector() **always**
drops it (together with "dimnames" if present). How much code around
could there be that calls as.vector() on an array and expects the "dim"
attribute to be dropped **except** when the mode() of the array is
"list"? It is more likely that the code around that calls as.vector()
on an array doesn't expect such exception and so is broken. This was
actually the case for my code ;-)

Thanks,
H.

>
> At the moment my gut feeling would propose to only update the
> documentation, adding that one case as "an exception for historic reasons".
>
> Martin
>
> -----
> *) {Possibly such an R we would create today would be much closer to
>      julia, where every function is generic / a multi-dispach method
>      "a la S4" .... and still be blazingly fast, thanks to JIT
>      compilation, method caching and more smart things.}
> But as you know one of the strength of (base) R is its stability
> and reliability.  You can only use something as a "the language
> of applied statistics and data science" and rely that published
> code still works 10 years later if the language is not
> changed/redesigned from scratch every few years ((as some ... are)).
>
>
>

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [hidden email]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel