## apply with zero-row matrix

 Forgive me if this has been asked many times before, but I couldn't find anything on the mailing lists. I'd expect apply(m, 1, foo) not to call `foo` if m is a matrix with zero rows. In fact: m <- matrix(NA, 0, 5) apply(m, 1, function (x) {cat("Called...\n"); print(x)}) ## Called... ## [1] FALSE FALSE FALSE FALSE FALSE Similarly for apply(m, 2,...) if m has no columns. Is there a reason for this? Could it be documented? David
## Re: apply with zero-row matrix

 >>>>> David Hugh-Jones >>>>>     on Mon, 30 Jul 2018 05:33:19 +0100 writes:     > Forgive me if this has been asked many times before, but I     > couldn't find anything on the mailing lists.     > I'd expect apply(m, 1, foo) not to call `foo` if m is a     > matrix with zero rows.  In fact:     > m <- matrix(NA, 0, 5)     > apply(m, 1, function (x) {cat("Called...\n"); print(x)})     > ## Called...     > ## [1] FALSE FALSE FALSE FALSE FALSE     > Similarly for apply(m, 2,...) if m has no columns.  Is     > there a reason for this? Yes : The reverse is really true for almost all basic R functions:     They *are* called and give an "empty" result automatically     when the main argument is empty. What you basicaly propose is to add an extra      if()       return() to all R functions.  While that makes sense for high-level R functions that do a lot of things, this would really be a bad idea in general : This would make all of these basic functions larger {more to maintain} and slightly slower for all non-zero cases just to make them slightly faster for the rare zero-length case. Martin Maechler ETH Zurich and R core Team
## Re: apply with zero-row matrix

 Hi Martin, Fair enough for R functions in general. But the behaviour of apply violates the expectation that apply(m, 1, fun) calls fun n times when m has n rows. That seems pretty basic. Also, I understand from your argument why it makes sense to call apply and return a special result (presumably NULL) for an empty argument; but why should apply call fun? Cheers David On Mon, 30 Jul 2018 at 08:41, Martin Maechler <[hidden email]> wrote: > >>>>> David Hugh-Jones > >>>>>     on Mon, 30 Jul 2018 05:33:19 +0100 writes: >     > Forgive me if this has been asked many times before, but I >     > couldn't find anything on the mailing lists. >     > I'd expect apply(m, 1, foo) not to call `foo` if m is a >     > matrix with zero rows.  In fact: >     > m <- matrix(NA, 0, 5) >     > apply(m, 1, function (x) {cat("Called...\n"); print(x)}) >     > ## Called... >     > ## [1] FALSE FALSE FALSE FALSE FALSE >     > Similarly for apply(m, 2,...) if m has no columns.  Is >     > there a reason for this? > Yes : > The reverse is really true for almost all basic R functions: >     They *are* called and give an "empty" result automatically >     when the main argument is empty. > What you basicaly propose is to add an extra >      if() >          return() > to all R functions.  While that makes sense for high-level R > functions that do a lot of things, this would really be a bad > idea in general : > This would make all of these basic functions larger {more to maintain} and > slightly slower for all non-zero cases just to make them > slightly faster for the rare zero-length case. > Martin Maechler > ETH Zurich and R core Team
## Re: apply with zero-row matrix

 >>>>> David Hugh-Jones >>>>>     on Mon, 30 Jul 2018 10:12:24 +0100 writes:     > Hi Martin, Fair enough for R functions in general. But the     > behaviour of apply violates the expectation that apply(m,     > 1, fun) calls fun n times when m has n rows.  That seems     > pretty basic. Well, that expectation is obviously wrong ;-)  see below     > Also, I understand from your argument why it makes sense     > to call apply and return a special result (presumably     > NULL) for an empty argument; but why should apply call fun?     > Cheers David The reason is seen e.g. in     > apply(matrix(,0,3), 2, quantile)          [,1] [,2] [,3]     0%     NA   NA   NA     25%    NA   NA   NA     50%    NA   NA   NA     75%    NA   NA   NA     100%   NA   NA   NA     > and that is documented (+/-) in the first paragraph of the 'Value:' section of help(apply) :  > Value:  >  >      If each call to 'FUN' returns a vector of length 'n', then 'apply'  >      returns an array of dimension 'c(n, dim(X)[MARGIN])' if 'n > 1'.  >      If 'n' equals '1', 'apply' returns a vector if 'MARGIN' has length  >      1 and an array of dimension 'dim(X)[MARGIN]' otherwise.  If 'n' is  >      '0', the result has length 0 but not necessarily the 'correct'  >      dimension. To determine 'n', the function *is* called once even when length(X) ==  0 It may indeed be would helpful to add this explicitly to the help page  ( /src/library/base/man/apply.Rd ). Can you propose a wording (in *.Rd if possible) ? With regards, Martin
## Re: apply with zero-row matrix

 Hi David,     Besides Martins point, there is also the issue that for a lot of cases you would still like to have the right class returned.     Right now these are returns:         > apply(matrix(NA_integer_,0,5), 1, class)     character(0)     > apply(matrix(NA_integer_,0,5), 1, identity)     integer(0)     > apply(matrix(NA,0,5), 1, identity)     logical(0)         In your case, these would all return NULL, so I think there is value in running FUN at least once (Say if you'd want to check if FUN always returns the right class).     And from a philosophical point of view, R is mostly a functional programming language, I think if you want side-effects a for-loop would look better.             Best regards,     Emil Bode           Data-analyst           +31 6 43 83 89 33     [hidden email]           DANS: Netherlands Institute for Permanent Access to Digital Research Resources     Anna van Saksenlaan 51 | 2593 HW Den Haag | +31 70 349 44 50 | [hidden email] | dans.knaw.nl     DANS is an institute of the Dutch Academy KNAW and funding organisation NWO .         On 30/07/2018, 11:12, "R-devel on behalf of David Hugh-Jones" <[hidden email] on behalf of [hidden email]> wrote:             Hi Martin,                 Fair enough for R functions in general. But the behaviour of apply violates         the expectation that apply(m, 1, fun) calls fun n times when m has n rows.         That seems pretty basic.                 Also, I understand from your argument why it makes sense to call apply and         return a special result (presumably NULL) for an empty argument; but why         should apply call fun?                 Cheers         David                 On Mon, 30 Jul 2018 at 08:41, Martin Maechler <[hidden email]>         wrote:                 > >>>>> David Hugh-Jones         > >>>>>     on Mon, 30 Jul 2018 05:33:19 +0100 writes:         >         >     > Forgive me if this has been asked many times before, but I         >     > couldn't find anything on the mailing lists.         >         >     > I'd expect apply(m, 1, foo) not to call `foo` if m is a         >     > matrix with zero rows.  In fact:         >         >     > m <- matrix(NA, 0, 5)         >     > apply(m, 1, function (x) {cat("Called...\n"); print(x)})         >     > ## Called...         >     > ## [1] FALSE FALSE FALSE FALSE FALSE         >         >         >     > Similarly for apply(m, 2,...) if m has no columns.  Is         >     > there a reason for this?         >         > Yes :         >         > The reverse is really true for almost all basic R functions:         >         >     They *are* called and give an "empty" result automatically         >     when the main argument is empty.         >         > What you basicaly propose is to add an extra         >         >      if()         >          return()         >         > to all R functions.  While that makes sense for high-level R         > functions that do a lot of things, this would really be a bad         > idea in general :         >         > This would make all of these basic functions larger {more to maintain} and         > slightly slower for all non-zero cases just to make them         > slightly faster for the rare zero-length case.         >         > Martin Maechler         > ETH Zurich and R core Team
## Re: apply with zero-row matrix

 Try pmap and related functions in purrr:   pmap(as.data.frame(m), ~ { cat("Called...\n"); print(c(...)) })   ## list() On Mon, Jul 30, 2018 at 12:33 AM, David Hugh-Jones <[hidden email]> wrote: > Forgive me if this has been asked many times before, but I couldn't find > anything on the mailing lists. > > I'd expect apply(m, 1, foo) not to call `foo` if m is a matrix with zero > rows. > In fact: > > m <- matrix(NA, 0, 5) > apply(m, 1, function (x) {cat("Called...\n"); print(x)}) > ## Called... > ## [1] FALSE FALSE FALSE FALSE FALSE > > Similarly for apply(m, 2,...) if m has no columns. > Is there a reason for this? Could it be documented? > > David -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
## Re: apply with zero-row matrix

 Interesting discussion. I'm not wholly convinced by Martin's and Emil's arguments. The behaviour seems to violate an obvious expectation (fun is called once per row) to satisfy a subtle one (result has a guaranteed dimension and type). In any case, here's a suggested chunk of rd to go at the end of the "Value": If \code{dim(X)[MARGIN]} is zero, then \code{FUN} is called once, with an argument of the appropriate dimensions. The argument's type is the same as \code{typeof(m)}, and the argument values are those returned by \code{vector(typeof(m))}. For example, if m is numeric, the argument will be a vector (or matrix or array) of zeroes. The type and length of the value returned by \code{FUN} is used to determine the type of the result. And at the end of "Details": \code{FUN} is always called at least once, see below. David On Mon, 30 Jul 2018 at 15:05, Gabor Grothendieck <[hidden email]> wrote: > Try pmap and related functions in purrr: >   pmap(as.data.frame(m), ~ { cat("Called...\n"); print(c(...)) }) >   ## list() > On Mon, Jul 30, 2018 at 12:33 AM, David Hugh-Jones > <[hidden email]> wrote: > > Forgive me if this has been asked many times before, but I couldn't find > > anything on the mailing lists. > > > > I'd expect apply(m, 1, foo) not to call `foo` if m is a matrix with zero > > rows. > > In fact: > > > > m <- matrix(NA, 0, 5) > > apply(m, 1, function (x) {cat("Called...\n"); print(x)}) > > ## Called... > > ## [1] FALSE FALSE FALSE FALSE FALSE > > > > Similarly for apply(m, 2,...) if m has no columns. > > Is there a reason for this? Could it be documented? > > > > David > > > -- > Statistics & Software Consulting > GKX Group, GKX Associates Inc. > tel: 1-877-GKX-GROUP > email: ggrothendieck at gmail.com