

I'm a little surprised that the following doesn't trigger an error or a warning.
matrix (1:256, 8, 8)
The help file says that the main argument is recycled, if it's too short.
But doesn't say what happens if it's too long.
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rdevel


>>>>> Abby Spurdle (/əˈbi/)
>>>>> on Mon, 1 Feb 2021 19:50:32 +1300 writes:
> I'm a little surprised that the following doesn't trigger an error or a warning.
> matrix (1:256, 8, 8)
> The help file says that the main argument is recycled, if it's too short.
> But doesn't say what happens if it's too long.
It's somewhat subtler than one may assume :
> matrix(1:9, 2,3)
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
Warning message:
In matrix(1:9, 2, 3) :
data length [9] is not a submultiple or multiple of the number of rows [2]
> matrix(1:8, 2,3)
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
Warning message:
In matrix(1:8, 2, 3) :
data length [8] is not a submultiple or multiple of the number of columns [3]
> matrix(1:12, 2,3)
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
>
So it looks to me the current behavior is quite on purpose.
Are you sure it's not documented at all when reading the docs
carefully? (I did *not*, just now).
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rdevel


So, does that mean that a clean result is contingent on the length of
the data being a multiple of both the number of rows and columns?
However, this rule is not straightforward.
> #EXAMPLE 1
> #what I would expect
> matrix (1:12, 0, 0)
<0 x 0 matrix>
Warning message:
In matrix(1:12, 0, 0) : data length exceeds size of matrix
> #EXAMPLE 2
> #don't like this
> matrix (numeric (), 2, 3)
[,1] [,2] [,3]
[1,] NA NA NA
[2,] NA NA NA
The first example is what I would expect, but is inconsistent with the
previous examples.
(Because zero is a valid multiple of twelve).
I dislike the second example with recycling of a zerolength vector.
This *is* covered in the help file, but also seems inconsistent with
the previous examples.
(Because two and three are not valid multiples of zero).
Also, I can't think of any reason why someone would want to construct
a matrix with extra data, and then discard part of it.
And even if there was, then why not allow an arbitrarily longer length?
On Mon, Feb 1, 2021 at 10:08 PM Martin Maechler
< [hidden email]> wrote:
>
> >>>>> Abby Spurdle (/əˈbi/)
> >>>>> on Mon, 1 Feb 2021 19:50:32 +1300 writes:
>
> > I'm a little surprised that the following doesn't trigger an error or a warning.
> > matrix (1:256, 8, 8)
>
> > The help file says that the main argument is recycled, if it's too short.
> > But doesn't say what happens if it's too long.
>
> It's somewhat subtler than one may assume :
>
> > matrix(1:9, 2,3)
> [,1] [,2] [,3]
> [1,] 1 3 5
> [2,] 2 4 6
> Warning message:
> In matrix(1:9, 2, 3) :
> data length [9] is not a submultiple or multiple of the number of rows [2]
>
> > matrix(1:8, 2,3)
> [,1] [,2] [,3]
> [1,] 1 3 5
> [2,] 2 4 6
> Warning message:
> In matrix(1:8, 2, 3) :
> data length [8] is not a submultiple or multiple of the number of columns [3]
>
> > matrix(1:12, 2,3)
> [,1] [,2] [,3]
> [1,] 1 3 5
> [2,] 2 4 6
> >
>
> So it looks to me the current behavior is quite on purpose.
> Are you sure it's not documented at all when reading the docs
> carefully? (I did *not*, just now).
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rdevel


FWIW, I paste below a possible change to the warnings generating part of the do_matrix function in R/src/main/array.c that adds the kind of warning that Abby is asking for, and that IMHO would more often help users find bugs in their code than interfere with intended behaviour.
> matrix (1:6, nrow = 2, ncol = 3)
> matrix (1:12, nrow = 2, ncol = 3)
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
Warning message:
In matrix(1:12, nrow = 2, ncol = 3) :
data length incompatible with size of matrix
> matrix (1:7, nrow = 2, ncol = 3)
Warning messages:
1: In matrix(1:7, nrow = 2, ncol = 3) :
data length [7] is not a submultiple or multiple of the number of rows [2]
2: In matrix(1:7, nrow = 2, ncol = 3) :
data length incompatible with size of matrix
> matrix (1:8, nrow = 2, ncol = 3)
Warning messages:
1: In matrix(1:8, nrow = 2, ncol = 3) :
data length [8] is not a submultiple or multiple of the number of columns [3]
2: In matrix(1:8, nrow = 2, ncol = 3) :
data length incompatible with size of matrix
> matrix (1:6, nrow = 0, ncol = 0)
<0 x 0 matrix>
> matrix (numeric(0), nrow = 2, ncol = 3)
[,1] [,2] [,3]
[1,] NA NA NA
[2,] NA NA NA
> matrix(1:2, ncol = 8)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 1 2 1 2 1 2 1 2
It would be nice to combine the new warning with that about “...not a submultiple or multiple…” into a single warning, if appropriate (as in two of the examples above), but that would require bigger surgery way above my payscale.
Kind regards
Wolfgang Huber
Index: array.c
===================================================================
 array.c (revision 79951)
+++ array.c (working copy)
@@ 133,18 +133,19 @@
nc = (int) ceil((double) lendat / (double) nr);
}
 if(lendat > 0) {
+ if (lendat > 1) {
R_xlen_t nrc = (R_xlen_t) nr * nc;
 if (lendat > 1 && nrc % lendat != 0) {
+ if ((nrc % lendat) != 0) {
if (((lendat > nr) && (lendat / nr) * nr != lendat) 
((lendat < nr) && (nr / lendat) * lendat != nr))
warning(_("data length [%d] is not a submultiple or multiple of the number of rows [%d]"), lendat, nr);
else if (((lendat > nc) && (lendat / nc) * nc != lendat) 
((lendat < nc) && (nc / lendat) * lendat != nc))
 warning(_("data length [%d] is not a submultiple or multiple of the number of columns [%d]"), lendat, nc);
 }
 else if ((lendat > 1) && (nrc == 0)){
+ warning(_("data length [%d] is not a submultiple or multiple of the number of columns [%d]"), lendat, nc);
+ if (nrc == 0)
warning(_("data length exceeds size of matrix"));
+ if (nrc != lendat)
+ warning(_("data length incompatible with size of matrix"));
}
}

// And here, for easy checking that part of the code in the new form:
if (lendat > 1) {
R_xlen_t nrc = (R_xlen_t) nr * nc;
if ((nrc % lendat) != 0) {
if (((lendat > nr) && (lendat / nr) * nr != lendat) 
((lendat < nr) && (nr / lendat) * lendat != nr))
warning(_("data length [%d] is not a submultiple or multiple of the number of rows [%d]"), lendat, nr);
else if (((lendat > nc) && (lendat / nc) * nc != lendat) 
((lendat < nc) && (nc / lendat) * lendat != nc))
warning(_("data length [%d] is not a submultiple or multiple of the number of columns [%d]"), lendat, nc);
if (nrc == 0)
warning(_("data length exceeds size of matrix"));
if (nrc != lendat)
warning(_("data length incompatible with size of matrix"));
}
}
> Il giorno 2feb2021, alle ore 00:27, Abby Spurdle (/əˈbi/) < [hidden email]> ha scritto:
>
> So, does that mean that a clean result is contingent on the length of
> the data being a multiple of both the number of rows and columns?
>
> However, this rule is not straightforward.
>
>> #EXAMPLE 1
>> #what I would expect
>> matrix (1:12, 0, 0)
> <0 x 0 matrix>
> Warning message:
> In matrix(1:12, 0, 0) : data length exceeds size of matrix
>
>> #EXAMPLE 2
>> #don't like this
>> matrix (numeric (), 2, 3)
> [,1] [,2] [,3]
> [1,] NA NA NA
> [2,] NA NA NA
>
> The first example is what I would expect, but is inconsistent with the
> previous examples.
> (Because zero is a valid multiple of twelve).
>
> I dislike the second example with recycling of a zerolength vector.
> This *is* covered in the help file, but also seems inconsistent with
> the previous examples.
> (Because two and three are not valid multiples of zero).
>
> Also, I can't think of any reason why someone would want to construct
> a matrix with extra data, and then discard part of it.
> And even if there was, then why not allow an arbitrarily longer length?
>
>
> On Mon, Feb 1, 2021 at 10:08 PM Martin Maechler
> < [hidden email]> wrote:
>>
>>>>>>> Abby Spurdle (/əˈbi/)
>>>>>>> on Mon, 1 Feb 2021 19:50:32 +1300 writes:
>>
>>> I'm a little surprised that the following doesn't trigger an error or a warning.
>>> matrix (1:256, 8, 8)
>>
>>> The help file says that the main argument is recycled, if it's too short.
>>> But doesn't say what happens if it's too long.
>>
>> It's somewhat subtler than one may assume :
>>
>>> matrix(1:9, 2,3)
>> [,1] [,2] [,3]
>> [1,] 1 3 5
>> [2,] 2 4 6
>> Warning message:
>> In matrix(1:9, 2, 3) :
>> data length [9] is not a submultiple or multiple of the number of rows [2]
>>
>>> matrix(1:8, 2,3)
>> [,1] [,2] [,3]
>> [1,] 1 3 5
>> [2,] 2 4 6
>> Warning message:
>> In matrix(1:8, 2, 3) :
>> data length [8] is not a submultiple or multiple of the number of columns [3]
>>
>>> matrix(1:12, 2,3)
>> [,1] [,2] [,3]
>> [1,] 1 3 5
>> [2,] 2 4 6
>>>
>>
>> So it looks to me the current behavior is quite on purpose.
>> Are you sure it's not documented at all when reading the docs
>> carefully? (I did *not*, just now).
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rdevel______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rdevel


>>>>> Wolfgang Huber
>>>>> on Sat, 6 Feb 2021 19:49:11 +0100 writes:
> FWIW, I paste below a possible change to the warnings generating part of the do_matrix function in R/src/main/array.c that adds the kind of warning that Abby is asking for, and that IMHO would more often help users find bugs in their code than interfere with intended behaviour.
Thank you, Wolfgang.
Honestly, I had originally not wanted to get into this.
Functions that have been in use for longer than R exists (namely
in S / S+) without the need for changes is not something we'd
typically easily consider for a change.
{well, in the very old times, 0extent matrices did not exist,
so there were *some* changes during matrix()'s history ..}
But I think Abby and you have a point here.... so I have been
looking at your patch .. and from code reading had wondered
about another behavior which *was* not quite consistent,
and you eliminated completely with your patch:
op < options(warn=1)
for(n in 0:2) { cat(n,":\n") ; print(matrix(seq_len(n), 0, 0)) }
options(op)
shows (in released R)
0 :
<0 x 0 matrix>
1 :
<0 x 0 matrix>
2 :
Warning in matrix(seq_len(n), 0, 0) :
data length exceeds size of matrix
<0 x 0 matrix>
>
and it is really seems not logical that in matrix(x, 0,0),
'x' of size 1 and size (>=) 2 are treated differently.
For consistency, size 1 "should" also warn (but read on!)
After your patch, theres' no warning in any case .. which is
consistent, within the matrix(x, 0,0) situation but not consistent
with your proposal to warn more often when matrix(x, n,k) is
"obviously" using inconsistent dimensions.
When trying to let matrix(1, 0,0) also warn,
I quickly found that this produces many new warnings in our (R
base packages) examples and tests, notably from construction
such as
matrix(NA, 0, 3)
or
matrix(NA_character_, 0, 4)
which would all have to be rewritten as
matrix(logical() , 0, 3)
or matrix(character(), 0, 4)
The latter *is* more strictly selfconsistent, but do we really
want to impose such strictness ?
My conclusion: Not at this moment
OTOH, I'd readd the warning for length(x) > 1 which has
been there in the current code (but not your patch).
{{No need to send another patch, I've changed too many small
things already, for me to be useful}}
This still needs adaption of one of the regression tests of R
itself, and needs (at least) one of the tests of the Matrix package
(warning turned into error, from options(warn = 2)).
I'm willing to go that route, but I'm sure this will entail some
work by other package authors, too (and hence CRAN maintainers etc).
Opinions?
>> matrix (1:6, nrow = 2, ncol = 3)
>> matrix (1:12, nrow = 2, ncol = 3)
> [,1] [,2] [,3]
> [1,] 1 3 5
> [2,] 2 4 6
> Warning message:
> In matrix(1:12, nrow = 2, ncol = 3) :
> data length incompatible with size of matrix
>> matrix (1:7, nrow = 2, ncol = 3)
> Warning messages:
> 1: In matrix(1:7, nrow = 2, ncol = 3) :
> data length [7] is not a submultiple or multiple of the number of rows [2]
> 2: In matrix(1:7, nrow = 2, ncol = 3) :
> data length incompatible with size of matrix
>> matrix (1:8, nrow = 2, ncol = 3)
> Warning messages:
> 1: In matrix(1:8, nrow = 2, ncol = 3) :
> data length [8] is not a submultiple or multiple of the number of columns [3]
> 2: In matrix(1:8, nrow = 2, ncol = 3) :
> data length incompatible with size of matrix
>> matrix (1:6, nrow = 0, ncol = 0)
> <0 x 0 matrix>
>> matrix (numeric(0), nrow = 2, ncol = 3)
> [,1] [,2] [,3]
> [1,] NA NA NA
> [2,] NA NA NA
>> matrix(1:2, ncol = 8)
> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
> [1,] 1 2 1 2 1 2 1 2
> It would be nice to combine the new warning with that about “...not a submultiple or multiple…” into a single warning, if appropriate (as in two of the examples above), but that would require bigger surgery way above my payscale.
I agree that in those cases it should only show one warning, and
to keep things simple I'd say it should just be the 2nd of those
above or more precisely (I have to check if that's correct)
"data length larger than size of matrix"
Martin
> Kind regards
> Wolfgang Huber
> Index: array.c
> ===================================================================
>  array.c (revision 79951)
> +++ array.c (working copy)
> @@ 133,18 +133,19 @@
> nc = (int) ceil((double) lendat / (double) nr);
> }
>  if(lendat > 0) {
> + if (lendat > 1) {
> R_xlen_t nrc = (R_xlen_t) nr * nc;
>  if (lendat > 1 && nrc % lendat != 0) {
> + if ((nrc % lendat) != 0) {
> if (((lendat > nr) && (lendat / nr) * nr != lendat) 
> ((lendat < nr) && (nr / lendat) * lendat != nr))
> warning(_("data length [%d] is not a submultiple or multiple of the number of rows [%d]"), lendat, nr);
> else if (((lendat > nc) && (lendat / nc) * nc != lendat) 
> ((lendat < nc) && (nc / lendat) * lendat != nc))
>  warning(_("data length [%d] is not a submultiple or multiple of the number of columns [%d]"), lendat, nc);
>  }
>  else if ((lendat > 1) && (nrc == 0)){
> + warning(_("data length [%d] is not a submultiple or multiple of the number of columns [%d]"), lendat, nc);
> + if (nrc == 0)
> warning(_("data length exceeds size of matrix"));
> + if (nrc != lendat)
> + warning(_("data length incompatible with size of matrix"));
> }
> }
> 
> // And here, for easy checking that part of the code in the new form:
> if (lendat > 1) {
> R_xlen_t nrc = (R_xlen_t) nr * nc;
> if ((nrc % lendat) != 0) {
> if (((lendat > nr) && (lendat / nr) * nr != lendat) 
> ((lendat < nr) && (nr / lendat) * lendat != nr))
> warning(_("data length [%d] is not a submultiple or multiple of the number of rows [%d]"), lendat, nr);
> else if (((lendat > nc) && (lendat / nc) * nc != lendat) 
> ((lendat < nc) && (nc / lendat) * lendat != nc))
> warning(_("data length [%d] is not a submultiple or multiple of the number of columns [%d]"), lendat, nc);
> if (nrc == 0)
> warning(_("data length exceeds size of matrix"));
> if (nrc != lendat)
> warning(_("data length incompatible with size of matrix"));
> }
> }
>> Il giorno 2feb2021, alle ore 00:27, Abby Spurdle (/əˈbi/) < [hidden email]> ha scritto:
>>
>> So, does that mean that a clean result is contingent on the length of
>> the data being a multiple of both the number of rows and columns?
>>
>> However, this rule is not straightforward.
>>
>>> #EXAMPLE 1
>>> #what I would expect
>>> matrix (1:12, 0, 0)
>> <0 x 0 matrix>
>> Warning message:
>> In matrix(1:12, 0, 0) : data length exceeds size of matrix
>>
>>> #EXAMPLE 2
>>> #don't like this
>>> matrix (numeric (), 2, 3)
>> [,1] [,2] [,3]
>> [1,] NA NA NA
>> [2,] NA NA NA
>>
>> The first example is what I would expect, but is inconsistent with the
>> previous examples.
>> (Because zero is a valid multiple of twelve).
>>
>> I dislike the second example with recycling of a zerolength vector.
>> This *is* covered in the help file, but also seems inconsistent with
>> the previous examples.
>> (Because two and three are not valid multiples of zero).
>>
>> Also, I can't think of any reason why someone would want to construct
>> a matrix with extra data, and then discard part of it.
>> And even if there was, then why not allow an arbitrarily longer length?
>>
>>
>> On Mon, Feb 1, 2021 at 10:08 PM Martin Maechler
>> < [hidden email]> wrote:
>>>
>>>>>>>> Abby Spurdle (/əˈbi/)
>>>>>>>> on Mon, 1 Feb 2021 19:50:32 +1300 writes:
>>>
>>>> I'm a little surprised that the following doesn't trigger an error or a warning.
>>>> matrix (1:256, 8, 8)
>>>
>>>> The help file says that the main argument is recycled, if it's too short.
>>>> But doesn't say what happens if it's too long.
>>>
>>> It's somewhat subtler than one may assume :
>>>
>>>> matrix(1:9, 2,3)
>>> [,1] [,2] [,3]
>>> [1,] 1 3 5
>>> [2,] 2 4 6
>>> Warning message:
>>> In matrix(1:9, 2, 3) :
>>> data length [9] is not a submultiple or multiple of the number of rows [2]
>>>
>>>> matrix(1:8, 2,3)
>>> [,1] [,2] [,3]
>>> [1,] 1 3 5
>>> [2,] 2 4 6
>>> Warning message:
>>> In matrix(1:8, 2, 3) :
>>> data length [8] is not a submultiple or multiple of the number of columns [3]
>>>
>>>> matrix(1:12, 2,3)
>>> [,1] [,2] [,3]
>>> [1,] 1 3 5
>>> [2,] 2 4 6
>>>>
>>>
>>> So it looks to me the current behavior is quite on purpose.
>>> Are you sure it's not documented at all when reading the docs
>>> carefully? (I did *not*, just now).
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/rdevel______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rdevel


Hi Martin
Thank you! I very much understand your reservations and know it was a bit cheeky to poke.
I agree that in those cases where my (naive) patch results in two warnings, keeping only the new one would better.
No strong opinion about the case where either ncol or nrow is 0. Maybe a compromise would be to live with the inconsistency of only warning if length(data)>1, in order to allow legacy code such as matrix(NA, 0, 3). Assuming that the main objective is not consistency but more robust users’ R code.
Btw if one were to refactor this properly (which I do not propose!), shouldn’t `matrix` just be a wrapper to `array`, whose added value is the inference of missing `nrow` and `ncol` values and dealing with `byrow`?
Kind regards, and thanks again
Wolfgang
> Il giorno 9feb2021, alle ore 11:58, Martin Maechler < [hidden email]> ha scritto:
>
>>>>>> Wolfgang Huber
>>>>>> on Sat, 6 Feb 2021 19:49:11 +0100 writes:
>
>> FWIW, I paste below a possible change to the warnings generating part of the do_matrix function in R/src/main/array.c that adds the kind of warning that Abby is asking for, and that IMHO would more often help users find bugs in their code than interfere with intended behaviour.
>
> Thank you, Wolfgang.
> Honestly, I had originally not wanted to get into this.
>
> Functions that have been in use for longer than R exists (namely
> in S / S+) without the need for changes is not something we'd
> typically easily consider for a change.
> {well, in the very old times, 0extent matrices did not exist,
> so there were *some* changes during matrix()'s history ..}
>
> But I think Abby and you have a point here.... so I have been
> looking at your patch .. and from code reading had wondered
> about another behavior which *was* not quite consistent,
> and you eliminated completely with your patch:
>
> op < options(warn=1)
> for(n in 0:2) { cat(n,":\n") ; print(matrix(seq_len(n), 0, 0)) }
> options(op)
>
> shows (in released R)
>
> 0 :
> <0 x 0 matrix>
> 1 :
> <0 x 0 matrix>
> 2 :
> Warning in matrix(seq_len(n), 0, 0) :
> data length exceeds size of matrix
> <0 x 0 matrix>
>>
>
> and it is really seems not logical that in matrix(x, 0,0),
> 'x' of size 1 and size (>=) 2 are treated differently.
> For consistency, size 1 "should" also warn (but read on!)
>
> After your patch, theres' no warning in any case .. which is
> consistent, within the matrix(x, 0,0) situation but not consistent
> with your proposal to warn more often when matrix(x, n,k) is
> "obviously" using inconsistent dimensions.
>
> When trying to let matrix(1, 0,0) also warn,
> I quickly found that this produces many new warnings in our (R
> base packages) examples and tests, notably from construction
> such as
>
> matrix(NA, 0, 3)
> or
> matrix(NA_character_, 0, 4)
>
> which would all have to be rewritten as
>
> matrix(logical() , 0, 3)
> or matrix(character(), 0, 4)
>
> The latter *is* more strictly selfconsistent, but do we really
> want to impose such strictness ?
>
> My conclusion: Not at this moment
>
> OTOH, I'd readd the warning for length(x) > 1 which has
> been there in the current code (but not your patch).
> {{No need to send another patch, I've changed too many small
> things already, for me to be useful}}
>
> This still needs adaption of one of the regression tests of R
> itself, and needs (at least) one of the tests of the Matrix package
> (warning turned into error, from options(warn = 2)).
>
> I'm willing to go that route, but I'm sure this will entail some
> work by other package authors, too (and hence CRAN maintainers etc).
>
> Opinions?
>
>
>>> matrix (1:6, nrow = 2, ncol = 3)
>
>>> matrix (1:12, nrow = 2, ncol = 3)
>> [,1] [,2] [,3]
>> [1,] 1 3 5
>> [2,] 2 4 6
>> Warning message:
>> In matrix(1:12, nrow = 2, ncol = 3) :
>> data length incompatible with size of matrix
>
>>> matrix (1:7, nrow = 2, ncol = 3)
>> Warning messages:
>> 1: In matrix(1:7, nrow = 2, ncol = 3) :
>> data length [7] is not a submultiple or multiple of the number of rows [2]
>> 2: In matrix(1:7, nrow = 2, ncol = 3) :
>> data length incompatible with size of matrix
>
>>> matrix (1:8, nrow = 2, ncol = 3)
>> Warning messages:
>> 1: In matrix(1:8, nrow = 2, ncol = 3) :
>> data length [8] is not a submultiple or multiple of the number of columns [3]
>> 2: In matrix(1:8, nrow = 2, ncol = 3) :
>> data length incompatible with size of matrix
>
>>> matrix (1:6, nrow = 0, ncol = 0)
>> <0 x 0 matrix>
>>> matrix (numeric(0), nrow = 2, ncol = 3)
>> [,1] [,2] [,3]
>> [1,] NA NA NA
>> [2,] NA NA NA
>
>>> matrix(1:2, ncol = 8)
>> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
>> [1,] 1 2 1 2 1 2 1 2
>
>> It would be nice to combine the new warning with that about “...not a submultiple or multiple…” into a single warning, if appropriate (as in two of the examples above), but that would require bigger surgery way above my payscale.
>
> I agree that in those cases it should only show one warning, and
> to keep things simple I'd say it should just be the 2nd of those
> above or more precisely (I have to check if that's correct)
>
> "data length larger than size of matrix"
>
> Martin
>
>> Kind regards
>> Wolfgang Huber
>
>
>> Index: array.c
>> ===================================================================
>>  array.c (revision 79951)
>> +++ array.c (working copy)
>> @@ 133,18 +133,19 @@
>> nc = (int) ceil((double) lendat / (double) nr);
>> }
>
>>  if(lendat > 0) {
>> + if (lendat > 1) {
>> R_xlen_t nrc = (R_xlen_t) nr * nc;
>>  if (lendat > 1 && nrc % lendat != 0) {
>> + if ((nrc % lendat) != 0) {
>> if (((lendat > nr) && (lendat / nr) * nr != lendat) 
>> ((lendat < nr) && (nr / lendat) * lendat != nr))
>> warning(_("data length [%d] is not a submultiple or multiple of the number of rows [%d]"), lendat, nr);
>> else if (((lendat > nc) && (lendat / nc) * nc != lendat) 
>> ((lendat < nc) && (nc / lendat) * lendat != nc))
>>  warning(_("data length [%d] is not a submultiple or multiple of the number of columns [%d]"), lendat, nc);
>>  }
>>  else if ((lendat > 1) && (nrc == 0)){
>> + warning(_("data length [%d] is not a submultiple or multiple of the number of columns [%d]"), lendat, nc);
>> + if (nrc == 0)
>> warning(_("data length exceeds size of matrix"));
>> + if (nrc != lendat)
>> + warning(_("data length incompatible with size of matrix"));
>> }
>> }
>
>
>> 
>> // And here, for easy checking that part of the code in the new form:
>> if (lendat > 1) {
>> R_xlen_t nrc = (R_xlen_t) nr * nc;
>> if ((nrc % lendat) != 0) {
>> if (((lendat > nr) && (lendat / nr) * nr != lendat) 
>> ((lendat < nr) && (nr / lendat) * lendat != nr))
>> warning(_("data length [%d] is not a submultiple or multiple of the number of rows [%d]"), lendat, nr);
>> else if (((lendat > nc) && (lendat / nc) * nc != lendat) 
>> ((lendat < nc) && (nc / lendat) * lendat != nc))
>> warning(_("data length [%d] is not a submultiple or multiple of the number of columns [%d]"), lendat, nc);
>> if (nrc == 0)
>> warning(_("data length exceeds size of matrix"));
>> if (nrc != lendat)
>> warning(_("data length incompatible with size of matrix"));
>> }
>> }
>
>>> Il giorno 2feb2021, alle ore 00:27, Abby Spurdle (/əˈbi/) < [hidden email]> ha scritto:
>>>
>>> So, does that mean that a clean result is contingent on the length of
>>> the data being a multiple of both the number of rows and columns?
>>>
>>> However, this rule is not straightforward.
>>>
>>>> #EXAMPLE 1
>>>> #what I would expect
>>>> matrix (1:12, 0, 0)
>>> <0 x 0 matrix>
>>> Warning message:
>>> In matrix(1:12, 0, 0) : data length exceeds size of matrix
>>>
>>>> #EXAMPLE 2
>>>> #don't like this
>>>> matrix (numeric (), 2, 3)
>>> [,1] [,2] [,3]
>>> [1,] NA NA NA
>>> [2,] NA NA NA
>>>
>>> The first example is what I would expect, but is inconsistent with the
>>> previous examples.
>>> (Because zero is a valid multiple of twelve).
>>>
>>> I dislike the second example with recycling of a zerolength vector.
>>> This *is* covered in the help file, but also seems inconsistent with
>>> the previous examples.
>>> (Because two and three are not valid multiples of zero).
>>>
>>> Also, I can't think of any reason why someone would want to construct
>>> a matrix with extra data, and then discard part of it.
>>> And even if there was, then why not allow an arbitrarily longer length?
>>>
>>>
>>> On Mon, Feb 1, 2021 at 10:08 PM Martin Maechler
>>> < [hidden email]> wrote:
>>>>
>>>>>>>>> Abby Spurdle (/əˈbi/)
>>>>>>>>> on Mon, 1 Feb 2021 19:50:32 +1300 writes:
>>>>
>>>>> I'm a little surprised that the following doesn't trigger an error or a warning.
>>>>> matrix (1:256, 8, 8)
>>>>
>>>>> The help file says that the main argument is recycled, if it's too short.
>>>>> But doesn't say what happens if it's too long.
>>>>
>>>> It's somewhat subtler than one may assume :
>>>>
>>>>> matrix(1:9, 2,3)
>>>> [,1] [,2] [,3]
>>>> [1,] 1 3 5
>>>> [2,] 2 4 6
>>>> Warning message:
>>>> In matrix(1:9, 2, 3) :
>>>> data length [9] is not a submultiple or multiple of the number of rows [2]
>>>>
>>>>> matrix(1:8, 2,3)
>>>> [,1] [,2] [,3]
>>>> [1,] 1 3 5
>>>> [2,] 2 4 6
>>>> Warning message:
>>>> In matrix(1:8, 2, 3) :
>>>> data length [8] is not a submultiple or multiple of the number of columns [3]
>>>>
>>>>> matrix(1:12, 2,3)
>>>> [,1] [,2] [,3]
>>>> [1,] 1 3 5
>>>> [2,] 2 4 6
>>>>>
>>>>
>>>> So it looks to me the current behavior is quite on purpose.
>>>> Are you sure it's not documented at all when reading the docs
>>>> carefully? (I did *not*, just now).
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/rdevel>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rdevel______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rdevel


Hi Martin,
It kind of does make sense to issue the warning when **recycling** (and
this is consistent with what happens with recycling in general):
> matrix(1:4, 6, 6)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 3 1 3 1 3
[2,] 2 4 2 4 2 4
[3,] 3 1 3 1 3 1
[4,] 4 2 4 2 4 2
[5,] 1 3 1 3 1 3
[6,] 2 4 2 4 2 4
> matrix(1:4, 5, 6)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 2 3 4 1 2
[2,] 2 3 4 1 2 3
[3,] 3 4 1 2 3 4
[4,] 4 1 2 3 4 1
[5,] 1 2 3 4 1 2
Warning message:
In matrix(1:4, 5, 6) :
data length [4] is not a submultiple or multiple of the number of
rows [5]
(Note that the warning is misleading. matrix() is happy to take data
with a length that is not a submultiple of the number of rows or cols
as long as it's a submultiple of the length of the matrix.)
However I'm not sure that **truncating** the data is desirable behavior:
> matrix(1:6, 1, 3)
[,1] [,2] [,3]
[1,] 1 2 3
> matrix(1:6, 1, 5)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 2 3 4 5
Warning message:
In matrix(1:6, 1, 5) :
data length [6] is not a submultiple or multiple of the number of
columns [5]
Maybe you get a warning sometimes, if you are lucky, but still.
Finally note that you never get any warning with array():
> array(1:4, c(5, 6))
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 2 3 4 1 2
[2,] 2 3 4 1 2 3
[3,] 3 4 1 2 3 4
[4,] 4 1 2 3 4 1
[5,] 1 2 3 4 1 2
> array(1:6, c(1, 5))
[,1] [,2] [,3] [,4] [,5]
[1,] 1 2 3 4 5
Cheers,
H.
On 2/1/21 1:08 AM, Martin Maechler wrote:
>>>>>> Abby Spurdle (/əˈbi/)
>>>>>> on Mon, 1 Feb 2021 19:50:32 +1300 writes:
>
> > I'm a little surprised that the following doesn't trigger an error or a warning.
> > matrix (1:256, 8, 8)
>
> > The help file says that the main argument is recycled, if it's too short.
> > But doesn't say what happens if it's too long.
>
> It's somewhat subtler than one may assume :
>
>> matrix(1:9, 2,3)
> [,1] [,2] [,3]
> [1,] 1 3 5
> [2,] 2 4 6
> Warning message:
> In matrix(1:9, 2, 3) :
> data length [9] is not a submultiple or multiple of the number of rows [2]
>
>> matrix(1:8, 2,3)
> [,1] [,2] [,3]
> [1,] 1 3 5
> [2,] 2 4 6
> Warning message:
> In matrix(1:8, 2, 3) :
> data length [8] is not a submultiple or multiple of the number of columns [3]
>
>> matrix(1:12, 2,3)
> [,1] [,2] [,3]
> [1,] 1 3 5
> [2,] 2 4 6
>>
>
> So it looks to me the current behavior is quite on purpose.
> Are you sure it's not documented at all when reading the docs
> carefully? (I did *not*, just now).
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rdevel>

Hervé Pagès
Bioconductor Core Team
[hidden email]
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rdevel

