ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

Henrik Bengtsson-5
# Issue

'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here
using R 3.5.1),

> c(TRUE, TRUE) || FALSE
[1] TRUE
> c(TRUE, FALSE) || FALSE
[1] TRUE
> c(TRUE, NA) || FALSE
[1] TRUE
> c(FALSE, TRUE) || FALSE
[1] FALSE

This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the
same) and it also applies to 'x && y'.

Note also how the above truncation of 'x' is completely silent -
there's neither an error nor a warning being produced.


# Discussion/Suggestion

Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a
mistake.  Either the code is written assuming 'x' and 'y' are scalars,
or there is a coding error and vectorized versions 'x | y' and 'x & y'
were intended.  Should 'x || y' always be considered an mistake if
'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning
or an error?  For instance,
'''r
> x <- c(TRUE, TRUE)
> y <- FALSE
> x || y

Error in x || y : applying scalar operator || to non-scalar elements
Execution halted

What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today
'x || y' returns 'NA' in such cases, e.g.

> logical(0) || c(FALSE, NA)
[1] NA
> logical(0) || logical(0)
[1] NA
> logical(0) && logical(0)
[1] NA

I don't know the background for this behavior, but I'm sure there is
an argument behind that one.  Maybe it's simply that '||' and '&&'
should always return a scalar logical and neither TRUE nor FALSE can
be returned.

/Henrik

PS. This is in the same vein as
https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html
- in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if
_R_CHECK_LENGTH_1_CONDITION_=true

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

Emil
I have to disagree, I think one of the advantages of '||' (or &&) is the lazy evaluation, i.e. you can use the first condition to "not care" about the second (and stop errors from being thrown).
So if I want to check if x is a length-one numeric with value a value between 0 and 1, I can do 'class(x)=='numeric' && length(x)==1 && x>0 && x<1'.
In your proposal, having x=c(1,2) would throw an error or multiple warnings.
Also code that relies on the second argument not being evaluated would break, as we need to evaluate y in order to know length(y)
There may be some benefit in checking for length(x) only, though that could also cause some false positives (e.g. 'x==-1 || length(x)==0' would be a bit ugly, but not necessarily wrong, same for someone too lazy to write x[1] instead of x).

And I don’t really see the advantage. The casting to length one is (I think), a feature, not a bug. If I have/need a length one x, and a length one y, why not use '|' and '&'? I have to admit I only use them in if-statements, and if I need an error to be thrown when x and y are not length one, I can use the shorter versions and then the if throws a warning (or an error for a length-0 or NA result).

I get it that for someone just starting in R, the differences between | and || can be confusing, but I guess that's just the price to pay for having a vectorized language.

Best regards,
Emil Bode
 
Data-analyst
 
+31 6 43 83 89 33
[hidden email]
 
DANS: Netherlands Institute for Permanent Access to Digital Research Resources
Anna van Saksenlaan 51 | 2593 HW Den Haag | +31 70 349 44 50 | [hidden email] <mailto:[hidden email]> | dans.knaw.nl <applewebdata://71F677F0-6872-45F3-A6C4-4972BF87185B/www.dans.knaw.nl>
DANS is an institute of the Dutch Academy KNAW <http://knaw.nl/nl> and funding organisation NWO <http://www.nwo.nl/>.

On 29/08/2018, 05:03, "R-devel on behalf of Henrik Bengtsson" <[hidden email] on behalf of [hidden email]> wrote:

    # Issue
   
    'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here
    using R 3.5.1),
   
    > c(TRUE, TRUE) || FALSE
    [1] TRUE
    > c(TRUE, FALSE) || FALSE
    [1] TRUE
    > c(TRUE, NA) || FALSE
    [1] TRUE
    > c(FALSE, TRUE) || FALSE
    [1] FALSE
   
    This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the
    same) and it also applies to 'x && y'.
   
    Note also how the above truncation of 'x' is completely silent -
    there's neither an error nor a warning being produced.
   
   
    # Discussion/Suggestion
   
    Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a
    mistake.  Either the code is written assuming 'x' and 'y' are scalars,
    or there is a coding error and vectorized versions 'x | y' and 'x & y'
    were intended.  Should 'x || y' always be considered an mistake if
    'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning
    or an error?  For instance,
    '''r
    > x <- c(TRUE, TRUE)
    > y <- FALSE
    > x || y
   
    Error in x || y : applying scalar operator || to non-scalar elements
    Execution halted
   
    What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today
    'x || y' returns 'NA' in such cases, e.g.
   
    > logical(0) || c(FALSE, NA)
    [1] NA
    > logical(0) || logical(0)
    [1] NA
    > logical(0) && logical(0)
    [1] NA
   
    I don't know the background for this behavior, but I'm sure there is
    an argument behind that one.  Maybe it's simply that '||' and '&&'
    should always return a scalar logical and neither TRUE nor FALSE can
    be returned.
   
    /Henrik
   
    PS. This is in the same vein as
    https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html
    - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if
    _R_CHECK_LENGTH_1_CONDITION_=true
   
    ______________________________________________
    [hidden email] mailing list
    https://stat.ethz.ch/mailman/listinfo/r-devel
   

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

Joris FA Meys
In reply to this post by Henrik Bengtsson-5
I have to agree with Emil here. && and || are short circuited like in C and
C++. That means that

TRUE || c(TRUE, FALSE)
FALSE && c(TRUE, FALSE)

cannot give an error because the second part is never evaluated. Throwing a
warning or error for

c(TRUE, FALSE) || TRUE

would mean that the operator gives a different result depending on the
order of the objects, breaking the symmetry. Also that would be undesirable.

Regarding logical(0): per the documentation, it is indeed so that ||, &&
and isTRUE always return a length-one logical vector. Hence the NA.

On a sidenote: there is no such thing as a scalar in R. What you call
scalar, is really a length-one vector. That seems like a detail, but is
important in understanding why this admittedly confusing behaviour actually
makes sense within the framework of R imho. I do understand your objections
and suggestions, but it would boil down to removing short circuited
operators from R.

My 2 cents.
Cheers
Joris

On Wed, Aug 29, 2018 at 5:03 AM Henrik Bengtsson <[hidden email]>
wrote:

> # Issue
>
> 'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here
> using R 3.5.1),
>
> > c(TRUE, TRUE) || FALSE
> [1] TRUE
> > c(TRUE, FALSE) || FALSE
> [1] TRUE
> > c(TRUE, NA) || FALSE
> [1] TRUE
> > c(FALSE, TRUE) || FALSE
> [1] FALSE
>
> This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the
> same) and it also applies to 'x && y'.
>
> Note also how the above truncation of 'x' is completely silent -
> there's neither an error nor a warning being produced.
>
>
> # Discussion/Suggestion
>
> Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a
> mistake.  Either the code is written assuming 'x' and 'y' are scalars,
> or there is a coding error and vectorized versions 'x | y' and 'x & y'
> were intended.  Should 'x || y' always be considered an mistake if
> 'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning
> or an error?  For instance,
> '''r
> > x <- c(TRUE, TRUE)
> > y <- FALSE
> > x || y
>
> Error in x || y : applying scalar operator || to non-scalar elements
> Execution halted
>
> What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today
> 'x || y' returns 'NA' in such cases, e.g.
>
> > logical(0) || c(FALSE, NA)
> [1] NA
> > logical(0) || logical(0)
> [1] NA
> > logical(0) && logical(0)
> [1] NA
>
> I don't know the background for this behavior, but I'm sure there is
> an argument behind that one.  Maybe it's simply that '||' and '&&'
> should always return a scalar logical and neither TRUE nor FALSE can
> be returned.
>
> /Henrik
>
> PS. This is in the same vein as
> https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html
> - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if
> _R_CHECK_LENGTH_1_CONDITION_=true
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>


--
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)
<https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g>

-----------
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

Dénes Tóth-2
In reply to this post by Emil
Hi,

I absolutely second Henrik's suggestion.

On 08/30/2018 01:09 PM, Emil Bode wrote:
> I have to disagree, I think one of the advantages of '||' (or &&) is the lazy evaluation, i.e. you can use the first condition to "not care" about the second (and stop errors from being thrown).

I do not think Henrik's proposal implies that both arguments of `||` or
`&&` should be evaluated before the evaluation of the condition. It
implies that if an argument is evaluated, and its length does not equal
one, it should return an error instead of the silent truncation of the
argument.
So your argument is orthogonal to the issue.

> So if I want to check if x is a length-one numeric with value a value between 0 and 1, I can do 'class(x)=='numeric' && length(x)==1 && x>0 && x<1'.
> In your proposal, having x=c(1,2) would throw an error or multiple warnings.
> Also code that relies on the second argument not being evaluated would break, as we need to evaluate y in order to know length(y)
> There may be some benefit in checking for length(x) only, though that could also cause some false positives (e.g. 'x==-1 || length(x)==0' would be a bit ugly, but not necessarily wrong, same for someone too lazy to write x[1] instead of x).
>
> And I don’t really see the advantage. The casting to length one is (I think), a feature, not a bug. If I have/need a length one x, and a length one y, why not use '|' and '&'? I have to admit I only use them in if-statements, and if I need an error to be thrown when x and y are not length one, I can use the shorter versions and then the if throws a warning (or an error for a length-0 or NA result).
>
> I get it that for someone just starting in R, the differences between | and || can be confusing, but I guess that's just the price to pay for having a vectorized language.

I use R for about 10 years, and use regularly `||` and `&&` for the
standard purpose (implemented in most programming languages for the same
purpose, that is, no evaluation of all arguments if it is not required
to decide whether the condition is TRUE). I can not recall any single
case when I wanted to use them for the purpose to evaluate whether the
*first* elements of vectors fulfill the given condition.

However, I regularly write mistakenly `||` or `&&` when I actually want
to write `|` or `&`, and have no chance to spot the error because of the
silent truncation of the arguments.


Regards,
Denes



>
> Best regards,
> Emil Bode
>  
> Data-analyst
>  
> +31 6 43 83 89 33
> [hidden email]
>  
> DANS: Netherlands Institute for Permanent Access to Digital Research Resources
> Anna van Saksenlaan 51 | 2593 HW Den Haag | +31 70 349 44 50 | [hidden email] <mailto:[hidden email]> | dans.knaw.nl <applewebdata://71F677F0-6872-45F3-A6C4-4972BF87185B/www.dans.knaw.nl>
> DANS is an institute of the Dutch Academy KNAW <http://knaw.nl/nl> and funding organisation NWO <http://www.nwo.nl/>.
>
> On 29/08/2018, 05:03, "R-devel on behalf of Henrik Bengtsson" <[hidden email] on behalf of [hidden email]> wrote:
>
>      # Issue
>      
>      'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here
>      using R 3.5.1),
>      
>      > c(TRUE, TRUE) || FALSE
>      [1] TRUE
>      > c(TRUE, FALSE) || FALSE
>      [1] TRUE
>      > c(TRUE, NA) || FALSE
>      [1] TRUE
>      > c(FALSE, TRUE) || FALSE
>      [1] FALSE
>      
>      This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the
>      same) and it also applies to 'x && y'.
>      
>      Note also how the above truncation of 'x' is completely silent -
>      there's neither an error nor a warning being produced.
>      
>      
>      # Discussion/Suggestion
>      
>      Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a
>      mistake.  Either the code is written assuming 'x' and 'y' are scalars,
>      or there is a coding error and vectorized versions 'x | y' and 'x & y'
>      were intended.  Should 'x || y' always be considered an mistake if
>      'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning
>      or an error?  For instance,
>      '''r
>      > x <- c(TRUE, TRUE)
>      > y <- FALSE
>      > x || y
>      
>      Error in x || y : applying scalar operator || to non-scalar elements
>      Execution halted
>      
>      What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today
>      'x || y' returns 'NA' in such cases, e.g.
>      
>      > logical(0) || c(FALSE, NA)
>      [1] NA
>      > logical(0) || logical(0)
>      [1] NA
>      > logical(0) && logical(0)
>      [1] NA
>      
>      I don't know the background for this behavior, but I'm sure there is
>      an argument behind that one.  Maybe it's simply that '||' and '&&'
>      should always return a scalar logical and neither TRUE nor FALSE can
>      be returned.
>      
>      /Henrik
>      
>      PS. This is in the same vein as
>      https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html
>      - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if
>      _R_CHECK_LENGTH_1_CONDITION_=true
>      
>      ______________________________________________
>      [hidden email] mailing list
>      https://stat.ethz.ch/mailman/listinfo/r-devel
>      
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

Dénes Tóth-2
In reply to this post by Joris FA Meys


On 08/30/2018 01:56 PM, Joris Meys wrote:

> I have to agree with Emil here. && and || are short circuited like in C and
> C++. That means that
>
> TRUE || c(TRUE, FALSE)
> FALSE && c(TRUE, FALSE)
>
> cannot give an error because the second part is never evaluated. Throwing a
> warning or error for
>
> c(TRUE, FALSE) || TRUE
>
> would mean that the operator gives a different result depending on the
> order of the objects, breaking the symmetry. Also that would be undesirable.

Note that `||` and `&&` have never been symmetric:

TRUE || stop() # returns TRUE
stop() || TRUE # returns an error


>
> Regarding logical(0): per the documentation, it is indeed so that ||, &&
> and isTRUE always return a length-one logical vector. Hence the NA.
>
> On a sidenote: there is no such thing as a scalar in R. What you call
> scalar, is really a length-one vector. That seems like a detail, but is
> important in understanding why this admittedly confusing behaviour actually
> makes sense within the framework of R imho. I do understand your objections
> and suggestions, but it would boil down to removing short circuited
> operators from R.
>
> My 2 cents.
> Cheers
> Joris
>
> On Wed, Aug 29, 2018 at 5:03 AM Henrik Bengtsson <[hidden email]>
> wrote:
>
>> # Issue
>>
>> 'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here
>> using R 3.5.1),
>>
>>> c(TRUE, TRUE) || FALSE
>> [1] TRUE
>>> c(TRUE, FALSE) || FALSE
>> [1] TRUE
>>> c(TRUE, NA) || FALSE
>> [1] TRUE
>>> c(FALSE, TRUE) || FALSE
>> [1] FALSE
>>
>> This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the
>> same) and it also applies to 'x && y'.
>>
>> Note also how the above truncation of 'x' is completely silent -
>> there's neither an error nor a warning being produced.
>>
>>
>> # Discussion/Suggestion
>>
>> Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a
>> mistake.  Either the code is written assuming 'x' and 'y' are scalars,
>> or there is a coding error and vectorized versions 'x | y' and 'x & y'
>> were intended.  Should 'x || y' always be considered an mistake if
>> 'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning
>> or an error?  For instance,
>> '''r
>>> x <- c(TRUE, TRUE)
>>> y <- FALSE
>>> x || y
>>
>> Error in x || y : applying scalar operator || to non-scalar elements
>> Execution halted
>>
>> What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today
>> 'x || y' returns 'NA' in such cases, e.g.
>>
>>> logical(0) || c(FALSE, NA)
>> [1] NA
>>> logical(0) || logical(0)
>> [1] NA
>>> logical(0) && logical(0)
>> [1] NA
>>
>> I don't know the background for this behavior, but I'm sure there is
>> an argument behind that one.  Maybe it's simply that '||' and '&&'
>> should always return a scalar logical and neither TRUE nor FALSE can
>> be returned.
>>
>> /Henrik
>>
>> PS. This is in the same vein as
>> https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html
>> - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if
>> _R_CHECK_LENGTH_1_CONDITION_=true
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

Joris FA Meys
On Thu, Aug 30, 2018 at 2:09 PM Dénes Tóth <[hidden email]> wrote:

> Note that `||` and `&&` have never been symmetric:
>
> TRUE || stop() # returns TRUE
> stop() || TRUE # returns an error
>
>
Fair point. So the suggestion would be to check whether x is of length 1
and whether y is of length 1 only when needed. I.e.

c(TRUE,FALSE) || TRUE

would give an error and

TRUE || c(TRUE, FALSE)

would pass.

Thought about it a bit more, and I can't come up with a use case where the
first line must pass. So if the short circuiting remains and the extra
check only gives a small performance penalty, adding the error could indeed
make some bugs more obvious.

Cheers
Joris

--
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)
<https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g>

-----------
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

Emil
Okay, I thought you always wanted to check the length, but if we can only check what's evaluated I mostly agree.

I still think there's not much wrong with how length-0 logicals are treated, as the return of NA in cases where the value matters is enough warning I think, and I can imagine some code like my previous example 'x==-1 || length(x)==0', which wouldn't need a warning.

But we could do a check for length being >1

Greetings, Emil


On 30/08/2018, 14:55, "R-devel on behalf of Joris Meys" <[hidden email] on behalf of [hidden email]> wrote:

    On Thu, Aug 30, 2018 at 2:09 PM Dénes Tóth <[hidden email]> wrote:
   
    > Note that `||` and `&&` have never been symmetric:
    >
    > TRUE || stop() # returns TRUE
    > stop() || TRUE # returns an error
    >
    >
    Fair point. So the suggestion would be to check whether x is of length 1
    and whether y is of length 1 only when needed. I.e.
   
    c(TRUE,FALSE) || TRUE
   
    would give an error and
   
    TRUE || c(TRUE, FALSE)
   
    would pass.
   
    Thought about it a bit more, and I can't come up with a use case where the
    first line must pass. So if the short circuiting remains and the extra
    check only gives a small performance penalty, adding the error could indeed
    make some bugs more obvious.
   
    Cheers
    Joris
   
    --
    Joris Meys
    Statistical consultant
   
    Department of Data Analysis and Mathematical Modelling
    Ghent University
    Coupure Links 653, B-9000 Gent (Belgium)
    <https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g>
   
    -----------
    Biowiskundedagen 2017-2018
    http://www.biowiskundedagen.ugent.be/
   
    -------------------------------
    Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
   
    [[alternative HTML version deleted]]
   
    ______________________________________________
    [hidden email] mailing list
    https://stat.ethz.ch/mailman/listinfo/r-devel
   

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

hadley wickham
In reply to this post by Henrik Bengtsson-5
I think this is an excellent idea as it eliminates a situation which
is almost certainly user error. Making it an error would break a small
amount of existing code (even if for the better), so perhaps it should
start as a warning, but be optionally upgraded to an error. It would
be nice to have a fixed date (R version) in the future when the
default will change to error.

In an ideal world, I think the following four cases should all return
the same error:

if (logical()) 1
#> Error in if (logical()) 1: argument is of length zero
if (c(TRUE, TRUE)) 1
#> Warning in if (c(TRUE, TRUE)) 1: the condition has length > 1 and only the
#> first element will be used
#> [1] 1

logical() || TRUE
#> [1] TRUE
c(TRUE, TRUE) || TRUE
#> [1] TRUE

i.e. I think that `if`, `&&`, and `||` should all check that their
input is a logical (or numeric) vector of length 1.

Hadley

On Tue, Aug 28, 2018 at 10:03 PM Henrik Bengtsson
<[hidden email]> wrote:

>
> # Issue
>
> 'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here
> using R 3.5.1),
>
> > c(TRUE, TRUE) || FALSE
> [1] TRUE
> > c(TRUE, FALSE) || FALSE
> [1] TRUE
> > c(TRUE, NA) || FALSE
> [1] TRUE
> > c(FALSE, TRUE) || FALSE
> [1] FALSE
>
> This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the
> same) and it also applies to 'x && y'.
>
> Note also how the above truncation of 'x' is completely silent -
> there's neither an error nor a warning being produced.
>
>
> # Discussion/Suggestion
>
> Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a
> mistake.  Either the code is written assuming 'x' and 'y' are scalars,
> or there is a coding error and vectorized versions 'x | y' and 'x & y'
> were intended.  Should 'x || y' always be considered an mistake if
> 'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning
> or an error?  For instance,
> '''r
> > x <- c(TRUE, TRUE)
> > y <- FALSE
> > x || y
>
> Error in x || y : applying scalar operator || to non-scalar elements
> Execution halted
>
> What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today
> 'x || y' returns 'NA' in such cases, e.g.
>
> > logical(0) || c(FALSE, NA)
> [1] NA
> > logical(0) || logical(0)
> [1] NA
> > logical(0) && logical(0)
> [1] NA
>
> I don't know the background for this behavior, but I'm sure there is
> an argument behind that one.  Maybe it's simply that '||' and '&&'
> should always return a scalar logical and neither TRUE nor FALSE can
> be returned.
>
> /Henrik
>
> PS. This is in the same vein as
> https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html
> - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if
> _R_CHECK_LENGTH_1_CONDITION_=true
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



--
http://hadley.nz

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

R devel mailing list
Should the following two functions should always give the same result,
except for possible differences in the 'call' component of the warning
or error message?:

  f0 <- function(x, y) x || y
  f1 <- function(x, y) if (x) { TRUE } else { if (y) {TRUE } else { FALSE }
}

And the same for the 'and' version?

  g0 <- function(x, y) x && y
  g1 <- function(x, y) if (x) { if (y) { TRUE } else { FALSE } } else {
FALSE }

The proposal is to make them act the same when length(x) or length(y) is
not 1.
Should they also act the same when x or y is NA?  'if (x)' currently stops
if is.na(x)
and 'x||y' does not.  Or should we continue with 'if' restricted to
bi-valued
logical and '||' and '&&' handling tri-valued logic?



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Aug 30, 2018 at 7:16 AM, Hadley Wickham <[hidden email]> wrote:

> I think this is an excellent idea as it eliminates a situation which
> is almost certainly user error. Making it an error would break a small
> amount of existing code (even if for the better), so perhaps it should
> start as a warning, but be optionally upgraded to an error. It would
> be nice to have a fixed date (R version) in the future when the
> default will change to error.
>
> In an ideal world, I think the following four cases should all return
> the same error:
>
> if (logical()) 1
> #> Error in if (logical()) 1: argument is of length zero
> if (c(TRUE, TRUE)) 1
> #> Warning in if (c(TRUE, TRUE)) 1: the condition has length > 1 and only
> the
> #> first element will be used
> #> [1] 1
>
> logical() || TRUE
> #> [1] TRUE
> c(TRUE, TRUE) || TRUE
> #> [1] TRUE
>
> i.e. I think that `if`, `&&`, and `||` should all check that their
> input is a logical (or numeric) vector of length 1.
>
> Hadley
>
> On Tue, Aug 28, 2018 at 10:03 PM Henrik Bengtsson
> <[hidden email]> wrote:
> >
> > # Issue
> >
> > 'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here
> > using R 3.5.1),
> >
> > > c(TRUE, TRUE) || FALSE
> > [1] TRUE
> > > c(TRUE, FALSE) || FALSE
> > [1] TRUE
> > > c(TRUE, NA) || FALSE
> > [1] TRUE
> > > c(FALSE, TRUE) || FALSE
> > [1] FALSE
> >
> > This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the
> > same) and it also applies to 'x && y'.
> >
> > Note also how the above truncation of 'x' is completely silent -
> > there's neither an error nor a warning being produced.
> >
> >
> > # Discussion/Suggestion
> >
> > Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a
> > mistake.  Either the code is written assuming 'x' and 'y' are scalars,
> > or there is a coding error and vectorized versions 'x | y' and 'x & y'
> > were intended.  Should 'x || y' always be considered an mistake if
> > 'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning
> > or an error?  For instance,
> > '''r
> > > x <- c(TRUE, TRUE)
> > > y <- FALSE
> > > x || y
> >
> > Error in x || y : applying scalar operator || to non-scalar elements
> > Execution halted
> >
> > What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today
> > 'x || y' returns 'NA' in such cases, e.g.
> >
> > > logical(0) || c(FALSE, NA)
> > [1] NA
> > > logical(0) || logical(0)
> > [1] NA
> > > logical(0) && logical(0)
> > [1] NA
> >
> > I don't know the background for this behavior, but I'm sure there is
> > an argument behind that one.  Maybe it's simply that '||' and '&&'
> > should always return a scalar logical and neither TRUE nor FALSE can
> > be returned.
> >
> > /Henrik
> >
> > PS. This is in the same vein as
> > https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html
> > - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if
> > _R_CHECK_LENGTH_1_CONDITION_=true
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> http://hadley.nz
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

Martin Maechler
In reply to this post by Joris FA Meys
>>>>> Joris Meys
>>>>>     on Thu, 30 Aug 2018 14:48:01 +0200 writes:

    > On Thu, Aug 30, 2018 at 2:09 PM Dénes Tóth
    > <[hidden email]> wrote:
    >> Note that `||` and `&&` have never been symmetric:
    >>
    >> TRUE || stop() # returns TRUE stop() || TRUE # returns an
    >> error
    >>
    >>
    > Fair point. So the suggestion would be to check whether x
    > is of length 1 and whether y is of length 1 only when
    > needed. I.e.

    > c(TRUE,FALSE) || TRUE

    > would give an error and

    > TRUE || c(TRUE, FALSE)

    > would pass.

    > Thought about it a bit more, and I can't come up with a
    > use case where the first line must pass. So if the short
    > circuiting remains and the extra check only gives a small
    > performance penalty, adding the error could indeed make
    > some bugs more obvious.

I agree "in theory".
Thank you, Henrik, for bringing it up!

In practice I think we should start having a warning signalled.
I have checked the source code in the mean time, and the check
is really very cheap
{ because it can/should be done after checking isNumber(): so
  then we know we have an atomic and can use XLENGTH() }


The 0-length case I don't think we should change as I do find
NA (is logical!) to be an appropriate logical answer.

Martin Maechler
ETH Zurich and R Core team.

    > Cheers Joris

    > --
    > Joris Meys Statistical consultant

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

Rui Barradas
In reply to this post by R devel mailing list
Hello,

Inline.

Às 16:44 de 30/08/2018, William Dunlap via R-devel escreveu:

> Should the following two functions should always give the same result,
> except for possible differences in the 'call' component of the warning
> or error message?:
>
>    f0 <- function(x, y) x || y
>    f1 <- function(x, y) if (x) { TRUE } else { if (y) {TRUE } else { FALSE }
> }
>
> And the same for the 'and' version?
>
>    g0 <- function(x, y) x && y
>    g1 <- function(x, y) if (x) { if (y) { TRUE } else { FALSE } } else {
> FALSE }
>
> The proposal is to make them act the same when length(x) or length(y) is
> not 1.
> Should they also act the same when x or y is NA?  'if (x)' currently stops
> if is.na(x)
> and 'x||y' does not.  Or should we continue with 'if' restricted to
> bi-valued
> logical and '||' and '&&' handling tri-valued logic?

I expect R to continue to do


f0(FALSE, NA)    # [1] NA
f0(NA, FALSE)    # [1] NA

g0(TRUE, NA)    # [1] NA
g0(NA, TRUE)    # [1] NA

f1(FALSE, NA)
#Error in if (y) { : missing value where TRUE/FALSE needed
f1(NA, FALSE)
#Error in if (x) { : missing value where TRUE/FALSE needed

g1(TRUE, NA)
#Error in if (x) { : missing value where TRUE/FALSE needed
g1(NA, TRUE)
#Error in if (x) { : missing value where TRUE/FALSE needed



Please don't change this.
There's more to the logical operators than the operands' lengths. That
issue needs to be fixed but it doesn't mean a radical change should happen.
And the same goes for 'if'. Here the problem is completely different,
there's more to 'if' than '||' and '&&'. Any change should be done with
increased care. (Which I'm sure will, as always.)

Rui Barradas


>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Thu, Aug 30, 2018 at 7:16 AM, Hadley Wickham <[hidden email]> wrote:
>
>> I think this is an excellent idea as it eliminates a situation which
>> is almost certainly user error. Making it an error would break a small
>> amount of existing code (even if for the better), so perhaps it should
>> start as a warning, but be optionally upgraded to an error. It would
>> be nice to have a fixed date (R version) in the future when the
>> default will change to error.
>>
>> In an ideal world, I think the following four cases should all return
>> the same error:
>>
>> if (logical()) 1
>> #> Error in if (logical()) 1: argument is of length zero
>> if (c(TRUE, TRUE)) 1
>> #> Warning in if (c(TRUE, TRUE)) 1: the condition has length > 1 and only
>> the
>> #> first element will be used
>> #> [1] 1
>>
>> logical() || TRUE
>> #> [1] TRUE
>> c(TRUE, TRUE) || TRUE
>> #> [1] TRUE
>>
>> i.e. I think that `if`, `&&`, and `||` should all check that their
>> input is a logical (or numeric) vector of length 1.
>>
>> Hadley
>>
>> On Tue, Aug 28, 2018 at 10:03 PM Henrik Bengtsson
>> <[hidden email]> wrote:
>>>
>>> # Issue
>>>
>>> 'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here
>>> using R 3.5.1),
>>>
>>>> c(TRUE, TRUE) || FALSE
>>> [1] TRUE
>>>> c(TRUE, FALSE) || FALSE
>>> [1] TRUE
>>>> c(TRUE, NA) || FALSE
>>> [1] TRUE
>>>> c(FALSE, TRUE) || FALSE
>>> [1] FALSE
>>>
>>> This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the
>>> same) and it also applies to 'x && y'.
>>>
>>> Note also how the above truncation of 'x' is completely silent -
>>> there's neither an error nor a warning being produced.
>>>
>>>
>>> # Discussion/Suggestion
>>>
>>> Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a
>>> mistake.  Either the code is written assuming 'x' and 'y' are scalars,
>>> or there is a coding error and vectorized versions 'x | y' and 'x & y'
>>> were intended.  Should 'x || y' always be considered an mistake if
>>> 'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning
>>> or an error?  For instance,
>>> '''r
>>>> x <- c(TRUE, TRUE)
>>>> y <- FALSE
>>>> x || y
>>>
>>> Error in x || y : applying scalar operator || to non-scalar elements
>>> Execution halted
>>>
>>> What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today
>>> 'x || y' returns 'NA' in such cases, e.g.
>>>
>>>> logical(0) || c(FALSE, NA)
>>> [1] NA
>>>> logical(0) || logical(0)
>>> [1] NA
>>>> logical(0) && logical(0)
>>> [1] NA
>>>
>>> I don't know the background for this behavior, but I'm sure there is
>>> an argument behind that one.  Maybe it's simply that '||' and '&&'
>>> should always return a scalar logical and neither TRUE nor FALSE can
>>> be returned.
>>>
>>> /Henrik
>>>
>>> PS. This is in the same vein as
>>> https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html
>>> - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if
>>> _R_CHECK_LENGTH_1_CONDITION_=true
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>>
>> --
>> http://hadley.nz
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

Joris FA Meys
In reply to this post by Martin Maechler
On Thu, Aug 30, 2018 at 5:58 PM Martin Maechler <[hidden email]>
wrote:

>
> I agree "in theory".
> Thank you, Henrik, for bringing it up!
>
> In practice I think we should start having a warning signalled.
>

I agree. I wouldn't know who would count on the automatic selection of the
first value, but better safe than sorry.


> I have checked the source code in the mean time, and the check
> is really very cheap
> { because it can/should be done after checking isNumber(): so
>   then we know we have an atomic and can use XLENGTH() }
>
>
That was my idea as well after going through the source code. I didn't want
to state it as I don't know enough of the code base and couldn't see if
there were complications I missed. Thank you for confirming!

Cheers
Joris
--
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)
<https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g>

-----------
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

hadley wickham
In reply to this post by Martin Maechler
On Thu, Aug 30, 2018 at 10:58 AM Martin Maechler
<[hidden email]> wrote:

>
> >>>>> Joris Meys
> >>>>>     on Thu, 30 Aug 2018 14:48:01 +0200 writes:
>
>     > On Thu, Aug 30, 2018 at 2:09 PM Dénes Tóth
>     > <[hidden email]> wrote:
>     >> Note that `||` and `&&` have never been symmetric:
>     >>
>     >> TRUE || stop() # returns TRUE stop() || TRUE # returns an
>     >> error
>     >>
>     >>
>     > Fair point. So the suggestion would be to check whether x
>     > is of length 1 and whether y is of length 1 only when
>     > needed. I.e.
>
>     > c(TRUE,FALSE) || TRUE
>
>     > would give an error and
>
>     > TRUE || c(TRUE, FALSE)
>
>     > would pass.
>
>     > Thought about it a bit more, and I can't come up with a
>     > use case where the first line must pass. So if the short
>     > circuiting remains and the extra check only gives a small
>     > performance penalty, adding the error could indeed make
>     > some bugs more obvious.
>
> I agree "in theory".
> Thank you, Henrik, for bringing it up!
>
> In practice I think we should start having a warning signalled.
> I have checked the source code in the mean time, and the check
> is really very cheap
> { because it can/should be done after checking isNumber(): so
>   then we know we have an atomic and can use XLENGTH() }
>
>
> The 0-length case I don't think we should change as I do find
> NA (is logical!) to be an appropriate logical answer.

Can you explain your reasoning a bit more here? I'd like to understand
the general principle, because from my perspective it's more
parsimonious to say that the inputs to || and && must be length 1,
rather than to say that inputs could be length 0 or length 1, and in
the length 0 case they are replaced with NA.

Hadley

--
http://hadley.nz

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

Emil

On 30/08/2018, 20:15, "R-devel on behalf of Hadley Wickham" <[hidden email] on behalf of [hidden email]> wrote:

    On Thu, Aug 30, 2018 at 10:58 AM Martin Maechler
    <[hidden email]> wrote:
    >
    > >>>>> Joris Meys
    > >>>>>     on Thu, 30 Aug 2018 14:48:01 +0200 writes:
    >
    >     > On Thu, Aug 30, 2018 at 2:09 PM Dénes Tóth
    >     > <[hidden email]> wrote:
    >     >> Note that `||` and `&&` have never been symmetric:
    >     >>
    >     >> TRUE || stop() # returns TRUE stop() || TRUE # returns an
    >     >> error
    >     >>
    >     >>
    >     > Fair point. So the suggestion would be to check whether x
    >     > is of length 1 and whether y is of length 1 only when
    >     > needed. I.e.
    >
    >     > c(TRUE,FALSE) || TRUE
    >
    >     > would give an error and
    >
    >     > TRUE || c(TRUE, FALSE)
    >
    >     > would pass.
    >
    >     > Thought about it a bit more, and I can't come up with a
    >     > use case where the first line must pass. So if the short
    >     > circuiting remains and the extra check only gives a small
    >     > performance penalty, adding the error could indeed make
    >     > some bugs more obvious.
    >
    > I agree "in theory".
    > Thank you, Henrik, for bringing it up!
    >
    > In practice I think we should start having a warning signalled.
    > I have checked the source code in the mean time, and the check
    > is really very cheap
    > { because it can/should be done after checking isNumber(): so
    >   then we know we have an atomic and can use XLENGTH() }
    >
    >
    > The 0-length case I don't think we should change as I do find
    > NA (is logical!) to be an appropriate logical answer.
   
    Can you explain your reasoning a bit more here? I'd like to understand
    the general principle, because from my perspective it's more
    parsimonious to say that the inputs to || and && must be length 1,
    rather than to say that inputs could be length 0 or length 1, and in
    the length 0 case they are replaced with NA.
   
    Hadley
   
I would say the value NA would cause warnings later on, that are easy to track down, so a return of NA is far less likely to cause problems than an unintended TRUE or FALSE. And I guess there would be some code reliant on 'logical(0) || TRUE' returning TRUE, that wouldn't necessarily be a mistake.

But I think it's hard to predict how exactly people are using functions. I personally can't imagine a situation where I'd use || or && outside an if-statement, so I'd rather have the current behaviour, because I'm not sure if I'm reliant on logical(0) || TRUE  somewhere in my code (even though that would be ugly code, it's not wrong per se)
But I could always rewrite it, so I believe it's more a question of how much would have to be rewritten. Maybe implement it first in devel, to see how many people would complain?

Emil Bode


   

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

Henrik Bengtsson-5
Thanks all for a great discussion.

I think we can introduce assertions for length(x) <= 1 (and produce a
warning/error if not) without changing the value of these &&/||
expressions.

In R 3.4.0, '_R_CHECK_LENGTH_1_CONDITION_=true' was introduced to turn
warnings on "the condition has length > 1 and only the first element
will be used" in cases like 'if (c(TRUE, TRUE)) 42'  into errors.  The
idea is to later make '_R_CHECK_LENGTH_1_CONDITION_=true' the new
default.  I guess, someday this will always produce an error.

Similarly, the test for this &&/|| issue could be controlled by
'_R_CHECK_LENGTH_1_LOGICAL_OPS_=warn' and
'_R_CHECK_LENGTH_1_LOGICAL_OPS_=err' and possibly have
'_R_CHECK_LENGTH_1_LOGICAL_OPS_=true' default to 'warn' and later
'err'.

Changing the behavior of cases where length(x) == 0 is more likely to
break *some* code out there, and might require a separate
discussion/set of validations.  It's not unlikely that someone
actually relied on this to resolve to NA.  BTW, since it hasn't been
explicitly said, it's "logical" that we have TRUE && logical(0)
resolving to NA, because it currently behaves as TRUE[1] &&
logical(0)[1], which resolves to TRUE && NA => NA.  If a decision on
the zero-length case would delay fixing the length(x) > 1 case, I
would postpone the decision on the former.

/Henrik

On Fri, Aug 31, 2018 at 2:48 AM Emil Bode <[hidden email]> wrote:

>
>
> On 30/08/2018, 20:15, "R-devel on behalf of Hadley Wickham" <[hidden email] on behalf of [hidden email]> wrote:
>
>     On Thu, Aug 30, 2018 at 10:58 AM Martin Maechler
>     <[hidden email]> wrote:
>     >
>     > >>>>> Joris Meys
>     > >>>>>     on Thu, 30 Aug 2018 14:48:01 +0200 writes:
>     >
>     >     > On Thu, Aug 30, 2018 at 2:09 PM Dénes Tóth
>     >     > <[hidden email]> wrote:
>     >     >> Note that `||` and `&&` have never been symmetric:
>     >     >>
>     >     >> TRUE || stop() # returns TRUE stop() || TRUE # returns an
>     >     >> error
>     >     >>
>     >     >>
>     >     > Fair point. So the suggestion would be to check whether x
>     >     > is of length 1 and whether y is of length 1 only when
>     >     > needed. I.e.
>     >
>     >     > c(TRUE,FALSE) || TRUE
>     >
>     >     > would give an error and
>     >
>     >     > TRUE || c(TRUE, FALSE)
>     >
>     >     > would pass.
>     >
>     >     > Thought about it a bit more, and I can't come up with a
>     >     > use case where the first line must pass. So if the short
>     >     > circuiting remains and the extra check only gives a small
>     >     > performance penalty, adding the error could indeed make
>     >     > some bugs more obvious.
>     >
>     > I agree "in theory".
>     > Thank you, Henrik, for bringing it up!
>     >
>     > In practice I think we should start having a warning signalled.
>     > I have checked the source code in the mean time, and the check
>     > is really very cheap
>     > { because it can/should be done after checking isNumber(): so
>     >   then we know we have an atomic and can use XLENGTH() }
>     >
>     >
>     > The 0-length case I don't think we should change as I do find
>     > NA (is logical!) to be an appropriate logical answer.
>
>     Can you explain your reasoning a bit more here? I'd like to understand
>     the general principle, because from my perspective it's more
>     parsimonious to say that the inputs to || and && must be length 1,
>     rather than to say that inputs could be length 0 or length 1, and in
>     the length 0 case they are replaced with NA.
>
>     Hadley
>
> I would say the value NA would cause warnings later on, that are easy to track down, so a return of NA is far less likely to cause problems than an unintended TRUE or FALSE. And I guess there would be some code reliant on 'logical(0) || TRUE' returning TRUE, that wouldn't necessarily be a mistake.
>
> But I think it's hard to predict how exactly people are using functions. I personally can't imagine a situation where I'd use || or && outside an if-statement, so I'd rather have the current behaviour, because I'm not sure if I'm reliant on logical(0) || TRUE  somewhere in my code (even though that would be ugly code, it's not wrong per se)
> But I could always rewrite it, so I believe it's more a question of how much would have to be rewritten. Maybe implement it first in devel, to see how many people would complain?
>
> Emil Bode
>
>
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

Hugh Parsonage
In reply to this post by Emil
I would add my support for the change.

From a cursory survey of existing code, I'd say that usage like `class(x)
== 'y'` -- rather than inherits(x, 'y') or is.y -- is probably going to be
the major source of new warnings. So perhaps in the NEWS item it could be
noted as a clue for developers encountering nascent warnings.

Of course `if (class(x) == 'y')` already throws a warning, just not `if
(class(x) == 'y' && TRUE)`.



On Thu, 30 Aug 2018 at 21:09 Emil Bode <[hidden email]> wrote:

> I have to disagree, I think one of the advantages of '||' (or &&) is the
> lazy evaluation, i.e. you can use the first condition to "not care" about
> the second (and stop errors from being thrown).
> So if I want to check if x is a length-one numeric with value a value
> between 0 and 1, I can do 'class(x)=='numeric' && length(x)==1 && x>0 &&
> x<1'.
> In your proposal, having x=c(1,2) would throw an error or multiple
> warnings.
> Also code that relies on the second argument not being evaluated would
> break, as we need to evaluate y in order to know length(y)
> There may be some benefit in checking for length(x) only, though that
> could also cause some false positives (e.g. 'x==-1 || length(x)==0' would
> be a bit ugly, but not necessarily wrong, same for someone too lazy to
> write x[1] instead of x).
>
> And I don’t really see the advantage. The casting to length one is (I
> think), a feature, not a bug. If I have/need a length one x, and a length
> one y, why not use '|' and '&'? I have to admit I only use them in
> if-statements, and if I need an error to be thrown when x and y are not
> length one, I can use the shorter versions and then the if throws a warning
> (or an error for a length-0 or NA result).
>
> I get it that for someone just starting in R, the differences between |
> and || can be confusing, but I guess that's just the price to pay for
> having a vectorized language.
>
> Best regards,
> Emil Bode
>
> Data-analyst
>
> +31 6 43 83 89 33
> [hidden email]
>
> DANS: Netherlands Institute for Permanent Access to Digital Research
> Resources
> Anna van Saksenlaan 51 | 2593 HW Den Haag
> <https://maps.google.com/?q=Anna+van+Saksenlaan+51+%7C+2593+HW+Den+Haag&entry=gmail&source=g>
> | +31 70 349 44 50 | [hidden email] <mailto:[hidden email]> |
> dans.knaw.nl <applewebdata://71F677F0-6872-45F3-A6C4-4972BF87185B/
> www.dans.knaw.nl>
> DANS is an institute of the Dutch Academy KNAW <http://knaw.nl/nl> and
> funding organisation NWO <http://www.nwo.nl/>.
>
> On 29/08/2018, 05:03, "R-devel on behalf of Henrik Bengtsson" <
> [hidden email] on behalf of [hidden email]>
> wrote:
>
>     # Issue
>
>     'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here
>     using R 3.5.1),
>
>     > c(TRUE, TRUE) || FALSE
>     [1] TRUE
>     > c(TRUE, FALSE) || FALSE
>     [1] TRUE
>     > c(TRUE, NA) || FALSE
>     [1] TRUE
>     > c(FALSE, TRUE) || FALSE
>     [1] FALSE
>
>     This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the
>     same) and it also applies to 'x && y'.
>
>     Note also how the above truncation of 'x' is completely silent -
>     there's neither an error nor a warning being produced.
>
>
>     # Discussion/Suggestion
>
>     Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a
>     mistake.  Either the code is written assuming 'x' and 'y' are scalars,
>     or there is a coding error and vectorized versions 'x | y' and 'x & y'
>     were intended.  Should 'x || y' always be considered an mistake if
>     'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning
>     or an error?  For instance,
>     '''r
>     > x <- c(TRUE, TRUE)
>     > y <- FALSE
>     > x || y
>
>     Error in x || y : applying scalar operator || to non-scalar elements
>     Execution halted
>
>     What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today
>     'x || y' returns 'NA' in such cases, e.g.
>
>     > logical(0) || c(FALSE, NA)
>     [1] NA
>     > logical(0) || logical(0)
>     [1] NA
>     > logical(0) && logical(0)
>     [1] NA
>
>     I don't know the background for this behavior, but I'm sure there is
>     an argument behind that one.  Maybe it's simply that '||' and '&&'
>     should always return a scalar logical and neither TRUE nor FALSE can
>     be returned.
>
>     /Henrik
>
>     PS. This is in the same vein as
>     https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html
>     - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if
>     _R_CHECK_LENGTH_1_CONDITION_=true
>
>     ______________________________________________
>     [hidden email] mailing list
>     https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel