ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

16 messages
Open this post in threaded view
|

ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

 # Issue 'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here using R 3.5.1), > c(TRUE, TRUE) || FALSE [1] TRUE > c(TRUE, FALSE) || FALSE [1] TRUE > c(TRUE, NA) || FALSE [1] TRUE > c(FALSE, TRUE) || FALSE [1] FALSE This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the same) and it also applies to 'x && y'. Note also how the above truncation of 'x' is completely silent - there's neither an error nor a warning being produced. # Discussion/Suggestion Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a mistake.  Either the code is written assuming 'x' and 'y' are scalars, or there is a coding error and vectorized versions 'x | y' and 'x & y' were intended.  Should 'x || y' always be considered an mistake if 'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning or an error?  For instance, '''r > x <- c(TRUE, TRUE) > y <- FALSE > x || y Error in x || y : applying scalar operator || to non-scalar elements Execution halted What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today 'x || y' returns 'NA' in such cases, e.g. > logical(0) || c(FALSE, NA) [1] NA > logical(0) || logical(0) [1] NA > logical(0) && logical(0) [1] NA I don't know the background for this behavior, but I'm sure there is an argument behind that one.  Maybe it's simply that '||' and '&&' should always return a scalar logical and neither TRUE nor FALSE can be returned. /Henrik PS. This is in the same vein as https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html- in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if _R_CHECK_LENGTH_1_CONDITION_=true ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

 I have to disagree, I think one of the advantages of '||' (or &&) is the lazy evaluation, i.e. you can use the first condition to "not care" about the second (and stop errors from being thrown). So if I want to check if x is a length-one numeric with value a value between 0 and 1, I can do 'class(x)=='numeric' && length(x)==1 && x>0 && x<1'. In your proposal, having x=c(1,2) would throw an error or multiple warnings. Also code that relies on the second argument not being evaluated would break, as we need to evaluate y in order to know length(y) There may be some benefit in checking for length(x) only, though that could also cause some false positives (e.g. 'x==-1 || length(x)==0' would be a bit ugly, but not necessarily wrong, same for someone too lazy to write x[1] instead of x). And I don’t really see the advantage. The casting to length one is (I think), a feature, not a bug. If I have/need a length one x, and a length one y, why not use '|' and '&'? I have to admit I only use them in if-statements, and if I need an error to be thrown when x and y are not length one, I can use the shorter versions and then the if throws a warning (or an error for a length-0 or NA result). I get it that for someone just starting in R, the differences between | and || can be confusing, but I guess that's just the price to pay for having a vectorized language. Best regards, Emil Bode   Data-analyst   +31 6 43 83 89 33 [hidden email]   DANS: Netherlands Institute for Permanent Access to Digital Research Resources Anna van Saksenlaan 51 | 2593 HW Den Haag | +31 70 349 44 50 | [hidden email] | dans.knaw.nl DANS is an institute of the Dutch Academy KNAW and funding organisation NWO . ﻿On 29/08/2018, 05:03, "R-devel on behalf of Henrik Bengtsson" <[hidden email] on behalf of [hidden email]> wrote:     # Issue         'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here     using R 3.5.1),         > c(TRUE, TRUE) || FALSE     [1] TRUE     > c(TRUE, FALSE) || FALSE     [1] TRUE     > c(TRUE, NA) || FALSE     [1] TRUE     > c(FALSE, TRUE) || FALSE     [1] FALSE         This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the     same) and it also applies to 'x && y'.         Note also how the above truncation of 'x' is completely silent -     there's neither an error nor a warning being produced.             # Discussion/Suggestion         Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a     mistake.  Either the code is written assuming 'x' and 'y' are scalars,     or there is a coding error and vectorized versions 'x | y' and 'x & y'     were intended.  Should 'x || y' always be considered an mistake if     'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning     or an error?  For instance,     '''r     > x <- c(TRUE, TRUE)     > y <- FALSE     > x || y         Error in x || y : applying scalar operator || to non-scalar elements     Execution halted         What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today     'x || y' returns 'NA' in such cases, e.g.         > logical(0) || c(FALSE, NA)     [1] NA     > logical(0) || logical(0)     [1] NA     > logical(0) && logical(0)     [1] NA         I don't know the background for this behavior, but I'm sure there is     an argument behind that one.  Maybe it's simply that '||' and '&&'     should always return a scalar logical and neither TRUE nor FALSE can     be returned.         /Henrik         PS. This is in the same vein as     https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html    - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if     _R_CHECK_LENGTH_1_CONDITION_=true         ______________________________________________     [hidden email] mailing list     https://stat.ethz.ch/mailman/listinfo/r-devel    ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

 In reply to this post by Henrik Bengtsson-5 I have to agree with Emil here. && and || are short circuited like in C and C++. That means that TRUE || c(TRUE, FALSE) FALSE && c(TRUE, FALSE) cannot give an error because the second part is never evaluated. Throwing a warning or error for c(TRUE, FALSE) || TRUE would mean that the operator gives a different result depending on the order of the objects, breaking the symmetry. Also that would be undesirable. Regarding logical(0): per the documentation, it is indeed so that ||, && and isTRUE always return a length-one logical vector. Hence the NA. On a sidenote: there is no such thing as a scalar in R. What you call scalar, is really a length-one vector. That seems like a detail, but is important in understanding why this admittedly confusing behaviour actually makes sense within the framework of R imho. I do understand your objections and suggestions, but it would boil down to removing short circuited operators from R. My 2 cents. Cheers Joris On Wed, Aug 29, 2018 at 5:03 AM Henrik Bengtsson <[hidden email]> wrote: > # Issue > > 'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here > using R 3.5.1), > > > c(TRUE, TRUE) || FALSE > [1] TRUE > > c(TRUE, FALSE) || FALSE > [1] TRUE > > c(TRUE, NA) || FALSE > [1] TRUE > > c(FALSE, TRUE) || FALSE > [1] FALSE > > This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the > same) and it also applies to 'x && y'. > > Note also how the above truncation of 'x' is completely silent - > there's neither an error nor a warning being produced. > > > # Discussion/Suggestion > > Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a > mistake.  Either the code is written assuming 'x' and 'y' are scalars, > or there is a coding error and vectorized versions 'x | y' and 'x & y' > were intended.  Should 'x || y' always be considered an mistake if > 'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning > or an error?  For instance, > '''r > > x <- c(TRUE, TRUE) > > y <- FALSE > > x || y > > Error in x || y : applying scalar operator || to non-scalar elements > Execution halted > > What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today > 'x || y' returns 'NA' in such cases, e.g. > > > logical(0) || c(FALSE, NA) > [1] NA > > logical(0) || logical(0) > [1] NA > > logical(0) && logical(0) > [1] NA > > I don't know the background for this behavior, but I'm sure there is > an argument behind that one.  Maybe it's simply that '||' and '&&' > should always return a scalar logical and neither TRUE nor FALSE can > be returned. > > /Henrik > > PS. This is in the same vein as > https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html> - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if > _R_CHECK_LENGTH_1_CONDITION_=true > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel> -- Joris Meys Statistical consultant Department of Data Analysis and Mathematical Modelling Ghent University Coupure Links 653, B-9000 Gent (Belgium) ----------- Biowiskundedagen 2017-2018 http://www.biowiskundedagen.ugent.be/------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php        [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

 In reply to this post by Joris FA Meys On 08/30/2018 01:56 PM, Joris Meys wrote: > I have to agree with Emil here. && and || are short circuited like in C and > C++. That means that > > TRUE || c(TRUE, FALSE) > FALSE && c(TRUE, FALSE) > > cannot give an error because the second part is never evaluated. Throwing a > warning or error for > > c(TRUE, FALSE) || TRUE > > would mean that the operator gives a different result depending on the > order of the objects, breaking the symmetry. Also that would be undesirable. Note that `||` and `&&` have never been symmetric: TRUE || stop() # returns TRUE stop() || TRUE # returns an error > > Regarding logical(0): per the documentation, it is indeed so that ||, && > and isTRUE always return a length-one logical vector. Hence the NA. > > On a sidenote: there is no such thing as a scalar in R. What you call > scalar, is really a length-one vector. That seems like a detail, but is > important in understanding why this admittedly confusing behaviour actually > makes sense within the framework of R imho. I do understand your objections > and suggestions, but it would boil down to removing short circuited > operators from R. > > My 2 cents. > Cheers > Joris > > On Wed, Aug 29, 2018 at 5:03 AM Henrik Bengtsson <[hidden email]> > wrote: > >> # Issue >> >> 'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here >> using R 3.5.1), >> >>> c(TRUE, TRUE) || FALSE >> [1] TRUE >>> c(TRUE, FALSE) || FALSE >> [1] TRUE >>> c(TRUE, NA) || FALSE >> [1] TRUE >>> c(FALSE, TRUE) || FALSE >> [1] FALSE >> >> This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the >> same) and it also applies to 'x && y'. >> >> Note also how the above truncation of 'x' is completely silent - >> there's neither an error nor a warning being produced. >> >> >> # Discussion/Suggestion >> >> Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a >> mistake.  Either the code is written assuming 'x' and 'y' are scalars, >> or there is a coding error and vectorized versions 'x | y' and 'x & y' >> were intended.  Should 'x || y' always be considered an mistake if >> 'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning >> or an error?  For instance, >> '''r >>> x <- c(TRUE, TRUE) >>> y <- FALSE >>> x || y >> >> Error in x || y : applying scalar operator || to non-scalar elements >> Execution halted >> >> What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today >> 'x || y' returns 'NA' in such cases, e.g. >> >>> logical(0) || c(FALSE, NA) >> [1] NA >>> logical(0) || logical(0) >> [1] NA >>> logical(0) && logical(0) >> [1] NA >> >> I don't know the background for this behavior, but I'm sure there is >> an argument behind that one.  Maybe it's simply that '||' and '&&' >> should always return a scalar logical and neither TRUE nor FALSE can >> be returned. >> >> /Henrik >> >> PS. This is in the same vein as >> https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html>> - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if >> _R_CHECK_LENGTH_1_CONDITION_=true >> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel>> > > ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

 On Thu, Aug 30, 2018 at 2:09 PM Dénes Tóth <[hidden email]> wrote: > Note that `||` and `&&` have never been symmetric: > > TRUE || stop() # returns TRUE > stop() || TRUE # returns an error > > Fair point. So the suggestion would be to check whether x is of length 1 and whether y is of length 1 only when needed. I.e. c(TRUE,FALSE) || TRUE would give an error and TRUE || c(TRUE, FALSE) would pass. Thought about it a bit more, and I can't come up with a use case where the first line must pass. So if the short circuiting remains and the extra check only gives a small performance penalty, adding the error could indeed make some bugs more obvious. Cheers Joris -- Joris Meys Statistical consultant Department of Data Analysis and Mathematical Modelling Ghent University Coupure Links 653, B-9000 Gent (Belgium) ----------- Biowiskundedagen 2017-2018 http://www.biowiskundedagen.ugent.be/------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php        [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

 Okay, I thought you always wanted to check the length, but if we can only check what's evaluated I mostly agree. I still think there's not much wrong with how length-0 logicals are treated, as the return of NA in cases where the value matters is enough warning I think, and I can imagine some code like my previous example 'x==-1 || length(x)==0', which wouldn't need a warning. But we could do a check for length being >1 Greetings, Emil ﻿On 30/08/2018, 14:55, "R-devel on behalf of Joris Meys" <[hidden email] on behalf of [hidden email]> wrote:     On Thu, Aug 30, 2018 at 2:09 PM Dénes Tóth <[hidden email]> wrote:         > Note that `||` and `&&` have never been symmetric:     >     > TRUE || stop() # returns TRUE     > stop() || TRUE # returns an error     >     >     Fair point. So the suggestion would be to check whether x is of length 1     and whether y is of length 1 only when needed. I.e.         c(TRUE,FALSE) || TRUE         would give an error and         TRUE || c(TRUE, FALSE)         would pass.         Thought about it a bit more, and I can't come up with a use case where the     first line must pass. So if the short circuiting remains and the extra     check only gives a small performance penalty, adding the error could indeed     make some bugs more obvious.         Cheers     Joris         --     Joris Meys     Statistical consultant         Department of Data Analysis and Mathematical Modelling     Ghent University     Coupure Links 653, B-9000 Gent (Belgium)             -----------     Biowiskundedagen 2017-2018     http://www.biowiskundedagen.ugent.be/        -------------------------------     Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php        [[alternative HTML version deleted]]         ______________________________________________     [hidden email] mailing list     https://stat.ethz.ch/mailman/listinfo/r-devel    ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

 In reply to this post by Henrik Bengtsson-5 I think this is an excellent idea as it eliminates a situation which is almost certainly user error. Making it an error would break a small amount of existing code (even if for the better), so perhaps it should start as a warning, but be optionally upgraded to an error. It would be nice to have a fixed date (R version) in the future when the default will change to error. In an ideal world, I think the following four cases should all return the same error: if (logical()) 1 #> Error in if (logical()) 1: argument is of length zero if (c(TRUE, TRUE)) 1 #> Warning in if (c(TRUE, TRUE)) 1: the condition has length > 1 and only the #> first element will be used #> [1] 1 logical() || TRUE #> [1] TRUE c(TRUE, TRUE) || TRUE #> [1] TRUE i.e. I think that `if`, `&&`, and `||` should all check that their input is a logical (or numeric) vector of length 1. Hadley On Tue, Aug 28, 2018 at 10:03 PM Henrik Bengtsson <[hidden email]> wrote: > > # Issue > > 'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here > using R 3.5.1), > > > c(TRUE, TRUE) || FALSE > [1] TRUE > > c(TRUE, FALSE) || FALSE > [1] TRUE > > c(TRUE, NA) || FALSE > [1] TRUE > > c(FALSE, TRUE) || FALSE > [1] FALSE > > This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the > same) and it also applies to 'x && y'. > > Note also how the above truncation of 'x' is completely silent - > there's neither an error nor a warning being produced. > > > # Discussion/Suggestion > > Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a > mistake.  Either the code is written assuming 'x' and 'y' are scalars, > or there is a coding error and vectorized versions 'x | y' and 'x & y' > were intended.  Should 'x || y' always be considered an mistake if > 'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning > or an error?  For instance, > '''r > > x <- c(TRUE, TRUE) > > y <- FALSE > > x || y > > Error in x || y : applying scalar operator || to non-scalar elements > Execution halted > > What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today > 'x || y' returns 'NA' in such cases, e.g. > > > logical(0) || c(FALSE, NA) > [1] NA > > logical(0) || logical(0) > [1] NA > > logical(0) && logical(0) > [1] NA > > I don't know the background for this behavior, but I'm sure there is > an argument behind that one.  Maybe it's simply that '||' and '&&' > should always return a scalar logical and neither TRUE nor FALSE can > be returned. > > /Henrik > > PS. This is in the same vein as > https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html> - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if > _R_CHECK_LENGTH_1_CONDITION_=true > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- http://hadley.nz______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

 Should the following two functions should always give the same result, except for possible differences in the 'call' component of the warning or error message?:   f0 <- function(x, y) x || y   f1 <- function(x, y) if (x) { TRUE } else { if (y) {TRUE } else { FALSE } } And the same for the 'and' version?   g0 <- function(x, y) x && y   g1 <- function(x, y) if (x) { if (y) { TRUE } else { FALSE } } else { FALSE } The proposal is to make them act the same when length(x) or length(y) is not 1. Should they also act the same when x or y is NA?  'if (x)' currently stops if is.na(x) and 'x||y' does not.  Or should we continue with 'if' restricted to bi-valued logical and '||' and '&&' handling tri-valued logic? Bill Dunlap TIBCO Software wdunlap tibco.com On Thu, Aug 30, 2018 at 7:16 AM, Hadley Wickham <[hidden email]> wrote: > I think this is an excellent idea as it eliminates a situation which > is almost certainly user error. Making it an error would break a small > amount of existing code (even if for the better), so perhaps it should > start as a warning, but be optionally upgraded to an error. It would > be nice to have a fixed date (R version) in the future when the > default will change to error. > > In an ideal world, I think the following four cases should all return > the same error: > > if (logical()) 1 > #> Error in if (logical()) 1: argument is of length zero > if (c(TRUE, TRUE)) 1 > #> Warning in if (c(TRUE, TRUE)) 1: the condition has length > 1 and only > the > #> first element will be used > #> [1] 1 > > logical() || TRUE > #> [1] TRUE > c(TRUE, TRUE) || TRUE > #> [1] TRUE > > i.e. I think that `if`, `&&`, and `||` should all check that their > input is a logical (or numeric) vector of length 1. > > Hadley > > On Tue, Aug 28, 2018 at 10:03 PM Henrik Bengtsson > <[hidden email]> wrote: > > > > # Issue > > > > 'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here > > using R 3.5.1), > > > > > c(TRUE, TRUE) || FALSE > > [1] TRUE > > > c(TRUE, FALSE) || FALSE > > [1] TRUE > > > c(TRUE, NA) || FALSE > > [1] TRUE > > > c(FALSE, TRUE) || FALSE > > [1] FALSE > > > > This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the > > same) and it also applies to 'x && y'. > > > > Note also how the above truncation of 'x' is completely silent - > > there's neither an error nor a warning being produced. > > > > > > # Discussion/Suggestion > > > > Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a > > mistake.  Either the code is written assuming 'x' and 'y' are scalars, > > or there is a coding error and vectorized versions 'x | y' and 'x & y' > > were intended.  Should 'x || y' always be considered an mistake if > > 'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning > > or an error?  For instance, > > '''r > > > x <- c(TRUE, TRUE) > > > y <- FALSE > > > x || y > > > > Error in x || y : applying scalar operator || to non-scalar elements > > Execution halted > > > > What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today > > 'x || y' returns 'NA' in such cases, e.g. > > > > > logical(0) || c(FALSE, NA) > > [1] NA > > > logical(0) || logical(0) > > [1] NA > > > logical(0) && logical(0) > > [1] NA > > > > I don't know the background for this behavior, but I'm sure there is > > an argument behind that one.  Maybe it's simply that '||' and '&&' > > should always return a scalar logical and neither TRUE nor FALSE can > > be returned. > > > > /Henrik > > > > PS. This is in the same vein as > > https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html> > - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if > > _R_CHECK_LENGTH_1_CONDITION_=true > > > > ______________________________________________ > > [hidden email] mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel> > > > -- > http://hadley.nz> > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel>         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

 In reply to this post by Joris FA Meys >>>>> Joris Meys >>>>>     on Thu, 30 Aug 2018 14:48:01 +0200 writes:     > On Thu, Aug 30, 2018 at 2:09 PM Dénes Tóth     > <[hidden email]> wrote:     >> Note that `||` and `&&` have never been symmetric:     >>     >> TRUE || stop() # returns TRUE stop() || TRUE # returns an     >> error     >>     >>     > Fair point. So the suggestion would be to check whether x     > is of length 1 and whether y is of length 1 only when     > needed. I.e.     > c(TRUE,FALSE) || TRUE     > would give an error and     > TRUE || c(TRUE, FALSE)     > would pass.     > Thought about it a bit more, and I can't come up with a     > use case where the first line must pass. So if the short     > circuiting remains and the extra check only gives a small     > performance penalty, adding the error could indeed make     > some bugs more obvious. I agree "in theory". Thank you, Henrik, for bringing it up! In practice I think we should start having a warning signalled. I have checked the source code in the mean time, and the check is really very cheap { because it can/should be done after checking isNumber(): so   then we know we have an atomic and can use XLENGTH() } The 0-length case I don't think we should change as I do find NA (is logical!) to be an appropriate logical answer. Martin Maechler ETH Zurich and R Core team.     > Cheers Joris     > --     > Joris Meys Statistical consultant ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

 In reply to this post by R devel mailing list Hello, Inline. Às 16:44 de 30/08/2018, William Dunlap via R-devel escreveu: > Should the following two functions should always give the same result, > except for possible differences in the 'call' component of the warning > or error message?: > >    f0 <- function(x, y) x || y >    f1 <- function(x, y) if (x) { TRUE } else { if (y) {TRUE } else { FALSE } > } > > And the same for the 'and' version? > >    g0 <- function(x, y) x && y >    g1 <- function(x, y) if (x) { if (y) { TRUE } else { FALSE } } else { > FALSE } > > The proposal is to make them act the same when length(x) or length(y) is > not 1. > Should they also act the same when x or y is NA?  'if (x)' currently stops > if is.na(x) > and 'x||y' does not.  Or should we continue with 'if' restricted to > bi-valued > logical and '||' and '&&' handling tri-valued logic? I expect R to continue to do f0(FALSE, NA)    # [1] NA f0(NA, FALSE)    # [1] NA g0(TRUE, NA)    # [1] NA g0(NA, TRUE)    # [1] NA f1(FALSE, NA) #Error in if (y) { : missing value where TRUE/FALSE needed f1(NA, FALSE) #Error in if (x) { : missing value where TRUE/FALSE needed g1(TRUE, NA) #Error in if (x) { : missing value where TRUE/FALSE needed g1(NA, TRUE) #Error in if (x) { : missing value where TRUE/FALSE needed Please don't change this. There's more to the logical operators than the operands' lengths. That issue needs to be fixed but it doesn't mean a radical change should happen. And the same goes for 'if'. Here the problem is completely different, there's more to 'if' than '||' and '&&'. Any change should be done with increased care. (Which I'm sure will, as always.) Rui Barradas > > > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > On Thu, Aug 30, 2018 at 7:16 AM, Hadley Wickham <[hidden email]> wrote: > >> I think this is an excellent idea as it eliminates a situation which >> is almost certainly user error. Making it an error would break a small >> amount of existing code (even if for the better), so perhaps it should >> start as a warning, but be optionally upgraded to an error. It would >> be nice to have a fixed date (R version) in the future when the >> default will change to error. >> >> In an ideal world, I think the following four cases should all return >> the same error: >> >> if (logical()) 1 >> #> Error in if (logical()) 1: argument is of length zero >> if (c(TRUE, TRUE)) 1 >> #> Warning in if (c(TRUE, TRUE)) 1: the condition has length > 1 and only >> the >> #> first element will be used >> #> [1] 1 >> >> logical() || TRUE >> #> [1] TRUE >> c(TRUE, TRUE) || TRUE >> #> [1] TRUE >> >> i.e. I think that `if`, `&&`, and `||` should all check that their >> input is a logical (or numeric) vector of length 1. >> >> Hadley >> >> On Tue, Aug 28, 2018 at 10:03 PM Henrik Bengtsson >> <[hidden email]> wrote: >>> >>> # Issue >>> >>> 'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here >>> using R 3.5.1), >>> >>>> c(TRUE, TRUE) || FALSE >>> [1] TRUE >>>> c(TRUE, FALSE) || FALSE >>> [1] TRUE >>>> c(TRUE, NA) || FALSE >>> [1] TRUE >>>> c(FALSE, TRUE) || FALSE >>> [1] FALSE >>> >>> This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the >>> same) and it also applies to 'x && y'. >>> >>> Note also how the above truncation of 'x' is completely silent - >>> there's neither an error nor a warning being produced. >>> >>> >>> # Discussion/Suggestion >>> >>> Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a >>> mistake.  Either the code is written assuming 'x' and 'y' are scalars, >>> or there is a coding error and vectorized versions 'x | y' and 'x & y' >>> were intended.  Should 'x || y' always be considered an mistake if >>> 'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning >>> or an error?  For instance, >>> '''r >>>> x <- c(TRUE, TRUE) >>>> y <- FALSE >>>> x || y >>> >>> Error in x || y : applying scalar operator || to non-scalar elements >>> Execution halted >>> >>> What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today >>> 'x || y' returns 'NA' in such cases, e.g. >>> >>>> logical(0) || c(FALSE, NA) >>> [1] NA >>>> logical(0) || logical(0) >>> [1] NA >>>> logical(0) && logical(0) >>> [1] NA >>> >>> I don't know the background for this behavior, but I'm sure there is >>> an argument behind that one.  Maybe it's simply that '||' and '&&' >>> should always return a scalar logical and neither TRUE nor FALSE can >>> be returned. >>> >>> /Henrik >>> >>> PS. This is in the same vein as >>> https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html>>> - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if >>> _R_CHECK_LENGTH_1_CONDITION_=true >>> >>> ______________________________________________ >>> [hidden email] mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel>> >> >> >> -- >> http://hadley.nz>> >> ______________________________________________ >> [hidden email] mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel>> > > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel> ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

 In reply to this post by Martin Maechler On Thu, Aug 30, 2018 at 5:58 PM Martin Maechler <[hidden email]> wrote: > > I agree "in theory". > Thank you, Henrik, for bringing it up! > > In practice I think we should start having a warning signalled. > I agree. I wouldn't know who would count on the automatic selection of the first value, but better safe than sorry. > I have checked the source code in the mean time, and the check > is really very cheap > { because it can/should be done after checking isNumber(): so >   then we know we have an atomic and can use XLENGTH() } > > That was my idea as well after going through the source code. I didn't want to state it as I don't know enough of the code base and couldn't see if there were complications I missed. Thank you for confirming! Cheers Joris -- Joris Meys Statistical consultant Department of Data Analysis and Mathematical Modelling Ghent University Coupure Links 653, B-9000 Gent (Belgium) ----------- Biowiskundedagen 2017-2018 http://www.biowiskundedagen.ugent.be/------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php        [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

 In reply to this post by Martin Maechler On Thu, Aug 30, 2018 at 10:58 AM Martin Maechler <[hidden email]> wrote: > > >>>>> Joris Meys > >>>>>     on Thu, 30 Aug 2018 14:48:01 +0200 writes: > >     > On Thu, Aug 30, 2018 at 2:09 PM Dénes Tóth >     > <[hidden email]> wrote: >     >> Note that `||` and `&&` have never been symmetric: >     >> >     >> TRUE || stop() # returns TRUE stop() || TRUE # returns an >     >> error >     >> >     >> >     > Fair point. So the suggestion would be to check whether x >     > is of length 1 and whether y is of length 1 only when >     > needed. I.e. > >     > c(TRUE,FALSE) || TRUE > >     > would give an error and > >     > TRUE || c(TRUE, FALSE) > >     > would pass. > >     > Thought about it a bit more, and I can't come up with a >     > use case where the first line must pass. So if the short >     > circuiting remains and the extra check only gives a small >     > performance penalty, adding the error could indeed make >     > some bugs more obvious. > > I agree "in theory". > Thank you, Henrik, for bringing it up! > > In practice I think we should start having a warning signalled. > I have checked the source code in the mean time, and the check > is really very cheap > { because it can/should be done after checking isNumber(): so >   then we know we have an atomic and can use XLENGTH() } > > > The 0-length case I don't think we should change as I do find > NA (is logical!) to be an appropriate logical answer. Can you explain your reasoning a bit more here? I'd like to understand the general principle, because from my perspective it's more parsimonious to say that the inputs to || and && must be length 1, rather than to say that inputs could be length 0 or length 1, and in the length 0 case they are replaced with NA. Hadley -- http://hadley.nz______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Open this post in threaded view
|

Re: ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

 ﻿On 30/08/2018, 20:15, "R-devel on behalf of Hadley Wickham" <[hidden email] on behalf of [hidden email]> wrote:     On Thu, Aug 30, 2018 at 10:58 AM Martin Maechler     <[hidden email]> wrote:     >     > >>>>> Joris Meys     > >>>>>     on Thu, 30 Aug 2018 14:48:01 +0200 writes:     >     >     > On Thu, Aug 30, 2018 at 2:09 PM Dénes Tóth     >     > <[hidden email]> wrote:     >     >> Note that `||` and `&&` have never been symmetric:     >     >>     >     >> TRUE || stop() # returns TRUE stop() || TRUE # returns an     >     >> error     >     >>     >     >>     >     > Fair point. So the suggestion would be to check whether x     >     > is of length 1 and whether y is of length 1 only when     >     > needed. I.e.     >     >     > c(TRUE,FALSE) || TRUE     >     >     > would give an error and     >     >     > TRUE || c(TRUE, FALSE)     >     >     > would pass.     >     >     > Thought about it a bit more, and I can't come up with a     >     > use case where the first line must pass. So if the short     >     > circuiting remains and the extra check only gives a small     >     > performance penalty, adding the error could indeed make     >     > some bugs more obvious.     >     > I agree "in theory".     > Thank you, Henrik, for bringing it up!     >     > In practice I think we should start having a warning signalled.     > I have checked the source code in the mean time, and the check     > is really very cheap     > { because it can/should be done after checking isNumber(): so     >   then we know we have an atomic and can use XLENGTH() }     >     >     > The 0-length case I don't think we should change as I do find     > NA (is logical!) to be an appropriate logical answer.         Can you explain your reasoning a bit more here? I'd like to understand     the general principle, because from my perspective it's more     parsimonious to say that the inputs to || and && must be length 1,     rather than to say that inputs could be length 0 or length 1, and in     the length 0 case they are replaced with NA.         Hadley     I would say the value NA would cause warnings later on, that are easy to track down, so a return of NA is far less likely to cause problems than an unintended TRUE or FALSE. And I guess there would be some code reliant on 'logical(0) || TRUE' returning TRUE, that wouldn't necessarily be a mistake. But I think it's hard to predict how exactly people are using functions. I personally can't imagine a situation where I'd use || or && outside an if-statement, so I'd rather have the current behaviour, because I'm not sure if I'm reliant on logical(0) || TRUE  somewhere in my code (even though that would be ugly code, it's not wrong per se) But I could always rewrite it, so I believe it's more a question of how much would have to be rewritten. Maybe implement it first in devel, to see how many people would complain? Emil Bode     ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-devel