brief update on the pipe operator in R-devel

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

brief update on the pipe operator in R-devel

luke-tierney
It turns out that allowing a bare function expression on the
right-hand side (RHS) of a pipe creates opportunities for confusion
and mistakes that are too risky. So we will be dropping support for
this from the pipe operator.

The case of a RHS call that wants to receive the LHS result in an
argument other than the first can be handled with just implicit first
argument passing along the lines of

     mtcars |> subset(cyl == 4) |> (\(d) lm(mpg ~ disp, data = d))()

It was hoped that allowing a bare function expression would make this
more convenient, but it has issues as outlined below. We are exploring
some alternatives, and will hopefully settle on one soon after the
holidays.

The basic problem, pointed out in a comment on Twitter, is that in
expressions of the form

     1 |> \(x) x + 1 -> y
     1 |> \(x) x + 1 |> \(y) x + y

everything after the \(x) is parsed as part of the body of the
function.  So these are parsed along the lines of

     1 |> \(x) { x + 1 -> y }
     1 |> \(x) { x + 1 |> \(y) x + y }

In the first case the result is assigned to a (useless) local
variable.  Someone writing this is more likely to have intended to
assign the result to a global variable, as this would:

     (1 |> \(x) x + 1) -> y

In the second case the 'x' in 'x + y' refers to the local variable 'x'
in the first RHS function. Someone writing this is more likely to have
meant

     (1 |> \(x) x + 1) |> \(y) x + y

with 'x' in 'x + y' now referring to a global variable:

     > x <- 2
     > 1 |> \(x) x + 1 |> \(y) x + y
     [1] 3
     > (1 |> \(x) x + 1) |> \(y) x + y
     [1] 4

These issues arise with any approach in R that allows a bare function
expression on the RHS of a pipe operation. It also arises in other
languages with pipe operators. For example, here is the last example
in Julia:

     julia> x = 2
     2
     julia> 1 |> x -> x + 1 |> y -> x + y
     3
     julia> ( 1 |> x -> x + 1 ) |> y -> x + y
     4

Even though proper use of parentheses can work around these issues,
the likelihood of making mistakes that are hard to track down is too
high. So we will disallow the use of bare function expressions on the
right hand side of a pipe.

Best,

luke

--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   [hidden email]
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: [External] brief update on the pipe operator in R-devel

luke-tierney
After some discussions we've settled on a syntax of the form

     mtcars |> subset(cyl == 4) |> d => lm(mpg ~ disp, data = d)

to handle cases where the pipe lhs needs to be passed to an argument
other than the first of the function called on the rhs. This seems a
to be a reasonable balance between making these non-standard cases
easy to see but still easy to write. This is now committed to R-devel.

Best,

luke

On Tue, 22 Dec 2020, [hidden email] wrote:

> It turns out that allowing a bare function expression on the
> right-hand side (RHS) of a pipe creates opportunities for confusion
> and mistakes that are too risky. So we will be dropping support for
> this from the pipe operator.
>
> The case of a RHS call that wants to receive the LHS result in an
> argument other than the first can be handled with just implicit first
> argument passing along the lines of
>
>    mtcars |> subset(cyl == 4) |> (\(d) lm(mpg ~ disp, data = d))()
>
> It was hoped that allowing a bare function expression would make this
> more convenient, but it has issues as outlined below. We are exploring
> some alternatives, and will hopefully settle on one soon after the
> holidays.
>
> The basic problem, pointed out in a comment on Twitter, is that in
> expressions of the form
>
>    1 |> \(x) x + 1 -> y
>    1 |> \(x) x + 1 |> \(y) x + y
>
> everything after the \(x) is parsed as part of the body of the
> function.  So these are parsed along the lines of
>
>    1 |> \(x) { x + 1 -> y }
>    1 |> \(x) { x + 1 |> \(y) x + y }
>
> In the first case the result is assigned to a (useless) local
> variable.  Someone writing this is more likely to have intended to
> assign the result to a global variable, as this would:
>
>    (1 |> \(x) x + 1) -> y
>
> In the second case the 'x' in 'x + y' refers to the local variable 'x'
> in the first RHS function. Someone writing this is more likely to have
> meant
>
>    (1 |> \(x) x + 1) |> \(y) x + y
>
> with 'x' in 'x + y' now referring to a global variable:
>
>    > x <- 2
>    > 1 |> \(x) x + 1 |> \(y) x + y
>    [1] 3
>    > (1 |> \(x) x + 1) |> \(y) x + y
>    [1] 4
>
> These issues arise with any approach in R that allows a bare function
> expression on the RHS of a pipe operation. It also arises in other
> languages with pipe operators. For example, here is the last example
> in Julia:
>
>    julia> x = 2
>    2
>    julia> 1 |> x -> x + 1 |> y -> x + y
>    3
>    julia> ( 1 |> x -> x + 1 ) |> y -> x + y
>    4
>
> Even though proper use of parentheses can work around these issues,
> the likelihood of making mistakes that are hard to track down is too
> high. So we will disallow the use of bare function expressions on the
> right hand side of a pipe.
>
> Best,
>
> luke
>
>

--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   [hidden email]
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: [External] brief update on the pipe operator in R-devel

Iñaki Ucar
On Tue, 12 Jan 2021 at 20:23, <[hidden email]> wrote:
>
> After some discussions we've settled on a syntax of the form
>
>      mtcars |> subset(cyl == 4) |> d => lm(mpg ~ disp, data = d)
>
> to handle cases where the pipe lhs needs to be passed to an argument
> other than the first of the function called on the rhs. This seems a
> to be a reasonable balance between making these non-standard cases
> easy to see but still easy to write. This is now committed to R-devel.

Interesting. Is the use of "d =>" restricted to pipelines? In other
words, I think that it shouldn't be equivalent to "function(d)", i.e.,
that this:

x <- d => lm(mpg ~ disp, data = d)

shouldn't work.

--
Iñaki Úcar

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: [External] brief update on the pipe operator in R-devel

Dirk Eddelbuettel

On 12 January 2021 at 20:38, Iñaki Ucar wrote:
| On Tue, 12 Jan 2021 at 20:23, <[hidden email]> wrote:
| >
| > After some discussions we've settled on a syntax of the form
| >
| >      mtcars |> subset(cyl == 4) |> d => lm(mpg ~ disp, data = d)
| >
| > to handle cases where the pipe lhs needs to be passed to an argument
| > other than the first of the function called on the rhs. This seems a
| > to be a reasonable balance between making these non-standard cases
| > easy to see but still easy to write. This is now committed to R-devel.
|
| Interesting. Is the use of "d =>" restricted to pipelines? In other
| words, I think that it shouldn't be equivalent to "function(d)", i.e.,
| that this:
|
| x <- d => lm(mpg ~ disp, data = d)
|
| shouldn't work.

Looks like your wish was already granted:

  > mtcars |> subset(cyl == 4) |> d => lm(mpg ~ disp, data = d)
 
  Call:
  lm(formula = mpg ~ disp, data = subset(mtcars, cyl == 4))
 
  Coefficients:
  (Intercept)         disp  
       40.872       -0.135  
 
  > d => lm(mpg ~ disp, data = d)
  Error in `=>`(d, lm(mpg ~ disp, data = d)) : could not find function "=>"
  > x <- d => lm(mpg ~ disp, data = d)
  Error in `=>`(d, lm(mpg ~ disp, data = d)) : could not find function "=>"
  >
 
Dirk

--
https://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: [External] brief update on the pipe operator in R-devel

Bill Dunlap-2
'=>' can be defined as a function.  E.g., it could be the logical "implies"
function:
    > `=>` <- function(x, y) !x | y
    > TRUE => FALSE
    [1] FALSE
    > FALSE => TRUE
    [1] TRUE
It might be nice then to have deparse() display it as an infix operator
instead of the current prefix:
    > deparse(quote(p => q))
    [1] "`=>`(p, q)"
There was a user who recently wrote asking for an infix operator like -> or
=> that would deparse nicely for use in some sort of model specification.

When used with |>, the parser will turn the |> and => into an ordinary
looking function call so deparsing is irrelevant.
    > deparse(quote(x |> tmp => f(7,arg2=tmp)))
    [1] "f(7, arg2 = x)"

-Bill

On Tue, Jan 12, 2021 at 12:01 PM Dirk Eddelbuettel <[hidden email]> wrote:

>
> On 12 January 2021 at 20:38, Iñaki Ucar wrote:
> | On Tue, 12 Jan 2021 at 20:23, <[hidden email]> wrote:
> | >
> | > After some discussions we've settled on a syntax of the form
> | >
> | >      mtcars |> subset(cyl == 4) |> d => lm(mpg ~ disp, data = d)
> | >
> | > to handle cases where the pipe lhs needs to be passed to an argument
> | > other than the first of the function called on the rhs. This seems a
> | > to be a reasonable balance between making these non-standard cases
> | > easy to see but still easy to write. This is now committed to R-devel.
> |
> | Interesting. Is the use of "d =>" restricted to pipelines? In other
> | words, I think that it shouldn't be equivalent to "function(d)", i.e.,
> | that this:
> |
> | x <- d => lm(mpg ~ disp, data = d)
> |
> | shouldn't work.
>
> Looks like your wish was already granted:
>
>   > mtcars |> subset(cyl == 4) |> d => lm(mpg ~ disp, data = d)
>
>   Call:
>   lm(formula = mpg ~ disp, data = subset(mtcars, cyl == 4))
>
>   Coefficients:
>   (Intercept)         disp
>        40.872       -0.135
>
>   > d => lm(mpg ~ disp, data = d)
>   Error in `=>`(d, lm(mpg ~ disp, data = d)) : could not find function "=>"
>   > x <- d => lm(mpg ~ disp, data = d)
>   Error in `=>`(d, lm(mpg ~ disp, data = d)) : could not find function "=>"
>   >
>
> Dirk
>
> --
> https://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: [External] brief update on the pipe operator in R-devel

Duncan Murdoch-2
On 12/01/2021 3:52 p.m., Bill Dunlap wrote:

> '=>' can be defined as a function.  E.g., it could be the logical "implies"
> function:
>      > `=>` <- function(x, y) !x | y
>      > TRUE => FALSE
>      [1] FALSE
>      > FALSE => TRUE
>      [1] TRUE
> It might be nice then to have deparse() display it as an infix operator
> instead of the current prefix:
>      > deparse(quote(p => q))
>      [1] "`=>`(p, q)"
> There was a user who recently wrote asking for an infix operator like -> or
> => that would deparse nicely for use in some sort of model specification.

The precedence of it as an operator is determined by what makes sense in
the pipe construction.  Currently precedence appears to be


:: ::: access variables in a namespace
$ @ component / slot extraction
[ [[ indexing
^ exponentiation (right to left)
- + unary minus and plus
: sequence operator
%any% special operators (including %% and %/%)
* / multiply, divide
+ - (binary) add, subtract
< > <= >= == != ordering and comparison
! negation
& && and
| || or
=>      PIPE BIND
|>      PIPE
~ as in formulae
-> ->> rightwards assignment
<- <<- assignment (right to left)
= assignment (right to left)
? help (unary and binary)

(Most of this is taken from ?Syntax, but I added the new operators in
based on the gram.y file).  So

A & B => C & D

would appear to be parsed as

(A & B) => (C & D)

I think this also makes sense; do you?

Duncan Murdoch


>
> When used with |>, the parser will turn the |> and => into an ordinary
> looking function call so deparsing is irrelevant.
>      > deparse(quote(x |> tmp => f(7,arg2=tmp)))
>      [1] "f(7, arg2 = x)"
>
> -Bill
>
> On Tue, Jan 12, 2021 at 12:01 PM Dirk Eddelbuettel <[hidden email]> wrote:
>
>>
>> On 12 January 2021 at 20:38, Iñaki Ucar wrote:
>> | On Tue, 12 Jan 2021 at 20:23, <[hidden email]> wrote:
>> | >
>> | > After some discussions we've settled on a syntax of the form
>> | >
>> | >      mtcars |> subset(cyl == 4) |> d => lm(mpg ~ disp, data = d)
>> | >
>> | > to handle cases where the pipe lhs needs to be passed to an argument
>> | > other than the first of the function called on the rhs. This seems a
>> | > to be a reasonable balance between making these non-standard cases
>> | > easy to see but still easy to write. This is now committed to R-devel.
>> |
>> | Interesting. Is the use of "d =>" restricted to pipelines? In other
>> | words, I think that it shouldn't be equivalent to "function(d)", i.e.,
>> | that this:
>> |
>> | x <- d => lm(mpg ~ disp, data = d)
>> |
>> | shouldn't work.
>>
>> Looks like your wish was already granted:
>>
>>    > mtcars |> subset(cyl == 4) |> d => lm(mpg ~ disp, data = d)
>>
>>    Call:
>>    lm(formula = mpg ~ disp, data = subset(mtcars, cyl == 4))
>>
>>    Coefficients:
>>    (Intercept)         disp
>>         40.872       -0.135
>>
>>    > d => lm(mpg ~ disp, data = d)
>>    Error in `=>`(d, lm(mpg ~ disp, data = d)) : could not find function "=>"
>>    > x <- d => lm(mpg ~ disp, data = d)
>>    Error in `=>`(d, lm(mpg ~ disp, data = d)) : could not find function "=>"
>>    >
>>
>> Dirk
>>
>> --
>> https://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: [External] brief update on the pipe operator in R-devel

Bill Dunlap-2
I agree that the precedence looks reasonable.  E.g.,

> str.language(quote(A > 0 & A<=B & B <= C => A <= C & 0 < C))
language: `=>`(A > 0 & A <= B & B <= C, A <= C ...
  symbol: =>
  language: A > 0 & A <= B & B <= C
    symbol: &
    language: A > 0 & A <= B
      symbol: &
      language: A > 0
        symbol: >
        symbol: A
        double: 0
      language: A <= B
        symbol: <=
        symbol: A
        symbol: B
    language: B <= C
      symbol: <=
      symbol: B
      symbol: C
  language: A <= C & 0 < C
    symbol: &
    language: A <= C
      symbol: <=
      symbol: A
      symbol: C
    language: 0 < C
      symbol: <
      double: 0
      symbol: C
> str.language(quote(data |> tmp1 => f1(x, arg1=tmp1) |> f2(y) |> tmp3 =>
f3(z, arg3=tmp3)))
language: f3(z, arg3 = f2(f1(x, arg1 = data), y))
  symbol: f3
  symbol: z
  language: arg3 = f2(f1(x, arg1 = data), y)
    symbol: f2
    language: f1(x, arg1 = data)
      symbol: f1
      symbol: x
      symbol: arg1 = data
    symbol: y

Where str.language is

str.language <- function(expr, name = "", indent = 0)
{
    trim... <- function(string, width.cutoff) {
        if (nchar(string) > width.cutoff) {
            string <- sprintf("%.*s ...", width.cutoff-4, string)
        }
        string
    }
    cat(sep="", rep("  ", indent), typeof(expr), ": ",
        if(length(name)==1 && nzchar(name)) { paste0(name, " = ") },
        trim...(deparse1(expr, width.cutoff=40), width.cutoff=40),
        "\n")
    if (is.recursive(expr)) {
        if (!is.list(expr)) {
            expr <- as.list(expr)
        }
        nms <- names(expr)
        for (i in seq_along(expr)) {
            str.language(expr[[i]], name=nms[[i]], indent = indent + 1)
        }
    }
    invisible(expr)

                                                 }

On Tue, Jan 12, 2021 at 1:16 PM Duncan Murdoch <[hidden email]>
wrote:

> On 12/01/2021 3:52 p.m., Bill Dunlap wrote:
> > '=>' can be defined as a function.  E.g., it could be the logical
> "implies"
> > function:
> >      > `=>` <- function(x, y) !x | y
> >      > TRUE => FALSE
> >      [1] FALSE
> >      > FALSE => TRUE
> >      [1] TRUE
> > It might be nice then to have deparse() display it as an infix operator
> > instead of the current prefix:
> >      > deparse(quote(p => q))
> >      [1] "`=>`(p, q)"
> > There was a user who recently wrote asking for an infix operator like ->
> or
> > => that would deparse nicely for use in some sort of model specification.
>
> The precedence of it as an operator is determined by what makes sense in
> the pipe construction.  Currently precedence appears to be
>
>
> :: :::  access variables in a namespace
> $ @     component / slot extraction
> [ [[    indexing
> ^       exponentiation (right to left)
> - +     unary minus and plus
> :       sequence operator
> %any%   special operators (including %% and %/%)
> * /     multiply, divide
> + -     (binary) add, subtract
> < > <= >= == != ordering and comparison
> !       negation
> & &&    and
> | ||    or
> =>      PIPE BIND
> |>      PIPE
> ~       as in formulae
> -> ->>  rightwards assignment
> <- <<-  assignment (right to left)
> =       assignment (right to left)
> ?       help (unary and binary)
>
> (Most of this is taken from ?Syntax, but I added the new operators in
> based on the gram.y file).  So
>
> A & B => C & D
>
> would appear to be parsed as
>
> (A & B) => (C & D)
>
> I think this also makes sense; do you?
>
> Duncan Murdoch
>
>
> >
> > When used with |>, the parser will turn the |> and => into an ordinary
> > looking function call so deparsing is irrelevant.
> >      > deparse(quote(x |> tmp => f(7,arg2=tmp)))
> >      [1] "f(7, arg2 = x)"
> >
> > -Bill
> >
> > On Tue, Jan 12, 2021 at 12:01 PM Dirk Eddelbuettel <[hidden email]>
> wrote:
> >
> >>
> >> On 12 January 2021 at 20:38, Iñaki Ucar wrote:
> >> | On Tue, 12 Jan 2021 at 20:23, <[hidden email]> wrote:
> >> | >
> >> | > After some discussions we've settled on a syntax of the form
> >> | >
> >> | >      mtcars |> subset(cyl == 4) |> d => lm(mpg ~ disp, data = d)
> >> | >
> >> | > to handle cases where the pipe lhs needs to be passed to an argument
> >> | > other than the first of the function called on the rhs. This seems a
> >> | > to be a reasonable balance between making these non-standard cases
> >> | > easy to see but still easy to write. This is now committed to
> R-devel.
> >> |
> >> | Interesting. Is the use of "d =>" restricted to pipelines? In other
> >> | words, I think that it shouldn't be equivalent to "function(d)", i.e.,
> >> | that this:
> >> |
> >> | x <- d => lm(mpg ~ disp, data = d)
> >> |
> >> | shouldn't work.
> >>
> >> Looks like your wish was already granted:
> >>
> >>    > mtcars |> subset(cyl == 4) |> d => lm(mpg ~ disp, data = d)
> >>
> >>    Call:
> >>    lm(formula = mpg ~ disp, data = subset(mtcars, cyl == 4))
> >>
> >>    Coefficients:
> >>    (Intercept)         disp
> >>         40.872       -0.135
> >>
> >>    > d => lm(mpg ~ disp, data = d)
> >>    Error in `=>`(d, lm(mpg ~ disp, data = d)) : could not find function
> "=>"
> >>    > x <- d => lm(mpg ~ disp, data = d)
> >>    Error in `=>`(d, lm(mpg ~ disp, data = d)) : could not find function
> "=>"
> >>    >
> >>
> >> Dirk
> >>
> >> --
> >> https://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]
> >>
> >> ______________________________________________
> >> [hidden email] mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: brief update on the pipe operator in R-devel

Gabor Grothendieck
In reply to this post by luke-tierney
These are documented but still seem like serious deficiencies:

> f <- function(x, y) x + 10*y
> 3 |> x => f(x, x)
Error in f(x, x) : pipe placeholder may only appear once

> 3 |> x => f(1+x, 1)
Error in f(1 + x, 1) :
  pipe placeholder must only appear as a top-level argument in the RHS call

Also note:

 ?"=>"
No documentation for ‘=>’ in specified packages and libraries:
you could try ‘??=>’

On Tue, Dec 22, 2020 at 5:28 PM <[hidden email]> wrote:

>
> It turns out that allowing a bare function expression on the
> right-hand side (RHS) of a pipe creates opportunities for confusion
> and mistakes that are too risky. So we will be dropping support for
> this from the pipe operator.
>
> The case of a RHS call that wants to receive the LHS result in an
> argument other than the first can be handled with just implicit first
> argument passing along the lines of
>
>      mtcars |> subset(cyl == 4) |> (\(d) lm(mpg ~ disp, data = d))()
>
> It was hoped that allowing a bare function expression would make this
> more convenient, but it has issues as outlined below. We are exploring
> some alternatives, and will hopefully settle on one soon after the
> holidays.
>
> The basic problem, pointed out in a comment on Twitter, is that in
> expressions of the form
>
>      1 |> \(x) x + 1 -> y
>      1 |> \(x) x + 1 |> \(y) x + y
>
> everything after the \(x) is parsed as part of the body of the
> function.  So these are parsed along the lines of
>
>      1 |> \(x) { x + 1 -> y }
>      1 |> \(x) { x + 1 |> \(y) x + y }
>
> In the first case the result is assigned to a (useless) local
> variable.  Someone writing this is more likely to have intended to
> assign the result to a global variable, as this would:
>
>      (1 |> \(x) x + 1) -> y
>
> In the second case the 'x' in 'x + y' refers to the local variable 'x'
> in the first RHS function. Someone writing this is more likely to have
> meant
>
>      (1 |> \(x) x + 1) |> \(y) x + y
>
> with 'x' in 'x + y' now referring to a global variable:
>
>      > x <- 2
>      > 1 |> \(x) x + 1 |> \(y) x + y
>      [1] 3
>      > (1 |> \(x) x + 1) |> \(y) x + y
>      [1] 4
>
> These issues arise with any approach in R that allows a bare function
> expression on the RHS of a pipe operation. It also arises in other
> languages with pipe operators. For example, here is the last example
> in Julia:
>
>      julia> x = 2
>      2
>      julia> 1 |> x -> x + 1 |> y -> x + y
>      3
>      julia> ( 1 |> x -> x + 1 ) |> y -> x + y
>      4
>
> Even though proper use of parentheses can work around these issues,
> the likelihood of making mistakes that are hard to track down is too
> high. So we will disallow the use of bare function expressions on the
> right hand side of a pipe.
>
> Best,
>
> luke
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa                  Phone:             319-335-3386
> Department of Statistics and        Fax:               319-335-3017
>     Actuarial Science
> 241 Schaeffer Hall                  email:   [hidden email]
> Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: brief update on the pipe operator in R-devel

R devel mailing list
Gabor,

Although it might be nice if all imagined cases worked, there are many ways to work around and get the results you want.

You may want to consider that it is easier to recognize the symbol you use (x in the examples) if it is alone and used only exactly once and it the list of function arguments.  If you want the x used multiple times, you can make a function that accepts the x once and then invokes another function and reuses the x as often as needed. Similarly for 1+x.

I do not know if the above choice was made to make it easier and faster to apply the above, or to avoid possible bad edge cases. Have you tested other ideas like:

        3 |> x => f(x=5)
Or
        3 |> x => f(x, y=x)

I mean ones where a default is supplied, not that it makes much sense here?

I am thinking of the concept of substitution as is often done for text or symbols. Often the substitution is done for the first instance found unless you specify you want a global change. In your examples, if only the first use of x would be replaced, the second naked x being left alone would be an error. If all instances were changed, what anomalies might happen? Giving a vector of length 1 containing the number 3 seems harmless enough to duplicate. But the pipeline can send all kinds of interesting data structures through including data.frames and arbitrary objects.


-----Original Message-----
From: R-devel <[hidden email]> On Behalf Of Gabor Grothendieck
Sent: Friday, January 15, 2021 7:28 AM
To: Tierney, Luke <[hidden email]>
Cc: [hidden email]
Subject: Re: [Rd] brief update on the pipe operator in R-devel

These are documented but still seem like serious deficiencies:

> f <- function(x, y) x + 10*y
> 3 |> x => f(x, x)
Error in f(x, x) : pipe placeholder may only appear once

> 3 |> x => f(1+x, 1)
Error in f(1 + x, 1) :
  pipe placeholder must only appear as a top-level argument in the RHS call

Also note:

 ?"=>"
No documentation for ‘=>’ in specified packages and libraries:
you could try ‘??=>’

On Tue, Dec 22, 2020 at 5:28 PM <[hidden email]> wrote:

>
> It turns out that allowing a bare function expression on the
> right-hand side (RHS) of a pipe creates opportunities for confusion
> and mistakes that are too risky. So we will be dropping support for
> this from the pipe operator.
>
> The case of a RHS call that wants to receive the LHS result in an
> argument other than the first can be handled with just implicit first
> argument passing along the lines of
>
>      mtcars |> subset(cyl == 4) |> (\(d) lm(mpg ~ disp, data = d))()
>
> It was hoped that allowing a bare function expression would make this
> more convenient, but it has issues as outlined below. We are exploring
> some alternatives, and will hopefully settle on one soon after the
> holidays.
>
> The basic problem, pointed out in a comment on Twitter, is that in
> expressions of the form
>
>      1 |> \(x) x + 1 -> y
>      1 |> \(x) x + 1 |> \(y) x + y
>
> everything after the \(x) is parsed as part of the body of the
> function.  So these are parsed along the lines of
>
>      1 |> \(x) { x + 1 -> y }
>      1 |> \(x) { x + 1 |> \(y) x + y }
>
> In the first case the result is assigned to a (useless) local
> variable.  Someone writing this is more likely to have intended to
> assign the result to a global variable, as this would:
>
>      (1 |> \(x) x + 1) -> y
>
> In the second case the 'x' in 'x + y' refers to the local variable 'x'
> in the first RHS function. Someone writing this is more likely to have
> meant
>
>      (1 |> \(x) x + 1) |> \(y) x + y
>
> with 'x' in 'x + y' now referring to a global variable:
>
>      > x <- 2
>      > 1 |> \(x) x + 1 |> \(y) x + y
>      [1] 3
>      > (1 |> \(x) x + 1) |> \(y) x + y
>      [1] 4
>
> These issues arise with any approach in R that allows a bare function
> expression on the RHS of a pipe operation. It also arises in other
> languages with pipe operators. For example, here is the last example
> in Julia:
>
>      julia> x = 2
>      2
>      julia> 1 |> x -> x + 1 |> y -> x + y
>      3
>      julia> ( 1 |> x -> x + 1 ) |> y -> x + y
>      4
>
> Even though proper use of parentheses can work around these issues,
> the likelihood of making mistakes that are hard to track down is too
> high. So we will disallow the use of bare function expressions on the
> right hand side of a pipe.
>
> Best,
>
> luke
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa                  Phone:             319-335-3386
> Department of Statistics and        Fax:               319-335-3017
>     Actuarial Science
> 241 Schaeffer Hall                  email:   [hidden email]
> Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: brief update on the pipe operator in R-devel

Bill Dunlap-2
If
    3 |> x => f(x, y=x)
were allowed then I think that
     runif(1) |> x => f(x, y=x)
be parsed as
     f(runif(1), y=runif(1))
so runif(1) would be evaluated twice, leading to incorrect results from f().

-Bill

On Fri, Jan 15, 2021 at 2:16 PM Avi Gross via R-devel <[hidden email]>
wrote:

> Gabor,
>
> Although it might be nice if all imagined cases worked, there are many
> ways to work around and get the results you want.
>
> You may want to consider that it is easier to recognize the symbol you use
> (x in the examples) if it is alone and used only exactly once and it the
> list of function arguments.  If you want the x used multiple times, you can
> make a function that accepts the x once and then invokes another function
> and reuses the x as often as needed. Similarly for 1+x.
>
> I do not know if the above choice was made to make it easier and faster to
> apply the above, or to avoid possible bad edge cases. Have you tested other
> ideas like:
>
>         3 |> x => f(x=5)
> Or
>         3 |> x => f(x, y=x)
>
> I mean ones where a default is supplied, not that it makes much sense here?
>
> I am thinking of the concept of substitution as is often done for text or
> symbols. Often the substitution is done for the first instance found unless
> you specify you want a global change. In your examples, if only the first
> use of x would be replaced, the second naked x being left alone would be an
> error. If all instances were changed, what anomalies might happen? Giving a
> vector of length 1 containing the number 3 seems harmless enough to
> duplicate. But the pipeline can send all kinds of interesting data
> structures through including data.frames and arbitrary objects.
>
>
> -----Original Message-----
> From: R-devel <[hidden email]> On Behalf Of Gabor
> Grothendieck
> Sent: Friday, January 15, 2021 7:28 AM
> To: Tierney, Luke <[hidden email]>
> Cc: [hidden email]
> Subject: Re: [Rd] brief update on the pipe operator in R-devel
>
> These are documented but still seem like serious deficiencies:
>
> > f <- function(x, y) x + 10*y
> > 3 |> x => f(x, x)
> Error in f(x, x) : pipe placeholder may only appear once
>
> > 3 |> x => f(1+x, 1)
> Error in f(1 + x, 1) :
>   pipe placeholder must only appear as a top-level argument in the RHS call
>
> Also note:
>
>  ?"=>"
> No documentation for ‘=>’ in specified packages and libraries:
> you could try ‘??=>’
>
> On Tue, Dec 22, 2020 at 5:28 PM <[hidden email]> wrote:
> >
> > It turns out that allowing a bare function expression on the
> > right-hand side (RHS) of a pipe creates opportunities for confusion
> > and mistakes that are too risky. So we will be dropping support for
> > this from the pipe operator.
> >
> > The case of a RHS call that wants to receive the LHS result in an
> > argument other than the first can be handled with just implicit first
> > argument passing along the lines of
> >
> >      mtcars |> subset(cyl == 4) |> (\(d) lm(mpg ~ disp, data = d))()
> >
> > It was hoped that allowing a bare function expression would make this
> > more convenient, but it has issues as outlined below. We are exploring
> > some alternatives, and will hopefully settle on one soon after the
> > holidays.
> >
> > The basic problem, pointed out in a comment on Twitter, is that in
> > expressions of the form
> >
> >      1 |> \(x) x + 1 -> y
> >      1 |> \(x) x + 1 |> \(y) x + y
> >
> > everything after the \(x) is parsed as part of the body of the
> > function.  So these are parsed along the lines of
> >
> >      1 |> \(x) { x + 1 -> y }
> >      1 |> \(x) { x + 1 |> \(y) x + y }
> >
> > In the first case the result is assigned to a (useless) local
> > variable.  Someone writing this is more likely to have intended to
> > assign the result to a global variable, as this would:
> >
> >      (1 |> \(x) x + 1) -> y
> >
> > In the second case the 'x' in 'x + y' refers to the local variable 'x'
> > in the first RHS function. Someone writing this is more likely to have
> > meant
> >
> >      (1 |> \(x) x + 1) |> \(y) x + y
> >
> > with 'x' in 'x + y' now referring to a global variable:
> >
> >      > x <- 2
> >      > 1 |> \(x) x + 1 |> \(y) x + y
> >      [1] 3
> >      > (1 |> \(x) x + 1) |> \(y) x + y
> >      [1] 4
> >
> > These issues arise with any approach in R that allows a bare function
> > expression on the RHS of a pipe operation. It also arises in other
> > languages with pipe operators. For example, here is the last example
> > in Julia:
> >
> >      julia> x = 2
> >      2
> >      julia> 1 |> x -> x + 1 |> y -> x + y
> >      3
> >      julia> ( 1 |> x -> x + 1 ) |> y -> x + y
> >      4
> >
> > Even though proper use of parentheses can work around these issues,
> > the likelihood of making mistakes that are hard to track down is too
> > high. So we will disallow the use of bare function expressions on the
> > right hand side of a pipe.
> >
> > Best,
> >
> > luke
> >
> > --
> > Luke Tierney
> > Ralph E. Wareham Professor of Mathematical Sciences
> > University of Iowa                  Phone:             319-335-3386
> > Department of Statistics and        Fax:               319-335-3017
> >     Actuarial Science
> > 241 Schaeffer Hall                  email:   [hidden email]
> > Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel