RFC: (in-principle) native unquoting for standard evaluation

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
14 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RFC: (in-principle) native unquoting for standard evaluation

Jonathan Carroll-2
(please be gentle, it's my first time)

I am interested in discussions (possibly reiterating past threads --
searching didn't turn up much) on the possibility of supporting standard
evaluation unquoting at the language level. This has been brought up in a
recent similar thread here [1] and on Twitter [2] where I proposed the
following desired (in-principle) syntax

    f <- function(col1, col2, new_col_name) {
        mtcars %>% mutate(@new_col_name = @col1 + @col2)
    }

or closer to home

    x <- 1:10; y <- "x"
    data.frame(z = @y)

where @ would be defined as a unary prefix operator which substitutes the
quoted variable name in-place, to allow more flexibility of NSE functions
within a programming context. This mechanism exists within MySQL [3] (and
likely other languages) and could potentially be extremely useful. Several
alternatives have been incorporated into packages (most recently work
on tidyeval) none of which appear to fully match the simplicity of the
above, and some of which cut a forceful path through the syntax tree.

The exact syntax isn't my concern at the moment (@ vs unquote() or other,
though the first requires user-supplied native prefix support within the
language, as per [1]) and neither is the exact way in which this would be
achieved (well above my pay grade). The practicality of @ being on the LHS
of `=` is also of a lesser concern (likely greater complexity) than the RHS.

I hear there exists (justified) reluctance to add new syntax to the
language, but I think this has sufficient merit (and a growing number of
workarounds) to warrant continued discussion.

With kindest regards,

- Jonathan.

[1] https://stat.ethz.ch/pipermail/r-devel/2017-March/073894.html
[2] https://twitter.com/carroll_jono/status/842142292253196290
[3] https://dev.mysql.com/doc/refman/5.7/en/user-variables.html

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFC: (in-principle) native unquoting for standard evaluation

Michael Lawrence-3
Interesting idea. Lazy and non-standard evaluation is going to happen; the
language needs a way to contain it.

I'll extend the proposal so that prefixing a formal argument with @ in
function() marks the argument as auto-quoting, so it arrives as a language
object without use of substitute(). Kind of like how '*' in C declares a
pointer and dereferences one.

subset <- function(x, @subset, ...) { }

This should make it easier to implement such functions, simplify
compilation, and allow detection of potential quoting errors through static
analysis.

Michael

On Thu, Mar 16, 2017 at 5:03 PM, Jonathan Carroll <[hidden email]>
wrote:

> (please be gentle, it's my first time)
>
> I am interested in discussions (possibly reiterating past threads --
> searching didn't turn up much) on the possibility of supporting standard
> evaluation unquoting at the language level. This has been brought up in a
> recent similar thread here [1] and on Twitter [2] where I proposed the
> following desired (in-principle) syntax
>
>     f <- function(col1, col2, new_col_name) {
>         mtcars %>% mutate(@new_col_name = @col1 + @col2)
>     }
>
> or closer to home
>
>     x <- 1:10; y <- "x"
>     data.frame(z = @y)
>
> where @ would be defined as a unary prefix operator which substitutes the
> quoted variable name in-place, to allow more flexibility of NSE functions
> within a programming context. This mechanism exists within MySQL [3] (and
> likely other languages) and could potentially be extremely useful. Several
> alternatives have been incorporated into packages (most recently work
> on tidyeval) none of which appear to fully match the simplicity of the
> above, and some of which cut a forceful path through the syntax tree.
>
> The exact syntax isn't my concern at the moment (@ vs unquote() or other,
> though the first requires user-supplied native prefix support within the
> language, as per [1]) and neither is the exact way in which this would be
> achieved (well above my pay grade). The practicality of @ being on the LHS
> of `=` is also of a lesser concern (likely greater complexity) than the
> RHS.
>
> I hear there exists (justified) reluctance to add new syntax to the
> language, but I think this has sufficient merit (and a growing number of
> workarounds) to warrant continued discussion.
>
> With kindest regards,
>
> - Jonathan.
>
> [1] https://stat.ethz.ch/pipermail/r-devel/2017-March/073894.html
> [2] https://twitter.com/carroll_jono/status/842142292253196290
> [3] https://dev.mysql.com/doc/refman/5.7/en/user-variables.html
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFC: (in-principle) native unquoting for standard evaluation

Jonathan Carroll-2
I love the pointer analogy. Presumably the additional complication of scope
breaks this however. * itself would have been a nice operator for this were
it not prone to ambiguity (`a * *b` vs `a**b`, from which @ does not
suffer).

Would this extension require that function authors explicitly enable
auto-quoting support? I somewhat envisioned functions seeing the resolved
unquoted object (within their calling scope) so that they could retain
their standard defintions when not using @. In my mutate example, mutate
itself could simply be the NSE version, so

    mutate(mtcars, z = mpg)

would work as normal, but

    x = "mpg"
    mutate(mtcars, z = @x)

would produce the same result (x may be changing within a loop or be
defined through a formal argument). Here, @x would resolve to `mpg` and
mutate would retain the duty of resolving that to mtcars$mpg as per normal.

A seperate SE version would not be required (as arguments could be set
programatically), but an additional flexibility could be @ acting on a
string rather than an object for direct unquoting

    mutate(mtcars, z = @"mpg")

for when the name is known but NSE isn't desired (which would also assist
with the whole utils::globalVariables() vs CRAN checks concern).

Having a formal argument forcefully auto-unquote would prevent standard
usage unless there was a way to also disable it. Unless I'm missing an
angle (which I very likely am) wouldn't it be better to have the user
supply an @-prefixed argument and retain the connection to the calling
scope?

Apologies if I have any of that confused or there are better approaches. I
merely have a desire for this to work and am learning as much as possible
about "how" as I go.

Your comments are greatly appreciated.

- Jonathan.

On Fri, 17 Mar 2017 at 21:00, Michael Lawrence <[hidden email]>
wrote:

Interesting idea. Lazy and non-standard evaluation is going to happen; the
language needs a way to contain it.

I'll extend the proposal so that prefixing a formal argument with @ in
function() marks the argument as auto-quoting, so it arrives as a language
object without use of substitute(). Kind of like how '*' in C declares a
pointer and dereferences one.

subset <- function(x, @subset, ...) { }

This should make it easier to implement such functions, simplify
compilation, and allow detection of potential quoting errors through static
analysis.

Michael

On Thu, Mar 16, 2017 at 5:03 PM, Jonathan Carroll <[hidden email]>
wrote:

(please be gentle, it's my first time)

I am interested in discussions (possibly reiterating past threads --
searching didn't turn up much) on the possibility of supporting standard
evaluation unquoting at the language level. This has been brought up in a
recent similar thread here [1] and on Twitter [2] where I proposed the
following desired (in-principle) syntax

    f <- function(col1, col2, new_col_name) {
        mtcars %>% mutate(@new_col_name = @col1 + @col2)
    }

or closer to home

    x <- 1:10; y <- "x"
    data.frame(z = @y)

where @ would be defined as a unary prefix operator which substitutes the
quoted variable name in-place, to allow more flexibility of NSE functions
within a programming context. This mechanism exists within MySQL [3] (and
likely other languages) and could potentially be extremely useful. Several
alternatives have been incorporated into packages (most recently work
on tidyeval) none of which appear to fully match the simplicity of the
above, and some of which cut a forceful path through the syntax tree.

The exact syntax isn't my concern at the moment (@ vs unquote() or other,
though the first requires user-supplied native prefix support within the
language, as per [1]) and neither is the exact way in which this would be
achieved (well above my pay grade). The practicality of @ being on the LHS
of `=` is also of a lesser concern (likely greater complexity) than the RHS.

I hear there exists (justified) reluctance to add new syntax to the
language, but I think this has sufficient merit (and a growing number of
workarounds) to warrant continued discussion.

With kindest regards,

- Jonathan.

[1] https://stat.ethz.ch/pipermail/r-devel/2017-March/073894.html
[2] https://twitter.com/carroll_jono/status/842142292253196290
[3] https://dev.mysql.com/doc/refman/5.7/en/user-variables.html

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFC: (in-principle) native unquoting for standard evaluation

Michael Lawrence-3
Not sure I totally understand what you wrote, but my proposal is somewhat
independent of the unquoting during the call (your proposal). Authors would
be free to either use auto-quoting or continue to rely on the substitute()
mechanism. Lazy evaluation wouldn't go away.


On Fri, Mar 17, 2017 at 6:16 AM, Jonathan Carroll <[hidden email]>
wrote:

> I love the pointer analogy. Presumably the additional complication of scope
> breaks this however. * itself would have been a nice operator for this were
> it not prone to ambiguity (`a * *b` vs `a**b`, from which @ does not
> suffer).
>
> Would this extension require that function authors explicitly enable
> auto-quoting support? I somewhat envisioned functions seeing the resolved
> unquoted object (within their calling scope) so that they could retain
> their standard defintions when not using @. In my mutate example, mutate
> itself could simply be the NSE version, so
>
>     mutate(mtcars, z = mpg)
>
> would work as normal, but
>
>     x = "mpg"
>     mutate(mtcars, z = @x)
>
> would produce the same result (x may be changing within a loop or be
> defined through a formal argument). Here, @x would resolve to `mpg` and
> mutate would retain the duty of resolving that to mtcars$mpg as per normal.
>
> A seperate SE version would not be required (as arguments could be set
> programatically), but an additional flexibility could be @ acting on a
> string rather than an object for direct unquoting
>
>     mutate(mtcars, z = @"mpg")
>
> for when the name is known but NSE isn't desired (which would also assist
> with the whole utils::globalVariables() vs CRAN checks concern).
>
> Having a formal argument forcefully auto-unquote would prevent standard
> usage unless there was a way to also disable it. Unless I'm missing an
> angle (which I very likely am) wouldn't it be better to have the user
> supply an @-prefixed argument and retain the connection to the calling
> scope?
>
> Apologies if I have any of that confused or there are better approaches. I
> merely have a desire for this to work and am learning as much as possible
> about "how" as I go.
>
> Your comments are greatly appreciated.
>
> - Jonathan.
>
> On Fri, 17 Mar 2017 at 21:00, Michael Lawrence <[hidden email]>
> wrote:
>
> Interesting idea. Lazy and non-standard evaluation is going to happen; the
> language needs a way to contain it.
>
> I'll extend the proposal so that prefixing a formal argument with @ in
> function() marks the argument as auto-quoting, so it arrives as a language
> object without use of substitute(). Kind of like how '*' in C declares a
> pointer and dereferences one.
>
> subset <- function(x, @subset, ...) { }
>
> This should make it easier to implement such functions, simplify
> compilation, and allow detection of potential quoting errors through static
> analysis.
>
> Michael
>
> On Thu, Mar 16, 2017 at 5:03 PM, Jonathan Carroll <[hidden email]>
> wrote:
>
> (please be gentle, it's my first time)
>
> I am interested in discussions (possibly reiterating past threads --
> searching didn't turn up much) on the possibility of supporting standard
> evaluation unquoting at the language level. This has been brought up in a
> recent similar thread here [1] and on Twitter [2] where I proposed the
> following desired (in-principle) syntax
>
>     f <- function(col1, col2, new_col_name) {
>         mtcars %>% mutate(@new_col_name = @col1 + @col2)
>     }
>
> or closer to home
>
>     x <- 1:10; y <- "x"
>     data.frame(z = @y)
>
> where @ would be defined as a unary prefix operator which substitutes the
> quoted variable name in-place, to allow more flexibility of NSE functions
> within a programming context. This mechanism exists within MySQL [3] (and
> likely other languages) and could potentially be extremely useful. Several
> alternatives have been incorporated into packages (most recently work
> on tidyeval) none of which appear to fully match the simplicity of the
> above, and some of which cut a forceful path through the syntax tree.
>
> The exact syntax isn't my concern at the moment (@ vs unquote() or other,
> though the first requires user-supplied native prefix support within the
> language, as per [1]) and neither is the exact way in which this would be
> achieved (well above my pay grade). The practicality of @ being on the LHS
> of `=` is also of a lesser concern (likely greater complexity) than the
> RHS.
>
> I hear there exists (justified) reluctance to add new syntax to the
> language, but I think this has sufficient merit (and a growing number of
> workarounds) to warrant continued discussion.
>
> With kindest regards,
>
> - Jonathan.
>
> [1] https://stat.ethz.ch/pipermail/r-devel/2017-March/073894.html
> [2] https://twitter.com/carroll_jono/status/842142292253196290
> [3] https://dev.mysql.com/doc/refman/5.7/en/user-variables.html
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFC: (in-principle) native unquoting for standard evaluation

Gabriel Becker
In reply to this post by Jonathan Carroll-2
Jonathan,

Nice proposal.

I think these two uses for unary @ ( your initial @ unary operator and
Michael's extension for use inside function declaration) synergize really
well. It could easily be that function owners can declare an parameter to
always quote, and function callers can their specific arguments to behave
in the way you describe. It would make @ mean two pretty different things
in these two contexts, but they aren't^ mixable, so I think that would be
ok. This also has a strong precedence with the * operator in C, where int
*a creates a pointer, and then *a +1 uses the dereferenced value.

^ I think they're only not mixable provided that the function function
itself does not support your (Jonathan's) version of the operator, i.e.,
the ability to use variables' values to declare parameter names  or default
values within the function declaration. (Actually I think it could be
supported for default values, just not parameter names, if we wanted to) I
think that's reasonable though. I don't think we would need to support that.

One big question is whether you can do function(x, y, @...). The definition
of mutate() using Michaels extension of your proposal would require this.
This would be in keeping with the principle of the proposal, I think,
though it might (or might not) make the implementation more complicated.

I wonder if it makes sense to have a formal ability to declare where the
NSE will take place in the function definition, perhaps, (completely
spitballing) a unary ^ operator, so a simplified subset could literally be
defined as

subset2 = function(^x,  @cond) x[cond,]

Perhaps that's getting too clever, but it could be cool. Note it would be
optional. And we might even want a different different operators for that,
since it changes what the @ modifier of the parameter does. (your code gets
the result of the expression being evaluated in the ^ context, rather than
the language object). This would be, I imagine, immensely useful when
attempting to compile code that is NSE, even beyond labeling it as such via
the @ in function declarations

Best,
~G


On Fri, Mar 17, 2017 at 6:16 AM, Jonathan Carroll <[hidden email]>
wrote:

> I love the pointer analogy. Presumably the additional complication of scope
> breaks this however. * itself would have been a nice operator for this were
> it not prone to ambiguity (`a * *b` vs `a**b`, from which @ does not
> suffer).
>
> Would this extension require that function authors explicitly enable
> auto-quoting support? I somewhat envisioned functions seeing the resolved
> unquoted object (within their calling scope) so that they could retain
> their standard defintions when not using @. In my mutate example, mutate
> itself could simply be the NSE version, so
>
>     mutate(mtcars, z = mpg)
>
> would work as normal, but
>
>     x = "mpg"
>     mutate(mtcars, z = @x)
>
> would produce the same result (x may be changing within a loop or be
> defined through a formal argument). Here, @x would resolve to `mpg` and
> mutate would retain the duty of resolving that to mtcars$mpg as per normal.
>
> A seperate SE version would not be required (as arguments could be set
> programatically), but an additional flexibility could be @ acting on a
> string rather than an object for direct unquoting
>
>     mutate(mtcars, z = @"mpg")
>
> for when the name is known but NSE isn't desired (which would also assist
> with the whole utils::globalVariables() vs CRAN checks concern).
>
> Having a formal argument forcefully auto-unquote would prevent standard
> usage unless there was a way to also disable it. Unless I'm missing an
> angle (which I very likely am) wouldn't it be better to have the user
> supply an @-prefixed argument and retain the connection to the calling
> scope?
>
> Apologies if I have any of that confused or there are better approaches. I
> merely have a desire for this to work and am learning as much as possible
> about "how" as I go.
>
> Your comments are greatly appreciated.
>
> - Jonathan.
>
> On Fri, 17 Mar 2017 at 21:00, Michael Lawrence <[hidden email]>
> wrote:
>
> Interesting idea. Lazy and non-standard evaluation is going to happen; the
> language needs a way to contain it.
>
> I'll extend the proposal so that prefixing a formal argument with @ in
> function() marks the argument as auto-quoting, so it arrives as a language
> object without use of substitute(). Kind of like how '*' in C declares a
> pointer and dereferences one.
>
> subset <- function(x, @subset, ...) { }
>
> This should make it easier to implement such functions, simplify
> compilation, and allow detection of potential quoting errors through static
> analysis.
>
> Michael
>
> On Thu, Mar 16, 2017 at 5:03 PM, Jonathan Carroll <[hidden email]>
> wrote:
>
> (please be gentle, it's my first time)
>
> I am interested in discussions (possibly reiterating past threads --
> searching didn't turn up much) on the possibility of supporting standard
> evaluation unquoting at the language level. This has been brought up in a
> recent similar thread here [1] and on Twitter [2] where I proposed the
> following desired (in-principle) syntax
>
>     f <- function(col1, col2, new_col_name) {
>         mtcars %>% mutate(@new_col_name = @col1 + @col2)
>     }
>
> or closer to home
>
>     x <- 1:10; y <- "x"
>     data.frame(z = @y)
>
> where @ would be defined as a unary prefix operator which substitutes the
> quoted variable name in-place, to allow more flexibility of NSE functions
> within a programming context. This mechanism exists within MySQL [3] (and
> likely other languages) and could potentially be extremely useful. Several
> alternatives have been incorporated into packages (most recently work
> on tidyeval) none of which appear to fully match the simplicity of the
> above, and some of which cut a forceful path through the syntax tree.
>
> The exact syntax isn't my concern at the moment (@ vs unquote() or other,
> though the first requires user-supplied native prefix support within the
> language, as per [1]) and neither is the exact way in which this would be
> achieved (well above my pay grade). The practicality of @ being on the LHS
> of `=` is also of a lesser concern (likely greater complexity) than the
> RHS.
>
> I hear there exists (justified) reluctance to add new syntax to the
> language, but I think this has sufficient merit (and a growing number of
> workarounds) to warrant continued discussion.
>
> With kindest regards,
>
> - Jonathan.
>
> [1] https://stat.ethz.ch/pipermail/r-devel/2017-March/073894.html
> [2] https://twitter.com/carroll_jono/status/842142292253196290
> [3] https://dev.mysql.com/doc/refman/5.7/en/user-variables.html
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



--
Gabriel Becker, PhD
Associate Scientist (Bioinformatics)
Genentech Research

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFC: (in-principle) native unquoting for standard evaluation

hadley wickham
In reply to this post by Jonathan Carroll-2
What would you propose for the unquote-splice operator?

Hadley

On Friday, March 17, 2017, Jonathan Carroll <[hidden email]> wrote:

> (please be gentle, it's my first time)
>
> I am interested in discussions (possibly reiterating past threads --
> searching didn't turn up much) on the possibility of supporting standard
> evaluation unquoting at the language level. This has been brought up in a
> recent similar thread here [1] and on Twitter [2] where I proposed the
> following desired (in-principle) syntax
>
>     f <- function(col1, col2, new_col_name) {
>         mtcars %>% mutate(@new_col_name = @col1 + @col2)
>     }
>
> or closer to home
>
>     x <- 1:10; y <- "x"
>     data.frame(z = @y)
>
> where @ would be defined as a unary prefix operator which substitutes the
> quoted variable name in-place, to allow more flexibility of NSE functions
> within a programming context. This mechanism exists within MySQL [3] (and
> likely other languages) and could potentially be extremely useful. Several
> alternatives have been incorporated into packages (most recently work
> on tidyeval) none of which appear to fully match the simplicity of the
> above, and some of which cut a forceful path through the syntax tree.
>
> The exact syntax isn't my concern at the moment (@ vs unquote() or other,
> though the first requires user-supplied native prefix support within the
> language, as per [1]) and neither is the exact way in which this would be
> achieved (well above my pay grade). The practicality of @ being on the LHS
> of `=` is also of a lesser concern (likely greater complexity) than the
> RHS.
>
> I hear there exists (justified) reluctance to add new syntax to the
> language, but I think this has sufficient merit (and a growing number of
> workarounds) to warrant continued discussion.
>
> With kindest regards,
>
> - Jonathan.
>
> [1] https://stat.ethz.ch/pipermail/r-devel/2017-March/073894.html
> [2] https://twitter.com/carroll_jono/status/842142292253196290
> [3] https://dev.mysql.com/doc/refman/5.7/en/user-variables.html
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] <javascript:;> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>


--
http://hadley.nz

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFC: (in-principle) native unquoting for standard evaluation

hadley wickham
In reply to this post by Michael Lawrence-3
Would this return a quosure? (i.e. a single sided formula that captures
both expression and environment). That's the data structure we've adopted
in tidyeval as it already has some built in support.

Hadley

On Friday, March 17, 2017, Michael Lawrence <[hidden email]>
wrote:

> Interesting idea. Lazy and non-standard evaluation is going to happen; the
> language needs a way to contain it.
>
> I'll extend the proposal so that prefixing a formal argument with @ in
> function() marks the argument as auto-quoting, so it arrives as a language
> object without use of substitute(). Kind of like how '*' in C declares a
> pointer and dereferences one.
>
> subset <- function(x, @subset, ...) { }
>
> This should make it easier to implement such functions, simplify
> compilation, and allow detection of potential quoting errors through static
> analysis.
>
> Michael
>
> On Thu, Mar 16, 2017 at 5:03 PM, Jonathan Carroll <[hidden email]
> <javascript:;>>
> wrote:
>
> > (please be gentle, it's my first time)
> >
> > I am interested in discussions (possibly reiterating past threads --
> > searching didn't turn up much) on the possibility of supporting standard
> > evaluation unquoting at the language level. This has been brought up in a
> > recent similar thread here [1] and on Twitter [2] where I proposed the
> > following desired (in-principle) syntax
> >
> >     f <- function(col1, col2, new_col_name) {
> >         mtcars %>% mutate(@new_col_name = @col1 + @col2)
> >     }
> >
> > or closer to home
> >
> >     x <- 1:10; y <- "x"
> >     data.frame(z = @y)
> >
> > where @ would be defined as a unary prefix operator which substitutes the
> > quoted variable name in-place, to allow more flexibility of NSE functions
> > within a programming context. This mechanism exists within MySQL [3] (and
> > likely other languages) and could potentially be extremely useful.
> Several
> > alternatives have been incorporated into packages (most recently work
> > on tidyeval) none of which appear to fully match the simplicity of the
> > above, and some of which cut a forceful path through the syntax tree.
> >
> > The exact syntax isn't my concern at the moment (@ vs unquote() or other,
> > though the first requires user-supplied native prefix support within the
> > language, as per [1]) and neither is the exact way in which this would be
> > achieved (well above my pay grade). The practicality of @ being on the
> LHS
> > of `=` is also of a lesser concern (likely greater complexity) than the
> > RHS.
> >
> > I hear there exists (justified) reluctance to add new syntax to the
> > language, but I think this has sufficient merit (and a growing number of
> > workarounds) to warrant continued discussion.
> >
> > With kindest regards,
> >
> > - Jonathan.
> >
> > [1] https://stat.ethz.ch/pipermail/r-devel/2017-March/073894.html
> > [2] https://twitter.com/carroll_jono/status/842142292253196290
> > [3] https://dev.mysql.com/doc/refman/5.7/en/user-variables.html
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] <javascript:;> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] <javascript:;> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>


--
http://hadley.nz

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFC: (in-principle) native unquoting for standard evaluation

Jonathan Carroll-2
Firstly, credit where due: the lazyeval NSE vignette [1] covers so many of
the angles that this proposal needs to address and is extremely well
written (even if it has been superseded). The @ prefix I'm proposing is a
drop-in replacement for `uq()` (as used in that vignette) but for which the
`f_eval()` and `~` steps would not be required by the author/user.

This is proposed as an admittedly naive suggestion which fails to account
for the subtleties raised in [1] such as unquoting of multiple arguments
and scope selection. I am hoping that the discussion can cover how best to
address those matters.

The significant hurdles (apart from implementation which I cannot speak to)
that are dealt with in lazyeval (and presumably tidyeval) seem to be:

- a prefix can be attached to only a single object, so the extra_args
example from [1] would not be possible. I'm not certain why the unquoting
of the variable would not still be possible with the form

    variable = "x"
    mean(@variable, na.rm = TRUE, trim = 0.9)

since I'm proposing that the call need not be a formula (I may be way off
on this interpretation).

- I am proposing that the new syntax be able to achieve the example

    f <- function(col1, col2, new_col_name) {
        mtcars %>% mutate(@new_col_name = @col1 + @col2)
    }

but this is ambiguous if there is, say, an object "mpg" within that
function scope. [1] handles this with .env and .data pronouns but this
doesn't seem possible with just a prefix. One solution may be to have @@
and @ representing these two options.

I appreciate the significant work that has gone into the tidyverse packages
which use NSE and my intention is not to downplay any of that. I would just
like to be able to use the language more efficiently, so native access to
the unquoting seems like a step forward.

Kindest regards,

- Jonathan.

[1] https://cran.r-project.org/web/packages/lazyeval/vignettes/lazyeval.html

On Sun, Mar 19, 2017 at 1:09 PM, Hadley Wickham <[hidden email]> wrote:

> Would this return a quosure? (i.e. a single sided formula that captures
> both expression and environment). That's the data structure we've adopted
> in tidyeval as it already has some built in support.
>
> Hadley
>
> On Friday, March 17, 2017, Michael Lawrence <[hidden email]>
> wrote:
>
>> Interesting idea. Lazy and non-standard evaluation is going to happen; the
>> language needs a way to contain it.
>>
>> I'll extend the proposal so that prefixing a formal argument with @ in
>> function() marks the argument as auto-quoting, so it arrives as a language
>> object without use of substitute(). Kind of like how '*' in C declares a
>> pointer and dereferences one.
>>
>> subset <- function(x, @subset, ...) { }
>>
>> This should make it easier to implement such functions, simplify
>> compilation, and allow detection of potential quoting errors through
>> static
>> analysis.
>>
>> Michael
>>
>> On Thu, Mar 16, 2017 at 5:03 PM, Jonathan Carroll <[hidden email]>
>> wrote:
>>
>> > (please be gentle, it's my first time)
>> >
>> > I am interested in discussions (possibly reiterating past threads --
>> > searching didn't turn up much) on the possibility of supporting standard
>> > evaluation unquoting at the language level. This has been brought up in
>> a
>> > recent similar thread here [1] and on Twitter [2] where I proposed the
>> > following desired (in-principle) syntax
>> >
>> >     f <- function(col1, col2, new_col_name) {
>> >         mtcars %>% mutate(@new_col_name = @col1 + @col2)
>> >     }
>> >
>> > or closer to home
>> >
>> >     x <- 1:10; y <- "x"
>> >     data.frame(z = @y)
>> >
>> > where @ would be defined as a unary prefix operator which substitutes
>> the
>> > quoted variable name in-place, to allow more flexibility of NSE
>> functions
>> > within a programming context. This mechanism exists within MySQL [3]
>> (and
>> > likely other languages) and could potentially be extremely useful.
>> Several
>> > alternatives have been incorporated into packages (most recently work
>> > on tidyeval) none of which appear to fully match the simplicity of the
>> > above, and some of which cut a forceful path through the syntax tree.
>> >
>> > The exact syntax isn't my concern at the moment (@ vs unquote() or
>> other,
>> > though the first requires user-supplied native prefix support within the
>> > language, as per [1]) and neither is the exact way in which this would
>> be
>> > achieved (well above my pay grade). The practicality of @ being on the
>> LHS
>> > of `=` is also of a lesser concern (likely greater complexity) than the
>> > RHS.
>> >
>> > I hear there exists (justified) reluctance to add new syntax to the
>> > language, but I think this has sufficient merit (and a growing number of
>> > workarounds) to warrant continued discussion.
>> >
>> > With kindest regards,
>> >
>> > - Jonathan.
>> >
>> > [1] https://stat.ethz.ch/pipermail/r-devel/2017-March/073894.html
>> > [2] https://twitter.com/carroll_jono/status/842142292253196290
>> > [3] https://dev.mysql.com/doc/refman/5.7/en/user-variables.html
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > [hidden email] mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>
> --
> http://hadley.nz
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFC: (in-principle) native unquoting for standard evaluation

Michael Lawrence-3
In reply to this post by hadley wickham
Yes, it would bind the language object to the environment, like an
R-level promise (but "promise" of course refers specifically to just
_lazy_ evaluation).

For the uqs() thing, expanding calls like that is somewhat orthogonal
to NSE. It would be nice in general to be able to write something like
mean(x, extra_args...) without resorting to do.call(mean, c(list(x),
extra_args)). If we had that then uqs() would just be the combination
of unquote and expansion, i.e., mean(x, @extra_args...). The "..."
postfix would not work since it's still a valid symbol name, but we
could come up with something.

Michael


On Sat, Mar 18, 2017 at 7:39 PM, Hadley Wickham <[hidden email]> wrote:

> Would this return a quosure? (i.e. a single sided formula that captures both
> expression and environment). That's the data structure we've adopted in
> tidyeval as it already has some built in support.
>
> Hadley
>
>
> On Friday, March 17, 2017, Michael Lawrence <[hidden email]>
> wrote:
>>
>> Interesting idea. Lazy and non-standard evaluation is going to happen; the
>> language needs a way to contain it.
>>
>> I'll extend the proposal so that prefixing a formal argument with @ in
>> function() marks the argument as auto-quoting, so it arrives as a language
>> object without use of substitute(). Kind of like how '*' in C declares a
>> pointer and dereferences one.
>>
>> subset <- function(x, @subset, ...) { }
>>
>> This should make it easier to implement such functions, simplify
>> compilation, and allow detection of potential quoting errors through
>> static
>> analysis.
>>
>> Michael
>>
>> On Thu, Mar 16, 2017 at 5:03 PM, Jonathan Carroll <[hidden email]>
>> wrote:
>>
>> > (please be gentle, it's my first time)
>> >
>> > I am interested in discussions (possibly reiterating past threads --
>> > searching didn't turn up much) on the possibility of supporting standard
>> > evaluation unquoting at the language level. This has been brought up in
>> > a
>> > recent similar thread here [1] and on Twitter [2] where I proposed the
>> > following desired (in-principle) syntax
>> >
>> >     f <- function(col1, col2, new_col_name) {
>> >         mtcars %>% mutate(@new_col_name = @col1 + @col2)
>> >     }
>> >
>> > or closer to home
>> >
>> >     x <- 1:10; y <- "x"
>> >     data.frame(z = @y)
>> >
>> > where @ would be defined as a unary prefix operator which substitutes
>> > the
>> > quoted variable name in-place, to allow more flexibility of NSE
>> > functions
>> > within a programming context. This mechanism exists within MySQL [3]
>> > (and
>> > likely other languages) and could potentially be extremely useful.
>> > Several
>> > alternatives have been incorporated into packages (most recently work
>> > on tidyeval) none of which appear to fully match the simplicity of the
>> > above, and some of which cut a forceful path through the syntax tree.
>> >
>> > The exact syntax isn't my concern at the moment (@ vs unquote() or
>> > other,
>> > though the first requires user-supplied native prefix support within the
>> > language, as per [1]) and neither is the exact way in which this would
>> > be
>> > achieved (well above my pay grade). The practicality of @ being on the
>> > LHS
>> > of `=` is also of a lesser concern (likely greater complexity) than the
>> > RHS.
>> >
>> > I hear there exists (justified) reluctance to add new syntax to the
>> > language, but I think this has sufficient merit (and a growing number of
>> > workarounds) to warrant continued discussion.
>> >
>> > With kindest regards,
>> >
>> > - Jonathan.
>> >
>> > [1] https://stat.ethz.ch/pipermail/r-devel/2017-March/073894.html
>> > [2] https://twitter.com/carroll_jono/status/842142292253196290
>> > [3] https://dev.mysql.com/doc/refman/5.7/en/user-variables.html
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > [hidden email] mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> http://hadley.nz

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFC: (in-principle) native unquoting for standard evaluation

Radford Neal
In reply to this post by Jonathan Carroll-2
Michael Lawrence (as last in long series of posters)...

> Yes, it would bind the language object to the environment, like an
> R-level promise (but "promise" of course refers specifically to just
> _lazy_ evaluation).
>
> For the uqs() thing, expanding calls like that is somewhat orthogonal
> to NSE. It would be nice in general to be able to write something like
> mean(x, extra_args...) without resorting to do.call(mean, c(list(x),
> extra_args)). If we had that then uqs() would just be the combination
> of unquote and expansion, i.e., mean(x, @extra_args...). The "..."
> postfix would not work since it's still a valid symbol name, but we
> could come up with something.


I've been trying to follow this proposal, though without tracking down
all the tweets, etc. that are referenced.  I suspect I'm not the only
reader who isn't clear exactly what is being proposed.  I think a
detailed, self-contained proposal would be useful.

One thing I'm not clear on is whether the proposal would add anything
semantically beyond what the present "eval" and "substitute" functions
can do fairly easily.  If not, is there really any need for a slightly
more concise syntax?  Is it expected that the new syntax would be used
lots by ordinary users, or is it only for the convenience of people
who are writing fairly esoteric functions (which might then be used by
many)?  If the later, it seems undesirable to me.

There is an opportunity cost to grabbing the presently-unused unary @
operator for this, in that it might otherwise be used for some other
extension.  For example, see the last five slides in my talk at
http://www.cs.utoronto.ca/~radford/ftp/R-lang-ext.pdf for a different
proposal for a new unary @ operator.  I'm not necessarily advocating
that particular use (my ideas in this respect are still undergoing
revisions), but the overall point is that there may well be several
good uses of a unary @ operator (and there aren't many other good
characters to use for a unary operator besides @).  It is unclear to
me that the current proposal is the highest-value use of @.

   Radford Neal

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFC: (in-principle) native unquoting for standard evaluation

hadley wickham
On Mon, Mar 20, 2017 at 7:36 AM, Radford Neal <[hidden email]> wrote:

> Michael Lawrence (as last in long series of posters)...
>
>> Yes, it would bind the language object to the environment, like an
>> R-level promise (but "promise" of course refers specifically to just
>> _lazy_ evaluation).
>>
>> For the uqs() thing, expanding calls like that is somewhat orthogonal
>> to NSE. It would be nice in general to be able to write something like
>> mean(x, extra_args...) without resorting to do.call(mean, c(list(x),
>> extra_args)). If we had that then uqs() would just be the combination
>> of unquote and expansion, i.e., mean(x, @extra_args...). The "..."
>> postfix would not work since it's still a valid symbol name, but we
>> could come up with something.
>
>
> I've been trying to follow this proposal, though without tracking down
> all the tweets, etc. that are referenced.  I suspect I'm not the only
> reader who isn't clear exactly what is being proposed.  I think a
> detailed, self-contained proposal would be useful.

We have a working implementation (which I'm calling tidyeval) in
https://github.com/hadley/rlang, but we have yet to write it up. We'll
spend some time documenting since it seems to be of broader interest.

> One thing I'm not clear on is whether the proposal would add anything
> semantically beyond what the present "eval" and "substitute" functions
> can do fairly easily.  If not, is there really any need for a slightly
> more concise syntax?  Is it expected that the new syntax would be used
> lots by ordinary users, or is it only for the convenience of people
> who are writing fairly esoteric functions (which might then be used by
> many)?  If the later, it seems undesirable to me.

I accidentally responded off list to Michael, but I think there are
three legs to "tidy" style of NSE:

1) capturing a quosure from a promise

2) quasiquotation (unquote + unquote-splice)

3) pronouns, so you can be explicit about where a variable should be
looked up (.data vs .end)

These are largely orthogonal, but I don't think you can solve the most
important NSE problems without all three. Just having 1) in base R
would be a big step forward.

> There is an opportunity cost to grabbing the presently-unused unary @
> operator for this, in that it might otherwise be used for some other
> extension.  For example, see the last five slides in my talk at
> http://www.cs.utoronto.ca/~radford/ftp/R-lang-ext.pdf for a different
> proposal for a new unary @ operator.  I'm not necessarily advocating
> that particular use (my ideas in this respect are still undergoing
> revisions), but the overall point is that there may well be several
> good uses of a unary @ operator (and there aren't many other good
> characters to use for a unary operator besides @).  It is unclear to
> me that the current proposal is the highest-value use of @.

A further extension would be to allow binary @ in function argument
names; then the LHS could be an arbitrary string used as an extension
mechanism.

Hadley

--
http://hadley.nz

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFC: (in-principle) native unquoting for standard evaluation

hadley wickham
On Mon, Mar 20, 2017 at 8:00 AM, Hadley Wickham <[hidden email]> wrote:

> On Mon, Mar 20, 2017 at 7:36 AM, Radford Neal <[hidden email]> wrote:
>> Michael Lawrence (as last in long series of posters)...
>>
>>> Yes, it would bind the language object to the environment, like an
>>> R-level promise (but "promise" of course refers specifically to just
>>> _lazy_ evaluation).
>>>
>>> For the uqs() thing, expanding calls like that is somewhat orthogonal
>>> to NSE. It would be nice in general to be able to write something like
>>> mean(x, extra_args...) without resorting to do.call(mean, c(list(x),
>>> extra_args)). If we had that then uqs() would just be the combination
>>> of unquote and expansion, i.e., mean(x, @extra_args...). The "..."
>>> postfix would not work since it's still a valid symbol name, but we
>>> could come up with something.
>>
>>
>> I've been trying to follow this proposal, though without tracking down
>> all the tweets, etc. that are referenced.  I suspect I'm not the only
>> reader who isn't clear exactly what is being proposed.  I think a
>> detailed, self-contained proposal would be useful.
>
> We have a working implementation (which I'm calling tidyeval) in
> https://github.com/hadley/rlang, but we have yet to write it up. We'll
> spend some time documenting since it seems to be of broader interest.

First pass at programming dplyr vignette (including details about tidyeval) at
http://rpubs.com/hadley/dplyr-programming

Hadley

--
http://hadley.nz

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RFC: (in-principle) native unquoting for standard evaluation

Lionel Henry
In reply to this post by Radford Neal

RN> There is an opportunity cost to grabbing the presently-unused unary @
RN> operator for this

I don't think this is the case because the parser has to interpret `@`
in formal argument lists in a different way than in function calls.
Besides, it'd make sense to set up these annotations with a binary
`@`. There are already two main ways of passing arguments in R: by
value and by expression. Providing an explicit annotation for passing
by expression would standardise the semantics of these functions and,
as Michael suggests, would help static analysis. So passing by name
could be just another argument-passing method:

    function(expr@ x, value@ y = 10L, name@ z = rnorm(1)) {
      list(x, y, z, z)
    }

The parser would record the argument metadata in the formals list.
This metadata could be consulted by static analysis tools and a
selected subset of those tags (`expr` and `name`) would have an effect
on the evaluation mechanism.


RN> One thing I'm not clear on is whether the proposal would add anything
RN> semantically beyond what the present "eval" and "substitute" functions
RN> can do fairly easily.

Quasiquotation makes it possible to program with functions that take
arguments by expression. There is no easy way to do that with
eval() and substitute() alone. R has always been an interface language
and as such, its main advantage is to provide DSLs for data analysis
tasks. Specification of statistical models with a formula, overscoping
data frame columns with subset() and transform(), etc. This is why
providing an easier means of programming with these functions seems to
have more value than call-by-name semantics. Unquoting will be used
extensively in ggplot2 and dplyr, two popular R packages.  Please see
the vignette posted by Hadley for some introductory examples.

In any case, the unquoting notation would be orthogonal to function
arguments annotations because actuals and formals are parsed
differently. While formals annotations would be recorded by the parser
in the formals list, `@` in an actual argument list would be parsed as
a function call, like the rest of R operators.

Lionel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RFC: (in-principle) native unquoting for standard evaluation

Lionel Henry
In reply to this post by Michael Lawrence-3
ML> For the uqs() thing, expanding calls like that is somewhat orthogonal
ML> to NSE. It would be nice in general to be able to write something like
ML> mean(x, extra_args...) without resorting to do.call(mean, c(list(x),
ML> extra_args)).

This is not completely true because splicing is necessarily linked to
the principle of unquoting (evaluating). You cannot splice something
that you don't know the value of, you have to evaluate the promise of
the splicing operand. In other words, you cannot splice at the parser
level, only at the interpreter level, and the splicing operation has
to be part of the call tree. This implies the important limitation
that you cannot splice a list in a call to a function taking named
arguments, you can only splice when capturing dots. On the plus side,
it seems more R-like to implement it as a regular function call since
all syntactic operations in R are function calls.

Since splicing is conceptually linked to unquoting, I think it would
make sense to have a derivative operator, e.g. @@. In that case it
would simply take its argument by expression and could thus be defined
as:

     `@@` <- `~`.

It'd be used like this:

     # Equivalent to as.list(mtcars)
     list(@@ mtcars)

     # Returns a list of symbols
     list(@@ lapply(letters, as.symbol))

To make it work we'd have two functions for capturing dots that would
understand arguments wrapped in an `@@` quosure. dotsValues(...)
would expand spliced arguments and then evaluate them, while
dotsExprs(...)  would expand and return a list of quosures. Dotted
primitive functions like list() or c() would also need to preprocess
the dots with a C function.

Another reason not to use `...` as syntax for splicing is that it may
be better to reserve it for forwarding operations. I think one other
syntax update that would be worthwile to consider is forwarding of
named arguments. This would allow labelling of arguments to work
transparently across wrappers:

     my_plot <- function(x) plot(1:10, ...(x))

     # The y axis is correctly labelled as 11:20 in the plot
     my_plot(11:20)

And this would also allow to forward named arguments to functions
taking their arguments by expression, just like we forward dots.

Lionel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Loading...