[WISH / PATCH] possibility to split string literals across multiple lines

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
23 messages Options
12
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[WISH / PATCH] possibility to split string literals across multiple lines

Andreas Kersting
Hi,

I would really like to have a way to split long string literals across
multiple lines in R.

Currently, if a string literal spans multiple lines, there is no way to
inhibit the introduction of newline characters:

 > "aaa
+ bbb"
[1] "aaa\nbbb"


If a line ends with a backslash, it is just ignored:

 > "aaa\
+ bbb"
[1] "aaa\nbbb"


We could use this fact to implement string splitting in a fairly
backward-compatible way, since currently such trailing backslashes
should hardly be used as they do not have any effect. The attached patch
makes the parser ignore a newline character directly following a backslash:

 > "aaa\
+ bbb"
[1] "aaabbb"


I personally would also prefer if leading blanks (spaces and tabs) in
the second line are ignored to allow for proper indentation:

 >   "aaa \
+    bbb"
[1] "aaa bbb"

 >   "aaa\
+    \ bbb"
[1] "aaa bbb"

This is also implemented by this patch.


An alternative approach could be to have something like

("aaa "
"bbb")

or

("aaa ",
"bbb")

be interpreted as "aaa bbb".

I don't know the ins and outs of the parser of R (hence: please very
carefully review the attached patch), but I guess this would be more
work to implement!?


What do you think? Is there anybody else who is missing this feature in
the first place?

Regards,
Andreas

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

patch.diff (3K) Download Attachment
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [WISH / PATCH] possibility to split string literals across multiple lines

Duncan Murdoch-2
On 14/06/2017 5:58 AM, Andreas Kersting wrote:
> Hi,
>
> I would really like to have a way to split long string literals across
> multiple lines in R.

I don't understand why you require the string to be a literal.  Why not
construct the long string in an expression like

  paste0("aaa",
         "bbb")

?  Surely the execution time of the paste0 call is negligible.

Duncan Murdoch

>
> Currently, if a string literal spans multiple lines, there is no way to
> inhibit the introduction of newline characters:
>
>  > "aaa
> + bbb"
> [1] "aaa\nbbb"
>
>
> If a line ends with a backslash, it is just ignored:
>
>  > "aaa\
> + bbb"
> [1] "aaa\nbbb"
>
>
> We could use this fact to implement string splitting in a fairly
> backward-compatible way, since currently such trailing backslashes
> should hardly be used as they do not have any effect. The attached patch
> makes the parser ignore a newline character directly following a backslash:
>
>  > "aaa\
> + bbb"
> [1] "aaabbb"
>
>
> I personally would also prefer if leading blanks (spaces and tabs) in
> the second line are ignored to allow for proper indentation:
>
>  >   "aaa \
> +    bbb"
> [1] "aaa bbb"
>
>  >   "aaa\
> +    \ bbb"
> [1] "aaa bbb"
>
> This is also implemented by this patch.
>
>
> An alternative approach could be to have something like
>
> ("aaa "
> "bbb")
>
> or
>
> ("aaa ",
> "bbb")
>
> be interpreted as "aaa bbb".
>
> I don't know the ins and outs of the parser of R (hence: please very
> carefully review the attached patch), but I guess this would be more
> work to implement!?
>
>
> What do you think? Is there anybody else who is missing this feature in
> the first place?
>
> Regards,
> Andreas
>
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [WISH / PATCH] possibility to split string literals across multiple lines

Gábor Csárdi
On Wed, Jun 14, 2017 at 12:12 PM, Duncan Murdoch
<[hidden email]> wrote:
> On 14/06/2017 5:58 AM, Andreas Kersting wrote:
>>
>> Hi,
>>
>> I would really like to have a way to split long string literals across
>> multiple lines in R.

You can also look at the glue package, it supports continuation and a lot more:

glue("
    A formatted string \\
    can also be on a \\
    single line
    ")
#> A formatted string can also be on a single line

Gabor

[...]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [WISH / PATCH] possibility to split string literals across multiple lines

Andreas Kersting
In reply to this post by Duncan Murdoch-2
On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdoch <[hidden email]> wrote:

> On 14/06/2017 5:58 AM, Andreas Kersting wrote:
> > Hi,
> >
> > I would really like to have a way to split long string literals across
> > multiple lines in R.
>
> I don't understand why you require the string to be a literal.  Why not
> construct the long string in an expression like
>
>   paste0("aaa",
>          "bbb")
>
> ?  Surely the execution time of the paste0 call is negligible.
>
> Duncan Murdoch

Actually "execution time" is precisely one of the reasons why I would like to see this feature as - depending on the context (e.g. in a tight loop) - the execution time of paste0 (or probably also glue, thanks Gabor) is not necessarily insignificant.

The other reason is style: I think it is cleaner if we can construct such a long string literal without the need for a function call.

Andreas

> >
> > Currently, if a string literal spans multiple lines, there is no way to
> > inhibit the introduction of newline characters:
> >
> >  > "aaa
> > + bbb"
> > [1] "aaa\nbbb"
> >
> >
> > If a line ends with a backslash, it is just ignored:
> >
> >  > "aaa\
> > + bbb"
> > [1] "aaa\nbbb"
> >
> >
> > We could use this fact to implement string splitting in a fairly
> > backward-compatible way, since currently such trailing backslashes
> > should hardly be used as they do not have any effect. The attached patch
> > makes the parser ignore a newline character directly following a backslash:
> >
> >  > "aaa\
> > + bbb"
> > [1] "aaabbb"
> >
> >
> > I personally would also prefer if leading blanks (spaces and tabs) in
> > the second line are ignored to allow for proper indentation:
> >
> >  >   "aaa \
> > +    bbb"
> > [1] "aaa bbb"
> >
> >  >   "aaa\
> > +    \ bbb"
> > [1] "aaa bbb"
> >
> > This is also implemented by this patch.
> >
> >
> > An alternative approach could be to have something like
> >
> > ("aaa "
> > "bbb")
> >
> > or
> >
> > ("aaa ",
> > "bbb")
> >
> > be interpreted as "aaa bbb".
> >
> > I don't know the ins and outs of the parser of R (hence: please very
> > carefully review the attached patch), but I guess this would be more
> > work to implement!?
> >
> >
> > What do you think? Is there anybody else who is missing this feature in
> > the first place?
> >
> > Regards,
> > Andreas
> >
> >
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [WISH / PATCH] possibility to split string literals across multiple lines

Mark van der Loo
Having some line-breaking character for string literals would have benefits
as string literals can then be constructed parse-time rather than run-time.
I have run into this myself a few times as well. One way to at least
emulate something like that is the following.

`%+%` <- function(x,y) paste0(x,y)

"hello" %+%
  " pretty" %+%
  " world"


-Mark



Op wo 14 jun. 2017 om 13:53 schreef Andreas Kersting <[hidden email]>:

> On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdoch <
> [hidden email]> wrote:
>
> > On 14/06/2017 5:58 AM, Andreas Kersting wrote:
> > > Hi,
> > >
> > > I would really like to have a way to split long string literals across
> > > multiple lines in R.
> >
> > I don't understand why you require the string to be a literal.  Why not
> > construct the long string in an expression like
> >
> >   paste0("aaa",
> >          "bbb")
> >
> > ?  Surely the execution time of the paste0 call is negligible.
> >
> > Duncan Murdoch
>
> Actually "execution time" is precisely one of the reasons why I would like
> to see this feature as - depending on the context (e.g. in a tight loop) -
> the execution time of paste0 (or probably also glue, thanks Gabor) is not
> necessarily insignificant.
>
> The other reason is style: I think it is cleaner if we can construct such
> a long string literal without the need for a function call.
>
> Andreas
>
> > >
> > > Currently, if a string literal spans multiple lines, there is no way to
> > > inhibit the introduction of newline characters:
> > >
> > >  > "aaa
> > > + bbb"
> > > [1] "aaa\nbbb"
> > >
> > >
> > > If a line ends with a backslash, it is just ignored:
> > >
> > >  > "aaa\
> > > + bbb"
> > > [1] "aaa\nbbb"
> > >
> > >
> > > We could use this fact to implement string splitting in a fairly
> > > backward-compatible way, since currently such trailing backslashes
> > > should hardly be used as they do not have any effect. The attached
> patch
> > > makes the parser ignore a newline character directly following a
> backslash:
> > >
> > >  > "aaa\
> > > + bbb"
> > > [1] "aaabbb"
> > >
> > >
> > > I personally would also prefer if leading blanks (spaces and tabs) in
> > > the second line are ignored to allow for proper indentation:
> > >
> > >  >   "aaa \
> > > +    bbb"
> > > [1] "aaa bbb"
> > >
> > >  >   "aaa\
> > > +    \ bbb"
> > > [1] "aaa bbb"
> > >
> > > This is also implemented by this patch.
> > >
> > >
> > > An alternative approach could be to have something like
> > >
> > > ("aaa "
> > > "bbb")
> > >
> > > or
> > >
> > > ("aaa ",
> > > "bbb")
> > >
> > > be interpreted as "aaa bbb".
> > >
> > > I don't know the ins and outs of the parser of R (hence: please very
> > > carefully review the attached patch), but I guess this would be more
> > > work to implement!?
> > >
> > >
> > > What do you think? Is there anybody else who is missing this feature in
> > > the first place?
> > >
> > > Regards,
> > > Andreas
> > >
> > >
> > >
> > > ______________________________________________
> > > [hidden email] mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > >
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [WISH / PATCH] possibility to split string literals across multiple lines

Joris FA Meys
Mark, that's actually a fair statement, although your extra operator
doesn't cause construction at parse time. You still call paste0(), but just
add an extra layer on top of it.

I also doubt that even in gigantic loops the benefit is going to be
significant. Take following example:

atestfun <- function(x){
  y <- paste0("a very long",
         "string for testing")
  grep(x, y)
}
atestfun2 <- function(x){
  y <- "a very long
string for testing"
  grep(x,y)
}
cfun <- cmpfun(atestfun)
cfun2 <- cmpfun(atestfun2)

require(rbenchmark)
benchmark(atestfun("a"),
          atestfun2("a"),
          cfun("a"),
          cfun2("a"),
          replications = 100000)

Which gives after 100,000 replications:

            test replications elapsed relative
1  atestfun("a")       100000    0.83    1.339
2 atestfun2("a")       100000    0.62    1.000
3      cfun("a")       100000    0.81    1.306
4     cfun2("a")       100000    0.62    1.000

The patch can in principle make similar code marginally faster, but I'm not
convinced the patch is going to make any real difference except for in some
very specific and exotic cases. Even more, calling a function like the
examples inside the loop is the only way I can come up with where this
might be a problem. If you just construct the string inside the loop,
there's two possibilities:

- the string does not need to change, and then you better construct it
outside of the loop
- the string does need to change, and then you need paste() or paste0()
anyway

I'm not against incorporating the patch, as it would eliminate a few
keystrokes. It's a neat idea, but I don't expect any other noticeable
advantage from it.

my humble 2 cents
Cheers
Joris

On Wed, Jun 14, 2017 at 2:00 PM, Mark van der Loo <[hidden email]>
wrote:

> Having some line-breaking character for string literals would have benefits
> as string literals can then be constructed parse-time rather than run-time.
> I have run into this myself a few times as well. One way to at least
> emulate something like that is the following.
>
> `%+%` <- function(x,y) paste0(x,y)
>
> "hello" %+%
>   " pretty" %+%
>   " world"
>
>
> -Mark
>
>
>
> Op wo 14 jun. 2017 om 13:53 schreef Andreas Kersting <[hidden email]
> >:
>
> > On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdoch <
> > [hidden email]> wrote:
> >
> > > On 14/06/2017 5:58 AM, Andreas Kersting wrote:
> > > > Hi,
> > > >
> > > > I would really like to have a way to split long string literals
> across
> > > > multiple lines in R.
> > >
> > > I don't understand why you require the string to be a literal.  Why not
> > > construct the long string in an expression like
> > >
> > >   paste0("aaa",
> > >          "bbb")
> > >
> > > ?  Surely the execution time of the paste0 call is negligible.
> > >
> > > Duncan Murdoch
> >
> > Actually "execution time" is precisely one of the reasons why I would
> like
> > to see this feature as - depending on the context (e.g. in a tight loop)
> -
> > the execution time of paste0 (or probably also glue, thanks Gabor) is not
> > necessarily insignificant.
> >
> > The other reason is style: I think it is cleaner if we can construct such
> > a long string literal without the need for a function call.
> >
> > Andreas
> >
> > > >
> > > > Currently, if a string literal spans multiple lines, there is no way
> to
> > > > inhibit the introduction of newline characters:
> > > >
> > > >  > "aaa
> > > > + bbb"
> > > > [1] "aaa\nbbb"
> > > >
> > > >
> > > > If a line ends with a backslash, it is just ignored:
> > > >
> > > >  > "aaa\
> > > > + bbb"
> > > > [1] "aaa\nbbb"
> > > >
> > > >
> > > > We could use this fact to implement string splitting in a fairly
> > > > backward-compatible way, since currently such trailing backslashes
> > > > should hardly be used as they do not have any effect. The attached
> > patch
> > > > makes the parser ignore a newline character directly following a
> > backslash:
> > > >
> > > >  > "aaa\
> > > > + bbb"
> > > > [1] "aaabbb"
> > > >
> > > >
> > > > I personally would also prefer if leading blanks (spaces and tabs) in
> > > > the second line are ignored to allow for proper indentation:
> > > >
> > > >  >   "aaa \
> > > > +    bbb"
> > > > [1] "aaa bbb"
> > > >
> > > >  >   "aaa\
> > > > +    \ bbb"
> > > > [1] "aaa bbb"
> > > >
> > > > This is also implemented by this patch.
> > > >
> > > >
> > > > An alternative approach could be to have something like
> > > >
> > > > ("aaa "
> > > > "bbb")
> > > >
> > > > or
> > > >
> > > > ("aaa ",
> > > > "bbb")
> > > >
> > > > be interpreted as "aaa bbb".
> > > >
> > > > I don't know the ins and outs of the parser of R (hence: please very
> > > > carefully review the attached patch), but I guess this would be more
> > > > work to implement!?
> > > >
> > > >
> > > > What do you think? Is there anybody else who is missing this feature
> in
> > > > the first place?
> > > >
> > > > Regards,
> > > > Andreas
> > > >
> > > >
> > > >
> > > > ______________________________________________
> > > > [hidden email] mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > > >
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



--
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics

tel :  +32 (0)9 264 61 79
[hidden email]
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [WISH / PATCH] possibility to split string literals across multiple lines

Mark van der Loo
I know it doesn't cause construction at parse time, and it was also not
what I said. What I meant was that it makes the syntax at least look a
little as if you have a line-breaking character within string literals.

Op wo 14 jun. 2017 om 14:18 schreef Joris Meys <[hidden email]>:

> Mark, that's actually a fair statement, although your extra operator
> doesn't cause construction at parse time. You still call paste0(), but just
> add an extra layer on top of it.
>
> I also doubt that even in gigantic loops the benefit is going to be
> significant. Take following example:
>
> atestfun <- function(x){
>   y <- paste0("a very long",
>          "string for testing")
>   grep(x, y)
> }
> atestfun2 <- function(x){
>   y <- "a very long
> string for testing"
>   grep(x,y)
> }
> cfun <- cmpfun(atestfun)
> cfun2 <- cmpfun(atestfun2)
>
> require(rbenchmark)
> benchmark(atestfun("a"),
>           atestfun2("a"),
>           cfun("a"),
>           cfun2("a"),
>           replications = 100000)
>
> Which gives after 100,000 replications:
>
>             test replications elapsed relative
> 1  atestfun("a")       100000    0.83    1.339
> 2 atestfun2("a")       100000    0.62    1.000
> 3      cfun("a")       100000    0.81    1.306
> 4     cfun2("a")       100000    0.62    1.000
>
> The patch can in principle make similar code marginally faster, but I'm
> not convinced the patch is going to make any real difference except for in
> some very specific and exotic cases. Even more, calling a function like the
> examples inside the loop is the only way I can come up with where this
> might be a problem. If you just construct the string inside the loop,
> there's two possibilities:
>
> - the string does not need to change, and then you better construct it
> outside of the loop
> - the string does need to change, and then you need paste() or paste0()
> anyway
>
> I'm not against incorporating the patch, as it would eliminate a few
> keystrokes. It's a neat idea, but I don't expect any other noticeable
> advantage from it.
>
> my humble 2 cents
> Cheers
> Joris
>
> On Wed, Jun 14, 2017 at 2:00 PM, Mark van der Loo <
> [hidden email]> wrote:
>
>> Having some line-breaking character for string literals would have
>> benefits
>> as string literals can then be constructed parse-time rather than
>> run-time.
>> I have run into this myself a few times as well. One way to at least
>> emulate something like that is the following.
>>
>> `%+%` <- function(x,y) paste0(x,y)
>>
>> "hello" %+%
>>   " pretty" %+%
>>   " world"
>>
>>
>> -Mark
>>
>>
>>
>> Op wo 14 jun. 2017 om 13:53 schreef Andreas Kersting <
>> [hidden email]>:
>>
>> > On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdoch <
>> > [hidden email]> wrote:
>> >
>> > > On 14/06/2017 5:58 AM, Andreas Kersting wrote:
>> > > > Hi,
>> > > >
>> > > > I would really like to have a way to split long string literals
>> across
>> > > > multiple lines in R.
>> > >
>> > > I don't understand why you require the string to be a literal.  Why
>> not
>> > > construct the long string in an expression like
>> > >
>> > >   paste0("aaa",
>> > >          "bbb")
>> > >
>> > > ?  Surely the execution time of the paste0 call is negligible.
>> > >
>> > > Duncan Murdoch
>> >
>> > Actually "execution time" is precisely one of the reasons why I would
>> like
>> > to see this feature as - depending on the context (e.g. in a tight
>> loop) -
>> > the execution time of paste0 (or probably also glue, thanks Gabor) is
>> not
>> > necessarily insignificant.
>> >
>> > The other reason is style: I think it is cleaner if we can construct
>> such
>> > a long string literal without the need for a function call.
>> >
>> > Andreas
>> >
>> > > >
>> > > > Currently, if a string literal spans multiple lines, there is no
>> way to
>> > > > inhibit the introduction of newline characters:
>> > > >
>> > > >  > "aaa
>> > > > + bbb"
>> > > > [1] "aaa\nbbb"
>> > > >
>> > > >
>> > > > If a line ends with a backslash, it is just ignored:
>> > > >
>> > > >  > "aaa\
>> > > > + bbb"
>> > > > [1] "aaa\nbbb"
>> > > >
>> > > >
>> > > > We could use this fact to implement string splitting in a fairly
>> > > > backward-compatible way, since currently such trailing backslashes
>> > > > should hardly be used as they do not have any effect. The attached
>> > patch
>> > > > makes the parser ignore a newline character directly following a
>> > backslash:
>> > > >
>> > > >  > "aaa\
>> > > > + bbb"
>> > > > [1] "aaabbb"
>> > > >
>> > > >
>> > > > I personally would also prefer if leading blanks (spaces and tabs)
>> in
>> > > > the second line are ignored to allow for proper indentation:
>> > > >
>> > > >  >   "aaa \
>> > > > +    bbb"
>> > > > [1] "aaa bbb"
>> > > >
>> > > >  >   "aaa\
>> > > > +    \ bbb"
>> > > > [1] "aaa bbb"
>> > > >
>> > > > This is also implemented by this patch.
>> > > >
>> > > >
>> > > > An alternative approach could be to have something like
>> > > >
>> > > > ("aaa "
>> > > > "bbb")
>> > > >
>> > > > or
>> > > >
>> > > > ("aaa ",
>> > > > "bbb")
>> > > >
>> > > > be interpreted as "aaa bbb".
>> > > >
>> > > > I don't know the ins and outs of the parser of R (hence: please very
>> > > > carefully review the attached patch), but I guess this would be more
>> > > > work to implement!?
>> > > >
>> > > >
>> > > > What do you think? Is there anybody else who is missing this
>> feature in
>> > > > the first place?
>> > > >
>> > > > Regards,
>> > > > Andreas
>> > > >
>> > > >
>> > > >
>> > > > ______________________________________________
>> > > > [hidden email] mailing list
>> > > > https://stat.ethz.ch/mailman/listinfo/r-devel
>> > > >
>> >
>> > ______________________________________________
>> > [hidden email] mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>>
>>         [[alternative HTML version deleted]]
>
>
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>
>
> --
> Joris Meys
> Statistical consultant
>
> Ghent University
> Faculty of Bioscience Engineering
> Department of Mathematical Modelling, Statistics and Bio-Informatics
>
> tel :  +32 (0)9 264 61 79 <+32%209%20264%2061%2079>
> [hidden email]
> -------------------------------
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [WISH / PATCH] possibility to split string literals across multiple lines

Joris FA Meys
Hi Mark,

I got you. I just pointed out the obvious to illustrate why your emulation
didn't eliminate the need for the real thing. I didn't mean to imply you
weren't aware of this, even though it may seem so. Sometimes I'm not 100%
aware of the subtleties of the English language. This seems one of those
cases.

Met vriendelijke groeten
Joris

On Wed, Jun 14, 2017 at 2:23 PM, Mark van der Loo <[hidden email]>
wrote:

> I know it doesn't cause construction at parse time, and it was also not
> what I said. What I meant was that it makes the syntax at least look a
> little as if you have a line-breaking character within string literals.
>
> Op wo 14 jun. 2017 om 14:18 schreef Joris Meys <[hidden email]>:
>
>> Mark, that's actually a fair statement, although your extra operator
>> doesn't cause construction at parse time. You still call paste0(), but just
>> add an extra layer on top of it.
>>
>> I also doubt that even in gigantic loops the benefit is going to be
>> significant. Take following example:
>>
>> atestfun <- function(x){
>>   y <- paste0("a very long",
>>          "string for testing")
>>   grep(x, y)
>> }
>> atestfun2 <- function(x){
>>   y <- "a very long
>> string for testing"
>>   grep(x,y)
>> }
>> cfun <- cmpfun(atestfun)
>> cfun2 <- cmpfun(atestfun2)
>>
>> require(rbenchmark)
>> benchmark(atestfun("a"),
>>           atestfun2("a"),
>>           cfun("a"),
>>           cfun2("a"),
>>           replications = 100000)
>>
>> Which gives after 100,000 replications:
>>
>>             test replications elapsed relative
>> 1  atestfun("a")       100000    0.83    1.339
>> 2 atestfun2("a")       100000    0.62    1.000
>> 3      cfun("a")       100000    0.81    1.306
>> 4     cfun2("a")       100000    0.62    1.000
>>
>> The patch can in principle make similar code marginally faster, but I'm
>> not convinced the patch is going to make any real difference except for in
>> some very specific and exotic cases. Even more, calling a function like the
>> examples inside the loop is the only way I can come up with where this
>> might be a problem. If you just construct the string inside the loop,
>> there's two possibilities:
>>
>> - the string does not need to change, and then you better construct it
>> outside of the loop
>> - the string does need to change, and then you need paste() or paste0()
>> anyway
>>
>> I'm not against incorporating the patch, as it would eliminate a few
>> keystrokes. It's a neat idea, but I don't expect any other noticeable
>> advantage from it.
>>
>> my humble 2 cents
>> Cheers
>> Joris
>>
>> On Wed, Jun 14, 2017 at 2:00 PM, Mark van der Loo <
>> [hidden email]> wrote:
>>
>>> Having some line-breaking character for string literals would have
>>> benefits
>>> as string literals can then be constructed parse-time rather than
>>> run-time.
>>> I have run into this myself a few times as well. One way to at least
>>> emulate something like that is the following.
>>>
>>> `%+%` <- function(x,y) paste0(x,y)
>>>
>>> "hello" %+%
>>>   " pretty" %+%
>>>   " world"
>>>
>>>
>>> -Mark
>>>
>>>
>>>
>>> Op wo 14 jun. 2017 om 13:53 schreef Andreas Kersting <
>>> [hidden email]>:
>>>
>>> > On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdoch <
>>> > [hidden email]> wrote:
>>> >
>>> > > On 14/06/2017 5:58 AM, Andreas Kersting wrote:
>>> > > > Hi,
>>> > > >
>>> > > > I would really like to have a way to split long string literals
>>> across
>>> > > > multiple lines in R.
>>> > >
>>> > > I don't understand why you require the string to be a literal.  Why
>>> not
>>> > > construct the long string in an expression like
>>> > >
>>> > >   paste0("aaa",
>>> > >          "bbb")
>>> > >
>>> > > ?  Surely the execution time of the paste0 call is negligible.
>>> > >
>>> > > Duncan Murdoch
>>> >
>>> > Actually "execution time" is precisely one of the reasons why I would
>>> like
>>> > to see this feature as - depending on the context (e.g. in a tight
>>> loop) -
>>> > the execution time of paste0 (or probably also glue, thanks Gabor) is
>>> not
>>> > necessarily insignificant.
>>> >
>>> > The other reason is style: I think it is cleaner if we can construct
>>> such
>>> > a long string literal without the need for a function call.
>>> >
>>> > Andreas
>>> >
>>> > > >
>>> > > > Currently, if a string literal spans multiple lines, there is no
>>> way to
>>> > > > inhibit the introduction of newline characters:
>>> > > >
>>> > > >  > "aaa
>>> > > > + bbb"
>>> > > > [1] "aaa\nbbb"
>>> > > >
>>> > > >
>>> > > > If a line ends with a backslash, it is just ignored:
>>> > > >
>>> > > >  > "aaa\
>>> > > > + bbb"
>>> > > > [1] "aaa\nbbb"
>>> > > >
>>> > > >
>>> > > > We could use this fact to implement string splitting in a fairly
>>> > > > backward-compatible way, since currently such trailing backslashes
>>> > > > should hardly be used as they do not have any effect. The attached
>>> > patch
>>> > > > makes the parser ignore a newline character directly following a
>>> > backslash:
>>> > > >
>>> > > >  > "aaa\
>>> > > > + bbb"
>>> > > > [1] "aaabbb"
>>> > > >
>>> > > >
>>> > > > I personally would also prefer if leading blanks (spaces and tabs)
>>> in
>>> > > > the second line are ignored to allow for proper indentation:
>>> > > >
>>> > > >  >   "aaa \
>>> > > > +    bbb"
>>> > > > [1] "aaa bbb"
>>> > > >
>>> > > >  >   "aaa\
>>> > > > +    \ bbb"
>>> > > > [1] "aaa bbb"
>>> > > >
>>> > > > This is also implemented by this patch.
>>> > > >
>>> > > >
>>> > > > An alternative approach could be to have something like
>>> > > >
>>> > > > ("aaa "
>>> > > > "bbb")
>>> > > >
>>> > > > or
>>> > > >
>>> > > > ("aaa ",
>>> > > > "bbb")
>>> > > >
>>> > > > be interpreted as "aaa bbb".
>>> > > >
>>> > > > I don't know the ins and outs of the parser of R (hence: please
>>> very
>>> > > > carefully review the attached patch), but I guess this would be
>>> more
>>> > > > work to implement!?
>>> > > >
>>> > > >
>>> > > > What do you think? Is there anybody else who is missing this
>>> feature in
>>> > > > the first place?
>>> > > >
>>> > > > Regards,
>>> > > > Andreas
>>> > > >
>>> > > >
>>> > > >
>>> > > > ______________________________________________
>>> > > > [hidden email] mailing list
>>> > > > https://stat.ethz.ch/mailman/listinfo/r-devel
>>> > > >
>>> >
>>> > ______________________________________________
>>> > [hidden email] mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>> >
>>>
>>>         [[alternative HTML version deleted]]
>>
>>
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>>
>>
>> --
>> Joris Meys
>> Statistical consultant
>>
>> Ghent University
>> Faculty of Bioscience Engineering
>> Department of Mathematical Modelling, Statistics and Bio-Informatics
>>
>> tel :  +32 (0)9 264 61 79 <+32%209%20264%2061%2079>
>> [hidden email]
>> -------------------------------
>> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>>
>


--
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics

tel :  +32 (0)9 264 61 79
[hidden email]
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [WISH / PATCH] possibility to split string literals across multiple lines

Duncan Murdoch-2
In reply to this post by Andreas Kersting
On 14/06/2017 6:45 AM, Andreas Kersting wrote:

> On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdoch <[hidden email]> wrote:
>
>> On 14/06/2017 5:58 AM, Andreas Kersting wrote:
>>> Hi,
>>>
>>> I would really like to have a way to split long string literals across
>>> multiple lines in R.
>>
>> I don't understand why you require the string to be a literal.  Why not
>> construct the long string in an expression like
>>
>>   paste0("aaa",
>>          "bbb")
>>
>> ?  Surely the execution time of the paste0 call is negligible.
>>
>> Duncan Murdoch
>
> Actually "execution time" is precisely one of the reasons why I would like to see this feature as - depending on the context (e.g. in a tight loop) - the execution time of paste0 (or probably also glue, thanks Gabor) is not necessarily insignificant.

You also need to consider implementation time.  This is not just changes
to R itself; trailing backslashes *are* used in some packages (e.g.
geoparser), so those packages would need to be identified and modified
and resubmitted to CRAN.

Core changes to existing behaviour need really strong arguments, and I'm
just not seeing those here.

Duncan Murdoch

> The other reason is style: I think it is cleaner if we can construct such a long string literal without the need for a function call.
>
> Andreas
>
>>>
>>> Currently, if a string literal spans multiple lines, there is no way to
>>> inhibit the introduction of newline characters:
>>>
>>>  > "aaa
>>> + bbb"
>>> [1] "aaa\nbbb"
>>>
>>>
>>> If a line ends with a backslash, it is just ignored:
>>>
>>>  > "aaa\
>>> + bbb"
>>> [1] "aaa\nbbb"
>>>
>>>
>>> We could use this fact to implement string splitting in a fairly
>>> backward-compatible way, since currently such trailing backslashes
>>> should hardly be used as they do not have any effect. The attached patch
>>> makes the parser ignore a newline character directly following a backslash:
>>>
>>>  > "aaa\
>>> + bbb"
>>> [1] "aaabbb"
>>>
>>>
>>> I personally would also prefer if leading blanks (spaces and tabs) in
>>> the second line are ignored to allow for proper indentation:
>>>
>>>  >   "aaa \
>>> +    bbb"
>>> [1] "aaa bbb"
>>>
>>>  >   "aaa\
>>> +    \ bbb"
>>> [1] "aaa bbb"
>>>
>>> This is also implemented by this patch.
>>>
>>>
>>> An alternative approach could be to have something like
>>>
>>> ("aaa "
>>> "bbb")
>>>
>>> or
>>>
>>> ("aaa ",
>>> "bbb")
>>>
>>> be interpreted as "aaa bbb".
>>>
>>> I don't know the ins and outs of the parser of R (hence: please very
>>> carefully review the attached patch), but I guess this would be more
>>> work to implement!?
>>>
>>>
>>> What do you think? Is there anybody else who is missing this feature in
>>> the first place?
>>>
>>> Regards,
>>> Andreas
>>>
>>>
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [WISH / PATCH] possibility to split string literals across multiple lines

Simon Urbanek
In reply to this post by Andreas Kersting
As I recall this has been discussed at least a few times (unfortunately I'm traveling so can't check the references), but the justification was never satisfactory.

Personally, I wouldn't mind string continuation supported since it makes for more readable code (I had one of my packages raise a NOTE in examples because there is no way in R to split a long hash into multiple lines), but I would be strongly against random removal of whitespaces as it's counter-intuitive, misleading and makes it impossible to continue spaces on the next line. None of the languages that I can think of with multiline strings do that as that's way too dangerous.

Cheers,
Simon


> On Jun 14, 2017, at 6:58 AM, Andreas Kersting <[hidden email]> wrote:
>
> Hi,
>
> I would really like to have a way to split long string literals across multiple lines in R.
>
> Currently, if a string literal spans multiple lines, there is no way to inhibit the introduction of newline characters:
>
> > "aaa
> + bbb"
> [1] "aaa\nbbb"
>
>
> If a line ends with a backslash, it is just ignored:
>
> > "aaa\
> + bbb"
> [1] "aaa\nbbb"
>
>
> We could use this fact to implement string splitting in a fairly backward-compatible way, since currently such trailing backslashes should hardly be used as they do not have any effect. The attached patch makes the parser ignore a newline character directly following a backslash:
>
> > "aaa\
> + bbb"
> [1] "aaabbb"
>
>
> I personally would also prefer if leading blanks (spaces and tabs) in the second line are ignored to allow for proper indentation:
>
> >   "aaa \
> +    bbb"
> [1] "aaa bbb"
>
> >   "aaa\
> +    \ bbb"
> [1] "aaa bbb"
>
> This is also implemented by this patch.
>
>
> An alternative approach could be to have something like
>
> ("aaa "
> "bbb")
>
> or
>
> ("aaa ",
> "bbb")
>
> be interpreted as "aaa bbb".
>
> I don't know the ins and outs of the parser of R (hence: please very carefully review the attached patch), but I guess this would be more work to implement!?
>
>
> What do you think? Is there anybody else who is missing this feature in the first place?
>
> Regards,
> Andreas
> <patch.diff>______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [WISH / PATCH] possibility to split string literals across multiple lines

hadley wickham
On Wed, Jun 14, 2017 at 8:48 AM, Simon Urbanek
<[hidden email]> wrote:
> As I recall this has been discussed at least a few times (unfortunately I'm traveling so can't check the references), but the justification was never satisfactory.
>
> Personally, I wouldn't mind string continuation supported since it makes for more readable code (I had one of my packages raise a NOTE in examples because there is no way in R to split a long hash into multiple lines), but I would be strongly against random removal of whitespaces as it's counter-intuitive, misleading and makes it impossible to continue spaces on the next line. None of the languages that I can think of with multiline strings do that as that's way too dangerous.

Julia does, but uses triple quotes:
https://docs.julialang.org/en/stable/manual/strings/#triple-quoted-string-literals

Hadley

--
http://hadley.nz

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [WISH / PATCH] possibility to split string literals across multiple lines

Serguei Sokol
In reply to this post by Andreas Kersting
Le 14/06/2017 à 12:58, Andreas Kersting a écrit :
> Hi,
>
> I would really like to have a way to split long string literals across multiple lines in R.
>
> ...
> An alternative approach could be to have something like
>
> ("aaa "
> "bbb")
This is C-style and if the core-team decides to implement it,
it could be useful and intuitive.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [WISH / PATCH] possibility to split string literals across multiple lines

Andreas Kersting
In reply to this post by Duncan Murdoch-2

-------- Original Message --------
From: Duncan Murdoch [mailto:[hidden email]]
Sent: Wednesday, Jun 14, 2017 1:36 PM GMT
To: Andreas Kersting
Cc: r-devel
Subject: [Rd] [WISH / PATCH] possibility to split string literals across
multiple lines

> On 14/06/2017 6:45 AM, Andreas Kersting wrote:
>> On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdoch
>> <[hidden email]> wrote:
>>
>>> On 14/06/2017 5:58 AM, Andreas Kersting wrote:
>>>> Hi,
>>>>
>>>> I would really like to have a way to split long string literals across
>>>> multiple lines in R.
>>>
>>> I don't understand why you require the string to be a literal.  Why not
>>> construct the long string in an expression like
>>>
>>>   paste0("aaa",
>>>          "bbb")
>>>
>>> ?  Surely the execution time of the paste0 call is negligible.
>>>
>>> Duncan Murdoch
>>
>> Actually "execution time" is precisely one of the reasons why I would
>> like to see this feature as - depending on the context (e.g. in a
>> tight loop) - the execution time of paste0 (or probably also glue,
>> thanks Gabor) is not necessarily insignificant.
>
> You also need to consider implementation time.  This is not just changes
> to R itself; trailing backslashes *are* used in some packages (e.g.
> geoparser), so those packages would need to be identified and modified
> and resubmitted to CRAN.

I am totally with you on this "runtime vs. implementation-time"-issue.
That is why I proposed the patch as I did: It seemed to require only
minor changes to base R and I didn't see how it could be incompatible
with existing code.

Actually I can still not see how a package could have potentially *used*
backslashes immediately followed by newlines up to now, since those
backslashes were just ignored by the parser (And changes to the function
StringValue are just about the parser, aren't they?). Of course I cannot
rule out the possibility that there is code like
var <- "aaa\
bbb"
around, but this would be based on the undocumented(?) features that
"backslash newline" is a valid escape sequence and that it is treated as
"newline".

Maybe its a good idea to show some more examples how the patched parser
behaves. There should only be difference to the current implementation
if a string literal spans multiple lines and a line ends in an odd
number of backslashes (see last example):

 > "aaa\\
+ bbb"
[1] "aaa\\\nbbb"

 > "aaa\\nbbb"
[1] "aaa\\nbbb"

 > "aaa\\\nbbb"
[1] "aaa\\\nbbb"

 > "aaa\\"
[1] "aaa\\"

 > "aaa\\\n"
[1] "aaa\\\n"

 > "aaa\\\\"
[1] "aaa\\\\"

 > "aaa\\\\\n"
[1] "aaa\\\\\n"

 > "aaa\\\\
+ bbb"
[1] "aaa\\\\\nbbb"

 > "aaa\\\
+ bbb"
[1] "aaa\\bbb"

Andreas

> Core changes to existing behaviour need really strong arguments, and I'm
> just not seeing those here.
>
> Duncan Murdoch
>
>> The other reason is style: I think it is cleaner if we can construct
>> such a long string literal without the need for a function call.
>>
>> Andreas
>>
>>>>
>>>> Currently, if a string literal spans multiple lines, there is no way to
>>>> inhibit the introduction of newline characters:
>>>>
>>>>  > "aaa
>>>> + bbb"
>>>> [1] "aaa\nbbb"
>>>>
>>>>
>>>> If a line ends with a backslash, it is just ignored:
>>>>
>>>>  > "aaa\
>>>> + bbb"
>>>> [1] "aaa\nbbb"
>>>>
>>>>
>>>> We could use this fact to implement string splitting in a fairly
>>>> backward-compatible way, since currently such trailing backslashes
>>>> should hardly be used as they do not have any effect. The attached
>>>> patch
>>>> makes the parser ignore a newline character directly following a
>>>> backslash:
>>>>
>>>>  > "aaa\
>>>> + bbb"
>>>> [1] "aaabbb"
>>>>
>>>>
>>>> I personally would also prefer if leading blanks (spaces and tabs) in
>>>> the second line are ignored to allow for proper indentation:
>>>>
>>>>  >   "aaa \
>>>> +    bbb"
>>>> [1] "aaa bbb"
>>>>
>>>>  >   "aaa\
>>>> +    \ bbb"
>>>> [1] "aaa bbb"
>>>>
>>>> This is also implemented by this patch.
>>>>
>>>>
>>>> An alternative approach could be to have something like
>>>>
>>>> ("aaa "
>>>> "bbb")
>>>>
>>>> or
>>>>
>>>> ("aaa ",
>>>> "bbb")
>>>>
>>>> be interpreted as "aaa bbb".
>>>>
>>>> I don't know the ins and outs of the parser of R (hence: please very
>>>> carefully review the attached patch), but I guess this would be more
>>>> work to implement!?
>>>>
>>>>
>>>> What do you think? Is there anybody else who is missing this feature in
>>>> the first place?
>>>>
>>>> Regards,
>>>> Andreas
>>>>
>>>>
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>
>>
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [WISH / PATCH] possibility to split string literals across multiple lines

R devel mailing list
In reply to this post by Andreas Kersting
If you are changing the parser (which is a major change) you
might consider treating strings in the C/C++ way:
   char *s = "A"
                   "B";
means the same as
   char *s = "AB";

I am not a big fan of that syntax but it is widely used.

A backslash at the end of the line leads to errors when you accidently
put a space after the backslash and the editor doesn't flag it.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Jun 14, 2017 at 3:58 AM, Andreas Kersting <[hidden email]>
wrote:

> Hi,
>
> I would really like to have a way to split long string literals across
> multiple lines in R.
>
> Currently, if a string literal spans multiple lines, there is no way to
> inhibit the introduction of newline characters:
>
> > "aaa
> + bbb"
> [1] "aaa\nbbb"
>
>
> If a line ends with a backslash, it is just ignored:
>
> > "aaa\
> + bbb"
> [1] "aaa\nbbb"
>
>
> We could use this fact to implement string splitting in a fairly
> backward-compatible way, since currently such trailing backslashes should
> hardly be used as they do not have any effect. The attached patch makes the
> parser ignore a newline character directly following a backslash:
>
> > "aaa\
> + bbb"
> [1] "aaabbb"
>
>
> I personally would also prefer if leading blanks (spaces and tabs) in the
> second line are ignored to allow for proper indentation:
>
> >   "aaa \
> +    bbb"
> [1] "aaa bbb"
>
> >   "aaa\
> +    \ bbb"
> [1] "aaa bbb"
>
> This is also implemented by this patch.
>
>
> An alternative approach could be to have something like
>
> ("aaa "
> "bbb")
>
> or
>
> ("aaa ",
> "bbb")
>
> be interpreted as "aaa bbb".
>
> I don't know the ins and outs of the parser of R (hence: please very
> carefully review the attached patch), but I guess this would be more work
> to implement!?
>
>
> What do you think? Is there anybody else who is missing this feature in
> the first place?
>
> Regards,
> Andreas
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [WISH / PATCH] possibility to split string literals across multiple lines

Gábor Csárdi
I don't think it is reasonable to change the parser this way. This is
currently valid R code:

a <- "foo"
"bar"

and with the new syntax, it is also valid, but with a different
meaning. Or you can even consider

a <- "foo"
bar %>% func() %>% print()

etc.

I like the idea of string literals, but the C/C++ way clearly does not
work. The Python/Julia way might, i.e.:

"""this is a
multi-line
lineral"""

Gabor

On Wed, Jun 14, 2017 at 4:12 PM, William Dunlap via R-devel
<[hidden email]> wrote:

> If you are changing the parser (which is a major change) you
> might consider treating strings in the C/C++ way:
>    char *s = "A"
>                    "B";
> means the same as
>    char *s = "AB";
>
> I am not a big fan of that syntax but it is widely used.
>
> A backslash at the end of the line leads to errors when you accidently
> put a space after the backslash and the editor doesn't flag it.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Wed, Jun 14, 2017 at 3:58 AM, Andreas Kersting <[hidden email]>
> wrote:
>
>> Hi,
>>
>> I would really like to have a way to split long string literals across
>> multiple lines in R.
>>
>> Currently, if a string literal spans multiple lines, there is no way to
>> inhibit the introduction of newline characters:
>>
>> > "aaa
>> + bbb"
>> [1] "aaa\nbbb"
>>
>>
>> If a line ends with a backslash, it is just ignored:
>>
>> > "aaa\
>> + bbb"
>> [1] "aaa\nbbb"
>>
>>
>> We could use this fact to implement string splitting in a fairly
>> backward-compatible way, since currently such trailing backslashes should
>> hardly be used as they do not have any effect. The attached patch makes the
>> parser ignore a newline character directly following a backslash:
>>
>> > "aaa\
>> + bbb"
>> [1] "aaabbb"
>>
>>
>> I personally would also prefer if leading blanks (spaces and tabs) in the
>> second line are ignored to allow for proper indentation:
>>
>> >   "aaa \
>> +    bbb"
>> [1] "aaa bbb"
>>
>> >   "aaa\
>> +    \ bbb"
>> [1] "aaa bbb"
>>
>> This is also implemented by this patch.
>>
>>
>> An alternative approach could be to have something like
>>
>> ("aaa "
>> "bbb")
>>
>> or
>>
>> ("aaa ",
>> "bbb")
>>
>> be interpreted as "aaa bbb".
>>
>> I don't know the ins and outs of the parser of R (hence: please very
>> carefully review the attached patch), but I guess this would be more work
>> to implement!?
>>
>>
>> What do you think? Is there anybody else who is missing this feature in
>> the first place?
>>
>> Regards,
>> Andreas
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [WISH / PATCH] possibility to split string literals across multiple lines

Andreas Kersting
In reply to this post by hadley wickham
-------- Original Message --------
From: Hadley Wickham [mailto:[hidden email]]
Sent: Wednesday, Jun 14, 2017 2:51 PM GMT
To: Simon Urbanek
Cc: Andreas Kersting; [hidden email]
Subject: [Rd] [WISH / PATCH] possibility to split string literals across
multiple lines

> On Wed, Jun 14, 2017 at 8:48 AM, Simon Urbanek
> <[hidden email]> wrote:
>> As I recall this has been discussed at least a few times (unfortunately I'm traveling so can't check the references), but the justification was never satisfactory.
>>
>> Personally, I wouldn't mind string continuation supported since it makes for more readable code (I had one of my packages raise a NOTE in examples because there is no way in R to split a long hash into multiple lines), but I would be strongly against random removal of whitespaces as it's counter-intuitive, misleading and makes it impossible to continue spaces on the next line. None of the languages that I can think of with multiline strings do that as that's way too dangerous.
>
> Julia does, but uses triple quotes:
> https://docs.julialang.org/en/stable/manual/strings/#triple-quoted-string-literals
>
> Hadley
>

If we consider bash a programming language: Here documents
(http://tldp.org/LDP/abs/html/here-docs.html) can have leading tabs be
removed (see Example 19-4).

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [WISH / PATCH] possibility to split string literals across multiple lines

luke-tierney
In reply to this post by Gábor Csárdi
On Wed, 14 Jun 2017, Gábor Csárdi wrote:

> I don't think it is reasonable to change the parser this way. This is
> currently valid R code:
>
> a <- "foo"
> "bar"
>
> and with the new syntax, it is also valid, but with a different
> meaning. Or you can even consider
>
> a <- "foo"
> bar %>% func() %>% print()
>
> etc.
>
> I like the idea of string literals, but the C/C++ way clearly does not
> work. The Python/Julia way might, i.e.:
>
> """this is a
> multi-line
> lineral"""

This does look like a promising option; some more careful checking
would be needed to make sure there aren't cases where currently
working code would be broken.

Another Python idea worth considering is the raw string notation
r"xyx" that does not process escape sequences -- this would make
writing things like regular expressions easier.

Best,

luke

>
> Gabor
>
> On Wed, Jun 14, 2017 at 4:12 PM, William Dunlap via R-devel
> <[hidden email]> wrote:
>> If you are changing the parser (which is a major change) you
>> might consider treating strings in the C/C++ way:
>>    char *s = "A"
>>                    "B";
>> means the same as
>>    char *s = "AB";
>>
>> I am not a big fan of that syntax but it is widely used.
>>
>> A backslash at the end of the line leads to errors when you accidently
>> put a space after the backslash and the editor doesn't flag it.
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>>
>> On Wed, Jun 14, 2017 at 3:58 AM, Andreas Kersting <[hidden email]>
>> wrote:
>>
>>> Hi,
>>>
>>> I would really like to have a way to split long string literals across
>>> multiple lines in R.
>>>
>>> Currently, if a string literal spans multiple lines, there is no way to
>>> inhibit the introduction of newline characters:
>>>
>>>> "aaa
>>> + bbb"
>>> [1] "aaa\nbbb"
>>>
>>>
>>> If a line ends with a backslash, it is just ignored:
>>>
>>>> "aaa\
>>> + bbb"
>>> [1] "aaa\nbbb"
>>>
>>>
>>> We could use this fact to implement string splitting in a fairly
>>> backward-compatible way, since currently such trailing backslashes should
>>> hardly be used as they do not have any effect. The attached patch makes the
>>> parser ignore a newline character directly following a backslash:
>>>
>>>> "aaa\
>>> + bbb"
>>> [1] "aaabbb"
>>>
>>>
>>> I personally would also prefer if leading blanks (spaces and tabs) in the
>>> second line are ignored to allow for proper indentation:
>>>
>>>>   "aaa \
>>> +    bbb"
>>> [1] "aaa bbb"
>>>
>>>>   "aaa\
>>> +    \ bbb"
>>> [1] "aaa bbb"
>>>
>>> This is also implemented by this patch.
>>>
>>>
>>> An alternative approach could be to have something like
>>>
>>> ("aaa "
>>> "bbb")
>>>
>>> or
>>>
>>> ("aaa ",
>>> "bbb")
>>>
>>> be interpreted as "aaa bbb".
>>>
>>> I don't know the ins and outs of the parser of R (hence: please very
>>> carefully review the attached patch), but I guess this would be more work
>>> to implement!?
>>>
>>>
>>> What do you think? Is there anybody else who is missing this feature in
>>> the first place?
>>>
>>> Regards,
>>> Andreas
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   [hidden email]
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [WISH / PATCH] possibility to split string literals across multiple lines

hadley wickham
>> I don't think it is reasonable to change the parser this way. This is
>> currently valid R code:
>>
>> a <- "foo"
>> "bar"
>>
>> and with the new syntax, it is also valid, but with a different
>> meaning. Or you can even consider
>>
>> a <- "foo"
>> bar %>% func() %>% print()
>>
>> etc.
>>
>> I like the idea of string literals, but the C/C++ way clearly does not
>> work. The Python/Julia way might, i.e.:
>>
>> """this is a
>> multi-line
>> lineral"""
>
>
> This does look like a promising option; some more careful checking
> would be needed to make sure there aren't cases where currently
> working code would be broken.
>
> Another Python idea worth considering is the raw string notation
> r"xyx" that does not process escape sequences -- this would make
> writing things like regular expressions easier.

If this is something you would consider, we'd be happy to put together
a patch for review.

Hadley


--
http://hadley.nz

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [WISH / PATCH] possibility to split string literals across multiple lines

Radford Neal
In reply to this post by Andreas Kersting
> On Wed, 14 Jun 2017, G?bor Cs?rdi wrote:
>
> > I like the idea of string literals, but the C/C++ way clearly does not
> > work. The Python/Julia way might, i.e.:
> >
> > """this is a
> > multi-line
> > lineral"""
>
> [hidden email]:

> This does look like a promising option; some more careful checking
> would be needed to make sure there aren't cases where currently
> working code would be broken.

I don't see how this proposal solves any problem of interest.

String literals can already be as long as you like.  The problem is
that they will get wrapped around in an editor (or not all be
visible), destroying the nice formatting of your program.

With the proposed extension, you can write long string literals with
short lines only if they were long only because they consisted of
multiple lines.  Getting a string literal that's 79 characters long
with no newlines (a perfectly good error message, for example) to fit
in your 80-character-wide editing window would still be impossible.

Furthermore, these Python-style literals have to have their second
and later lines start at the left edge, destroying the indentation
of your program (supposing you actually wanted to use one).

In contrast, C-style concatenation (by the parser) of consecutive
string literals works just fine for what you'd want to do in a
program.  The only thing they wouldn't do that the Python-style
literals would do is allow you to put big blocks of literal text in
your program, without having to put quotes around each line.  But
shouldn't such text really be stored in a separate file that gets
read, rather than in the program source?

   Radford Neal

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [WISH / PATCH] possibility to split string literals across multiple lines

Gábor Csárdi
On Fri, Jun 16, 2017 at 7:04 PM, Radford Neal <[hidden email]> wrote:

>> On Wed, 14 Jun 2017, G?bor Cs?rdi wrote:
>>
>> > I like the idea of string literals, but the C/C++ way clearly does not
>> > work. The Python/Julia way might, i.e.:
>> >
>> > """this is a
>> > multi-line
>> > lineral"""
>>
>> [hidden email]:
>
>> This does look like a promising option; some more careful checking
>> would be needed to make sure there aren't cases where currently
>> working code would be broken.
>
> I don't see how this proposal solves any problem of interest.
>
> String literals can already be as long as you like.  The problem is
> that they will get wrapped around in an editor (or not all be
> visible), destroying the nice formatting of your program.

From the Python docs:

String literals can span multiple lines. One way is using
triple-quotes: """...""" or '''...'''. End of lines are automatically
included in the string, but it’s possible to prevent this by adding a
\ at the end of the line.

[...]

Gabor

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
12
Loading...