paste(character(0), collapse="", recycle0=FALSE) should be ""

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

paste(character(0), collapse="", recycle0=FALSE) should be ""

R devel mailing list
Without 'collapse', 'paste' pastes (concatenates) its arguments elementwise (separated by 'sep', " " by default). New in R devel and R patched, specifying recycle0 = FALSE makes mixing zero-length and nonzero-length arguments results in length zero. The result of paste(n, "th", sep = "", recycle0 = FALSE) always have the same length as 'n'. Previously, the result is still as long as the longest argument, with the zero-length argument like "". If all og the arguments have length zero, 'recycle0' doesn't matter.

As far as I understand, 'paste' with 'collapse' as a character string is supposed to put together elements of a vector into a single character string. I think 'recycle0' shouldn't change it.

In current R devel and R patched, paste(character(0), collapse = "", recycle0 = FALSE) is character(0). I think it should be "", like paste(character(0), collapse="").

paste(c("4", "5"), "th", sep = "", collapse = ", ", recycle0 = FALSE)
is
"4th, 5th".
paste(c("4"     ), "th", sep = "", collapse = ", ", recycle0 = FALSE)
is
"4th".
I think
paste(c(        ), "th", sep = "", collapse = ", ", recycle0 = FALSE)
should be
"",
not character(0).

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: paste(character(0), collapse="", recycle0=FALSE) should be ""

Martin Maechler
>>>>> suharto anggono--- via R-devel
>>>>>     on Fri, 1 May 2020 03:05:37 +0000 (UTC) writes:

    > Without 'collapse', 'paste' pastes (concatenates) its arguments elementwise (separated by 'sep', " " by default). New in R devel and R patched, specifying recycle0 = FALSE makes mixing zero-length and nonzero-length arguments results in length zero.

That's not intended.
(It's what should only happen with the new (non-default) recycle0=TRUE )

> The result of paste(n, "th", sep = "", recycle0 = FALSE) always have the same length as 'n'. Previously, the result is still as long as the longest argument, with the zero-length argument like "". If all og the arguments have length zero, 'recycle0' doesn't matter.

    > As far as I understand, 'paste' with 'collapse' as a character string is supposed to put together elements of a vector into a single character string. I think 'recycle0' shouldn't change it.

Well, not quite:  only  'recycle0=FALSE'  shouldn't change it
.. maybe this is what you meant anyway.

    > In current R devel and R patched, paste(character(0), collapse = "", recycle0 = FALSE) is character(0). I think it should be "", like paste(character(0), collapse="").

Definitely:  The intent of the new 'recycle0' argument is to
provide a non-default possibility for paste(...., recycle0=TRUE) to behave more
like "arithmetic" functions where the recycling rules ensure that
if one argument has length 0 then the result has length 0:
i.e.,   paste(a,b,c,d,  recycle0=TRUE)      should recycle the same as
              a+b+c+d                       does recycle

Indeed, the default 'recycle0=FALSE'  should correspond to previous (R <= 4.0.0)
behavior entirely.

BUT from all I see, R-devel and R-patched's version of paste()
do behave as they should.  Also what you clim here is not true :

    > paste(c("4", "5"), "th", sep = "", collapse = ", ", recycle0 = FALSE)
    > is
    > "4th, 5th".
    > paste(c("4"     ), "th", sep = "", collapse = ", ", recycle0 = FALSE)
    > is
    > "4th".
    > I think
    > paste(c(        ), "th", sep = "", collapse = ", ", recycle0 = FALSE)
    > should be
    > "",
    > not character(0).

Rather, what I see is what the comments of the following code
lines claim (according to the intention of 'recycle0', contrary
some of your claims above) :


paste(character(0), collapse = "", recycle0 = FALSE) # is "", like
paste(character(0), collapse = "")
paste(character(0), collapse = "", recycle0 =  TRUE) # is character(0)

paste(c("4", "5"), "th", sep = "", collapse = ", ", recycle0 = FALSE) # is "4th, 5th"
paste(c("4"     ), "th", sep = "", collapse = ", ", recycle0 = FALSE) # is "4th"
paste(c(        ), "th", sep = "", collapse = ", ", recycle0 = FALSE) # is "th"
##
paste(c("4", "5"), "th", sep = "", collapse = ", ", recycle0 =  TRUE) # is "4th, 5th"
paste(c("4"     ), "th", sep = "", collapse = ", ", recycle0 =  TRUE) # is "4th"
paste(c(        ), "th", sep = "", collapse = ", ", recycle0 =  TRUE) # is character(0)


There must be a lapsus / misunderstanding somewhere.
I don't see any problem in the new behavior for now.

Best regards,
Martin

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: paste(character(0), collapse="", recycle0=FALSE) should be ""

R devel mailing list
 I was wrong, as I didn't actually try and didn't read the documentation carefully. I thought that ' zero-length arguments being recycled to "" ' happens when recycle0 = TRUE. It is actually the opposite.

Everywhere in my previous message, recycle0 = FALSE should be recycle0 = TRUE.

I really think that 'paste' with 'collapse' specified (as a character string) should always result in a single character string, no matter what value of 'recycle0'.

paste(character(0), collapse = "", recycle0 = TRUE) # character(0), but should be ""

paste(character(0), recycle0 = FALSE)
is the same as
paste(character(0), recycle0 = TRUE) .
'recycle0' doesn't matter there.
Why should
paste(character(0), collapse = "", recycle0 = FALSE)
be different from
paste(character(0), collapse = "", recycle0 = TRUE) ?

paste(c("4", "5"), "th", sep = "", collapse = ", ", recycle0 =  TRUE) # "4th, 5th"
paste(c("4"    ), "th", sep = "", collapse = ", ", recycle0 =  TRUE) # "4th"
paste(c(        ), "th", sep = "", collapse = ", ", recycle0 =  TRUE) # character(0), but should be ""


On Saturday, 2 May 2020, 10:09:21 pm GMT+7, Martin Maechler <[hidden email]> wrote:


>>>>> suharto anggono--- via R-devel
>>>>>    on Fri, 1 May 2020 03:05:37 +0000 (UTC) writes:

    > Without 'collapse', 'paste' pastes (concatenates) its arguments elementwise (separated by 'sep', " " by default). New in R devel and R patched, specifying recycle0 = FALSE makes mixing zero-length and nonzero-length arguments results in length zero.

That's not intended.
(It's what should only happen with the new (non-default) recycle0=TRUE )

> The result of paste(n, "th", sep = "", recycle0 = FALSE) always have the same length as 'n'. Previously, the result is still as long as the longest argument, with the zero-length argument like "". If all og the arguments have length zero, 'recycle0' doesn't matter.

    > As far as I understand, 'paste' with 'collapse' as a character string is supposed to put together elements of a vector into a single character string. I think 'recycle0' shouldn't change it.

Well, not quite:  only  'recycle0=FALSE'  shouldn't change it
.. maybe this is what you meant anyway.

    > In current R devel and R patched, paste(character(0), collapse = "", recycle0 = FALSE) is character(0). I think it should be "", like paste(character(0), collapse="").

Definitely:  The intent of the new 'recycle0' argument is to
provide a non-default possibility for paste(...., recycle0=TRUE) to behave more
like "arithmetic" functions where the recycling rules ensure that
if one argument has length 0 then the result has length 0:
i.e.,  paste(a,b,c,d,  recycle0=TRUE)      should recycle the same as
              a+b+c+d                      does recycle

Indeed, the default 'recycle0=FALSE'  should correspond to previous (R <= 4.0.0)
behavior entirely.

BUT from all I see, R-devel and R-patched's version of paste()
do behave as they should.  Also what you clim here is not true :

    > paste(c("4", "5"), "th", sep = "", collapse = ", ", recycle0 = FALSE)
    > is
    > "4th, 5th".
    > paste(c("4"    ), "th", sep = "", collapse = ", ", recycle0 = FALSE)
    > is
    > "4th".
    > I think
    > paste(c(        ), "th", sep = "", collapse = ", ", recycle0 = FALSE)
    > should be
    > "",
    > not character(0).

Rather, what I see is what the comments of the following code
lines claim (according to the intention of 'recycle0', contrary
some of your claims above) :


paste(character(0), collapse = "", recycle0 = FALSE) # is "", like
paste(character(0), collapse = "")
paste(character(0), collapse = "", recycle0 =  TRUE) # is character(0)


paste(c("4", "5"), "th", sep = "", collapse = ", ", recycle0 = FALSE) # is "4th, 5th"
paste(c("4"    ), "th", sep = "", collapse = ", ", recycle0 = FALSE) # is "4th"

paste(c(        ), "th", sep = "", collapse = ", ", recycle0 = FALSE) # is "th"
##
paste(c("4", "5"), "th", sep = "", collapse = ", ", recycle0 =  TRUE) # is "4th, 5th"
paste(c("4"    ), "th", sep = "", collapse = ", ", recycle0 =  TRUE) # is "4th"
paste(c(        ), "th", sep = "", collapse = ", ", recycle0 =  TRUE) # is character(0)


There must be a lapsus / misunderstanding somewhere.
I don't see any problem in the new behavior for now.

Best regards,
Martin  
        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: paste(character(0), collapse="", recycle0=FALSE) should be ""

R devel mailing list
In reply to this post by R devel mailing list
I agree: paste(collapse="something", ...) should always return a single
character string, regardless of the value of recycle0.  This would be
similar to when there are no non-NULL arguments to paste; collapse="."
gives a single empty string and collapse=NULL gives a zero long character
vector.
> paste()
character(0)
> paste(collapse=", ")
[1] ""

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Thu, Apr 30, 2020 at 9:56 PM suharto_anggono--- via R-devel <
[hidden email]> wrote:

> Without 'collapse', 'paste' pastes (concatenates) its arguments
> elementwise (separated by 'sep', " " by default). New in R devel and R
> patched, specifying recycle0 = FALSE makes mixing zero-length and
> nonzero-length arguments results in length zero. The result of paste(n,
> "th", sep = "", recycle0 = FALSE) always have the same length as 'n'.
> Previously, the result is still as long as the longest argument, with the
> zero-length argument like "". If all og the arguments have length zero,
> 'recycle0' doesn't matter.
>
> As far as I understand, 'paste' with 'collapse' as a character string is
> supposed to put together elements of a vector into a single character
> string. I think 'recycle0' shouldn't change it.
>
> In current R devel and R patched, paste(character(0), collapse = "",
> recycle0 = FALSE) is character(0). I think it should be "", like
> paste(character(0), collapse="").
>
> paste(c("4", "5"), "th", sep = "", collapse = ", ", recycle0 = FALSE)
> is
> "4th, 5th".
> paste(c("4"     ), "th", sep = "", collapse = ", ", recycle0 = FALSE)
> is
> "4th".
> I think
> paste(c(        ), "th", sep = "", collapse = ", ", recycle0 = FALSE)
> should be
> "",
> not character(0).
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: paste(character(0), collapse="", recycle0=FALSE) should be ""

Hervé Pagès-2
Totally agree with that.

H.

On 5/15/20 10:34, William Dunlap via R-devel wrote:

> I agree: paste(collapse="something", ...) should always return a single
> character string, regardless of the value of recycle0.  This would be
> similar to when there are no non-NULL arguments to paste; collapse="."
> gives a single empty string and collapse=NULL gives a zero long character
> vector.
>> paste()
> character(0)
>> paste(collapse=", ")
> [1] ""
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Thu, Apr 30, 2020 at 9:56 PM suharto_anggono--- via R-devel <
> [hidden email]> wrote:
>
>> Without 'collapse', 'paste' pastes (concatenates) its arguments
>> elementwise (separated by 'sep', " " by default). New in R devel and R
>> patched, specifying recycle0 = FALSE makes mixing zero-length and
>> nonzero-length arguments results in length zero. The result of paste(n,
>> "th", sep = "", recycle0 = FALSE) always have the same length as 'n'.
>> Previously, the result is still as long as the longest argument, with the
>> zero-length argument like "". If all og the arguments have length zero,
>> 'recycle0' doesn't matter.
>>
>> As far as I understand, 'paste' with 'collapse' as a character string is
>> supposed to put together elements of a vector into a single character
>> string. I think 'recycle0' shouldn't change it.
>>
>> In current R devel and R patched, paste(character(0), collapse = "",
>> recycle0 = FALSE) is character(0). I think it should be "", like
>> paste(character(0), collapse="").
>>
>> paste(c("4", "5"), "th", sep = "", collapse = ", ", recycle0 = FALSE)
>> is
>> "4th, 5th".
>> paste(c("4"     ), "th", sep = "", collapse = ", ", recycle0 = FALSE)
>> is
>> "4th".
>> I think
>> paste(c(        ), "th", sep = "", collapse = ", ", recycle0 = FALSE)
>> should be
>> "",
>> not character(0).
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
>

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [hidden email]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: paste(character(0), collapse="", recycle0=FALSE) should be ""

Gabriel Becker-2
Hi all,

This makes sense to me, but I would think that recycle0 and collapse should
actually be incompatible and paste should throw an error if recycle0 were
TRUE and collapse were declared in the same call. I don't think the value
of recycle0 should be silently ignored if it is actively specified.

~G

On Fri, May 15, 2020 at 11:05 AM Hervé Pagès <[hidden email]> wrote:

> Totally agree with that.
>
> H.
>
> On 5/15/20 10:34, William Dunlap via R-devel wrote:
> > I agree: paste(collapse="something", ...) should always return a single
> > character string, regardless of the value of recycle0.  This would be
> > similar to when there are no non-NULL arguments to paste; collapse="."
> > gives a single empty string and collapse=NULL gives a zero long character
> > vector.
> >> paste()
> > character(0)
> >> paste(collapse=", ")
> > [1] ""
> >
> > Bill Dunlap
> > TIBCO Software
> > wdunlap tibco.com
> >
> >
> > On Thu, Apr 30, 2020 at 9:56 PM suharto_anggono--- via R-devel <
> > [hidden email]> wrote:
> >
> >> Without 'collapse', 'paste' pastes (concatenates) its arguments
> >> elementwise (separated by 'sep', " " by default). New in R devel and R
> >> patched, specifying recycle0 = FALSE makes mixing zero-length and
> >> nonzero-length arguments results in length zero. The result of paste(n,
> >> "th", sep = "", recycle0 = FALSE) always have the same length as 'n'.
> >> Previously, the result is still as long as the longest argument, with
> the
> >> zero-length argument like "". If all og the arguments have length zero,
> >> 'recycle0' doesn't matter.
> >>
> >> As far as I understand, 'paste' with 'collapse' as a character string is
> >> supposed to put together elements of a vector into a single character
> >> string. I think 'recycle0' shouldn't change it.
> >>
> >> In current R devel and R patched, paste(character(0), collapse = "",
> >> recycle0 = FALSE) is character(0). I think it should be "", like
> >> paste(character(0), collapse="").
> >>
> >> paste(c("4", "5"), "th", sep = "", collapse = ", ", recycle0 = FALSE)
> >> is
> >> "4th, 5th".
> >> paste(c("4"     ), "th", sep = "", collapse = ", ", recycle0 = FALSE)
> >> is
> >> "4th".
> >> I think
> >> paste(c(        ), "th", sep = "", collapse = ", ", recycle0 = FALSE)
> >> should be
> >> "",
> >> not character(0).
> >>
> >> ______________________________________________
> >> [hidden email] mailing list
> >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
> >>
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [hidden email] mailing list
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
> >
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: [hidden email]
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: paste(character(0), collapse="", recycle0=FALSE) should be ""

Hervé Pagès-2
There is still the situation where **both** 'sep' and 'collapse' are
specified:

   > paste(integer(0), "nth", sep="", collapse=",")
   [1] "nth"

In that case 'recycle0' should **not** be ignored i.e.

   paste(integer(0), "nth", sep="", collapse=",", recycle0=TRUE)

should return the empty string (and not character(0) like it does at the
moment).

In other words, 'recycle0' should only control the first operation (the
operation controlled by 'sep'). Which makes plenty of sense: the 1st
operation is binary (or n-ary) while the collapse operation is unary.
There is no concept of recycling in the context of unary operations.

H.

On 5/15/20 11:25, Gabriel Becker wrote:

> Hi all,
>
> This makes sense to me, but I would think that recycle0 and collapse
> should actually be incompatible and paste should throw an error if
> recycle0 were TRUE and collapse were declared in the same call. I don't
> think the value of recycle0 should be silently ignored if it is actively
> specified.
>
> ~G
>
> On Fri, May 15, 2020 at 11:05 AM Hervé Pagès <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Totally agree with that.
>
>     H.
>
>     On 5/15/20 10:34, William Dunlap via R-devel wrote:
>      > I agree: paste(collapse="something", ...) should always return a
>     single
>      > character string, regardless of the value of recycle0.  This would be
>      > similar to when there are no non-NULL arguments to paste;
>     collapse="."
>      > gives a single empty string and collapse=NULL gives a zero long
>     character
>      > vector.
>      >> paste()
>      > character(0)
>      >> paste(collapse=", ")
>      > [1] ""
>      >
>      > Bill Dunlap
>      > TIBCO Software
>      > wdunlap tibco.com
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=rXIwWqf4U4HZS_bjUT3KfA9ARaV5YTb_kEcXWHnkt-c&e=>
>      >
>      >
>      > On Thu, Apr 30, 2020 at 9:56 PM suharto_anggono--- via R-devel <
>      > [hidden email] <mailto:[hidden email]>> wrote:
>      >
>      >> Without 'collapse', 'paste' pastes (concatenates) its arguments
>      >> elementwise (separated by 'sep', " " by default). New in R devel
>     and R
>      >> patched, specifying recycle0 = FALSE makes mixing zero-length and
>      >> nonzero-length arguments results in length zero. The result of
>     paste(n,
>      >> "th", sep = "", recycle0 = FALSE) always have the same length as
>     'n'.
>      >> Previously, the result is still as long as the longest argument,
>     with the
>      >> zero-length argument like "". If all og the arguments have
>     length zero,
>      >> 'recycle0' doesn't matter.
>      >>
>      >> As far as I understand, 'paste' with 'collapse' as a character
>     string is
>      >> supposed to put together elements of a vector into a single
>     character
>      >> string. I think 'recycle0' shouldn't change it.
>      >>
>      >> In current R devel and R patched, paste(character(0), collapse = "",
>      >> recycle0 = FALSE) is character(0). I think it should be "", like
>      >> paste(character(0), collapse="").
>      >>
>      >> paste(c("4", "5"), "th", sep = "", collapse = ", ", recycle0 =
>     FALSE)
>      >> is
>      >> "4th, 5th".
>      >> paste(c("4"     ), "th", sep = "", collapse = ", ", recycle0 =
>     FALSE)
>      >> is
>      >> "4th".
>      >> I think
>      >> paste(c(        ), "th", sep = "", collapse = ", ", recycle0 =
>     FALSE)
>      >> should be
>      >> "",
>      >> not character(0).
>      >>
>      >> ______________________________________________
>      >> [hidden email] <mailto:[hidden email]> mailing list
>      >>
>     https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
>      >>
>      >
>      >       [[alternative HTML version deleted]]
>      >
>      > ______________________________________________
>      > [hidden email] <mailto:[hidden email]> mailing list
>      >
>     https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
>      >
>
>     --
>     Hervé Pagès
>
>     Program in Computational Biology
>     Division of Public Health Sciences
>     Fred Hutchinson Cancer Research Center
>     1100 Fairview Ave. N, M1-B514
>     P.O. Box 19024
>     Seattle, WA 98109-1024
>
>     E-mail: [hidden email] <mailto:[hidden email]>
>     Phone:  (206) 667-5791
>     Fax:    (206) 667-1319
>
>     ______________________________________________
>     [hidden email] <mailto:[hidden email]> mailing list
>     https://stat.ethz.ch/mailman/listinfo/r-devel
>     <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=COnDeGgHNnHJlLLZOznMlhcaFU1nIRlkaSbssvlrMvw&e=>
>

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [hidden email]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: paste(character(0), collapse="", recycle0=FALSE) should be ""

Martin Maechler
>>>>> Hervé Pagès
>>>>>     on Fri, 15 May 2020 13:44:28 -0700 writes:

    > There is still the situation where **both** 'sep' and 'collapse' are
    > specified:

    >> paste(integer(0), "nth", sep="", collapse=",")
    > [1] "nth"

    > In that case 'recycle0' should **not** be ignored i.e.

    > paste(integer(0), "nth", sep="", collapse=",", recycle0=TRUE)

    > should return the empty string (and not character(0) like it does at the
    > moment).

    > In other words, 'recycle0' should only control the first operation (the
    > operation controlled by 'sep'). Which makes plenty of sense: the 1st
    > operation is binary (or n-ary) while the collapse operation is unary.
    > There is no concept of recycling in the context of unary operations.

Interesting, ..., and sounding somewhat convincing.

    > On 5/15/20 11:25, Gabriel Becker wrote:
    >> Hi all,
    >>
    >> This makes sense to me, but I would think that recycle0 and collapse
    >> should actually be incompatible and paste should throw an error if
    >> recycle0 were TRUE and collapse were declared in the same call. I don't
    >> think the value of recycle0 should be silently ignored if it is actively
    >> specified.
    >>
    >> ~G

Just to summarize what I think we should know and agree (or be
be "disproven") and where this comes from ...

1) recycle0 is a new R 4.0.0 option in paste() / paste0() which by default
   (recycle0 = FALSE) should (and *does* AFAIK) not change anything,
   hence  paste() / paste0() behave completely back-compatible
   if recycle0 is kept to FALSE.

2) recycle0 = TRUE is meant to give different behavior, notably
   0-length arguments (among '...') should result in 0-length results.

   The above does not specify what this means in detail, see 3)

3) The current R 4.0.0 implementation (for which I'm primarily responsible)
   and help(paste)  are in accordance.
   Notably the help page (Arguments -> 'recycle0' ; Details 1st para ; Examples)
   says and shows how the 4.0.0 implementation has been meant to work.

4) Several provenly smart members of the R community argue that
   both the implementation and the documentation of 'recycle0 =
   TRUE'  should be changed to be more logical / coherent / sensical ..

Is the above all correct in your view?

Assuming yes,  I read basically two proposals, both agreeing
that  recycle0 = TRUE  should only ever apply to the action of 'sep'
but not the action of 'collapse'.

1) Bill and Hervé (I think) propose that 'recycle0' should have
   no effect whenever  'collapse = <string>'

2) Gabe proposes that 'collapse = <string>' and 'recycle0 = TRUE'
   should be declared incompatible and error. If going in that
   direction, I could also see them to give a warning (and
   continue as if recycle = FALSE).

I have not yet my mind up but would tend to agree to "you guys",
but I think that other R Core members should chime in, too.

Martin

    >> On Fri, May 15, 2020 at 11:05 AM Hervé Pagès <[hidden email]
    >> <mailto:[hidden email]>> wrote:
    >>
    >> Totally agree with that.
    >>
    >> H.
    >>
    >> On 5/15/20 10:34, William Dunlap via R-devel wrote:
    >> > I agree: paste(collapse="something", ...) should always return a
    >> single
    >> > character string, regardless of the value of recycle0.  This would be
    >> > similar to when there are no non-NULL arguments to paste;
    >> collapse="."
    >> > gives a single empty string and collapse=NULL gives a zero long
    >> character
    >> > vector.
    >> >> paste()
    >> > character(0)
    >> >> paste(collapse=", ")
    >> > [1] ""
    >> >
    >> > Bill Dunlap
    >> > TIBCO Software
    >> > wdunlap tibco.com
    >> <https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=rXIwWqf4U4HZS_bjUT3KfA9ARaV5YTb_kEcXWHnkt-c&e=>
    >> >
    >> >
    >> > On Thu, Apr 30, 2020 at 9:56 PM suharto_anggono--- via R-devel <
    >> > [hidden email] <mailto:[hidden email]>> wrote:
    >> >
    >> >> Without 'collapse', 'paste' pastes (concatenates) its arguments
    >> >> elementwise (separated by 'sep', " " by default). New in R devel
    >> and R
    >> >> patched, specifying recycle0 = FALSE makes mixing zero-length and
    >> >> nonzero-length arguments results in length zero. The result of
    >> paste(n,
    >> >> "th", sep = "", recycle0 = FALSE) always have the same length as
    >> 'n'.
    >> >> Previously, the result is still as long as the longest argument,
    >> with the
    >> >> zero-length argument like "". If all og the arguments have
    >> length zero,
    >> >> 'recycle0' doesn't matter.
    >> >>
    >> >> As far as I understand, 'paste' with 'collapse' as a character
    >> string is
    >> >> supposed to put together elements of a vector into a single
    >> character
    >> >> string. I think 'recycle0' shouldn't change it.
    >> >>
    >> >> In current R devel and R patched, paste(character(0), collapse = "",
    >> >> recycle0 = FALSE) is character(0). I think it should be "", like
    >> >> paste(character(0), collapse="").
    >> >>
    >> >> paste(c("4", "5"), "th", sep = "", collapse = ", ", recycle0 =
    >> FALSE)
    >> >> is
    >> >> "4th, 5th".
    >> >> paste(c("4"     ), "th", sep = "", collapse = ", ", recycle0 =
    >> FALSE)
    >> >> is
    >> >> "4th".
    >> >> I think
    >> >> paste(c(        ), "th", sep = "", collapse = ", ", recycle0 =
    >> FALSE)
    >> >> should be
    >> >> "",
    >> >> not character(0).
    >> >>
    >> >> ______________________________________________
    >> >> [hidden email] <mailto:[hidden email]> mailing list
    >> >>
    >> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
    >> >>
    >> >
    >> >       [[alternative HTML version deleted]]
    >> >
    >> > ______________________________________________
    >> > [hidden email] <mailto:[hidden email]> mailing list
    >> >
    >> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
    >> >
    >>
    >> --
    >> Hervé Pagès
    >>
    >> Program in Computational Biology
    >> Division of Public Health Sciences
    >> Fred Hutchinson Cancer Research Center
    >> 1100 Fairview Ave. N, M1-B514
    >> P.O. Box 19024
    >> Seattle, WA 98109-1024
    >>
    >> E-mail: [hidden email] <mailto:[hidden email]>
    >> Phone:  (206) 667-5791
    >> Fax:    (206) 667-1319
    >>
    >> ______________________________________________
    >> [hidden email] <mailto:[hidden email]> mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=COnDeGgHNnHJlLLZOznMlhcaFU1nIRlkaSbssvlrMvw&e=>
    >>

    > --
    > Hervé Pagès

    > Program in Computational Biology
    > Division of Public Health Sciences
    > Fred Hutchinson Cancer Research Center
    > 1100 Fairview Ave. N, M1-B514
    > P.O. Box 19024
    > Seattle, WA 98109-1024

    > E-mail: [hidden email]
    > Phone:  (206) 667-5791
    > Fax:    (206) 667-1319

    > ______________________________________________
    > [hidden email] mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: paste(character(0), collapse="", recycle0=FALSE) should be ""

R devel mailing list
> 1) Bill and Hervé (I think) propose that 'recycle0' should have
>   no effect whenever  'collapse = <string>'

I think that collapse=<string> should make paste() return a single string,
regardless of the value of recycle0.  E.g., I would like to see

> paste0("X",seq_len(3),collapse=", ", recycle0=TRUE)
[1] "X1, X2, X3"
> paste0("X",seq_len(0),collapse=", ", recycle0=TRUE)
[1] ""

Currently the latter gives character(0).

paste's collapse argument has traditionally acted after all the other
arguments were dealt with, as in the following not extensively tested
function.

altPaste <- function (..., collapse = NULL) {
    tmp <- paste(...)
    if (!is.null(collapse)) {
        paste(tmp, collapse=collapse)
    } else {
        tmp
    }
}

E.g., in post-R-4.0.0 R-devel
> altPaste("X", seq_len(3), sep="", collapse=", ")
[1] "X1, X2, X3"
> altPaste("X", seq_len(0), sep="", collapse=", ")
[1] "X"
> altPaste("X", seq_len(0), sep="", collapse=", ", recycle0=TRUE)
[1] ""

I think it would be good if the above function continued to act the same as
paste itself.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Thu, May 21, 2020 at 9:42 AM Martin Maechler <[hidden email]>
wrote:

> >>>>> Hervé Pagès
> >>>>>     on Fri, 15 May 2020 13:44:28 -0700 writes:
>
>     > There is still the situation where **both** 'sep' and 'collapse' are
>     > specified:
>
>     >> paste(integer(0), "nth", sep="", collapse=",")
>     > [1] "nth"
>
>     > In that case 'recycle0' should **not** be ignored i.e.
>
>     > paste(integer(0), "nth", sep="", collapse=",", recycle0=TRUE)
>
>     > should return the empty string (and not character(0) like it does at
> the
>     > moment).
>
>     > In other words, 'recycle0' should only control the first operation
> (the
>     > operation controlled by 'sep'). Which makes plenty of sense: the 1st
>     > operation is binary (or n-ary) while the collapse operation is
> unary.
>     > There is no concept of recycling in the context of unary operations.
>
> Interesting, ..., and sounding somewhat convincing.
>
>     > On 5/15/20 11:25, Gabriel Becker wrote:
>     >> Hi all,
>     >>
>     >> This makes sense to me, but I would think that recycle0 and
> collapse
>     >> should actually be incompatible and paste should throw an error if
>     >> recycle0 were TRUE and collapse were declared in the same call. I
> don't
>     >> think the value of recycle0 should be silently ignored if it is
> actively
>     >> specified.
>     >>
>     >> ~G
>
> Just to summarize what I think we should know and agree (or be
> be "disproven") and where this comes from ...
>
> 1) recycle0 is a new R 4.0.0 option in paste() / paste0() which by default
>    (recycle0 = FALSE) should (and *does* AFAIK) not change anything,
>    hence  paste() / paste0() behave completely back-compatible
>    if recycle0 is kept to FALSE.
>
> 2) recycle0 = TRUE is meant to give different behavior, notably
>    0-length arguments (among '...') should result in 0-length results.
>
>    The above does not specify what this means in detail, see 3)
>
> 3) The current R 4.0.0 implementation (for which I'm primarily responsible)
>    and help(paste)  are in accordance.
>    Notably the help page (Arguments -> 'recycle0' ; Details 1st para ;
> Examples)
>    says and shows how the 4.0.0 implementation has been meant to work.
>
> 4) Several provenly smart members of the R community argue that
>    both the implementation and the documentation of 'recycle0 =
>    TRUE'  should be changed to be more logical / coherent / sensical ..
>
> Is the above all correct in your view?
>
> Assuming yes,  I read basically two proposals, both agreeing
> that  recycle0 = TRUE  should only ever apply to the action of 'sep'
> but not the action of 'collapse'.
>
> 1) Bill and Hervé (I think) propose that 'recycle0' should have
>    no effect whenever  'collapse = <string>'
>
> 2) Gabe proposes that 'collapse = <string>' and 'recycle0 = TRUE'
>    should be declared incompatible and error. If going in that
>    direction, I could also see them to give a warning (and
>    continue as if recycle = FALSE).
>
> I have not yet my mind up but would tend to agree to "you guys",
> but I think that other R Core members should chime in, too.
>
> Martin
>
>     >> On Fri, May 15, 2020 at 11:05 AM Hervé Pagès <[hidden email]
>     >> <mailto:[hidden email]>> wrote:
>     >>
>     >> Totally agree with that.
>     >>
>     >> H.
>     >>
>     >> On 5/15/20 10:34, William Dunlap via R-devel wrote:
>     >> > I agree: paste(collapse="something", ...) should always return a
>     >> single
>     >> > character string, regardless of the value of recycle0.  This
> would be
>     >> > similar to when there are no non-NULL arguments to paste;
>     >> collapse="."
>     >> > gives a single empty string and collapse=NULL gives a zero long
>     >> character
>     >> > vector.
>     >> >> paste()
>     >> > character(0)
>     >> >> paste(collapse=", ")
>     >> > [1] ""
>     >> >
>     >> > Bill Dunlap
>     >> > TIBCO Software
>     >> > wdunlap tibco.com
>     >> <
> https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=rXIwWqf4U4HZS_bjUT3KfA9ARaV5YTb_kEcXWHnkt-c&e=
> >
>     >> >
>     >> >
>     >> > On Thu, Apr 30, 2020 at 9:56 PM suharto_anggono--- via R-devel <
>     >> > [hidden email] <mailto:[hidden email]>> wrote:
>     >> >
>     >> >> Without 'collapse', 'paste' pastes (concatenates) its arguments
>     >> >> elementwise (separated by 'sep', " " by default). New in R devel
>     >> and R
>     >> >> patched, specifying recycle0 = FALSE makes mixing zero-length and
>     >> >> nonzero-length arguments results in length zero. The result of
>     >> paste(n,
>     >> >> "th", sep = "", recycle0 = FALSE) always have the same length as
>     >> 'n'.
>     >> >> Previously, the result is still as long as the longest argument,
>     >> with the
>     >> >> zero-length argument like "". If all og the arguments have
>     >> length zero,
>     >> >> 'recycle0' doesn't matter.
>     >> >>
>     >> >> As far as I understand, 'paste' with 'collapse' as a character
>     >> string is
>     >> >> supposed to put together elements of a vector into a single
>     >> character
>     >> >> string. I think 'recycle0' shouldn't change it.
>     >> >>
>     >> >> In current R devel and R patched, paste(character(0), collapse =
> "",
>     >> >> recycle0 = FALSE) is character(0). I think it should be "", like
>     >> >> paste(character(0), collapse="").
>     >> >>
>     >> >> paste(c("4", "5"), "th", sep = "", collapse = ", ", recycle0 =
>     >> FALSE)
>     >> >> is
>     >> >> "4th, 5th".
>     >> >> paste(c("4"     ), "th", sep = "", collapse = ", ", recycle0 =
>     >> FALSE)
>     >> >> is
>     >> >> "4th".
>     >> >> I think
>     >> >> paste(c(        ), "th", sep = "", collapse = ", ", recycle0 =
>     >> FALSE)
>     >> >> should be
>     >> >> "",
>     >> >> not character(0).
>     >> >>
>     >> >> ______________________________________________
>     >> >> [hidden email] <mailto:[hidden email]> mailing
> list
>     >> >>
>     >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
>     >> >>
>     >> >
>     >> >       [[alternative HTML version deleted]]
>     >> >
>     >> > ______________________________________________
>     >> > [hidden email] <mailto:[hidden email]> mailing list
>     >> >
>     >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
>     >> >
>     >>
>     >> --
>     >> Hervé Pagès
>     >>
>     >> Program in Computational Biology
>     >> Division of Public Health Sciences
>     >> Fred Hutchinson Cancer Research Center
>     >> 1100 Fairview Ave. N, M1-B514
>     >> P.O. Box 19024
>     >> Seattle, WA 98109-1024
>     >>
>     >> E-mail: [hidden email] <mailto:[hidden email]>
>     >> Phone:  (206) 667-5791
>     >> Fax:    (206) 667-1319
>     >>
>     >> ______________________________________________
>     >> [hidden email] <mailto:[hidden email]> mailing list
>     >> https://stat.ethz.ch/mailman/listinfo/r-devel
>     >> <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=COnDeGgHNnHJlLLZOznMlhcaFU1nIRlkaSbssvlrMvw&e=
> >
>     >>
>
>     > --
>     > Hervé Pagès
>
>     > Program in Computational Biology
>     > Division of Public Health Sciences
>     > Fred Hutchinson Cancer Research Center
>     > 1100 Fairview Ave. N, M1-B514
>     > P.O. Box 19024
>     > Seattle, WA 98109-1024
>
>     > E-mail: [hidden email]
>     > Phone:  (206) 667-5791
>     > Fax:    (206) 667-1319
>
>     > ______________________________________________
>     > [hidden email] mailing list
>     > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: paste(character(0), collapse="", recycle0=FALSE) should be ""

Gabriel Becker-2
In reply to this post by Martin Maechler
Hi Martin et al,



On Thu, May 21, 2020 at 9:42 AM Martin Maechler <[hidden email]>
wrote:

> >>>>> Hervé Pagès
> >>>>>     on Fri, 15 May 2020 13:44:28 -0700 writes:
>
>     > There is still the situation where **both** 'sep' and 'collapse' are
>     > specified:
>
>     >> paste(integer(0), "nth", sep="", collapse=",")
>     > [1] "nth"
>
>     > In that case 'recycle0' should **not** be ignored i.e.
>
>     > paste(integer(0), "nth", sep="", collapse=",", recycle0=TRUE)
>
>     > should return the empty string (and not character(0) like it does at
> the
>     > moment).
>
>     > In other words, 'recycle0' should only control the first operation
> (the
>     > operation controlled by 'sep'). Which makes plenty of sense: the 1st
>     > operation is binary (or n-ary) while the collapse operation is
> unary.
>     > There is no concept of recycling in the context of unary operations.
>
> Interesting, ..., and sounding somewhat convincing.
>
>     > On 5/15/20 11:25, Gabriel Becker wrote:
>     >> Hi all,
>     >>
>     >> This makes sense to me, but I would think that recycle0 and
> collapse
>     >> should actually be incompatible and paste should throw an error if
>     >> recycle0 were TRUE and collapse were declared in the same call. I
> don't
>     >> think the value of recycle0 should be silently ignored if it is
> actively
>     >> specified.
>     >>
>     >> ~G
>
> Just to summarize what I think we should know and agree (or be
> be "disproven") and where this comes from ...
>
> 1) recycle0 is a new R 4.0.0 option in paste() / paste0() which by default
>    (recycle0 = FALSE) should (and *does* AFAIK) not change anything,
>    hence  paste() / paste0() behave completely back-compatible
>    if recycle0 is kept to FALSE.
>
> 2) recycle0 = TRUE is meant to give different behavior, notably
>    0-length arguments (among '...') should result in 0-length results.
>
>    The above does not specify what this means in detail, see 3)
>
> 3) The current R 4.0.0 implementation (for which I'm primarily responsible)
>    and help(paste)  are in accordance.
>    Notably the help page (Arguments -> 'recycle0' ; Details 1st para ;
> Examples)
>    says and shows how the 4.0.0 implementation has been meant to work.
>
> 4) Several provenly smart members of the R community argue that
>    both the implementation and the documentation of 'recycle0 =
>    TRUE'  should be changed to be more logical / coherent / sensical ..
>
> Is the above all correct in your view?
>
> Assuming yes,  I read basically two proposals, both agreeing
> that  recycle0 = TRUE  should only ever apply to the action of 'sep'
> but not the action of 'collapse'.
>
> 1) Bill and Hervé (I think) propose that 'recycle0' should have
>    no effect whenever  'collapse = <string>'
>
> 2) Gabe proposes that 'collapse = <string>' and 'recycle0 = TRUE'
>    should be declared incompatible and error. If going in that
>    direction, I could also see them to give a warning (and
>    continue as if recycle = FALSE).
>

Herve makes a good point about when sep and collapse are both set. That
said, if the user explicitly sets recycle0, Personally, I don't think it
should be silently ignored under any configuration of other arguments.

If all of the arguments are to go into effect, the question then becomes
one of ordering, I think.

Consider

paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ", collapse = ",",
recycle0=TRUE)

Currently that returns character(0), becuase the logic is essenttially (in
pseudo-code)

collapse(paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ",
recycle0=TRUE), collapse = ", ", recycle0=TRUE)

     -> collapse(character(0), collapse = ", " recycle0=TRUE)

-> character(0)

Now Bill Dunlap argued, fairly convincingly I think, that paste(...,
collapse=<string>) should *always* return a character vector of length
exactly one. With recycle0, though,  it will return "" via the progression

paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ", collapse = ",",
recycle0=TRUE)

     -> collapse(character(0), collapse = ", ")

-> ""


because recycle0 is still applied to the sep-based operation which occurs
before collapse, thus leaving a vector of length 0 to collapse.

That is consistent but seems unlikely to be what the user wanted, imho. I
think if it does this there should be at least a warning when paste
collapses to "" this way, if it is allowed at all (ie if mixing
collapse=<string> and recycle0=TRUE is not simply made an error).

I would like to hear others' thoughts as well though. @Pages, Herve
<[hidden email]> @William Dunlap <[hidden email]> is "" what you
envision as thee desired and useful behavior there?

Best,
~G



> I have not yet my mind up but would tend to agree to "you guys",
> but I think that other R Core members should chime in, too.
>
> Martin
>
>     >> On Fri, May 15, 2020 at 11:05 AM Hervé Pagès <[hidden email]
>     >> <mailto:[hidden email]>> wrote:
>     >>
>     >> Totally agree with that.
>     >>
>     >> H.
>     >>
>     >> On 5/15/20 10:34, William Dunlap via R-devel wrote:
>     >> > I agree: paste(collapse="something", ...) should always return a
>     >> single
>     >> > character string, regardless of the value of recycle0.  This
> would be
>     >> > similar to when there are no non-NULL arguments to paste;
>     >> collapse="."
>     >> > gives a single empty string and collapse=NULL gives a zero long
>     >> character
>     >> > vector.
>     >> >> paste()
>     >> > character(0)
>     >> >> paste(collapse=", ")
>     >> > [1] ""
>     >> >
>     >> > Bill Dunlap
>     >> > TIBCO Software
>     >> > wdunlap tibco.com
>     >> <
> https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=rXIwWqf4U4HZS_bjUT3KfA9ARaV5YTb_kEcXWHnkt-c&e=
> >
>     >> >
>     >> >
>     >> > On Thu, Apr 30, 2020 at 9:56 PM suharto_anggono--- via R-devel <
>     >> > [hidden email] <mailto:[hidden email]>> wrote:
>     >> >
>     >> >> Without 'collapse', 'paste' pastes (concatenates) its arguments
>     >> >> elementwise (separated by 'sep', " " by default). New in R devel
>     >> and R
>     >> >> patched, specifying recycle0 = FALSE makes mixing zero-length and
>     >> >> nonzero-length arguments results in length zero. The result of
>     >> paste(n,
>     >> >> "th", sep = "", recycle0 = FALSE) always have the same length as
>     >> 'n'.
>     >> >> Previously, the result is still as long as the longest argument,
>     >> with the
>     >> >> zero-length argument like "". If all og the arguments have
>     >> length zero,
>     >> >> 'recycle0' doesn't matter.
>     >> >>
>     >> >> As far as I understand, 'paste' with 'collapse' as a character
>     >> string is
>     >> >> supposed to put together elements of a vector into a single
>     >> character
>     >> >> string. I think 'recycle0' shouldn't change it.
>     >> >>
>     >> >> In current R devel and R patched, paste(character(0), collapse =
> "",
>     >> >> recycle0 = FALSE) is character(0). I think it should be "", like
>     >> >> paste(character(0), collapse="").
>     >> >>
>     >> >> paste(c("4", "5"), "th", sep = "", collapse = ", ", recycle0 =
>     >> FALSE)
>     >> >> is
>     >> >> "4th, 5th".
>     >> >> paste(c("4"     ), "th", sep = "", collapse = ", ", recycle0 =
>     >> FALSE)
>     >> >> is
>     >> >> "4th".
>     >> >> I think
>     >> >> paste(c(        ), "th", sep = "", collapse = ", ", recycle0 =
>     >> FALSE)
>     >> >> should be
>     >> >> "",
>     >> >> not character(0).
>     >> >>
>     >> >> ______________________________________________
>     >> >> [hidden email] <mailto:[hidden email]> mailing
> list
>     >> >>
>     >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
>     >> >>
>     >> >
>     >> >       [[alternative HTML version deleted]]
>     >> >
>     >> > ______________________________________________
>     >> > [hidden email] <mailto:[hidden email]> mailing list
>     >> >
>     >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
>     >> >
>     >>
>     >> --
>     >> Hervé Pagès
>     >>
>     >> Program in Computational Biology
>     >> Division of Public Health Sciences
>     >> Fred Hutchinson Cancer Research Center
>     >> 1100 Fairview Ave. N, M1-B514
>     >> P.O. Box 19024
>     >> Seattle, WA 98109-1024
>     >>
>     >> E-mail: [hidden email] <mailto:[hidden email]>
>     >> Phone:  (206) 667-5791
>     >> Fax:    (206) 667-1319
>     >>
>     >> ______________________________________________
>     >> [hidden email] <mailto:[hidden email]> mailing list
>     >> https://stat.ethz.ch/mailman/listinfo/r-devel
>     >> <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=COnDeGgHNnHJlLLZOznMlhcaFU1nIRlkaSbssvlrMvw&e=
> >
>     >>
>
>     > --
>     > Hervé Pagès
>
>     > Program in Computational Biology
>     > Division of Public Health Sciences
>     > Fred Hutchinson Cancer Research Center
>     > 1100 Fairview Ave. N, M1-B514
>     > P.O. Box 19024
>     > Seattle, WA 98109-1024
>
>     > E-mail: [hidden email]
>     > Phone:  (206) 667-5791
>     > Fax:    (206) 667-1319
>
>     > ______________________________________________
>     > [hidden email] mailing list
>     > https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: paste(character(0), collapse="", recycle0=FALSE) should be ""

Hervé Pagès-2
I think that

    paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ", collapse = ",",
recycle0=TRUE)

should just return an empty string and don't see why it needs to emit a
warning or raise an error. To me it does exactly what the user is asking
for, which is to change how the 3 arguments are recycled **before** the
'sep' operation.

The 'recycle0' argument has no business in the 'collapse' operation
(which comes after the 'sep' operation): this operation still behaves
like it always had.

That's all there is to it.

H.


On 5/22/20 03:00, Gabriel Becker wrote:

> Hi Martin et al,
>
>
>
> On Thu, May 21, 2020 at 9:42 AM Martin Maechler
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>      >>>>> Hervé Pagès
>      >>>>>     on Fri, 15 May 2020 13:44:28 -0700 writes:
>
>          > There is still the situation where **both** 'sep' and
>     'collapse' are
>          > specified:
>
>          >> paste(integer(0), "nth", sep="", collapse=",")
>          > [1] "nth"
>
>          > In that case 'recycle0' should **not** be ignored i.e.
>
>          > paste(integer(0), "nth", sep="", collapse=",", recycle0=TRUE)
>
>          > should return the empty string (and not character(0) like it
>     does at the
>          > moment).
>
>          > In other words, 'recycle0' should only control the first
>     operation (the
>          > operation controlled by 'sep'). Which makes plenty of sense:
>     the 1st
>          > operation is binary (or n-ary) while the collapse operation
>     is unary.
>          > There is no concept of recycling in the context of unary
>     operations.
>
>     Interesting, ..., and sounding somewhat convincing.
>
>          > On 5/15/20 11:25, Gabriel Becker wrote:
>          >> Hi all,
>          >>
>          >> This makes sense to me, but I would think that recycle0 and
>     collapse
>          >> should actually be incompatible and paste should throw an
>     error if
>          >> recycle0 were TRUE and collapse were declared in the same
>     call. I don't
>          >> think the value of recycle0 should be silently ignored if it
>     is actively
>          >> specified.
>          >>
>          >> ~G
>
>     Just to summarize what I think we should know and agree (or be
>     be "disproven") and where this comes from ...
>
>     1) recycle0 is a new R 4.0.0 option in paste() / paste0() which by
>     default
>         (recycle0 = FALSE) should (and *does* AFAIK) not change anything,
>         hence  paste() / paste0() behave completely back-compatible
>         if recycle0 is kept to FALSE.
>
>     2) recycle0 = TRUE is meant to give different behavior, notably
>         0-length arguments (among '...') should result in 0-length results.
>
>         The above does not specify what this means in detail, see 3)
>
>     3) The current R 4.0.0 implementation (for which I'm primarily
>     responsible)
>         and help(paste)  are in accordance.
>         Notably the help page (Arguments -> 'recycle0' ; Details 1st
>     para ; Examples)
>         says and shows how the 4.0.0 implementation has been meant to work.
>
>     4) Several provenly smart members of the R community argue that
>         both the implementation and the documentation of 'recycle0 =
>         TRUE'  should be changed to be more logical / coherent / sensical ..
>
>     Is the above all correct in your view?
>
>     Assuming yes,  I read basically two proposals, both agreeing
>     that  recycle0 = TRUE  should only ever apply to the action of 'sep'
>     but not the action of 'collapse'.
>
>     1) Bill and Hervé (I think) propose that 'recycle0' should have
>         no effect whenever  'collapse = <string>'
>
>     2) Gabe proposes that 'collapse = <string>' and 'recycle0 = TRUE'
>         should be declared incompatible and error. If going in that
>         direction, I could also see them to give a warning (and
>         continue as if recycle = FALSE).
>
>
> Herve makes a good point about when sep and collapse are both set. That
> said, if the user explicitly sets recycle0, Personally, I don't think it
> should be silently ignored under any configuration of other arguments.
>
> If all of the arguments are to go into effect, the question then becomes
> one of ordering, I think.
>
> Consider
>
>     paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ", collapse = ",",
>     recycle0=TRUE)
>
> Currently that returns character(0), becuase the logic is
> essenttially (in pseudo-code)
>
>     collapse(paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ",
>     recycle0=TRUE), collapse = ", ", recycle0=TRUE)
>
>       -> collapse(character(0), collapse = ", " recycle0=TRUE)
>
>     -> character(0)
>
> Now Bill Dunlap argued, fairly convincingly I think, that paste(...,
> collapse=<string>) should /always/ return a character vector of length
> exactly one. With recycle0, though,  it will return "" via the progression
>
>     paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ", collapse = ",",
>     recycle0=TRUE)
>
>       -> collapse(character(0), collapse = ", ")
>
>     -> ""
>
>
> because recycle0 is still applied to the sep-based operation which
> occurs before collapse, thus leaving a vector of length 0 to collapse.
>
> That is consistent but seems unlikely to be what the user wanted, imho.
> I think if it does this there should be at least a warning when paste
> collapses to "" this way, if it is allowed at all (ie if mixing
> collapse=<string>and recycle0=TRUEis not simply made an error).
>
> I would like to hear others' thoughts as well though. @Pages, Herve
> <mailto:[hidden email]> @William Dunlap
> <mailto:[hidden email]> is "" what you envision as thee desired and
> useful behavior there?
>
> Best,
> ~G
>
>
>
>     I have not yet my mind up but would tend to agree to "you guys",
>     but I think that other R Core members should chime in, too.
>
>     Martin
>
>          >> On Fri, May 15, 2020 at 11:05 AM Hervé Pagès
>     <[hidden email] <mailto:[hidden email]>
>          >> <mailto:[hidden email] <mailto:[hidden email]>>>
>     wrote:
>          >>
>          >> Totally agree with that.
>          >>
>          >> H.
>          >>
>          >> On 5/15/20 10:34, William Dunlap via R-devel wrote:
>          >> > I agree: paste(collapse="something", ...) should always
>     return a
>          >> single
>          >> > character string, regardless of the value of recycle0.
>     This would be
>          >> > similar to when there are no non-NULL arguments to paste;
>          >> collapse="."
>          >> > gives a single empty string and collapse=NULL gives a zero
>     long
>          >> character
>          >> > vector.
>          >> >> paste()
>          >> > character(0)
>          >> >> paste(collapse=", ")
>          >> > [1] ""
>          >> >
>          >> > Bill Dunlap
>          >> > TIBCO Software
>          >> > wdunlap tibco.com
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=o9ozvxBK-kVvAUFro7U1RrI5w0U8EPb0uyjQwMvOpt8&e=>
>          >>
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=rXIwWqf4U4HZS_bjUT3KfA9ARaV5YTb_kEcXWHnkt-c&e=>
>          >> >
>          >> >
>          >> > On Thu, Apr 30, 2020 at 9:56 PM suharto_anggono--- via
>     R-devel <
>          >> > [hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>> wrote:
>          >> >
>          >> >> Without 'collapse', 'paste' pastes (concatenates) its
>     arguments
>          >> >> elementwise (separated by 'sep', " " by default). New in
>     R devel
>          >> and R
>          >> >> patched, specifying recycle0 = FALSE makes mixing
>     zero-length and
>          >> >> nonzero-length arguments results in length zero. The
>     result of
>          >> paste(n,
>          >> >> "th", sep = "", recycle0 = FALSE) always have the same
>     length as
>          >> 'n'.
>          >> >> Previously, the result is still as long as the longest
>     argument,
>          >> with the
>          >> >> zero-length argument like "". If all og the arguments have
>          >> length zero,
>          >> >> 'recycle0' doesn't matter.
>          >> >>
>          >> >> As far as I understand, 'paste' with 'collapse' as a
>     character
>          >> string is
>          >> >> supposed to put together elements of a vector into a single
>          >> character
>          >> >> string. I think 'recycle0' shouldn't change it.
>          >> >>
>          >> >> In current R devel and R patched, paste(character(0),
>     collapse = "",
>          >> >> recycle0 = FALSE) is character(0). I think it should be
>     "", like
>          >> >> paste(character(0), collapse="").
>          >> >>
>          >> >> paste(c("4", "5"), "th", sep = "", collapse = ", ",
>     recycle0 =
>          >> FALSE)
>          >> >> is
>          >> >> "4th, 5th".
>          >> >> paste(c("4"     ), "th", sep = "", collapse = ", ",
>     recycle0 =
>          >> FALSE)
>          >> >> is
>          >> >> "4th".
>          >> >> I think
>          >> >> paste(c(        ), "th", sep = "", collapse = ", ",
>     recycle0 =
>          >> FALSE)
>          >> >> should be
>          >> >> "",
>          >> >> not character(0).
>          >> >>
>          >> >> ______________________________________________
>          >> >> [hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     mailing list
>          >> >>
>          >>
>     https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
>          >> >>
>          >> >
>          >> >       [[alternative HTML version deleted]]
>          >> >
>          >> > ______________________________________________
>          >> > [hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     mailing list
>          >> >
>          >>
>     https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
>          >> >
>          >>
>          >> --
>          >> Hervé Pagès
>          >>
>          >> Program in Computational Biology
>          >> Division of Public Health Sciences
>          >> Fred Hutchinson Cancer Research Center
>          >> 1100 Fairview Ave. N, M1-B514
>          >> P.O. Box 19024
>          >> Seattle, WA 98109-1024
>          >>
>          >> E-mail: [hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>          >> Phone:  (206) 667-5791
>          >> Fax:    (206) 667-1319
>          >>
>          >> ______________________________________________
>          >> [hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     mailing list
>          >> https://stat.ethz.ch/mailman/listinfo/r-devel
>     <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=OLA7CqaU5uKeid1aGw41XJ_2Uq7JXbcwpPOrTWWG2v4&e=>
>          >>
>     <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=COnDeGgHNnHJlLLZOznMlhcaFU1nIRlkaSbssvlrMvw&e=>
>          >>
>
>          > --
>          > Hervé Pagès
>
>          > Program in Computational Biology
>          > Division of Public Health Sciences
>          > Fred Hutchinson Cancer Research Center
>          > 1100 Fairview Ave. N, M1-B514
>          > P.O. Box 19024
>          > Seattle, WA 98109-1024
>
>          > E-mail: [hidden email] <mailto:[hidden email]>
>          > Phone:  (206) 667-5791
>          > Fax:    (206) 667-1319
>
>          > ______________________________________________
>          > [hidden email] <mailto:[hidden email]> mailing list
>          > https://stat.ethz.ch/mailman/listinfo/r-devel
>     <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=OLA7CqaU5uKeid1aGw41XJ_2Uq7JXbcwpPOrTWWG2v4&e=>
>

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [hidden email]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: paste(character(0), collapse="", recycle0=FALSE) should be ""

R devel mailing list
I agree with Herve, processing collapse happens last so collapse=non-NULL
always leads to a single character string being returned, the same as
paste(collapse="").  See the altPaste function I posted yesterday.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Fri, May 22, 2020 at 9:12 AM Hervé Pagès <[hidden email]> wrote:

> I think that
>
>     paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ", collapse = ",",
> recycle0=TRUE)
>
> should just return an empty string and don't see why it needs to emit a
> warning or raise an error. To me it does exactly what the user is asking
> for, which is to change how the 3 arguments are recycled **before** the
> 'sep' operation.
>
> The 'recycle0' argument has no business in the 'collapse' operation
> (which comes after the 'sep' operation): this operation still behaves
> like it always had.
>
> That's all there is to it.
>
> H.
>
>
> On 5/22/20 03:00, Gabriel Becker wrote:
> > Hi Martin et al,
> >
> >
> >
> > On Thu, May 21, 2020 at 9:42 AM Martin Maechler
> > <[hidden email] <mailto:[hidden email]>> wrote:
> >
> >      >>>>> Hervé Pagès
> >      >>>>>     on Fri, 15 May 2020 13:44:28 -0700 writes:
> >
> >          > There is still the situation where **both** 'sep' and
> >     'collapse' are
> >          > specified:
> >
> >          >> paste(integer(0), "nth", sep="", collapse=",")
> >          > [1] "nth"
> >
> >          > In that case 'recycle0' should **not** be ignored i.e.
> >
> >          > paste(integer(0), "nth", sep="", collapse=",", recycle0=TRUE)
> >
> >          > should return the empty string (and not character(0) like it
> >     does at the
> >          > moment).
> >
> >          > In other words, 'recycle0' should only control the first
> >     operation (the
> >          > operation controlled by 'sep'). Which makes plenty of sense:
> >     the 1st
> >          > operation is binary (or n-ary) while the collapse operation
> >     is unary.
> >          > There is no concept of recycling in the context of unary
> >     operations.
> >
> >     Interesting, ..., and sounding somewhat convincing.
> >
> >          > On 5/15/20 11:25, Gabriel Becker wrote:
> >          >> Hi all,
> >          >>
> >          >> This makes sense to me, but I would think that recycle0 and
> >     collapse
> >          >> should actually be incompatible and paste should throw an
> >     error if
> >          >> recycle0 were TRUE and collapse were declared in the same
> >     call. I don't
> >          >> think the value of recycle0 should be silently ignored if it
> >     is actively
> >          >> specified.
> >          >>
> >          >> ~G
> >
> >     Just to summarize what I think we should know and agree (or be
> >     be "disproven") and where this comes from ...
> >
> >     1) recycle0 is a new R 4.0.0 option in paste() / paste0() which by
> >     default
> >         (recycle0 = FALSE) should (and *does* AFAIK) not change anything,
> >         hence  paste() / paste0() behave completely back-compatible
> >         if recycle0 is kept to FALSE.
> >
> >     2) recycle0 = TRUE is meant to give different behavior, notably
> >         0-length arguments (among '...') should result in 0-length
> results.
> >
> >         The above does not specify what this means in detail, see 3)
> >
> >     3) The current R 4.0.0 implementation (for which I'm primarily
> >     responsible)
> >         and help(paste)  are in accordance.
> >         Notably the help page (Arguments -> 'recycle0' ; Details 1st
> >     para ; Examples)
> >         says and shows how the 4.0.0 implementation has been meant to
> work.
> >
> >     4) Several provenly smart members of the R community argue that
> >         both the implementation and the documentation of 'recycle0 =
> >         TRUE'  should be changed to be more logical / coherent /
> sensical ..
> >
> >     Is the above all correct in your view?
> >
> >     Assuming yes,  I read basically two proposals, both agreeing
> >     that  recycle0 = TRUE  should only ever apply to the action of 'sep'
> >     but not the action of 'collapse'.
> >
> >     1) Bill and Hervé (I think) propose that 'recycle0' should have
> >         no effect whenever  'collapse = <string>'
> >
> >     2) Gabe proposes that 'collapse = <string>' and 'recycle0 = TRUE'
> >         should be declared incompatible and error. If going in that
> >         direction, I could also see them to give a warning (and
> >         continue as if recycle = FALSE).
> >
> >
> > Herve makes a good point about when sep and collapse are both set. That
> > said, if the user explicitly sets recycle0, Personally, I don't think it
> > should be silently ignored under any configuration of other arguments.
> >
> > If all of the arguments are to go into effect, the question then becomes
> > one of ordering, I think.
> >
> > Consider
> >
> >     paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ", collapse = ",",
> >     recycle0=TRUE)
> >
> > Currently that returns character(0), becuase the logic is
> > essenttially (in pseudo-code)
> >
> >     collapse(paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ",
> >     recycle0=TRUE), collapse = ", ", recycle0=TRUE)
> >
> >       -> collapse(character(0), collapse = ", " recycle0=TRUE)
> >
> >     -> character(0)
> >
> > Now Bill Dunlap argued, fairly convincingly I think, that paste(...,
> > collapse=<string>) should /always/ return a character vector of length
> > exactly one. With recycle0, though,  it will return "" via the
> progression
> >
> >     paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ", collapse = ",",
> >     recycle0=TRUE)
> >
> >       -> collapse(character(0), collapse = ", ")
> >
> >     -> ""
> >
> >
> > because recycle0 is still applied to the sep-based operation which
> > occurs before collapse, thus leaving a vector of length 0 to collapse.
> >
> > That is consistent but seems unlikely to be what the user wanted, imho.
> > I think if it does this there should be at least a warning when paste
> > collapses to "" this way, if it is allowed at all (ie if mixing
> > collapse=<string>and recycle0=TRUEis not simply made an error).
> >
> > I would like to hear others' thoughts as well though. @Pages, Herve
> > <mailto:[hidden email]> @William Dunlap
> > <mailto:[hidden email]> is "" what you envision as thee desired and
> > useful behavior there?
> >
> > Best,
> > ~G
> >
> >
> >
> >     I have not yet my mind up but would tend to agree to "you guys",
> >     but I think that other R Core members should chime in, too.
> >
> >     Martin
> >
> >          >> On Fri, May 15, 2020 at 11:05 AM Hervé Pagès
> >     <[hidden email] <mailto:[hidden email]>
> >          >> <mailto:[hidden email] <mailto:[hidden email]>>>
> >     wrote:
> >          >>
> >          >> Totally agree with that.
> >          >>
> >          >> H.
> >          >>
> >          >> On 5/15/20 10:34, William Dunlap via R-devel wrote:
> >          >> > I agree: paste(collapse="something", ...) should always
> >     return a
> >          >> single
> >          >> > character string, regardless of the value of recycle0.
> >     This would be
> >          >> > similar to when there are no non-NULL arguments to paste;
> >          >> collapse="."
> >          >> > gives a single empty string and collapse=NULL gives a zero
> >     long
> >          >> character
> >          >> > vector.
> >          >> >> paste()
> >          >> > character(0)
> >          >> >> paste(collapse=", ")
> >          >> > [1] ""
> >          >> >
> >          >> > Bill Dunlap
> >          >> > TIBCO Software
> >          >> > wdunlap tibco.com
> >     <
> https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=o9ozvxBK-kVvAUFro7U1RrI5w0U8EPb0uyjQwMvOpt8&e=
> >
> >          >>
> >     <
> https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=rXIwWqf4U4HZS_bjUT3KfA9ARaV5YTb_kEcXWHnkt-c&e=
> >
> >          >> >
> >          >> >
> >          >> > On Thu, Apr 30, 2020 at 9:56 PM suharto_anggono--- via
> >     R-devel <
> >          >> > [hidden email] <mailto:[hidden email]>
> >     <mailto:[hidden email] <mailto:[hidden email]>>>
> wrote:
> >          >> >
> >          >> >> Without 'collapse', 'paste' pastes (concatenates) its
> >     arguments
> >          >> >> elementwise (separated by 'sep', " " by default). New in
> >     R devel
> >          >> and R
> >          >> >> patched, specifying recycle0 = FALSE makes mixing
> >     zero-length and
> >          >> >> nonzero-length arguments results in length zero. The
> >     result of
> >          >> paste(n,
> >          >> >> "th", sep = "", recycle0 = FALSE) always have the same
> >     length as
> >          >> 'n'.
> >          >> >> Previously, the result is still as long as the longest
> >     argument,
> >          >> with the
> >          >> >> zero-length argument like "". If all og the arguments have
> >          >> length zero,
> >          >> >> 'recycle0' doesn't matter.
> >          >> >>
> >          >> >> As far as I understand, 'paste' with 'collapse' as a
> >     character
> >          >> string is
> >          >> >> supposed to put together elements of a vector into a
> single
> >          >> character
> >          >> >> string. I think 'recycle0' shouldn't change it.
> >          >> >>
> >          >> >> In current R devel and R patched, paste(character(0),
> >     collapse = "",
> >          >> >> recycle0 = FALSE) is character(0). I think it should be
> >     "", like
> >          >> >> paste(character(0), collapse="").
> >          >> >>
> >          >> >> paste(c("4", "5"), "th", sep = "", collapse = ", ",
> >     recycle0 =
> >          >> FALSE)
> >          >> >> is
> >          >> >> "4th, 5th".
> >          >> >> paste(c("4"     ), "th", sep = "", collapse = ", ",
> >     recycle0 =
> >          >> FALSE)
> >          >> >> is
> >          >> >> "4th".
> >          >> >> I think
> >          >> >> paste(c(        ), "th", sep = "", collapse = ", ",
> >     recycle0 =
> >          >> FALSE)
> >          >> >> should be
> >          >> >> "",
> >          >> >> not character(0).
> >          >> >>
> >          >> >> ______________________________________________
> >          >> >> [hidden email] <mailto:[hidden email]>
> >     <mailto:[hidden email] <mailto:[hidden email]>>
> >     mailing list
> >          >> >>
> >          >>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
> >          >> >>
> >          >> >
> >          >> >       [[alternative HTML version deleted]]
> >          >> >
> >          >> > ______________________________________________
> >          >> > [hidden email] <mailto:[hidden email]>
> >     <mailto:[hidden email] <mailto:[hidden email]>>
> >     mailing list
> >          >> >
> >          >>
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
> >          >> >
> >          >>
> >          >> --
> >          >> Hervé Pagès
> >          >>
> >          >> Program in Computational Biology
> >          >> Division of Public Health Sciences
> >          >> Fred Hutchinson Cancer Research Center
> >          >> 1100 Fairview Ave. N, M1-B514
> >          >> P.O. Box 19024
> >          >> Seattle, WA 98109-1024
> >          >>
> >          >> E-mail: [hidden email] <mailto:[hidden email]>
> >     <mailto:[hidden email] <mailto:[hidden email]>>
> >          >> Phone:  (206) 667-5791
> >          >> Fax:    (206) 667-1319
> >          >>
> >          >> ______________________________________________
> >          >> [hidden email] <mailto:[hidden email]>
> >     <mailto:[hidden email] <mailto:[hidden email]>>
> >     mailing list
> >          >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >     <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=OLA7CqaU5uKeid1aGw41XJ_2Uq7JXbcwpPOrTWWG2v4&e=
> >
> >          >>
> >     <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=COnDeGgHNnHJlLLZOznMlhcaFU1nIRlkaSbssvlrMvw&e=
> >
> >          >>
> >
> >          > --
> >          > Hervé Pagès
> >
> >          > Program in Computational Biology
> >          > Division of Public Health Sciences
> >          > Fred Hutchinson Cancer Research Center
> >          > 1100 Fairview Ave. N, M1-B514
> >          > P.O. Box 19024
> >          > Seattle, WA 98109-1024
> >
> >          > E-mail: [hidden email] <mailto:[hidden email]>
> >          > Phone:  (206) 667-5791
> >          > Fax:    (206) 667-1319
> >
> >          > ______________________________________________
> >          > [hidden email] <mailto:[hidden email]> mailing
> list
> >          > https://stat.ethz.ch/mailman/listinfo/r-devel
> >     <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=OLA7CqaU5uKeid1aGw41XJ_2Uq7JXbcwpPOrTWWG2v4&e=
> >
> >
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: [hidden email]
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: paste(character(0), collapse="", recycle0=FALSE) should be ""

Gabriel Becker-2
I understand that this is consistent but it also strikes me as an enormous
'gotcha' of a magnitude that 'we' are trying to avoid/smooth over at this
point in user-facing R space.

For the record I'm not suggesting it should return something other than "",
and in particular I'm not arguing that any call to paste *that does not
return an error* with non-NULL collapse should return a character vector of
length one.

Rather I'm pointing out that it could (perhaps should, imo) simply be an
error, which is also consistent, in the strict sense, with
previous behavior in that it is the developer simply declining to extend
the recycle0 argument to the full parameter space (there is no rule that
says we must do so, arguments whose use is incompatible with other
arguments can be reasonable and called for).

I don't feel feel super strongly that reeturning "" in this and similar
cases horrible and should never happen, but i'd bet dollars to donuts that
to the extent that behavior occurs it will be a disproportionately major
source of bugs, and i think thats at least worth considering in addition to
pure consistency.

~G

On Fri, May 22, 2020 at 9:50 AM William Dunlap <[hidden email]> wrote:

> I agree with Herve, processing collapse happens last so collapse=non-NULL
> always leads to a single character string being returned, the same as
> paste(collapse="").  See the altPaste function I posted yesterday.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Fri, May 22, 2020 at 9:12 AM Hervé Pagès <[hidden email]> wrote:
>
>> I think that
>>
>>     paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ", collapse = ",",
>> recycle0=TRUE)
>>
>> should just return an empty string and don't see why it needs to emit a
>> warning or raise an error. To me it does exactly what the user is asking
>> for, which is to change how the 3 arguments are recycled **before** the
>> 'sep' operation.
>>
>> The 'recycle0' argument has no business in the 'collapse' operation
>> (which comes after the 'sep' operation): this operation still behaves
>> like it always had.
>>
>> That's all there is to it.
>>
>> H.
>>
>>
>> On 5/22/20 03:00, Gabriel Becker wrote:
>> > Hi Martin et al,
>> >
>> >
>> >
>> > On Thu, May 21, 2020 at 9:42 AM Martin Maechler
>> > <[hidden email] <mailto:[hidden email]>> wrote:
>> >
>> >      >>>>> Hervé Pagès
>> >      >>>>>     on Fri, 15 May 2020 13:44:28 -0700 writes:
>> >
>> >          > There is still the situation where **both** 'sep' and
>> >     'collapse' are
>> >          > specified:
>> >
>> >          >> paste(integer(0), "nth", sep="", collapse=",")
>> >          > [1] "nth"
>> >
>> >          > In that case 'recycle0' should **not** be ignored i.e.
>> >
>> >          > paste(integer(0), "nth", sep="", collapse=",", recycle0=TRUE)
>> >
>> >          > should return the empty string (and not character(0) like it
>> >     does at the
>> >          > moment).
>> >
>> >          > In other words, 'recycle0' should only control the first
>> >     operation (the
>> >          > operation controlled by 'sep'). Which makes plenty of sense:
>> >     the 1st
>> >          > operation is binary (or n-ary) while the collapse operation
>> >     is unary.
>> >          > There is no concept of recycling in the context of unary
>> >     operations.
>> >
>> >     Interesting, ..., and sounding somewhat convincing.
>> >
>> >          > On 5/15/20 11:25, Gabriel Becker wrote:
>> >          >> Hi all,
>> >          >>
>> >          >> This makes sense to me, but I would think that recycle0 and
>> >     collapse
>> >          >> should actually be incompatible and paste should throw an
>> >     error if
>> >          >> recycle0 were TRUE and collapse were declared in the same
>> >     call. I don't
>> >          >> think the value of recycle0 should be silently ignored if it
>> >     is actively
>> >          >> specified.
>> >          >>
>> >          >> ~G
>> >
>> >     Just to summarize what I think we should know and agree (or be
>> >     be "disproven") and where this comes from ...
>> >
>> >     1) recycle0 is a new R 4.0.0 option in paste() / paste0() which by
>> >     default
>> >         (recycle0 = FALSE) should (and *does* AFAIK) not change
>> anything,
>> >         hence  paste() / paste0() behave completely back-compatible
>> >         if recycle0 is kept to FALSE.
>> >
>> >     2) recycle0 = TRUE is meant to give different behavior, notably
>> >         0-length arguments (among '...') should result in 0-length
>> results.
>> >
>> >         The above does not specify what this means in detail, see 3)
>> >
>> >     3) The current R 4.0.0 implementation (for which I'm primarily
>> >     responsible)
>> >         and help(paste)  are in accordance.
>> >         Notably the help page (Arguments -> 'recycle0' ; Details 1st
>> >     para ; Examples)
>> >         says and shows how the 4.0.0 implementation has been meant to
>> work.
>> >
>> >     4) Several provenly smart members of the R community argue that
>> >         both the implementation and the documentation of 'recycle0 =
>> >         TRUE'  should be changed to be more logical / coherent /
>> sensical ..
>> >
>> >     Is the above all correct in your view?
>> >
>> >     Assuming yes,  I read basically two proposals, both agreeing
>> >     that  recycle0 = TRUE  should only ever apply to the action of 'sep'
>> >     but not the action of 'collapse'.
>> >
>> >     1) Bill and Hervé (I think) propose that 'recycle0' should have
>> >         no effect whenever  'collapse = <string>'
>> >
>> >     2) Gabe proposes that 'collapse = <string>' and 'recycle0 = TRUE'
>> >         should be declared incompatible and error. If going in that
>> >         direction, I could also see them to give a warning (and
>> >         continue as if recycle = FALSE).
>> >
>> >
>> > Herve makes a good point about when sep and collapse are both set. That
>> > said, if the user explicitly sets recycle0, Personally, I don't think
>> it
>> > should be silently ignored under any configuration of other arguments.
>> >
>> > If all of the arguments are to go into effect, the question then
>> becomes
>> > one of ordering, I think.
>> >
>> > Consider
>> >
>> >     paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ", collapse = ",",
>> >     recycle0=TRUE)
>> >
>> > Currently that returns character(0), becuase the logic is
>> > essenttially (in pseudo-code)
>> >
>> >     collapse(paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ",
>> >     recycle0=TRUE), collapse = ", ", recycle0=TRUE)
>> >
>> >       -> collapse(character(0), collapse = ", " recycle0=TRUE)
>> >
>> >     -> character(0)
>> >
>> > Now Bill Dunlap argued, fairly convincingly I think, that paste(...,
>> > collapse=<string>) should /always/ return a character vector of length
>> > exactly one. With recycle0, though,  it will return "" via the
>> progression
>> >
>> >     paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ", collapse = ",",
>> >     recycle0=TRUE)
>> >
>> >       -> collapse(character(0), collapse = ", ")
>> >
>> >     -> ""
>> >
>> >
>> > because recycle0 is still applied to the sep-based operation which
>> > occurs before collapse, thus leaving a vector of length 0 to collapse.
>> >
>> > That is consistent but seems unlikely to be what the user wanted, imho.
>> > I think if it does this there should be at least a warning when paste
>> > collapses to "" this way, if it is allowed at all (ie if mixing
>> > collapse=<string>and recycle0=TRUEis not simply made an error).
>> >
>> > I would like to hear others' thoughts as well though. @Pages, Herve
>> > <mailto:[hidden email]> @William Dunlap
>> > <mailto:[hidden email]> is "" what you envision as thee desired and
>> > useful behavior there?
>> >
>> > Best,
>> > ~G
>> >
>> >
>> >
>> >     I have not yet my mind up but would tend to agree to "you guys",
>> >     but I think that other R Core members should chime in, too.
>> >
>> >     Martin
>> >
>> >          >> On Fri, May 15, 2020 at 11:05 AM Hervé Pagès
>> >     <[hidden email] <mailto:[hidden email]>
>> >          >> <mailto:[hidden email] <mailto:[hidden email]
>> >>>
>> >     wrote:
>> >          >>
>> >          >> Totally agree with that.
>> >          >>
>> >          >> H.
>> >          >>
>> >          >> On 5/15/20 10:34, William Dunlap via R-devel wrote:
>> >          >> > I agree: paste(collapse="something", ...) should always
>> >     return a
>> >          >> single
>> >          >> > character string, regardless of the value of recycle0.
>> >     This would be
>> >          >> > similar to when there are no non-NULL arguments to paste;
>> >          >> collapse="."
>> >          >> > gives a single empty string and collapse=NULL gives a zero
>> >     long
>> >          >> character
>> >          >> > vector.
>> >          >> >> paste()
>> >          >> > character(0)
>> >          >> >> paste(collapse=", ")
>> >          >> > [1] ""
>> >          >> >
>> >          >> > Bill Dunlap
>> >          >> > TIBCO Software
>> >          >> > wdunlap tibco.com
>> >     <
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=o9ozvxBK-kVvAUFro7U1RrI5w0U8EPb0uyjQwMvOpt8&e=
>> >
>> >          >>
>> >     <
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=rXIwWqf4U4HZS_bjUT3KfA9ARaV5YTb_kEcXWHnkt-c&e=
>> >
>> >          >> >
>> >          >> >
>> >          >> > On Thu, Apr 30, 2020 at 9:56 PM suharto_anggono--- via
>> >     R-devel <
>> >          >> > [hidden email] <mailto:[hidden email]>
>> >     <mailto:[hidden email] <mailto:[hidden email]>>>
>> wrote:
>> >          >> >
>> >          >> >> Without 'collapse', 'paste' pastes (concatenates) its
>> >     arguments
>> >          >> >> elementwise (separated by 'sep', " " by default). New in
>> >     R devel
>> >          >> and R
>> >          >> >> patched, specifying recycle0 = FALSE makes mixing
>> >     zero-length and
>> >          >> >> nonzero-length arguments results in length zero. The
>> >     result of
>> >          >> paste(n,
>> >          >> >> "th", sep = "", recycle0 = FALSE) always have the same
>> >     length as
>> >          >> 'n'.
>> >          >> >> Previously, the result is still as long as the longest
>> >     argument,
>> >          >> with the
>> >          >> >> zero-length argument like "". If all og the arguments
>> have
>> >          >> length zero,
>> >          >> >> 'recycle0' doesn't matter.
>> >          >> >>
>> >          >> >> As far as I understand, 'paste' with 'collapse' as a
>> >     character
>> >          >> string is
>> >          >> >> supposed to put together elements of a vector into a
>> single
>> >          >> character
>> >          >> >> string. I think 'recycle0' shouldn't change it.
>> >          >> >>
>> >          >> >> In current R devel and R patched, paste(character(0),
>> >     collapse = "",
>> >          >> >> recycle0 = FALSE) is character(0). I think it should be
>> >     "", like
>> >          >> >> paste(character(0), collapse="").
>> >          >> >>
>> >          >> >> paste(c("4", "5"), "th", sep = "", collapse = ", ",
>> >     recycle0 =
>> >          >> FALSE)
>> >          >> >> is
>> >          >> >> "4th, 5th".
>> >          >> >> paste(c("4"     ), "th", sep = "", collapse = ", ",
>> >     recycle0 =
>> >          >> FALSE)
>> >          >> >> is
>> >          >> >> "4th".
>> >          >> >> I think
>> >          >> >> paste(c(        ), "th", sep = "", collapse = ", ",
>> >     recycle0 =
>> >          >> FALSE)
>> >          >> >> should be
>> >          >> >> "",
>> >          >> >> not character(0).
>> >          >> >>
>> >          >> >> ______________________________________________
>> >          >> >> [hidden email] <mailto:[hidden email]>
>> >     <mailto:[hidden email] <mailto:[hidden email]>>
>> >     mailing list
>> >          >> >>
>> >          >>
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
>> >          >> >>
>> >          >> >
>> >          >> >       [[alternative HTML version deleted]]
>> >          >> >
>> >          >> > ______________________________________________
>> >          >> > [hidden email] <mailto:[hidden email]>
>> >     <mailto:[hidden email] <mailto:[hidden email]>>
>> >     mailing list
>> >          >> >
>> >          >>
>> >
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
>> >          >> >
>> >          >>
>> >          >> --
>> >          >> Hervé Pagès
>> >          >>
>> >          >> Program in Computational Biology
>> >          >> Division of Public Health Sciences
>> >          >> Fred Hutchinson Cancer Research Center
>> >          >> 1100 Fairview Ave. N, M1-B514
>> >          >> P.O. Box 19024
>> >          >> Seattle, WA 98109-1024
>> >          >>
>> >          >> E-mail: [hidden email] <mailto:[hidden email]>
>> >     <mailto:[hidden email] <mailto:[hidden email]>>
>> >          >> Phone:  (206) 667-5791
>> >          >> Fax:    (206) 667-1319
>> >          >>
>> >          >> ______________________________________________
>> >          >> [hidden email] <mailto:[hidden email]>
>> >     <mailto:[hidden email] <mailto:[hidden email]>>
>> >     mailing list
>> >          >> https://stat.ethz.ch/mailman/listinfo/r-devel
>> >     <
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=OLA7CqaU5uKeid1aGw41XJ_2Uq7JXbcwpPOrTWWG2v4&e=
>> >
>> >          >>
>> >     <
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=COnDeGgHNnHJlLLZOznMlhcaFU1nIRlkaSbssvlrMvw&e=
>> >
>> >          >>
>> >
>> >          > --
>> >          > Hervé Pagès
>> >
>> >          > Program in Computational Biology
>> >          > Division of Public Health Sciences
>> >          > Fred Hutchinson Cancer Research Center
>> >          > 1100 Fairview Ave. N, M1-B514
>> >          > P.O. Box 19024
>> >          > Seattle, WA 98109-1024
>> >
>> >          > E-mail: [hidden email] <mailto:[hidden email]>
>> >          > Phone:  (206) 667-5791
>> >          > Fax:    (206) 667-1319
>> >
>> >          > ______________________________________________
>> >          > [hidden email] <mailto:[hidden email]>
>> mailing list
>> >          > https://stat.ethz.ch/mailman/listinfo/r-devel
>> >     <
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=OLA7CqaU5uKeid1aGw41XJ_2Uq7JXbcwpPOrTWWG2v4&e=
>> >
>> >
>>
>> --
>> Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1-B514
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>>
>> E-mail: [hidden email]
>> Phone:  (206) 667-5791
>> Fax:    (206) 667-1319
>>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: paste(character(0), collapse="", recycle0=FALSE) should be ""

Hervé Pagès-2
Gabe,

It's the current behavior of paste() that is a major source of bugs:

   ## Add "rs" prefix to SNP ids and collapse them in a
   ## comma-separated string.
   collapse_snp_ids <- function(snp_ids)
       paste("rs", snp_ids, sep="", collapse=",")

   snp_groups <- list(
     group1=c(55, 22, 200),
     group2=integer(0),
     group3=c(99, 550)
   )

   vapply(snp_groups, collapse_snp_ids, character(1))
   #            group1            group2            group3
   # "rs55,rs22,rs200"              "rs"      "rs99,rs550"

This has hit me so many times!

Now with 'collapse0=TRUE', we finally have the opportunity to make it do
the right thing. Let's not miss that opportunity.

Cheers,
H.


On 5/22/20 11:26, Gabriel Becker wrote:

> I understand that this is consistent but it also strikes me as an
> enormous 'gotcha' of a magnitude that 'we' are trying to avoid/smooth
> over at this point in user-facing R space.
>
> For the record I'm not suggesting it should return something other than
> "", and in particular I'm not arguing that any call to paste /that does
> not return an error/ with non-NULL collapse should return a character
> vector of length one.
>
> Rather I'm pointing out that it could (perhaps should, imo) simply be an
> error, which is also consistent, in the strict sense, with
> previous behavior in that it is the developer simply declining to extend
> the recycle0 argument to the full parameter space (there is no rule that
> says we must do so, arguments whose use is incompatible with other
> arguments can be reasonable and called for).
>
> I don't feel feel super strongly that reeturning "" in this and similar
> cases horrible and should never happen, but i'd bet dollars to donuts
> that to the extent that behavior occurs it will be a disproportionately
> major source of bugs, and i think thats at least worth considering in
> addition to pure consistency.
>
> ~G
>
> On Fri, May 22, 2020 at 9:50 AM William Dunlap <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     I agree with Herve, processing collapse happens last so
>     collapse=non-NULL always leads to a single character string being
>     returned, the same as paste(collapse="").  See the altPaste function
>     I posted yesterday.
>
>     Bill Dunlap
>     TIBCO Software
>     wdunlap tibco.com
>     <https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Z1o-HO3_OqxOR9LaRguGvnG7X4vF_z1_q13I7zmjcfY&s=7ZT1IjmexPqsDBhrV3NspPTr8M8XiMweEwJWErgAlqw&e=>
>
>
>     On Fri, May 22, 2020 at 9:12 AM Hervé Pagès <[hidden email]
>     <mailto:[hidden email]>> wrote:
>
>         I think that
>
>              paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ", collapse
>         = ",",
>         recycle0=TRUE)
>
>         should just return an empty string and don't see why it needs to
>         emit a
>         warning or raise an error. To me it does exactly what the user
>         is asking
>         for, which is to change how the 3 arguments are recycled
>         **before** the
>         'sep' operation.
>
>         The 'recycle0' argument has no business in the 'collapse' operation
>         (which comes after the 'sep' operation): this operation still
>         behaves
>         like it always had.
>
>         That's all there is to it.
>
>         H.
>
>
>         On 5/22/20 03:00, Gabriel Becker wrote:
>          > Hi Martin et al,
>          >
>          >
>          >
>          > On Thu, May 21, 2020 at 9:42 AM Martin Maechler
>          > <[hidden email]
>         <mailto:[hidden email]>
>         <mailto:[hidden email]
>         <mailto:[hidden email]>>> wrote:
>          >
>          >      >>>>> Hervé Pagès
>          >      >>>>>     on Fri, 15 May 2020 13:44:28 -0700 writes:
>          >
>          >          > There is still the situation where **both** 'sep' and
>          >     'collapse' are
>          >          > specified:
>          >
>          >          >> paste(integer(0), "nth", sep="", collapse=",")
>          >          > [1] "nth"
>          >
>          >          > In that case 'recycle0' should **not** be ignored i.e.
>          >
>          >          > paste(integer(0), "nth", sep="", collapse=",",
>         recycle0=TRUE)
>          >
>          >          > should return the empty string (and not
>         character(0) like it
>          >     does at the
>          >          > moment).
>          >
>          >          > In other words, 'recycle0' should only control the
>         first
>          >     operation (the
>          >          > operation controlled by 'sep'). Which makes plenty
>         of sense:
>          >     the 1st
>          >          > operation is binary (or n-ary) while the collapse
>         operation
>          >     is unary.
>          >          > There is no concept of recycling in the context of
>         unary
>          >     operations.
>          >
>          >     Interesting, ..., and sounding somewhat convincing.
>          >
>          >          > On 5/15/20 11:25, Gabriel Becker wrote:
>          >          >> Hi all,
>          >          >>
>          >          >> This makes sense to me, but I would think that
>         recycle0 and
>          >     collapse
>          >          >> should actually be incompatible and paste should
>         throw an
>          >     error if
>          >          >> recycle0 were TRUE and collapse were declared in
>         the same
>          >     call. I don't
>          >          >> think the value of recycle0 should be silently
>         ignored if it
>          >     is actively
>          >          >> specified.
>          >          >>
>          >          >> ~G
>          >
>          >     Just to summarize what I think we should know and agree
>         (or be
>          >     be "disproven") and where this comes from ...
>          >
>          >     1) recycle0 is a new R 4.0.0 option in paste() / paste0()
>         which by
>          >     default
>          >         (recycle0 = FALSE) should (and *does* AFAIK) not
>         change anything,
>          >         hence  paste() / paste0() behave completely
>         back-compatible
>          >         if recycle0 is kept to FALSE.
>          >
>          >     2) recycle0 = TRUE is meant to give different behavior,
>         notably
>          >         0-length arguments (among '...') should result in
>         0-length results.
>          >
>          >         The above does not specify what this means in detail,
>         see 3)
>          >
>          >     3) The current R 4.0.0 implementation (for which I'm
>         primarily
>          >     responsible)
>          >         and help(paste)  are in accordance.
>          >         Notably the help page (Arguments -> 'recycle0' ;
>         Details 1st
>          >     para ; Examples)
>          >         says and shows how the 4.0.0 implementation has been
>         meant to work.
>          >
>          >     4) Several provenly smart members of the R community
>         argue that
>          >         both the implementation and the documentation of
>         'recycle0 =
>          >         TRUE'  should be changed to be more logical /
>         coherent / sensical ..
>          >
>          >     Is the above all correct in your view?
>          >
>          >     Assuming yes,  I read basically two proposals, both agreeing
>          >     that  recycle0 = TRUE  should only ever apply to the
>         action of 'sep'
>          >     but not the action of 'collapse'.
>          >
>          >     1) Bill and Hervé (I think) propose that 'recycle0'
>         should have
>          >         no effect whenever  'collapse = <string>'
>          >
>          >     2) Gabe proposes that 'collapse = <string>' and 'recycle0
>         = TRUE'
>          >         should be declared incompatible and error. If going
>         in that
>          >         direction, I could also see them to give a warning (and
>          >         continue as if recycle = FALSE).
>          >
>          >
>          > Herve makes a good point about when sep and collapse are both
>         set. That
>          > said, if the user explicitly sets recycle0, Personally, I
>         don't think it
>          > should be silently ignored under any configuration of other
>         arguments.
>          >
>          > If all of the arguments are to go into effect, the question
>         then becomes
>          > one of ordering, I think.
>          >
>          > Consider
>          >
>          >     paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ",
>         collapse = ",",
>          >     recycle0=TRUE)
>          >
>          > Currently that returns character(0), becuase the logic is
>          > essenttially (in pseudo-code)
>          >
>          >     collapse(paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ",
>          >     recycle0=TRUE), collapse = ", ", recycle0=TRUE)
>          >
>          >       -> collapse(character(0), collapse = ", " recycle0=TRUE)
>          >
>          >     -> character(0)
>          >
>          > Now Bill Dunlap argued, fairly convincingly I think, that
>         paste(...,
>          > collapse=<string>) should /always/ return a character vector
>         of length
>          > exactly one. With recycle0, though,  it will return "" via
>         the progression
>          >
>          >     paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ",
>         collapse = ",",
>          >     recycle0=TRUE)
>          >
>          >       -> collapse(character(0), collapse = ", ")
>          >
>          >     -> ""
>          >
>          >
>          > because recycle0 is still applied to the sep-based operation
>         which
>          > occurs before collapse, thus leaving a vector of length 0 to
>         collapse.
>          >
>          > That is consistent but seems unlikely to be what the user
>         wanted, imho.
>          > I think if it does this there should be at least a warning
>         when paste
>          > collapses to "" this way, if it is allowed at all (ie if mixing
>          > collapse=<string>and recycle0=TRUEis not simply made an error).
>          >
>          > I would like to hear others' thoughts as well though. @Pages,
>         Herve
>          > <mailto:[hidden email] <mailto:[hidden email]>>
>         @William Dunlap
>          > <mailto:[hidden email] <mailto:[hidden email]>> is ""
>         what you envision as thee desired and
>          > useful behavior there?
>          >
>          > Best,
>          > ~G
>          >
>          >
>          >
>          >     I have not yet my mind up but would tend to agree to "you
>         guys",
>          >     but I think that other R Core members should chime in, too.
>          >
>          >     Martin
>          >
>          >          >> On Fri, May 15, 2020 at 11:05 AM Hervé Pagès
>          >     <[hidden email] <mailto:[hidden email]>
>         <mailto:[hidden email] <mailto:[hidden email]>>
>          >          >> <mailto:[hidden email]
>         <mailto:[hidden email]> <mailto:[hidden email]
>         <mailto:[hidden email]>>>>
>          >     wrote:
>          >          >>
>          >          >> Totally agree with that.
>          >          >>
>          >          >> H.
>          >          >>
>          >          >> On 5/15/20 10:34, William Dunlap via R-devel wrote:
>          >          >> > I agree: paste(collapse="something", ...)
>         should always
>          >     return a
>          >          >> single
>          >          >> > character string, regardless of the value of
>         recycle0.
>          >     This would be
>          >          >> > similar to when there are no non-NULL arguments
>         to paste;
>          >          >> collapse="."
>          >          >> > gives a single empty string and collapse=NULL
>         gives a zero
>          >     long
>          >          >> character
>          >          >> > vector.
>          >          >> >> paste()
>          >          >> > character(0)
>          >          >> >> paste(collapse=", ")
>          >          >> > [1] ""
>          >          >> >
>          >          >> > Bill Dunlap
>          >          >> > TIBCO Software
>          >          >> > wdunlap tibco.com
>         <https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Z1o-HO3_OqxOR9LaRguGvnG7X4vF_z1_q13I7zmjcfY&s=7ZT1IjmexPqsDBhrV3NspPTr8M8XiMweEwJWErgAlqw&e=>
>          >  
>           <https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=o9ozvxBK-kVvAUFro7U1RrI5w0U8EPb0uyjQwMvOpt8&e=>
>          >          >>
>          >  
>           <https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=rXIwWqf4U4HZS_bjUT3KfA9ARaV5YTb_kEcXWHnkt-c&e=>
>          >          >> >
>          >          >> >
>          >          >> > On Thu, Apr 30, 2020 at 9:56 PM
>         suharto_anggono--- via
>          >     R-devel <
>          >          >> > [hidden email]
>         <mailto:[hidden email]> <mailto:[hidden email]
>         <mailto:[hidden email]>>
>          >     <mailto:[hidden email]
>         <mailto:[hidden email]> <mailto:[hidden email]
>         <mailto:[hidden email]>>>> wrote:
>          >          >> >
>          >          >> >> Without 'collapse', 'paste' pastes
>         (concatenates) its
>          >     arguments
>          >          >> >> elementwise (separated by 'sep', " " by
>         default). New in
>          >     R devel
>          >          >> and R
>          >          >> >> patched, specifying recycle0 = FALSE makes mixing
>          >     zero-length and
>          >          >> >> nonzero-length arguments results in length
>         zero. The
>          >     result of
>          >          >> paste(n,
>          >          >> >> "th", sep = "", recycle0 = FALSE) always have
>         the same
>          >     length as
>          >          >> 'n'.
>          >          >> >> Previously, the result is still as long as the
>         longest
>          >     argument,
>          >          >> with the
>          >          >> >> zero-length argument like "". If all og the
>         arguments have
>          >          >> length zero,
>          >          >> >> 'recycle0' doesn't matter.
>          >          >> >>
>          >          >> >> As far as I understand, 'paste' with
>         'collapse' as a
>          >     character
>          >          >> string is
>          >          >> >> supposed to put together elements of a vector
>         into a single
>          >          >> character
>          >          >> >> string. I think 'recycle0' shouldn't change it.
>          >          >> >>
>          >          >> >> In current R devel and R patched,
>         paste(character(0),
>          >     collapse = "",
>          >          >> >> recycle0 = FALSE) is character(0). I think it
>         should be
>          >     "", like
>          >          >> >> paste(character(0), collapse="").
>          >          >> >>
>          >          >> >> paste(c("4", "5"), "th", sep = "", collapse =
>         ", ",
>          >     recycle0 =
>          >          >> FALSE)
>          >          >> >> is
>          >          >> >> "4th, 5th".
>          >          >> >> paste(c("4"     ), "th", sep = "", collapse =
>         ", ",
>          >     recycle0 =
>          >          >> FALSE)
>          >          >> >> is
>          >          >> >> "4th".
>          >          >> >> I think
>          >          >> >> paste(c(        ), "th", sep = "", collapse =
>         ", ",
>          >     recycle0 =
>          >          >> FALSE)
>          >          >> >> should be
>          >          >> >> "",
>          >          >> >> not character(0).
>          >          >> >>
>          >          >> >> ______________________________________________
>          >          >> >> [hidden email]
>         <mailto:[hidden email]> <mailto:[hidden email]
>         <mailto:[hidden email]>>
>          >     <mailto:[hidden email]
>         <mailto:[hidden email]> <mailto:[hidden email]
>         <mailto:[hidden email]>>>
>          >     mailing list
>          >          >> >>
>          >          >>
>          >
>         https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
>          >          >> >>
>          >          >> >
>          >          >> >       [[alternative HTML version deleted]]
>          >          >> >
>          >          >> > ______________________________________________
>          >          >> > [hidden email]
>         <mailto:[hidden email]> <mailto:[hidden email]
>         <mailto:[hidden email]>>
>          >     <mailto:[hidden email]
>         <mailto:[hidden email]> <mailto:[hidden email]
>         <mailto:[hidden email]>>>
>          >     mailing list
>          >          >> >
>          >          >>
>          >
>         https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
>          >          >> >
>          >          >>
>          >          >> --
>          >          >> Hervé Pagès
>          >          >>
>          >          >> Program in Computational Biology
>          >          >> Division of Public Health Sciences
>          >          >> Fred Hutchinson Cancer Research Center
>          >          >> 1100 Fairview Ave. N, M1-B514
>          >          >> P.O. Box 19024
>          >          >> Seattle, WA 98109-1024
>          >          >>
>          >          >> E-mail: [hidden email]
>         <mailto:[hidden email]> <mailto:[hidden email]
>         <mailto:[hidden email]>>
>          >     <mailto:[hidden email]
>         <mailto:[hidden email]> <mailto:[hidden email]
>         <mailto:[hidden email]>>>
>          >          >> Phone:  (206) 667-5791
>          >          >> Fax:    (206) 667-1319
>          >          >>
>          >          >> ______________________________________________
>          >          >> [hidden email]
>         <mailto:[hidden email]> <mailto:[hidden email]
>         <mailto:[hidden email]>>
>          >     <mailto:[hidden email]
>         <mailto:[hidden email]> <mailto:[hidden email]
>         <mailto:[hidden email]>>>
>          >     mailing list
>          >          >> https://stat.ethz.ch/mailman/listinfo/r-devel
>         <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Z1o-HO3_OqxOR9LaRguGvnG7X4vF_z1_q13I7zmjcfY&s=CDOaP2RJnAyhpbHe6-O752uc4IPMugypbcgdYzhoF_8&e=>
>          >  
>           <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=OLA7CqaU5uKeid1aGw41XJ_2Uq7JXbcwpPOrTWWG2v4&e=>
>          >          >>
>          >  
>           <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=COnDeGgHNnHJlLLZOznMlhcaFU1nIRlkaSbssvlrMvw&e=>
>          >          >>
>          >
>          >          > --
>          >          > Hervé Pagès
>          >
>          >          > Program in Computational Biology
>          >          > Division of Public Health Sciences
>          >          > Fred Hutchinson Cancer Research Center
>          >          > 1100 Fairview Ave. N, M1-B514
>          >          > P.O. Box 19024
>          >          > Seattle, WA 98109-1024
>          >
>          >          > E-mail: [hidden email]
>         <mailto:[hidden email]> <mailto:[hidden email]
>         <mailto:[hidden email]>>
>          >          > Phone:  (206) 667-5791
>          >          > Fax:    (206) 667-1319
>          >
>          >          > ______________________________________________
>          >          > [hidden email]
>         <mailto:[hidden email]> <mailto:[hidden email]
>         <mailto:[hidden email]>> mailing list
>          >          > https://stat.ethz.ch/mailman/listinfo/r-devel
>         <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Z1o-HO3_OqxOR9LaRguGvnG7X4vF_z1_q13I7zmjcfY&s=CDOaP2RJnAyhpbHe6-O752uc4IPMugypbcgdYzhoF_8&e=>
>          >  
>           <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=OLA7CqaU5uKeid1aGw41XJ_2Uq7JXbcwpPOrTWWG2v4&e=>
>          >
>
>         --
>         Hervé Pagès
>
>         Program in Computational Biology
>         Division of Public Health Sciences
>         Fred Hutchinson Cancer Research Center
>         1100 Fairview Ave. N, M1-B514
>         P.O. Box 19024
>         Seattle, WA 98109-1024
>
>         E-mail: [hidden email] <mailto:[hidden email]>
>         Phone:  (206) 667-5791
>         Fax:    (206) 667-1319
>

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: [hidden email]
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: paste(character(0), collapse="", recycle0=FALSE) should be ""

R devel mailing list
> On Friday, May 22, 2020, 6:16:45 PM EDT, Hervé Pagès <[hidden email]> wrote:
>
> Gabe,
>
> It's the current behavior of paste() that is a major source of bugs:
>
>   ## Add "rs" prefix to SNP ids and collapse them in a
>   ## comma-separated string.
>   collapse_snp_ids <- function(snp_ids)
>       paste("rs", snp_ids, sep="", collapse=",")
>
>   snp_groups <- list(
>     group1=c(55, 22, 200),
>     group2=integer(0),
>     group3=c(99, 550)
>   )
>
>   vapply(snp_groups, collapse_snp_ids, character(1))
>   #            group1            group2            group3
>   # "rs55,rs22,rs200"              "rs"      "rs99,rs550"
>
> This has hit me so many times!
>
> Now with 'collapse0=TRUE', we finally have the opportunity to make it do
> the right thing. Let's not miss that opportunity.
>
> Cheers,
> H.

FWIW what convinces me is consistency with other aggregating functions applied
to zero length inputs:

sum(numeric(0))
## [1] 0

>
>
> On 5/22/20 11:26, Gabriel Becker wrote:
> > I understand that this is consistent but it also strikes me as an
> > enormous 'gotcha' of a magnitude that 'we' are trying to avoid/smooth
> > over at this point in user-facing R space.
> >
> > For the record I'm not suggesting it should return something other than
> > "", and in particular I'm not arguing that any call to paste /that does
> > not return an error/ with non-NULL collapse should return a character
> > vector of length one.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: paste(character(0), collapse="", recycle0=FALSE) should be ""

Hervé Pagès-2
On 5/22/20 18:12, brodie gaslam wrote:
>
> FWIW what convinces me is consistency with other aggregating functions applied
> to zero length inputs:
>
> sum(numeric(0))
> ## [1] 0

Right.

And 1 is the identity element of multiplication:

 > prod(numeric(0))
[1] 1

And the empty string is the identity element of string aggregation by
concatenation.

H.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: paste(character(0), collapse="", recycle0=FALSE) should be ""

Gabriel Becker-2
In reply to this post by Hervé Pagès-2
Herve (et al.),

On Fri, May 22, 2020 at 3:16 PM Hervé Pagès <[hidden email]> wrote:

> Gabe,
>
> It's the current behavior of paste() that is a major source of bugs:
>
>    ## Add "rs" prefix to SNP ids and collapse them in a
>    ## comma-separated string.
>    collapse_snp_ids <- function(snp_ids)
>        paste("rs", snp_ids, sep="", collapse=",")
>
>    snp_groups <- list(
>      group1=c(55, 22, 200),
>      group2=integer(0),
>      group3=c(99, 550)
>    )
>
>    vapply(snp_groups, collapse_snp_ids, character(1))
>    #            group1            group2            group3
>    # "rs55,rs22,rs200"              "rs"      "rs99,rs550"
>
> This has hit me so many times!
>
> Now with 'collapse0=TRUE', we finally have the opportunity to make it do
> the right thing. Let's not miss that opportunity.
>

I see what you're saying, but I don' know. Maybe my intuition is just
different but when I collapse multiple character vectors together, I
expect all the characters from each of those vectors to be in the resulting
collapsed one. In your example its a string literal tot be added
elementwise to the prefix, but what if it is another vector of length > 1.
Wouldn't it be strange that all those values are wiped and absent from the
resulting string? Maybe it's just me. like for paste(x,y,z, sep ="",
collapse = ", ", recycle0=TRUE) if length(y) is 0, it literally makes no
difference when x and z are.

I seem to be being largely outvoted anyway though, so we will see what
Martin and others who may pop up might think, but I raised the points I
wanted to raise so we'll see where things ultimately fall.

~G



>
> Cheers,
> H.
>
>
> On 5/22/20 11:26, Gabriel Becker wrote:
> > I understand that this is consistent but it also strikes me as an
> > enormous 'gotcha' of a magnitude that 'we' are trying to avoid/smooth
> > over at this point in user-facing R space.
> >
> > For the record I'm not suggesting it should return something other than
> > "", and in particular I'm not arguing that any call to paste /that does
> > not return an error/ with non-NULL collapse should return a character
> > vector of length one.
> >
> > Rather I'm pointing out that it could (perhaps should, imo) simply be an
> > error, which is also consistent, in the strict sense, with
> > previous behavior in that it is the developer simply declining to extend
> > the recycle0 argument to the full parameter space (there is no rule that
> > says we must do so, arguments whose use is incompatible with other
> > arguments can be reasonable and called for).
> >
> > I don't feel feel super strongly that reeturning "" in this and similar
> > cases horrible and should never happen, but i'd bet dollars to donuts
> > that to the extent that behavior occurs it will be a disproportionately
> > major source of bugs, and i think thats at least worth considering in
> > addition to pure consistency.
> >
> > ~G
> >
> > On Fri, May 22, 2020 at 9:50 AM William Dunlap <[hidden email]
> > <mailto:[hidden email]>> wrote:
> >
> >     I agree with Herve, processing collapse happens last so
> >     collapse=non-NULL always leads to a single character string being
> >     returned, the same as paste(collapse="").  See the altPaste function
> >     I posted yesterday.
> >
> >     Bill Dunlap
> >     TIBCO Software
> >     wdunlap tibco.com
> >     <
> https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Z1o-HO3_OqxOR9LaRguGvnG7X4vF_z1_q13I7zmjcfY&s=7ZT1IjmexPqsDBhrV3NspPTr8M8XiMweEwJWErgAlqw&e=
> >
> >
> >
> >     On Fri, May 22, 2020 at 9:12 AM Hervé Pagès <[hidden email]
> >     <mailto:[hidden email]>> wrote:
> >
> >         I think that
> >
> >              paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ", collapse
> >         = ",",
> >         recycle0=TRUE)
> >
> >         should just return an empty string and don't see why it needs to
> >         emit a
> >         warning or raise an error. To me it does exactly what the user
> >         is asking
> >         for, which is to change how the 3 arguments are recycled
> >         **before** the
> >         'sep' operation.
> >
> >         The 'recycle0' argument has no business in the 'collapse'
> operation
> >         (which comes after the 'sep' operation): this operation still
> >         behaves
> >         like it always had.
> >
> >         That's all there is to it.
> >
> >         H.
> >
> >
> >         On 5/22/20 03:00, Gabriel Becker wrote:
> >          > Hi Martin et al,
> >          >
> >          >
> >          >
> >          > On Thu, May 21, 2020 at 9:42 AM Martin Maechler
> >          > <[hidden email]
> >         <mailto:[hidden email]>
> >         <mailto:[hidden email]
> >         <mailto:[hidden email]>>> wrote:
> >          >
> >          >      >>>>> Hervé Pagès
> >          >      >>>>>     on Fri, 15 May 2020 13:44:28 -0700 writes:
> >          >
> >          >          > There is still the situation where **both** 'sep'
> and
> >          >     'collapse' are
> >          >          > specified:
> >          >
> >          >          >> paste(integer(0), "nth", sep="", collapse=",")
> >          >          > [1] "nth"
> >          >
> >          >          > In that case 'recycle0' should **not** be ignored
> i.e.
> >          >
> >          >          > paste(integer(0), "nth", sep="", collapse=",",
> >         recycle0=TRUE)
> >          >
> >          >          > should return the empty string (and not
> >         character(0) like it
> >          >     does at the
> >          >          > moment).
> >          >
> >          >          > In other words, 'recycle0' should only control the
> >         first
> >          >     operation (the
> >          >          > operation controlled by 'sep'). Which makes plenty
> >         of sense:
> >          >     the 1st
> >          >          > operation is binary (or n-ary) while the collapse
> >         operation
> >          >     is unary.
> >          >          > There is no concept of recycling in the context of
> >         unary
> >          >     operations.
> >          >
> >          >     Interesting, ..., and sounding somewhat convincing.
> >          >
> >          >          > On 5/15/20 11:25, Gabriel Becker wrote:
> >          >          >> Hi all,
> >          >          >>
> >          >          >> This makes sense to me, but I would think that
> >         recycle0 and
> >          >     collapse
> >          >          >> should actually be incompatible and paste should
> >         throw an
> >          >     error if
> >          >          >> recycle0 were TRUE and collapse were declared in
> >         the same
> >          >     call. I don't
> >          >          >> think the value of recycle0 should be silently
> >         ignored if it
> >          >     is actively
> >          >          >> specified.
> >          >          >>
> >          >          >> ~G
> >          >
> >          >     Just to summarize what I think we should know and agree
> >         (or be
> >          >     be "disproven") and where this comes from ...
> >          >
> >          >     1) recycle0 is a new R 4.0.0 option in paste() / paste0()
> >         which by
> >          >     default
> >          >         (recycle0 = FALSE) should (and *does* AFAIK) not
> >         change anything,
> >          >         hence  paste() / paste0() behave completely
> >         back-compatible
> >          >         if recycle0 is kept to FALSE.
> >          >
> >          >     2) recycle0 = TRUE is meant to give different behavior,
> >         notably
> >          >         0-length arguments (among '...') should result in
> >         0-length results.
> >          >
> >          >         The above does not specify what this means in detail,
> >         see 3)
> >          >
> >          >     3) The current R 4.0.0 implementation (for which I'm
> >         primarily
> >          >     responsible)
> >          >         and help(paste)  are in accordance.
> >          >         Notably the help page (Arguments -> 'recycle0' ;
> >         Details 1st
> >          >     para ; Examples)
> >          >         says and shows how the 4.0.0 implementation has been
> >         meant to work.
> >          >
> >          >     4) Several provenly smart members of the R community
> >         argue that
> >          >         both the implementation and the documentation of
> >         'recycle0 =
> >          >         TRUE'  should be changed to be more logical /
> >         coherent / sensical ..
> >          >
> >          >     Is the above all correct in your view?
> >          >
> >          >     Assuming yes,  I read basically two proposals, both
> agreeing
> >          >     that  recycle0 = TRUE  should only ever apply to the
> >         action of 'sep'
> >          >     but not the action of 'collapse'.
> >          >
> >          >     1) Bill and Hervé (I think) propose that 'recycle0'
> >         should have
> >          >         no effect whenever  'collapse = <string>'
> >          >
> >          >     2) Gabe proposes that 'collapse = <string>' and 'recycle0
> >         = TRUE'
> >          >         should be declared incompatible and error. If going
> >         in that
> >          >         direction, I could also see them to give a warning
> (and
> >          >         continue as if recycle = FALSE).
> >          >
> >          >
> >          > Herve makes a good point about when sep and collapse are both
> >         set. That
> >          > said, if the user explicitly sets recycle0, Personally, I
> >         don't think it
> >          > should be silently ignored under any configuration of other
> >         arguments.
> >          >
> >          > If all of the arguments are to go into effect, the question
> >         then becomes
> >          > one of ordering, I think.
> >          >
> >          > Consider
> >          >
> >          >     paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ",
> >         collapse = ",",
> >          >     recycle0=TRUE)
> >          >
> >          > Currently that returns character(0), becuase the logic is
> >          > essenttially (in pseudo-code)
> >          >
> >          >     collapse(paste(c("a", "b"), NULL, c("c",  "d"),  sep = "
> ",
> >          >     recycle0=TRUE), collapse = ", ", recycle0=TRUE)
> >          >
> >          >       -> collapse(character(0), collapse = ", " recycle0=TRUE)
> >          >
> >          >     -> character(0)
> >          >
> >          > Now Bill Dunlap argued, fairly convincingly I think, that
> >         paste(...,
> >          > collapse=<string>) should /always/ return a character vector
> >         of length
> >          > exactly one. With recycle0, though,  it will return "" via
> >         the progression
> >          >
> >          >     paste(c("a", "b"), NULL, c("c",  "d"),  sep = " ",
> >         collapse = ",",
> >          >     recycle0=TRUE)
> >          >
> >          >       -> collapse(character(0), collapse = ", ")
> >          >
> >          >     -> ""
> >          >
> >          >
> >          > because recycle0 is still applied to the sep-based operation
> >         which
> >          > occurs before collapse, thus leaving a vector of length 0 to
> >         collapse.
> >          >
> >          > That is consistent but seems unlikely to be what the user
> >         wanted, imho.
> >          > I think if it does this there should be at least a warning
> >         when paste
> >          > collapses to "" this way, if it is allowed at all (ie if
> mixing
> >          > collapse=<string>and recycle0=TRUEis not simply made an
> error).
> >          >
> >          > I would like to hear others' thoughts as well though. @Pages,
> >         Herve
> >          > <mailto:[hidden email] <mailto:[hidden email]>>
> >         @William Dunlap
> >          > <mailto:[hidden email] <mailto:[hidden email]>> is ""
> >         what you envision as thee desired and
> >          > useful behavior there?
> >          >
> >          > Best,
> >          > ~G
> >          >
> >          >
> >          >
> >          >     I have not yet my mind up but would tend to agree to "you
> >         guys",
> >          >     but I think that other R Core members should chime in,
> too.
> >          >
> >          >     Martin
> >          >
> >          >          >> On Fri, May 15, 2020 at 11:05 AM Hervé Pagès
> >          >     <[hidden email] <mailto:[hidden email]>
> >         <mailto:[hidden email] <mailto:[hidden email]>>
> >          >          >> <mailto:[hidden email]
> >         <mailto:[hidden email]> <mailto:[hidden email]
> >         <mailto:[hidden email]>>>>
> >          >     wrote:
> >          >          >>
> >          >          >> Totally agree with that.
> >          >          >>
> >          >          >> H.
> >          >          >>
> >          >          >> On 5/15/20 10:34, William Dunlap via R-devel
> wrote:
> >          >          >> > I agree: paste(collapse="something", ...)
> >         should always
> >          >     return a
> >          >          >> single
> >          >          >> > character string, regardless of the value of
> >         recycle0.
> >          >     This would be
> >          >          >> > similar to when there are no non-NULL arguments
> >         to paste;
> >          >          >> collapse="."
> >          >          >> > gives a single empty string and collapse=NULL
> >         gives a zero
> >          >     long
> >          >          >> character
> >          >          >> > vector.
> >          >          >> >> paste()
> >          >          >> > character(0)
> >          >          >> >> paste(collapse=", ")
> >          >          >> > [1] ""
> >          >          >> >
> >          >          >> > Bill Dunlap
> >          >          >> > TIBCO Software
> >          >          >> > wdunlap tibco.com
> >         <
> https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Z1o-HO3_OqxOR9LaRguGvnG7X4vF_z1_q13I7zmjcfY&s=7ZT1IjmexPqsDBhrV3NspPTr8M8XiMweEwJWErgAlqw&e=
> >
> >          >
> >           <
> https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=o9ozvxBK-kVvAUFro7U1RrI5w0U8EPb0uyjQwMvOpt8&e=
> >
> >          >          >>
> >          >
> >           <
> https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=rXIwWqf4U4HZS_bjUT3KfA9ARaV5YTb_kEcXWHnkt-c&e=
> >
> >          >          >> >
> >          >          >> >
> >          >          >> > On Thu, Apr 30, 2020 at 9:56 PM
> >         suharto_anggono--- via
> >          >     R-devel <
> >          >          >> > [hidden email]
> >         <mailto:[hidden email]> <mailto:[hidden email]
> >         <mailto:[hidden email]>>
> >          >     <mailto:[hidden email]
> >         <mailto:[hidden email]> <mailto:[hidden email]
> >         <mailto:[hidden email]>>>> wrote:
> >          >          >> >
> >          >          >> >> Without 'collapse', 'paste' pastes
> >         (concatenates) its
> >          >     arguments
> >          >          >> >> elementwise (separated by 'sep', " " by
> >         default). New in
> >          >     R devel
> >          >          >> and R
> >          >          >> >> patched, specifying recycle0 = FALSE makes
> mixing
> >          >     zero-length and
> >          >          >> >> nonzero-length arguments results in length
> >         zero. The
> >          >     result of
> >          >          >> paste(n,
> >          >          >> >> "th", sep = "", recycle0 = FALSE) always have
> >         the same
> >          >     length as
> >          >          >> 'n'.
> >          >          >> >> Previously, the result is still as long as the
> >         longest
> >          >     argument,
> >          >          >> with the
> >          >          >> >> zero-length argument like "". If all og the
> >         arguments have
> >          >          >> length zero,
> >          >          >> >> 'recycle0' doesn't matter.
> >          >          >> >>
> >          >          >> >> As far as I understand, 'paste' with
> >         'collapse' as a
> >          >     character
> >          >          >> string is
> >          >          >> >> supposed to put together elements of a vector
> >         into a single
> >          >          >> character
> >          >          >> >> string. I think 'recycle0' shouldn't change it.
> >          >          >> >>
> >          >          >> >> In current R devel and R patched,
> >         paste(character(0),
> >          >     collapse = "",
> >          >          >> >> recycle0 = FALSE) is character(0). I think it
> >         should be
> >          >     "", like
> >          >          >> >> paste(character(0), collapse="").
> >          >          >> >>
> >          >          >> >> paste(c("4", "5"), "th", sep = "", collapse =
> >         ", ",
> >          >     recycle0 =
> >          >          >> FALSE)
> >          >          >> >> is
> >          >          >> >> "4th, 5th".
> >          >          >> >> paste(c("4"     ), "th", sep = "", collapse =
> >         ", ",
> >          >     recycle0 =
> >          >          >> FALSE)
> >          >          >> >> is
> >          >          >> >> "4th".
> >          >          >> >> I think
> >          >          >> >> paste(c(        ), "th", sep = "", collapse =
> >         ", ",
> >          >     recycle0 =
> >          >          >> FALSE)
> >          >          >> >> should be
> >          >          >> >> "",
> >          >          >> >> not character(0).
> >          >          >> >>
> >          >          >> >> ______________________________________________
> >          >          >> >> [hidden email]
> >         <mailto:[hidden email]> <mailto:[hidden email]
> >         <mailto:[hidden email]>>
> >          >     <mailto:[hidden email]
> >         <mailto:[hidden email]> <mailto:[hidden email]
> >         <mailto:[hidden email]>>>
> >          >     mailing list
> >          >          >> >>
> >          >          >>
> >          >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
> >          >          >> >>
> >          >          >> >
> >          >          >> >       [[alternative HTML version deleted]]
> >          >          >> >
> >          >          >> > ______________________________________________
> >          >          >> > [hidden email]
> >         <mailto:[hidden email]> <mailto:[hidden email]
> >         <mailto:[hidden email]>>
> >          >     <mailto:[hidden email]
> >         <mailto:[hidden email]> <mailto:[hidden email]
> >         <mailto:[hidden email]>>>
> >          >     mailing list
> >          >          >> >
> >          >          >>
> >          >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e=
> >          >          >> >
> >          >          >>
> >          >          >> --
> >          >          >> Hervé Pagès
> >          >          >>
> >          >          >> Program in Computational Biology
> >          >          >> Division of Public Health Sciences
> >          >          >> Fred Hutchinson Cancer Research Center
> >          >          >> 1100 Fairview Ave. N, M1-B514
> >          >          >> P.O. Box 19024
> >          >          >> Seattle, WA 98109-1024
> >          >          >>
> >          >          >> E-mail: [hidden email]
> >         <mailto:[hidden email]> <mailto:[hidden email]
> >         <mailto:[hidden email]>>
> >          >     <mailto:[hidden email]
> >         <mailto:[hidden email]> <mailto:[hidden email]
> >         <mailto:[hidden email]>>>
> >          >          >> Phone:  (206) 667-5791
> >          >          >> Fax:    (206) 667-1319
> >          >          >>
> >          >          >> ______________________________________________
> >          >          >> [hidden email]
> >         <mailto:[hidden email]> <mailto:[hidden email]
> >         <mailto:[hidden email]>>
> >          >     <mailto:[hidden email]
> >         <mailto:[hidden email]> <mailto:[hidden email]
> >         <mailto:[hidden email]>>>
> >          >     mailing list
> >          >          >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >         <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Z1o-HO3_OqxOR9LaRguGvnG7X4vF_z1_q13I7zmjcfY&s=CDOaP2RJnAyhpbHe6-O752uc4IPMugypbcgdYzhoF_8&e=
> >
> >          >
> >           <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=OLA7CqaU5uKeid1aGw41XJ_2Uq7JXbcwpPOrTWWG2v4&e=
> >
> >          >          >>
> >          >
> >           <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=COnDeGgHNnHJlLLZOznMlhcaFU1nIRlkaSbssvlrMvw&e=
> >
> >          >          >>
> >          >
> >          >          > --
> >          >          > Hervé Pagès
> >          >
> >          >          > Program in Computational Biology
> >          >          > Division of Public Health Sciences
> >          >          > Fred Hutchinson Cancer Research Center
> >          >          > 1100 Fairview Ave. N, M1-B514
> >          >          > P.O. Box 19024
> >          >          > Seattle, WA 98109-1024
> >          >
> >          >          > E-mail: [hidden email]
> >         <mailto:[hidden email]> <mailto:[hidden email]
> >         <mailto:[hidden email]>>
> >          >          > Phone:  (206) 667-5791
> >          >          > Fax:    (206) 667-1319
> >          >
> >          >          > ______________________________________________
> >          >          > [hidden email]
> >         <mailto:[hidden email]> <mailto:[hidden email]
> >         <mailto:[hidden email]>> mailing list
> >          >          > https://stat.ethz.ch/mailman/listinfo/r-devel
> >         <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Z1o-HO3_OqxOR9LaRguGvnG7X4vF_z1_q13I7zmjcfY&s=CDOaP2RJnAyhpbHe6-O752uc4IPMugypbcgdYzhoF_8&e=
> >
> >          >
> >           <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=OLA7CqaU5uKeid1aGw41XJ_2Uq7JXbcwpPOrTWWG2v4&e=
> >
> >          >
> >
> >         --
> >         Hervé Pagès
> >
> >         Program in Computational Biology
> >         Division of Public Health Sciences
> >         Fred Hutchinson Cancer Research Center
> >         1100 Fairview Ave. N, M1-B514
> >         P.O. Box 19024
> >         Seattle, WA 98109-1024
> >
> >         E-mail: [hidden email] <mailto:[hidden email]>
> >         Phone:  (206) 667-5791
> >         Fax:    (206) 667-1319
> >
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: [hidden email]
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: paste(character(0), collapse="", recycle0=FALSE) should be ""

Gabriel Becker-2
In reply to this post by R devel mailing list
Brodie,

A good point, but more analogous to what I'm concerned with is

> sum(5, numeric(0))

[1] 5


Not 0 (the analogu of Herve's desired behavior).

Best,
~G

PS Brodie sorry for the double.

On Fri, May 22, 2020 at 6:12 PM brodie gaslam <[hidden email]>
wrote:

> > On Friday, May 22, 2020, 6:16:45 PM EDT, Hervé Pagès <
> [hidden email]> wrote:
> >
> > Gabe,
> >
> > It's the current behavior of paste() that is a major source of bugs:
> >
> >   ## Add "rs" prefix to SNP ids and collapse them in a
> >   ## comma-separated string.
> >   collapse_snp_ids <- function(snp_ids)
> >       paste("rs", snp_ids, sep="", collapse=",")
> >
> >   snp_groups <- list(
> >     group1=c(55, 22, 200),
> >     group2=integer(0),
> >     group3=c(99, 550)
> >   )
> >
> >   vapply(snp_groups, collapse_snp_ids, character(1))
> >   #            group1            group2            group3
> >   # "rs55,rs22,rs200"              "rs"      "rs99,rs550"
> >
> > This has hit me so many times!
> >
> > Now with 'collapse0=TRUE', we finally have the opportunity to make it do
> > the right thing. Let's not miss that opportunity.
> >
> > Cheers,
> > H.
>
> FWIW what convinces me is consistency with other aggregating functions
> applied
> to zero length inputs:
>
> sum(numeric(0))
> ## [1] 0
>
> >
> >
> > On 5/22/20 11:26, Gabriel Becker wrote:
> > > I understand that this is consistent but it also strikes me as an
> > > enormous 'gotcha' of a magnitude that 'we' are trying to avoid/smooth
> > > over at this point in user-facing R space.
> > >
> > > For the record I'm not suggesting it should return something other than
> > > "", and in particular I'm not arguing that any call to paste /that does
> > > not return an error/ with non-NULL collapse should return a character
> > > vector of length one.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: paste(character(0), collapse="", recycle0=FALSE) should be ""

Hervé Pagès-2
In reply to this post by Gabriel Becker-2
On 5/23/20 17:45, Gabriel Becker wrote:
> Maybe my intuition is just
> different but when I collapse multiple character vectors together, I
> expect all the characters from each of those vectors to be in the
> resulting collapsed one.

Yes I'd expect that too. But the **collapse** operation in paste() has
never been about collapsing **multiple** character vectors together.
What it does is collapse the **single** character vector that comes out
of the 'sep' operation.

So

   paste(x, y, z, sep="", collapse=",")

is analogous to

   sum(x + y + z)

The element-wise addition is analog to the 'sep' operation.
The sum() operation is analog to the 'collapse' operation.

H.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: paste(character(0), collapse="", recycle0=FALSE) should be ""

Gabriel Becker-2
On Sat, May 23, 2020 at 9:59 PM Hervé Pagès <[hidden email]> wrote:

> On 5/23/20 17:45, Gabriel Becker wrote:
> > Maybe my intuition is just
> > different but when I collapse multiple character vectors together, I
> > expect all the characters from each of those vectors to be in the
> > resulting collapsed one.
>
> Yes I'd expect that too. But the **collapse** operation in paste() has
> never been about collapsing **multiple** character vectors together.
> What it does is collapse the **single** character vector that comes out
> of the 'sep' operation.
>

I understand what it does, I broke ti down the same way in my post earlier
in the thread. the fact remains is that it is a single function which
significantly muddies the waters. so you can say

paste0(x,y, collapse=",", recycle0=TRUE)

is not a collapse operation on multiple vectors, and of course there's a
sense in which you're not wrong (again I understand what these functions
do), but it sure looks like one in the invocation, doesn't it?

Honestly the thing that this whole discussion has shown me most clearly is
that, imho, collapse (accepting ONLY one data vector) and paste(accepting
multiple) should never have been a single function to begin with.  But that
ship sailed long long ago.




> So
>
>    paste(x, y, z, sep="", collapse=",")
>
> is analogous to
>
>    sum(x + y + z)
>

Honestly, I'd be significantly more comfortable if

1:10 + integer(0) + 5

were an error too.

At least I'm consistent right?

~G

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
12