

In the help page of ?tapply it says that the first argument (X) is "an
atomic object, typically a vector."
However, tapply seems to be able to handle list objects. For example:
###################
l < as.list(1:10)
is.atomic(l) # FALSE
index < c(rep(1,5),rep(2,5))
tapply(l,index,unlist)
> tapply(l,index,unlist)
$`1`
[1] 1 2 3 4 5
$`2`
[1] 6 7 8 9 10
###################
Hence, does it mean a list an atomic object? (which I thought it wasn't) or
is the help for tapply needs updating?
(or some third option I'm missing?)
Thanks.
Contact
Details:
Contact me: [hidden email] 
Read me: www.talgalili.com (Hebrew)  www.biostatistics.co.il (Hebrew) 
www.rstatistics.com (English)

[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Did you ever receive a reply to this?
Note that for your example:
> tapply(l,index,sum)
Error in FUN(X[[i]], ...) : invalid 'type' (list) of argument
A list is definitely not atomic (is.recursive(l) ).
So it looks like a "quirk" that FUN = unlist doesn't raise an error.
Cheers,
Bert
Bert Gunter
"The trouble with having an open mind is that people keep coming along
and sticking things into it."
 Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Sat, Feb 4, 2017 at 4:17 AM, Tal Galili < [hidden email]> wrote:
> In the help page of ?tapply it says that the first argument (X) is "an
> atomic object, typically a vector."
>
> However, tapply seems to be able to handle list objects. For example:
>
> ###################
>
> l < as.list(1:10)
> is.atomic(l) # FALSE
> index < c(rep(1,5),rep(2,5))
> tapply(l,index,unlist)
>
>> tapply(l,index,unlist)
> $`1`
> [1] 1 2 3 4 5
>
> $`2`
> [1] 6 7 8 9 10
>
>
> ###################
>
> Hence, does it mean a list an atomic object? (which I thought it wasn't) or
> is the help for tapply needs updating?
> (or some third option I'm missing?)
>
> Thanks.
>
>
>
>
>
> Contact
> Details:
> Contact me: [hidden email] 
> Read me: www.talgalili.com (Hebrew)  www.biostatistics.co.il (Hebrew) 
> www.rstatistics.com (English)
> 
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hi,
tapply() will work on any object 'X' that has a length and supports
singlebracket subsetting. These objects are sometimes called
"vectorlike" objects. Atomic vectors, lists, S4 objects with a "length"
and "[" method, etc... are examples of "vectorlike" objects.
So instead of saying
X: an atomic object, typically a vector.
I think it would be more accurate if the man page was saying something
like
X: a vectorlike object that supports subsetting with `[`, typically
an atomic vector.
H.
On 02/04/2017 04:17 AM, Tal Galili wrote:
> In the help page of ?tapply it says that the first argument (X) is "an
> atomic object, typically a vector."
>
> However, tapply seems to be able to handle list objects. For example:
>
> ###################
>
> l < as.list(1:10)
> is.atomic(l) # FALSE
> index < c(rep(1,5),rep(2,5))
> tapply(l,index,unlist)
>
>> tapply(l,index,unlist)
> $`1`
> [1] 1 2 3 4 5
>
> $`2`
> [1] 6 7 8 9 10
>
>
> ###################
>
> Hence, does it mean a list an atomic object? (which I thought it wasn't) or
> is the help for tapply needs updating?
> (or some third option I'm missing?)
>
> Thanks.
>
>
>
>
>
> Contact
> Details:
> Contact me: [hidden email] 
> Read me: www.talgalili.com (Hebrew)  www.biostatistics.co.il (Hebrew) 
> www.rstatistics.com (English)
> 
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>

Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1B514
P.O. Box 19024
Seattle, WA 981091024
Email: [hidden email]
Phone: (206) 6675791
Fax: (206) 6671319
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hervé:
Kindly explain this, then:
> l < as.list(1:10)
> is.atomic(l) # FALSE
[1] FALSE
> index < c(rep(1,5),rep(2,5))
>
>
> tapply(l,index,unlist)
$`1`
[1] 1 2 3 4 5
$`2`
[1] 6 7 8 9 10
>
> ## But
>
> tapply(l,index, sum)
Error in FUN(X[[i]], ...) : invalid 'type' (list) of argument
Cheers,
Bert
Bert Gunter
"The trouble with having an open mind is that people keep coming along
and sticking things into it."
 Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Tue, Feb 14, 2017 at 5:10 PM, Hervé Pagès < [hidden email]> wrote:
> Hi,
>
> tapply() will work on any object 'X' that has a length and supports
> singlebracket subsetting. These objects are sometimes called
> "vectorlike" objects. Atomic vectors, lists, S4 objects with a "length"
> and "[" method, etc... are examples of "vectorlike" objects.
>
> So instead of saying
>
> X: an atomic object, typically a vector.
>
> I think it would be more accurate if the man page was saying something
> like
>
> X: a vectorlike object that supports subsetting with `[`, typically
> an atomic vector.
>
> H.
>
> On 02/04/2017 04:17 AM, Tal Galili wrote:
>>
>> In the help page of ?tapply it says that the first argument (X) is "an
>> atomic object, typically a vector."
>>
>> However, tapply seems to be able to handle list objects. For example:
>>
>> ###################
>>
>> l < as.list(1:10)
>> is.atomic(l) # FALSE
>> index < c(rep(1,5),rep(2,5))
>> tapply(l,index,unlist)
>>
>>> tapply(l,index,unlist)
>>
>> $`1`
>> [1] 1 2 3 4 5
>>
>> $`2`
>> [1] 6 7 8 9 10
>>
>>
>> ###################
>>
>> Hence, does it mean a list an atomic object? (which I thought it wasn't)
>> or
>> is the help for tapply needs updating?
>> (or some third option I'm missing?)
>>
>> Thanks.
>>
>>
>>
>>
>>
>> Contact
>> Details:
>> Contact me: [hidden email] 
>> Read me: www.talgalili.com (Hebrew)  www.biostatistics.co.il (Hebrew) 
>> www.rstatistics.com (English)
>>
>> 
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide
>> http://www.Rproject.org/postingguide.html>> and provide commented, minimal, selfcontained, reproducible code.
>>
>
> 
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1B514
> P.O. Box 19024
> Seattle, WA 981091024
>
> Email: [hidden email]
> Phone: (206) 6675791
> Fax: (206) 6671319
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


The problem with Bert's second example is that sum doesn't work on a list.
The tapply worked correctly.
> unlist(l[1:5])
[1] 1 2 3 4 5
> sum(l[1:5])
Error in sum(l[1:5]) : invalid 'type' (list) of argument
On Tue, Feb 14, 2017 at 8:28 PM, Bert Gunter < [hidden email]> wrote:
> Hervé:
>
> Kindly explain this, then:
>
>> l < as.list(1:10)
>> is.atomic(l) # FALSE
> [1] FALSE
>> index < c(rep(1,5),rep(2,5))
>>
>>
>> tapply(l,index,unlist)
> $`1`
> [1] 1 2 3 4 5
>
> $`2`
> [1] 6 7 8 9 10
>
>>
>> ## But
>>
>> tapply(l,index, sum)
> Error in FUN(X[[i]], ...) : invalid 'type' (list) of argument
>
> Cheers,
> Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
>  Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Tue, Feb 14, 2017 at 5:10 PM, Hervé Pagès < [hidden email]> wrote:
>> Hi,
>>
>> tapply() will work on any object 'X' that has a length and supports
>> singlebracket subsetting. These objects are sometimes called
>> "vectorlike" objects. Atomic vectors, lists, S4 objects with a "length"
>> and "[" method, etc... are examples of "vectorlike" objects.
>>
>> So instead of saying
>>
>> X: an atomic object, typically a vector.
>>
>> I think it would be more accurate if the man page was saying something
>> like
>>
>> X: a vectorlike object that supports subsetting with `[`, typically
>> an atomic vector.
>>
>> H.
>>
>> On 02/04/2017 04:17 AM, Tal Galili wrote:
>>>
>>> In the help page of ?tapply it says that the first argument (X) is "an
>>> atomic object, typically a vector."
>>>
>>> However, tapply seems to be able to handle list objects. For example:
>>>
>>> ###################
>>>
>>> l < as.list(1:10)
>>> is.atomic(l) # FALSE
>>> index < c(rep(1,5),rep(2,5))
>>> tapply(l,index,unlist)
>>>
>>>> tapply(l,index,unlist)
>>>
>>> $`1`
>>> [1] 1 2 3 4 5
>>>
>>> $`2`
>>> [1] 6 7 8 9 10
>>>
>>>
>>> ###################
>>>
>>> Hence, does it mean a list an atomic object? (which I thought it wasn't)
>>> or
>>> is the help for tapply needs updating?
>>> (or some third option I'm missing?)
>>>
>>> Thanks.
>>>
>>>
>>>
>>>
>>>
>>> Contact
>>> Details:
>>> Contact me: [hidden email] 
>>> Read me: www.talgalili.com (Hebrew)  www.biostatistics.co.il (Hebrew) 
>>> www.rstatistics.com (English)
>>>
>>> 
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>> PLEASE do read the posting guide
>>> http://www.Rproject.org/postingguide.html>>> and provide commented, minimal, selfcontained, reproducible code.
>>>
>>
>> 
>> Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1B514
>> P.O. Box 19024
>> Seattle, WA 981091024
>>
>> Email: [hidden email]
>> Phone: (206) 6675791
>> Fax: (206) 6671319
>>
>> ______________________________________________
>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html>> and provide commented, minimal, selfcontained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Right. More precisely the function passed thru the FUN argument must
work on the subsets of X generated internally by tapply(). You can
actually see these subsets by passing the identity function:
X < letters[1:10]
INDEX < c(rep(1,5),rep(2,5))
tapply(X, INDEX, FUN=identity)
# $`1`
# [1] "a" "b" "c" "d" "e"
#
# $`2`
# [1] "f" "g" "h" "i" "j"
Doing this shows you how tapply() splits the vectorlike object X into
a list of subsets. If you replace the identity function with a function
that cannot be applied to these subsets, then you get an error:
tapply(X, INDEX, FUN=sum)
# Error in FUN(X[[i]], ...) : invalid 'type' (character) of argument
As you can see, here we get an error even though X is an atomic vector.
H.
On 02/14/2017 05:41 PM, Richard M. Heiberger wrote:
> The problem with Bert's second example is that sum doesn't work on a list.
> The tapply worked correctly.
>
>> unlist(l[1:5])
> [1] 1 2 3 4 5
>
>> sum(l[1:5])
> Error in sum(l[1:5]) : invalid 'type' (list) of argument
>
>
>
> On Tue, Feb 14, 2017 at 8:28 PM, Bert Gunter < [hidden email]> wrote:
>> Hervé:
>>
>> Kindly explain this, then:
>>
>>> l < as.list(1:10)
>>> is.atomic(l) # FALSE
>> [1] FALSE
>>> index < c(rep(1,5),rep(2,5))
>>>
>>>
>>> tapply(l,index,unlist)
>> $`1`
>> [1] 1 2 3 4 5
>>
>> $`2`
>> [1] 6 7 8 9 10
>>
>>>
>>> ## But
>>>
>>> tapply(l,index, sum)
>> Error in FUN(X[[i]], ...) : invalid 'type' (list) of argument
>>
>> Cheers,
>> Bert
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>>  Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Tue, Feb 14, 2017 at 5:10 PM, Hervé Pagès < [hidden email]> wrote:
>>> Hi,
>>>
>>> tapply() will work on any object 'X' that has a length and supports
>>> singlebracket subsetting. These objects are sometimes called
>>> "vectorlike" objects. Atomic vectors, lists, S4 objects with a "length"
>>> and "[" method, etc... are examples of "vectorlike" objects.
>>>
>>> So instead of saying
>>>
>>> X: an atomic object, typically a vector.
>>>
>>> I think it would be more accurate if the man page was saying something
>>> like
>>>
>>> X: a vectorlike object that supports subsetting with `[`, typically
>>> an atomic vector.
>>>
>>> H.
>>>
>>> On 02/04/2017 04:17 AM, Tal Galili wrote:
>>>>
>>>> In the help page of ?tapply it says that the first argument (X) is "an
>>>> atomic object, typically a vector."
>>>>
>>>> However, tapply seems to be able to handle list objects. For example:
>>>>
>>>> ###################
>>>>
>>>> l < as.list(1:10)
>>>> is.atomic(l) # FALSE
>>>> index < c(rep(1,5),rep(2,5))
>>>> tapply(l,index,unlist)
>>>>
>>>>> tapply(l,index,unlist)
>>>>
>>>> $`1`
>>>> [1] 1 2 3 4 5
>>>>
>>>> $`2`
>>>> [1] 6 7 8 9 10
>>>>
>>>>
>>>> ###################
>>>>
>>>> Hence, does it mean a list an atomic object? (which I thought it wasn't)
>>>> or
>>>> is the help for tapply needs updating?
>>>> (or some third option I'm missing?)
>>>>
>>>> Thanks.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Contact
>>>> Details:
>>>> Contact me: [hidden email] 
>>>> Read me: www.talgalili.com (Hebrew)  www.biostatistics.co.il (Hebrew) 
>>>> www.rstatistics.com (English)
>>>>
>>>> 
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>>> PLEASE do read the posting guide
>>>> http://www.Rproject.org/postingguide.html>>>> and provide commented, minimal, selfcontained, reproducible code.
>>>>
>>>
>>> 
>>> Hervé Pagès
>>>
>>> Program in Computational Biology
>>> Division of Public Health Sciences
>>> Fred Hutchinson Cancer Research Center
>>> 1100 Fairview Ave. N, M1B514
>>> P.O. Box 19024
>>> Seattle, WA 981091024
>>>
>>> Email: [hidden email]
>>> Phone: (206) 6675791
>>> Fax: (206) 6671319
>>>
>>> ______________________________________________
>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html>>> and provide commented, minimal, selfcontained, reproducible code.
>>
>> ______________________________________________
>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html>> and provide commented, minimal, selfcontained, reproducible code.

Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1B514
P.O. Box 19024
Seattle, WA 981091024
Email: [hidden email]
Phone: (206) 6675791
Fax: (206) 6671319
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Yes, exactly.
So my point is that this:
"X: a vectorlike object that supports subsetting with `[`, typically
an atomic vector."
is incorrect, or at least a bit opaque, without further emphasizing
that FUN must accept the result of "[". With atomic vectors, the error
that you produced was obvious, but with lists, I believe not so. I
Appreciate the desire for brevity, but I think clarity should be the
primary goal. Maybe it *is* just me, but I think a few extra words of
explanation here would not go amiss.
But, anyway, thanks for the clarification.
Cheers,
Bert
Bert Gunter
"The trouble with having an open mind is that people keep coming along
and sticking things into it."
 Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Tue, Feb 14, 2017 at 6:04 PM, Hervé Pagès < [hidden email]> wrote:
> Right. More precisely the function passed thru the FUN argument must
> work on the subsets of X generated internally by tapply(). You can
> actually see these subsets by passing the identity function:
>
> X < letters[1:10]
> INDEX < c(rep(1,5),rep(2,5))
> tapply(X, INDEX, FUN=identity)
> # $`1`
> # [1] "a" "b" "c" "d" "e"
> #
> # $`2`
> # [1] "f" "g" "h" "i" "j"
>
> Doing this shows you how tapply() splits the vectorlike object X into
> a list of subsets. If you replace the identity function with a function
> that cannot be applied to these subsets, then you get an error:
>
> tapply(X, INDEX, FUN=sum)
> # Error in FUN(X[[i]], ...) : invalid 'type' (character) of argument
>
> As you can see, here we get an error even though X is an atomic vector.
>
> H.
>
>
>
> On 02/14/2017 05:41 PM, Richard M. Heiberger wrote:
>>
>> The problem with Bert's second example is that sum doesn't work on a list.
>> The tapply worked correctly.
>>
>>> unlist(l[1:5])
>>
>> [1] 1 2 3 4 5
>>
>>> sum(l[1:5])
>>
>> Error in sum(l[1:5]) : invalid 'type' (list) of argument
>>
>>
>>
>> On Tue, Feb 14, 2017 at 8:28 PM, Bert Gunter < [hidden email]>
>> wrote:
>>>
>>> Hervé:
>>>
>>> Kindly explain this, then:
>>>
>>>> l < as.list(1:10)
>>>> is.atomic(l) # FALSE
>>>
>>> [1] FALSE
>>>>
>>>> index < c(rep(1,5),rep(2,5))
>>>>
>>>>
>>>> tapply(l,index,unlist)
>>>
>>> $`1`
>>> [1] 1 2 3 4 5
>>>
>>> $`2`
>>> [1] 6 7 8 9 10
>>>
>>>>
>>>> ## But
>>>>
>>>> tapply(l,index, sum)
>>>
>>> Error in FUN(X[[i]], ...) : invalid 'type' (list) of argument
>>>
>>> Cheers,
>>> Bert
>>> Bert Gunter
>>>
>>> "The trouble with having an open mind is that people keep coming along
>>> and sticking things into it."
>>>  Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>
>>>
>>> On Tue, Feb 14, 2017 at 5:10 PM, Hervé Pagès < [hidden email]>
>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> tapply() will work on any object 'X' that has a length and supports
>>>> singlebracket subsetting. These objects are sometimes called
>>>> "vectorlike" objects. Atomic vectors, lists, S4 objects with a "length"
>>>> and "[" method, etc... are examples of "vectorlike" objects.
>>>>
>>>> So instead of saying
>>>>
>>>> X: an atomic object, typically a vector.
>>>>
>>>> I think it would be more accurate if the man page was saying something
>>>> like
>>>>
>>>> X: a vectorlike object that supports subsetting with `[`, typically
>>>> an atomic vector.
>>>>
>>>> H.
>>>>
>>>> On 02/04/2017 04:17 AM, Tal Galili wrote:
>>>>>
>>>>>
>>>>> In the help page of ?tapply it says that the first argument (X) is "an
>>>>> atomic object, typically a vector."
>>>>>
>>>>> However, tapply seems to be able to handle list objects. For example:
>>>>>
>>>>> ###################
>>>>>
>>>>> l < as.list(1:10)
>>>>> is.atomic(l) # FALSE
>>>>> index < c(rep(1,5),rep(2,5))
>>>>> tapply(l,index,unlist)
>>>>>
>>>>>> tapply(l,index,unlist)
>>>>>
>>>>>
>>>>> $`1`
>>>>> [1] 1 2 3 4 5
>>>>>
>>>>> $`2`
>>>>> [1] 6 7 8 9 10
>>>>>
>>>>>
>>>>> ###################
>>>>>
>>>>> Hence, does it mean a list an atomic object? (which I thought it
>>>>> wasn't)
>>>>> or
>>>>> is the help for tapply needs updating?
>>>>> (or some third option I'm missing?)
>>>>>
>>>>> Thanks.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Contact
>>>>> Details:
>>>>> Contact me: [hidden email] 
>>>>> Read me: www.talgalili.com (Hebrew)  www.biostatistics.co.il (Hebrew)
>>>>> 
>>>>> www.rstatistics.com (English)
>>>>>
>>>>>
>>>>> 
>>>>>
>>>>> [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>>>> PLEASE do read the posting guide
>>>>> http://www.Rproject.org/postingguide.html>>>>> and provide commented, minimal, selfcontained, reproducible code.
>>>>>
>>>>
>>>> 
>>>> Hervé Pagès
>>>>
>>>> Program in Computational Biology
>>>> Division of Public Health Sciences
>>>> Fred Hutchinson Cancer Research Center
>>>> 1100 Fairview Ave. N, M1B514
>>>> P.O. Box 19024
>>>> Seattle, WA 981091024
>>>>
>>>> Email: [hidden email]
>>>> Phone: (206) 6675791
>>>> Fax: (206) 6671319
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>>> PLEASE do read the posting guide
>>>> http://www.Rproject.org/postingguide.html>>>> and provide commented, minimal, selfcontained, reproducible code.
>>>
>>>
>>> ______________________________________________
>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>> PLEASE do read the posting guide
>>> http://www.Rproject.org/postingguide.html>>> and provide commented, minimal, selfcontained, reproducible code.
>
>
> 
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1B514
> P.O. Box 19024
> Seattle, WA 981091024
>
> Email: [hidden email]
> Phone: (206) 6675791
> Fax: (206) 6671319
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On 02/14/2017 06:39 PM, Bert Gunter wrote:
> Yes, exactly.
>
> So my point is that this:
>
> "X: a vectorlike object that supports subsetting with `[`, typically
> an atomic vector."
>
> is incorrect, or at least a bit opaque, without further emphasizing
> that FUN must accept the result of "[".
Maybe this kind of details belong to the description of the FUN
argument. However please note that the man page for lapply() or the
other *apply() functions don't emphasize the fact that the supplied
FUN must be a function that accepts the things it applies to either,
and nobody seems to make a big deal of it. Maybe because it's obvious?
> With atomic vectors, the error
> that you produced was obvious, but with lists, I believe not so.
Well, it's the same error. Maybe what's not obvious is that in both
cases the error is coming from sum(), not from tapply() itself.
sum() is complaining that it receives something that it doesn't
know how to handle. The clue is in how the error message starts:
Error in FUN(X[[i]], ...):
Maybe one could argue this is a little bit cryptic. Note the difference
when the error is coming from tapply() itself:
> X < letters[1:9]
> INDEX < c(rep(1,5),rep(2,5))
> tapply(X, INDEX, FUN=identity)
Error in tapply(X, INDEX, FUN = identity) :
arguments must have same length
H.
> I Appreciate the desire for brevity, but I think clarity should be the
> primary goal. Maybe it *is* just me, but I think a few extra words of
> explanation here would not go amiss.
>
> But, anyway, thanks for the clarification.
>
> Cheers,
> Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
>  Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Tue, Feb 14, 2017 at 6:04 PM, Hervé Pagès < [hidden email]> wrote:
>> Right. More precisely the function passed thru the FUN argument must
>> work on the subsets of X generated internally by tapply(). You can
>> actually see these subsets by passing the identity function:
>>
>> X < letters[1:10]
>> INDEX < c(rep(1,5),rep(2,5))
>> tapply(X, INDEX, FUN=identity)
>> # $`1`
>> # [1] "a" "b" "c" "d" "e"
>> #
>> # $`2`
>> # [1] "f" "g" "h" "i" "j"
>>
>> Doing this shows you how tapply() splits the vectorlike object X into
>> a list of subsets. If you replace the identity function with a function
>> that cannot be applied to these subsets, then you get an error:
>>
>> tapply(X, INDEX, FUN=sum)
>> # Error in FUN(X[[i]], ...) : invalid 'type' (character) of argument
>>
>> As you can see, here we get an error even though X is an atomic vector.
>>
>> H.
>>
>>
>>
>> On 02/14/2017 05:41 PM, Richard M. Heiberger wrote:
>>>
>>> The problem with Bert's second example is that sum doesn't work on a list.
>>> The tapply worked correctly.
>>>
>>>> unlist(l[1:5])
>>>
>>> [1] 1 2 3 4 5
>>>
>>>> sum(l[1:5])
>>>
>>> Error in sum(l[1:5]) : invalid 'type' (list) of argument
>>>
>>>
>>>
>>> On Tue, Feb 14, 2017 at 8:28 PM, Bert Gunter < [hidden email]>
>>> wrote:
>>>>
>>>> Hervé:
>>>>
>>>> Kindly explain this, then:
>>>>
>>>>> l < as.list(1:10)
>>>>> is.atomic(l) # FALSE
>>>>
>>>> [1] FALSE
>>>>>
>>>>> index < c(rep(1,5),rep(2,5))
>>>>>
>>>>>
>>>>> tapply(l,index,unlist)
>>>>
>>>> $`1`
>>>> [1] 1 2 3 4 5
>>>>
>>>> $`2`
>>>> [1] 6 7 8 9 10
>>>>
>>>>>
>>>>> ## But
>>>>>
>>>>> tapply(l,index, sum)
>>>>
>>>> Error in FUN(X[[i]], ...) : invalid 'type' (list) of argument
>>>>
>>>> Cheers,
>>>> Bert
>>>> Bert Gunter
>>>>
>>>> "The trouble with having an open mind is that people keep coming along
>>>> and sticking things into it."
>>>>  Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>>
>>>>
>>>> On Tue, Feb 14, 2017 at 5:10 PM, Hervé Pagès < [hidden email]>
>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> tapply() will work on any object 'X' that has a length and supports
>>>>> singlebracket subsetting. These objects are sometimes called
>>>>> "vectorlike" objects. Atomic vectors, lists, S4 objects with a "length"
>>>>> and "[" method, etc... are examples of "vectorlike" objects.
>>>>>
>>>>> So instead of saying
>>>>>
>>>>> X: an atomic object, typically a vector.
>>>>>
>>>>> I think it would be more accurate if the man page was saying something
>>>>> like
>>>>>
>>>>> X: a vectorlike object that supports subsetting with `[`, typically
>>>>> an atomic vector.
>>>>>
>>>>> H.
>>>>>
>>>>> On 02/04/2017 04:17 AM, Tal Galili wrote:
>>>>>>
>>>>>>
>>>>>> In the help page of ?tapply it says that the first argument (X) is "an
>>>>>> atomic object, typically a vector."
>>>>>>
>>>>>> However, tapply seems to be able to handle list objects. For example:
>>>>>>
>>>>>> ###################
>>>>>>
>>>>>> l < as.list(1:10)
>>>>>> is.atomic(l) # FALSE
>>>>>> index < c(rep(1,5),rep(2,5))
>>>>>> tapply(l,index,unlist)
>>>>>>
>>>>>>> tapply(l,index,unlist)
>>>>>>
>>>>>>
>>>>>> $`1`
>>>>>> [1] 1 2 3 4 5
>>>>>>
>>>>>> $`2`
>>>>>> [1] 6 7 8 9 10
>>>>>>
>>>>>>
>>>>>> ###################
>>>>>>
>>>>>> Hence, does it mean a list an atomic object? (which I thought it
>>>>>> wasn't)
>>>>>> or
>>>>>> is the help for tapply needs updating?
>>>>>> (or some third option I'm missing?)
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Contact
>>>>>> Details:
>>>>>> Contact me: [hidden email] 
>>>>>> Read me: www.talgalili.com (Hebrew)  www.biostatistics.co.il (Hebrew)
>>>>>> 
>>>>>> www.rstatistics.com (English)
>>>>>>
>>>>>>
>>>>>> 
>>>>>>
>>>>>> [[alternative HTML version deleted]]
>>>>>>
>>>>>> ______________________________________________
>>>>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>>>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>>>>> PLEASE do read the posting guide
>>>>>> http://www.Rproject.org/postingguide.html>>>>>> and provide commented, minimal, selfcontained, reproducible code.
>>>>>>
>>>>>
>>>>> 
>>>>> Hervé Pagès
>>>>>
>>>>> Program in Computational Biology
>>>>> Division of Public Health Sciences
>>>>> Fred Hutchinson Cancer Research Center
>>>>> 1100 Fairview Ave. N, M1B514
>>>>> P.O. Box 19024
>>>>> Seattle, WA 981091024
>>>>>
>>>>> Email: [hidden email]
>>>>> Phone: (206) 6675791
>>>>> Fax: (206) 6671319
>>>>>
>>>>> ______________________________________________
>>>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>>>> PLEASE do read the posting guide
>>>>> http://www.Rproject.org/postingguide.html>>>>> and provide commented, minimal, selfcontained, reproducible code.
>>>>
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>>> PLEASE do read the posting guide
>>>> http://www.Rproject.org/postingguide.html>>>> and provide commented, minimal, selfcontained, reproducible code.
>>
>>
>> 
>> Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1B514
>> P.O. Box 19024
>> Seattle, WA 981091024
>>
>> Email: [hidden email]
>> Phone: (206) 6675791
>> Fax: (206) 6671319

Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1B514
P.O. Box 19024
Seattle, WA 981091024
Email: [hidden email]
Phone: (206) 6675791
Fax: (206) 6671319
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


It seems like this should be consistent with split(), since that's
what actually powers the behaviour.
Reading the description for split leads to this rather interesting example:
tapply(mtcars, 1:11, I)
Hadley
On Tue, Feb 14, 2017 at 7:10 PM, Hervé Pagès < [hidden email]> wrote:
> Hi,
>
> tapply() will work on any object 'X' that has a length and supports
> singlebracket subsetting. These objects are sometimes called
> "vectorlike" objects. Atomic vectors, lists, S4 objects with a "length"
> and "[" method, etc... are examples of "vectorlike" objects.
>
> So instead of saying
>
> X: an atomic object, typically a vector.
>
> I think it would be more accurate if the man page was saying something
> like
>
> X: a vectorlike object that supports subsetting with `[`, typically
> an atomic vector.
>
> H.
>
>
> On 02/04/2017 04:17 AM, Tal Galili wrote:
>>
>> In the help page of ?tapply it says that the first argument (X) is "an
>> atomic object, typically a vector."
>>
>> However, tapply seems to be able to handle list objects. For example:
>>
>> ###################
>>
>> l < as.list(1:10)
>> is.atomic(l) # FALSE
>> index < c(rep(1,5),rep(2,5))
>> tapply(l,index,unlist)
>>
>>> tapply(l,index,unlist)
>>
>> $`1`
>> [1] 1 2 3 4 5
>>
>> $`2`
>> [1] 6 7 8 9 10
>>
>>
>> ###################
>>
>> Hence, does it mean a list an atomic object? (which I thought it wasn't)
>> or
>> is the help for tapply needs updating?
>> (or some third option I'm missing?)
>>
>> Thanks.
>>
>>
>>
>>
>>
>> Contact
>> Details:
>> Contact me: [hidden email] 
>> Read me: www.talgalili.com (Hebrew)  www.biostatistics.co.il (Hebrew) 
>> www.rstatistics.com (English)
>>
>> 
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide
>> http://www.Rproject.org/postingguide.html>> and provide commented, minimal, selfcontained, reproducible code.
>>
>
> 
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1B514
> P.O. Box 19024
> Seattle, WA 981091024
>
> Email: [hidden email]
> Phone: (206) 6675791
> Fax: (206) 6671319
>
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.

http://hadley.nz______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


You could also call this "interesting example" a bug.
Clearly not enough code reuse in the implementation of tapply().
Instead of the current 25 lines of code, it could be a simple
wrapper around split() and sapply() e.g.. something like:
tapply2 < function(X, INDEX, FUN=NULL, ..., simplify=TRUE)
{
f < make_factor_from_INDEX(INDEX) # same as tapply(INDEX=INDEX,
FUN=NULL)
sapply(split(X, f), FUN, ..., simplify=simplify, USE.NAMES=FALSE)
}
and then be guaranteed to behave consistently with split() and
sapply(). Also the make_factor_from_INDEX() step maybe could be
shared with what aggregate.data.frame() does internally with its
'by' argument.
Still a mystery to me why the power of code sharing/reuse is so
often underestimated :/
H.
On 02/15/2017 11:32 AM, Hadley Wickham wrote:
> It seems like this should be consistent with split(), since that's
> what actually powers the behaviour.
>
> Reading the description for split leads to this rather interesting example:
>
> tapply(mtcars, 1:11, I)
>
> Hadley
>
> On Tue, Feb 14, 2017 at 7:10 PM, Hervé Pagès < [hidden email]> wrote:
>> Hi,
>>
>> tapply() will work on any object 'X' that has a length and supports
>> singlebracket subsetting. These objects are sometimes called
>> "vectorlike" objects. Atomic vectors, lists, S4 objects with a "length"
>> and "[" method, etc... are examples of "vectorlike" objects.
>>
>> So instead of saying
>>
>> X: an atomic object, typically a vector.
>>
>> I think it would be more accurate if the man page was saying something
>> like
>>
>> X: a vectorlike object that supports subsetting with `[`, typically
>> an atomic vector.
>>
>> H.
>>
>>
>> On 02/04/2017 04:17 AM, Tal Galili wrote:
>>>
>>> In the help page of ?tapply it says that the first argument (X) is "an
>>> atomic object, typically a vector."
>>>
>>> However, tapply seems to be able to handle list objects. For example:
>>>
>>> ###################
>>>
>>> l < as.list(1:10)
>>> is.atomic(l) # FALSE
>>> index < c(rep(1,5),rep(2,5))
>>> tapply(l,index,unlist)
>>>
>>>> tapply(l,index,unlist)
>>>
>>> $`1`
>>> [1] 1 2 3 4 5
>>>
>>> $`2`
>>> [1] 6 7 8 9 10
>>>
>>>
>>> ###################
>>>
>>> Hence, does it mean a list an atomic object? (which I thought it wasn't)
>>> or
>>> is the help for tapply needs updating?
>>> (or some third option I'm missing?)
>>>
>>> Thanks.
>>>
>>>
>>>
>>>
>>>
>>> Contact
>>> Details:
>>> Contact me: [hidden email] 
>>> Read me: www.talgalili.com (Hebrew)  www.biostatistics.co.il (Hebrew) 
>>> www.rstatistics.com (English)
>>>
>>> 
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/rhelp>>> PLEASE do read the posting guide
>>> http://www.Rproject.org/postingguide.html>>> and provide commented, minimal, selfcontained, reproducible code.
>>>
>>
>> 
>> Hervé Pagès
>>
>> Program in Computational Biology
>> Division of Public Health Sciences
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1B514
>> P.O. Box 19024
>> Seattle, WA 981091024
>>
>> Email: [hidden email]
>> Phone: (206) 6675791
>> Fax: (206) 6671319
>>
>>
>> ______________________________________________
>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html>> and provide commented, minimal, selfcontained, reproducible code.
>
>
>

Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1B514
P.O. Box 19024
Seattle, WA 981091024
Email: [hidden email]
Phone: (206) 6675791
Fax: (206) 6671319
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


>>>>> Hervé Pagès < [hidden email]>
>>>>> on Tue, 14 Feb 2017 17:10:05 0800 writes:
> Hi, tapply() will work on any object 'X' that has a length
> and supports singlebracket subsetting. These objects are
> sometimes called "vectorlike" objects. Atomic vectors,
> lists, S4 objects with a "length" and "[" method,
> etc... are examples of "vectorlike" objects.
> So instead of saying
> X: an atomic object, typically a vector.
> I think it would be more accurate if the man page was
> saying something like
> X: a vectorlike object that supports subsetting with
> `[`, typically an atomic vector.
Thank you, Hervé!
Actually (someone else mentioned ?)
only length(X) and split(X, <group>) need to work,
and as split() itself is an S3 generic function, X can be even
more general... well depending on how exactly you understand
"vectorlike".
So I would go with
X: an R object for which a ‘split’ method exists. Typically
vectorlike, allowing subsetting with ‘[’.
Martin
> H.
> On 02/04/2017 04:17 AM, Tal Galili wrote:
>> In the help page of ?tapply it says that the first
>> argument (X) is "an atomic object, typically a vector."
>>
>> However, tapply seems to be able to handle list
>> objects. For example:
>>
>> ###################
>>
>> l < as.list(1:10) is.atomic(l) # FALSE index <
>> c(rep(1,5),rep(2,5)) tapply(l,index,unlist)
>>
>>> tapply(l,index,unlist)
>> $`1` [1] 1 2 3 4 5
>>
>> $`2` [1] 6 7 8 9 10
>>
>>
>> ###################
>>
>> Hence, does it mean a list an atomic object? (which I
>> thought it wasn't) or is the help for tapply needs
>> updating? (or some third option I'm missing?)
>>
>> Thanks.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On Mon, Feb 20, 2017 at 7:31 AM, Martin Maechler
< [hidden email]> wrote:
>>>>>> Hervé Pagès < [hidden email]>
>>>>>> on Tue, 14 Feb 2017 17:10:05 0800 writes:
>
> > Hi, tapply() will work on any object 'X' that has a length
> > and supports singlebracket subsetting. These objects are
> > sometimes called "vectorlike" objects. Atomic vectors,
> > lists, S4 objects with a "length" and "[" method,
> > etc... are examples of "vectorlike" objects.
>
> > So instead of saying
>
> > X: an atomic object, typically a vector.
>
> > I think it would be more accurate if the man page was
> > saying something like
>
> > X: a vectorlike object that supports subsetting with
> > `[`, typically an atomic vector.
>
> Thank you, Hervé!
>
> Actually (someone else mentioned ?)
> only length(X) and split(X, <group>) need to work,
> and as split() itself is an S3 generic function, X can be even
> more general... well depending on how exactly you understand
> "vectorlike".
>
> So I would go with
>
> X: an R object for which a ‘split’ method exists. Typically
> vectorlike, allowing subsetting with ‘[’.
I think technically tapply() should be using NROW() check that X and
INDEX are compatible. That would make it more compatible with split()
semantics.
Hadley

http://hadley.nz______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.

