In the help page of ?tapply it says that the first argument (X) is "an
atomic object, typically a vector." However, tapply seems to be able to handle list objects. For example: ################### l <- as.list(1:10) is.atomic(l) # FALSE index <- c(rep(1,5),rep(2,5)) tapply(l,index,unlist) > tapply(l,index,unlist) $`1` [1] 1 2 3 4 5 $`2` [1] 6 7 8 9 10 ################### Hence, does it mean a list an atomic object? (which I thought it wasn't) or is the help for tapply needs updating? (or some third option I'm missing?) Thanks. ----------------Contact Details:------------------------------------------------------- Contact me: [hidden email] | Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) ---------------------------------------------------------------------------------------------- [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Did you ever receive a reply to this?
Note that for your example: > tapply(l,index,sum) Error in FUN(X[[i]], ...) : invalid 'type' (list) of argument A list is definitely not atomic (is.recursive(l) ). So it looks like a "quirk" that FUN = unlist doesn't raise an error. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sat, Feb 4, 2017 at 4:17 AM, Tal Galili <[hidden email]> wrote: > In the help page of ?tapply it says that the first argument (X) is "an > atomic object, typically a vector." > > However, tapply seems to be able to handle list objects. For example: > > ################### > > l <- as.list(1:10) > is.atomic(l) # FALSE > index <- c(rep(1,5),rep(2,5)) > tapply(l,index,unlist) > >> tapply(l,index,unlist) > $`1` > [1] 1 2 3 4 5 > > $`2` > [1] 6 7 8 9 10 > > > ################### > > Hence, does it mean a list an atomic object? (which I thought it wasn't) or > is the help for tapply needs updating? > (or some third option I'm missing?) > > Thanks. > > > > > > ----------------Contact > Details:------------------------------------------------------- > Contact me: [hidden email] | > Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | > www.r-statistics.com (English) > ---------------------------------------------------------------------------------------------- > > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by Tal Galili
Hi,
tapply() will work on any object 'X' that has a length and supports single-bracket subsetting. These objects are sometimes called "vector-like" objects. Atomic vectors, lists, S4 objects with a "length" and "[" method, etc... are examples of "vector-like" objects. So instead of saying X: an atomic object, typically a vector. I think it would be more accurate if the man page was saying something like X: a vector-like object that supports subsetting with `[`, typically an atomic vector. H. On 02/04/2017 04:17 AM, Tal Galili wrote: > In the help page of ?tapply it says that the first argument (X) is "an > atomic object, typically a vector." > > However, tapply seems to be able to handle list objects. For example: > > ################### > > l <- as.list(1:10) > is.atomic(l) # FALSE > index <- c(rep(1,5),rep(2,5)) > tapply(l,index,unlist) > >> tapply(l,index,unlist) > $`1` > [1] 1 2 3 4 5 > > $`2` > [1] 6 7 8 9 10 > > > ################### > > Hence, does it mean a list an atomic object? (which I thought it wasn't) or > is the help for tapply needs updating? > (or some third option I'm missing?) > > Thanks. > > > > > > ----------------Contact > Details:------------------------------------------------------- > Contact me: [hidden email] | > Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | > www.r-statistics.com (English) > ---------------------------------------------------------------------------------------------- > > [[alternative HTML version deleted]] > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: [hidden email] Phone: (206) 667-5791 Fax: (206) 667-1319 ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Hervé:
Kindly explain this, then: > l <- as.list(1:10) > is.atomic(l) # FALSE [1] FALSE > index <- c(rep(1,5),rep(2,5)) > > > tapply(l,index,unlist) $`1` [1] 1 2 3 4 5 $`2` [1] 6 7 8 9 10 > > ## But > > tapply(l,index, sum) Error in FUN(X[[i]], ...) : invalid 'type' (list) of argument Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Feb 14, 2017 at 5:10 PM, Hervé Pagès <[hidden email]> wrote: > Hi, > > tapply() will work on any object 'X' that has a length and supports > single-bracket subsetting. These objects are sometimes called > "vector-like" objects. Atomic vectors, lists, S4 objects with a "length" > and "[" method, etc... are examples of "vector-like" objects. > > So instead of saying > > X: an atomic object, typically a vector. > > I think it would be more accurate if the man page was saying something > like > > X: a vector-like object that supports subsetting with `[`, typically > an atomic vector. > > H. > > On 02/04/2017 04:17 AM, Tal Galili wrote: >> >> In the help page of ?tapply it says that the first argument (X) is "an >> atomic object, typically a vector." >> >> However, tapply seems to be able to handle list objects. For example: >> >> ################### >> >> l <- as.list(1:10) >> is.atomic(l) # FALSE >> index <- c(rep(1,5),rep(2,5)) >> tapply(l,index,unlist) >> >>> tapply(l,index,unlist) >> >> $`1` >> [1] 1 2 3 4 5 >> >> $`2` >> [1] 6 7 8 9 10 >> >> >> ################### >> >> Hence, does it mean a list an atomic object? (which I thought it wasn't) >> or >> is the help for tapply needs updating? >> (or some third option I'm missing?) >> >> Thanks. >> >> >> >> >> >> ----------------Contact >> Details:------------------------------------------------------- >> Contact me: [hidden email] | >> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | >> www.r-statistics.com (English) >> >> ---------------------------------------------------------------------------------------------- >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: [hidden email] > Phone: (206) 667-5791 > Fax: (206) 667-1319 > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
The problem with Bert's second example is that sum doesn't work on a list.
The tapply worked correctly. > unlist(l[1:5]) [1] 1 2 3 4 5 > sum(l[1:5]) Error in sum(l[1:5]) : invalid 'type' (list) of argument On Tue, Feb 14, 2017 at 8:28 PM, Bert Gunter <[hidden email]> wrote: > Hervé: > > Kindly explain this, then: > >> l <- as.list(1:10) >> is.atomic(l) # FALSE > [1] FALSE >> index <- c(rep(1,5),rep(2,5)) >> >> >> tapply(l,index,unlist) > $`1` > [1] 1 2 3 4 5 > > $`2` > [1] 6 7 8 9 10 > >> >> ## But >> >> tapply(l,index, sum) > Error in FUN(X[[i]], ...) : invalid 'type' (list) of argument > > Cheers, > Bert > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Tue, Feb 14, 2017 at 5:10 PM, Hervé Pagès <[hidden email]> wrote: >> Hi, >> >> tapply() will work on any object 'X' that has a length and supports >> single-bracket subsetting. These objects are sometimes called >> "vector-like" objects. Atomic vectors, lists, S4 objects with a "length" >> and "[" method, etc... are examples of "vector-like" objects. >> >> So instead of saying >> >> X: an atomic object, typically a vector. >> >> I think it would be more accurate if the man page was saying something >> like >> >> X: a vector-like object that supports subsetting with `[`, typically >> an atomic vector. >> >> H. >> >> On 02/04/2017 04:17 AM, Tal Galili wrote: >>> >>> In the help page of ?tapply it says that the first argument (X) is "an >>> atomic object, typically a vector." >>> >>> However, tapply seems to be able to handle list objects. For example: >>> >>> ################### >>> >>> l <- as.list(1:10) >>> is.atomic(l) # FALSE >>> index <- c(rep(1,5),rep(2,5)) >>> tapply(l,index,unlist) >>> >>>> tapply(l,index,unlist) >>> >>> $`1` >>> [1] 1 2 3 4 5 >>> >>> $`2` >>> [1] 6 7 8 9 10 >>> >>> >>> ################### >>> >>> Hence, does it mean a list an atomic object? (which I thought it wasn't) >>> or >>> is the help for tapply needs updating? >>> (or some third option I'm missing?) >>> >>> Thanks. >>> >>> >>> >>> >>> >>> ----------------Contact >>> Details:------------------------------------------------------- >>> Contact me: [hidden email] | >>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | >>> www.r-statistics.com (English) >>> >>> ---------------------------------------------------------------------------------------------- >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> -- >> Hervé Pagès >> >> Program in Computational Biology >> Division of Public Health Sciences >> Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N, M1-B514 >> P.O. Box 19024 >> Seattle, WA 98109-1024 >> >> E-mail: [hidden email] >> Phone: (206) 667-5791 >> Fax: (206) 667-1319 >> >> ______________________________________________ >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Right. More precisely the function passed thru the FUN argument must
work on the subsets of X generated internally by tapply(). You can actually see these subsets by passing the identity function: X <- letters[1:10] INDEX <- c(rep(1,5),rep(2,5)) tapply(X, INDEX, FUN=identity) # $`1` # [1] "a" "b" "c" "d" "e" # # $`2` # [1] "f" "g" "h" "i" "j" Doing this shows you how tapply() splits the vector-like object X into a list of subsets. If you replace the identity function with a function that cannot be applied to these subsets, then you get an error: tapply(X, INDEX, FUN=sum) # Error in FUN(X[[i]], ...) : invalid 'type' (character) of argument As you can see, here we get an error even though X is an atomic vector. H. On 02/14/2017 05:41 PM, Richard M. Heiberger wrote: > The problem with Bert's second example is that sum doesn't work on a list. > The tapply worked correctly. > >> unlist(l[1:5]) > [1] 1 2 3 4 5 > >> sum(l[1:5]) > Error in sum(l[1:5]) : invalid 'type' (list) of argument > > > > On Tue, Feb 14, 2017 at 8:28 PM, Bert Gunter <[hidden email]> wrote: >> Hervé: >> >> Kindly explain this, then: >> >>> l <- as.list(1:10) >>> is.atomic(l) # FALSE >> [1] FALSE >>> index <- c(rep(1,5),rep(2,5)) >>> >>> >>> tapply(l,index,unlist) >> $`1` >> [1] 1 2 3 4 5 >> >> $`2` >> [1] 6 7 8 9 10 >> >>> >>> ## But >>> >>> tapply(l,index, sum) >> Error in FUN(X[[i]], ...) : invalid 'type' (list) of argument >> >> Cheers, >> Bert >> Bert Gunter >> >> "The trouble with having an open mind is that people keep coming along >> and sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> >> On Tue, Feb 14, 2017 at 5:10 PM, Hervé Pagès <[hidden email]> wrote: >>> Hi, >>> >>> tapply() will work on any object 'X' that has a length and supports >>> single-bracket subsetting. These objects are sometimes called >>> "vector-like" objects. Atomic vectors, lists, S4 objects with a "length" >>> and "[" method, etc... are examples of "vector-like" objects. >>> >>> So instead of saying >>> >>> X: an atomic object, typically a vector. >>> >>> I think it would be more accurate if the man page was saying something >>> like >>> >>> X: a vector-like object that supports subsetting with `[`, typically >>> an atomic vector. >>> >>> H. >>> >>> On 02/04/2017 04:17 AM, Tal Galili wrote: >>>> >>>> In the help page of ?tapply it says that the first argument (X) is "an >>>> atomic object, typically a vector." >>>> >>>> However, tapply seems to be able to handle list objects. For example: >>>> >>>> ################### >>>> >>>> l <- as.list(1:10) >>>> is.atomic(l) # FALSE >>>> index <- c(rep(1,5),rep(2,5)) >>>> tapply(l,index,unlist) >>>> >>>>> tapply(l,index,unlist) >>>> >>>> $`1` >>>> [1] 1 2 3 4 5 >>>> >>>> $`2` >>>> [1] 6 7 8 9 10 >>>> >>>> >>>> ################### >>>> >>>> Hence, does it mean a list an atomic object? (which I thought it wasn't) >>>> or >>>> is the help for tapply needs updating? >>>> (or some third option I'm missing?) >>>> >>>> Thanks. >>>> >>>> >>>> >>>> >>>> >>>> ----------------Contact >>>> Details:------------------------------------------------------- >>>> Contact me: [hidden email] | >>>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | >>>> www.r-statistics.com (English) >>>> >>>> ---------------------------------------------------------------------------------------------- >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> -- >>> Hervé Pagès >>> >>> Program in Computational Biology >>> Division of Public Health Sciences >>> Fred Hutchinson Cancer Research Center >>> 1100 Fairview Ave. N, M1-B514 >>> P.O. Box 19024 >>> Seattle, WA 98109-1024 >>> >>> E-mail: [hidden email] >>> Phone: (206) 667-5791 >>> Fax: (206) 667-1319 >>> >>> ______________________________________________ >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: [hidden email] Phone: (206) 667-5791 Fax: (206) 667-1319 ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Yes, exactly.
So my point is that this: "X: a vector-like object that supports subsetting with `[`, typically an atomic vector." is incorrect, or at least a bit opaque, without further emphasizing that FUN must accept the result of "[". With atomic vectors, the error that you produced was obvious, but with lists, I believe not so. I Appreciate the desire for brevity, but I think clarity should be the primary goal. Maybe it *is* just me, but I think a few extra words of explanation here would not go amiss. But, anyway, thanks for the clarification. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Feb 14, 2017 at 6:04 PM, Hervé Pagès <[hidden email]> wrote: > Right. More precisely the function passed thru the FUN argument must > work on the subsets of X generated internally by tapply(). You can > actually see these subsets by passing the identity function: > > X <- letters[1:10] > INDEX <- c(rep(1,5),rep(2,5)) > tapply(X, INDEX, FUN=identity) > # $`1` > # [1] "a" "b" "c" "d" "e" > # > # $`2` > # [1] "f" "g" "h" "i" "j" > > Doing this shows you how tapply() splits the vector-like object X into > a list of subsets. If you replace the identity function with a function > that cannot be applied to these subsets, then you get an error: > > tapply(X, INDEX, FUN=sum) > # Error in FUN(X[[i]], ...) : invalid 'type' (character) of argument > > As you can see, here we get an error even though X is an atomic vector. > > H. > > > > On 02/14/2017 05:41 PM, Richard M. Heiberger wrote: >> >> The problem with Bert's second example is that sum doesn't work on a list. >> The tapply worked correctly. >> >>> unlist(l[1:5]) >> >> [1] 1 2 3 4 5 >> >>> sum(l[1:5]) >> >> Error in sum(l[1:5]) : invalid 'type' (list) of argument >> >> >> >> On Tue, Feb 14, 2017 at 8:28 PM, Bert Gunter <[hidden email]> >> wrote: >>> >>> Hervé: >>> >>> Kindly explain this, then: >>> >>>> l <- as.list(1:10) >>>> is.atomic(l) # FALSE >>> >>> [1] FALSE >>>> >>>> index <- c(rep(1,5),rep(2,5)) >>>> >>>> >>>> tapply(l,index,unlist) >>> >>> $`1` >>> [1] 1 2 3 4 5 >>> >>> $`2` >>> [1] 6 7 8 9 10 >>> >>>> >>>> ## But >>>> >>>> tapply(l,index, sum) >>> >>> Error in FUN(X[[i]], ...) : invalid 'type' (list) of argument >>> >>> Cheers, >>> Bert >>> Bert Gunter >>> >>> "The trouble with having an open mind is that people keep coming along >>> and sticking things into it." >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >>> >>> >>> On Tue, Feb 14, 2017 at 5:10 PM, Hervé Pagès <[hidden email]> >>> wrote: >>>> >>>> Hi, >>>> >>>> tapply() will work on any object 'X' that has a length and supports >>>> single-bracket subsetting. These objects are sometimes called >>>> "vector-like" objects. Atomic vectors, lists, S4 objects with a "length" >>>> and "[" method, etc... are examples of "vector-like" objects. >>>> >>>> So instead of saying >>>> >>>> X: an atomic object, typically a vector. >>>> >>>> I think it would be more accurate if the man page was saying something >>>> like >>>> >>>> X: a vector-like object that supports subsetting with `[`, typically >>>> an atomic vector. >>>> >>>> H. >>>> >>>> On 02/04/2017 04:17 AM, Tal Galili wrote: >>>>> >>>>> >>>>> In the help page of ?tapply it says that the first argument (X) is "an >>>>> atomic object, typically a vector." >>>>> >>>>> However, tapply seems to be able to handle list objects. For example: >>>>> >>>>> ################### >>>>> >>>>> l <- as.list(1:10) >>>>> is.atomic(l) # FALSE >>>>> index <- c(rep(1,5),rep(2,5)) >>>>> tapply(l,index,unlist) >>>>> >>>>>> tapply(l,index,unlist) >>>>> >>>>> >>>>> $`1` >>>>> [1] 1 2 3 4 5 >>>>> >>>>> $`2` >>>>> [1] 6 7 8 9 10 >>>>> >>>>> >>>>> ################### >>>>> >>>>> Hence, does it mean a list an atomic object? (which I thought it >>>>> wasn't) >>>>> or >>>>> is the help for tapply needs updating? >>>>> (or some third option I'm missing?) >>>>> >>>>> Thanks. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> ----------------Contact >>>>> Details:------------------------------------------------------- >>>>> Contact me: [hidden email] | >>>>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) >>>>> | >>>>> www.r-statistics.com (English) >>>>> >>>>> >>>>> ---------------------------------------------------------------------------------------------- >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> ______________________________________________ >>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>> >>>> -- >>>> Hervé Pagès >>>> >>>> Program in Computational Biology >>>> Division of Public Health Sciences >>>> Fred Hutchinson Cancer Research Center >>>> 1100 Fairview Ave. N, M1-B514 >>>> P.O. Box 19024 >>>> Seattle, WA 98109-1024 >>>> >>>> E-mail: [hidden email] >>>> Phone: (206) 667-5791 >>>> Fax: (206) 667-1319 >>>> >>>> ______________________________________________ >>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >>> ______________________________________________ >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. > > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: [hidden email] > Phone: (206) 667-5791 > Fax: (206) 667-1319 ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
On 02/14/2017 06:39 PM, Bert Gunter wrote:
> Yes, exactly. > > So my point is that this: > > "X: a vector-like object that supports subsetting with `[`, typically > an atomic vector." > > is incorrect, or at least a bit opaque, without further emphasizing > that FUN must accept the result of "[". Maybe this kind of details belong to the description of the FUN argument. However please note that the man page for lapply() or the other *apply() functions don't emphasize the fact that the supplied FUN must be a function that accepts the things it applies to either, and nobody seems to make a big deal of it. Maybe because it's obvious? > With atomic vectors, the error > that you produced was obvious, but with lists, I believe not so. Well, it's the same error. Maybe what's not obvious is that in both cases the error is coming from sum(), not from tapply() itself. sum() is complaining that it receives something that it doesn't know how to handle. The clue is in how the error message starts: Error in FUN(X[[i]], ...): Maybe one could argue this is a little bit cryptic. Note the difference when the error is coming from tapply() itself: > X <- letters[1:9] > INDEX <- c(rep(1,5),rep(2,5)) > tapply(X, INDEX, FUN=identity) Error in tapply(X, INDEX, FUN = identity) : arguments must have same length H. > I Appreciate the desire for brevity, but I think clarity should be the > primary goal. Maybe it *is* just me, but I think a few extra words of > explanation here would not go amiss. > > But, anyway, thanks for the clarification. > > Cheers, > Bert > > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Tue, Feb 14, 2017 at 6:04 PM, Hervé Pagès <[hidden email]> wrote: >> Right. More precisely the function passed thru the FUN argument must >> work on the subsets of X generated internally by tapply(). You can >> actually see these subsets by passing the identity function: >> >> X <- letters[1:10] >> INDEX <- c(rep(1,5),rep(2,5)) >> tapply(X, INDEX, FUN=identity) >> # $`1` >> # [1] "a" "b" "c" "d" "e" >> # >> # $`2` >> # [1] "f" "g" "h" "i" "j" >> >> Doing this shows you how tapply() splits the vector-like object X into >> a list of subsets. If you replace the identity function with a function >> that cannot be applied to these subsets, then you get an error: >> >> tapply(X, INDEX, FUN=sum) >> # Error in FUN(X[[i]], ...) : invalid 'type' (character) of argument >> >> As you can see, here we get an error even though X is an atomic vector. >> >> H. >> >> >> >> On 02/14/2017 05:41 PM, Richard M. Heiberger wrote: >>> >>> The problem with Bert's second example is that sum doesn't work on a list. >>> The tapply worked correctly. >>> >>>> unlist(l[1:5]) >>> >>> [1] 1 2 3 4 5 >>> >>>> sum(l[1:5]) >>> >>> Error in sum(l[1:5]) : invalid 'type' (list) of argument >>> >>> >>> >>> On Tue, Feb 14, 2017 at 8:28 PM, Bert Gunter <[hidden email]> >>> wrote: >>>> >>>> Hervé: >>>> >>>> Kindly explain this, then: >>>> >>>>> l <- as.list(1:10) >>>>> is.atomic(l) # FALSE >>>> >>>> [1] FALSE >>>>> >>>>> index <- c(rep(1,5),rep(2,5)) >>>>> >>>>> >>>>> tapply(l,index,unlist) >>>> >>>> $`1` >>>> [1] 1 2 3 4 5 >>>> >>>> $`2` >>>> [1] 6 7 8 9 10 >>>> >>>>> >>>>> ## But >>>>> >>>>> tapply(l,index, sum) >>>> >>>> Error in FUN(X[[i]], ...) : invalid 'type' (list) of argument >>>> >>>> Cheers, >>>> Bert >>>> Bert Gunter >>>> >>>> "The trouble with having an open mind is that people keep coming along >>>> and sticking things into it." >>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >>>> >>>> >>>> On Tue, Feb 14, 2017 at 5:10 PM, Hervé Pagès <[hidden email]> >>>> wrote: >>>>> >>>>> Hi, >>>>> >>>>> tapply() will work on any object 'X' that has a length and supports >>>>> single-bracket subsetting. These objects are sometimes called >>>>> "vector-like" objects. Atomic vectors, lists, S4 objects with a "length" >>>>> and "[" method, etc... are examples of "vector-like" objects. >>>>> >>>>> So instead of saying >>>>> >>>>> X: an atomic object, typically a vector. >>>>> >>>>> I think it would be more accurate if the man page was saying something >>>>> like >>>>> >>>>> X: a vector-like object that supports subsetting with `[`, typically >>>>> an atomic vector. >>>>> >>>>> H. >>>>> >>>>> On 02/04/2017 04:17 AM, Tal Galili wrote: >>>>>> >>>>>> >>>>>> In the help page of ?tapply it says that the first argument (X) is "an >>>>>> atomic object, typically a vector." >>>>>> >>>>>> However, tapply seems to be able to handle list objects. For example: >>>>>> >>>>>> ################### >>>>>> >>>>>> l <- as.list(1:10) >>>>>> is.atomic(l) # FALSE >>>>>> index <- c(rep(1,5),rep(2,5)) >>>>>> tapply(l,index,unlist) >>>>>> >>>>>>> tapply(l,index,unlist) >>>>>> >>>>>> >>>>>> $`1` >>>>>> [1] 1 2 3 4 5 >>>>>> >>>>>> $`2` >>>>>> [1] 6 7 8 9 10 >>>>>> >>>>>> >>>>>> ################### >>>>>> >>>>>> Hence, does it mean a list an atomic object? (which I thought it >>>>>> wasn't) >>>>>> or >>>>>> is the help for tapply needs updating? >>>>>> (or some third option I'm missing?) >>>>>> >>>>>> Thanks. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> ----------------Contact >>>>>> Details:------------------------------------------------------- >>>>>> Contact me: [hidden email] | >>>>>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) >>>>>> | >>>>>> www.r-statistics.com (English) >>>>>> >>>>>> >>>>>> ---------------------------------------------------------------------------------------------- >>>>>> >>>>>> [[alternative HTML version deleted]] >>>>>> >>>>>> ______________________________________________ >>>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>> PLEASE do read the posting guide >>>>>> http://www.R-project.org/posting-guide.html >>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>> >>>>> >>>>> -- >>>>> Hervé Pagès >>>>> >>>>> Program in Computational Biology >>>>> Division of Public Health Sciences >>>>> Fred Hutchinson Cancer Research Center >>>>> 1100 Fairview Ave. N, M1-B514 >>>>> P.O. Box 19024 >>>>> Seattle, WA 98109-1024 >>>>> >>>>> E-mail: [hidden email] >>>>> Phone: (206) 667-5791 >>>>> Fax: (206) 667-1319 >>>>> >>>>> ______________________________________________ >>>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>>> >>>> ______________________________________________ >>>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >> >> >> -- >> Hervé Pagès >> >> Program in Computational Biology >> Division of Public Health Sciences >> Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N, M1-B514 >> P.O. Box 19024 >> Seattle, WA 98109-1024 >> >> E-mail: [hidden email] >> Phone: (206) 667-5791 >> Fax: (206) 667-1319 -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: [hidden email] Phone: (206) 667-5791 Fax: (206) 667-1319 ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by Hervé Pagès-2
It seems like this should be consistent with split(), since that's
what actually powers the behaviour. Reading the description for split leads to this rather interesting example: tapply(mtcars, 1:11, I) Hadley On Tue, Feb 14, 2017 at 7:10 PM, Hervé Pagès <[hidden email]> wrote: > Hi, > > tapply() will work on any object 'X' that has a length and supports > single-bracket subsetting. These objects are sometimes called > "vector-like" objects. Atomic vectors, lists, S4 objects with a "length" > and "[" method, etc... are examples of "vector-like" objects. > > So instead of saying > > X: an atomic object, typically a vector. > > I think it would be more accurate if the man page was saying something > like > > X: a vector-like object that supports subsetting with `[`, typically > an atomic vector. > > H. > > > On 02/04/2017 04:17 AM, Tal Galili wrote: >> >> In the help page of ?tapply it says that the first argument (X) is "an >> atomic object, typically a vector." >> >> However, tapply seems to be able to handle list objects. For example: >> >> ################### >> >> l <- as.list(1:10) >> is.atomic(l) # FALSE >> index <- c(rep(1,5),rep(2,5)) >> tapply(l,index,unlist) >> >>> tapply(l,index,unlist) >> >> $`1` >> [1] 1 2 3 4 5 >> >> $`2` >> [1] 6 7 8 9 10 >> >> >> ################### >> >> Hence, does it mean a list an atomic object? (which I thought it wasn't) >> or >> is the help for tapply needs updating? >> (or some third option I'm missing?) >> >> Thanks. >> >> >> >> >> >> ----------------Contact >> Details:------------------------------------------------------- >> Contact me: [hidden email] | >> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | >> www.r-statistics.com (English) >> >> ---------------------------------------------------------------------------------------------- >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: [hidden email] > Phone: (206) 667-5791 > Fax: (206) 667-1319 > > > ______________________________________________ > [hidden email] mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- http://hadley.nz ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
You could also call this "interesting example" a bug.
Clearly not enough code reuse in the implementation of tapply(). Instead of the current 25 lines of code, it could be a simple wrapper around split() and sapply() e.g.. something like: tapply2 <- function(X, INDEX, FUN=NULL, ..., simplify=TRUE) { f <- make_factor_from_INDEX(INDEX) # same as tapply(INDEX=INDEX, FUN=NULL) sapply(split(X, f), FUN, ..., simplify=simplify, USE.NAMES=FALSE) } and then be guaranteed to behave consistently with split() and sapply(). Also the make_factor_from_INDEX() step maybe could be shared with what aggregate.data.frame() does internally with its 'by' argument. Still a mystery to me why the power of code sharing/reuse is so often underestimated :-/ H. On 02/15/2017 11:32 AM, Hadley Wickham wrote: > It seems like this should be consistent with split(), since that's > what actually powers the behaviour. > > Reading the description for split leads to this rather interesting example: > > tapply(mtcars, 1:11, I) > > Hadley > > On Tue, Feb 14, 2017 at 7:10 PM, Hervé Pagès <[hidden email]> wrote: >> Hi, >> >> tapply() will work on any object 'X' that has a length and supports >> single-bracket subsetting. These objects are sometimes called >> "vector-like" objects. Atomic vectors, lists, S4 objects with a "length" >> and "[" method, etc... are examples of "vector-like" objects. >> >> So instead of saying >> >> X: an atomic object, typically a vector. >> >> I think it would be more accurate if the man page was saying something >> like >> >> X: a vector-like object that supports subsetting with `[`, typically >> an atomic vector. >> >> H. >> >> >> On 02/04/2017 04:17 AM, Tal Galili wrote: >>> >>> In the help page of ?tapply it says that the first argument (X) is "an >>> atomic object, typically a vector." >>> >>> However, tapply seems to be able to handle list objects. For example: >>> >>> ################### >>> >>> l <- as.list(1:10) >>> is.atomic(l) # FALSE >>> index <- c(rep(1,5),rep(2,5)) >>> tapply(l,index,unlist) >>> >>>> tapply(l,index,unlist) >>> >>> $`1` >>> [1] 1 2 3 4 5 >>> >>> $`2` >>> [1] 6 7 8 9 10 >>> >>> >>> ################### >>> >>> Hence, does it mean a list an atomic object? (which I thought it wasn't) >>> or >>> is the help for tapply needs updating? >>> (or some third option I'm missing?) >>> >>> Thanks. >>> >>> >>> >>> >>> >>> ----------------Contact >>> Details:------------------------------------------------------- >>> Contact me: [hidden email] | >>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | >>> www.r-statistics.com (English) >>> >>> ---------------------------------------------------------------------------------------------- >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> -- >> Hervé Pagès >> >> Program in Computational Biology >> Division of Public Health Sciences >> Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N, M1-B514 >> P.O. Box 19024 >> Seattle, WA 98109-1024 >> >> E-mail: [hidden email] >> Phone: (206) 667-5791 >> Fax: (206) 667-1319 >> >> >> ______________________________________________ >> [hidden email] mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: [hidden email] Phone: (206) 667-5791 Fax: (206) 667-1319 ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
In reply to this post by Hervé Pagès-2
>>>>> Hervé Pagès <[hidden email]>
>>>>> on Tue, 14 Feb 2017 17:10:05 -0800 writes: > Hi, tapply() will work on any object 'X' that has a length > and supports single-bracket subsetting. These objects are > sometimes called "vector-like" objects. Atomic vectors, > lists, S4 objects with a "length" and "[" method, > etc... are examples of "vector-like" objects. > So instead of saying > X: an atomic object, typically a vector. > I think it would be more accurate if the man page was > saying something like > X: a vector-like object that supports subsetting with > `[`, typically an atomic vector. Thank you, Hervé! Actually (someone else mentioned ?) only length(X) and split(X, <group>) need to work, and as split() itself is an S3 generic function, X can be even more general... well depending on how exactly you understand "vector-like". So I would go with X: an R object for which a ‘split’ method exists. Typically vector-like, allowing subsetting with ‘[’. Martin > H. > On 02/04/2017 04:17 AM, Tal Galili wrote: >> In the help page of ?tapply it says that the first >> argument (X) is "an atomic object, typically a vector." >> >> However, tapply seems to be able to handle list >> objects. For example: >> >> ################### >> >> l <- as.list(1:10) is.atomic(l) # FALSE index <- >> c(rep(1,5),rep(2,5)) tapply(l,index,unlist) >> >>> tapply(l,index,unlist) >> $`1` [1] 1 2 3 4 5 >> >> $`2` [1] 6 7 8 9 10 >> >> >> ################### >> >> Hence, does it mean a list an atomic object? (which I >> thought it wasn't) or is the help for tapply needs >> updating? (or some third option I'm missing?) >> >> Thanks. ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
On Mon, Feb 20, 2017 at 7:31 AM, Martin Maechler
<[hidden email]> wrote: >>>>>> Hervé Pagès <[hidden email]> >>>>>> on Tue, 14 Feb 2017 17:10:05 -0800 writes: > > > Hi, tapply() will work on any object 'X' that has a length > > and supports single-bracket subsetting. These objects are > > sometimes called "vector-like" objects. Atomic vectors, > > lists, S4 objects with a "length" and "[" method, > > etc... are examples of "vector-like" objects. > > > So instead of saying > > > X: an atomic object, typically a vector. > > > I think it would be more accurate if the man page was > > saying something like > > > X: a vector-like object that supports subsetting with > > `[`, typically an atomic vector. > > Thank you, Hervé! > > Actually (someone else mentioned ?) > only length(X) and split(X, <group>) need to work, > and as split() itself is an S3 generic function, X can be even > more general... well depending on how exactly you understand > "vector-like". > > So I would go with > > X: an R object for which a ‘split’ method exists. Typically > vector-like, allowing subsetting with ‘[’. I think technically tapply() should be using NROW() check that X and INDEX are compatible. That would make it more compatible with split() semantics. Hadley -- http://hadley.nz ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. |
Free forum by Nabble | Edit this page |