# Aggregate behaviour inconsistent (?) when FUN=table

4 messages
Open this post in threaded view
|

## Aggregate behaviour inconsistent (?) when FUN=table

 Dear R users, When I use aggregate with table as FUN, I get what I would call a strange behaviour if it involves numerical vectors and one "level" of it is not present for every "levels" of the "by" variable: ---------------------------  > df <- data.frame(A=c(1,1,1,1,0,0,0,0),B=c(1,0,1,0,0,0,1,0),C=c(1,0,1,0,0,1,1,1))  > aggregate(df[1:2],list(df\$C),table,simplify = TRUE,drop=TRUE)    Group.1 A.0 A.1    B 1       0   1   2    3 2       1   3   2 2, 3  > table(df\$C,df\$B)      0 1    0 3 0    1 2 3 --------------- As you can see, a comma appears in the column with the variable B in the aggregate whereas when I call table I obtain the same result as if B was defined as a factor (I suppose it comes from the fact "non-factor arguments a are coerced via factor" according to the details of the table help). I find it completely normal if I remember that aggregate first splits the data into subsets and then compute the table. But then I don't understand why it works differently with character vectors. Indeed if I use character vectors, I get the same result as with factors: ------------------------  > df <- data.frame(A=factor(c("1","1","1","1","0","0","0","0")),B=factor(c("1","0","1","0","0","0","1","0")),C=factor(c("1","0","1","0","0","1","1","1")))  > aggregate(df[1:2],list(df\$C),table,simplify = TRUE,drop=TRUE)    Group.1 A.0 A.1 B.0 B.1 1       0   1   2   3   0 2       1   3   2   2   3  > df <- data.frame(A=factor(c(1,1,1,1,0,0,0,0)),B=factor(c(1,0,1,0,0,0,1,0)),C=factor(c(1,0,1,0,0,1,1,1)))  > aggregate(df[1:2],list(df\$C),table,simplify = TRUE,drop=TRUE)    Group.1 A.0 A.1 B.0 B.1 1       0   1   2   3   0 2       1   3   2   2   3 --------------------- Is it possible to precise anything about this behaviour in the aggregate help since the result is not completely compatible with the expectation of result we can have according to the table help? Or would it be possible to have the same results independently of the vector type? This post was rejected on the R-devel mailing list so I ask my question here as suggested. Best regards, Alain Guillet -- Alain Guillet Statistician and Computer Scientist SMCS - IMMAQ - Université catholique de Louvain http://www.uclouvain.be/smcsBureau c.316 Voie du Roman Pays, 20 (bte L1.04.01) B-1348 Louvain-la-Neuve Belgium Tel: +32 10 47 30 50 Accès: http://www.uclouvain.be/323631.html______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: Aggregate behaviour inconsistent (?) when FUN=table

 The normal input to a factory that builds cars is car parts. Feeding whole trucks into such a factory is likely to yield odd-looking results. Both aggregate and table do similar kinds of things, but yield differently constructed outputs. The output of the table function is not well-suited to be used as the aggregated value to be compiled into a data frame by the aggregate function, so having aggregate call the table function will yield surprises. I am having some difficulty deciphering what it is you are trying to accomplish with all this, so I will guess that you are trying to reproduce the information output from table( df\$C, df\$B ) so aggregate( df\$A, df[ , c( "C", "B" ) ], length ) but if that isn't what you want then perhaps you can clarify what result you want to see and we can help you get there. -- Sent from my phone. Please excuse my brevity. On February 6, 2018 12:20:03 AM PST, Alain Guillet <[hidden email]> wrote: >Dear R users, > >When I use aggregate with table as FUN, I get what I would call a >strange behaviour if it involves numerical vectors and one "level" of >it >is not present for every "levels" of the "by" variable: > >--------------------------- > > > df <- >data.frame(A=c(1,1,1,1,0,0,0,0),B=c(1,0,1,0,0,0,1,0),C=c(1,0,1,0,0,1,1,1)) > > aggregate(df[1:2],list(df\$C),table,simplify = TRUE,drop=TRUE) >   Group.1 A.0 A.1    B >1       0   1   2    3 >2       1   3   2 2, 3 > > > table(df\$C,df\$B) > >     0 1 >   0 3 0 >   1 2 3 > >--------------- > >As you can see, a comma appears in the column with the variable B in >the >aggregate whereas when I call table I obtain the same result as if B >was >defined as a factor (I suppose it comes from the fact "non-factor >arguments a are coerced via factor" according to the details of the >table help). I find it completely normal if I remember that aggregate >first splits the data into subsets and then compute the table. But then > >I don't understand why it works differently with character vectors. >Indeed if I use character vectors, I get the same result as with >factors: > >------------------------ > > > df <- >data.frame(A=factor(c("1","1","1","1","0","0","0","0")),B=factor(c("1","0","1","0","0","0","1","0")),C=factor(c("1","0","1","0","0","1","1","1"))) > > aggregate(df[1:2],list(df\$C),table,simplify = TRUE,drop=TRUE) >   Group.1 A.0 A.1 B.0 B.1 >1       0   1   2   3   0 >2       1   3   2   2   3 > > > df <- >data.frame(A=factor(c(1,1,1,1,0,0,0,0)),B=factor(c(1,0,1,0,0,0,1,0)),C=factor(c(1,0,1,0,0,1,1,1))) > > aggregate(df[1:2],list(df\$C),table,simplify = TRUE,drop=TRUE) >   Group.1 A.0 A.1 B.0 B.1 >1       0   1   2   3   0 >2       1   3   2   2   3 > >--------------------- > >Is it possible to precise anything about this behaviour in the >aggregate >help since the result is not completely compatible with the expectation > >of result we can have according to the table help? Or would it be >possible to have the same results independently of the vector type? >This >post was rejected on the R-devel mailing list so I ask my question here > >as suggested. > > >Best regards, >Alain Guillet ______________________________________________ [hidden email] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.