by Function Result Factor Levels

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

by Function Result Factor Levels

Dario Strbenac-2
Good day,

How is it possible to get a data.frame of factor levels used for obtaining each element of the result of the by function ? For example,

result <- by(warpbreaks[, 1],   warpbreaks[, -1], summary)
> result
wool: A
tension: L
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
  25.00   26.00   51.00   44.56   54.00   70.00
 ...

I'd like to obtain a data.frame of the two columns, wool and tension, specifying the level of each factor that corresponds to each element of result.

--------------------------------------
Dario Strbenac
PhD Student
University of Sydney
Camperdown NSW 2050
Australia

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: by Function Result Factor Levels

William Dunlap
Do you want something like the following?

> library(dplyr, quietly=TRUE, warn.conflicts=FALSE)
> warpbreaks %>% group_by(wool, tension) %>% summarize(Min=min(breaks), Median=median(breaks), Max=max(breaks))
Source: local data frame [6 x 5]
Groups: wool

  wool tension Min Median Max
1    A       L  25     51  70
2    A       M  12     21  36
3    A       H  10     24  43
4    B       L  14     29  44
5    B       M  16     28  42
6    B       H  13     17  28

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Tue, Sep 15, 2015 at 7:00 PM, Dario Strbenac
<[hidden email]> wrote:

> Good day,
>
> How is it possible to get a data.frame of factor levels used for obtaining each element of the result of the by function ? For example,
>
> result <- by(warpbreaks[, 1],   warpbreaks[, -1], summary)
>> result
> wool: A
> tension: L
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
>   25.00   26.00   51.00   44.56   54.00   70.00
>  ...
>
> I'd like to obtain a data.frame of the two columns, wool and tension, specifying the level of each factor that corresponds to each element of result.
>
> --------------------------------------
> Dario Strbenac
> PhD Student
> University of Sydney
> Camperdown NSW 2050
> Australia
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: by Function Result Factor Levels

Dario Strbenac-2
Good day,

Yes, exactly. I found that aggregate is another alternative which doesn't require a package dependency, although the column formatting is less suitable, always prepending x.

aggregate(warpbreaks[, 1], warpbreaks[, 2:3], function(breaks) c(Min = min(breaks), Med = median(breaks), Max = max(breaks)))
  wool tension x.Min x.Med x.Max
1    A       L    25    51    70
2    B       L    14    29    44
3    A       M    12    21    36
4    B       M    16    28    42
5    A       H    10    24    43
6    B       H    13    17    28

--------------------------------------
Dario Strbenac
PhD Student
University of Sydney
Camperdown NSW 2050
Australia
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: by Function Result Factor Levels

David Carlson
Actually x is the variable name since your function returned a vector of three values:

> tbl <- aggregate(warpbreaks[, 1], warpbreaks[, 2:3], function(breaks) c(Min = min(breaks),
+  Med = median(breaks), Max = max(breaks)))
> str(tbl)
'data.frame':   6 obs. of  3 variables:
 $ wool   : Factor w/ 2 levels "A","B": 1 2 1 2 1 2
 $ tension: Factor w/ 3 levels "L","M","H": 1 1 2 2 3 3
 $ x      : num [1:6, 1:3] 25 14 12 16 10 13 51 29 21 28 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr  "Min" "Med" "Max"

You have two options. One is to convert the matrix to three separate columns:

> tbl2 <- data.frame(tbl[, 1:2], tbl$x)
> str(tbl2)
'data.frame':   6 obs. of  5 variables:
 $ wool   : Factor w/ 2 levels "A","B": 1 2 1 2 1 2
 $ tension: Factor w/ 3 levels "L","M","H": 1 1 2 2 3 3
 $ Min    : num  25 14 12 16 10 13
 $ Med    : num  51 29 21 28 24 17
 $ Max    : num  70 44 36 42 43 28
> tbl2
  wool tension Min Med Max
1    A       L  25  51  70
2    B       L  14  29  44
3    A       M  12  21  36
4    B       M  16  28  42
5    A       H  10  24  43
6    B       H  13  17  28

The other is to change the name of x to something more informative:

> names(tbl)[3] <- "breaks"
> str(tbl)
'data.frame':   6 obs. of  3 variables:
 $ wool   : Factor w/ 2 levels "A","B": 1 2 1 2 1 2
 $ tension: Factor w/ 3 levels "L","M","H": 1 1 2 2 3 3
 $ breaks : num [1:6, 1:3] 25 14 12 16 10 13 51 29 21 28 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : chr  "Min" "Med" "Max"
> tbl
  wool tension breaks.Min breaks.Med breaks.Max
1    A       L         25         51         70
2    B       L         14         29         44
3    A       M         12         21         36
4    B       M         16         28         42
5    A       H         10         24         43
6    B       H         13         17         28

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352

-----Original Message-----
From: R-help [mailto:[hidden email]] On Behalf Of Dario Strbenac
Sent: Wednesday, September 16, 2015 1:00 AM
To: William Dunlap
Cc: [hidden email]
Subject: Re: [R] by Function Result Factor Levels

Good day,

Yes, exactly. I found that aggregate is another alternative which doesn't require a package dependency, although the column formatting is less suitable, always prepending x.

aggregate(warpbreaks[, 1], warpbreaks[, 2:3], function(breaks) c(Min = min(breaks), Med = median(breaks), Max = max(breaks)))
  wool tension x.Min x.Med x.Max
1    A       L    25    51    70
2    B       L    14    29    44
3    A       M    12    21    36
4    B       M    16    28    42
5    A       H    10    24    43
6    B       H    13    17    28

--------------------------------------
Dario Strbenac
PhD Student
University of Sydney
Camperdown NSW 2050
Australia
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.