Looking for a sort of tapply() to data frames

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Looking for a sort of tapply() to data frames

January Weiner-2
Hi,

I read about the by() function, but it does not seem to do the job I
need. Here is the problem:

Say - I have a data frame, with three columns.  The first one contains
strings that describe the data points, with repeats (for example, days
of a week).  The other two contain numbers. Something like that:

Day val1 val2
Tue 1    2
Tue 2    8
Tue 3    5
Wed 1    2
Wed 1    8
etc.

Now I would like to have a data frame with averages for each week:

Day val1 val2
Tue 2    5
Wed 1    5
etc.
I now I can do tapply(DF$val2, DF$days, mean) to get the means for
val2. But I would like to have a data frame as result (as in reality I
have many more columns).

Further question: where can I find a good, advanced introduction to R
data types? R's help() function just kills my brain, and the tutorials
are very limited.

My kind regards,

January Weiner

--
------------ January Weiner 3  ---------------------+---------------
Division of Bioinformatics, University of Muenster  |  Schloßplatz 4
(+49)(251)8321634                                   |  D48149 Münster
http://www.uni-muenster.de/Biologie.Botanik/ebb/    |  Germany

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: Looking for a sort of tapply() to data frames

Thomas Lumley
On Wed, 14 Dec 2005, January Weiner wrote:

> Hi,
>
> I read about the by() function, but it does not seem to do the job I
> need. Here is the problem:

by() will work, you just need to use the right function in it.

You want

by(df[,-1], df$Day, function.that.means.each.column)

so all you need to do is write  function.that.means.each.column()
In this case there is a built-in function, colMeans, so you don't even
have to write it.

More generally (eg the approach would work for medians as well)

by(df[,1], df$Day, function(today) apply(today, 2, mean))

Finally, you could just use aggregate().

  -thomas

> Say - I have a data frame, with three columns.  The first one contains
> strings that describe the data points, with repeats (for example, days
> of a week).  The other two contain numbers. Something like that:
>
> Day val1 val2
> Tue 1    2
> Tue 2    8
> Tue 3    5
> Wed 1    2
> Wed 1    8
> etc.
>
> Now I would like to have a data frame with averages for each week:
>
> Day val1 val2
> Tue 2    5
> Wed 1    5
> etc.
> I now I can do tapply(DF$val2, DF$days, mean) to get the means for
> val2. But I would like to have a data frame as result (as in reality I
> have many more columns).
>
> Further question: where can I find a good, advanced introduction to R
> data types? R's help() function just kills my brain, and the tutorials
> are very limited.
>
> My kind regards,
>
> January Weiner
>
> --
> ------------ January Weiner 3  ---------------------+---------------
> Division of Bioinformatics, University of Muenster  |  Schloßplatz 4
> (+49)(251)8321634                                   |  D48149 Münster
> http://www.uni-muenster.de/Biologie.Botanik/ebb/    |  Germany
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
Thomas Lumley Assoc. Professor, Biostatistics
[hidden email] University of Washington, Seattle
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: Looking for a sort of tapply() to data frames

January Weiner-2
Hello again,

On 12/14/05, Thomas Lumley <[hidden email]> wrote:
> You want
>
> by(df[,-1], df$Day, function.that.means.each.column)

OK, slowly :-) I don't understand it.

- why df[,-1] and not df? don't we loose the df$Day entries?

(by the way, why does typeof(df) show "list"? I thought that
read.table() returns a data frame?)

> so all you need to do is write  function.that.means.each.column()
> In this case there is a built-in function, colMeans, so you don't even
> have to write it.

Hmmmmm, I tried it and it did not work. That is, it works - but not as
intended :-).

Fake example:

> df <- data.frame(Day=c("Tue","Tue","Tue", "Wed", "Wed"), val1=seq(1,5), val2=3*seq(1,5))
> df
  Day val1 val2
1 Tue    1    3
2 Tue    2    6
3 Tue    3    9
4 Wed    4   12
5 Wed    5   15
> ddf <- by(df[,-1], df$Day, colMeans)
> ddf
df$Day: Tue
val1 val2
   2    6
------------------------------------------------------------
df$Day: Wed
val1 val2
 4.5 13.5
> ddf$Day
NULL
> ddf$val1
NULL

In real data, instead of "days", I have around 6000 items, so I need
them to be in one column called "Days" (or whatever).  OK. So correct
me if I understand wrongly what is happening here:

by() divides df in data frame subsets and applies a function
(colMeans) to each of them.  The result of colMeans ... manual says
that colMeans returns the following:

     A numeric or complex array of suitable size, or a vector if the
     result is one-dimensional.  The 'dimnames' (or 'names' for a
     vector result) are taken from the original array.

...which doesn't tell me much.  typeof(colMeans(...)) tells me
"double" but I think it lies. OK, lets assume it is a vector (should
be, I assume the result is one-dimensional, as I can hardly imagine a
multidimensional result).

So in the end I have a list with as many columns as I have days, and
in each column I have a vector with N named dimensions, where N is the
numbers of variables in the original data frame bar one.  But what I
would like to have is a data frame with exactly the same column names,
and rows being just a summary.  And no clue how to convert one in the
other :-)

> More generally (eg the approach would work for medians as well)
>
> by(df[,1], df$Day, function(today) apply(today, 2, mean))

Huh? why is it df[,1] now? I think I'm completly lost.

> Finally, you could just use aggregate().

Probably, yes.  As soon as I figure out how to use it, that is :-) (an
hour later: OK, I got it! yuppie!)  However what I really needed was
smth like this:

ddf <- by(df[,-1], df$Day, function(z) { return(cor(z$val1,z$val2)) ; } )

(but I still don't know how to convert it to a friendly data frame...)

Thanks for the answers!

January

--
------------ January Weiner 3  ---------------------+---------------
Division of Bioinformatics, University of Muenster  |  Schloßplatz 4
(+49)(251)8321634                                   |  D48149 Münster
http://www.uni-muenster.de/Biologie.Botanik/ebb/    |  Germany

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: Looking for a sort of tapply() to data frames

Gabor Grothendieck
On 12/15/05, January Weiner <[hidden email]> wrote:

> Hello again,
>
> On 12/14/05, Thomas Lumley <[hidden email]> wrote:
> > You want
> >
> > by(df[,-1], df$Day, function.that.means.each.column)
>
> OK, slowly :-) I don't understand it.
>
> - why df[,-1] and not df? don't we loose the df$Day entries?

You don't get them as a column but you get them as the
component labels.

   by(df, df$Day, function(x) colMeans(x[,-1]))

If you convert it to a data frame you get them as the rownames:

  do.call("rbind", by(df, df$Day, function(x) colMeans(x[,-1])))

>
> (by the way, why does typeof(df) show "list"? I thought that
> read.table() returns a data frame?)

I think you want class(df) which shows its a data frame.

>
> > so all you need to do is write  function.that.means.each.column()
> > In this case there is a built-in function, colMeans, so you don't even
> > have to write it.
>
> Hmmmmm, I tried it and it did not work. That is, it works - but not as
> intended :-).
>
> Fake example:
>
> > df <- data.frame(Day=c("Tue","Tue","Tue", "Wed", "Wed"), val1=seq(1,5), val2=3*seq(1,5))
> > df
>  Day val1 val2
> 1 Tue    1    3
> 2 Tue    2    6
> 3 Tue    3    9
> 4 Wed    4   12
> 5 Wed    5   15
> > ddf <- by(df[,-1], df$Day, colMeans)
> > ddf
> df$Day: Tue
> val1 val2
>   2    6
> ------------------------------------------------------------
> df$Day: Wed
> val1 val2
>  4.5 13.5
> > ddf$Day
> NULL
> > ddf$val1
> NULL
>
> In real data, instead of "days", I have around 6000 items, so I need
> them to be in one column called "Days" (or whatever).  OK. So correct
> me if I understand wrongly what is happening here:
>
> by() divides df in data frame subsets and applies a function
> (colMeans) to each of them.  The result of colMeans ... manual says
> that colMeans returns the following:
>
>     A numeric or complex array of suitable size, or a vector if the
>     result is one-dimensional.  The 'dimnames' (or 'names' for a
>     vector result) are taken from the original array.
>
> ...which doesn't tell me much.  typeof(colMeans(...)) tells me
> "double" but I think it lies. OK, lets assume it is a vector (should
> be, I assume the result is one-dimensional, as I can hardly imagine a
> multidimensional result).
>
> So in the end I have a list with as many columns as I have days, and
> in each column I have a vector with N named dimensions, where N is the
> numbers of variables in the original data frame bar one.  But what I
> would like to have is a data frame with exactly the same column names,
> and rows being just a summary.  And no clue how to convert one in the
> other :-)
>
> > More generally (eg the approach would work for medians as well)
> >
> > by(df[,1], df$Day, function(today) apply(today, 2, mean))
>
> Huh? why is it df[,1] now? I think I'm completly lost.

  df[,1] and df$Day both refer to the same first column.

>
> > Finally, you could just use aggregate().
>
> Probably, yes.  As soon as I figure out how to use it, that is :-) (an

   aggregate(df[,-1], df[,1,drop = FALSE], mean)

or

   aggregate(df[,-1], list(Day = df$Day), mean)

The second arg of aggregate must be a list which is why we used
drop = FALSE in the first instance and an explicit list in the second.

Another alternative is to use summaryBy from the doBy package found
at http://genetics.agrsci.dk/~sorenh/misc/ :

   library(doBy)
   summaryBy(cbind(var1, var2) ~ Day, data = df)


> hour later: OK, I got it! yuppie!)  However what I really needed was
> smth like this:
>
> ddf <- by(df[,-1], df$Day, function(z) { return(cor(z$val1,z$val2)) ; } )
>
> (but I still don't know how to convert it to a friendly data frame...)
>

   do.call("rbind", ddf)

> Thanks for the answers!
>
> January
>
> --
> ------------ January Weiner 3  ---------------------+---------------
> Division of Bioinformatics, University of Muenster  |  Schloßplatz 4
> (+49)(251)8321634                                   |  D48149 Münster
> http://www.uni-muenster.de/Biologie.Botanik/ebb/    |  Germany
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: Looking for a sort of tapply() to data frames

January Weiner-2
Hi,

On 12/15/05, Gabor Grothendieck <[hidden email]> wrote:
> You don't get them as a column but you get them as the
> component labels.
>
>    by(df, df$Day, function(x) colMeans(x[,-1]))
>
> If you convert it to a data frame you get them as the rownames:
>
>   do.call("rbind", by(df, df$Day, function(x) colMeans(x[,-1])))

Thanks! that helps a lot.  But I still run into problems with this.
Sorry for bothering you with newbie questions, if my problems are
trivial, point me to a suitable guide (I did read the introductory
materials on R).

First: it works for colMeans, but it does not work for a function like this:

do.call("rbind", by(df, df$Day, function(x) cor(df$val1, df$val2))

it says "Error in do.call(....) : second argument must be a list". I
do not understand this, as the second argument is "b" of the class
"by", as it was in the case of colMeans, so it did not change...?

Second: in case of colMeans (where it works) it returns a matrix, and
I have troubles getting it back to the data.frame, so I can access
blah$Day.  Instead, I have smth like that:

> do.call("rbind",b)
    V2 V3 V4 V5       V7
Tue 19 15  2  0 1.538462
Wed  5  3  6  1 1.285714

...and I do not know how to acces, for example, values for "Tue",
except with [1,] -- which is somewhat problematic.  For example, I
would like to display the 3 days for which V7 is highest.  How can I
do that?

> I think you want class(df) which shows its a data frame.

Ops. Sorry, I didn't guess it from the manual :-)

>    aggregate(df[,-1], df[,1,drop = FALSE], mean)

But why is df[,1,drop=FALSE] a list?  I don't get it...

>    aggregate(df[,-1], list(Day = df$Day), mean)

Yeah, I figured out that one.

> Another alternative is to use summaryBy from the doBy package found
> at http://genetics.agrsci.dk/~sorenh/misc/ :
>
>    library(doBy)
>    summaryBy(cbind(var1, var2) ~ Day, data = df)

I think I am not confident enough with the basic data types in R, I
need to understand them before I go over to specialized packages :-)

Again, thanks a lot,
January

--
------------ January Weiner 3  ---------------------+---------------
Division of Bioinformatics, University of Muenster  |  Schloßplatz 4
(+49)(251)8321634                                   |  D48149 Münster
http://www.uni-muenster.de/Biologie.Botanik/ebb/    |  Germany

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: Looking for a sort of tapply() to data frames

Gabor Grothendieck
On 12/16/05, January Weiner <[hidden email]> wrote:

> Hi,
>
> On 12/15/05, Gabor Grothendieck <[hidden email]> wrote:
> > You don't get them as a column but you get them as the
> > component labels.
> >
> >    by(df, df$Day, function(x) colMeans(x[,-1]))
> >
> > If you convert it to a data frame you get them as the rownames:
> >
> >   do.call("rbind", by(df, df$Day, function(x) colMeans(x[,-1])))
>
> Thanks! that helps a lot.  But I still run into problems with this.
> Sorry for bothering you with newbie questions, if my problems are
> trivial, point me to a suitable guide (I did read the introductory
> materials on R).
>
> First: it works for colMeans, but it does not work for a function like this:
>
> do.call("rbind", by(df, df$Day, function(x) cor(df$val1, df$val2))

There are a number of problems:

1. the function does not depend on x and therefore will return the
same result for each day group.

2. although ?by says it returns a list, it apparently simplifies the result,
contrary to the documentation, in certain cases.  Try this:

do.call("rbind", as.list(by(df, df$Day, function(x) cor(x$val1, x$val2))))

or this:

do.call("rbind", by(df, df$Day, function(x) list(cor = cor(x$val1, x$val2))))


3. In your sample data val1 is constant for Wed so you won't be able
to get a correlation.  That's the source of the warning that you get
when running the line in #2.

>
> it says "Error in do.call(....) : second argument must be a list". I
> do not understand this, as the second argument is "b" of the class
> "by", as it was in the case of colMeans, so it did not change...?
>
> Second: in case of colMeans (where it works) it returns a matrix, and
> I have troubles getting it back to the data.frame, so I can access
> blah$Day.  Instead, I have smth like that:

Try blah[,"Day"] which works with both matrices and data frames.

>
> > do.call("rbind",b)
>    V2 V3 V4 V5       V7
> Tue 19 15  2  0 1.538462
> Wed  5  3  6  1 1.285714


Another possibility is to coerce it to a data frame:

as.data.frame(do.call("rbind", b))

or change your function to return a list.

>
> ...and I do not know how to acces, for example, values for "Tue",
> except with [1,] -- which is somewhat problematic.  For example, I
> would like to display the 3 days for which V7 is highest.  How can I
> do that?
>
> > I think you want class(df) which shows its a data frame.
>
> Ops. Sorry, I didn't guess it from the manual :-)
>
> >    aggregate(df[,-1], df[,1,drop = FALSE], mean)
>
> But why is df[,1,drop=FALSE] a list?  I don't get it...

Because df is a one column data frame and data frames are lists.
Had we not specified drop, it would have automatically dropped it
since it has only one dimension simplifying it to a non-list.
We do not want that simplification here.

>
> >    aggregate(df[,-1], list(Day = df$Day), mean)
>
> Yeah, I figured out that one.
>
> > Another alternative is to use summaryBy from the doBy package found
> > at http://genetics.agrsci.dk/~sorenh/misc/ :
> >
> >    library(doBy)
> >    summaryBy(cbind(var1, var2) ~ Day, data = df)
>
> I think I am not confident enough with the basic data types in R, I
> need to understand them before I go over to specialized packages :-)
> Again, thanks a lot,
> January
>
> --
> ------------ January Weiner 3  ---------------------+---------------
> Division of Bioinformatics, University of Muenster  |  Schloßplatz 4
> (+49)(251)8321634                                   |  D48149 Münster
> http://www.uni-muenster.de/Biologie.Botanik/ebb/    |  Germany
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: Looking for a sort of tapply() to data frames

Frank Harrell
Gabor Grothendieck wrote:

> On 12/16/05, January Weiner <[hidden email]> wrote:
>
>>Hi,
>>
>>On 12/15/05, Gabor Grothendieck <[hidden email]> wrote:
>>
>>>You don't get them as a column but you get them as the
>>>component labels.
>>>
>>>   by(df, df$Day, function(x) colMeans(x[,-1]))
>>>
>>>If you convert it to a data frame you get them as the rownames:
>>>
>>>  do.call("rbind", by(df, df$Day, function(x) colMeans(x[,-1])))
>>
>>Thanks! that helps a lot.  But I still run into problems with this.
>>Sorry for bothering you with newbie questions, if my problems are
>>trivial, point me to a suitable guide (I did read the introductory
>>materials on R).
>>
>>First: it works for colMeans, but it does not work for a function like this:
>>
>>do.call("rbind", by(df, df$Day, function(x) cor(df$val1, df$val2))
>
>
> There are a number of problems:
>
> 1. the function does not depend on x and therefore will return the
> same result for each day group.
>
> 2. although ?by says it returns a list, it apparently simplifies the result,
> contrary to the documentation, in certain cases.  Try this:
>
> do.call("rbind", as.list(by(df, df$Day, function(x) cor(x$val1, x$val2))))
>
> or this:
>
> do.call("rbind", by(df, df$Day, function(x) list(cor = cor(x$val1, x$val2))))
>
>
> 3. In your sample data val1 is constant for Wed so you won't be able
> to get a correlation.  That's the source of the warning that you get
> when running the line in #2.
>
>
>>it says "Error in do.call(....) : second argument must be a list". I
>>do not understand this, as the second argument is "b" of the class
>>"by", as it was in the case of colMeans, so it did not change...?
>>
>>Second: in case of colMeans (where it works) it returns a matrix, and
>>I have troubles getting it back to the data.frame, so I can access
>>blah$Day.  Instead, I have smth like that:
>
>
> Try blah[,"Day"] which works with both matrices and data frames.
>
>
>>>do.call("rbind",b)
>>
>>   V2 V3 V4 V5       V7
>>Tue 19 15  2  0 1.538462
>>Wed  5  3  6  1 1.285714
>
>
>
> Another possibility is to coerce it to a data frame:
>
> as.data.frame(do.call("rbind", b))
>
> or change your function to return a list.
>
>
>>...and I do not know how to acces, for example, values for "Tue",
>>except with [1,] -- which is somewhat problematic.  For example, I
>>would like to display the 3 days for which V7 is highest.  How can I
>>do that?
>>
>>
>>>I think you want class(df) which shows its a data frame.
>>
>>Ops. Sorry, I didn't guess it from the manual :-)
>>
>>
>>>   aggregate(df[,-1], df[,1,drop = FALSE], mean)
>>
>>But why is df[,1,drop=FALSE] a list?  I don't get it...
>
>
> Because df is a one column data frame and data frames are lists.
> Had we not specified drop, it would have automatically dropped it
> since it has only one dimension simplifying it to a non-list.
> We do not want that simplification here.
>
>
>>>   aggregate(df[,-1], list(Day = df$Day), mean)
>>
>>Yeah, I figured out that one.
>>
>>
>>>Another alternative is to use summaryBy from the doBy package found
>>>at http://genetics.agrsci.dk/~sorenh/misc/ :
>>>
>>>   library(doBy)
>>>   summaryBy(cbind(var1, var2) ~ Day, data = df)
>>
>>I think I am not confident enough with the basic data types in R, I
>>need to understand them before I go over to specialized packages :-)
>>Again, thanks a lot,
>>January

You might want to look at the summarize function in the Hmisc package.

Frank

>>
>>--
>>------------ January Weiner 3  ---------------------+---------------
>>Division of Bioinformatics, University of Muenster  |  Schloßplatz 4
>>(+49)(251)8321634                                   |  D48149 Münster
>>http://www.uni-muenster.de/Biologie.Botanik/ebb/    |  Germany
>>
>>______________________________________________
>>[hidden email] mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>>
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>


--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Frank Harrell
Department of Biostatistics, Vanderbilt University
Reply | Threaded
Open this post in threaded view
|

Re: Looking for a sort of tapply() to data frames

Gabor Grothendieck
In reply to this post by Gabor Grothendieck
One other point.  The cor example could be done using tapply like
this:

tapply(rownames(df), df$Day, function(r) cor(df[r,"val1"], df[r, "val2"]))


On 12/16/05, Gabor Grothendieck <[hidden email]> wrote:

> On 12/16/05, January Weiner <[hidden email]> wrote:
> > Hi,
> >
> > On 12/15/05, Gabor Grothendieck <[hidden email]> wrote:
> > > You don't get them as a column but you get them as the
> > > component labels.
> > >
> > >    by(df, df$Day, function(x) colMeans(x[,-1]))
> > >
> > > If you convert it to a data frame you get them as the rownames:
> > >
> > >   do.call("rbind", by(df, df$Day, function(x) colMeans(x[,-1])))
> >
> > Thanks! that helps a lot.  But I still run into problems with this.
> > Sorry for bothering you with newbie questions, if my problems are
> > trivial, point me to a suitable guide (I did read the introductory
> > materials on R).
> >
> > First: it works for colMeans, but it does not work for a function like this:
> >
> > do.call("rbind", by(df, df$Day, function(x) cor(df$val1, df$val2))
>
> There are a number of problems:
>
> 1. the function does not depend on x and therefore will return the
> same result for each day group.
>
> 2. although ?by says it returns a list, it apparently simplifies the result,
> contrary to the documentation, in certain cases.  Try this:
>
> do.call("rbind", as.list(by(df, df$Day, function(x) cor(x$val1, x$val2))))
>
> or this:
>
> do.call("rbind", by(df, df$Day, function(x) list(cor = cor(x$val1, x$val2))))
>
>
> 3. In your sample data val1 is constant for Wed so you won't be able
> to get a correlation.  That's the source of the warning that you get
> when running the line in #2.
>
> >
> > it says "Error in do.call(....) : second argument must be a list". I
> > do not understand this, as the second argument is "b" of the class
> > "by", as it was in the case of colMeans, so it did not change...?
> >
> > Second: in case of colMeans (where it works) it returns a matrix, and
> > I have troubles getting it back to the data.frame, so I can access
> > blah$Day.  Instead, I have smth like that:
>
> Try blah[,"Day"] which works with both matrices and data frames.
>
> >
> > > do.call("rbind",b)
> >    V2 V3 V4 V5       V7
> > Tue 19 15  2  0 1.538462
> > Wed  5  3  6  1 1.285714
>
>
> Another possibility is to coerce it to a data frame:
>
> as.data.frame(do.call("rbind", b))
>
> or change your function to return a list.
>
> >
> > ...and I do not know how to acces, for example, values for "Tue",
> > except with [1,] -- which is somewhat problematic.  For example, I
> > would like to display the 3 days for which V7 is highest.  How can I
> > do that?
> >
> > > I think you want class(df) which shows its a data frame.
> >
> > Ops. Sorry, I didn't guess it from the manual :-)
> >
> > >    aggregate(df[,-1], df[,1,drop = FALSE], mean)
> >
> > But why is df[,1,drop=FALSE] a list?  I don't get it...
>
> Because df is a one column data frame and data frames are lists.
> Had we not specified drop, it would have automatically dropped it
> since it has only one dimension simplifying it to a non-list.
> We do not want that simplification here.
>
> >
> > >    aggregate(df[,-1], list(Day = df$Day), mean)
> >
> > Yeah, I figured out that one.
> >
> > > Another alternative is to use summaryBy from the doBy package found
> > > at http://genetics.agrsci.dk/~sorenh/misc/ :
> > >
> > >    library(doBy)
> > >    summaryBy(cbind(var1, var2) ~ Day, data = df)
> >
> > I think I am not confident enough with the basic data types in R, I
> > need to understand them before I go over to specialized packages :-)
> > Again, thanks a lot,
> > January
> >
> > --
> > ------------ January Weiner 3  ---------------------+---------------
> > Division of Bioinformatics, University of Muenster  |  Schloßplatz 4
> > (+49)(251)8321634                                   |  D48149 Münster
> > http://www.uni-muenster.de/Biologie.Botanik/ebb/    |  Germany
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> >
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html