Hmisc + summarize + quantile: Why only quantiles for first variable in data frame?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Hmisc + summarize + quantile: Why only quantiles for first variable in data frame?

Kim Milferstedt
Hi,

I'm working on a data set that contains a couple of factors and a
number of dependent variables. From all of these dependent variables
I would like to calculate mean, standard deviation and quantiles.
With the function FUN I get all the means and stdev that I want but
quantiles are only calculated for the first of the dependent
variables (column 8 in the summarize command). What do I have to do
differently in order to get all the quantiles that I want?

Thanks,

Kim

sgldm2    <-  read.table("E:/analysistemp/060412_test_data2.txt", header=T)
attach(sgldm2)
names(sgldm2)

FUN         <-  function(x)c(Mean=mean(x,na.rm=TRUE),
STDEV=sd(x,na.rm=TRUE), Quantile=quantile(x, probs=
c(0.25,0.50,0.75),na.rm=TRUE))
ordering    <-  llist(time_h_f, Distance_f)

resALL      <-  summarize(sgldm2[,8:10], ordering, FUN)

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: Hmisc + summarize + quantile: Why only quantiles for first variable in data frame?

Frank Harrell
Kim Milferstedt wrote:

> Hi,
>
> I'm working on a data set that contains a couple of factors and a
> number of dependent variables. From all of these dependent variables
> I would like to calculate mean, standard deviation and quantiles.
> With the function FUN I get all the means and stdev that I want but
> quantiles are only calculated for the first of the dependent
> variables (column 8 in the summarize command). What do I have to do
> differently in order to get all the quantiles that I want?
>
> Thanks,
>
> Kim
>
> sgldm2    <-  read.table("E:/analysistemp/060412_test_data2.txt", header=T)
> attach(sgldm2)
> names(sgldm2)
>
> FUN         <-  function(x)c(Mean=mean(x,na.rm=TRUE),
> STDEV=sd(x,na.rm=TRUE), Quantile=quantile(x, probs=
> c(0.25,0.50,0.75),na.rm=TRUE))
> ordering    <-  llist(time_h_f, Distance_f)
>
> resALL      <-  summarize(sgldm2[,8:10], ordering, FUN)

Please read the documentation and see the examples.  The first argument
to summarize is a matrix or vector and if a matrix, FUN must use matrix
operations if you want column-by-column results.

FH

>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>


--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Frank Harrell
Department of Biostatistics, Vanderbilt University
Reply | Threaded
Open this post in threaded view
|

Re: Hmisc + summarize + quantile: Why only quantiles for first variable in data frame?

Kim Milferstedt
Hi Frank Harrell,

thanks for the response. I understand your comment but I wasn't able
to find (or recognize) an answer on how to tell FUN explicitely to
use matrix operations. Would you be so kind and give me an example?

Thanks so much,

Kim


>>Please read the documentation and see the examples.  The first
>>argument to summarize is a matrix or vector and if a matrix, FUN
>>must use matrix operations if you want column-by-column results.
>>
>>FH
>>--
>>Frank E Harrell Jr   Professor and Chair           School of Medicine
>>                      Department of Biostatistics   Vanderbilt University

>Kim Milferstedt wrote:
>>Hi,
>>I'm working on a data set that contains a couple of factors and a
>>number of dependent variables. From all of these dependent
>>variables I would like to calculate mean, standard deviation and
>>quantiles. With the function FUN I get all the means and stdev that
>>I want but quantiles are only calculated for the first of the
>>dependent variables (column 8 in the summarize command). What do I
>>have to do differently in order to get all the quantiles that I want?
>>Thanks,
>>Kim
>>sgldm2    <-  read.table("E:/analysistemp/060412_test_data2.txt", header=T)
>>attach(sgldm2)
>>names(sgldm2)
>>FUN         <-  function(x)c(Mean=mean(x,na.rm=TRUE),
>>STDEV=sd(x,na.rm=TRUE), Quantile=quantile(x, probs=
>>c(0.25,0.50,0.75),na.rm=TRUE))
>>ordering    <-  llist(time_h_f, Distance_f)
>>resALL      <-  summarize(sgldm2[,8:10], ordering, FUN)
>
>__________________________________________
>
>Kim Milferstedt
>University of Illinois at Urbana-Champaign
>Department of Civil and Environmental Engineering
>4125 Newmark Civil Engineering Building
>205 North Mathews Avenue MC 250
>Urbana, IL 61801
>USA
>phone: (001) 217 333-9663
>fax: (001) 217 333-6968
>email: [hidden email]
>http://cee.uiuc.edu/research/morgenroth/index.asp
>___________________________________________

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: Hmisc + summarize + quantile: Why only quantiles for first variable in data frame?

Frank Harrell
Kim Milferstedt wrote:

> Hi Frank Harrell,
>
> thanks for the response. I understand your comment but I wasn't able to
> find (or recognize) an answer on how to tell FUN explicitely to use
> matrix operations. Would you be so kind and give me an example?
>
> Thanks so much,
>
> Kim
>

See http://biostat.mc.vanderbilt.edu/SasByMeansExample plus an example
in the help file for summary.formula in Hmisc which uses the apply
function.  summary.formula and summarize are similar in the use of FUN
(which summary.formula unfortunately calls 'fun').

Frank

>
>>> Please read the documentation and see the examples.  The first
>>> argument to summarize is a matrix or vector and if a matrix, FUN must
>>> use matrix operations if you want column-by-column results.
>>>
>>> FH
>>> --
>>> Frank E Harrell Jr   Professor and Chair           School of Medicine
>>>                      Department of Biostatistics   Vanderbilt University
>
>
>> Kim Milferstedt wrote:
>>
>>> Hi,
>>> I'm working on a data set that contains a couple of factors and a
>>> number of dependent variables. From all of these dependent variables
>>> I would like to calculate mean, standard deviation and quantiles.
>>> With the function FUN I get all the means and stdev that I want but
>>> quantiles are only calculated for the first of the dependent
>>> variables (column 8 in the summarize command). What do I have to do
>>> differently in order to get all the quantiles that I want?
>>> Thanks,
>>> Kim
>>> sgldm2    <-  read.table("E:/analysistemp/060412_test_data2.txt",
>>> header=T)
>>> attach(sgldm2)
>>> names(sgldm2)
>>> FUN         <-  function(x)c(Mean=mean(x,na.rm=TRUE),
>>> STDEV=sd(x,na.rm=TRUE), Quantile=quantile(x, probs=
>>> c(0.25,0.50,0.75),na.rm=TRUE))
>>> ordering    <-  llist(time_h_f, Distance_f)
>>> resALL      <-  summarize(sgldm2[,8:10], ordering, FUN)
>>
>>
>> __________________________________________
>>
>> Kim Milferstedt
>> University of Illinois at Urbana-Champaign
>> Department of Civil and Environmental Engineering
>> 4125 Newmark Civil Engineering Building
>> 205 North Mathews Avenue MC 250
>> Urbana, IL 61801
>> USA
>> phone: (001) 217 333-9663
>> fax: (001) 217 333-6968
>> email: [hidden email]
>> http://cee.uiuc.edu/research/morgenroth/index.asp
>> ___________________________________________
>
>
>


--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Frank Harrell
Department of Biostatistics, Vanderbilt University