Quantcast

by output into data frame

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

by output into data frame

David Perlman
I could do this in various hacky ways, but what's the right way?

I have a nice application of the by function, which does what I want.  The output looks like this:

> auc_stress
lab.samples.stress$subid: 2
  cortisol amylase
1   919.05  6834.8
---------------------------------------------------------------------------------------------------------------------------
lab.samples.stress$subid: 3
   cortisol  amylase
11   728.25 24422.05

etc.

What I want is a data frame roughly like this:

subid  cortisol.auc  amylase.auc
2      919.05        6834.8
3      728.25        24422.05

etc.

What is a nice way to make that happen?



Here is the code and data that I am using, which should run directly if you copy and paste it:


sanity.check<-read.csv("http://brainimaging.waisman.wisc.edu/~perlman/testdata.csv", header=TRUE, sep = ",")
lab.samples <- subset(sanity.check,Sample!='before bed' & Sample!='morning after')
lab.samples$Sample<-factor(lab.samples$Sample)
lab.samples.stress<-subset(lab.samples,challenge=='stress')
lab.samples.control<-subset(lab.samples,challenge=='control')

auc_ground <- function(sub_df) {
        print(sub_df)
        auc<-sub_df[1,]*0
        timedif<-c(60,10,10,10,10,10,10)
        for (i in 1:(nrow(sub_df)-1) ) {
                print(c(i,i+1))
                #print(c(values[i],values[i+1]))
                pair_area<-(sub_df[i,]+sub_df[i+1,])*timedif[i]/2
                auc<-auc+pair_area
        }
        auc
}

auc_stress<-by(lab.samples.stress[c('cortisol','amylase')], lab.samples.stress$subid, auc_ground, simplify=T)
auc_control<-by(lab.samples.control[c('cortisol','amylase')], lab.samples.control$subid, auc_ground, simplify=T)


Thanks for your help!

P.S. sorry if this question has been answered before, it is nearly impossible to get useful google results on search terms like "by"...  too common word...


-dave----------------------------------------------------------------------
A neuroscientist is at the video arcade, when someone makes him a $1000 bet
on Pac-Man. He smiles, gets out his screwdriver and takes apart the Pac-Man
game. Everyone says "What are you doing?" The neuroscientist says "Well,
since we all know that Pac-Man is based on electric signals traveling
through these circuits, obviously I can understand it better than the other
guy by going straight to the source!"

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: by output into data frame

Jorge I Velez
Hi David,

Thank you for the reproducible example!

Try

> do.call(rbind, auc_stress)
  cortisol  amylase
2   919.05  6834.80
3   728.25 24422.05
4  2106.00 25908.35
6   636.40 12209.75
7  1925.95  4749.25

> do.call(rbind, auc_control)
  cortisol  amylase
2   604.90  2458.00
4   587.65 29954.55
6   493.60 13833.80
7  1211.00  4932.35

HTH,
Jorge.-


On Mon, Mar 19, 2012 at 5:44 PM, David Perlman <> wrote:

> I could do this in various hacky ways, but what's the right way?
>
> I have a nice application of the by function, which does what I want.  The
> output looks like this:
>
> > auc_stress
> lab.samples.stress$subid: 2
>  cortisol amylase
> 1   919.05  6834.8
>
> ---------------------------------------------------------------------------------------------------------------------------
> lab.samples.stress$subid: 3
>   cortisol  amylase
> 11   728.25 24422.05
>
> etc.
>
> What I want is a data frame roughly like this:
>
> subid  cortisol.auc  amylase.auc
> 2      919.05        6834.8
> 3      728.25        24422.05
>
> etc.
>
> What is a nice way to make that happen?
>
>
>
> Here is the code and data that I am using, which should run directly if
> you copy and paste it:
>
>
> sanity.check<-read.csv("
> http://brainimaging.waisman.wisc.edu/~perlman/testdata.csv", header=TRUE,
> sep = ",")
> lab.samples <- subset(sanity.check,Sample!='before bed' & Sample!='morning
> after')
> lab.samples$Sample<-factor(lab.samples$Sample)
> lab.samples.stress<-subset(lab.samples,challenge=='stress')
> lab.samples.control<-subset(lab.samples,challenge=='control')
>
> auc_ground <- function(sub_df) {
>        print(sub_df)
>        auc<-sub_df[1,]*0
>        timedif<-c(60,10,10,10,10,10,10)
>        for (i in 1:(nrow(sub_df)-1) ) {
>                print(c(i,i+1))
>                #print(c(values[i],values[i+1]))
>                pair_area<-(sub_df[i,]+sub_df[i+1,])*timedif[i]/2
>                auc<-auc+pair_area
>        }
>        auc
> }
>
> auc_stress<-by(lab.samples.stress[c('cortisol','amylase')],
> lab.samples.stress$subid, auc_ground, simplify=T)
> auc_control<-by(lab.samples.control[c('cortisol','amylase')],
> lab.samples.control$subid, auc_ground, simplify=T)
>
>
> Thanks for your help!
>
> P.S. sorry if this question has been answered before, it is nearly
> impossible to get useful google results on search terms like "by"...  too
> common word...
>
>
> -dave----------------------------------------------------------------------
> A neuroscientist is at the video arcade, when someone makes him a $1000 bet
> on Pac-Man. He smiles, gets out his screwdriver and takes apart the Pac-Man
> game. Everyone says "What are you doing?" The neuroscientist says "Well,
> since we all know that Pac-Man is based on electric signals traveling
> through these circuits, obviously I can understand it better than the other
> guy by going straight to the source!"
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: by output into data frame

Peter Ehlers
On 2012-03-19 15:00, Jorge I Velez wrote:

> Hi David,
>
> Thank you for the reproducible example!
>
> Try
>
>> do.call(rbind, auc_stress)
>    cortisol  amylase
> 2   919.05  6834.80
> 3   728.25 24422.05
> 4  2106.00 25908.35
> 6   636.40 12209.75
> 7  1925.95  4749.25
>
>> do.call(rbind, auc_control)
>    cortisol  amylase
> 2   604.90  2458.00
> 4   587.65 29954.55
> 6   493.60 13833.80
> 7  1211.00  4932.35
>
> HTH,
> Jorge.-


Or with the plyr package:

library(plyr)
ldply(auc_stress)

Peter Ehlers


>
>
> On Mon, Mar 19, 2012 at 5:44 PM, David Perlman<>  wrote:
>
>> I could do this in various hacky ways, but what's the right way?
>>
>> I have a nice application of the by function, which does what I want.  The
>> output looks like this:
>>
>>> auc_stress
>> lab.samples.stress$subid: 2
>>   cortisol amylase
>> 1   919.05  6834.8
>>
>> ---------------------------------------------------------------------------------------------------------------------------
>> lab.samples.stress$subid: 3
>>    cortisol  amylase
>> 11   728.25 24422.05
>>
>> etc.
>>
>> What I want is a data frame roughly like this:
>>
>> subid  cortisol.auc  amylase.auc
>> 2      919.05        6834.8
>> 3      728.25        24422.05
>>
>> etc.
>>
>> What is a nice way to make that happen?
>>
>>
>>
>> Here is the code and data that I am using, which should run directly if
>> you copy and paste it:
>>
>>
>> sanity.check<-read.csv("
>> http://brainimaging.waisman.wisc.edu/~perlman/testdata.csv", header=TRUE,
>> sep = ",")
>> lab.samples<- subset(sanity.check,Sample!='before bed'&  Sample!='morning
>> after')
>> lab.samples$Sample<-factor(lab.samples$Sample)
>> lab.samples.stress<-subset(lab.samples,challenge=='stress')
>> lab.samples.control<-subset(lab.samples,challenge=='control')
>>
>> auc_ground<- function(sub_df) {
>>         print(sub_df)
>>         auc<-sub_df[1,]*0
>>         timedif<-c(60,10,10,10,10,10,10)
>>         for (i in 1:(nrow(sub_df)-1) ) {
>>                 print(c(i,i+1))
>>                 #print(c(values[i],values[i+1]))
>>                 pair_area<-(sub_df[i,]+sub_df[i+1,])*timedif[i]/2
>>                 auc<-auc+pair_area
>>         }
>>         auc
>> }
>>
>> auc_stress<-by(lab.samples.stress[c('cortisol','amylase')],
>> lab.samples.stress$subid, auc_ground, simplify=T)
>> auc_control<-by(lab.samples.control[c('cortisol','amylase')],
>> lab.samples.control$subid, auc_ground, simplify=T)
>>
>>
>> Thanks for your help!
>>
>> P.S. sorry if this question has been answered before, it is nearly
>> impossible to get useful google results on search terms like "by"...  too
>> common word...
>>
>>
>> -dave----------------------------------------------------------------------
>> A neuroscientist is at the video arcade, when someone makes him a $1000 bet
>> on Pac-Man. He smiles, gets out his screwdriver and takes apart the Pac-Man
>> game. Everyone says "What are you doing?" The neuroscientist says "Well,
>> since we all know that Pac-Man is based on electric signals traveling
>> through these circuits, obviously I can understand it better than the other
>> guy by going straight to the source!"
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate
star

Re: by output into data frame

Peter Meilstrup
In reply to this post by David Perlman
Thanks for providing a reproducible example.

Using the plyr package you can write your whole computation more compactly:

library(plyr)
library(caTools) #for trapz

auc <- ddply(lab.samples, .(challenge, subid),
             function(df) {
  df$time <- c(0, seq(60,by=10, len=nrow(df)-1))
  summarize(df,
            cortisol = trapz(time, cortisol),
            amylase = trapz(time, amylase))
})

On Mon, Mar 19, 2012 at 2:44 PM, David Perlman <[hidden email]> wrote:

> I could do this in various hacky ways, but what's the right way?
>
> I have a nice application of the by function, which does what I want.  The
> output looks like this:
>
> > auc_stress
> lab.samples.stress$subid: 2
>  cortisol amylase
> 1   919.05  6834.8
>
> ---------------------------------------------------------------------------------------------------------------------------
> lab.samples.stress$subid: 3
>   cortisol  amylase
> 11   728.25 24422.05
>
> etc.
>
> What I want is a data frame roughly like this:
>
> subid  cortisol.auc  amylase.auc
> 2      919.05        6834.8
> 3      728.25        24422.05
>
> etc.
>
> What is a nice way to make that happen?
>
>
>
> Here is the code and data that I am using, which should run directly if
> you copy and paste it:
>
>
> sanity.check<-read.csv("
> http://brainimaging.waisman.wisc.edu/~perlman/testdata.csv", header=TRUE,
> sep = ",")
> lab.samples <- subset(sanity.check,Sample!='before bed' & Sample!='morning
> after')
> lab.samples$Sample<-factor(lab.samples$Sample)
> lab.samples.stress<-subset(lab.samples,challenge=='stress')
> lab.samples.control<-subset(lab.samples,challenge=='control')
>
> auc_ground <- function(sub_df) {
>        print(sub_df)
>        auc<-sub_df[1,]*0
>        timedif<-c(60,10,10,10,10,10,10)
>        for (i in 1:(nrow(sub_df)-1) ) {
>                print(c(i,i+1))
>                #print(c(values[i],values[i+1]))
>                pair_area<-(sub_df[i,]+sub_df[i+1,])*timedif[i]/2
>                auc<-auc+pair_area
>        }
>        auc
> }
>
> auc_stress<-by(lab.samples.stress[c('cortisol','amylase')],
> lab.samples.stress$subid, auc_ground, simplify=T)
> auc_control<-by(lab.samples.control[c('cortisol','amylase')],
> lab.samples.control$subid, auc_ground, simplify=T)
>
>
> Thanks for your help!
>
> P.S. sorry if this question has been answered before, it is nearly
> impossible to get useful google results on search terms like "by"...  too
> common word...
>
>
> -dave----------------------------------------------------------------------
> A neuroscientist is at the video arcade, when someone makes him a $1000 bet
> on Pac-Man. He smiles, gets out his screwdriver and takes apart the Pac-Man
> game. Everyone says "What are you doing?" The neuroscientist says "Well,
> since we all know that Pac-Man is based on electric signals traveling
> through these circuits, obviously I can understand it better than the other
> guy by going straight to the source!"
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Loading...