loop of quartile groups

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

loop of quartile groups

Charles Determan Jr
Greetings R users,

My goal is to generate quartile groups of each variable in my data set.  I
would like each experiment to have its designated group added as a
subsequent column.  I can accomplish this individually with the following
code:

brks <- with(data_variables,

             cut2(var2, g=4))

#I don't want the actual numbers, I need a numbered group

data$test1=factor(brks, labels=1:4)


However, I cannot get a loop to work nor can I get a loop to add the
columns with an appropriate name (ex. quartile_variable).  I have tried
multiple different ways but can't seem to get it to work.  I think it would
begin something like this:


    for(i in 11:ncol(survival_data_variables)){
        brks=as.data.frame(with(survival_data_variables,
            cut2(survival_data_variables[,i], g=4)))


Any assistance would be sincerely appreciated.  I would like the final data
set to have the following layout:


ID        var1       var2        var3     var4       quartile var1
quartile var2        quartile var3      quartile var4


Here is a subset of my data to work with:

structure(list(ID = c(11112L, 11811L, 12412L, 12510L, 13111L,

20209L, 20612L, 20711L, 21510L, 22012L), var1 = c(106, 107,

116, 67, 76, 146, 89, 62, 65, 116), var2 = c(0, 0, 201,

558, 526, 555, 576, 0, 531, 649), var3 = c(70.67, 81.33,

93.67, 84.33, 52, 74, 114, 101, 80.33, 91.33), var4 = c(136,

139, 142, 138, 140, 140, 136, 139, 140, 139)), .Names = c("ID",

"var1", "var2", "var3", "var4"), row.names = c(NA,

10L), class = "data.frame")


Regards,
Charles

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: loop of quartile groups

Rui Barradas
Hello,

There's no function cut2() but it's not very difficult to write one.
I've named your data example 'dat', it saves keystrokes.
Try the following.

dat <- structure(...etc...)

cut2 <- function(x, g = 0){
     cut(x, breaks = c(-Inf, seq(min(x), max(x), length.out = g)))
}

fun <- function(x) {
     ct <- cut2(x, g = 4)
     factor(ct, levels = levels(ct), labels=1:4)
}

tmp <- sapply(dat[-1], fun)
colnames(tmp) <- paste("quartile", colnames(tmp), sep="_")
dat2 <- cbind(dat, tmp)
str(dat2)


Hope this helps,

Rui Barradas
Em 17-10-2012 15:23, Charles Determan Jr escreveu:

> Greetings R users,
>
> My goal is to generate quartile groups of each variable in my data set.  I
> would like each experiment to have its designated group added as a
> subsequent column.  I can accomplish this individually with the following
> code:
>
> brks <- with(data_variables,
>
>               cut2(var2, g=4))
>
> #I don't want the actual numbers, I need a numbered group
>
> data$test1=factor(brks, labels=1:4)
>
>
> However, I cannot get a loop to work nor can I get a loop to add the
> columns with an appropriate name (ex. quartile_variable).  I have tried
> multiple different ways but can't seem to get it to work.  I think it would
> begin something like this:
>
>
>      for(i in 11:ncol(survival_data_variables)){
>          brks=as.data.frame(with(survival_data_variables,
>              cut2(survival_data_variables[,i], g=4)))
>
>
> Any assistance would be sincerely appreciated.  I would like the final data
> set to have the following layout:
>
>
> ID        var1       var2        var3     var4       quartile var1
> quartile var2        quartile var3      quartile var4
>
>
> Here is a subset of my data to work with:
>
> structure(list(ID = c(11112L, 11811L, 12412L, 12510L, 13111L,
>
> 20209L, 20612L, 20711L, 21510L, 22012L), var1 = c(106, 107,
>
> 116, 67, 76, 146, 89, 62, 65, 116), var2 = c(0, 0, 201,
>
> 558, 526, 555, 576, 0, 531, 649), var3 = c(70.67, 81.33,
>
> 93.67, 84.33, 52, 74, 114, 101, 80.33, 91.33), var4 = c(136,
>
> 139, 142, 138, 140, 140, 136, 139, 140, 139)), .Names = c("ID",
>
> "var1", "var2", "var3", "var4"), row.names = c(NA,
>
> 10L), class = "data.frame")
>
>
> Regards,
> Charles
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: loop of quartile groups

David Winsemius
In reply to this post by Charles Determan Jr

On Oct 17, 2012, at 7:23 AM, Charles Determan Jr wrote:

> Greetings R users,
>
> My goal is to generate quartile groups of each variable in my data  
> set.  I
> would like each experiment to have its designated group added as a
> subsequent column.  I can accomplish this individually with the  
> following
> code:
>
> brks <- with(data_variables,
>
>             cut2(var2, g=4))
>
> #I don't want the actual numbers, I need a numbered group
>
> data$test1=factor(brks, labels=1:4)
>
>
> However, I cannot get a loop to work nor can I get a loop to add the
> columns with an appropriate name (ex. quartile_variable).  I have  
> tried
> multiple different ways but can't seem to get it to work.  I think  
> it would
> begin something like this:
>
>
>    for(i in 11:ncol(survival_data_variables)){
>        brks=as.data.frame(with(survival_data_variables,
>            cut2(survival_data_variables[,i], g=4)))
>
>
> Any assistance would be sincerely appreciated.  I would like the  
> final data
> set to have the following layout:
>
>
> ID        var1       var2        var3     var4       quartile var1
> quartile var2        quartile var3      quartile var4
>
>
> Here is a subset of my data to work with:
>
> structure(list(ID = c(11112L, 11811L, 12412L, 12510L, 13111L,
>
> 20209L, 20612L, 20711L, 21510L, 22012L), var1 = c(106, 107,
>
> 116, 67, 76, 146, 89, 62, 65, 116), var2 = c(0, 0, 201,
>
> 558, 526, 555, 576, 0, 531, 649), var3 = c(70.67, 81.33,
>
> 93.67, 84.33, 52, 74, 114, 101, 80.33, 91.33), var4 = c(136,
>
> 139, 142, 138, 140, 140, 136, 139, 140, 139)), .Names = c("ID",
>
> "var1", "var2", "var3", "var4"), row.names = c(NA,
>
> 10L), class = "data.frame")

require(Hmisc)  # which is the package that provides cut2
  dat$quart_var1 <- factor( cut2( dat$var1, g=4), labels=1:4)
  dat
#------------------
       ID var1 var2   var3 var4 quart_var1
1  11112  106    0  70.67  136          3
2  11811  107    0  81.33  139          3
3  12412  116  201  93.67  142          3
4  12510   67  558  84.33  138          1
5  13111   76  526  52.00  140          2
6  20209  146  555  74.00  140          4
7  20612   89  576 114.00  136          2
8  20711   62    0 101.00  139          1
9  21510   65  531  80.33  140          1
10 22012  116  649  91.33  139          3

That won't generalize well, so you could work with this version of the  
same operation:

dat[paste("quart", names(dat)[2], sep="_")] <-  
factor( cut2( dat[[ names(dat)[2] ]], g=4), labels=1:4)

--
David Winsemius, MD
Alameda, CA, USA

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.