Post for R

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Post for R

R help mailing list-2

Hello, 
I want to split the dataframe into 1000 groups based on two column values(max value and second max value). First, I made two lists L1 and L2.  L1 is the list divided into 100 groups based on the range of max value and L2 is divided into 10 groups based on the second max values. Now I want to do the combinations based on L1 and L2. I want to do a for loop for L1 and for each element in L1, I split it into 10 groups based on L2. I tried to write the code, but it does not work.

L1<-split(df,cut(df$max,seq(0,1,by=0.01)))L2<-split(df,cut(df$submax,seq(0,0.2,by=0.02)))
Z<-list()G<-list()for (i in length(L1)){  Z=data.frame(L1[i])  G <- split(Z$submax,"0.02")  print(G)  }
Thanks so much!Carrie
        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Post for R

Hasan Diwan-2
Carrie,
I would suggest a few things before posting your code here:
- Put a dput(df)
- Format it properly, as it stands it won't compile, because you're missing
newlines/semicolons between, e.g. Z <- list()*; *G <- list(); for (i in
length(L1)){  Z=data.frame(L1[i])*;* G <- split(Z$submax,"0.02")*;*
 print(G)  }
-- H

On 31 May 2017 at 19:48, carrie wang via R-help <[hidden email]>
wrote:

>
> Hello,
> I want to split the dataframe into 1000 groups based on two column
> values(max value and second max value). First, I made two lists L1 and L2.
> L1 is the list divided into 100 groups based on the range of max value and
> L2 is divided into 10 groups based on the second max values. Now I want to
> do the combinations based on L1 and L2. I want to do a for loop for L1 and
> for each element in L1, I split it into 10 groups based on L2. I tried to
> write the code, but it does not work.
>
> L1<-split(df,cut(df$max,seq(0,1,by=0.01)))L2<-split(df,cut(d
> f$submax,seq(0,0.2,by=0.02)))
> Z<-list()G<-list()for (i in length(L1)){  Z=data.frame(L1[i])  G <-
> split(Z$submax,"0.02")  print(G)  }
> Thanks so much!Carrie
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posti
> ng-guide.html
> and provide commented, minimal, self-contained, reproducible code.




--
OpenPGP: https://sks-keyservers.net/pks/lookup?op=
get&search=0xFEBAD7FFD041BBA1
If you wish to request my time, please do so using http://bit.ly/
hd1ScheduleRequest.
Si vous voudrais faire connnaisance, allez a http://bit.ly/
hd1ScheduleRequest.

<https://sks-keyservers.net/pks/lookup?op=get&search=0xFEBAD7FFD041BBA1>Sent
from my mobile device
Envoye de mon portable

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Post for R

Jim Lemon-4
In reply to this post by R help mailing list-2
Hi Carrie,
You may have a problem with this if some subsets are empty:

L3<-lapply(split(df,cut(df$max,seq(0,1,by=0.01))),
 split,cut(df$submax,seq(0,0.2,by=0.02)))

Jim

On Thu, Jun 1, 2017 at 12:48 PM, carrie wang via R-help
<[hidden email]> wrote:

>
> Hello,
> I want to split the dataframe into 1000 groups based on two column values(max value and second max value). First, I made two lists L1 and L2.  L1 is the list divided into 100 groups based on the range of max value and L2 is divided into 10 groups based on the second max values. Now I want to do the combinations based on L1 and L2. I want to do a for loop for L1 and for each element in L1, I split it into 10 groups based on L2. I tried to write the code, but it does not work.
>
> L1<-split(df,cut(df$max,seq(0,1,by=0.01)))L2<-split(df,cut(df$submax,seq(0,0.2,by=0.02)))
> Z<-list()G<-list()for (i in length(L1)){  Z=data.frame(L1[i])  G <- split(Z$submax,"0.02")  print(G)  }
> Thanks so much!Carrie
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Post for R

David Carlson
As Hasan notes, your code is scrambled because you sent your message in html format. This list converts all mail to plain text to make it readable to a wider variety of computers and operating systems around the world. One consequence is that the html code for newline gets ignored. The attached .png file shows how to send plain text messages using yahoo mail. Also use a subject line that is a bit more descriptive, e.g. Splitting a data frame on two variables.

Without knowing more about your data we cannot know if your method of dividing the data into groups is sound. As Jim points out, some of your groups could be empty depending on how the variables max and submax are coded. Using str(df) would help, but we really need a sample data set to try possible approaches. That data set should not be your entire data and can be entirely made up if your data is proprietary.

Your loop does not work because it overwrites Z and G on each step. Also it is completely unnecessary if I understand what you are trying to do. Here's a simple reproducible example, that you can try. It uses only 24 cases divided into 4 max groups and 2 submax groups, but it should show you how you might handle your data. I'll divide the groups into equal ranges, not an equal number of observations, but it is not clear which you want. As a result some groups could be empty, but that does not happen in this example:

# Set a random seed so you will get the same numbers
set.seed(42)
# Create a simple data set with 3 variables
# Name it dfr instead of df which is a function name
# R will normally figure out which you mean unless you make an error
# in which case you will get a cryptic error message about "a closure"
dfr <- data.frame(var=LETTERS[1:24], max=sample.int(100, 24),
    submax=sample.int(100, 24))
# Create your data into 4 and 2 groups
L1 <- cut(dfr$max, 4)
L2 <- cut(dfr$submax, 2)
# Create the combinations of the two groups
L12 <- expand.grid(levels(L1), levels(L2))
# Split the data
dfr.sp <- split(dfr, L12)
length(dfr.sp)
# We get 8 groups, 4x2

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352

-----Original Message-----
From: R-help [mailto:[hidden email]] On Behalf Of Jim Lemon
Sent: Thursday, June 1, 2017 2:53 AM
To: carrie wang <[hidden email]>; r-help mailing list <[hidden email]>
Subject: Re: [R] Post for R

Hi Carrie,
You may have a problem with this if some subsets are empty:

L3<-lapply(split(df,cut(df$max,seq(0,1,by=0.01))),
 split,cut(df$submax,seq(0,0.2,by=0.02)))

Jim

On Thu, Jun 1, 2017 at 12:48 PM, carrie wang via R-help
<[hidden email]> wrote:

>
> Hello,
> I want to split the dataframe into 1000 groups based on two column values(max value and second max value). First, I made two lists L1 and L2.  L1 is the list divided into 100 groups based on the range of max value and L2 is divided into 10 groups based on the second max values. Now I want to do the combinations based on L1 and L2. I want to do a for loop for L1 and for each element in L1, I split it into 10 groups based on L2. I tried to write the code, but it does not work.
>
> L1<-split(df,cut(df$max,seq(0,1,by=0.01)))L2<-split(df,cut(df$submax,seq(0,0.2,by=0.02)))
> Z<-list()G<-list()for (i in length(L1)){  Z=data.frame(L1[i])  G <- split(Z$submax,"0.02")  print(G)  }
> Thanks so much!Carrie
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

YahooPlainText.png (75K) Download Attachment