Split a list

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Split a list

Juliet Ndukum
I have a list of dataframes i.e. each list element is a dataframe with three columns and differing number of rows. The third column takes on only two values. I wish to split the list into two sublists based on the value of the third column of the list element. 
Second issue with lists as well. I would like to reduce each of the sublist based on the range of the second column, i.e. if the range of the second column is greater than twenty for example keep the list element.

Could someone help me with a code to implement these two issues. Thanks in advance for your help,
JN
        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Split a list

Weidong Gu-2
It would be nice if you could provide a sample. However, if the data
in the list have the same colnames, you can combine them by

df<-do.call('rbind',your_list_data_frame)

Then you can do what you want on the dataframe instead of a list

HTH

Weidong Gu


On Fri, Oct 14, 2011 at 9:06 AM, Juliet Ndukum <[hidden email]> wrote:

> I have a list of dataframes i.e. each list element is a dataframe with three columns and differing number of rows. The third column takes on only two values. I wish to split the list into two sublists based on the value of the third column of the list element.
> Second issue with lists as well. I would like to reduce each of the sublist based on the range of the second column, i.e. if the range of the second column is greater than twenty for example keep the list element.
>
> Could someone help me with a code to implement these two issues. Thanks in advance for your help,
> JN
>        [[alternative HTML version deleted]]
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Split a list

Michael Weylandt
In reply to this post by Juliet Ndukum
Comments inline:

On Fri, Oct 14, 2011 at 9:06 AM, Juliet Ndukum <[hidden email]> wrote:
> I have a list of dataframes i.e. each list element is a dataframe with three columns and differing number of rows. The third column takes on only two values. I wish to split the list into two sublists based on the value of the third column of the list element.

Assuming the third column always takes a constant value for each
data.frame, something this will work:

lst <- list(a = data.frame(1:3, 4:6, 7), b = data.frame(1:3, 4:6, 8),
c = data.frame(1:3, 4:6, 7))
idx <- sapply(lst, function(x) x[1,3] == 7)

lst1 = lst[idx]
lst2 = lst[!idx]


> Second issue with lists as well. I would like to reduce each of the sublist based on the range of the second column, i.e. if the range of the second column is greater than twenty for example keep the list element.

Similar line of argument:

idx <- sapply(lst, function(x) range(x[,2]) > 20)
lst1 = lst[idx]

>
> Could someone help me with a code to implement these two issues. Thanks in advance for your help,
> JN
>        [[alternative HTML version deleted]]
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Split a list

David Winsemius
In reply to this post by Weidong Gu-2

On Oct 14, 2011, at 9:26 AM, Weidong Gu wrote:

> It would be nice if you could provide a sample.

That is certainly true.

> However, if the data
> in the list have the same colnames, you can combine them by
>
> df<-do.call('rbind',your_list_data_frame)
>
> Then you can do what you want on the dataframe instead of a list
>
> HTH
>
> Weidong Gu
>
>
> On Fri, Oct 14, 2011 at 9:06 AM, Juliet Ndukum <[hidden email]>  
> wrote:
>> I have a list of dataframes i.e. each list element is a dataframe  
>> with three columns and differing number of rows. The third column  
>> takes on only two values. I wish to split the list into two  
>> sublists based on the value of the third column of the list element.

Perhaps something like:
list_of_firsts <- lapply(dflist, function(x) X[ , X[,3]=="first"] )
list_of_seconds <- lapply( dflist, function(x) X[ , X[,3]=="second"])

Ow with subset (but without that missing example it is more difficult  
to show the true value of subset:

subset(X, select= X[,3]=="first")


>> Second issue with lists as well. I would like to reduce each of the  
>> sublist based on the range of the second column, i.e. if the range  
>> of the second column is greater than twenty for example keep the  
>> list element.

Same as above with an inequality sign.

>>
>> Could someone help me with a code to implement these two issues.  
>> Thanks in advance for your help,
>> JN
>>        [[alternative HTML version deleted]]
>


David Winsemius, MD
West Hartford, CT

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Split a list

djmuseR
In reply to this post by Juliet Ndukum
Hi:

Following the lead of others, here's a reproducible example that I
believe achieves what you want.

# Q1:
L <- lapply(1:3, function(n)
    data.frame(x = rnorm(6), y = rnorm(6), g = rep(1:2, each = 3)))

# Using David's suggestion:
L1 <- lapply(L, function(d) subset(d, g == 1L))
L2 <- lapply(L, function(d) subset(d, g == 2L))

# Q2:
# Let range > 2 to retain in this small example:
# Find the range of the second column of each list component:
sapply(L, function(x) diff(range(x[, 2], na.rm = TRUE)))

# The code retains the data frame if the range of the second
# column is > 2, otherwise it is set to NULL:
lapply(L, function(d) if(diff(range(d[, 2], na.rm = TRUE)) > 2) d else NULL)

# If you want to collapse the result into a data frame, the base R
approach would be
do.call('rbind', lapply(L, function(d) if(diff(range(d[, 2], na.rm =
TRUE)) > 2) d else NULL))

# An equivalent way to do all of this in the plyr package is:
library('plyr')
L1 <- llply(L, function(d) subset(d, g == 1L))
L2 <- llply(L, function(d) subset(d, g == 2L))

ldply(L, function(d) if(diff(range(d[, 2], na.rm = TRUE)) > 2) d else NULL)

There are advantages to naming the list components if this is what you
have in mind, since both ldply() and the rbind from do.call() will
output indicators of which component data frame each observation
belongs; ldply() uses an .id variable to designate the list component
name whereas do.call(rbind, ...) uses rownames to distinguish
observations. For this example, try

names(L) <- paste('d', 1:3, sep = '')

and run the code above again to see the difference.

HTH,
Dennis

On Fri, Oct 14, 2011 at 6:06 AM, Juliet Ndukum <[hidden email]> wrote:

> I have a list of dataframes i.e. each list element is a dataframe with three columns and differing number of rows. The third column takes on only two values. I wish to split the list into two sublists based on the value of the third column of the list element.
> Second issue with lists as well. I would like to reduce each of the sublist based on the range of the second column, i.e. if the range of the second column is greater than twenty for example keep the list element.
>
> Could someone help me with a code to implement these two issues. Thanks in advance for your help,
> JN
>        [[alternative HTML version deleted]]
>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.