Using apply function to merge list of data frames

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Using apply function to merge list of data frames

Naresh Gurbuxani
I have a list whose components are data frames.  My goal is to construct a data frame by merging all the list components.  Is it possible to achieve this using apply and without a for loop, as used below?

Thanks,
Naresh

mylist <- list(A = data.frame(date = seq.Date(as.Date('2018-01-01'), by = 'week',
                                  length.out = 5), ret = rnorm(5)),
               B = data.frame(date = seq.Date(as.Date('2018-01-01'), by = 'week',
                                  length.out = 5), ret = rnorm(5)))
 
mydf <- data.frame(date = seq.Date(as.Date('2018-01-01'), by = 'week', length.out = 5))
 
for(ch in names(mylist)){
    tempdf <- mylist[[ch]]
    names(tempdf)[2] <- paste(names(tempdf)[2], ch, sep = '.')
    mydf <- merge(mydf, tempdf, by = c('date'))}
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Using apply function to merge list of data frames

Berend Hasselman


> On 25 Jul 2018, at 08:17, Naresh Gurbuxani <[hidden email]> wrote:
>
> I have a list whose components are data frames.  My goal is to construct a data frame by merging all the list components.  Is it possible to achieve this using apply and without a for loop, as used below?
>
> Thanks,
> Naresh
>
> mylist <- list(A = data.frame(date = seq.Date(as.Date('2018-01-01'), by = 'week',
>                                  length.out = 5), ret = rnorm(5)),
>               B = data.frame(date = seq.Date(as.Date('2018-01-01'), by = 'week',
>                                  length.out = 5), ret = rnorm(5)))
>
> mydf <- data.frame(date = seq.Date(as.Date('2018-01-01'), by = 'week', length.out = 5))
>
> for(ch in names(mylist)){
>    tempdf <- mylist[[ch]]
>    names(tempdf)[2] <- paste(names(tempdf)[2], ch, sep = '.')
>    mydf <- merge(mydf, tempdf, by = c('date'))}
> _

See if these would help:

on R-help the thread

https://stat.ethz.ch/pipermail/r-help/2018-May/454249.html

and

https://stackoverflow.com/questions/4512465/what-is-the-most-efficient-way-to-cast-a-list-as-a-data-frame?rq=1

Berend

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Using apply function to merge list of data frames

S Ellison-2
In reply to this post by Naresh Gurbuxani
Short answer: do.call()

do.call("rbind", df.list)
will rbind all of the data frames in df.list.

You may have to tidy up row names afterwards, and you will need to make sure that the data frames all have the same column names and each column has the same class, or you'll get unexpected results.

S Ellison

> -----Original Message-----
> From: R-help [mailto:[hidden email]] On Behalf Of Naresh
> Gurbuxani
> Sent: 25 July 2018 07:17
> To: [hidden email]
> Subject: [R] Using apply function to merge list of data frames
>
> I have a list whose components are data frames.  My goal is to construct a
> data frame by merging all the list components.  Is it possible to achieve this
> using apply and without a for loop, as used below?
>
> Thanks,
> Naresh
>
> mylist <- list(A = data.frame(date = seq.Date(as.Date('2018-01-01'), by =
> 'week',
>                                   length.out = 5), ret = rnorm(5)),
>                B = data.frame(date = seq.Date(as.Date('2018-01-01'), by = 'week',
>                                   length.out = 5), ret = rnorm(5)))
>
> mydf <- data.frame(date = seq.Date(as.Date('2018-01-01'), by = 'week',
> length.out = 5))
>
> for(ch in names(mylist)){
>     tempdf <- mylist[[ch]]
>     names(tempdf)[2] <- paste(names(tempdf)[2], ch, sep = '.')
>     mydf <- merge(mydf, tempdf, by = c('date'))}
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Using apply function to merge list of data frames

Jeff Newmiller
Er, rbind is not merge... do.call expects the function you specify to handle all the elements of the list in a single invocation... Reduce will work with a two-argument function.

Reduce(merge, df.list, accumulate=TRUE, by='date')

For clarity: apply and the like have for loops inside them, so the primary benefit is a compact and easy to read invocation.

Do not assume that this syntax will have an appreciably-different performance behavior than the for loop solution. In particular, merge is a potentially very slow operation so if your real data frames have the identical key like your example does, using cbind could help performance significantly. Also, both your for loop and Reduce allocate memory as needed, leading to potential memory thrashing that could be a problem for large data sets. If this is an issue for you then you might want to roll your own preallocating for loop or use a function like bind_cols that has that feature [1][2]

[1] http://r4ds.had.co.nz/iteration.html
[2] https://dplyr.tidyverse.org/reference/bind.html


On July 27, 2018 4:45:31 AM PDT, S Ellison <[hidden email]> wrote:

>Short answer: do.call()
>
>do.call("rbind", df.list)
>will rbind all of the data frames in df.list.
>
>You may have to tidy up row names afterwards, and you will need to make
>sure that the data frames all have the same column names and each
>column has the same class, or you'll get unexpected results.
>
>S Ellison
>
>> -----Original Message-----
>> From: R-help [mailto:[hidden email]] On Behalf Of
>Naresh
>> Gurbuxani
>> Sent: 25 July 2018 07:17
>> To: [hidden email]
>> Subject: [R] Using apply function to merge list of data frames
>>
>> I have a list whose components are data frames.  My goal is to
>construct a
>> data frame by merging all the list components.  Is it possible to
>achieve this
>> using apply and without a for loop, as used below?
>>
>> Thanks,
>> Naresh
>>
>> mylist <- list(A = data.frame(date = seq.Date(as.Date('2018-01-01'),
>by =
>> 'week',
>>                                   length.out = 5), ret = rnorm(5)),
>>                B = data.frame(date = seq.Date(as.Date('2018-01-01'),
>by = 'week',
>>                                   length.out = 5), ret = rnorm(5)))
>>
>> mydf <- data.frame(date = seq.Date(as.Date('2018-01-01'), by =
>'week',
>> length.out = 5))
>>
>> for(ch in names(mylist)){
>>     tempdf <- mylist[[ch]]
>>     names(tempdf)[2] <- paste(names(tempdf)[2], ch, sep = '.')
>>     mydf <- merge(mydf, tempdf, by = c('date'))}
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>*******************************************************************
>This email and any attachments are confidential. Any
>use...{{dropped:8}}
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.