Creating one df from 85 df present in a list

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Creating one df from 85 df present in a list

aureta
hi, I am trying to fuse (cbind, merge... NOT rbind) several dataframes with
different numbers of rows, all df included in a list, and using the code
extract shown below. The function merge() works well with two df but not
more than two...I have 85 dataframes to join in this way (85 df in the
list)....could you please let me know how to get all 85 df merged ?,,,,,
thanks

fusion_de_tablas = merge(red_tablas_por_punto[["1 - Bv.Artigas y la Rambla
(Terminal CUTCSA)"]],
red_tablas_por_punto[["10 - Avenida Millán 2515 (Hospital Vilardebó)"]],
red_tablas_por_punto[["100 - Fauquet 6358 (Hospital Saint Bois)"]],
by= 'toma_de_muestras', all = T )

--
*Alejandro *

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Creating one df from 85 df present in a list

Bert Gunter-2
?do.call  -- takes a list of arguments to a function
... as in
do.call(merge, yourlist)  ## or similar perhaps


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Jun 10, 2020 at 11:48 AM Alejandro Ureta <[hidden email]>
wrote:

> hi, I am trying to fuse (cbind, merge... NOT rbind) several dataframes with
> different numbers of rows, all df included in a list, and using the code
> extract shown below. The function merge() works well with two df but not
> more than two...I have 85 dataframes to join in this way (85 df in the
> list)....could you please let me know how to get all 85 df merged ?,,,,,
> thanks
>
> fusion_de_tablas = merge(red_tablas_por_punto[["1 - Bv.Artigas y la Rambla
> (Terminal CUTCSA)"]],
> red_tablas_por_punto[["10 - Avenida Millán 2515 (Hospital Vilardebó)"]],
> red_tablas_por_punto[["100 - Fauquet 6358 (Hospital Saint Bois)"]],
> by= 'toma_de_muestras', all = T )
>
> --
> *Alejandro *
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Creating one df from 85 df present in a list

Rasmus Liland-3
On 2020-06-10 13:14 -0700, Bert Gunter wrote:

> On Wed, Jun 10, 2020 at 11:48 AM Alejandro Ureta wrote:
> >
> > hi, I am trying to fuse (cbind, merge...
> > NOT rbind) several dataframes with
> > different numbers of rows, all df
> > included in a list, and using the code
> > extract shown below. The function merge()
> > works well with two df but not more than
> > two...I have 85 dataframes to join in
> > this way (85 df in the list)....could you
> > please let me know how to get all 85 df
> > merged ?,,,,, thanks
> >
> > fusion_de_tablas = merge(red_tablas_por_punto[["1 - Bv.Artigas y la Rambla
> > (Terminal CUTCSA)"]],
> > red_tablas_por_punto[["10 - Avenida Millán 2515 (Hospital Vilardebó)"]],
> > red_tablas_por_punto[["100 - Fauquet 6358 (Hospital Saint Bois)"]],
> > by= 'toma_de_muestras', all = T )
>
> ?do.call  -- takes a list of arguments to a function
> ... as in
> do.call(merge, yourlist)  ## or similar perhaps
Dear Alejandro,

it would be easier to help you if you
provided some example of how fusion_de_tablas
looks like.  

Here is a small example on uniting some odd
sized dataframes with some common and some
differently named columns.

        red_tablas_por_punto <-
          list(
            "1 - Bv.Artigas y la Rambla (Terminal CUTCSA)" =
              data.frame("a"=1:3,
                         "b"=4:6,
                         "c"=4:6,
                         'toma_de_muestras'=1),
            "10 - Avenida Millán 2515 (Hospital Vilardebó)" =
              data.frame("d"=4:8,
                         "b"=8:12,
                         'toma_de_muestras'=7),
            "100 - Fauquet 6358 (Hospital Saint Bois)" =
              data.frame("e"=100:101,
                         "a"=85:86,
                         'toma_de_muestras'=4)
          )
        unified.df <- lapply(names(red_tablas_por_punto),
          function(tabla, cn) {
            x <- red_tablas_por_punto[[tabla]]
            x[,cn[!(cn %in% colnames(x))]] <- NA
            x <- x[,cn]
            x$tabla <- tabla
            return(x)
          }, cn=unique(unlist(lapply(red_tablas_por_punto, colnames))))
        unified.df <- do.call(rbind, unified.df)
        unified.df

which yields

            a  b  c toma_de_muestras  d   e                                         tabla
        1   1  4  4                1 NA  NA  1 - Bv.Artigas y la Rambla (Terminal CUTCSA)
        2   2  5  5                1 NA  NA  1 - Bv.Artigas y la Rambla (Terminal CUTCSA)
        3   3  6  6                1 NA  NA  1 - Bv.Artigas y la Rambla (Terminal CUTCSA)
        4  NA  8 NA                7  4  NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
        5  NA  9 NA                7  5  NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
        6  NA 10 NA                7  6  NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
        7  NA 11 NA                7  7  NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
        8  NA 12 NA                7  8  NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
        9  85 NA NA                4 NA 100      100 - Fauquet 6358 (Hospital Saint Bois)
        10 86 NA NA                4 NA 101      100 - Fauquet 6358 (Hospital Saint Bois)

I also found that [1] you could use merge
like you tried with Reduce, like

        Reduce(function(x, y)
          merge(x, y, by='toma_de_muestras', all=T),
          red_tablas_por_punto)

which yields

           toma_de_muestras a.x b.x  c  d b.y   e a.y
        1             10001   1   4  4 NA  NA  NA  NA
        2             10002   2   5  5 NA  NA  NA  NA
        3             10003   3   6  6 NA  NA  NA  NA
        4             10004  NA  NA NA  4   8  NA  NA
        5             10005  NA  NA NA  5   9  NA  NA
        6             10006  NA  NA NA  6  10  NA  NA
        7             10007  NA  NA NA  7  11  NA  NA
        8             10008  NA  NA NA  8  12  NA  NA
        9             10009  NA  NA NA NA  NA 100  85
        10            10010  NA  NA NA NA  NA 101  86

where the semi-common “a” column does not
become unified ...  thus, I like my initial
step-by-step apply-based solution better ...

Best,
Rasmus

[1] https://stackoverflow.com/questions/22644780/merging-multiple-csv-files-in-r-using-do-call

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Creating one df from 85 df present in a list

Rasmus Liland-3
On 2020-06-13 01:54 +0200, Rasmus Liland wrote:
> Dear Alejandro,

Sorry, I programmed and wrote that email at
the same time, changed the “toma_de_muestras”
perhaps other things, then continued
programming, thus this might make more
sense ...

Firstly, it would be easier to help you if
you provided some example of how
fusion_de_tablas looks like.

In this first example, I create a small list
of oddly shaped data.frames which might look
like your 85-element-long list.  Then,
determining the unique colnames.  Lastly,
applying my way through the list again to
fill in N/A in the columns not there, so the
do.call function recieves what it expects ...

        red_tablas_por_punto <-
          list(
            "1 - Bv.Artigas y la Rambla (Terminal CUTCSA)" =
              data.frame("a"=1:3,
                         "b"=4:6,
                         "c"=4:6,
                         'toma_de_muestras'=10001:10003),
            "10 - Avenida Millán 2515 (Hospital Vilardebó)" =
              data.frame("d"=4:8,
                         "b"=8:12,
                         'toma_de_muestras'=10004:10008),
            "100 - Fauquet 6358 (Hospital Saint Bois)" =
              data.frame("e"=100:101,
                         "a"=85:86,
                         'toma_de_muestras'=10009:10010)
          )
        unified.df <- lapply(names(red_tablas_por_punto),
          function(tabla, cn) {
            x <- red_tablas_por_punto[[tabla]]
            x[,cn[!(cn %in% colnames(x))]] <- NA
            x <- x[,cn]
            x$tabla <- tabla
            return(x)
          }, cn=unique(unlist(lapply(red_tablas_por_punto, colnames))))
        unified.df <- do.call(rbind, unified.df)
        unified.df

yields this:

            a  b  c toma_de_muestras  d   e                                         tabla
        1   1  4  4            10001 NA  NA  1 - Bv.Artigas y la Rambla (Terminal CUTCSA)
        2   2  5  5            10002 NA  NA  1 - Bv.Artigas y la Rambla (Terminal CUTCSA)
        3   3  6  6            10003 NA  NA  1 - Bv.Artigas y la Rambla (Terminal CUTCSA)
        4  NA  8 NA            10004  4  NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
        5  NA  9 NA            10005  5  NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
        6  NA 10 NA            10006  6  NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
        7  NA 11 NA            10007  7  NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
        8  NA 12 NA            10008  8  NA 10 - Avenida Millán 2515 (Hospital Vilardebó)
        9  85 NA NA            10009 NA 100      100 - Fauquet 6358 (Hospital Saint Bois)
        10 86 NA NA            10010 NA 101      100 - Fauquet 6358 (Hospital Saint Bois)

... right, so you could also use merge with
Reduce like in that stackoverflow answer [1],
which might have been what you were looking
for anyway:

        Reduce(function(x, y)
          merge(x, y, by='toma_de_muestras', all=T),
          red_tablas_por_punto)

yields this:

           toma_de_muestras a.x b.x  c  d b.y   e a.y
        1             10001   1   4  4 NA  NA  NA  NA
        2             10002   2   5  5 NA  NA  NA  NA
        3             10003   3   6  6 NA  NA  NA  NA
        4             10004  NA  NA NA  4   8  NA  NA
        5             10005  NA  NA NA  5   9  NA  NA
        6             10006  NA  NA NA  6  10  NA  NA
        7             10007  NA  NA NA  7  11  NA  NA
        8             10008  NA  NA NA  8  12  NA  NA
        9             10009  NA  NA NA NA  NA 100  85
        10            10010  NA  NA NA NA  NA 101  86

Best,
Rasmus

[1] https://stackoverflow.com/questions/22644780/merging-multiple-csv-files-in-r-using-do-call

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

signature.asc (849 bytes) Download Attachment