Import Multiple csv files and merge into one Master file

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Import Multiple csv files and merge into one Master file

XINLI LI
Dear R Group:

    How to import multiple csv files and merge into one dataset.

Thanks and Regards,

   Xing

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Import Multiple csv files and merge into one Master file

Joshua Wiley-2
Hi Xing,

This depends somewhat on what you mean by "merge", and how many files
you are talking about.  Supposing you are dealing with few enough
files you can do it manually:

dat1 <- read.csv("yourfile1.csv")
dat2 <- read.csv("yourfile2.csv")
...
datn <- read.csv("yourfilen.csv")

If each file contains unique variables:

complete.dat <- cbind(dat1, dat2, ... , datn)

if each one is just a continuation rowise:

complete.dat <- rbind(dat1, dat2, ... , datn)

If they are all sort of related but in no consistent way, and do not need to be:

complete.dat <- list(dat1, dat2, ... , datn)

If you need some fancier merging than just columnwise or rowwise, look
at merge().  For documentation on these features see:

?data.frame # to find out more about data frames (which is what read.csv uses)
?list # for details about what a list is
?read.table
?read.csv # just a special wrapper for read.table
?cbind # for column binding
?rbind # for row binding
?merge # for more specialized merging
example(merge) # for examples using merge()

HTH,

Josh

On Thu, Oct 7, 2010 at 8:19 PM, XINLI LI <[hidden email]> wrote:

> Dear R Group:
>
>    How to import multiple csv files and merge into one dataset.
>
> Thanks and Regards,
>
>   Xing
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Import Multiple csv files and merge into one Master file

Erik Iverson-3
In reply to this post by XINLI LI
See the R Data Import/Export manual for the first step:

http://cran.r-project.org/doc/manuals/R-data.html

?read.table should help you out.  You might use
?lapply along with read.table to read in multiple files.

Then, use ?merge, possibly in tandem with the ?Reduce
function, depending on how many data.frames you're
dealing with.

A more specific question will elicit perhaps more specific
answers.


On 10/07/2010 10:19 PM, XINLI LI wrote:

> Dear R Group:
>
>      How to import multiple csv files and merge into one dataset.
>
> Thanks and Regards,
>
>     Xing
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Import Multiple csv files and merge into one Master file

Joshua Wiley-2
In reply to this post by Joshua Wiley-2
Hi Xinli,

You will probable have to tweak this some for it to work for you, but
it at least gives you an idea.

First put all your files in one directory, then you use the
list.files() function to read into R the names of every file in that
directory (this is easier than typing all 100 something names).  Now,
you can use lapply() to 'apply' the function, read.csv() to each file
name.  In my example, I set header = TRUE just to show you how you can
specify arguments to the function you are calling with lapply().  This
will result in a list with each element being the results of
read.csv().  Finally, since all the columns are the same, we can just
rbind() every data frame together.  That is accomplished the outer
function, do.call(), whose first argument is "rbind" and second is the
output of lapply().

filenames <- list.files(path = "~/")
do.call("rbind", lapply(filenames, read.csv, header = TRUE))

Hope that helps,

Josh

On Fri, Oct 8, 2010 at 4:32 AM, XINLI LI <[hidden email]> wrote:

> Hi Joshua:
>
>     Thank you very much for your help. I have more than 100 files, and do
> you have a better way to merge the files into one dataset, there all have
> the same columns.
>
>    Thanks,
>
>    xinli
>
> On Thu, Oct 7, 2010 at 11:28 PM, Joshua Wiley <[hidden email]>
> wrote:
>>
>> Hi Xing,
>>
>> This depends somewhat on what you mean by "merge", and how many files
>> you are talking about.  Supposing you are dealing with few enough
>> files you can do it manually:
>>
>> dat1 <- read.csv("yourfile1.csv")
>> dat2 <- read.csv("yourfile2.csv")
>> ...
>> datn <- read.csv("yourfilen.csv")
>>
>> If each file contains unique variables:
>>
>> complete.dat <- cbind(dat1, dat2, ... , datn)
>>
>> if each one is just a continuation rowise:
>>
>> complete.dat <- rbind(dat1, dat2, ... , datn)
>>
>> If they are all sort of related but in no consistent way, and do not need
>> to be:
>>
>> complete.dat <- list(dat1, dat2, ... , datn)
>>
>> If you need some fancier merging than just columnwise or rowwise, look
>> at merge().  For documentation on these features see:
>>
>> ?data.frame # to find out more about data frames (which is what read.csv
>> uses)
>> ?list # for details about what a list is
>> ?read.table
>> ?read.csv # just a special wrapper for read.table
>> ?cbind # for column binding
>> ?rbind # for row binding
>> ?merge # for more specialized merging
>> example(merge) # for examples using merge()
>>
>> HTH,
>>
>> Josh
>>
>> On Thu, Oct 7, 2010 at 8:19 PM, XINLI LI <[hidden email]> wrote:
>> > Dear R Group:
>> >
>> >    How to import multiple csv files and merge into one dataset.
>> >
>> > Thanks and Regards,
>> >
>> >   Xing
>> >
>> >        [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > [hidden email] mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>>
>>
>> --
>> Joshua Wiley
>> Ph.D. Student, Health Psychology
>> University of California, Los Angeles
>> http://www.joshuawiley.com/
>
>



--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Import Multiple csv files and merge into one Master file

djmuseR
Hi:

An alternative approach with a few less keystrokes, using the plyr package:

library(plyr)

# Create some example data files, populate them and write them out using
write.csv:
fnames <- c(paste('file0', 1:9, '.csv', sep = ''))
for(i in seq_along(fnames)) {
     d <- data.frame(x = rnorm(3), y = rpois(3, 10), z = round(runif(3), 3))
     write.csv(d, fnames[i], row.names = FALSE, quote = FALSE)
   }

This generates files file01.csv -> file09.csv. Each file has three lines
with three variables. Now use ldply to read them back in using read.csv():

u <- ldply(fnames, read.csv)
> u
             x  y     z
1  -0.77367688 17 0.496
2  -0.79069791 11 0.323
3   0.69257133  9 0.229
4   0.55484202 14 0.428
5  -0.67254503  4 0.702
6   0.15010483 10 0.802
...  [27 lines in all]

ldply() is one way to circumvent the do.call(rbind, lapply(...)) mantra when
the final output is to be a data frame.

HTH,
Dennis

On Fri, Oct 8, 2010 at 8:30 AM, Joshua Wiley <[hidden email]> wrote:

> Hi Xinli,
>
> You will probable have to tweak this some for it to work for you, but
> it at least gives you an idea.
>
> First put all your files in one directory, then you use the
> list.files() function to read into R the names of every file in that
> directory (this is easier than typing all 100 something names).  Now,
> you can use lapply() to 'apply' the function, read.csv() to each file
> name.  In my example, I set header = TRUE just to show you how you can
> specify arguments to the function you are calling with lapply().  This
> will result in a list with each element being the results of
> read.csv().  Finally, since all the columns are the same, we can just
> rbind() every data frame together.  That is accomplished the outer
> function, do.call(), whose first argument is "rbind" and second is the
> output of lapply().
>
> filenames <- list.files(path = "~/")
> do.call("rbind", lapply(filenames, read.csv, header = TRUE))
>
> Hope that helps,
>
> Josh
>
> On Fri, Oct 8, 2010 at 4:32 AM, XINLI LI <[hidden email]> wrote:
> > Hi Joshua:
> >
> >     Thank you very much for your help. I have more than 100 files, and do
> > you have a better way to merge the files into one dataset, there all have
> > the same columns.
> >
> >    Thanks,
> >
> >    xinli
> >
> > On Thu, Oct 7, 2010 at 11:28 PM, Joshua Wiley <[hidden email]>
> > wrote:
> >>
> >> Hi Xing,
> >>
> >> This depends somewhat on what you mean by "merge", and how many files
> >> you are talking about.  Supposing you are dealing with few enough
> >> files you can do it manually:
> >>
> >> dat1 <- read.csv("yourfile1.csv")
> >> dat2 <- read.csv("yourfile2.csv")
> >> ...
> >> datn <- read.csv("yourfilen.csv")
> >>
> >> If each file contains unique variables:
> >>
> >> complete.dat <- cbind(dat1, dat2, ... , datn)
> >>
> >> if each one is just a continuation rowise:
> >>
> >> complete.dat <- rbind(dat1, dat2, ... , datn)
> >>
> >> If they are all sort of related but in no consistent way, and do not
> need
> >> to be:
> >>
> >> complete.dat <- list(dat1, dat2, ... , datn)
> >>
> >> If you need some fancier merging than just columnwise or rowwise, look
> >> at merge().  For documentation on these features see:
> >>
> >> ?data.frame # to find out more about data frames (which is what read.csv
> >> uses)
> >> ?list # for details about what a list is
> >> ?read.table
> >> ?read.csv # just a special wrapper for read.table
> >> ?cbind # for column binding
> >> ?rbind # for row binding
> >> ?merge # for more specialized merging
> >> example(merge) # for examples using merge()
> >>
> >> HTH,
> >>
> >> Josh
> >>
> >> On Thu, Oct 7, 2010 at 8:19 PM, XINLI LI <[hidden email]> wrote:
> >> > Dear R Group:
> >> >
> >> >    How to import multiple csv files and merge into one dataset.
> >> >
> >> > Thanks and Regards,
> >> >
> >> >   Xing
> >> >
> >> >        [[alternative HTML version deleted]]
> >> >
> >> > ______________________________________________
> >> > [hidden email] mailing list
> >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting guide
> >> > http://www.R-project.org/posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
> >> >
> >>
> >>
> >>
> >> --
> >> Joshua Wiley
> >> Ph.D. Student, Health Psychology
> >> University of California, Los Angeles
> >> http://www.joshuawiley.com/
> >
> >
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> University of California, Los Angeles
> http://www.joshuawiley.com/
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.