readLines with space-delimiter?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

readLines with space-delimiter?

Seth
Hi,
I am reading a large space-delimited text file into R (41 columns and many rows) and need to do run each row's values through another R object and then write to another text file.  So, far using readLines and writeLines seems to be the best bet.  I've gotten the data exchange working except each row is read in as one 'chunk', meaning the row has all values between two quotes ("41 numbers").  I need to split these based upon the spaces between them.  What is the simplest means of doing this?

Code so far.

datin<-file("C:\\rforest\\data\\aoidry_predictors_85.txt", open="rt")
datout<-file("C:\\rforest\\prob85.txt",open="wt")
x<-readLines(datin,n=1)
writeLines(x,con=datout)

Thanks,
Seth
Reply | Threaded
Open this post in threaded view
|

Re: readLines with space-delimiter?

jholtman
Have you considered 'scan' or 'read.table'?  This is what is mostly used in
these situations.  Read the chapter in the Intro to R on reading in data.

On Tue, May 4, 2010 at 8:10 PM, Seth <[hidden email]> wrote:

>
> Hi,
> I am reading a large space-delimited text file into R (41 columns and many
> rows) and need to do run each row's values through another R object and
> then
> write to another text file.  So, far using readLines and writeLines seems
> to
> be the best bet.  I've gotten the data exchange working except each row is
> read in as one 'chunk', meaning the row has all values between two quotes
> ("41 numbers").  I need to split these based upon the spaces between them.
> What is the simplest means of doing this?
>
> Code so far.
>
> datin<-file("C:\\rforest\\data\\aoidry_predictors_85.txt", open="rt")
> datout<-file("C:\\rforest\\prob85.txt",open="wt")
> x<-readLines(datin,n=1)
> writeLines(x,con=datout)
>
> Thanks,
> Seth
> --
> View this message in context:
> http://r.789695.n4.nabble.com/readLines-with-space-delimiter-tp2130255p2130255.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: readLines with space-delimiter?

Seth
Thanks.  I wasn't aware that scan or read.table allowed you to read in a single line, process it, output results, and then read in the next line.  This is what I need to do because the data set is too large to hold in RAM.  I did manage to do this with readLines and overcome the space-delimiter issue.
Reply | Threaded
Open this post in threaded view
|

Re: readLines with space-delimiter?

Duncan Murdoch-2
Seth wrote:
> Thanks.  I wasn't aware that scan or read.table allowed you to read in a
> single line, process it, output results, and then read in the next line.
> This is what I need to do because the data set is too large to hold in RAM.
> I did manage to do this with readLines and overcome the space-delimiter
> issue.
>  

You can read a fixed number of lines, then process them; then you can
repeat.  The key is to open the file as a connection before calling
read.table, and don't close it after each read.  But you were probably
doing that with readLines.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: readLines with space-delimiter?

kman-4
In reply to this post by Seth
Dear Seth,

If this were my project, I would likely use something besides readLines().
Have you looked into read.table() or scan()? They'll separate based on your
delimiter on input so you do not need to do post processing.

#example file
txt2<-cbind(c("A","cat","ran","over","the"),c("brown","fox.","","",""))
dfile<-file("test.txt","at")
writeLines(paste(txt[,1], collapse=" "), dfile)
writeLines(paste(txt[,2], collapse=" "), dfile)
close(dfile)

#via scan()
scan("test.txt", sep=" ", what="") # character vector, empty character
strings interpreted
scan("test.txt", sep="\n", what="") # each line (as in readLines())
scan("test.txt", what="") # white space delimited

#untested
x<-scan(datin, what="", nlines=10)

If your forty something columns have known data types other than
character(), setting what=list() with the types (or ignore a column with
NULL as a value) will configure your types on read as well.

txt2<-cbind(c("A","cat","ran","over",5),c("the","brown","fox.","",7)) # all
character after coercion
dfile<-file("test2.txt","at")
writeLines(paste(txt2[,1], collapse=" "),dfile)
writeLines(paste(txt2[,2], collapse=" "),dfile)
close(dfile)

whatl<-list("","","","",0)
names(whatl)<-c("char1","char2","char3","another","numbers")
data.frame(scan("test2.txt", sep=" ", what=whatl))

The rwiki has a page with more detail on scan().
http://rwiki.sciviews.org/doku.php?id=large_scale_data:lsdioi_scangritty

Sincerely,
KeithC.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: readLines with space-delimiter?

kman-4
In reply to this post by Seth
One line at a time! That must have taken forever!

Do you happen to know how many lines you could read in at once & still have
enough room to work with?

Sincerely,
KeithC.

-----Original Message-----
From: Seth [mailto:[hidden email]]
Sent: Tuesday, May 04, 2010 11:05 PM
To: [hidden email]
Subject: Re: [R] readLines with space-delimiter?


Thanks.  I wasn't aware that scan or read.table allowed you to read in a
single line, process it, output results, and then read in the next line.
This is what I need to do because the data set is too large to hold in RAM.
I did manage to do this with readLines and overcome the space-delimiter
issue.
--
View this message in context:
http://r.789695.n4.nabble.com/readLines-with-space-delimiter-tp2130255p21304
34.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.