read a irregular text file data into dataframe()

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

read a irregular text file data into dataframe()

j.joshua thomas
I am using R2.4.1 calling a text file contains the following data structure:

when i call the file into R using

tData<-read.table("c:\\test.txt")

it gave me Error saying, irregular column in the data set
however i need to use the below type of data

Is there any alternative in R?

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

0010 0028 0061 0088
0010 0042 0084
0004 0010 0055
0010 0018 0040 0042
0010 0046 0059
0010 0016 0042 0055
0010 0012 0018 0054
0010 0034 0042 0102
0081
0001 0076 0085
0080 0086
0017 0032 0081
0004 0010 0055
0010 0042 0061 0080
0010 0017 0078 0084
0006 0010 0040 0042
0075 0080
0005 0028 0032
0006 0010 0040 0061
--
Lecturer J. Joshua Thomas
KDU College Penang Campus
Research Student,
University Sains Malaysia

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: read a irregular text file data into dataframe()

Stephen Tucker
I don't know of any canned function to do this but you can write your own
function (see contents below) to:

(1) open file connection
(2) read number of fields
(3) create empty matrix with the number of rows and maximum number of columns
of your data
(4) rewind to beginning of file
(5) scan line-by-line and fill the matrix
(6) close the file connection
(7) convert matrix to data frame
(8) use the function type.convert to automatically convert numerical columns
to mode numeric (since scan(), as I've specified it, reads in everything as
mode character, which converts the holding matrix's mode to character from
its default of logical).

the function below will work for your example data set, but to make it more
general, you can add arguments like 'what' to scan(), 'sep' to both
count.fields() and scan(); depending on whether you have column names you can
modify it accordingly as well.

# call function with this line
df <- read.irregular("c:\\test.txt")

# this is the function

read.irregular <- function(filenm) {
  fileID <- file(filenm,open="rt")
  nFields <- count.fields(fileID)
  mat <- matrix(nrow=length(nFields),ncol=max(nFields))
  invisible(seek(fileID,where=0,origin="start",rw="read"))
  for(i in 1:nrow(mat) ) {
    mat[i,1:nFields[i]] <-scan(fileID,what="",nlines=1,quiet=TRUE)
  }
  close(fileID)
  df <- as.data.frame(mat)
  df[] <- lapply(df,type.convert,as.is=TRUE)
  return(df)
}

Hope this helps.

--- "j.joshua thomas" <[hidden email]> wrote:

> I am using R2.4.1 calling a text file contains the following data
> structure:
>
> when i call the file into R using
>
> tData<-read.table("c:\\test.txt")
>
> it gave me Error saying, irregular column in the data set
> however i need to use the below type of data
>
> Is there any alternative in R?
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> 0010 0028 0061 0088
> 0010 0042 0084
> 0004 0010 0055
> 0010 0018 0040 0042
> 0010 0046 0059
> 0010 0016 0042 0055
> 0010 0012 0018 0054
> 0010 0034 0042 0102
> 0081
> 0001 0076 0085
> 0080 0086
> 0017 0032 0081
> 0004 0010 0055
> 0010 0042 0061 0080
> 0010 0017 0078 0084
> 0006 0010 0040 0042
> 0075 0080
> 0005 0028 0032
> 0006 0010 0040 0061
> --
> Lecturer J. Joshua Thomas
> KDU College Penang Campus
> Research Student,
> University Sains Malaysia
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



 
____________________________________________________________________________________
It's here! Your new message!  
Get new email alerts with the free Yahoo! Toolbar.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: read a irregular text file data into dataframe()

Petr Klasterecky
In reply to this post by j.joshua thomas
read.table("c:\\test.txt",fill=TRUE)

Petr

j.joshua thomas napsal(a):

> I am using R2.4.1 calling a text file contains the following data structure:
>
> when i call the file into R using
>
> tData<-read.table("c:\\test.txt")
>
> it gave me Error saying, irregular column in the data set
> however i need to use the below type of data
>
> Is there any alternative in R?
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> 0010 0028 0061 0088
> 0010 0042 0084
> 0004 0010 0055
> 0010 0018 0040 0042
> 0010 0046 0059
> 0010 0016 0042 0055
> 0010 0012 0018 0054
> 0010 0034 0042 0102
> 0081
> 0001 0076 0085
> 0080 0086
> 0017 0032 0081
> 0004 0010 0055
> 0010 0042 0061 0080
> 0010 0017 0078 0084
> 0006 0010 0040 0042
> 0075 0080
> 0005 0028 0032
> 0006 0010 0040 0061

--
Petr Klasterecky
Dept. of Probability and Statistics
Charles University in Prague
Czech Republic

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.