Help with read.csv

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Help with read.csv

Giovanni Petris

Hello,

I have a file that looks like this:

Date,Hour,DA_DMD,DMD,DA_RTP,RTP,,
1/1/2006,1,3393.9,3412,76.65,105.04,,
1/1/2006,2,3173.3,3202,69.20,67.67,,
1/1/2006,3,3040.0,3051,69.20,77.67,,
1/1/2006,4,2998.2,2979,67.32,69.10,,
1/1/2006,5,3005.8,2958,65.20,68.34,,

where the ',' is the separator and I tried to read it into R, but...

> y <- read.csv("Data/Data_tmp.csv", header = FALSE, skip = 1,
+               colClasses = c("character", "int", rep("double", 4)),
+               col.names = c("Date","Hour","DA_DMD","DMD","DA_RTP", "RTP"),
+               flush = TRUE)
Error in read.table(file = file, header = header, sep = sep, quote = quote,  :
  more columns than column names

count.fields() gives me 8 fields per line, so I tried other variations,
like the following, with two fictitious extra fields, but...

> y <- read.csv("Data/Data_tmp.csv", header = FALSE, skip = 1,
+               colClasses = c("character", "int", rep("double", 6)),
+               col.names = c("Date","Hour","DA_DMD","DMD","DA_RTP",
+               "RTP", "XXX", "YYY"))
Error in methods::as(data[[i]], colClasses[i]) :
  no method or default for coercing "character" to "int"

Could anybody please tell me what I am doing wrong and how I could read
my data into R?

Thanks in advance,
Giovanni






--

Giovanni Petris  <[hidden email]>
Associate Professor
Department of Mathematical Sciences
University of Arkansas - Fayetteville, AR 72701
Ph: (479) 575-6324, 575-8630 (fax)
http://definetti.uark.edu/~gpetris/

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Help with read.csv

David Wolfskill
On Wed, Mar 09, 2011 at 04:32:29PM -0600, Giovanni Petris wrote:

>
> Hello,
>
> I have a file that looks like this:
>
> Date,Hour,DA_DMD,DMD,DA_RTP,RTP,,
> 1/1/2006,1,3393.9,3412,76.65,105.04,,
> 1/1/2006,2,3173.3,3202,69.20,67.67,,
> 1/1/2006,3,3040.0,3051,69.20,77.67,,
> 1/1/2006,4,2998.2,2979,67.32,69.10,,
> 1/1/2006,5,3005.8,2958,65.20,68.34,,
>
> where the ',' is the separator and I tried to read it into R, but...
>
> > y <- read.csv("Data/Data_tmp.csv", header = FALSE, skip = 1,
> +               colClasses = c("character", "int", rep("double", 4)),
> +               col.names = c("Date","Hour","DA_DMD","DMD","DA_RTP", "RTP"),
> +               flush = TRUE)
> Error in read.table(file = file, header = header, sep = sep, quote = quote,  :
>   more columns than column names
>
> count.fields() gives me 8 fields per line, so I tried other variations,
> like the following, with two fictitious extra fields, but...
>...
> Error in methods::as(data[[i]], colClasses[i]) :
>   no method or default for coercing "character" to "int"
>
> Could anybody please tell me what I am doing wrong and how I could read
> my data into R?
I copied the data locally, then I read it using:

> y <- read.csv("Data_tmp.csv", header = TRUE, colClasses = c("character", "integer", rep("double", 4), "NULL", "NULL"))

which yields:

> y
      Date Hour DA_DMD  DMD DA_RTP    RTP
1 1/1/2006    1 3393.9 3412  76.65 105.04
2 1/1/2006    2 3173.3 3202  69.20  67.67
3 1/1/2006    3 3040.0 3051  69.20  77.67
4 1/1/2006    4 2998.2 2979  67.32  69.10
5 1/1/2006    5 3005.8 2958  65.20  68.34
>

Is that what you had in mind?

Issues I tried to address:
* The class is "integer", not "int".
* I used "NULL" class to refer to columns that are to be skipped.
* I made use of the headers, rather than skipping them & re-coding their
  content in the read.csv() invocation.

Peace,
david
--
David H. Wolfskill [hidden email]
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

attachment0 (203 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Help with read.csv

jholtman
In reply to this post by Giovanni Petris
Easiest is to use 'header = TRUE' and use the data from the file as
the header (remove the skip=1).  Let the system determine what it
should be and then you can change it later.

On Wed, Mar 9, 2011 at 5:32 PM, Giovanni Petris <[hidden email]> wrote:

>
> Hello,
>
> I have a file that looks like this:
>
> Date,Hour,DA_DMD,DMD,DA_RTP,RTP,,
> 1/1/2006,1,3393.9,3412,76.65,105.04,,
> 1/1/2006,2,3173.3,3202,69.20,67.67,,
> 1/1/2006,3,3040.0,3051,69.20,77.67,,
> 1/1/2006,4,2998.2,2979,67.32,69.10,,
> 1/1/2006,5,3005.8,2958,65.20,68.34,,
>
> where the ',' is the separator and I tried to read it into R, but...
>
>> y <- read.csv("Data/Data_tmp.csv", header = FALSE, skip = 1,
> +               colClasses = c("character", "int", rep("double", 4)),
> +               col.names = c("Date","Hour","DA_DMD","DMD","DA_RTP", "RTP"),
> +               flush = TRUE)
> Error in read.table(file = file, header = header, sep = sep, quote = quote,  :
>  more columns than column names
>
> count.fields() gives me 8 fields per line, so I tried other variations,
> like the following, with two fictitious extra fields, but...
>
>> y <- read.csv("Data/Data_tmp.csv", header = FALSE, skip = 1,
> +               colClasses = c("character", "int", rep("double", 6)),
> +               col.names = c("Date","Hour","DA_DMD","DMD","DA_RTP",
> +               "RTP", "XXX", "YYY"))
> Error in methods::as(data[[i]], colClasses[i]) :
>  no method or default for coercing "character" to "int"
>
> Could anybody please tell me what I am doing wrong and how I could read
> my data into R?
>
> Thanks in advance,
> Giovanni
>
>
>
>
>
>
> --
>
> Giovanni Petris  <[hidden email]>
> Associate Professor
> Department of Mathematical Sciences
> University of Arkansas - Fayetteville, AR 72701
> Ph: (479) 575-6324, 575-8630 (fax)
> http://definetti.uark.edu/~gpetris/
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Help with read.csv

Gabor Grothendieck
In reply to this post by Giovanni Petris
On Wed, Mar 9, 2011 at 5:32 PM, Giovanni Petris <[hidden email]> wrote:

>
> Hello,
>
> I have a file that looks like this:
>
> Date,Hour,DA_DMD,DMD,DA_RTP,RTP,,
> 1/1/2006,1,3393.9,3412,76.65,105.04,,
> 1/1/2006,2,3173.3,3202,69.20,67.67,,
> 1/1/2006,3,3040.0,3051,69.20,77.67,,
> 1/1/2006,4,2998.2,2979,67.32,69.10,,
> 1/1/2006,5,3005.8,2958,65.20,68.34,,
>
> where the ',' is the separator and I tried to read it into R, but...
>
>> y <- read.csv("Data/Data_tmp.csv", header = FALSE, skip = 1,
> +               colClasses = c("character", "int", rep("double", 4)),
> +               col.names = c("Date","Hour","DA_DMD","DMD","DA_RTP", "RTP"),
> +               flush = TRUE)
> Error in read.table(file = file, header = header, sep = sep, quote = quote,  :
>  more columns than column names
>
> count.fields() gives me 8 fields per line, so I tried other variations,
> like the following, with two fictitious extra fields, but...
>
>> y <- read.csv("Data/Data_tmp.csv", header = FALSE, skip = 1,
> +               colClasses = c("character", "int", rep("double", 6)),
> +               col.names = c("Date","Hour","DA_DMD","DMD","DA_RTP",
> +               "RTP", "XXX", "YYY"))
> Error in methods::as(data[[i]], colClasses[i]) :
>  no method or default for coercing "character" to "int"
>
> Could anybody please tell me what I am doing wrong and how I could read
> my data into R?
>

This works for me:

Lines <- "Date,Hour,DA_DMD,DMD,DA_RTP,RTP,,
1/1/2006,1,3393.9,3412,76.65,105.04,,
1/1/2006,2,3173.3,3202,69.20,67.67,,
1/1/2006,3,3040.0,3051,69.20,77.67,,
1/1/2006,4,2998.2,2979,67.32,69.10,,
1/1/2006,5,3005.8,2958,65.20,68.34,,"

read.csv(textConnection(Lines))

as does this:

read.table(textConnection(Lines),  skip = 1, sep = ",",
 col.names = c("Date","Hour","DA_DMD","DMD","DA_RTP", "RTP", "junk1", "junk2"),
 colClasses = c("character", "integer", rep("double", 6)))

--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Help with read.csv

Phil Spector
In reply to this post by Giovanni Petris
Giovanni -
    If you change "int" (which has no meaning in R) to
"integer" in your second example, it should work.

  - Phil Spector
  Statistical Computing Facility
  Department of Statistics
  UC Berkeley
  [hidden email]

On Wed, 9 Mar 2011, Giovanni Petris wrote:

>
> Hello,
>
> I have a file that looks like this:
>
> Date,Hour,DA_DMD,DMD,DA_RTP,RTP,,
> 1/1/2006,1,3393.9,3412,76.65,105.04,,
> 1/1/2006,2,3173.3,3202,69.20,67.67,,
> 1/1/2006,3,3040.0,3051,69.20,77.67,,
> 1/1/2006,4,2998.2,2979,67.32,69.10,,
> 1/1/2006,5,3005.8,2958,65.20,68.34,,
>
> where the ',' is the separator and I tried to read it into R, but...
>
>> y <- read.csv("Data/Data_tmp.csv", header = FALSE, skip = 1,
> +               colClasses = c("character", "int", rep("double", 4)),
> +               col.names = c("Date","Hour","DA_DMD","DMD","DA_RTP", "RTP"),
> +               flush = TRUE)
> Error in read.table(file = file, header = header, sep = sep, quote = quote,  :
>  more columns than column names
>
> count.fields() gives me 8 fields per line, so I tried other variations,
> like the following, with two fictitious extra fields, but...
>
>> y <- read.csv("Data/Data_tmp.csv", header = FALSE, skip = 1,
> +               colClasses = c("character", "int", rep("double", 6)),
> +               col.names = c("Date","Hour","DA_DMD","DMD","DA_RTP",
> +               "RTP", "XXX", "YYY"))
> Error in methods::as(data[[i]], colClasses[i]) :
>  no method or default for coercing "character" to "int"
>
> Could anybody please tell me what I am doing wrong and how I could read
> my data into R?
>
> Thanks in advance,
> Giovanni
>
>
>
>
>
>
> --
>
> Giovanni Petris  <[hidden email]>
> Associate Professor
> Department of Mathematical Sciences
> University of Arkansas - Fayetteville, AR 72701
> Ph: (479) 575-6324, 575-8630 (fax)
> http://definetti.uark.edu/~gpetris/
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Help with read.csv

Giovanni Petris
In reply to this post by David Wolfskill
Thanks to everybody who answered with suggestions (David Wolfskill,
Stephen Sefick, Jim Holtman, Gabor Grothendieck, and Phil Spector).

Beside the obvious end-of-the-day mixup ("int" in lieu of "integer"), I
was not aware of the existence of a "NULL" class - which proved pretty
useful in this case.

Thanks again for the help from this great list!

Best,
Giovanni

On Wed, 2011-03-09 at 15:10 -0800, David Wolfskill wrote:

> On Wed, Mar 09, 2011 at 04:32:29PM -0600, Giovanni Petris wrote:
> >
> > Hello,
> >
> > I have a file that looks like this:
> >
> > Date,Hour,DA_DMD,DMD,DA_RTP,RTP,,
> > 1/1/2006,1,3393.9,3412,76.65,105.04,,
> > 1/1/2006,2,3173.3,3202,69.20,67.67,,
> > 1/1/2006,3,3040.0,3051,69.20,77.67,,
> > 1/1/2006,4,2998.2,2979,67.32,69.10,,
> > 1/1/2006,5,3005.8,2958,65.20,68.34,,
> >
> > where the ',' is the separator and I tried to read it into R, but...
> >
> > > y <- read.csv("Data/Data_tmp.csv", header = FALSE, skip = 1,
> > +               colClasses = c("character", "int", rep("double", 4)),
> > +               col.names = c("Date","Hour","DA_DMD","DMD","DA_RTP", "RTP"),
> > +               flush = TRUE)
> > Error in read.table(file = file, header = header, sep = sep, quote = quote,  :
> >   more columns than column names
> >
> > count.fields() gives me 8 fields per line, so I tried other variations,
> > like the following, with two fictitious extra fields, but...
> >...
> > Error in methods::as(data[[i]], colClasses[i]) :
> >   no method or default for coercing "character" to "int"
> >
> > Could anybody please tell me what I am doing wrong and how I could read
> > my data into R?
>
> I copied the data locally, then I read it using:
>
> > y <- read.csv("Data_tmp.csv", header = TRUE, colClasses = c("character", "integer", rep("double", 4), "NULL", "NULL"))
>
> which yields:
>
> > y
>       Date Hour DA_DMD  DMD DA_RTP    RTP
> 1 1/1/2006    1 3393.9 3412  76.65 105.04
> 2 1/1/2006    2 3173.3 3202  69.20  67.67
> 3 1/1/2006    3 3040.0 3051  69.20  77.67
> 4 1/1/2006    4 2998.2 2979  67.32  69.10
> 5 1/1/2006    5 3005.8 2958  65.20  68.34
> >
>
> Is that what you had in mind?
>
> Issues I tried to address:
> * The class is "integer", not "int".
> * I used "NULL" class to refer to columns that are to be skipped.
> * I made use of the headers, rather than skipping them & re-coding their
>   content in the read.csv() invocation.
>
> Peace,
> david

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.