Read

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Read

Val-17
HI all,
I am trying to read a csv file, but  have a problem in the row names.
After reading, the name of the first column is now "row.names" and
all other column names are shifted to the right. The value of the last
column become all NAs( as an extra column).

My sample data looks like as follow,
filename = dat.csv
The first row has a missing value at column 3 and 5. The last row has
a missing value at column 1 and  5
x1,x2,x3,x4,x5
12,13,,14,,
22,23,24,25,26
,33,34,34,
To read the file I used this

dsh<-read.csv(file="dat.csv",sep=",",row.names=NULL,fill=TRUE,header=TRUE,comment.char
= "", quote = "", stringsAsFactors = FALSE)

The output  from the above  is
dsh

 row.names x1 x2 x3 x4 x5
1        12 13 NA 14 NA  NA
2        22 23 24 25 26  NA
3             33 34 34 NA  NA

The name of teh frist column is row,banes and all values of last columns is NAs


However, the desired output should be
 x1 x2 x3 x4 x5
 12 13 NA 14 NA
 22 23 24 25 26
 NA 33 34 34 NA


How can I fix this?
Thank you in advance

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Read

Jeff Newmiller
Your file has 5 commas in the first data row, but only 4 in the header. R
interprets this to mean your first column is intended to be row names (has
no corresponding column label) rather than data. (Row names are "outside"
the data frame... use str(dsh) to get a better picture.)

Basically, your file does not conform to consistent practices for csv
files of having the same number of commas in every row. If at all possible
I would eliminate the extra comma. If you have many of these broken files,
you might need to read the data in pieces... e.g.

dsh <- read.csv( "dat.csv", header=FALSE, skip=1 )
dsh <- dsh[ , -length( dsh ) ]
dshh <- read.csv( "dat.csv", header=TRUE, nrow=1)
names( dsh ) <- names( dshh )

On Fri, 9 Nov 2018, Val wrote:

> HI all,
> I am trying to read a csv file, but  have a problem in the row names.
> After reading, the name of the first column is now "row.names" and
> all other column names are shifted to the right. The value of the last
> column become all NAs( as an extra column).
>
> My sample data looks like as follow,
> filename = dat.csv
> The first row has a missing value at column 3 and 5. The last row has
> a missing value at column 1 and  5
> x1,x2,x3,x4,x5
> 12,13,,14,,
> 22,23,24,25,26
> ,33,34,34,
> To read the file I used this
>
> dsh<-read.csv(file="dat.csv",sep=",",row.names=NULL,fill=TRUE,header=TRUE,comment.char
> = "", quote = "", stringsAsFactors = FALSE)
>
> The output  from the above  is
> dsh
>
> row.names x1 x2 x3 x4 x5
> 1        12 13 NA 14 NA  NA
> 2        22 23 24 25 26  NA
> 3             33 34 34 NA  NA
>
> The name of teh frist column is row,banes and all values of last columns is NAs
>
>
> However, the desired output should be
> x1 x2 x3 x4 x5
> 12 13 NA 14 NA
> 22 23 24 25 26
> NA 33 34 34 NA
>
>
> How can I fix this?
> Thank you in advance
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Read

Rui Barradas
Hello,

I've just tested Jeff's solution, it works but the second code line
should be

dsh <- sh[ , -length( sh ) ]


(dsh doesn't exist yet.)

Hope this helps,

Rui Barradas

Às 02:46 de 10/11/2018, Jeff Newmiller escreveu:

> Your file has 5 commas in the first data row, but only 4 in the header.
> R interprets this to mean your first column is intended to be row names
> (has no corresponding column label) rather than data. (Row names are
> "outside" the data frame... use str(dsh) to get a better picture.)
>
> Basically, your file does not conform to consistent practices for csv
> files of having the same number of commas in every row. If at all
> possible I would eliminate the extra comma. If you have many of these
> broken files, you might need to read the data in pieces... e.g.
>
> dsh <- read.csv( "dat.csv", header=FALSE, skip=1 )
> dsh <- dsh[ , -length( dsh ) ]
> dshh <- read.csv( "dat.csv", header=TRUE, nrow=1)
> names( dsh ) <- names( dshh )
>
> On Fri, 9 Nov 2018, Val wrote:
>
>> HI all,
>> I am trying to read a csv file, but  have a problem in the row names.
>> After reading, the name of the first column is now "row.names" and
>> all other column names are shifted to the right. The value of the last
>> column become all NAs( as an extra column).
>>
>> My sample data looks like as follow,
>> filename = dat.csv
>> The first row has a missing value at column 3 and 5. The last row has
>> a missing value at column 1 and  5
>> x1,x2,x3,x4,x5
>> 12,13,,14,,
>> 22,23,24,25,26
>> ,33,34,34,
>> To read the file I used this
>>
>> dsh<-read.csv(file="dat.csv",sep=",",row.names=NULL,fill=TRUE,header=TRUE,comment.char
>>
>> = "", quote = "", stringsAsFactors = FALSE)
>>
>> The output  from the above  is
>> dsh
>>
>> row.names x1 x2 x3 x4 x5
>> 1        12 13 NA 14 NA  NA
>> 2        22 23 24 25 26  NA
>> 3             33 34 34 NA  NA
>>
>> The name of teh frist column is row,banes and all values of last
>> columns is NAs
>>
>>
>> However, the desired output should be
>> x1 x2 x3 x4 x5
>> 12 13 NA 14 NA
>> 22 23 24 25 26
>> NA 33 34 34 NA
>>
>>
>> How can I fix this?
>> Thank you in advance
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live Go...
>                                        Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Read

Ista Zahn
In reply to this post by Val-17
readr::read_csv produces the desired result by default:

readr::read_csv("x1,x2,x3,x4,x5
12,13,,14,,
22,23,24,25,26
,33,34,34,")

Best,
Ista
On Fri, Nov 9, 2018 at 8:40 PM Val <[hidden email]> wrote:

>
> HI all,
> I am trying to read a csv file, but  have a problem in the row names.
> After reading, the name of the first column is now "row.names" and
> all other column names are shifted to the right. The value of the last
> column become all NAs( as an extra column).
>
> My sample data looks like as follow,
> filename = dat.csv
> The first row has a missing value at column 3 and 5. The last row has
> a missing value at column 1 and  5
> x1,x2,x3,x4,x5
> 12,13,,14,,
> 22,23,24,25,26
> ,33,34,34,
> To read the file I used this
>
> dsh<-read.csv(file="dat.csv",sep=",",row.names=NULL,fill=TRUE,header=TRUE,comment.char
> = "", quote = "", stringsAsFactors = FALSE)
>
> The output  from the above  is
> dsh
>
>  row.names x1 x2 x3 x4 x5
> 1        12 13 NA 14 NA  NA
> 2        22 23 24 25 26  NA
> 3             33 34 34 NA  NA
>
> The name of teh frist column is row,banes and all values of last columns is NAs
>
>
> However, the desired output should be
>  x1 x2 x3 x4 x5
>  12 13 NA 14 NA
>  22 23 24 25 26
>  NA 33 34 34 NA
>
>
> How can I fix this?
> Thank you in advance
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Read

Val-17
In reply to this post by Jeff Newmiller
Thank you Jeff and all.

My data is very messy and it is nice trick suggested by Jeff to handle it

On Fri, Nov 9, 2018 at 8:42 PM Jeff Newmiller <[hidden email]> wrote:

>
> Your file has 5 commas in the first data row, but only 4 in the header. R
> interprets this to mean your first column is intended to be row names (has
> no corresponding column label) rather than data. (Row names are "outside"
> the data frame... use str(dsh) to get a better picture.)
>
> Basically, your file does not conform to consistent practices for csv
> files of having the same number of commas in every row. If at all possible
> I would eliminate the extra comma. If you have many of these broken files,
> you might need to read the data in pieces... e.g.
>
> dsh <- read.csv( "dat.csv", header=FALSE, skip=1 )
> dsh <- dsh[ , -length( dsh ) ]
> dshh <- read.csv( "dat.csv", header=TRUE, nrow=1)
> names( dsh ) <- names( dshh )
>
> On Fri, 9 Nov 2018, Val wrote:
>
> > HI all,
> > I am trying to read a csv file, but  have a problem in the row names.
> > After reading, the name of the first column is now "row.names" and
> > all other column names are shifted to the right. The value of the last
> > column become all NAs( as an extra column).
> >
> > My sample data looks like as follow,
> > filename = dat.csv
> > The first row has a missing value at column 3 and 5. The last row has
> > a missing value at column 1 and  5
> > x1,x2,x3,x4,x5
> > 12,13,,14,,
> > 22,23,24,25,26
> > ,33,34,34,
> > To read the file I used this
> >
> > dsh<-read.csv(file="dat.csv",sep=",",row.names=NULL,fill=TRUE,header=TRUE,comment.char
> > = "", quote = "", stringsAsFactors = FALSE)
> >
> > The output  from the above  is
> > dsh
> >
> > row.names x1 x2 x3 x4 x5
> > 1        12 13 NA 14 NA  NA
> > 2        22 23 24 25 26  NA
> > 3             33 34 34 NA  NA
> >
> > The name of teh frist column is row,banes and all values of last columns is NAs
> >
> >
> > However, the desired output should be
> > x1 x2 x3 x4 x5
> > 12 13 NA 14 NA
> > 22 23 24 25 26
> > NA 33 34 34 NA
> >
> >
> > How can I fix this?
> > Thank you in advance
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<[hidden email]>        Basics: ##.#.       ##.#.  Live Go...
>                                        Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
> ---------------------------------------------------------------------------

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.