Re: read.csv, worrying behaviour?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: read.csv, worrying behaviour?

R devel mailing list
Dear all

I've been using R for around 16 years now and I've only just become aware of a behaviour of read.csv that I find worrying which is why I'm contacting this list. A simplified example of the behaviour is as follows

I created a "test.csv" file containing the following lines:

a,b,c,d,e,f,g
1,2,3,4

And then read it into R using:

> d = read.csv("test.csv")
> d
  a b c d  e  f  g
1 1 2 3 4 NA NA NA

I was surprised that this did not issue a warning. I can understand why the following csv would not issue a warning:

a,b,c,d,e,f,g
1,2,3,4,,,

But the missing commas in the first example? Thoughts from others would be welcome.

Kind regards

Ben


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Benjamin M. Taylor, MSci, MSc, PhD
Lead Data Scientist
Blackpool Teaching Hospitals NHS Foundation Trust
Home 15
Whinney Heys Road
Blackpool
FY3 8NR

Scholar: https://scholar.google.co.uk/citations?user=6Hf0CJkAAAAJ&hl=en
Github: https://github.com/bentaylor1
Gitlab: https://gitlab.com/ben_taylor
ORCID: http://orcid.org/0000-0001-8667-4089



********************************************************************************************************************

This message may contain confidential information. If yo...{{dropped:19}}

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: read.csv, worrying behaviour?

Kevin Coombes
I believe this is documented behavior. The 'read.csv' function is a
front-end to 'read.table' with different default values. IN this
particular case, read.csv sets fill = TRUE, which means that it is
supposed to fill incomplete lines with NA's. It also sets header=TRUE,
which is presumably what it is using to determine the expected length of
a line-row.
   -- Kevin

On 2/25/2021 4:11 AM, TAYLOR, Benjamin (BLACKPOOL TEACHING HOSPITALS NHS
FOUNDATION TRUST) via R-devel wrote:

> Dear all
>
> I've been using R for around 16 years now and I've only just become aware of a behaviour of read.csv that I find worrying which is why I'm contacting this list. A simplified example of the behaviour is as follows
>
> I created a "test.csv" file containing the following lines:
>
> a,b,c,d,e,f,g
> 1,2,3,4
>
> And then read it into R using:
>
>> d = read.csv("test.csv")
>> d
>    a b c d  e  f  g
> 1 1 2 3 4 NA NA NA
>
> I was surprised that this did not issue a warning. I can understand why the following csv would not issue a warning:
>
> a,b,c,d,e,f,g
> 1,2,3,4,,,
>
> But the missing commas in the first example? Thoughts from others would be welcome.
>
> Kind regards
>
> Ben
>
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> Benjamin M. Taylor, MSci, MSc, PhD
> Lead Data Scientist
> Blackpool Teaching Hospitals NHS Foundation Trust
> Home 15
> Whinney Heys Road
> Blackpool
> FY3 8NR
>
> Scholar: https://scholar.google.co.uk/citations?user=6Hf0CJkAAAAJ&hl=en
> Github: https://github.com/bentaylor1
> Gitlab: https://gitlab.com/ben_taylor
> ORCID: http://orcid.org/0000-0001-8667-4089
>
>
>
> ********************************************************************************************************************
>
> This message may contain confidential information. If ...{{dropped:6}}

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel