read.table and NaN

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

read.table and NaN

R help mailing list-2
Hi,

Is there a way to make read.table consider NaN as a string of characters rather than the internal NaN? Changing the na.strings argument does not seems to have any effect on how R interprets the NaN string (while is does not the the NA string)

con <- textConnection(object = 'A,B\n1,NaN\nNA,2')
tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '', stringsAsFactors = FALSE)
close.connection(con)
tmp
class(tmp[,1])
class(tmp[,2])


______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: read.table and NaN

Bert Gunter-2
Like this?

con <- textConnection(object = 'A,B\n1,NaN\nNA,2')
> tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '',
stringsAsFactors = FALSE,
+                   colClasses = c("numeric", "character"))
> close.connection(con)
> tmp
   A   B
1  1 NaN
2 NA   2
> class(tmp[,1])
[1] "numeric"
> class(tmp[,2])
[1] "character"
> tmp[,2]
[1] "NaN" "2"


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Oct 23, 2019 at 6:31 PM Sebastien Bihorel via R-help <
[hidden email]> wrote:

> Hi,
>
> Is there a way to make read.table consider NaN as a string of characters
> rather than the internal NaN? Changing the na.strings argument does not
> seems to have any effect on how R interprets the NaN string (while is does
> not the the NA string)
>
> con <- textConnection(object = 'A,B\n1,NaN\nNA,2')
> tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '',
> stringsAsFactors = FALSE)
> close.connection(con)
> tmp
> class(tmp[,1])
> class(tmp[,2])
>
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: read.table and NaN

R help mailing list-2
Thanks Gunter

It seems that one has to know the structure of the data and adapt the read.table call accordingly. I am working on a framework that is meant to process data files with unknown structure, so I have to think a bit more about that...
________________________________
From: Bert Gunter <[hidden email]>
Sent: Thursday, October 24, 2019 00:08
To: Sebastien Bihorel <[hidden email]>
Cc: [hidden email] <[hidden email]>
Subject: Re: [R] read.table and NaN

Like this?

con <- textConnection(object = 'A,B\n1,NaN\nNA,2')
> tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '', stringsAsFactors = FALSE,
+                   colClasses = c("numeric", "character"))
> close.connection(con)
> tmp
   A   B
1  1 NaN
2 NA   2
> class(tmp[,1])
[1] "numeric"
> class(tmp[,2])
[1] "character"
> tmp[,2]
[1] "NaN" "2"


Bert Gunter

"The trouble with having an open mind is that people keep coming along and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Oct 23, 2019 at 6:31 PM Sebastien Bihorel via R-help <[hidden email]<mailto:[hidden email]>> wrote:
Hi,

Is there a way to make read.table consider NaN as a string of characters rather than the internal NaN? Changing the na.strings argument does not seems to have any effect on how R interprets the NaN string (while is does not the the NA string)

con <- textConnection(object = 'A,B\n1,NaN\nNA,2')
tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '', stringsAsFactors = FALSE)
close.connection(con)
tmp
class(tmp[,1])
class(tmp[,2])


______________________________________________
[hidden email]<mailto:[hidden email]> mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: read.table and NaN

Bert Gunter-2
Not so. Read ?read.table carefully. You can use "NA" as a default.
Moreover, you **specified** that you want NaN read as character, which
means that any column containing NaN **must** be character. That's part of
the specification for data frames (all columns must be one data type). So
either change your specfication or change your data structure.

And, incidentally, my first name is "Bert" .

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Oct 24, 2019 at 6:43 AM Sebastien Bihorel <
[hidden email]> wrote:

> Thanks Gunter
>
> It seems that one has to know the structure of the data and adapt the
> read.table call accordingly. I am working on a framework that is meant to
> process data files with unknown structure, so I have to think a bit more
> about that...
> ------------------------------
> *From:* Bert Gunter <[hidden email]>
> *Sent:* Thursday, October 24, 2019 00:08
> *To:* Sebastien Bihorel <[hidden email]>
> *Cc:* [hidden email] <[hidden email]>
> *Subject:* Re: [R] read.table and NaN
>
> Like this?
>
> con <- textConnection(object = 'A,B\n1,NaN\nNA,2')
> > tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '',
> stringsAsFactors = FALSE,
> +                   colClasses = c("numeric", "character"))
> > close.connection(con)
> > tmp
>    A   B
> 1  1 NaN
> 2 NA   2
> > class(tmp[,1])
> [1] "numeric"
> > class(tmp[,2])
> [1] "character"
> > tmp[,2]
> [1] "NaN" "2"
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, Oct 23, 2019 at 6:31 PM Sebastien Bihorel via R-help <
> [hidden email]> wrote:
>
> Hi,
>
> Is there a way to make read.table consider NaN as a string of characters
> rather than the internal NaN? Changing the na.strings argument does not
> seems to have any effect on how R interprets the NaN string (while is does
> not the the NA string)
>
> con <- textConnection(object = 'A,B\n1,NaN\nNA,2')
> tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '',
> stringsAsFactors = FALSE)
> close.connection(con)
> tmp
> class(tmp[,1])
> class(tmp[,2])
>
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: read.table and NaN

Bert Gunter-2
Oh, and btw, I think you should omit the groups = argument.
It's not needed since "groups" is already the conditioning variable, so
only one group per panel,
and using it seems to interact unfavorably with the way jittering is done.


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Oct 24, 2019 at 7:39 AM Bert Gunter <[hidden email]> wrote:

> Not so. Read ?read.table carefully. You can use "NA" as a default.
> Moreover, you **specified** that you want NaN read as character, which
> means that any column containing NaN **must** be character. That's part of
> the specification for data frames (all columns must be one data type). So
> either change your specfication or change your data structure.
>
> And, incidentally, my first name is "Bert" .
>
> Cheers,
> Bert
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Thu, Oct 24, 2019 at 6:43 AM Sebastien Bihorel <
> [hidden email]> wrote:
>
>> Thanks Gunter
>>
>> It seems that one has to know the structure of the data and adapt the
>> read.table call accordingly. I am working on a framework that is meant to
>> process data files with unknown structure, so I have to think a bit more
>> about that...
>> ------------------------------
>> *From:* Bert Gunter <[hidden email]>
>> *Sent:* Thursday, October 24, 2019 00:08
>> *To:* Sebastien Bihorel <[hidden email]>
>> *Cc:* [hidden email] <[hidden email]>
>> *Subject:* Re: [R] read.table and NaN
>>
>> Like this?
>>
>> con <- textConnection(object = 'A,B\n1,NaN\nNA,2')
>> > tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '',
>> stringsAsFactors = FALSE,
>> +                   colClasses = c("numeric", "character"))
>> > close.connection(con)
>> > tmp
>>    A   B
>> 1  1 NaN
>> 2 NA   2
>> > class(tmp[,1])
>> [1] "numeric"
>> > class(tmp[,2])
>> [1] "character"
>> > tmp[,2]
>> [1] "NaN" "2"
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Wed, Oct 23, 2019 at 6:31 PM Sebastien Bihorel via R-help <
>> [hidden email]> wrote:
>>
>> Hi,
>>
>> Is there a way to make read.table consider NaN as a string of characters
>> rather than the internal NaN? Changing the na.strings argument does not
>> seems to have any effect on how R interprets the NaN string (while is does
>> not the the NA string)
>>
>> con <- textConnection(object = 'A,B\n1,NaN\nNA,2')
>> tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '',
>> stringsAsFactors = FALSE)
>> close.connection(con)
>> tmp
>> class(tmp[,1])
>> class(tmp[,2])
>>
>>
>> ______________________________________________
>> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: read.table and NaN

R help mailing list-2
In reply to this post by Bert Gunter-2
My bad, Bert 😉

My point is that my function/framework has very minimal expectations about the source data (mostly, that it is a rectangular shape table of data separated by some separator) and does not have any a-priori knowledge about what the first, second, etc columns in the data files must contain.... so while it would be possible to pass down some class vector which would be passed down as the colClasses argument to read.table, it is not necessarily reasonable in the context of the overall framework.

I guess I was surprised that read.table interprets NaN in an input file as the internal "Not a number" rather than as a string... there is nothing in the ?read.table about that.

Anyways, as I said, I need to think more about this in the context of the framework where this function operates...

Thanks for the input


________________________________
From: Bert Gunter <[hidden email]>
Sent: Thursday, October 24, 2019 10:39
To: Sebastien Bihorel <[hidden email]>
Cc: [hidden email] <[hidden email]>
Subject: Re: [R] read.table and NaN

Not so. Read ?read.table carefully. You can use "NA" as a default. Moreover, you **specified** that you want NaN read as character, which means that any column containing NaN **must** be character. That's part of the specification for data frames (all columns must be one data type). So either change your specfication or change your data structure.

And, incidentally, my first name is "Bert" .

Cheers,
Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Oct 24, 2019 at 6:43 AM Sebastien Bihorel <[hidden email]<mailto:[hidden email]>> wrote:
Thanks Gunter

It seems that one has to know the structure of the data and adapt the read.table call accordingly. I am working on a framework that is meant to process data files with unknown structure, so I have to think a bit more about that...
________________________________
From: Bert Gunter <[hidden email]<mailto:[hidden email]>>
Sent: Thursday, October 24, 2019 00:08
To: Sebastien Bihorel <[hidden email]<mailto:[hidden email]>>
Cc: [hidden email]<mailto:[hidden email]> <[hidden email]<mailto:[hidden email]>>
Subject: Re: [R] read.table and NaN

Like this?

con <- textConnection(object = 'A,B\n1,NaN\nNA,2')
> tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '', stringsAsFactors = FALSE,
+                   colClasses = c("numeric", "character"))
> close.connection(con)
> tmp
   A   B
1  1 NaN
2 NA   2
> class(tmp[,1])
[1] "numeric"
> class(tmp[,2])
[1] "character"
> tmp[,2]
[1] "NaN" "2"


Bert Gunter

"The trouble with having an open mind is that people keep coming along and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Oct 23, 2019 at 6:31 PM Sebastien Bihorel via R-help <[hidden email]<mailto:[hidden email]>> wrote:
Hi,

Is there a way to make read.table consider NaN as a string of characters rather than the internal NaN? Changing the na.strings argument does not seems to have any effect on how R interprets the NaN string (while is does not the the NA string)

con <- textConnection(object = 'A,B\n1,NaN\nNA,2')
tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '', stringsAsFactors = FALSE)
close.connection(con)
tmp
class(tmp[,1])
class(tmp[,2])


______________________________________________
[hidden email]<mailto:[hidden email]> mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: read.table and NaN

Bert Gunter-2
Please read ?is.finite and sections on NA and NaN in The Intro to R and /or
The R Language Definition.

You seem to be engaging in lots of incorrect speculation without first
having ascertained the facts about how these things work.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Oct 25, 2019 at 1:12 PM Sebastien Bihorel <
[hidden email]> wrote:

> My bad, Bert 😉
>
> My point is that my function/framework has very minimal expectations about
> the source data (mostly, that it is a rectangular shape table of data
> separated by some separator) and does not have any a-priori knowledge about
> what the first, second, etc columns in the data files must contain.... so
> while it would be possible to pass down some class vector which would be
> passed down as the colClasses argument to read.table, it is not necessarily
> reasonable in the context of the overall framework.
>
> I guess I was surprised that read.table interprets NaN in an input file as
> the internal "Not a number" rather than as a string... there is nothing in
> the ?read.table about that.
>
> Anyways, as I said, I need to think more about this in the context of the
> framework where this function operates...
>
> Thanks for the input
>
>
> ------------------------------
> *From:* Bert Gunter <[hidden email]>
> *Sent:* Thursday, October 24, 2019 10:39
> *To:* Sebastien Bihorel <[hidden email]>
> *Cc:* [hidden email] <[hidden email]>
> *Subject:* Re: [R] read.table and NaN
>
> Not so. Read ?read.table carefully. You can use "NA" as a default.
> Moreover, you **specified** that you want NaN read as character, which
> means that any column containing NaN **must** be character. That's part of
> the specification for data frames (all columns must be one data type). So
> either change your specfication or change your data structure.
>
> And, incidentally, my first name is "Bert" .
>
> Cheers,
> Bert
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Thu, Oct 24, 2019 at 6:43 AM Sebastien Bihorel <
> [hidden email]> wrote:
>
> Thanks Gunter
>
> It seems that one has to know the structure of the data and adapt the
> read.table call accordingly. I am working on a framework that is meant to
> process data files with unknown structure, so I have to think a bit more
> about that...
> ------------------------------
> *From:* Bert Gunter <[hidden email]>
> *Sent:* Thursday, October 24, 2019 00:08
> *To:* Sebastien Bihorel <[hidden email]>
> *Cc:* [hidden email] <[hidden email]>
> *Subject:* Re: [R] read.table and NaN
>
> Like this?
>
> con <- textConnection(object = 'A,B\n1,NaN\nNA,2')
> > tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '',
> stringsAsFactors = FALSE,
> +                   colClasses = c("numeric", "character"))
> > close.connection(con)
> > tmp
>    A   B
> 1  1 NaN
> 2 NA   2
> > class(tmp[,1])
> [1] "numeric"
> > class(tmp[,2])
> [1] "character"
> > tmp[,2]
> [1] "NaN" "2"
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Wed, Oct 23, 2019 at 6:31 PM Sebastien Bihorel via R-help <
> [hidden email]> wrote:
>
> Hi,
>
> Is there a way to make read.table consider NaN as a string of characters
> rather than the internal NaN? Changing the na.strings argument does not
> seems to have any effect on how R interprets the NaN string (while is does
> not the the NA string)
>
> con <- textConnection(object = 'A,B\n1,NaN\nNA,2')
> tmp <- read.table(con, header = TRUE, sep = ',', na.strings = '',
> stringsAsFactors = FALSE)
> close.connection(con)
> tmp
> class(tmp[,1])
> class(tmp[,2])
>
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.