Buffering in R 3.5 connections causes incorrect data in readChar

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Buffering in R 3.5 connections causes incorrect data in readChar

Aaron Goodman
I noticed an issue where readChar does not return the correct value after a
call to readline. It appears that readChar is not aware of the buffering,
so it reads from the end of the buffer, rather than the current position in
the file. This is a significant change of behavior from R-3.4.4.

Below is a test case that I used to home in on the problem.

---

p<-"test2.txt"
cat("abcdefg
hijklmn
opqrstu",file=p)

cat("read char after readline (h)\n")
con <- file(p,"r")
invisible(readLines(con,1))
print(readChar(con,1))
close(con)

cat("read char after readline and seek (h)\n")
con <- file(p,"r")
invisible(readLines(con,1))
invisible(seek(con,seek(con)))
print(readChar(con,1))
close(con)

cat("read lines after readline (hijklmn)\n")
con <- file(p,"r")
invisible(readLines(con,1) )
print(readLines(con,1))
close(con)


cat("read line after char (bcdefg):\n")
con <- file(p,"r")
invisible(readChar(con,1) )
print(readLines(con,1))
close(con)

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Buffering in R 3.5 connections causes incorrect data in readChar

Tomas Kalibera
On 05/26/2018 05:15 AM, Aaron Goodman wrote:
> I noticed an issue where readChar does not return the correct value after a
> call to readline. It appears that readChar is not aware of the buffering,
> so it reads from the end of the buffer, rather than the current position in
> the file. This is a significant change of behavior from R-3.4.4.
>
> Below is a test case that I used to home in on the problem.
Thanks for the report and analysis, you are right, readChar ignores the
buffer (and it also ignores the pushback). But please note that this
behavior is in line with the documentation, see ?readChar: readChar must
only be used with binary connections, but the example uses it on a text
connection. Buffering and pushback are only used on (readable) text
connections. I will check whether we could report a runtime error.

Best
Tomas

> ---
>
> p<-"test2.txt"
> cat("abcdefg
> hijklmn
> opqrstu",file=p)
>
> cat("read char after readline (h)\n")
> con <- file(p,"r")
> invisible(readLines(con,1))
> print(readChar(con,1))
> close(con)
>
> cat("read char after readline and seek (h)\n")
> con <- file(p,"r")
> invisible(readLines(con,1))
> invisible(seek(con,seek(con)))
> print(readChar(con,1))
> close(con)
>
> cat("read lines after readline (hijklmn)\n")
> con <- file(p,"r")
> invisible(readLines(con,1) )
> print(readLines(con,1))
> close(con)
>
>
> cat("read line after char (bcdefg):\n")
> con <- file(p,"r")
> invisible(readChar(con,1) )
> print(readLines(con,1))
> close(con)
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Buffering in R 3.5 connections causes incorrect data in readChar

Aaron Goodman
Tomas,

Thank you for the explanation. I see in the documentation: "These functions
are intended to be used with binary-mode connections."  So I see how using
it on a text connection is undefined, and not a bug. An error or warning
when attempting to use a it on a text connection would be helpful
considering how the behavior has changed in R-3.5.

On Tue, May 29, 2018 at 3:09 AM, Tomas Kalibera <[hidden email]>
wrote:

> On 05/26/2018 05:15 AM, Aaron Goodman wrote:
>
>> I noticed an issue where readChar does not return the correct value after
>> a
>> call to readline. It appears that readChar is not aware of the buffering,
>> so it reads from the end of the buffer, rather than the current position
>> in
>> the file. This is a significant change of behavior from R-3.4.4.
>>
>> Below is a test case that I used to home in on the problem.
>>
> Thanks for the report and analysis, you are right, readChar ignores the
> buffer (and it also ignores the pushback). But please note that this
> behavior is in line with the documentation, see ?readChar: readChar must
> only be used with binary connections, but the example uses it on a text
> connection. Buffering and pushback are only used on (readable) text
> connections. I will check whether we could report a runtime error.
>
> Best
> Tomas
>
> ---
>>
>> p<-"test2.txt"
>> cat("abcdefg
>> hijklmn
>> opqrstu",file=p)
>>
>> cat("read char after readline (h)\n")
>> con <- file(p,"r")
>> invisible(readLines(con,1))
>> print(readChar(con,1))
>> close(con)
>>
>> cat("read char after readline and seek (h)\n")
>> con <- file(p,"r")
>> invisible(readLines(con,1))
>> invisible(seek(con,seek(con)))
>> print(readChar(con,1))
>> close(con)
>>
>> cat("read lines after readline (hijklmn)\n")
>> con <- file(p,"r")
>> invisible(readLines(con,1) )
>> print(readLines(con,1))
>> close(con)
>>
>>
>> cat("read line after char (bcdefg):\n")
>> con <- file(p,"r")
>> invisible(readChar(con,1) )
>> print(readLines(con,1))
>> close(con)
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Buffering in R 3.5 connections causes incorrect data in readChar

Tomas Kalibera
For now I have added a warning that is issued (only) when there is
definitely a problem (currently a read buffer, a pushback buffer or
encoding conversion of the input). And I have added a similar warning
for writing when there is encoding conversion of the output (writeChar).
But, as you say, the behavior is undefined and it remains so regardless
of whether there is a warning or not: programs should only use these
functions with binary connections.

Best
Tomas

On 05/30/2018 12:00 AM, Aaron Goodman wrote:

> Tomas,
>
> Thank you for the explanation. I see in the documentation: "These
> functions are intended to be used with binary-mode connections."  So I
> see how using it on a text connection is undefined, and not a bug. An
> error or warning when attempting to use a it on a text connection
> would be helpful considering how the behavior has changed in R-3.5.
>
> On Tue, May 29, 2018 at 3:09 AM, Tomas Kalibera
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     On 05/26/2018 05:15 AM, Aaron Goodman wrote:
>
>         I noticed an issue where readChar does not return the correct
>         value after a
>         call to readline. It appears that readChar is not aware of the
>         buffering,
>         so it reads from the end of the buffer, rather than the
>         current position in
>         the file. This is a significant change of behavior from R-3.4.4.
>
>         Below is a test case that I used to home in on the problem.
>
>     Thanks for the report and analysis, you are right, readChar
>     ignores the buffer (and it also ignores the pushback). But please
>     note that this behavior is in line with the documentation, see
>     ?readChar: readChar must only be used with binary connections, but
>     the example uses it on a text connection. Buffering and pushback
>     are only used on (readable) text connections. I will check whether
>     we could report a runtime error.
>
>     Best
>     Tomas
>
>         ---
>
>         p<-"test2.txt"
>         cat("abcdefg
>         hijklmn
>         opqrstu",file=p)
>
>         cat("read char after readline (h)\n")
>         con <- file(p,"r")
>         invisible(readLines(con,1))
>         print(readChar(con,1))
>         close(con)
>
>         cat("read char after readline and seek (h)\n")
>         con <- file(p,"r")
>         invisible(readLines(con,1))
>         invisible(seek(con,seek(con)))
>         print(readChar(con,1))
>         close(con)
>
>         cat("read lines after readline (hijklmn)\n")
>         con <- file(p,"r")
>         invisible(readLines(con,1) )
>         print(readLines(con,1))
>         close(con)
>
>
>         cat("read line after char (bcdefg):\n")
>         con <- file(p,"r")
>         invisible(readChar(con,1) )
>         print(readLines(con,1))
>         close(con)
>
>                 [[alternative HTML version deleted]]
>
>         ______________________________________________
>         [hidden email] <mailto:[hidden email]> mailing list
>         https://stat.ethz.ch/mailman/listinfo/r-devel
>         <https://stat.ethz.ch/mailman/listinfo/r-devel>
>
>
>
>


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

readBin/writeBin (was Buffering in R 3.5, ref VF9FQ0oZEFNeWgQLDVgPBwACVFwOUgpyVgIYWVhQAQgDTVFdVQ==)

Tomas Kalibera
A small note: while working on this I realized that readBin/writeBin
documentation was still saying that these functions, even though not
intended for text connections, _may_ work with them. But with
readBin/writeBin there is always a runtime error when invoked on a text
connection, so I updated the docs. Thanks again for the report,

Best
Tomas

On 05/30/2018 03:38 PM, Tomas Kalibera wrote:

> For now I have added a warning that is issued (only) when there is
> definitely a problem (currently a read buffer, a pushback buffer or
> encoding conversion of the input).  And I have added a similar warning
> for writing when there is encoding conversion of the output
> (writeChar). But, as you say, the behavior is undefined and it remains
> so regardless of whether there is a warning or not: programs should
> only use these functions with binary connections.
>
> Best
> Tomas
>
> On 05/30/2018 12:00 AM, Aaron Goodman wrote:
>> Tomas,
>>
>> Thank you for the explanation. I see in the documentation: "These
>> functions are intended to be used with binary-mode connections."  So
>> I see how using it on a text connection is undefined, and not a bug.
>> An error or warning when attempting to use a it on a text connection
>> would be helpful considering how the behavior has changed in R-3.5.
>>
>> On Tue, May 29, 2018 at 3:09 AM, Tomas Kalibera
>> <[hidden email] <mailto:[hidden email]>> wrote:
>>
>>     On 05/26/2018 05:15 AM, Aaron Goodman wrote:
>>
>>         I noticed an issue where readChar does not return the correct
>>         value after a
>>         call to readline. It appears that readChar is not aware of
>>         the buffering,
>>         so it reads from the end of the buffer, rather than the
>>         current position in
>>         the file. This is a significant change of behavior from R-3.4.4.
>>
>>         Below is a test case that I used to home in on the problem.
>>
>>     Thanks for the report and analysis, you are right, readChar
>>     ignores the buffer (and it also ignores the pushback). But please
>>     note that this behavior is in line with the documentation, see
>>     ?readChar: readChar must only be used with binary connections,
>>     but the example uses it on a text connection. Buffering and
>>     pushback are only used on (readable) text connections. I will
>>     check whether we could report a runtime error.
>>
>>     Best
>>     Tomas
>>
>>         ---
>>
>>         p<-"test2.txt"
>>         cat("abcdefg
>>         hijklmn
>>         opqrstu",file=p)
>>
>>         cat("read char after readline (h)\n")
>>         con <- file(p,"r")
>>         invisible(readLines(con,1))
>>         print(readChar(con,1))
>>         close(con)
>>
>>         cat("read char after readline and seek (h)\n")
>>         con <- file(p,"r")
>>         invisible(readLines(con,1))
>>         invisible(seek(con,seek(con)))
>>         print(readChar(con,1))
>>         close(con)
>>
>>         cat("read lines after readline (hijklmn)\n")
>>         con <- file(p,"r")
>>         invisible(readLines(con,1) )
>>         print(readLines(con,1))
>>         close(con)
>>
>>
>>         cat("read line after char (bcdefg):\n")
>>         con <- file(p,"r")
>>         invisible(readChar(con,1) )
>>         print(readLines(con,1))
>>         close(con)
>>
>>                 [[alternative HTML version deleted]]
>>
>>         ______________________________________________
>>         [hidden email] <mailto:[hidden email]> mailing list
>>         https://stat.ethz.ch/mailman/listinfo/r-devel
>>         <https://stat.ethz.ch/mailman/listinfo/r-devel>
>>
>>
>>
>>
>


        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel