read.fwf only reads one "buffersize" of lines when reading multiline records (PR#8951)

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

read.fwf only reads one "buffersize" of lines when reading multiline records (PR#8951)

David Hugh-Jones-3
Full_Name: David Hugh-Jones
Version: 2.3.0
OS: Windows
Submission from: (NULL) (155.245.34.194)


Here's a test case:

tf = tempfile()
cat(file=tf, rep("123", 100), sep="\n")
# this gives a data frame with 100 rows
read.fwf(tf, widths=c(1,2), buffersize=10)

# but this only gives 5 rows!
read.fwf(tf, widths=list(c(1,2), 3), buffersize=10)


I think the correct fix is to replace this line in read.fwf

if (length(raw) < thisblock)

with

if (nread < thisblock)


This bug is present in earlier versions, at least up to 2.1.0.

Cheers
David

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel