Bug (?): reading binary files in Windows 10

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Bug (?): reading binary files in Windows 10

Kate Stone
Hello r-help,

Could you help me determine whether this is an R bug or not?

I've been trying to read this binary file in R:

download.file("ftp://ftp.fieldtriptoolbox.org/pub/fieldtrip/tutorial/preprocessing_erp/s04.eeg","s04.eeg")

and I get a different length file (i.e. much longer) in Windows  >= 8
x64 (build 9200) than in Ubuntu. I've tested it with different R
versions in Windows and different package versions with the same
incorrect result. Other colleagues have tested it on the same
Windows/Ubuntu builds and got the correct length.

I'm not sure whether this is an R problem or something to do with my
OS specifically, or even with the file itself. Any ideas?? I've
attached a small script demonstrating the issue.

Many thanks,
Kate

--
Kate Stone
PhD candidate
Vasishth Lab | Department of Linguistics
Potsdam University, 14467 Potsdam, Germany
https://auskate.github.io
______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Bug (?): reading binary files in Windows 10

Albrecht Kauffmann-2
Dear Kate,

I cannot find your small script, but I downloaded the file using your command line. It has the size of  142773760 bytes (136.2 MB).

Hth,
Albrecht

--
  Albrecht Kauffmann
  [hidden email]

Am Do, 6. Dez 2018, um 13:45, schrieb Kate Stone:

> Hello r-help,
>
> Could you help me determine whether this is an R bug or not?
>
> I've been trying to read this binary file in R:
>
> download.file("ftp://ftp.fieldtriptoolbox.org/pub/fieldtrip/tutorial/preprocessing_erp/s04.eeg","s04.eeg")
>
> and I get a different length file (i.e. much longer) in Windows  >= 8
> x64 (build 9200) than in Ubuntu. I've tested it with different R
> versions in Windows and different package versions with the same
> incorrect result. Other colleagues have tested it on the same
> Windows/Ubuntu builds and got the correct length.
>
> I'm not sure whether this is an R problem or something to do with my
> OS specifically, or even with the file itself. Any ideas?? I've
> attached a small script demonstrating the issue.
>
> Many thanks,
> Kate
>
> --
> Kate Stone
> PhD candidate
> Vasishth Lab | Department of Linguistics
> Potsdam University, 14467 Potsdam, Germany
> https://auskate.github.io
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Bug (?): reading binary files in Windows 10

OmarGon
Hi,

this is what i got, just with base R:

> a <- download.file("
ftp://ftp.fieldtriptoolbox.org/pub/fieldtrip/tutorial/preprocessing_erp/s04.eeg
","s04.eeg")
probando la URL '
ftp://ftp.fieldtriptoolbox.org/pub/fieldtrip/tutorial/preprocessing_erp/s04.eeg
'
Content type 'unknown' length 142773760 bytes (136.2 MB)
==================================================
> a
[1] 0
> length(a)
[1] 1

Information about the session:

> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.1 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=es_ES.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=es_ES.UTF-8        LC_COLLATE=es_ES.UTF-8
 [5] LC_MONETARY=es_ES.UTF-8    LC_MESSAGES=es_ES.UTF-8
 [7] LC_PAPER=es_ES.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_3.4.4 tools_3.4.4    yaml_2.2.0


El jue., 6 dic. 2018 a las 8:51, Albrecht Kauffmann (<[hidden email]>)
escribió:

> Dear Kate,
>
> I cannot find your small script, but I downloaded the file using your
> command line. It has the size of  142773760 bytes (136.2 MB).
>
> Hth,
> Albrecht
>
> --
>   Albrecht Kauffmann
>   [hidden email]
>
> Am Do, 6. Dez 2018, um 13:45, schrieb Kate Stone:
> > Hello r-help,
> >
> > Could you help me determine whether this is an R bug or not?
> >
> > I've been trying to read this binary file in R:
> >
> > download.file("
> ftp://ftp.fieldtriptoolbox.org/pub/fieldtrip/tutorial/preprocessing_erp/s04.eeg
> ","s04.eeg")
> >
> > and I get a different length file (i.e. much longer) in Windows  >= 8
> > x64 (build 9200) than in Ubuntu. I've tested it with different R
> > versions in Windows and different package versions with the same
> > incorrect result. Other colleagues have tested it on the same
> > Windows/Ubuntu builds and got the correct length.
> >
> > I'm not sure whether this is an R problem or something to do with my
> > OS specifically, or even with the file itself. Any ideas?? I've
> > attached a small script demonstrating the issue.
> >
> > Many thanks,
> > Kate
> >
> > --
> > Kate Stone
> > PhD candidate
> > Vasishth Lab | Department of Linguistics
> > Potsdam University, 14467 Potsdam, Germany
> > https://auskate.github.io
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Bug (?): reading binary files in Windows 10

Duncan Murdoch-2
In reply to this post by Kate Stone
On 06/12/2018 7:45 AM, Kate Stone wrote:

> Hello r-help,
>
> Could you help me determine whether this is an R bug or not?
>
> I've been trying to read this binary file in R:
>
> download.file("ftp://ftp.fieldtriptoolbox.org/pub/fieldtrip/tutorial/preprocessing_erp/s04.eeg","s04.eeg")
>
> and I get a different length file (i.e. much longer) in Windows  >= 8
> x64 (build 9200) than in Ubuntu. I've tested it with different R
> versions in Windows and different package versions with the same
> incorrect result. Other colleagues have tested it on the same
> Windows/Ubuntu builds and got the correct length.
>
> I'm not sure whether this is an R problem or something to do with my
> OS specifically, or even with the file itself. Any ideas?? I've
> attached a small script demonstrating the issue.

On Windows, the `mode = "wb"` argument to download.file() is important,
otherwise it is assumed to be a text file, and LF is changed to CR LF.
There may also be handling of EOF marks, I forget.

Duncan Murdoch

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Bug (?): reading binary files in Windows 10

Jeff Newmiller
AFAIK this receiver-side responsibility to specify the text/binary status of the file is particularly a problem with the "ftp://" protocol because it does not use MIME file encoding (which "http://" uses). MIME allows the sending end of the connection to communicate whether the file is text or binary, though it uses more bandwidth for the transfer. If the server offers you a choice in these days of high bandwidth connections, you may be better off sticking with http/https.

Note that MIME is not magic... if the sender is improperly configured then the client can potentially receive corrupt data. Fortunately the most typical MIME misconfigurations cause the file to be unchanged in all cases, leaving it to the receiver to deal with any text file newline decoding choice/task after the file transfer is completed.

On December 6, 2018 7:03:48 AM PST, Duncan Murdoch <[hidden email]> wrote:

>On 06/12/2018 7:45 AM, Kate Stone wrote:
>> Hello r-help,
>>
>> Could you help me determine whether this is an R bug or not?
>>
>> I've been trying to read this binary file in R:
>>
>>
>download.file("ftp://ftp.fieldtriptoolbox.org/pub/fieldtrip/tutorial/preprocessing_erp/s04.eeg","s04.eeg")
>>
>> and I get a different length file (i.e. much longer) in Windows  >= 8
>> x64 (build 9200) than in Ubuntu. I've tested it with different R
>> versions in Windows and different package versions with the same
>> incorrect result. Other colleagues have tested it on the same
>> Windows/Ubuntu builds and got the correct length.
>>
>> I'm not sure whether this is an R problem or something to do with my
>> OS specifically, or even with the file itself. Any ideas?? I've
>> attached a small script demonstrating the issue.
>
>On Windows, the `mode = "wb"` argument to download.file() is important,
>
>otherwise it is assumed to be a text file, and LF is changed to CR LF.
>There may also be handling of EOF marks, I forget.
>
>Duncan Murdoch
>
>______________________________________________
>[hidden email] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Bug (?): reading binary files in Windows 10

Kate Stone
Ah wow, that answers many questions, thanks!

On Thu, Dec 6, 2018 at 4:41 PM Jeff Newmiller <[hidden email]>
wrote:

> AFAIK this receiver-side responsibility to specify the text/binary status
> of the file is particularly a problem with the "ftp://" protocol because
> it does not use MIME file encoding (which "http://" uses). MIME allows
> the sending end of the connection to communicate whether the file is text
> or binary, though it uses more bandwidth for the transfer. If the server
> offers you a choice in these days of high bandwidth connections, you may be
> better off sticking with http/https.
>
> Note that MIME is not magic... if the sender is improperly configured then
> the client can potentially receive corrupt data. Fortunately the most
> typical MIME misconfigurations cause the file to be unchanged in all cases,
> leaving it to the receiver to deal with any text file newline decoding
> choice/task after the file transfer is completed.
>
> On December 6, 2018 7:03:48 AM PST, Duncan Murdoch <
> [hidden email]> wrote:
> >On 06/12/2018 7:45 AM, Kate Stone wrote:
> >> Hello r-help,
> >>
> >> Could you help me determine whether this is an R bug or not?
> >>
> >> I've been trying to read this binary file in R:
> >>
> >>
> >download.file("
> ftp://ftp.fieldtriptoolbox.org/pub/fieldtrip/tutorial/preprocessing_erp/s04.eeg
> ","s04.eeg")
> >>
> >> and I get a different length file (i.e. much longer) in Windows  >= 8
> >> x64 (build 9200) than in Ubuntu. I've tested it with different R
> >> versions in Windows and different package versions with the same
> >> incorrect result. Other colleagues have tested it on the same
> >> Windows/Ubuntu builds and got the correct length.
> >>
> >> I'm not sure whether this is an R problem or something to do with my
> >> OS specifically, or even with the file itself. Any ideas?? I've
> >> attached a small script demonstrating the issue.
> >
> >On Windows, the `mode = "wb"` argument to download.file() is important,
> >
> >otherwise it is assumed to be a text file, and LF is changed to CR LF.
> >There may also be handling of EOF marks, I forget.
> >
> >Duncan Murdoch
> >
> >______________________________________________
> >[hidden email] mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>


--
Kate Stone
PhD candidate
Vasishth Lab | Department of Linguistics
Potsdam University, 14467 Potsdam, Germany
https://auskate.github.io

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.