input string ... cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

input string ... cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'?

Spencer Graves-4
Hello:


          What if anything should I do regarding notes from either "load" or
"attach" that, "input string ... cannot be translated to UTF-8, is it
valid in 'ANSI_X3.4-1968'?"?


          I'm running R 4.0.5 under macOS 11.2.3;  see "sessionInfo()" and
detailed instructions below on the precise file I dowloaded from the web
and tried to read.


          I may be able to get what I want just ignoring this.  However, I'd
like to know how to fix this.


          Thanks,
          Spencer Graves


sessionInfo()
R version 4.0.5 (2021-03-31)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16

Matrix products: default
LAPACK:
/Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
  [1] compiler_4.0.5    htmltools_0.5.1.1 tools_4.0.5       yaml_2.2.1

  [5] tinytex_0.31      rmarkdown_2.7     knitr_1.31
digest_0.6.27
  [9] xfun_0.22         rlang_0.4.10      evaluate_0.14
 > search()
  [1] ".GlobalEnv"                "file:NAVCO 1.3 List.RData"
  [3] "file:NAVCO 1.3 List.RData" "tools:rstudio"
  [5] "package:stats"             "package:graphics"
  [7] "package:grDevices"         "package:utils"
  [9] "package:datasets"          "package:methods"
[11] "Autoloads"                 "package:base"


*** To get the file I used for this, I went to
"https://www.ericachenoweth.com/research".  From there I clicked
"Version 1.3".  This took me to


https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ON9XND


I then clicked the "Download" icon to the right of "NAVCO 1.3 List.tab".
  This gave me 5 "Download Options", one of which was "RData Format";  I
selected that.  This downloaded "NAVCO 1.3 List.RData", which I moved to
getwd().  Then I did 'load("NAVCO 1.3 List.RData")' and 'attach("NAVCO
1.3 List.RData")'.  Both of those gave me 8 repetitions of a message
like "input string ... cannot be translated to UTF-8, is it valid in
'ANSI_X3.4-1968'?" with different values substituted for "...".

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: input string ... cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'?

Duncan Murdoch-2
On 22/04/2021 9:25 p.m., Spencer Graves wrote:
> Hello:
>
>
>  What if anything should I do regarding notes from either "load" or
> "attach" that, "input string ... cannot be translated to UTF-8, is it
> valid in 'ANSI_X3.4-1968'?"?

First, ANSI_X3.4-1968  is an official name for for a version of Ascii.
It appears in the file near the start, where I believe it records the
native encoding in place when the file was written, so readers using a
different encoding can translate.

Your actual file appears to have been encoded in UTF-8, but not marked
as such.  You're lucky you read it on macOS, where UTF-8 is the native
encoding, since the reader probably recognized the bytes weren't ascii
bytes (and warned you about that), then just left them alone.  If you
read that file on Windows you'd likely get junk for those entries.

For your interest, here's a dump of the start of your file, after
gunzipping it:

00000000  52 44 58 33 0a 58 0a 00  00 00 03 00 03 06 00 00
|RDX3.X..........|
00000010  03 05 00 00 00 00 0e 41  4e 53 49 5f 58 33 2e 34
|.......ANSI_X3.4|
00000020  2d 31 39 36 38 00 00 04  02 00 00 00 01 00 04 00
|-1968...........|
00000030  09 00 00 00 01 78 00 00  03 13 00 00 00 10 00 00
|.....x..........|
00000040  02 0e 00 00 02 6e 40 90  0c 00 00 00 00 00 40 90
|.....n@.......@.|
00000050  44 00 00 00 00 00 40 10  00 00 00 00 00 00 40 7c
|D.....@.......@||

Duncan Murdoch

>
>
>  I'm running R 4.0.5 under macOS 11.2.3;  see "sessionInfo()" and
> detailed instructions below on the precise file I dowloaded from the web
> and tried to read.
>
>
>  I may be able to get what I want just ignoring this.  However, I'd
> like to know how to fix this.
>
>
>  Thanks,
>  Spencer Graves
>
>
> sessionInfo()
> R version 4.0.5 (2021-03-31)
> Platform: x86_64-apple-darwin17.0 (64-bit)
> Running under: macOS Big Sur 10.16
>
> Matrix products: default
> LAPACK:
> /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> loaded via a namespace (and not attached):
>    [1] compiler_4.0.5    htmltools_0.5.1.1 tools_4.0.5       yaml_2.2.1
>
>    [5] tinytex_0.31      rmarkdown_2.7     knitr_1.31
> digest_0.6.27
>    [9] xfun_0.22         rlang_0.4.10      evaluate_0.14
>   > search()
>    [1] ".GlobalEnv"                "file:NAVCO 1.3 List.RData"
>    [3] "file:NAVCO 1.3 List.RData" "tools:rstudio"
>    [5] "package:stats"             "package:graphics"
>    [7] "package:grDevices"         "package:utils"
>    [9] "package:datasets"          "package:methods"
> [11] "Autoloads"                 "package:base"
>
>
> *** To get the file I used for this, I went to
> "https://www.ericachenoweth.com/research".  From there I clicked
> "Version 1.3".  This took me to
>
>
> https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ON9XND
>
>
> I then clicked the "Download" icon to the right of "NAVCO 1.3 List.tab".
>    This gave me 5 "Download Options", one of which was "RData Format";  I
> selected that.  This downloaded "NAVCO 1.3 List.RData", which I moved to
> getwd().  Then I did 'load("NAVCO 1.3 List.RData")' and 'attach("NAVCO
> 1.3 List.RData")'.  Both of those gave me 8 repetitions of a message
> like "input string ... cannot be translated to UTF-8, is it valid in
> 'ANSI_X3.4-1968'?" with different values substituted for "...".
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: input string ... cannot be translated to UTF-8, is it valid in 'ANSI_X3.4-1968'?

John Kane-3
The tab format seems to read in with no problem.

On Thu, 22 Apr 2021 at 23:08, Duncan Murdoch <[hidden email]> wrote:

>
> On 22/04/2021 9:25 p.m., Spencer Graves wrote:
> > Hello:
> >
> >
> >         What if anything should I do regarding notes from either "load" or
> > "attach" that, "input string ... cannot be translated to UTF-8, is it
> > valid in 'ANSI_X3.4-1968'?"?
>
> First, ANSI_X3.4-1968  is an official name for for a version of Ascii.
> It appears in the file near the start, where I believe it records the
> native encoding in place when the file was written, so readers using a
> different encoding can translate.
>
> Your actual file appears to have been encoded in UTF-8, but not marked
> as such.  You're lucky you read it on macOS, where UTF-8 is the native
> encoding, since the reader probably recognized the bytes weren't ascii
> bytes (and warned you about that), then just left them alone.  If you
> read that file on Windows you'd likely get junk for those entries.
>
> For your interest, here's a dump of the start of your file, after
> gunzipping it:
>
> 00000000  52 44 58 33 0a 58 0a 00  00 00 03 00 03 06 00 00
> |RDX3.X..........|
> 00000010  03 05 00 00 00 00 0e 41  4e 53 49 5f 58 33 2e 34
> |.......ANSI_X3.4|
> 00000020  2d 31 39 36 38 00 00 04  02 00 00 00 01 00 04 00
> |-1968...........|
> 00000030  09 00 00 00 01 78 00 00  03 13 00 00 00 10 00 00
> |.....x..........|
> 00000040  02 0e 00 00 02 6e 40 90  0c 00 00 00 00 00 40 90
> |.....n@.......@.|
> 00000050  44 00 00 00 00 00 40 10  00 00 00 00 00 00 40 7c
> |D.....@.......@||
>
> Duncan Murdoch
>
> >
> >
> >         I'm running R 4.0.5 under macOS 11.2.3;  see "sessionInfo()" and
> > detailed instructions below on the precise file I dowloaded from the web
> > and tried to read.
> >
> >
> >         I may be able to get what I want just ignoring this.  However, I'd
> > like to know how to fix this.
> >
> >
> >         Thanks,
> >         Spencer Graves
> >
> >
> > sessionInfo()
> > R version 4.0.5 (2021-03-31)
> > Platform: x86_64-apple-darwin17.0 (64-bit)
> > Running under: macOS Big Sur 10.16
> >
> > Matrix products: default
> > LAPACK:
> > /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
> >
> > locale:
> > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> >
> > attached base packages:
> > [1] stats     graphics  grDevices utils     datasets  methods   base
> >
> > loaded via a namespace (and not attached):
> >    [1] compiler_4.0.5    htmltools_0.5.1.1 tools_4.0.5       yaml_2.2.1
> >
> >    [5] tinytex_0.31      rmarkdown_2.7     knitr_1.31
> > digest_0.6.27
> >    [9] xfun_0.22         rlang_0.4.10      evaluate_0.14
> >   > search()
> >    [1] ".GlobalEnv"                "file:NAVCO 1.3 List.RData"
> >    [3] "file:NAVCO 1.3 List.RData" "tools:rstudio"
> >    [5] "package:stats"             "package:graphics"
> >    [7] "package:grDevices"         "package:utils"
> >    [9] "package:datasets"          "package:methods"
> > [11] "Autoloads"                 "package:base"
> >
> >
> > *** To get the file I used for this, I went to
> > "https://www.ericachenoweth.com/research".  From there I clicked
> > "Version 1.3".  This took me to
> >
> >
> > https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ON9XND
> >
> >
> > I then clicked the "Download" icon to the right of "NAVCO 1.3 List.tab".
> >    This gave me 5 "Download Options", one of which was "RData Format";  I
> > selected that.  This downloaded "NAVCO 1.3 List.RData", which I moved to
> > getwd().  Then I did 'load("NAVCO 1.3 List.RData")' and 'attach("NAVCO
> > 1.3 List.RData")'.  Both of those gave me 8 repetitions of a message
> > like "input string ... cannot be translated to UTF-8, is it valid in
> > 'ANSI_X3.4-1968'?" with different values substituted for "...".
> >
> > ______________________________________________
> > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
John Kane
Kingston ON Canada

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.