suggestion to fix packageDescription() for Windows users

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

suggestion to fix packageDescription() for Windows users

Ben Marwick
Recently I was trying to cite a package where the authors have ä
and ø in their names. I found that on Windows the citation() function
did not return the authors' names at all, but on Linux there was no
problem (sessionInfos at the bottom):

On Windows, no author names are returned:

#---------------

 > citation("readr")

To cite package ‘readr’ in publications use:

   (2017). readr: Read Rectangular Text Data. R package version 1.1.1.
   https://CRAN.R-project.org/package=readr

A BibTeX entry for LaTeX users is

   @Manual{,
     title = {readr: Read Rectangular Text Data},
     year = {2017},
     note = {R package version 1.1.1},
     url = {https://CRAN.R-project.org/package=readr},
   }

ATTENTION: This citation information has been auto-generated from the
package DESCRIPTION file and may need manual editing, see
‘help("citation")’.
#---------------

On Linux we do see the author names:

#---------------
 > citation("readr")

To cite package ‘readr’ in publications use:

   Hadley Wickham, Jim Hester and Romain Francois (2017). readr:
   Read Rectangular Text Data. R package version 1.1.1.
   https://CRAN.R-project.org/package=readr

A BibTeX entry for LaTeX users is

   @Manual{,
     title = {readr: Read Rectangular Text Data},
     author = {Hadley Wickham and Jim Hester and Romain Francois},
     year = {2017},
     note = {R package version 1.1.1},
     url = {https://CRAN.R-project.org/package=readr},
   }
#---------------

This appears to be an OS-dependent encoding issue. The citation function
does not take an encoding argument, so it's not possible to set the
encoding at the point where that function is used. The citation function
working with the packageDescription function, which does have an
encoding argument, but the default is not useful for Windows when there
is an encoding set in the DESCRIPTION of the package (in this case UTF-8).

We can set the encoding argument in packageDescription so it works in
Windows to give the authors as expected, but it is very inconvenient to
generate citations directly from the output of this function. So I'd
like to propose a solution this problem by changing one line in the
packageDescription function, like so, from:

#---------------
if (missing(encoding) && Sys.getlocale("LC_CTYPE") == "C")
#---------------

to:

#---------------
if ((missing(encoding) && Sys.getlocale("LC_CTYPE") == "C") |
unname(Sys.info()['sysname']) == "Windows")
#---------------

If I understand correctly, that will force ASCII//TRANSLIT encoding when
DESCRIPTION files are read by packageDescription() on Windows machines.
The upside is that Windows users will get the authors in the package
citation, unlike the current situation. The downside is that the exotic
symbols in the authors' names are replaced with common ones that are
similar.

I think getting the citations to easily include the authors' names is
pretty important, even if their names have exotic characters, so this is
worth fixing. Is this edit to packageDescription the best way to solve
this problem of exotic characters preventing the authors' names from
showing on Windows?

thanks,

Ben




Windows sessionInfo

#---------------
 > sessionInfo()
R version 3.4.0 Patched (2017-05-10 r72670)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_Australia.1252
[2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936
[3] LC_MONETARY=English_Australia.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_Australia.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
  [1] readr_1.1.1    compiler_3.4.0 R6_2.2.1       hms_0.3
tools_3.4.0
  [6] tibble_1.3.3   yaml_2.1.14    Rcpp_0.12.11   knitr_1.16
rlang_0.1.1
[11] fortunes_1.5-4
#---------------

Linux sessionInfo:

#---------------
 > sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.10

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_3.3.1 yaml_2.1.14 knitr_1.16
#---------------

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: suggestion to fix packageDescription() for Windows users

Duncan Murdoch-2
On 17/06/2017 7:10 AM, Ben Marwick wrote:
> Recently I was trying to cite a package where the authors have ä
> and ø in their names. I found that on Windows the citation() function
> did not return the authors' names at all, but on Linux there was no
> problem (sessionInfos at the bottom):
>
> On Windows, no author names are returned:

I'm not seeing this.  You have fairly strange localization settings; see
comments below.

>
> #---------------
>
>  > citation("readr")
>
> To cite package ‘readr’ in publications use:
>
>    (2017). readr: Read Rectangular Text Data. R package version 1.1.1.
>    https://CRAN.R-project.org/package=readr
>
> A BibTeX entry for LaTeX users is
>
>    @Manual{,
>      title = {readr: Read Rectangular Text Data},
>      year = {2017},
>      note = {R package version 1.1.1},
>      url = {https://CRAN.R-project.org/package=readr},
>    }
>
> ATTENTION: This citation information has been auto-generated from the
> package DESCRIPTION file and may need manual editing, see
> ‘help("citation")’.
> #---------------
>
> On Linux we do see the author names:
>
> #---------------
>  > citation("readr")
>
> To cite package ‘readr’ in publications use:
>
>    Hadley Wickham, Jim Hester and Romain Francois (2017). readr:
>    Read Rectangular Text Data. R package version 1.1.1.
>    https://CRAN.R-project.org/package=readr
>
> A BibTeX entry for LaTeX users is
>
>    @Manual{,
>      title = {readr: Read Rectangular Text Data},
>      author = {Hadley Wickham and Jim Hester and Romain Francois},
>      year = {2017},
>      note = {R package version 1.1.1},
>      url = {https://CRAN.R-project.org/package=readr},
>    }
> #---------------
>
> This appears to be an OS-dependent encoding issue. The citation function
> does not take an encoding argument, so it's not possible to set the
> encoding at the point where that function is used. The citation function
> working with the packageDescription function, which does have an
> encoding argument, but the default is not useful for Windows when there
> is an encoding set in the DESCRIPTION of the package (in this case UTF-8).
>
> We can set the encoding argument in packageDescription so it works in
> Windows to give the authors as expected, but it is very inconvenient to
> generate citations directly from the output of this function. So I'd
> like to propose a solution this problem by changing one line in the
> packageDescription function, like so, from:
>
> #---------------
> if (missing(encoding) && Sys.getlocale("LC_CTYPE") == "C")
> #---------------
>
> to:
>
> #---------------
> if ((missing(encoding) && Sys.getlocale("LC_CTYPE") == "C") |
> unname(Sys.info()['sysname']) == "Windows")
> #---------------
>
> If I understand correctly, that will force ASCII//TRANSLIT encoding when
> DESCRIPTION files are read by packageDescription() on Windows machines.
> The upside is that Windows users will get the authors in the package
> citation, unlike the current situation. The downside is that the exotic
> symbols in the authors' names are replaced with common ones that are
> similar.
>
> I think getting the citations to easily include the authors' names is
> pretty important, even if their names have exotic characters, so this is
> worth fixing. Is this edit to packageDescription the best way to solve
> this problem of exotic characters preventing the authors' names from
> showing on Windows?
>
> thanks,
>
> Ben
>
>
>
>
> Windows sessionInfo
>
> #---------------
>  > sessionInfo()
> R version 3.4.0 Patched (2017-05-10 r72670)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows 7 x64 (build 7601) Service Pack 1
>
> Matrix products: default
>
> locale:
> [1] LC_COLLATE=English_Australia.1252
> [2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936
> [3] LC_MONETARY=English_Australia.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_Australia.1252

I don't know what English_Australia.1252 does that's different from what
I use (English_Canada.1252), but the Chinese locale setting could cause
trouble.  Could you try setting this (presumably in the Windows control
panel) to be consistent?  You're using a much simpler setting on Linux.

Duncan Murdoch

>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> loaded via a namespace (and not attached):
>   [1] readr_1.1.1    compiler_3.4.0 R6_2.2.1       hms_0.3
> tools_3.4.0
>   [6] tibble_1.3.3   yaml_2.1.14    Rcpp_0.12.11   knitr_1.16
> rlang_0.1.1
> [11] fortunes_1.5-4
> #---------------
>
> Linux sessionInfo:
>
> #---------------
>  > sessionInfo()
> R version 3.3.1 (2016-06-21)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 16.10
>
> locale:
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] tools_3.3.1 yaml_2.1.14 knitr_1.16
> #---------------
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: suggestion to fix packageDescription() for Windows users

Ben Marwick
Hi Duncan,

Thanks for your reply. Yes, it does seem to be specific to the CTYPE
setting to Chinese on Windows. If I set it to English using
Sys.setlocale() there is no problem, then back to Chinese and the
authors disappear:

Sys.setlocale("LC_ALL","English")
citation("readr")

#' To cite package ‘readr’ in publications use:
#'
#'   Hadley Wickham, Jim Hester and Romain Francois (2017). readr: Read
#' Rectangular Text Data. R package version 1.1.1.
#' https://CRAN.R-project.org/package=readr
#'
#' A BibTeX entry for LaTeX users is
#'
#' @Manual{,
#'   title = {readr: Read Rectangular Text Data},
#'   author = {Hadley Wickham and Jim Hester and Romain Francois},
#'   year = {2017},
#'   note = {R package version 1.1.1},
#'   url = {https://CRAN.R-project.org/package=readr},
#' }


Sys.setlocale("LC_CTYPE", "Chinese")
citation("readr")

#'
#' To cite package ‘readr’ in publications use:
#'
#'   (2017). readr: Read Rectangular Text Data. R package version 1.1.1.
#' https://CRAN.R-project.org/package=readr
#'
#' A BibTeX entry for LaTeX users is
#'
#' @Manual{,
#'   title = {readr: Read Rectangular Text Data},
#'   year = {2017},
#'   note = {R package version 1.1.1},
#'   url = {https://CRAN.R-project.org/package=readr},
#' }
#'
#' ATTENTION: This citation information has been auto-generated from the
#' package DESCRIPTION file and may need manual editing, see
#' ‘help("citation")’.

Where do we go from here? I do want to use the Chinese locale with R on
Windows (and perhaps others do too), so switching the locale isn't a fix.

Thanks,

Ben

On 17/06/2017 10:36 PM, Duncan Murdoch wrote:

> On 17/06/2017 7:10 AM, Ben Marwick wrote:
>> Recently I was trying to cite a package where the authors have ä
>> and ø in their names. I found that on Windows the citation() function
>> did not return the authors' names at all, but on Linux there was no
>> problem (sessionInfos at the bottom):
>>
>> On Windows, no author names are returned:
>
> I'm not seeing this.  You have fairly strange localization settings; see
> comments below.
>
>>
>> #---------------
>>
>>  > citation("readr")
>>
>> To cite package ‘readr’ in publications use:
>>
>>    (2017). readr: Read Rectangular Text Data. R package version 1.1.1.
>>    https://CRAN.R-project.org/package=readr
>>
>> A BibTeX entry for LaTeX users is
>>
>>    @Manual{,
>>      title = {readr: Read Rectangular Text Data},
>>      year = {2017},
>>      note = {R package version 1.1.1},
>>      url = {https://CRAN.R-project.org/package=readr},
>>    }
>>
>> ATTENTION: This citation information has been auto-generated from the
>> package DESCRIPTION file and may need manual editing, see
>> ‘help("citation")’.
>> #---------------
>>
>> On Linux we do see the author names:
>>
>> #---------------
>>  > citation("readr")
>>
>> To cite package ‘readr’ in publications use:
>>
>>    Hadley Wickham, Jim Hester and Romain Francois (2017). readr:
>>    Read Rectangular Text Data. R package version 1.1.1.
>>    https://CRAN.R-project.org/package=readr
>>
>> A BibTeX entry for LaTeX users is
>>
>>    @Manual{,
>>      title = {readr: Read Rectangular Text Data},
>>      author = {Hadley Wickham and Jim Hester and Romain Francois},
>>      year = {2017},
>>      note = {R package version 1.1.1},
>>      url = {https://CRAN.R-project.org/package=readr},
>>    }
>> #---------------
>>
>> This appears to be an OS-dependent encoding issue. The citation function
>> does not take an encoding argument, so it's not possible to set the
>> encoding at the point where that function is used. The citation function
>> working with the packageDescription function, which does have an
>> encoding argument, but the default is not useful for Windows when there
>> is an encoding set in the DESCRIPTION of the package (in this case
>> UTF-8).
>>
>> We can set the encoding argument in packageDescription so it works in
>> Windows to give the authors as expected, but it is very inconvenient to
>> generate citations directly from the output of this function. So I'd
>> like to propose a solution this problem by changing one line in the
>> packageDescription function, like so, from:
>>
>> #---------------
>> if (missing(encoding) && Sys.getlocale("LC_CTYPE") == "C")
>> #---------------
>>
>> to:
>>
>> #---------------
>> if ((missing(encoding) && Sys.getlocale("LC_CTYPE") == "C") |
>> unname(Sys.info()['sysname']) == "Windows")
>> #---------------
>>
>> If I understand correctly, that will force ASCII//TRANSLIT encoding when
>> DESCRIPTION files are read by packageDescription() on Windows machines.
>> The upside is that Windows users will get the authors in the package
>> citation, unlike the current situation. The downside is that the exotic
>> symbols in the authors' names are replaced with common ones that are
>> similar.
>>
>> I think getting the citations to easily include the authors' names is
>> pretty important, even if their names have exotic characters, so this is
>> worth fixing. Is this edit to packageDescription the best way to solve
>> this problem of exotic characters preventing the authors' names from
>> showing on Windows?
>>
>> thanks,
>>
>> Ben
>>
>>
>>
>>
>> Windows sessionInfo
>>
>> #---------------
>>  > sessionInfo()
>> R version 3.4.0 Patched (2017-05-10 r72670)
>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>> Running under: Windows 7 x64 (build 7601) Service Pack 1
>>
>> Matrix products: default
>>
>> locale:
>> [1] LC_COLLATE=English_Australia.1252
>> [2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936
>> [3] LC_MONETARY=English_Australia.1252
>> [4] LC_NUMERIC=C
>> [5] LC_TIME=English_Australia.1252
>
> I don't know what English_Australia.1252 does that's different from what
> I use (English_Canada.1252), but the Chinese locale setting could cause
> trouble.  Could you try setting this (presumably in the Windows control
> panel) to be consistent?  You're using a much simpler setting on Linux.
>
> Duncan Murdoch
>
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> loaded via a namespace (and not attached):
>>   [1] readr_1.1.1    compiler_3.4.0 R6_2.2.1       hms_0.3
>> tools_3.4.0
>>   [6] tibble_1.3.3   yaml_2.1.14    Rcpp_0.12.11   knitr_1.16
>> rlang_0.1.1
>> [11] fortunes_1.5-4
>> #---------------
>>
>> Linux sessionInfo:
>>
>> #---------------
>>  > sessionInfo()
>> R version 3.3.1 (2016-06-21)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>> Running under: Ubuntu 16.10
>>
>> locale:
>>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> loaded via a namespace (and not attached):
>> [1] tools_3.3.1 yaml_2.1.14 knitr_1.16
>> #---------------
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: suggestion to fix packageDescription() for Windows users

Duncan Murdoch-2
On 17/06/2017 9:13 AM, Ben Marwick wrote:
> Hi Duncan,
>
> Thanks for your reply. Yes, it does seem to be specific to the CTYPE
> setting to Chinese on Windows. If I set it to English using
> Sys.setlocale() there is no problem, then back to Chinese and the
> authors disappear:
>
> Sys.setlocale("LC_ALL","English")
> citation("readr")

Thanks, that makes the problem reproducible.  I'll submit it as a bug
report.  Maybe someone from Microsoft will fix it.

Duncan Murdoch

>
> #' To cite package ‘readr’ in publications use:
> #'
> #'   Hadley Wickham, Jim Hester and Romain Francois (2017). readr: Read
> #' Rectangular Text Data. R package version 1.1.1.
> #' https://CRAN.R-project.org/package=readr
> #'
> #' A BibTeX entry for LaTeX users is
> #'
> #' @Manual{,
> #'   title = {readr: Read Rectangular Text Data},
> #'   author = {Hadley Wickham and Jim Hester and Romain Francois},
> #'   year = {2017},
> #'   note = {R package version 1.1.1},
> #'   url = {https://CRAN.R-project.org/package=readr},
> #' }
>
>
> Sys.setlocale("LC_CTYPE", "Chinese")
> citation("readr")
>
> #'
> #' To cite package ‘readr’ in publications use:
> #'
> #'   (2017). readr: Read Rectangular Text Data. R package version 1.1.1.
> #' https://CRAN.R-project.org/package=readr
> #'
> #' A BibTeX entry for LaTeX users is
> #'
> #' @Manual{,
> #'   title = {readr: Read Rectangular Text Data},
> #'   year = {2017},
> #'   note = {R package version 1.1.1},
> #'   url = {https://CRAN.R-project.org/package=readr},
> #' }
> #'
> #' ATTENTION: This citation information has been auto-generated from the
> #' package DESCRIPTION file and may need manual editing, see
> #' ‘help("citation")’.
>
> Where do we go from here? I do want to use the Chinese locale with R on
> Windows (and perhaps others do too), so switching the locale isn't a fix.
>
> Thanks,
>
> Ben
>
> On 17/06/2017 10:36 PM, Duncan Murdoch wrote:
>> On 17/06/2017 7:10 AM, Ben Marwick wrote:
>>> Recently I was trying to cite a package where the authors have ä
>>> and ø in their names. I found that on Windows the citation() function
>>> did not return the authors' names at all, but on Linux there was no
>>> problem (sessionInfos at the bottom):
>>>
>>> On Windows, no author names are returned:
>>
>> I'm not seeing this.  You have fairly strange localization settings; see
>> comments below.
>>
>>>
>>> #---------------
>>>
>>>  > citation("readr")
>>>
>>> To cite package ‘readr’ in publications use:
>>>
>>>    (2017). readr: Read Rectangular Text Data. R package version 1.1.1.
>>>    https://CRAN.R-project.org/package=readr
>>>
>>> A BibTeX entry for LaTeX users is
>>>
>>>    @Manual{,
>>>      title = {readr: Read Rectangular Text Data},
>>>      year = {2017},
>>>      note = {R package version 1.1.1},
>>>      url = {https://CRAN.R-project.org/package=readr},
>>>    }
>>>
>>> ATTENTION: This citation information has been auto-generated from the
>>> package DESCRIPTION file and may need manual editing, see
>>> ‘help("citation")’.
>>> #---------------
>>>
>>> On Linux we do see the author names:
>>>
>>> #---------------
>>>  > citation("readr")
>>>
>>> To cite package ‘readr’ in publications use:
>>>
>>>    Hadley Wickham, Jim Hester and Romain Francois (2017). readr:
>>>    Read Rectangular Text Data. R package version 1.1.1.
>>>    https://CRAN.R-project.org/package=readr
>>>
>>> A BibTeX entry for LaTeX users is
>>>
>>>    @Manual{,
>>>      title = {readr: Read Rectangular Text Data},
>>>      author = {Hadley Wickham and Jim Hester and Romain Francois},
>>>      year = {2017},
>>>      note = {R package version 1.1.1},
>>>      url = {https://CRAN.R-project.org/package=readr},
>>>    }
>>> #---------------
>>>
>>> This appears to be an OS-dependent encoding issue. The citation function
>>> does not take an encoding argument, so it's not possible to set the
>>> encoding at the point where that function is used. The citation function
>>> working with the packageDescription function, which does have an
>>> encoding argument, but the default is not useful for Windows when there
>>> is an encoding set in the DESCRIPTION of the package (in this case
>>> UTF-8).
>>>
>>> We can set the encoding argument in packageDescription so it works in
>>> Windows to give the authors as expected, but it is very inconvenient to
>>> generate citations directly from the output of this function. So I'd
>>> like to propose a solution this problem by changing one line in the
>>> packageDescription function, like so, from:
>>>
>>> #---------------
>>> if (missing(encoding) && Sys.getlocale("LC_CTYPE") == "C")
>>> #---------------
>>>
>>> to:
>>>
>>> #---------------
>>> if ((missing(encoding) && Sys.getlocale("LC_CTYPE") == "C") |
>>> unname(Sys.info()['sysname']) == "Windows")
>>> #---------------
>>>
>>> If I understand correctly, that will force ASCII//TRANSLIT encoding when
>>> DESCRIPTION files are read by packageDescription() on Windows machines.
>>> The upside is that Windows users will get the authors in the package
>>> citation, unlike the current situation. The downside is that the exotic
>>> symbols in the authors' names are replaced with common ones that are
>>> similar.
>>>
>>> I think getting the citations to easily include the authors' names is
>>> pretty important, even if their names have exotic characters, so this is
>>> worth fixing. Is this edit to packageDescription the best way to solve
>>> this problem of exotic characters preventing the authors' names from
>>> showing on Windows?
>>>
>>> thanks,
>>>
>>> Ben
>>>
>>>
>>>
>>>
>>> Windows sessionInfo
>>>
>>> #---------------
>>>  > sessionInfo()
>>> R version 3.4.0 Patched (2017-05-10 r72670)
>>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>>> Running under: Windows 7 x64 (build 7601) Service Pack 1
>>>
>>> Matrix products: default
>>>
>>> locale:
>>> [1] LC_COLLATE=English_Australia.1252
>>> [2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936
>>> [3] LC_MONETARY=English_Australia.1252
>>> [4] LC_NUMERIC=C
>>> [5] LC_TIME=English_Australia.1252
>>
>> I don't know what English_Australia.1252 does that's different from what
>> I use (English_Canada.1252), but the Chinese locale setting could cause
>> trouble.  Could you try setting this (presumably in the Windows control
>> panel) to be consistent?  You're using a much simpler setting on Linux.
>>
>> Duncan Murdoch
>>
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>
>>> loaded via a namespace (and not attached):
>>>   [1] readr_1.1.1    compiler_3.4.0 R6_2.2.1       hms_0.3
>>> tools_3.4.0
>>>   [6] tibble_1.3.3   yaml_2.1.14    Rcpp_0.12.11   knitr_1.16
>>> rlang_0.1.1
>>> [11] fortunes_1.5-4
>>> #---------------
>>>
>>> Linux sessionInfo:
>>>
>>> #---------------
>>>  > sessionInfo()
>>> R version 3.3.1 (2016-06-21)
>>> Platform: x86_64-pc-linux-gnu (64-bit)
>>> Running under: Ubuntu 16.10
>>>
>>> locale:
>>>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>>   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>
>>> loaded via a namespace (and not attached):
>>> [1] tools_3.3.1 yaml_2.1.14 knitr_1.16
>>> #---------------
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: suggestion to fix packageDescription() for Windows users

Ben Marwick
Thanks very much, I see your bug report here:
https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17291

On 18/06/2017 2:26 AM, Duncan Murdoch wrote:

> On 17/06/2017 9:13 AM, Ben Marwick wrote:
>> Hi Duncan,
>>
>> Thanks for your reply. Yes, it does seem to be specific to the CTYPE
>> setting to Chinese on Windows. If I set it to English using
>> Sys.setlocale() there is no problem, then back to Chinese and the
>> authors disappear:
>>
>> Sys.setlocale("LC_ALL","English")
>> citation("readr")
>
> Thanks, that makes the problem reproducible.  I'll submit it as a bug
> report.  Maybe someone from Microsoft will fix it.
>
> Duncan Murdoch
>
>>
>> #' To cite package ‘readr’ in publications use:
>> #'
>> #'   Hadley Wickham, Jim Hester and Romain Francois (2017). readr: Read
>> #' Rectangular Text Data. R package version 1.1.1.
>> #' https://CRAN.R-project.org/package=readr
>> #'
>> #' A BibTeX entry for LaTeX users is
>> #'
>> #' @Manual{,
>> #'   title = {readr: Read Rectangular Text Data},
>> #'   author = {Hadley Wickham and Jim Hester and Romain Francois},
>> #'   year = {2017},
>> #'   note = {R package version 1.1.1},
>> #'   url = {https://CRAN.R-project.org/package=readr},
>> #' }
>>
>>
>> Sys.setlocale("LC_CTYPE", "Chinese")
>> citation("readr")
>>
>> #'
>> #' To cite package ‘readr’ in publications use:
>> #'
>> #'   (2017). readr: Read Rectangular Text Data. R package version 1.1.1.
>> #' https://CRAN.R-project.org/package=readr
>> #'
>> #' A BibTeX entry for LaTeX users is
>> #'
>> #' @Manual{,
>> #'   title = {readr: Read Rectangular Text Data},
>> #'   year = {2017},
>> #'   note = {R package version 1.1.1},
>> #'   url = {https://CRAN.R-project.org/package=readr},
>> #' }
>> #'
>> #' ATTENTION: This citation information has been auto-generated from the
>> #' package DESCRIPTION file and may need manual editing, see
>> #' ‘help("citation")’.
>>
>> Where do we go from here? I do want to use the Chinese locale with R on
>> Windows (and perhaps others do too), so switching the locale isn't a fix.
>>
>> Thanks,
>>
>> Ben
>>
>> On 17/06/2017 10:36 PM, Duncan Murdoch wrote:
>>> On 17/06/2017 7:10 AM, Ben Marwick wrote:
>>>> Recently I was trying to cite a package where the authors have ä
>>>> and ø in their names. I found that on Windows the citation() function
>>>> did not return the authors' names at all, but on Linux there was no
>>>> problem (sessionInfos at the bottom):
>>>>
>>>> On Windows, no author names are returned:
>>>
>>> I'm not seeing this.  You have fairly strange localization settings; see
>>> comments below.
>>>
>>>>
>>>> #---------------
>>>>
>>>>  > citation("readr")
>>>>
>>>> To cite package ‘readr’ in publications use:
>>>>
>>>>    (2017). readr: Read Rectangular Text Data. R package version 1.1.1.
>>>>    https://CRAN.R-project.org/package=readr
>>>>
>>>> A BibTeX entry for LaTeX users is
>>>>
>>>>    @Manual{,
>>>>      title = {readr: Read Rectangular Text Data},
>>>>      year = {2017},
>>>>      note = {R package version 1.1.1},
>>>>      url = {https://CRAN.R-project.org/package=readr},
>>>>    }
>>>>
>>>> ATTENTION: This citation information has been auto-generated from the
>>>> package DESCRIPTION file and may need manual editing, see
>>>> ‘help("citation")’.
>>>> #---------------
>>>>
>>>> On Linux we do see the author names:
>>>>
>>>> #---------------
>>>>  > citation("readr")
>>>>
>>>> To cite package ‘readr’ in publications use:
>>>>
>>>>    Hadley Wickham, Jim Hester and Romain Francois (2017). readr:
>>>>    Read Rectangular Text Data. R package version 1.1.1.
>>>>    https://CRAN.R-project.org/package=readr
>>>>
>>>> A BibTeX entry for LaTeX users is
>>>>
>>>>    @Manual{,
>>>>      title = {readr: Read Rectangular Text Data},
>>>>      author = {Hadley Wickham and Jim Hester and Romain Francois},
>>>>      year = {2017},
>>>>      note = {R package version 1.1.1},
>>>>      url = {https://CRAN.R-project.org/package=readr},
>>>>    }
>>>> #---------------
>>>>
>>>> This appears to be an OS-dependent encoding issue. The citation
>>>> function
>>>> does not take an encoding argument, so it's not possible to set the
>>>> encoding at the point where that function is used. The citation
>>>> function
>>>> working with the packageDescription function, which does have an
>>>> encoding argument, but the default is not useful for Windows when there
>>>> is an encoding set in the DESCRIPTION of the package (in this case
>>>> UTF-8).
>>>>
>>>> We can set the encoding argument in packageDescription so it works in
>>>> Windows to give the authors as expected, but it is very inconvenient to
>>>> generate citations directly from the output of this function. So I'd
>>>> like to propose a solution this problem by changing one line in the
>>>> packageDescription function, like so, from:
>>>>
>>>> #---------------
>>>> if (missing(encoding) && Sys.getlocale("LC_CTYPE") == "C")
>>>> #---------------
>>>>
>>>> to:
>>>>
>>>> #---------------
>>>> if ((missing(encoding) && Sys.getlocale("LC_CTYPE") == "C") |
>>>> unname(Sys.info()['sysname']) == "Windows")
>>>> #---------------
>>>>
>>>> If I understand correctly, that will force ASCII//TRANSLIT encoding
>>>> when
>>>> DESCRIPTION files are read by packageDescription() on Windows machines.
>>>> The upside is that Windows users will get the authors in the package
>>>> citation, unlike the current situation. The downside is that the exotic
>>>> symbols in the authors' names are replaced with common ones that are
>>>> similar.
>>>>
>>>> I think getting the citations to easily include the authors' names is
>>>> pretty important, even if their names have exotic characters, so
>>>> this is
>>>> worth fixing. Is this edit to packageDescription the best way to solve
>>>> this problem of exotic characters preventing the authors' names from
>>>> showing on Windows?
>>>>
>>>> thanks,
>>>>
>>>> Ben
>>>>
>>>>
>>>>
>>>>
>>>> Windows sessionInfo
>>>>
>>>> #---------------
>>>>  > sessionInfo()
>>>> R version 3.4.0 Patched (2017-05-10 r72670)
>>>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>>>> Running under: Windows 7 x64 (build 7601) Service Pack 1
>>>>
>>>> Matrix products: default
>>>>
>>>> locale:
>>>> [1] LC_COLLATE=English_Australia.1252
>>>> [2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936
>>>> [3] LC_MONETARY=English_Australia.1252
>>>> [4] LC_NUMERIC=C
>>>> [5] LC_TIME=English_Australia.1252
>>>
>>> I don't know what English_Australia.1252 does that's different from what
>>> I use (English_Canada.1252), but the Chinese locale setting could cause
>>> trouble.  Could you try setting this (presumably in the Windows control
>>> panel) to be consistent?  You're using a much simpler setting on Linux.
>>>
>>> Duncan Murdoch
>>>
>>>>
>>>> attached base packages:
>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>
>>>> loaded via a namespace (and not attached):
>>>>   [1] readr_1.1.1    compiler_3.4.0 R6_2.2.1       hms_0.3
>>>> tools_3.4.0
>>>>   [6] tibble_1.3.3   yaml_2.1.14    Rcpp_0.12.11   knitr_1.16
>>>> rlang_0.1.1
>>>> [11] fortunes_1.5-4
>>>> #---------------
>>>>
>>>> Linux sessionInfo:
>>>>
>>>> #---------------
>>>>  > sessionInfo()
>>>> R version 3.3.1 (2016-06-21)
>>>> Platform: x86_64-pc-linux-gnu (64-bit)
>>>> Running under: Ubuntu 16.10
>>>>
>>>> locale:
>>>>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>>>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>>>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>>>   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>>>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>>
>>>> attached base packages:
>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] tools_3.3.1 yaml_2.1.14 knitr_1.16
>>>> #---------------
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>>
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: suggestion to fix packageDescription() for Windows users

Andrie de Vries-2
In reply to this post by Duncan Murdoch-2
Hi, Duncan

i have forwarded this thread to Nathan, who promised to look into it.

Andrie

On 17 Jun 2017 17:26, "Duncan Murdoch" <[hidden email]> wrote:

> On 17/06/2017 9:13 AM, Ben Marwick wrote:
>
>> Hi Duncan,
>>
>> Thanks for your reply. Yes, it does seem to be specific to the CTYPE
>> setting to Chinese on Windows. If I set it to English using
>> Sys.setlocale() there is no problem, then back to Chinese and the
>> authors disappear:
>>
>> Sys.setlocale("LC_ALL","English")
>> citation("readr")
>>
>
> Thanks, that makes the problem reproducible.  I'll submit it as a bug
> report.  Maybe someone from Microsoft will fix it.
>
> Duncan Murdoch
>
>
>> #' To cite package ‘readr’ in publications use:
>> #'
>> #'   Hadley Wickham, Jim Hester and Romain Francois (2017). readr: Read
>> #' Rectangular Text Data. R package version 1.1.1.
>> #' https://CRAN.R-project.org/package=readr
>> #'
>> #' A BibTeX entry for LaTeX users is
>> #'
>> #' @Manual{,
>> #'   title = {readr: Read Rectangular Text Data},
>> #'   author = {Hadley Wickham and Jim Hester and Romain Francois},
>> #'   year = {2017},
>> #'   note = {R package version 1.1.1},
>> #'   url = {https://CRAN.R-project.org/package=readr},
>> #' }
>>
>>
>> Sys.setlocale("LC_CTYPE", "Chinese")
>> citation("readr")
>>
>> #'
>> #' To cite package ‘readr’ in publications use:
>> #'
>> #'   (2017). readr: Read Rectangular Text Data. R package version 1.1.1.
>> #' https://CRAN.R-project.org/package=readr
>> #'
>> #' A BibTeX entry for LaTeX users is
>> #'
>> #' @Manual{,
>> #'   title = {readr: Read Rectangular Text Data},
>> #'   year = {2017},
>> #'   note = {R package version 1.1.1},
>> #'   url = {https://CRAN.R-project.org/package=readr},
>> #' }
>> #'
>> #' ATTENTION: This citation information has been auto-generated from the
>> #' package DESCRIPTION file and may need manual editing, see
>> #' ‘help("citation")’.
>>
>> Where do we go from here? I do want to use the Chinese locale with R on
>> Windows (and perhaps others do too), so switching the locale isn't a fix.
>>
>> Thanks,
>>
>> Ben
>>
>> On 17/06/2017 10:36 PM, Duncan Murdoch wrote:
>>
>>> On 17/06/2017 7:10 AM, Ben Marwick wrote:
>>>
>>>> Recently I was trying to cite a package where the authors have ä
>>>> and ø in their names. I found that on Windows the citation() function
>>>> did not return the authors' names at all, but on Linux there was no
>>>> problem (sessionInfos at the bottom):
>>>>
>>>> On Windows, no author names are returned:
>>>>
>>>
>>> I'm not seeing this.  You have fairly strange localization settings; see
>>> comments below.
>>>
>>>
>>>> #---------------
>>>>
>>>>  > citation("readr")
>>>>
>>>> To cite package ‘readr’ in publications use:
>>>>
>>>>    (2017). readr: Read Rectangular Text Data. R package version 1.1.1.
>>>>    https://CRAN.R-project.org/package=readr
>>>>
>>>> A BibTeX entry for LaTeX users is
>>>>
>>>>    @Manual{,
>>>>      title = {readr: Read Rectangular Text Data},
>>>>      year = {2017},
>>>>      note = {R package version 1.1.1},
>>>>      url = {https://CRAN.R-project.org/package=readr},
>>>>    }
>>>>
>>>> ATTENTION: This citation information has been auto-generated from the
>>>> package DESCRIPTION file and may need manual editing, see
>>>> ‘help("citation")’.
>>>> #---------------
>>>>
>>>> On Linux we do see the author names:
>>>>
>>>> #---------------
>>>>  > citation("readr")
>>>>
>>>> To cite package ‘readr’ in publications use:
>>>>
>>>>    Hadley Wickham, Jim Hester and Romain Francois (2017). readr:
>>>>    Read Rectangular Text Data. R package version 1.1.1.
>>>>    https://CRAN.R-project.org/package=readr
>>>>
>>>> A BibTeX entry for LaTeX users is
>>>>
>>>>    @Manual{,
>>>>      title = {readr: Read Rectangular Text Data},
>>>>      author = {Hadley Wickham and Jim Hester and Romain Francois},
>>>>      year = {2017},
>>>>      note = {R package version 1.1.1},
>>>>      url = {https://CRAN.R-project.org/package=readr},
>>>>    }
>>>> #---------------
>>>>
>>>> This appears to be an OS-dependent encoding issue. The citation function
>>>> does not take an encoding argument, so it's not possible to set the
>>>> encoding at the point where that function is used. The citation function
>>>> working with the packageDescription function, which does have an
>>>> encoding argument, but the default is not useful for Windows when there
>>>> is an encoding set in the DESCRIPTION of the package (in this case
>>>> UTF-8).
>>>>
>>>> We can set the encoding argument in packageDescription so it works in
>>>> Windows to give the authors as expected, but it is very inconvenient to
>>>> generate citations directly from the output of this function. So I'd
>>>> like to propose a solution this problem by changing one line in the
>>>> packageDescription function, like so, from:
>>>>
>>>> #---------------
>>>> if (missing(encoding) && Sys.getlocale("LC_CTYPE") == "C")
>>>> #---------------
>>>>
>>>> to:
>>>>
>>>> #---------------
>>>> if ((missing(encoding) && Sys.getlocale("LC_CTYPE") == "C") |
>>>> unname(Sys.info()['sysname']) == "Windows")
>>>> #---------------
>>>>
>>>> If I understand correctly, that will force ASCII//TRANSLIT encoding when
>>>> DESCRIPTION files are read by packageDescription() on Windows machines.
>>>> The upside is that Windows users will get the authors in the package
>>>> citation, unlike the current situation. The downside is that the exotic
>>>> symbols in the authors' names are replaced with common ones that are
>>>> similar.
>>>>
>>>> I think getting the citations to easily include the authors' names is
>>>> pretty important, even if their names have exotic characters, so this is
>>>> worth fixing. Is this edit to packageDescription the best way to solve
>>>> this problem of exotic characters preventing the authors' names from
>>>> showing on Windows?
>>>>
>>>> thanks,
>>>>
>>>> Ben
>>>>
>>>>
>>>>
>>>>
>>>> Windows sessionInfo
>>>>
>>>> #---------------
>>>>  > sessionInfo()
>>>> R version 3.4.0 Patched (2017-05-10 r72670)
>>>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>>>> Running under: Windows 7 x64 (build 7601) Service Pack 1
>>>>
>>>> Matrix products: default
>>>>
>>>> locale:
>>>> [1] LC_COLLATE=English_Australia.1252
>>>> [2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936
>>>> [3] LC_MONETARY=English_Australia.1252
>>>> [4] LC_NUMERIC=C
>>>> [5] LC_TIME=English_Australia.1252
>>>>
>>>
>>> I don't know what English_Australia.1252 does that's different from what
>>> I use (English_Canada.1252), but the Chinese locale setting could cause
>>> trouble.  Could you try setting this (presumably in the Windows control
>>> panel) to be consistent?  You're using a much simpler setting on Linux.
>>>
>>> Duncan Murdoch
>>>
>>>
>>>> attached base packages:
>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>
>>>> loaded via a namespace (and not attached):
>>>>   [1] readr_1.1.1    compiler_3.4.0 R6_2.2.1       hms_0.3
>>>> tools_3.4.0
>>>>   [6] tibble_1.3.3   yaml_2.1.14    Rcpp_0.12.11   knitr_1.16
>>>> rlang_0.1.1
>>>> [11] fortunes_1.5-4
>>>> #---------------
>>>>
>>>> Linux sessionInfo:
>>>>
>>>> #---------------
>>>>  > sessionInfo()
>>>> R version 3.3.1 (2016-06-21)
>>>> Platform: x86_64-pc-linux-gnu (64-bit)
>>>> Running under: Ubuntu 16.10
>>>>
>>>> locale:
>>>>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>>>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>>>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>>>   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>>>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>>
>>>> attached base packages:
>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>
>>>> loaded via a namespace (and not attached):
>>>> [1] tools_3.3.1 yaml_2.1.14 knitr_1.16
>>>> #---------------
>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>>>
>>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Loading...