reg-tests-1d.R fails in r72721

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

reg-tests-1d.R fails in r72721

Hiroyuki Kawakatsu
Hi,

I am failing make check in r72721 at the end of reg-tests-1d.R. The
relevant block of code is

## path.expand shouldn't translate to local encoding PR#17120
filename <- "\U9b3c.R"
print(Encoding(filename))
x1 <- path.expand(paste0("~/", filename))
print(Encoding(x1))
x2 <- paste0(path.expand("~/"), filename)
print(Encoding(x2))
stopifnot(identical( path.expand(paste0("~/", filename)), paste0(path.expand("~/"), filename)))
## Chinese character was changed to hex code

Encoding(x1) is "unknown" while Encoding(x2) is "UTF-8". If I run
this code with R --vanilla, both are UTF-8 and the assertion
passes. What is make check doing differently? Or is there something
wrong with my setting/environment? Thanks,

h.
--
+---
| Hiroyuki Kawakatsu
| Business School, Dublin City University
| Dublin 9, Ireland. Tel +353 (0)1 700 7496

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: reg-tests-1d.R fails in r72721

Duncan Murdoch-2
On 24/05/2017 5:47 AM, Hiroyuki Kawakatsu wrote:

> Hi,
>
> I am failing make check in r72721 at the end of reg-tests-1d.R. The
> relevant block of code is
>
> ## path.expand shouldn't translate to local encoding PR#17120
> filename <- "\U9b3c.R"
> print(Encoding(filename))
> x1 <- path.expand(paste0("~/", filename))
> print(Encoding(x1))
> x2 <- paste0(path.expand("~/"), filename)
> print(Encoding(x2))
> stopifnot(identical( path.expand(paste0("~/", filename)), paste0(path.expand("~/"), filename)))
> ## Chinese character was changed to hex code
>
> Encoding(x1) is "unknown" while Encoding(x2) is "UTF-8". If I run
> this code with R --vanilla, both are UTF-8 and the assertion
> passes. What is make check doing differently? Or is there something
> wrong with my setting/environment? Thanks,
>

I think the test is wrong because in the first case you are working in a
locale where that character is representable.  In my locale it is not,
so x1 is converted to UTF-8, and everything compares equal.

An explicit conversion of x1 to UTF-8 should fix this, i.e. replace

x1 <- path.expand(paste0("~/", filename))

with

x1 <- enc2utf8(path.expand(paste0("~/", filename)))

Could you try this and see if it helps?

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: reg-tests-1d.R fails in r72721

Hiroyuki Kawakatsu
On 2017-05-24, Duncan Murdoch wrote:

>
> I think the test is wrong because in the first case you are working in a
> locale where that character is representable.  In my locale it is not, so x1
> is converted to UTF-8, and everything compares equal.
>
> An explicit conversion of x1 to UTF-8 should fix this, i.e. replace
>
> x1 <- path.expand(paste0("~/", filename))
>
> with
>
> x1 <- enc2utf8(path.expand(paste0("~/", filename)))
>
> Could you try this and see if it helps?

Nope:

> ## path.expand shouldn't translate to local encoding PR#17120
> filename <- "\U9b3c.R"
>
> x11 <- path.expand(paste0("~/", filename))
> print(Encoding(x11))
[1] "unknown"
> x12 <- enc2utf8( path.expand(paste0("~/", filename)) )
> print(Encoding(x12))
[1] "unknown"
> x2 <- paste0(path.expand("~/"), filename)
> print(Encoding(x2))
[1] "UTF-8"
>
> #stopifnot(identical(path.expand(paste0("~/", filename)),
> stopifnot(identical(enc2utf8( path.expand(paste0("~/", filename)) ),
+                   paste0(path.expand("~/"), filename)))
Error: identical(enc2utf8(path.expand(paste0("~/", filename))), paste0(path.expand("~/"),  .... is not TRUE
Execution halted

I forgot to report:

> sessionInfo()
R Under development (unstable) (2017-05-23 r72721)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 9 (stretch)

Matrix products: default
BLAS: /usr/local/share/R-devel/lib/libRblas.so
LAPACK: /usr/local/share/R-devel/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8  
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C      

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base    

loaded via a namespace (and not attached):
[1] compiler_3.5.0

h.

--
+---
| Hiroyuki Kawakatsu
| Business School, Dublin City University
| Dublin 9, Ireland. Tel +353 (0)1 700 7496

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: reg-tests-1d.R fails in r72721

Duncan Murdoch-2
On 24/05/2017 7:59 AM, Hiroyuki Kawakatsu wrote:

> On 2017-05-24, Duncan Murdoch wrote:
>>
>> I think the test is wrong because in the first case you are working in a
>> locale where that character is representable.  In my locale it is not, so x1
>> is converted to UTF-8, and everything compares equal.
>>
>> An explicit conversion of x1 to UTF-8 should fix this, i.e. replace
>>
>> x1 <- path.expand(paste0("~/", filename))
>>
>> with
>>
>> x1 <- enc2utf8(path.expand(paste0("~/", filename)))
>>
>> Could you try this and see if it helps?
>
> Nope:

Okay, how about if we weaken the test?  Instead of

stopifnot(identical(path.expand(paste0("~/", filename)),
                     paste0(path.expand("~/"), filename)))

try

stopifnot(path.expand(paste0("~/", filename)) ==
                       paste0(path.expand("~/"), filename))

Duncan Murdoch

>
>> ## path.expand shouldn't translate to local encoding PR#17120
>> filename <- "\U9b3c.R"
>>
>> x11 <- path.expand(paste0("~/", filename))
>> print(Encoding(x11))
> [1] "unknown"
>> x12 <- enc2utf8( path.expand(paste0("~/", filename)) )
>> print(Encoding(x12))
> [1] "unknown"
>> x2 <- paste0(path.expand("~/"), filename)
>> print(Encoding(x2))
> [1] "UTF-8"
>>
>> #stopifnot(identical(path.expand(paste0("~/", filename)),
>> stopifnot(identical(enc2utf8( path.expand(paste0("~/", filename)) ),
> +                   paste0(path.expand("~/"), filename)))
> Error: identical(enc2utf8(path.expand(paste0("~/", filename))), paste0(path.expand("~/"),  .... is not TRUE
> Execution halted
>
> I forgot to report:
>
>> sessionInfo()
> R Under development (unstable) (2017-05-23 r72721)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Debian GNU/Linux 9 (stretch)
>
> Matrix products: default
> BLAS: /usr/local/share/R-devel/lib/libRblas.so
> LAPACK: /usr/local/share/R-devel/lib/libRlapack.so
>
> locale:
>  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
>  [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
>  [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8
>  [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C
>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_3.5.0
>
> h.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: reg-tests-1d.R fails in r72721

Hiroyuki Kawakatsu
On 2017-05-24, Duncan Murdoch wrote:
[...]
> Okay, how about if we weaken the test?  
[...]
> try
>
> stopifnot(path.expand(paste0("~/", filename)) ==
>                       paste0(path.expand("~/"), filename))
>

Nope:

> ## path.expand shouldn't translate to local encoding PR#17120
> filename <- "\U9b3c.R"
>
> #stopifnot(identical(path.expand(paste0("~/", filename)),
> stopifnot(path.expand(paste0("~/", filename)) ==
+                   paste0(path.expand("~/"), filename))
Error: path.expand(paste0("~/", filename)) == paste0(path.expand("~/"),  .... is not TRUE
Execution halted

The problem is that path.expand(), or do_pathexpand() for
non-windoze calls translateChar() which in turn calls
translateToNative() which is unknown to make check (but not to R
--vanilla) under my setup. Once it is unknown, there seems to be no
way to force an encoding:

> ## path.expand shouldn't translate to local encoding PR#17120
> filename <- "\U9b3c.R"
> print(Encoding(filename))
[1] "UTF-8"
>
> y1 <- paste0("~/", filename)
> print(Encoding(y1))
[1] "UTF-8"
>
> y2 <- path.expand(y1)
> print(Encoding(y2))
[1] "unknown"
>
> y3a <- iconv(y2, to="UTF-8")
> print(Encoding(y3a))
[1] "unknown"
>
> y3b <- enc2utf8(y2)
> print(Encoding(y3b))
[1] "unknown"
>
> Encoding(y2) <- "UTF-8"
> print(Encoding(y2))
[1] "unknown"
>

h.

--
+---
| Hiroyuki Kawakatsu
| Business School, Dublin City University
| Dublin 9, Ireland. Tel +353 (0)1 700 7496

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: reg-tests-1d.R fails in r72721

Duncan Murdoch-2
On 24/05/2017 9:59 AM, Hiroyuki Kawakatsu wrote:

> On 2017-05-24, Duncan Murdoch wrote:
> [...]
>> Okay, how about if we weaken the test?
> [...]
>> try
>>
>> stopifnot(path.expand(paste0("~/", filename)) ==
>>                       paste0(path.expand("~/"), filename))
>>
>
> Nope:
>
>> ## path.expand shouldn't translate to local encoding PR#17120
>> filename <- "\U9b3c.R"
>>
>> #stopifnot(identical(path.expand(paste0("~/", filename)),
>> stopifnot(path.expand(paste0("~/", filename)) ==
> +                   paste0(path.expand("~/"), filename))
> Error: path.expand(paste0("~/", filename)) == paste0(path.expand("~/"),  .... is not TRUE
> Execution halted

Thanks.  I've made that test conditional on running on Windows, and
re-opened bug 17120.  I indicated that it's now a Unix-only bug.

This may be a first:  a case where R handles non-native characters
better in Windows than it does in Unix.  I'm sure this will show up in a
Microsoft ad soon :-).

Duncan Murdoch

> The problem is that path.expand(), or do_pathexpand() for
> non-windoze calls translateChar() which in turn calls
> translateToNative() which is unknown to make check (but not to R
> --vanilla) under my setup. Once it is unknown, there seems to be no
> way to force an encoding:
>
>> ## path.expand shouldn't translate to local encoding PR#17120
>> filename <- "\U9b3c.R"
>> print(Encoding(filename))
> [1] "UTF-8"
>>
>> y1 <- paste0("~/", filename)
>> print(Encoding(y1))
> [1] "UTF-8"
>>
>> y2 <- path.expand(y1)
>> print(Encoding(y2))
> [1] "unknown"
>>
>> y3a <- iconv(y2, to="UTF-8")
>> print(Encoding(y3a))
> [1] "unknown"
>>
>> y3b <- enc2utf8(y2)
>> print(Encoding(y3b))
> [1] "unknown"
>>
>> Encoding(y2) <- "UTF-8"
>> print(Encoding(y2))
> [1] "unknown"
>>
>
> h.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Loading...