Rprintf expected encoding

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Rprintf expected encoding

Patrick Perry-2
I'm trying to find information about how to use Rprintf with a UTF-8
encoded string, and I'm not sure what the right cross-platform usage is.
I found an earlier thread about this
(http://r.789695.n4.nabble.com/How-to-print-UTF-8-encoded-strings-from-a-C-routine-to-R-s-output-td4724337.html)
but it wasn't very helpful.

If I want to print a UTF-8 string, I can do one of the following:

1) Send native data via Rprintf("%s", translateChar(str));

2) Send UTF-8 data via Rprintf("%s", translateCharUTF8(str));

If Rprintf is sending its output to stdout, then (1) seems like the
correct option. If Rprintf is sending to a file connection with encoding
set to UTF-8 (for example, after a call to sink(file(...,
encoding="UTF-8"))), then (2) is correct. Is there any way to know the
encoding that Rprintf is expecting?

Thanks,


Patrick
--

Patrick Perry
Assistant Professor
New York University

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Rprintf expected encoding

Duncan Murdoch-2
On 30/06/2017 4:24 PM, Patrick Perry wrote:

> I'm trying to find information about how to use Rprintf with a UTF-8
> encoded string, and I'm not sure what the right cross-platform usage is.
> I found an earlier thread about this
> (http://r.789695.n4.nabble.com/How-to-print-UTF-8-encoded-strings-from-a-C-routine-to-R-s-output-td4724337.html)
> but it wasn't very helpful.
>
> If I want to print a UTF-8 string, I can do one of the following:
>
> 1) Send native data via Rprintf("%s", translateChar(str));
>
> 2) Send UTF-8 data via Rprintf("%s", translateCharUTF8(str));
>
> If Rprintf is sending its output to stdout, then (1) seems like the
> correct option. If Rprintf is sending to a file connection with encoding
> set to UTF-8 (for example, after a call to sink(file(...,
> encoding="UTF-8"))), then (2) is correct. Is there any way to know the
> encoding that Rprintf is expecting?

It always expects the native encoding.  If the output connection is
UTF-8 encoded, it will translate from native to UTF-8 as it writes.

Things will hopefully change in R 3.5.0, since the translation from
UTF-8 to native to UTF-8 can lose information (and is inefficient even
if not lossy).  I think old code should behave as it did in the past,
but there will be a way to say that the incoming string is in UTF-8.

Duncan Murdoch

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Loading...