Quantcast

encoding problem with rPython

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

encoding problem with rPython

Jason-Kim
This post has NOT been accepted by the mailing list yet.
Hi,
I want to analyze korean text using python library.
But returned characters from python code are not visible correctly.
So I made simple code like below.
It just passes korean string into python and then returns splited characters.
I want to process like split_word1 function.
But returned characters are broken.
How should I do?


::R code start::

> library(rPython)
> python.exec("def split_word1(s): return s.split(' ')")
> python.call("split_word1", "시스템 정보")
[1] "ܤ\\"      "\025\xf4"
>
> python.exec("def split_word2(s): return [unicode(w) for w in s.split(' ')]")
> python.call("split_word2", "시스템 정보")
[1] "ܤ\\"      "\025\xf4"
>
> python.exec("def constant_word1(): return ['시스템', '정보']")
> python.call("constant_word1")
[1] "ܤ\\"      "\025\xf4"
>
> python.exec("def constant_word2(): return [u'시스템', u'정보']")
> python.call("constant_word2")
[1] "시스템" "정보"  
>

::R code end::
Loading...