R prompt updates are not validated

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

R prompt updates are not validated

Jack Wasey
I was trying to get an interactive R prompt with the current working directory. I reviewed R source 'main.c' and 'options.c', and saw that a 20 char buffer is used when in Browse debugging mode, but that no other validation is done on the length of the prompt option.

This hangs R, or takes extremely long to return:

# R --vanilla
big <- paste(sample(LETTERS, size = 1e7, replace = TRUE), collapse = "")                                                                              
options(prompt = big)

Running R with gdb and interrupting to get backtraces shows that 'pushReadLine' in 'unix/sys-std.c' results in a chain of libreadline calls, including, in my case at least, UTF-8 and a lot of __strlen_avx2 activity. 'R_PromptString' in 'main.c' should check prompt is a reasonable length, as well as a check when setting the prompt in 'options.c'. This may be a readline bug, too? I watched it do nothing for a while, it didn't seem to accumulate much or any new memory while watching 'top', but did max one core of CPU.

> sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 19.04

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.3.5.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8  
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C      

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base    

loaded via a namespace (and not attached):
[1] compiler_3.5.3
>

I've searched R-devel and see minimal discussion of security threats in R. Has anybody fuzzed R with data or source files? As R grows in popularity, I hope there is some pro-active security work going on, which I understand may not always best be done on a public mailing list.

Jack Wasey

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: R prompt updates are not validated

Tomas Kalibera
On 4/18/19 11:07 PM, Jack Wasey wrote:
> I was trying to get an interactive R prompt with the current working directory. I reviewed R source 'main.c' and 'options.c', and saw that a 20 char buffer is used when in Browse debugging mode, but that no other validation is done on the length of the prompt option.
There is no limit enforced on the length of the prompt option. There is
nothing wrong about accepting strings of unlimited length, when they are
handled properly. I don't see any problem in how the prompt string is
handled in the present code base. The only reason to add such limit may
be to hide the readline issue you found, but perhaps it would be better
to solve that on the readline side, unless you have a realistic
reproducible example to trigger the problem.
> This hangs R, or takes extremely long to return:
>
> # R --vanilla
> big <- paste(sample(LETTERS, size = 1e7, replace = TRUE), collapse = "")
> options(prompt = big)
>
> Running R with gdb and interrupting to get backtraces shows that 'pushReadLine' in 'unix/sys-std.c' results in a chain of libreadline calls, including, in my case at least, UTF-8 and a lot of __strlen_avx2 activity. 'R_PromptString' in 'main.c' should check prompt is a reasonable length, as well as a check when setting the prompt in 'options.c'. This may be a readline bug, too? I watched it do nothing for a while, it didn't seem to accumulate much or any new memory while watching 'top', but did max one core of CPU.

I can reproduce this issue with readline on Ubuntu and Fedora,
rl_callback_handler_install() takes very long, spending a lot on time in
encoding conversions, and for large inputs corrupts memory. On macOS
with editline I could get long prompts working fine (fast and without
crashing). I don't see how this could be a problem in R, it seems to be
in readline: if you or anyone find it to be a problem worth spending
time on, I would suggest creating a small standalone C example to
trigger it and file a bug against readline.

> I've searched R-devel and see minimal discussion of security threats in R. Has anybody fuzzed R with data or source files? As R grows in popularity, I hope there is some pro-active security work going on, which I understand may not always best be done on a public mailing list.

Keep in mind that R by design lets you run arbitrary code on the machine
without any restriction (e.g. via "system", "library", "dyn.load"), and
there is no API in R to restrict access to those and similar functions.
So, there is no point in exploiting say a buffer overflow bug. Of
course, a buffer overflow bug is still a correctness problem and will be
fixed if found and reported.

Tomas

>
> Jack Wasey
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel