R extension memory leak detection question

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

R extension memory leak detection question

xiaoyan yu
I am writing C++ program based on R extensions and also try to test the
program with google address sanitizer.

I thought if I don't protect the variable from the allocation API such as
Rf_allocVector, there will be a memory leak. However, the address sanitizer
didn't report it. Is my understanding correct? Or I will see the memory
leak only if I compile R source code with the address sanitizer.

 Please help!

Thanks,
Xiaoyan

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: R extension memory leak detection question

Tomas Kalibera
On 3/12/21 7:43 PM, xiaoyan yu wrote:
> I am writing C++ program based on R extensions and also try to test the
> program with google address sanitizer.
>
> I thought if I don't protect the variable from the allocation API such as
> Rf_allocVector, there will be a memory leak. However, the address sanitizer
> didn't report it. Is my understanding correct? Or I will see the memory
> leak only if I compile R source code with the address sanitizer.

Yes, you should use special options for compilation and linking to use
address sanitizer. See Writing R Extensions, section 4.3.3.

If you allocate an R object using Rf_allocVector(), but don't protect
it, it means this object is available for the garbage collector to
reclaim. So it is not a memory leak.

Memory leaks with a garbage collector are much less common than without,
because if the program loses a pointer to some piece of memory, that
piece will automatically be reclaimed (not leaked). Still, memory leaks
are possible if the program forgets about a pointer to some piece of
memory no longer needed, and keeps that pointer in say some global
structure. Such memory leaks would not be found using address sanitizer.

Address sanitizer/Undefined behavior sanitizer can sometimes find errors
caused by that the program forgets to protect an R object, but this is
relatively rare, as they don't understand R heap specifically, so you
cannot assume that if you create such example, the error will always be
found.

Best
Tomas

>
>   Please help!
>
> Thanks,
> Xiaoyan
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: R extension memory leak detection question

Lionel Henry
> Still, memory leaks are possible if the program forgets about a
> pointer to some piece of memory no longer needed, and keeps that
> pointer in say some global structure. Such memory leaks would not be
> found using address sanitizer.

We had a few cases of this in the past. Given the difficulty of
tracing the leaking references, we wrote this package for taking
snapshots of the R heap and finding dominators and shortest paths
between nodes:

Repo: https://github.com/r-lib/memtools
Vignette: https://memtools.r-lib.org/articles/memtools.html

One issue that complicates taking snapshots is that R doesn't expose
the GC roots. In practice, only the precious list is needed I think.
Would you consider a patch that allows retrieving the precious list
for debugging purposes via a `.Internal()` call?

Best,
Lionel


On 3/15/21, Tomas Kalibera <[hidden email]> wrote:

> On 3/12/21 7:43 PM, xiaoyan yu wrote:
>> I am writing C++ program based on R extensions and also try to test the
>> program with google address sanitizer.
>>
>> I thought if I don't protect the variable from the allocation API such as
>> Rf_allocVector, there will be a memory leak. However, the address
>> sanitizer
>> didn't report it. Is my understanding correct? Or I will see the memory
>> leak only if I compile R source code with the address sanitizer.
>
> Yes, you should use special options for compilation and linking to use
> address sanitizer. See Writing R Extensions, section 4.3.3.
>
> If you allocate an R object using Rf_allocVector(), but don't protect
> it, it means this object is available for the garbage collector to
> reclaim. So it is not a memory leak.
>
> Memory leaks with a garbage collector are much less common than without,
> because if the program loses a pointer to some piece of memory, that
> piece will automatically be reclaimed (not leaked). Still, memory leaks
> are possible if the program forgets about a pointer to some piece of
> memory no longer needed, and keeps that pointer in say some global
> structure. Such memory leaks would not be found using address sanitizer.
>
> Address sanitizer/Undefined behavior sanitizer can sometimes find errors
> caused by that the program forgets to protect an R object, but this is
> relatively rare, as they don't understand R heap specifically, so you
> cannot assume that if you create such example, the error will always be
> found.
>
> Best
> Tomas
>
>>
>>   Please help!
>>
>> Thanks,
>> Xiaoyan
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: R extension memory leak detection question

xiaoyan yu
Thank you all for your help.
We embedded R in our program and found the memory in the process
accumulated while our expectation is that the memory will go down after
each R evaluation.
I started to write a test program with only a few lines of R embedded codes
and found the memory never went down even after R library is unloaded.
Please find more details in the readme and test program at
https://github.com/xiaoyanyuvt/RMemTest

Thanks,
Xiaoyan


On Fri, Mar 19, 2021 at 2:21 PM Lionel Henry <[hidden email]> wrote:

> > Still, memory leaks are possible if the program forgets about a
> > pointer to some piece of memory no longer needed, and keeps that
> > pointer in say some global structure. Such memory leaks would not be
> > found using address sanitizer.
>
> We had a few cases of this in the past. Given the difficulty of
> tracing the leaking references, we wrote this package for taking
> snapshots of the R heap and finding dominators and shortest paths
> between nodes:
>
> Repo: https://github.com/r-lib/memtools
> Vignette: https://memtools.r-lib.org/articles/memtools.html
>
> One issue that complicates taking snapshots is that R doesn't expose
> the GC roots. In practice, only the precious list is needed I think.
> Would you consider a patch that allows retrieving the precious list
> for debugging purposes via a `.Internal()` call?
>
> Best,
> Lionel
>
>
> On 3/15/21, Tomas Kalibera <[hidden email]> wrote:
> > On 3/12/21 7:43 PM, xiaoyan yu wrote:
> >> I am writing C++ program based on R extensions and also try to test the
> >> program with google address sanitizer.
> >>
> >> I thought if I don't protect the variable from the allocation API such
> as
> >> Rf_allocVector, there will be a memory leak. However, the address
> >> sanitizer
> >> didn't report it. Is my understanding correct? Or I will see the memory
> >> leak only if I compile R source code with the address sanitizer.
> >
> > Yes, you should use special options for compilation and linking to use
> > address sanitizer. See Writing R Extensions, section 4.3.3.
> >
> > If you allocate an R object using Rf_allocVector(), but don't protect
> > it, it means this object is available for the garbage collector to
> > reclaim. So it is not a memory leak.
> >
> > Memory leaks with a garbage collector are much less common than without,
> > because if the program loses a pointer to some piece of memory, that
> > piece will automatically be reclaimed (not leaked). Still, memory leaks
> > are possible if the program forgets about a pointer to some piece of
> > memory no longer needed, and keeps that pointer in say some global
> > structure. Such memory leaks would not be found using address sanitizer.
> >
> > Address sanitizer/Undefined behavior sanitizer can sometimes find errors
> > caused by that the program forgets to protect an R object, but this is
> > relatively rare, as they don't understand R heap specifically, so you
> > cannot assume that if you create such example, the error will always be
> > found.
> >
> > Best
> > Tomas
> >
> >>
> >>   Please help!
> >>
> >> Thanks,
> >> Xiaoyan
> >>
> >>      [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> [hidden email] mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: R extension memory leak detection question

Dirk Eddelbuettel

On 5 April 2021 at 18:27, xiaoyan yu wrote:
| Thank you all for your help.
| We embedded R in our program and found the memory in the process
| accumulated while our expectation is that the memory will go down after
| each R evaluation.
| I started to write a test program with only a few lines of R embedded codes
| and found the memory never went down even after R library is unloaded.
| Please find more details in the readme and test program at
| https://github.com/xiaoyanyuvt/RMemTest

You may find the projects RInside (for easily embedding R inside C++
programs) and littler (also embedding R, but using C only, for use in
lightweight cmdline applications) useful.  Those have existed for, give or
take, 10 and 15 years and have not proven to show memory leaks so I feel the
burden of proof is still on you.

Also I got your program to compile (after making the 'makefile' a bit more
general, and fixing two things upsetting current C++ compilers) but I am not
sure we really see memory consumption:

   edd@rob:~$ ps -fv $(pgrep -x foo)
       PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
   1456192 pts/9    S+     0:00      0     1  5890  1768  0.0 ./foo
   edd@rob:~$ ps -fv $(pgrep -x foo)
       PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
   1456192 pts/9    Sl+    0:00      0     1 1617174 9896  0.0 ./foo
   edd@rob:~$

Dirk

--
https://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: R extension memory leak detection question

xiaoyan yu
Thanks for your quick response. It is also surprising for us to notice the
memory accumulation when running our program since it has been years since
we developed our program.
Here is the memory status I observed from 15384k to 234208k to 242024k
without decreasing when running the test program.
[Before the first ENTER]$ps -aux | grep foo
xy 16985  0.0  0.0  *15384*  1312 pts/0    S+   00:09   0:00 ./foo
[After the first ENTER ]$ps -aux | grep foo
xy 16985  0.4  0.2 *234208* 42104 pts/0    S+   00:09   0:00 ./foo
[After the second ENTER and also before the program exit]$ps -aux | grep foo
xy    16985  0.1  0.2 *242024* 42244 pts/0    S+   00:09   0:00 ./foo

The test program is just a small simplified portion of our program. We
observed even more memory in use when running our program. We would like to
try to understand more of the memory life cycle of the embedded R.

Thanks,
Xiaoyan


On Mon, Apr 5, 2021 at 6:53 PM Dirk Eddelbuettel <[hidden email]> wrote:

>
> On 5 April 2021 at 18:27, xiaoyan yu wrote:
> | Thank you all for your help.
> | We embedded R in our program and found the memory in the process
> | accumulated while our expectation is that the memory will go down after
> | each R evaluation.
> | I started to write a test program with only a few lines of R embedded
> codes
> | and found the memory never went down even after R library is unloaded.
> | Please find more details in the readme and test program at
> | https://github.com/xiaoyanyuvt/RMemTest
>
> You may find the projects RInside (for easily embedding R inside C++
> programs) and littler (also embedding R, but using C only, for use in
> lightweight cmdline applications) useful.  Those have existed for, give or
> take, 10 and 15 years and have not proven to show memory leaks so I feel
> the
> burden of proof is still on you.
>
> Also I got your program to compile (after making the 'makefile' a bit more
> general, and fixing two things upsetting current C++ compilers) but I am
> not
> sure we really see memory consumption:
>
>    edd@rob:~$ ps -fv $(pgrep -x foo)
>        PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
>    1456192 pts/9    S+     0:00      0     1  5890  1768  0.0 ./foo
>    edd@rob:~$ ps -fv $(pgrep -x foo)
>        PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
>    1456192 pts/9    Sl+    0:00      0     1 1617174 9896  0.0 ./foo
>    edd@rob:~$
>
> Dirk
>
> --
> https://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel