custom allocators, Valgrind and uninitialized memory

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

custom allocators, Valgrind and uninitialized memory

Andreas Kersting
Hi,

In my package bettermc, I use a custom allocator, which hands out already defined/initialized memory (mmap of a POSIX shared memory object).

If my code is run in R which was configured --with-valgrind-instrumentation > 0, Valgrind will (correctly) complain about uninitialized memory being used, e.g.

==813836== Conditional jump or move depends on uninitialised value(s)
==813836==    at 0x4F0A9D: getvar (svn/R-devel/src/main/eval.c:5171)
==813836==    by 0x4D9B38: bcEval (svn/R-devel/src/main/eval.c:6867)
==813836==    by 0x4F0077: Rf_eval (svn/R-devel/src/main/eval.c:727)
==813836==    by 0x4F09AF: forcePromise (svn/R-devel/src/main/eval.c:555)
==813836==    by 0x4F0C57: FORCE_PROMISE (svn/R-devel/src/main/eval.c:5136)
==813836==    by 0x4F0C57: getvar (svn/R-devel/src/main/eval.c:5177)
==813836==    by 0x4D9B38: bcEval (svn/R-devel/src/main/eval.c:6867)
==813836==    by 0x4F0077: Rf_eval (svn/R-devel/src/main/eval.c:727)
==813836==    by 0x4F1A8D: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==813836==    by 0x4F2783: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==813836==    by 0x4DF61D: bcEval (svn/R-devel/src/main/eval.c:7083)
==813836==    by 0x4F0077: Rf_eval (svn/R-devel/src/main/eval.c:727)
==813836==    by 0x4F1A8D: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==813836==  Uninitialised value was created by a client request
==813836==    at 0x52D5CF: Rf_allocVector3 (svn/R-devel/src/main/memory.c:2892)
==813836==    by 0x16B415EA: allocate_from_shm (packages/tests-vg/bettermc/src/copy2shm.c:289)
==813836==    by 0x49D123: R_doDotCall (svn/R-devel/src/main/dotcode.c:614)
==813836==    by 0x4DA36D: bcEval (svn/R-devel/src/main/eval.c:7671)
==813836==    by 0x4F0077: Rf_eval (svn/R-devel/src/main/eval.c:727)
==813836==    by 0x4F1A8D: R_execClosure (svn/R-devel/src/main/eval.c:1897)
==813836==    by 0x4F2783: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
==813836==    by 0x4F0243: Rf_eval (svn/R-devel/src/main/eval.c:850)
==813836==    by 0x49B68F: do_External (svn/R-devel/src/main/dotcode.c:573)
==813836==    by 0x4D3566: bcEval (svn/R-devel/src/main/eval.c:7115)
==813836==    by 0x4F0077: Rf_eval (svn/R-devel/src/main/eval.c:727)
==813836==    by 0x4F1A8D: R_execClosure (svn/R-devel/src/main/eval.c:1897)

(allocate_from_shm() is my function calling allocVector3() with a custom allocator.) Valgrind is correct, because allocVector3() explicitly calls VALGRIND_MAKE_MEM_UNDEFINED() on the memory my custom allocator returns.

- Sould allocVector3() call VALGRIND_MAKE_MEM_UNDEFINED() also if a custom allocator is used? For some custom allocators this is correct, for others not.

- Or should the code using a custom allocator call VALGRIND_MAKE_MEM_DEFINED() on the DATAPTR() returned by allocVector3()? E.g.

...
ret = PROTECT(allocVector3(asInteger(type), asReal(length), &allocator));
VALGRIND_MAKE_MEM_DEFINED(DATAPTR(ret), size);
...

For the latter to work also on systems without Valgrind installed, I need to include both valgrind.h and memcheck.h in the src of my package and include these (rather than the system headers), correct? Should I best take these headers directly from R (src/include/vg)?

Thanks! Regards,
Andreas
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: custom allocators, Valgrind and uninitialized memory

Andreas Kersting
Another idea for the second option. Instead of including the Valgrind headers, the following could be enough:

#if __has_include(<valgrind/memcheck.h>)
#include <valgrind/memcheck.h>
#else
#define VALGRIND_MAKE_MEM_DEFINED(_qzz_addr,_qzz_len)          \
  do {                                                         \
    (_qzz_addr);                                               \
    (_qzz_len);                                                \
  } while (0)
#endif

I guess the packages are build on the same CRAN machine which also runs the tests under Valgrind, i.e. valgrind/memcheck.h is available during build of the package!?

Not sure though if Oracle Developer Studio on Solaris supports __has_include() ...

2021-03-26 08:40 GMT+01:00 "Andreas Kersting" <[hidden email]>:

> Hi,
>
> In my package bettermc, I use a custom allocator, which hands out already defined/initialized memory (mmap of a POSIX shared memory object).
>
> If my code is run in R which was configured --with-valgrind-instrumentation > 0, Valgrind will (correctly) complain about uninitialized memory being used, e.g.
>
> ==813836== Conditional jump or move depends on uninitialised value(s)
> ==813836==    at 0x4F0A9D: getvar (svn/R-devel/src/main/eval.c:5171)
> ==813836==    by 0x4D9B38: bcEval (svn/R-devel/src/main/eval.c:6867)
> ==813836==    by 0x4F0077: Rf_eval (svn/R-devel/src/main/eval.c:727)
> ==813836==    by 0x4F09AF: forcePromise (svn/R-devel/src/main/eval.c:555)
> ==813836==    by 0x4F0C57: FORCE_PROMISE (svn/R-devel/src/main/eval.c:5136)
> ==813836==    by 0x4F0C57: getvar (svn/R-devel/src/main/eval.c:5177)
> ==813836==    by 0x4D9B38: bcEval (svn/R-devel/src/main/eval.c:6867)
> ==813836==    by 0x4F0077: Rf_eval (svn/R-devel/src/main/eval.c:727)
> ==813836==    by 0x4F1A8D: R_execClosure (svn/R-devel/src/main/eval.c:1897)
> ==813836==    by 0x4F2783: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
> ==813836==    by 0x4DF61D: bcEval (svn/R-devel/src/main/eval.c:7083)
> ==813836==    by 0x4F0077: Rf_eval (svn/R-devel/src/main/eval.c:727)
> ==813836==    by 0x4F1A8D: R_execClosure (svn/R-devel/src/main/eval.c:1897)
> ==813836==  Uninitialised value was created by a client request
> ==813836==    at 0x52D5CF: Rf_allocVector3 (svn/R-devel/src/main/memory.c:2892)
> ==813836==    by 0x16B415EA: allocate_from_shm (packages/tests-vg/bettermc/src/copy2shm.c:289)
> ==813836==    by 0x49D123: R_doDotCall (svn/R-devel/src/main/dotcode.c:614)
> ==813836==    by 0x4DA36D: bcEval (svn/R-devel/src/main/eval.c:7671)
> ==813836==    by 0x4F0077: Rf_eval (svn/R-devel/src/main/eval.c:727)
> ==813836==    by 0x4F1A8D: R_execClosure (svn/R-devel/src/main/eval.c:1897)
> ==813836==    by 0x4F2783: Rf_applyClosure (svn/R-devel/src/main/eval.c:1823)
> ==813836==    by 0x4F0243: Rf_eval (svn/R-devel/src/main/eval.c:850)
> ==813836==    by 0x49B68F: do_External (svn/R-devel/src/main/dotcode.c:573)
> ==813836==    by 0x4D3566: bcEval (svn/R-devel/src/main/eval.c:7115)
> ==813836==    by 0x4F0077: Rf_eval (svn/R-devel/src/main/eval.c:727)
> ==813836==    by 0x4F1A8D: R_execClosure (svn/R-devel/src/main/eval.c:1897)
>
> (allocate_from_shm() is my function calling allocVector3() with a custom allocator.) Valgrind is correct, because allocVector3() explicitly calls VALGRIND_MAKE_MEM_UNDEFINED() on the memory my custom allocator returns.
>
> - Sould allocVector3() call VALGRIND_MAKE_MEM_UNDEFINED() also if a custom allocator is used? For some custom allocators this is correct, for others not.
>
> - Or should the code using a custom allocator call VALGRIND_MAKE_MEM_DEFINED() on the DATAPTR() returned by allocVector3()? E.g.
>
> ...
> ret = PROTECT(allocVector3(asInteger(type), asReal(length), &allocator));
> VALGRIND_MAKE_MEM_DEFINED(DATAPTR(ret), size);
> ...
>
> For the latter to work also on systems without Valgrind installed, I need to include both valgrind.h and memcheck.h in the src of my package and include these (rather than the system headers), correct? Should I best take these headers directly from R (src/include/vg)?
>
> Thanks! Regards,
> Andreas
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: custom allocators, Valgrind and uninitialized memory

Dirk Eddelbuettel

Andreas,

Can you briefly describe what it is you are trying to do?

In general, no R package would use valgrind directly; it is an optional
debugger. Also note _Writing R Extensions_ has a few things to say about how
memory destined for R object can and cannot be allocated -- I presume your
custom allocator is only for objects you managed and do not return to R?

Best, Dirk

--
https://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: custom allocators, Valgrind and uninitialized memory

Andreas Kersting
Hi Dirk,

Sure, let me try to explain:

CRAN ran the tests of my package using R which was configured --with-valgrind-instrumentation > 0. Valgrind reported many errors related to the use of supposedly uninitialized memory and the CRAN team asked me to tackle these.

These errors are false positives, because I pass a custom allocator to allocVector3() which hands out memory which is already initialized. However, this memory is explicitly marked for Valgrind as uninitialized by allocVector3(), and I do not initialize it subsequently, so Valgrind complains.

Now I am asking if it is correct that allocVector3() marks memory as uninitialized/undefined, even if it comes from a custom allocator. This is because allocVector3() cannot know if the memory might already by initialized.

If this is the intended behavior of allocVector3(), then I am looking for the proper way to get rid of these false Valgrind errors. So to be able to more easily spot the true ones ...

Which section of _Writing R Extensions_ do you have in mind? I cannot find anything on custom allocators there, but maybe I am using the wrong search terms. No, these object are returned to R and I am not aware that this is a problem / not allowed.

Regards, Andreas

2021-03-26 19:51 GMT+01:00 "Dirk Eddelbuettel" <[hidden email]>:

>
> Andreas,
>
> Can you briefly describe what it is you are trying to do?
>
> In general, no R package would use valgrind directly; it is an optional
> debugger. Also note _Writing R Extensions_ has a few things to say about how
> memory destined for R object can and cannot be allocated -- I presume your
> custom allocator is only for objects you managed and do not return to R?
>
> Best, Dirk
>
> --
> https://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]
>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: custom allocators, Valgrind and uninitialized memory

Dirk Eddelbuettel

Andreas,

Thanks for the clarification.

On 26 March 2021 at 20:48, Andreas Kersting wrote:
| Sure, let me try to explain:
|
| CRAN ran the tests of my package using R which was configured --with-valgrind-instrumentation > 0. Valgrind reported many errors related to the use of supposedly uninitialized memory and the CRAN team asked me to tackle these.
|
| These errors are false positives, because I pass a custom allocator to allocVector3() which hands out memory which is already initialized. However, this memory is explicitly marked for Valgrind as uninitialized by allocVector3(), and I do not initialize it subsequently, so Valgrind complains.
|
| Now I am asking if it is correct that allocVector3() marks memory as uninitialized/undefined, even if it comes from a custom allocator. This is because allocVector3() cannot know if the memory might already by initialized.
|
| If this is the intended behavior of allocVector3(), then I am looking for the proper way to get rid of these false Valgrind errors. So to be able to more easily spot the true ones ...
|
| Which section of _Writing R Extensions_ do you have in mind? I cannot find anything on custom allocators there, but maybe I am using the wrong search terms. No, these object are returned to R and I am not aware that this is a problem / not allowed.

I had the simpler section 6.1. in Writing R Extensions in mind, but you are
indeed using the allocVector3() interface from the public header in
R_ext/Rallocators.h -- but I only see scant mention of allocVector3 in R
Internals.  So I will have to pass. A search of CRAN via the Github mirror
[1] also reveals little ... but includes a hit from your package.

Dirk

[1] https://github.com/search?q=org%3Acran+allocVector3&type=code

--
https://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: custom allocators, Valgrind and uninitialized memory

Tomas Kalibera
In reply to this post by Andreas Kersting
Hi Andreas,

On 3/26/21 8:48 PM, Andreas Kersting wrote:
> Hi Dirk,  > > Sure, let me try to explain: > > CRAN ran the tests of my package
using R which was configured > --with-valgrind-instrumentation > 0.
Valgrind reported many errors > related to the use of supposedly
uninitialized memory and the CRAN > team asked me to tackle these. > >
These errors are false positives, because I pass a custom allocator > to
allocVector3() which hands out memory which is already > initialized.
However, this memory is explicitly marked for Valgrind > as
uninitialized by allocVector3(), and I do not initialize it >
subsequently, so Valgrind complains. > > Now I am asking if it is
correct that allocVector3() marks memory as > uninitialized/undefined,
even if it comes from a custom allocator. > This is because
allocVector3() cannot know if the memory might > already by initialized.
I think the semantics of allocVector/allocVector3 should be the same
regardless of whether custom allocators are used. The semantics of
allocVector is to provide uninitialized memory (non-pointer types,
Writing R Extensions 5.9.2). Therefore, it is the caller who needs to
take care of initialization. This is also the semantics of "malloc" and
Rallocators.h says "custom_alloc_t mem_alloc; /* malloc equivalent */".

So I think that your code using your custom allocator needs to
initialize allocated memory to be correct. If your allocator initializes
the memory, that is fine, but unnecessary.

So technically speaking, the valgrind reports are not false alarms. I
think your call sites should initialize.

Best
Tomas



        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: custom allocators, Valgrind and uninitialized memory

Andreas Kersting
Hi Tomas,

Thanks for sharing your view on this! I understand your point, but still I think that the current situation is somewhat unfortunate:

I would argue that mmap() is a natural candidate to be used together with allocVector3(); it is even mentioned explicitly here: https://github.com/wch/r-source/blob/trunk/src/main/memory.c#L2575-L2576

However, when using a non-anonymous mapping, i.e. we want mmap() to initialize the memory e.g. from a file or a POSIX shared memory object, this means that we need to use MAP_FIXED in case we are obliged to initialize the memory AFTER allocVector3() returned it; at least I cannot think of a different way to achieve this.

The use of MAP_FIXED
- is discouraged (e.g. https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/mmap.2.html)
- requires two calls to mmap(): (1) to obtain the (anonymous) memory to be handed out by the custom allocater and (2) to actually map the file "over" the just allocated vector (using MAP_FIXED), which will overwrite the vector header; hence, we need to first back it up to later restore it

I have implemented my function using MAP_FIXED here: https://github.com/gfkse/bettermc/commit/f34c4f4c45c9ab11abe9b9e9b8b48064f128d731#diff-7098a5dde34efab163bbef27fe32f95c29e76236649479985d09c70100e4c737R278-R323

This solution, to me, is much more complicated and hacky than my previous one, which assumed it is OK to hand out already initialized memory directly from allocVector3().

Regards,
Andreas


2021-03-29 10:41 GMT+02:00 "Tomas Kalibera" <[hidden email]>:

> Hi Andreas,
> On 3/26/21 8:48 PM, Andreas Kersting wrote:
>> Hi Dirk,  > > Sure, let me try to explain: > > CRAN ran the tests of my package using R which was configured > --with-valgrind-instrumentation > 0. Valgrind reported many errors > related to the use of supposedly uninitialized memory and the CRAN > team asked me to tackle these. > > These errors are false positives, because I pass a custom allocator > to allocVector3() which hands out memory which is already > initialized. However, this memory is explicitly marked for Valgrind > as uninitialized by allocVector3(), and I do not initialize it > subsequently, so Valgrind complains. > > Now I am asking if it is correct that allocVector3() marks memory as > uninitialized/undefined, even if it comes from a custom allocator. > This is because allocVector3() cannot know if the memory might > already by initialized.
> I think the semantics of allocVector/allocVector3 should be the same regardless of whether custom allocators are used. The semantics of allocVector is to provide uninitialized memory (non-pointer types, Writing R Extensions 5.9.2). Therefore, it is the caller who needs to take care of initialization. This is also the semantics of "malloc" and Rallocators.h says "custom_alloc_t mem_alloc; /* malloc equivalent */".
>
> So I think that your code using your custom allocator needs to initialize allocated memory to be correct. If your allocator initializes the memory, that is fine, but unnecessary.
>
> So technically speaking, the valgrind reports are not false alarms. I think your call sites should initialize.
>
> Best
> Tomas
>
>
>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: custom allocators, Valgrind and uninitialized memory

Andreas Kersting
In reply to this post by Andreas Kersting
Hi Simon,

Yes, if this was acceptable on CRAN, I would agree that calling VALGRIND_MAKE_MEM_DEFINED() in my code would be sufficient.

But since Tomas said, "So I think that your code using your custom allocator needs to initialize allocated memory to be correct. If your allocator initializes the memory, that is fine, but unnecessary.", I am not sure if it is acceptable.

Regards,
Andreas

2021-03-30 00:39 GMT+02:00 "Simon Urbanek" <[hidden email]>:

> Andres,
>
> correct me if I'm wrong, but the issue here is not initialisation but rather valgrind flagging. You simply have to call VALGRIND_MAKE_MEM_DEFINED() in your code after allocVector3() to declare that you have initialised the memory - or am I missing something?
>
> Cheers,
> Simon
>
>
>
>> On 30/03/2021, at 9:18 AM, Andreas Kersting <[hidden email]> wrote:
>>
>> Hi Tomas,
>>
>> Thanks for sharing your view on this! I understand your point, but still I think that the current situation is somewhat unfortunate:
>>
>> I would argue that mmap() is a natural candidate to be used together with allocVector3(); it is even mentioned explicitly here: https://github.com/wch/r-source/blob/trunk/src/main/memory.c#L2575-L2576
>>
>> However, when using a non-anonymous mapping, i.e. we want mmap() to initialize the memory e.g. from a file or a POSIX shared memory object, this means that we need to use MAP_FIXED in case we are obliged to initialize the memory AFTER allocVector3() returned it; at least I cannot think of a different way to achieve this.
>>
>> The use of MAP_FIXED
>> - is discouraged (e.g. https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/mmap.2.html)
>> - requires two calls to mmap(): (1) to obtain the (anonymous) memory to be handed out by the custom allocater and (2) to actually map the file "over" the just allocated vector (using MAP_FIXED), which will overwrite the vector header; hence, we need to first back it up to later restore it
>>
>> I have implemented my function using MAP_FIXED here: https://github.com/gfkse/bettermc/commit/f34c4f4c45c9ab11abe9b9e9b8b48064f128d731#diff-7098a5dde34efab163bbef27fe32f95c29e76236649479985d09c70100e4c737R278-R323
>>
>> This solution, to me, is much more complicated and hacky than my previous one, which assumed it is OK to hand out already initialized memory directly from allocVector3().
>>
>> Regards,
>> Andreas
>>
>>
>> 2021-03-29 10:41 GMT+02:00 "Tomas Kalibera" <[hidden email]>:
>>> Hi Andreas,
>>> On 3/26/21 8:48 PM, Andreas Kersting wrote:
>>>> Hi Dirk,  > > Sure, let me try to explain: > > CRAN ran the tests of my package using R which was configured > --with-valgrind-instrumentation > 0. Valgrind reported many errors > related to the use of supposedly uninitialized memory and the CRAN > team asked me to tackle these. > > These errors are false positives, because I pass a custom allocator > to allocVector3() which hands out memory which is already > initialized. However, this memory is explicitly marked for Valgrind > as uninitialized by allocVector3(), and I do not initialize it > subsequently, so Valgrind complains. > > Now I am asking if it is correct that allocVector3() marks memory as > uninitialized/undefined, even if it comes from a custom allocator. > This is because allocVector3() cannot know if the memory might > already by initialized.
>>> I think the semantics of allocVector/allocVector3 should be the same regardless of whether custom allocators are used. The semantics of allocVector is to provide uninitialized memory (non-pointer types, Writing R Extensions 5.9.2). Therefore, it is the caller who needs to take care of initialization. This is also the semantics of "malloc" and Rallocators.h says "custom_alloc_t mem_alloc; /* malloc equivalent */".
>>>
>>> So I think that your code using your custom allocator needs to initialize allocated memory to be correct. If your allocator initializes the memory, that is fine, but unnecessary.
>>>
>>> So technically speaking, the valgrind reports are not false alarms. I think your call sites should initialize.
>>>
>>> Best
>>> Tomas
>>>
>>>
>>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: custom allocators, Valgrind and uninitialized memory

Tomas Kalibera
In reply to this post by Andreas Kersting
On 3/29/21 10:18 PM, Andreas Kersting wrote:
> Hi Tomas,
>
> Thanks for sharing your view on this! I understand your point, but still I think that the current situation is somewhat unfortunate:
>
> I would argue that mmap() is a natural candidate to be used together with allocVector3(); it is even mentioned explicitly here: https://github.com/wch/r-source/blob/trunk/src/main/memory.c#L2575-L2576
>
> However, when using a non-anonymous mapping, i.e. we want mmap() to initialize the memory e.g. from a file or a POSIX shared memory object, this means that we need to use MAP_FIXED in case we are obliged to initialize the memory AFTER allocVector3() returned it; at least I cannot think of a different way to achieve this.

Hi Andreas,

I didn't mean to suggest that the memory to be returned by a custom
allocator must originate from malloc, mmap is fine, I was just arguing
that the mention of malloc in the header file is another indication that
the memory returned by a custom allocator may not be initialized. If
your allocator does initialize the memory anyway, it is fine and
appropriate to tell valgrind about it. As this is becoming rather too
technical and specific to the internals, feel free to take this offline
with Simon and me.

Best
Tomas

>
> The use of MAP_FIXED
> - is discouraged (e.g. https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/mmap.2.html)
> - requires two calls to mmap(): (1) to obtain the (anonymous) memory to be handed out by the custom allocater and (2) to actually map the file "over" the just allocated vector (using MAP_FIXED), which will overwrite the vector header; hence, we need to first back it up to later restore it
>
> I have implemented my function using MAP_FIXED here: https://github.com/gfkse/bettermc/commit/f34c4f4c45c9ab11abe9b9e9b8b48064f128d731#diff-7098a5dde34efab163bbef27fe32f95c29e76236649479985d09c70100e4c737R278-R323
>
> This solution, to me, is much more complicated and hacky than my previous one, which assumed it is OK to hand out already initialized memory directly from allocVector3().
>
> Regards,
> Andreas
>
>
> 2021-03-29 10:41 GMT+02:00 "Tomas Kalibera" <[hidden email]>:
>> Hi Andreas,
>> On 3/26/21 8:48 PM, Andreas Kersting wrote:
>>> Hi Dirk,  > > Sure, let me try to explain: > > CRAN ran the tests of my package using R which was configured > --with-valgrind-instrumentation > 0. Valgrind reported many errors > related to the use of supposedly uninitialized memory and the CRAN > team asked me to tackle these. > > These errors are false positives, because I pass a custom allocator > to allocVector3() which hands out memory which is already > initialized. However, this memory is explicitly marked for Valgrind > as uninitialized by allocVector3(), and I do not initialize it > subsequently, so Valgrind complains. > > Now I am asking if it is correct that allocVector3() marks memory as > uninitialized/undefined, even if it comes from a custom allocator. > This is because allocVector3() cannot know if the memory might > already by initialized.
>> I think the semantics of allocVector/allocVector3 should be the same regardless of whether custom allocators are used. The semantics of allocVector is to provide uninitialized memory (non-pointer types, Writing R Extensions 5.9.2). Therefore, it is the caller who needs to take care of initialization. This is also the semantics of "malloc" and Rallocators.h says "custom_alloc_t mem_alloc; /* malloc equivalent */".
>>
>> So I think that your code using your custom allocator needs to initialize allocated memory to be correct. If your allocator initializes the memory, that is fine, but unnecessary.
>>
>> So technically speaking, the valgrind reports are not false alarms. I think your call sites should initialize.
>>
>> Best
>> Tomas
>>
>>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: custom allocators, Valgrind and uninitialized memory

Kasper Daniel Hansen-2
On Tue, Mar 30, 2021 at 9:39 AM Tomas Kalibera <[hidden email]>
wrote:

<SNIP>

> appropriate to tell valgrind about it. As this is becoming rather too
> technical and specific to the internals, feel free to take this offline
> with Simon and me.
>

Respectfully, this seems to me to be exactly the kind of exchange R-devel
is meant for.
--
Best,
Kasper

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: custom allocators, Valgrind and uninitialized memory

Andreas Kersting
In reply to this post by Andreas Kersting
Hi Simon, hi Tomas,

Let me try to wrap up this discussion:

- "What does any of this to do with CRAN?"
Not much, I agree. It is just that this whole issue arose because the CRAN team asked me to fix the use of uninitialized memory as reported by Valgrind. Sorry for mixing things up here.

- "I don't quite get your earlier response about allocating *after* the call since that makes no sense to me"
I was talking about *initializing* after the call as originally suggested by Tomas and - as I wrote - I also do not like my proposal involving MAP_FIXED.

- bottom line
allocVector3() is correctly marking memory as uninitialized because it cannot safely assume otherwise. It is ok for a custom allocator to return already initialized memory and inform Valgrind about this fact.

I hope, this summarizes it well.

Thanks for your time and support, Tomas and Simon. Very much appreciated!

Regards,
Andreas


2021-03-30 10:03 GMT+02:00 "Simon Urbanek" <[hidden email]>:

> Andreas,
>
> What does any of this to do with CRAN? This not a the CRAN list - we're discussing the proper approach of using valgrind and R can only assume that the memory is uninitialised (since it cannot safely assume anything else) so it is up to you to declare the memory as initialised if you can guarantee that being true. I don't quite get your earlier response about allocating *after* the call since that makes no sense to me - the whole point of a custom allocator is to allow you to allocate the memory, so whether it is initialised or not is under your control - but that means also it is your responsibility to flag the state accordingly. Note, however, that this is not merely just true by the virtue of using mmap - the memory content is only valid (initialised) if you used mmap with previously initialised content. Again, entirely up to you to decide what the semantics are since you are the author of the custom allocator. Does that make sense?
>
> Cheers,
> Simon
>
>
>
>> On Mar 30, 2021, at 18:27, Andreas Kersting <[hidden email]> wrote:
>>
>> Hi Simon,
>>
>> Yes, if this was acceptable on CRAN, I would agree that calling VALGRIND_MAKE_MEM_DEFINED() in my code would be sufficient.
>>
>> But since Tomas said, "So I think that your code using your custom allocator needs to initialize allocated memory to be correct. If your allocator initializes the memory, that is fine, but unnecessary.", I am not sure if it is acceptable.
>>
>> Regards,
>> Andreas
>>
>> 2021-03-30 00:39 GMT+02:00 "Simon Urbanek" <[hidden email]>:
>>> Andres,
>>>
>>> correct me if I'm wrong, but the issue here is not initialisation but rather valgrind flagging. You simply have to call VALGRIND_MAKE_MEM_DEFINED() in your code after allocVector3() to declare that you have initialised the memory - or am I missing something?
>>>
>>> Cheers,
>>> Simon
>>>
>>>
>>>
>>>> On 30/03/2021, at 9:18 AM, Andreas Kersting <[hidden email]> wrote:
>>>>
>>>> Hi Tomas,
>>>>
>>>> Thanks for sharing your view on this! I understand your point, but still I think that the current situation is somewhat unfortunate:
>>>>
>>>> I would argue that mmap() is a natural candidate to be used together with allocVector3(); it is even mentioned explicitly here: https://github.com/wch/r-source/blob/trunk/src/main/memory.c#L2575-L2576
>>>>
>>>> However, when using a non-anonymous mapping, i.e. we want mmap() to initialize the memory e.g. from a file or a POSIX shared memory object, this means that we need to use MAP_FIXED in case we are obliged to initialize the memory AFTER allocVector3() returned it; at least I cannot think of a different way to achieve this.
>>>>
>>>> The use of MAP_FIXED
>>>> - is discouraged (e.g. https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/mmap.2.html)
>>>> - requires two calls to mmap(): (1) to obtain the (anonymous) memory to be handed out by the custom allocater and (2) to actually map the file "over" the just allocated vector (using MAP_FIXED), which will overwrite the vector header; hence, we need to first back it up to later restore it
>>>>
>>>> I have implemented my function using MAP_FIXED here: https://github.com/gfkse/bettermc/commit/f34c4f4c45c9ab11abe9b9e9b8b48064f128d731#diff-7098a5dde34efab163bbef27fe32f95c29e76236649479985d09c70100e4c737R278-R323
>>>>
>>>> This solution, to me, is much more complicated and hacky than my previous one, which assumed it is OK to hand out already initialized memory directly from allocVector3().
>>>>
>>>> Regards,
>>>> Andreas
>>>>
>>>>
>>>> 2021-03-29 10:41 GMT+02:00 "Tomas Kalibera" <[hidden email]>:
>>>>> Hi Andreas,
>>>>> On 3/26/21 8:48 PM, Andreas Kersting wrote:
>>>>>> Hi Dirk,  > > Sure, let me try to explain: > > CRAN ran the tests of my package using R which was configured > --with-valgrind-instrumentation > 0. Valgrind reported many errors > related to the use of supposedly uninitialized memory and the CRAN > team asked me to tackle these. > > These errors are false positives, because I pass a custom allocator > to allocVector3() which hands out memory which is already > initialized. However, this memory is explicitly marked for Valgrind > as uninitialized by allocVector3(), and I do not initialize it > subsequently, so Valgrind complains. > > Now I am asking if it is correct that allocVector3() marks memory as > uninitialized/undefined, even if it comes from a custom allocator. > This is because allocVector3() cannot know if the memory might > already by initialized.
>>>>> I think the semantics of allocVector/allocVector3 should be the same regardless of whether custom allocators are used. The semantics of allocVector is to provide uninitialized memory (non-pointer types, Writing R Extensions 5.9.2). Therefore, it is the caller who needs to take care of initialization. This is also the semantics of "malloc" and Rallocators.h says "custom_alloc_t mem_alloc; /* malloc equivalent */".
>>>>>
>>>>> So I think that your code using your custom allocator needs to initialize allocated memory to be correct. If your allocator initializes the memory, that is fine, but unnecessary.
>>>>>
>>>>> So technically speaking, the valgrind reports are not false alarms. I think your call sites should initialize.
>>>>>
>>>>> Best
>>>>> Tomas
>>>>>
>>>>>
>>>>>
>>>> ______________________________________________
>>>> [hidden email] mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>>
>>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel