Race condition on parallel package's mcexit and rmChild

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Race condition on parallel package's mcexit and rmChild

Sun Yijiang
I've been hacking with parallel package for some time and built a
parallel processing framework with it.  However, although very rarely,
I did notice "ignoring SIGPIPE signal" error every now and then.
After a deep dig into the source code, I think I found something worth
noticing.

In short, wring to pipe in the C function mc_exit(SEXP sRes) may cause
a SIGPIPE.  Code from src/library/parallel/src/fork.c:

SEXP NORET mc_exit(SEXP sRes)
{
    int res = asInteger(sRes);
... ...
    if (master_fd != -1) { /* send 0 to signify that we're leaving */
        size_t len = 0;
        /* assign result for Fedora security settings */
        ssize_t n = write(master_fd, &len, sizeof(len));
... ...
}

So a pipe write is made in mc_exit, and here's how this function is
used in src/library/parallel/R/unix/mcfork.R:

mcexit <- function(exit.code = 0L, send = NULL)
{
    if (!is.null(send)) try(sendMaster(send), silent = TRUE)
    .Call(C_mc_exit, as.integer(exit.code))
}

Between sendMaster() and mc_exit() calls, which are made in the child
process, the master process may call readChild() followed by
rmChild().  rmChild closes the pipe on the master side, and if it's
called before child calls mc_exit, a SIGPIPE will be raised when child
tries to write to the pipe in mc_exit.

rmChild is defined but not used in parallel package, so this problem
won't surface in most cases.  However, it is a useful API and may be
used by users like me for advanced control over child processes.  I
hope we can discuss a solution on it.

In fact, I don't see why we need to write to the pipe on child exit
and how it has anything to do with "Fedora security settings" as
suggested in the comments.  Removing it, IMHO, would be a good and
clean way to solve this problem.

Regards,
Yijiang

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Race condition on parallel package's mcexit and rmChild

Tomas Kalibera
This issue has already been addressed in 76462 (R-devel) and also ported
to R-patched. In fact rmChild() is used in mccollect(wait=FALSE).

Best
Tomas

On 5/19/19 11:39 AM, Sun Yijiang wrote:

> I've been hacking with parallel package for some time and built a
> parallel processing framework with it.  However, although very rarely,
> I did notice "ignoring SIGPIPE signal" error every now and then.
> After a deep dig into the source code, I think I found something worth
> noticing.
>
> In short, wring to pipe in the C function mc_exit(SEXP sRes) may cause
> a SIGPIPE.  Code from src/library/parallel/src/fork.c:
>
> SEXP NORET mc_exit(SEXP sRes)
> {
>      int res = asInteger(sRes);
> ... ...
>      if (master_fd != -1) { /* send 0 to signify that we're leaving */
>          size_t len = 0;
>          /* assign result for Fedora security settings */
>          ssize_t n = write(master_fd, &len, sizeof(len));
> ... ...
> }
>
> So a pipe write is made in mc_exit, and here's how this function is
> used in src/library/parallel/R/unix/mcfork.R:
>
> mcexit <- function(exit.code = 0L, send = NULL)
> {
>      if (!is.null(send)) try(sendMaster(send), silent = TRUE)
>      .Call(C_mc_exit, as.integer(exit.code))
> }
>
> Between sendMaster() and mc_exit() calls, which are made in the child
> process, the master process may call readChild() followed by
> rmChild().  rmChild closes the pipe on the master side, and if it's
> called before child calls mc_exit, a SIGPIPE will be raised when child
> tries to write to the pipe in mc_exit.
>
> rmChild is defined but not used in parallel package, so this problem
> won't surface in most cases.  However, it is a useful API and may be
> used by users like me for advanced control over child processes.  I
> hope we can discuss a solution on it.
>
> In fact, I don't see why we need to write to the pipe on child exit
> and how it has anything to do with "Fedora security settings" as
> suggested in the comments.  Removing it, IMHO, would be a good and
> clean way to solve this problem.
>
> Regards,
> Yijiang
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Race condition on parallel package's mcexit and rmChild

Sun Yijiang
Have read the latest code, but I still don't understand why mc_exit
needs to write zero on exit.  If a child closes its pipe, parent will
know that on next select.

Best,
Yijiang

Tomas Kalibera <[hidden email]> 于2019年5月20日周一 下午10:52写道:

>
> This issue has already been addressed in 76462 (R-devel) and also ported
> to R-patched. In fact rmChild() is used in mccollect(wait=FALSE).
>
> Best
> Tomas
>
> On 5/19/19 11:39 AM, Sun Yijiang wrote:
> > I've been hacking with parallel package for some time and built a
> > parallel processing framework with it.  However, although very rarely,
> > I did notice "ignoring SIGPIPE signal" error every now and then.
> > After a deep dig into the source code, I think I found something worth
> > noticing.
> >
> > In short, wring to pipe in the C function mc_exit(SEXP sRes) may cause
> > a SIGPIPE.  Code from src/library/parallel/src/fork.c:
> >
> > SEXP NORET mc_exit(SEXP sRes)
> > {
> >      int res = asInteger(sRes);
> > ... ...
> >      if (master_fd != -1) { /* send 0 to signify that we're leaving */
> >          size_t len = 0;
> >          /* assign result for Fedora security settings */
> >          ssize_t n = write(master_fd, &len, sizeof(len));
> > ... ...
> > }
> >
> > So a pipe write is made in mc_exit, and here's how this function is
> > used in src/library/parallel/R/unix/mcfork.R:
> >
> > mcexit <- function(exit.code = 0L, send = NULL)
> > {
> >      if (!is.null(send)) try(sendMaster(send), silent = TRUE)
> >      .Call(C_mc_exit, as.integer(exit.code))
> > }
> >
> > Between sendMaster() and mc_exit() calls, which are made in the child
> > process, the master process may call readChild() followed by
> > rmChild().  rmChild closes the pipe on the master side, and if it's
> > called before child calls mc_exit, a SIGPIPE will be raised when child
> > tries to write to the pipe in mc_exit.
> >
> > rmChild is defined but not used in parallel package, so this problem
> > won't surface in most cases.  However, it is a useful API and may be
> > used by users like me for advanced control over child processes.  I
> > hope we can discuss a solution on it.
> >
> > In fact, I don't see why we need to write to the pipe on child exit
> > and how it has anything to do with "Fedora security settings" as
> > suggested in the comments.  Removing it, IMHO, would be a good and
> > clean way to solve this problem.
> >
> > Regards,
> > Yijiang
> >
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Race condition on parallel package's mcexit and rmChild

Simon Urbanek
Because that's the communication protocol between the parent and child. There is a difference between unsolicited exit and empty result exit.

Cheers,
Simon


> On May 20, 2019, at 11:22 AM, Sun Yijiang <[hidden email]> wrote:
>
> Have read the latest code, but I still don't understand why mc_exit
> needs to write zero on exit.  If a child closes its pipe, parent will
> know that on next select.
>
> Best,
> Yijiang
>
> Tomas Kalibera <[hidden email]> 于2019年5月20日周一 下午10:52写道:
>>
>> This issue has already been addressed in 76462 (R-devel) and also ported
>> to R-patched. In fact rmChild() is used in mccollect(wait=FALSE).
>>
>> Best
>> Tomas
>>
>> On 5/19/19 11:39 AM, Sun Yijiang wrote:
>>> I've been hacking with parallel package for some time and built a
>>> parallel processing framework with it.  However, although very rarely,
>>> I did notice "ignoring SIGPIPE signal" error every now and then.
>>> After a deep dig into the source code, I think I found something worth
>>> noticing.
>>>
>>> In short, wring to pipe in the C function mc_exit(SEXP sRes) may cause
>>> a SIGPIPE.  Code from src/library/parallel/src/fork.c:
>>>
>>> SEXP NORET mc_exit(SEXP sRes)
>>> {
>>>     int res = asInteger(sRes);
>>> ... ...
>>>     if (master_fd != -1) { /* send 0 to signify that we're leaving */
>>>         size_t len = 0;
>>>         /* assign result for Fedora security settings */
>>>         ssize_t n = write(master_fd, &len, sizeof(len));
>>> ... ...
>>> }
>>>
>>> So a pipe write is made in mc_exit, and here's how this function is
>>> used in src/library/parallel/R/unix/mcfork.R:
>>>
>>> mcexit <- function(exit.code = 0L, send = NULL)
>>> {
>>>     if (!is.null(send)) try(sendMaster(send), silent = TRUE)
>>>     .Call(C_mc_exit, as.integer(exit.code))
>>> }
>>>
>>> Between sendMaster() and mc_exit() calls, which are made in the child
>>> process, the master process may call readChild() followed by
>>> rmChild().  rmChild closes the pipe on the master side, and if it's
>>> called before child calls mc_exit, a SIGPIPE will be raised when child
>>> tries to write to the pipe in mc_exit.
>>>
>>> rmChild is defined but not used in parallel package, so this problem
>>> won't surface in most cases.  However, it is a useful API and may be
>>> used by users like me for advanced control over child processes.  I
>>> hope we can discuss a solution on it.
>>>
>>> In fact, I don't see why we need to write to the pipe on child exit
>>> and how it has anything to do with "Fedora security settings" as
>>> suggested in the comments.  Removing it, IMHO, would be a good and
>>> clean way to solve this problem.
>>>
>>> Regards,
>>> Yijiang
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel