Single-threaded aspect

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Single-threaded aspect

Charles Determan
R Developers,

Could someone help explain what it means that R is single threaded?  I am
trying to understand what is actually going on inside R when users want to
parallelize code.  For example, using mclapply or foreach (with some
backend) somehow allows users to benefit from multiple CPUs.

Similarly there is the RcppParallel package for RMatrix/RVector objects.
But none of these address the general XPtr objects in Rcpp.  Some readers
here may recognize my question on SO (
http://stackoverflow.com/questions/37167479/rcpp-parallelize-functions-that-return-xptr)
where I was curious about parallel calls to C++/Rcpp functions that return
XPtr objects.  I am being a little more persistent here as this limitation
provides a very hard stop on the development on one of my packages that
heavily uses XPtr objects.  It's not meant to be a criticism or intended to
be rude, I just want to fully understand.

I am willing to accept that it may be impossible currently but I want to at
least understand why it is impossible so I can explain to future users why
parallel functionality is not available.  Which just echos my original
question, what does it mean that R is single threaded?

Kind Regards,
Charles

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Single-threaded aspect

Duncan Murdoch-2
On 12/05/2016 8:45 AM, Charles Determan wrote:
> R Developers,
>
> Could someone help explain what it means that R is single threaded?  I am
> trying to understand what is actually going on inside R when users want to
> parallelize code.  For example, using mclapply or foreach (with some
> backend) somehow allows users to benefit from multiple CPUs.

I don't know what document you are quoting when you say "R is single
threaded", but one possible meaning is that most base R calculations are
done in a single thread.  When you do vectorized calculations like x+y
for long vectors x and y, they are done internally as loops over the
entries.

On Windows, there are two threads when running Rterm, with one to
maintain the display, since otherwise the plot display couldn't update
while R is waiting for input.

The mclapply function in the parallel package forks the process to do
its calculations.

Other packages can do other variations on parallel computations.

I can't help you with the rest of your question, I don't know what XPtr
objects are.

Duncan Murdoch

>
> Similarly there is the RcppParallel package for RMatrix/RVector objects.
> But none of these address the general XPtr objects in Rcpp.  Some readers
> here may recognize my question on SO (
> http://stackoverflow.com/questions/37167479/rcpp-parallelize-functions-that-return-xptr)
> where I was curious about parallel calls to C++/Rcpp functions that return
> XPtr objects.  I am being a little more persistent here as this limitation
> provides a very hard stop on the development on one of my packages that
> heavily uses XPtr objects.  It's not meant to be a criticism or intended to
> be rude, I just want to fully understand.
>
> I am willing to accept that it may be impossible currently but I want to at
> least understand why it is impossible so I can explain to future users why
> parallel functionality is not available.  Which just echos my original
> question, what does it mean that R is single threaded?
>
> Kind Regards,
> Charles
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Single-threaded aspect

Mark van der Loo
In reply to this post by Charles Determan
Charles,

1. Perhaps this question is better directed at the R-help or
R-pacakge-devel mailinglist.

2. It basically means that R itself can only evaluate one R expression at
the time.

The parallel package circumvents this by starting multiple R-sessions and
dividing workload.

Compiled code called by R (such as C++ code through RCpp or C-code through
base R's interface) can execute multi-threaded code for internal purposes,
using e.g. openMP. A limitation is that compiled code cannot call R's C API
from multiple threads (in many cases). For example, it is not thread-safe
to create R-variables from multiple threads running in C. (R's variable
administration is such that the order of (un)making them from compiled code
matters).

I am not very savvy on Rcpp or XPtr objects, but it appears that Dirk
provided answers about that in your SO-question.

Best,
Mark










Op do 12 mei 2016 om 14:46 schreef Charles Determan <[hidden email]>:

> R Developers,
>
> Could someone help explain what it means that R is single threaded?  I am
> trying to understand what is actually going on inside R when users want to
> parallelize code.  For example, using mclapply or foreach (with some
> backend) somehow allows users to benefit from multiple CPUs.
>
> Similarly there is the RcppParallel package for RMatrix/RVector objects.
> But none of these address the general XPtr objects in Rcpp.  Some readers
> here may recognize my question on SO (
>
> http://stackoverflow.com/questions/37167479/rcpp-parallelize-functions-that-return-xptr
> )
> where I was curious about parallel calls to C++/Rcpp functions that return
> XPtr objects.  I am being a little more persistent here as this limitation
> provides a very hard stop on the development on one of my packages that
> heavily uses XPtr objects.  It's not meant to be a criticism or intended to
> be rude, I just want to fully understand.
>
> I am willing to accept that it may be impossible currently but I want to at
> least understand why it is impossible so I can explain to future users why
> parallel functionality is not available.  Which just echos my original
> question, what does it mean that R is single threaded?
>
> Kind Regards,
> Charles
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Single-threaded aspect

Charles Determan
Thanks for the replies.  Regarding the answer by Dirk, I didn't feel like I
still understood the reasoning why mclapply or foreach cannot handle XPtr
objects.  Instead of cluttering the SO question with comments I was getting
the impression that this was a limitation inherited with R objects (which
XPtr is supposed to be a proxy for an R object according to Dirk's
comment).  If this is not the case, I could repost this on Rcpp-devel
unless it could be migrated.

Regards,
Charles

On Thu, May 12, 2016 at 8:11 AM, Mark van der Loo <[hidden email]>
wrote:

> Charles,
>
> 1. Perhaps this question is better directed at the R-help or
> R-pacakge-devel mailinglist.
>
> 2. It basically means that R itself can only evaluate one R expression at
> the time.
>
> The parallel package circumvents this by starting multiple R-sessions and
> dividing workload.
>
> Compiled code called by R (such as C++ code through RCpp or C-code through
> base R's interface) can execute multi-threaded code for internal purposes,
> using e.g. openMP. A limitation is that compiled code cannot call R's C API
> from multiple threads (in many cases). For example, it is not thread-safe
> to create R-variables from multiple threads running in C. (R's variable
> administration is such that the order of (un)making them from compiled code
> matters).
>
> I am not very savvy on Rcpp or XPtr objects, but it appears that Dirk
> provided answers about that in your SO-question.
>
> Best,
> Mark
>
>
>
>
>
>
>
>
>
>
> Op do 12 mei 2016 om 14:46 schreef Charles Determan <[hidden email]
> >:
>
>> R Developers,
>>
>> Could someone help explain what it means that R is single threaded?  I am
>> trying to understand what is actually going on inside R when users want to
>> parallelize code.  For example, using mclapply or foreach (with some
>> backend) somehow allows users to benefit from multiple CPUs.
>>
>> Similarly there is the RcppParallel package for RMatrix/RVector objects.
>> But none of these address the general XPtr objects in Rcpp.  Some readers
>> here may recognize my question on SO (
>>
>> http://stackoverflow.com/questions/37167479/rcpp-parallelize-functions-that-return-xptr
>> )
>> where I was curious about parallel calls to C++/Rcpp functions that return
>> XPtr objects.  I am being a little more persistent here as this limitation
>> provides a very hard stop on the development on one of my packages that
>> heavily uses XPtr objects.  It's not meant to be a criticism or intended
>> to
>> be rude, I just want to fully understand.
>>
>> I am willing to accept that it may be impossible currently but I want to
>> at
>> least understand why it is impossible so I can explain to future users why
>> parallel functionality is not available.  Which just echos my original
>> question, what does it mean that R is single threaded?
>>
>> Kind Regards,
>> Charles
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Single-threaded aspect

Simon Urbanek
As others said XPtr is not something in R so Rcpp mailing list would be the right place for that aspect.

However, it you forget Rcpp and phrase it as an R question, you also get much closer to the reason and answer. SEXP type is the internal representation of all objects in R. I assume your question is which operations in the R API on those are thread-safe. The answer is that most of them are not, the main reason being that the memory management is not thread-safe, i.e. you cannot allocate anything without synchronization. Since almost all API calls involve some memory allocations, they are not thread-safe. You can, however, allocate objects and the operate on their payload, e.g., you can get numerical input vectors, allocate the result vector and then perform your threaded computation in C on those, synchronize and get back - that's how most implicit parallel operations in R work (leveraging BLAS, OpenMP, etc.). That is also what Dirk replied in your SO answer (quote: "Packages like RcppParallel are very careful about using non-R data structures for multithreaded work."). Note!
  that the payload of most native vectors (integer, real, complex) is technically non-R data structure in the sense so you can operate on those directly (some read-only operations are also thread-safe in the API as long as they can't trigger errors/warning/side-effects).

For completeness, memory allocation is not the only reason or obstacle for thread-safe R API calls, but a main one. Other issues involve error handling (you may long-jump out of your thread stack) and global state (devices, connections etc.). In short, it's not something that can be really solved without complete re-design and re-write.

Cheers,
Simon


> On May 12, 2016, at 9:16 AM, Charles Determan <[hidden email]> wrote:
>
> Thanks for the replies.  Regarding the answer by Dirk, I didn't feel like I
> still understood the reasoning why mclapply or foreach cannot handle XPtr
> objects.  Instead of cluttering the SO question with comments I was getting
> the impression that this was a limitation inherited with R objects (which
> XPtr is supposed to be a proxy for an R object according to Dirk's
> comment).  If this is not the case, I could repost this on Rcpp-devel
> unless it could be migrated.
>
> Regards,
> Charles
>
> On Thu, May 12, 2016 at 8:11 AM, Mark van der Loo <[hidden email]>
> wrote:
>
>> Charles,
>>
>> 1. Perhaps this question is better directed at the R-help or
>> R-pacakge-devel mailinglist.
>>
>> 2. It basically means that R itself can only evaluate one R expression at
>> the time.
>>
>> The parallel package circumvents this by starting multiple R-sessions and
>> dividing workload.
>>
>> Compiled code called by R (such as C++ code through RCpp or C-code through
>> base R's interface) can execute multi-threaded code for internal purposes,
>> using e.g. openMP. A limitation is that compiled code cannot call R's C API
>> from multiple threads (in many cases). For example, it is not thread-safe
>> to create R-variables from multiple threads running in C. (R's variable
>> administration is such that the order of (un)making them from compiled code
>> matters).
>>
>> I am not very savvy on Rcpp or XPtr objects, but it appears that Dirk
>> provided answers about that in your SO-question.
>>
>> Best,
>> Mark
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Op do 12 mei 2016 om 14:46 schreef Charles Determan <[hidden email]
>>> :
>>
>>> R Developers,
>>>
>>> Could someone help explain what it means that R is single threaded?  I am
>>> trying to understand what is actually going on inside R when users want to
>>> parallelize code.  For example, using mclapply or foreach (with some
>>> backend) somehow allows users to benefit from multiple CPUs.
>>>
>>> Similarly there is the RcppParallel package for RMatrix/RVector objects.
>>> But none of these address the general XPtr objects in Rcpp.  Some readers
>>> here may recognize my question on SO (
>>>
>>> http://stackoverflow.com/questions/37167479/rcpp-parallelize-functions-that-return-xptr
>>> )
>>> where I was curious about parallel calls to C++/Rcpp functions that return
>>> XPtr objects.  I am being a little more persistent here as this limitation
>>> provides a very hard stop on the development on one of my packages that
>>> heavily uses XPtr objects.  It's not meant to be a criticism or intended
>>> to
>>> be rude, I just want to fully understand.
>>>
>>> I am willing to accept that it may be impossible currently but I want to
>>> at
>>> least understand why it is impossible so I can explain to future users why
>>> parallel functionality is not available.  Which just echos my original
>>> question, what does it mean that R is single threaded?
>>>
>>> Kind Regards,
>>> Charles
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> [hidden email] mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Single-threaded aspect

R devel mailing list
In reply to this post by Charles Determan
The R language itself has features that limit how much
mulitthreading/parallel processing can be done.  There are functions with
side effects, such as library(), plot(), runif(), <-, and <<- and there are
no mechanisms to isolate them.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, May 12, 2016 at 5:45 AM, Charles Determan <[hidden email]>
wrote:

> R Developers,
>
> Could someone help explain what it means that R is single threaded?  I am
> trying to understand what is actually going on inside R when users want to
> parallelize code.  For example, using mclapply or foreach (with some
> backend) somehow allows users to benefit from multiple CPUs.
>
> Similarly there is the RcppParallel package for RMatrix/RVector objects.
> But none of these address the general XPtr objects in Rcpp.  Some readers
> here may recognize my question on SO (
>
> http://stackoverflow.com/questions/37167479/rcpp-parallelize-functions-that-return-xptr
> )
> where I was curious about parallel calls to C++/Rcpp functions that return
> XPtr objects.  I am being a little more persistent here as this limitation
> provides a very hard stop on the development on one of my packages that
> heavily uses XPtr objects.  It's not meant to be a criticism or intended to
> be rude, I just want to fully understand.
>
> I am willing to accept that it may be impossible currently but I want to at
> least understand why it is impossible so I can explain to future users why
> parallel functionality is not available.  Which just echos my original
> question, what does it mean that R is single threaded?
>
> Kind Regards,
> Charles
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Single-threaded aspect

Dirk Eddelbuettel
In reply to this post by Mark van der Loo

On 12 May 2016 at 13:11, Mark van der Loo wrote:
| Charles,
|
| 1. Perhaps this question is better directed at the R-help or
| R-pacakge-devel mailinglist.
|
| 2. It basically means that R itself can only evaluate one R expression at
| the time.
|
| The parallel package circumvents this by starting multiple R-sessions and
| dividing workload.
|
| Compiled code called by R (such as C++ code through RCpp or C-code through
| base R's interface) can execute multi-threaded code for internal purposes,
| using e.g. openMP. A limitation is that compiled code cannot call R's C API
| from multiple threads (in many cases). For example, it is not thread-safe
| to create R-variables from multiple threads running in C. (R's variable
| administration is such that the order of (un)making them from compiled code
| matters).

Well put.

| I am not very savvy on Rcpp or XPtr objects, but it appears that Dirk
| provided answers about that in your SO-question.

Charles seems to hang himself up completely about a small detail, failing to
see the forest for the trees.

There are (many) working examples of parallel (compiled) code with R. All of
them stress (and I simplify here) that can you touch R objects, or call back
into R, for fear of any assignment or allocation triggering an R event.  R
being single-threaded it cannot do this.

My answer to this problem is to only use non-R data structures. That is what
RcpParallel does in the actual parallel code portions in all examples --
types RVector and RMatrix do NOT connect back to R. There are several working
examples.  That is also what the OpenMP examples at the Rcpp Gallery do.

Charles seems to be replying 'but I use XPtr' or 'I use XPtr on arma::mat or
Eigen::Matrixxd' and seems to forget that these are proxy objects to SEXPs.
XPtr just wrap the SEXP for external pointers; Arma's and Eigen's matrices
are performant via RcppArmadillo and RcppEigen because we use R memory via
proxies.  All of that is 'too close to R' for comfort.

So the short answer is:  enter compiled code from R, set a mutex (either
conceptually or explicitly), _copy_ your data in to plain C++ data structures
and go to town in parallel via OpenMP and other multithreaded approaches.
Then collect the result, release the mutex and move back up.

I hope this help.

Dirk

|
| Best,
| Mark
|
|
|
|
|
|
|
|
|
|
| Op do 12 mei 2016 om 14:46 schreef Charles Determan <[hidden email]>:
|
| > R Developers,
| >
| > Could someone help explain what it means that R is single threaded?  I am
| > trying to understand what is actually going on inside R when users want to
| > parallelize code.  For example, using mclapply or foreach (with some
| > backend) somehow allows users to benefit from multiple CPUs.
| >
| > Similarly there is the RcppParallel package for RMatrix/RVector objects.
| > But none of these address the general XPtr objects in Rcpp.  Some readers
| > here may recognize my question on SO (
| >
| > http://stackoverflow.com/questions/37167479/rcpp-parallelize-functions-that-return-xptr
| > )
| > where I was curious about parallel calls to C++/Rcpp functions that return
| > XPtr objects.  I am being a little more persistent here as this limitation
| > provides a very hard stop on the development on one of my packages that
| > heavily uses XPtr objects.  It's not meant to be a criticism or intended to
| > be rude, I just want to fully understand.
| >
| > I am willing to accept that it may be impossible currently but I want to at
| > least understand why it is impossible so I can explain to future users why
| > parallel functionality is not available.  Which just echos my original
| > question, what does it mean that R is single threaded?
| >
| > Kind Regards,
| > Charles
| >
| >         [[alternative HTML version deleted]]
| >
| > ______________________________________________
| > [hidden email] mailing list
| > https://stat.ethz.ch/mailman/listinfo/r-devel
| >
|
| [[alternative HTML version deleted]]
|
| ______________________________________________
| [hidden email] mailing list
| https://stat.ethz.ch/mailman/listinfo/r-devel

--
http://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Single-threaded aspect

Charles Determan
Thank you Simon for the detailed reply.  That explains much more of what I
was looking for from the R side.

Dirk, I'm sorry if I seem hung up on anything here but I am trying to
understand the details.  My reply about XPtr or XPtr on arma/Eigen was to
confirm my understanding was correct, which it appears it was.  I was not
aware the RVector/RMatrix objects don't connect to R as I am just now
familiarizing myself with the package, that explains more of my confusion.
I will look at doing work within the compiled code as you have suggested.

Regards,
Charles

On Thu, May 12, 2016 at 9:18 AM, Dirk Eddelbuettel <[hidden email]> wrote:

>
> On 12 May 2016 at 13:11, Mark van der Loo wrote:
> | Charles,
> |
> | 1. Perhaps this question is better directed at the R-help or
> | R-pacakge-devel mailinglist.
> |
> | 2. It basically means that R itself can only evaluate one R expression at
> | the time.
> |
> | The parallel package circumvents this by starting multiple R-sessions and
> | dividing workload.
> |
> | Compiled code called by R (such as C++ code through RCpp or C-code
> through
> | base R's interface) can execute multi-threaded code for internal
> purposes,
> | using e.g. openMP. A limitation is that compiled code cannot call R's C
> API
> | from multiple threads (in many cases). For example, it is not thread-safe
> | to create R-variables from multiple threads running in C. (R's variable
> | administration is such that the order of (un)making them from compiled
> code
> | matters).
>
> Well put.
>
> | I am not very savvy on Rcpp or XPtr objects, but it appears that Dirk
> | provided answers about that in your SO-question.
>
> Charles seems to hang himself up completely about a small detail, failing
> to
> see the forest for the trees.
>
> There are (many) working examples of parallel (compiled) code with R. All
> of
> them stress (and I simplify here) that can you touch R objects, or call
> back
> into R, for fear of any assignment or allocation triggering an R event.  R
> being single-threaded it cannot do this.
>
> My answer to this problem is to only use non-R data structures. That is
> what
> RcpParallel does in the actual parallel code portions in all examples --
> types RVector and RMatrix do NOT connect back to R. There are several
> working
> examples.  That is also what the OpenMP examples at the Rcpp Gallery do.
>
> Charles seems to be replying 'but I use XPtr' or 'I use XPtr on arma::mat
> or
> Eigen::Matrixxd' and seems to forget that these are proxy objects to SEXPs.
> XPtr just wrap the SEXP for external pointers; Arma's and Eigen's matrices
> are performant via RcppArmadillo and RcppEigen because we use R memory via
> proxies.  All of that is 'too close to R' for comfort.
>
> So the short answer is:  enter compiled code from R, set a mutex (either
> conceptually or explicitly), _copy_ your data in to plain C++ data
> structures
> and go to town in parallel via OpenMP and other multithreaded approaches.
> Then collect the result, release the mutex and move back up.
>
> I hope this help.
>
> Dirk
>
> |
> | Best,
> | Mark
> |
> |
> |
> |
> |
> |
> |
> |
> |
> |
> | Op do 12 mei 2016 om 14:46 schreef Charles Determan <
> [hidden email]>:
> |
> | > R Developers,
> | >
> | > Could someone help explain what it means that R is single threaded?  I
> am
> | > trying to understand what is actually going on inside R when users
> want to
> | > parallelize code.  For example, using mclapply or foreach (with some
> | > backend) somehow allows users to benefit from multiple CPUs.
> | >
> | > Similarly there is the RcppParallel package for RMatrix/RVector
> objects.
> | > But none of these address the general XPtr objects in Rcpp.  Some
> readers
> | > here may recognize my question on SO (
> | >
> | >
> http://stackoverflow.com/questions/37167479/rcpp-parallelize-functions-that-return-xptr
> | > )
> | > where I was curious about parallel calls to C++/Rcpp functions that
> return
> | > XPtr objects.  I am being a little more persistent here as this
> limitation
> | > provides a very hard stop on the development on one of my packages that
> | > heavily uses XPtr objects.  It's not meant to be a criticism or
> intended to
> | > be rude, I just want to fully understand.
> | >
> | > I am willing to accept that it may be impossible currently but I want
> to at
> | > least understand why it is impossible so I can explain to future users
> why
> | > parallel functionality is not available.  Which just echos my original
> | > question, what does it mean that R is single threaded?
> | >
> | > Kind Regards,
> | > Charles
> | >
> | >         [[alternative HTML version deleted]]
> | >
> | > ______________________________________________
> | > [hidden email] mailing list
> | > https://stat.ethz.ch/mailman/listinfo/r-devel
> | >
> |
> |       [[alternative HTML version deleted]]
> |
> | ______________________________________________
> | [hidden email] mailing list
> | https://stat.ethz.ch/mailman/listinfo/r-devel
>
> --
> http://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]
>

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Single-threaded aspect

Dirk Eddelbuettel
In reply to this post by Dirk Eddelbuettel

On 12 May 2016 at 09:18, Dirk Eddelbuettel wrote:
|
| On 12 May 2016 at 13:11, Mark van der Loo wrote:
| | Charles,
| |
| | 1. Perhaps this question is better directed at the R-help or
| | R-pacakge-devel mailinglist.
| |
| | 2. It basically means that R itself can only evaluate one R expression at
| | the time.
| |
| | The parallel package circumvents this by starting multiple R-sessions and
| | dividing workload.
| |
| | Compiled code called by R (such as C++ code through RCpp or C-code through
| | base R's interface) can execute multi-threaded code for internal purposes,
| | using e.g. openMP. A limitation is that compiled code cannot call R's C API
| | from multiple threads (in many cases). For example, it is not thread-safe
| | to create R-variables from multiple threads running in C. (R's variable
| | administration is such that the order of (un)making them from compiled code
| | matters).
|
| Well put.
|
| | I am not very savvy on Rcpp or XPtr objects, but it appears that Dirk
| | provided answers about that in your SO-question.
|
| Charles seems to hang himself up completely about a small detail, failing to
| see the forest for the trees.
|
| There are (many) working examples of parallel (compiled) code with R. All of
| them stress (and I simplify here) that can you touch R objects, or call back

An import 'not' missing here (and a reordering);  "that you CANNOT touch R objects"

Sorry, Dirk

| into R, for fear of any assignment or allocation triggering an R event.  R
| being single-threaded it cannot do this.
|
| My answer to this problem is to only use non-R data structures. That is what
| RcpParallel does in the actual parallel code portions in all examples --
| types RVector and RMatrix do NOT connect back to R. There are several working
| examples.  That is also what the OpenMP examples at the Rcpp Gallery do.
|
| Charles seems to be replying 'but I use XPtr' or 'I use XPtr on arma::mat or
| Eigen::Matrixxd' and seems to forget that these are proxy objects to SEXPs.
| XPtr just wrap the SEXP for external pointers; Arma's and Eigen's matrices
| are performant via RcppArmadillo and RcppEigen because we use R memory via
| proxies.  All of that is 'too close to R' for comfort.
|
| So the short answer is:  enter compiled code from R, set a mutex (either
| conceptually or explicitly), _copy_ your data in to plain C++ data structures
| and go to town in parallel via OpenMP and other multithreaded approaches.
| Then collect the result, release the mutex and move back up.
|
| I hope this help.
|
| Dirk
|
| |
| | Best,
| | Mark
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | Op do 12 mei 2016 om 14:46 schreef Charles Determan <[hidden email]>:
| |
| | > R Developers,
| | >
| | > Could someone help explain what it means that R is single threaded?  I am
| | > trying to understand what is actually going on inside R when users want to
| | > parallelize code.  For example, using mclapply or foreach (with some
| | > backend) somehow allows users to benefit from multiple CPUs.
| | >
| | > Similarly there is the RcppParallel package for RMatrix/RVector objects.
| | > But none of these address the general XPtr objects in Rcpp.  Some readers
| | > here may recognize my question on SO (
| | >
| | > http://stackoverflow.com/questions/37167479/rcpp-parallelize-functions-that-return-xptr
| | > )
| | > where I was curious about parallel calls to C++/Rcpp functions that return
| | > XPtr objects.  I am being a little more persistent here as this limitation
| | > provides a very hard stop on the development on one of my packages that
| | > heavily uses XPtr objects.  It's not meant to be a criticism or intended to
| | > be rude, I just want to fully understand.
| | >
| | > I am willing to accept that it may be impossible currently but I want to at
| | > least understand why it is impossible so I can explain to future users why
| | > parallel functionality is not available.  Which just echos my original
| | > question, what does it mean that R is single threaded?
| | >
| | > Kind Regards,
| | > Charles
| | >
| | >         [[alternative HTML version deleted]]
| | >
| | > ______________________________________________
| | > [hidden email] mailing list
| | > https://stat.ethz.ch/mailman/listinfo/r-devel
| | >
| |
| | [[alternative HTML version deleted]]
| |
| | ______________________________________________
| | [hidden email] mailing list
| | https://stat.ethz.ch/mailman/listinfo/r-devel
|
| --
| http://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]
|
| ______________________________________________
| [hidden email] mailing list
| https://stat.ethz.ch/mailman/listinfo/r-devel

--
http://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: Single-threaded aspect

Dirk Eddelbuettel
In reply to this post by Charles Determan

On 12 May 2016 at 09:25, Charles Determan wrote:
| Thank you Simon for the detailed reply.  That explains much more of what I was
| looking for from the R side.
|
| Dirk, I'm sorry if I seem hung up on anything here but I am trying to
| understand the details.  My reply about XPtr or XPtr on arma/Eigen was to
| confirm my understanding was correct, which it appears it was.  I was not aware

I still do not think so.

Step back, have a cup of tea or two, and start with the simple and short
OpenMP examples in Rcpp itself.  They have been there for years and should
still work.  I would encourage you to work through these, maybe take notes
and possibly even submit the notes as a new short piece in the Rcpp Gallery.

| the RVector/RMatrix objects don't connect to R as I am just now familiarizing
| myself with the package, that explains more of my confusion.  I will look at
| doing work within the compiled code as you have suggested.

Sounds good.  OpenMP and Intel TBB (as in RcppParallel) will only become more
important as we move to more and more cores.  Working with them is not all
that obvious as you are finding out.  Let's try to work to make the
documentation better.

Dirk
 
| Regards,
| Charles
|
| On Thu, May 12, 2016 at 9:18 AM, Dirk Eddelbuettel <[hidden email]> wrote:
|
|    
|     On 12 May 2016 at 13:11, Mark van der Loo wrote:
|     | Charles,
|     |
|     | 1. Perhaps this question is better directed at the R-help or
|     | R-pacakge-devel mailinglist.
|     |
|     | 2. It basically means that R itself can only evaluate one R expression at
|     | the time.
|     |
|     | The parallel package circumvents this by starting multiple R-sessions and
|     | dividing workload.
|     |
|     | Compiled code called by R (such as C++ code through RCpp or C-code
|     through
|     | base R's interface) can execute multi-threaded code for internal
|     purposes,
|     | using e.g. openMP. A limitation is that compiled code cannot call R's C
|     API
|     | from multiple threads (in many cases). For example, it is not thread-safe
|     | to create R-variables from multiple threads running in C. (R's variable
|     | administration is such that the order of (un)making them from compiled
|     code
|     | matters).
|
|     Well put.
|    
|     | I am not very savvy on Rcpp or XPtr objects, but it appears that Dirk
|     | provided answers about that in your SO-question.
|
|     Charles seems to hang himself up completely about a small detail, failing
|     to
|     see the forest for the trees.
|
|     There are (many) working examples of parallel (compiled) code with R. All
|     of
|     them stress (and I simplify here) that can you touch R objects, or call
|     back
|     into R, for fear of any assignment or allocation triggering an R event.  R
|     being single-threaded it cannot do this.
|
|     My answer to this problem is to only use non-R data structures. That is
|     what
|     RcpParallel does in the actual parallel code portions in all examples --
|     types RVector and RMatrix do NOT connect back to R. There are several
|     working
|     examples.  That is also what the OpenMP examples at the Rcpp Gallery do.
|
|     Charles seems to be replying 'but I use XPtr' or 'I use XPtr on arma::mat
|     or
|     Eigen::Matrixxd' and seems to forget that these are proxy objects to SEXPs.
|     XPtr just wrap the SEXP for external pointers; Arma's and Eigen's matrices
|     are performant via RcppArmadillo and RcppEigen because we use R memory via
|     proxies.  All of that is 'too close to R' for comfort.
|
|     So the short answer is:  enter compiled code from R, set a mutex (either
|     conceptually or explicitly), _copy_ your data in to plain C++ data
|     structures
|     and go to town in parallel via OpenMP and other multithreaded approaches.
|     Then collect the result, release the mutex and move back up.
|
|     I hope this help.
|
|     Dirk
|
|     |
|     | Best,
|     | Mark
|     |
|     |
|     |
|     |
|     |
|     |
|     |
|     |
|     |
|     |
|     | Op do 12 mei 2016 om 14:46 schreef Charles Determan <
|     [hidden email]>:
|     |
|     | > R Developers,
|     | >
|     | > Could someone help explain what it means that R is single threaded?  I
|     am
|     | > trying to understand what is actually going on inside R when users want
|     to
|     | > parallelize code.  For example, using mclapply or foreach (with some
|     | > backend) somehow allows users to benefit from multiple CPUs.
|     | >
|     | > Similarly there is the RcppParallel package for RMatrix/RVector
|     objects.
|     | > But none of these address the general XPtr objects in Rcpp.  Some
|     readers
|     | > here may recognize my question on SO (
|     | >
|     | > http://stackoverflow.com/questions/37167479/
|     rcpp-parallelize-functions-that-return-xptr
|     | > )
|     | > where I was curious about parallel calls to C++/Rcpp functions that
|     return
|     | > XPtr objects.  I am being a little more persistent here as this
|     limitation
|     | > provides a very hard stop on the development on one of my packages that
|     | > heavily uses XPtr objects.  It's not meant to be a criticism or
|     intended to
|     | > be rude, I just want to fully understand.
|     | >
|     | > I am willing to accept that it may be impossible currently but I want
|     to at
|     | > least understand why it is impossible so I can explain to future users
|     why
|     | > parallel functionality is not available.  Which just echos my original
|     | > question, what does it mean that R is single threaded?
|     | >
|     | > Kind Regards,
|     | > Charles
|     | >
|     | >         [[alternative HTML version deleted]]
|     | >
|     | > ______________________________________________
|     | > [hidden email] mailing list
|     | > https://stat.ethz.ch/mailman/listinfo/r-devel
|     | >
|     |
|     |       [[alternative HTML version deleted]]
|     |
|     | ______________________________________________
|     | [hidden email] mailing list
|     | https://stat.ethz.ch/mailman/listinfo/r-devel
|
|     --
|     http://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]
|
|

--
http://dirk.eddelbuettel.com | @eddelbuettel | [hidden email]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel