R on dual-core machines

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

R on dual-core machines

Aleš Žiberna-3
Dear expeRts!

I'm thinking of buying a new computer and am considering dual-core
processors, such as AMD Athlon64 X2. Since I'm not a computer expert, pleas
forgive me if some of my questions are silly.

First, am I correct that using a dual-core processor is (for R point of
view) the same as using a computer with two processors?

If that is true, the posts I found on the list imply that using such a
processor can usually bring significant improvements (in computational time)
only if the case where the core (C or sometimes R) is specially designed for
multiple processors (see comments below).

So based on these and other comments I can conclude that if I'm not prepared
(able) to make such modifications, I can aspect improvements only in this
two areas:
1. If I am running two instances of R.
2. If I'm running several other programs on the computer beside R, the
programs and R would run faster, since they would not "compete" for
processor time (so much)

Thanks in advance for any useful suggestions,
Ales Ziberna

P.S.: Useful posts on the list follow:



It depends on the usage pattern. If you run multiple CPU-bound processes in
parallel without too much coordination (parallel make is a good example,
simulations another), then you get close to double up from a dual. For a
single R process, you can get something like 40% improvement in large linear
algebra problems, using a threaded ATLAS.
For other problems the speedup is basically nil. There is some potential in
threading R or (much easier) some of its vector operations, but that is not
even on the drawing board at this stage.

------------------------------------------------------------------------

If you want to exploit multiple processors, you can write code (e.g., in C)
called from R (e.g., through .Call or .C) that performs parallel/threaded
computations in a thread-safe way (e.g., without calling back into R).
---
Another possibility is to replace the BLAS/LAPACK library with a thread-safe
version. This provides a boost to those R algorithms exploiting these
libraries.
---
An alternative is to do all of the parallelization within R using nice tools
like the snow package combined with Rmpi.  If your task is computationally
intensive on the R side, but not on the client, then parallelizing R code
may be the better way to go.  All depends on your application, I think.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: R on dual-core machines

Uwe Ligges
Yes, you are right, and you found some relevant posts.

Uwe Ligges


Aleš Žiberna wrote:

> Dear expeRts!
>
> I'm thinking of buying a new computer and am considering dual-core
> processors, such as AMD Athlon64 X2. Since I'm not a computer expert, pleas
> forgive me if some of my questions are silly.
>
> First, am I correct that using a dual-core processor is (for R point of
> view) the same as using a computer with two processors?
>
> If that is true, the posts I found on the list imply that using such a
> processor can usually bring significant improvements (in computational time)
> only if the case where the core (C or sometimes R) is specially designed for
> multiple processors (see comments below).
>
> So based on these and other comments I can conclude that if I'm not prepared
> (able) to make such modifications, I can aspect improvements only in this
> two areas:
> 1. If I am running two instances of R.
> 2. If I'm running several other programs on the computer beside R, the
> programs and R would run faster, since they would not "compete" for
> processor time (so much)
>
> Thanks in advance for any useful suggestions,
> Ales Ziberna
>
> P.S.: Useful posts on the list follow:
>
>
>
> It depends on the usage pattern. If you run multiple CPU-bound processes in
> parallel without too much coordination (parallel make is a good example,
> simulations another), then you get close to double up from a dual. For a
> single R process, you can get something like 40% improvement in large linear
> algebra problems, using a threaded ATLAS.
> For other problems the speedup is basically nil. There is some potential in
> threading R or (much easier) some of its vector operations, but that is not
> even on the drawing board at this stage.
>
> ------------------------------------------------------------------------
>
> If you want to exploit multiple processors, you can write code (e.g., in C)
> called from R (e.g., through .Call or .C) that performs parallel/threaded
> computations in a thread-safe way (e.g., without calling back into R).
> ---
> Another possibility is to replace the BLAS/LAPACK library with a thread-safe
> version. This provides a boost to those R algorithms exploiting these
> libraries.
> ---
> An alternative is to do all of the parallelization within R using nice tools
> like the snow package combined with Rmpi.  If your task is computationally
> intensive on the R side, but not on the client, then parallelizing R code
> may be the better way to go.  All depends on your application, I think.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

Re: R on dual-core machines

Brian Ripley
In reply to this post by Aleš Žiberna-3
On Mon, 30 Jan 2006, [iso-8859-2] Alea }iberna wrote:

> Dear expeRts!
>
> I'm thinking of buying a new computer and am considering dual-core
> processors, such as AMD Athlon64 X2. Since I'm not a computer expert, pleas
> forgive me if some of my questions are silly.
>
> First, am I correct that using a dual-core processor is (for R point of
> view) the same as using a computer with two processors?

Depends on the OS (R does not get to see at that level), but that's a fair
presumption.  For example, our dual dual-core Opteron box is reported as
having four processors by Linux.

> If that is true, the posts I found on the list imply that using such a
> processor can usually bring significant improvements (in computational time)
> only if the case where the core (C or sometimes R) is specially designed for
> multiple processors (see comments below).
>
> So based on these and other comments I can conclude that if I'm not prepared
> (able) to make such modifications, I can aspect improvements only in this
> two areas:
> 1. If I am running two instances of R.
> 2. If I'm running several other programs on the computer beside R, the
> programs and R would run faster, since they would not "compete" for
> processor time (so much)

Yes, but running multiple R instances can be very useful.

Our long-term experience with multiple-processor machines is that you do
need to ensure you have adequate RAM and plenty of swap space, especially
on OSes that do not handle out-of-swap gracefully.

>
> Thanks in advance for any useful suggestions,
> Ales Ziberna
>
> P.S.: Useful posts on the list follow:
>
>
>
> It depends on the usage pattern. If you run multiple CPU-bound processes in
> parallel without too much coordination (parallel make is a good example,
> simulations another), then you get close to double up from a dual. For a
> single R process, you can get something like 40% improvement in large linear
> algebra problems, using a threaded ATLAS.
> For other problems the speedup is basically nil. There is some potential in
> threading R or (much easier) some of its vector operations, but that is not
> even on the drawing board at this stage.
>
> ------------------------------------------------------------------------
>
> If you want to exploit multiple processors, you can write code (e.g., in C)
> called from R (e.g., through .Call or .C) that performs parallel/threaded
> computations in a thread-safe way (e.g., without calling back into R).
> ---
> Another possibility is to replace the BLAS/LAPACK library with a thread-safe
> version. This provides a boost to those R algorithms exploiting these
> libraries.

Replace `thread-safe' by `multi-threaded', as you do need the BLAS to
use multiple threads itself (not be told to).  See the R-admin manul for
how to do this with Linux versions of OS.

> ---
> An alternative is to do all of the parallelization within R using nice tools
> like the snow package combined with Rmpi.  If your task is computationally
> intensive on the R side, but not on the client, then parallelizing R code
> may be the better way to go.  All depends on your application, I think.

--
Brian D. Ripley,                  [hidden email]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html