Using multicores in R

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Using multicores in R

moriah
Hi,

I have an R script which is time consuming because it has two nested loops in it of at least 5000 iterations each, I have tried to use the multicore package but id doesn't seem to improve the elapsed time of the script(a shorter script for example) and I can't use the mcapply because of technical reasons.

I was wondering how can I make my script use more cores and memory because I am running it on a server and it is a shame that it uses only one core.

Thanks!
Moriah
 
Reply | Threaded
Open this post in threaded view
|

Re: Using multicores in R

Uwe Ligges-3


On 03.12.2012 11:14, moriah wrote:
> Hi,
>
> I have an R script which is time consuming because it has two nested loops
> in it of at least 5000 iterations each, I have tried to use the multicore
> package but id doesn't seem to improve the elapsed time of the script(a
> shorter script for example) and I can't use the mcapply because of technical
> reasons.

Errr, but otherwise multicore does not have an effect ...

See package "parallel" that offers various functions for parallel
computations. We cannot help much more if you do not tell us what the
technical reasons are why mcapply() does not work.

Best,
Uwe Ligges



>
> I was wondering how can I make my script use more cores and memory because I
> am running it on a server and it is a shame that it uses only one core.


>
> Thanks!
> Moriah
>
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Using-multicores-in-R-tp4651808.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Using multicores in R

Steve Lianoglou-6
And also:

On Monday, December 3, 2012, Uwe Ligges wrote:

>
>
> On 03.12.2012 11:14, moriah wrote:
>
>> Hi,
>>
>> I have an R script which is time consuming because it has two nested loops
>> in it of at least 5000 iterations each, I have tried to use the multicore
>> package but id doesn't seem to improve the elapsed time of the script(a
>> shorter script for example) and I can't use the mcapply because of
>> technical
>> reasons.
>>
>
> Errr, but otherwise multicore does not have an effect ...
>
> See package "parallel" that offers various functions for parallel
> computations. We cannot help much more if you do not tell us what the
> technical reasons are why mcapply() does not work.


If the work you are doing within each iteration of the loop is trivial, you
will likely even see a decrease in performance if you try to parallelize it.

Without more info from you regarding your problem, there's little we can do
to help, tho.

 -Steve



--
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Using multicores in R

Spencer Graves-2
       1.  Have you looked at CRAN Task View: High-Performance and
Parallel Computing with R
(http://cran.r-project.org/web/views/HighPerformanceComputing.html)?


       2.  Have you tried the "compiler" package?  If I understand
correctly, R is a two-stage interpreter, first translating what we know
as R into byte code, which is then interpreted by a byte code
interpreter.  If my memory is correct, this approach can cut the compute
time by a factor of 100.


       3.  Have you reviewed the section on "Profiling R code for speed"
in the "Writing R Extensions" manual that becomes available after
help.start()?  The profiling tools discussed there help identify the
portion of more complex code that takes the most time.  The standard
advice then is to experiment with writing the most time consuming
portion several different ways.  I've seen many examples where writing
what appears to be the same thing in R several different ways identifies
one that is easily 10 and maybe 100 or 1000 times faster than the
slowest alternative tried.


       4.  Have you tried using the "sos" package to search for other
functions and packages in R that may already have good code doing some
of the things you want to do?  The "findFn" function in "sos" searches
the "functions" subset of the "RSiteSearch" database and returns the
result sorted by package.  There are also a "union" and
"writeFindFn2xls" functions to make it easy to manipulate and evaluate
the results, described in a vignette. It's the best literature search I
know for anything statistical: If I don't find it there, it's OK to look
someplace else. [Caveat:  I'm the lead author of "sos", so I'm biased.]


       Best Wishes,
       Spencer


On 12/3/2012 6:24 AM, Steve Lianoglou wrote:

> And also:
>
> On Monday, December 3, 2012, Uwe Ligges wrote:
>
>>
>> On 03.12.2012 11:14, moriah wrote:
>>
>>> Hi,
>>>
>>> I have an R script which is time consuming because it has two nested loops
>>> in it of at least 5000 iterations each, I have tried to use the multicore
>>> package but id doesn't seem to improve the elapsed time of the script(a
>>> shorter script for example) and I can't use the mcapply because of
>>> technical
>>> reasons.
>>>
>> Errr, but otherwise multicore does not have an effect ...
>>
>> See package "parallel" that offers various functions for parallel
>> computations. We cannot help much more if you do not tell us what the
>> technical reasons are why mcapply() does not work.
>
> If the work you are doing within each iteration of the loop is trivial, you
> will likely even see a decrease in performance if you try to parallelize it.
>
> Without more info from you regarding your problem, there's little we can do
> to help, tho.
>
>   -Steve
>
>
>


--
Spencer Graves, PE, PhD
President and Chief Technology Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567
web:  www.structuremonitoring.com


--
Spencer Graves, PE, PhD
President and Chief Technology Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567
web:  www.structuremonitoring.com

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Using multicores in R

Jim Porzak
In reply to this post by moriah
Moriah,

Since you are doing nested loops, Rcpp may be an easy speed-up. Follow
all the links here
http://blog.revolutionanalytics.com/2012/11/hadleys-guide-to-high-performance-r-with-rcpp.html
for details.

HTH,
Jim Porzak
Minted.com
San Francisco, CA
www.linkedin.com/in/jimporzak
use R! Group SF: www.meetup.com/R-Users/


On Mon, Dec 3, 2012 at 2:14 AM, moriah <[hidden email]> wrote:

> Hi,
>
> I have an R script which is time consuming because it has two nested loops
> in it of at least 5000 iterations each, I have tried to use the multicore
> package but id doesn't seem to improve the elapsed time of the script(a
> shorter script for example) and I can't use the mcapply because of technical
> reasons.
>
> I was wondering how can I make my script use more cores and memory because I
> am running it on a server and it is a shame that it uses only one core.
>
> Thanks!
> Moriah
>
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Using-multicores-in-R-tp4651808.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Using multicores in R

moriah
In reply to this post by Spencer Graves-2
Thanks for the help,

Perhaps I should elaborate a bit, I am working on bioinformatics  project in which I am trying to run a forward selection algorithm for machine learning classification of two biological conditions.
At each iteration I want to find the gene that in addition to those I have found already does the best classification.

It looks something like this:

for (j in 1:5030)
  {
  tp <- 0;
  for (i in 1:5030)
  {
    if (!(i %in% idx))
    {

      classifier<-naiveBayes(trn[,c(i,idx)], trn[,20118])
      tbl <-table(predict(classifier, trn[,-20118]), trn[,20118])
      success <- (tbl[[1]] +tbl[[4]])/(tbl[[1]] +tbl[[4]]+tbl[[2]]+tbl[[3]])

      if (success > tp)
      {
        tp <- success
        ind <- i
        gene <- names(trn)[i]
      }
    }
   
  }
  idx <- c(idx,ind)
  res <- rbind(res, data.frame(Iteration=j,Success=tp*100,Gene=gene))
}

I am no expert when it comes to programming so I am not sure how can I optimize my relatively primitive code in the best way...

Thanks,
Moriah