# Why R is 200 times slower than Matlab ?

## Why R is 200 times slower than Matlab ?

 I am switching from Matlab to R, but I found that R is 200 times slower than matlab. Since I am newbie to R, I must be missing some important programming tips. Please help me out on this. Here is the function: ## make the full pair-wise permutation of a vector ## input_fc=c(1,2,3); ## output_fc=( 1 1 1 2 2 2 3 3 3 1 2 3 1 2 3 1 2 3 ); grw_permute = function(input_fc){ fc_vector = input_fc index = 1 k = length(fc_vector) fc_matrix = matrix(0,2,k^2) for(i in 1:k){ for(j in 1:k){ fc_matrix[index]  =  fc_vector[i] fc_matrix[index+1]  =  fc_vector[j] index = index+2 } } return(fc_matrix) } For an input vector of size 300. It took R 2.17 seconds to run. But the same code in matlab only needs 0.01 seconds to run. Am I missing sth in R.. Is there a away to optimize.  ??? Thanks -- Zhandong Liu Genomics and Computational Biology University of Pennsylvania 616 BRB II/III, 421 Curie Boulevard University of Pennsylvania School of Medicine Philadelphia, PA 19104-6160
## Re: Why R is 200 times slower than Matlab ?

 Hi, ZD, Your comment about speed is too general. Here is a benchmark comparison among several languages and HTH. http://www.sciviews.org/benchmark/index.html

On Wed, Apr 30, 2008 at 4:15 PM, Zhandong Liu wrote:
[quoted message content]

-- 
=============================== 
WenSui Liu 
ChoicePoint Precision Marketing 
Phone: 678-893-9457 
Email : [hidden email] 
Blog : statcompute.spaces.live.com
## Re: Why R is 200 times slower than Matlab ?

 I would rather not comment on matlab (where is your matlab code by the way?), but your function could be simplified a bit: grw.permute <- function(v) {   cbind( rep(v, each=length(v)), rep(v, length(v)) ) } > system.time(tmp <- f( 1:300))    user  system elapsed   0.020   0.000   0.019 This is on my quite busy 4 years old laptop.... Best, Gabor On Wed, Apr 30, 2008 at 04:15:46PM -0400, Zhandong Liu wrote: > I am switching from Matlab to R, but I found that R is 200 times slower than > matlab. > > Since I am newbie to R, I must be missing some important programming tips. > [...] -- Csardi Gabor
## Re: Why R is 200 times slower than Matlab ?

 But please consider that this benchmark is five years old, and i believe that R has changed quite a lot since version 1.9. Gabor On Wed, Apr 30, 2008 at 04:21:51PM -0400, Wensui Liu wrote: > Hi, ZD, > Your comment about speed is too general. Here is a benchmark > comparison among several languages and HTH. > http://www.sciviews.org/benchmark/index.html
> [...] -- Csardi Gabor
## Re: Why R is 200 times slower than Matlab ?

 Zhandong Liu wrote: > I am switching from Matlab to R, but I found that R is 200 times slower than > matlab. > > Since I am newbie to R, I must be missing some important programming tips. > > Please help me out on this. > > Here is the function: > ## make the full pair-wise permutation of a vector > ## input_fc=c(1,2,3); > ## output_fc=( > 1 1 1 2 2 2 3 3 3 > 1 2 3 1 2 3 1 2 3 > ); > > grw_permute = function(input_fc){ > > fc_vector = input_fc > > index = 1 > > k = length(fc_vector) > > fc_matrix = matrix(0,2,k^2) > > for(i in 1:k){ > > for(j in 1:k){ > > fc_matrix[index]  =  fc_vector[i] > > fc_matrix[index+1]  =  fc_vector[j] > > index = index+2 > > } > > } > > return(fc_matrix) > > } > > For an input vector of size 300. It took R 2.17 seconds to run. > > But the same code in matlab only needs 0.01 seconds to run. > > Am I missing sth in R.. Is there a away to optimize.  ??? > > Thanks > >   This is pretty characteristic. With R, you really don't want nested loops doing single-element accessing (if you have better things to do with 2.16 seconds of our life). You will usually find that this sort of problem is handled either using vectorized operations at a higher level, or pushed into C code which is dynamically loaded. For the particular problem, notice that the same result is obtained with  > system.time(rbind(rep(1:300,300),rep(1:300,each=300)))    user  system elapsed   0.041   0.006   0.050 or even (OK, so it's transposed)  > system.time(expand.grid(1:300,1:300))    user  system elapsed   0.027   0.011   0.040 --    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918 ~~~~~~~~~~ - ([hidden email])              FAX: (+45) 35327907
## Re: Why R is 200 times slower than Matlab ?

 On Thu, 01 May 2008, Zhandong Liu wrote: > I am switching from Matlab to R, but I found that R is 200 times slower > than matlab. > > Since I am newbie to R, I must be missing some important programming tips. > > Please help me out on this. > > Here is the function: > ## make the full pair-wise permutation of a vector > ## input_fc=c(1,2,3); > ## output_fc=( > 1 1 1 2 2 2 3 3 3 > 1 2 3 1 2 3 1 2 3 > ); > > grw_permute = function(input_fc){ > > fc_vector = input_fc > > index = 1 > > k = length(fc_vector) > > fc_matrix = matrix(0,2,k^2) > > for(i in 1:k){ > > for(j in 1:k){ > > fc_matrix[index]  =  fc_vector[i] > > fc_matrix[index+1]  =  fc_vector[j] > > index = index+2 > > } > > } > > return(fc_matrix) > > } > > For an input vector of size 300. It took R 2.17 seconds to run. > > But the same code in matlab only needs 0.01 seconds to run. I am not a MATLAB user, but I suspect it wasn't "the same code" that produced an answer in MATLAB, but you don't provide your MATLAB code, nor do you specify what version of R, of MATLAB, or what hardware and OS you are using. I get {NetBSD, R version 2.6.0 (2007-10-03), Core 2 Duo, 3.x GHz}: > input_fc <- sample(1:600) > unix.time(a1 <- grw_permute(input_fc))    user  system elapsed   3.279  -0.001   3.280 > unix.time({n <- length(input_fc); a2 <- matrix(c(rep(input_fc, each=n), rep(input_fc, n)), 2, n*n, byrow = T)})    user  system elapsed   0.019   0.020   0.040 > all.equal(a1, a2) [1] TRUE >               A sample of length 300 took less than 1 second using your grw_permute() (so your OS may be making a difference as well). > > Am I missing sth in R.. Is there a away to optimize.  ??? > Yes. Loops are not efficient in R. > Thanks HTH, Ray Brownrigg
## Re: Why R is 200 times slower than Matlab ?

 You just have to use the right functions: is this fast enough > system.time(x <- expand.grid(1:300, 1:300))    user  system elapsed    0.00    0.01    0.01 On Wed, Apr 30, 2008 at 4:15 PM, Zhandong Liu wrote: [quoted message] -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?
## Re: Why R is 200 times slower than Matlab ?

 Zhandong Liu wrote: > I am switching from Matlab to R, but I found that R is 200 times slower than > matlab. > > Since I am newbie to R, I must be missing some important programming tips. The most important tip I would give you is to use the vectorized nature of R whenever possible.  This helps avoid messy indexing and 'for' loops. Look at the following 3 functions.  Yours, Gabor's, and my own (which I was about to post when I saw Gabor's nice solution, and is basically the same). Also see the system timings after the definitions. grw_permute <- function(input_fc){    fc_vector <- input_fc    index <- 1    k <- length(fc_vector)    fc_matrix <- matrix(0, 2, k^2)    for(i in 1:k){      for(j in 1:k){        fc_matrix[index]  <-  fc_vector[i]        fc_matrix[index+1]  <-  fc_vector[j]        index <- index + 2      }    }    return(fc_matrix) } grw.permute2 <- function(v) {    cbind( rep(v, each=length(v)), rep(v, length(v)) ) } grw_permute3 <- function(input_fc) {    matrix(c(rep(input_fc, each = length(input_fc)),             rep.int(input_fc, times = length(input_fc))),           nrow = 2, byrow = TRUE) }  > system.time(p1 <- grw_permute(1:300))     user  system elapsed    1.548   0.064   2.341  > system.time(p2 <- grw_permute2(1:300))     user  system elapsed    0.009   0.001   0.010  > system.time(p3 <- grw_permute3(1:300))     user  system elapsed    0.008   0.002   0.010 Erik Iverson
## Re: Why R is 200 times slower than Matlab ?

## Re: Why R is 200 times slower than Matlab ?

 Ah, so the code is quite similar in MATLAB (and the *algorithm* is the same :-) ). The "Important programming tip" is that when converting from MATLAB to R, you shouldn't just 'translate' from MATLAB code to R code, you must reconsider the problem in the context of the R environment.  This is very much like translating poetry, where the result should really be a poem in the target language, not just an accurate word-for-word (or even sentence-for-sentence) translation. Ray On Thu, 01 May 2008, you wrote: > This is the missing Matlab code: > > function[fc_matrix]=grw_permute(fc_vector) > > > > n=length(fc_vector); > > > > fc_matrix=zeros(2,n^2); > > > > index=1; > > for i=1:n > >     for j=1:n > >     fc_matrix(index)=fc_vector(i); > >     fc_matrix(index+1)=fc_vector(j); > >     index=index+2; > >     end > > end
## efficiency & profiling? (was: Why R is 200 times slower than Matlab ?)

 This has been an interesting discussion, and brings up two questions for me: Is there a good collection of hints/suggestions for R language idoms in terms of efficiency? For instance I read not to use for-loops, so I used apply only to later read that "apply" is internally implemented as a "for" so nothing gained here. Warnings about pitfalls (such as nested loops), hints, suggestions would be great. The second question - is there some sort of profiling tool available that would make it easy to recognize where the script is spending most of its time? Might be especially useful for newbies like me. Thanks all, Esmail
## Re: Why R is 200 times slower than Matlab ?

 Aside from optiming your code by making use of R functions that use C underneath as much as possible the big difference between R and Matlab is Matlab's just-in-time compilation of code.  When that was introduced in Matlab huge speedups of Matlab programs were noticeable. For R, there is a new package on CRAN, jit, that aims to provide similar speedups. On Wed, Apr 30, 2008 at 4:15 PM, Zhandong Liu wrote: [quoted message]
## Re: Why R is 200 times slower than Matlab ?

 On Wed, Apr 30, 2008 at 6:27 PM, Gabor Grothendieck wrote: > Aside from optiming your code by making use of R functions >  that use C underneath as much as possible the big difference >  between R and Matlab is Matlab's just-in-time compilation of >  code.  When that was introduced in Matlab huge speedups of >  Matlab programs were noticeable. > >  For R, there is a new package on CRAN, jit, that >  aims to provide similar speedups. http://www.milbo.users.sonic.net/ra/index.html

Great! I just found out about ra. In Python I love psyco and I guess I will test ra soon. Thanks, N.- -- http://arhuaco.org
## Re: efficiency & profiling? (was: Why R is 200 times slower than Matlab ?)

 On Wed, Apr 30, 2008 at 06:59:38PM -0400, esmail bonakdarian wrote: > > This has been an interesting discussion, and brings up two questions > for me: > > Is there a good collection of hints/suggestions for R language idoms in terms > of efficiency? For instance I read not to use for-loops, so I used apply only to > later read that "apply" is internally implemented as a "for" so nothing gained > here. Warnings about pitfalls (such as nested loops), hints,
## Re: efficiency & profiling?

 In reply to this post by esmail bonakdarian-3 On 30/04/2008 6:59 PM, esmail bonakdarian wrote: > This has been an interesting discussion, and brings up two questions > for me: > > Is there a good collection of hints/suggestions for R language idoms in terms > of efficiency? For instance I read not to use for-loops, so I used apply only to > later read that "apply" is internally implemented as a "for" so nothing gained > here. Warnings about pitfalls (such as nested loops), hints, suggestions would > be great. > > The second question - is there some sort of profiling tool available that would > make it easy to recognize where the script is spending most of its time? Might > be especially useful for newbies like me. See ?Rprof for the tool.  For the tips, I think you just need to hang around here a while.  I don't know of a nice collection (but I'm sure there are several.) Duncan Murdoch ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
## Re: efficiency & profiling?

 > See ?Rprof for the tool.  For the tips, I think you just need to hang > around here a while.  I don't know of a nice collection (but I'm sure > there are several.) > > Duncan Murdoch Hi, thanks .. several folks pointed me to Rprof, I'll take a look. Yes, I have been reading the list, the amount of messages per day is simply amazing, I can hardly keep up. Do most of you read this on the web or get it as digest? I am getting them as individual e-mails (thank god for filters) ... :-) Esmail _________________________________________________________________ Back to work after babyhow do you know when youre ready? 5797498&ocid=T067MSN40A0701A         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
## Re: efficiency & profiling?

 On 30/04/2008 7:47 PM, esmail bonakdarian wrote: > >> See ?Rprof for the tool.  For the tips, I think you just need to hang >> around here a while.  I don't know of a nice collection (but I'm sure >> there are several.) >> >> Duncan Murdoch > > > Hi, > > thanks .. several folks pointed me to Rprof, I'll take a look. > > Yes, I have been reading the list, the amount of messages per day > is simply amazing, I can hardly keep up. Do most of you read this > on the web or get it as digest? I am getting them as individual > e-mails (thank god for filters) ... :-) I think most read as email, but a substantial minority read it on the web.  You really do need filters. Personally, I scan the subject lines, and read about 1 in 10 threads. Duncan Murdoch ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
## Re: efficiency & profiling?

 In reply to this post by Duncan Murdoch On Wed, Apr 30, 2008 at 4:33 PM, Duncan Murdoch <[hidden email]> wrote: > On 30/04/2008 6:59 PM, esmail bonakdarian wrote: > > Is there a good collection of hints/suggestions for R language idoms in > terms > > of efficiency? > >  See ?Rprof for the tool.  For the tips, I think you just need to hang > around here a while.  I don't know of a nice collection (but I'm sure there > are several.) > >  Duncan Murdoch ;-) here's one:  https://stat.ethz.ch/pipermail/r-help/2005-October/080991.htmlKingsford Jones ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
## Re: Why R is 200 times slower than Matlab ?

 In reply to this post by Nelson Castillo Will the use of jit improve performance of use contributed packages such as lme4? Thanks. Shige On Thu, May 1, 2008 at 7:31 AM, Nelson Castillo <[hidden email]> wrote: > On Wed, Apr 30, 2008 at 6:27 PM, Gabor Grothendieck > <[hidden email]> wrote: > > Aside from optiming your code by making use of R functions > >  that use C underneath as much as possible the big difference > >  between R and Matlab is Matlab's just-in-time compilation of > >  code.  When that was introduced in Matlab huge speedups of > >  Matlab programs were noticeable. > > > >  For R, there is a new package on CRAN, jit, that > >  aims to provide similar speedups. > > http://www.milbo.users.sonic.net/ra/index.html> > Great! I just found out about ra. In Python I love psyco and I guess I > will test > ra soon. > > Thanks, > N.- > > -- > http://arhuaco.org> > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. >         [[alternative HTML version deleted]] ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.