# how to make simulation faster Classic List Threaded 8 messages Open this post in threaded view
|

## how to make simulation faster

 This post was updated on . Dear R users, I need to run 1000 simulations to find maximum likelihood estimates.  I print my output as a vector. However, it is taking too long. I am running 50 simulations at a time and it is taking me 30 minutes. Once I tried to run 200 simulations at once, after 2 hours I stopped it and saw that only about 40 of them are simulated in those 2 hours. Is there any way to make my simulations faster?. Thank you in advance. Here is a part of my code. My data set is a clustered data set with 2 subjects in each group (100 groups). I generated 1000 of these  and put them in a matrix. So, each column represents one data set. output1<-vector("numeric",length(1:r)) output2<-vector("numeric",length(1:r)) output3<-vector("numeric",length(1:r)) output4<-vector("numeric",length(1:r)) for (m in 1:r){ ll<-function(p){ cumhaz<-(time*exp(Zi*p))*p cumhaz<-aggregate(cumhaz,by=list(i),FUN=sum) lnhaz<-delta*log(exp(Zi*p)*p) lnhaz<-aggregate(lnhaz,by=list(i),FUN=sum) lnhaz<-lnhaz[2:(r+1)] cumhaz<-cumhaz[2:(r+1)] lik<-r[m]*log(p)- sum((di[,m]+1/p)*log(1+cumhaz[,m]*p))+sum(lnhaz[,m])- n*log(gamma(1/p))+sum(log(gamma(di[,m]+1/p))) -lik } initial<-c(1,0.5,0.5) estt<-nlm(ll,initial,hessian=T) lambda<-estt\$estimate theta<-estt\$estimate beta<-estt\$estimate hessian<-t\$hessian cov<-solve(hessian) vtheta<-cov[2,2] vbeta<-cov[3,3] output1[m]<-theta output2[m]<-beta output3[m]<-vtheta output4[m]<-vbeta } theta<-as.vector(output1) beta<-as.vector(output2) vtheta<-as.vector(output3) vbeta<-as.vector(output4)
Open this post in threaded view
|

## Re: how to make simulation faster

 On Fri, Oct 26, 2012 at 4:23 AM, stats12 <[hidden email]> wrote: > Dear R users, > > I need to run 1000 simulations to find maximum likelihood estimates.  I > print my output as a vector. However, it is taking too long. I am running 50 > simulations at a time and it is taking me 30 minutes. Once I tried to run > 200 simulations at once, after 2 hours I stopped it and saw that only about > 40 of them are simulated in those 2 hours. Is there any way to make my > simulations faster? (I can post my code if needed, I'm just looking for > general ideas here). Thank you in advance. > Code would be nice: I struggle to think of an basic MLE fitting that would take ~36s per iteration and scale so badly if you are writing idiomatic R: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-exampleFinally, I note you're posting from Nabble. Please include context in your reply -- I don't believe Nabble does this automatically, so you'll need to manually include it. Most of the regular respondents on this list don't use Nabble -- it is a _mailing list_ after all -- so we don't get the forum view you do, only emails of the individual posts. Combine that with the high volume of posts, and it's quite difficult to trace a discussion if we all don't make sure to include context. RMW ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: how to make simulation faster

 In reply to this post by stats12 use Rprof to profile your code to determine where time is being spent.  This might tell you where to concentrate your effort. Sent from my iPad On Oct 25, 2012, at 23:23, stats12 <[hidden email]> wrote: > Dear R users, > > I need to run 1000 simulations to find maximum likelihood estimates.  I > print my output as a vector. However, it is taking too long. I am running 50 > simulations at a time and it is taking me 30 minutes. Once I tried to run > 200 simulations at once, after 2 hours I stopped it and saw that only about > 40 of them are simulated in those 2 hours. Is there any way to make my > simulations faster? (I can post my code if needed, I'm just looking for > general ideas here). Thank you in advance. > > > > -- > View this message in context: http://r.789695.n4.nabble.com/how-to-make-simulation-faster-tp4647492.html> Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|

## Re: how to make simulation faster

 In reply to this post by Michael Weylandt Hi, Thank you for your reply. I updated my post with the code. Also, about posting from Nabble, since I am a new user I didn't know about that problem. If I post to the mailing list ( r-help@r-project.org), would it get rid of that problem? output1<-vector("numeric",length(1:r)) output2<-vector("numeric",length(1:r)) output3<-vector("numeric",length(1:r)) output4<-vector("numeric",length(1:r)) for (m in 1:r){ ll<-function(p){ cumhaz<-(time*exp(Zi*p))*p cumhaz<-aggregate(cumhaz,by=list(i),FUN=sum) lnhaz<-delta*log(exp(Zi*p)*p) lnhaz<-aggregate(lnhaz,by=list(i),FUN=sum) lnhaz<-lnhaz[2:(r+1)] cumhaz<-cumhaz[2:(r+1)] lik<-r[m]*log(p)- sum((di[,m]+1/p)*log(1+cumhaz[,m]*p))+sum(lnhaz[,m])- n*log(gamma(1/p))+sum(log(gamma(di[,m]+1/p))) -lik } initial<-c(1,0.5,0.5) estt<-nlm(ll,initial,hessian=T) lambda<-estt\$estimate theta<-estt\$estimate beta<-estt\$estimate hessian<-t\$hessian cov<-solve(hessian) vtheta<-cov[2,2] vbeta<-cov[3,3] output1[m]<-theta output2[m]<-beta output3[m]<-vtheta output4[m]<-vbeta } theta<-as.vector(output1) beta<-as.vector(output2) vtheta<-as.vector(output3) vbeta<-as.vector(output4) Code would be nice: I struggle to think of an basic MLE fitting that would take ~36s per iteration and scale so badly if you are writing idiomatic R: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-exampleFinally, I note you're posting from Nabble. Please include context in your reply -- I don't believe Nabble does this automatically, so you'll need to manually include it. Most of the regular respondents on this list don't use Nabble -- it is a _mailing list_ after all -- so we don't get the forum view you do, only emails of the individual posts. Combine that with the high volume of posts, and it's quite difficult to trace a discussion if we all don't make sure to include context.
Open this post in threaded view
|

## Re: how to make simulation faster

 In reply to this post by jholtman Thank you. I tried Rprof and looks like aggregate function I am using is one of the functions that takes most of the time. What is the difference between self time and total time? \$by.total                        total.time total.pct self.time self.pct f                          925.92     99.98      5.16     0.56                 925.92     99.98      0.06     0.01 nlm                        925.92     99.98      0.00     0.00 aggregate                  885.66     95.64      0.28     0.03 aggregate.default          885.38     95.61      0.02     0.00 aggregate.data.frame       885.34     95.60     12.54     1.35 lapply                     817.06     88.23    183.76    19.84 FUN                        814.48     87.95     53.50     5.78 split                      425.06     45.90      4.50     0.49 split.default              420.56     45.41     14.16     1.53 factor                     408.84     44.15    347.12    37.48 unique                     235.32     25.41     19.38     2.09 sapply                     202.60     21.88      1.70     0.18 unlist                     181.40     19.59    132.28    14.28 simplify2array             148.70     16.06      1.10     0.12 [[<-                        41.24      4.45      1.62     0.17 [[<-.data.frame             39.62      4.28     31.42     3.39 ^                           26.18      2.83     26.18     2.83 jholtman wrote use Rprof to profile your code to determine where time is being spent.  This might tell you where to concentrate your effort. Sent from my iPad On Oct 25, 2012, at 23:23, stats12 <[hidden email]> wrote: > Dear R users, > > I need to run 1000 simulations to find maximum likelihood estimates.  I > print my output as a vector. However, it is taking too long. I am running 50 > simulations at a time and it is taking me 30 minutes. Once I tried to run > 200 simulations at once, after 2 hours I stopped it and saw that only about > 40 of them are simulated in those 2 hours. Is there any way to make my > simulations faster? (I can post my code if needed, I'm just looking for > general ideas here). Thank you in advance. > > > > -- > View this message in context: http://r.789695.n4.nabble.com/how-to-make-simulation-faster-tp4647492.html> Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Open this post in threaded view
|