Re: Convergence in Monte Carlo Simulation

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Convergence in Monte Carlo Simulation

Phat Chau
Hello,

I put together the following code and am curious about its correctness. My first question relates to the Monte Carlo simulations – the goal is to continue to iterate until I get n = 1000 simulations where the model successfully converges. I am wondering if I coded it correctly below with the while loop. Is the idea that the counter increments by one only if “model” does not return a string?

I would also like to know how I can create n = 1000 independent data sets. I think to do this, I would have to set a random number seed via set.seed() before the creation of each dataset. Where would I enter set.seed in the syntax below? Would it be in the function (as indicated in red)?

powercrosssw <- function(nclus, clsize) {

  set.seed()

  cohortsw <- genData(nclus, id = "cluster")
  cohortsw <- addColumns(clusterDef, cohortsw)
  cohortswTm <- addPeriods(cohortsw, nPeriods = 8, idvars = "cluster", perName = "period")
  cohortstep <- trtStepWedge(cohortswTm, "cluster", nWaves = 4, lenWaves = 1, startPer = 1, grpName = "Ijt")

  pat <- genCluster(cohortswTm, cLevelVar = "timeID", numIndsVar = clsize, level1ID = "id")

  dx <- merge(pat[, .(cluster, period, id)], cohortstep, by = c("cluster", "period"))
  dx <- addColumns(patError, dx)

  setkey(dx, id, cluster, period)

  dx <- addColumns(outDef, dx)

  return(dx)

}

i=1

while (i < 1000) {

  dx <- powercrosssw()

  #Fit multi-level model to simulated dataset
  model5 <- tryCatch(lme(y ~ factor(period) + factor(Ijt), data = dx, random = ~1|cluster, method = "REML"),
                     warning = function(w) { "warning" }
  )

  if (! is.character(model5)) {

    coeff <- coef(summary(model5))["factor(Ijt)1", "Value"]
    pvalue <- coef(summary(model5))["factor(Ijt)1", "p-value"]
    error <- coef(summary(model5))["factor(Ijt)1", "Std.Error"]
    bresult <- c(bresult, coeff)
    presult <- c(presult, pvalue)
    eresult <- c(eresult, error)

    i <- i + 1
  }
}

Thank you so much.



        [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Convergence in Monte Carlo Simulation

Michael Dewey-3
I am not 100% clear what your code is doing as it gets a bit wangled as
you posted in HTML but here are a couple of thoughts.

You need to set the seed outside any loops so it happens once and for all.

I would test after trycatch and keep a separate count of failures and
successes as the failure to converge must be meaningful about the
scientific question whatever that is. At the moment your count appears
to be in the correct place to count successes.

Michael

On 14/06/2020 02:50, Phat Chau wrote:

> Hello,
>
> I put together the following code and am curious about its correctness. My first question relates to the Monte Carlo simulations – the goal is to continue to iterate until I get n = 1000 simulations where the model successfully converges. I am wondering if I coded it correctly below with the while loop. Is the idea that the counter increments by one only if “model” does not return a string?
>
> I would also like to know how I can create n = 1000 independent data sets. I think to do this, I would have to set a random number seed via set.seed() before the creation of each dataset. Where would I enter set.seed in the syntax below? Would it be in the function (as indicated in red)?
>
> powercrosssw <- function(nclus, clsize) {
>
>    set.seed()
>
>    cohortsw <- genData(nclus, id = "cluster")
>    cohortsw <- addColumns(clusterDef, cohortsw)
>    cohortswTm <- addPeriods(cohortsw, nPeriods = 8, idvars = "cluster", perName = "period")
>    cohortstep <- trtStepWedge(cohortswTm, "cluster", nWaves = 4, lenWaves = 1, startPer = 1, grpName = "Ijt")
>
>    pat <- genCluster(cohortswTm, cLevelVar = "timeID", numIndsVar = clsize, level1ID = "id")
>
>    dx <- merge(pat[, .(cluster, period, id)], cohortstep, by = c("cluster", "period"))
>    dx <- addColumns(patError, dx)
>
>    setkey(dx, id, cluster, period)
>
>    dx <- addColumns(outDef, dx)
>
>    return(dx)
>
> }
>
> i=1
>
> while (i < 1000) {
>
>    dx <- powercrosssw()
>
>    #Fit multi-level model to simulated dataset
>    model5 <- tryCatch(lme(y ~ factor(period) + factor(Ijt), data = dx, random = ~1|cluster, method = "REML"),
>                       warning = function(w) { "warning" }
>    )
>
>    if (! is.character(model5)) {
>
>      coeff <- coef(summary(model5))["factor(Ijt)1", "Value"]
>      pvalue <- coef(summary(model5))["factor(Ijt)1", "p-value"]
>      error <- coef(summary(model5))["factor(Ijt)1", "Std.Error"]
>      bresult <- c(bresult, coeff)
>      presult <- c(presult, pvalue)
>      eresult <- c(eresult, error)
>
>      i <- i + 1
>    }
> }
>
> Thank you so much.
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

--
Michael
http://www.dewey.myzen.co.uk/home.html

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Convergence in Monte Carlo Simulation

Phat Chau
Thank you Michael.

I will clarify some more. The function in the first part of the code that I posted generates the simulated dataset for a cluster randomized trial from the simstudy package.

I am not quite clear what you mean by placing it outside the loop. So the goal here is to create n = 1000 independent datasets with different (randomly drawn values from the specified normal distributions not shown) for all of the parameters. What I have tried to do is place the seed at the very top of all my code in the past, but what that does is it leads to the creation of a single dataset that gets repeated over and over n = 1000 times. Hence, there ends up being no variability in the data (and power estimates from the p-values given the stated and required power).

Regarding the counter, is it correct in this instance that the loop will continue until n = 1000 iterations have successfully converged? I am not so concerned with counting failures.

Thank you.
Edward

On 2020-06-14, 6:46 AM, "Michael Dewey" <[hidden email]> wrote:

    I am not 100% clear what your code is doing as it gets a bit wangled as
    you posted in HTML but here are a couple of thoughts.
   
    You need to set the seed outside any loops so it happens once and for all.
   
    I would test after trycatch and keep a separate count of failures and
    successes as the failure to converge must be meaningful about the
    scientific question whatever that is. At the moment your count appears
    to be in the correct place to count successes.
   
    Michael
   
    On 14/06/2020 02:50, Phat Chau wrote:
    > Hello,
    >
    > I put together the following code and am curious about its correctness. My first question relates to the Monte Carlo simulations – the goal is to continue to iterate until I get n = 1000 simulations where the model successfully converges. I am wondering if I coded it correctly below with the while loop. Is the idea that the counter increments by one only if “model” does not return a string?
    >
    > I would also like to know how I can create n = 1000 independent data sets. I think to do this, I would have to set a random number seed via set.seed() before the creation of each dataset. Where would I enter set.seed in the syntax below? Would it be in the function (as indicated in red)?
    >
    > powercrosssw <- function(nclus, clsize) {
    >
    >    set.seed()
    >
    >    cohortsw <- genData(nclus, id = "cluster")
    >    cohortsw <- addColumns(clusterDef, cohortsw)
    >    cohortswTm <- addPeriods(cohortsw, nPeriods = 8, idvars = "cluster", perName = "period")
    >    cohortstep <- trtStepWedge(cohortswTm, "cluster", nWaves = 4, lenWaves = 1, startPer = 1, grpName = "Ijt")
    >
    >    pat <- genCluster(cohortswTm, cLevelVar = "timeID", numIndsVar = clsize, level1ID = "id")
    >
    >    dx <- merge(pat[, .(cluster, period, id)], cohortstep, by = c("cluster", "period"))
    >    dx <- addColumns(patError, dx)
    >
    >    setkey(dx, id, cluster, period)
    >
    >    dx <- addColumns(outDef, dx)
    >
    >    return(dx)
    >
    > }
    >
    > i=1
    >
    > while (i < 1000) {
    >
    >    dx <- powercrosssw()
    >
    >    #Fit multi-level model to simulated dataset
    >    model5 <- tryCatch(lme(y ~ factor(period) + factor(Ijt), data = dx, random = ~1|cluster, method = "REML"),
    >                       warning = function(w) { "warning" }
    >    )
    >
    >    if (! is.character(model5)) {
    >
    >      coeff <- coef(summary(model5))["factor(Ijt)1", "Value"]
    >      pvalue <- coef(summary(model5))["factor(Ijt)1", "p-value"]
    >      error <- coef(summary(model5))["factor(Ijt)1", "Std.Error"]
    >      bresult <- c(bresult, coeff)
    >      presult <- c(presult, pvalue)
    >      eresult <- c(eresult, error)
    >
    >      i <- i + 1
    >    }
    > }
    >
    > Thank you so much.
    >
    >
    >
    > [[alternative HTML version deleted]]
    >
    > ______________________________________________
    > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
    > https://stat.ethz.ch/mailman/listinfo/r-help
    > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    > and provide commented, minimal, self-contained, reproducible code.
    >
    >
   
    --
    Michael
    http://www.dewey.myzen.co.uk/home.html
   

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Convergence in Monte Carlo Simulation

Michael Dewey-3
Dear Edward

Every time you call your function powercrosssw() it resets the seed so
you must be calling it multiple times in some way.

Michael

On 14/06/2020 13:57, Phat Chau wrote:

> Thank you Michael.
>
> I will clarify some more. The function in the first part of the code that I posted generates the simulated dataset for a cluster randomized trial from the simstudy package.
>
> I am not quite clear what you mean by placing it outside the loop. So the goal here is to create n = 1000 independent datasets with different (randomly drawn values from the specified normal distributions not shown) for all of the parameters. What I have tried to do is place the seed at the very top of all my code in the past, but what that does is it leads to the creation of a single dataset that gets repeated over and over n = 1000 times. Hence, there ends up being no variability in the data (and power estimates from the p-values given the stated and required power).
>
> Regarding the counter, is it correct in this instance that the loop will continue until n = 1000 iterations have successfully converged? I am not so concerned with counting failures.
>
> Thank you.
> Edward
>
> On 2020-06-14, 6:46 AM, "Michael Dewey" <[hidden email]> wrote:
>
>      I am not 100% clear what your code is doing as it gets a bit wangled as
>      you posted in HTML but here are a couple of thoughts.
>      
>      You need to set the seed outside any loops so it happens once and for all.
>      
>      I would test after trycatch and keep a separate count of failures and
>      successes as the failure to converge must be meaningful about the
>      scientific question whatever that is. At the moment your count appears
>      to be in the correct place to count successes.
>      
>      Michael
>      
>      On 14/06/2020 02:50, Phat Chau wrote:
>      > Hello,
>      >
>      > I put together the following code and am curious about its correctness. My first question relates to the Monte Carlo simulations – the goal is to continue to iterate until I get n = 1000 simulations where the model successfully converges. I am wondering if I coded it correctly below with the while loop. Is the idea that the counter increments by one only if “model” does not return a string?
>      >
>      > I would also like to know how I can create n = 1000 independent data sets. I think to do this, I would have to set a random number seed via set.seed() before the creation of each dataset. Where would I enter set.seed in the syntax below? Would it be in the function (as indicated in red)?
>      >
>      > powercrosssw <- function(nclus, clsize) {
>      >
>      >    set.seed()
>      >
>      >    cohortsw <- genData(nclus, id = "cluster")
>      >    cohortsw <- addColumns(clusterDef, cohortsw)
>      >    cohortswTm <- addPeriods(cohortsw, nPeriods = 8, idvars = "cluster", perName = "period")
>      >    cohortstep <- trtStepWedge(cohortswTm, "cluster", nWaves = 4, lenWaves = 1, startPer = 1, grpName = "Ijt")
>      >
>      >    pat <- genCluster(cohortswTm, cLevelVar = "timeID", numIndsVar = clsize, level1ID = "id")
>      >
>      >    dx <- merge(pat[, .(cluster, period, id)], cohortstep, by = c("cluster", "period"))
>      >    dx <- addColumns(patError, dx)
>      >
>      >    setkey(dx, id, cluster, period)
>      >
>      >    dx <- addColumns(outDef, dx)
>      >
>      >    return(dx)
>      >
>      > }
>      >
>      > i=1
>      >
>      > while (i < 1000) {
>      >
>      >    dx <- powercrosssw()
>      >
>      >    #Fit multi-level model to simulated dataset
>      >    model5 <- tryCatch(lme(y ~ factor(period) + factor(Ijt), data = dx, random = ~1|cluster, method = "REML"),
>      >                       warning = function(w) { "warning" }
>      >    )
>      >
>      >    if (! is.character(model5)) {
>      >
>      >      coeff <- coef(summary(model5))["factor(Ijt)1", "Value"]
>      >      pvalue <- coef(summary(model5))["factor(Ijt)1", "p-value"]
>      >      error <- coef(summary(model5))["factor(Ijt)1", "Std.Error"]
>      >      bresult <- c(bresult, coeff)
>      >      presult <- c(presult, pvalue)
>      >      eresult <- c(eresult, error)
>      >
>      >      i <- i + 1
>      >    }
>      > }
>      >
>      > Thank you so much.
>      >
>      >
>      >
>      > [[alternative HTML version deleted]]
>      >
>      > ______________________________________________
>      > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
>      > https://stat.ethz.ch/mailman/listinfo/r-help
>      > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>      > and provide commented, minimal, self-contained, reproducible code.
>      >
>      >
>      
>      --
>      Michael
>      http://www.dewey.myzen.co.uk/home.html
>      
>
>
>

--
Michael
http://www.dewey.myzen.co.uk/home.html

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Convergence in Monte Carlo Simulation

Phat Chau
Dear Michael,

So I shouldn't be setting the seed at all then since it is automatic? Or is the suggestion here that a new seed is chosen each time?

I think rather than having you guess at the problem (my apologies) I will post the entire of the code (with omissions where it is not directly impacting the problem at hand). Sometimes I hesitate to post huge blocks because it can be a bit daunting, but I realize in coding that even the smallest glitch can throw everything off.

Set.seed(123) <<<Placing the seed here leads to no variation at all in my simulations as noted previously

clusterDef <- defDataAdd(varname = "u_3", dist = "normal", formula = 0, variance = 25.77)
patDef <- defDataAdd(varname = "u_2", dist = "normal", formula = 0, variance = 120.62)
patError <- defDataAdd(varname = "error", dist = "normal", formula = 0, variance = 38.35)

...(Data definition code omitted)

setkey(patTm, id, cluster, period)

#Define outcome y
outDef <- defDataAdd(varname = "y", formula = "17.87 + 5.0*Ijt - 5.42*I(period == 1) - 5.72*I(period == 2) - 7.03*I(period == 3) - 6.13*I(period == 4) - 9.13*I(period == 5) + u_3 + u_2 + error", dist = "normal")

patTm <- addColumns(outDef, patTm)

powercrosssw <- function(nclus, clsize) {
       
        set.seed() < not sure if placing it the function rather than at the top is appropriate to generate a new and independent dataset for each of the 1000 iterations

Regarding the convergence issue, it seems that what you are saying is I have it all set up correctly (i.e. it will iterate until 1000 iterations converge). I do get this rather peculiar error though in some cases:

Error in lme.formula(y ~ factor(period) + factor(Ijt), data = patTm, random = ~1 |  :
  nlminb problem, convergence error code = 1
  message = false convergence (8)

I am not quite sure what the problem is there.

Edward


On 2020-06-14, 10:16 AM, "Michael Dewey" <[hidden email]> wrote:

    Dear Edward
   
    Every time you call your function powercrosssw() it resets the seed so
    you must be calling it multiple times in some way.
   
    Michael
   
    On 14/06/2020 13:57, Phat Chau wrote:
    > Thank you Michael.
    >
    > I will clarify some more. The function in the first part of the code that I posted generates the simulated dataset for a cluster randomized trial from the simstudy package.
    >
    > I am not quite clear what you mean by placing it outside the loop. So the goal here is to create n = 1000 independent datasets with different (randomly drawn values from the specified normal distributions not shown) for all of the parameters. What I have tried to do is place the seed at the very top of all my code in the past, but what that does is it leads to the creation of a single dataset that gets repeated over and over n = 1000 times. Hence, there ends up being no variability in the data (and power estimates from the p-values given the stated and required power).
    >
    > Regarding the counter, is it correct in this instance that the loop will continue until n = 1000 iterations have successfully converged? I am not so concerned with counting failures.
    >
    > Thank you.
    > Edward
    >
    > On 2020-06-14, 6:46 AM, "Michael Dewey" <[hidden email]> wrote:
    >
    >      I am not 100% clear what your code is doing as it gets a bit wangled as
    >      you posted in HTML but here are a couple of thoughts.
    >      
    >      You need to set the seed outside any loops so it happens once and for all.
    >      
    >      I would test after trycatch and keep a separate count of failures and
    >      successes as the failure to converge must be meaningful about the
    >      scientific question whatever that is. At the moment your count appears
    >      to be in the correct place to count successes.
    >      
    >      Michael
    >      
    >      On 14/06/2020 02:50, Phat Chau wrote:
    >      > Hello,
    >      >
    >      > I put together the following code and am curious about its correctness. My first question relates to the Monte Carlo simulations – the goal is to continue to iterate until I get n = 1000 simulations where the model successfully converges. I am wondering if I coded it correctly below with the while loop. Is the idea that the counter increments by one only if “model” does not return a string?
    >      >
    >      > I would also like to know how I can create n = 1000 independent data sets. I think to do this, I would have to set a random number seed via set.seed() before the creation of each dataset. Where would I enter set.seed in the syntax below? Would it be in the function (as indicated in red)?
    >      >
    >      > powercrosssw <- function(nclus, clsize) {
    >      >
    >      >    set.seed()
    >      >
    >      >    cohortsw <- genData(nclus, id = "cluster")
    >      >    cohortsw <- addColumns(clusterDef, cohortsw)
    >      >    cohortswTm <- addPeriods(cohortsw, nPeriods = 8, idvars = "cluster", perName = "period")
    >      >    cohortstep <- trtStepWedge(cohortswTm, "cluster", nWaves = 4, lenWaves = 1, startPer = 1, grpName = "Ijt")
    >      >
    >      >    pat <- genCluster(cohortswTm, cLevelVar = "timeID", numIndsVar = clsize, level1ID = "id")
    >      >
    >      >    dx <- merge(pat[, .(cluster, period, id)], cohortstep, by = c("cluster", "period"))
    >      >    dx <- addColumns(patError, dx)
    >      >
    >      >    setkey(dx, id, cluster, period)
    >      >
    >      >    dx <- addColumns(outDef, dx)
    >      >
    >      >    return(dx)
    >      >
    >      > }
    >      >
    >      > i=1
    >      >
    >      > while (i < 1000) {
    >      >
    >      >    dx <- powercrosssw()
    >      >
    >      >    #Fit multi-level model to simulated dataset
    >      >    model5 <- tryCatch(lme(y ~ factor(period) + factor(Ijt), data = dx, random = ~1|cluster, method = "REML"),
    >      >                       warning = function(w) { "warning" }
    >      >    )
    >      >
    >      >    if (! is.character(model5)) {
    >      >
    >      >      coeff <- coef(summary(model5))["factor(Ijt)1", "Value"]
    >      >      pvalue <- coef(summary(model5))["factor(Ijt)1", "p-value"]
    >      >      error <- coef(summary(model5))["factor(Ijt)1", "Std.Error"]
    >      >      bresult <- c(bresult, coeff)
    >      >      presult <- c(presult, pvalue)
    >      >      eresult <- c(eresult, error)
    >      >
    >      >      i <- i + 1
    >      >    }
    >      > }
    >      >
    >      > Thank you so much.
    >      >
    >      >
    >      >
    >      > [[alternative HTML version deleted]]
    >      >
    >      > ______________________________________________
    >      > [hidden email] mailing list -- To UNSUBSCRIBE and more, see
    >      > https://stat.ethz.ch/mailman/listinfo/r-help
    >      > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
    >      > and provide commented, minimal, self-contained, reproducible code.
    >      >
    >      >
    >      
    >      --
    >      Michael
    >      http://www.dewey.myzen.co.uk/home.html
    >      
    >
    >
    >
   
    --
    Michael
    http://www.dewey.myzen.co.uk/home.html
   

______________________________________________
[hidden email] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.