withTimeout bug, it does not work properly with nlme anymore

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

withTimeout bug, it does not work properly with nlme anymore

Ramiro Barrantes-2
Hello,

I was relying on withTimeout (from R.utils) to help me stop nlme when it �hangs�.  However, recently this stopped working.  I am pasting a reproducible example below: withTimeout should stop nlme after 10 seconds but the code will generate data for which nlme does not converge (or takes too long) and withTimeout does not stop it.  I tried this both on a linux (64 bit, CentOS 7, R 3.4.1, nlme 3.1-131 R.util 2.6, and also with R 3.2.5) and mac (Sierra 10.13.1, R 3.4.2, same versions or nlme and R.utils).  It takes over R and I need to use brute-force to stop it.  As mentioned, this used to work and it is very helpful for the purposes of having a loop where nlme goes through many models.

Thank you in advance for any help,
Ramiro

library(nlme)
library(R.utils)

dat<-data.frame(x=c(3.69,3.69,3.69,3.69,3.69,3.69,3.69,3.69,3.69,3.69,3.69,3.69,3,3,3,3,3,3,3,3,3,3,3,3,2.3,2.3,2.3,2.3,2.3,2.3,2.3,2.3,2.3,2.3,2.3,2.3,1.61,1.61,1.61,1.61,1.61,1.61,1.61,1.61,1.61,1.61,1.61,1.61,0.92,0.92,0.92,0.92,0.92,0.92,0.92,0.92,0.92,0.92,0.92,0.92,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86),
y=c(0.35,0.69,0.57,1.48,6.08,-0.34,0.53,1.66,0.02,4.4,8.42,3.3,2.32,-2.3,7.52,-2.12,3.41,-4.76,7.9,5.04,10.26,-1.42,7.85,-1.88,3.81,-2.59,4.32,5.7,1.18, -1.74,1.81,6.16,4.2,-0.39,1.55,-1.4,1.76,-4.14,-2.36,-0.24,4.8,-7.07,1.34,1.98,0.86,-3.96,-0.61,2.68,-1.65,-2.06,3.67,-0.19,2.33,3.78,2.16,0.35, -5.6,1.32,2.99,4.21,-0.9,4.32,-4.01,2.03,0.9,-0.74,-5.78,5.76,0.52,1.37,-0.9,-4.06,-0.49,-2.39,-2.67,-0.71,-0.4,2.55,0.97,1.96,8.13,-5.93,4.01,0.79, -5.61,0.29,4.92,-2.89,-3.24,-3.06,-0.23,0.71,0.75,4.6,1.35, -3.35),
f.block=c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4),
id=c("a2","a2","a2","a2","a1","a1","a1","a1","a3","a3","a3","a3","a2","a2","a2","a2","a1","a1","a1","a1","a3","a3","a3","a3","a2","a2","a2","a2","a1","a1","a1","a1","a3","a3","a3","a3","a2","a2","a2","a2","a1","a1","a1","a1","a3","a3","a3","a3","a2","a2","a2","a2","a1","a1","a1","a1","a3","a3","a3","a3","a2","a2","a2","a2","a1","a1","a1","a1","a3","a3","a3","a3","a2","a2","a2","a2","a1","a1","a1","a1","a3","a3","a3","a3","a2","a2","a2","a2","a1","a1","a1","a1","a3","a3","a3","a3"))

fpl.B.range <- function(lx,logbase,A,B,C,D) {
    A/(1+logbase^(-B*(lx-C)))+D
}
myFormula<-list(formula(A~id),formula(B~id),formula(C~id),formula(D~id))
INIT <- c(A.a1=1,A.a2=0,A.a3=0,B=1,B.a2=0,B.a3=0,C=0,C.a2=0,C.a3=0,D=1,D.a2=0,D.a3=0)


for (i in 1:100) {
    print(paste("Iteration ",i,"...this will stall soon"))
    set.seed(i)
    dat$y <- dat$y+rnorm(nrow(dat), mean = 0, sd = 0.1)
    try({withTimeout(nlme(model=y~fpl.B.range(x,exp(1),A,B,C,D),
                      control=nlmeControl(maxIter=50,pnlsMaxIter=7,msMaxIter=50,niterEM=25),
                          data=dat, na.action=na.omit,
                          fixed=myFormula,random=list(f.block=pdSymm(A+B+C+D~1)),
                          start=INIT),timeout=10)})
}


        [[alternative HTML version deleted]]


______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Reply | Threaded
Open this post in threaded view
|

Re: withTimeout bug, it does not work properly with nlme anymore

Martin Maechler
>>>>> Ramiro Barrantes <[hidden email]>
>>>>>     on Mon, 27 Nov 2017 21:02:52 +0000 writes:

    > Hello, I was relying on withTimeout (from R.utils) to help
    > me stop nlme when it �hangs�.  However, recently this
    > stopped working.  I am pasting a reproducible example
    > below: withTimeout should stop nlme after 10 seconds but
    > the code will generate data for which nlme does not
    > converge (or takes too long) and withTimeout does not stop
    > it.  I tried this both on a linux (64 bit, CentOS 7, R
    > 3.4.1, nlme 3.1-131 R.util 2.6, and also with R 3.2.5) and
    > mac (Sierra 10.13.1, R 3.4.2, same versions or nlme and
    > R.utils).  It takes over R and I need to use brute-force
    > to stop it.  As mentioned, this used to work and it is
    > very helpful for the purposes of having a loop where nlme
    > goes through many models.

    > Thank you in advance for any help, Ramiro

Dear Ramiro,

as I thought you are reporting a bug  about  R.utils  withTimeout(),
I and maybe others have not reacted.

You've addressed this again in a non-public e-mail,
and indeed the underlying bug is really in nlme  which you do
mention implicitly.

I'm appending a version of your example that is not using R.utils
at all and reproducible hangs for me with R 3.4.3, R 3.4.3
patched and R-devel (and almost surely earlier versions of R
which I did not check.

Indeed, the call to nlme() "stalls" // "hangs" / "freezes" /
... R indeed, and cannot be terminated in a regular way, and, as
you, I do need "brute force" to stop it, killing the R process
too.

As the maintainer of the 'nlme'  *is* R-core,
we are asked to fix this, at least making it interruptable.

Still I should not take time for that for the next couple of
weeks as I should fulfill several other day jobs duties,
instead, and so will not promise anything here.

Tested (minimal) patches are welcome!

Here's a version of your script slightly simplified which
exhibits the problem and shows the problem indeed does not
happen in nlminb() -- which I wrongly assumed for a while --
but indeed in nlme's call to own .C() code.

I am looking into fixing this (making it interruptable // detect
the infinite loop).
My guess is that it only happens in degenerate cases like here.

Martin Maechler
ETH Zurich



## From: Ramiro Barrantes <[hidden email]>
## To: "[hidden email]" <[hidden email]>
## Subject: [Rd] withTimeout bug, it does not work properly with nlme anymore
## Date: Mon, 27 Nov 2017 21:02:52 +0000

## Hello,

## I was relying on withTimeout (from R.utils) to help me stop nlme when it
## �hangs�.  However, recently this stopped working.  I am pasting a
## reproducible example below: withTimeout should stop nlme after 10 seconds
## but the code will generate data for which nlme does not converge (or takes
## too long) and withTimeout does not stop it.  I tried this both on a linux
## (64 bit, CentOS 7, R 3.4.1, nlme 3.1-131 R.util 2.6, and also with R
## 3.2.5) and mac (Sierra 10.13.1, R 3.4.2, same versions or nlme and
## R.utils).  It takes over R and I need to use brute-force to stop it.  As
## mentioned, this used to work and it is very helpful for the purposes of
## having a loop where nlme goes through many models.

## Thank you in advance for any help,
## Ramiro

## ((Modifications by Martin Maechler)
dat <- data.frame(
    x=c(3.69,3.69,3.69,3.69,3.69,3.69,3.69,3.69,3.69,3.69,3.69,3.69,3,3,3,3,3,3,3,3,3,3,3,3,2.3,2.3,2.3,2.3,2.3,2.3,2.3,2.3,2.3,2.3,2.3,2.3,1.61,1.61,1.61,1.61,1.61,1.61,1.61,1.61,1.61,1.61,1.61,1.61,0.92,0.92,0.92,0.92,0.92,0.92,0.92,0.92,0.92,0.92,0.92,0.92,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-0.47,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86,-1.86),
    y=c(0.35,0.69,0.57,1.48,6.08,-0.34,0.53,1.66,0.02,4.4,8.42,3.3,2.32,-2.3,7.52,-2.12,3.41,-4.76,7.9,5.04,10.26,-1.42,7.85,-1.88,3.81,-2.59,4.32,5.7,1.18, -1.74,1.81,6.16,4.2,-0.39,1.55,-1.4,1.76,-4.14,-2.36,-0.24,4.8,-7.07,1.34,1.98,0.86,-3.96,-0.61,2.68,-1.65,-2.06,3.67,-0.19,2.33,3.78,2.16,0.35, -5.6,1.32,2.99,4.21,-0.9,4.32,-4.01,2.03,0.9,-0.74,-5.78,5.76,0.52,1.37,-0.9,-4.06,-0.49,-2.39,-2.67,-0.71,-0.4,2.55,0.97,1.96,8.13,-5.93,4.01,0.79, -5.61,0.29,4.92,-2.89,-3.24,-3.06,-0.23,0.71,0.75,4.6,1.35, -3.35),
    f.block = rep(1:4, 24),
    id= paste0("a", rep(c(2,1,3),each=4)))
str(dat)
## 'data.frame': 96 obs. of  4 variables:
##  $ x      : num  3.69 3.69 3.69 3.69 3.69 3.69 3.69 3.69 3.69 3.69 ...
##  $ y      : num  0.35 0.69 0.57 1.48 6.08 -0.34 0.53 1.66 0.02 4.4 ...
##  $ f.block: num  1 2 3 4 1 2 3 4 1 2 ...
##  $ id     : Factor w/ 3 levels "a1","a2","a3": 2 2 2 2 1 1 1 1 3 3 ...

table(dat$id) # 32 x 3 -- indeed the 2 factors are perfectly balanced:
xtabs(~id + f.block, data=dat)

## This is the version to directly trigger the bug
dd <- dat
set.seed(33)
dd$y <- dat$y + rnorm(nrow(dat), mean = 0, sd = 0.1)

library(nlme, lib = .Library) # <- get R's version not a newer one
cat("nlme version: ", format(packageVersion("nlme")), "\n")
## MM: Barrantes used 'logbase' and 'logbase^(..)' -- I just use exp(..):
fpl.B.range <- function(lx,A,B,C,D) {
    A/(1+exp(-B*(lx-C))) + D
}

INIT <- c(A.a1=1, A.a2=0, A.a3=0,
          B = 1,  B.a2=0, B.a3=0,
          C = 0,  C.a2=0, C.a3=0,
          D = 1,  D.a2=0, D.a3=0)

if(FALSE) # for interactive experiments, eval the following
debugonce(nlme.formula)

trace(nlminb, ## show arguments on entry:
      quote(print(ls.str())),
      exit = quote({cat("exiting nlminb();  port_msg(iv1):\n");
          port_msg(iv1); cat("variables:\n"); print(ls.str())}))


## MM: from watching 'htop' I don't see a clear memory leak...
## >>>>>>>>>>>>>>>Careful: This does "freeze R" : >>>>>>>>>>>>>>>>>
nlme(y ~ fpl.B.range(x, A,B,C,D), data = dd,
     fixed = list(A~id, B~id, C~id, D~id),
     random = list(f.block = pdSymm(A+B+C+D ~ 1)),
     start = INIT,
     control= nlmeControl(## NB: msMaxIter=200, ## gives singularity error at iter.55
         msVerbose=TRUE), #==> passed as 'trace' to nlminb()
     verbose = TRUE) -> res
## Shows that nlminb() is entered, then
## prints 50 iterations, and then *AGAIN* number 50 (!!)
## and then shows how nlminb() *is* exited, then shows -- thanks to verbose=TRUE
## **Iteration 1
## LME step: Loglik: -245.5092, nlminb iterations: 50
## reStruct  parameters:
##   f.block1   f.block2   f.block3   f.block4   f.block5   f.block6   f.block7   f.block8   f.block9  f.block10
##  2.3611369 -0.8382860 13.0713658 -1.0197240 -1.1551335 -0.3378552  5.4881588 -0.4035375 -3.3995335 14.7498195
## and then
## it stalls, I need to kill the R process
## --  [on lynne Fedora 26 (4.14.11-200.fc26.x86_64), Jan.2018]
## in R 3.4.3 and R 3.4.3 patched with nlme 3.1.131
##                    and R-devel with nlme 3.1.135

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel