# suggestions for nls error: false convergence Classic List Threaded 6 messages Reply | Threaded
Open this post in threaded view
|

## suggestions for nls error: false convergence

 Hi,   I'm trying to fit some data using a logistic function defined as y ~ a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau) My data is below: x <- 1:100 y <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,1,1,1,2,2,2,2,2,3,4,4,4,5, 5,5,5,6,6,6,6,6,8,8,9,9,10,13,14,16,19,21, 24,28,33,40,42,44,50,54,69,70,93,96,110,127,127,141,157,169, 178,187,206,216,227,236,238,244,246,250,255,255,257,260,261,262,266,268, 268,270,272,272,272,273,275,275,275,276) My first attempt was to use nls as below:     d <- data.frame(x=x, y=y)     model <- nls(y ~ a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau)), data=d,     start=list(a=277,m=100,n=101,tau=10),     algorithm='port', trace=TRUE,     control=nls.control(maxiter=5000, minFactor=1/2048)) Running the above code I get the following error message: Convergence failure: function evaluation limit reached without convergence (9). To investigate this further I used nlminb() to get a set of starting parameters. Thus I did: func <- function( par ) {     a = par     m = par     n = par     tau = par     a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau)) } est <- nlminb( c(277, 100,101, 10), objective=func, control=list(eval.max=400, iter.max=1000)) I get absolute convergence and a set of parameter values. Plugging these into the nls call and trying again still gives me  Convergence failure: function evaluation limit reached without convergence (9) I have tried a number of different starting values for the nls() call but I usually end up getting the following error: Convergence failure: false convergence (8) After reading the PORT library docs, I see that this error can mean 1) gradient is calculated incorrectly 2) stopping tolerances are too tight 3) gradient is discontinous near some iterate However, since nls() usually reports the above error after 30 to 40 iterations, the PORT docs suggest that it is not problem 1. I'm not sure about (3) - I have other data which are somewhat similar to the above data, but they lead to a straightforward fit. In the end I tried a different starting value and lowered the tolerances a little, and I got a valid fit My questions are: 1) Why would the parameters that lead nlminb() to converge to work in nls() (since I'm using the PORT algorithm in both cases)? 2) Is there a strategy to obtain starting values? In my case I know that a should be around 277, but for the others I'm not sure. 3) Is there a quick way to check whether the gradient is discontinous at a point, numerically (rather than calculating the analytical derivatives)? I did plot(diff(y)) and it certainly looks messy, but I also have other y vectors which look equally jagged (though the jaggedness occurs at lower x) Any suggestions would be appreciated. Thanks, ------------------------------------------------------------------- Rajarshi Guha <[hidden email]> GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE ------------------------------------------------------------------- After a number of decimal places, nobody gives a damn. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

## Re: suggestions for nls error: false convergence

 I generally prefer "optim" to "nlminb", because it will optionally return the hessian, and a review of the eigenvalues and vectors can help diagnose problems like this.  When I tried "optim" with your "func", I got an error message:  > est. <- optim( c(277, 100,101, 10), func, +                hessian=TRUE) Error in optim(c(277, 100, 101, 10), func, hessian = TRUE) :         objective function in optim evaluates to length 100 not 1           For both "optim" and "nlminb", the second argument is a "function to be minimized", which must return a scalar.  Your "func" returned a vector of length 100 = length(x);  "nlminb" tried to minimize only the first component of that vector without reference to y!           I therefore modified your "func" as follows and tried it with "nlminb" and with three methods of "optim": func1 <- function( par,y, x ) { a = par m = par n = par tau = par y. <- a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau)) sum((y-y.)^2) } est.n <- nlminb( c(277, 100,101, 10), objective=func1, control=list(eval.max=400, iter.max=1000), y=y, x=x) est.o1 <- optim( c(277, 100,101, 10), func1,                 hessian=TRUE, y=y, x=x) est.o2 <- optim( c(277, 100,101, 10), func1,         method="BFGS", hessian=TRUE, y=y, x=x) est.o3 <- optim( c(277, 100,101, 10), func1,         method="CG", hessian=TRUE, y=y, x=x) est.o4 <- optim( c(277, 100,101, 10), func1,         method="SANN", hessian=TRUE, y=y, x=x)           These 5 optimizers found the following "minima", returning "convergence" messages as follows: nlminb:  \$objective=808.7282;  \$convergence=0, "relative convergence (4)" optim(method="Nelder-Mead"(default)):  \$value=9032.814 (11 times nlminb);  \$convergence=1 (the iteration limit 'maxit' had been reached.) optim(method="BFGS"):  \$value=1189625 (1471 times nlminb); \$convergence=0 ("successful convergence") optim(method="CG"):  \$value=1189574 (1471 times nlminb);  \$convergence=1 (the iteration limit 'maxit' had been reached.) optim(method="SANN"):  \$value=42819.06 (53 times nlminb); \$convergence=0 ("successful convergence")           Clearly, nlminb got the smallest residual sum of squares.  To get the hessian there, I fed the nlminb output into optim as follows: est.no1 <- optim(est.n\$par, func1,                 hessian=TRUE, y=y, x=x) est.no.hess.eig <- eigen(est.no1\$hessian, symmetric=TRUE) signif(est.no.hess.eig\$values, 2)   6.9e+05  3.2e+01  1.6e-04 -1.0e-07                   Since the smallest eigenvalue is negative, we might think that the hessian is indefinite.  However, the smallest eigenvalue is less than 1e-12 times the largest in absolute value, which really means that the smallest eigenvalue is essentially zero.  Moreover, since the third eigenvalue is less than 1e-9 times the largest, the true rank of the hessian is at most 2.  If we also compare the first and second eigenvalues, we that the first is over 2000 times the second.  This says that with your model and your data, you really can only estimate one parameter;  everything else looks like noise, to me.           The eigenvectors provide more detail:  > round(est.no.hess.eig\$vectors, 2)        [,1] [,2]  [,3] [,4] [1,] -0.01 1.00  0.00 0.00 [2,]  0.00 0.00  1.00 0.01 [3,]  0.00 0.00 -0.01 1.00 [4,]  1.00 0.01  0.00 0.00  >           This says that your data can estimate the fourth parameter very well, as the largest eigenvalue is associated almost exclusively with that parameter.  You might also be able to estimate the second parameter somewhat, as it is associated almost exclusively with the second eigenvalue.  However, you have to move the second and third parameters much more than the first and fourth to see much change in the residual sum of squares.           For more information, I suggest you consult the following: Seber and Wild (1988) Nonlinear Regression (Wiley;  reprinted in 2003, esp. ch. 7 on "Growth Models") Bates and Watts (1988) Nonlinear Regression Analysis and Its Applications (Wiley, esp. ch. 6 on "Graphical Summaries")           hope this helps.           spencer graves Rajarshi Guha wrote: > Hi, >   I'm trying to fit some data using a logistic function defined as > > y ~ a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau) > > My data is below: > > x <- 1:100 > > y <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, > 0,0,0,0,0,1,1,1,2,2,2,2,2,3,4,4,4,5, > 5,5,5,6,6,6,6,6,8,8,9,9,10,13,14,16,19,21, > 24,28,33,40,42,44,50,54,69,70,93,96,110,127,127,141,157,169, > 178,187,206,216,227,236,238,244,246,250,255,255,257,260,261,262,266,268, > 268,270,272,272,272,273,275,275,275,276) > > My first attempt was to use nls as below: > >     d <- data.frame(x=x, y=y) >     model <- nls(y ~ a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau)), data=d, >     start=list(a=277,m=100,n=101,tau=10), >     algorithm='port', trace=TRUE, >     control=nls.control(maxiter=5000, minFactor=1/2048)) > > Running the above code I get the following error message: > > Convergence failure: function evaluation limit reached without > convergence (9). > > To investigate this further I used nlminb() to get a set of starting > parameters. Thus I did: > > func <- function( par ) { >     a = par >     m = par >     n = par >     tau = par >     a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau)) > } > est <- nlminb( c(277, 100,101, 10), objective=func, > control=list(eval.max=400, iter.max=1000)) > > I get absolute convergence and a set of parameter values. Plugging these > into the nls call and trying again still gives me > >  Convergence failure: function evaluation limit reached without > convergence (9) > > I have tried a number of different starting values for the nls() call > but I usually end up getting the following error: > > Convergence failure: false convergence (8) > > After reading the PORT library docs, I see that this error can mean > > 1) gradient is calculated incorrectly > 2) stopping tolerances are too tight > 3) gradient is discontinous near some iterate > > However, since nls() usually reports the above error after 30 to 40 > iterations, the PORT docs suggest that it is not problem 1. I'm not sure > about (3) - I have other data which are somewhat similar to the above > data, but they lead to a straightforward fit. > > > > In the end I tried a different starting value and lowered the tolerances > a little, and I got a valid fit > > My questions are: > > 1) Why would the parameters that lead nlminb() to converge to work in > nls() (since I'm using the PORT algorithm in both cases)? > > 2) Is there a strategy to obtain starting values? In my case I know that > a should be around 277, but for the others I'm not sure. > > 3) Is there a quick way to check whether the gradient is discontinous at > a point, numerically (rather than calculating the analytical > derivatives)? I did > > plot(diff(y)) > > and it certainly looks messy, but I also have other y vectors which look > equally jagged (though the jaggedness occurs at lower x) > > Any suggestions would be appreciated. > > Thanks, > > ------------------------------------------------------------------- > Rajarshi Guha <[hidden email]> > GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE > ------------------------------------------------------------------- > After a number of decimal places, nobody gives a damn. > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html-- Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA [hidden email] www.pdf.com Tel:  408-938-4420 Fax: 408-280-7915 ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

## Re: suggestions for nls error: false convergence

 In reply to this post by Rajarshi Guha Hi. An alternative is to use the package 'drc' on CRAN to fit your data! x <- 1:100 y <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,1,1,1,2,2,2,2,2,3,4,4,4,5, 5,5,5,6,6,6,6,6,8,8,9,9,10,13,14,16,19,21, 24,28,33,40,42,44,50,54,69,70,93,96,110,127,127,141,157,169, 178,187,206,216,227,236,238,244,246,250,255,255,257,260,261,262,266,268, 268,270,272,272,272,273,275,275,275,276) ## Defining the function (in a bit different notation) logi <- function(dose, parm){parm[, 1] * (1+parm[, 2]*exp(-dose/parm[, 3])) / (1+parm[, 4]*exp(-dose/parm[, 3]))} ## Fitting the function to the data (see ?multdrc for details) library(drc) model <- multdrc(y~x, fct = list(logi, NULL, c("a", "m", "tau", "n")), startVal=c(277, 100, 10, 101)) summary(model) plot(model, log="") Hope this helps? Christian ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

## Re: suggestions for nls error: false convergence

 Hi, Christian:           That's very interesting.  May I ask how "multdrc" converged and found all 4 parameters statistically significant when "nls" failed and "optim" told me the hessian was singular?           To try to study this myself, I first compared your numbers with what I got and found that you had swapped the third and fourth parameters. When I swapped them back, I got the same estimates but substantially different standard errors from your function: logi <- function(dose, parm){ parm[, 1] * (1+parm[, 2]*exp(-dose/parm[, 3])) / (1+parm[, 4]*exp(-dose/parm[, 3]))} logi. <- function(dose, parm){ parm[, 1] * (1+parm[, 2]*exp(-dose/parm[, 4])) / (1+parm[, 3]*exp(-dose/parm[, 4]))} ## Fitting the function to the data (see ?multdrc for details) library(drc) model <- multdrc(y~x, fct = list(logi, NULL, c("a", "m", "tau", "n")),              startVal=c(277, 100, 10, 101)) model. <- multdrc(y~x, fct = list(logi., NULL, c("a", "m", "n", "tau")),              startVal=c(277, 100, 101, 10))   signif(summary(model)\$estimates, 2)                  Estimate Std. Error t-value  p-value a:(Intercept)    2.8e+02    9.6e-01   290.0 4.5e-143 m:(Intercept)    1.5e+03    6.0e+02     2.6  1.2e-02 tau:(Intercept)  5.6e+00    9.6e-02    59.0  2.7e-77 n:(Intercept)    2.3e+05    4.7e+04     4.8  5.1e-06  > signif(summary(model.)\$estimates, 2)                  Estimate Std. Error t-value  p-value a:(Intercept)    2.8e+02    8.5e-01   320.0 4.2e-148 m:(Intercept)    1.5e+03    4.8e+02     3.2  1.7e-03 n:(Intercept)    2.3e+05    3.3e+04     6.9  6.0e-10 tau:(Intercept)  5.6e+00    6.8e-02    83.0  1.2e-91           I then tried "optim" again, rescaling all parameters by your estimated standard errors.  If your code does what I thought it might, would expect that the residual variance times the main diagonal of the inverse of the Hessian should be all 1's.  That's not what I got, so I'm confused.  Here's what I did: rescale <- summary(model.)\$estimates[,2] func2 <- function( par,y, x, rescale ) { par <- rescale*par a = par m = par n = par tau = par y. <- a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau)) sum((y-y.)^2) } est.no2 <- optim(est.n\$par/rescale,       func2,  hessian=TRUE, y=y, x=x, rescale=rescale) round(var.resid <- est.no2\$value/96, 2)  8.42 est.no2.hessEig <- eigen(est.no2\$hessian, symmetric=TRUE)  > round(est.no2.hessEig\$values, 1)  6134.1   52.4   24.2    4.7 # MUCH better than before. hessInv <- with(est.no2.hessEig, vectors %*%(t(vectors)/values))  > var.par <- (est.no2\$value/96)*hessInv  > round(sqrt(diag(var.par)), 2)  0.72 0.76 0.79 0.77           What am I missing?           Thanks again for your work in the "drc" package and for your earlier reply to my comments on this example.           Spencer Graves Christian Ritz wrote: > Hi. > > An alternative is to use the package 'drc' on CRAN to fit your data! > > x <- 1:100 > > y <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, > 0,0,0,0,0,1,1,1,2,2,2,2,2,3,4,4,4,5, > 5,5,5,6,6,6,6,6,8,8,9,9,10,13,14,16,19,21, > 24,28,33,40,42,44,50,54,69,70,93,96,110,127,127,141,157,169, > 178,187,206,216,227,236,238,244,246,250,255,255,257,260,261,262,266,268, > 268,270,272,272,272,273,275,275,275,276) > > > ## Defining the function (in a bit different notation) > logi <- function(dose, parm){parm[, 1] * (1+parm[, 2]*exp(-dose/parm[, 3])) / (1+parm[, 4]*exp(-dose/parm[, 3]))} > > ## Fitting the function to the data (see ?multdrc for details) > library(drc) > model <- multdrc(y~x, fct = list(logi, NULL, c("a", "m", "tau", "n")), startVal=c(277, 100, 10, 101)) > > summary(model) > plot(model, log="") > > > Hope this helps? > > Christian > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html-- Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA [hidden email] www.pdf.com Tel:  408-938-4420 Fax: 408-280-7915 ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

## Re: suggestions for nls error: false convergence

 Hi Spencer. When using 'optim' and the first try fails you could: 1) try some other methods: Nelder-Mead, BFGS, ... 2) increase the maximum number of iterations (argument maxit in the control list) 3) specify the argument parscale in the control list, in order to have all parameters of same magnitude during optimisation (this is useful if the parameters are suspected to be of different magnitudes). Using the default method (Nelder-Mead) with maxit=1000 results in convergence, and essentially the same estimates are obtained if you use the method BFGS and set maxit=1000 and parscale=c(277, 100, 101, 10) (the initial starting values): x <- 1:100 y <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,1,1,1,2,2,2,2,2,3,4,4,4,5, 5,5,5,6,6,6,6,6,8,8,9,9,10,13,14,16,19,21, 24,28,33,40,42,44,50,54,69,70,93,96,110,127,127,141,157,169, 178,187,206,216,227,236,238,244,246,250,255,255,257,260,261,262,266,268, 268,270,272,272,272,273,275,275,275,276) func2 <- function( par,y, x, rescale ) { par <- rescale*par a = par m = par n = par tau = par y. <- a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau)) sum((y-y.)2) } est.no2 <- optim(c(277, 100, 101, 10), func2,  hessian=TRUE, y=y, x=x, rescale=1, control=list(maxit=1000)) est.no3 <- optim(c(277, 100, 101, 10), func2,  hessian=TRUE, method="BFGS", y=y, x=x, rescale=1, control=list(maxit=1000, parscale=c(277, 100, 101, 10))) The optimisation in the package 'drc' uses BFGS with the maxit and parscale arguments specified. Best wishes Christian ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Reply | Threaded
Open this post in threaded view
|

## Re: suggestions for nls error: false convergence

 Sometimes its just one parameter that's the culprit so just use a grid to get the starting value.  Since its known that in logistic growth the saturation level is a problem we conjecture that in this one m is the culprit and grid over it.  Note that with this approach we did not need to extend the allowable iterations at all so the control arg has been removed simplifying the call: > for(m in seq(50, 150, 25)) { +   model <- try(nls(y ~ a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau)), +                  data=d, start = list(a = 277, m = m, n = 101, tau = 10), +                  algorithm='port')) +    if (!inherits(model, "try-error")) print(model) +    } Nonlinear regression model   model:  y ~ a * (1 + m * exp(-x/tau))/(1 + n * exp(-x/tau))    data:  d            a            m            n          tau 2.757817e+02 1.594652e+03 2.312661e+05 5.617307e+00  residual sum-of-squares:  808.7282 Nonlinear regression model   model:  y ~ a * (1 + m * exp(-x/tau))/(1 + n * exp(-x/tau))    data:  d            a            m            n          tau 2.757817e+02 1.594652e+03 2.312661e+05 5.617307e+00  residual sum-of-squares:  808.7282 Error in nls(y ~ a * (1 + m * exp(-x/tau))/(1 + n * exp(-x/tau)), data = d,  :         Convergence failure: iteration limit reached without convergence (9) Error in nls(y ~ a * (1 + m * exp(-x/tau))/(1 + n * exp(-x/tau)), data = d,  :         Convergence failure: singular convergence (7) Nonlinear regression model   model:  y ~ a * (1 + m * exp(-x/tau))/(1 + n * exp(-x/tau))    data:  d            a            m            n          tau 2.757817e+02 1.594652e+03 2.312661e+05 5.617307e+00  residual sum-of-squares:  808.7282 On 12/19/05, Christian Ritz <[hidden email]> wrote: > Hi Spencer. > > When using 'optim' and the first try fails you could: > > > 1) try some other methods: Nelder-Mead, BFGS, ... > > 2) increase the maximum number of iterations (argument maxit in the control list) > > 3) specify the argument parscale in the control list, in order to have all parameters of same magnitude during > optimisation (this is useful if the parameters are suspected to be of different magnitudes). > > > Using the default method (Nelder-Mead) with maxit=1000 results in convergence, and essentially the same estimates are > obtained if you use the method BFGS and set maxit=1000 and parscale=c(277, 100, 101, 10) (the initial starting values): > > > x <- 1:100 > > y <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, > 0,0,0,0,0,1,1,1,2,2,2,2,2,3,4,4,4,5, > 5,5,5,6,6,6,6,6,8,8,9,9,10,13,14,16,19,21, > 24,28,33,40,42,44,50,54,69,70,93,96,110,127,127,141,157,169, > 178,187,206,216,227,236,238,244,246,250,255,255,257,260,261,262,266,268, > 268,270,272,272,272,273,275,275,275,276) > > func2 <- function( par,y, x, rescale ) { > par <- rescale*par > a = par > m = par > n = par > tau = par > y. <- a * (1+m*exp(-x/tau)) / (1+n*exp(-x/tau)) > sum((y-y.)2) > } > > est.no2 <- optim(c(277, 100, 101, 10), func2,  hessian=TRUE, y=y, x=x, rescale=1, control=list(maxit=1000)) > > est.no3 <- optim(c(277, 100, 101, 10), func2,  hessian=TRUE, method="BFGS", y=y, x=x, rescale=1, > control=list(maxit=1000, parscale=c(277, 100, 101, 10))) > > > The optimisation in the package 'drc' uses BFGS with the maxit and parscale arguments specified. > > Best wishes > > Christian > > ______________________________________________ > [hidden email] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html> ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide! http://www.R-project.org/posting-guide.html