Hessian in box-constraint problem - concern OPTIM function

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Hessian in box-constraint problem - concern OPTIM function

Cleber N.Borges

Hello all useRs,

I am using the OPTIM function with particular interest in the method
L-BFGS-B,
because it is a box-constraint method.

I have interest in the errors estimates too.
I make:

s.e. <- sqrt( diag( solve(  optim(...,method='L-BFGS-B',
hessian=TRUE)$hessian   )))

but in help say:
"Note that this is the Hessian of the unconstrained problem even if the
 box constraints are active."

My doubts is:

How to obtain a authentic hessian for a box-constraint problem?
How I should make a interpretation of this result (concern the hessian) ?
Is possible make some transformation or so can I considerate this result
a good approximation??

I am grateful for some help!

References are welcome! :-D


Cleber Borges

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Hessian in box-constraint problem - concern OPTIM function

Spencer Graves
      This is an interesting question.

      What is the problem you are trying to solve and how do the
boundary conditions function as part of this system?

      I ask because the asymptotic theory behind your formula for 's.e.'
breaks down with parameters at boundaries.  It assumes that you are
minimizing the negative log(likelihood) AND the optimum is in the
interior of the region AND the log(likelihood) is sufficiently close to
being parabolic that a reasonable approximation for the distribution of
the maximum likelihood estimates (MLEs) has a density adequately
approximated by a second-order Taylor series expansion about the MLEs.  
In this case, transforming the parameters will not solve the problem.  
If the maximum is at a boundary and if you send the boundary to Inf with
a transformation, then a second-order Taylor series expansion of the
log(likelihood) about the MLEs will be locally flat in some
direction(s), so the hessian can not be inverted.

      These days, the experts typically approach problems like this
using Monte Carlo, often in the form of Markov Chain Monte Carlo
(MCMC).  One example of an analysis of this type of problem appears in
section 2.4 of Pinheiro and Bates (2000) Mixed-Effects Models in S and
S-Plus (Springer).

      You might get more help from this list if you provide a little
more detail, preferably including commented, minimal, self-contained,
reproducible code for a simplified version of your problem as described
in the posting guide "http://www.R-project.org/posting-guide.html".

      Hope this helps.
      Spencer Graves

Cleber Nogueira Borges wrote:

>
> Hello all useRs,
>
> I am using the OPTIM function with particular interest in the method
> L-BFGS-B,
> because it is a box-constraint method.
>
> I have interest in the errors estimates too.
> I make:
>
> s.e. <- sqrt( diag( solve(  optim(...,method='L-BFGS-B',
> hessian=TRUE)$hessian   )))
>
> but in help say:
> "Note that this is the Hessian of the unconstrained problem even if the
> box constraints are active."
>
> My doubts is:
>
> How to obtain a authentic hessian for a box-constraint problem?
> How I should make a interpretation of this result (concern the hessian) ?
> Is possible make some transformation or so can I considerate this
> result a good approximation??
>
> I am grateful for some help!
>
> References are welcome! :-D
>
>
> Cleber Borges
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Hessian in box-constraint problem - concern OPTIM function

Cleber N.Borges
Hello Mr. Graves,
Hello all useRs,

Many thanks for your attention.
>      This is an interesting question.
>      What is the problem you are trying to solve and how do the
> boundary conditions function as part of this system?
One of the functions that I need to minimized is:

sum( ( ( hat_xi - xi )^2 )*uxi^-2 + ( ( ( a+b*hat_xi ) - yi)^2 )*uyi^-2)

the context: I need to considerate errors in regressors:  x[i] ~ N( x[i] ;
ux[i]^2 )

u = 'uncertainty' is the same of std, but this 'u' is because the
metrology terminology.

And, I would like to restrict the x[i] variables in ~95% CI.

My dirty code (test) follow below

Thanks again.
Cleber

###############


n <- 7 ### TODO: any number???

xi <- c(1:n) ; uxi <- round( abs( rnorm( n,0,1e-1)),6)
yi <- round(xi + uxi + rnorm(n,0,.9),6) ; uyi <- round(abs(rnorm(
n,0,1e-1)),6)

naive <- lm( yi ~ xi )
# p: parameters
p <- 2
plot( xi,yi )
abline( naive )


fobjetiva <- function( optPar=c(xi,a,b) , xi, uxi, yi, uyi )
{
    n <- length( xi )
    hat_xi <- optPar[1:n] ; a <- optPar[n+1] ; b <- optPar[n+2]
        sum( ( (hat_xi - xi)^2 )*uxi^-2 + ( ( (a+b*hat_xi) - yi)^2
)*uyi^-2 )
}

## testing
fobjetiva(c(xi, coef(naive)), xi,uxi,yi,uyi)

### box-constraints
seCoef <- sqrt(diag(vcov( naive )))
Linf <- as.numeric(c( xi-2*uxi, coef(naive)-5000*seCoef ))
Lsup <- as.numeric(c( xi+2*uxi, coef(naive)+5000*seCoef ))



metodo <-  4   #   1,2,3,4 ou 5

all_iterOptim <- capture.output(
  reportOptim <- optim(
    par=c(xi,coef(naive)),
    fn=fobjetiva,
    xi=xi,
    uxi=uxi,
    yi=yi,
    uyi=uyi,
    method=c("Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN")[metodo],
    lower=Linf,
    upper=Lsup,
    hessian=TRUE,
    control=list(
      REPORT=1,
      maxit=1e4,
      trace=6
    )
  )
)

### get all steps
all_steps <- grep("^X", all_iterOptim )
all_steps <- all_iterOptim[ all_steps ]
steps <- length(all_steps)
steps

matrix_steps <- matrix(0,nr=steps, nc=(n+p) )
for( i in 1:steps )
  {
  matrix_steps[i,] <- as.numeric(unlist(strsplit(all_steps[i], "
"))[3:(2+n+p)])
  }
windows(restore=T)

### view a animation of this otimization

for( i in 1:steps )
  {
  x <- matrix_steps[i,1:n]
  a <- matrix_steps[i,n+p-1]
  b <- matrix_steps[i,n+p]
  y <- a+b*x
  plot( xi, yi, pch=19, main=paste("Passo",i,"de",steps,sep=" "), cex=2 )
  abline(a,b, col='blue', lwd=3)
  abline(naive, lwd=2, lty=2, col='red')
  points( x, y, col='red', pch=19, cex=2 )
  segments(xi,yi,x,y, lwd=2)
  Sys.sleep(.11)
  ###file = paste("ISO_",(i+100),".png", sep="")
  ###savePlot(filename=file, type ="png", device=dev.cur() )
  }

###comando = "convert -dispose previous -adjoin -delay 35  ISO_*.png
-loop 0  ISO_animator.gif"
###shell(comando)
###unlink("ISO_*.png")


## view trajectory

windows(restore=T)
par( mfrow=c(4,2))
for( i in 1:7 ){
restri <- xi[i]+uxi[i]*c(-2,2)
interv <- range(matrix_steps[,i], restri )
plot(1:steps, matrix_steps[,i], t='l', xlab="Passos", ylab="x1",
ylim=interv, lwd=2, las=2 )
abline( h=restri, col='red', lwd=2)
titulo=c("Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN")[metodo]
title(titulo)
}



>      I ask because the asymptotic theory behind your formula for
> 's.e.' breaks down with parameters at boundaries.  It assumes that you
> are minimizing the negative log(likelihood) AND the optimum is in the
> interior of the region AND the log(likelihood) is sufficiently close
> to being parabolic that a reasonable approximation for the
> distribution of the maximum likelihood estimates (MLEs) has a density
> adequately approximated by a second-order Taylor series expansion
> about the MLEs.  In this case, transforming the parameters will not
> solve the problem.  If the maximum is at a boundary and if you send
> the boundary to Inf with a transformation, then a second-order Taylor
> series expansion of the log(likelihood) about the MLEs will be locally
> flat in some direction(s), so the hessian can not be inverted.
>      These days, the experts typically approach problems like this
> using Monte Carlo, often in the form of Markov Chain Monte Carlo
> (MCMC).  One example of an analysis of this type of problem appears in
> section 2.4 of Pinheiro and Bates (2000) Mixed-Effects Models in S and
> S-Plus (Springer).
>    
>      Hope this helps.      Spencer Graves
>
> Cleber Nogueira Borges wrote:
>>
>> Hello all useRs,
>>
>> I am using the OPTIM function with particular interest in the method
>> L-BFGS-B,
>> because it is a box-constraint method.
>> I have interest in the errors estimates too.
>> I make:
>> s.e. <- sqrt( diag( solve(  optim(...,method='L-BFGS-B',
>> hessian=TRUE)$hessian   )))
>> but in help say:
>> "Note that this is the Hessian of the unconstrained problem even if the
>> box constraints are active."
>> My doubts is:
>> How to obtain a authentic hessian for a box-constraint problem?
>> How I should make a interpretation of this result (concern the
>> hessian) ?
>> Is possible make some transformation or so can I considerate this
>> result a good approximation??
>> I am grateful for some help!
>> References are welcome! :-D
>> Cleber Borges

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Reply | Threaded
Open this post in threaded view
|

Re: Hessian in box-constraint problem - concern OPTIM function

Spencer Graves
      The description of your problem is too complex for me to analyze
in the time I have available for this.  Could you develop a very brief,
simple example that illustrates your question?  A very brief Monte
Carlo, estimating 2 parameters in a very simple formula with one of the
parameters at a boundary would be great.  If you do this, it may help
you answer the question for yourself, as suggested in the famous book by
Polya on "How to Solve It"
(http://en.wikipedia.org/wiki/How_to_Solve_It).  If you develop such an
example but still can't solve it yourself, please send it to this list
(see the posting guide http://www.R-project.org/posting-guide.html).

      Hope this helps.
      Spencer

Cleber Nogueira Borges wrote:

> Hello Mr. Graves,
> Hello all useRs,
>
> Many thanks for your attention.
>>      This is an interesting question.
>>      What is the problem you are trying to solve and how do the
>> boundary conditions function as part of this system?
> One of the functions that I need to minimized is:
>
> sum( ( ( hat_xi - xi )^2 )*uxi^-2 + ( ( ( a+b*hat_xi ) - yi)^2 )*uyi^-2)
>
> the context: I need to considerate errors in regressors:  x[i] ~ N(
> x[i] ;
> ux[i]^2 )
>
> u = 'uncertainty' is the same of std, but this 'u' is because the
> metrology terminology.
>
> And, I would like to restrict the x[i] variables in ~95% CI.
>
> My dirty code (test) follow below
>
> Thanks again.
> Cleber
>
> ###############
>
>
> n <- 7 ### TODO: any number???
>
> xi <- c(1:n) ; uxi <- round( abs( rnorm( n,0,1e-1)),6)
> yi <- round(xi + uxi + rnorm(n,0,.9),6) ; uyi <- round(abs(rnorm(
> n,0,1e-1)),6)
>
> naive <- lm( yi ~ xi )
> # p: parameters
> p <- 2
> plot( xi,yi )
> abline( naive )
>
>
> fobjetiva <- function( optPar=c(xi,a,b) , xi, uxi, yi, uyi )
> {
>    n <- length( xi )
>    hat_xi <- optPar[1:n] ; a <- optPar[n+1] ; b <- optPar[n+2]
>        sum( ( (hat_xi - xi)^2 )*uxi^-2 + ( ( (a+b*hat_xi) - yi)^2
> )*uyi^-2 )
> }
>
> ## testing
> fobjetiva(c(xi, coef(naive)), xi,uxi,yi,uyi)
>
> ### box-constraints
> seCoef <- sqrt(diag(vcov( naive )))
> Linf <- as.numeric(c( xi-2*uxi, coef(naive)-5000*seCoef ))
> Lsup <- as.numeric(c( xi+2*uxi, coef(naive)+5000*seCoef ))
>
>
>
> metodo <-  4   #   1,2,3,4 ou 5
>
> all_iterOptim <- capture.output(
>  reportOptim <- optim(
>    par=c(xi,coef(naive)),
>    fn=fobjetiva,
>    xi=xi,
>    uxi=uxi,
>    yi=yi,
>    uyi=uyi,
>    method=c("Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN")[metodo],
>    lower=Linf,
>    upper=Lsup,
>    hessian=TRUE,
>    control=list(
>      REPORT=1,
>      maxit=1e4,
>      trace=6
>    )
>  )
> )
>
> ### get all steps
> all_steps <- grep("^X", all_iterOptim )
> all_steps <- all_iterOptim[ all_steps ]
> steps <- length(all_steps)
> steps
>
> matrix_steps <- matrix(0,nr=steps, nc=(n+p) )
> for( i in 1:steps )
>  {
>  matrix_steps[i,] <- as.numeric(unlist(strsplit(all_steps[i], "
> "))[3:(2+n+p)])
>  }
> windows(restore=T)
>
> ### view a animation of this otimization
>
> for( i in 1:steps )
>  {
>  x <- matrix_steps[i,1:n]
>  a <- matrix_steps[i,n+p-1]
>  b <- matrix_steps[i,n+p]
>  y <- a+b*x
>  plot( xi, yi, pch=19, main=paste("Passo",i,"de",steps,sep=" "), cex=2 )
>  abline(a,b, col='blue', lwd=3)
>  abline(naive, lwd=2, lty=2, col='red')
>  points( x, y, col='red', pch=19, cex=2 )
>  segments(xi,yi,x,y, lwd=2)
>  Sys.sleep(.11)
>  ###file = paste("ISO_",(i+100),".png", sep="")
>  ###savePlot(filename=file, type ="png", device=dev.cur() )
>  }
>
> ###comando = "convert -dispose previous -adjoin -delay 35  ISO_*.png
> -loop 0  ISO_animator.gif"
> ###shell(comando)
> ###unlink("ISO_*.png")
>
>
> ## view trajectory
>
> windows(restore=T)
> par( mfrow=c(4,2))
> for( i in 1:7 ){
> restri <- xi[i]+uxi[i]*c(-2,2)
> interv <- range(matrix_steps[,i], restri )
> plot(1:steps, matrix_steps[,i], t='l', xlab="Passos", ylab="x1",
> ylim=interv, lwd=2, las=2 )
> abline( h=restri, col='red', lwd=2)
> titulo=c("Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN")[metodo]
> title(titulo)
> }
>
>
>
>>      I ask because the asymptotic theory behind your formula for
>> 's.e.' breaks down with parameters at boundaries.  It assumes that
>> you are minimizing the negative log(likelihood) AND the optimum is in
>> the interior of the region AND the log(likelihood) is sufficiently
>> close to being parabolic that a reasonable approximation for the
>> distribution of the maximum likelihood estimates (MLEs) has a density
>> adequately approximated by a second-order Taylor series expansion
>> about the MLEs.  In this case, transforming the parameters will not
>> solve the problem.  If the maximum is at a boundary and if you send
>> the boundary to Inf with a transformation, then a second-order Taylor
>> series expansion of the log(likelihood) about the MLEs will be
>> locally flat in some direction(s), so the hessian can not be inverted.
>>      These days, the experts typically approach problems like this
>> using Monte Carlo, often in the form of Markov Chain Monte Carlo
>> (MCMC).  One example of an analysis of this type of problem appears
>> in section 2.4 of Pinheiro and Bates (2000) Mixed-Effects Models in S
>> and S-Plus (Springer).
>>          Hope this helps.      Spencer Graves
>>
>> Cleber Nogueira Borges wrote:
>>>
>>> Hello all useRs,
>>>
>>> I am using the OPTIM function with particular interest in the method
>>> L-BFGS-B,
>>> because it is a box-constraint method.
>>> I have interest in the errors estimates too.
>>> I make:
>>> s.e. <- sqrt( diag( solve(  optim(...,method='L-BFGS-B',
>>> hessian=TRUE)$hessian   )))
>>> but in help say:
>>> "Note that this is the Hessian of the unconstrained problem even if the
>>> box constraints are active."
>>> My doubts is:
>>> How to obtain a authentic hessian for a box-constraint problem?
>>> How I should make a interpretation of this result (concern the
>>> hessian) ?
>>> Is possible make some transformation or so can I considerate this
>>> result a good approximation??
>>> I am grateful for some help!
>>> References are welcome! :-D
>>> Cleber Borges
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.