

I am trying to deal with a maximisation problem in which it is possible
for the objective function to (quite legitimately) return the value
Inf, which causes the numerical optimisers that I have tried to fall over.
The Inf values arise from expressions of the form "a * log(b)", with b
= 0. Under the *starting* values of the parameters, a must equal equal
0 whenever b = 0, so we can legitimately say that a * log(b) = 0 in
these circumstances. However as the maximisation algorithm searches
over parameters it is possible for b to take the value 0 for values of
a that are strictly positive. (The values of "a" do not change during
this search, although they *do* change between "successive searches".)
Clearly if one is *maximising* the objective then Inf is not a value of
particular interest, and we should be able to "move away". But the
optimising function just stops.
It is also clear that "moving away" is not a simple task; you can't
estimate a gradient or Hessian at a point where the function value is Inf.
Can anyone suggest a way out of this dilemma, perhaps an optimiser that
is equipped to cope with Inf values in some sneaky way?
Various ad hoc kludges spring to mind, but they all seem to be fraught
with peril.
I have tried changing the value returned by the objective function from
"v" to exp(v)  which maps Inf to 0, which is nice and finite.
However this seemed to flatten out the objective surface too much, and
the search stalled at the 0 value, which is the antithesis of optimal.
The problem arises in a context of applying the EM algorithm where the
Mstep cannot be carried out explicitly, whence numerical optimisation.
I can give more detail if anyone thinks that it could be relevant.
I would appreciate advice from younger and wiser heads! :)
cheers,
Rolf Turner

Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +6493737599 ext. 88276
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Have you tried reparameterizing, using logb (=log(b)) instead of b?
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner < [hidden email]> wrote:
>
> I am trying to deal with a maximisation problem in which it is possible
> for the objective function to (quite legitimately) return the value Inf,
> which causes the numerical optimisers that I have tried to fall over.
>
> The Inf values arise from expressions of the form "a * log(b)", with b =
> 0. Under the *starting* values of the parameters, a must equal equal 0
> whenever b = 0, so we can legitimately say that a * log(b) = 0 in these
> circumstances. However as the maximisation algorithm searches over
> parameters it is possible for b to take the value 0 for values of
> a that are strictly positive. (The values of "a" do not change during
> this search, although they *do* change between "successive searches".)
>
> Clearly if one is *maximising* the objective then Inf is not a value of
> particular interest, and we should be able to "move away". But the
> optimising function just stops.
>
> It is also clear that "moving away" is not a simple task; you can't
> estimate a gradient or Hessian at a point where the function value is Inf.
>
> Can anyone suggest a way out of this dilemma, perhaps an optimiser that is
> equipped to cope with Inf values in some sneaky way?
>
> Various ad hoc kludges spring to mind, but they all seem to be fraught
> with peril.
>
> I have tried changing the value returned by the objective function from
> "v" to exp(v)  which maps Inf to 0, which is nice and finite. However
> this seemed to flatten out the objective surface too much, and the search
> stalled at the 0 value, which is the antithesis of optimal.
>
> The problem arises in a context of applying the EM algorithm where the
> Mstep cannot be carried out explicitly, whence numerical optimisation.
> I can give more detail if anyone thinks that it could be relevant.
>
> I would appreciate advice from younger and wiser heads! :)
>
> cheers,
>
> Rolf Turner
>
> 
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +6493737599 ext. 88276
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/posti> ngguide.html
> and provide commented, minimal, selfcontained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Rolf, What optimizers did you try? There are a few in the optimrx package on Rforge that handle bounds, and it may be
useful to set bounds in this case. Transformations using log or exp can be helpful if done carefully, but as you note,
they can make the function more difficult to optimize.
Be cautious about using the default numerical gradient approximations. optimrx allows selection of the numDeriv grad()
function, which is quite good. Complex step would be better, but you need a function which can be computed with complex
arguments. Unfortunately, numerical gradients often step over the cliff edge of computability of the function. The
bounds are not checked for the step to compute things like (f(x+h)  f(x) / h.
Cheers, JN
On 161106 07:07 PM, William Dunlap via Rhelp wrote:
> Have you tried reparameterizing, using logb (=log(b)) instead of b?
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner < [hidden email]> wrote:
>
>>
>> I am trying to deal with a maximisation problem in which it is possible
>> for the objective function to (quite legitimately) return the value Inf,
>> which causes the numerical optimisers that I have tried to fall over.
>>
>> The Inf values arise from expressions of the form "a * log(b)", with b =
>> 0. Under the *starting* values of the parameters, a must equal equal 0
>> whenever b = 0, so we can legitimately say that a * log(b) = 0 in these
>> circumstances. However as the maximisation algorithm searches over
>> parameters it is possible for b to take the value 0 for values of
>> a that are strictly positive. (The values of "a" do not change during
>> this search, although they *do* change between "successive searches".)
>>
>> Clearly if one is *maximising* the objective then Inf is not a value of
>> particular interest, and we should be able to "move away". But the
>> optimising function just stops.
>>
>> It is also clear that "moving away" is not a simple task; you can't
>> estimate a gradient or Hessian at a point where the function value is Inf.
>>
>> Can anyone suggest a way out of this dilemma, perhaps an optimiser that is
>> equipped to cope with Inf values in some sneaky way?
>>
>> Various ad hoc kludges spring to mind, but they all seem to be fraught
>> with peril.
>>
>> I have tried changing the value returned by the objective function from
>> "v" to exp(v)  which maps Inf to 0, which is nice and finite. However
>> this seemed to flatten out the objective surface too much, and the search
>> stalled at the 0 value, which is the antithesis of optimal.
>>
>> The problem arises in a context of applying the EM algorithm where the
>> Mstep cannot be carried out explicitly, whence numerical optimisation.
>> I can give more detail if anyone thinks that it could be relevant.
>>
>> I would appreciate advice from younger and wiser heads! :)
>>
>> cheers,
>>
>> Rolf Turner
>>
>> 
>> Technical Editor ANZJS
>> Department of Statistics
>> University of Auckland
>> Phone: +6493737599 ext. 88276
>>
>> ______________________________________________
>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide http://www.Rproject.org/posti>> ngguide.html
>> and provide commented, minimal, selfcontained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


In reply to this post by R help mailing list2
On 07/11/16 13:07, William Dunlap wrote:
> Have you tried reparameterizing, using logb (=log(b)) instead of b?
Uh, no. I don't think that that makes any sense in my context.
The "b" values are probabilities and must satisfy a "sumto1"
constraint. To accommodate this constraint I reparametrise via a
"logistic" style parametrisation  basically
b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n
with the parameters that the optimiser works with being z_1, ...,
z_{n1} (and with z_n == 0 for identifiability). The objective function
is of the form sum_i(a_i * log(b_i)), so I transform back
from the z_i to the b_i in order calculate the value of the objective
function. But when the z_i get moderately largenegative, the b_i
become numerically 0 and then log(b_i) becomes Inf. And the optimiser
falls over.
cheers,
Rolf
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com < http://tibco.com>
>
> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner < [hidden email]
> <mailto: [hidden email]>> wrote:
>
>
> I am trying to deal with a maximisation problem in which it is
> possible for the objective function to (quite legitimately) return
> the value Inf, which causes the numerical optimisers that I have
> tried to fall over.
>
> The Inf values arise from expressions of the form "a * log(b)",
> with b = 0. Under the *starting* values of the parameters, a must
> equal equal 0 whenever b = 0, so we can legitimately say that a *
> log(b) = 0 in these circumstances. However as the maximisation
> algorithm searches over parameters it is possible for b to take the
> value 0 for values of
> a that are strictly positive. (The values of "a" do not change during
> this search, although they *do* change between "successive searches".)
>
> Clearly if one is *maximising* the objective then Inf is not a value of
> particular interest, and we should be able to "move away". But the
> optimising function just stops.
>
> It is also clear that "moving away" is not a simple task; you can't
> estimate a gradient or Hessian at a point where the function value
> is Inf.
>
> Can anyone suggest a way out of this dilemma, perhaps an optimiser
> that is equipped to cope with Inf values in some sneaky way?
>
> Various ad hoc kludges spring to mind, but they all seem to be
> fraught with peril.
>
> I have tried changing the value returned by the objective function from
> "v" to exp(v)  which maps Inf to 0, which is nice and finite.
> However this seemed to flatten out the objective surface too much,
> and the search stalled at the 0 value, which is the antithesis of
> optimal.
>
> The problem arises in a context of applying the EM algorithm where
> the Mstep cannot be carried out explicitly, whence numerical
> optimisation.
> I can give more detail if anyone thinks that it could be relevant.
>
> I would appreciate advice from younger and wiser heads! :)
>
> cheers,
>
> Rolf Turner
>
> 
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +6493737599 ext. 88276 <tel:%2B6493737599%20ext.%2088276>
>
> ______________________________________________
> [hidden email] <mailto: [hidden email]> mailing list 
> To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> < https://stat.ethz.ch/mailman/listinfo/rhelp>
> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> < http://www.Rproject.org/postingguide.html>
> and provide commented, minimal, selfcontained, reproducible code.
>
>

Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +6493737599 ext. 88276
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On 07/11/16 14:14, ProfJCNash wrote:
> Rolf, What optimizers did you try? There are a few in the optimrx package on Rforge that handle bounds, and it may be
> useful to set bounds in this case. Transformations using log or exp can be helpful if done carefully, but as you note,
> they can make the function more difficult to optimize.
I can't see how to impose bounds in my circumstances. As you will have
seen from my previous answer to Bill Dunlap's post, the b_i are
probabilities, parametrised in terms of z_i:
b_i = exp(z_i)/[sum_j exp(z_j)] .
It is not at all clear to me how to impose constraints on the z_i that
will bound the b_i away from 0.
I can constrain L <= z_i <= U for all i and get b_i >= exp(L)/[n*exp(U)]
 I think; I may have things upsidedown and backwards  but this
leaves an infinitude of choices for L and U.
Also the starting values at each Mstep are "naturally" given in terms
of the b_i. I.e. I can calculate "reasonable" values for the b_i and
then transform these to provide starting values for the z_i. The
starting values for z_i might not satisfy a given set of constraints.
I guess I could simply truncate the starting values to fall within the
constraints, but that "feels wrong" to me.
I also worry about the impact that imposing constraints will have on
the monotonicity of the successive values of the expected log likelihood
in the EM algorithm context.
> Be cautious about using the default numerical gradient approximations. optimrx allows selection of the numDeriv grad()
> function, which is quite good. Complex step would be better, but you need a function which can be computed with complex
> arguments. Unfortunately, numerical gradients often step over the cliff edge of computability of the function. The
> bounds are not checked for the step to compute things like (f(x+h)  f(x) / h.
I think I can program up an analytic gradient function. Maybe I'll try
that. I have been reluctant to do so because I have had peculiar (bad)
experiences in the past in trying to use analytic gradients with nlm().
Of course the (analytic) gradient becomes undefined if one of the b_i is 0.
cheers,
Rolf

Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +6493737599 ext. 88276
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On Mon, 7 Nov 2016, Rolf Turner wrote:
> On 07/11/16 13:07, William Dunlap wrote:
>> Have you tried reparameterizing, using logb (=log(b)) instead of b?
>
> Uh, no. I don't think that that makes any sense in my context.
>
> The "b" values are probabilities and must satisfy a "sumto1" constraint.
> To accommodate this constraint I reparametrise via a "logistic" style
> parametrisation  basically
>
> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n
>
> with the parameters that the optimiser works with being z_1, ..., z_{n1}
> (and with z_n == 0 for identifiability). The objective function is of the
> form sum_i(a_i * log(b_i)),
This is sum_i(a_i * z_i)  sum(a_i)*log(sum_j(exp(z_j)), isn't it?
So you don't need to evaluate b_i here, do you?
Large values of z_j will lead to exp(z_j) == Inf, but using
sum_i(a_i * (z_imax.z))  sum(a_i)*log(sum_j(exp(z_jmax.z))
will handle that.
HTH,
Chuck
p.s. Regarding "advice from younger and wiser heads", I probably cannot
claim to be either.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On 07/11/16 15:46, Charles C. Berry wrote:
> On Mon, 7 Nov 2016, Rolf Turner wrote:
>
>> On 07/11/16 13:07, William Dunlap wrote:
>>> Have you tried reparameterizing, using logb (=log(b)) instead of b?
>>
>> Uh, no. I don't think that that makes any sense in my context.
>>
>> The "b" values are probabilities and must satisfy a "sumto1"
>> constraint. To accommodate this constraint I reparametrise via a
>> "logistic" style parametrisation  basically
>>
>> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n
>>
>> with the parameters that the optimiser works with being z_1, ...,
>> z_{n1} (and with z_n == 0 for identifiability). The objective
>> function is of the form sum_i(a_i * log(b_i)),
>
>
> This is sum_i(a_i * z_i)  sum(a_i)*log(sum_j(exp(z_j)), isn't it?
>
> So you don't need to evaluate b_i here, do you?
>
> Large values of z_j will lead to exp(z_j) == Inf, but using
>
> sum_i(a_i * (z_imax.z))  sum(a_i)*log(sum_j(exp(z_jmax.z))
>
> will handle that.
Wow!!! That looks like it will work!!! I won't completely believe it
until I've programmed it up and tried it  but for the first time in
days I'm feeling hopeful.
>
> HTH,
>
> Chuck
>
> p.s. Regarding "advice from younger and wiser heads", I probably cannot
> claim to be either.
On present evidence you certainly appear to be one hell of a lot wiser!!!
Thanks.
cheers,
Rolf

Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +6493737599 ext. 88276
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Perhaps the C function Rf_logspace_sum(double *x, int n) would help in
computing log(b). It computes log(sum(exp(x_i))) for i in 1..n, avoiding
unnecessary under and overflow.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner < [hidden email]> wrote:
> On 07/11/16 13:07, William Dunlap wrote:
>
>> Have you tried reparameterizing, using logb (=log(b)) instead of b?
>>
>
> Uh, no. I don't think that that makes any sense in my context.
>
> The "b" values are probabilities and must satisfy a "sumto1"
> constraint. To accommodate this constraint I reparametrise via a
> "logistic" style parametrisation  basically
>
> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n
>
> with the parameters that the optimiser works with being z_1, ..., z_{n1}
> (and with z_n == 0 for identifiability). The objective function is of the
> form sum_i(a_i * log(b_i)), so I transform back
> from the z_i to the b_i in order calculate the value of the objective
> function. But when the z_i get moderately largenegative, the b_i become
> numerically 0 and then log(b_i) becomes Inf. And the optimiser falls over.
>
> cheers,
>
> Rolf
>
>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com < http://tibco.com>
>>
>> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner < [hidden email]
>> <mailto: [hidden email]>> wrote:
>>
>>
>> I am trying to deal with a maximisation problem in which it is
>> possible for the objective function to (quite legitimately) return
>> the value Inf, which causes the numerical optimisers that I have
>> tried to fall over.
>>
>> The Inf values arise from expressions of the form "a * log(b)",
>> with b = 0. Under the *starting* values of the parameters, a must
>> equal equal 0 whenever b = 0, so we can legitimately say that a *
>> log(b) = 0 in these circumstances. However as the maximisation
>> algorithm searches over parameters it is possible for b to take the
>> value 0 for values of
>> a that are strictly positive. (The values of "a" do not change during
>> this search, although they *do* change between "successive searches".)
>>
>> Clearly if one is *maximising* the objective then Inf is not a value
>> of
>> particular interest, and we should be able to "move away". But the
>> optimising function just stops.
>>
>> It is also clear that "moving away" is not a simple task; you can't
>> estimate a gradient or Hessian at a point where the function value
>> is Inf.
>>
>> Can anyone suggest a way out of this dilemma, perhaps an optimiser
>> that is equipped to cope with Inf values in some sneaky way?
>>
>> Various ad hoc kludges spring to mind, but they all seem to be
>> fraught with peril.
>>
>> I have tried changing the value returned by the objective function
>> from
>> "v" to exp(v)  which maps Inf to 0, which is nice and finite.
>> However this seemed to flatten out the objective surface too much,
>> and the search stalled at the 0 value, which is the antithesis of
>> optimal.
>>
>> The problem arises in a context of applying the EM algorithm where
>> the Mstep cannot be carried out explicitly, whence numerical
>> optimisation.
>> I can give more detail if anyone thinks that it could be relevant.
>>
>> I would appreciate advice from younger and wiser heads! :)
>>
>> cheers,
>>
>> Rolf Turner
>>
>> 
>> Technical Editor ANZJS
>> Department of Statistics
>> University of Auckland
>> Phone: +6493737599 ext. 88276 <tel:%2B6493737599%20ext.%2
>> 088276>
>>
>> ______________________________________________
>> [hidden email] <mailto: [hidden email]> mailing list 
>> To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> < https://stat.ethz.ch/mailman/listinfo/rhelp>
>> PLEASE do read the posting guide
>> http://www.Rproject.org/postingguide.html>> < http://www.Rproject.org/postingguide.html>
>> and provide commented, minimal, selfcontained, reproducible code.
>>
>>
>>
>
> 
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +6493737599 ext. 88276
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


>>>>> William Dunlap via Rhelp < [hidden email]>
>>>>> on Sun, 6 Nov 2016 20:53:17 0800 writes:
> Perhaps the C function Rf_logspace_sum(double *x, int n) would help in
> computing log(b). It computes log(sum(exp(x_i))) for i in 1..n, avoiding
> unnecessary under and overflow.
Indeed!
I had thought more than twice to also export it to the R level
notably as we have been using two R level versions in a package
I maintain ('copula'). They are vectorized there in a way that
seemed particularly useful to our (Marius Hofert and my) use cases.
More on this  making these available in R, how exactly? 
probably should move to the Rdevel list.
Thank you Bill for bringing it up!
Martin
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
> On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner < [hidden email]> wrote:
>> On 07/11/16 13:07, William Dunlap wrote:
>>
>>> Have you tried reparameterizing, using logb (=log(b)) instead of b?
>>>
>>
>> Uh, no. I don't think that that makes any sense in my context.
>>
>> The "b" values are probabilities and must satisfy a "sumto1"
>> constraint. To accommodate this constraint I reparametrise via a
>> "logistic" style parametrisation  basically
>>
>> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n
>>
>> with the parameters that the optimiser works with being z_1, ..., z_{n1}
>> (and with z_n == 0 for identifiability). The objective function is of the
>> form sum_i(a_i * log(b_i)), so I transform back
>> from the z_i to the b_i in order calculate the value of the objective
>> function. But when the z_i get moderately largenegative, the b_i become
>> numerically 0 and then log(b_i) becomes Inf. And the optimiser falls over.
>>
>> cheers,
>>
>> Rolf
>>
>>
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com < http://tibco.com>
>>>
>>> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner < [hidden email]
>>> <mailto: [hidden email]>> wrote:
>>>
>>>
>>> I am trying to deal with a maximisation problem in which it is
>>> possible for the objective function to (quite legitimately) return
>>> the value Inf, which causes the numerical optimisers that I have
>>> tried to fall over.
>>>
>>> The Inf values arise from expressions of the form "a * log(b)",
>>> with b = 0. Under the *starting* values of the parameters, a must
>>> equal equal 0 whenever b = 0, so we can legitimately say that a *
>>> log(b) = 0 in these circumstances. However as the maximisation
>>> algorithm searches over parameters it is possible for b to take the
>>> value 0 for values of
>>> a that are strictly positive. (The values of "a" do not change during
>>> this search, although they *do* change between "successive searches".)
>>>
>>> Clearly if one is *maximising* the objective then Inf is not a value
>>> of
>>> particular interest, and we should be able to "move away". But the
>>> optimising function just stops.
>>>
>>> It is also clear that "moving away" is not a simple task; you can't
>>> estimate a gradient or Hessian at a point where the function value
>>> is Inf.
>>>
>>> Can anyone suggest a way out of this dilemma, perhaps an optimiser
>>> that is equipped to cope with Inf values in some sneaky way?
>>>
>>> Various ad hoc kludges spring to mind, but they all seem to be
>>> fraught with peril.
>>>
>>> I have tried changing the value returned by the objective function
>>> from
>>> "v" to exp(v)  which maps Inf to 0, which is nice and finite.
>>> However this seemed to flatten out the objective surface too much,
>>> and the search stalled at the 0 value, which is the antithesis of
>>> optimal.
>>>
>>> The problem arises in a context of applying the EM algorithm where
>>> the Mstep cannot be carried out explicitly, whence numerical
>>> optimisation.
>>> I can give more detail if anyone thinks that it could be relevant.
>>>
>>> I would appreciate advice from younger and wiser heads! :)
>>>
>>> cheers,
>>>
>>> Rolf Turner
>>>
>>> 
>>> Technical Editor ANZJS
>>> Department of Statistics
>>> University of Auckland
>>> Phone: +6493737599 ext. 88276 <tel:%2B6493737599%20ext.%2
088276>
>>>
>>> ______________________________________________
>>> [hidden email] <mailto: [hidden email]> mailing list 
>>> To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/rhelp >>> < https://stat.ethz.ch/mailman/listinfo/rhelp>
>>> PLEASE do read the posting guide
>>> http://www.Rproject.org/postingguide.html >>> < http://www.Rproject.org/postingguide.html>
>>> and provide commented, minimal, selfcontained, reproducible code.
>>>
>>>
>>>
>>
>> 
>> Technical Editor ANZJS
>> Department of Statistics
>> University of Auckland
>> Phone: +6493737599 ext. 88276
>>
> [[alternative HTML version deleted]]
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp > PLEASE do read the posting guide http://www.Rproject.org/postingguide.html > and provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


>
> I am trying to deal with a maximisation problem in which it is possible
> for the objective function to (quite legitimately) return the value
> Inf,
(Just to add to the pedantic part of the discuss by those of us that do
not qualify as younger and wiser:)
Setting log(0) to Inf is often convenient but really I think the log
function is undefined at zero, so I would not refer to this as "legitimate".
>which causes the numerical optimisers that I have tried to fall over.
In theory as well as practice. You need to have a function that is
defined on the whole domain.
>
> The Inf values arise from expressions of the form "a * log(b)", with b
> = 0. Under the *starting* values of the parameters, a must equal equal
> 0 whenever b = 0, so we can legitimately say that a * log(b) = 0 in
This also is undefined and not "legitimate". I think there is no reason
it should be equal zero. We tend to want to set it to the value we think
of as the "limit": for a=0 the limit as b goes to zero would be zero,
but the limit of a*(inf) is inf as a goes to zero.
So, you really do need to avoid zero because your function is not
defined there, or find a redefinition that works properly at zero. I
think you have a solution from another post.
Paul
> these circumstances. However as the maximisation algorithm searches
> over parameters it is possible for b to take the value 0 for values of
> a that are strictly positive. (The values of "a" do not change during
> this search, although they *do* change between "successive searches".)
>
> Clearly if one is *maximising* the objective then Inf is not a value of
> particular interest, and we should be able to "move away". But the
> optimising function just stops.
>
> It is also clear that "moving away" is not a simple task; you can't
> estimate a gradient or Hessian at a point where the function value is Inf.
>
> Can anyone suggest a way out of this dilemma, perhaps an optimiser that
> is equipped to cope with Inf values in some sneaky way?
>
> Various ad hoc kludges spring to mind, but they all seem to be fraught
> with peril.
>
> I have tried changing the value returned by the objective function from
> "v" to exp(v)  which maps Inf to 0, which is nice and finite.
> However this seemed to flatten out the objective surface too much, and
> the search stalled at the 0 value, which is the antithesis of optimal.
>
> The problem arises in a context of applying the EM algorithm where the
> Mstep cannot be carried out explicitly, whence numerical optimisation.
> I can give more detail if anyone thinks that it could be relevant.
>
> I would appreciate advice from younger and wiser heads! :)
>
> cheers,
>
> Rolf Turner
>
>  Technical Editor ANZJS Department of Statistics University of
> Auckland Phone: +6493737599 ext. 88276
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


It would be nice if the C functions Rf_logspace_sum, Rf_logspace_add, and
Rf_logspace_sub were available as R functions. (I wish the '_sub' were
'_subtract' because 'sub' means too many things in R.)
I think Rf_logspace_sum in R could be a little better. E.g., using the C
code
#include <R.h>
#include <Rinternals.h>
#include <Rmath.h>
SEXP Call_logspace_sum(SEXP x)
{
if (TYPEOF(x) != REALSXP)
{
Rf_error("'x' must be a numeric vector");
}
return ScalarReal(Rf_logspace_sum(REAL(x), length(x)));
}
and the R functions
logspace_sum < function (x) .Call("Call_logspace_sum", as.numeric(x))
and
test < function (x) {
x < as.numeric(x)
rbind(Rmpfr = as.numeric(log(sum(exp(Rmpfr::mpfr(x, precBits=5000))))),
Rf_logspace_sum = logspace_sum(x),
subtract_xmax = log(sum(exp(x  max(x)))) + max(x),
naive = log(sum(exp(x))))
}
R3.3.2 on Linux gives, after options(digits=17)
> test(c(0, 50))
[,1]
Rmpfr 1.9287498479639178e22
Rf_logspace_sum 1.9287498479639178e22
subtract_xmax 0.0000000000000000e+00
naive 0.0000000000000000e+00
which is nice, but also the not so nice
> test(c(0, 50, 50))
[,1]
Rmpfr 3.8574996959278356e22
Rf_logspace_sum 0.0000000000000000e+00
subtract_xmax 0.0000000000000000e+00
naive 0.0000000000000000e+00
With TERR the second test has Rmpfr==Rf_logspace_sum for that example.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Mon, Nov 7, 2016 at 3:08 AM, Martin Maechler < [hidden email]>
wrote:
> >>>>> William Dunlap via Rhelp < [hidden email]>
> >>>>> on Sun, 6 Nov 2016 20:53:17 0800 writes:
>
> > Perhaps the C function Rf_logspace_sum(double *x, int n) would help
> in
> > computing log(b). It computes log(sum(exp(x_i))) for i in 1..n,
> avoiding
> > unnecessary under and overflow.
>
> Indeed!
>
> I had thought more than twice to also export it to the R level
> notably as we have been using two R level versions in a package
> I maintain ('copula'). They are vectorized there in a way that
> seemed particularly useful to our (Marius Hofert and my) use cases.
>
> More on this  making these available in R, how exactly? 
> probably should move to the Rdevel list.
>
> Thank you Bill for bringing it up!
> Martin
>
> > Bill Dunlap
> > TIBCO Software
> > wdunlap tibco.com
>
> > On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner < [hidden email]>
> wrote:
>
> >> On 07/11/16 13:07, William Dunlap wrote:
> >>
> >>> Have you tried reparameterizing, using logb (=log(b)) instead of b?
> >>>
> >>
> >> Uh, no. I don't think that that makes any sense in my context.
> >>
> >> The "b" values are probabilities and must satisfy a "sumto1"
> >> constraint. To accommodate this constraint I reparametrise via a
> >> "logistic" style parametrisation  basically
> >>
> >> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n
> >>
> >> with the parameters that the optimiser works with being z_1, ...,
> z_{n1}
> >> (and with z_n == 0 for identifiability). The objective function is
> of the
> >> form sum_i(a_i * log(b_i)), so I transform back
> >> from the z_i to the b_i in order calculate the value of the
> objective
> >> function. But when the z_i get moderately largenegative, the b_i
> become
> >> numerically 0 and then log(b_i) becomes Inf. And the optimiser
> falls over.
> >>
> >> cheers,
> >>
> >> Rolf
> >>
> >>
> >>> Bill Dunlap
> >>> TIBCO Software
> >>> wdunlap tibco.com < http://tibco.com>
> >>>
> >>> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner <
> [hidden email]
> >>> <mailto: [hidden email]>> wrote:
> >>>
> >>>
> >>> I am trying to deal with a maximisation problem in which it is
> >>> possible for the objective function to (quite legitimately) return
> >>> the value Inf, which causes the numerical optimisers that I have
> >>> tried to fall over.
> >>>
> >>> The Inf values arise from expressions of the form "a * log(b)",
> >>> with b = 0. Under the *starting* values of the parameters, a must
> >>> equal equal 0 whenever b = 0, so we can legitimately say that a *
> >>> log(b) = 0 in these circumstances. However as the maximisation
> >>> algorithm searches over parameters it is possible for b to take the
> >>> value 0 for values of
> >>> a that are strictly positive. (The values of "a" do not change
> during
> >>> this search, although they *do* change between "successive
> searches".)
> >>>
> >>> Clearly if one is *maximising* the objective then Inf is not a
> value
> >>> of
> >>> particular interest, and we should be able to "move away". But the
> >>> optimising function just stops.
> >>>
> >>> It is also clear that "moving away" is not a simple task; you can't
> >>> estimate a gradient or Hessian at a point where the function value
> >>> is Inf.
> >>>
> >>> Can anyone suggest a way out of this dilemma, perhaps an optimiser
> >>> that is equipped to cope with Inf values in some sneaky way?
> >>>
> >>> Various ad hoc kludges spring to mind, but they all seem to be
> >>> fraught with peril.
> >>>
> >>> I have tried changing the value returned by the objective function
> >>> from
> >>> "v" to exp(v)  which maps Inf to 0, which is nice and finite.
> >>> However this seemed to flatten out the objective surface too much,
> >>> and the search stalled at the 0 value, which is the antithesis of
> >>> optimal.
> >>>
> >>> The problem arises in a context of applying the EM algorithm where
> >>> the Mstep cannot be carried out explicitly, whence numerical
> >>> optimisation.
> >>> I can give more detail if anyone thinks that it could be relevant.
> >>>
> >>> I would appreciate advice from younger and wiser heads! :)
> >>>
> >>> cheers,
> >>>
> >>> Rolf Turner
> >>>
> >>> 
> >>> Technical Editor ANZJS
> >>> Department of Statistics
> >>> University of Auckland
> >>> Phone: +6493737599 ext. 88276 <tel:%2B6493737599%20ext.%2
> 088276>
> >>>
> >>> ______________________________________________
> >>> [hidden email] <mailto: [hidden email]> mailing list 
> >>> To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/rhelp> >>> < https://stat.ethz.ch/mailman/listinfo/rhelp>
> >>> PLEASE do read the posting guide
> >>> http://www.Rproject.org/postingguide.html> >>> < http://www.Rproject.org/postingguide.html>
> >>> and provide commented, minimal, selfcontained, reproducible code.
> >>>
> >>>
> >>>
> >>
> >> 
> >> Technical Editor ANZJS
> >> Department of Statistics
> >> University of Auckland
> >> Phone: +6493737599 ext. 88276
> >>
>
> > [[alternative HTML version deleted]]
>
> > ______________________________________________
> > [hidden email] mailing list  To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/rhelp> > PLEASE do read the posting guide http://www.Rproject.org/> postingguide.html
> > and provide commented, minimal, selfcontained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Using log1p instead of log improves the accuracy of the 'subtract xmax'
algorithm when the largest x is very close to zero. Perhaps that is what
is missing in the Rf_logspace_add.
test < function (x) {
x < as.numeric(x)
i < which.max(x)
rbind(Rmpfr = as.numeric(log(sum(exp(Rmpfr::mpfr(x, precBits=5000))))),
Rf_logspace_sum = logspace_sum(x),
subtract_xmax_log1p = log1p(sum(exp(x[i]  x[i]))) + x[i],
subtract_xmax_naive = log(sum(exp(x  x[i]))) + x[i],
naive = log(sum(exp(x))))
}
> test(c(1e50, 46, 47))
[,1]
Rmpfr 1.4404614986241e20
Rf_logspace_sum 1.0000000000000e50
subtract_xmax_log1p 1.4404614986241e20
subtract_xmax_naive 1.0000000000000e50
naive 0.0000000000000e+00
> test(c(0, 46, 47))
[,1]
Rmpfr 1.4404614986241e20
Rf_logspace_sum 0.0000000000000e+00
subtract_xmax_log1p 1.4404614986241e20
subtract_xmax_naive 0.0000000000000e+00
naive 0.0000000000000e+00
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Mon, Nov 7, 2016 at 11:09 AM, William Dunlap < [hidden email]> wrote:
> It would be nice if the C functions Rf_logspace_sum, Rf_logspace_add, and
> Rf_logspace_sub were available as R functions. (I wish the '_sub' were
> '_subtract' because 'sub' means too many things in R.)
>
> I think Rf_logspace_sum in R could be a little better. E.g., using the C
> code
>
> #include <R.h>
> #include <Rinternals.h>
> #include <Rmath.h>
>
> SEXP Call_logspace_sum(SEXP x)
> {
> if (TYPEOF(x) != REALSXP)
> {
> Rf_error("'x' must be a numeric vector");
> }
> return ScalarReal(Rf_logspace_sum(REAL(x), length(x)));
> }
>
> and the R functions
>
> logspace_sum < function (x) .Call("Call_logspace_sum", as.numeric(x))
>
> and
>
> test < function (x) {
> x < as.numeric(x)
> rbind(Rmpfr = as.numeric(log(sum(exp(Rmpfr::mpfr(x,
> precBits=5000))))),
> Rf_logspace_sum = logspace_sum(x),
> subtract_xmax = log(sum(exp(x  max(x)))) + max(x),
> naive = log(sum(exp(x))))
> }
>
>
> R3.3.2 on Linux gives, after options(digits=17)
> > test(c(0, 50))
> [,1]
> Rmpfr 1.9287498479639178e22
> Rf_logspace_sum 1.9287498479639178e22
> subtract_xmax 0.0000000000000000e+00
> naive 0.0000000000000000e+00
>
> which is nice, but also the not so nice
>
> > test(c(0, 50, 50))
> [,1]
> Rmpfr 3.8574996959278356e22
> Rf_logspace_sum 0.0000000000000000e+00
> subtract_xmax 0.0000000000000000e+00
> naive 0.0000000000000000e+00
>
> With TERR the second test has Rmpfr==Rf_logspace_sum for that example.
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Mon, Nov 7, 2016 at 3:08 AM, Martin Maechler <
> [hidden email]> wrote:
>
>> >>>>> William Dunlap via Rhelp < [hidden email]>
>> >>>>> on Sun, 6 Nov 2016 20:53:17 0800 writes:
>>
>> > Perhaps the C function Rf_logspace_sum(double *x, int n) would help
>> in
>> > computing log(b). It computes log(sum(exp(x_i))) for i in 1..n,
>> avoiding
>> > unnecessary under and overflow.
>>
>> Indeed!
>>
>> I had thought more than twice to also export it to the R level
>> notably as we have been using two R level versions in a package
>> I maintain ('copula'). They are vectorized there in a way that
>> seemed particularly useful to our (Marius Hofert and my) use cases.
>>
>> More on this  making these available in R, how exactly? 
>> probably should move to the Rdevel list.
>>
>> Thank you Bill for bringing it up!
>> Martin
>>
>> > Bill Dunlap
>> > TIBCO Software
>> > wdunlap tibco.com
>>
>> > On Sun, Nov 6, 2016 at 5:25 PM, Rolf Turner <
>> [hidden email]> wrote:
>>
>> >> On 07/11/16 13:07, William Dunlap wrote:
>> >>
>> >>> Have you tried reparameterizing, using logb (=log(b)) instead of
>> b?
>> >>>
>> >>
>> >> Uh, no. I don't think that that makes any sense in my context.
>> >>
>> >> The "b" values are probabilities and must satisfy a "sumto1"
>> >> constraint. To accommodate this constraint I reparametrise via a
>> >> "logistic" style parametrisation  basically
>> >>
>> >> b_i = exp(z_i)/[sum_j exp(z_j)], j = 1, ... n
>> >>
>> >> with the parameters that the optimiser works with being z_1, ...,
>> z_{n1}
>> >> (and with z_n == 0 for identifiability). The objective function
>> is of the
>> >> form sum_i(a_i * log(b_i)), so I transform back
>> >> from the z_i to the b_i in order calculate the value of the
>> objective
>> >> function. But when the z_i get moderately largenegative, the b_i
>> become
>> >> numerically 0 and then log(b_i) becomes Inf. And the optimiser
>> falls over.
>> >>
>> >> cheers,
>> >>
>> >> Rolf
>> >>
>> >>
>> >>> Bill Dunlap
>> >>> TIBCO Software
>> >>> wdunlap tibco.com < http://tibco.com>
>> >>>
>> >>> On Sun, Nov 6, 2016 at 1:17 PM, Rolf Turner <
>> [hidden email]
>> >>> <mailto: [hidden email]>> wrote:
>> >>>
>> >>>
>> >>> I am trying to deal with a maximisation problem in which it is
>> >>> possible for the objective function to (quite legitimately) return
>> >>> the value Inf, which causes the numerical optimisers that I have
>> >>> tried to fall over.
>> >>>
>> >>> The Inf values arise from expressions of the form "a * log(b)",
>> >>> with b = 0. Under the *starting* values of the parameters, a must
>> >>> equal equal 0 whenever b = 0, so we can legitimately say that a *
>> >>> log(b) = 0 in these circumstances. However as the maximisation
>> >>> algorithm searches over parameters it is possible for b to take
>> the
>> >>> value 0 for values of
>> >>> a that are strictly positive. (The values of "a" do not change
>> during
>> >>> this search, although they *do* change between "successive
>> searches".)
>> >>>
>> >>> Clearly if one is *maximising* the objective then Inf is not a
>> value
>> >>> of
>> >>> particular interest, and we should be able to "move away". But
>> the
>> >>> optimising function just stops.
>> >>>
>> >>> It is also clear that "moving away" is not a simple task; you
>> can't
>> >>> estimate a gradient or Hessian at a point where the function value
>> >>> is Inf.
>> >>>
>> >>> Can anyone suggest a way out of this dilemma, perhaps an optimiser
>> >>> that is equipped to cope with Inf values in some sneaky way?
>> >>>
>> >>> Various ad hoc kludges spring to mind, but they all seem to be
>> >>> fraught with peril.
>> >>>
>> >>> I have tried changing the value returned by the objective function
>> >>> from
>> >>> "v" to exp(v)  which maps Inf to 0, which is nice and finite.
>> >>> However this seemed to flatten out the objective surface too much,
>> >>> and the search stalled at the 0 value, which is the antithesis of
>> >>> optimal.
>> >>>
>> >>> The problem arises in a context of applying the EM algorithm where
>> >>> the Mstep cannot be carried out explicitly, whence numerical
>> >>> optimisation.
>> >>> I can give more detail if anyone thinks that it could be relevant.
>> >>>
>> >>> I would appreciate advice from younger and wiser heads! :)
>> >>>
>> >>> cheers,
>> >>>
>> >>> Rolf Turner
>> >>>
>> >>> 
>> >>> Technical Editor ANZJS
>> >>> Department of Statistics
>> >>> University of Auckland
>> >>> Phone: +6493737599 ext. 88276 <tel:%2B6493737599%20ext.%2
>> 088276>
>> >>>
>> >>> ______________________________________________
>> >>> [hidden email] <mailto: [hidden email]> mailing list
>> 
>> >>> To UNSUBSCRIBE and more, see
>> >>> https://stat.ethz.ch/mailman/listinfo/rhelp>> >>> < https://stat.ethz.ch/mailman/listinfo/rhelp>
>> >>> PLEASE do read the posting guide
>> >>> http://www.Rproject.org/postingguide.html>> >>> < http://www.Rproject.org/postingguide.html>
>> >>> and provide commented, minimal, selfcontained, reproducible code.
>> >>>
>> >>>
>> >>>
>> >>
>> >> 
>> >> Technical Editor ANZJS
>> >> Department of Statistics
>> >> University of Auckland
>> >> Phone: +6493737599 ext. 88276
>> >>
>>
>> > [[alternative HTML version deleted]]
>>
>> > ______________________________________________
>> > [hidden email] mailing list  To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/rhelp>> > PLEASE do read the posting guide http://www.Rproject.org/posti>> ngguide.html
>> > and provide commented, minimal, selfcontained, reproducible code.
>>
>
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.

