

I am trying to familiarize myself with optim() with a relatively simple
maximization.
Description:
L and K are two terms which are constrained to add up to a total 100000
(with respective weights to each). To map this constraint I plugged K into
the function (to make this as simple as possible.)
Together these two feed into one nonlinear function which is the product of
two monotonic (on the positive interval) functions. Then that numbers is
returned in a function fed to optim, which should maximize the output by
adjusting L. The whole code is:
production1 < function(L){
budget=100000
Lcost=12
Kcost=15
K=(budgetL*Lcost)/Kcost
machines=0.05*L^(2/3)*K^(1/3)
return(machines)
}
# production1(6000) #example of number with much higher output vs optim
result
S1=optim(1001,production1,method="CG",control=list(fnscale=1))
S1
Output:
$par
[1] 1006.536
$value
[1] 90.54671
$counts
function gradient
201 101
$convergence
[1] 1
$message
NULL
For some reason this never explores the problem space and just spits out
some answer close to the initial condition. What am I doing wrong?
Thanks,
Skyler S.
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


The help file points out that CG is "fragile" ... and I would expect that failing to define a gradient function will exacerbate that.
I think you should use a different algorithm or specify a gradient function. You might also consider working with the more recent optimr package contributed by Dr Nash, author of the original optim function in R.
On March 12, 2020 2:30:26 AM PDT, Skyler Saleebyan < [hidden email]> wrote:
>I am trying to familiarize myself with optim() with a relatively simple
>maximization.
>
>Description:
>L and K are two terms which are constrained to add up to a total 100000
>(with respective weights to each). To map this constraint I plugged K
>into
>the function (to make this as simple as possible.)
>
>Together these two feed into one nonlinear function which is the
>product of
>two monotonic (on the positive interval) functions. Then that numbers
>is
>returned in a function fed to optim, which should maximize the output
>by
>adjusting L. The whole code is:
>
>production1 < function(L){
> budget=100000
> Lcost=12
> Kcost=15
> K=(budgetL*Lcost)/Kcost
> machines=0.05*L^(2/3)*K^(1/3)
> return(machines)
>}
>
># production1(6000) #example of number with much higher output vs optim
>result
>S1=optim(1001,production1,method="CG",control=list(fnscale=1))
>S1
>
>Output:
>$par
>[1] 1006.536
>
>$value
>[1] 90.54671
>
>$counts
>function gradient
> 201 101
>
>$convergence
>[1] 1
>
>$message
>NULL
>
>
>For some reason this never explores the problem space and just spits
>out
>some answer close to the initial condition. What am I doing wrong?
>
>Thanks,
>Skyler S.
>
> [[alternative HTML version deleted]]
>
>______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp>PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html>and provide commented, minimal, selfcontained, reproducible code.

Sent from my phone. Please excuse my brevity.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


It is possible to work out this problem explicitly. Playing with a few
different calls to optim shows that the method="LBFGSB" gives the correct
answer.
I don't have particular insight into why method="CG" is problematic.
On Thu, Mar 12, 2020 at 4:12 PM Jeff Newmiller < [hidden email]>
wrote:
> The help file points out that CG is "fragile" ... and I would expect that
> failing to define a gradient function will exacerbate that.
>
> I think you should use a different algorithm or specify a gradient
> function. You might also consider working with the more recent optimr
> package contributed by Dr Nash, author of the original optim function in R.
>
> On March 12, 2020 2:30:26 AM PDT, Skyler Saleebyan <
> [hidden email]> wrote:
> >I am trying to familiarize myself with optim() with a relatively simple
> >maximization.
> >
> >Description:
> >L and K are two terms which are constrained to add up to a total 100000
> >(with respective weights to each). To map this constraint I plugged K
> >into
> >the function (to make this as simple as possible.)
> >
> >Together these two feed into one nonlinear function which is the
> >product of
> >two monotonic (on the positive interval) functions. Then that numbers
> >is
> >returned in a function fed to optim, which should maximize the output
> >by
> >adjusting L. The whole code is:
> >
> >production1 < function(L){
> > budget=100000
> > Lcost=12
> > Kcost=15
> > K=(budgetL*Lcost)/Kcost
> > machines=0.05*L^(2/3)*K^(1/3)
> > return(machines)
> >}
> >
> ># production1(6000) #example of number with much higher output vs optim
> >result
> >S1=optim(1001,production1,method="CG",control=list(fnscale=1))
> >S1
> >
> >Output:
> >$par
> >[1] 1006.536
> >
> >$value
> >[1] 90.54671
> >
> >$counts
> >function gradient
> > 201 101
> >
> >$convergence
> >[1] 1
> >
> >$message
> >NULL
> >
> >
> >For some reason this never explores the problem space and just spits
> >out
> >some answer close to the initial condition. What am I doing wrong?
> >
> >Thanks,
> >Skyler S.
> >
> > [[alternative HTML version deleted]]
> >
> >______________________________________________
> > [hidden email] mailing list  To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/rhelp> >PLEASE do read the posting guide
> > http://www.Rproject.org/postingguide.html> >and provide commented, minimal, selfcontained, reproducible code.
>
> 
> Sent from my phone. Please excuse my brevity.
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


It looks like a bug in the CG method. The other methods in optim() all
work fine. CG is documented to be a good choice in high dimensions; why
did you choose it for a 1 dim problem?
Duncan Murdoch
On 12/03/2020 2:30 a.m., Skyler Saleebyan wrote:
> I am trying to familiarize myself with optim() with a relatively simple
> maximization.
>
> Description:
> L and K are two terms which are constrained to add up to a total 100000
> (with respective weights to each). To map this constraint I plugged K into
> the function (to make this as simple as possible.)
>
> Together these two feed into one nonlinear function which is the product of
> two monotonic (on the positive interval) functions. Then that numbers is
> returned in a function fed to optim, which should maximize the output by
> adjusting L. The whole code is:
>
> production1 < function(L){
> budget=100000
> Lcost=12
> Kcost=15
> K=(budgetL*Lcost)/Kcost
> machines=0.05*L^(2/3)*K^(1/3)
> return(machines)
> }
>
> # production1(6000) #example of number with much higher output vs optim
> result
> S1=optim(1001,production1,method="CG",control=list(fnscale=1))
> S1
>
> Output:
> $par
> [1] 1006.536
>
> $value
> [1] 90.54671
>
> $counts
> function gradient
> 201 101
>
> $convergence
> [1] 1
>
> $message
> NULL
>
>
> For some reason this never explores the problem space and just spits out
> some answer close to the initial condition. What am I doing wrong?
>
> Thanks,
> Skyler S.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


As author of CG (at least the code that was used to build it), I can say I was
never happy with that code. Rcgmin is the replacement I wrote, and I believe that
could still be improved.
BUT:
 you have a 1D optimization. Use Brent method and supply bounds.
 I never intended CG (or BFGS or NelderMead or ...) to work for 1D
problems
 as Jeff points out, you need the gradient. I stop Rcgmin and Rvmmin if
user hasn't supplied one, as numerical approximations need to be very
good for these gradient methods
JN
On 20200312 10:03 a.m., Jeff Newmiller wrote:
> The help file points out that CG is "fragile" ... and I would expect that failing to define a gradient function will exacerbate that.
>
> I think you should use a different algorithm or specify a gradient function. You might also consider working with the more recent optimr package contributed by Dr Nash, author of the original optim function in R.
>
> On March 12, 2020 2:30:26 AM PDT, Skyler Saleebyan < [hidden email]> wrote:
>> I am trying to familiarize myself with optim() with a relatively simple
>> maximization.
>>
>> Description:
>> L and K are two terms which are constrained to add up to a total 100000
>> (with respective weights to each). To map this constraint I plugged K
>> into
>> the function (to make this as simple as possible.)
>>
>> Together these two feed into one nonlinear function which is the
>> product of
>> two monotonic (on the positive interval) functions. Then that numbers
>> is
>> returned in a function fed to optim, which should maximize the output
>> by
>> adjusting L. The whole code is:
>>
>> production1 < function(L){
>> budget=100000
>> Lcost=12
>> Kcost=15
>> K=(budgetL*Lcost)/Kcost
>> machines=0.05*L^(2/3)*K^(1/3)
>> return(machines)
>> }
>>
>> # production1(6000) #example of number with much higher output vs optim
>> result
>> S1=optim(1001,production1,method="CG",control=list(fnscale=1))
>> S1
>>
>> Output:
>> $par
>> [1] 1006.536
>>
>> $value
>> [1] 90.54671
>>
>> $counts
>> function gradient
>> 201 101
>>
>> $convergence
>> [1] 1
>>
>> $message
>> NULL
>>
>>
>> For some reason this never explores the problem space and just spits
>> out
>> some answer close to the initial condition. What am I doing wrong?
>>
>> Thanks,
>> Skyler S.
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] mailing list  To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/rhelp>> PLEASE do read the posting guide
>> http://www.Rproject.org/postingguide.html>> and provide commented, minimal, selfcontained, reproducible code.
>
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Thanks for the replies. Since I was seeing this glitch with CG in my 1d and
2d formulation of the problem I was trying to figure out what was going on
that led to the failure. I'll switch to a more suitable method and keep
these considerations in mind.
On Thu, Mar 12, 2020, 9:23 AM J C Nash < [hidden email]> wrote:
> As author of CG (at least the code that was used to build it), I can say I
> was
> never happy with that code. Rcgmin is the replacement I wrote, and I
> believe that
> could still be improved.
>
> BUT:
>  you have a 1D optimization. Use Brent method and supply bounds.
>  I never intended CG (or BFGS or NelderMead or ...) to work for 1D
> problems
>  as Jeff points out, you need the gradient. I stop Rcgmin and Rvmmin if
> user hasn't supplied one, as numerical approximations need to be very
> good for these gradient methods
>
> JN
>
> On 20200312 10:03 a.m., Jeff Newmiller wrote:
> > The help file points out that CG is "fragile" ... and I would expect
> that failing to define a gradient function will exacerbate that.
> >
> > I think you should use a different algorithm or specify a gradient
> function. You might also consider working with the more recent optimr
> package contributed by Dr Nash, author of the original optim function in R.
> >
> > On March 12, 2020 2:30:26 AM PDT, Skyler Saleebyan <
> [hidden email]> wrote:
> >> I am trying to familiarize myself with optim() with a relatively simple
> >> maximization.
> >>
> >> Description:
> >> L and K are two terms which are constrained to add up to a total 100000
> >> (with respective weights to each). To map this constraint I plugged K
> >> into
> >> the function (to make this as simple as possible.)
> >>
> >> Together these two feed into one nonlinear function which is the
> >> product of
> >> two monotonic (on the positive interval) functions. Then that numbers
> >> is
> >> returned in a function fed to optim, which should maximize the output
> >> by
> >> adjusting L. The whole code is:
> >>
> >> production1 < function(L){
> >> budget=100000
> >> Lcost=12
> >> Kcost=15
> >> K=(budgetL*Lcost)/Kcost
> >> machines=0.05*L^(2/3)*K^(1/3)
> >> return(machines)
> >> }
> >>
> >> # production1(6000) #example of number with much higher output vs optim
> >> result
> >> S1=optim(1001,production1,method="CG",control=list(fnscale=1))
> >> S1
> >>
> >> Output:
> >> $par
> >> [1] 1006.536
> >>
> >> $value
> >> [1] 90.54671
> >>
> >> $counts
> >> function gradient
> >> 201 101
> >>
> >> $convergence
> >> [1] 1
> >>
> >> $message
> >> NULL
> >>
> >>
> >> For some reason this never explores the problem space and just spits
> >> out
> >> some answer close to the initial condition. What am I doing wrong?
> >>
> >> Thanks,
> >> Skyler S.
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/rhelp> >> PLEASE do read the posting guide
> >> http://www.Rproject.org/postingguide.html> >> and provide commented, minimal, selfcontained, reproducible code.
> >
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


I'm sorry, Duncan.
But I disagree.
This is not a "bug" in optim function, as such.
(Or at least, there's nothing in this discussion to suggest that there's a bug).
But rather a floating point arithmetic related problem.
The OP's function looks simple enough, at first glance.
But it's not.
Plotting a numerical approximation of the derivative, makes the
problem more apparent:

plot_derivative < function (f, a = sol  offset, b = sol + offset,
sol, offset=0.001, N=200)
{ FIRST < 1:(N  2)
LAST < 3:N
MP < 2:(N  1)
x < seq (a, b, length.out=N)
y < f (x)
dy < (y [LAST]  y [FIRST]) / (x [LAST]  x [FIRST])
plot (x [MP], dy, type="l", xlab="x", ylab="dy/dx (approx)")
}
optim.sol < optim (1001, production1 ,method="CG", control = list
(fnscale=1) )$par
plot_derivative (production1, sol=optim.sol)
abline (v=optim.sol, lty=2, col="grey")

So, I would say the optim function (including the CG method) is doing
what it's supposed to do.
And collating/expanding on Nash's, Jeff's and Eric's comments:
(1) An exact solution can be derived quickly, so using a numerical
method is unnecessary, and inefficient.
(2) Possible problems with the CG method are noted in the documentation.
(3) Numerical approximations of the function's derivative need to be
wellbehaved for gradientbased numerical methods to work properly.
On Fri, Mar 13, 2020 at 3:42 AM Duncan Murdoch < [hidden email]> wrote:
>
> It looks like a bug in the CG method. The other methods in optim() all
> work fine. CG is documented to be a good choice in high dimensions; why
> did you choose it for a 1 dim problem?
>
> Duncan Murdoch
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


> (1) An exact solution can be derived quickly
Please disregard note (1) above.
I'm not sure if it was right.
And one more comment:
The conjugate gradient method is an established method.
So the question is, is the optim function applying this method or not...
And assuming that it is, then R is definitely doing what it should be doing.
If not, then I guess it would be a bug...
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Hi Abby: Either way, thanks for your efforts with the derivative plot.
Note that John Nash is a SERIOUS EXPERT in optimization so I would just go
by what he
said earlier. Also, I don't want to speak for Duncan but I have a feeling
that he meant "inadequacy" in the CG
method rather than a bug in the R code.
Mark
On Thu, Mar 12, 2020 at 5:55 PM Abby Spurdle < [hidden email]> wrote:
> > (1) An exact solution can be derived quickly
>
> Please disregard note (1) above.
> I'm not sure if it was right.
>
> And one more comment:
>
> The conjugate gradient method is an established method.
> So the question is, is the optim function applying this method or not...
> And assuming that it is, then R is definitely doing what it should be
> doing.
> If not, then I guess it would be a bug...
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
>
[[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On 12/03/2020 1:22 p.m., Abby Spurdle wrote:
> I'm sorry, Duncan.
> But I disagree.
>
> This is not a "bug" in optim function, as such.
> (Or at least, there's nothing in this discussion to suggest that there's a bug).
> But rather a floating point arithmetic related problem.
>
> The OP's function looks simple enough, at first glance.
> But it's not.
>
> Plotting a numerical approximation of the derivative, makes the
> problem more apparent:
There is nothing in that plot to indicate that the result given by
optim() should be accepted as optimal. The numerical approximation to
the derivative is 0.055851 everywhere in your graph, with numerical
errors out in the 8th decimal place or later. Clearly the max occurs
somewhere to the right of that. Yes, the 2nd derivative calculation
will be terrible if R chooses a step size of 0.00001 when calculating
it, but why would it do that, given that the 1st derivative is 3 orders
of magnitude larger?
> 
> plot_derivative < function (f, a = sol  offset, b = sol + offset,
> sol, offset=0.001, N=200)
> { FIRST < 1:(N  2)
> LAST < 3:N
> MP < 2:(N  1)
>
> x < seq (a, b, length.out=N)
> y < f (x)
> dy < (y [LAST]  y [FIRST]) / (x [LAST]  x [FIRST])
>
> plot (x [MP], dy, type="l", xlab="x", ylab="dy/dx (approx)")
> }
>
> optim.sol < optim (1001, production1 ,method="CG", control = list
> (fnscale=1) )$par
> plot_derivative (production1, sol=optim.sol)
> abline (v=optim.sol, lty=2, col="grey")
> 
>
> So, I would say the optim function (including the CG method) is doing
> what it's supposed to do.
>
> And collating/expanding on Nash's, Jeff's and Eric's comments:
> (1) An exact solution can be derived quickly, so using a numerical
> method is unnecessary, and inefficient.
> (2) Possible problems with the CG method are noted in the documentation.
> (3) Numerical approximations of the function's derivative need to be
> wellbehaved for gradientbased numerical methods to work properly.
>
>
> On Fri, Mar 13, 2020 at 3:42 AM Duncan Murdoch < [hidden email]> wrote:
>>
>> It looks like a bug in the CG method. The other methods in optim() all
>> work fine. CG is documented to be a good choice in high dimensions; why
>> did you choose it for a 1 dim problem?
>>
>> Duncan Murdoch
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


> There is nothing in that plot to indicate that the result given by
> optim() should be accepted as optimal. The numerical approximation to
> the derivative is 0.055851 everywhere in your graph
That wasn't how I intended the plot to be interpreted.
By default, the step size (in x) is 1e5, which seems like a moderate step size.
However, at that level, the numerical approximation is very badly behaved.
And if the step size is decreased, things get worse.
I haven't checked all the technical details of the optim function.
But any reliance on numerical approximations of the derivative, have a
high chance of running into problems using a function like this.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On 12/03/2020 7:25 p.m., Abby Spurdle wrote:
>> There is nothing in that plot to indicate that the result given by
>> optim() should be accepted as optimal. The numerical approximation to
>> the derivative is 0.055851 everywhere in your graph
>
> That wasn't how I intended the plot to be interpreted.
> By default, the step size (in x) is 1e5, which seems like a moderate step size.
optim() uses a much larger one. You can see it if you run this code
after yours:
production2 < function(L){
abline(v=L);cat("L=", L, "\n") # Add this in to see L values
budget=100000
Lcost=12
Kcost=15
K=(budgetL*Lcost)/Kcost
machines=0.05*L^(2/3)*K^(1/3)
return(machines)
}
optim.sol < optim (1001, production2 ,method="CG", control =
list(fnscale=1) )
You'll get just 3 evaluations within the scale of your plot. They are at
L= 1006.536
L= 1006.537
L= 1006.535
It appears to have chosen step size 0.001, not 0.00001. It should be
getting adequate accuracy in both 1st and 2nd derivatives.
Those little ripples you see in the plot are not relevant.
> However, at that level, the numerical approximation is very badly behaved.
> And if the step size is decreased, things get worse.
>
> I haven't checked all the technical details of the optim function.
> But any reliance on numerical approximations of the derivative, have a
> high chance of running into problems using a function like this.
>
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


> L= 1006.536
> L= 1006.537
> L= 1006.535
> It appears to have chosen step size 0.001, not 0.00001. It should be
> getting adequate accuracy in both 1st and 2nd derivatives.
> Those little ripples you see in the plot are not relevant.
I'm impressed.
But you're still wrong.
Try this:

#not good R code!
v = numeric ()
production3 < function(L){
#store in vector
v << c (v, L)
budget=100000
Lcost=12
Kcost=15
K=(budgetL*Lcost)/Kcost
machines=0.05*L^(2/3)*K^(1/3)
return(machines)
}
optim.sol < optim (1001, production3 ,method="CG", control = list(fnscale=1) )
n = length (v)
print (n)
plot (1:n ,v, type="l")

After 401 iterations (on my computer), the algorithm hasn't converged.
And I note it's converging extremely slowly, so I don't see any
argument for increasing the number of iterations.
And try this:
(The first 30 steps).

plot (1:30 ,v [1:30], type="l")

Little ripples aren't going anywhere...
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


On 12/03/2020 8:52 p.m., Abby Spurdle wrote:
>> L= 1006.536
>> L= 1006.537
>> L= 1006.535
>> It appears to have chosen step size 0.001, not 0.00001. It should be
>> getting adequate accuracy in both 1st and 2nd derivatives.
>> Those little ripples you see in the plot are not relevant.
>
> I'm impressed.
> But you're still wrong.
>
> Try this:
> 
> #not good R code!
> v = numeric ()
>
> production3 < function(L){
> #store in vector
> v << c (v, L)
>
> budget=100000
> Lcost=12
> Kcost=15
> K=(budgetL*Lcost)/Kcost
> machines=0.05*L^(2/3)*K^(1/3)
> return(machines)
> }
>
> optim.sol < optim (1001, production3 ,method="CG", control = list(fnscale=1) )
>
> n = length (v)
> print (n)
>
> plot (1:n ,v, type="l")
> 
>
> After 401 iterations (on my computer), the algorithm hasn't converged.
> And I note it's converging extremely slowly, so I don't see any
> argument for increasing the number of iterations.
>
> And try this:
> (The first 30 steps).
> 
> plot (1:30 ,v [1:30], type="l")
> 
>
> Little ripples aren't going anywhere...
>
That's the bug.
It is correctly signalling that it hasn't converged (look at
optim.sol$convergence, which "indicates that the iteration limit maxit
had been reached".) But CG should be taking bigger steps. On a 1D
quadratic objective function with no errors in the derivatives, it
should take one step to convergence. Here we're not quadratic (though
it's pretty close), and we don't have exact derivatives (your ripples),
so the fact that it is sticking to one step size is a sign that it is
not working. If those ripples are big enough to matter (and I'm not
convinced of that), it should take highly variable steps.
The fact that it doesn't give a warning() when it knows it has failed to
converge is also a pretty serious design flaw.
Duncan Murdoch
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


Once again, CG and its successors aren't envisaged for 1D problems. Do you
really want to perform brain surgery with a chain saw?
Note that
production4 < function(L) {  production3(L) }
sjn2 < optimize(production3, c(900, 1100))
sjn2
gives
$minimum
[1] 900.0001
$objective
[1] 84.44156
Whether that is a good optimum I haven't checked.
JN
On 20200313 6:47 a.m., Duncan Murdoch wrote:
> On 12/03/2020 8:52 p.m., Abby Spurdle wrote:
>>> L= 1006.536
>>> L= 1006.537
>>> L= 1006.535
>>> It appears to have chosen step size 0.001, not 0.00001. It should be
>>> getting adequate accuracy in both 1st and 2nd derivatives.
>>> Those little ripples you see in the plot are not relevant.
>>
>> I'm impressed.
>> But you're still wrong.
>>
>> Try this:
>> 
>> #not good R code!
>> v = numeric ()
>>
>> production3 < function(L){
>> #store in vector
>> v << c (v, L)
>>
>> budget=100000
>> Lcost=12
>> Kcost=15
>> K=(budgetL*Lcost)/Kcost
>> machines=0.05*L^(2/3)*K^(1/3)
>> return(machines)
>> }
>>
>> optim.sol < optim (1001, production3 ,method="CG", control = list(fnscale=1) )
>>
>> n = length (v)
>> print (n)
>>
>> plot (1:n ,v, type="l")
>> 
>>
>> After 401 iterations (on my computer), the algorithm hasn't converged.
>> And I note it's converging extremely slowly, so I don't see any
>> argument for increasing the number of iterations.
>>
>> And try this:
>> (The first 30 steps).
>> 
>> plot (1:30 ,v [1:30], type="l")
>> 
>>
>> Little ripples aren't going anywhere...
>>
>
> That's the bug.
>
> It is correctly signalling that it hasn't converged (look at optim.sol$convergence, which "indicates that the iteration
> limit maxit had been reached".) But CG should be taking bigger steps. On a 1D quadratic objective function with no
> errors in the derivatives, it should take one step to convergence. Here we're not quadratic (though it's pretty close),
> and we don't have exact derivatives (your ripples), so the fact that it is sticking to one step size is a sign that it
> is not working. If those ripples are big enough to matter (and I'm not convinced of that), it should take highly
> variable steps.
>
> The fact that it doesn't give a warning() when it knows it has failed to converge is also a pretty serious design flaw.
>
> Duncan Murdoch
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


> It is correctly signalling that it hasn't converged (look at
> optim.sol$convergence, which "indicates that the iteration limit maxit
> had been reached".) But CG should be taking bigger steps. On a 1D
> quadratic objective function with no errors in the derivatives, it
> should take one step to convergence. Here we're not quadratic (though
> it's pretty close), and we don't have exact derivatives (your ripples),
> so the fact that it is sticking to one step size is a sign that it is
> not working. If those ripples are big enough to matter (and I'm not
> convinced of that), it should take highly variable steps.
Hi Duncan,
I need to apologize.
The problem has nothing to do with little ripples.
(My bad...)
I tried approximating the function with a cubic Hermite spline.
(Essentially smoothing the function).
However, the optim function still returns the wrong result.
Which surprised me...
Then I tried changing the max number of iterations, and found
something quite interesting:

production.wr < function(L){
cat (L, "\n")
budget=100000
Lcost=12
Kcost=15
K=(budgetL*Lcost)/Kcost
machines=0.05*L^(2/3)*K^(1/3)
return(machines)
}
S1=optim(1001,production.wr,method="CG",control=list(fnscale=1, maxit=1))
S1=optim(1001,production.wr,method="CG",control=list(fnscale=1, maxit=2))
S1=optim(1001,production.wr,method="CG",control=list(fnscale=1, maxit=3))
S1=optim(1001,production.wr,method="CG",control=list(fnscale=1, maxit=4))

The first iteration calls the function (3 + 2) times.
Subsequent iterations call the function (2 + 2) times.
In the subsetof3, the step size is exactly 0.001.
And in subsequent leading (but not trailing) subsetsof2, the step
size is exactly 0.002.
I was wondering (hypothetically) if the first iteration is
approximating the second derivative, and subsequent iterations are
not...???
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


I got that last point wrong as well.
(Each iteration is using five evaluations).
Ignore all my comments on this subject.
On 3/14/20, Abby Spurdle < [hidden email]> wrote:
>> It is correctly signalling that it hasn't converged (look at
>> optim.sol$convergence, which "indicates that the iteration limit maxit
>> had been reached".) But CG should be taking bigger steps. On a 1D
>> quadratic objective function with no errors in the derivatives, it
>> should take one step to convergence. Here we're not quadratic (though
>> it's pretty close), and we don't have exact derivatives (your ripples),
>> so the fact that it is sticking to one step size is a sign that it is
>> not working. If those ripples are big enough to matter (and I'm not
>> convinced of that), it should take highly variable steps.
>
> Hi Duncan,
>
> I need to apologize.
> The problem has nothing to do with little ripples.
> (My bad...)
>
> I tried approximating the function with a cubic Hermite spline.
> (Essentially smoothing the function).
>
> However, the optim function still returns the wrong result.
> Which surprised me...
>
> Then I tried changing the max number of iterations, and found
> something quite interesting:
> 
> production.wr < function(L){
> cat (L, "\n")
> budget=100000
> Lcost=12
> Kcost=15
> K=(budgetL*Lcost)/Kcost
> machines=0.05*L^(2/3)*K^(1/3)
> return(machines)
> }
>
> S1=optim(1001,production.wr,method="CG",control=list(fnscale=1, maxit=1))
> S1=optim(1001,production.wr,method="CG",control=list(fnscale=1, maxit=2))
> S1=optim(1001,production.wr,method="CG",control=list(fnscale=1, maxit=3))
> S1=optim(1001,production.wr,method="CG",control=list(fnscale=1, maxit=4))
> 
>
> The first iteration calls the function (3 + 2) times.
> Subsequent iterations call the function (2 + 2) times.
> In the subsetof3, the step size is exactly 0.001.
> And in subsequent leading (but not trailing) subsetsof2, the step
> size is exactly 0.002.
>
> I was wondering (hypothetically) if the first iteration is
> approximating the second derivative, and subsequent iterations are
> not...???
>
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


It seems CG is having problems with the cube root. This converges while
still using CG:
S1 < optim(1001,function(x) (production1(x)^3), method = "CG",
control = list(fnscale=1))
On Thu, Mar 12, 2020 at 9:34 AM Skyler Saleebyan
< [hidden email]> wrote:
>
> I am trying to familiarize myself with optim() with a relatively simple
> maximization.
>
> Description:
> L and K are two terms which are constrained to add up to a total 100000
> (with respective weights to each). To map this constraint I plugged K into
> the function (to make this as simple as possible.)
>
> Together these two feed into one nonlinear function which is the product of
> two monotonic (on the positive interval) functions. Then that numbers is
> returned in a function fed to optim, which should maximize the output by
> adjusting L. The whole code is:
>
> production1 < function(L){
> budget=100000
> Lcost=12
> Kcost=15
> K=(budgetL*Lcost)/Kcost
> machines=0.05*L^(2/3)*K^(1/3)
> return(machines)
> }
>
> # production1(6000) #example of number with much higher output vs optim
> result
> S1=optim(1001,production1,method="CG",control=list(fnscale=1))
> S1
>
> Output:
> $par
> [1] 1006.536
>
> $value
> [1] 90.54671
>
> $counts
> function gradient
> 201 101
>
> $convergence
> [1] 1
>
> $message
> NULL
>
>
> For some reason this never explores the problem space and just spits out
> some answer close to the initial condition. What am I doing wrong?
>
> Thanks,
> Skyler S.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp> PLEASE do read the posting guide http://www.Rproject.org/postingguide.html> and provide commented, minimal, selfcontained, reproducible code.

Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1877GKXGROUP
email: ggrothendieck at gmail.com
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


######################################################
I ran <while (TRUE) 0> before posting, and waited a while...
(Re: The posting guide, which I'm going to start putting a lot more weight on).
Noting, I was wondering if the posting guide has a mistake, because
<4*runif(1)> doesn't do anything special...
(Hopefully, Martin is reading this...).
And I'm planning to make this my last post in this thread...
######################################################
Here's a much simpler example of the problem:
optim (4, dnorm, method="CG", control = list (fnscale=1) )$par
This problem isn't limited to objective functions of one variable.
I tried similar problems with functions of two, three and four variables.
And the same thing happened.
I'm not sure if this is a bug in the R code, or not.
If it's a bug...
And if it's not a bug, I'm struggling to see why anyone would want to
use this method in an applied setting...
On Fri, Mar 13, 2020 at 3:42 AM Duncan Murdoch < [hidden email]> wrote:
>
> It looks like a bug in the CG method. The other methods in optim() all
> work fine. CG is documented to be a good choice in high dimensions; why
> did you choose it for a 1 dim problem?
>
> Duncan Murdoch
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.


You are starting to sound like Dr Nash [1]... "use optimr".
[1] https://stat.ethz.ch/pipermail/rhelp/2018July/458498.htmlOn March 14, 2020 2:27:48 PM PDT, Abby Spurdle < [hidden email]> wrote:
>######################################################
>I ran <while (TRUE) 0> before posting, and waited a while...
>(Re: The posting guide, which I'm going to start putting a lot more
>weight on).
>
>Noting, I was wondering if the posting guide has a mistake, because
><4*runif(1)> doesn't do anything special...
>(Hopefully, Martin is reading this...).
>
>And I'm planning to make this my last post in this thread...
>######################################################
>Here's a much simpler example of the problem:
>
>optim (4, dnorm, method="CG", control = list (fnscale=1) )$par
>
>This problem isn't limited to objective functions of one variable.
>I tried similar problems with functions of two, three and four
>variables.
>And the same thing happened.
>
>I'm not sure if this is a bug in the R code, or not.
>
>If it's a bug...
>
>And if it's not a bug, I'm struggling to see why anyone would want to
>use this method in an applied setting...
>
>
>On Fri, Mar 13, 2020 at 3:42 AM Duncan Murdoch
>< [hidden email]> wrote:
>>
>> It looks like a bug in the CG method. The other methods in optim()
>all
>> work fine. CG is documented to be a good choice in high dimensions;
>why
>> did you choose it for a 1 dim problem?
>>
>> Duncan Murdoch
>
>______________________________________________
> [hidden email] mailing list  To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/rhelp>PLEASE do read the posting guide
> http://www.Rproject.org/postingguide.html>and provide commented, minimal, selfcontained, reproducible code.

Sent from my phone. Please excuse my brevity.
______________________________________________
[hidden email] mailing list  To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/rhelpPLEASE do read the posting guide http://www.Rproject.org/postingguide.htmland provide commented, minimal, selfcontained, reproducible code.

