

I've discovered an infelicity (I guess) in qbeta(): it's not a bug,
since there's a clear warning about lack of convergence of the numerical
algorithm ("full precision may not have been achieved"). I can work
around this, but I'm curious why it happens and whether there's a better
workaround  it doesn't seem to be in a particularly extreme corner of
parameter space. It happens, e.g., for these parameters:
phi < 1.1
i < 0.01
t < 0.001
shape1 = i/phi ## 0.009090909
shape2 = (1i)/phi ## 0.9
qbeta(t,shape1,shape2) ## 5.562685e309
## bruteforce uniroot() version, see below
Qbeta0(t,shape1,shape2) ## 0.9262824
The qbeta code is pretty scary to read: the warning "full precision
may not have been achieved" is triggered here:
https://github.com/wch/rsource/blob/f8d4d7d48051860cc695b99db9be9cf439aee743/src/nmath/qbeta.c#L530 Any thoughts? Should I report this on the bug list?
A more general illustration:
http://www.math.mcmaster.ca/bolker/misc/qbeta.png===
fun < function(phi,i=0.01,t=0.001, f=qbeta) {
f(t,shape1=i/phi,shape2=(1i)/phi, lower.tail=FALSE)
}
## bruteforce beta quantile function
Qbeta0 < function(t,shape1,shape2,lower.tail=FALSE) {
fn < function(x) {pbeta(x,shape1,shape2,lower.tail=lower.tail)t}
uniroot(fn,interval=c(0,1))$root
}
Qbeta < Vectorize(Qbeta0,c("t","shape1","shape2"))
curve(fun,from=1,to=4)
curve(fun(x,f=Qbeta),add=TRUE,col=2)
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rdevel


>>>>> Ben Bolker
>>>>> on Wed, 25 Mar 2020 21:09:16 0400 writes:
> I've discovered an infelicity (I guess) in qbeta(): it's not a bug,
> since there's a clear warning about lack of convergence of the numerical
> algorithm ("full precision may not have been achieved"). I can work
> around this, but I'm curious why it happens and whether there's a better
> workaround  it doesn't seem to be in a particularly extreme corner of
> parameter space. It happens, e.g., for these parameters:
> phi < 1.1
> i < 0.01
> t < 0.001
> shape1 = i/phi ## 0.009090909
> shape2 = (1i)/phi ## 0.9
> qbeta(t,shape1,shape2) ## 5.562685e309
> ## bruteforce uniroot() version, see below
> Qbeta0(t,shape1,shape2) ## 0.9262824
> The qbeta code is pretty scary to read: the warning "full precision
> may not have been achieved" is triggered here:
> https://github.com/wch/rsource/blob/f8d4d7d48051860cc695b99db9be9cf439aee743/src/nmath/qbeta.c#L530 > Any thoughts?
Well, qbeta() is mostly based on inverting pbeta() and pbeta()
has *several* "dangerous" corners in its parameter spaces
{in some cases, it makes sense to look at the 4 different cases
log.p = TRUE/FALSE // lower.tail = TRUE/FALSE separately ..}
pbeta() itself is based on the most complex numerical code in
all of base R, i.e., src/nmath/toms708.c and that algorithm
(TOMS 708) had been sophisticated already when it was published,
and it has been improved and tweaked several times since being
part of R, notably for the log.p=TRUE case which had not been in
the focus of the publication and its algorithm.
[[ NB: part of this you can read when reading help(pbeta) to the end ! ]]
I've spent many "man weeks", or even "man months" on pbeta() and
qbeta(), already and have dreamed to get a good student do a
master's thesis about the problem and potential solutions I've
looked into in the mean time.
My current gut feeling is that in some cases, new approximations
are necessary (i.e. tweaking of current approximations is not
going to help sufficiently).
Also not (in the R sources) tests/pqbetastricttst.R
a whole file of "regression tests" about pbeta() and qbeta()
{where part of the true values have been computed with my CRAN
package Rmpfr (for high precision computation) with the
Rmpfr::pbetaI() function which gives arbitrarily precise pbeta()
values but only when (a,b) are integers  that's the "I" in pbetaI().
Yes, it's intriguing ... and I'll look into your special
findings a bit later today.
> Should I report this on the bug list?
Yes, please. Not all problem of pbeta() / qbeta() are part yet,
of R's bugzilla data base, and maybe this will help to draw
more good applied mathematicians look into it.
Martin Maechler
ETH Zurich and R Core team
(I'd call myself the "dpqhacker" within R core  related to
my CRAN package 'DPQ')
> A more general illustration:
> http://www.math.mcmaster.ca/bolker/misc/qbeta.png > ===
> fun < function(phi,i=0.01,t=0.001, f=qbeta) {
> f(t,shape1=i/phi,shape2=(1i)/phi, lower.tail=FALSE)
> }
> ## bruteforce beta quantile function
> Qbeta0 < function(t,shape1,shape2,lower.tail=FALSE) {
> fn < function(x) {pbeta(x,shape1,shape2,lower.tail=lower.tail)t}
> uniroot(fn,interval=c(0,1))$root
> }
> Qbeta < Vectorize(Qbeta0,c("t","shape1","shape2"))
> curve(fun,from=1,to=4)
> curve(fun(x,f=Qbeta),add=TRUE,col=2)
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rdevel______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rdevel


It's a pretty extreme case, try e.g. curve(pbeta(x, shape1, shape2), n=10001), and (probably  I can't be bothered to work out the relation between beta shapes and F df parameters this morning...) outside what is normally encountered in statistical analyses. Notice though, that you have lower=FALSE in your Qbeta0, so you should have it in qbeta as well:
> qbeta(t,shape1,shape2, lower=FALSE)
[1] 0.9999949
Warning message:
In qbeta(t, shape1, shape2, lower = FALSE) :
full precision may not have been achieved in 'qbeta'
which of course is still wrong (I dont't think there is a problem in the other tail, Qbeta0(t,...., lower=TRUE) returns zero.
You can see the effect also with
curve(qbeta(x, shape1, shape2), n=10001, from=.99, to=1)
which kind of suggests one of those regimeswitching bugs, where different methods are used for various parts of the domain, and the crossover is not done quite right.
At any rate, qbeta is one of R's very basic workhorses, so we do want it to work right, so by all means file a report.
pd
> On 26 Mar 2020, at 02:09 , Ben Bolker < [hidden email]> wrote:
>
>
> I've discovered an infelicity (I guess) in qbeta(): it's not a bug,
> since there's a clear warning about lack of convergence of the numerical
> algorithm ("full precision may not have been achieved"). I can work
> around this, but I'm curious why it happens and whether there's a better
> workaround  it doesn't seem to be in a particularly extreme corner of
> parameter space. It happens, e.g., for these parameters:
>
> phi < 1.1
> i < 0.01
> t < 0.001
> shape1 = i/phi ## 0.009090909
> shape2 = (1i)/phi ## 0.9
> qbeta(t,shape1,shape2) ## 5.562685e309
> ## bruteforce uniroot() version, see below
> Qbeta0(t,shape1,shape2) ## 0.9262824
>
> The qbeta code is pretty scary to read: the warning "full precision
> may not have been achieved" is triggered here:
>
> https://github.com/wch/rsource/blob/f8d4d7d48051860cc695b99db9be9cf439aee743/src/nmath/qbeta.c#L530>
> Any thoughts? Should I report this on the bug list?
>
>
> A more general illustration:
> http://www.math.mcmaster.ca/bolker/misc/qbeta.png>
> ===
> fun < function(phi,i=0.01,t=0.001, f=qbeta) {
> f(t,shape1=i/phi,shape2=(1i)/phi, lower.tail=FALSE)
> }
> ## bruteforce beta quantile function
> Qbeta0 < function(t,shape1,shape2,lower.tail=FALSE) {
> fn < function(x) {pbeta(x,shape1,shape2,lower.tail=lower.tail)t}
> uniroot(fn,interval=c(0,1))$root
> }
> Qbeta < Vectorize(Qbeta0,c("t","shape1","shape2"))
> curve(fun,from=1,to=4)
> curve(fun(x,f=Qbeta),add=TRUE,col=2)
>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rdevel
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [hidden email] Priv: [hidden email]
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rdevel


Given that a number of us are housebound, it might be a good time to try to
improve the approximation. It's not an area where I have much expertise, but in
looking at the qbeta.c code I see a lot of rootfinding, where I do have some
background. However, I'm very reluctant to work alone on this, and will ask
interested others to email offlist. If there are others, I'll report back.
Ben: Do you have an idea of parameter region where approximation is poor?
I think that it would be smart to focus on that to start with.
Martin: On a separate precision matter, did you get my query early in year about double
length accumulation of inner products of vectors in Rmpfr? Rhelp more or
less implied that Rmpfr does NOT use extra length. I've been using David
Smith's FM Fortran where the DOT_PRODUCT does use double length, but it
would be nice to have that in R. My attempts to find "easy" workarounds have
not been successful, but I'll admit that other things took precedence.
Best,
John Nash
On 20200326 4:02 a.m., Martin Maechler wrote:
>>>>>> Ben Bolker
>>>>>> on Wed, 25 Mar 2020 21:09:16 0400 writes:
>
> > I've discovered an infelicity (I guess) in qbeta(): it's not a bug,
> > since there's a clear warning about lack of convergence of the numerical
> > algorithm ("full precision may not have been achieved"). I can work
> > around this, but I'm curious why it happens and whether there's a better
> > workaround  it doesn't seem to be in a particularly extreme corner of
> > parameter space. It happens, e.g., for these parameters:
>
> > phi < 1.1
> > i < 0.01
> > t < 0.001
> > shape1 = i/phi ## 0.009090909
> > shape2 = (1i)/phi ## 0.9
> > qbeta(t,shape1,shape2) ## 5.562685e309
> > ## bruteforce uniroot() version, see below
> > Qbeta0(t,shape1,shape2) ## 0.9262824
>
> > The qbeta code is pretty scary to read: the warning "full precision
> > may not have been achieved" is triggered here:
>
> > https://github.com/wch/rsource/blob/f8d4d7d48051860cc695b99db9be9cf439aee743/src/nmath/qbeta.c#L530>
> > Any thoughts?
>
> Well, qbeta() is mostly based on inverting pbeta() and pbeta()
> has *several* "dangerous" corners in its parameter spaces
> {in some cases, it makes sense to look at the 4 different cases
> log.p = TRUE/FALSE // lower.tail = TRUE/FALSE separately ..}
>
> pbeta() itself is based on the most complex numerical code in
> all of base R, i.e., src/nmath/toms708.c and that algorithm
> (TOMS 708) had been sophisticated already when it was published,
> and it has been improved and tweaked several times since being
> part of R, notably for the log.p=TRUE case which had not been in
> the focus of the publication and its algorithm.
> [[ NB: part of this you can read when reading help(pbeta) to the end ! ]]
>
> I've spent many "man weeks", or even "man months" on pbeta() and
> qbeta(), already and have dreamed to get a good student do a
> master's thesis about the problem and potential solutions I've
> looked into in the mean time.
>
> My current gut feeling is that in some cases, new approximations
> are necessary (i.e. tweaking of current approximations is not
> going to help sufficiently).
>
> Also not (in the R sources) tests/pqbetastricttst.R
> a whole file of "regression tests" about pbeta() and qbeta()
> {where part of the true values have been computed with my CRAN
> package Rmpfr (for high precision computation) with the
> Rmpfr::pbetaI() function which gives arbitrarily precise pbeta()
> values but only when (a,b) are integers  that's the "I" in pbetaI().
>
> Yes, it's intriguing ... and I'll look into your special
> findings a bit later today.
>
>
> > Should I report this on the bug list?
>
> Yes, please. Not all problem of pbeta() / qbeta() are part yet,
> of R's bugzilla data base, and maybe this will help to draw
> more good applied mathematicians look into it.
>
>
>
> Martin Maechler
> ETH Zurich and R Core team
> (I'd call myself the "dpqhacker" within R core  related to
> my CRAN package 'DPQ')
>
>
> > A more general illustration:
> > http://www.math.mcmaster.ca/bolker/misc/qbeta.png>
> > ===
> > fun < function(phi,i=0.01,t=0.001, f=qbeta) {
> > f(t,shape1=i/phi,shape2=(1i)/phi, lower.tail=FALSE)
> > }
> > ## bruteforce beta quantile function
> > Qbeta0 < function(t,shape1,shape2,lower.tail=FALSE) {
> > fn < function(x) {pbeta(x,shape1,shape2,lower.tail=lower.tail)t}
> > uniroot(fn,interval=c(0,1))$root
> > }
> > Qbeta < Vectorize(Qbeta0,c("t","shape1","shape2"))
> > curve(fun,from=1,to=4)
> > curve(fun(x,f=Qbeta),add=TRUE,col=2)
>
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/rdevel>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rdevel>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rdevel


>>>>> J C Nash
>>>>> on Thu, 26 Mar 2020 09:29:53 0400 writes:
> Given that a number of us are housebound, it might be a good time to try to
> improve the approximation. It's not an area where I have much expertise, but in
> looking at the qbeta.c code I see a lot of rootfinding, where I do have some
> background. However, I'm very reluctant to work alone on this, and will ask
> interested others to email offlist. If there are others, I'll report back.
Hi John.
Yes, qbeta() {in its "main branches"} does zero finding, but
zero finding of pbeta(...)  p* and I tried to explain in my
last email that the real problem is that already pbeta() is not
accurate enough in some unstable corners ...
The order fixing should typically be
1) fix pbeta()
2) look at qbeta() which now may not even need a fix because its
problems may have been entirely a consequence of pbeta()'s inaccuracies.
And if there are cases where the qbeta() problems are not
only pbeta's "fault", it is still true that the fixes that
would still be needed crucially depend on the detailed
working of the function whose zero(s) are sought, i.e., pbeta()
> Ben: Do you have an idea of parameter region where approximation is poor?
> I think that it would be smart to focus on that to start with.

Rmpfr matrix/vector  products:
> Martin: On a separate precision matter, did you get my query early in year about double
> length accumulation of inner products of vectors in Rmpfr? Rhelp more or
> less implied that Rmpfr does NOT use extra length. I've been using David
> Smith's FM Fortran where the DOT_PRODUCT does use double length, but it
> would be nice to have that in R. My attempts to find "easy" workarounds have
> not been successful, but I'll admit that other things took precedence.
Well, the current development version of 'Rmpfr' on Rforge now
contains facilities to enlarge the precision of the computations
by a factor 'fPrec' with default 'fPrec = 1';
notably, instead of x %*% y (where the `%*%` cannot have more
than two arguments) does have a counterpart matmult(x,y, ....)
which allows more arguments, namely 'fPrec', or directly 'precBits';
and of course there are crossprod() and tcrossprod() one should
use when applicable and they also got the 'fPrec' and
'precBits' arguments.
{The %*% etc precision increase still does not work optimally
efficiency wise, as it simply increases the precision of all
computations by just increasing the precision of x and y (the inputs)}.
The whole Matrix and Matrixvector arithmetic is still
comparibly slow in Rmpfr .. mostly because I valued human time
(mine!) much higher than computer time in its implementation.
That's one reason I would never want to double the precision
everywhere as it decreases speed even more, and often times
unnecessarily: doubling the accuracy is basically "worstcase
scenario" precaution
Martin
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rdevel


Despite the need to focus on pbeta, I'm still willing to put in some effort.
But I find it really helps to have 23 others involved, since the questions back
and forth keep matters moving forward. Volunteers?
Thanks to Martin for detailed comments.
JN
On 20200326 10:34 a.m., Martin Maechler wrote:
>>>>>> J C Nash
>>>>>> on Thu, 26 Mar 2020 09:29:53 0400 writes:
>
> > Given that a number of us are housebound, it might be a good time to try to
> > improve the approximation. It's not an area where I have much expertise, but in
> > looking at the qbeta.c code I see a lot of rootfinding, where I do have some
> > background. However, I'm very reluctant to work alone on this, and will ask
> > interested others to email offlist. If there are others, I'll report back.
>
> Hi John.
> Yes, qbeta() {in its "main branches"} does zero finding, but
> zero finding of pbeta(...)  p* and I tried to explain in my
> last email that the real problem is that already pbeta() is not
> accurate enough in some unstable corners ...
> The order fixing should typically be
> 1) fix pbeta()
> 2) look at qbeta() which now may not even need a fix because its
> problems may have been entirely a consequence of pbeta()'s inaccuracies.
> And if there are cases where the qbeta() problems are not
> only pbeta's "fault", it is still true that the fixes that
> would still be needed crucially depend on the detailed
> working of the function whose zero(s) are sought, i.e., pbeta()
>
> > Ben: Do you have an idea of parameter region where approximation is poor?
> > I think that it would be smart to focus on that to start with.
>
> 
>
> Rmpfr matrix/vector  products:
>
> > Martin: On a separate precision matter, did you get my query early in year about double
> > length accumulation of inner products of vectors in Rmpfr? Rhelp more or
> > less implied that Rmpfr does NOT use extra length. I've been using David
> > Smith's FM Fortran where the DOT_PRODUCT does use double length, but it
> > would be nice to have that in R. My attempts to find "easy" workarounds have
> > not been successful, but I'll admit that other things took precedence.
>
> Well, the current development version of 'Rmpfr' on Rforge now
> contains facilities to enlarge the precision of the computations
> by a factor 'fPrec' with default 'fPrec = 1';
> notably, instead of x %*% y (where the `%*%` cannot have more
> than two arguments) does have a counterpart matmult(x,y, ....)
> which allows more arguments, namely 'fPrec', or directly 'precBits';
> and of course there are crossprod() and tcrossprod() one should
> use when applicable and they also got the 'fPrec' and
> 'precBits' arguments.
>
> {The %*% etc precision increase still does not work optimally
> efficiency wise, as it simply increases the precision of all
> computations by just increasing the precision of x and y (the inputs)}.
>
> The whole Matrix and Matrixvector arithmetic is still
> comparibly slow in Rmpfr .. mostly because I valued human time
> (mine!) much higher than computer time in its implementation.
> That's one reason I would never want to double the precision
> everywhere as it decreases speed even more, and often times
> unnecessarily: doubling the accuracy is basically "worstcase
> scenario" precaution
>
> Martin
>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rdevel


This is also strange:
qbeta < function (p, shape1, shape2, ncp = 0, lower.tail = TRUE, log.p = FALSE)
{
if (missing(ncp))
.Call(C_qbeta, p, shape1, shape2, lower.tail, log.p)
else .Call(C_qnbeta, p, shape1, shape2, ncp, lower.tail,
log.p)
}
Since the default value is 0 for noncentrality, it seems like the logic above is wrong. When ncp=0, C_qnbeta would be called rather than C_qbeta.
Am I missing something?
Ravi
________________________________
From: Rdevel < [hidden email]> on behalf of J C Nash < [hidden email]>
Sent: Thursday, March 26, 2020 10:40:05 AM
To: Martin Maechler
Cc: [hidden email]
Subject: Re: [Rd] unstable corner of parameter space for qbeta?
Despite the need to focus on pbeta, I'm still willing to put in some effort.
But I find it really helps to have 23 others involved, since the questions back
and forth keep matters moving forward. Volunteers?
Thanks to Martin for detailed comments.
JN
On 20200326 10:34 a.m., Martin Maechler wrote:
>>>>>> J C Nash
>>>>>> on Thu, 26 Mar 2020 09:29:53 0400 writes:
>
> > Given that a number of us are housebound, it might be a good time to try to
> > improve the approximation. It's not an area where I have much expertise, but in
> > looking at the qbeta.c code I see a lot of rootfinding, where I do have some
> > background. However, I'm very reluctant to work alone on this, and will ask
> > interested others to email offlist. If there are others, I'll report back.
>
> Hi John.
> Yes, qbeta() {in its "main branches"} does zero finding, but
> zero finding of pbeta(...)  p* and I tried to explain in my
> last email that the real problem is that already pbeta() is not
> accurate enough in some unstable corners ...
> The order fixing should typically be
> 1) fix pbeta()
> 2) look at qbeta() which now may not even need a fix because its
> problems may have been entirely a consequence of pbeta()'s inaccuracies.
> And if there are cases where the qbeta() problems are not
> only pbeta's "fault", it is still true that the fixes that
> would still be needed crucially depend on the detailed
> working of the function whose zero(s) are sought, i.e., pbeta()
>
> > Ben: Do you have an idea of parameter region where approximation is poor?
> > I think that it would be smart to focus on that to start with.
>
> 
>
> Rmpfr matrix/vector  products:
>
> > Martin: On a separate precision matter, did you get my query early in year about double
> > length accumulation of inner products of vectors in Rmpfr? Rhelp more or
> > less implied that Rmpfr does NOT use extra length. I've been using David
> > Smith's FM Fortran where the DOT_PRODUCT does use double length, but it
> > would be nice to have that in R. My attempts to find "easy" workarounds have
> > not been successful, but I'll admit that other things took precedence.
>
> Well, the current development version of 'Rmpfr' on Rforge now
> contains facilities to enlarge the precision of the computations
> by a factor 'fPrec' with default 'fPrec = 1';
> notably, instead of x %*% y (where the `%*%` cannot have more
> than two arguments) does have a counterpart matmult(x,y, ....)
> which allows more arguments, namely 'fPrec', or directly 'precBits';
> and of course there are crossprod() and tcrossprod() one should
> use when applicable and they also got the 'fPrec' and
> 'precBits' arguments.
>
> {The %*% etc precision increase still does not work optimally
> efficiency wise, as it simply increases the precision of all
> computations by just increasing the precision of x and y (the inputs)}.
>
> The whole Matrix and Matrixvector arithmetic is still
> comparibly slow in Rmpfr .. mostly because I valued human time
> (mine!) much higher than computer time in its implementation.
> That's one reason I would never want to double the precision
> everywhere as it decreases speed even more, and often times
> unnecessarily: doubling the accuracy is basically "worstcase
> scenario" precaution
>
> Martin
>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rdevel [[alternative HTML version deleted]]
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rdevel


>>>>> Ravi Varadhan
>>>>> on Thu, 26 Mar 2020 18:33:43 +0000 writes:
> This is also strange:
> qbeta < function (p, shape1, shape2, ncp = 0, lower.tail = TRUE, log.p = FALSE)
> {
> if (missing(ncp))
> .Call(C_qbeta, p, shape1, shape2, lower.tail, log.p)
> else .Call(C_qnbeta, p, shape1, shape2, ncp, lower.tail,
> log.p)
> }
> Since the default value is 0 for noncentrality, it seems like the logic above is wrong. When ncp=0, C_qnbeta would be called rather than C_qbeta.
> Am I missing something?
Yes. This has been on purpose (forever) and I think the help
page mentions that  though probably a bit subtly.
The way it is now, one can use both algorithms to compute what
in principle should be the main thing.
> Ravi
> ________________________________
> From: Rdevel < [hidden email]> on behalf of J C Nash < [hidden email]>
> Sent: Thursday, March 26, 2020 10:40:05 AM
> To: Martin Maechler
> Cc: [hidden email]
> Subject: Re: [Rd] unstable corner of parameter space for qbeta?
> Despite the need to focus on pbeta, I'm still willing to put in some effort.
> But I find it really helps to have 23 others involved, since the questions back
> and forth keep matters moving forward. Volunteers?
> Thanks to Martin for detailed comments.
> JN
> On 20200326 10:34 a.m., Martin Maechler wrote:
>>>>>>> J C Nash
>>>>>>> on Thu, 26 Mar 2020 09:29:53 0400 writes:
>>
>> > Given that a number of us are housebound, it might be a good time to try to
>> > improve the approximation. It's not an area where I have much expertise, but in
>> > looking at the qbeta.c code I see a lot of rootfinding, where I do have some
>> > background. However, I'm very reluctant to work alone on this, and will ask
>> > interested others to email offlist. If there are others, I'll report back.
>>
>> Hi John.
>> Yes, qbeta() {in its "main branches"} does zero finding, but
>> zero finding of pbeta(...)  p* and I tried to explain in my
>> last email that the real problem is that already pbeta() is not
>> accurate enough in some unstable corners ...
>> The order fixing should typically be
>> 1) fix pbeta()
>> 2) look at qbeta() which now may not even need a fix because its
>> problems may have been entirely a consequence of pbeta()'s inaccuracies.
>> And if there are cases where the qbeta() problems are not
>> only pbeta's "fault", it is still true that the fixes that
>> would still be needed crucially depend on the detailed
>> working of the function whose zero(s) are sought, i.e., pbeta()
>>
>> > Ben: Do you have an idea of parameter region where approximation is poor?
>> > I think that it would be smart to focus on that to start with.
>>
>> 
>>
>> Rmpfr matrix/vector  products:
>>
>> > Martin: On a separate precision matter, did you get my query early in year about double
>> > length accumulation of inner products of vectors in Rmpfr? Rhelp more or
>> > less implied that Rmpfr does NOT use extra length. I've been using David
>> > Smith's FM Fortran where the DOT_PRODUCT does use double length, but it
>> > would be nice to have that in R. My attempts to find "easy" workarounds have
>> > not been successful, but I'll admit that other things took precedence.
>>
>> Well, the current development version of 'Rmpfr' on Rforge now
>> contains facilities to enlarge the precision of the computations
>> by a factor 'fPrec' with default 'fPrec = 1';
>> notably, instead of x %*% y (where the `%*%` cannot have more
>> than two arguments) does have a counterpart matmult(x,y, ....)
>> which allows more arguments, namely 'fPrec', or directly 'precBits';
>> and of course there are crossprod() and tcrossprod() one should
>> use when applicable and they also got the 'fPrec' and
>> 'precBits' arguments.
>>
>> {The %*% etc precision increase still does not work optimally
>> efficiency wise, as it simply increases the precision of all
>> computations by just increasing the precision of x and y (the inputs)}.
>>
>> The whole Matrix and Matrixvector arithmetic is still
>> comparibly slow in Rmpfr .. mostly because I valued human time
>> (mine!) much higher than computer time in its implementation.
>> That's one reason I would never want to double the precision
>> everywhere as it decreases speed even more, and often times
>> unnecessarily: doubling the accuracy is basically "worstcase
>> scenario" precaution
>>
>> Martin
>>
> ______________________________________________
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/rdevel______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rdevel


On 20200326 4:02 a.m., Martin Maechler wrote:
>>>>>> Ben Bolker
>>>>>> on Wed, 25 Mar 2020 21:09:16 0400 writes:
>
> > I've discovered an infelicity (I guess) in qbeta(): it's not a bug,
> > since there's a clear warning about lack of convergence of the numerical
> > algorithm ("full precision may not have been achieved"). I can work
> > around this, but I'm curious why it happens and whether there's a better
> > workaround  it doesn't seem to be in a particularly extreme corner of
> > parameter space. It happens, e.g., for these parameters:
>
> > phi < 1.1
> > i < 0.01
> > t < 0.001
> > shape1 = i/phi ## 0.009090909
> > shape2 = (1i)/phi ## 0.9
> > qbeta(t,shape1,shape2) ## 5.562685e309
> > ## bruteforce uniroot() version, see below
> > Qbeta0(t,shape1,shape2) ## 0.9262824
>
> > The qbeta code is pretty scary to read: the warning "full precision
> > may not have been achieved" is triggered here:
>
> > https://github.com/wch/rsource/blob/f8d4d7d48051860cc695b99db9be9cf439aee743/src/nmath/qbeta.c#L530>
> > Any thoughts?
>
> Well, qbeta() is mostly based on inverting pbeta() and pbeta()
> has *several* "dangerous" corners in its parameter spaces
> {in some cases, it makes sense to look at the 4 different cases
> log.p = TRUE/FALSE // lower.tail = TRUE/FALSE separately ..}
>
> pbeta() itself is based on the most complex numerical code in
> all of base R, i.e., src/nmath/toms708.c and that algorithm
> (TOMS 708) had been sophisticated already when it was published,
> and it has been improved and tweaked several times since being
> part of R, notably for the log.p=TRUE case which had not been in
> the focus of the publication and its algorithm.
> [[ NB: part of this you can read when reading help(pbeta) to the end ! ]]
>
> I've spent many "man weeks", or even "man months" on pbeta() and
> qbeta(), already and have dreamed to get a good student do a
> master's thesis about the problem and potential solutions I've
> looked into in the mean time.
>
> My current gut feeling is that in some cases, new approximations
> are necessary (i.e. tweaking of current approximations is not
> going to help sufficiently).
>
> Also not (in the R sources) tests/pqbetastricttst.R
> a whole file of "regression tests" about pbeta() and qbeta()
> {where part of the true values have been computed with my CRAN
> package Rmpfr (for high precision computation) with the
> Rmpfr::pbetaI() function which gives arbitrarily precise pbeta()
> values but only when (a,b) are integers  that's the "I" in pbetaI().
>
> Yes, it's intriguing ... and I'll look into your special
> findings a bit later today.
>
>
> > Should I report this on the bug list?
>
> Yes, please. Not all problem of pbeta() / qbeta() are part yet,
> of R's bugzilla data base, and maybe this will help to draw
> more good applied mathematicians look into it.
Will report.
I'm not at all surprised that this is a supertough problem. The only
part that was surprising to me was that my naive unirootbased solution
worked (for this particular corner of parameter space where qbeta() has
trouble: it was terrible elsewhere, so now I'm using a hybrid solution
where I use my bruteforce uniroot thing if I get a warning from qbeta().
I hesitated to even bring it up because I know you're really busy, but I
figured it was better to tag it now and let you deal with it some time
later.
Bugzilla report at https://bugs.rproject.org/bugzilla/show_bug.cgi?id=17746 cheers
Ben Bolker
>
>
>
> Martin Maechler
> ETH Zurich and R Core team
> (I'd call myself the "dpqhacker" within R core  related to
> my CRAN package 'DPQ')
>
>
> > A more general illustration:
> > http://www.math.mcmaster.ca/bolker/misc/qbeta.png>
> > ===
> > fun < function(phi,i=0.01,t=0.001, f=qbeta) {
> > f(t,shape1=i/phi,shape2=(1i)/phi, lower.tail=FALSE)
> > }
> > ## bruteforce beta quantile function
> > Qbeta0 < function(t,shape1,shape2,lower.tail=FALSE) {
> > fn < function(x) {pbeta(x,shape1,shape2,lower.tail=lower.tail)t}
> > uniroot(fn,interval=c(0,1))$root
> > }
> > Qbeta < Vectorize(Qbeta0,c("t","shape1","shape2"))
> > curve(fun,from=1,to=4)
> > curve(fun(x,f=Qbeta),add=TRUE,col=2)
>
> > ______________________________________________
> > [hidden email] mailing list
> > https://stat.ethz.ch/mailman/listinfo/rdevel>
______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/rdevel

