Adjust thresholds in Bessel function implementations (bug 14469)
Commit Message
A recent discussion in bug 14469 notes that a threshold in float
Bessel function implementations, used to determine when to use a
simpler implementation approach, results in substantially inaccurate
results.
As I discussed in
<https://sourceware.org/ml/libc-alpha/2013-03/msg00345.html>, a
heuristic argument suggests 2^(S+P) as the right order of magnitude
for a suitable threshold, where S is the number of significand bits in
the floating-point type and P is the number of significant bits in the
representation of the floating-point type, and the float and ldbl-96
implementations use thresholds that are too small. Some threshold
does need using, there or elsewhere in the implementation, to avoid
spurious underflow and overflow for large arguments.
This patch sets the thresholds in the affected implementations to more
heuristically justifiable values. Results will still be inaccurate
close to zeroes of the functions (thus this patch does *not* fix any
of the bugs for Bessel function inaccuracy); fixing that would require
a different implementation approach, likely along the lines described
in <http://www.cl.cam.ac.uk/~jrh13/papers/bessel.ps.gz>.
So the justification for a change such as thing would be statistical
rather than based on particular tests that had excessive errors and no
longer do so (no doubt such tests could be found, but would probably
be too fragile to add to the testsuite, as liable to give large errors
again from very small implementation changes or even from compiler
changes). Paul, could you run your exhaustive tests of j0f, j1f, y0f,
y1f with this patch and see what effect it has on the maximum ulp
error and the number of inputs with large errors?
Tested (glibc testsuite) for x86_64.
Comments
Dear Joseph,
> Paul, could you run your exhaustive tests of j0f, j1f, y0f,
> y1f with this patch and see what effect it has on the maximum ulp
> error and the number of inputs with large errors?
here are the results on x86_64 for the binary32 functions:
Maximal error in ulps:
glibc-2.31 glibc-2.31 + patch
j0f 4760682496 6177902
j1f 714456064 2246675
y0f 15117099008 4837658
y1f 464033280 6177902
Number of incorrectly rounded results:
glibc-2.31 glibc-2.31 + patch
j0f 1353797232 1334176546
j1f 1369557306 1340594104
y0f 1314115311 1304302535
y1f 1213975420 1199498354
The maximal errors have thus been greatly reduced.
Best regards,
Paul
On Fri, 14 Feb 2020, paul zimmermann wrote:
> Dear Joseph,
>
> > Paul, could you run your exhaustive tests of j0f, j1f, y0f,
> > y1f with this patch and see what effect it has on the maximum ulp
> > error and the number of inputs with large errors?
>
> here are the results on x86_64 for the binary32 functions:
Thanks. I've now committed the patch.
@@ -60,7 +60,7 @@ __ieee754_j0f(float x)
* j0(x) = 1/sqrt(pi) * (P(0,x)*cc - Q(0,x)*ss) / sqrt(x)
* y0(x) = 1/sqrt(pi) * (P(0,x)*ss + Q(0,x)*cc) / sqrt(x)
*/
- if(ix>0x48000000) z = (invsqrtpi*cc)/sqrtf(x);
+ if(ix>0x5c000000) z = (invsqrtpi*cc)/sqrtf(x);
else {
u = pzerof(x); v = qzerof(x);
z = invsqrtpi*(u*cc-v*ss)/sqrtf(x);
@@ -133,7 +133,7 @@ __ieee754_y0f(float x)
if ((s*c)<zero) cc = z/ss;
else ss = z/cc;
}
- if(ix>0x48000000) z = (invsqrtpi*ss)/sqrtf(x);
+ if(ix>0x5c000000) z = (invsqrtpi*ss)/sqrtf(x);
else {
u = pzerof(x); v = qzerof(x);
z = invsqrtpi*(u*ss+v*cc)/sqrtf(x);
@@ -65,7 +65,7 @@ __ieee754_j1f(float x)
* j1(x) = 1/sqrt(pi) * (P(1,x)*cc - Q(1,x)*ss) / sqrt(x)
* y1(x) = 1/sqrt(pi) * (P(1,x)*ss + Q(1,x)*cc) / sqrt(x)
*/
- if(ix>0x48000000) z = (invsqrtpi*cc)/sqrtf(y);
+ if(ix>0x5c000000) z = (invsqrtpi*cc)/sqrtf(y);
else {
u = ponef(y); v = qonef(y);
z = invsqrtpi*(u*cc-v*ss)/sqrtf(y);
@@ -139,7 +139,7 @@ __ieee754_y1f(float x)
* sin(x) +- cos(x) = -cos(2x)/(sin(x) -+ cos(x))
* to compute the worse one.
*/
- if(ix>0x48000000) z = (invsqrtpi*ss)/sqrtf(x);
+ if(ix>0x5c000000) z = (invsqrtpi*ss)/sqrtf(x);
else {
u = ponef(x); v = qonef(x);
z = invsqrtpi*(u*ss+v*cc)/sqrtf(x);
@@ -134,7 +134,7 @@ __ieee754_j0l (long double x)
* j0(x) = 1/sqrt(pi) * (P(0,x)*cc - Q(0,x)*ss) / sqrt(x)
* y0(x) = 1/sqrt(pi) * (P(0,x)*ss + Q(0,x)*cc) / sqrt(x)
*/
- if (__glibc_unlikely (ix > 0x4080)) /* 2^129 */
+ if (__glibc_unlikely (ix > 0x408e)) /* 2^143 */
z = (invsqrtpi * cc) / sqrtl (x);
else
{
@@ -236,7 +236,7 @@ __ieee754_y0l (long double x)
else
ss = z / cc;
}
- if (__glibc_unlikely (ix > 0x4080)) /* 1e39 */
+ if (__glibc_unlikely (ix > 0x408e)) /* 2^143 */
z = (invsqrtpi * ss) / sqrtl (x);
else
{
@@ -138,7 +138,7 @@ __ieee754_j1l (long double x)
* j1(x) = 1/sqrt(pi) * (P(1,x)*cc - Q(1,x)*ss) / sqrt(x)
* y1(x) = 1/sqrt(pi) * (P(1,x)*ss + Q(1,x)*cc) / sqrt(x)
*/
- if (__glibc_unlikely (ix > 0x4080))
+ if (__glibc_unlikely (ix > 0x408e))
z = (invsqrtpi * cc) / sqrtl (y);
else
{
@@ -232,7 +232,7 @@ __ieee754_y1l (long double x)
* sin(x) +- cos(x) = -cos(2x)/(sin(x) -+ cos(x))
* to compute the worse one.
*/
- if (__glibc_unlikely (ix > 0x4080))
+ if (__glibc_unlikely (ix > 0x408e))
z = (invsqrtpi * ss) / sqrtl (x);
else
{