fix inaccuracy of j0f for x >= 2^127 when sin(x)+cos(x) is tiny
Commit Message
Hi,
before the patch below, the maximum ulp error for j0 in the whole binary32
range is 6177902 ulps (for x = 3.153646966e+38).
After this patch, it is 900691 ulps (for x = 2.404825449e+00).
The patch fixes the case where x >= 2^127 and tiny sin(x)+cos(x).
Large remaining errors are due to a cancellation in another branch of the code.
Paul
PS: the same method can be applied to j1 and y1.
PS2: this can wait for 2.33 of course.
From 6b731f36b1a5badf4704645d0dda40957cedd0db Mon Sep 17 00:00:00 2001
From: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Date: Mon, 27 Jul 2020 19:01:18 +0200
Subject: [PATCH] fix inaccuracy of j0f for x >= 2^127 when sin(x)+cos(x) is
tiny
---
sysdeps/ieee754/flt-32/e_j0f.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
Comments
On Mon, 27 Jul 2020, Paul Zimmermann wrote:
> + float x0 = 3.153646966e+38f;
> + float y = x - x0; /* exact */
> + /* sin(y) = sin(x)*cos(x0)-cos(x)*sin(x0) */
> + z = __sinf (y);
> + float eps = 8.17583368e-8f;
> + /* cos(x0) ~ -sin(x0) + eps */
> + z += eps * __cosf (x);
> + /* now z ~ (sin(x)-cos(x))*cos(x0) */
> + float cosx0 = -0.707106740f;
In new code we generally prefer to use hex float constants in such cases
where a specific floating-point value is wanted.
> In new code we generally prefer to use hex float constants in such cases
> where a specific floating-point value is wanted.
thank you Joseph. Here is a new version. The maximal error for x >= 2^127
is now 4 ulps (attained for x=1.740713465e+38).
Total: errors=4220511 (0.10%) errors2=393216 maxerr=4 ulp(s)
Paul
From 6b731f36b1a5badf4704645d0dda40957cedd0db Mon Sep 17 00:00:00 2001
From: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Date: Mon, 27 Jul 2020 19:01:18 +0200
Subject: [PATCH 1/2] fix inaccuracy of j0f for x >= 2^127 when sin(x)+cos(x)
is tiny
---
sysdeps/ieee754/flt-32/e_j0f.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/sysdeps/ieee754/flt-32/e_j0f.c b/sysdeps/ieee754/flt-32/e_j0f.c
index c89b9f2688..f85d8a59e0 100644
--- a/sysdeps/ieee754/flt-32/e_j0f.c
+++ b/sysdeps/ieee754/flt-32/e_j0f.c
@@ -56,6 +56,22 @@ __ieee754_j0f(float x)
if ((s*c)<zero) cc = z/ss;
else ss = z/cc;
}
+ else {
+ /* we subtract (exactly) a value x0 such that cos(x0)+sin(x0)
+ is very near from 0, and use the identity
+ sin(x-x0) = sin(x)*cos(x0)-cos(x)*sin(x0) to get
+ sin(x) + cos(x) with extra accuracy */
+ float x0 = 3.153646966e+38f;
+ float y = x - x0; /* exact */
+ /* sin(y) = sin(x)*cos(x0)-cos(x)*sin(x0) */
+ z = __sinf (y);
+ float eps = 8.17583368e-8f;
+ /* cos(x0) ~ -sin(x0) + eps */
+ z += eps * __cosf (x);
+ /* now z ~ (sin(x)-cos(x))*cos(x0) */
+ float cosx0 = -0.707106740f;
+ cc = z / cosx0;
+ }
/*
* j0(x) = 1/sqrt(pi) * (P(0,x)*cc - Q(0,x)*ss) / sqrt(x)
* y0(x) = 1/sqrt(pi) * (P(0,x)*ss + Q(0,x)*cc) / sqrt(x)
On Jul 28 2020, Paul Zimmermann wrote:
> + /* we subtract (exactly) a value x0 such that cos(x0)+sin(x0)
> + is very near from 0, and use the identity
Did you mean "near to"?
Andreas.
Dear Andreas,
yes thanks. Sorry my english is not perfect.
Paul
From 6b731f36b1a5badf4704645d0dda40957cedd0db Mon Sep 17 00:00:00 2001
From: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Date: Mon, 27 Jul 2020 19:01:18 +0200
Subject: [PATCH 1/3] fix inaccuracy of j0f for x >= 2^127 when sin(x)+cos(x)
is tiny
---
sysdeps/ieee754/flt-32/e_j0f.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/sysdeps/ieee754/flt-32/e_j0f.c b/sysdeps/ieee754/flt-32/e_j0f.c
index c89b9f2688..f85d8a59e0 100644
--- a/sysdeps/ieee754/flt-32/e_j0f.c
+++ b/sysdeps/ieee754/flt-32/e_j0f.c
@@ -56,6 +56,22 @@ __ieee754_j0f(float x)
if ((s*c)<zero) cc = z/ss;
else ss = z/cc;
}
+ else {
+ /* we subtract (exactly) a value x0 such that cos(x0)+sin(x0)
+ is very near from 0, and use the identity
+ sin(x-x0) = sin(x)*cos(x0)-cos(x)*sin(x0) to get
+ sin(x) + cos(x) with extra accuracy */
+ float x0 = 3.153646966e+38f;
+ float y = x - x0; /* exact */
+ /* sin(y) = sin(x)*cos(x0)-cos(x)*sin(x0) */
+ z = __sinf (y);
+ float eps = 8.17583368e-8f;
+ /* cos(x0) ~ -sin(x0) + eps */
+ z += eps * __cosf (x);
+ /* now z ~ (sin(x)-cos(x))*cos(x0) */
+ float cosx0 = -0.707106740f;
+ cc = z / cosx0;
+ }
/*
* j0(x) = 1/sqrt(pi) * (P(0,x)*cc - Q(0,x)*ss) / sqrt(x)
* y0(x) = 1/sqrt(pi) * (P(0,x)*ss + Q(0,x)*cc) / sqrt(x)
On 28/07/2020 07:50, Paul Zimmermann wrote:
> Dear Andreas,
>
> yes thanks. Sorry my english is not perfect.
>
> Paul
Could you send v2 patch with all the fixes indicated by Joseph and Andreas
(this change from a change format is confusing)? Also please fix the
indentation issue and the open brackets on next line.
I also think this fix should also add an entry on math/auto-libm-test-out-y0
that exercises this code path and with a check if the ULPs file require
some adjustments as well.
>
> From 6b731f36b1a5badf4704645d0dda40957cedd0db Mon Sep 17 00:00:00 2001
> From: Paul Zimmermann <Paul.Zimmermann@inria.fr>
> Date: Mon, 27 Jul 2020 19:01:18 +0200
> Subject: [PATCH 1/3] fix inaccuracy of j0f for x >= 2^127 when sin(x)+cos(x)
> is tiny
>
> ---
> sysdeps/ieee754/flt-32/e_j0f.c | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
>
> diff --git a/sysdeps/ieee754/flt-32/e_j0f.c b/sysdeps/ieee754/flt-32/e_j0f.c
> index c89b9f2688..f85d8a59e0 100644
> --- a/sysdeps/ieee754/flt-32/e_j0f.c
> +++ b/sysdeps/ieee754/flt-32/e_j0f.c
> @@ -56,6 +56,22 @@ __ieee754_j0f(float x)
> if ((s*c)<zero) cc = z/ss;
> else ss = z/cc;
> }
> + else {
> + /* we subtract (exactly) a value x0 such that cos(x0)+sin(x0)
> + is very near from 0, and use the identity
> + sin(x-x0) = sin(x)*cos(x0)-cos(x)*sin(x0) to get
> + sin(x) + cos(x) with extra accuracy */
> + float x0 = 3.153646966e+38f;
> + float y = x - x0; /* exact */
> + /* sin(y) = sin(x)*cos(x0)-cos(x)*sin(x0) */
> + z = __sinf (y);
> + float eps = 8.17583368e-8f;
> + /* cos(x0) ~ -sin(x0) + eps */
> + z += eps * __cosf (x);
> + /* now z ~ (sin(x)-cos(x))*cos(x0) */
> + float cosx0 = -0.707106740f;
> + cc = z / cosx0;
> + }
> /*
> * j0(x) = 1/sqrt(pi) * (P(0,x)*cc - Q(0,x)*ss) / sqrt(x)
> * y0(x) = 1/sqrt(pi) * (P(0,x)*ss + Q(0,x)*cc) / sqrt(x)
>
@@ -56,6 +56,22 @@ __ieee754_j0f(float x)
if ((s*c)<zero) cc = z/ss;
else ss = z/cc;
}
+ else {
+ /* we subtract (exactly) a value x0 such that cos(x0)+sin(x0)
+ is very near from 0, and use the identity
+ sin(x-x0) = sin(x)*cos(x0)-cos(x)*sin(x0) to get
+ sin(x) + cos(x) with extra accuracy */
+ float x0 = 3.153646966e+38f;
+ float y = x - x0; /* exact */
+ /* sin(y) = sin(x)*cos(x0)-cos(x)*sin(x0) */
+ z = __sinf (y);
+ float eps = 8.17583368e-8f;
+ /* cos(x0) ~ -sin(x0) + eps */
+ z += eps * __cosf (x);
+ /* now z ~ (sin(x)-cos(x))*cos(x0) */
+ float cosx0 = -0.707106740f;
+ cc = z / cosx0;
+ }
/*
* j0(x) = 1/sqrt(pi) * (P(0,x)*cc - Q(0,x)*ss) / sqrt(x)
* y0(x) = 1/sqrt(pi) * (P(0,x)*ss + Q(0,x)*cc) / sqrt(x)