Improves __ieee754_exp() performance by greater than 5x on sparc/x86.

  modified file:   sysdeps/ieee754/dbl-64/e_exp.c

These changes will be active for all platforms that don't provide
their own exp() routines. They will also be active for the ieee754
versions of ccos, ccosh, cosh, csin, csinh, sinh, exp10, gamma, and
erf which call __ieee754_exp() directly or indirectly.

Typical performance gains as measured on Sparc s7 testing common
values between exp(1) and exp(40) is typically around 5x.

Using the glibc perf tests on sparc and x86_64,
      sparc (nsec)    x86 (nsec)
      old     new     old     new
max   17629   400    5173     802
min     399    64      15      15
mean   5317   211    1349      29

The extreme max times for the old (ieee754) exp are due to the
multiprecision computation in the old algorithm when the true value is
very near 0.5 ulp away from an value representable in double
precision. The new algorithm does not take special measures for those
cases. The current glibc exp perf tests overrepresent those values.
Informal testing suggests approximately one in 200 cases might
invoke the high cost computation. The performance advantage of the new
algorithm for other values is still large but not as large as indicated
by the chart above.

Glibc correctness tests for exp() and expf() were run on sparc and x86_64.
The results match on both platforms. Within the test suite,
3 input values were found to cause 1 bit differences (ulp)
when "FE_TONEAREST" rounding mode is set. No differences were
seen for the tested values for the other rounding modes.
Typical example:
exp(-0x1.760cd2p+0)  (-1.46113312244415283203125)
 new code:    2.31973271630014299393707e-01   0x1.db14cd799387ap-3
 old code:    2.31973271630014271638132e-01   0x1.db14cd7993879p-3
    exp    =  2.31973271630014285508337 (high precision)
Old delta: off by 0.49 ulp
New delta: off by 0.51 ulp

In addition, because ieee754_exp() is used by other routines, cexp()
showed test results with very small imaginary input values where the
imaginary portion of the result was off by 3 ulp when in upward
rounding mode, but not in the other rounding modes.
---
 sysdeps/ieee754/dbl-64/e_exp.c |  618 +++++++++++++++++++++++++++-------------
 1 files changed, 416 insertions(+), 202 deletions(-)

Improves __ieee754_exp() performance by greater than 5x on sparc/x86.

Commit Message

Comments

Patch