Fix x86/x86_64 expl/exp10l spurious underflows (bug 16348)

Message ID Pine.LNX.4.64.1403271650430.649@digraph.polyomino.org.uk
State Committed
Headers

Commit Message

Joseph Myers March 27, 2014, 4:51 p.m. UTC
  This patch fixes bug 16348, spurious underflows from x86/x86_64 expl
on arguments close to 0.  These implementations effectively use expm1
(on the fractional part of the argument) internally, so resulting in
spurious underflows when the result is very close to 1.  For arguments
small enough that the round-to-nearest correct result is 1, this patch
uses 1+x instead.

These implementations are also used for exp10l and so the patch fixes
similar issues there (the 0x1p-67 threshold being small enough to be
correct for exp10l as well as expl).  But because of spurious
underflows in other exp10 implementations (bug 16560), the tests
aren't added for exp10 at this point - they can be added when the
other exp10 parts of that bug are fixed.

Tested x86_64 and x86; no ulps updates needed.

(auto-libm-test-out diffs omitted below.)

2014-03-27  Joseph Myers  <joseph@codesourcery.com>

	[BZ #16348]
	* sysdeps/i386/fpu/e_expl.S (IEEE754_EXPL) [!USE_AS_EXPM1L]: Use
	1+x for argument with exponent below -67.
	* sysdeps/x86_64/fpu/e_expl.S (IEEE754_EXPL) [!USE_AS_EXPM1L]:
	Likewise.
	* math/auto-libm-test-in: Add more tests of exp.
	* math/auto-libm-test-out: Regenerated.
  

Comments

Andreas Jaeger March 27, 2014, 6:30 p.m. UTC | #1
On 03/27/2014 05:51 PM, Joseph S. Myers wrote:
> This patch fixes bug 16348, spurious underflows from x86/x86_64 expl
> on arguments close to 0.  These implementations effectively use expm1
> (on the fractional part of the argument) internally, so resulting in
> spurious underflows when the result is very close to 1.  For arguments
> small enough that the round-to-nearest correct result is 1, this patch
> uses 1+x instead.
> 
> These implementations are also used for exp10l and so the patch fixes
> similar issues there (the 0x1p-67 threshold being small enough to be
> correct for exp10l as well as expl).  But because of spurious
> underflows in other exp10 implementations (bug 16560), the tests
> aren't added for exp10 at this point - they can be added when the
> other exp10 parts of that bug are fixed.

Thanks,
Andreas
  

Patch

diff --git a/math/auto-libm-test-in b/math/auto-libm-test-in
index 7c80192..7ad407c 100644
--- a/math/auto-libm-test-in
+++ b/math/auto-libm-test-in
@@ -843,6 +843,24 @@  exp -7.4444006192138124e+02
 exp -0x1.75f113c30b1c8p+9
 exp -max
 exp -11342.8125
+exp 0x1p-10
+exp -0x1p-10
+exp 0x1p-20
+exp -0x1p-20
+exp 0x1p-30
+exp -0x1p-30
+exp 0x1p-40
+exp -0x1p-40
+exp 0x1p-50
+exp -0x1p-50
+exp 0x1p-60
+exp -0x1p-60
+exp 0x1p-100
+exp -0x1p-100
+exp min
+exp -min
+exp min_subnorm
+exp -min_subnorm
 
 exp10 0
 exp10 -0
diff --git a/sysdeps/i386/fpu/e_expl.S b/sysdeps/i386/fpu/e_expl.S
index 5917f57..eb4086b 100644
--- a/sysdeps/i386/fpu/e_expl.S
+++ b/sysdeps/i386/fpu/e_expl.S
@@ -112,8 +112,14 @@  ENTRY(IEEE754_EXPL)
 	movzwl	4+8(%esp), %eax
 	andl	$0x7fff, %eax
 	cmpl	$0x400d, %eax
-	jle	3f
-	/* Overflow, underflow or infinity or NaN as argument.  */
+	jg	5f
+	cmpl	$0x3fbc, %eax
+	jge	3f
+	/* Argument's exponent below -67, result rounds to 1.  */
+	fld1
+	faddp
+	jmp	2f
+5:	/* Overflow, underflow or infinity or NaN as argument.  */
 	fstsw	%ax
 	movb	$0x45, %dh
 	andb	%ah, %dh
diff --git a/sysdeps/x86_64/fpu/e_expl.S b/sysdeps/x86_64/fpu/e_expl.S
index 36d30c2..338527b 100644
--- a/sysdeps/x86_64/fpu/e_expl.S
+++ b/sysdeps/x86_64/fpu/e_expl.S
@@ -109,8 +109,14 @@  ENTRY(IEEE754_EXPL)
 	movzwl	8+8(%rsp), %eax
 	andl	$0x7fff, %eax
 	cmpl	$0x400d, %eax
-	jle	3f
-	/* Overflow, underflow or infinity or NaN as argument.  */
+	jg	5f
+	cmpl	$0x3fbc, %eax
+	jge	3f
+	/* Argument's exponent below -67, result rounds to 1.  */
+	fld1
+	faddp
+	jmp	2f
+5:	/* Overflow, underflow or infinity or NaN as argument.  */
 	fstsw	%ax
 	movb	$0x45, %dh
 	andb	%ah, %dh