[1/3] Optimize trunc() and truncf().

Message ID 1464139950-31943-1-git-send-email-mattst88@gmail.com
State New, archived
Headers

Commit Message

Matt Turner May 25, 2016, 1:32 a.m. UTC
  By creating a mask of non-fractional bits from the exponent.
---

Joseph suggested an SSE 4.1 implementation of trunc/truncf, so I thought
now would be a good time to send these patches even sans benchmarking data.

I do not believe the other two generic trunc* implementations (ldbl-128,
dbl-64) would benefit from the change made in this patch

Suggestions for ChangeLog entries welcome. Guidance for generating benchmark
data requested.

 sysdeps/ieee754/dbl-64/wordsize-64/s_trunc.c | 32 +++++++++++-----------------
 sysdeps/ieee754/flt-32/s_truncf.c            | 32 +++++++++++-----------------
 2 files changed, 26 insertions(+), 38 deletions(-)
  

Comments

Florian Weimer May 25, 2016, 9:02 a.m. UTC | #1
On 05/25/2016 03:32 AM, Matt Turner wrote:
 > +  int64_t exp = (i0 >> 52) & 0x7ff;
 > +  int64_t mask = UINT64_C(-1) << max(52 - (exp - 1023), 0);

I think it will help GCC if you make exp an int and change the max 
function accordingly.

You should use spaces in front of the parenthesis in a function call, 
and also for most macro calls.

Thanks,
Florian
  
Joseph Myers May 25, 2016, 10:39 a.m. UTC | #2
On Tue, 24 May 2016, Matt Turner wrote:

> +  int64_t exp = (i0 >> 52) & 0x7ff;
> +  int64_t mask = UINT64_C(-1) << max(52 - (exp - 1023), 0);

This can involve undefined behavior (shift by more than 63), so ...

> +  if (exp < 1023)
> +    mask = UINT64_C(0x8000000000000000);

... the shift should only be done at all if exp is large enough.

Likewise for truncf.
  
Joseph Myers May 25, 2016, 3:36 p.m. UTC | #3
On Tue, 24 May 2016, Matt Turner wrote:

> -  else
> -    {
> -      if (j0 == 0x400)
> -	/* x is inf or NaN.  */
> -	return x + x;
> -    }

You're removing too much code here.  You need to keep this addition in the 
NaN case to ensure that signaling NaNs raise "invalid" and get converted 
to quiet NaNs in the process.
  
Mike Frysinger May 25, 2016, 5:20 p.m. UTC | #4
On 25 May 2016 15:36, Joseph Myers wrote:
> On Tue, 24 May 2016, Matt Turner wrote:
> > -  else
> > -    {
> > -      if (j0 == 0x400)
> > -	/* x is inf or NaN.  */
> > -	return x + x;
> > -    }
> 
> You're removing too much code here.  You need to keep this addition in the 
> NaN case to ensure that signaling NaNs raise "invalid" and get converted 
> to quiet NaNs in the process.

does this mean current test coverage is insufficient ?
-mike
  
Joseph Myers May 25, 2016, 5:24 p.m. UTC | #5
On Wed, 25 May 2016, Mike Frysinger wrote:

> On 25 May 2016 15:36, Joseph Myers wrote:
> > On Tue, 24 May 2016, Matt Turner wrote:
> > > -  else
> > > -    {
> > > -      if (j0 == 0x400)
> > > -	/* x is inf or NaN.  */
> > > -	return x + x;
> > > -    }
> > 
> > You're removing too much code here.  You need to keep this addition in the 
> > NaN case to ensure that signaling NaNs raise "invalid" and get converted 
> > to quiet NaNs in the process.
> 
> does this mean current test coverage is insufficient ?

Yes.  I intend to add sNaN support to libm-test.inc at some point if noone 
else gets there first; no doubt this would then show up lots of bugs in 
existing code.  Cf. 
<https://sourceware.org/ml/libc-ports/2013-04/msg00008.html>, but that 
would need completely reworking now for current libm-test.
  

Patch

diff --git a/sysdeps/ieee754/dbl-64/wordsize-64/s_trunc.c b/sysdeps/ieee754/dbl-64/wordsize-64/s_trunc.c
index 81ac55e..e4cba3b 100644
--- a/sysdeps/ieee754/dbl-64/wordsize-64/s_trunc.c
+++ b/sysdeps/ieee754/dbl-64/wordsize-64/s_trunc.c
@@ -1,7 +1,6 @@ 
 /* Truncate argument to nearest integral value not larger than the argument.
    Copyright (C) 1997-2016 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
-   Contributed by Ulrich Drepper <drepper@cygnus.com>, 1997.
 
    The GNU C Library is free software; you can redistribute it and/or
    modify it under the terms of the GNU Lesser General Public
@@ -21,31 +20,26 @@ 
 
 #include <math_private.h>
 
+static int64_t
+max (int64_t x, int64_t y)
+{
+  return x > y ? x : y;
+}
 
 double
 __trunc (double x)
 {
-  int64_t i0, j0;
-  int64_t sx;
+  int64_t i0;
 
   EXTRACT_WORDS64 (i0, x);
-  sx = i0 & UINT64_C(0x8000000000000000);
-  j0 = ((i0 >> 52) & 0x7ff) - 0x3ff;
-  if (j0 < 52)
-    {
-      if (j0 < 0)
-	/* The magnitude of the number is < 1 so the result is +-0.  */
-	INSERT_WORDS64 (x, sx);
-      else
-	INSERT_WORDS64 (x, sx | (i0 & ~(UINT64_C(0x000fffffffffffff) >> j0)));
-    }
-  else
-    {
-      if (j0 == 0x400)
-	/* x is inf or NaN.  */
-	return x + x;
-    }
+  int64_t exp = (i0 >> 52) & 0x7ff;
+  int64_t mask = UINT64_C(-1) << max(52 - (exp - 1023), 0);
+
+  if (exp < 1023)
+    mask = UINT64_C(0x8000000000000000);
 
+  i0 &= mask;
+  INSERT_WORDS64(x, i0);
   return x;
 }
 weak_alias (__trunc, trunc)
diff --git a/sysdeps/ieee754/flt-32/s_truncf.c b/sysdeps/ieee754/flt-32/s_truncf.c
index 43d35c7..67fdcc8 100644
--- a/sysdeps/ieee754/flt-32/s_truncf.c
+++ b/sysdeps/ieee754/flt-32/s_truncf.c
@@ -1,7 +1,6 @@ 
 /* Truncate argument to nearest integral value not larger than the argument.
    Copyright (C) 1997-2016 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
-   Contributed by Ulrich Drepper <drepper@cygnus.com>, 1997.
 
    The GNU C Library is free software; you can redistribute it and/or
    modify it under the terms of the GNU Lesser General Public
@@ -21,31 +20,26 @@ 
 
 #include <math_private.h>
 
+static int32_t
+max (int32_t x, int32_t y)
+{
+  return x > y ? x : y;
+}
 
 float
 __truncf (float x)
 {
-  int32_t i0, j0;
-  int sx;
+  int32_t i0;
 
   GET_FLOAT_WORD (i0, x);
-  sx = i0 & 0x80000000;
-  j0 = ((i0 >> 23) & 0xff) - 0x7f;
-  if (j0 < 23)
-    {
-      if (j0 < 0)
-	/* The magnitude of the number is < 1 so the result is +-0.  */
-	SET_FLOAT_WORD (x, sx);
-      else
-	SET_FLOAT_WORD (x, sx | (i0 & ~(0x007fffff >> j0)));
-    }
-  else
-    {
-      if (j0 == 0x80)
-	/* x is inf or NaN.  */
-	return x + x;
-    }
+  int32_t exp = (i0 >> 23) & 0xff;
+  int32_t mask = ~0u << max(23 - (exp - 127), 0);
+
+  if (exp < 127)
+    mask = 0x80000000;
 
+  i0 &= mask;
+  SET_FLOAT_WORD (x, i0);
   return x;
 }
 weak_alias (__truncf, truncf)