Patchwork [v3,1/3] Cleanup __ieee754_sqrt(f/l)

login
register
mail settings
Submitter Wilco Dijkstra
Date March 13, 2018, 6:27 p.m.
Message ID <DB6PR0801MB2053DCCDB0C7A490A861AE8783D20@DB6PR0801MB2053.eurprd08.prod.outlook.com>
Download mbox | patch
Permalink /patch/26301/
State New
Headers show

Comments

Wilco Dijkstra - March 13, 2018, 6:27 p.m.
This patch series cleans up the many uses of  __ieee754_sqrt(f/l) in GLIBC.
The goal is to enable GCC to do the inlining, and if this fails call the
__ieee754_sqrt function.  This is done by internally declaring sqrt with asm
redirects.  The compat symbols and sqrt wrappers need to disable the redirect.
The redirect is also disabled if there are already redirects defined when
using -ffinite-math-only.  The benchtests still use the libc internal headers
(which is incorrect), so update sqrt-inputs to use the external math.h so there
is no redirection.

All math functions (but not math tests, non-library code and libnldbl) are
built with -fno-math-errno which means GCC will typically inline sqrt as a
single instruction.  This means targets are no longer forced to add a special
inline for sqrt.

Tested using buildmanyglibc (13 most relevant variants since it keeps running
out of diskspace...)

ChangeLog:
2018-03-13  Wilco Dijkstra  <wdijkstr@arm.com>

        * include/math.h (sqrt): Declare with asm redirect.
        (sqrtf): Likewise.
        (sqrtl): Likewise.
        * benchtests/sqrt-inputs: Use correct math/math.h include.
        * Makeconfig: Add -fno-math-errno for libc/libm, but build testsuite,
        nonlib and libnldbl with -fmath-errno.
        * math/w_sqrt_compat.c: Define NO_MATH_REDIRECT.
        * math/w_sqrt_template.c: Likewise.
        * math/w_sqrtf_compat.c: Likewise.
        * math/w_sqrtl_compat.c: Likewise.
        * sysdeps/generic/math-type-macros-float128.h: Remove math.h and complex.h.
        * sysdeps/i386/fpu/w_sqrt.c: Likewise.
        * sysdeps/i386/fpu/w_sqrt_compat.c: Likewise.

--
Joseph Myers - March 13, 2018, 10:45 p.m.
On Tue, 13 Mar 2018, Wilco Dijkstra wrote:

> Tested using buildmanyglibc (13 most relevant variants since it keeps running
> out of diskspace...)

What variants exactly?

What execution testing have you done?  It would be a good idea for that to 
include at least one configuration where sqrt is not inlined by the 
compiler, and also to include 32-bit x86 to make sure the special-case 
wrappers there (to avoid double rounding) continue to work as expected.

Why is the benchtests/sqrt-inputs change needed?  I think it's correct for 
benchmark inputs to refer to the include name as in user code, not to the 
path within the glibc source tree, and certainly this test should not be 
doing things differently from all the others.

Please repost this patch series with ChangeLog entries updated to reflect 
the current patch contents so they don't confuse the review.  The 
ChangeLog entry for this patch fails to mention the sqrtf128 declaration 
in include/math.h.  The one for patch 2 includes changes to 
math/w_sqrt*_compat.c and sysdeps/aaarch64 files that aren't actually 
included in the patch.  The one for patch 3 suggests that only some parts 
of sysdeps/s390/fpu/bits/mathinline.h are removed, when actually that 
patch (correctly) removes the whole file.
Wilco Dijkstra - March 14, 2018, 2:47 p.m.
Joseph Myers wrote:

> What variants exactly?

powerpc-linux-gnu-soft
powerpc-linux-gnu
powerpc64-linux-gnu
x86_64-linux-gnu
arm-linux-gnueabihf
aarch64-linux-gnu
sparc64-linux-gnu
sparcv9-linux-gnu
390-linux-gnu
m68k-linux-gnu
i486-linux-gnu
i586-linux-gnu
mips64-linux-gnu-n64

Maybe we can agree on a useful subset in the script - the number of variants for
some ISAs is ridiculous, do we really need 24 variants for MIPS? I suppose we
can also remove tile and a few others like sh given the recent discussions of
obsoleting those ports?

> What execution testing have you done?  It would be a good idea for that to 
> include at least one configuration where sqrt is not inlined by the 
> compiler, and also to include 32-bit x86 to make sure the special-case 
> wrappers there (to avoid double rounding) continue to work as expected.

I run on AArch64 and x64 when it is relevant. Finding a config that doesn't inline
sqrt may be difficult, since pretty much all ISAs have a sqrt instruction.

> Why is the benchtests/sqrt-inputs change needed?  I think it's correct for 
> benchmark inputs to refer to the include name as in user code, not to the 
> path within the glibc source tree, and certainly this test should not be 
> doing things differently from all the others.

We build the testsuite explicitly with -fmath-errno so there will be calls to sqrt.
Given the include path is incorrect when running the testsuite (it uses internal
paths rather than external ones), we get redirection to __ieee754_sqrt which is
not exported outside of GLIBC. Since the include path issue is a separate bug
I just ensure the sqrt benchmark gets the right include - when it is fixed we can
easily change it back.

Btw we'll also need to create some real inputs and decide what we want to test:
the inlined sqrt instruction or the sqrt call in GLIBC (which will usually be the sqrt
instruction), or the sqrt emulation...

I'll fix the ChangeLogs shortly.

Wilco
Joseph Myers - March 14, 2018, 4:50 p.m.
On Wed, 14 Mar 2018, Wilco Dijkstra wrote:

> Maybe we can agree on a useful subset in the script - the number of 
> variants for some ISAs is ridiculous, do we really need 24 variants for 
> MIPS? I suppose we can also remove tile and a few others like sh given 
> the recent discussions of obsoleting those ports?

The point is to cover all ABIs supported by glibc (of which there are 24 
for MIPS), and then useful non-ABI variants that significantly affect the 
code that gets built.

If an architecture port is removed from glibc, the same commit removing it 
from glibc should of course remove it from build-many-glibcs.py.  If tile 
does get removed from the Linux kernel I think at that point we should 
remove it from glibc.  I'm not aware of any evidence that sh is going to 
be removed from the Linux kernel or any proposals to remove it from glibc 
(although it does need a maintainer in glibc).

> > What execution testing have you done?  It would be a good idea for that to 
> > include at least one configuration where sqrt is not inlined by the 
> > compiler, and also to include 32-bit x86 to make sure the special-case 
> > wrappers there (to avoid double rounding) continue to work as expected.
> 
> I run on AArch64 and x64 when it is relevant. Finding a config that 
> doesn't inline sqrt may be difficult, since pretty much all ISAs have a 
> sqrt instruction.

Soft-float configurations (e.g. soft-float ARM) don't inline sqrt.

> We build the testsuite explicitly with -fmath-errno so there will be 
> calls to sqrt. Given the include path is incorrect when running the 
> testsuite (it uses internal paths rather than external ones), we get 

_ISOMAC should be defined for building the testsuite except for a few 
tests listed in tests-internal.  The include/ headers should, when _ISOMAC 
is defined, do nothing except include the corresponding public header 
(which is required so it gets found at all, since directories such as 
math/ contain the public headers but aren't in the include search path 
when building source files from other glibc subdirectories).

> redirection to __ieee754_sqrt which is not exported outside of GLIBC. 
> Since the include path issue is a separate bug I just ensure the sqrt 
> benchmark gets the right include - when it is fixed we can easily change 
> it back.

If the benchmarks are compiled without _ISOMAC being defined, that's the 
bug that should be fixed.  Changing them to use math/math.h is definitely 
wrong.

> Btw we'll also need to create some real inputs and decide what we want 
> to test: the inlined sqrt instruction or the sqrt call in GLIBC (which 
> will usually be the sqrt instruction), or the sqrt emulation...

The benchmarks are supposed to test the function call in glibc (and thus 
use -fno-builtin).
Wilco Dijkstra - March 14, 2018, 5:34 p.m.
Joseph Myers wrote:

> If the benchmarks are compiled without _ISOMAC being defined, that's the 
> bug that should be fixed.  Changing them to use math/math.h is definitely 
> wrong.

Unfortunately it's not possible to define _ISOMAC since many tests actually require
internal headers, for example the string benchmarks often need inhibit_loop_to_libcall
which is defined only if _ISOMAC is not defined.

Defining _ISOMAC just for the math tests seems to work though, that looks like a
reasonable workaround for now.

Wilco
Joseph Myers - March 14, 2018, 5:44 p.m.
On Wed, 14 Mar 2018, Wilco Dijkstra wrote:

> Unfortunately it's not possible to define _ISOMAC since many tests 
> actually require internal headers, for example the string benchmarks 
> often need inhibit_loop_to_libcall which is defined only if _ISOMAC is 
> not defined.

At least some string tests do (string/test-string.h):

/* We are compiled under _ISOMAC, so libc-symbols.h does not do this
   for us.  */
#include "config.h"
#ifdef HAVE_CC_INHIBIT_LOOP_TO_LIBCALL
# define inhibit_loop_to_libcall \
    __attribute__ ((__optimize__ ("-fno-tree-loop-distribute-patterns")))
#else
# define inhibit_loop_to_libcall
#endif

I'd expect string benchmarks to do similarly (possibly factoring this out 
to a separate header).  That's much better than building them without 
_ISOMAC and so getting *all* the internal header pieces.

Patch

diff --git a/Makeconfig b/Makeconfig
index 86a71e580213f6e5de6e619d9f3370f4365b4ae2..1afe86475c10ea208ed88b9b137cc7863510929f 100644
--- a/Makeconfig
+++ b/Makeconfig
@@ -831,6 +831,9 @@  endif
 # disable any optimization that assume default rounding mode.
 +math-flags = -frounding-math
 
+# Build libc/libm using -fno-math-errno, but run testsuite with -fmath-errno.
++extra-math-flags = $(if $(filter libnldbl nonlib testsuite,$(in-module)),-fmath-errno,-fno-math-errno)
+
 # We might want to compile with some stack-protection flag.
 ifneq ($(stack-protector),)
 +stack-protector=$(stack-protector)
@@ -966,6 +969,7 @@  endif
 
 override CFLAGS	= -std=gnu11 -fgnu89-inline $(config-extra-cflags) \
 		  $(filter-out %frame-pointer,$(+cflags)) $(+gccwarn-c) \
+		  $(+extra-math-flags) \
 		  $(sysdep-CFLAGS) $(CFLAGS-$(suffix $@)) $(CFLAGS-$(<F)) \
 		  $(CFLAGS-$(@F)) $(tls-model) \
 		  $(foreach lib,$(libof-$(basename $(@F))) \
diff --git a/benchtests/sqrt-inputs b/benchtests/sqrt-inputs
index 370bc05cc876fd5c55004f13ad2f017c7b01e475..13777a227d7aff5b70c0b68af05647892ca04ae4 100644
--- a/benchtests/sqrt-inputs
+++ b/benchtests/sqrt-inputs
@@ -1,6 +1,6 @@ 
 ## args: double
 ## ret: double
-## includes: math.h
+## includes: math/math.h
 0.25
 0.75
 2.0
diff --git a/include/math.h b/include/math.h
index 7ee291fc972088a1409949326f7024e8e3fe5415..c25033fcd5add2f4067fdbb51fb345056d6a1400 100644
--- a/include/math.h
+++ b/include/math.h
@@ -54,5 +54,20 @@  libm_hidden_proto (__expf128)
 libm_hidden_proto (__expm1f128)
 # endif
 
+# if !(defined __FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0)
+#  ifndef NO_MATH_REDIRECT
+/* Declare sqrt for use within GLIBC.  Sqrt will typically inline into a
+   single instruction.  Use an asm to avoid use of PLTs if it doesn't.  */
+float (sqrtf) (float) asm ("__ieee754_sqrtf");
+double (sqrt) (double) asm ("__ieee754_sqrt");
+#   ifndef __NO_LONG_DOUBLE_MATH
+long double (sqrtl) (long double) asm ("__ieee754_sqrtl");
+#   endif
+#   if __HAVE_DISTINCT_FLOAT128 > 0
+_Float128 (sqrtf128) (_Float128) asm ("__ieee754_sqrtf128");
+#   endif
+#  endif
+# endif
+
 #endif
 #endif
diff --git a/math/w_sqrt_compat.c b/math/w_sqrt_compat.c
index 7884d146331d7c43c79e1957e3ba4f2dab608ead..e76a8079aa0a2dd4dc09aba48b6a82aaddd1a584 100644
--- a/math/w_sqrt_compat.c
+++ b/math/w_sqrt_compat.c
@@ -16,6 +16,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
+#define NO_MATH_REDIRECT
 #include <math.h>
 #include <math_private.h>
 #include <math-svid-compat.h>
diff --git a/math/w_sqrt_template.c b/math/w_sqrt_template.c
index 52a21c0bf1cd7bdab164356635ffa5f10973b633..9c6ac75956ca8abf9448d631fd286b5e2d1deb73 100644
--- a/math/w_sqrt_template.c
+++ b/math/w_sqrt_template.c
@@ -21,6 +21,7 @@ 
    for each floating-point type.  */
 #if __USE_WRAPPER_TEMPLATE
 
+# define NO_MATH_REDIRECT
 # include <errno.h>
 # include <fenv.h>
 # include <math.h>
diff --git a/math/w_sqrtf_compat.c b/math/w_sqrtf_compat.c
index 824ce0ed4658210b75e38953ee39d3dcd0bbb3f7..cad9b4ad77f1707af6c1df5ada82b07537d79dfa 100644
--- a/math/w_sqrtf_compat.c
+++ b/math/w_sqrtf_compat.c
@@ -16,6 +16,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
+#define NO_MATH_REDIRECT
 #include <math.h>
 #include <math_private.h>
 #include <math-svid-compat.h>
diff --git a/math/w_sqrtl_compat.c b/math/w_sqrtl_compat.c
index 56627a615dce25aae893f9d84292922a6e452b3f..16dda403b77144667e3f76b633f6f9165c8a38d3 100644
--- a/math/w_sqrtl_compat.c
+++ b/math/w_sqrtl_compat.c
@@ -16,6 +16,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
+#define NO_MATH_REDIRECT
 #include <math.h>
 #include <math_private.h>
 #include <math-svid-compat.h>
diff --git a/sysdeps/generic/math-type-macros-float128.h b/sysdeps/generic/math-type-macros-float128.h
index 605996e0ba73f3ad1e421f516396442b2c2b1e25..485c13bb88a1995c69d62e11ea819d1323dd938b 100644
--- a/sysdeps/generic/math-type-macros-float128.h
+++ b/sysdeps/generic/math-type-macros-float128.h
@@ -19,9 +19,6 @@ 
 #ifndef _MATH_TYPE_MACROS_FLOAT128
 #define _MATH_TYPE_MACROS_FLOAT128
 
-#include <math.h>
-#include <complex.h>
-
 #define M_LIT(c) __f128 (c)
 #define M_PFX FLT128
 #define M_SUF(c) c ## f128
diff --git a/sysdeps/i386/fpu/w_sqrt.c b/sysdeps/i386/fpu/w_sqrt.c
index d37a5d55bf439cebdff05f05dd66f910d6d48c18..8bef04e68a7be3d21d89e6eb41dcfa2561f3f6a5 100644
--- a/sysdeps/i386/fpu/w_sqrt.c
+++ b/sysdeps/i386/fpu/w_sqrt.c
@@ -1,5 +1,6 @@ 
 /* The inline __ieee754_sqrt is not correctly rounding; it's OK for
    most internal uses in glibc, but not for sqrt itself.  */
+#define NO_MATH_REDIRECT
 #define __ieee754_sqrt __avoid_ieee754_sqrt
 #include <math.h>
 #include <math_private.h>
diff --git a/sysdeps/i386/fpu/w_sqrt_compat.c b/sysdeps/i386/fpu/w_sqrt_compat.c
index ddd36d0964c2945579bc59b02f127af88384acca..dd485f4b88c7f5348abaab81a1308aeadc9594ec 100644
--- a/sysdeps/i386/fpu/w_sqrt_compat.c
+++ b/sysdeps/i386/fpu/w_sqrt_compat.c
@@ -1,5 +1,6 @@ 
 /* The inline __ieee754_sqrt is not correctly rounding; it's OK for
    most internal uses in glibc, but not for sqrt itself.  */
+#define NO_MATH_REDIRECT
 #define __ieee754_sqrt __avoid_ieee754_sqrt
 #include <math.h>
 #include <math_private.h>