[v3,1/3] Cleanup __ieee754_sqrt(f/l)
Commit Message
This patch series cleans up the many uses of __ieee754_sqrt(f/l) in GLIBC.
The goal is to enable GCC to do the inlining, and if this fails call the
__ieee754_sqrt function. This is done by internally declaring sqrt with asm
redirects. The compat symbols and sqrt wrappers need to disable the redirect.
The redirect is also disabled if there are already redirects defined when
using -ffinite-math-only. The benchtests still use the libc internal headers
(which is incorrect), so update sqrt-inputs to use the external math.h so there
is no redirection.
All math functions (but not math tests, non-library code and libnldbl) are
built with -fno-math-errno which means GCC will typically inline sqrt as a
single instruction. This means targets are no longer forced to add a special
inline for sqrt.
Tested using buildmanyglibc (13 most relevant variants since it keeps running
out of diskspace...)
ChangeLog:
2018-03-13 Wilco Dijkstra <wdijkstr@arm.com>
* include/math.h (sqrt): Declare with asm redirect.
(sqrtf): Likewise.
(sqrtl): Likewise.
* benchtests/sqrt-inputs: Use correct math/math.h include.
* Makeconfig: Add -fno-math-errno for libc/libm, but build testsuite,
nonlib and libnldbl with -fmath-errno.
* math/w_sqrt_compat.c: Define NO_MATH_REDIRECT.
* math/w_sqrt_template.c: Likewise.
* math/w_sqrtf_compat.c: Likewise.
* math/w_sqrtl_compat.c: Likewise.
* sysdeps/generic/math-type-macros-float128.h: Remove math.h and complex.h.
* sysdeps/i386/fpu/w_sqrt.c: Likewise.
* sysdeps/i386/fpu/w_sqrt_compat.c: Likewise.
--
Comments
On Tue, 13 Mar 2018, Wilco Dijkstra wrote:
> Tested using buildmanyglibc (13 most relevant variants since it keeps running
> out of diskspace...)
What variants exactly?
What execution testing have you done? It would be a good idea for that to
include at least one configuration where sqrt is not inlined by the
compiler, and also to include 32-bit x86 to make sure the special-case
wrappers there (to avoid double rounding) continue to work as expected.
Why is the benchtests/sqrt-inputs change needed? I think it's correct for
benchmark inputs to refer to the include name as in user code, not to the
path within the glibc source tree, and certainly this test should not be
doing things differently from all the others.
Please repost this patch series with ChangeLog entries updated to reflect
the current patch contents so they don't confuse the review. The
ChangeLog entry for this patch fails to mention the sqrtf128 declaration
in include/math.h. The one for patch 2 includes changes to
math/w_sqrt*_compat.c and sysdeps/aaarch64 files that aren't actually
included in the patch. The one for patch 3 suggests that only some parts
of sysdeps/s390/fpu/bits/mathinline.h are removed, when actually that
patch (correctly) removes the whole file.
Joseph Myers wrote:
> What variants exactly?
powerpc-linux-gnu-soft
powerpc-linux-gnu
powerpc64-linux-gnu
x86_64-linux-gnu
arm-linux-gnueabihf
aarch64-linux-gnu
sparc64-linux-gnu
sparcv9-linux-gnu
390-linux-gnu
m68k-linux-gnu
i486-linux-gnu
i586-linux-gnu
mips64-linux-gnu-n64
Maybe we can agree on a useful subset in the script - the number of variants for
some ISAs is ridiculous, do we really need 24 variants for MIPS? I suppose we
can also remove tile and a few others like sh given the recent discussions of
obsoleting those ports?
> What execution testing have you done? It would be a good idea for that to
> include at least one configuration where sqrt is not inlined by the
> compiler, and also to include 32-bit x86 to make sure the special-case
> wrappers there (to avoid double rounding) continue to work as expected.
I run on AArch64 and x64 when it is relevant. Finding a config that doesn't inline
sqrt may be difficult, since pretty much all ISAs have a sqrt instruction.
> Why is the benchtests/sqrt-inputs change needed? I think it's correct for
> benchmark inputs to refer to the include name as in user code, not to the
> path within the glibc source tree, and certainly this test should not be
> doing things differently from all the others.
We build the testsuite explicitly with -fmath-errno so there will be calls to sqrt.
Given the include path is incorrect when running the testsuite (it uses internal
paths rather than external ones), we get redirection to __ieee754_sqrt which is
not exported outside of GLIBC. Since the include path issue is a separate bug
I just ensure the sqrt benchmark gets the right include - when it is fixed we can
easily change it back.
Btw we'll also need to create some real inputs and decide what we want to test:
the inlined sqrt instruction or the sqrt call in GLIBC (which will usually be the sqrt
instruction), or the sqrt emulation...
I'll fix the ChangeLogs shortly.
Wilco
On Wed, 14 Mar 2018, Wilco Dijkstra wrote:
> Maybe we can agree on a useful subset in the script - the number of
> variants for some ISAs is ridiculous, do we really need 24 variants for
> MIPS? I suppose we can also remove tile and a few others like sh given
> the recent discussions of obsoleting those ports?
The point is to cover all ABIs supported by glibc (of which there are 24
for MIPS), and then useful non-ABI variants that significantly affect the
code that gets built.
If an architecture port is removed from glibc, the same commit removing it
from glibc should of course remove it from build-many-glibcs.py. If tile
does get removed from the Linux kernel I think at that point we should
remove it from glibc. I'm not aware of any evidence that sh is going to
be removed from the Linux kernel or any proposals to remove it from glibc
(although it does need a maintainer in glibc).
> > What execution testing have you done? It would be a good idea for that to
> > include at least one configuration where sqrt is not inlined by the
> > compiler, and also to include 32-bit x86 to make sure the special-case
> > wrappers there (to avoid double rounding) continue to work as expected.
>
> I run on AArch64 and x64 when it is relevant. Finding a config that
> doesn't inline sqrt may be difficult, since pretty much all ISAs have a
> sqrt instruction.
Soft-float configurations (e.g. soft-float ARM) don't inline sqrt.
> We build the testsuite explicitly with -fmath-errno so there will be
> calls to sqrt. Given the include path is incorrect when running the
> testsuite (it uses internal paths rather than external ones), we get
_ISOMAC should be defined for building the testsuite except for a few
tests listed in tests-internal. The include/ headers should, when _ISOMAC
is defined, do nothing except include the corresponding public header
(which is required so it gets found at all, since directories such as
math/ contain the public headers but aren't in the include search path
when building source files from other glibc subdirectories).
> redirection to __ieee754_sqrt which is not exported outside of GLIBC.
> Since the include path issue is a separate bug I just ensure the sqrt
> benchmark gets the right include - when it is fixed we can easily change
> it back.
If the benchmarks are compiled without _ISOMAC being defined, that's the
bug that should be fixed. Changing them to use math/math.h is definitely
wrong.
> Btw we'll also need to create some real inputs and decide what we want
> to test: the inlined sqrt instruction or the sqrt call in GLIBC (which
> will usually be the sqrt instruction), or the sqrt emulation...
The benchmarks are supposed to test the function call in glibc (and thus
use -fno-builtin).
Joseph Myers wrote:
> If the benchmarks are compiled without _ISOMAC being defined, that's the
> bug that should be fixed. Changing them to use math/math.h is definitely
> wrong.
Unfortunately it's not possible to define _ISOMAC since many tests actually require
internal headers, for example the string benchmarks often need inhibit_loop_to_libcall
which is defined only if _ISOMAC is not defined.
Defining _ISOMAC just for the math tests seems to work though, that looks like a
reasonable workaround for now.
Wilco
On Wed, 14 Mar 2018, Wilco Dijkstra wrote:
> Unfortunately it's not possible to define _ISOMAC since many tests
> actually require internal headers, for example the string benchmarks
> often need inhibit_loop_to_libcall which is defined only if _ISOMAC is
> not defined.
At least some string tests do (string/test-string.h):
/* We are compiled under _ISOMAC, so libc-symbols.h does not do this
for us. */
#include "config.h"
#ifdef HAVE_CC_INHIBIT_LOOP_TO_LIBCALL
# define inhibit_loop_to_libcall \
__attribute__ ((__optimize__ ("-fno-tree-loop-distribute-patterns")))
#else
# define inhibit_loop_to_libcall
#endif
I'd expect string benchmarks to do similarly (possibly factoring this out
to a separate header). That's much better than building them without
_ISOMAC and so getting *all* the internal header pieces.
@@ -831,6 +831,9 @@ endif
# disable any optimization that assume default rounding mode.
+math-flags = -frounding-math
+# Build libc/libm using -fno-math-errno, but run testsuite with -fmath-errno.
++extra-math-flags = $(if $(filter libnldbl nonlib testsuite,$(in-module)),-fmath-errno,-fno-math-errno)
+
# We might want to compile with some stack-protection flag.
ifneq ($(stack-protector),)
+stack-protector=$(stack-protector)
@@ -966,6 +969,7 @@ endif
override CFLAGS = -std=gnu11 -fgnu89-inline $(config-extra-cflags) \
$(filter-out %frame-pointer,$(+cflags)) $(+gccwarn-c) \
+ $(+extra-math-flags) \
$(sysdep-CFLAGS) $(CFLAGS-$(suffix $@)) $(CFLAGS-$(<F)) \
$(CFLAGS-$(@F)) $(tls-model) \
$(foreach lib,$(libof-$(basename $(@F))) \
@@ -1,6 +1,6 @@
## args: double
## ret: double
-## includes: math.h
+## includes: math/math.h
0.25
0.75
2.0
@@ -54,5 +54,20 @@ libm_hidden_proto (__expf128)
libm_hidden_proto (__expm1f128)
# endif
+# if !(defined __FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0)
+# ifndef NO_MATH_REDIRECT
+/* Declare sqrt for use within GLIBC. Sqrt will typically inline into a
+ single instruction. Use an asm to avoid use of PLTs if it doesn't. */
+float (sqrtf) (float) asm ("__ieee754_sqrtf");
+double (sqrt) (double) asm ("__ieee754_sqrt");
+# ifndef __NO_LONG_DOUBLE_MATH
+long double (sqrtl) (long double) asm ("__ieee754_sqrtl");
+# endif
+# if __HAVE_DISTINCT_FLOAT128 > 0
+_Float128 (sqrtf128) (_Float128) asm ("__ieee754_sqrtf128");
+# endif
+# endif
+# endif
+
#endif
#endif
@@ -16,6 +16,7 @@
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
+#define NO_MATH_REDIRECT
#include <math.h>
#include <math_private.h>
#include <math-svid-compat.h>
@@ -21,6 +21,7 @@
for each floating-point type. */
#if __USE_WRAPPER_TEMPLATE
+# define NO_MATH_REDIRECT
# include <errno.h>
# include <fenv.h>
# include <math.h>
@@ -16,6 +16,7 @@
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
+#define NO_MATH_REDIRECT
#include <math.h>
#include <math_private.h>
#include <math-svid-compat.h>
@@ -16,6 +16,7 @@
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
+#define NO_MATH_REDIRECT
#include <math.h>
#include <math_private.h>
#include <math-svid-compat.h>
@@ -19,9 +19,6 @@
#ifndef _MATH_TYPE_MACROS_FLOAT128
#define _MATH_TYPE_MACROS_FLOAT128
-#include <math.h>
-#include <complex.h>
-
#define M_LIT(c) __f128 (c)
#define M_PFX FLT128
#define M_SUF(c) c ## f128
@@ -1,5 +1,6 @@
/* The inline __ieee754_sqrt is not correctly rounding; it's OK for
most internal uses in glibc, but not for sqrt itself. */
+#define NO_MATH_REDIRECT
#define __ieee754_sqrt __avoid_ieee754_sqrt
#include <math.h>
#include <math_private.h>
@@ -1,5 +1,6 @@
/* The inline __ieee754_sqrt is not correctly rounding; it's OK for
most internal uses in glibc, but not for sqrt itself. */
+#define NO_MATH_REDIRECT
#define __ieee754_sqrt __avoid_ieee754_sqrt
#include <math.h>
#include <math_private.h>