From patchwork Wed Feb 27 03:24:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: GT X-Patchwork-Id: 31607 Received: (qmail 100483 invoked by alias); 27 Feb 2019 03:25:25 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 100396 invoked by uid 89); 27 Feb 2019 03:25:22 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=0.0 required=5.0 tests=none autolearn=unavailable version=3.3.2 spammy= X-HELO: mail-40135.protonmail.ch Date: Wed, 27 Feb 2019 03:24:49 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com; s=default; t=1551237891; bh=73OvFG0yJyLrOwIa0TbUi71rVuuXGDHuqOoqgBM06I0=; h=Date:To:From:Reply-To:Subject:Feedback-ID:From; b=BWGdJop8Uo3jgRI20eo6GVGUPcVtFHkNTXc/PE23pwZBp0It4/i/ZFvt4AkMqFRrI pPtsAtBht6BjNdQd0gj9SX/34qEtQ8fL9P/Qb+/QTF4YnZ2KjKidHajVGyujY7GLcw cjHEP4kdDGabot+DmydUH6Hmaa8HSgdq49ZDWhhI= To: "libc-alpha@sourceware.org" From: GT Reply-To: GT Subject: [PATCH] PPC64: First in the series of patches implementing POWER8 vector math. Message-ID: MIME-Version: 1.0 Empty Message From e7d282c21dd987a30d5b6eb674f32a501594273f Mon Sep 17 00:00:00 2001 From: Bert Tenjy Date: Wed, 27 Feb 2019 02:00:29 +0000 Subject: [PATCH] PPC64: First in the series of patches implementing POWER8 vector math. [BZ #24205] Implements double-precision cosine using VSX vector capability. Algorithm for cosine is from x86_64 [commit #2193311288] adapted to PPC64. Name-mangling exactly duplicates SSE ISA of the x86_64 ABI. The details are at The patch has been tested on PPC64/POWER8 Little Endian and Big Endian. It is tested using the framework created for libmvec on x86_64 which runs tests on issuing 'make check'. Tests of the new vector cosine function all pass. Glibc built with this patch was installed using the procedure outlined at . Compiling against the new library created a test executable which computes cosines using the vector version of the function. The results are at most 2-ulps away from the scalar cosine. That is expected and indicated in the comments describing the algorithm - as obtained from x86_64 commit #2193311288. --- ChangeLog | 17 ++++ NEWS | 13 +++ sysdeps/powerpc/bits/math-vector.h | 41 +++++++++ sysdeps/powerpc/fpu/libm-test-ulps | 3 + sysdeps/powerpc/powerpc64/fpu/Makefile | 7 ++ sysdeps/powerpc/powerpc64/fpu/Versions | 5 ++ .../powerpc/powerpc64/fpu/multiarch/Makefile | 17 ++++ .../multiarch/test-double-vlen2-wrappers.c | 24 +++++ .../powerpc64/fpu/multiarch/vec_d_cos2_vsx.c | 88 +++++++++++++++++++ .../powerpc64/fpu/multiarch/vec_d_trig_data.h | 60 +++++++++++++ .../powerpc/powerpc64/fpu/vec_finite_alias.c | 41 +++++++++ .../linux/powerpc/powerpc64/libmvec.abilist | 1 + 12 files changed, 317 insertions(+) create mode 100644 sysdeps/powerpc/bits/math-vector.h create mode 100644 sysdeps/powerpc/powerpc64/fpu/Makefile create mode 100644 sysdeps/powerpc/powerpc64/fpu/Versions create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vsx.c create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h create mode 100644 sysdeps/powerpc/powerpc64/fpu/vec_finite_alias.c create mode 100644 sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist Notable differences from the previous patch and further commentary: 1. Renamed the main C source file from vec_d_cos2_power8.c to vec_d_cos2_vsx.c. VSX functionality is also available on POWER7 and POWER9, hence the change. 2. Removed vec_d_cos2_core.c and vec_d_cos2_vmx.c. The former did ifunc selection between the latter and the main C implementation. File vec_d_cos2_vmx.c was not a true Altivec implementation. It was only a wrapper to the scalar cosine funtion. 3. A new file, vec_finite_alias.c is a workaround until the vector log function is implemented. It is needed so that libmvec_nonshared.a is built. Without it, compiling against the newly-built glibc will fail due to its being missing. 4. __PPC64__ is the macro tested in math-vector.h. Table 5.1 of the POWER ELFv2 ABI defines it and __powerpc64__ as synonyms. The other macros in that file are all-uppercase and the choice made preserves consistency. 5. GCC has no vectorizing support for PPC64. The openmp pragmas are ignored and only scalar cosine calls generated. Exactly as when libmvec doesn't exist. 6. The executables created to test against new glibc installation required a workaround. x86_64 also did when I tried to compile the same test. The test is a modification of Example #1 at . The only change initially is a replacement of the call to cos () with one to the vector version _ZGVbN2v_cos (). Compilation fails due to function without a prototype. The solution for both PPC64 and x86_64 was to supply a 'extern _ZGVbN2v_cos ()' forward declaration. Then compilation created an executable that used the new vector cosine. 7. This patch is half of the requirement for BZ #24205. The other is implementing vector single-precision cosine. There are two outstanding issues which I ask to be pushed into the patch for cosf. Gracefully terminating configure if the GCC used does not provide the VSX builtins required to build libmvec. And runtime avoidance of tests of the vector functions on machines without VSX hardware. diff --git a/ChangeLog b/ChangeLog index 8096175cc9..654774d690 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,20 @@ +2019-02-27 + + [BZ #24205] + * sysdeps/powerpc/bits/math-vector.h: New file. + * sysdeps/powerpc/fpu/libm-test-ulps (cos_vlen2): Regenerated. + * sysdeps/powerpc/powerpc64/fpu/Makefile: New file. + * sysdeps/powerpc/powerpc64/fpu/Versions: Likewise. + * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (libmvec-sysdep_routines) + (CFLAGS-vec_d_cos2_vsx.c, libmvec-tests, double-vlen2-funcs) + (double-vlen2-arch-ext-cflags): Added build of VSX vector cos function + and its tests. + * sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c: New file. + * sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vsx.c: Likewise. + * sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h: Likewise. + * sysdeps/powerpc/powerpc64/fpu/vec_finite_alias.c: Likewise. + * sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist: Likewise. + 2019-02-26 Joseph Myers * sysdeps/arm/sysdep.h (#if condition): Break lines before rather diff --git a/NEWS b/NEWS index 0a3b6c7a5a..fc08f11c51 100644 --- a/NEWS +++ b/NEWS @@ -5,6 +5,19 @@ See the end for copying conditions. Please send GNU C library bug reports via using `glibc' in the "product" field. + +* Start of implementing vector math library libmvec on PPC64/POWER8. + The double-precision cosine now has a vector version. + GCC support for auto-vectorization of functions on PPC64 is not yet + available. Until that is done, the new vector math functions will be + inaccessible to applications. + Building libmvec for PPC64 VSX hardware is done at configuration with + --enable-mathvec. The default is to not build. + The library ABI specification is x86_64 Vector Function ABI. + More information on libmvec including a link to the ABI document is at: + + + Version 2.30 Major new features: diff --git a/sysdeps/powerpc/bits/math-vector.h b/sysdeps/powerpc/bits/math-vector.h new file mode 100644 index 0000000000..78d9db64bf --- /dev/null +++ b/sysdeps/powerpc/bits/math-vector.h @@ -0,0 +1,41 @@ +/* Platform-specific SIMD declarations of math functions. + Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _MATH_H +# error "Never include directly;\ + include instead." +#endif + +/* Get default empty definitions for simd declarations. */ +#include + +#if defined __PPC64__ && defined __FAST_MATH__ +# if defined _OPENMP && _OPENMP >= 201307 +/* OpenMP case. */ +# define __DECL_SIMD_PPC64 _Pragma ("omp declare simd notinbranch") +# elif __GNUC_PREREQ (6,0) +/* W/o OpenMP use GCC 6.* __attribute__ ((__simd__)). */ +# define __DECL_SIMD_PPC64 __attribute__ ((__simd__ ("notinbranch"))) +# endif + +# ifdef __DECL_SIMD_PPC64 +# undef __DECL_SIMD_cos +# define __DECL_SIMD_cos __DECL_SIMD_PPC64 + +# endif +#endif diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps index 1eec27c1dc..d392b135a7 100644 --- a/sysdeps/powerpc/fpu/libm-test-ulps +++ b/sysdeps/powerpc/fpu/libm-test-ulps @@ -1311,6 +1311,9 @@ ifloat128: 2 ildouble: 5 ldouble: 5 +Function: "cos_vlen2": +double: 2 + Function: "cosh": double: 1 float: 1 diff --git a/sysdeps/powerpc/powerpc64/fpu/Makefile b/sysdeps/powerpc/powerpc64/fpu/Makefile new file mode 100644 index 0000000000..21dc67ff73 --- /dev/null +++ b/sysdeps/powerpc/powerpc64/fpu/Makefile @@ -0,0 +1,7 @@ +ifeq ($(subdir),mathvec) +libmvec-support += vec_finite_alias + +CFLAGS-vec_finite_alias.c += -mvsx + +libmvec-static-only-routines = vec_finite_alias +endif diff --git a/sysdeps/powerpc/powerpc64/fpu/Versions b/sysdeps/powerpc/powerpc64/fpu/Versions new file mode 100644 index 0000000000..9a3e1211cc --- /dev/null +++ b/sysdeps/powerpc/powerpc64/fpu/Versions @@ -0,0 +1,5 @@ +libmvec { + GLIBC_2.30 { + _ZGVbN2v_cos; + } +} diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile index 39b557604c..44c1c04c13 100644 --- a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile @@ -42,3 +42,20 @@ CFLAGS-e_hypotf-power7.c = -mcpu=power7 CFLAGS-s_modf-ppc64.c += -fsignaling-nans CFLAGS-s_modff-ppc64.c += -fsignaling-nans endif + +ifeq ($(subdir),mathvec) +libmvec-sysdep_routines += vec_d_cos2_vsx +CFLAGS-vec_d_cos2_vsx.c += -mvsx +endif + +# Variables for libmvec tests. +ifeq ($(subdir),math) +ifeq ($(build-mathvec),yes) +libmvec-tests += double-vlen2 + +double-vlen2-funcs = cos + +double-vlen2-arch-ext-cflags = -mvsx -DREQUIRE_VSX + +endif +endif diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c new file mode 100644 index 0000000000..17e2cc0724 --- /dev/null +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c @@ -0,0 +1,24 @@ +/* Wrapper part of tests for VSX ISA versions of vector math functions. + Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "test-double-vlen2.h" +#include + +#define VEC_TYPE vector double + +VECTOR_WRAPPER (WRAPPER_NAME (cos), _ZGVbN2v_cos) diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vsx.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vsx.c new file mode 100644 index 0000000000..ed8fe330c1 --- /dev/null +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vsx.c @@ -0,0 +1,88 @@ +/* Function cos vectorized with VSX. + Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include "vec_d_trig_data.h" + +vector double +_ZGVbN2v_cos (vector double x) +{ + + /* + ARGUMENT RANGE REDUCTION: + Add Pi/2 to argument: X' = X+Pi/2. */ + vector double x_prime = (vector double) d_half_pi + x; + + /* Get absolute argument value: X' = |X'|. */ + vector double abs_x_prime = vec_abs (x_prime); + + /* Y = X'*InvPi + RS : right shifter add. */ + vector double y = (x_prime * d_inv_pi) + d_rshifter; + + /* Check for large arguments path. */ + vector bool long long large_in = vec_cmpgt (abs_x_prime, d_rangeval); + + /* N = Y - RS : right shifter sub. */ + vector double n = y - d_rshifter; + + /* SignRes = Y<<63 : shift LSB to MSB place for result sign. */ + vector double sign_res = (vector double) vec_sl ((vector long long) y, + (vector unsigned long long) + vec_splats (63)); + + /* N = N - 0.5. */ + n = n - d_one_half; + + /* R = X - N*Pi1. */ + vector double r = x - (n * d_pi1_fma); + + /* R = R - N*Pi2. */ + r = r - (n * d_pi2_fma); + + /* R = R - N*Pi3. */ + r = r - (n * d_pi3_fma); + + /* R2 = R*R. */ + vector double r2 = r * r; + + /* Poly = C3+R2*(C4+R2*(C5+R2*(C6+R2*C7))). */ + vector double poly = r2 * d_coeff7 + d_coeff6; + poly = poly * r2 + d_coeff5; + poly = poly * r2 + d_coeff4; + poly = poly * r2 + d_coeff3; + + /* Poly = R+R*(R2*(C1+R2*(C2+R2*Poly))). */ + poly = poly * r2 + d_coeff2; + poly = poly * r2 + d_coeff1; + poly = poly * r2 * r + r; + + /* + RECONSTRUCTION: + Final sign setting: Res = Poly^SignRes. */ + vector double out + = (vector double) ((vector long long) poly ^ (vector long long) sign_res); + + if (large_in[0] != 0) + out[0] = cos (x[0]); + + if (large_in[1] != 0) + out[1] = cos (x[1]); + + return out; + +} diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h new file mode 100644 index 0000000000..4b2678928f --- /dev/null +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h @@ -0,0 +1,60 @@ +/* Constants used in polynomail approximations for vectorized sin, cos, + and sincos functions. + Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef D_TRIG_DATA_H +#define D_TRIG_DATA_H + +#include + +/* PI/2. */ +const vector double d_half_pi = {0x1.921fb54442d18p+0, 0x1.921fb54442d18p+0}; + +/* Inverse PI. */ +const vector double d_inv_pi = {0x1.45f306dc9c883p-2, 0x1.45f306dc9c883p-2}; + +/* Right-shifter constant. */ +const vector double d_rshifter = {0x1.8p+52, 0x1.8p+52}; + +/* Working range threshold. */ +const vector double d_rangeval = {0x1p+23, 0x1p+23}; + +/* One-half . */ +const vector double d_one_half = {0x1p-1, 0x1p-1}; + +/* Range reduction PI-based constants if FMA available: + PI high part (FMA available). */ +const vector double d_pi1_fma = {0x1.921fb54442d18p+1, 0x1.921fb54442d18p+1}; + +/* PI mid part (FMA available). */ +const vector double d_pi2_fma = {0x1.1a62633145c06p-53, 0x1.1a62633145c06p-53}; + +/* PI low part (FMA available). */ +const vector double d_pi3_fma += {0x1.c1cd129024e09p-106,0x1.c1cd129024e09p-106}; + +/* Polynomial coefficients (relative error 2^(-52.115)). */ +const vector double d_coeff7 = {-0x1.9f0d60811aac8p-41,-0x1.9f0d60811aac8p-41}; +const vector double d_coeff6 = {0x1.60e6857a2f22p-33,0x1.60e6857a2f22p-33}; +const vector double d_coeff5 = {-0x1.ae63546002231p-26,-0x1.ae63546002231p-26}; +const vector double d_coeff4 = {0x1.71de38030feap-19,0x1.71de38030feap-19}; +const vector double d_coeff3 = {-0x1.a01a019a5b86dp-13,-0x1.a01a019a5b86dp-13}; +const vector double d_coeff2 = {0x1.111111110a4a8p-7,0x1.111111110a4a8p-7}; +const vector double d_coeff1 = {-0x1.55555555554a7p-3,-0x1.55555555554a7p-3}; + +#endif /* D_TRIG_DATA_H. */ diff --git a/sysdeps/powerpc/powerpc64/fpu/vec_finite_alias.c b/sysdeps/powerpc/powerpc64/fpu/vec_finite_alias.c new file mode 100644 index 0000000000..f1a062aadf --- /dev/null +++ b/sysdeps/powerpc/powerpc64/fpu/vec_finite_alias.c @@ -0,0 +1,41 @@ +/* A temporary workaround until vector log is implemented. + Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +/* We need this wrapper to the scalar log function so that + libmvec_nonshared.a is generated. Otherwise compiling + against the new glibc during testing results in an error + due to the missing libmvec_nonshared.a. */ + +vector double +_ZGVbN2v___log_finite (vector double x) +{ + + /* + Calls the scalar log function twice, once for each + of the pair of doubles in the input argument. */ + vector double out; + + out[0] = log (x[0]); + out[1] = log (x[1]); + + return out; + +} diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist new file mode 100644 index 0000000000..656ce0541f --- /dev/null +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist @@ -0,0 +1 @@ +GLIBC_2.30 _ZGVbN2v_cos F -- 2.20.1