From patchwork Wed Mar 6 22:22:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: GT X-Patchwork-Id: 31754 Received: (qmail 11968 invoked by alias); 6 Mar 2019 22:22:51 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 11180 invoked by uid 89); 6 Mar 2019 22:22:51 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-24.9 required=5.0 tests=BAYES_00, DNS_FROM_AHBL_RHSBL, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_LOW, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.1 spammy=capability, Testing, indicator, Its X-HELO: mail-40135.protonmail.ch Date: Wed, 06 Mar 2019 22:22:26 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com; s=default; t=1551910953; bh=Ob0w8rzL+VerAorQAmKBjwxLXxLICNdYmUbD2j29c8I=; h=Date:To:From:Reply-To:Subject:Feedback-ID:From; b=sKwYo6UYdMnIsYhbDEWmhZ0BCCxnc7sSWCg0mWDFajlld/2aesdxIiSyN3zrFUxUA r6jgJGd1W2S0K+4DXuKklD5FZvO9qeIZZLm3ks8NTmog0fpt1vxvNIacKq27utkRdz RGG3e38e/2nqtAfPHSN3EqmXQEkSGpFUPGc9bQfM= To: "libc-alpha@sourceware.org" , "tnggil@protonmail.com" From: GT Reply-To: GT Subject: [RESEND] [PATCH 1/12] PPC64: First in the series of patches implementing POWER8 vector math. Message-ID: MIME-Version: 1.0 Differences from prior submission of same patch: 1. Removed the powerpc specific bits/math-vector.h file. With that gone I believe the patches can go into glibc master, as opposed to a branch. There is now no API for compilers to possibly produce erroneously vectorized calls to the SIMD math functions. That makes these functions totally inert until GCC comes along and needs to test its new PPC64 autovectorization. Then, a PPC specific bits/math-vector.h will be added back into glibc. I verified that libmvec does build and pass make check without the powerpc bits/math-vector.h. I also verified that GCC produces the same instructions in an executable, with and without the pragma omp simd directives. That compilation used a new installation of glibc including libmvec. So, the new glibc including PPC64 libmvec should not break any existing code. 2. Fixed names of constants to include leading double-underscores as required by their namespace. 3. Removed configure check for VSX builtins in compiler. As glibc requires GCC >= 6.2, we are guaranteed the presence of these builtins. 4. Changed patch subject line to now include m/n sequence indicator. From 50692e8bcebb66a7fd14381f0a912019d3fb6fef Mon Sep 17 00:00:00 2001 From: Bert Tenjy Date: Wed, 6 Mar 2019 21:52:48 +0000 Subject: [PATCH 1/12] PPC64: First in the series of patches implementing POWER8 vector math. [BZ #24205] Implements double-precision cosine using VSX vector capability. Algorithm for cosine is from x86_64 [commit #2193311288] adapted to PPC64. Name-mangling exactly duplicates SSE ISA of the x86_64 ABI. The details are at The patch has been tested on PPC64/POWER8 Little Endian and Big Endian. It is tested using the framework created for libmvec on x86_64 which runs tests on issuing 'make check'. Tests of the new vector cosine function all pass. Configure verifies that the compiler implements VSX builtins. If so vector math library is built by default. A runtime check prevents vector tests running on systems lacking VSX hardware. File vec_finite_alias.c is empty. Its existence ensures that libmvec_nonshared.a is created. Otherwise, linking during testing of a newly-installed glibc fails. Glibc built with this patch was installed using the procedure outlined at . Compiling against the new library created a test executable which computes cosines using the vector version of the function. The results are at most 2-ulps away from the scalar cosine. That is expected and indicated in the comments describing the algorithm - as obtained from x86_64 commit #2193311288. --- ChangeLog | 21 +++++ NEWS | 12 +++ sysdeps/powerpc/fpu/libm-test-ulps | 3 + sysdeps/powerpc/powerpc64/fpu/Makefile | 7 ++ sysdeps/powerpc/powerpc64/fpu/Versions | 5 ++ sysdeps/powerpc/powerpc64/fpu/configure | 6 ++ sysdeps/powerpc/powerpc64/fpu/configure.ac | 6 ++ .../powerpc/powerpc64/fpu/math-tests-arch.h | 34 +++++++ .../powerpc/powerpc64/fpu/multiarch/Makefile | 17 ++++ .../multiarch/test-double-vlen2-wrappers.c | 24 +++++ .../powerpc64/fpu/multiarch/vec_d_cos2_vsx.c | 88 +++++++++++++++++++ .../powerpc64/fpu/multiarch/vec_d_trig_data.h | 60 +++++++++++++ .../powerpc/powerpc64/fpu/vec_finite_alias.c | 17 ++++ .../linux/powerpc/powerpc64/libmvec.abilist | 1 + 14 files changed, 301 insertions(+) create mode 100644 sysdeps/powerpc/powerpc64/fpu/Makefile create mode 100644 sysdeps/powerpc/powerpc64/fpu/Versions create mode 100644 sysdeps/powerpc/powerpc64/fpu/configure create mode 100644 sysdeps/powerpc/powerpc64/fpu/configure.ac create mode 100644 sysdeps/powerpc/powerpc64/fpu/math-tests-arch.h create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vsx.c create mode 100644 sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h create mode 100644 sysdeps/powerpc/powerpc64/fpu/vec_finite_alias.c create mode 100644 sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist diff --git a/ChangeLog b/ChangeLog index 8096175cc9..51956e6b78 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,24 @@ +2019-03-06 + + [BZ #24205] + * NEWS: Updated to note new SIMD vector cosine function. + * sysdeps/powerpc/fpu/libm-test-ulps (cos_vlen2): Regenerated. + * sysdeps/powerpc/powerpc64/fpu/Makefile: New file. + * sysdeps/powerpc/powerpc64/fpu/Versions: Likewise. + * sysdeps/powerpc/powerpc64/fpu/configure: Generated. + * sysdeps/powerpc/powerpc64/fpu/configure.ac: New file. + * sysdeps/powerpc/powerpc64/fpu/math-tests-arch.h: New file. + * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (libmvec-sysdep_routines) + (CFLAGS-vec_d_cos2_vsx.c, libmvec-tests, double-vlen2-funcs) + (double-vlen2-arch-ext-cflags): Added build of VSX vector cos function + and its tests. + * sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c: New file. + * sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vsx.c: Likewise. + * sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h: Likewise. + * sysdeps/powerpc/powerpc64/fpu/vec_finite_alias.c: Likewise. + * sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist: Likewise. + + 2019-02-26 Joseph Myers * sysdeps/arm/sysdep.h (#if condition): Break lines before rather diff --git a/NEWS b/NEWS index 0a3b6c7a5a..22735e1d1d 100644 --- a/NEWS +++ b/NEWS @@ -4,11 +4,23 @@ See the end for copying conditions. Please send GNU C library bug reports via using `glibc' in the "product" field. + Version 2.30 Major new features: +* Start of implementing vector math library libmvec on PPC64/POWER8. + The double-precision cosine now has a vector version. + GCC support for auto-vectorization of functions on PPC64 is not yet + available. Until that is done, the new vector math functions are + inaccessible to applications. + Library libmvec is built by default for PPC64. Disable its creation by + passing flag --disable-mathvec to configure. + The library ABI specification is x86_64 Vector Function ABI. + More information on libmvec including a link to the ABI document is at: + + * The dynamic linker accepts the --preload argument to preload shared objects, in addition to the LD_PRELOAD environment variable. diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps index 1eec27c1dc..d392b135a7 100644 --- a/sysdeps/powerpc/fpu/libm-test-ulps +++ b/sysdeps/powerpc/fpu/libm-test-ulps @@ -1311,6 +1311,9 @@ ifloat128: 2 ildouble: 5 ldouble: 5 +Function: "cos_vlen2": +double: 2 + Function: "cosh": double: 1 float: 1 diff --git a/sysdeps/powerpc/powerpc64/fpu/Makefile b/sysdeps/powerpc/powerpc64/fpu/Makefile new file mode 100644 index 0000000000..21dc67ff73 --- /dev/null +++ b/sysdeps/powerpc/powerpc64/fpu/Makefile @@ -0,0 +1,7 @@ +ifeq ($(subdir),mathvec) +libmvec-support += vec_finite_alias + +CFLAGS-vec_finite_alias.c += -mvsx + +libmvec-static-only-routines = vec_finite_alias +endif diff --git a/sysdeps/powerpc/powerpc64/fpu/Versions b/sysdeps/powerpc/powerpc64/fpu/Versions new file mode 100644 index 0000000000..9a3e1211cc --- /dev/null +++ b/sysdeps/powerpc/powerpc64/fpu/Versions @@ -0,0 +1,5 @@ +libmvec { + GLIBC_2.30 { + _ZGVbN2v_cos; + } +} diff --git a/sysdeps/powerpc/powerpc64/fpu/configure b/sysdeps/powerpc/powerpc64/fpu/configure new file mode 100644 index 0000000000..1b2bb78165 --- /dev/null +++ b/sysdeps/powerpc/powerpc64/fpu/configure @@ -0,0 +1,6 @@ +# This file is generated from configure.ac by Autoconf. DO NOT EDIT! + # Local configure fragment for sysdeps/powerpc/powerpc64/fpu/. + +if test "$enable_mathvec" != "no" ; then + build_mathvec=yes +fi diff --git a/sysdeps/powerpc/powerpc64/fpu/configure.ac b/sysdeps/powerpc/powerpc64/fpu/configure.ac new file mode 100644 index 0000000000..d2ef5c8739 --- /dev/null +++ b/sysdeps/powerpc/powerpc64/fpu/configure.ac @@ -0,0 +1,6 @@ +GLIBC_PROVIDES dnl See aclocal.m4 in the top level source directory. +# Local configure fragment for sysdeps/powerpc/powerpc64/fpu/. + +if test "$enable_mathvec" != "no" ; then + build_mathvec=yes +fi diff --git a/sysdeps/powerpc/powerpc64/fpu/math-tests-arch.h b/sysdeps/powerpc/powerpc64/fpu/math-tests-arch.h new file mode 100644 index 0000000000..74fac298d2 --- /dev/null +++ b/sysdeps/powerpc/powerpc64/fpu/math-tests-arch.h @@ -0,0 +1,34 @@ +/* Runtime architecture check for math tests. PPC64 version. + Copyright (C) 2014-2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#if defined REQUIRE_VSX + +# include "init-arch.h" + +# define INIT_ARCH_EXT +# define CHECK_ARCH_EXT \ + do \ + { \ + INIT_ARCH() \ + if (!(hwcap & PPC_FEATURE_HAS_VSX )) return; \ + } \ + while (0) + +#else +# include +#endif diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile index 39b557604c..44c1c04c13 100644 --- a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile @@ -42,3 +42,20 @@ CFLAGS-e_hypotf-power7.c = -mcpu=power7 CFLAGS-s_modf-ppc64.c += -fsignaling-nans CFLAGS-s_modff-ppc64.c += -fsignaling-nans endif + +ifeq ($(subdir),mathvec) +libmvec-sysdep_routines += vec_d_cos2_vsx +CFLAGS-vec_d_cos2_vsx.c += -mvsx +endif + +# Variables for libmvec tests. +ifeq ($(subdir),math) +ifeq ($(build-mathvec),yes) +libmvec-tests += double-vlen2 + +double-vlen2-funcs = cos + +double-vlen2-arch-ext-cflags = -mvsx -DREQUIRE_VSX + +endif +endif diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c new file mode 100644 index 0000000000..17e2cc0724 --- /dev/null +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/test-double-vlen2-wrappers.c @@ -0,0 +1,24 @@ +/* Wrapper part of tests for VSX ISA versions of vector math functions. + Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "test-double-vlen2.h" +#include + +#define VEC_TYPE vector double + +VECTOR_WRAPPER (WRAPPER_NAME (cos), _ZGVbN2v_cos) diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vsx.c b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vsx.c new file mode 100644 index 0000000000..e1467a78b1 --- /dev/null +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_cos2_vsx.c @@ -0,0 +1,88 @@ +/* Function cos vectorized with VSX. + Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include "vec_d_trig_data.h" + +vector double +_ZGVbN2v_cos (vector double x) +{ + + /* + ARGUMENT RANGE REDUCTION: + Add Pi/2 to argument: X' = X+Pi/2. */ + vector double x_prime = (vector double) __d_half_pi + x; + + /* Get absolute argument value: X' = |X'|. */ + vector double abs_x_prime = vec_abs (x_prime); + + /* Y = X'*InvPi + RS : right shifter add. */ + vector double y = (x_prime * __d_inv_pi) + __d_rshifter; + + /* Check for large arguments path. */ + vector bool long long large_in = vec_cmpgt (abs_x_prime, __d_rangeval); + + /* N = Y - RS : right shifter sub. */ + vector double n = y - __d_rshifter; + + /* SignRes = Y<<63 : shift LSB to MSB place for result sign. */ + vector double sign_res = (vector double) vec_sl ((vector long long) y, + (vector unsigned long long) + vec_splats (63)); + + /* N = N - 0.5. */ + n = n - __d_one_half; + + /* R = X - N*Pi1. */ + vector double r = x - (n * __d_pi1_fma); + + /* R = R - N*Pi2. */ + r = r - (n * __d_pi2_fma); + + /* R = R - N*Pi3. */ + r = r - (n * __d_pi3_fma); + + /* R2 = R*R. */ + vector double r2 = r * r; + + /* Poly = C3+R2*(C4+R2*(C5+R2*(C6+R2*C7))). */ + vector double poly = r2 * __d_coeff7 + __d_coeff6; + poly = poly * r2 + __d_coeff5; + poly = poly * r2 + __d_coeff4; + poly = poly * r2 + __d_coeff3; + + /* Poly = R+R*(R2*(C1+R2*(C2+R2*Poly))). */ + poly = poly * r2 + __d_coeff2; + poly = poly * r2 + __d_coeff1; + poly = poly * r2 * r + r; + + /* + RECONSTRUCTION: + Final sign setting: Res = Poly^SignRes. */ + vector double out + = (vector double) ((vector long long) poly ^ (vector long long) sign_res); + + if (large_in[0] != 0) + out[0] = cos (x[0]); + + if (large_in[1] != 0) + out[1] = cos (x[1]); + + return out; + +} diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h new file mode 100644 index 0000000000..ecd47d258f --- /dev/null +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/vec_d_trig_data.h @@ -0,0 +1,60 @@ +/* Constants used in polynomial approximations for vectorized sin, cos, + and sincos functions. + Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef D_TRIG_DATA_H +#define D_TRIG_DATA_H + +#include + +/* PI/2. */ +const vector double __d_half_pi = {0x1.921fb54442d18p+0, 0x1.921fb54442d18p+0}; + +/* Inverse PI. */ +const vector double __d_inv_pi = {0x1.45f306dc9c883p-2, 0x1.45f306dc9c883p-2}; + +/* Right-shifter constant. */ +const vector double __d_rshifter = {0x1.8p+52, 0x1.8p+52}; + +/* Working range threshold. */ +const vector double __d_rangeval = {0x1p+23, 0x1p+23}; + +/* One-half. */ +const vector double __d_one_half = {0x1p-1, 0x1p-1}; + +/* Range reduction PI-based constants if FMA available: + PI high part (FMA available). */ +const vector double __d_pi1_fma = {0x1.921fb54442d18p+1, 0x1.921fb54442d18p+1}; + +/* PI mid part (FMA available). */ +const vector double __d_pi2_fma = {0x1.1a62633145c06p-53, 0x1.1a62633145c06p-53}; + +/* PI low part (FMA available). */ +const vector double __d_pi3_fma += {0x1.c1cd129024e09p-106,0x1.c1cd129024e09p-106}; + +/* Polynomial coefficients (relative error 2^(-52.115)). */ +const vector double __d_coeff7 = {-0x1.9f0d60811aac8p-41,-0x1.9f0d60811aac8p-41}; +const vector double __d_coeff6 = {0x1.60e6857a2f22p-33,0x1.60e6857a2f22p-33}; +const vector double __d_coeff5 = {-0x1.ae63546002231p-26,-0x1.ae63546002231p-26}; +const vector double __d_coeff4 = {0x1.71de38030feap-19,0x1.71de38030feap-19}; +const vector double __d_coeff3 = {-0x1.a01a019a5b86dp-13,-0x1.a01a019a5b86dp-13}; +const vector double __d_coeff2 = {0x1.111111110a4a8p-7,0x1.111111110a4a8p-7}; +const vector double __d_coeff1 = {-0x1.55555555554a7p-3,-0x1.55555555554a7p-3}; + +#endif /* D_TRIG_DATA_H. */ diff --git a/sysdeps/powerpc/powerpc64/fpu/vec_finite_alias.c b/sysdeps/powerpc/powerpc64/fpu/vec_finite_alias.c new file mode 100644 index 0000000000..5fd180fbeb --- /dev/null +++ b/sysdeps/powerpc/powerpc64/fpu/vec_finite_alias.c @@ -0,0 +1,17 @@ +/* A temporary workaround to force creation of libmvec_nonshared.a. + Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist new file mode 100644 index 0000000000..656ce0541f --- /dev/null +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libmvec.abilist @@ -0,0 +1 @@ +GLIBC_2.30 _ZGVbN2v_cos F -- 2.20.1