From patchwork Fri Mar 29 13:35:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 32084 Received: (qmail 123424 invoked by alias); 29 Mar 2019 13:36:36 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 121309 invoked by uid 89); 29 Mar 2019 13:36:14 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-24.8 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.1 spammy=2224 X-HELO: mail-ua1-f65.google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=PJ0W78oPbyZgma1Jk97CPbBEMSlNHwPe+xNKBp1/E5c=; b=P0WKEveIZmcxRYmXwfmeRl3ohQxJAKHT4YDJcBOCaam3WzGH5sthLdnzxbK4mP1+ik VmbqOCFok7pRAfgLu+xtX0g0V3omQ8WZ6d/mjpJOgYzFI5oB5uXuteYvn0hkkUKhtXjL j+yZrXfweYTIhkeqTGw7D5+8+oRpiiGi6jDft9RJfCj8Kw8P6NenvzqnTjFEEx4yH7gR S+uRyy28HcMGPJwmjz2EPM0GxhRkeN5nK3xPrUgdRiCQ+aw+g/9FrcBDoo381b0yDjDP EVW2daJiEzygxTtzIgS+ApwFP1zop3CgPeb4cWK2iKGNYrud+xZJ/1ubvUpe05NfU1oq 7m8A== Return-Path: From: Adhemerval Zanella To: libc-alpha@sourceware.org Subject: [PATCH 25/28] powerpc: Refactor modf{f} Date: Fri, 29 Mar 2019 10:35:26 -0300 Message-Id: <20190329133529.22523-26-adhemerval.zanella@linaro.org> In-Reply-To: <20190329133529.22523-1-adhemerval.zanella@linaro.org> References: <20190329133529.22523-1-adhemerval.zanella@linaro.org> The modf{f} optimization is not an optimization for ISA 2.07+. This patch move the IFUNC for powerpc64 only, move the power5+ to generic location, and include the generic implementation for ISA 2.07+. The performance changes are based on modf benchtests: * POWER9 - generic code "modf": { "": { "duration": 4.97057e+09, "iterations": 1.00688e+09, "max": 28.76, "min": 4.912, "mean": 4.9366 } } * POWER9 - power5+ optimization "modf": { "": { "duration": 4.98291e+09, "iterations": 9.32818e+08, "max": 15.058, "min": 5.107, "mean": 5.34178 } } * POWER8 - generic code "modf": { "": { "duration": 5.05329e+09, "iterations": 8.38814e+08, "max": 518.051, "min": 5.79, "mean": 6.02433 } } * POWER8 - power5+ optimization "modf": { "": { "duration": 5.05573e+09, "iterations": 8.35254e+08, "max": 63.141, "min": 5.873, "mean": 6.05293 } } * POWER7 - generic code "modf": { "": { "duration": 4.89818e+09, "iterations": 1.08408e+09, "max": 57.556, "min": 3.953, "mean": 4.51827 } } * POWER7 - power5+ optimization "modf": { "": { "duration": 4.83789e+09, "iterations": 1.33409e+09, "max": 46.608, "min": 2.224, "mean": 3.62636 } } Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/power5+/fpu/s_modf.c: Move to ... * sysdeps/powerpc/fpu/s_modf.c: ... here. Add ISA 2.07 optimization. * sysdeps/powerpc/power5+/fpu/s_modff.c: Move to ... * sysdeps/powerpc/fpu/s_modff.c: ... here. Add ISA 2.07 optimization. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-power5+.c: Adjust include. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff-power5+.c: Likewise. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile (sysdep_calls, sysdep_routines): Add s_modf* objects. (CFLAGS-s_modf-power5+.c, CFLAGS-s_modff-power5+.c, CFLAGS-s_modf-ppc64.c, CFLAGS-s_modff-ppc64.c): New rule. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c: Movo to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c: Move ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-power5+.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-power5+.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-ppc64.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-ppc64.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff.c: ... here. Reviewed-by: Gabriel F. T. Gomes --- sysdeps/powerpc/{power5+ => }/fpu/s_modf.c | 17 +++++++++++++---- sysdeps/powerpc/{power5+ => }/fpu/s_modff.c | 13 +++++++++++-- .../power4/fpu/multiarch/s_modf-power5+.c | 13 +------------ .../power4/fpu/multiarch/s_modff-power5+.c | 9 +-------- .../powerpc64/be/fpu/multiarch/Makefile | 19 ++++++++++++++++++- .../{ => be}/fpu/multiarch/s_modf-power5+.c | 3 ++- .../{ => be}/fpu/multiarch/s_modf-ppc64.c | 0 .../powerpc64/{ => be}/fpu/multiarch/s_modf.c | 0 .../{ => be}/fpu/multiarch/s_modff-power5+.c | 3 ++- .../{ => be}/fpu/multiarch/s_modff-ppc64.c | 0 .../{ => be}/fpu/multiarch/s_modff.c | 0 .../powerpc/powerpc64/fpu/multiarch/Makefile | 13 ------------- 12 files changed, 48 insertions(+), 42 deletions(-) rename sysdeps/powerpc/{power5+ => }/fpu/s_modf.c (74%) rename sysdeps/powerpc/{power5+ => }/fpu/s_modff.c (77%) rename sysdeps/powerpc/powerpc64/{ => be}/fpu/multiarch/s_modf-power5+.c (91%) rename sysdeps/powerpc/powerpc64/{ => be}/fpu/multiarch/s_modf-ppc64.c (100%) rename sysdeps/powerpc/powerpc64/{ => be}/fpu/multiarch/s_modf.c (100%) rename sysdeps/powerpc/powerpc64/{ => be}/fpu/multiarch/s_modff-power5+.c (91%) rename sysdeps/powerpc/powerpc64/{ => be}/fpu/multiarch/s_modff-ppc64.c (100%) rename sysdeps/powerpc/powerpc64/{ => be}/fpu/multiarch/s_modff.c (100%) diff --git a/sysdeps/powerpc/power5+/fpu/s_modf.c b/sysdeps/powerpc/fpu/s_modf.c similarity index 74% rename from sysdeps/powerpc/power5+/fpu/s_modf.c rename to sysdeps/powerpc/fpu/s_modf.c index dbb11652e1..2304fc48ed 100644 --- a/sysdeps/powerpc/power5+/fpu/s_modf.c +++ b/sysdeps/powerpc/fpu/s_modf.c @@ -15,9 +15,15 @@ License along with the GNU C Library; see the file COPYING.LIB. If not, see . */ -#include -#include -#include +/* ISA 2.07 provides fast GPR to FP instruction (mfvsr{d,wz}) which make + generic implementation faster. Also disables for old ISAs that do not + have ceil/floor instructions. */ +#if defined(_ARCH_PWR8) || !defined(_ARCH_PWR5X) +# include +#else +# include +# include +# include double __modf (double x, double *iptr) @@ -44,7 +50,10 @@ __modf (double x, double *iptr) return copysign (x - *iptr, x); } } +# ifndef __modf libm_alias_double (__modf, modf) -#if LONG_DOUBLE_COMPAT (libc, GLIBC_2_0) +# if LONG_DOUBLE_COMPAT (libc, GLIBC_2_0) compat_symbol (libc, __modf, modfl, GLIBC_2_0); +# endif +# endif #endif diff --git a/sysdeps/powerpc/power5+/fpu/s_modff.c b/sysdeps/powerpc/fpu/s_modff.c similarity index 77% rename from sysdeps/powerpc/power5+/fpu/s_modff.c rename to sysdeps/powerpc/fpu/s_modff.c index 87c9f020f7..2a0f114b20 100644 --- a/sysdeps/powerpc/power5+/fpu/s_modff.c +++ b/sysdeps/powerpc/fpu/s_modff.c @@ -15,8 +15,14 @@ License along with the GNU C Library; see the file COPYING.LIB. If not, see . */ -#include -#include +/* ISA 2.07 provides fast GPR to FP instruction (mfvsr{d,wz}) which make + generic implementation faster. Also disables for old ISAs that do not + have ceil/floor instructions. */ +#if defined(_ARCH_PWR8) || !defined(_ARCH_PWR5X) +# include +#else +# include +# include float __modff (float x, float *iptr) @@ -43,4 +49,7 @@ __modff (float x, float *iptr) return copysignf (x - *iptr, x); } } +# ifndef __modff libm_alias_float (__modf, modf) +# endif +#endif diff --git a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-power5+.c b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-power5+.c index b1d0540b31..6f93c2b652 100644 --- a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-power5+.c +++ b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-power5+.c @@ -16,16 +16,5 @@ License along with the GNU C Library; if not, see . */ -#include -#include - -#undef weak_alias -#define weak_alias(a,b) -#undef strong_alias -#define strong_alias(a,b) -#undef compat_symbol -#define compat_symbol(a,b,c,d) - #define __modf __modf_power5plus - -#include +#include diff --git a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff-power5+.c b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff-power5+.c index 8b333eae0d..2e701881e8 100644 --- a/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff-power5+.c +++ b/sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff-power5+.c @@ -16,12 +16,5 @@ License along with the GNU C Library; if not, see . */ -#include -#include - -#undef weak_alias -#define weak_alias(a,b) - #define __modff __modff_power5plus - -#include +#include diff --git a/sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile index 9917cd15d5..63de39bbf1 100644 --- a/sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile +++ b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile @@ -1,4 +1,13 @@ ifeq ($(subdir),math) +# These functions are built both for libc and libm because they're required +# by printf. While the libc objects have the prefix s_, the libm ones are +# prefixed with m_. +sysdep_calls := s_modf-power5+ \ + s_modf-ppc64 \ + s_modff-power5+ \ + s_modff-ppc64 + +sysdep_routines += $(sysdep_calls) libm-sysdep_routines += s_ceil-power5+ \ s_ceil-ppc64 \ s_ceilf-power5+ \ @@ -22,7 +31,8 @@ libm-sysdep_routines += s_ceil-power5+ \ s_llround-power6x \ s_llround-power5+ \ s_llround-ppc64 \ - s_llroundf-ppc64 + s_llroundf-ppc64 \ + $(sysdep_calls:s_%=m_%) CFLAGS-s_ceil-power5+.c = -mcpu=power5+ CFLAGS-s_ceilf-power5+.c = -mcpu=power5+ @@ -35,4 +45,11 @@ CFLAGS-s_truncf-power5+.c = -mcpu=power5+ CFLAGS-s_llround-power8.c += -mcpu=power8 CFLAGS-s_llround-power6x.c += -mcpu=power6x CFLAGS-s_llround-power5+.c += -mcpu=power5+ + +CFLAGS-s_modf-power5+.c += -mcpu=power5+ +CFLAGS-s_modff-power5+.c += -mcpu=power5+ +# These files quiet sNaNs in a way that is optimized away without +# -fsignaling-nans. +CFLAGS-s_modf-ppc64.c += -fsignaling-nans +CFLAGS-s_modff-ppc64.c += -fsignaling-nans endif diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c similarity index 91% rename from sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c rename to sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c index 1a958de178..6f93c2b652 100644 --- a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c +++ b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c @@ -16,4 +16,5 @@ License along with the GNU C Library; if not, see . */ -#include +#define __modf __modf_power5plus +#include diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-ppc64.c b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-ppc64.c similarity index 100% rename from sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-ppc64.c rename to sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-ppc64.c diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf.c b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf.c similarity index 100% rename from sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf.c rename to sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf.c diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-power5+.c b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-power5+.c similarity index 91% rename from sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-power5+.c rename to sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-power5+.c index 4939d4bc2b..2e701881e8 100644 --- a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-power5+.c +++ b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-power5+.c @@ -16,4 +16,5 @@ License along with the GNU C Library; if not, see . */ -#include +#define __modff __modff_power5plus +#include diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-ppc64.c b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-ppc64.c similarity index 100% rename from sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-ppc64.c rename to sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-ppc64.c diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff.c b/sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff.c similarity index 100% rename from sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff.c rename to sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff.c diff --git a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile index 534d5a7133..d7ad1e2724 100644 --- a/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile +++ b/sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile @@ -1,10 +1,4 @@ ifeq ($(subdir),math) -# These functions are built both for libc and libm because they're required -# by printf. While the libc objects have the prefix s_, the libm ones are -# prefixed with m_. -sysdep_calls := s_modf-power5+ s_modf-ppc64 \ - s_modff-power5+ s_modff-ppc64 - sysdep_routines += $(sysdep_calls) libm-sysdep_routines += s_logb-power7 s_logbf-power7 \ s_logbl-power7 s_logb-ppc64 s_logbf-ppc64 \ @@ -14,11 +8,4 @@ libm-sysdep_routines += s_logb-power7 s_logbf-power7 \ CFLAGS-s_logbf-power7.c = -mcpu=power7 CFLAGS-s_logbl-power7.c = -mcpu=power7 CFLAGS-s_logb-power7.c = -mcpu=power7 -CFLAGS-s_modf-power5+.c = -mcpu=power5+ -CFLAGS-s_modff-power5+.c = -mcpu=power5+ - -# These files quiet sNaNs in a way that is optimized away without -# -fsignaling-nans. -CFLAGS-s_modf-ppc64.c += -fsignaling-nans -CFLAGS-s_modff-ppc64.c += -fsignaling-nans endif