From patchwork Tue May 31 13:08:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andreas Schwab X-Patchwork-Id: 54562 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EAC653954C74 for ; Tue, 31 May 2022 13:09:03 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org EAC653954C74 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1654002544; bh=cs/wrFxnOGqa5/F9hSTXRFkAPfLu3a8tSTEEaJyuVvo=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=uDTww/RuH0C4d1k6QYKPvLHvc5d90rfL/vj/yP0Yx2saQIyceP5S82yewAhx0a9IM imXv4UcAncVQjlGTg/ggYOdVdOa/1Woldvi74SKu2QMfZxi62apDWq/N+5zfJciz/9 2ZmNe9v15zRiubKP5eAsFEqGi86fnEGf9ouwLS68= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id A11133858D3C for ; Tue, 31 May 2022 13:08:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A11133858D3C Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id B05461F988 for ; Tue, 31 May 2022 13:08:41 +0000 (UTC) Received: from hawking.suse.de (unknown [10.168.4.11]) by relay2.suse.de (Postfix) with ESMTP id A914A2C141 for ; Tue, 31 May 2022 13:08:41 +0000 (UTC) Received: by hawking.suse.de (Postfix, from userid 17005) id 3E30E44007D; Tue, 31 May 2022 15:08:41 +0200 (CEST) To: libc-alpha@sourceware.org Subject: [PATCH] x86_64: Optimize sincos where sin/cos is optimized (bug 29193) X-Yow: I put aside my copy of ``BOWLING WORLD'' and think about GUN CONTROL legislation.. Date: Tue, 31 May 2022 15:08:40 +0200 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.1 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-8.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, KAM_STOCKGEN, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Andreas Schwab via Libc-alpha From: Andreas Schwab Reply-To: Andreas Schwab Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" The compiler may substitute calls to sin or cos with calls to sincos, thus we should have the same optimized implementations for sincos. The optimized implementations may produce results that differ, that also makes sure that the sincos call aggrees with the sin and cos calls. Reviewed-by: Adhemerval Zanella --- sysdeps/ieee754/dbl-64/s_sincos.c | 7 +++++ sysdeps/x86_64/fpu/multiarch/Makefile | 12 ++++++-- sysdeps/x86_64/fpu/multiarch/s_sincos-avx.c | 3 ++ sysdeps/x86_64/fpu/multiarch/s_sincos-fma.c | 3 ++ sysdeps/x86_64/fpu/multiarch/s_sincos-fma4.c | 3 ++ sysdeps/x86_64/fpu/multiarch/s_sincos.c | 30 ++++++++++++++++++++ 6 files changed, 55 insertions(+), 3 deletions(-) create mode 100644 sysdeps/x86_64/fpu/multiarch/s_sincos-avx.c create mode 100644 sysdeps/x86_64/fpu/multiarch/s_sincos-fma.c create mode 100644 sysdeps/x86_64/fpu/multiarch/s_sincos-fma4.c create mode 100644 sysdeps/x86_64/fpu/multiarch/s_sincos.c diff --git a/sysdeps/ieee754/dbl-64/s_sincos.c b/sysdeps/ieee754/dbl-64/s_sincos.c index 12c3021e66..890c137ebb 100644 --- a/sysdeps/ieee754/dbl-64/s_sincos.c +++ b/sysdeps/ieee754/dbl-64/s_sincos.c @@ -24,10 +24,15 @@ #include #include +#ifndef SECTION +# define SECTION +#endif + #define IN_SINCOS #include "s_sin.c" void +SECTION __sincos (double x, double *sinx, double *cosx) { mynumber u; @@ -100,4 +105,6 @@ __sincos (double x, double *sinx, double *cosx) *sinx = *cosx = x / x; } +#ifndef __sincos libm_alias_double (__sincos, sincos) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/Makefile b/sysdeps/x86_64/fpu/multiarch/Makefile index ec796277a5..248162525b 100644 --- a/sysdeps/x86_64/fpu/multiarch/Makefile +++ b/sysdeps/x86_64/fpu/multiarch/Makefile @@ -10,7 +10,8 @@ libm-sysdep_routines += s_ceil-sse4_1 s_ceilf-sse4_1 s_floor-sse4_1 \ s_trunc-sse4_1 s_truncf-sse4_1 libm-sysdep_routines += e_exp-fma e_log-fma e_pow-fma s_atan-fma \ - e_asin-fma e_atan2-fma s_sin-fma s_tan-fma + e_asin-fma e_atan2-fma s_sin-fma s_tan-fma \ + s_sincos-fma CFLAGS-e_asin-fma.c = -mfma -mavx2 CFLAGS-e_atan2-fma.c = -mfma -mavx2 @@ -20,6 +21,7 @@ CFLAGS-e_pow-fma.c = -mfma -mavx2 CFLAGS-s_atan-fma.c = -mfma -mavx2 CFLAGS-s_sin-fma.c = -mfma -mavx2 CFLAGS-s_tan-fma.c = -mfma -mavx2 +CFLAGS-s_sincos-fma.c = -mfma -mavx2 libm-sysdep_routines += s_sinf-sse2 s_cosf-sse2 s_sincosf-sse2 @@ -36,7 +38,8 @@ CFLAGS-s_cosf-fma.c = -mfma -mavx2 CFLAGS-s_sincosf-fma.c = -mfma -mavx2 libm-sysdep_routines += e_exp-fma4 e_log-fma4 e_pow-fma4 s_atan-fma4 \ - e_asin-fma4 e_atan2-fma4 s_sin-fma4 s_tan-fma4 + e_asin-fma4 e_atan2-fma4 s_sin-fma4 s_tan-fma4 \ + s_sincos-fma4 CFLAGS-e_asin-fma4.c = -mfma4 CFLAGS-e_atan2-fma4.c = -mfma4 @@ -46,9 +49,11 @@ CFLAGS-e_pow-fma4.c = -mfma4 CFLAGS-s_atan-fma4.c = -mfma4 CFLAGS-s_sin-fma4.c = -mfma4 CFLAGS-s_tan-fma4.c = -mfma4 +CFLAGS-s_sincos-fma4.c = -mfma4 libm-sysdep_routines += e_exp-avx e_log-avx s_atan-avx \ - e_atan2-avx s_sin-avx s_tan-avx + e_atan2-avx s_sin-avx s_tan-avx \ + s_sincos-avx CFLAGS-e_atan2-avx.c = -msse2avx -DSSE2AVX CFLAGS-e_exp-avx.c = -msse2avx -DSSE2AVX @@ -56,6 +61,7 @@ CFLAGS-e_log-avx.c = -msse2avx -DSSE2AVX CFLAGS-s_atan-avx.c = -msse2avx -DSSE2AVX CFLAGS-s_sin-avx.c = -msse2avx -DSSE2AVX CFLAGS-s_tan-avx.c = -msse2avx -DSSE2AVX +CFLAGS-s_sincos-avx.c = -msse2avx -DSSE2AVX endif ifeq ($(subdir),mathvec) diff --git a/sysdeps/x86_64/fpu/multiarch/s_sincos-avx.c b/sysdeps/x86_64/fpu/multiarch/s_sincos-avx.c new file mode 100644 index 0000000000..debea0f619 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_sincos-avx.c @@ -0,0 +1,3 @@ +#define __sincos __sincos_avx +#define SECTION __attribute__ ((section (".text.avx"))) +#include diff --git a/sysdeps/x86_64/fpu/multiarch/s_sincos-fma.c b/sysdeps/x86_64/fpu/multiarch/s_sincos-fma.c new file mode 100644 index 0000000000..31610b3356 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_sincos-fma.c @@ -0,0 +1,3 @@ +#define __sincos __sincos_fma +#define SECTION __attribute__ ((section (".text.fma"))) +#include diff --git a/sysdeps/x86_64/fpu/multiarch/s_sincos-fma4.c b/sysdeps/x86_64/fpu/multiarch/s_sincos-fma4.c new file mode 100644 index 0000000000..7e8b8e6484 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_sincos-fma4.c @@ -0,0 +1,3 @@ +#define __sincos __sincos_fma4 +#define SECTION __attribute__ ((section (".text.fma4"))) +#include diff --git a/sysdeps/x86_64/fpu/multiarch/s_sincos.c b/sysdeps/x86_64/fpu/multiarch/s_sincos.c new file mode 100644 index 0000000000..67c2817d68 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_sincos.c @@ -0,0 +1,30 @@ +/* Multiple versions of sincos. + Copyright (C) 2017-2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +extern void __redirect_sincos (double, double *, double *); + +#define SYMBOL_NAME sincos +#include "ifunc-fma4.h" + +libc_ifunc_redirected (__redirect_sincos, __sincos, IFUNC_SELECTOR ()); +libm_alias_double (__sincos, sincos) + +#define __sincos __sincos_sse2 +#include