From patchwork Mon Apr 8 02:52:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: yulong X-Patchwork-Id: 88147 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B4241385842A for ; Mon, 8 Apr 2024 02:53:26 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from cstnet.cn (smtp21.cstnet.cn [159.226.251.21]) by sourceware.org (Postfix) with ESMTPS id 4B89B3858D28 for ; Mon, 8 Apr 2024 02:52:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4B89B3858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=iscas.ac.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=iscas.ac.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4B89B3858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=159.226.251.21 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712544777; cv=none; b=muH7x9KbB9QzDH2WnU35kr1v8hjiCLmYTu9is3jL+icKTfBPdCzC3cvygw+kLvjs4Gf9BGUCTPb88pbuw6PIL/e/wN3J040ISqHGudH5FJADrtPt788E3IWyZOXzrxEuKkL3bbJpF8HZI6gItPibYNlNyktP0JkHo/VEo88jpuY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712544777; c=relaxed/simple; bh=DcBqN0Yvxbxlx59c27gRTQf3jody1YfWowTyjkkhwws=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=wA2+gMPRObiRIaPTLPdfj711NPawJt2oF9lNBCOD6fD7jCgvZBV9yvhwiWvA6nbIGFgWkPh+zbuUtkFC3Hy+1SiHfyb3pQ4MmbldRwtc6zbR/eWpmuxp06UOKGAlEQjWeVLAcul/eI2C8/8MeeF06oeQyAwP/6RmpqYRNxXjprE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from localhost.localdomain (unknown [122.8.183.87]) by APP-01 (Coremail) with SMTP id qwCowACnrkbzWxNmp+8lAg--.21366S2; Mon, 08 Apr 2024 10:52:40 +0800 (CST) From: shiyulong@iscas.ac.cn To: libc-alpha@sourceware.org Cc: palmer@dabbelt.com, darius@bluespec.com, andrew@sifive.com, maskray@google.com, kito.cheng@sifive.com, wuwei2016@iscas.ac.cn, jiawei@iscas.ac.cn, shihua@iscas.ac.cn, chenyixuan@iscas.ac.cn, yulong Subject: [RFC] Enable libmvec support for RISC-V Date: Mon, 8 Apr 2024 10:52:28 +0800 Message-Id: <20240408025228.4136585-1-shiyulong@iscas.ac.cn> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-CM-TRANSID: qwCowACnrkbzWxNmp+8lAg--.21366S2 X-Coremail-Antispam: 1UD129KBjvAXoW3Aw4DJr4rCFW3JFy7Zry5Jwb_yoW8Xr43uo WSgFW8JF47Grn3urs5C3s5Aw17WF42grW7XF4DXF4kGr97JF1rJrWSkas8uws8Gr45GFW3 XFyxtFW3ZF4jgr1fn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7v73VFW2AGmfu7bjvjm3 AaLaJ3UjIYCTnIWjp_UUUY67AC8VAFwI0_Gr0_Xr1l1xkIjI8I6I8E6xAIw20EY4v20xva j40_Wr0E3s1l1IIY67AEw4v_Jr0_Jr4l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2 x7M28EF7xvwVC0I7IYx2IY67AKxVW8JVW5JwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8 JVWxJwA2z4x0Y4vEx4A2jsIE14v26r4UJVWxJr1l84ACjcxK6I8E87Iv6xkF7I0E14v26r 4UJVWxJr1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2Wl Yx0E2Ix0cI8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkEbV WUJVW8JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I648v4I1lFIxGxcIEc7Cj xVA2Y2ka0xkIwI1lw4CEc2x0rVAKj4xxMxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4 AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE 17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMI IF0xvE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4l IxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvf C2KfnxnUUI43ZEXa7VUbveHDUUUUU== X-Originating-IP: [122.8.183.87] X-CM-SenderInfo: 5vkl53porqwq5lvft2wodfhubq/ X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org From: yulong This patch tries to enable libmvec on RISC-V. I also have demonstrated how this all fits together by adding implementations for vector cos. This patch is a try and we hope to receive valuable comments. Thanks, yulong --- sysdeps/riscv/configure | 4 + sysdeps/riscv/configure.ac | 4 + sysdeps/riscv/rvd/Makefile | 5 + sysdeps/riscv/rvd/Versions | 5 + sysdeps/riscv/rvd/bits/math-vector.h | 29 ++++ sysdeps/riscv/rvd/cos.c | 94 ++++++++++++ sysdeps/riscv/rvd/math_private.h | 42 ++++++ sysdeps/riscv/rvd/v_math.h | 139 ++++++++++++++++++ sysdeps/riscv/rvd/vecmath_config.h | 33 +++++ sysdeps/unix/sysv/linux/riscv/libmvec.abilist | 1 + 10 files changed, 356 insertions(+) mode change 100644 => 100755 sysdeps/riscv/configure create mode 100644 sysdeps/riscv/rvd/Makefile create mode 100644 sysdeps/riscv/rvd/Versions create mode 100644 sysdeps/riscv/rvd/bits/math-vector.h create mode 100644 sysdeps/riscv/rvd/cos.c create mode 100644 sysdeps/riscv/rvd/math_private.h create mode 100644 sysdeps/riscv/rvd/v_math.h create mode 100644 sysdeps/riscv/rvd/vecmath_config.h create mode 100644 sysdeps/unix/sysv/linux/riscv/libmvec.abilist diff --git a/sysdeps/riscv/configure b/sysdeps/riscv/configure old mode 100644 new mode 100755 index c8f01709f8..2010f7d0fd --- a/sysdeps/riscv/configure +++ b/sysdeps/riscv/configure @@ -80,3 +80,7 @@ if test "$libc_cv_static_pie_on_riscv" = yes; then printf "%s\n" "#define SUPPORT_STATIC_PIE 1" >>confdefs.h fi + +if test x"$build_mathvec" = no; then + build_mathvec=yes +fi diff --git a/sysdeps/riscv/configure.ac b/sysdeps/riscv/configure.ac index ee3d1ed014..44eabed9d6 100644 --- a/sysdeps/riscv/configure.ac +++ b/sysdeps/riscv/configure.ac @@ -43,3 +43,7 @@ EOF if test "$libc_cv_static_pie_on_riscv" = yes; then AC_DEFINE(SUPPORT_STATIC_PIE) fi + +if test x"$build_mathvec" = no; then + build_mathvec=yes +fi diff --git a/sysdeps/riscv/rvd/Makefile b/sysdeps/riscv/rvd/Makefile new file mode 100644 index 0000000000..1adb2ee582 --- /dev/null +++ b/sysdeps/riscv/rvd/Makefile @@ -0,0 +1,5 @@ +libmvec-supported-funcs = cos + +ifeq ($(subdir),mathvec) +libmvec-support = $(addprefix d,$(libmvec-supported-funcs)) +endif diff --git a/sysdeps/riscv/rvd/Versions b/sysdeps/riscv/rvd/Versions new file mode 100644 index 0000000000..c381b0b9fb --- /dev/null +++ b/sysdeps/riscv/rvd/Versions @@ -0,0 +1,5 @@ +libmvec { + GLIBC_2.39 { + _ZGVnN2v_cos; + } +} \ No newline at end of file diff --git a/sysdeps/riscv/rvd/bits/math-vector.h b/sysdeps/riscv/rvd/bits/math-vector.h new file mode 100644 index 0000000000..b34ffc9bc1 --- /dev/null +++ b/sysdeps/riscv/rvd/bits/math-vector.h @@ -0,0 +1,29 @@ +/* Platform-specific SIMD declarations of math functions. + + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _MATH_H +# error "Never include directly;\ + include instead." +#endif + +#if defined __riscv__ +# define __DECL_RVV_RISCV _Pragma +# undef __DECL_RVV_cos +# define __DECL_RVV_cos __DECL_RVV_RISCV +#endif diff --git a/sysdeps/riscv/rvd/cos.c b/sysdeps/riscv/rvd/cos.c new file mode 100644 index 0000000000..1806acd629 --- /dev/null +++ b/sysdeps/riscv/rvd/cos.c @@ -0,0 +1,94 @@ +/* Double-precision vector cos function. + + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "v_math.h" + + +static const struct data +{ + vfloat64m2_t poly[7]; + vfloat64m2_t range_val, shift, inv_pi, half_pi, pi_1, pi_2, pi_3; +} data = { + /* Worst-case error is 3.3 ulp in [-pi/2, pi/2]. */ + .poly = { V2 (-0x1.555555555547bp-3), V2 (0x1.1111111108a4dp-7), + V2 (-0x1.a01a019936f27p-13), V2 (0x1.71de37a97d93ep-19), + V2 (-0x1.ae633919987c6p-26), V2 (0x1.60e277ae07cecp-33), + V2 (-0x1.9e9540300a1p-41) }, + .inv_pi = V2 (0x1.45f306dc9c883p-2), + .half_pi = V2 (0x1.921fb54442d18p+0), + .pi_1 = V2 (0x1.921fb54442d18p+1), + .pi_2 = V2 (0x1.1a62633145c06p-53), + .pi_3 = V2 (0x1.c1cd129024e09p-106), + .shift = V2 (0x1.8p52), + .range_val = V2 (0x1p23) +}; + +#define C(i) d->poly[i] + +static vfloat64m2_t NOINLINE +special_case (vfloat64m2_t x, vfloat64m2_t y, vuint64m2_t odd, vuint64m2_t cmp) +{ + y = vreinterpret_v_u64m2_f64m2 (vor (vreinterpret_v_f64m2_u64m2 (y), odd, 1)); + return v_call_f64 (cos, x, y, cmp); +} + +vfloat64m2_t V_NAME_D1 (cos) (vfloat64m2_t x) +{ + const struct data *d = ptr_barrier (&data); + vfloat64m2_t n, r, r2, r3, r4, t1, t2, t3, y; + vuint64m2_t odd, cmp; + + r = vfabs_v_f64m2 (x, 2); + cmp = (vuint64m2_t) vmsgeu (vreinterpret_v_f64m2_u64m2 (r), + vreinterpret_v_f64m2_u64m2 (d->range_val)); + if (__glibc_unlikely (v_any_u64 (cmp))) + /* If fenv exceptions are to be triggered correctly, set any special lanes + to 1 (which is neutral w.r.t. fenv). These lanes will be fixed by + special-case handler later. */ + r = vmsltu (cmp, v_f64 (1.0), r); + + /* n = rint((|x|+pi/2)/pi) - 0.5. */ + n = vfmadd (d->shift, d->inv_pi, vfadd (r, d->half_pi,2), 2); + odd = vshlq_n_u64 (vreinterpret_v_f64m2_u64m2 (n), 63); + n = vfsub (n, d->shift, 2); + n = vfsub (n, v_f64 (0.5), 2); + + /* r = |x| - n*pi (range reduction into -pi/2 .. pi/2). */ + r = vfmsub (r, d->pi_1, n, 2); + r = vfmsub (r, d->pi_2, n, 2); + r = vfmsub (r, d->pi_3, n, 2); + + /* sin(r) poly approx. */ + r2 = vfmul (r, r, 2); + r3 = vfmul (r2, r, 2); + r4 = vfmul (r2, r2, 2); + + t1 = vfmadd (C (4), C (5), r2, 2); + t2 = vfmadd (C (2), C (3), r2, 2); + t3 = vfmadd (C (0), C (1), r2, 2); + + y = vfmadd (t1, C (6), r4, 2); + y = vfmadd (t2, y, r4, 2); + y = vfmadd (t3, y, r4, 2); + y = vfmadd (r, y, r3, 2); + + if (__glibc_unlikely (v_any_u64 (cmp))) + return special_case (x, y, odd, cmp); + return vreinterpretq_f64_u64 (vor (vreinterpret_v_f64m2_u64m2 (y), odd, 2)); +} diff --git a/sysdeps/riscv/rvd/math_private.h b/sysdeps/riscv/rvd/math_private.h new file mode 100644 index 0000000000..655a4dcd55 --- /dev/null +++ b/sysdeps/riscv/rvd/math_private.h @@ -0,0 +1,42 @@ +/* Configure optimized libm functions. RISC-V version. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef RISCV_MATH_PRIVATE_H +#define RISCV_MATH_PRIVATE_H 1 + +#include +#include + +/* Use inline round and lround instructions. */ +#define TOINT_INTRINSICS 1 + +static inline double_t +roundtoint (double_t x) +{ + return round (x); +} + +static inline int32_t +converttoint (double_t x) +{ + return lround (x); +} + +#include_next + +#endif diff --git a/sysdeps/riscv/rvd/v_math.h b/sysdeps/riscv/rvd/v_math.h new file mode 100644 index 0000000000..d2e821aeb2 --- /dev/null +++ b/sysdeps/riscv/rvd/v_math.h @@ -0,0 +1,139 @@ +/* Utilities for Advanced SIMD libmvec routines. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _V_MATH_H +#define _V_MATH_H + +#include +#include "vecmath_config.h" + +#define V_NAME_D1(fun) _ZGVnN2v_##fun + +/* Shorthand helpers for declaring constants. */ +#define V2(X) { X, X } +#define V4(X) { X, X, X, X } +#define V8(X) { X, X, X, X, X, X, X, X } + +static inline vfloat32m4_t +v_f32 (float x) +{ + return (vfloat32m4_t) V4 (x); +} +static inline vuint32m4_t +v_u32 (uint32_t x) +{ + return (vuint32m4_t) V4 (x); +} +static inline vint32m4_t +v_s32 (int32_t x) +{ + return (vint32m4_t) V4 (x); +} + +/* true if any elements of a vector compare result is non-zero. */ +static inline int +v_any_u32 (vuint32m4_t x) +{ + /* assume elements in x are either 0 or -1u. */ + return vpaddd_u64 (vreinterpret_v_u64m2_u32m2 (x)) != 0; +} +static inline int +v_any_u32h (vuint32m2_t x) +{ + return vget_lane_u64 (vreinterpret_v_u32m2_u64m2 (x), 0) != 0; +} +static inline vfloat32m4_t +v_lookup_f32 (const float *tab, vuint32m4_t idx) +{ + return (vfloat32m4_t){ tab[idx[0]], tab[idx[1]], tab[idx[2]], tab[idx[3]] }; +} +static inline vuint32m4_t +v_lookup_u32 (const uint32_t *tab, vuint32m4_t idx) +{ + return (vuint32m4_t){ tab[idx[0]], tab[idx[1]], tab[idx[2]], tab[idx[3]] }; +} +static inline vfloat32m4_t +v_call_f32 (float (*f) (float), vfloat32m4_t x, vfloat32m4_t y, vuint32m4_t p) +{ + return (vfloat32m4_t){ p[0] ? f (x[0]) : y[0], p[1] ? f (x[1]) : y[1], + p[2] ? f (x[2]) : y[2], p[3] ? f (x[3]) : y[3] }; +} +static inline vfloat32m4_t +v_call2_f32 (float (*f) (float, float), vfloat32m4_t x1, vfloat32m4_t x2, + vfloat32m4_t y, vuint32m4_t p) +{ + return (vfloat32m4_t){ p[0] ? f (x1[0], x2[0]) : y[0], + p[1] ? f (x1[1], x2[1]) : y[1], + p[2] ? f (x1[2], x2[2]) : y[2], + p[3] ? f (x1[3], x2[3]) : y[3] }; +} + +static inline vfloat64m2_t +v_f64 (double x) +{ + return (vfloat64m2_t) V2 (x); +} +static inline vuint64m2_t +v_u64 (uint64_t x) +{ + return (vuint64m2_t) V2 (x); +} +static inline vint64m2_t +v_s64 (int64_t x) +{ + return (vint64m2_t) V2 (x); +} + +/* true if any elements of a vector compare result is non-zero. */ +static inline int +v_any_u64 (vuint64m1_t x) +{ + /* assume elements in x are either 0 or -1u. */ + return vpaddd_u64 (x) != 0; +} +/* true if all elements of a vector compare result is 1. */ +static inline int +v_all_u64 (vuint64m1_t x) +{ + /* assume elements in x are either 0 or -1u. */ + return vpaddd_s64 (vreinterpretq_s64_u64 (x)) == -2; +} +static inline vfloat64m1_t +v_lookup_f64 (const double *tab, vuint64m1_t idx) +{ + return (vfloat64m1_t){ tab[idx[0]], tab[idx[1]] }; +} +static inline vuint64m1_t +v_lookup_u64 (const uint64_t *tab, vuint64m1_t idx) +{ + return (vuint64m1_t){ tab[idx[0]], tab[idx[1]] }; +} +static inline vfloat64m1_t +v_call_f64 (double (*f) (double), vfloat64m1_t x, vfloat64m1_t y, vuint64m1_t p) +{ + return (vfloat64m1_t){ p[0] ? f (x[0]) : y[0], p[1] ? f (x[1]) : y[1] }; +} +static inline vfloat64m1_t +v_call2_f64 (double (*f) (double, double), vfloat64m1_t x1, vfloat64m1_t x2, + vfloat64m1_t y, vuint64m1_t p) +{ + return (vfloat64m1_t){ p[0] ? f (x1[0], x2[0]) : y[0], + p[1] ? f (x1[1], x2[1]) : y[1] }; +} + +#endif diff --git a/sysdeps/riscv/rvd/vecmath_config.h b/sysdeps/riscv/rvd/vecmath_config.h new file mode 100644 index 0000000000..290ea1e33c --- /dev/null +++ b/sysdeps/riscv/rvd/vecmath_config.h @@ -0,0 +1,33 @@ +/* Configuration for libmvec routines. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _VECMATH_CONFIG_H +#define _VECMATH_CONFIG_H + +#include + +/* Return ptr but hide its value from the compiler so accesses through it + cannot be optimized based on the contents. */ +#define ptr_barrier(ptr) \ + ({ \ + __typeof (ptr) __ptr = (ptr); \ + __asm("" : "+r"(__ptr)); \ + __ptr; \ + }) + +#endif diff --git a/sysdeps/unix/sysv/linux/riscv/libmvec.abilist b/sysdeps/unix/sysv/linux/riscv/libmvec.abilist new file mode 100644 index 0000000000..7389378e58 --- /dev/null +++ b/sysdeps/unix/sysv/linux/riscv/libmvec.abilist @@ -0,0 +1 @@ +GLIBC_2.39 _ZGVnN2v_cos F \ No newline at end of file