From patchwork Wed Mar 6 17:18:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Steve Ellcey X-Patchwork-Id: 31742 Received: (qmail 100685 invoked by alias); 6 Mar 2019 17:19:04 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 100675 invoked by uid 89); 6 Mar 2019 17:19:03 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-24.0 required=5.0 tests=BAYES_00, DNS_FROM_AHBL_RHSBL, FRT_ADOBE2, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_LOW, SPF_PASS, TIME_LIMIT_EXCEEDED autolearn=unavailable version=3.3.1 spammy=finclude X-HELO: mx0b-0016f401.pphosted.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : subject : date : message-id : content-type : mime-version; s=pfpt0818; bh=ZFi5QylQb54zHCN/Q64FuH8oUn9EoC0bfwc/1/Aijp0=; b=kfvHTO7UH5JesbegBqn1ooMsV/nopuOCtgknOh8YOT8cIvLBdguHEa23mzNWco9SKwIp c64jGPoCTe8wsrVp9QdC+ermQhvhe9at88XJe/dRy5zAoHsLYWIk3028fG5md5nStlag 5aSxw7e87lYj2bGeE6KYTjlfgg/doS5H2QO17oRLiPEDdA0zAkmgVhltcHmuwOeVfA04 SzxICh5qihAS6i815BjqWjC5SepGMUHGSnEbQI8Y+vz1wR9ubQibmTnsLrHCj07cm1nX nrsPjeSsfd6fdODEZJKEGZrnXiLXUL+b5fRzYqDLgjsAe9f2Tvmusx0Tpc79qrxMosFX hg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.onmicrosoft.com; s=selector1-marvell-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ZFi5QylQb54zHCN/Q64FuH8oUn9EoC0bfwc/1/Aijp0=; b=DzBmymWgqJVrPgqe5SR7ZQ3gWrOrVfqQHYbRsS7PLtTA8Ja6ayhZvJJto6tVthk1EE+VRk4PobyB5FmAVDyXuYOgtP1M0qo7hMmMrvnNQin0oUTiDL+J3+AYSLcc8Lkvz8TrnYrWXchMnupDrkrqt22Cv2N2EcF/WEPnCFkelFc= From: Steve Ellcey To: "libc-alpha@sourceware.org" Subject: [PATCH] Aarch64: Add simd exp/expf functions Date: Wed, 6 Mar 2019 17:18:43 +0000 Message-ID: received-spf: None (protection.outlook.com: marvell.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 MIME-Version: 1.0 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED Here are float and double vector exp functions for Aarch64. The vector functions are based on the ieee ones in sysdeps/ieee754/flt-32/e_expf.c and sysdeps/ieee754/dbl-64/e_exp.c. If any of the values are 'large' or NaN they actually call the scalar routines, otherwise they use the Aarch64 SIMD instructions with the same algorithm as the ieee functions. My testing has not found any differences in exp output for scalar vs. vector and the newly added tests for the vector routines pass using the updated libm-test-ulps file. This patch also sets build_mathvec to yes by default on Aarch64, applies the simd attribute to exp and expf in the C header and includes a Fortran header. The Fortran header is in finclude so this patch needs Martin Liska's patch that moves math-vector-fortran.h from bits to finclude in order to work correctly. Comments? Steve Ellcey sellcey@marvell.com 2019-03-06 Steve Ellcey * sysdeps/aarch64/configure.ac (build_mathvec): Set to yes by default. * sysdeps/aarch64/configure: Regenerate. * sysdeps/aarch64/fpu/Makefile (CFLAGS-libmvec_double_vlen2_exp.c): Set flag. (CFLAGS-libmvec_float_vlen4_expf.c): Likewise. (CFLAGS-libmvec_exp_data.c): Likewise. (CFLAGS-libmvec_exp2f_data.c): Likewise. (libmvec-support): Add libmvec_double_vlen2_exp, libmvec_float_vlen4_expf, libmvec_exp_data, libmvec_exp2f_data to list. (libmvec-static-only-routines): Add dummy name to list. (libmvec-tests): Add double-vlen2, float-vlen4 to list. (double-vlen2-funcs): Add new vector function name. (float-vlen4-funcs): Add new vector function name. * sysdeps/aarch64/fpu/Versions: New file. * sysdeps/aarch64/fpu/bits/math-vector.h: New file. * sysdeps/aarch64/fpu/finclude/math-vector-fortran.h: New file. * sysdeps/aarch64/fpu/libmvec_double_vlen2_exp.c: New file. * sysdeps/aarch64/fpu/libmvec_exp2f_data.c: New file. * sysdeps/aarch64/fpu/libmvec_exp_data.c: New file. * sysdeps/aarch64/fpu/libmvec_float_vlen4_expf.c: New file. * sysdeps/aarch64/fpu/libmvec_util.h: New file. * sysdeps/aarch64/fpu/test-double-vlen2-wrappers.c: New file. * sysdeps/aarch64/fpu/test-float-vlen4-wrappers.c: New file. * sysdeps/aarch64/libm-test-ulps (exp_vlen2): New entry. (exp_vlen4): Likewise. * sysdeps/unix/sysv/linux/aarch64/libmvec.abilist: New file. diff --git a/sysdeps/aarch64/configure.ac b/sysdeps/aarch64/configure.ac index 7851dd4..c6d9646 100644 --- a/sysdeps/aarch64/configure.ac +++ b/sysdeps/aarch64/configure.ac @@ -20,3 +20,7 @@ if test $libc_cv_aarch64_be = yes; then else LIBC_CONFIG_VAR([default-abi], [lp64]) fi + +if test x"$build_mathvec" = xnotset; then + build_mathvec=yes +fi diff --git a/sysdeps/aarch64/fpu/Makefile b/sysdeps/aarch64/fpu/Makefile index 4a182bd..579b6a5 100644 --- a/sysdeps/aarch64/fpu/Makefile +++ b/sysdeps/aarch64/fpu/Makefile @@ -12,3 +12,27 @@ CFLAGS-s_fmaxf.c += -ffinite-math-only CFLAGS-s_fmin.c += -ffinite-math-only CFLAGS-s_fminf.c += -ffinite-math-only endif + +ifeq ($(subdir),mathvec) +CFLAGS-libmvec_double_vlen2_exp.c += -march=armv8-a+simd -fno-math-errno +CFLAGS-libmvec_float_vlen4_expf.c += -march=armv8-a+simd -fno-math-errno +CFLAGS-libmvec_exp_data.c += -march=armv8-a+simd -fno-math-errno +CFLAGS-libmvec_exp2f_data.c += -march=armv8-a+simd -fno-math-errno + +libmvec-support += libmvec_double_vlen2_exp +libmvec-support += libmvec_float_vlen4_expf +libmvec-support += libmvec_exp_data +libmvec-support += libmvec_exp2f_data + +# If I do not add a static routine I do not get libmvec_nonshared.a +# installed and GCC will fail to link when it cannot find it. +libmvec-static-only-routines += libmvec_dummy +endif + +ifeq ($(subdir),math) +ifeq ($(build-mathvec),yes) +libmvec-tests += double-vlen2 float-vlen4 +double-vlen2-funcs = exp +float-vlen4-funcs = exp +endif +endif diff --git a/sysdeps/aarch64/fpu/Versions b/sysdeps/aarch64/fpu/Versions index e69de29..9fe90ba 100644 --- a/sysdeps/aarch64/fpu/Versions +++ b/sysdeps/aarch64/fpu/Versions @@ -0,0 +1,5 @@ +libmvec { + GLIBC_2.30 { + _ZGVnN2v___exp_finite; _ZGVnN2v_exp; _ZGVnN4v___expf_finite; _ZGVnN4v_expf; + } +} diff --git a/sysdeps/aarch64/fpu/bits/math-vector.h b/sysdeps/aarch64/fpu/bits/math-vector.h index e69de29..4c34159 100644 --- a/sysdeps/aarch64/fpu/bits/math-vector.h +++ b/sysdeps/aarch64/fpu/bits/math-vector.h @@ -0,0 +1,43 @@ +/* Platform-specific SIMD declarations of math functions. + Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _MATH_H +# error "Never include directly;\ + include instead." +#endif + +/* Get default empty definitions for simd declarations. */ +#include + +#if defined __FAST_MATH__ +# if defined _OPENMP && _OPENMP >= 201307 +/* OpenMP case. */ +# define __DECL_SIMD_AARCH64 _Pragma ("omp declare simd notinbranch") +# elif __GNUC_PREREQ (6,0) +/* W/o OpenMP use GCC 6.* __attribute__ ((__simd__)). */ +# define __DECL_SIMD_AARCH64 __attribute__ ((__simd__ ("notinbranch"))) +# endif + +# ifdef __DECL_SIMD_AARCH64 +# undef __DECL_SIMD_exp +# define __DECL_SIMD_exp __DECL_SIMD_AARCH64 +# undef __DECL_SIMD_expf +# define __DECL_SIMD_expf __DECL_SIMD_AARCH64 + +# endif +#endif diff --git a/sysdeps/aarch64/fpu/finclude/math-vector-fortran.h b/sysdeps/aarch64/fpu/finclude/math-vector-fortran.h index e69de29..e42bed4 100644 --- a/sysdeps/aarch64/fpu/finclude/math-vector-fortran.h +++ b/sysdeps/aarch64/fpu/finclude/math-vector-fortran.h @@ -0,0 +1,20 @@ +! Platform-specific declarations of SIMD math functions for Fortran. -*- f90 -*- +! Copyright (C) 2019 Free Software Foundation, Inc. +! This file is part of the GNU C Library. +! +! The GNU C Library is free software; you can redistribute it and/or +! modify it under the terms of the GNU Lesser General Public +! License as published by the Free Software Foundation; either +! version 2.1 of the License, or (at your option) any later version. +! +! The GNU C Library is distributed in the hope that it will be useful, +! but WITHOUT ANY WARRANTY; without even the implied warranty of +! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +! Lesser General Public License for more details. +! +! You should have received a copy of the GNU Lesser General Public +! License along with the GNU C Library; if not, see +! . + +!GCC$ builtin (exp) attributes simd (notinbranch) if('aarch64') +!GCC$ builtin (expf) attributes simd (notinbranch) if('aarch64') diff --git a/sysdeps/aarch64/fpu/libmvec_double_vlen2_exp.c b/sysdeps/aarch64/fpu/libmvec_double_vlen2_exp.c index e69de29..fecb0ad 100644 --- a/sysdeps/aarch64/fpu/libmvec_double_vlen2_exp.c +++ b/sysdeps/aarch64/fpu/libmvec_double_vlen2_exp.c @@ -0,0 +1,95 @@ +/* Double-precision 2 element vector e^x function. + Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* This function is based on sysdeps/ieee754/dbl-64/e_exp.c. */ + +#include +#include +#include +#include +#include +#include +#include "math_config.h" +#include "libmvec_util.h" + +#define N (1 << EXP_TABLE_BITS) +#define InvLn2N __exp_data.invln2N +#define NegLn2hiN __exp_data.negln2hiN +#define NegLn2loN __exp_data.negln2loN +#define Shift __exp_data.shift +#define T __exp_data.tab +#define C2 __exp_data.poly[5 - EXP_POLY_ORDER] +#define C3 __exp_data.poly[6 - EXP_POLY_ORDER] +#define C4 __exp_data.poly[7 - EXP_POLY_ORDER] +#define C5 __exp_data.poly[8 - EXP_POLY_ORDER] + +#define LIMIT 700.0 + +/* Do not inline this call. That way _ZGVnN2v_exp has no calls to non-vector + functions. This reduces the register saves that _ZGVnN2v_exp has to do. */ + +__attribute__((aarch64_vector_pcs, noinline)) static __Float64x2_t +__scalar_exp(__Float64x2_t x) +{ + return (__Float64x2_t) { exp(x[0]), exp(x[1]) }; +} + +__attribute__((aarch64_vector_pcs)) __Float64x2_t +_ZGVnN2v_exp(__Float64x2_t x) +{ + double h, z_0, z_1; + __Float64x2_t g, scale_v, tail_v, tmp_v, r_v, r2_v, kd_v; + __Float64x2_t NegLn2hiN_v, NegLn2loN_v, C2_v, C3_v, C4_v, C5_v; + uint64_t ki_0, ki_1, idx_0, idx_1; + uint64_t top_0, top_1, sbits_0, sbits_1; + + /* If any value is larger than LIMIT, or NAN, call scalar operation. */ + g = __builtin_aarch64_absv2df (x); + h = __builtin_aarch64_reduc_smax_scal_v2df (g); + if (__glibc_unlikely (!(h < LIMIT))) + return __scalar_exp (x); + + z_0 = InvLn2N * x[0]; + z_1 = InvLn2N * x[1]; + ki_0 = converttoint (z_0); + ki_1 = converttoint (z_1); + + idx_0 = 2 * (ki_0 % N); + idx_1 = 2 * (ki_1 % N); + top_0 = ki_0 << (52 - EXP_TABLE_BITS); + top_1 = ki_1 << (52 - EXP_TABLE_BITS); + sbits_0 = T[idx_0 + 1] + top_0; + sbits_1 = T[idx_1 + 1] + top_1; + + kd_v = (__Float64x2_t) { roundtoint (z_0), roundtoint (z_1) }; + scale_v = (__Float64x2_t) { asdouble (sbits_0), asdouble (sbits_1) }; + tail_v = (__Float64x2_t) { asdouble (T[idx_0]), asdouble (T[idx_1]) }; + NegLn2hiN_v = (__Float64x2_t) { NegLn2hiN, NegLn2hiN }; + NegLn2loN_v = (__Float64x2_t) { NegLn2loN, NegLn2loN }; + C2_v = (__Float64x2_t) { C2, C2 }; + C3_v = (__Float64x2_t) { C3, C3 }; + C4_v = (__Float64x2_t) { C4, C4 }; + C5_v = (__Float64x2_t) { C5, C5 }; + + r_v = x + kd_v * NegLn2hiN_v + kd_v * NegLn2loN_v; + r2_v = r_v * r_v; + tmp_v = tail_v + r_v + r2_v * (C2_v + r_v * C3_v) + r2_v * r2_v + * (C4_v + r_v * C5_v); + return scale_v + scale_v * tmp_v; +} +weak_alias (_ZGVnN2v_exp, _ZGVnN2v___exp_finite) diff --git a/sysdeps/aarch64/fpu/libmvec_exp2f_data.c b/sysdeps/aarch64/fpu/libmvec_exp2f_data.c index e69de29..d97ce15 100644 --- a/sysdeps/aarch64/fpu/libmvec_exp2f_data.c +++ b/sysdeps/aarch64/fpu/libmvec_exp2f_data.c @@ -0,0 +1,2 @@ +#include +#include diff --git a/sysdeps/aarch64/fpu/libmvec_exp_data.c b/sysdeps/aarch64/fpu/libmvec_exp_data.c index e69de29..a83661b 100644 --- a/sysdeps/aarch64/fpu/libmvec_exp_data.c +++ b/sysdeps/aarch64/fpu/libmvec_exp_data.c @@ -0,0 +1 @@ +#include diff --git a/sysdeps/aarch64/fpu/libmvec_float_vlen4_expf.c b/sysdeps/aarch64/fpu/libmvec_float_vlen4_expf.c index e69de29..6504574 100644 --- a/sysdeps/aarch64/fpu/libmvec_float_vlen4_expf.c +++ b/sysdeps/aarch64/fpu/libmvec_float_vlen4_expf.c @@ -0,0 +1,115 @@ +/* Single-precision 2 element vector e^x function. + Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* This function is based on sysdeps/ieee754/flt-32/e_expf.c. */ + +#include +#include +#include +#include +#include "libmvec_util.h" + +#define N (1 << EXP2F_TABLE_BITS) +#define LIMIT 80.0 + +#define InvLn2N __exp2f_data.invln2_scaled +#define T __exp2f_data.tab +#define C __exp2f_data.poly_scaled +#define SHIFT __exp2f_data.shift + +/* Do not inline this call. That way _ZGVnN4v_expf has no calls to non-vector + functions. This reduces the register saves that _ZGVnN4v_expf has to do. */ + +__attribute__((aarch64_vector_pcs,noinline)) static __Float32x4_t +__scalar_expf (__Float32x4_t x) +{ + return (__Float32x4_t) { expf(x[0]), expf(x[1]), expf(x[2]), expf(x[3]) }; +} + +__attribute__((aarch64_vector_pcs)) __Float32x4_t +_ZGVnN4v_expf(__Float32x4_t x) +{ + __Float32x4_t g, result; + __Float64x2_t xd_0, xd_1, vInvLn2N, z_0, z_1, vkd_0, vkd_1, r_0, r_1; + __Float64x2_t vs_0, vs_1, c0, c1, c2, y_0, y_1, r2_0, r2_1, one; + uint64_t ki_0, ki_1, ki_2, ki_3, t_0, t_1, t_2, t_3; + double s_0, s_1, s_2, s_3; + float f; + + /* If any value is larger than LIMIT, or NAN, call scalar operation. */ + g = __builtin_aarch64_absv4sf (x); + f = __builtin_aarch64_reduc_smax_scal_v4sf (g); + if (__glibc_unlikely (!(f < LIMIT))) + return __scalar_expf (x); + + xd_0 = get_lo_and_extend (x); + xd_1 = get_hi_and_extend (x); + + vInvLn2N = (__Float64x2_t) { InvLn2N, InvLn2N }; + /* x*N/Ln2 = k + r with r in [-1/2, 1/2] and int k. */ + z_0 = vInvLn2N * xd_0; + z_1 = vInvLn2N * xd_1; + + /* Round and convert z to int, the result is in [-150*N, 128*N] and + ideally ties-to-even rule is used, otherwise the magnitude of r + can be bigger which gives larger approximation error. */ + vkd_0 = __builtin_aarch64_roundv2df (z_0); + vkd_1 = __builtin_aarch64_roundv2df (z_1); + r_0 = z_0 - vkd_0; + r_1 = z_1 - vkd_1; + + ki_0 = (long) vkd_0[0]; + ki_1 = (long) vkd_0[1]; + ki_2 = (long) vkd_1[0]; + ki_3 = (long) vkd_1[1]; + + /* exp(x) = 2^(k/N) * 2^(r/N) ~= s * (C0*r^3 + C1*r^2 + C2*r + 1) */ + t_0 = T[ki_0 % N]; + t_1 = T[ki_1 % N]; + t_2 = T[ki_2 % N]; + t_3 = T[ki_3 % N]; + t_0 += ki_0 << (52 - EXP2F_TABLE_BITS); + t_1 += ki_1 << (52 - EXP2F_TABLE_BITS); + t_2 += ki_2 << (52 - EXP2F_TABLE_BITS); + t_3 += ki_3 << (52 - EXP2F_TABLE_BITS); + s_0 = asdouble (t_0); + s_1 = asdouble (t_1); + s_2 = asdouble (t_2); + s_3 = asdouble (t_3); + + vs_0 = (__Float64x2_t) { s_0, s_1 }; + vs_1 = (__Float64x2_t) { s_2, s_3 }; + c0 = (__Float64x2_t) { C[0], C[0] }; + c1 = (__Float64x2_t) { C[1], C[1] }; + c2 = (__Float64x2_t) { C[2], C[2] }; + one = (__Float64x2_t) { 1.0, 1.0 }; + + z_0 = c0 * r_0 + c1; + z_1 = c0 * r_1 + c1; + r2_0 = r_0 * r_0; + r2_1 = r_1 * r_1; + y_0 = c2 * r_0 + one; + y_1 = c2 * r_1 + one; + y_0 = z_0 * r2_0 + y_0; + y_1 = z_1 * r2_1 + y_1; + y_0 = y_0 * vs_0; + y_1 = y_1 * vs_1; + result = pack_and_trunc (y_0, y_1); + return result; +} +weak_alias (_ZGVnN4v_expf, _ZGVnN4v___expf_finite) diff --git a/sysdeps/aarch64/fpu/libmvec_util.h b/sysdeps/aarch64/fpu/libmvec_util.h index e69de29..a127724 100644 --- a/sysdeps/aarch64/fpu/libmvec_util.h +++ b/sysdeps/aarch64/fpu/libmvec_util.h @@ -0,0 +1,53 @@ +/* Utility functions for Aarch64 vector functions. + Copyright (C) 2015-2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +/* Copy lower 2 elements of of a 4 element float vector into a 2 element + double vector. */ + +static __always_inline +__Float64x2_t get_lo_and_extend (__Float32x4_t x) +{ + __Uint64x2_t tmp1 = (__Uint64x2_t) x; +#ifdef BIG_ENDIAN + uint64_t tmp2 = (uint64_t) tmp1[1]; +#else + uint64_t tmp2 = (uint64_t) tmp1[0]; +#endif + return __builtin_aarch64_float_extend_lo_v2df ((__Float32x2_t) tmp2); +} + +/* Copy upper 2 elements of of a 4 element float vector into a 2 element + double vector. */ + +static __always_inline +__Float64x2_t get_hi_and_extend (__Float32x4_t x) +{ + return __builtin_aarch64_vec_unpacks_hi_v4sf (x); +} + +/* Copy a pair of 2 element double vectors into a 4 element float vector. */ + +static __always_inline +__Float32x4_t pack_and_trunc (__Float64x2_t x, __Float64x2_t y) +{ + __Float32x2_t xx = __builtin_aarch64_float_truncate_lo_v2sf (x); + __Float32x2_t yy = __builtin_aarch64_float_truncate_lo_v2sf (y); + return (__builtin_aarch64_combinev2sf (xx, yy)); +} diff --git a/sysdeps/aarch64/fpu/test-double-vlen2-wrappers.c b/sysdeps/aarch64/fpu/test-double-vlen2-wrappers.c index e69de29..331a51e 100644 --- a/sysdeps/aarch64/fpu/test-double-vlen2-wrappers.c +++ b/sysdeps/aarch64/fpu/test-double-vlen2-wrappers.c @@ -0,0 +1,23 @@ +/* Wrapper part of tests for aarch64 double vector math functions. + Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "test-double-vlen2.h" + +#define VEC_TYPE __Float64x2_t + +VECTOR_WRAPPER (WRAPPER_NAME (exp), _ZGVnN2v_exp) diff --git a/sysdeps/aarch64/fpu/test-float-vlen4-wrappers.c b/sysdeps/aarch64/fpu/test-float-vlen4-wrappers.c index e69de29..e3feef6 100644 --- a/sysdeps/aarch64/fpu/test-float-vlen4-wrappers.c +++ b/sysdeps/aarch64/fpu/test-float-vlen4-wrappers.c @@ -0,0 +1,23 @@ +/* Wrapper part of tests for float aarch64 vector math functions. + Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "test-float-vlen4.h" + +#define VEC_TYPE __Float32x4_t + +VECTOR_WRAPPER (WRAPPER_NAME (expf), _ZGVnN4v_expf) diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps index 585e5bb..1ed4af9 100644 --- a/sysdeps/aarch64/libm-test-ulps +++ b/sysdeps/aarch64/libm-test-ulps @@ -1601,6 +1601,12 @@ float: 1 idouble: 1 ifloat: 1 +Function: "exp_vlen2": +double: 1 + +Function: "exp_vlen4": +float: 1 + Function: "expm1": double: 1 float: 1 diff --git a/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist b/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist index e69de29..b7431a3 100644 --- a/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist +++ b/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist @@ -0,0 +1,4 @@ +GLIBC_2.30 _ZGVnN2v___exp_finite F +GLIBC_2.30 _ZGVnN2v_exp F +GLIBC_2.30 _ZGVnN4v___expf_finite F +GLIBC_2.30 _ZGVnN4v_expf F