mbox series

[v3,0/1] Add vector math function tan/tanf to libmvec

Message ID 20211230000358.3894697-1-skpgkp2@gmail.com
Headers show
Series Add vector math function tan/tanf to libmvec | expand

Message

Sunil K Pandey Dec. 30, 2021, 12:03 a.m. UTC
This patch may looks big but 74% of this patch is data table.

Changes from v2:
-  Replace big negative rip offset with Table Lookup Bias.
-  Remove more unused data table fields.
-  Include LOE(live on exit) register info.
-  Apply more peephole optimization.
-  Optimize load of all bits set into ZMM register
-  Replace 3 kmovw + andl with kandw instruction.
-  Restructure data table and remove unused fields.
-  Fix data table and field alignment according to ISA.
-  Fix data offset according to ISA.
-  Remove exit call dead code.
-  Remove unnecessary save/restore.
-  Keep cfi_escape for callee saved registers only.
-  Add DW_CFA_expression comments corresponding to each cfi_escape.
-  Define macro corresponding to each numeric data table offset.
-  Replace numeric data table offset with macro name.
-  Add data table structure definition as comments.
-  Restructure data table and add comments to each data field value.
-  Rename numeric sequential labels with meaningful label name.
-  Add more comments to labels as well as on call sites.
-  Internal special value processing paths replaced by calls to standard
   scalar math functions, makes code more compact and aligned with
   previous libmvec submission.
  
Changes from v1:
-  Add ISA specific sections for all libmvec functions.
-  Add libmvec functions to math-vector-fortran.h.
-  Change label to sequential.
-  Fix function name in GNU header plate.

This patch implements tan/tanf vector math functions containing
SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI.
It also contains accuracy and ABI tests with regenerated ulps.

Sunil K Pandey (1):
  x86-64: Add vector tan/tanf implementation to libmvec

 bits/libm-simd-decl-stubs.h                   |   11 +
 math/bits/mathcalls.h                         |    2 +-
 .../unix/sysv/linux/x86_64/libmvec.abilist    |    8 +
 sysdeps/x86/fpu/bits/math-vector.h            |    4 +
 .../x86/fpu/finclude/math-vector-fortran.h    |    4 +
 sysdeps/x86_64/fpu/Makeconfig                 |    1 +
 sysdeps/x86_64/fpu/Versions                   |    2 +
 sysdeps/x86_64/fpu/libm-test-ulps             |   20 +
 .../fpu/multiarch/svml_d_tan2_core-sse2.S     |   20 +
 .../x86_64/fpu/multiarch/svml_d_tan2_core.c   |   27 +
 .../fpu/multiarch/svml_d_tan2_core_sse4.S     | 6259 +++++++++++++++++
 .../fpu/multiarch/svml_d_tan4_core-sse.S      |   20 +
 .../x86_64/fpu/multiarch/svml_d_tan4_core.c   |   27 +
 .../fpu/multiarch/svml_d_tan4_core_avx2.S     | 6227 ++++++++++++++++
 .../fpu/multiarch/svml_d_tan8_core-avx2.S     |   20 +
 .../x86_64/fpu/multiarch/svml_d_tan8_core.c   |   27 +
 .../fpu/multiarch/svml_d_tan8_core_avx512.S   | 2733 +++++++
 .../fpu/multiarch/svml_s_tanf16_core-avx2.S   |   20 +
 .../x86_64/fpu/multiarch/svml_s_tanf16_core.c |   28 +
 .../fpu/multiarch/svml_s_tanf16_core_avx512.S |  927 +++
 .../fpu/multiarch/svml_s_tanf4_core-sse2.S    |   20 +
 .../x86_64/fpu/multiarch/svml_s_tanf4_core.c  |   28 +
 .../fpu/multiarch/svml_s_tanf4_core_sse4.S    | 2600 +++++++
 .../fpu/multiarch/svml_s_tanf8_core-sse.S     |   20 +
 .../x86_64/fpu/multiarch/svml_s_tanf8_core.c  |   28 +
 .../fpu/multiarch/svml_s_tanf8_core_avx2.S    | 2595 +++++++
 sysdeps/x86_64/fpu/svml_d_tan2_core.S         |   29 +
 sysdeps/x86_64/fpu/svml_d_tan4_core.S         |   29 +
 sysdeps/x86_64/fpu/svml_d_tan4_core_avx.S     |   25 +
 sysdeps/x86_64/fpu/svml_d_tan8_core.S         |   25 +
 sysdeps/x86_64/fpu/svml_s_tanf16_core.S       |   25 +
 sysdeps/x86_64/fpu/svml_s_tanf4_core.S        |   29 +
 sysdeps/x86_64/fpu/svml_s_tanf8_core.S        |   29 +
 sysdeps/x86_64/fpu/svml_s_tanf8_core_avx.S    |   25 +
 .../x86_64/fpu/test-double-libmvec-tan-avx.c  |    1 +
 .../x86_64/fpu/test-double-libmvec-tan-avx2.c |    1 +
 .../fpu/test-double-libmvec-tan-avx512f.c     |    1 +
 sysdeps/x86_64/fpu/test-double-libmvec-tan.c  |    3 +
 .../x86_64/fpu/test-double-vlen2-wrappers.c   |    1 +
 .../fpu/test-double-vlen4-avx2-wrappers.c     |    1 +
 .../x86_64/fpu/test-double-vlen4-wrappers.c   |    1 +
 .../x86_64/fpu/test-double-vlen8-wrappers.c   |    1 +
 .../x86_64/fpu/test-float-libmvec-tanf-avx.c  |    1 +
 .../x86_64/fpu/test-float-libmvec-tanf-avx2.c |    1 +
 .../fpu/test-float-libmvec-tanf-avx512f.c     |    1 +
 sysdeps/x86_64/fpu/test-float-libmvec-tanf.c  |    3 +
 .../x86_64/fpu/test-float-vlen16-wrappers.c   |    1 +
 .../x86_64/fpu/test-float-vlen4-wrappers.c    |    1 +
 .../fpu/test-float-vlen8-avx2-wrappers.c      |    1 +
 .../x86_64/fpu/test-float-vlen8-wrappers.c    |    1 +
 50 files changed, 21913 insertions(+), 1 deletion(-)
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_tan2_core-sse2.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_tan2_core.c
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_tan2_core_sse4.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_tan4_core-sse.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_tan4_core.c
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_tan4_core_avx2.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_tan8_core-avx2.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_tan8_core.c
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_tan8_core_avx512.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_tanf16_core-avx2.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_tanf16_core.c
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_tanf16_core_avx512.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_tanf4_core-sse2.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_tanf4_core.c
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_tanf4_core_sse4.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_tanf8_core-sse.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_tanf8_core.c
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_tanf8_core_avx2.S
 create mode 100644 sysdeps/x86_64/fpu/svml_d_tan2_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_d_tan4_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_d_tan4_core_avx.S
 create mode 100644 sysdeps/x86_64/fpu/svml_d_tan8_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_s_tanf16_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_s_tanf4_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_s_tanf8_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_s_tanf8_core_avx.S
 create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-tan-avx.c
 create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-tan-avx2.c
 create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-tan-avx512f.c
 create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-tan.c
 create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-tanf-avx.c
 create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-tanf-avx2.c
 create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-tanf-avx512f.c
 create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-tanf.c