[v3,0/1] Add vector math function acos/acosf to libmvec

Message ID 20211215185413.4137536-1-skpgkp2@gmail.com
Headers
Series Add vector math function acos/acosf to libmvec |

Message

Sunil Pandey Dec. 15, 2021, 6:54 p.m. UTC
  This is single function patch as suggested.  We incorporated following
changes in v3.  Rest of the libmvec patches will follow similar change.
Let me know if it looks reasonable?

Changes from v2:
-  Keep cfi_escape for callee saved registers only.
-  Add DW_CFA_expression comments corresponding to each cfi_escape.
-  Define macro corresponding to each numeric data table offset.
-  Replace numeric data table offset with macro name.
-  Add data table structure definition as comments.
-  Restructure data table and add comments to each data field value.
-  Rename numeric sequential labels with meaningful label name.
-  Add more comments to labels as well as on call sites.
-  Internal special value processing paths replaced by calls to standard
   scalar math functions, makes code more compact and aligned with
   previous libmvec submission.
  
Changes from v1:
-  Add ISA specific sections for all libmvec functions.
-  Add libmvec functions to math-vector-fortran.h.
-  Change label to sequential.
-  Fix function name in GNU header plate.

This patch implements acos/acosf vector math functions containing
SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI.
It also contains accuracy and ABI tests with regenerated ulps.

Sunil K Pandey (1):
  x86-64: Add vector acos/acosf implementation to libmvec

 bits/libm-simd-decl-stubs.h                   |  11 +
 math/bits/mathcalls.h                         |   2 +-
 .../unix/sysv/linux/x86_64/libmvec.abilist    |   8 +
 sysdeps/x86/fpu/bits/math-vector.h            |   4 +
 .../x86/fpu/finclude/math-vector-fortran.h    |   4 +
 sysdeps/x86_64/fpu/Makeconfig                 |   1 +
 sysdeps/x86_64/fpu/Versions                   |   4 +
 sysdeps/x86_64/fpu/libm-test-ulps             |  20 +
 .../fpu/multiarch/ifunc-mathvec-avx512-skx.h  |  39 ++
 .../fpu/multiarch/svml_d_acos2_core-sse2.S    |  20 +
 .../x86_64/fpu/multiarch/svml_d_acos2_core.c  |  27 ++
 .../fpu/multiarch/svml_d_acos2_core_sse4.S    | 399 ++++++++++++++++++
 .../fpu/multiarch/svml_d_acos4_core-sse.S     |  20 +
 .../x86_64/fpu/multiarch/svml_d_acos4_core.c  |  27 ++
 .../fpu/multiarch/svml_d_acos4_core_avx2.S    | 368 ++++++++++++++++
 .../fpu/multiarch/svml_d_acos8_core-avx2.S    |  20 +
 .../x86_64/fpu/multiarch/svml_d_acos8_core.c  |  27 ++
 .../fpu/multiarch/svml_d_acos8_core_avx512.S  | 386 +++++++++++++++++
 .../fpu/multiarch/svml_s_acosf16_core-avx2.S  |  20 +
 .../fpu/multiarch/svml_s_acosf16_core.c       |  28 ++
 .../multiarch/svml_s_acosf16_core_avx512.S    | 332 +++++++++++++++
 .../fpu/multiarch/svml_s_acosf4_core-sse2.S   |  20 +
 .../x86_64/fpu/multiarch/svml_s_acosf4_core.c |  28 ++
 .../fpu/multiarch/svml_s_acosf4_core_sse4.S   | 351 +++++++++++++++
 .../fpu/multiarch/svml_s_acosf8_core-sse.S    |  20 +
 .../x86_64/fpu/multiarch/svml_s_acosf8_core.c |  28 ++
 .../fpu/multiarch/svml_s_acosf8_core_avx2.S   | 332 +++++++++++++++
 sysdeps/x86_64/fpu/svml_d_acos2_core.S        |  29 ++
 sysdeps/x86_64/fpu/svml_d_acos4_core.S        |  29 ++
 sysdeps/x86_64/fpu/svml_d_acos4_core_avx.S    |  25 ++
 sysdeps/x86_64/fpu/svml_d_acos8_core.S        |  25 ++
 sysdeps/x86_64/fpu/svml_s_acosf16_core.S      |  25 ++
 sysdeps/x86_64/fpu/svml_s_acosf4_core.S       |  29 ++
 sysdeps/x86_64/fpu/svml_s_acosf8_core.S       |  29 ++
 sysdeps/x86_64/fpu/svml_s_acosf8_core_avx.S   |  25 ++
 .../x86_64/fpu/test-double-libmvec-acos-avx.c |   1 +
 .../fpu/test-double-libmvec-acos-avx2.c       |   1 +
 .../fpu/test-double-libmvec-acos-avx512f.c    |   1 +
 sysdeps/x86_64/fpu/test-double-libmvec-acos.c |   3 +
 .../x86_64/fpu/test-double-vlen2-wrappers.c   |   1 +
 .../fpu/test-double-vlen4-avx2-wrappers.c     |   1 +
 .../x86_64/fpu/test-double-vlen4-wrappers.c   |   1 +
 .../x86_64/fpu/test-double-vlen8-wrappers.c   |   1 +
 .../x86_64/fpu/test-float-libmvec-acosf-avx.c |   1 +
 .../fpu/test-float-libmvec-acosf-avx2.c       |   1 +
 .../fpu/test-float-libmvec-acosf-avx512f.c    |   1 +
 sysdeps/x86_64/fpu/test-float-libmvec-acosf.c |   3 +
 .../x86_64/fpu/test-float-vlen16-wrappers.c   |   1 +
 .../x86_64/fpu/test-float-vlen4-wrappers.c    |   1 +
 .../fpu/test-float-vlen8-avx2-wrappers.c      |   1 +
 .../x86_64/fpu/test-float-vlen8-wrappers.c    |   1 +
 51 files changed, 2781 insertions(+), 1 deletion(-)
 create mode 100644 sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx512-skx.h
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_acos2_core-sse2.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_acos2_core.c
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_acos2_core_sse4.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_acos4_core-sse.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_acos4_core.c
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_acos4_core_avx2.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_acos8_core-avx2.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_acos8_core.c
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_acos8_core_avx512.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_acosf16_core-avx2.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_acosf16_core.c
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_acosf16_core_avx512.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_acosf4_core-sse2.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_acosf4_core.c
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_acosf4_core_sse4.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_acosf8_core-sse.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_acosf8_core.c
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_acosf8_core_avx2.S
 create mode 100644 sysdeps/x86_64/fpu/svml_d_acos2_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_d_acos4_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_d_acos4_core_avx.S
 create mode 100644 sysdeps/x86_64/fpu/svml_d_acos8_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_s_acosf16_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_s_acosf4_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_s_acosf8_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_s_acosf8_core_avx.S
 create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-acos-avx.c
 create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-acos-avx2.c
 create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-acos-avx512f.c
 create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-acos.c
 create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-acosf-avx.c
 create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-acosf-avx2.c
 create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-acosf-avx512f.c
 create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-acosf.c