[RFC,V4] Enable libmvec support for RISC-V

Message ID 20240415072108.3741341-1-shiyulong@iscas.ac.cn
State RFC
Headers
Series [RFC,V4] Enable libmvec support for RISC-V |

Checks

Context Check Description
redhat-pt-bot/TryBot-apply_patch success Patch applied to master at the time it was sent
linaro-tcwg-bot/tcwg_glibc_build--master-arm success Testing passed
redhat-pt-bot/TryBot-32bit success Build for i686
linaro-tcwg-bot/tcwg_glibc_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_glibc_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_glibc_check--master-aarch64 success Testing passed

Commit Message

yulong April 15, 2024, 7:21 a.m. UTC
  From: yulong <shiyulong@iscas.ac.cn>

Diff: Chande the version from GLIBC_2.39 to GLIBC_2.40.
This patch tries to enable libmvec on RISC-V. I also have demonstrated
how this all fits together by adding implementations for vector cos.
This patch is a try and we hope to receive valuable comments.

Thanks,
yulong

---
 sysdeps/riscv/configure                       |   4 +
 sysdeps/riscv/configure.ac                    |   4 +
 sysdeps/riscv/rvd/Makefile                    |   5 +
 sysdeps/riscv/rvd/Versions                    |   5 +
 sysdeps/riscv/rvd/bits/math-vector.h          |  29 ++++
 sysdeps/riscv/rvd/cos.c                       |  94 ++++++++++++
 sysdeps/riscv/rvd/math_private.h              |  42 ++++++
 sysdeps/riscv/rvd/v_math.h                    | 139 ++++++++++++++++++
 sysdeps/riscv/rvd/vecmath_config.h            |  33 +++++
 sysdeps/unix/sysv/linux/riscv/libmvec.abilist |   1 +
 10 files changed, 356 insertions(+)
 mode change 100644 => 100755 sysdeps/riscv/configure
 create mode 100644 sysdeps/riscv/rvd/Makefile
 create mode 100644 sysdeps/riscv/rvd/Versions
 create mode 100644 sysdeps/riscv/rvd/bits/math-vector.h
 create mode 100644 sysdeps/riscv/rvd/cos.c
 create mode 100644 sysdeps/riscv/rvd/math_private.h
 create mode 100644 sysdeps/riscv/rvd/v_math.h
 create mode 100644 sysdeps/riscv/rvd/vecmath_config.h
 create mode 100644 sysdeps/unix/sysv/linux/riscv/libmvec.abilist
  

Comments

Jeff Law April 25, 2024, 5:07 a.m. UTC | #1
On 4/15/24 1:21 AM, shiyulong@iscas.ac.cn wrote:
> From: yulong <shiyulong@iscas.ac.cn>
> 
> Diff: Chande the version from GLIBC_2.39 to GLIBC_2.40.
> This patch tries to enable libmvec on RISC-V. I also have demonstrated
> how this all fits together by adding implementations for vector cos.
> This patch is a try and we hope to receive valuable comments.
Just an FYI -- Palmer's team over at Rivos have implementations for a 
number of routines that would fit into libmvec.  You might reach out to 
Ping Tak Peter Tang  <ptpt@rivosinc.com> for information in his 
implementation.

> https://github.com/rivosinc/veclibm/


THeir implementations may provide good guidance on performant 
implementations of various routines that libmvec typically provides.

jeff
  
yulong April 29, 2024, 1:12 a.m. UTC | #2
在 2024/4/25 13:07, Jeff Law 写道:
>
>
> On 4/15/24 1:21 AM, shiyulong@iscas.ac.cn wrote:
>> From: yulong <shiyulong@iscas.ac.cn>
>>
>> Diff: Chande the version from GLIBC_2.39 to GLIBC_2.40.
>> This patch tries to enable libmvec on RISC-V. I also have demonstrated
>> how this all fits together by adding implementations for vector cos.
>> This patch is a try and we hope to receive valuable comments.
> Just an FYI -- Palmer's team over at Rivos have implementations for a 
> number of routines that would fit into libmvec.  You might reach out 
> to Ping Tak Peter Tang  <ptpt@rivosinc.com> for information in his 
> implementation.
>
>> https://github.com/rivosinc/veclibm/
>
>
> THeir implementations may provide good guidance on performant 
> implementations of various routines that libmvec typically provides.
>
> jeff
Thanks Jeff for your advice, I'm working on a new implementation after 
reading the above code.
  
Palmer Dabbelt April 30, 2024, 4:26 p.m. UTC | #3
On Wed, 24 Apr 2024 22:07:31 PDT (-0700), jeffreyalaw@gmail.com wrote:
>
>
> On 4/15/24 1:21 AM, shiyulong@iscas.ac.cn wrote:
>> From: yulong <shiyulong@iscas.ac.cn>
>>
>> Diff: Chande the version from GLIBC_2.39 to GLIBC_2.40.
>> This patch tries to enable libmvec on RISC-V. I also have demonstrated
>> how this all fits together by adding implementations for vector cos.
>> This patch is a try and we hope to receive valuable comments.
> Just an FYI -- Palmer's team over at Rivos have implementations for a
> number of routines that would fit into libmvec.  You might reach out to
> Ping Tak Peter Tang  <ptpt@rivosinc.com> for information in his
> implementation.
>
>> https://github.com/rivosinc/veclibm/
>
>
> THeir implementations may provide good guidance on performant
> implementations of various routines that libmvec typically provides.

Ya, that's the idea of veclibm.  The actual functions are written in a 
way that's more suitable for some other libraries, but the core 
computational implemenations should be the same.  A few of us had 
briefly talked internally about getting these into glibc, IIUC all the 
code was written at Rivos and thus could be copyright assigned to the 
FSF and used in glibc.  We don't have time to do that right now, but if 
you're interested in helping that'd be awesome.  We'll need to be 
careful with the copyright/licensing, though.

That said, I've never really quite managed to figure out how all the 
libmvec stuff is supposed to fit together.  I'm more worried about the 
ABI side of things than the implementation, so I think starting with 
just one function to get the ABI template figure out is a reasonable way 
to go and we can get the rest of the implementations ported over next.  
The first thing that jumps out on the ABI side of things is cos() taking 
EMUL=2 types, I'm not sure if there's a reason for that but it seems 
we'd want EMUL=1 to fit more data in the argument registers?

Also, I think some of this can be split out: the roundtoint/converttoint 
isn't really a libmvec thing (see 
https://inbox.sourceware.org/libc-alpha/20220803174258.4235-1-palmer@rivosinc.com/, 
which fails some test), and ptr_barrier() can probably be pulled out to 
something generic as it's the same as arm64's version.

I'm also only seeing draft versions of the vector intrinsics.  I know we 
merged them into GCC and usually that means things are stable, but we 
merged these pre-freeze (based on some assertions things wouldn't 
change) and things have drifted around a bit it the spec.  I think we're 
probably safe just depending on the types, if there's no frozen version 
we should at least write down exactly which version we're following 
though.

Also: are there GCC patches for these?  It'd be great to be able to test 
things through the whole codegen stack so we can make sure it works.

>
> jeff
  
yulong May 10, 2024, 1:06 p.m. UTC | #4
在 2024/5/1 0:26, Palmer Dabbelt 写道:
> On Wed, 24 Apr 2024 22:07:31 PDT (-0700), jeffreyalaw@gmail.com wrote:
>>
>>
>> On 4/15/24 1:21 AM, shiyulong@iscas.ac.cn wrote:
>>> From: yulong <shiyulong@iscas.ac.cn>
>>>
>>> Diff: Chande the version from GLIBC_2.39 to GLIBC_2.40.
>>> This patch tries to enable libmvec on RISC-V. I also have demonstrated
>>> how this all fits together by adding implementations for vector cos.
>>> This patch is a try and we hope to receive valuable comments.
>> Just an FYI -- Palmer's team over at Rivos have implementations for a
>> number of routines that would fit into libmvec.  You might reach out to
>> Ping Tak Peter Tang <ptpt@rivosinc.com> for information in his
>> implementation.
>>
>>> https://github.com/rivosinc/veclibm/
>>
>>
>> THeir implementations may provide good guidance on performant
>> implementations of various routines that libmvec typically provides.
>
> Ya, that's the idea of veclibm.  The actual functions are written in a 
> way that's more suitable for some other libraries, but the core 
> computational implemenations should be the same.  A few of us had 
> briefly talked internally about getting these into glibc, IIUC all the 
> code was written at Rivos and thus could be copyright assigned to the 
> FSF and used in glibc.  We don't have time to do that right now, but 
> if you're interested in helping that'd be awesome.  We'll need to be 
> careful with the copyright/licensing, though.
Thanks for your reply.   I also received an email from Peter Tang. I am 
very interested in contributing to glibc.
>
> That said, I've never really quite managed to figure out how all the 
> libmvec stuff is supposed to fit together.  I'm more worried about the 
> ABI side of things than the implementation, so I think starting with 
> just one function to get the ABI template figure out is a reasonable 
> way to go and we can get the rest of the implementations ported over 
> next.  The first thing that jumps out on the ABI side of things is 
> cos() taking EMUL=2 types, I'm not sure if there's a reason for that 
> but it seems we'd want EMUL=1 to fit more data in the argument registers?
Setting EMUL=2 is just a personal experiment. I think you are right and 
I will improve it in the next version.
>
> Also, I think some of this can be split out: the 
> roundtoint/converttoint isn't really a libmvec thing (see 
> https://inbox.sourceware.org/libc-alpha/20220803174258.4235-1-palmer@rivosinc.com/, 
> which fails some test), and ptr_barrier() can probably be pulled out 
> to something generic as it's the same as arm64's version.
>
> I'm also only seeing draft versions of the vector intrinsics.  I know 
> we merged them into GCC and usually that means things are stable, but 
> we merged these pre-freeze (based on some assertions things wouldn't 
> change) and things have drifted around a bit it the spec.  I think 
> we're probably safe just depending on the types, if there's no frozen 
> version we should at least write down exactly which version we're 
> following though.
We are currently developing based on the latest branches. Can we declare 
that we are following RVV 1.0?
>
> Also: are there GCC patches for these?  It'd be great to be able to 
> test things through the whole codegen stack so we can make sure it works.
Unfortunately, there are no patches for GCC right now. This may be the 
direction of future work.
>
>>
>> jeff
  
Zhijin Zeng Nov. 4, 2024, 4:41 a.m. UTC | #5
Hi yulong,  do you have any further progress? I finish a new version 
libmvec support for risc-v, which also base on implementations by 
Palmer's team over at Rivos.

     https://github.com/rivosinc/veclibm/

I can't find the vector function name mangling of risc-v, so I define it 
as follows, maybe it's incorrect, but I think it's worhting discussing.

     _ZGV<x>N<y>v<v...>_<func_name>

     'x' is the LMUL, if the LMUL is 1/2/4/8 and 'x' is 1/2/4/8.

     'y' is the count of elements also 'simdlen' in gcc.

     'v..' depends on the number of parameter, there are as many 'v' 
characters as there are parameters.

     'func_name' is the scalar function name.

This path have supported vectorized version for the following math 
function in risc-v (although now only support VLENB <= 256,  it's very 
easy to extend to larger VLENB). Besides, I also finish the gcc patch to 
support libmvec in risc-v.

exp/asin/atan/acos/atanh/exp10/exp2/tan/tanh/pow/sin/log/cos/acosh/asinh/atan2/expm1/tgamma/lgamma/log2/log10/cbrt/erfc/erf/cosh/sinh

Hi Palmer, I  temporarily change the Copyright information in some files 
which come from veclibm, it's not a viaolation of your Copyright, 
actually I don't know how to solve the conflict between LGPL and 
Apache2.0.  If you know, please tell me to fix it, thank you.

Zhijin Zeng


在 2024/5/10 21:06, yulong 写道:
>
> 在 2024/5/1 0:26, Palmer Dabbelt 写道:
>> On Wed, 24 Apr 2024 22:07:31 PDT (-0700), jeffreyalaw@gmail.com wrote:
>>>
>>>
>>> On 4/15/24 1:21 AM, shiyulong@iscas.ac.cn wrote:
>>>> From: yulong <shiyulong@iscas.ac.cn>
>>>>
>>>> Diff: Chande the version from GLIBC_2.39 to GLIBC_2.40.
>>>> This patch tries to enable libmvec on RISC-V. I also have demonstrated
>>>> how this all fits together by adding implementations for vector cos.
>>>> This patch is a try and we hope to receive valuable comments.
>>> Just an FYI -- Palmer's team over at Rivos have implementations for a
>>> number of routines that would fit into libmvec.  You might reach out to
>>> Ping Tak Peter Tang <ptpt@rivosinc.com> for information in his
>>> implementation.
>>>
>>>> https://github.com/rivosinc/veclibm/
>>>
>>>
>>> THeir implementations may provide good guidance on performant
>>> implementations of various routines that libmvec typically provides.
>>
>> Ya, that's the idea of veclibm.  The actual functions are written in 
>> a way that's more suitable for some other libraries, but the core 
>> computational implemenations should be the same.  A few of us had 
>> briefly talked internally about getting these into glibc, IIUC all 
>> the code was written at Rivos and thus could be copyright assigned to 
>> the FSF and used in glibc.  We don't have time to do that right now, 
>> but if you're interested in helping that'd be awesome.  We'll need to 
>> be careful with the copyright/licensing, though.
> Thanks for your reply.   I also received an email from Peter Tang. I 
> am very interested in contributing to glibc.
>>
>> That said, I've never really quite managed to figure out how all the 
>> libmvec stuff is supposed to fit together.  I'm more worried about 
>> the ABI side of things than the implementation, so I think starting 
>> with just one function to get the ABI template figure out is a 
>> reasonable way to go and we can get the rest of the implementations 
>> ported over next.  The first thing that jumps out on the ABI side of 
>> things is cos() taking EMUL=2 types, I'm not sure if there's a reason 
>> for that but it seems we'd want EMUL=1 to fit more data in the 
>> argument registers?
> Setting EMUL=2 is just a personal experiment. I think you are right 
> and I will improve it in the next version.
>>
>> Also, I think some of this can be split out: the 
>> roundtoint/converttoint isn't really a libmvec thing (see 
>> https://inbox.sourceware.org/libc-alpha/20220803174258.4235-1-palmer@rivosinc.com/, 
>> which fails some test), and ptr_barrier() can probably be pulled out 
>> to something generic as it's the same as arm64's version.
>>
>> I'm also only seeing draft versions of the vector intrinsics.  I know 
>> we merged them into GCC and usually that means things are stable, but 
>> we merged these pre-freeze (based on some assertions things wouldn't 
>> change) and things have drifted around a bit it the spec.  I think 
>> we're probably safe just depending on the types, if there's no frozen 
>> version we should at least write down exactly which version we're 
>> following though.
> We are currently developing based on the latest branches. Can we 
> declare that we are following RVV 1.0?
>>
>> Also: are there GCC patches for these?  It'd be great to be able to 
>> test things through the whole codegen stack so we can make sure it 
>> works.
> Unfortunately, there are no patches for GCC right now. This may be the 
> direction of future work.
>>
>>>
>>> jeff

This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not an intended recipient of this message, please delete it and any attachment from your system and notify the sender immediately by reply e-mail. Unintended recipients should not use, copy, disclose or take any action based on this message or any information contained in this message. Emails cannot be guaranteed to be secure or error free as they can be intercepted, amended, lost or destroyed, and you should take full responsibility for security checking. 
 
本邮件及其任何附件具有保密性质,并可能受其他保护或不允许被披露给第三方。如阁下误收到本邮件,敬请立即以回复电子邮件的方式通知发件人,并将本邮件及其任何附件从阁下系统中予以删除。如阁下并非本邮件写明之收件人,敬请切勿使用、复制、披露本邮件或其任何内容,亦请切勿依本邮件或其任何内容而采取任何行动。电子邮件无法保证是一种安全和不会出现任何差错的通信方式,可能会被拦截、修改、丢失或损坏,收件人需自行负责做好安全检查。
From 0eda8e538c7f7d4036d9decceb714acf3314f885 Mon Sep 17 00:00:00 2001
From: Zhijin Zeng <zhijin.zeng@spacemit.com>
Date: Thu, 31 Oct 2024 18:13:19 +0800
Subject: [PATCH] RISC-V: support vector math library for risc-v

Add risc-v vector function mangling rules as follow:

_ZGV<x>N<y>v_<func_name>

'x' is the LMUL, if the LMUL is 1/2/4/8 and 'x' is 1/2/4/8.
'y' is the count of elements also 'simdlen' in gcc.
'func_name' is the scalar function name.

gcc/ChangeLog:

	* config/riscv/riscv.cc (INCLUDE_STRING):
	(riscv_vector_type_p):
	(supported_simd_type):
	(lane_size):
	(riscv_simd_clone_compute_vecsize_and_simdlen):
	(riscv_simd_clone_adjust):
	(riscv_simd_clone_usable):
	(TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN):
	(TARGET_SIMD_CLONE_ADJUST):
	(TARGET_SIMD_CLONE_USABLE):
---
 gcc/config/riscv/riscv.cc | 241 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 240 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 4f8e3ab931a..9b44d36b171 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -22,6 +22,7 @@ along with GCC; see the file COPYING3.  If not see
 #define IN_TARGET_CODE 1
 
 #define INCLUDE_STRING
+#include <cmath>
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -33,6 +34,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "insn-config.h"
 #include "insn-attr.h"
 #include "recog.h"
+#include "cgraph.h"
 #include "output.h"
 #include "alias.h"
 #include "tree.h"
@@ -5197,7 +5199,9 @@ riscv_vector_type_p (const_tree type)
 {
   /* Currently, only builtin scalabler vector type is allowed, in the future,
      more vector types may be allowed, such as GNU vector type, etc.  */
-  return riscv_vector::builtin_type_p (type);
+  if (!type)
+    return false;
+  return riscv_vector::builtin_type_p (type) || VECTOR_TYPE_P (type);
 }
 
 static unsigned int
@@ -11099,6 +11103,231 @@ riscv_get_raw_result_mode (int regno)
   return default_get_reg_raw_mode (regno);
 }
 
+/* Return true for types that could be supported as SIMD return or
+   argument types.  */
+
+static bool
+supported_simd_type (tree t)
+{
+  if (SCALAR_FLOAT_TYPE_P (t) || INTEGRAL_TYPE_P (t))
+    {
+      HOST_WIDE_INT s = tree_to_shwi (TYPE_SIZE_UNIT (t));
+      return s == 1 || s == 2 || s == 4 || s == 8;
+    }
+  return false;
+}
+
+static unsigned
+lane_size (cgraph_simd_clone_arg_type clone_arg_type, tree type)
+{
+  gcc_assert (clone_arg_type != SIMD_CLONE_ARG_TYPE_MASK);
+
+  if (INTEGRAL_TYPE_P (type)
+      || SCALAR_FLOAT_TYPE_P (type))
+    switch (TYPE_PRECISION (type) / BITS_PER_UNIT)
+      {
+      default:
+	break;
+      case 1:
+      case 2:
+      case 4:
+      case 8:
+	return TYPE_PRECISION (type);
+      }
+  gcc_unreachable ();
+}
+
+/* Implement TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN.  */
+
+static int
+riscv_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node,
+					struct cgraph_simd_clone *clonei,
+					tree base_type ATTRIBUTE_UNUSED,
+					int num, bool explicit_p)
+{
+  tree t, ret_type;
+  unsigned int elt_bit = 0;
+  unsigned HOST_WIDE_INT const_simdlen;
+
+  if (!TARGET_VECTOR)
+    return 0;
+
+  if (maybe_ne (clonei->simdlen, 0U)
+      && clonei->simdlen.is_constant (&const_simdlen)
+      && (const_simdlen < 2
+	  || const_simdlen > 1024
+	  || (const_simdlen & (const_simdlen - 1)) != 0))
+    {
+      if (explicit_p)
+	warning_at (DECL_SOURCE_LOCATION (node->decl), 0,
+		    "unsupported simdlen %wd", const_simdlen);
+      return 0;
+    }
+
+  ret_type = TREE_TYPE (TREE_TYPE (node->decl));
+  if (TREE_CODE (ret_type) != VOID_TYPE
+      && !supported_simd_type (ret_type))
+    {
+      if (!explicit_p)
+	;
+      else if (COMPLEX_FLOAT_TYPE_P (ret_type))
+	warning_at (DECL_SOURCE_LOCATION (node->decl), 0,
+		    "GCC does not currently support return type %qT "
+		    "for simd", ret_type);
+      else
+	warning_at (DECL_SOURCE_LOCATION (node->decl), 0,
+		    "unsupported return type %qT for simd",
+		    ret_type);
+      return 0;
+    }
+
+  auto_vec<std::pair <tree, unsigned int>> vec_elts (clonei->nargs + 1);
+  if (TREE_CODE (ret_type) != VOID_TYPE)
+    {
+      elt_bit = lane_size (SIMD_CLONE_ARG_TYPE_VECTOR, ret_type);
+      vec_elts.safe_push (std::make_pair (ret_type, elt_bit));
+    }
+
+  int i;
+  tree type_arg_types = TYPE_ARG_TYPES (TREE_TYPE (node->decl));
+  bool decl_arg_p = (node->definition || type_arg_types == NULL_TREE);
+  for (t = (decl_arg_p ? DECL_ARGUMENTS (node->decl) : type_arg_types), i = 0;
+       t && t != void_list_node; t = TREE_CHAIN (t), i++)
+    {
+      tree arg_type = decl_arg_p ? TREE_TYPE (t) : TREE_VALUE (t);
+      if (clonei->args[i].arg_type != SIMD_CLONE_ARG_TYPE_UNIFORM
+	  && !supported_simd_type (arg_type))
+	{
+	  if (!explicit_p)
+	    ;
+	  else if (COMPLEX_FLOAT_TYPE_P (ret_type))
+	    warning_at (DECL_SOURCE_LOCATION (node->decl), 0,
+			"GCC does not currently support argument type %qT "
+			"for simd", arg_type);
+	  else
+	    warning_at (DECL_SOURCE_LOCATION (node->decl), 0,
+			"unsupported argument type %qT for simd",
+			arg_type);
+	  return 0;
+	}
+      unsigned lane_bits = lane_size (clonei->args[i].arg_type, arg_type);
+      if (clonei->args[i].arg_type == SIMD_CLONE_ARG_TYPE_VECTOR)
+	vec_elts.safe_push (std::make_pair (arg_type, lane_bits));
+      if (!elt_bit)
+	elt_bit = lane_bits;
+      if (elt_bit != lane_bits)
+	return 0;
+    }
+
+  if (!elt_bit)
+    return 0;
+
+  clonei->vecsize_mangle = 'n';
+  clonei->mask_mode = VOIDmode;
+  poly_uint64 simdlen;
+  auto_vec<poly_uint64> simdlens (2);
+
+  clonei->vecsize_int = 0;
+  clonei->vecsize_float = 0;
+
+  if ((unsigned int)TARGET_MIN_VLEN <= elt_bit)
+    return 0;
+
+  /* Keep track of the possible simdlens the clones of this function can have,
+     and check them later to see if we support them.  */
+  if (known_eq (clonei->simdlen, 0U))
+    {
+      if (TARGET_MAX_LMUL >= RVV_M1)
+	simdlens.safe_push (
+	    exact_div (poly_uint64 (TARGET_MIN_VLEN * RVV_M1), elt_bit));
+      if (TARGET_MAX_LMUL >= RVV_M2)
+	simdlens.safe_push (
+	    exact_div (poly_uint64 (TARGET_MIN_VLEN * RVV_M2), elt_bit));
+      if (TARGET_MAX_LMUL >= RVV_M4)
+	simdlens.safe_push (
+	    exact_div (poly_uint64 (TARGET_MIN_VLEN * RVV_M4), elt_bit));
+      if (TARGET_MAX_LMUL >= RVV_M8)
+	simdlens.safe_push (
+	    exact_div (poly_uint64 (TARGET_MIN_VLEN * RVV_M8), elt_bit));
+    }
+  else
+    simdlens.safe_push (clonei->simdlen);
+
+  unsigned j = 0;
+  while (j < simdlens.length ())
+    {
+      bool remove_simdlen = false;
+      for (auto elt : vec_elts)
+	if (known_gt (simdlens[j] * elt.second,
+	    TARGET_MIN_VLEN * TARGET_MAX_LMUL))
+	  {
+	    /* Don't issue a warning for every simdclone when there is no
+	       specific simdlen clause.  */
+	    if (explicit_p && maybe_ne (clonei->simdlen, 0U))
+	      warning_at (DECL_SOURCE_LOCATION (node->decl), 0,
+			  "GCC does not currently support simdlen %wd for "
+			  "type %qT",
+			  constant_lower_bound (simdlens[j]), elt.first);
+	    remove_simdlen = true;
+	    break;
+	  }
+      if (remove_simdlen)
+	simdlens.ordered_remove (j);
+      else
+	j++;
+    }
+
+  int count = simdlens.length ();
+  if (count == 0)
+    {
+      if (explicit_p && known_eq (clonei->simdlen, 0U))
+	{
+	  /* Warn the user if we can't generate any simdclone.  */
+	  //simdlen = exact_div (TARGET_MIN_VLEN * LMUL, elt_bit);
+	  warning_at (DECL_SOURCE_LOCATION (node->decl), 0,
+		      "GCC does not currently support a simdclone with simdlens"
+		      " %wd and %wd for these types.",
+		      constant_lower_bound (simdlen),
+		      constant_lower_bound (simdlen*2));
+	}
+      return 0;
+    }
+
+  gcc_assert (num < count);
+  clonei->vecsize_mangle = std::exp2 (num) + '0';
+  clonei->simdlen = simdlens[num];
+  return count;
+}
+
+/* Implement TARGET_SIMD_CLONE_ADJUST.  */
+
+static void
+riscv_simd_clone_adjust (struct cgraph_node *node)
+{
+  tree t = TREE_TYPE (node->decl);
+  TYPE_ATTRIBUTES (t) = make_attribute ("riscv_vector_cc", "default",
+					TYPE_ATTRIBUTES (t));
+}
+
+/* Implement TARGET_SIMD_CLONE_USABLE.  */
+
+static int
+riscv_simd_clone_usable (struct cgraph_node *node)
+{
+  switch (node->simdclone->vecsize_mangle)
+    {
+    case '1':
+    case '2':
+    case '4':
+    case '8':
+      if (!TARGET_VECTOR)
+	return -1;
+      return 0;
+    default:
+      gcc_unreachable ();
+    }
+}
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
 #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
@@ -11451,6 +11680,16 @@ riscv_get_raw_result_mode (int regno)
 #undef TARGET_GET_RAW_RESULT_MODE
 #define TARGET_GET_RAW_RESULT_MODE riscv_get_raw_result_mode
 
+#undef TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN
+#define TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN \
+  riscv_simd_clone_compute_vecsize_and_simdlen
+
+#undef TARGET_SIMD_CLONE_ADJUST
+#define TARGET_SIMD_CLONE_ADJUST riscv_simd_clone_adjust
+
+#undef TARGET_SIMD_CLONE_USABLE
+#define TARGET_SIMD_CLONE_USABLE riscv_simd_clone_usable
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-riscv.h"
  
yulong Nov. 5, 2024, 3:06 a.m. UTC | #6
Hi, Zhijin Zeng:
Thank you for your contribution.

I am still working on the relevant work, but not pushed it to upstream. 
Because there is an urgent project being done recently. After that, I 
will send patch to upstream as soon as possible.

Thanks!

yulong

在 2024/11/4 12:41, Zhijin Zeng 写道:
> Hi yulong,  do you have any further progress? I finish a new version
> libmvec support for risc-v, which also base on implementations by
> Palmer's team over at Rivos.
>
>       https://github.com/rivosinc/veclibm/
>
> I can't find the vector function name mangling of risc-v, so I define it
> as follows, maybe it's incorrect, but I think it's worhting discussing.
>
>       _ZGV<x>N<y>v<v...>_<func_name>
>
>       'x' is the LMUL, if the LMUL is 1/2/4/8 and 'x' is 1/2/4/8.
>
>       'y' is the count of elements also 'simdlen' in gcc.
>
>       'v..' depends on the number of parameter, there are as many 'v'
> characters as there are parameters.
>
>       'func_name' is the scalar function name.
>
> This path have supported vectorized version for the following math
> function in risc-v (although now only support VLENB <= 256,  it's very
> easy to extend to larger VLENB). Besides, I also finish the gcc patch to
> support libmvec in risc-v.
>
> exp/asin/atan/acos/atanh/exp10/exp2/tan/tanh/pow/sin/log/cos/acosh/asinh/atan2/expm1/tgamma/lgamma/log2/log10/cbrt/erfc/erf/cosh/sinh
>
> Hi Palmer, I  temporarily change the Copyright information in some files
> which come from veclibm, it's not a viaolation of your Copyright,
> actually I don't know how to solve the conflict between LGPL and
> Apache2.0.  If you know, please tell me to fix it, thank you.
>
> Zhijin Zeng
>
>
> 在 2024/5/10 21:06, yulong 写道:
>> 在 2024/5/1 0:26, Palmer Dabbelt 写道:
>>> On Wed, 24 Apr 2024 22:07:31 PDT (-0700), jeffreyalaw@gmail.com wrote:
>>>>
>>>> On 4/15/24 1:21 AM, shiyulong@iscas.ac.cn wrote:
>>>>> From: yulong <shiyulong@iscas.ac.cn>
>>>>>
>>>>> Diff: Chande the version from GLIBC_2.39 to GLIBC_2.40.
>>>>> This patch tries to enable libmvec on RISC-V. I also have demonstrated
>>>>> how this all fits together by adding implementations for vector cos.
>>>>> This patch is a try and we hope to receive valuable comments.
>>>> Just an FYI -- Palmer's team over at Rivos have implementations for a
>>>> number of routines that would fit into libmvec.  You might reach out to
>>>> Ping Tak Peter Tang <ptpt@rivosinc.com> for information in his
>>>> implementation.
>>>>
>>>>> https://github.com/rivosinc/veclibm/
>>>>
>>>> THeir implementations may provide good guidance on performant
>>>> implementations of various routines that libmvec typically provides.
>>> Ya, that's the idea of veclibm.  The actual functions are written in
>>> a way that's more suitable for some other libraries, but the core
>>> computational implemenations should be the same.  A few of us had
>>> briefly talked internally about getting these into glibc, IIUC all
>>> the code was written at Rivos and thus could be copyright assigned to
>>> the FSF and used in glibc.  We don't have time to do that right now,
>>> but if you're interested in helping that'd be awesome.  We'll need to
>>> be careful with the copyright/licensing, though.
>> Thanks for your reply.   I also received an email from Peter Tang. I
>> am very interested in contributing to glibc.
>>> That said, I've never really quite managed to figure out how all the
>>> libmvec stuff is supposed to fit together.  I'm more worried about
>>> the ABI side of things than the implementation, so I think starting
>>> with just one function to get the ABI template figure out is a
>>> reasonable way to go and we can get the rest of the implementations
>>> ported over next.  The first thing that jumps out on the ABI side of
>>> things is cos() taking EMUL=2 types, I'm not sure if there's a reason
>>> for that but it seems we'd want EMUL=1 to fit more data in the
>>> argument registers?
>> Setting EMUL=2 is just a personal experiment. I think you are right
>> and I will improve it in the next version.
>>> Also, I think some of this can be split out: the
>>> roundtoint/converttoint isn't really a libmvec thing (see
>>> https://inbox.sourceware.org/libc-alpha/20220803174258.4235-1-palmer@rivosinc.com/,
>>> which fails some test), and ptr_barrier() can probably be pulled out
>>> to something generic as it's the same as arm64's version.
>>>
>>> I'm also only seeing draft versions of the vector intrinsics.  I know
>>> we merged them into GCC and usually that means things are stable, but
>>> we merged these pre-freeze (based on some assertions things wouldn't
>>> change) and things have drifted around a bit it the spec.  I think
>>> we're probably safe just depending on the types, if there's no frozen
>>> version we should at least write down exactly which version we're
>>> following though.
>> We are currently developing based on the latest branches. Can we
>> declare that we are following RVV 1.0?
>>> Also: are there GCC patches for these?  It'd be great to be able to
>>> test things through the whole codegen stack so we can make sure it
>>> works.
>> Unfortunately, there are no patches for GCC right now. This may be the
>> direction of future work.
>>>> jeff
> This message and any attachment are confidential and may be privileged or otherwise protected from disclosure. If you are not an intended recipient of this message, please delete it and any attachment from your system and notify the sender immediately by reply e-mail. Unintended recipients should not use, copy, disclose or take any action based on this message or any information contained in this message. Emails cannot be guaranteed to be secure or error free as they can be intercepted, amended, lost or destroyed, and you should take full responsibility for security checking.
>   
> 本邮件及其任何附件具有保密性质,并可能受其他保护或不允许被披露给第三方。如阁下误收到本邮件,敬请立即以回复电子邮件的方式通知发件人,并将本邮件及其任何附件从阁下系统中予以删除。如阁下并非本邮件写明之收件人,敬请切勿使用、复制、披露本邮件或其任何内容,亦请切勿依本邮件或其任何内容而采取任何行动。电子邮件无法保证是一种安全和不会出现任何差错的通信方式,可能会被拦截、修改、丢失或损坏,收件人需自行负责做好安全检查。
  

Patch

diff --git a/sysdeps/riscv/configure b/sysdeps/riscv/configure
old mode 100644
new mode 100755
index c8f01709f8..a6d0b4becb
--- a/sysdeps/riscv/configure
+++ b/sysdeps/riscv/configure
@@ -80,3 +80,7 @@  if test "$libc_cv_static_pie_on_riscv" = yes; then
   printf "%s\n" "#define SUPPORT_STATIC_PIE 1" >>confdefs.h
 
 fi
+
+if test x"$build_mathvec" = xnotset; then
+  build_mathvec=yes
+fi
diff --git a/sysdeps/riscv/configure.ac b/sysdeps/riscv/configure.ac
index ee3d1ed014..b1c1105baa 100644
--- a/sysdeps/riscv/configure.ac
+++ b/sysdeps/riscv/configure.ac
@@ -43,3 +43,7 @@  EOF
 if test "$libc_cv_static_pie_on_riscv" = yes; then
   AC_DEFINE(SUPPORT_STATIC_PIE)
 fi
+
+if test x"$build_mathvec" = xnotset; then
+  build_mathvec=yes
+fi
diff --git a/sysdeps/riscv/rvd/Makefile b/sysdeps/riscv/rvd/Makefile
new file mode 100644
index 0000000000..1adb2ee582
--- /dev/null
+++ b/sysdeps/riscv/rvd/Makefile
@@ -0,0 +1,5 @@ 
+libmvec-supported-funcs = cos
+
+ifeq ($(subdir),mathvec)
+libmvec-support = $(addprefix d,$(libmvec-supported-funcs))
+endif
diff --git a/sysdeps/riscv/rvd/Versions b/sysdeps/riscv/rvd/Versions
new file mode 100644
index 0000000000..0fd283329c
--- /dev/null
+++ b/sysdeps/riscv/rvd/Versions
@@ -0,0 +1,5 @@ 
+libmvec {
+  GLIBC_2.40 {
+    _ZGVnN2v_cos;
+  }
+}
diff --git a/sysdeps/riscv/rvd/bits/math-vector.h b/sysdeps/riscv/rvd/bits/math-vector.h
new file mode 100644
index 0000000000..b34ffc9bc1
--- /dev/null
+++ b/sysdeps/riscv/rvd/bits/math-vector.h
@@ -0,0 +1,29 @@ 
+/* Platform-specific SIMD declarations of math functions.
+
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _MATH_H
+#  error "Never include <bits/math-vector.h> directly;\
+ include <math.h> instead."
+#endif
+
+#if defined __riscv__
+# define __DECL_RVV_RISCV _Pragma
+# undef __DECL_RVV_cos
+# define __DECL_RVV_cos __DECL_RVV_RISCV
+#endif
diff --git a/sysdeps/riscv/rvd/cos.c b/sysdeps/riscv/rvd/cos.c
new file mode 100644
index 0000000000..1806acd629
--- /dev/null
+++ b/sysdeps/riscv/rvd/cos.c
@@ -0,0 +1,94 @@ 
+/* Double-precision vector cos function.
+
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include "v_math.h"
+
+
+static const struct data
+{
+  vfloat64m2_t poly[7];
+  vfloat64m2_t range_val, shift, inv_pi, half_pi, pi_1, pi_2, pi_3;
+} data = {
+  /* Worst-case error is 3.3 ulp in [-pi/2, pi/2].  */
+  .poly = { V2 (-0x1.555555555547bp-3), V2 (0x1.1111111108a4dp-7),
+	    V2 (-0x1.a01a019936f27p-13), V2 (0x1.71de37a97d93ep-19),
+	    V2 (-0x1.ae633919987c6p-26), V2 (0x1.60e277ae07cecp-33),
+	    V2 (-0x1.9e9540300a1p-41) },
+  .inv_pi = V2 (0x1.45f306dc9c883p-2),
+  .half_pi = V2 (0x1.921fb54442d18p+0),
+  .pi_1 = V2 (0x1.921fb54442d18p+1),
+  .pi_2 = V2 (0x1.1a62633145c06p-53),
+  .pi_3 = V2 (0x1.c1cd129024e09p-106),
+  .shift = V2 (0x1.8p52),
+  .range_val = V2 (0x1p23)
+};
+
+#define C(i) d->poly[i]
+
+static vfloat64m2_t NOINLINE
+special_case (vfloat64m2_t x, vfloat64m2_t y, vuint64m2_t odd, vuint64m2_t cmp)
+{
+  y = vreinterpret_v_u64m2_f64m2 (vor (vreinterpret_v_f64m2_u64m2 (y), odd, 1));
+  return v_call_f64 (cos, x, y, cmp);
+}
+
+vfloat64m2_t V_NAME_D1 (cos) (vfloat64m2_t x)
+{
+  const struct data *d = ptr_barrier (&data);
+  vfloat64m2_t n, r, r2, r3, r4, t1, t2, t3, y;
+  vuint64m2_t odd, cmp;
+
+  r = vfabs_v_f64m2 (x, 2);
+  cmp = (vuint64m2_t) vmsgeu (vreinterpret_v_f64m2_u64m2 (r),
+		   vreinterpret_v_f64m2_u64m2 (d->range_val));
+  if (__glibc_unlikely (v_any_u64 (cmp)))
+    /* If fenv exceptions are to be triggered correctly, set any special lanes
+       to 1 (which is neutral w.r.t. fenv). These lanes will be fixed by
+       special-case handler later.  */
+    r = vmsltu (cmp, v_f64 (1.0), r);
+
+  /* n = rint((|x|+pi/2)/pi) - 0.5.  */
+  n = vfmadd (d->shift, d->inv_pi, vfadd (r, d->half_pi,2), 2);
+  odd = vshlq_n_u64 (vreinterpret_v_f64m2_u64m2 (n), 63);
+  n = vfsub (n, d->shift, 2);
+  n = vfsub (n, v_f64 (0.5), 2);
+
+  /* r = |x| - n*pi  (range reduction into -pi/2 .. pi/2).  */
+  r = vfmsub (r, d->pi_1, n, 2);
+  r = vfmsub (r, d->pi_2, n, 2);
+  r = vfmsub (r, d->pi_3, n, 2);
+
+  /* sin(r) poly approx.  */
+  r2 = vfmul (r, r, 2);
+  r3 = vfmul (r2, r, 2);
+  r4 = vfmul (r2, r2, 2);
+
+  t1 = vfmadd (C (4), C (5), r2, 2);
+  t2 = vfmadd (C (2), C (3), r2, 2);
+  t3 = vfmadd (C (0), C (1), r2, 2);
+
+  y = vfmadd (t1, C (6), r4, 2);
+  y = vfmadd (t2, y, r4, 2);
+  y = vfmadd (t3, y, r4, 2);
+  y = vfmadd (r, y, r3, 2);
+
+  if (__glibc_unlikely (v_any_u64 (cmp)))
+    return special_case (x, y, odd, cmp);
+  return vreinterpretq_f64_u64 (vor (vreinterpret_v_f64m2_u64m2 (y), odd, 2));
+}
diff --git a/sysdeps/riscv/rvd/math_private.h b/sysdeps/riscv/rvd/math_private.h
new file mode 100644
index 0000000000..655a4dcd55
--- /dev/null
+++ b/sysdeps/riscv/rvd/math_private.h
@@ -0,0 +1,42 @@ 
+/* Configure optimized libm functions.  RISC-V version.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef RISCV_MATH_PRIVATE_H
+#define RISCV_MATH_PRIVATE_H 1
+
+#include <stdint.h>
+#include <math.h>
+
+/* Use inline round and lround instructions.  */
+#define TOINT_INTRINSICS 1
+
+static inline double_t
+roundtoint (double_t x)
+{
+  return round (x);
+}
+
+static inline int32_t
+converttoint (double_t x)
+{
+  return lround (x);
+}
+
+#include_next <math_private.h>
+
+#endif
diff --git a/sysdeps/riscv/rvd/v_math.h b/sysdeps/riscv/rvd/v_math.h
new file mode 100644
index 0000000000..d2e821aeb2
--- /dev/null
+++ b/sysdeps/riscv/rvd/v_math.h
@@ -0,0 +1,139 @@ 
+/* Utilities for Advanced SIMD libmvec routines.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _V_MATH_H
+#define _V_MATH_H
+
+#include <riscv_vector.h>
+#include "vecmath_config.h"
+
+#define V_NAME_D1(fun) _ZGVnN2v_##fun
+
+/* Shorthand helpers for declaring constants.  */
+#define V2(X) { X, X }
+#define V4(X) { X, X, X, X }
+#define V8(X) { X, X, X, X, X, X, X, X }
+
+static inline vfloat32m4_t
+v_f32 (float x)
+{
+  return (vfloat32m4_t) V4 (x);
+}
+static inline vuint32m4_t
+v_u32 (uint32_t x)
+{
+  return (vuint32m4_t) V4 (x);
+}
+static inline vint32m4_t
+v_s32 (int32_t x)
+{
+  return (vint32m4_t) V4 (x);
+}
+
+/* true if any elements of a vector compare result is non-zero.  */
+static inline int
+v_any_u32 (vuint32m4_t x)
+{
+  /* assume elements in x are either 0 or -1u.  */
+  return vpaddd_u64 (vreinterpret_v_u64m2_u32m2 (x)) != 0;
+}
+static inline int
+v_any_u32h (vuint32m2_t x)
+{
+  return vget_lane_u64 (vreinterpret_v_u32m2_u64m2 (x), 0) != 0;
+}
+static inline vfloat32m4_t
+v_lookup_f32 (const float *tab, vuint32m4_t idx)
+{
+  return (vfloat32m4_t){ tab[idx[0]], tab[idx[1]], tab[idx[2]], tab[idx[3]] };
+}
+static inline vuint32m4_t
+v_lookup_u32 (const uint32_t *tab, vuint32m4_t idx)
+{
+  return (vuint32m4_t){ tab[idx[0]], tab[idx[1]], tab[idx[2]], tab[idx[3]] };
+}
+static inline vfloat32m4_t
+v_call_f32 (float (*f) (float), vfloat32m4_t x, vfloat32m4_t y, vuint32m4_t p)
+{
+  return (vfloat32m4_t){ p[0] ? f (x[0]) : y[0], p[1] ? f (x[1]) : y[1],
+			p[2] ? f (x[2]) : y[2], p[3] ? f (x[3]) : y[3] };
+}
+static inline vfloat32m4_t
+v_call2_f32 (float (*f) (float, float), vfloat32m4_t x1, vfloat32m4_t x2,
+	     vfloat32m4_t y, vuint32m4_t p)
+{
+  return (vfloat32m4_t){ p[0] ? f (x1[0], x2[0]) : y[0],
+			p[1] ? f (x1[1], x2[1]) : y[1],
+			p[2] ? f (x1[2], x2[2]) : y[2],
+			p[3] ? f (x1[3], x2[3]) : y[3] };
+}
+
+static inline vfloat64m2_t
+v_f64 (double x)
+{
+  return (vfloat64m2_t) V2 (x);
+}
+static inline vuint64m2_t
+v_u64 (uint64_t x)
+{
+  return (vuint64m2_t) V2 (x);
+}
+static inline vint64m2_t
+v_s64 (int64_t x)
+{
+  return (vint64m2_t) V2 (x);
+}
+
+/* true if any elements of a vector compare result is non-zero.  */
+static inline int
+v_any_u64 (vuint64m1_t x)
+{
+  /* assume elements in x are either 0 or -1u.  */
+  return vpaddd_u64 (x) != 0;
+}
+/* true if all elements of a vector compare result is 1.  */
+static inline int
+v_all_u64 (vuint64m1_t x)
+{
+  /* assume elements in x are either 0 or -1u.  */
+  return vpaddd_s64 (vreinterpretq_s64_u64 (x)) == -2;
+}
+static inline vfloat64m1_t
+v_lookup_f64 (const double *tab, vuint64m1_t idx)
+{
+  return (vfloat64m1_t){ tab[idx[0]], tab[idx[1]] };
+}
+static inline vuint64m1_t
+v_lookup_u64 (const uint64_t *tab, vuint64m1_t idx)
+{
+  return (vuint64m1_t){ tab[idx[0]], tab[idx[1]] };
+}
+static inline vfloat64m1_t
+v_call_f64 (double (*f) (double), vfloat64m1_t x, vfloat64m1_t y, vuint64m1_t p)
+{
+  return (vfloat64m1_t){ p[0] ? f (x[0]) : y[0], p[1] ? f (x[1]) : y[1] };
+}
+static inline vfloat64m1_t
+v_call2_f64 (double (*f) (double, double), vfloat64m1_t x1, vfloat64m1_t x2,
+	     vfloat64m1_t y, vuint64m1_t p)
+{
+  return (vfloat64m1_t){ p[0] ? f (x1[0], x2[0]) : y[0],
+			p[1] ? f (x1[1], x2[1]) : y[1] };
+}
+
+#endif
diff --git a/sysdeps/riscv/rvd/vecmath_config.h b/sysdeps/riscv/rvd/vecmath_config.h
new file mode 100644
index 0000000000..290ea1e33c
--- /dev/null
+++ b/sysdeps/riscv/rvd/vecmath_config.h
@@ -0,0 +1,33 @@ 
+/* Configuration for libmvec routines.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _VECMATH_CONFIG_H
+#define _VECMATH_CONFIG_H
+
+#include <math_private.h>
+
+/* Return ptr but hide its value from the compiler so accesses through it
+   cannot be optimized based on the contents.  */
+#define ptr_barrier(ptr)                                                      \
+  ({                                                                          \
+    __typeof (ptr) __ptr = (ptr);                                             \
+    __asm("" : "+r"(__ptr));                                                  \
+    __ptr;                                                                    \
+  })
+
+#endif
diff --git a/sysdeps/unix/sysv/linux/riscv/libmvec.abilist b/sysdeps/unix/sysv/linux/riscv/libmvec.abilist
new file mode 100644
index 0000000000..fe8141b189
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/riscv/libmvec.abilist
@@ -0,0 +1 @@ 
+GLIBC_2.40 _ZGVnN2v_cos F