From patchwork Mon Dec  4 02:57:08 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Feng Wang <wangfeng@eswincomputing.com>
X-Patchwork-Id: 81241
X-Patchwork-Delegate: kito.cheng@gmail.com
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 3CB65385C322
	for <patchwork@sourceware.org>; Mon,  4 Dec 2023 02:59:16 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from zg8tndyumtaxlji0oc4xnzya.icoremail.net
 (zg8tndyumtaxlji0oc4xnzya.icoremail.net [46.101.248.176])
 by sourceware.org (Postfix) with ESMTP id 73E41385E00D
 for <gcc-patches@gcc.gnu.org>; Mon,  4 Dec 2023 02:58:28 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 73E41385E00D
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none)
 header.from=eswincomputing.com
Authentication-Results: sourceware.org;
 spf=pass smtp.mailfrom=eswincomputing.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 73E41385E00D
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=46.101.248.176
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701658713; cv=none;
 b=ExY3uK1w6APxkbqa2/yf+dmmUREfaeBAUrHw75TIKiAjJ8G47yWGBphdEGDWolf2FXj2Jq5g9LCjm2AyptydmeoLrb/ftGvJAiEwq8zujqlImgMsT9wCQQsqldexmsXCFYlCvXdgQcDns3bzBUdRzVVmuLCtl5W2JhIEeVTFx8k=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1701658713; c=relaxed/simple;
 bh=NVXC947w+DB7xraFc9t2iTe7Picd1XiveZOay5DyuZ8=;
 h=From:To:Subject:Date:Message-Id;
 b=AY5DLAYc3HTLFJBe8bH2BFlfkOpXhXChaI24zudRwdxOy/Ma/JLAQ9w/B5hcLrdxz1q8f7qOIELAAHLj47EQk9YlPgcEixpEKBxXcqg4lUKYJ8yrpZAZXTvpGzucVfUQ5HIjq76rr6xn7A9fMCPHVeJVrKBz0FP0/A+lC/JGL+c=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: from localhost.localdomain (unknown [10.12.130.31])
 by app1 (Coremail) with SMTP id TAJkCgC3Qv39P21lJhYAAA--.1700S9;
 Mon, 04 Dec 2023 10:57:21 +0800 (CST)
From: Feng Wang <wangfeng@eswincomputing.com>
To: gcc-patches@gcc.gnu.org
Cc: kito.cheng@gmail.com, jeffreyalaw@gmail.com, zhusonghe@eswincomputing.com,
 panciyan@eswincomputing.com, Feng Wang <wangfeng@eswincomputing.com>
Subject: [PATCH 6/7] RISC-V: Add intrinsic functions for crypto vector Zvksed
 extension.
Date: Mon,  4 Dec 2023 02:57:08 +0000
Message-Id: <20231204025709.3783-6-wangfeng@eswincomputing.com>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20231204025709.3783-1-wangfeng@eswincomputing.com>
References: <20231204025709.3783-1-wangfeng@eswincomputing.com>
X-CM-TRANSID: TAJkCgC3Qv39P21lJhYAAA--.1700S9
X-Coremail-Antispam: 1UD129KBjvAXoWfur18Wr4kWFWfCw4kKr1xXwb_yoWrGr17uo
 Z8Krs5u3WrXr17uw4Duw48Gr1xXa1xXrs3A3WfKrnru3WfZa1Fk3ZFqa1DZFs2yr4DZFZ8
 CFs3Zr4xXF13tF1rn29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7v73VFW2AGmfu7bjvjm3
 AaLaJ3UjIYCTnIWjp_UUUOY7AC8VAFwI0_Wr0E3s1l1xkIjI8I6I8E6xAIw20EY4v20xva
 j40_Wr0E3s1l1IIY67AEw4v_Jr0_Jr4l82xGYIkIc2x26280x7IE14v26r126s0DM28Irc
 Ia0xkI8VCY1x0267AKxVW5JVCq3wA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK021l
 84ACjcxK6xIIjxv20xvE14v26w1j6s0DM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4UJV
 WxJr1l84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_GcCE
 3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E2I
 x0cI8IcVAFwI0_JrI_JrylYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJVW8
 JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I648v4I1lc2xSY4AK6svPMxAIw2
 8IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMxCIbckI1I0E14v26r1q6r43MI8I
 3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxV
 WUAVWUtwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r4j6ryUMIIF0xvE2Ix0cI8I
 cVCY1x0267AKxVW8JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aV
 AFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZE
 Xa7VU17GYJUUUUU==
X-CM-SenderInfo: pzdqwwxhqjqvxvzl0uprps33xlqjhudrp/
X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS,
 TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org

This patch add the intrinsic functions(according to https://github.com/
riscv-non-isa/rvv-intrinsic-doc/blob/eopc/vector-crypto/auto-generated/
vector-crypto/intrinsic_funcs.md) for crypto vector Zvksed extension. And all
the test cases are added for api-testing.

gcc/ChangeLog:

        * common/config/riscv/riscv-common.cc: Add Zvksed in riscv_implied_info.
        * config/riscv/riscv-vector-builtins-bases.cc (class vaeskf1): Add new function_base for Zvksed.
        (class crypto_vi): Ditto.
        (BASE): Add Zvksed BASE declaration.
        * config/riscv/riscv-vector-builtins-bases.h: Ditto.
        * config/riscv/riscv-vector-builtins-shapes.cc (struct crypto_vv_def): Add function_builder for Zvksed.
        * config/riscv/riscv-vector-crypto-builtins-avail.h (AVAIL): Add enable condition.
        * config/riscv/riscv-vector-crypto-builtins-functions.def (vsha2cl): Add intrinsc def.
        (vsm4k): Ditto.
        (vsm4r): Ditto.
        * config/riscv/riscv.md: Add Zvksed ins name.
        * config/riscv/vector-crypto.md (sm4k): Add Zvksed md patterns.
        (@pred_vaeskf1<mode>_scalar):Ditto.
        (@pred_crypto_vi<vi_ins_name><mode>_scalar): Ditto.
        * config/riscv/vector.md: Add the corresponding attribute for Zvksed.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/zvk/zvk.exp:
        * gcc.target/riscv/zvk/zvksed/vsm4k.c: New test.
        * gcc.target/riscv/zvk/zvksed/vsm4k_overloaded.c: New test.
        * gcc.target/riscv/zvk/zvksed/vsm4r.c: New test.
        * gcc.target/riscv/zvk/zvksed/vsm4r_overloaded.c: New test.
---
 gcc/common/config/riscv/riscv-common.cc       |   1 +
 .../riscv/riscv-vector-builtins-bases.cc      |  13 +-
 .../riscv/riscv-vector-builtins-bases.h       |   2 +
 .../riscv/riscv-vector-builtins-shapes.cc     |   2 +-
 .../riscv-vector-crypto-builtins-avail.h      |   1 +
 ...riscv-vector-crypto-builtins-functions.def |  10 +-
 gcc/config/riscv/riscv.md                     |   5 +-
 gcc/config/riscv/vector-crypto.md             |  40 +++--
 gcc/config/riscv/vector.md                    |  20 ++-
 gcc/testsuite/gcc.target/riscv/zvk/zvk.exp    |   3 +-
 .../gcc.target/riscv/zvk/zvksed/vsm4k.c       |  50 ++++++
 .../riscv/zvk/zvksed/vsm4k_overloaded.c       |  50 ++++++
 .../gcc.target/riscv/zvk/zvksed/vsm4r.c       | 170 ++++++++++++++++++
 .../riscv/zvk/zvksed/vsm4r_overloaded.c       | 170 ++++++++++++++++++
 14 files changed, 505 insertions(+), 32 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4k.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4k_overloaded.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4r.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4r_overloaded.c

diff --git a/gcc/common/config/riscv/riscv-common.cc b/gcc/common/config/riscv/riscv-common.cc
index 7201ac3866c..87595b135ef 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -127,6 +127,7 @@ static const riscv_implied_info_t riscv_implied_info[] =
   {"zvkned",   "v"},
   {"zvknha",   "v"},
   {"zvknhb",   "v"},
+  {"zvksed",   "v"},
 
   {"zfh", "zfhmin"},
   {"zfhmin", "f"},
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index a3670ec5b38..83309f07661 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -2288,8 +2288,9 @@ public:
   }
 };
 
-/* Implements vaeskf1. */
-class vaeskf1 : public function_base
+/* Implements vaeskf1/vsm4k. */
+template<int UNSPEC>
+class crypto_vi : public function_base
 {
 public:
   bool apply_mask_policy_p () const override { return false; }
@@ -2297,7 +2298,7 @@ public:
 
   rtx expand (function_expander &e) const override
   {
-    return e.use_exact_insn (code_for_pred_vaeskf1_scalar (e.vector_mode ()));
+    return e.use_exact_insn (code_for_pred_crypto_vi_scalar (UNSPEC, e.vector_mode ()));
   }
 };
 
@@ -2591,11 +2592,13 @@ static CONSTEXPR const crypto_vv<UNSPEC_VAESEM>  vaesem_obj;
 static CONSTEXPR const crypto_vv<UNSPEC_VAESDF>  vaesdf_obj;
 static CONSTEXPR const crypto_vv<UNSPEC_VAESDM>  vaesdm_obj;
 static CONSTEXPR const crypto_vv<UNSPEC_VAESZ>   vaesz_obj;
-static CONSTEXPR const vaeskf1 vaeskf1_obj;
+static CONSTEXPR const crypto_vi<UNSPEC_VAESKF1> vaeskf1_obj;
 static CONSTEXPR const vaeskf2 vaeskf2_obj;
 static CONSTEXPR const vg_nhab<UNSPEC_VSHA2MS>   vsha2ms_obj;
 static CONSTEXPR const vg_nhab<UNSPEC_VSHA2CH>   vsha2ch_obj;
 static CONSTEXPR const vg_nhab<UNSPEC_VSHA2CL>   vsha2cl_obj;
+static CONSTEXPR const crypto_vi<UNSPEC_VSM4K>   vsm4k_obj;
+static CONSTEXPR const crypto_vv<UNSPEC_VSM4R>   vsm4r_obj;
 
 /* Declare the function base NAME, pointing it to an instance
    of class <NAME>_obj.  */
@@ -2882,4 +2885,6 @@ BASE (vaeskf2)
 BASE (vsha2ms)
 BASE (vsha2ch)
 BASE (vsha2cl)
+BASE (vsm4k)
+BASE (vsm4r)
 } // end namespace riscv_vector
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h b/gcc/config/riscv/riscv-vector-builtins-bases.h
index 0560b0008f0..e9e6d7bfe7f 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -304,6 +304,8 @@ extern const function_base *const vaeskf2;
 extern const function_base *const vsha2ms;
 extern const function_base *const vsha2ch;
 extern const function_base *const vsha2cl;
+extern const function_base *const vsm4k;
+extern const function_base *const vsm4r;
 }
 
 } // end namespace riscv_vector
diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
index 5873103857a..4fe298917f6 100644
--- a/gcc/config/riscv/riscv-vector-builtins-shapes.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
@@ -1050,7 +1050,7 @@ struct crypto_vv_def : public build_base
   }
 };
 
-/* vaeskf1/vaeskf2 class.  */
+/* vaeskf1/vaeskf2/vsm4k class.  */
 struct crypto_vi_def : public build_base
 {
   char *get_name (function_builder &b, const function_instance &instance,
diff --git a/gcc/config/riscv/riscv-vector-crypto-builtins-avail.h b/gcc/config/riscv/riscv-vector-crypto-builtins-avail.h
index bc1b6ec9b5b..f09315923f3 100755
--- a/gcc/config/riscv/riscv-vector-crypto-builtins-avail.h
+++ b/gcc/config/riscv/riscv-vector-crypto-builtins-avail.h
@@ -19,5 +19,6 @@ AVAIL (zvkg, TARGET_ZVKG)
 AVAIL (zvkned, TARGET_ZVKNED)
 AVAIL (zvknha_or_zvknhb, TARGET_ZVKNHA || TARGET_ZVKNHB)
 AVAIL (zvknhb, TARGET_ZVKNHB)
+AVAIL (zvksed, TARGET_ZVKSED)
 }
 #endif
diff --git a/gcc/config/riscv/riscv-vector-crypto-builtins-functions.def b/gcc/config/riscv/riscv-vector-crypto-builtins-functions.def
index 9c89412a9a9..67f3bf5284b 100755
--- a/gcc/config/riscv/riscv-vector-crypto-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-crypto-builtins-functions.def
@@ -64,4 +64,12 @@ DEF_VECTOR_CRYPTO_FUNCTION (vsha2cl,  crypto_vv, none_tu_preds, u_vvvv_crypto_se
 //ZVKNHB
 DEF_VECTOR_CRYPTO_FUNCTION (vsha2ms,  crypto_vv, none_tu_preds, u_vvvv_crypto_sew64_ops, zvknhb)
 DEF_VECTOR_CRYPTO_FUNCTION (vsha2ch,  crypto_vv, none_tu_preds, u_vvvv_crypto_sew64_ops, zvknhb)
-DEF_VECTOR_CRYPTO_FUNCTION (vsha2cl,  crypto_vv, none_tu_preds, u_vvvv_crypto_sew64_ops, zvknhb)
\ No newline at end of file
+DEF_VECTOR_CRYPTO_FUNCTION (vsha2cl,  crypto_vv, none_tu_preds, u_vvvv_crypto_sew64_ops, zvknhb)
+//Zvksed
+DEF_VECTOR_CRYPTO_FUNCTION (vsm4k,    crypto_vi, none_tu_preds, u_vv_size_crypto_sew32_ops, zvksed)
+DEF_VECTOR_CRYPTO_FUNCTION (vsm4r,    crypto_vv, none_tu_preds, u_vvv_crypto_sew32_ops, zvksed)
+DEF_VECTOR_CRYPTO_FUNCTION (vsm4r,    crypto_vv, none_tu_preds, u_vvs_crypto_sew32_ops, zvksed)
+DEF_VECTOR_CRYPTO_FUNCTION (vsm4r,    crypto_vv, none_tu_preds, u_vvs_crypto_sew32_lmul_x2_ops,  zvksed)
+DEF_VECTOR_CRYPTO_FUNCTION (vsm4r,    crypto_vv, none_tu_preds, u_vvs_crypto_sew32_lmul_x4_ops,  zvksed)
+DEF_VECTOR_CRYPTO_FUNCTION (vsm4r,    crypto_vv, none_tu_preds, u_vvs_crypto_sew32_lmul_x8_ops,  zvksed)
+DEF_VECTOR_CRYPTO_FUNCTION (vsm4r,    crypto_vv, none_tu_preds, u_vvs_crypto_sew32_lmul_x16_ops, zvksed)
\ No newline at end of file
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index e8fc21e8ceb..c076b82008a 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -428,6 +428,7 @@
 ;; vcompress    vector compress instruction
 ;; vmov         whole vector register move
 ;; vector       unknown vector instruction
+;; 17. Crypto Vector instructions
 ;; vandn        crypto vector bitwise and-not instructions
 ;; vbrev        crypto vector reverse bits in elements instructions
 ;; vbrev8       crypto vector reverse bits in bytes instructions
@@ -451,6 +452,8 @@
 ;; vsha2ms      crypto vector SHA-2 message schedule instructions
 ;; vsha2ch      crypto vector SHA-2 two rounds of compression instructions
 ;; vsha2cl      crypto vector SHA-2 two rounds of compression instructions
+;; vsm4k        crypto vector SM4 KeyExpansion instructions
+;; vsm4r        crypto vector SM4 Rounds instructions
 (define_attr "type"
   "unknown,branch,jump,jalr,ret,call,load,fpload,store,fpstore,
    mtc,mfc,const,arith,logical,shift,slt,imul,idiv,move,fmove,fadd,fmul,
@@ -472,7 +475,7 @@
    vslideup,vslidedown,vislide1up,vislide1down,vfslide1up,vfslide1down,
    vgather,vcompress,vmov,vector,vandn,vbrev,vbrev8,vrev8,vclz,vctz,vrol,vror,vwsll,
    vclmul,vclmulh,vghsh,vgmul,vaesef,vaesem,vaesdf,vaesdm,vaeskf1,vaeskf2,vaesz,
-   vsha2ms,vsha2ch,vsha2cl"
+   vsha2ms,vsha2ch,vsha2cl,vsm4k,vsm4r"
   (cond [(eq_attr "got" "load") (const_string "load")
 
 	 ;; If a doubleword move uses these expensive instructions,
diff --git a/gcc/config/riscv/vector-crypto.md b/gcc/config/riscv/vector-crypto.md
index 38b41fb3664..7bd4cd9f8b9 100755
--- a/gcc/config/riscv/vector-crypto.md
+++ b/gcc/config/riscv/vector-crypto.md
@@ -33,6 +33,10 @@
     UNSPEC_VSHA2MS
     UNSPEC_VSHA2CH
     UNSPEC_VSHA2CL
+    UNSPEC_VSM4K
+    UNSPEC_VSM4R
+    UNSPEC_VSM4RVV
+    UNSPEC_VSM4RVS
 ])
 
 (define_int_attr ror_rol [(UNSPEC_VROL "rol") (UNSPEC_VROR "ror")])
@@ -47,16 +51,20 @@
                               (UNSPEC_VAESEMVV "aesem") (UNSPEC_VAESDFVV "aesdf")
                               (UNSPEC_VAESDMVV "aesdm") (UNSPEC_VAESEFVS "aesef")
                               (UNSPEC_VAESEMVS "aesem") (UNSPEC_VAESDFVS "aesdf")
-                              (UNSPEC_VAESDMVS "aesdm") (UNSPEC_VAESZVS  "aesz" )])
+                              (UNSPEC_VAESDMVS "aesdm") (UNSPEC_VAESZVS  "aesz" )
+                              (UNSPEC_VSM4RVV  "sm4r" ) (UNSPEC_VSM4RVS  "sm4r" )])
 
 (define_int_attr vv_ins1_name [(UNSPEC_VGHSH "ghsh")     (UNSPEC_VSHA2MS "sha2ms")
                                (UNSPEC_VSHA2CH "sha2ch") (UNSPEC_VSHA2CL "sha2cl")])
 
+(define_int_attr vi_ins_name [(UNSPEC_VAESKF1 "aeskf1") (UNSPEC_VSM4K "sm4k")])
+
 (define_int_attr ins_type [(UNSPEC_VGMUL    "vv") (UNSPEC_VAESEFVV "vv")
                            (UNSPEC_VAESEMVV "vv") (UNSPEC_VAESDFVV "vv")
                            (UNSPEC_VAESDMVV "vv") (UNSPEC_VAESEFVS "vs")
                            (UNSPEC_VAESEMVS "vs") (UNSPEC_VAESDFVS "vs")
-                           (UNSPEC_VAESDMVS "vs") (UNSPEC_VAESZVS  "vs")])
+                           (UNSPEC_VAESDMVS "vs") (UNSPEC_VAESZVS  "vs")
+                           (UNSPEC_VSM4RVV  "vv") (UNSPEC_VSM4RVS  "vs")])
 
 (define_int_iterator UNSPEC_VRORL [UNSPEC_VROL UNSPEC_VROR])
 
@@ -69,10 +77,12 @@
 (define_int_iterator UNSPEC_CRYPTO_VV [UNSPEC_VGMUL    UNSPEC_VAESEFVV UNSPEC_VAESEMVV
                                        UNSPEC_VAESDFVV UNSPEC_VAESDMVV UNSPEC_VAESEFVS
                                        UNSPEC_VAESEMVS UNSPEC_VAESDFVS UNSPEC_VAESDMVS
-                                       UNSPEC_VAESZVS])
+                                       UNSPEC_VAESZVS  UNSPEC_VSM4RVV  UNSPEC_VSM4RVS])
 
 (define_int_iterator UNSPEC_VGNHAB [UNSPEC_VGHSH UNSPEC_VSHA2MS UNSPEC_VSHA2CH UNSPEC_VSHA2CL])
 
+(define_int_iterator UNSPEC_CRYPTO_VI [UNSPEC_VAESKF1 UNSPEC_VSM4K])
+
 ;; zvbb instructions patterns.
 ;; vandn.vv vandn.vx vrol.vv vrol.vx
 ;; vror.vv vror.vx vror.vi
@@ -338,7 +348,7 @@
              [(match_operand:VSI 1 "register_operand" " 0")
               (match_operand:VSI 2 "register_operand" "vr")] UNSPEC_CRYPTO_VV)
           (match_dup 1)))]
-  "TARGET_ZVKNED"
+  "TARGET_ZVKG || TARGET_ZVKNED"
   "v<vv_ins_name>.<ins_type>\t%0,%2"
   [(set_attr "type" "v<vv_ins_name>")
    (set_attr "mode" "<VSI:MODE>")])
@@ -356,7 +366,7 @@
              [(match_operand:VSI 1 "register_operand" " 0")
               (match_operand:VSI 2 "register_operand" "vr")] UNSPEC_CRYPTO_VV)
           (match_dup 1)))]
-  "TARGET_ZVKNED"
+  "TARGET_ZVKNED || TARGET_ZVKSED"
   "v<vv_ins_name>.<ins_type>\t%0,%2"
   [(set_attr "type" "v<vv_ins_name>")
    (set_attr "mode" "<VSI:MODE>")])
@@ -374,7 +384,7 @@
              [(match_operand:<VSIX2> 1  "register_operand" " 0")
               (match_operand:VLMULX2_SI 2 "register_operand" "vr")] UNSPEC_CRYPTO_VV)
           (match_dup 1)))]
-  "TARGET_ZVKNED"
+  "TARGET_ZVKNED || TARGET_ZVKSED"
   "v<vv_ins_name>.<ins_type>\t%0,%2"
   [(set_attr "type" "v<vv_ins_name>")
    (set_attr "mode" "<VLMULX2_SI:MODE>")])
@@ -392,7 +402,7 @@
             [(match_operand:<VSIX4> 1 "register_operand" " 0")
              (match_operand:VLMULX4_SI 2 "register_operand" "vr")] UNSPEC_CRYPTO_VV)
          (match_dup 1)))]
- "TARGET_ZVKNED"
+ "TARGET_ZVKNED || TARGET_ZVKSED"
  "v<vv_ins_name>.<ins_type>\t%0,%2"
  [(set_attr "type" "v<vv_ins_name>")
   (set_attr "mode" "<VLMULX4_SI:MODE>")])
@@ -410,7 +420,7 @@
             [(match_operand:<VSIX8> 1 "register_operand" " 0")
              (match_operand:VLMULX8_SI 2 "register_operand" "vr")] UNSPEC_CRYPTO_VV)
          (match_dup 1)))]
- "TARGET_ZVKNED"
+ "TARGET_ZVKNED || TARGET_ZVKSED"
  "v<vv_ins_name>.<ins_type>\t%0,%2"
  [(set_attr "type" "v<vv_ins_name>")
   (set_attr "mode" "<VLMULX8_SI:MODE>")])
@@ -428,13 +438,13 @@
             [(match_operand:<VSIX16> 1 "register_operand" " 0")
              (match_operand:VLMULX16_SI 2 "register_operand" "vr")] UNSPEC_CRYPTO_VV)
          (match_dup 1)))]
- "TARGET_ZVKNED"
+ "TARGET_ZVKNED || TARGET_ZVKSED"
  "v<vv_ins_name>.<ins_type>\t%0,%2"
  [(set_attr "type" "v<vv_ins_name>")
   (set_attr "mode" "<VLMULX16_SI:MODE>")])
 
-;; vaeskf1.vi
-(define_insn "@pred_vaeskf1<mode>_scalar"
+;; vaeskf1.vi vsm4k.vi
+(define_insn "@pred_crypto_vi<vi_ins_name><mode>_scalar"
   [(set (match_operand:VSI 0 "register_operand"           "=vd, vd")
         (if_then_else:VSI
           (unspec:<VM>
@@ -445,11 +455,11 @@
              (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
           (unspec:VSI
              [(match_operand:VSI 2       "register_operand"  "vr, vr")
-              (match_operand:<VEL> 3 "const_int_operand" " i,  i")] UNSPEC_VAESKF1)
+              (match_operand:<VEL> 3 "const_int_operand" " i,  i")] UNSPEC_CRYPTO_VI)
           (match_operand:VSI 1 "vector_merge_operand" "vu, 0")))]
-  "TARGET_ZVKNED"
-  "vaeskf1.vi\t%0,%2,%3"
-  [(set_attr "type" "vaeskf1")
+  "TARGET_ZVKNED || TARGET_ZVKSED"
+  "v<vi_ins_name>.vi\t%0,%2,%3"
+  [(set_attr "type" "v<vi_ins_name>")
    (set_attr "mode" "<MODE>")])
 
 ;; vaeskf2.vi
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index e2de9bfc496..7fae91b3860 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -54,7 +54,7 @@
 			  vgather,vcompress,vlsegde,vssegte,vlsegds,vssegts,vlsegdux,vlsegdox,\
 			  vssegtux,vssegtox,vlsegdff,vandn,vbrev,vbrev8,vrev8,vclz,vctz,vrol,\
                           vror,vwsll,vclmul,vclmulh,vghsh,vgmul,vaesef,vaesem,vaesdf,vaesdm,\
-                          vaeskf1,vaeskf2,vaesz,vsha2ms,vsha2ch,vsha2cl")
+                          vaeskf1,vaeskf2,vaesz,vsha2ms,vsha2ch,vsha2cl,vsm4k,vsm4r")
 	 (const_string "true")]
 	(const_string "false")))
 
@@ -78,7 +78,7 @@
 			  vgather,vcompress,vlsegde,vssegte,vlsegds,vssegts,vlsegdux,vlsegdox,\
 			  vssegtux,vssegtox,vlsegdff,vandn,vbrev,vbrev8,vrev8,vclz,vctz,vrol,\
                           vror,vwsll,vclmul,vclmulh,vghsh,vgmul,vaesef,vaesem,vaesdf,vaesdm,\
-                          vaeskf1,vaeskf2,vaesz,vsha2ms,vsha2ch,vsha2cl")
+                          vaeskf1,vaeskf2,vaesz,vsha2ms,vsha2ch,vsha2cl,vsm4k,vsm4r")
 	 (const_string "true")]
 	(const_string "false")))
 
@@ -707,7 +707,7 @@
 	       (const_int 2)
 
 	       (eq_attr "type" "vimerge,vfmerge,vcompress,vghsh,vgmul,vaesef,vaesem,vaesdf,vaesdm,\
-                                vaeskf1,vaeskf2,vaesz,vsha2ms,vsha2ch,vsha2cl")
+                                vaeskf1,vaeskf2,vaesz,vsha2ms,vsha2ch,vsha2cl,vsm4k,vsm4r")
 	       (const_int 1)
 
 	       (eq_attr "type" "vimuladd,vfmuladd")
@@ -747,7 +747,7 @@
 			  vfcvtftoi,vfwcvtitof,vfwcvtftoi,vfwcvtftof,vfncvtitof,\
 			  vfncvtftoi,vfncvtftof,vfclass,vimovxv,vfmovfv,vcompress,\
 			  vlsegde,vssegts,vssegtux,vssegtox,vlsegdff,vbrev,vbrev8,vrev8,\
-                          vghsh,vaeskf1,vaeskf2,vsha2ms,vsha2ch,vsha2cl")
+                          vghsh,vaeskf1,vaeskf2,vsha2ms,vsha2ch,vsha2cl,vsm4k")
 	   (const_int 4)
 
 	 ;; If operands[3] of "vlds" is not vector mode, it is pred_broadcast.
@@ -770,7 +770,7 @@
 	   (const_int 6)
 
 	 (eq_attr "type" "vmpop,vmffs,vmidx,vssegte,vclz,vctz,vgmul,vaesef,vaesem,vaesdf,vaesdm,\
-                          vaesz")
+                          vaesz,vsm4r")
 	   (const_int 3)]
   (const_int INVALID_ATTRIBUTE)))
 
@@ -780,7 +780,7 @@
 			  vfcvtitof,vfcvtftoi,vfwcvtitof,vfwcvtftoi,vfwcvtftof,\
 			  vfncvtitof,vfncvtftoi,vfncvtftof,vfclass,vimovxv,vfmovfv,\
 			  vcompress,vldff,vlsegde,vlsegdff,vbrev,vbrev8,vrev8,vghsh,\
-                          vaeskf1,vaeskf2,vsha2ms,vsha2ch,vsha2cl")
+                          vaeskf1,vaeskf2,vsha2ms,vsha2ch,vsha2cl,vsm4k")
 	   (symbol_ref "riscv_vector::get_ta(operands[5])")
 
 	 ;; If operands[3] of "vlds" is not vector mode, it is pred_broadcast.
@@ -802,7 +802,7 @@
 	 (eq_attr "type" "vimuladd,vfmuladd")
 	   (symbol_ref "riscv_vector::get_ta(operands[7])")
 
-	 (eq_attr "type" "vmidx,vgmul,vaesef,vaesem,vaesdf,vaesdm,vaesz")
+	 (eq_attr "type" "vmidx,vgmul,vaesef,vaesem,vaesdf,vaesdm,vaesz,vsm4r")
 	   (symbol_ref "riscv_vector::get_ta(operands[4])")]
 	(const_int INVALID_ATTRIBUTE)))
 
@@ -844,7 +844,8 @@
 			  vfclass,vired,viwred,vfredu,vfredo,vfwredu,vfwredo,\
 			  vimovxv,vfmovfv,vlsegde,vlsegdff,vbrev,vbrev8,vrev8")
 	   (const_int 7)
-	 (eq_attr "type" "vldm,vstm,vmalu,vmalu,vgmul,vaesef,vaesem,vaesdf,vaesdm,vaesz")
+	 (eq_attr "type" "vldm,vstm,vmalu,vmalu,vgmul,vaesef,vaesem,vaesdf,vaesdm,vaesz,\
+                          vsm4r")
 	   (const_int 5)
 
 	 ;; If operands[3] of "vlds" is not vector mode, it is pred_broadcast.
@@ -867,7 +868,8 @@
 	 (eq_attr "type" "vimuladd,vfmuladd")
 	   (const_int 9)
 
-	 (eq_attr "type" "vmsfs,vmidx,vcompress,vghsh,vaeskf1,vaeskf2,vsha2ms,vsha2ch,vsha2cl")
+	 (eq_attr "type" "vmsfs,vmidx,vcompress,vghsh,vaeskf1,vaeskf2,vsha2ms,vsha2ch,vsha2cl,\
+                          vsm4k")
 	   (const_int 6)
 
 	 (eq_attr "type" "vmpop,vmffs,vssegte,vclz,vctz")
diff --git a/gcc/testsuite/gcc.target/riscv/zvk/zvk.exp b/gcc/testsuite/gcc.target/riscv/zvk/zvk.exp
index 13f1302314a..7d87b0c1bee 100644
--- a/gcc/testsuite/gcc.target/riscv/zvk/zvk.exp
+++ b/gcc/testsuite/gcc.target/riscv/zvk/zvk.exp
@@ -46,6 +46,7 @@ dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/zvknha/*.\[cS\]]] \
         "" $DEFAULT_CFLAGS
 dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/zvknhb/*.\[cS\]]] \
         "" $DEFAULT_CFLAGS
-
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/zvksed/*.\[cS\]]] \
+        "" $DEFAULT_CFLAGS
 # All done.
 dg-finish
diff --git a/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4k.c b/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4k.c
new file mode 100644
index 00000000000..7a8a0857f31
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4k.c
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zvksed -mabi=lp64d -O2 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+/* non-policy */
+vuint32mf2_t test_vsm4k_vi_u32mf2(vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4k_vi_u32mf2(vs2, 0, vl);
+}
+
+vuint32m1_t test_vsm4k_vi_u32m1(vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4k_vi_u32m1(vs2, 0, vl);
+}
+
+vuint32m2_t test_vsm4k_vi_u32m2(vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4k_vi_u32m2(vs2, 0, vl);
+}
+
+vuint32m4_t test_vsm4k_vi_u32m4(vuint32m4_t vs2, size_t vl) {
+  return __riscv_vsm4k_vi_u32m4(vs2, 0, vl);
+}
+
+vuint32m8_t test_vsm4k_vi_u32m8(vuint32m8_t vs2, size_t vl) {
+  return __riscv_vsm4k_vi_u32m8(vs2, 0, vl);
+}
+
+/* policy */
+vuint32mf2_t test_vsm4k_vi_u32mf2_tu(vuint32mf2_t maskedoff, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4k_vi_u32mf2_tu(maskedoff, vs2, 0, vl);
+}
+
+vuint32m1_t test_vsm4k_vi_u32m1_tu(vuint32m1_t maskedoff, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4k_vi_u32m1_tu(maskedoff, vs2, 0, vl);
+}
+
+vuint32m2_t test_vsm4k_vi_u32m2_tu(vuint32m2_t maskedoff, vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4k_vi_u32m2_tu(maskedoff, vs2, 0, vl);
+}
+
+vuint32m4_t test_vsm4k_vi_u32m4_tu(vuint32m4_t maskedoff, vuint32m4_t vs2, size_t vl) {
+  return __riscv_vsm4k_vi_u32m4_tu(maskedoff, vs2, 0, vl);
+}
+
+vuint32m8_t test_vsm4k_vi_u32m8_tu(vuint32m8_t maskedoff, vuint32m8_t vs2, size_t vl) {
+  return __riscv_vsm4k_vi_u32m8_tu(maskedoff, vs2, 0, vl);
+}
+
+/* { dg-final { scan-assembler-times {vsetvli\s*zero,\s*[a-x0-9]+,\s*[a-x0-9]+,m[a-x0-9]+,\s*ta,\s*ma} 5 } } */
+/* { dg-final { scan-assembler-times {vsetvli\s*zero,\s*[a-x0-9]+,\s*[a-x0-9]+,m[a-x0-9]+,\s*tu,\s*ma} 5 } } */
+/* { dg-final { scan-assembler-times {vsm4k\.vi\s+v[0-9]+,\s*v[0-9]+,0} 10 } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4k_overloaded.c b/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4k_overloaded.c
new file mode 100644
index 00000000000..dd06a7e58d8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4k_overloaded.c
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zvksed -mabi=lp64d -O2 -Wno-psabi" } */
+
+#include "riscv_vector.h"
+
+/* non-policy */
+vuint32mf2_t test_vsm4k_vi_u32mf2(vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4k(vs2, 0, vl);
+}
+
+vuint32m1_t test_vsm4k_vi_u32m1(vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4k(vs2, 0, vl);
+}
+
+vuint32m2_t test_vsm4k_vi_u32m2(vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4k(vs2, 0, vl);
+}
+
+vuint32m4_t test_vsm4k_vi_u32m4(vuint32m4_t vs2, size_t vl) {
+  return __riscv_vsm4k(vs2, 0, vl);
+}
+
+vuint32m8_t test_vsm4k_vi_u32m8(vuint32m8_t vs2, size_t vl) {
+  return __riscv_vsm4k(vs2, 0, vl);
+}
+
+/* policy */
+vuint32mf2_t test_vsm4k_vi_u32mf2_tu(vuint32mf2_t maskedoff, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4k_tu(maskedoff, vs2, 0, vl);
+}
+
+vuint32m1_t test_vsm4k_vi_u32m1_tu(vuint32m1_t maskedoff, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4k_tu(maskedoff, vs2, 0, vl);
+}
+
+vuint32m2_t test_vsm4k_vi_u32m2_tu(vuint32m2_t maskedoff, vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4k_tu(maskedoff, vs2, 0, vl);
+}
+
+vuint32m4_t test_vsm4k_vi_u32m4_tu(vuint32m4_t maskedoff, vuint32m4_t vs2, size_t vl) {
+  return __riscv_vsm4k_tu(maskedoff, vs2, 0, vl);
+}
+
+vuint32m8_t test_vsm4k_vi_u32m8_tu(vuint32m8_t maskedoff, vuint32m8_t vs2, size_t vl) {
+  return __riscv_vsm4k_tu(maskedoff, vs2, 0, vl);
+}
+
+/* { dg-final { scan-assembler-times {vsetvli\s*zero,\s*[a-x0-9]+,\s*[a-x0-9]+,m[a-x0-9]+,\s*ta,\s*ma} 5 } } */
+/* { dg-final { scan-assembler-times {vsetvli\s*zero,\s*[a-x0-9]+,\s*[a-x0-9]+,m[a-x0-9]+,\s*tu,\s*ma} 5 } } */
+/* { dg-final { scan-assembler-times {vsm4k\.vi\s+v[0-9]+,\s*v[0-9]+,0} 10 } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4r.c b/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4r.c
new file mode 100644
index 00000000000..dac66db3abb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4r.c
@@ -0,0 +1,170 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zvksed -mabi=lp64d -O2 -Wno-psabi" } */
+#include "riscv_vector.h"
+
+/* non-policy */
+vuint32mf2_t test_vsm4r_vv_u32mf2(vuint32mf2_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv_u32mf2(vd, vs2, vl);
+}
+
+vuint32mf2_t test_vsm4r_vs_u32mf2_u32mf2(vuint32mf2_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32mf2_u32mf2(vd, vs2, vl);
+}
+
+vuint32m1_t test_vsm4r_vs_u32mf2_u32m1(vuint32m1_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32mf2_u32m1(vd, vs2, vl);
+}
+
+vuint32m2_t test_vsm4r_vs_u32mf2_u32m2(vuint32m2_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32mf2_u32m2(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vs_u32mf2_u32m4(vuint32m4_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32mf2_u32m4(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32mf2_u32m8(vuint32m8_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32mf2_u32m8(vd, vs2, vl);
+}
+
+vuint32m1_t test_vsm4r_vv_u32m1(vuint32m1_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv_u32m1(vd, vs2, vl);
+}
+
+vuint32m1_t test_vsm4r_vs_u32m1_u32m1(vuint32m1_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m1_u32m1(vd, vs2, vl);
+}
+
+vuint32m2_t test_vsm4r_vs_u32m1_u32m2(vuint32m2_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m1_u32m2(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vs_u32m1_u32m4(vuint32m4_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m1_u32m4(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32m1_u32m8(vuint32m8_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m1_u32m8(vd, vs2, vl);
+}
+
+vuint32m2_t test_vsm4r_vv_u32m2(vuint32m2_t vd, vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv_u32m2(vd, vs2, vl);
+}
+
+vuint32m2_t test_vsm4r_vs_u32m2_u32m2(vuint32m2_t vd, vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m2_u32m2(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vs_u32m2_u32m4(vuint32m4_t vd, vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m2_u32m4(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32m2_u32m8(vuint32m8_t vd, vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m2_u32m8(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vv_u32m4(vuint32m4_t vd, vuint32m4_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv_u32m4(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vs_u32m4_u32m4(vuint32m4_t vd, vuint32m4_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m4_u32m4(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32m4_u32m8(vuint32m8_t vd, vuint32m4_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m4_u32m8(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vv_u32m8(vuint32m8_t vd, vuint32m8_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv_u32m8(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32m8_u32m8(vuint32m8_t vd, vuint32m8_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m8_u32m8(vd, vs2, vl);
+}
+
+/* policy */
+vuint32mf2_t test_vsm4r_vv_u32mf2_tu(vuint32mf2_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv_u32mf2_tu(vd, vs2, vl);
+}
+
+vuint32mf2_t test_vsm4r_vs_u32mf2_u32mf2_tu(vuint32mf2_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32mf2_u32mf2_tu(vd, vs2, vl);
+}
+
+vuint32m1_t test_vsm4r_vs_u32mf2_u32m1_tu(vuint32m1_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32mf2_u32m1_tu(vd, vs2, vl);
+}
+
+vuint32m2_t test_vsm4r_vs_u32mf2_u32m2_tu(vuint32m2_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32mf2_u32m2_tu(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vs_u32mf2_u32m4_tu(vuint32m4_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32mf2_u32m4_tu(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32mf2_u32m8_tu(vuint32m8_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32mf2_u32m8_tu(vd, vs2, vl);
+}
+
+vuint32m1_t test_vsm4r_vv_u32m1_tu(vuint32m1_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv_u32m1_tu(vd, vs2, vl);
+}
+
+vuint32m1_t test_vsm4r_vs_u32m1_u32m1_tu(vuint32m1_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m1_u32m1_tu(vd, vs2, vl);
+}
+
+vuint32m2_t test_vsm4r_vs_u32m1_u32m2_tu(vuint32m2_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m1_u32m2_tu(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vs_u32m1_u32m4_tu(vuint32m4_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m1_u32m4_tu(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32m1_u32m8_tu(vuint32m8_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m1_u32m8_tu(vd, vs2, vl);
+}
+
+vuint32m2_t test_vsm4r_vv_u32m2_tu(vuint32m2_t vd, vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv_u32m2_tu(vd, vs2, vl);
+}
+
+vuint32m2_t test_vsm4r_vs_u32m2_u32m2_tu(vuint32m2_t vd, vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m2_u32m2_tu(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vs_u32m2_u32m4_tu(vuint32m4_t vd, vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m2_u32m4_tu(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32m2_u32m8_tu(vuint32m8_t vd, vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m2_u32m8_tu(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vv_u32m4_tu(vuint32m4_t vd, vuint32m4_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv_u32m4_tu(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vs_u32m4_u32m4_tu(vuint32m4_t vd, vuint32m4_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m4_u32m4_tu(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32m4_u32m8_tu(vuint32m8_t vd, vuint32m4_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m4_u32m8_tu(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vv_u32m8_tu(vuint32m8_t vd, vuint32m8_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv_u32m8_tu(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32m8_u32m8_tu(vuint32m8_t vd, vuint32m8_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_u32m8_u32m8_tu(vd, vs2, vl);
+}
+
+/* { dg-final { scan-assembler-times {vsetvli\s*zero,\s*[a-x0-9]+,\s*[a-x0-9]+,m[a-x0-9]+,\s*ta,\s*ma} 20 } } */
+/* { dg-final { scan-assembler-times {vsetvli\s*zero,\s*[a-x0-9]+,\s*[a-x0-9]+,m[a-x0-9]+,\s*tu,\s*ma} 20 } } */
+/* { dg-final { scan-assembler-times {vsm4r\.vv\s+v[0-9]+,\s*v[0-9]} 10 } } */
+/* { dg-final { scan-assembler-times {vsm4r\.vs\s+v[0-9]+,\s*v[0-9]} 30 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4r_overloaded.c b/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4r_overloaded.c
new file mode 100644
index 00000000000..6311adfb2d5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4r_overloaded.c
@@ -0,0 +1,170 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zvksed -mabi=lp64d -O2 -Wno-psabi" } */
+#include "riscv_vector.h"
+
+/* non-policy */
+vuint32mf2_t test_vsm4r_vv_u32mf2(vuint32mf2_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv(vd, vs2, vl);
+}
+
+vuint32mf2_t test_vsm4r_vs_u32mf2_u32mf2(vuint32mf2_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs(vd, vs2, vl);
+}
+
+vuint32m1_t test_vsm4r_vs_u32mf2_u32m1(vuint32m1_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs(vd, vs2, vl);
+}
+
+vuint32m2_t test_vsm4r_vs_u32mf2_u32m2(vuint32m2_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vs_u32mf2_u32m4(vuint32m4_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32mf2_u32m8(vuint32m8_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs(vd, vs2, vl);
+}
+
+vuint32m1_t test_vsm4r_vv_u32m1(vuint32m1_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv(vd, vs2, vl);
+}
+
+vuint32m1_t test_vsm4r_vs_u32m1_u32m1(vuint32m1_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs(vd, vs2, vl);
+}
+
+vuint32m2_t test_vsm4r_vs_u32m1_u32m2(vuint32m2_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vs_u32m1_u32m4(vuint32m4_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32m1_u32m8(vuint32m8_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs(vd, vs2, vl);
+}
+
+vuint32m2_t test_vsm4r_vv_u32m2(vuint32m2_t vd, vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv(vd, vs2, vl);
+}
+
+vuint32m2_t test_vsm4r_vs_u32m2_u32m2(vuint32m2_t vd, vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vs_u32m2_u32m4(vuint32m4_t vd, vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32m2_u32m8(vuint32m8_t vd, vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vv_u32m4(vuint32m4_t vd, vuint32m4_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vs_u32m4_u32m4(vuint32m4_t vd, vuint32m4_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32m4_u32m8(vuint32m8_t vd, vuint32m4_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vv_u32m8(vuint32m8_t vd, vuint32m8_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32m8_u32m8(vuint32m8_t vd, vuint32m8_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs(vd, vs2, vl);
+}
+
+/* policy */
+vuint32mf2_t test_vsm4r_vv_u32mf2_tu(vuint32mf2_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv_tu(vd, vs2, vl);
+}
+
+vuint32mf2_t test_vsm4r_vs_u32mf2_u32mf2_tu(vuint32mf2_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_tu(vd, vs2, vl);
+}
+
+vuint32m1_t test_vsm4r_vs_u32mf2_u32m1_tu(vuint32m1_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_tu(vd, vs2, vl);
+}
+
+vuint32m2_t test_vsm4r_vs_u32mf2_u32m2_tu(vuint32m2_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_tu(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vs_u32mf2_u32m4_tu(vuint32m4_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_tu(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32mf2_u32m8_tu(vuint32m8_t vd, vuint32mf2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_tu(vd, vs2, vl);
+}
+
+vuint32m1_t test_vsm4r_vv_u32m1_tu(vuint32m1_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv_tu(vd, vs2, vl);
+}
+
+vuint32m1_t test_vsm4r_vs_u32m1_u32m1_tu(vuint32m1_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_tu(vd, vs2, vl);
+}
+
+vuint32m2_t test_vsm4r_vs_u32m1_u32m2_tu(vuint32m2_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_tu(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vs_u32m1_u32m4_tu(vuint32m4_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_tu(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32m1_u32m8_tu(vuint32m8_t vd, vuint32m1_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_tu(vd, vs2, vl);
+}
+
+vuint32m2_t test_vsm4r_vv_u32m2_tu(vuint32m2_t vd, vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv_tu(vd, vs2, vl);
+}
+
+vuint32m2_t test_vsm4r_vs_u32m2_u32m2_tu(vuint32m2_t vd, vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_tu(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vs_u32m2_u32m4_tu(vuint32m4_t vd, vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_tu(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32m2_u32m8_tu(vuint32m8_t vd, vuint32m2_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_tu(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vv_u32m4_tu(vuint32m4_t vd, vuint32m4_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv_tu(vd, vs2, vl);
+}
+
+vuint32m4_t test_vsm4r_vs_u32m4_u32m4_tu(vuint32m4_t vd, vuint32m4_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_tu(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32m4_u32m8_tu(vuint32m8_t vd, vuint32m4_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_tu(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vv_u32m8_tu(vuint32m8_t vd, vuint32m8_t vs2, size_t vl) {
+  return __riscv_vsm4r_vv_tu(vd, vs2, vl);
+}
+
+vuint32m8_t test_vsm4r_vs_u32m8_u32m8_tu(vuint32m8_t vd, vuint32m8_t vs2, size_t vl) {
+  return __riscv_vsm4r_vs_tu(vd, vs2, vl);
+}
+
+/* { dg-final { scan-assembler-times {vsetvli\s*zero,\s*[a-x0-9]+,\s*[a-x0-9]+,m[a-x0-9]+,\s*ta,\s*ma} 20 } } */
+/* { dg-final { scan-assembler-times {vsetvli\s*zero,\s*[a-x0-9]+,\s*[a-x0-9]+,m[a-x0-9]+,\s*tu,\s*ma} 20 } } */
+/* { dg-final { scan-assembler-times {vsm4r\.vv\s+v[0-9]+,\s*v[0-9]} 10 } } */
+/* { dg-final { scan-assembler-times {vsm4r\.vs\s+v[0-9]+,\s*v[0-9]} 30 } } */