From patchwork Mon Dec 4 02:57:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Wang X-Patchwork-Id: 81241 X-Patchwork-Delegate: kito.cheng@gmail.com Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3CB65385C322 for ; Mon, 4 Dec 2023 02:59:16 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from zg8tndyumtaxlji0oc4xnzya.icoremail.net (zg8tndyumtaxlji0oc4xnzya.icoremail.net [46.101.248.176]) by sourceware.org (Postfix) with ESMTP id 73E41385E00D for ; Mon, 4 Dec 2023 02:58:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 73E41385E00D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=eswincomputing.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=eswincomputing.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 73E41385E00D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=46.101.248.176 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701658713; cv=none; b=ExY3uK1w6APxkbqa2/yf+dmmUREfaeBAUrHw75TIKiAjJ8G47yWGBphdEGDWolf2FXj2Jq5g9LCjm2AyptydmeoLrb/ftGvJAiEwq8zujqlImgMsT9wCQQsqldexmsXCFYlCvXdgQcDns3bzBUdRzVVmuLCtl5W2JhIEeVTFx8k= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701658713; c=relaxed/simple; bh=NVXC947w+DB7xraFc9t2iTe7Picd1XiveZOay5DyuZ8=; h=From:To:Subject:Date:Message-Id; b=AY5DLAYc3HTLFJBe8bH2BFlfkOpXhXChaI24zudRwdxOy/Ma/JLAQ9w/B5hcLrdxz1q8f7qOIELAAHLj47EQk9YlPgcEixpEKBxXcqg4lUKYJ8yrpZAZXTvpGzucVfUQ5HIjq76rr6xn7A9fMCPHVeJVrKBz0FP0/A+lC/JGL+c= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from localhost.localdomain (unknown [10.12.130.31]) by app1 (Coremail) with SMTP id TAJkCgC3Qv39P21lJhYAAA--.1700S9; Mon, 04 Dec 2023 10:57:21 +0800 (CST) From: Feng Wang To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, jeffreyalaw@gmail.com, zhusonghe@eswincomputing.com, panciyan@eswincomputing.com, Feng Wang Subject: [PATCH 6/7] RISC-V: Add intrinsic functions for crypto vector Zvksed extension. Date: Mon, 4 Dec 2023 02:57:08 +0000 Message-Id: <20231204025709.3783-6-wangfeng@eswincomputing.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20231204025709.3783-1-wangfeng@eswincomputing.com> References: <20231204025709.3783-1-wangfeng@eswincomputing.com> X-CM-TRANSID: TAJkCgC3Qv39P21lJhYAAA--.1700S9 X-Coremail-Antispam: 1UD129KBjvAXoWfur18Wr4kWFWfCw4kKr1xXwb_yoWrGr17uo Z8Krs5u3WrXr17uw4Duw48Gr1xXa1xXrs3A3WfKrnru3WfZa1Fk3ZFqa1DZFs2yr4DZFZ8 CFs3Zr4xXF13tF1rn29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7v73VFW2AGmfu7bjvjm3 AaLaJ3UjIYCTnIWjp_UUUOY7AC8VAFwI0_Wr0E3s1l1xkIjI8I6I8E6xAIw20EY4v20xva j40_Wr0E3s1l1IIY67AEw4v_Jr0_Jr4l82xGYIkIc2x26280x7IE14v26r126s0DM28Irc Ia0xkI8VCY1x0267AKxVW5JVCq3wA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK021l 84ACjcxK6xIIjxv20xvE14v26w1j6s0DM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4UJV WxJr1l84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_GcCE 3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E2I x0cI8IcVAFwI0_JrI_JrylYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJVW8 JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I648v4I1lc2xSY4AK6svPMxAIw2 8IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMxCIbckI1I0E14v26r1q6r43MI8I 3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxV WUAVWUtwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r4j6ryUMIIF0xvE2Ix0cI8I cVCY1x0267AKxVW8JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aV AFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZE Xa7VU17GYJUUUUU== X-CM-SenderInfo: pzdqwwxhqjqvxvzl0uprps33xlqjhudrp/ X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org This patch add the intrinsic functions(according to https://github.com/ riscv-non-isa/rvv-intrinsic-doc/blob/eopc/vector-crypto/auto-generated/ vector-crypto/intrinsic_funcs.md) for crypto vector Zvksed extension. And all the test cases are added for api-testing. gcc/ChangeLog: * common/config/riscv/riscv-common.cc: Add Zvksed in riscv_implied_info. * config/riscv/riscv-vector-builtins-bases.cc (class vaeskf1): Add new function_base for Zvksed. (class crypto_vi): Ditto. (BASE): Add Zvksed BASE declaration. * config/riscv/riscv-vector-builtins-bases.h: Ditto. * config/riscv/riscv-vector-builtins-shapes.cc (struct crypto_vv_def): Add function_builder for Zvksed. * config/riscv/riscv-vector-crypto-builtins-avail.h (AVAIL): Add enable condition. * config/riscv/riscv-vector-crypto-builtins-functions.def (vsha2cl): Add intrinsc def. (vsm4k): Ditto. (vsm4r): Ditto. * config/riscv/riscv.md: Add Zvksed ins name. * config/riscv/vector-crypto.md (sm4k): Add Zvksed md patterns. (@pred_vaeskf1_scalar):Ditto. (@pred_crypto_vi_scalar): Ditto. * config/riscv/vector.md: Add the corresponding attribute for Zvksed. gcc/testsuite/ChangeLog: * gcc.target/riscv/zvk/zvk.exp: * gcc.target/riscv/zvk/zvksed/vsm4k.c: New test. * gcc.target/riscv/zvk/zvksed/vsm4k_overloaded.c: New test. * gcc.target/riscv/zvk/zvksed/vsm4r.c: New test. * gcc.target/riscv/zvk/zvksed/vsm4r_overloaded.c: New test. --- gcc/common/config/riscv/riscv-common.cc | 1 + .../riscv/riscv-vector-builtins-bases.cc | 13 +- .../riscv/riscv-vector-builtins-bases.h | 2 + .../riscv/riscv-vector-builtins-shapes.cc | 2 +- .../riscv-vector-crypto-builtins-avail.h | 1 + ...riscv-vector-crypto-builtins-functions.def | 10 +- gcc/config/riscv/riscv.md | 5 +- gcc/config/riscv/vector-crypto.md | 40 +++-- gcc/config/riscv/vector.md | 20 ++- gcc/testsuite/gcc.target/riscv/zvk/zvk.exp | 3 +- .../gcc.target/riscv/zvk/zvksed/vsm4k.c | 50 ++++++ .../riscv/zvk/zvksed/vsm4k_overloaded.c | 50 ++++++ .../gcc.target/riscv/zvk/zvksed/vsm4r.c | 170 ++++++++++++++++++ .../riscv/zvk/zvksed/vsm4r_overloaded.c | 170 ++++++++++++++++++ 14 files changed, 505 insertions(+), 32 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4k.c create mode 100644 gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4k_overloaded.c create mode 100644 gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4r.c create mode 100644 gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4r_overloaded.c diff --git a/gcc/common/config/riscv/riscv-common.cc b/gcc/common/config/riscv/riscv-common.cc index 7201ac3866c..87595b135ef 100644 --- a/gcc/common/config/riscv/riscv-common.cc +++ b/gcc/common/config/riscv/riscv-common.cc @@ -127,6 +127,7 @@ static const riscv_implied_info_t riscv_implied_info[] = {"zvkned", "v"}, {"zvknha", "v"}, {"zvknhb", "v"}, + {"zvksed", "v"}, {"zfh", "zfhmin"}, {"zfhmin", "f"}, diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc b/gcc/config/riscv/riscv-vector-builtins-bases.cc index a3670ec5b38..83309f07661 100644 --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc @@ -2288,8 +2288,9 @@ public: } }; -/* Implements vaeskf1. */ -class vaeskf1 : public function_base +/* Implements vaeskf1/vsm4k. */ +template +class crypto_vi : public function_base { public: bool apply_mask_policy_p () const override { return false; } @@ -2297,7 +2298,7 @@ public: rtx expand (function_expander &e) const override { - return e.use_exact_insn (code_for_pred_vaeskf1_scalar (e.vector_mode ())); + return e.use_exact_insn (code_for_pred_crypto_vi_scalar (UNSPEC, e.vector_mode ())); } }; @@ -2591,11 +2592,13 @@ static CONSTEXPR const crypto_vv vaesem_obj; static CONSTEXPR const crypto_vv vaesdf_obj; static CONSTEXPR const crypto_vv vaesdm_obj; static CONSTEXPR const crypto_vv vaesz_obj; -static CONSTEXPR const vaeskf1 vaeskf1_obj; +static CONSTEXPR const crypto_vi vaeskf1_obj; static CONSTEXPR const vaeskf2 vaeskf2_obj; static CONSTEXPR const vg_nhab vsha2ms_obj; static CONSTEXPR const vg_nhab vsha2ch_obj; static CONSTEXPR const vg_nhab vsha2cl_obj; +static CONSTEXPR const crypto_vi vsm4k_obj; +static CONSTEXPR const crypto_vv vsm4r_obj; /* Declare the function base NAME, pointing it to an instance of class _obj. */ @@ -2882,4 +2885,6 @@ BASE (vaeskf2) BASE (vsha2ms) BASE (vsha2ch) BASE (vsha2cl) +BASE (vsm4k) +BASE (vsm4r) } // end namespace riscv_vector diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h b/gcc/config/riscv/riscv-vector-builtins-bases.h index 0560b0008f0..e9e6d7bfe7f 100644 --- a/gcc/config/riscv/riscv-vector-builtins-bases.h +++ b/gcc/config/riscv/riscv-vector-builtins-bases.h @@ -304,6 +304,8 @@ extern const function_base *const vaeskf2; extern const function_base *const vsha2ms; extern const function_base *const vsha2ch; extern const function_base *const vsha2cl; +extern const function_base *const vsm4k; +extern const function_base *const vsm4r; } } // end namespace riscv_vector diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc b/gcc/config/riscv/riscv-vector-builtins-shapes.cc index 5873103857a..4fe298917f6 100644 --- a/gcc/config/riscv/riscv-vector-builtins-shapes.cc +++ b/gcc/config/riscv/riscv-vector-builtins-shapes.cc @@ -1050,7 +1050,7 @@ struct crypto_vv_def : public build_base } }; -/* vaeskf1/vaeskf2 class. */ +/* vaeskf1/vaeskf2/vsm4k class. */ struct crypto_vi_def : public build_base { char *get_name (function_builder &b, const function_instance &instance, diff --git a/gcc/config/riscv/riscv-vector-crypto-builtins-avail.h b/gcc/config/riscv/riscv-vector-crypto-builtins-avail.h index bc1b6ec9b5b..f09315923f3 100755 --- a/gcc/config/riscv/riscv-vector-crypto-builtins-avail.h +++ b/gcc/config/riscv/riscv-vector-crypto-builtins-avail.h @@ -19,5 +19,6 @@ AVAIL (zvkg, TARGET_ZVKG) AVAIL (zvkned, TARGET_ZVKNED) AVAIL (zvknha_or_zvknhb, TARGET_ZVKNHA || TARGET_ZVKNHB) AVAIL (zvknhb, TARGET_ZVKNHB) +AVAIL (zvksed, TARGET_ZVKSED) } #endif diff --git a/gcc/config/riscv/riscv-vector-crypto-builtins-functions.def b/gcc/config/riscv/riscv-vector-crypto-builtins-functions.def index 9c89412a9a9..67f3bf5284b 100755 --- a/gcc/config/riscv/riscv-vector-crypto-builtins-functions.def +++ b/gcc/config/riscv/riscv-vector-crypto-builtins-functions.def @@ -64,4 +64,12 @@ DEF_VECTOR_CRYPTO_FUNCTION (vsha2cl, crypto_vv, none_tu_preds, u_vvvv_crypto_se //ZVKNHB DEF_VECTOR_CRYPTO_FUNCTION (vsha2ms, crypto_vv, none_tu_preds, u_vvvv_crypto_sew64_ops, zvknhb) DEF_VECTOR_CRYPTO_FUNCTION (vsha2ch, crypto_vv, none_tu_preds, u_vvvv_crypto_sew64_ops, zvknhb) -DEF_VECTOR_CRYPTO_FUNCTION (vsha2cl, crypto_vv, none_tu_preds, u_vvvv_crypto_sew64_ops, zvknhb) \ No newline at end of file +DEF_VECTOR_CRYPTO_FUNCTION (vsha2cl, crypto_vv, none_tu_preds, u_vvvv_crypto_sew64_ops, zvknhb) +//Zvksed +DEF_VECTOR_CRYPTO_FUNCTION (vsm4k, crypto_vi, none_tu_preds, u_vv_size_crypto_sew32_ops, zvksed) +DEF_VECTOR_CRYPTO_FUNCTION (vsm4r, crypto_vv, none_tu_preds, u_vvv_crypto_sew32_ops, zvksed) +DEF_VECTOR_CRYPTO_FUNCTION (vsm4r, crypto_vv, none_tu_preds, u_vvs_crypto_sew32_ops, zvksed) +DEF_VECTOR_CRYPTO_FUNCTION (vsm4r, crypto_vv, none_tu_preds, u_vvs_crypto_sew32_lmul_x2_ops, zvksed) +DEF_VECTOR_CRYPTO_FUNCTION (vsm4r, crypto_vv, none_tu_preds, u_vvs_crypto_sew32_lmul_x4_ops, zvksed) +DEF_VECTOR_CRYPTO_FUNCTION (vsm4r, crypto_vv, none_tu_preds, u_vvs_crypto_sew32_lmul_x8_ops, zvksed) +DEF_VECTOR_CRYPTO_FUNCTION (vsm4r, crypto_vv, none_tu_preds, u_vvs_crypto_sew32_lmul_x16_ops, zvksed) \ No newline at end of file diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index e8fc21e8ceb..c076b82008a 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -428,6 +428,7 @@ ;; vcompress vector compress instruction ;; vmov whole vector register move ;; vector unknown vector instruction +;; 17. Crypto Vector instructions ;; vandn crypto vector bitwise and-not instructions ;; vbrev crypto vector reverse bits in elements instructions ;; vbrev8 crypto vector reverse bits in bytes instructions @@ -451,6 +452,8 @@ ;; vsha2ms crypto vector SHA-2 message schedule instructions ;; vsha2ch crypto vector SHA-2 two rounds of compression instructions ;; vsha2cl crypto vector SHA-2 two rounds of compression instructions +;; vsm4k crypto vector SM4 KeyExpansion instructions +;; vsm4r crypto vector SM4 Rounds instructions (define_attr "type" "unknown,branch,jump,jalr,ret,call,load,fpload,store,fpstore, mtc,mfc,const,arith,logical,shift,slt,imul,idiv,move,fmove,fadd,fmul, @@ -472,7 +475,7 @@ vslideup,vslidedown,vislide1up,vislide1down,vfslide1up,vfslide1down, vgather,vcompress,vmov,vector,vandn,vbrev,vbrev8,vrev8,vclz,vctz,vrol,vror,vwsll, vclmul,vclmulh,vghsh,vgmul,vaesef,vaesem,vaesdf,vaesdm,vaeskf1,vaeskf2,vaesz, - vsha2ms,vsha2ch,vsha2cl" + vsha2ms,vsha2ch,vsha2cl,vsm4k,vsm4r" (cond [(eq_attr "got" "load") (const_string "load") ;; If a doubleword move uses these expensive instructions, diff --git a/gcc/config/riscv/vector-crypto.md b/gcc/config/riscv/vector-crypto.md index 38b41fb3664..7bd4cd9f8b9 100755 --- a/gcc/config/riscv/vector-crypto.md +++ b/gcc/config/riscv/vector-crypto.md @@ -33,6 +33,10 @@ UNSPEC_VSHA2MS UNSPEC_VSHA2CH UNSPEC_VSHA2CL + UNSPEC_VSM4K + UNSPEC_VSM4R + UNSPEC_VSM4RVV + UNSPEC_VSM4RVS ]) (define_int_attr ror_rol [(UNSPEC_VROL "rol") (UNSPEC_VROR "ror")]) @@ -47,16 +51,20 @@ (UNSPEC_VAESEMVV "aesem") (UNSPEC_VAESDFVV "aesdf") (UNSPEC_VAESDMVV "aesdm") (UNSPEC_VAESEFVS "aesef") (UNSPEC_VAESEMVS "aesem") (UNSPEC_VAESDFVS "aesdf") - (UNSPEC_VAESDMVS "aesdm") (UNSPEC_VAESZVS "aesz" )]) + (UNSPEC_VAESDMVS "aesdm") (UNSPEC_VAESZVS "aesz" ) + (UNSPEC_VSM4RVV "sm4r" ) (UNSPEC_VSM4RVS "sm4r" )]) (define_int_attr vv_ins1_name [(UNSPEC_VGHSH "ghsh") (UNSPEC_VSHA2MS "sha2ms") (UNSPEC_VSHA2CH "sha2ch") (UNSPEC_VSHA2CL "sha2cl")]) +(define_int_attr vi_ins_name [(UNSPEC_VAESKF1 "aeskf1") (UNSPEC_VSM4K "sm4k")]) + (define_int_attr ins_type [(UNSPEC_VGMUL "vv") (UNSPEC_VAESEFVV "vv") (UNSPEC_VAESEMVV "vv") (UNSPEC_VAESDFVV "vv") (UNSPEC_VAESDMVV "vv") (UNSPEC_VAESEFVS "vs") (UNSPEC_VAESEMVS "vs") (UNSPEC_VAESDFVS "vs") - (UNSPEC_VAESDMVS "vs") (UNSPEC_VAESZVS "vs")]) + (UNSPEC_VAESDMVS "vs") (UNSPEC_VAESZVS "vs") + (UNSPEC_VSM4RVV "vv") (UNSPEC_VSM4RVS "vs")]) (define_int_iterator UNSPEC_VRORL [UNSPEC_VROL UNSPEC_VROR]) @@ -69,10 +77,12 @@ (define_int_iterator UNSPEC_CRYPTO_VV [UNSPEC_VGMUL UNSPEC_VAESEFVV UNSPEC_VAESEMVV UNSPEC_VAESDFVV UNSPEC_VAESDMVV UNSPEC_VAESEFVS UNSPEC_VAESEMVS UNSPEC_VAESDFVS UNSPEC_VAESDMVS - UNSPEC_VAESZVS]) + UNSPEC_VAESZVS UNSPEC_VSM4RVV UNSPEC_VSM4RVS]) (define_int_iterator UNSPEC_VGNHAB [UNSPEC_VGHSH UNSPEC_VSHA2MS UNSPEC_VSHA2CH UNSPEC_VSHA2CL]) +(define_int_iterator UNSPEC_CRYPTO_VI [UNSPEC_VAESKF1 UNSPEC_VSM4K]) + ;; zvbb instructions patterns. ;; vandn.vv vandn.vx vrol.vv vrol.vx ;; vror.vv vror.vx vror.vi @@ -338,7 +348,7 @@ [(match_operand:VSI 1 "register_operand" " 0") (match_operand:VSI 2 "register_operand" "vr")] UNSPEC_CRYPTO_VV) (match_dup 1)))] - "TARGET_ZVKNED" + "TARGET_ZVKG || TARGET_ZVKNED" "v.\t%0,%2" [(set_attr "type" "v") (set_attr "mode" "")]) @@ -356,7 +366,7 @@ [(match_operand:VSI 1 "register_operand" " 0") (match_operand:VSI 2 "register_operand" "vr")] UNSPEC_CRYPTO_VV) (match_dup 1)))] - "TARGET_ZVKNED" + "TARGET_ZVKNED || TARGET_ZVKSED" "v.\t%0,%2" [(set_attr "type" "v") (set_attr "mode" "")]) @@ -374,7 +384,7 @@ [(match_operand: 1 "register_operand" " 0") (match_operand:VLMULX2_SI 2 "register_operand" "vr")] UNSPEC_CRYPTO_VV) (match_dup 1)))] - "TARGET_ZVKNED" + "TARGET_ZVKNED || TARGET_ZVKSED" "v.\t%0,%2" [(set_attr "type" "v") (set_attr "mode" "")]) @@ -392,7 +402,7 @@ [(match_operand: 1 "register_operand" " 0") (match_operand:VLMULX4_SI 2 "register_operand" "vr")] UNSPEC_CRYPTO_VV) (match_dup 1)))] - "TARGET_ZVKNED" + "TARGET_ZVKNED || TARGET_ZVKSED" "v.\t%0,%2" [(set_attr "type" "v") (set_attr "mode" "")]) @@ -410,7 +420,7 @@ [(match_operand: 1 "register_operand" " 0") (match_operand:VLMULX8_SI 2 "register_operand" "vr")] UNSPEC_CRYPTO_VV) (match_dup 1)))] - "TARGET_ZVKNED" + "TARGET_ZVKNED || TARGET_ZVKSED" "v.\t%0,%2" [(set_attr "type" "v") (set_attr "mode" "")]) @@ -428,13 +438,13 @@ [(match_operand: 1 "register_operand" " 0") (match_operand:VLMULX16_SI 2 "register_operand" "vr")] UNSPEC_CRYPTO_VV) (match_dup 1)))] - "TARGET_ZVKNED" + "TARGET_ZVKNED || TARGET_ZVKSED" "v.\t%0,%2" [(set_attr "type" "v") (set_attr "mode" "")]) -;; vaeskf1.vi -(define_insn "@pred_vaeskf1_scalar" +;; vaeskf1.vi vsm4k.vi +(define_insn "@pred_crypto_vi_scalar" [(set (match_operand:VSI 0 "register_operand" "=vd, vd") (if_then_else:VSI (unspec: @@ -445,11 +455,11 @@ (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (unspec:VSI [(match_operand:VSI 2 "register_operand" "vr, vr") - (match_operand: 3 "const_int_operand" " i, i")] UNSPEC_VAESKF1) + (match_operand: 3 "const_int_operand" " i, i")] UNSPEC_CRYPTO_VI) (match_operand:VSI 1 "vector_merge_operand" "vu, 0")))] - "TARGET_ZVKNED" - "vaeskf1.vi\t%0,%2,%3" - [(set_attr "type" "vaeskf1") + "TARGET_ZVKNED || TARGET_ZVKSED" + "v.vi\t%0,%2,%3" + [(set_attr "type" "v") (set_attr "mode" "")]) ;; vaeskf2.vi diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index e2de9bfc496..7fae91b3860 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -54,7 +54,7 @@ vgather,vcompress,vlsegde,vssegte,vlsegds,vssegts,vlsegdux,vlsegdox,\ vssegtux,vssegtox,vlsegdff,vandn,vbrev,vbrev8,vrev8,vclz,vctz,vrol,\ vror,vwsll,vclmul,vclmulh,vghsh,vgmul,vaesef,vaesem,vaesdf,vaesdm,\ - vaeskf1,vaeskf2,vaesz,vsha2ms,vsha2ch,vsha2cl") + vaeskf1,vaeskf2,vaesz,vsha2ms,vsha2ch,vsha2cl,vsm4k,vsm4r") (const_string "true")] (const_string "false"))) @@ -78,7 +78,7 @@ vgather,vcompress,vlsegde,vssegte,vlsegds,vssegts,vlsegdux,vlsegdox,\ vssegtux,vssegtox,vlsegdff,vandn,vbrev,vbrev8,vrev8,vclz,vctz,vrol,\ vror,vwsll,vclmul,vclmulh,vghsh,vgmul,vaesef,vaesem,vaesdf,vaesdm,\ - vaeskf1,vaeskf2,vaesz,vsha2ms,vsha2ch,vsha2cl") + vaeskf1,vaeskf2,vaesz,vsha2ms,vsha2ch,vsha2cl,vsm4k,vsm4r") (const_string "true")] (const_string "false"))) @@ -707,7 +707,7 @@ (const_int 2) (eq_attr "type" "vimerge,vfmerge,vcompress,vghsh,vgmul,vaesef,vaesem,vaesdf,vaesdm,\ - vaeskf1,vaeskf2,vaesz,vsha2ms,vsha2ch,vsha2cl") + vaeskf1,vaeskf2,vaesz,vsha2ms,vsha2ch,vsha2cl,vsm4k,vsm4r") (const_int 1) (eq_attr "type" "vimuladd,vfmuladd") @@ -747,7 +747,7 @@ vfcvtftoi,vfwcvtitof,vfwcvtftoi,vfwcvtftof,vfncvtitof,\ vfncvtftoi,vfncvtftof,vfclass,vimovxv,vfmovfv,vcompress,\ vlsegde,vssegts,vssegtux,vssegtox,vlsegdff,vbrev,vbrev8,vrev8,\ - vghsh,vaeskf1,vaeskf2,vsha2ms,vsha2ch,vsha2cl") + vghsh,vaeskf1,vaeskf2,vsha2ms,vsha2ch,vsha2cl,vsm4k") (const_int 4) ;; If operands[3] of "vlds" is not vector mode, it is pred_broadcast. @@ -770,7 +770,7 @@ (const_int 6) (eq_attr "type" "vmpop,vmffs,vmidx,vssegte,vclz,vctz,vgmul,vaesef,vaesem,vaesdf,vaesdm,\ - vaesz") + vaesz,vsm4r") (const_int 3)] (const_int INVALID_ATTRIBUTE))) @@ -780,7 +780,7 @@ vfcvtitof,vfcvtftoi,vfwcvtitof,vfwcvtftoi,vfwcvtftof,\ vfncvtitof,vfncvtftoi,vfncvtftof,vfclass,vimovxv,vfmovfv,\ vcompress,vldff,vlsegde,vlsegdff,vbrev,vbrev8,vrev8,vghsh,\ - vaeskf1,vaeskf2,vsha2ms,vsha2ch,vsha2cl") + vaeskf1,vaeskf2,vsha2ms,vsha2ch,vsha2cl,vsm4k") (symbol_ref "riscv_vector::get_ta(operands[5])") ;; If operands[3] of "vlds" is not vector mode, it is pred_broadcast. @@ -802,7 +802,7 @@ (eq_attr "type" "vimuladd,vfmuladd") (symbol_ref "riscv_vector::get_ta(operands[7])") - (eq_attr "type" "vmidx,vgmul,vaesef,vaesem,vaesdf,vaesdm,vaesz") + (eq_attr "type" "vmidx,vgmul,vaesef,vaesem,vaesdf,vaesdm,vaesz,vsm4r") (symbol_ref "riscv_vector::get_ta(operands[4])")] (const_int INVALID_ATTRIBUTE))) @@ -844,7 +844,8 @@ vfclass,vired,viwred,vfredu,vfredo,vfwredu,vfwredo,\ vimovxv,vfmovfv,vlsegde,vlsegdff,vbrev,vbrev8,vrev8") (const_int 7) - (eq_attr "type" "vldm,vstm,vmalu,vmalu,vgmul,vaesef,vaesem,vaesdf,vaesdm,vaesz") + (eq_attr "type" "vldm,vstm,vmalu,vmalu,vgmul,vaesef,vaesem,vaesdf,vaesdm,vaesz,\ + vsm4r") (const_int 5) ;; If operands[3] of "vlds" is not vector mode, it is pred_broadcast. @@ -867,7 +868,8 @@ (eq_attr "type" "vimuladd,vfmuladd") (const_int 9) - (eq_attr "type" "vmsfs,vmidx,vcompress,vghsh,vaeskf1,vaeskf2,vsha2ms,vsha2ch,vsha2cl") + (eq_attr "type" "vmsfs,vmidx,vcompress,vghsh,vaeskf1,vaeskf2,vsha2ms,vsha2ch,vsha2cl,\ + vsm4k") (const_int 6) (eq_attr "type" "vmpop,vmffs,vssegte,vclz,vctz") diff --git a/gcc/testsuite/gcc.target/riscv/zvk/zvk.exp b/gcc/testsuite/gcc.target/riscv/zvk/zvk.exp index 13f1302314a..7d87b0c1bee 100644 --- a/gcc/testsuite/gcc.target/riscv/zvk/zvk.exp +++ b/gcc/testsuite/gcc.target/riscv/zvk/zvk.exp @@ -46,6 +46,7 @@ dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/zvknha/*.\[cS\]]] \ "" $DEFAULT_CFLAGS dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/zvknhb/*.\[cS\]]] \ "" $DEFAULT_CFLAGS - +dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/zvksed/*.\[cS\]]] \ + "" $DEFAULT_CFLAGS # All done. dg-finish diff --git a/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4k.c b/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4k.c new file mode 100644 index 00000000000..7a8a0857f31 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4k.c @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zvksed -mabi=lp64d -O2 -Wno-psabi" } */ + +#include "riscv_vector.h" + +/* non-policy */ +vuint32mf2_t test_vsm4k_vi_u32mf2(vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4k_vi_u32mf2(vs2, 0, vl); +} + +vuint32m1_t test_vsm4k_vi_u32m1(vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4k_vi_u32m1(vs2, 0, vl); +} + +vuint32m2_t test_vsm4k_vi_u32m2(vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4k_vi_u32m2(vs2, 0, vl); +} + +vuint32m4_t test_vsm4k_vi_u32m4(vuint32m4_t vs2, size_t vl) { + return __riscv_vsm4k_vi_u32m4(vs2, 0, vl); +} + +vuint32m8_t test_vsm4k_vi_u32m8(vuint32m8_t vs2, size_t vl) { + return __riscv_vsm4k_vi_u32m8(vs2, 0, vl); +} + +/* policy */ +vuint32mf2_t test_vsm4k_vi_u32mf2_tu(vuint32mf2_t maskedoff, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4k_vi_u32mf2_tu(maskedoff, vs2, 0, vl); +} + +vuint32m1_t test_vsm4k_vi_u32m1_tu(vuint32m1_t maskedoff, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4k_vi_u32m1_tu(maskedoff, vs2, 0, vl); +} + +vuint32m2_t test_vsm4k_vi_u32m2_tu(vuint32m2_t maskedoff, vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4k_vi_u32m2_tu(maskedoff, vs2, 0, vl); +} + +vuint32m4_t test_vsm4k_vi_u32m4_tu(vuint32m4_t maskedoff, vuint32m4_t vs2, size_t vl) { + return __riscv_vsm4k_vi_u32m4_tu(maskedoff, vs2, 0, vl); +} + +vuint32m8_t test_vsm4k_vi_u32m8_tu(vuint32m8_t maskedoff, vuint32m8_t vs2, size_t vl) { + return __riscv_vsm4k_vi_u32m8_tu(maskedoff, vs2, 0, vl); +} + +/* { dg-final { scan-assembler-times {vsetvli\s*zero,\s*[a-x0-9]+,\s*[a-x0-9]+,m[a-x0-9]+,\s*ta,\s*ma} 5 } } */ +/* { dg-final { scan-assembler-times {vsetvli\s*zero,\s*[a-x0-9]+,\s*[a-x0-9]+,m[a-x0-9]+,\s*tu,\s*ma} 5 } } */ +/* { dg-final { scan-assembler-times {vsm4k\.vi\s+v[0-9]+,\s*v[0-9]+,0} 10 } } */ \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4k_overloaded.c b/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4k_overloaded.c new file mode 100644 index 00000000000..dd06a7e58d8 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4k_overloaded.c @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zvksed -mabi=lp64d -O2 -Wno-psabi" } */ + +#include "riscv_vector.h" + +/* non-policy */ +vuint32mf2_t test_vsm4k_vi_u32mf2(vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4k(vs2, 0, vl); +} + +vuint32m1_t test_vsm4k_vi_u32m1(vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4k(vs2, 0, vl); +} + +vuint32m2_t test_vsm4k_vi_u32m2(vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4k(vs2, 0, vl); +} + +vuint32m4_t test_vsm4k_vi_u32m4(vuint32m4_t vs2, size_t vl) { + return __riscv_vsm4k(vs2, 0, vl); +} + +vuint32m8_t test_vsm4k_vi_u32m8(vuint32m8_t vs2, size_t vl) { + return __riscv_vsm4k(vs2, 0, vl); +} + +/* policy */ +vuint32mf2_t test_vsm4k_vi_u32mf2_tu(vuint32mf2_t maskedoff, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4k_tu(maskedoff, vs2, 0, vl); +} + +vuint32m1_t test_vsm4k_vi_u32m1_tu(vuint32m1_t maskedoff, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4k_tu(maskedoff, vs2, 0, vl); +} + +vuint32m2_t test_vsm4k_vi_u32m2_tu(vuint32m2_t maskedoff, vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4k_tu(maskedoff, vs2, 0, vl); +} + +vuint32m4_t test_vsm4k_vi_u32m4_tu(vuint32m4_t maskedoff, vuint32m4_t vs2, size_t vl) { + return __riscv_vsm4k_tu(maskedoff, vs2, 0, vl); +} + +vuint32m8_t test_vsm4k_vi_u32m8_tu(vuint32m8_t maskedoff, vuint32m8_t vs2, size_t vl) { + return __riscv_vsm4k_tu(maskedoff, vs2, 0, vl); +} + +/* { dg-final { scan-assembler-times {vsetvli\s*zero,\s*[a-x0-9]+,\s*[a-x0-9]+,m[a-x0-9]+,\s*ta,\s*ma} 5 } } */ +/* { dg-final { scan-assembler-times {vsetvli\s*zero,\s*[a-x0-9]+,\s*[a-x0-9]+,m[a-x0-9]+,\s*tu,\s*ma} 5 } } */ +/* { dg-final { scan-assembler-times {vsm4k\.vi\s+v[0-9]+,\s*v[0-9]+,0} 10 } } */ \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4r.c b/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4r.c new file mode 100644 index 00000000000..dac66db3abb --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4r.c @@ -0,0 +1,170 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zvksed -mabi=lp64d -O2 -Wno-psabi" } */ +#include "riscv_vector.h" + +/* non-policy */ +vuint32mf2_t test_vsm4r_vv_u32mf2(vuint32mf2_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vv_u32mf2(vd, vs2, vl); +} + +vuint32mf2_t test_vsm4r_vs_u32mf2_u32mf2(vuint32mf2_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32mf2_u32mf2(vd, vs2, vl); +} + +vuint32m1_t test_vsm4r_vs_u32mf2_u32m1(vuint32m1_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32mf2_u32m1(vd, vs2, vl); +} + +vuint32m2_t test_vsm4r_vs_u32mf2_u32m2(vuint32m2_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32mf2_u32m2(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vs_u32mf2_u32m4(vuint32m4_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32mf2_u32m4(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32mf2_u32m8(vuint32m8_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32mf2_u32m8(vd, vs2, vl); +} + +vuint32m1_t test_vsm4r_vv_u32m1(vuint32m1_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vv_u32m1(vd, vs2, vl); +} + +vuint32m1_t test_vsm4r_vs_u32m1_u32m1(vuint32m1_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m1_u32m1(vd, vs2, vl); +} + +vuint32m2_t test_vsm4r_vs_u32m1_u32m2(vuint32m2_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m1_u32m2(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vs_u32m1_u32m4(vuint32m4_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m1_u32m4(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32m1_u32m8(vuint32m8_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m1_u32m8(vd, vs2, vl); +} + +vuint32m2_t test_vsm4r_vv_u32m2(vuint32m2_t vd, vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4r_vv_u32m2(vd, vs2, vl); +} + +vuint32m2_t test_vsm4r_vs_u32m2_u32m2(vuint32m2_t vd, vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m2_u32m2(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vs_u32m2_u32m4(vuint32m4_t vd, vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m2_u32m4(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32m2_u32m8(vuint32m8_t vd, vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m2_u32m8(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vv_u32m4(vuint32m4_t vd, vuint32m4_t vs2, size_t vl) { + return __riscv_vsm4r_vv_u32m4(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vs_u32m4_u32m4(vuint32m4_t vd, vuint32m4_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m4_u32m4(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32m4_u32m8(vuint32m8_t vd, vuint32m4_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m4_u32m8(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vv_u32m8(vuint32m8_t vd, vuint32m8_t vs2, size_t vl) { + return __riscv_vsm4r_vv_u32m8(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32m8_u32m8(vuint32m8_t vd, vuint32m8_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m8_u32m8(vd, vs2, vl); +} + +/* policy */ +vuint32mf2_t test_vsm4r_vv_u32mf2_tu(vuint32mf2_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vv_u32mf2_tu(vd, vs2, vl); +} + +vuint32mf2_t test_vsm4r_vs_u32mf2_u32mf2_tu(vuint32mf2_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32mf2_u32mf2_tu(vd, vs2, vl); +} + +vuint32m1_t test_vsm4r_vs_u32mf2_u32m1_tu(vuint32m1_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32mf2_u32m1_tu(vd, vs2, vl); +} + +vuint32m2_t test_vsm4r_vs_u32mf2_u32m2_tu(vuint32m2_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32mf2_u32m2_tu(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vs_u32mf2_u32m4_tu(vuint32m4_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32mf2_u32m4_tu(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32mf2_u32m8_tu(vuint32m8_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32mf2_u32m8_tu(vd, vs2, vl); +} + +vuint32m1_t test_vsm4r_vv_u32m1_tu(vuint32m1_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vv_u32m1_tu(vd, vs2, vl); +} + +vuint32m1_t test_vsm4r_vs_u32m1_u32m1_tu(vuint32m1_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m1_u32m1_tu(vd, vs2, vl); +} + +vuint32m2_t test_vsm4r_vs_u32m1_u32m2_tu(vuint32m2_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m1_u32m2_tu(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vs_u32m1_u32m4_tu(vuint32m4_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m1_u32m4_tu(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32m1_u32m8_tu(vuint32m8_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m1_u32m8_tu(vd, vs2, vl); +} + +vuint32m2_t test_vsm4r_vv_u32m2_tu(vuint32m2_t vd, vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4r_vv_u32m2_tu(vd, vs2, vl); +} + +vuint32m2_t test_vsm4r_vs_u32m2_u32m2_tu(vuint32m2_t vd, vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m2_u32m2_tu(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vs_u32m2_u32m4_tu(vuint32m4_t vd, vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m2_u32m4_tu(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32m2_u32m8_tu(vuint32m8_t vd, vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m2_u32m8_tu(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vv_u32m4_tu(vuint32m4_t vd, vuint32m4_t vs2, size_t vl) { + return __riscv_vsm4r_vv_u32m4_tu(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vs_u32m4_u32m4_tu(vuint32m4_t vd, vuint32m4_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m4_u32m4_tu(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32m4_u32m8_tu(vuint32m8_t vd, vuint32m4_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m4_u32m8_tu(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vv_u32m8_tu(vuint32m8_t vd, vuint32m8_t vs2, size_t vl) { + return __riscv_vsm4r_vv_u32m8_tu(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32m8_u32m8_tu(vuint32m8_t vd, vuint32m8_t vs2, size_t vl) { + return __riscv_vsm4r_vs_u32m8_u32m8_tu(vd, vs2, vl); +} + +/* { dg-final { scan-assembler-times {vsetvli\s*zero,\s*[a-x0-9]+,\s*[a-x0-9]+,m[a-x0-9]+,\s*ta,\s*ma} 20 } } */ +/* { dg-final { scan-assembler-times {vsetvli\s*zero,\s*[a-x0-9]+,\s*[a-x0-9]+,m[a-x0-9]+,\s*tu,\s*ma} 20 } } */ +/* { dg-final { scan-assembler-times {vsm4r\.vv\s+v[0-9]+,\s*v[0-9]} 10 } } */ +/* { dg-final { scan-assembler-times {vsm4r\.vs\s+v[0-9]+,\s*v[0-9]} 30 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4r_overloaded.c b/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4r_overloaded.c new file mode 100644 index 00000000000..6311adfb2d5 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zvk/zvksed/vsm4r_overloaded.c @@ -0,0 +1,170 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zvksed -mabi=lp64d -O2 -Wno-psabi" } */ +#include "riscv_vector.h" + +/* non-policy */ +vuint32mf2_t test_vsm4r_vv_u32mf2(vuint32mf2_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vv(vd, vs2, vl); +} + +vuint32mf2_t test_vsm4r_vs_u32mf2_u32mf2(vuint32mf2_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs(vd, vs2, vl); +} + +vuint32m1_t test_vsm4r_vs_u32mf2_u32m1(vuint32m1_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs(vd, vs2, vl); +} + +vuint32m2_t test_vsm4r_vs_u32mf2_u32m2(vuint32m2_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vs_u32mf2_u32m4(vuint32m4_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32mf2_u32m8(vuint32m8_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs(vd, vs2, vl); +} + +vuint32m1_t test_vsm4r_vv_u32m1(vuint32m1_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vv(vd, vs2, vl); +} + +vuint32m1_t test_vsm4r_vs_u32m1_u32m1(vuint32m1_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vs(vd, vs2, vl); +} + +vuint32m2_t test_vsm4r_vs_u32m1_u32m2(vuint32m2_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vs(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vs_u32m1_u32m4(vuint32m4_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vs(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32m1_u32m8(vuint32m8_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vs(vd, vs2, vl); +} + +vuint32m2_t test_vsm4r_vv_u32m2(vuint32m2_t vd, vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4r_vv(vd, vs2, vl); +} + +vuint32m2_t test_vsm4r_vs_u32m2_u32m2(vuint32m2_t vd, vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4r_vs(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vs_u32m2_u32m4(vuint32m4_t vd, vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4r_vs(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32m2_u32m8(vuint32m8_t vd, vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4r_vs(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vv_u32m4(vuint32m4_t vd, vuint32m4_t vs2, size_t vl) { + return __riscv_vsm4r_vv(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vs_u32m4_u32m4(vuint32m4_t vd, vuint32m4_t vs2, size_t vl) { + return __riscv_vsm4r_vs(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32m4_u32m8(vuint32m8_t vd, vuint32m4_t vs2, size_t vl) { + return __riscv_vsm4r_vs(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vv_u32m8(vuint32m8_t vd, vuint32m8_t vs2, size_t vl) { + return __riscv_vsm4r_vv(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32m8_u32m8(vuint32m8_t vd, vuint32m8_t vs2, size_t vl) { + return __riscv_vsm4r_vs(vd, vs2, vl); +} + +/* policy */ +vuint32mf2_t test_vsm4r_vv_u32mf2_tu(vuint32mf2_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vv_tu(vd, vs2, vl); +} + +vuint32mf2_t test_vsm4r_vs_u32mf2_u32mf2_tu(vuint32mf2_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_tu(vd, vs2, vl); +} + +vuint32m1_t test_vsm4r_vs_u32mf2_u32m1_tu(vuint32m1_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_tu(vd, vs2, vl); +} + +vuint32m2_t test_vsm4r_vs_u32mf2_u32m2_tu(vuint32m2_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_tu(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vs_u32mf2_u32m4_tu(vuint32m4_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_tu(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32mf2_u32m8_tu(vuint32m8_t vd, vuint32mf2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_tu(vd, vs2, vl); +} + +vuint32m1_t test_vsm4r_vv_u32m1_tu(vuint32m1_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vv_tu(vd, vs2, vl); +} + +vuint32m1_t test_vsm4r_vs_u32m1_u32m1_tu(vuint32m1_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vs_tu(vd, vs2, vl); +} + +vuint32m2_t test_vsm4r_vs_u32m1_u32m2_tu(vuint32m2_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vs_tu(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vs_u32m1_u32m4_tu(vuint32m4_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vs_tu(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32m1_u32m8_tu(vuint32m8_t vd, vuint32m1_t vs2, size_t vl) { + return __riscv_vsm4r_vs_tu(vd, vs2, vl); +} + +vuint32m2_t test_vsm4r_vv_u32m2_tu(vuint32m2_t vd, vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4r_vv_tu(vd, vs2, vl); +} + +vuint32m2_t test_vsm4r_vs_u32m2_u32m2_tu(vuint32m2_t vd, vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_tu(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vs_u32m2_u32m4_tu(vuint32m4_t vd, vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_tu(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32m2_u32m8_tu(vuint32m8_t vd, vuint32m2_t vs2, size_t vl) { + return __riscv_vsm4r_vs_tu(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vv_u32m4_tu(vuint32m4_t vd, vuint32m4_t vs2, size_t vl) { + return __riscv_vsm4r_vv_tu(vd, vs2, vl); +} + +vuint32m4_t test_vsm4r_vs_u32m4_u32m4_tu(vuint32m4_t vd, vuint32m4_t vs2, size_t vl) { + return __riscv_vsm4r_vs_tu(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32m4_u32m8_tu(vuint32m8_t vd, vuint32m4_t vs2, size_t vl) { + return __riscv_vsm4r_vs_tu(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vv_u32m8_tu(vuint32m8_t vd, vuint32m8_t vs2, size_t vl) { + return __riscv_vsm4r_vv_tu(vd, vs2, vl); +} + +vuint32m8_t test_vsm4r_vs_u32m8_u32m8_tu(vuint32m8_t vd, vuint32m8_t vs2, size_t vl) { + return __riscv_vsm4r_vs_tu(vd, vs2, vl); +} + +/* { dg-final { scan-assembler-times {vsetvli\s*zero,\s*[a-x0-9]+,\s*[a-x0-9]+,m[a-x0-9]+,\s*ta,\s*ma} 20 } } */ +/* { dg-final { scan-assembler-times {vsetvli\s*zero,\s*[a-x0-9]+,\s*[a-x0-9]+,m[a-x0-9]+,\s*tu,\s*ma} 20 } } */ +/* { dg-final { scan-assembler-times {vsm4r\.vv\s+v[0-9]+,\s*v[0-9]} 10 } } */ +/* { dg-final { scan-assembler-times {vsm4r\.vs\s+v[0-9]+,\s*v[0-9]} 30 } } */