From patchwork Mon Dec 4 13:44:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 81284 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D09FD3858018 for ; Mon, 4 Dec 2023 13:45:30 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgjp3.qq.com (smtpbgjp3.qq.com [54.92.39.34]) by sourceware.org (Postfix) with ESMTPS id A9F5538582BB for ; Mon, 4 Dec 2023 13:45:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A9F5538582BB Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A9F5538582BB Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.92.39.34 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701697513; cv=none; b=V9fTmwwxvuE9xKZvfRZ7sIDU5KaVPYrWcu/eSKa17LAy2Qj09VCLcXV9QUZHhx0yDds5qXSnKVnCI3cWxQ3dprobePnNVGK5vFM5L38vvLhdWOS0gTdU3otgHbI3SBQvyh1+zrQEjKHqsuToTEINFluTkWeCliC4pn9FD+iz4PQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701697513; c=relaxed/simple; bh=Ug0BWwTQnBRhKP/sqX6aoRmMoSobWtBe+yBZLvIhNWw=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=BFWK5ipDgbpt8RREkBPwXrwZ6rzb4svpZ3/ajLED7dFwGYrSHHR+/89rwcGvSnxWyX7/fvsOUsAblEudweNlTpW1GaaiTYj+Th/Qs7kFi5BgNibuT5qQRSGt/lJK4QhvOaNeESsrJJTGOkOt0JOOi/PMl96m14YdHDHtVYLGQ68= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp71t1701697498t7ngc7yg Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Mon, 04 Dec 2023 21:44:57 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: znfcQSa1hKaq+Vcso8t+Pkot4iwwcQLDwqFVn2I3aemuBxIhOpMpH0nsnjXi2 lWTcYX0ACHHyBlr/ZMXYZjZSSTqdn4AfqX023Q49OFF0uqRYz1iWiuYeltF05M1gOOvDuxY U5URUDjPhLuhPESIilSLKwdOGXY/vwxcan6hb5GPLhRCwCCHC4B2BexkFmD5hqM0wP/Fksu iEkdYMeQR7v+ad3oWzFHtXrv4DO16G+eReEdWjPB0XN9C8EkbMtoDaOY/RsOvflUkMk26Q0 6j9gFsJ124mPzq0iU5P/upmLX/TW1ZbRdijva/YyMXT6OjVW4IRb0eAj5wCOYQAHvfep36Y XpeK1odtxX1BkxIhfrUrRudf/HJOXttR2VLTpCGdUFvr1tvpCnaN3goaksMu209ajPQ5DTD WRuYqDue6nEjT3RsGmxAWQ== X-QQ-GoodBg: 2 X-BIZMAIL-ID: 13140439415722099536 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [Committed V2] RISC-V: Fix overlap group incorrect overlap on v0 Date: Mon, 4 Dec 2023 21:44:56 +0800 Message-Id: <20231204134456.453587-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SCC_10_SHORT_WORD_LINES, SCC_20_SHORT_WORD_LINES, SCC_35_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org In serious high register pressure case (appended in this patch): We see vluxei8.v v0,(s1),v1,v0.t which is not allowed. Since according to RVV ISA: +;; The destination vector register group for a masked vector instruction cannot overlap the source mask register (v0), +;; unless the destination vector register is being written with a mask value (e.g., compares) or the scalar result of a reduction. Such case doesn't have spillings, however, we expect such case should be spilled and reload data. The rootcause is I made a mistake in previous patch on matching dest operand and mask operand constraints: dest: "=vr" mask: "vmWc1" After this patch: dest: "vd,vr" mask: "vm,Wc1" make EEW widening pattern are same as other instruction patterns. PR target/112431 gcc/ChangeLog: * config/riscv/vector.md: Fix incorrect overlap in v0. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr112431-34.c: New test. --- gcc/config/riscv/vector.md | 268 +++++++++--------- .../gcc.target/riscv/rvv/base/pr112431-34.c | 101 +++++++ 2 files changed, 235 insertions(+), 134 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-34.c diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index cec1edc8190..ba0714a9971 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -2223,70 +2223,70 @@ ;; DEST eew is greater than SOURCE eew. (define_insn "@pred_indexed_load_x2_greater_eew" - [(set (match_operand:VEEWEXT2 0 "register_operand" "=vr, vr, vr, vr, vr, vr, ?&vr, ?&vr") + [(set (match_operand:VEEWEXT2 0 "register_operand" "=vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, ?&vr, ?&vr") (if_then_else:VEEWEXT2 (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1,vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 6 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 7 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 8 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (unspec:VEEWEXT2 - [(match_operand 3 "pmode_reg_or_0_operand" " rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ") + [(match_operand 3 "pmode_reg_or_0_operand" " rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ") (mem:BLK (scratch)) - (match_operand: 4 "register_operand" " W21, W21, W42, W42, W84, W84, vr, vr")] ORDER) - (match_operand:VEEWEXT2 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0, vu, 0")))] + (match_operand: 4 "register_operand" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84, vr, vr")] ORDER) + (match_operand:VEEWEXT2 2 "vector_merge_operand" " vu, vu, 0, 0, vu, vu, 0, 0, vu, vu, 0, 0, vu, 0")))] "TARGET_VECTOR" "vlxei.v\t%0,(%z3),%4%p1" [(set_attr "type" "vldx") (set_attr "mode" "") - (set_attr "group_overlap" "W21,W21,W42,W42,W84,W84,none,none")]) + (set_attr "group_overlap" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84,none,none")]) (define_insn "@pred_indexed_load_x4_greater_eew" - [(set (match_operand:VEEWEXT4 0 "register_operand" "=vr, vr, vr, vr, ?&vr, ?&vr") + [(set (match_operand:VEEWEXT4 0 "register_operand" "=vd, vr, vd, vr, vd, vr, vd, vr, ?&vr, ?&vr") (if_then_else:VEEWEXT4 (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i, i, i") - (match_operand 8 "const_int_operand" " i, i, i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1,vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (unspec:VEEWEXT4 - [(match_operand 3 "pmode_reg_or_0_operand" " rJ, rJ, rJ, rJ, rJ, rJ") + [(match_operand 3 "pmode_reg_or_0_operand" " rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ") (mem:BLK (scratch)) - (match_operand: 4 "register_operand" " W43, W43, W86, W86, vr, vr")] ORDER) - (match_operand:VEEWEXT4 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))] + (match_operand: 4 "register_operand" "W43,W43,W43,W43,W86,W86,W86,W86, vr, vr")] ORDER) + (match_operand:VEEWEXT4 2 "vector_merge_operand" " vu, vu, 0, 0, vu, vu, 0, 0, vu, 0")))] "TARGET_VECTOR" "vlxei.v\t%0,(%z3),%4%p1" [(set_attr "type" "vldx") (set_attr "mode" "") - (set_attr "group_overlap" "W43,W43,W86,W86,none,none")]) + (set_attr "group_overlap" "W43,W43,W43,W43,W86,W86,W86,W86,none,none")]) (define_insn "@pred_indexed_load_x8_greater_eew" - [(set (match_operand:VEEWEXT8 0 "register_operand" "=vr, vr, ?&vr, ?&vr") + [(set (match_operand:VEEWEXT8 0 "register_operand" "=vd, vr, vd, vr, ?&vr, ?&vr") (if_then_else:VEEWEXT8 (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i") - (match_operand 8 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm,Wc1, vm,Wc1,vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (unspec:VEEWEXT8 - [(match_operand 3 "pmode_reg_or_0_operand" " rJ, rJ, rJ, rJ") + [(match_operand 3 "pmode_reg_or_0_operand" " rJ, rJ, rJ, rJ, rJ, rJ") (mem:BLK (scratch)) - (match_operand: 4 "register_operand" " W87, W87, vr, vr")] ORDER) - (match_operand:VEEWEXT8 2 "vector_merge_operand" " vu, 0, vu, 0")))] + (match_operand: 4 "register_operand" "W87,W87,W87,W87, vr, vr")] ORDER) + (match_operand:VEEWEXT8 2 "vector_merge_operand" " vu, vu, 0, 0, vu, 0")))] "TARGET_VECTOR" "vlxei.v\t%0,(%z3),%4%p1" [(set_attr "type" "vldx") (set_attr "mode" "") - (set_attr "group_overlap" "W87,W87,none,none")]) + (set_attr "group_overlap" "W87,W87,W87,W87,none,none")]) ;; DEST eew is smaller than SOURCE eew. (define_insn "@pred_indexed_load_x2_smaller_eew" @@ -3686,66 +3686,66 @@ ;; Vector Double-Widening Sign-extend and Zero-extend. (define_insn "@pred__vf2" - [(set (match_operand:VWEXTI 0 "register_operand" "=vr, vr, vr, vr, vr, vr, ?&vr, ?&vr") + [(set (match_operand:VWEXTI 0 "register_operand" "=vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, ?&vr, ?&vr") (if_then_else:VWEXTI (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK") - (match_operand 5 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 6 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1,vmWc1,vmWc1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 5 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 6 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 7 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (any_extend:VWEXTI - (match_operand: 3 "register_operand" " W21, W21, W42, W42, W84, W84, vr, vr")) - (match_operand:VWEXTI 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0, vu, 0")))] + (match_operand: 3 "register_operand" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84, vr, vr")) + (match_operand:VWEXTI 2 "vector_merge_operand" " vu, vu, 0, 0, vu, vu, 0, 0, vu, vu, 0, 0, vu, 0")))] "TARGET_VECTOR" "vext.vf2\t%0,%3%p1" [(set_attr "type" "vext") (set_attr "mode" "") - (set_attr "group_overlap" "W21,W21,W42,W42,W84,W84,none,none")]) + (set_attr "group_overlap" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84,none,none")]) ;; Vector Quad-Widening Sign-extend and Zero-extend. (define_insn "@pred__vf4" - [(set (match_operand:VQEXTI 0 "register_operand" "=vr, vr, vr, vr, ?&vr, ?&vr") + [(set (match_operand:VQEXTI 0 "register_operand" "=vd, vr, vd, vr, vd, vr, vd, vr, ?&vr, ?&vr") (if_then_else:VQEXTI (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK") - (match_operand 5 "const_int_operand" " i, i, i, i, i, i") - (match_operand 6 "const_int_operand" " i, i, i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1,vmWc1,vmWc1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 5 "const_int_operand" " i, i, i, i, i, i, i, i, i, i") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (any_extend:VQEXTI - (match_operand: 3 "register_operand" " W43, W43, W86, W86, vr, vr")) - (match_operand:VQEXTI 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))] + (match_operand: 3 "register_operand" "W43,W43,W43,W43,W86,W86,W86,W86, vr, vr")) + (match_operand:VQEXTI 2 "vector_merge_operand" " vu, vu, 0, 0, vu, vu, 0, 0, vu, 0")))] "TARGET_VECTOR" "vext.vf4\t%0,%3%p1" [(set_attr "type" "vext") (set_attr "mode" "") - (set_attr "group_overlap" "W43,W43,W86,W86,none,none")]) + (set_attr "group_overlap" "W43,W43,W43,W43,W86,W86,W86,W86,none,none")]) ;; Vector Oct-Widening Sign-extend and Zero-extend. (define_insn "@pred__vf8" - [(set (match_operand:VOEXTI 0 "register_operand" "=vr, vr, ?&vr, ?&vr") + [(set (match_operand:VOEXTI 0 "register_operand" "=vd, vr, vd, vr, ?&vr, ?&vr") (if_then_else:VOEXTI (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 4 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 5 "const_int_operand" " i, i, i, i") - (match_operand 6 "const_int_operand" " i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm,Wc1, vm,Wc1,vmWc1,vmWc1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK") + (match_operand 5 "const_int_operand" " i, i, i, i, i, i") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (any_extend:VOEXTI - (match_operand: 3 "register_operand" " W87, W87, vr, vr")) - (match_operand:VOEXTI 2 "vector_merge_operand" " vu, 0, vu, 0")))] + (match_operand: 3 "register_operand" "W87,W87,W87,W87, vr, vr")) + (match_operand:VOEXTI 2 "vector_merge_operand" " vu, vu, 0, 0, vu, 0")))] "TARGET_VECTOR" "vext.vf8\t%0,%3%p1" [(set_attr "type" "vext") (set_attr "mode" "") - (set_attr "group_overlap" "W87,W87,none,none")]) + (set_attr "group_overlap" "W87,W87,W87,W87,none,none")]) ;; Vector Widening Add/Subtract/Multiply. (define_insn "@pred_dual_widen_" @@ -3771,28 +3771,28 @@ (set_attr "mode" "")]) (define_insn "@pred_dual_widen__scalar" - [(set (match_operand:VWEXTI 0 "register_operand" "=vr, vr, vr, vr, vr, vr, ?&vr, ?&vr") + [(set (match_operand:VWEXTI 0 "register_operand" "=vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, ?&vr, ?&vr") (if_then_else:VWEXTI (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1,vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 6 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 7 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 8 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (any_widen_binop:VWEXTI (any_extend:VWEXTI - (match_operand: 3 "register_operand" " W21, W21, W42, W42, W84, W84, vr, vr")) + (match_operand: 3 "register_operand" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84, vr, vr")) (any_extend:VWEXTI (vec_duplicate: - (match_operand: 4 "reg_or_0_operand" " rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ")))) - (match_operand:VWEXTI 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0, vu, 0")))] + (match_operand: 4 "reg_or_0_operand" " rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ")))) + (match_operand:VWEXTI 2 "vector_merge_operand" " vu, vu, 0, 0, vu, vu, 0, 0, vu, vu, 0, 0, vu, 0")))] "TARGET_VECTOR" "vw.vx\t%0,%3,%z4%p1" [(set_attr "type" "vi") (set_attr "mode" "") - (set_attr "group_overlap" "W21,W21,W42,W42,W84,W84,none,none")]) + (set_attr "group_overlap" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84,none,none")]) (define_insn "@pred_single_widen_sub" [(set (match_operand:VWEXTI 0 "register_operand" "=&vr,&vr") @@ -3881,47 +3881,47 @@ (set_attr "mode" "")]) (define_insn "@pred_widen_mulsu_scalar" - [(set (match_operand:VWEXTI 0 "register_operand" "=vr, vr, vr, vr, vr, vr, ?&vr, ?&vr") + [(set (match_operand:VWEXTI 0 "register_operand" "=vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, ?&vr, ?&vr") (if_then_else:VWEXTI (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1,vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 6 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 7 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 8 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (mult:VWEXTI (sign_extend:VWEXTI - (match_operand: 3 "register_operand" " W21, W21, W42, W42, W84, W84, vr, vr")) + (match_operand: 3 "register_operand" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84, vr, vr")) (zero_extend:VWEXTI (vec_duplicate: - (match_operand: 4 "reg_or_0_operand" " rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ")))) - (match_operand:VWEXTI 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0, vu, 0")))] + (match_operand: 4 "reg_or_0_operand" " rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ, rJ")))) + (match_operand:VWEXTI 2 "vector_merge_operand" " vu, vu, 0, 0, vu, vu, 0, 0, vu, vu, 0, 0, vu, 0")))] "TARGET_VECTOR" "vwmulsu.vx\t%0,%3,%z4%p1" [(set_attr "type" "viwmul") (set_attr "mode" "") - (set_attr "group_overlap" "W21,W21,W42,W42,W84,W84,none,none")]) + (set_attr "group_overlap" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84,none,none")]) ;; vwcvt.x.x.v (define_insn "@pred_" - [(set (match_operand:VWEXTI 0 "register_operand" "=vr, vr, vr, vr, vr, vr, ?&vr, ?&vr") + [(set (match_operand:VWEXTI 0 "register_operand" "=vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, ?&vr, ?&vr") (if_then_else:VWEXTI (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK") - (match_operand 5 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 6 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1,vmWc1,vmWc1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 5 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 6 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 7 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (plus:VWEXTI (any_extend:VWEXTI - (match_operand: 3 "register_operand" " W21, W21, W42, W42, W84, W84, vr, vr")) + (match_operand: 3 "register_operand" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84, vr, vr")) (vec_duplicate:VWEXTI (reg: X0_REGNUM))) - (match_operand:VWEXTI 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0, vu, 0")))] + (match_operand:VWEXTI 2 "vector_merge_operand" " vu, vu, 0, 0, vu, vu, 0, 0, vu, vu, 0, 0, vu, 0")))] "TARGET_VECTOR" "vwcvt.x.x.v\t%0,%3%p1" [(set_attr "type" "viwalu") @@ -3930,7 +3930,7 @@ (set (attr "ta") (symbol_ref "riscv_vector::get_ta(operands[5])")) (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) (set (attr "avl_type_idx") (const_int 7)) - (set_attr "group_overlap" "W21,W21,W42,W42,W84,W84,none,none")]) + (set_attr "group_overlap" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84,none,none")]) ;; ------------------------------------------------------------------------------- ;; ---- Predicated integer Narrowing operations @@ -7045,32 +7045,32 @@ (symbol_ref "riscv_vector::get_frm_mode (operands[9])"))]) (define_insn "@pred_dual_widen__scalar" - [(set (match_operand:VWEXTF 0 "register_operand" "=vr, vr, vr, vr, vr, vr, ?&vr, ?&vr") + [(set (match_operand:VWEXTF 0 "register_operand" "=vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, ?&vr, ?&vr") (if_then_else:VWEXTF (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 9 "const_int_operand" " i, i, i, i, i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1,vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 6 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 7 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 8 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 9 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM) (reg:SI FRM_REGNUM)] UNSPEC_VPREDICATE) (any_widen_binop:VWEXTF (float_extend:VWEXTF - (match_operand: 3 "register_operand" " W21, W21, W42, W42, W84, W84, vr, vr")) + (match_operand: 3 "register_operand" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84, vr, vr")) (float_extend:VWEXTF (vec_duplicate: - (match_operand: 4 "register_operand" " f, f, f, f, f, f, f, f")))) - (match_operand:VWEXTF 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0, vu, 0")))] + (match_operand: 4 "register_operand" " f, f, f, f, f, f, f, f, f, f, f, f, f, f")))) + (match_operand:VWEXTF 2 "vector_merge_operand" " vu, vu, 0, 0, vu, vu, 0, 0, vu, vu, 0, 0, vu, 0")))] "TARGET_VECTOR" "vfw.vf\t%0,%3,%4%p1" [(set_attr "type" "vf") (set_attr "mode" "") (set (attr "frm_mode") (symbol_ref "riscv_vector::get_frm_mode (operands[9])")) - (set_attr "group_overlap" "W21,W21,W42,W42,W84,W84,none,none")]) + (set_attr "group_overlap" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84,none,none")]) (define_insn "@pred_single_widen_add" [(set (match_operand:VWEXTF 0 "register_operand" "=&vr, &vr") @@ -7635,88 +7635,88 @@ ;; ------------------------------------------------------------------------------- (define_insn "@pred_widen_fcvt_x_f" - [(set (match_operand:VWCONVERTI 0 "register_operand" "=vr, vr, vr, vr, vr, vr, ?&vr, ?&vr") + [(set (match_operand:VWCONVERTI 0 "register_operand" "=vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, ?&vr, ?&vr") (if_then_else:VWCONVERTI (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK") - (match_operand 5 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 6 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1,vmWc1,vmWc1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 5 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 6 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 7 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 8 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM) (reg:SI FRM_REGNUM)] UNSPEC_VPREDICATE) (unspec:VWCONVERTI - [(match_operand: 3 "register_operand" " W21, W21, W42, W42, W84, W84, vr, vr")] VFCVTS) - (match_operand:VWCONVERTI 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0, vu, 0")))] + [(match_operand: 3 "register_operand" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84, vr, vr")] VFCVTS) + (match_operand:VWCONVERTI 2 "vector_merge_operand" " vu, vu, 0, 0, vu, vu, 0, 0, vu, vu, 0, 0, vu, 0")))] "TARGET_VECTOR" "vfwcvt.x.f.v\t%0,%3%p1" [(set_attr "type" "vfwcvtftoi") (set_attr "mode" "") (set (attr "frm_mode") (symbol_ref "riscv_vector::get_frm_mode (operands[8])")) - (set_attr "group_overlap" "W21,W21,W42,W42,W84,W84,none,none")]) + (set_attr "group_overlap" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84,none,none")]) (define_insn "@pred_widen_" - [(set (match_operand:VWCONVERTI 0 "register_operand" "=vr, vr, vr, vr, vr, vr, ?&vr, ?&vr") + [(set (match_operand:VWCONVERTI 0 "register_operand" "=vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, ?&vr, ?&vr") (if_then_else:VWCONVERTI (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK") - (match_operand 5 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 6 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1,vmWc1,vmWc1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 5 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 6 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 7 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (any_fix:VWCONVERTI - (match_operand: 3 "register_operand" " W21, W21, W42, W42, W84, W84, vr, vr")) - (match_operand:VWCONVERTI 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0, vu, 0")))] + (match_operand: 3 "register_operand" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84, vr, vr")) + (match_operand:VWCONVERTI 2 "vector_merge_operand" " vu, vu, 0, 0, vu, vu, 0, 0, vu, vu, 0, 0, vu, 0")))] "TARGET_VECTOR" "vfwcvt.rtz.x.f.v\t%0,%3%p1" [(set_attr "type" "vfwcvtftoi") (set_attr "mode" "") - (set_attr "group_overlap" "W21,W21,W42,W42,W84,W84,none,none")]) + (set_attr "group_overlap" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84,none,none")]) (define_insn "@pred_widen_" - [(set (match_operand:V_VLSF 0 "register_operand" "=vr, vr, vr, vr, vr, vr, ?&vr, ?&vr") + [(set (match_operand:V_VLSF 0 "register_operand" "=vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, ?&vr, ?&vr") (if_then_else:V_VLSF (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK") - (match_operand 5 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 6 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1,vmWc1,vmWc1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 5 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 6 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 7 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (any_float:V_VLSF - (match_operand: 3 "register_operand" " W21, W21, W42, W42, W84, W84, vr, vr")) - (match_operand:V_VLSF 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0, vu, 0")))] + (match_operand: 3 "register_operand" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84, vr, vr")) + (match_operand:V_VLSF 2 "vector_merge_operand" " vu, vu, 0, 0, vu, vu, 0, 0, vu, vu, 0, 0, vu, 0")))] "TARGET_VECTOR" "vfwcvt.f.x.v\t%0,%3%p1" [(set_attr "type" "vfwcvtitof") (set_attr "mode" "") - (set_attr "group_overlap" "W21,W21,W42,W42,W84,W84,none,none")]) + (set_attr "group_overlap" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84,none,none")]) (define_insn "@pred_extend" - [(set (match_operand:VWEXTF_ZVFHMIN 0 "register_operand" "=vr, vr, vr, vr, vr, vr, ?&vr, ?&vr") + [(set (match_operand:VWEXTF_ZVFHMIN 0 "register_operand" "=vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, vd, vr, ?&vr, ?&vr") (if_then_else:VWEXTF_ZVFHMIN (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK") - (match_operand 5 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 6 "const_int_operand" " i, i, i, i, i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1, vm,Wc1,vmWc1,vmWc1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 5 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 6 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 7 "const_int_operand" "i, i, i, i, i, i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (float_extend:VWEXTF_ZVFHMIN - (match_operand: 3 "register_operand" " W21, W21, W42, W42, W84, W84, vr, vr")) - (match_operand:VWEXTF_ZVFHMIN 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0, vu, 0")))] + (match_operand: 3 "register_operand" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84, vr, vr")) + (match_operand:VWEXTF_ZVFHMIN 2 "vector_merge_operand" " vu, vu, 0, 0, vu, vu, 0, 0, vu, vu, 0, 0, vu, 0")))] "TARGET_VECTOR" "vfwcvt.f.f.v\t%0,%3%p1" [(set_attr "type" "vfwcvtftof") (set_attr "mode" "") - (set_attr "group_overlap" "W21,W21,W42,W42,W84,W84,none,none")]) + (set_attr "group_overlap" "W21,W21,W21,W21,W42,W42,W42,W42,W84,W84,W84,W84,none,none")]) ;; ------------------------------------------------------------------------------- ;; ---- Predicated floating-point narrow conversions diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-34.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-34.c new file mode 100644 index 00000000000..80ea65b85ff --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-34.c @@ -0,0 +1,101 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */ + +#include "riscv_vector.h" + +size_t __attribute__ ((noinline)) +sumation (size_t sum0, size_t sum1, size_t sum2, size_t sum3, size_t sum4, + size_t sum5, size_t sum6, size_t sum7, size_t sum8, size_t sum9, + size_t sum10, size_t sum11, size_t sum12, size_t sum13, size_t sum14, + size_t sum15) +{ + return sum0 + sum1 + sum2 + sum3 + sum4 + sum5 + sum6 + sum7 + sum8 + sum9 + + sum10 + sum11 + sum12 + sum13 + sum14 + sum15; +} + +size_t +foo (char const *buf, size_t len) +{ + size_t sum = 0; + size_t vl = __riscv_vsetvlmax_e8m8 (); + size_t step = vl * 4; + const char *it = buf, *end = buf + len; + for (; it + step <= end;) + { + vuint8m1_t v0 = __riscv_vle8_v_u8m1 ((void *) it, vl); + it += vl; + vuint8m1_t v1 = __riscv_vle8_v_u8m1 ((void *) it, vl); + it += vl; + vuint8m1_t v2 = __riscv_vle8_v_u8m1 ((void *) it, vl); + it += vl; + vuint8m1_t v3 = __riscv_vle8_v_u8m1 ((void *) it, vl); + it += vl; + vuint8m1_t v4 = __riscv_vle8_v_u8m1 ((void *) it, vl); + it += vl; + vuint8m1_t v5 = __riscv_vle8_v_u8m1 ((void *) it, vl); + it += vl; + vuint8m1_t v6 = __riscv_vle8_v_u8m1 ((void *) it, vl); + it += vl; + vuint8m1_t v7 = __riscv_vle8_v_u8m1 ((void *) it, vl); + it += vl; + vuint8m1_t v8 = __riscv_vle8_v_u8m1 ((void *) it, vl); + it += vl; + vuint8m1_t v9 = __riscv_vle8_v_u8m1 ((void *) it, vl); + it += vl; + vuint8m1_t v10 = __riscv_vle8_v_u8m1 ((void *) it, vl); + it += vl; + vuint8m1_t v11 = __riscv_vle8_v_u8m1 ((void *) it, vl); + it += vl; + vuint8m1_t v12 = __riscv_vle8_v_u8m1 ((void *) it, vl); + it += vl; + vuint8m1_t v13 = __riscv_vle8_v_u8m1 ((void *) it, vl); + it += vl; + vuint8m1_t v14 = __riscv_vle8_v_u8m1 ((void *) it, vl); + it += vl; + vuint8m1_t v15 = __riscv_vle8_v_u8m1 ((void *) it, vl); + it += vl; + + asm volatile("nop" ::: "memory"); + vint16m2_t vw0 = __riscv_vluxei8_v_i16m2 ((void *) it, v0, vl); + vint16m2_t vw1 = __riscv_vluxei8_v_i16m2 ((void *) it, v1, vl); + vint16m2_t vw2 = __riscv_vluxei8_v_i16m2 ((void *) it, v2, vl); + vint16m2_t vw3 = __riscv_vluxei8_v_i16m2 ((void *) it, v3, vl); + vint16m2_t vw4 = __riscv_vluxei8_v_i16m2 ((void *) it, v4, vl); + vint16m2_t vw5 = __riscv_vluxei8_v_i16m2 ((void *) it, v5, vl); + vint16m2_t vw6 = __riscv_vluxei8_v_i16m2 ((void *) it, v6, vl); + vint16m2_t vw7 = __riscv_vluxei8_v_i16m2 ((void *) it, v7, vl); + vint16m2_t vw8 = __riscv_vluxei8_v_i16m2 ((void *) it, v8, vl); + vint16m2_t vw9 = __riscv_vluxei8_v_i16m2 ((void *) it, v9, vl); + vint16m2_t vw10 = __riscv_vluxei8_v_i16m2 ((void *) it, v10, vl); + vint16m2_t vw11 = __riscv_vluxei8_v_i16m2 ((void *) it, v11, vl); + vint16m2_t vw12 = __riscv_vluxei8_v_i16m2 ((void *) it, v12, vl); + vint16m2_t vw13 = __riscv_vluxei8_v_i16m2 ((void *) it, v13, vl); + vint16m2_t vw14 = __riscv_vluxei8_v_i16m2 ((void *) it, v14, vl); + vbool8_t mask = *(vbool8_t*)it; + vint16m2_t vw15 = __riscv_vluxei8_v_i16m2_m (mask, (void *) it, v15, vl); + + asm volatile("nop" ::: "memory"); + size_t sum0 = __riscv_vmv_x_s_i16m2_i16 (vw0); + size_t sum1 = __riscv_vmv_x_s_i16m2_i16 (vw1); + size_t sum2 = __riscv_vmv_x_s_i16m2_i16 (vw2); + size_t sum3 = __riscv_vmv_x_s_i16m2_i16 (vw3); + size_t sum4 = __riscv_vmv_x_s_i16m2_i16 (vw4); + size_t sum5 = __riscv_vmv_x_s_i16m2_i16 (vw5); + size_t sum6 = __riscv_vmv_x_s_i16m2_i16 (vw6); + size_t sum7 = __riscv_vmv_x_s_i16m2_i16 (vw7); + size_t sum8 = __riscv_vmv_x_s_i16m2_i16 (vw8); + size_t sum9 = __riscv_vmv_x_s_i16m2_i16 (vw9); + size_t sum10 = __riscv_vmv_x_s_i16m2_i16 (vw10); + size_t sum11 = __riscv_vmv_x_s_i16m2_i16 (vw11); + size_t sum12 = __riscv_vmv_x_s_i16m2_i16 (vw12); + size_t sum13 = __riscv_vmv_x_s_i16m2_i16 (vw13); + size_t sum14 = __riscv_vmv_x_s_i16m2_i16 (vw14); + size_t sum15 = __riscv_vmv_x_s_i16m2_i16 (vw15); + + sum += sumation (sum0, sum1, sum2, sum3, sum4, sum5, sum6, sum7, sum8, + sum9, sum10, sum11, sum12, sum13, sum14, sum15); + } + return sum; +} + +/* { dg-final { scan-assembler-not {vluxei8\.v\tv0,\s*\([a-x0-9]+\),\s*v[0-9]+,\s*v0.t} } } */