From patchwork Fri Dec  1 07:00:27 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Juzhe-Zhong <juzhe.zhong@rivai.ai>
X-Patchwork-Id: 81071
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id E533D3857025
	for <patchwork@sourceware.org>; Fri,  1 Dec 2023 07:00:56 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from smtpbgjp3.qq.com (smtpbgjp3.qq.com [54.92.39.34])
 by sourceware.org (Postfix) with ESMTPS id B56413858D38
 for <gcc-patches@gcc.gnu.org>; Fri,  1 Dec 2023 07:00:34 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B56413858D38
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=rivai.ai
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B56413858D38
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=54.92.39.34
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701414041; cv=none;
 b=M4cYJsiXUUUagmQBQvqdXrvjqJayXG5jjudCEk2ikiL2hYlGDMwTkKSUFUAvxR353abIAj+AOWsZBD5KZybX2y5h4X+sd3rJ7V7dvZZqNMvQFfftMF1ViWeiH3m7buCTCwXfFFGrMx0k3HCAh7DKOfn18zfiwyeSpHcbfumkU2k=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1701414041; c=relaxed/simple;
 bh=Nz01JQtyHzF043/8xsbeTm29ydbFZqjSfV7aunPj060=;
 h=From:To:Subject:Date:Message-Id:MIME-Version;
 b=vtIv5vLM0HcSuRYP9nkU2FffTOx9CmmMiSllxCHpZzoFtJpR58w7SjZJvsOp6wguJV0IKHcgW18yNzQ45JAl/T/D4uGj2+QtOXkv/hMF0yJqYFWEGR7+4pxHCwXL5ckm/l0+jjUYaivqHuMQOD1vjlWBRtQJfUoSzGWA5UXie8k=
ARC-Authentication-Results: i=1; server2.sourceware.org
X-QQ-mid: bizesmtp63t1701414029tbx9oe5f
Received: from rios-cad122.hadoop.rioslab.org ( [58.60.1.26])
 by bizesmtp.qq.com (ESMTP) with
 id ; Fri, 01 Dec 2023 15:00:27 +0800 (CST)
X-QQ-SSF: 01400000000000G0V000000A0000000
X-QQ-FEAT: QityeSR92A2oTTybNb3UhzqgdQCRre1hegFaD/8cRMVCX9KXl3lH2/WBC81r+
 ZkiRJm8S78lzw66tzEpMkuDWRY1wgBdARhzyx3nfaq39pBGPyMeNJhfjrLyqM6UkjKO5yyH
 Rof3PHYzhDehiq89fGosljw0FM8NikIW6V0RVlF0nLIZBsq3LF8+j7zkXNUNCYIpx0wH9xv
 upbFEPNfFsQJ1dLdlQn5hoR2sgfRURyovz6r4pKF8oBy63RVUEP02+OreM8P0uZG9X1LdaV
 txnx0mSui0isEyOs+7F/LGxveJgmtmLl1k89N1UrwLRBKb1X0vFJHtC7M37TTk6zkAXdPAe
 b2ZWPxcK3z86ht6qw7w0Ijbpz+YohZw5D/ATqqMeLmBeXK+L5BVigSQcf1xS4P5V4JnAnoM
 z3LCy8pQ50I=
X-QQ-GoodBg: 2
X-BIZMAIL-ID: 13757078184952998760
From: Juzhe-Zhong <juzhe.zhong@rivai.ai>
To: gcc-patches@gcc.gnu.org
Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, jeffreyalaw@gmail.com,
 rdapp.gcc@gmail.com, Juzhe-Zhong <juzhe.zhong@rivai.ai>
Subject: [PATCH] RISC-V: Support highpart register overlap for widen vx/vf
 instructions
Date: Fri,  1 Dec 2023 15:00:27 +0800
Message-Id: <20231201070027.581910-1-juzhe.zhong@rivai.ai>
X-Mailer: git-send-email 2.36.3
MIME-Version: 1.0
X-QQ-SENDSIZE: 520
Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0
X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE,
 RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org

This patch leverages the same approach as vwcvt.

Before this patch:

.L5:
        add     a3,s0,s1
        add     a4,s6,s1
        add     a5,s7,s1
        vsetvli zero,s0,e32,m4,ta,ma
        vle32.v v16,0(s1)
        vle32.v v12,0(a3)
        mv      s1,s2
        vle32.v v8,0(a4)
        vle32.v v4,0(a5)
        nop
        vfwadd.vf       v24,v16,fs0
        vfwadd.vf       v16,v12,fs0
        vs8r.v  v16,0(sp)                -----> spill
        vfwadd.vf       v16,v8,fs0
        vfwadd.vf       v8,v4,fs0
        nop
        vsetvli zero,zero,e64,m8,ta,ma
        vfmv.f.s        fa4,v24
        vl8re64.v       v24,0(sp)       -----> reload
        vfmv.f.s        fa5,v24
        fcvt.lu.d a0,fa4,rtz
        fcvt.lu.d a1,fa5,rtz
        vfmv.f.s        fa4,v16
        vfmv.f.s        fa5,v8
        fcvt.lu.d a2,fa4,rtz
        fcvt.lu.d a3,fa5,rtz
        add     s2,s2,s5
        call    sumation
        add     s3,s3,a0
        bgeu    s4,s2,.L5

After this patch:

.L5:
	add	a3,s0,s1
	add	a4,s6,s1
	add	a5,s7,s1
	vsetvli	zero,s0,e32,m4,ta,ma
	vle32.v	v4,0(s1)
	vle32.v	v28,0(a3)
	mv	s1,s2
	vle32.v	v20,0(a4)
	vle32.v	v12,0(a5)
	vfwadd.vf	v0,v4,fs0
	vfwadd.vf	v24,v28,fs0
	vfwadd.vf	v16,v20,fs0
	vfwadd.vf	v8,v12,fs0
	vsetvli	zero,zero,e64,m8,ta,ma
	vfmv.f.s	fa4,v0
	vfmv.f.s	fa5,v24
	fcvt.lu.d a0,fa4,rtz
	fcvt.lu.d a1,fa5,rtz
	vfmv.f.s	fa4,v16
	vfmv.f.s	fa5,v8
	fcvt.lu.d a2,fa4,rtz
	fcvt.lu.d a3,fa5,rtz
	add	s2,s2,s5
	call	sumation
	add	s3,s3,a0
	bgeu	s4,s2,.L5

	PR target/112431

gcc/ChangeLog:

	* config/riscv/vector.md: Support highpart overlap for vx/vf.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/pr112431-22.c: New test.
	* gcc.target/riscv/rvv/base/pr112431-23.c: New test.
	* gcc.target/riscv/rvv/base/pr112431-24.c: New test.
	* gcc.target/riscv/rvv/base/pr112431-25.c: New test.
	* gcc.target/riscv/rvv/base/pr112431-26.c: New test.
	* gcc.target/riscv/rvv/base/pr112431-27.c: New test.
---
 gcc/config/riscv/vector.md                    |  65 +++---
 .../gcc.target/riscv/rvv/base/pr112431-22.c   | 188 ++++++++++++++++++
 .../gcc.target/riscv/rvv/base/pr112431-23.c   | 119 +++++++++++
 .../gcc.target/riscv/rvv/base/pr112431-24.c   |  86 ++++++++
 .../gcc.target/riscv/rvv/base/pr112431-25.c   | 104 ++++++++++
 .../gcc.target/riscv/rvv/base/pr112431-26.c   |  68 +++++++
 .../gcc.target/riscv/rvv/base/pr112431-27.c   |  51 +++++
 7 files changed, 650 insertions(+), 31 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-22.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-23.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-24.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-25.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-26.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-27.c

diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index b47b9742b62..7a1b22fb58d 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -3768,27 +3768,28 @@
    (set_attr "mode" "<V_DOUBLE_TRUNC>")])
 
 (define_insn "@pred_dual_widen_<any_widen_binop:optab><any_extend:su><mode>_scalar"
-  [(set (match_operand:VWEXTI 0 "register_operand"                  "=&vr,&vr")
+  [(set (match_operand:VWEXTI 0 "register_operand"                   "=vr,   vr,   vr,   vr,  vr,    vr, ?&vr, ?&vr")
 	(if_then_else:VWEXTI
 	  (unspec:<VM>
-	    [(match_operand:<VM> 1 "vector_mask_operand"           "vmWc1,vmWc1")
-	     (match_operand 5 "vector_length_operand"              "   rK,   rK")
-	     (match_operand 6 "const_int_operand"                  "    i,    i")
-	     (match_operand 7 "const_int_operand"                  "    i,    i")
-	     (match_operand 8 "const_int_operand"                  "    i,    i")
+	    [(match_operand:<VM> 1 "vector_mask_operand"           "vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1")
+	     (match_operand 5 "vector_length_operand"              "   rK,   rK,   rK,   rK,   rK,   rK,   rK,   rK")
+	     (match_operand 6 "const_int_operand"                  "    i,    i,    i,    i,    i,    i,    i,    i")
+	     (match_operand 7 "const_int_operand"                  "    i,    i,    i,    i,    i,    i,    i,    i")
+	     (match_operand 8 "const_int_operand"                  "    i,    i,    i,    i,    i,    i,    i,    i")
 	     (reg:SI VL_REGNUM)
 	     (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
 	  (any_widen_binop:VWEXTI
 	    (any_extend:VWEXTI
-	      (match_operand:<V_DOUBLE_TRUNC> 3 "register_operand" "   vr,   vr"))
+	      (match_operand:<V_DOUBLE_TRUNC> 3 "register_operand" "  W21,  W21,  W42,  W42,  W84,  W84,   vr,   vr"))
 	    (any_extend:VWEXTI
 	      (vec_duplicate:<V_DOUBLE_TRUNC>
-		(match_operand:<VSUBEL> 4 "reg_or_0_operand"       "   rJ,   rJ"))))
-	  (match_operand:VWEXTI 2 "vector_merge_operand"           "   vu,    0")))]
+		(match_operand:<VSUBEL> 4 "reg_or_0_operand"       "   rJ,   rJ,   rJ,   rJ,   rJ,   rJ,   rJ,   rJ"))))
+	  (match_operand:VWEXTI 2 "vector_merge_operand"           "   vu,    0,   vu,    0,   vu,    0,   vu,    0")))]
   "TARGET_VECTOR"
   "vw<any_widen_binop:insn><any_extend:u>.vx\t%0,%3,%z4%p1"
   [(set_attr "type" "vi<widen_binop_insn_type>")
-   (set_attr "mode" "<V_DOUBLE_TRUNC>")])
+   (set_attr "mode" "<V_DOUBLE_TRUNC>")
+   (set_attr "group_overlap" "W21,W21,W42,W42,W84,W84,none,none")])
 
 (define_insn "@pred_single_widen_sub<any_extend:su><mode>"
   [(set (match_operand:VWEXTI 0 "register_operand"                  "=&vr,&vr")
@@ -3877,27 +3878,28 @@
    (set_attr "mode" "<V_DOUBLE_TRUNC>")])
 
 (define_insn "@pred_widen_mulsu<mode>_scalar"
-  [(set (match_operand:VWEXTI 0 "register_operand"                  "=&vr,&vr")
+  [(set (match_operand:VWEXTI 0 "register_operand"                   "=vr,   vr,   vr,   vr,  vr,    vr, ?&vr, ?&vr")
 	(if_then_else:VWEXTI
 	  (unspec:<VM>
-	    [(match_operand:<VM> 1 "vector_mask_operand"           "vmWc1,vmWc1")
-	     (match_operand 5 "vector_length_operand"              "   rK,   rK")
-	     (match_operand 6 "const_int_operand"                  "    i,    i")
-	     (match_operand 7 "const_int_operand"                  "    i,    i")
-	     (match_operand 8 "const_int_operand"                  "    i,    i")
+	    [(match_operand:<VM> 1 "vector_mask_operand"           "vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1")
+	     (match_operand 5 "vector_length_operand"              "   rK,   rK,   rK,   rK,   rK,   rK,   rK,   rK")
+	     (match_operand 6 "const_int_operand"                  "    i,    i,    i,    i,    i,    i,    i,    i")
+	     (match_operand 7 "const_int_operand"                  "    i,    i,    i,    i,    i,    i,    i,    i")
+	     (match_operand 8 "const_int_operand"                  "    i,    i,    i,    i,    i,    i,    i,    i")
 	     (reg:SI VL_REGNUM)
 	     (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
 	  (mult:VWEXTI
 	    (sign_extend:VWEXTI
-	      (match_operand:<V_DOUBLE_TRUNC> 3 "register_operand" "   vr,   vr"))
+	      (match_operand:<V_DOUBLE_TRUNC> 3 "register_operand" "  W21,  W21,  W42,  W42,  W84,  W84,   vr,   vr"))
 	    (zero_extend:VWEXTI
 	      (vec_duplicate:<V_DOUBLE_TRUNC>
-		(match_operand:<VSUBEL> 4 "reg_or_0_operand"       "   rJ,   rJ"))))
-	  (match_operand:VWEXTI 2 "vector_merge_operand"           "   vu,    0")))]
+		(match_operand:<VSUBEL> 4 "reg_or_0_operand"       "   rJ,   rJ,   rJ,   rJ,   rJ,   rJ,   rJ,   rJ"))))
+	  (match_operand:VWEXTI 2 "vector_merge_operand"           "   vu,    0,   vu,    0,   vu,    0,   vu,    0")))]
   "TARGET_VECTOR"
   "vwmulsu.vx\t%0,%3,%z4%p1"
   [(set_attr "type" "viwmul")
-   (set_attr "mode" "<V_DOUBLE_TRUNC>")])
+   (set_attr "mode" "<V_DOUBLE_TRUNC>")
+   (set_attr "group_overlap" "W21,W21,W42,W42,W84,W84,none,none")])
 
 ;; vwcvt<u>.x.x.v
 (define_insn "@pred_<optab><mode>"
@@ -7037,31 +7039,32 @@
 	(symbol_ref "riscv_vector::get_frm_mode (operands[9])"))])
 
 (define_insn "@pred_dual_widen_<optab><mode>_scalar"
-  [(set (match_operand:VWEXTF 0 "register_operand"                  "=&vr,  &vr")
+  [(set (match_operand:VWEXTF 0 "register_operand"                   "=vr,   vr,   vr,   vr,  vr,    vr, ?&vr, ?&vr")
 	(if_then_else:VWEXTF
 	  (unspec:<VM>
-	    [(match_operand:<VM> 1 "vector_mask_operand"           "vmWc1,vmWc1")
-	     (match_operand 5 "vector_length_operand"              "   rK,   rK")
-	     (match_operand 6 "const_int_operand"                  "    i,    i")
-	     (match_operand 7 "const_int_operand"                  "    i,    i")
-	     (match_operand 8 "const_int_operand"                  "    i,    i")
-	     (match_operand 9 "const_int_operand"                  "    i,    i")
+	    [(match_operand:<VM> 1 "vector_mask_operand"           "vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1")
+	     (match_operand 5 "vector_length_operand"              "   rK,   rK,   rK,   rK,   rK,   rK,   rK,   rK")
+	     (match_operand 6 "const_int_operand"                  "    i,    i,    i,    i,    i,    i,    i,    i")
+	     (match_operand 7 "const_int_operand"                  "    i,    i,    i,    i,    i,    i,    i,    i")
+	     (match_operand 8 "const_int_operand"                  "    i,    i,    i,    i,    i,    i,    i,    i")
+	     (match_operand 9 "const_int_operand"                  "    i,    i,    i,    i,    i,    i,    i,    i")
 	     (reg:SI VL_REGNUM)
 	     (reg:SI VTYPE_REGNUM)
 	     (reg:SI FRM_REGNUM)] UNSPEC_VPREDICATE)
 	  (any_widen_binop:VWEXTF
 	    (float_extend:VWEXTF
-	      (match_operand:<V_DOUBLE_TRUNC> 3 "register_operand" "   vr,   vr"))
+	      (match_operand:<V_DOUBLE_TRUNC> 3 "register_operand" "  W21,  W21,  W42,  W42,  W84,  W84,   vr,   vr"))
 	    (float_extend:VWEXTF
 	      (vec_duplicate:<V_DOUBLE_TRUNC>
-		(match_operand:<VSUBEL> 4 "register_operand"       "    f,    f"))))
-	  (match_operand:VWEXTF 2 "vector_merge_operand"           "   vu,    0")))]
+		(match_operand:<VSUBEL> 4 "register_operand"       "    f,    f,    f,    f,    f,    f,    f,    f"))))
+	  (match_operand:VWEXTF 2 "vector_merge_operand"           "   vu,    0,   vu,    0,   vu,    0,   vu,    0")))]
   "TARGET_VECTOR"
   "vfw<insn>.vf\t%0,%3,%4%p1"
   [(set_attr "type" "vf<widen_binop_insn_type>")
    (set_attr "mode" "<V_DOUBLE_TRUNC>")
    (set (attr "frm_mode")
-	(symbol_ref "riscv_vector::get_frm_mode (operands[9])"))])
+	(symbol_ref "riscv_vector::get_frm_mode (operands[9])"))
+   (set_attr "group_overlap" "W21,W21,W42,W42,W84,W84,none,none")])
 
 (define_insn "@pred_single_widen_add<mode>"
   [(set (match_operand:VWEXTF 0 "register_operand"                  "=&vr,  &vr")
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-22.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-22.c
new file mode 100644
index 00000000000..90db18217bb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-22.c
@@ -0,0 +1,188 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
+
+#include "riscv_vector.h"
+
+size_t __attribute__ ((noinline))
+sumation (size_t sum0, size_t sum1, size_t sum2, size_t sum3, size_t sum4,
+	  size_t sum5, size_t sum6, size_t sum7, size_t sum8, size_t sum9,
+	  size_t sum10, size_t sum11, size_t sum12, size_t sum13, size_t sum14,
+	  size_t sum15)
+{
+  return sum0 + sum1 + sum2 + sum3 + sum4 + sum5 + sum6 + sum7 + sum8 + sum9
+	 + sum10 + sum11 + sum12 + sum13 + sum14 + sum15;
+}
+
+size_t
+foo (char const *buf, size_t len)
+{
+  size_t sum = 0;
+  size_t vl = __riscv_vsetvlmax_e8m8 ();
+  size_t step = vl * 4;
+  const char *it = buf, *end = buf + len;
+  for (; it + step <= end;)
+    {
+      vint8m1_t v0 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v1 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v2 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v3 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v4 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v5 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v6 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v7 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v8 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v9 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v10 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v11 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v12 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v13 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v14 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v15 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      
+      asm volatile("nop" ::: "memory");
+      vint16m2_t vw0 = __riscv_vwadd_vx_i16m2 (v0, 33, vl);
+      vint16m2_t vw1 = __riscv_vwadd_vx_i16m2 (v1, 33, vl);
+      vint16m2_t vw2 = __riscv_vwadd_vx_i16m2 (v2, 33, vl);
+      vint16m2_t vw3 = __riscv_vwadd_vx_i16m2 (v3, 33, vl);
+      vint16m2_t vw4 = __riscv_vwadd_vx_i16m2 (v4, 33, vl);
+      vint16m2_t vw5 = __riscv_vwadd_vx_i16m2 (v5, 33, vl);
+      vint16m2_t vw6 = __riscv_vwadd_vx_i16m2 (v6, 33, vl);
+      vint16m2_t vw7 = __riscv_vwadd_vx_i16m2 (v7, 33, vl);
+      vint16m2_t vw8 = __riscv_vwadd_vx_i16m2 (v8, 33, vl);
+      vint16m2_t vw9 = __riscv_vwadd_vx_i16m2 (v9, 33, vl);
+      vint16m2_t vw10 = __riscv_vwadd_vx_i16m2 (v10, 33, vl);
+      vint16m2_t vw11 = __riscv_vwadd_vx_i16m2 (v11, 33, vl);
+      vint16m2_t vw12 = __riscv_vwadd_vx_i16m2 (v12, 33, vl);
+      vint16m2_t vw13 = __riscv_vwadd_vx_i16m2 (v13, 33, vl);
+      vint16m2_t vw14 = __riscv_vwadd_vx_i16m2 (v14, 33, vl);
+      vint16m2_t vw15 = __riscv_vwadd_vx_i16m2 (v15, 33, vl);
+
+      asm volatile("nop" ::: "memory");
+      size_t sum0 = __riscv_vmv_x_s_i16m2_i16 (vw0);
+      size_t sum1 = __riscv_vmv_x_s_i16m2_i16 (vw1);
+      size_t sum2 = __riscv_vmv_x_s_i16m2_i16 (vw2);
+      size_t sum3 = __riscv_vmv_x_s_i16m2_i16 (vw3);
+      size_t sum4 = __riscv_vmv_x_s_i16m2_i16 (vw4);
+      size_t sum5 = __riscv_vmv_x_s_i16m2_i16 (vw5);
+      size_t sum6 = __riscv_vmv_x_s_i16m2_i16 (vw6);
+      size_t sum7 = __riscv_vmv_x_s_i16m2_i16 (vw7);
+      size_t sum8 = __riscv_vmv_x_s_i16m2_i16 (vw8);
+      size_t sum9 = __riscv_vmv_x_s_i16m2_i16 (vw9);
+      size_t sum10 = __riscv_vmv_x_s_i16m2_i16 (vw10);
+      size_t sum11 = __riscv_vmv_x_s_i16m2_i16 (vw11);
+      size_t sum12 = __riscv_vmv_x_s_i16m2_i16 (vw12);
+      size_t sum13 = __riscv_vmv_x_s_i16m2_i16 (vw13);
+      size_t sum14 = __riscv_vmv_x_s_i16m2_i16 (vw14);
+      size_t sum15 = __riscv_vmv_x_s_i16m2_i16 (vw15);
+
+      sum += sumation (sum0, sum1, sum2, sum3, sum4, sum5, sum6, sum7, sum8,
+		       sum9, sum10, sum11, sum12, sum13, sum14, sum15);
+    }
+  return sum;
+}
+
+size_t
+foo2 (char const *buf, size_t len)
+{
+  size_t sum = 0;
+  size_t vl = __riscv_vsetvlmax_e8m8 ();
+  size_t step = vl * 4;
+  const char *it = buf, *end = buf + len;
+  for (; it + step <= end;)
+    {
+      vint8m1_t v0 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v1 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v2 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v3 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v4 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v5 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v6 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v7 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v8 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v9 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v10 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v11 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v12 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v13 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v14 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      vint8m1_t v15 = __riscv_vle8_v_i8m1 ((void *) it, vl);
+      it += vl;
+      
+      asm volatile("nop" ::: "memory");
+      vint16m2_t vw0 = __riscv_vwmulsu_vx_i16m2 (v0, 33, vl);
+      vint16m2_t vw1 = __riscv_vwmulsu_vx_i16m2 (v1, 33, vl);
+      vint16m2_t vw2 = __riscv_vwmulsu_vx_i16m2 (v2, 33, vl);
+      vint16m2_t vw3 = __riscv_vwmulsu_vx_i16m2 (v3, 33, vl);
+      vint16m2_t vw4 = __riscv_vwmulsu_vx_i16m2 (v4, 33, vl);
+      vint16m2_t vw5 = __riscv_vwmulsu_vx_i16m2 (v5, 33, vl);
+      vint16m2_t vw6 = __riscv_vwmulsu_vx_i16m2 (v6, 33, vl);
+      vint16m2_t vw7 = __riscv_vwmulsu_vx_i16m2 (v7, 33, vl);
+      vint16m2_t vw8 = __riscv_vwmulsu_vx_i16m2 (v8, 33, vl);
+      vint16m2_t vw9 = __riscv_vwmulsu_vx_i16m2 (v9, 33, vl);
+      vint16m2_t vw10 = __riscv_vwmulsu_vx_i16m2 (v10, 33, vl);
+      vint16m2_t vw11 = __riscv_vwmulsu_vx_i16m2 (v11, 33, vl);
+      vint16m2_t vw12 = __riscv_vwmulsu_vx_i16m2 (v12, 33, vl);
+      vint16m2_t vw13 = __riscv_vwmulsu_vx_i16m2 (v13, 33, vl);
+      vint16m2_t vw14 = __riscv_vwmulsu_vx_i16m2 (v14, 33, vl);
+      vint16m2_t vw15 = __riscv_vwmulsu_vx_i16m2 (v15, 33, vl);
+
+      asm volatile("nop" ::: "memory");
+      size_t sum0 = __riscv_vmv_x_s_i16m2_i16 (vw0);
+      size_t sum1 = __riscv_vmv_x_s_i16m2_i16 (vw1);
+      size_t sum2 = __riscv_vmv_x_s_i16m2_i16 (vw2);
+      size_t sum3 = __riscv_vmv_x_s_i16m2_i16 (vw3);
+      size_t sum4 = __riscv_vmv_x_s_i16m2_i16 (vw4);
+      size_t sum5 = __riscv_vmv_x_s_i16m2_i16 (vw5);
+      size_t sum6 = __riscv_vmv_x_s_i16m2_i16 (vw6);
+      size_t sum7 = __riscv_vmv_x_s_i16m2_i16 (vw7);
+      size_t sum8 = __riscv_vmv_x_s_i16m2_i16 (vw8);
+      size_t sum9 = __riscv_vmv_x_s_i16m2_i16 (vw9);
+      size_t sum10 = __riscv_vmv_x_s_i16m2_i16 (vw10);
+      size_t sum11 = __riscv_vmv_x_s_i16m2_i16 (vw11);
+      size_t sum12 = __riscv_vmv_x_s_i16m2_i16 (vw12);
+      size_t sum13 = __riscv_vmv_x_s_i16m2_i16 (vw13);
+      size_t sum14 = __riscv_vmv_x_s_i16m2_i16 (vw14);
+      size_t sum15 = __riscv_vmv_x_s_i16m2_i16 (vw15);
+
+      sum += sumation (sum0, sum1, sum2, sum3, sum4, sum5, sum6, sum7, sum8,
+		       sum9, sum10, sum11, sum12, sum13, sum14, sum15);
+    }
+  return sum;
+}
+
+/* { dg-final { scan-assembler-not {vmv1r} } } */
+/* { dg-final { scan-assembler-not {vmv2r} } } */
+/* { dg-final { scan-assembler-not {vmv4r} } } */
+/* { dg-final { scan-assembler-not {vmv8r} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-23.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-23.c
new file mode 100644
index 00000000000..ee0b928e9df
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-23.c
@@ -0,0 +1,119 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
+
+#include "riscv_vector.h"
+
+size_t __attribute__ ((noinline))
+sumation (size_t sum0, size_t sum1, size_t sum2, size_t sum3, size_t sum4,
+	  size_t sum5, size_t sum6, size_t sum7)
+{
+  return sum0 + sum1 + sum2 + sum3 + sum4 + sum5 + sum6 + sum7;
+}
+
+size_t
+foo (char const *buf, size_t len)
+{
+  size_t sum = 0;
+  size_t vl = __riscv_vsetvlmax_e8m8 ();
+  size_t step = vl * 4;
+  const char *it = buf, *end = buf + len;
+  for (; it + step <= end;)
+    {
+      vint8m2_t v0 = __riscv_vle8_v_i8m2 ((void *) it, vl);
+      it += vl;
+      vint8m2_t v1 = __riscv_vle8_v_i8m2 ((void *) it, vl);
+      it += vl;
+      vint8m2_t v2 = __riscv_vle8_v_i8m2 ((void *) it, vl);
+      it += vl;
+      vint8m2_t v3 = __riscv_vle8_v_i8m2 ((void *) it, vl);
+      it += vl;
+      vint8m2_t v4 = __riscv_vle8_v_i8m2 ((void *) it, vl);
+      it += vl;
+      vint8m2_t v5 = __riscv_vle8_v_i8m2 ((void *) it, vl);
+      it += vl;
+      vint8m2_t v6 = __riscv_vle8_v_i8m2 ((void *) it, vl);
+      it += vl;
+      vint8m2_t v7 = __riscv_vle8_v_i8m2 ((void *) it, vl);
+      it += vl;
+
+      asm volatile("nop" ::: "memory");
+      vint16m4_t vw0 = __riscv_vwadd_vx_i16m4 (v0, 55, vl);
+      vint16m4_t vw1 = __riscv_vwadd_vx_i16m4 (v1, 55, vl);
+      vint16m4_t vw2 = __riscv_vwadd_vx_i16m4 (v2, 55, vl);
+      vint16m4_t vw3 = __riscv_vwadd_vx_i16m4 (v3, 55, vl);
+      vint16m4_t vw4 = __riscv_vwadd_vx_i16m4 (v4, 55, vl);
+      vint16m4_t vw5 = __riscv_vwadd_vx_i16m4 (v5, 55, vl);
+      vint16m4_t vw6 = __riscv_vwadd_vx_i16m4 (v6, 55, vl);
+      vint16m4_t vw7 = __riscv_vwadd_vx_i16m4 (v7, 55, vl);
+
+      asm volatile("nop" ::: "memory");
+      size_t sum0 = __riscv_vmv_x_s_i16m4_i16 (vw0);
+      size_t sum1 = __riscv_vmv_x_s_i16m4_i16 (vw1);
+      size_t sum2 = __riscv_vmv_x_s_i16m4_i16 (vw2);
+      size_t sum3 = __riscv_vmv_x_s_i16m4_i16 (vw3);
+      size_t sum4 = __riscv_vmv_x_s_i16m4_i16 (vw4);
+      size_t sum5 = __riscv_vmv_x_s_i16m4_i16 (vw5);
+      size_t sum6 = __riscv_vmv_x_s_i16m4_i16 (vw6);
+      size_t sum7 = __riscv_vmv_x_s_i16m4_i16 (vw7);
+
+      sum += sumation (sum0, sum1, sum2, sum3, sum4, sum5, sum6, sum7);
+    }
+  return sum;
+}
+
+size_t
+foo2 (char const *buf, size_t len)
+{
+  size_t sum = 0;
+  size_t vl = __riscv_vsetvlmax_e8m8 ();
+  size_t step = vl * 4;
+  const char *it = buf, *end = buf + len;
+  for (; it + step <= end;)
+    {
+      vint8m2_t v0 = __riscv_vle8_v_i8m2 ((void *) it, vl);
+      it += vl;
+      vint8m2_t v1 = __riscv_vle8_v_i8m2 ((void *) it, vl);
+      it += vl;
+      vint8m2_t v2 = __riscv_vle8_v_i8m2 ((void *) it, vl);
+      it += vl;
+      vint8m2_t v3 = __riscv_vle8_v_i8m2 ((void *) it, vl);
+      it += vl;
+      vint8m2_t v4 = __riscv_vle8_v_i8m2 ((void *) it, vl);
+      it += vl;
+      vint8m2_t v5 = __riscv_vle8_v_i8m2 ((void *) it, vl);
+      it += vl;
+      vint8m2_t v6 = __riscv_vle8_v_i8m2 ((void *) it, vl);
+      it += vl;
+      vint8m2_t v7 = __riscv_vle8_v_i8m2 ((void *) it, vl);
+      it += vl;
+
+      asm volatile("nop" ::: "memory");
+      vint16m4_t vw0 = __riscv_vwmulsu_vx_i16m4 (v0, 55, vl);
+      vint16m4_t vw1 = __riscv_vwmulsu_vx_i16m4 (v1, 55, vl);
+      vint16m4_t vw2 = __riscv_vwmulsu_vx_i16m4 (v2, 55, vl);
+      vint16m4_t vw3 = __riscv_vwmulsu_vx_i16m4 (v3, 55, vl);
+      vint16m4_t vw4 = __riscv_vwmulsu_vx_i16m4 (v4, 55, vl);
+      vint16m4_t vw5 = __riscv_vwmulsu_vx_i16m4 (v5, 55, vl);
+      vint16m4_t vw6 = __riscv_vwmulsu_vx_i16m4 (v6, 55, vl);
+      vint16m4_t vw7 = __riscv_vwmulsu_vx_i16m4 (v7, 55, vl);
+
+      asm volatile("nop" ::: "memory");
+      size_t sum0 = __riscv_vmv_x_s_i16m4_i16 (vw0);
+      size_t sum1 = __riscv_vmv_x_s_i16m4_i16 (vw1);
+      size_t sum2 = __riscv_vmv_x_s_i16m4_i16 (vw2);
+      size_t sum3 = __riscv_vmv_x_s_i16m4_i16 (vw3);
+      size_t sum4 = __riscv_vmv_x_s_i16m4_i16 (vw4);
+      size_t sum5 = __riscv_vmv_x_s_i16m4_i16 (vw5);
+      size_t sum6 = __riscv_vmv_x_s_i16m4_i16 (vw6);
+      size_t sum7 = __riscv_vmv_x_s_i16m4_i16 (vw7);
+
+      sum += sumation (sum0, sum1, sum2, sum3, sum4, sum5, sum6, sum7);
+    }
+  return sum;
+}
+
+/* { dg-final { scan-assembler-not {vmv1r} } } */
+/* { dg-final { scan-assembler-not {vmv2r} } } */
+/* { dg-final { scan-assembler-not {vmv4r} } } */
+/* { dg-final { scan-assembler-not {vmv8r} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-24.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-24.c
new file mode 100644
index 00000000000..603e2941cd3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-24.c
@@ -0,0 +1,86 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
+
+#include "riscv_vector.h"
+
+size_t __attribute__ ((noinline))
+sumation (size_t sum0, size_t sum1, size_t sum2, size_t sum3)
+{
+  return sum0 + sum1 + sum2 + sum3;
+}
+
+size_t
+foo (char const *buf, size_t len)
+{
+  size_t sum = 0;
+  size_t vl = __riscv_vsetvlmax_e8m8 ();
+  size_t step = vl * 4;
+  const char *it = buf, *end = buf + len;
+  for (; it + step <= end;)
+    {
+      vint8m4_t v0 = __riscv_vle8_v_i8m4 ((void *) it, vl);
+      it += vl;
+      vint8m4_t v1 = __riscv_vle8_v_i8m4 ((void *) it, vl);
+      it += vl;
+      vint8m4_t v2 = __riscv_vle8_v_i8m4 ((void *) it, vl);
+      it += vl;
+      vint8m4_t v3 = __riscv_vle8_v_i8m4 ((void *) it, vl);
+      it += vl;
+
+      asm volatile("nop" ::: "memory");
+      vint16m8_t vw0 = __riscv_vwadd_vx_i16m8 (v0, 66, vl);
+      vint16m8_t vw1 = __riscv_vwadd_vx_i16m8 (v1, 66, vl);
+      vint16m8_t vw2 = __riscv_vwadd_vx_i16m8 (v2, 66, vl);
+      vint16m8_t vw3 = __riscv_vwadd_vx_i16m8 (v3, 66, vl);
+
+      asm volatile("nop" ::: "memory");
+      size_t sum0 = __riscv_vmv_x_s_i16m8_i16 (vw0);
+      size_t sum1 = __riscv_vmv_x_s_i16m8_i16 (vw1);
+      size_t sum2 = __riscv_vmv_x_s_i16m8_i16 (vw2);
+      size_t sum3 = __riscv_vmv_x_s_i16m8_i16 (vw3);
+
+      sum += sumation (sum0, sum1, sum2, sum3);
+    }
+  return sum;
+}
+
+size_t
+foo2 (char const *buf, size_t len)
+{
+  size_t sum = 0;
+  size_t vl = __riscv_vsetvlmax_e8m8 ();
+  size_t step = vl * 4;
+  const char *it = buf, *end = buf + len;
+  for (; it + step <= end;)
+    {
+      vint8m4_t v0 = __riscv_vle8_v_i8m4 ((void *) it, vl);
+      it += vl;
+      vint8m4_t v1 = __riscv_vle8_v_i8m4 ((void *) it, vl);
+      it += vl;
+      vint8m4_t v2 = __riscv_vle8_v_i8m4 ((void *) it, vl);
+      it += vl;
+      vint8m4_t v3 = __riscv_vle8_v_i8m4 ((void *) it, vl);
+      it += vl;
+
+      asm volatile("nop" ::: "memory");
+      vint16m8_t vw0 = __riscv_vwmulsu_vx_i16m8 (v0, 66, vl);
+      vint16m8_t vw1 = __riscv_vwmulsu_vx_i16m8 (v1, 66, vl);
+      vint16m8_t vw2 = __riscv_vwmulsu_vx_i16m8 (v2, 66, vl);
+      vint16m8_t vw3 = __riscv_vwmulsu_vx_i16m8 (v3, 66, vl);
+
+      asm volatile("nop" ::: "memory");
+      size_t sum0 = __riscv_vmv_x_s_i16m8_i16 (vw0);
+      size_t sum1 = __riscv_vmv_x_s_i16m8_i16 (vw1);
+      size_t sum2 = __riscv_vmv_x_s_i16m8_i16 (vw2);
+      size_t sum3 = __riscv_vmv_x_s_i16m8_i16 (vw3);
+
+      sum += sumation (sum0, sum1, sum2, sum3);
+    }
+  return sum;
+}
+
+/* { dg-final { scan-assembler-not {vmv1r} } } */
+/* { dg-final { scan-assembler-not {vmv2r} } } */
+/* { dg-final { scan-assembler-not {vmv4r} } } */
+/* { dg-final { scan-assembler-not {vmv8r} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-25.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-25.c
new file mode 100644
index 00000000000..0b52b9f24eb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-25.c
@@ -0,0 +1,104 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
+
+#include "riscv_vector.h"
+
+size_t __attribute__ ((noinline))
+sumation (size_t sum0, size_t sum1, size_t sum2, size_t sum3, size_t sum4,
+	  size_t sum5, size_t sum6, size_t sum7, size_t sum8, size_t sum9,
+	  size_t sum10, size_t sum11, size_t sum12, size_t sum13, size_t sum14,
+	  size_t sum15)
+{
+  return sum0 + sum1 + sum2 + sum3 + sum4 + sum5 + sum6 + sum7 + sum8 + sum9
+	 + sum10 + sum11 + sum12 + sum13 + sum14 + sum15;
+}
+
+size_t
+foo (char const *buf, size_t len)
+{
+  size_t sum = 0;
+  size_t vl = __riscv_vsetvlmax_e8m8 ();
+  size_t step = vl * 4;
+  const char *it = buf, *end = buf + len;
+  for (; it + step <= end;)
+    {
+      vfloat32m1_t v0 = __riscv_vle32_v_f32m1 ((void *) it, vl);
+      it += vl;
+      vfloat32m1_t v1 = __riscv_vle32_v_f32m1 ((void *) it, vl);
+      it += vl;
+      vfloat32m1_t v2 = __riscv_vle32_v_f32m1 ((void *) it, vl);
+      it += vl;
+      vfloat32m1_t v3 = __riscv_vle32_v_f32m1 ((void *) it, vl);
+      it += vl;
+      vfloat32m1_t v4 = __riscv_vle32_v_f32m1 ((void *) it, vl);
+      it += vl;
+      vfloat32m1_t v5 = __riscv_vle32_v_f32m1 ((void *) it, vl);
+      it += vl;
+      vfloat32m1_t v6 = __riscv_vle32_v_f32m1 ((void *) it, vl);
+      it += vl;
+      vfloat32m1_t v7 = __riscv_vle32_v_f32m1 ((void *) it, vl);
+      it += vl;
+      vfloat32m1_t v8 = __riscv_vle32_v_f32m1 ((void *) it, vl);
+      it += vl;
+      vfloat32m1_t v9 = __riscv_vle32_v_f32m1 ((void *) it, vl);
+      it += vl;
+      vfloat32m1_t v10 = __riscv_vle32_v_f32m1 ((void *) it, vl);
+      it += vl;
+      vfloat32m1_t v11 = __riscv_vle32_v_f32m1 ((void *) it, vl);
+      it += vl;
+      vfloat32m1_t v12 = __riscv_vle32_v_f32m1 ((void *) it, vl);
+      it += vl;
+      vfloat32m1_t v13 = __riscv_vle32_v_f32m1 ((void *) it, vl);
+      it += vl;
+      vfloat32m1_t v14 = __riscv_vle32_v_f32m1 ((void *) it, vl);
+      it += vl;
+      vfloat32m1_t v15 = __riscv_vle32_v_f32m1 ((void *) it, vl);
+      it += vl;
+      
+      asm volatile("nop" ::: "memory");
+      vfloat64m2_t vw0 = __riscv_vfwadd_vf_f64m2 (v0, 33, vl);
+      vfloat64m2_t vw1 = __riscv_vfwadd_vf_f64m2 (v1, 33, vl);
+      vfloat64m2_t vw2 = __riscv_vfwadd_vf_f64m2 (v2, 33, vl);
+      vfloat64m2_t vw3 = __riscv_vfwadd_vf_f64m2 (v3, 33, vl);
+      vfloat64m2_t vw4 = __riscv_vfwadd_vf_f64m2 (v4, 33, vl);
+      vfloat64m2_t vw5 = __riscv_vfwadd_vf_f64m2 (v5, 33, vl);
+      vfloat64m2_t vw6 = __riscv_vfwadd_vf_f64m2 (v6, 33, vl);
+      vfloat64m2_t vw7 = __riscv_vfwadd_vf_f64m2 (v7, 33, vl);
+      vfloat64m2_t vw8 = __riscv_vfwadd_vf_f64m2 (v8, 33, vl);
+      vfloat64m2_t vw9 = __riscv_vfwadd_vf_f64m2 (v9, 33, vl);
+      vfloat64m2_t vw10 = __riscv_vfwadd_vf_f64m2 (v10, 33, vl);
+      vfloat64m2_t vw11 = __riscv_vfwadd_vf_f64m2 (v11, 33, vl);
+      vfloat64m2_t vw12 = __riscv_vfwadd_vf_f64m2 (v12, 33, vl);
+      vfloat64m2_t vw13 = __riscv_vfwadd_vf_f64m2 (v13, 33, vl);
+      vfloat64m2_t vw14 = __riscv_vfwadd_vf_f64m2 (v14, 33, vl);
+      vfloat64m2_t vw15 = __riscv_vfwadd_vf_f64m2 (v15, 33, vl);
+
+      asm volatile("nop" ::: "memory");
+      size_t sum0 = __riscv_vfmv_f_s_f64m2_f64 (vw0);
+      size_t sum1 = __riscv_vfmv_f_s_f64m2_f64 (vw1);
+      size_t sum2 = __riscv_vfmv_f_s_f64m2_f64 (vw2);
+      size_t sum3 = __riscv_vfmv_f_s_f64m2_f64 (vw3);
+      size_t sum4 = __riscv_vfmv_f_s_f64m2_f64 (vw4);
+      size_t sum5 = __riscv_vfmv_f_s_f64m2_f64 (vw5);
+      size_t sum6 = __riscv_vfmv_f_s_f64m2_f64 (vw6);
+      size_t sum7 = __riscv_vfmv_f_s_f64m2_f64 (vw7);
+      size_t sum8 = __riscv_vfmv_f_s_f64m2_f64 (vw8);
+      size_t sum9 = __riscv_vfmv_f_s_f64m2_f64 (vw9);
+      size_t sum10 = __riscv_vfmv_f_s_f64m2_f64 (vw10);
+      size_t sum11 = __riscv_vfmv_f_s_f64m2_f64 (vw11);
+      size_t sum12 = __riscv_vfmv_f_s_f64m2_f64 (vw12);
+      size_t sum13 = __riscv_vfmv_f_s_f64m2_f64 (vw13);
+      size_t sum14 = __riscv_vfmv_f_s_f64m2_f64 (vw14);
+      size_t sum15 = __riscv_vfmv_f_s_f64m2_f64 (vw15);
+
+      sum += sumation (sum0, sum1, sum2, sum3, sum4, sum5, sum6, sum7, sum8,
+		       sum9, sum10, sum11, sum12, sum13, sum14, sum15);
+    }
+  return sum;
+}
+
+/* { dg-final { scan-assembler-not {vmv1r} } } */
+/* { dg-final { scan-assembler-not {vmv2r} } } */
+/* { dg-final { scan-assembler-not {vmv4r} } } */
+/* { dg-final { scan-assembler-not {vmv8r} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-26.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-26.c
new file mode 100644
index 00000000000..d21a73765ed
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-26.c
@@ -0,0 +1,68 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
+
+#include "riscv_vector.h"
+
+size_t __attribute__ ((noinline))
+sumation (size_t sum0, size_t sum1, size_t sum2, size_t sum3, size_t sum4,
+	  size_t sum5, size_t sum6, size_t sum7)
+{
+  return sum0 + sum1 + sum2 + sum3 + sum4 + sum5 + sum6 + sum7;
+}
+
+size_t
+foo (char const *buf, size_t len)
+{
+  size_t sum = 0;
+  size_t vl = __riscv_vsetvlmax_e8m8 ();
+  size_t step = vl * 4;
+  const char *it = buf, *end = buf + len;
+  for (; it + step <= end;)
+    {
+      vfloat32m2_t v0 = __riscv_vle32_v_f32m2 ((void *) it, vl);
+      it += vl;
+      vfloat32m2_t v1 = __riscv_vle32_v_f32m2 ((void *) it, vl);
+      it += vl;
+      vfloat32m2_t v2 = __riscv_vle32_v_f32m2 ((void *) it, vl);
+      it += vl;
+      vfloat32m2_t v3 = __riscv_vle32_v_f32m2 ((void *) it, vl);
+      it += vl;
+      vfloat32m2_t v4 = __riscv_vle32_v_f32m2 ((void *) it, vl);
+      it += vl;
+      vfloat32m2_t v5 = __riscv_vle32_v_f32m2 ((void *) it, vl);
+      it += vl;
+      vfloat32m2_t v6 = __riscv_vle32_v_f32m2 ((void *) it, vl);
+      it += vl;
+      vfloat32m2_t v7 = __riscv_vle32_v_f32m2 ((void *) it, vl);
+      it += vl;
+      
+      asm volatile("nop" ::: "memory");
+      vfloat64m4_t vw0 = __riscv_vfwadd_vf_f64m4 (v0, 33, vl);
+      vfloat64m4_t vw1 = __riscv_vfwadd_vf_f64m4 (v1, 33, vl);
+      vfloat64m4_t vw2 = __riscv_vfwadd_vf_f64m4 (v2, 33, vl);
+      vfloat64m4_t vw3 = __riscv_vfwadd_vf_f64m4 (v3, 33, vl);
+      vfloat64m4_t vw4 = __riscv_vfwadd_vf_f64m4 (v4, 33, vl);
+      vfloat64m4_t vw5 = __riscv_vfwadd_vf_f64m4 (v5, 33, vl);
+      vfloat64m4_t vw6 = __riscv_vfwadd_vf_f64m4 (v6, 33, vl);
+      vfloat64m4_t vw7 = __riscv_vfwadd_vf_f64m4 (v7, 33, vl);
+
+      asm volatile("nop" ::: "memory");
+      size_t sum0 = __riscv_vfmv_f_s_f64m4_f64 (vw0);
+      size_t sum1 = __riscv_vfmv_f_s_f64m4_f64 (vw1);
+      size_t sum2 = __riscv_vfmv_f_s_f64m4_f64 (vw2);
+      size_t sum3 = __riscv_vfmv_f_s_f64m4_f64 (vw3);
+      size_t sum4 = __riscv_vfmv_f_s_f64m4_f64 (vw4);
+      size_t sum5 = __riscv_vfmv_f_s_f64m4_f64 (vw5);
+      size_t sum6 = __riscv_vfmv_f_s_f64m4_f64 (vw6);
+      size_t sum7 = __riscv_vfmv_f_s_f64m4_f64 (vw7);
+
+      sum += sumation (sum0, sum1, sum2, sum3, sum4, sum5, sum6, sum7);
+    }
+  return sum;
+}
+
+/* { dg-final { scan-assembler-not {vmv1r} } } */
+/* { dg-final { scan-assembler-not {vmv2r} } } */
+/* { dg-final { scan-assembler-not {vmv4r} } } */
+/* { dg-final { scan-assembler-not {vmv8r} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-27.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-27.c
new file mode 100644
index 00000000000..2423f7b33ee
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-27.c
@@ -0,0 +1,51 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
+
+#include "riscv_vector.h"
+
+size_t __attribute__ ((noinline))
+sumation (size_t sum0, size_t sum1, size_t sum2, size_t sum3)
+{
+  return sum0 + sum1 + sum2 + sum3;
+}
+
+size_t
+foo (char const *buf, size_t len)
+{
+  size_t sum = 0;
+  size_t vl = __riscv_vsetvlmax_e8m8 ();
+  size_t step = vl * 4;
+  const char *it = buf, *end = buf + len;
+  for (; it + step <= end;)
+    {
+      vfloat32m4_t v0 = __riscv_vle32_v_f32m4 ((void *) it, vl);
+      it += vl;
+      vfloat32m4_t v1 = __riscv_vle32_v_f32m4 ((void *) it, vl);
+      it += vl;
+      vfloat32m4_t v2 = __riscv_vle32_v_f32m4 ((void *) it, vl);
+      it += vl;
+      vfloat32m4_t v3 = __riscv_vle32_v_f32m4 ((void *) it, vl);
+      it += vl;
+      
+      asm volatile("nop" ::: "memory");
+      vfloat64m8_t vw0 = __riscv_vfwadd_vf_f64m8 (v0, 33, vl);
+      vfloat64m8_t vw1 = __riscv_vfwadd_vf_f64m8 (v1, 33, vl);
+      vfloat64m8_t vw2 = __riscv_vfwadd_vf_f64m8 (v2, 33, vl);
+      vfloat64m8_t vw3 = __riscv_vfwadd_vf_f64m8 (v3, 33, vl);
+
+      asm volatile("nop" ::: "memory");
+      size_t sum0 = __riscv_vfmv_f_s_f64m8_f64 (vw0);
+      size_t sum1 = __riscv_vfmv_f_s_f64m8_f64 (vw1);
+      size_t sum2 = __riscv_vfmv_f_s_f64m8_f64 (vw2);
+      size_t sum3 = __riscv_vfmv_f_s_f64m8_f64 (vw3);
+
+      sum += sumation (sum0, sum1, sum2, sum3);
+    }
+  return sum;
+}
+
+/* { dg-final { scan-assembler-not {vmv1r} } } */
+/* { dg-final { scan-assembler-not {vmv2r} } } */
+/* { dg-final { scan-assembler-not {vmv4r} } } */
+/* { dg-final { scan-assembler-not {vmv8r} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */