RISC-V: Add duplicate vector support.

Message ID 20221125160639.43024-1-juzhe.zhong@rivai.ai
State Deferred, archived
Headers
Series RISC-V: Add duplicate vector support. |

Commit Message

钟居哲 Nov. 25, 2022, 4:06 p.m. UTC
  From: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>

gcc/ChangeLog:

        * config/riscv/constraints.md (Wdm): New constraint.
        * config/riscv/predicates.md (direct_broadcast_operand): New predicate.
        * config/riscv/riscv-protos.h (RVV_VLMAX): New macro.
        (emit_pred_op): Refine function.
        * config/riscv/riscv-selftests.cc (run_const_vector_selftests): New function.
        (run_broadcast_selftests): Ditto.
        (BROADCAST_TEST): New tests.
        (riscv_run_selftests): More tests. 
        * config/riscv/riscv-v.cc (emit_pred_move): Refine function.
        (emit_vlmax_vsetvl): Ditto.
        (emit_pred_op): Ditto.
        (expand_const_vector): New function.
        (legitimize_move): Add constant vector support.
        * config/riscv/riscv.cc (riscv_print_operand): New asm print rule for const vector.
        * config/riscv/riscv.h (X0_REGNUM): New macro.
        * config/riscv/vector-iterators.md: New attribute.
        * config/riscv/vector.md (vec_duplicate<mode>): New pattern.
        (@pred_broadcast<mode>): New pattern.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/rvv/base/dup-1.c: New test.
        * gcc.target/riscv/rvv/base/dup-2.c: New test.

---
 gcc/config/riscv/constraints.md               |   5 +
 gcc/config/riscv/predicates.md                |   5 +
 gcc/config/riscv/riscv-protos.h               |   2 +
 gcc/config/riscv/riscv-selftests.cc           | 127 +++++
 gcc/config/riscv/riscv-v.cc                   |  86 ++-
 gcc/config/riscv/riscv.cc                     |  13 +
 gcc/config/riscv/riscv.h                      |   3 +
 gcc/config/riscv/vector-iterators.md          |   9 +
 gcc/config/riscv/vector.md                    |  53 +-
 .../gcc.target/riscv/rvv/base/dup-1.c         | 521 ++++++++++++++++++
 .../gcc.target/riscv/rvv/base/dup-2.c         |  75 +++
 11 files changed, 881 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/dup-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/dup-2.c
  

Comments

Jeff Law Nov. 28, 2022, 4:49 p.m. UTC | #1
On 11/25/22 09:06, juzhe.zhong@rivai.ai wrote:
> From: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
>
> gcc/ChangeLog:
>
>          * config/riscv/constraints.md (Wdm): New constraint.
>          * config/riscv/predicates.md (direct_broadcast_operand): New predicate.
>          * config/riscv/riscv-protos.h (RVV_VLMAX): New macro.
>          (emit_pred_op): Refine function.
>          * config/riscv/riscv-selftests.cc (run_const_vector_selftests): New function.
>          (run_broadcast_selftests): Ditto.
>          (BROADCAST_TEST): New tests.
>          (riscv_run_selftests): More tests.
>          * config/riscv/riscv-v.cc (emit_pred_move): Refine function.
>          (emit_vlmax_vsetvl): Ditto.
>          (emit_pred_op): Ditto.
>          (expand_const_vector): New function.
>          (legitimize_move): Add constant vector support.
>          * config/riscv/riscv.cc (riscv_print_operand): New asm print rule for const vector.
>          * config/riscv/riscv.h (X0_REGNUM): New macro.
>          * config/riscv/vector-iterators.md: New attribute.
>          * config/riscv/vector.md (vec_duplicate<mode>): New pattern.
>          (@pred_broadcast<mode>): New pattern.
>
> gcc/testsuite/ChangeLog:
>
>          * gcc.target/riscv/rvv/base/dup-1.c: New test.
>          * gcc.target/riscv/rvv/base/dup-2.c: New test.

I think this should wait for the next stage1 cycle.

jeff
  
钟居哲 Nov. 28, 2022, 10:54 p.m. UTC | #2
OK.



juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2022-11-29 00:49
To: juzhe.zhong; gcc-patches
CC: kito.cheng
Subject: Re: [PATCH] RISC-V: Add duplicate vector support.
 
On 11/25/22 09:06, juzhe.zhong@rivai.ai wrote:
> From: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
>
> gcc/ChangeLog:
>
>          * config/riscv/constraints.md (Wdm): New constraint.
>          * config/riscv/predicates.md (direct_broadcast_operand): New predicate.
>          * config/riscv/riscv-protos.h (RVV_VLMAX): New macro.
>          (emit_pred_op): Refine function.
>          * config/riscv/riscv-selftests.cc (run_const_vector_selftests): New function.
>          (run_broadcast_selftests): Ditto.
>          (BROADCAST_TEST): New tests.
>          (riscv_run_selftests): More tests.
>          * config/riscv/riscv-v.cc (emit_pred_move): Refine function.
>          (emit_vlmax_vsetvl): Ditto.
>          (emit_pred_op): Ditto.
>          (expand_const_vector): New function.
>          (legitimize_move): Add constant vector support.
>          * config/riscv/riscv.cc (riscv_print_operand): New asm print rule for const vector.
>          * config/riscv/riscv.h (X0_REGNUM): New macro.
>          * config/riscv/vector-iterators.md: New attribute.
>          * config/riscv/vector.md (vec_duplicate<mode>): New pattern.
>          (@pred_broadcast<mode>): New pattern.
>
> gcc/testsuite/ChangeLog:
>
>          * gcc.target/riscv/rvv/base/dup-1.c: New test.
>          * gcc.target/riscv/rvv/base/dup-2.c: New test.
 
I think this should wait for the next stage1 cycle.
 
jeff
  
Kito Cheng Dec. 1, 2022, 4:04 p.m. UTC | #3
LGMT, and as we discussed in another patch[1], I support RVV related
stuff to keep merge for this moment
and we agreed that it is not ideal but acceptable, so committed to trunku :)

[1] https://patchwork.ozlabs.org/project/gcc/patch/20221128141406.242953-1-juzhe.zhong@rivai.ai/

On Tue, Nov 29, 2022 at 6:55 AM 钟居哲 <juzhe.zhong@rivai.ai> wrote:
>
> OK.
>
>
>
> juzhe.zhong@rivai.ai
>
> From: Jeff Law
> Date: 2022-11-29 00:49
> To: juzhe.zhong; gcc-patches
> CC: kito.cheng
> Subject: Re: [PATCH] RISC-V: Add duplicate vector support.
>
> On 11/25/22 09:06, juzhe.zhong@rivai.ai wrote:
> > From: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
> >
> > gcc/ChangeLog:
> >
> >          * config/riscv/constraints.md (Wdm): New constraint.
> >          * config/riscv/predicates.md (direct_broadcast_operand): New predicate.
> >          * config/riscv/riscv-protos.h (RVV_VLMAX): New macro.
> >          (emit_pred_op): Refine function.
> >          * config/riscv/riscv-selftests.cc (run_const_vector_selftests): New function.
> >          (run_broadcast_selftests): Ditto.
> >          (BROADCAST_TEST): New tests.
> >          (riscv_run_selftests): More tests.
> >          * config/riscv/riscv-v.cc (emit_pred_move): Refine function.
> >          (emit_vlmax_vsetvl): Ditto.
> >          (emit_pred_op): Ditto.
> >          (expand_const_vector): New function.
> >          (legitimize_move): Add constant vector support.
> >          * config/riscv/riscv.cc (riscv_print_operand): New asm print rule for const vector.
> >          * config/riscv/riscv.h (X0_REGNUM): New macro.
> >          * config/riscv/vector-iterators.md: New attribute.
> >          * config/riscv/vector.md (vec_duplicate<mode>): New pattern.
> >          (@pred_broadcast<mode>): New pattern.
> >
> > gcc/testsuite/ChangeLog:
> >
> >          * gcc.target/riscv/rvv/base/dup-1.c: New test.
> >          * gcc.target/riscv/rvv/base/dup-2.c: New test.
>
> I think this should wait for the next stage1 cycle.
>
> jeff
>
>
>
  

Patch

diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index 4088c48150a..51cffb2bcb6 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -151,3 +151,8 @@ 
  A constraint that matches a vector of immediate all ones."
  (and (match_code "const_vector")
       (match_test "op == CONSTM1_RTX (GET_MODE (op))")))
+
+(define_constraint "Wdm"
+  "Vector duplicate memory operand"
+  (and (match_operand 0 "memory_operand")
+       (match_code "reg" "0")))
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index dfd98761b8b..5a5a49bf7c0 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -286,6 +286,11 @@ 
 	    (match_test "GET_CODE (op) == UNSPEC
 			 && (XINT (op, 1) == UNSPEC_VUNDEF)"))))
 
+;; The scalar operand can be directly broadcast by RVV instructions.
+(define_predicate "direct_broadcast_operand"
+  (ior (match_operand 0 "register_operand")
+       (match_test "satisfies_constraint_Wdm (op)")))
+
 ;; A CONST_INT operand that has exactly two bits cleared.
 (define_predicate "const_nottwobits_operand"
   (and (match_code "const_int")
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 2ec3af05aa4..27692ffb210 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -119,6 +119,7 @@  extern void riscv_run_selftests (void);
 #endif
 
 namespace riscv_vector {
+#define RVV_VLMAX gen_rtx_REG (Pmode, X0_REGNUM)
 /* Routines implemented in riscv-vector-builtins.cc.  */
 extern void init_builtins (void);
 extern const char *mangle_builtin_type (const_tree);
@@ -130,6 +131,7 @@  extern tree builtin_decl (unsigned, bool);
 extern rtx expand_builtin (unsigned int, tree, rtx);
 extern bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
 extern bool legitimize_move (rtx, rtx, machine_mode);
+extern void emit_pred_op (unsigned, rtx, rtx, machine_mode);
 enum tail_policy
 {
   TAIL_UNDISTURBED = 0,
diff --git a/gcc/config/riscv/riscv-selftests.cc b/gcc/config/riscv/riscv-selftests.cc
index 636874ebc0f..1bf1a648fa1 100644
--- a/gcc/config/riscv/riscv-selftests.cc
+++ b/gcc/config/riscv/riscv-selftests.cc
@@ -33,6 +33,9 @@  along with GCC; see the file COPYING3.  If not see
 #include "expr.h"
 #include "selftest.h"
 #include "selftest-rtl.h"
+#include "insn-attr.h"
+#include "target.h"
+#include "optabs.h"
 
 #if CHECKING_P
 using namespace selftest;
@@ -230,12 +233,136 @@  run_poly_int_selftests (void)
   run_poly_int_selftest ("rv32imafd_zve32x1p0", ABI_ILP32D, POLY_TEST_DIMODE,
 			 worklist);
 }
+
+static void
+run_const_vector_selftests (void)
+{
+  /* We dont't need to do the redundant tests in different march && mabi.
+     Just pick up the march && mabi which fully support all RVV modes.  */
+  riscv_selftest_arch_abi_setter rv ("rv64imafdcv", ABI_LP64D);
+  rtl_dump_test t (SELFTEST_LOCATION, locate_file ("riscv/empty-func.rtl"));
+  set_new_first_and_last_insn (NULL, NULL);
+
+  machine_mode mode;
+  std::vector<HOST_WIDE_INT> worklist = {-111, -17, -16, 7, 15, 16, 111};
+
+  FOR_EACH_MODE_IN_CLASS (mode, MODE_VECTOR_INT)
+    {
+      if (riscv_v_ext_vector_mode_p (mode))
+	{
+	  for (const HOST_WIDE_INT &val : worklist)
+	    {
+	      start_sequence ();
+	      rtx dest = gen_reg_rtx (mode);
+	      rtx dup = gen_const_vec_duplicate (mode, GEN_INT (val));
+	      emit_move_insn (dest, dup);
+	      rtx_insn *insn = get_last_insn ();
+	      rtx src = XEXP (SET_SRC (PATTERN (insn)), 1);
+	      /* 1. Should be vmv.v.i for in rang of -16 ~ 15.
+		 2. Should be vmv.v.x for exceed -16 ~ 15.  */
+	      if (IN_RANGE (val, -16, 15))
+		ASSERT_TRUE (rtx_equal_p (src, dup));
+	      else
+		ASSERT_TRUE (
+		  rtx_equal_p (src,
+			       gen_rtx_VEC_DUPLICATE (mode, XEXP (src, 0))));
+	      end_sequence ();
+	    }
+	}
+    }
+
+  FOR_EACH_MODE_IN_CLASS (mode, MODE_VECTOR_FLOAT)
+    {
+      if (riscv_v_ext_vector_mode_p (mode))
+	{
+	  scalar_mode inner_mode = GET_MODE_INNER (mode);
+	  REAL_VALUE_TYPE f = REAL_VALUE_ATOF ("0.2928932", inner_mode);
+	  rtx ele = const_double_from_real_value (f, inner_mode);
+
+	  start_sequence ();
+	  rtx dest = gen_reg_rtx (mode);
+	  rtx dup = gen_const_vec_duplicate (mode, ele);
+	  emit_move_insn (dest, dup);
+	  rtx_insn *insn = get_last_insn ();
+	  rtx src = XEXP (SET_SRC (PATTERN (insn)), 1);
+	  /* Should always be vfmv.v.f.  */
+	  ASSERT_TRUE (
+	    rtx_equal_p (src, gen_rtx_VEC_DUPLICATE (mode, XEXP (src, 0))));
+	  end_sequence ();
+	}
+    }
+
+  FOR_EACH_MODE_IN_CLASS (mode, MODE_VECTOR_BOOL)
+    {
+      /* Test vmset.m.  */
+      if (riscv_v_ext_vector_mode_p (mode))
+	{
+	  start_sequence ();
+	  rtx dest = gen_reg_rtx (mode);
+	  emit_move_insn (dest, CONSTM1_RTX (mode));
+	  rtx_insn *insn = get_last_insn ();
+	  rtx src = XEXP (SET_SRC (PATTERN (insn)), 1);
+	  ASSERT_TRUE (rtx_equal_p (src, CONSTM1_RTX (mode)));
+	  end_sequence ();
+	}
+    }
+}
+
+static void
+run_broadcast_selftests (void)
+{
+  /* We dont't need to do the redundant tests in different march && mabi.
+     Just pick up the march && mabi which fully support all RVV modes.  */
+  riscv_selftest_arch_abi_setter rv ("rv64imafdcv", ABI_LP64D);
+  rtl_dump_test t (SELFTEST_LOCATION, locate_file ("riscv/empty-func.rtl"));
+  set_new_first_and_last_insn (NULL, NULL);
+
+  machine_mode mode;
+
+#define BROADCAST_TEST(MODE_CLASS)                                             \
+  FOR_EACH_MODE_IN_CLASS (mode, MODE_VECTOR_INT)                               \
+    {                                                                          \
+      if (riscv_v_ext_vector_mode_p (mode))                                    \
+	{                                                                      \
+	  rtx_insn *insn;                                                      \
+	  rtx src;                                                             \
+	  scalar_mode inner_mode = GET_MODE_INNER (mode);                      \
+	  /* Test vlse.v with zero stride.  */                                 \
+	  start_sequence ();                                                   \
+	  rtx addr = gen_reg_rtx (Pmode);                                      \
+	  rtx mem = gen_rtx_MEM (inner_mode, addr);                            \
+	  expand_vector_broadcast (mode, mem);                                 \
+	  insn = get_last_insn ();                                             \
+	  src = XEXP (SET_SRC (PATTERN (insn)), 1);                            \
+	  ASSERT_TRUE (MEM_P (XEXP (src, 0)));                                 \
+	  ASSERT_TRUE (                                                        \
+	    rtx_equal_p (src, gen_rtx_VEC_DUPLICATE (mode, XEXP (src, 0))));   \
+	  end_sequence ();                                                     \
+	  /* Test vmv.v.x or vfmv.v.f.  */                                     \
+	  start_sequence ();                                                   \
+	  rtx reg = gen_reg_rtx (inner_mode);                                  \
+	  expand_vector_broadcast (mode, reg);                                 \
+	  insn = get_last_insn ();                                             \
+	  src = XEXP (SET_SRC (PATTERN (insn)), 1);                            \
+	  ASSERT_TRUE (REG_P (XEXP (src, 0)));                                 \
+	  ASSERT_TRUE (                                                        \
+	    rtx_equal_p (src, gen_rtx_VEC_DUPLICATE (mode, XEXP (src, 0))));   \
+	  end_sequence ();                                                     \
+	}                                                                      \
+    }
+
+  BROADCAST_TEST (MODE_VECTOR_INT)
+  BROADCAST_TEST (MODE_VECTOR_FLOAT)
+}
+
 namespace selftest {
 /* Run all target-specific selftests.  */
 void
 riscv_run_selftests (void)
 {
   run_poly_int_selftests ();
+  run_const_vector_selftests ();
+  run_broadcast_selftests ();
 }
 } // namespace selftest
 #endif /* #if CHECKING_P */
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index e0459e3f610..fbd8bbfe254 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -40,6 +40,7 @@ 
 #include "target.h"
 #include "expr.h"
 #include "optabs.h"
+#include "tm-constrs.h"
 
 using namespace riscv_vector;
 
@@ -104,34 +105,80 @@  const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT minval,
 	  && IN_RANGE (INTVAL (elt), minval, maxval));
 }
 
-/* Emit an RVV unmask && vl mov from SRC to DEST.  */
-static void
-emit_pred_move (rtx dest, rtx src, machine_mode mask_mode)
+static rtx
+emit_vlmax_vsetvl (machine_mode vmode)
 {
-  insn_expander<7> e;
-  machine_mode mode = GET_MODE (dest);
   rtx vl = gen_reg_rtx (Pmode);
-  unsigned int sew = GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL
+  unsigned int sew = GET_MODE_CLASS (vmode) == MODE_VECTOR_BOOL
 		       ? 8
-		       : GET_MODE_BITSIZE (GET_MODE_INNER (mode));
+		       : GET_MODE_BITSIZE (GET_MODE_INNER (vmode));
 
-  emit_insn (gen_vsetvl_no_side_effects (
-    Pmode, vl, gen_rtx_REG (Pmode, 0), gen_int_mode (sew, Pmode),
-    gen_int_mode ((unsigned int) mode, Pmode), const1_rtx, const1_rtx));
+  emit_insn (
+    gen_vsetvl_no_side_effects (Pmode, vl, RVV_VLMAX, gen_int_mode (sew, Pmode),
+				gen_int_mode ((unsigned int) vmode, Pmode),
+				const1_rtx, const1_rtx));
+  return vl;
+}
+
+/* Emit an RVV unmask && vl mov from SRC to DEST.  */
+void
+emit_pred_op (unsigned icode, rtx dest, rtx src, machine_mode mask_mode)
+{
+  insn_expander<7> e;
+  machine_mode mode = GET_MODE (dest);
 
   e.add_output_operand (dest, mode);
   e.add_all_one_mask_operand (mask_mode);
   e.add_vundef_operand (mode);
 
-  e.add_input_operand (src, mode);
+  e.add_input_operand (src, GET_MODE (src));
 
-  e.add_input_operand (vl, Pmode);
+  rtx vlmax = emit_vlmax_vsetvl (mode);
+  e.add_input_operand (vlmax, Pmode);
 
   e.add_policy_operand (TAIL_AGNOSTIC, MASK_AGNOSTIC);
 
-  enum insn_code icode;
-  icode = code_for_pred_mov (mode);
-  e.expand (icode, true);
+  e.expand ((enum insn_code) icode, MEM_P (dest) || MEM_P (src));
+}
+
+static void
+expand_const_vector (rtx target, rtx src, machine_mode mask_mode)
+{
+  machine_mode mode = GET_MODE (target);
+  scalar_mode elt_mode = GET_MODE_INNER (mode);
+  if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL)
+    {
+      rtx elt;
+      gcc_assert (
+	const_vec_duplicate_p (src, &elt)
+	&& (rtx_equal_p (elt, const0_rtx) || rtx_equal_p (elt, const1_rtx)));
+      emit_pred_op (code_for_pred_mov (mode), target, src, mode);
+      return;
+    }
+
+  rtx elt;
+  if (const_vec_duplicate_p (src, &elt))
+    {
+      rtx tmp = register_operand (target, mode) ? target : gen_reg_rtx (mode);
+      /* Element in range -16 ~ 15 integer or 0.0 floating-point,
+	 we use vmv.v.i instruction.  */
+      if (satisfies_constraint_vi (src) || satisfies_constraint_Wc0 (src))
+	emit_pred_op (code_for_pred_mov (mode), tmp, src, mask_mode);
+      else
+	emit_pred_op (code_for_pred_broadcast (mode), tmp,
+		      force_reg (elt_mode, elt), mask_mode);
+
+      if (tmp != target)
+	emit_move_insn (target, tmp);
+      return;
+    }
+
+  /* TODO: We only support const duplicate vector for now. More cases
+     will be supported when we support auto-vectorization:
+
+       1. series vector.
+       2. multiple elts duplicate vector.
+       3. multiple patterns with multiple elts.  */
 }
 
 /* Expand a pre-RA RVV data move from SRC to DEST.
@@ -140,6 +187,11 @@  bool
 legitimize_move (rtx dest, rtx src, machine_mode mask_mode)
 {
   machine_mode mode = GET_MODE (dest);
+  if (CONST_VECTOR_P (src))
+    {
+      expand_const_vector (dest, src, mask_mode);
+      return true;
+    }
   if (known_ge (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)
       && GET_MODE_CLASS (mode) != MODE_VECTOR_BOOL)
     {
@@ -153,12 +205,12 @@  legitimize_move (rtx dest, rtx src, machine_mode mask_mode)
     {
       rtx tmp = gen_reg_rtx (mode);
       if (MEM_P (src))
-	emit_pred_move (tmp, src, mask_mode);
+	emit_pred_op (code_for_pred_mov (mode), tmp, src, mask_mode);
       else
 	emit_move_insn (tmp, src);
       src = tmp;
     }
-  emit_pred_move (dest, src, mask_mode);
+  emit_pred_op (code_for_pred_mov (mode), dest, src, mask_mode);
   return true;
 }
 
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 7bfc0e9f595..0267494ae5a 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -4205,6 +4205,19 @@  riscv_print_operand (FILE *file, rtx op, int letter)
 
   switch (letter)
     {
+      case 'v': {
+	rtx elt;
+
+	if (!const_vec_duplicate_p (op, &elt))
+	  output_operand_lossage ("invalid vector constant");
+	else if (satisfies_constraint_Wc0 (op))
+	  asm_fprintf (file, "0");
+	else if (satisfies_constraint_vi (op))
+	  asm_fprintf (file, "%wd", INTVAL (elt));
+	else
+	  output_operand_lossage ("invalid vector constant");
+	break;
+      }
       case 'm': {
 	if (riscv_v_ext_vector_mode_p (mode))
 	  {
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index b05c3c1545c..defb475f948 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -651,6 +651,9 @@  enum reg_class
 #define FP_ARG_FIRST (FP_REG_FIRST + 10)
 #define FP_ARG_LAST  (FP_ARG_FIRST + MAX_ARGS_IN_REGISTERS - 1)
 
+/* Helper macro for RVV vsetvl instruction generation.  */
+#define X0_REGNUM GP_REG_FIRST
+
 #define CALLEE_SAVED_REG_NUMBER(REGNO)			\
   ((REGNO) >= 8 && (REGNO) <= 9 ? (REGNO) - 8 :		\
    (REGNO) >= 18 && (REGNO) <= 27 ? (REGNO) - 16 : -1)
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index 9d4a9dc8a0e..92c4bd0a6a3 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -71,6 +71,15 @@ 
   (VNx1DF "VNx1BI") (VNx2DF "VNx2BI") (VNx4DF "VNx4BI") (VNx8DF "VNx8BI")
 ])
 
+(define_mode_attr VEL [
+  (VNx1QI "QI") (VNx2QI "QI") (VNx4QI "QI") (VNx8QI "QI") (VNx16QI "QI") (VNx32QI "QI") (VNx64QI "QI")
+  (VNx1HI "HI") (VNx2HI "HI") (VNx4HI "HI") (VNx8HI "HI") (VNx16HI "HI") (VNx32HI "HI")
+  (VNx1SI "SI") (VNx2SI "SI") (VNx4SI "SI") (VNx8SI "SI") (VNx16SI "SI")
+  (VNx1DI "DI") (VNx2DI "DI") (VNx4DI "DI") (VNx8DI "DI")
+  (VNx1SF "SF") (VNx2SF "SF") (VNx4SF "SF") (VNx8SF "SF") (VNx16SF "SF")
+  (VNx1DF "DF") (VNx2DF "DF") (VNx4DF "DF") (VNx8DF "DF")
+])
+
 (define_mode_attr sew [
   (VNx1QI "8") (VNx2QI "8") (VNx4QI "8") (VNx8QI "8") (VNx16QI "8") (VNx32QI "8") (VNx64QI "8")
   (VNx1HI "16") (VNx2HI "16") (VNx4HI "16") (VNx8HI "16") (VNx16HI "16") (VNx32HI "16")
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 01418ac5fcf..0dace449316 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -151,6 +151,26 @@ 
   [(set_attr "type" "vmov")
    (set_attr "mode" "<MODE>")])
 
+;; -----------------------------------------------------------------
+;; ---- Duplicate Operations
+;; -----------------------------------------------------------------
+
+;; According to GCC internal:
+;; This pattern only handles duplicates of non-constant inputs.
+;; Constant vectors go through the movm pattern instead.
+;; So "direct_broadcast_operand" can only be mem or reg, no CONSTANT.
+(define_expand "vec_duplicate<mode>"
+  [(set (match_operand:V 0 "register_operand")
+	(vec_duplicate:V
+	  (match_operand:<VEL> 1 "direct_broadcast_operand")))]
+  "TARGET_VECTOR"
+  {
+    riscv_vector::emit_pred_op (code_for_pred_broadcast (<MODE>mode), operands[0],
+		    		operands[1], <VM>mode);
+    DONE;
+  }
+)
+
 ;; -----------------------------------------------------------------
 ;; ---- 6. Configuration-Setting Instructions
 ;; -----------------------------------------------------------------
@@ -368,7 +388,7 @@ 
    vle<sew>.v\t%0,%3%p1
    vse<sew>.v\t%3,%0%p1
    vmv.v.v\t%0,%3
-   vmv.v.i\t%0,v%3"
+   vmv.v.i\t%0,%v3"
   "&& register_operand (operands[0], <MODE>mode)
    && register_operand (operands[3], <MODE>mode)
    && satisfies_constraint_vu (operands[2])"
@@ -407,3 +427,34 @@ 
   ""
   [(set_attr "type" "vldm,vstm,vimov,vmalu,vmalu")
    (set_attr "mode" "<MODE>")])
+
+;; -------------------------------------------------------------------------------
+;; ---- Predicated Broadcast
+;; -------------------------------------------------------------------------------
+;; Includes:
+;; - 7.5. Vector Strided Instructions (zero stride)
+;; - 11.16 Vector Integer Move Instructions (vmv.v.x)
+;; - 13.16 Vector Floating-Point Move Instruction (vfmv.v.f)
+;; -------------------------------------------------------------------------------
+
+(define_insn "@pred_broadcast<mode>"
+  [(set (match_operand:V 0 "register_operand"                 "=vr,  vr,  vr,  vr")
+	(if_then_else:V
+	  (unspec:<VM>
+	    [(match_operand:<VM> 1 "vector_mask_operand"      " Wc1, Wc1, vm, Wc1")
+	     (match_operand 4 "vector_length_operand"         " rK,  rK,  rK,  rK")
+	     (match_operand 5 "const_int_operand"             "  i,   i,   i,   i")
+	     (match_operand 6 "const_int_operand"             "  i,   i,   i,   i")
+	     (reg:SI VL_REGNUM)
+	     (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+	  (vec_duplicate:V
+	    (match_operand:<VEL> 3 "direct_broadcast_operand" "  r,   f, Wdm, Wdm"))
+	  (match_operand:V 2 "vector_merge_operand"           "vu0, vu0, vu0, vu0")))]
+  "TARGET_VECTOR"
+  "@
+   vmv.v.x\t%0,%3
+   vfmv.v.f\t%0,%3
+   vlse<sew>.v\t%0,%3,zero,%1.t
+   vlse<sew>.v\t%0,%3,zero"
+  [(set_attr "type" "vimov,vfmov,vlds,vlds")
+   (set_attr "mode" "<MODE>")])
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/dup-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/dup-1.c
new file mode 100644
index 00000000000..2a83afae056
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/dup-1.c
@@ -0,0 +1,521 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -O3 -fgimple" } */
+
+#include "riscv_vector.h"
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f1 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint8mf8_t> ((vint8mf8_t *)out_2(D)) = _Literal (vint8mf8_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f2 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint8mf4_t> ((vint8mf4_t *)out_2(D)) = _Literal (vint8mf4_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f3 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint8mf2_t> ((vint8mf2_t *)out_2(D)) = _Literal (vint8mf2_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f4 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint8m1_t> ((vint8m1_t *)out_2(D)) = _Literal (vint8m1_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f5 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint8m2_t> ((vint8m2_t *)out_2(D)) = _Literal (vint8m2_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f6 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint8m4_t> ((vint8m4_t *)out_2(D)) = _Literal (vint8m4_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f7 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint8m8_t> ((vint8m8_t *)out_2(D)) = _Literal (vint8m8_t) 0;
+  return;
+
+}
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f8 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint8mf8_t> ((vuint8mf8_t *)out_2(D)) = _Literal (vuint8mf8_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f9 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint8mf4_t> ((vuint8mf4_t *)out_2(D)) = _Literal (vuint8mf4_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f10 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint8mf2_t> ((vuint8mf2_t *)out_2(D)) = _Literal (vuint8mf2_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f11 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint8m1_t> ((vuint8m1_t *)out_2(D)) = _Literal (vuint8m1_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f12 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint8m2_t> ((vuint8m2_t *)out_2(D)) = _Literal (vuint8m2_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f13 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint8m4_t> ((vuint8m4_t *)out_2(D)) = _Literal (vuint8m4_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f14 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint8m8_t> ((vuint8m8_t *)out_2(D)) = _Literal (vuint8m8_t) 0;
+  return;
+
+}
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f15 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint16mf4_t> ((vint16mf4_t *)out_2(D)) = _Literal (vint16mf4_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f16 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint16mf2_t> ((vint16mf2_t *)out_2(D)) = _Literal (vint16mf2_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f17 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint16m1_t> ((vint16m1_t *)out_2(D)) = _Literal (vint16m1_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f18 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint16m2_t> ((vint16m2_t *)out_2(D)) = _Literal (vint16m2_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f19 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint16m4_t> ((vint16m4_t *)out_2(D)) = _Literal (vint16m4_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f20 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint16m8_t> ((vint16m8_t *)out_2(D)) = _Literal (vint16m8_t) 0;
+  return;
+
+}
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f21 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint16mf4_t> ((vuint16mf4_t *)out_2(D)) = _Literal (vuint16mf4_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f22 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint16mf2_t> ((vuint16mf2_t *)out_2(D)) = _Literal (vuint16mf2_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f23 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint16m1_t> ((vuint16m1_t *)out_2(D)) = _Literal (vuint16m1_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f24 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint16m2_t> ((vuint16m2_t *)out_2(D)) = _Literal (vuint16m2_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f25 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint16m4_t> ((vuint16m4_t *)out_2(D)) = _Literal (vuint16m4_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f26 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint16m8_t> ((vuint16m8_t *)out_2(D)) = _Literal (vuint16m8_t) 0;
+  return;
+
+}
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f27 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint32mf2_t> ((vint32mf2_t *)out_2(D)) = _Literal (vint32mf2_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f28 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint32m1_t> ((vint32m1_t *)out_2(D)) = _Literal (vint32m1_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f29 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint32m2_t> ((vint32m2_t *)out_2(D)) = _Literal (vint32m2_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f30 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint32m4_t> ((vint32m4_t *)out_2(D)) = _Literal (vint32m4_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f31 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint32m8_t> ((vint32m8_t *)out_2(D)) = _Literal (vint32m8_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f32 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint32mf2_t> ((vuint32mf2_t *)out_2(D)) = _Literal (vuint32mf2_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f33 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint32m1_t> ((vuint32m1_t *)out_2(D)) = _Literal (vuint32m1_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f34 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint32m2_t> ((vuint32m2_t *)out_2(D)) = _Literal (vuint32m2_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f35 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint32m4_t> ((vuint32m4_t *)out_2(D)) = _Literal (vuint32m4_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f36 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint32m8_t> ((vuint32m8_t *)out_2(D)) = _Literal (vuint32m8_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f37 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint64m1_t> ((vint64m1_t *)out_2(D)) = _Literal (vint64m1_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f38 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint64m2_t> ((vint64m2_t *)out_2(D)) = _Literal (vint64m2_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f39 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint64m4_t> ((vint64m4_t *)out_2(D)) = _Literal (vint64m4_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f40 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vint64m8_t> ((vint64m8_t *)out_2(D)) = _Literal (vint64m8_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f41 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint64m1_t> ((vuint64m1_t *)out_2(D)) = _Literal (vuint64m1_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f42 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint64m2_t> ((vuint64m2_t *)out_2(D)) = _Literal (vuint64m2_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f43 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint64m4_t> ((vuint64m4_t *)out_2(D)) = _Literal (vuint64m4_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f44 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vuint64m8_t> ((vuint64m8_t *)out_2(D)) = _Literal (vuint64m8_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f45 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vfloat32m1_t> ((vfloat32m1_t *)out_2(D)) = _Literal (vfloat32m1_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f46 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vfloat32m2_t> ((vfloat32m2_t *)out_2(D)) = _Literal (vfloat32m2_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f47 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vfloat32m4_t> ((vfloat32m4_t *)out_2(D)) = _Literal (vfloat32m4_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f48 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vfloat32m8_t> ((vfloat32m8_t *)out_2(D)) = _Literal (vfloat32m8_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f49 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vfloat64m1_t> ((vfloat64m1_t *)out_2(D)) = _Literal (vfloat64m1_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f50 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vfloat64m2_t> ((vfloat64m2_t *)out_2(D)) = _Literal (vfloat64m2_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f51 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vfloat64m4_t> ((vfloat64m4_t *)out_2(D)) = _Literal (vfloat64m4_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f52 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vfloat64m8_t> ((vfloat64m8_t *)out_2(D)) = _Literal (vfloat64m8_t) 0;
+  return;
+
+}
+
+/* { dg-final { scan-assembler-times {vmv\.v\.i\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*0} 52 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/dup-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/dup-2.c
new file mode 100644
index 00000000000..c6903039c2a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/dup-2.c
@@ -0,0 +1,75 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -O3 -fgimple" } */
+
+#include "riscv_vector.h"
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f1 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vbool1_t> ((vbool1_t *)out_2(D)) = _Literal (vbool1_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f2 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vbool2_t> ((vbool2_t *)out_2(D)) = _Literal (vbool2_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f3 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vbool4_t> ((vbool4_t *)out_2(D)) = _Literal (vbool4_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f4 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vbool8_t> ((vbool8_t *)out_2(D)) = _Literal (vbool8_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f5 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vbool16_t> ((vbool16_t *)out_2(D)) = _Literal (vbool16_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f6 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vbool32_t> ((vbool32_t *)out_2(D)) = _Literal (vbool32_t) 0;
+  return;
+
+}
+
+
+void __GIMPLE (ssa,guessed_local(1073741824))
+f7 (void * out)
+{
+  __BB(2,guessed_local(1073741824)):
+  __MEM <vbool64_t> ((vbool64_t *)out_2(D)) = _Literal (vbool64_t) 0;
+  return;
+
+}
+
+/* { dg-final { scan-assembler-times {vmclr\.m\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1])} 7 } } */