[v11,03/12] Update constant creation for BB SLP with predicated tails

Message ID 20260603151924.53706-4-chris.bazley@arm.com
State New
Headers
Series Extend BB SLP vectorization to use predicated tails |

Commit Message

Christopher Bazley June 3, 2026, 3:19 p.m. UTC
  Created a new function, gimple_build_vector_from_elems,
for use when creating vectorized definitions for basic block
vectorization in vect_create_constant_vectors.

The existing gimple_build_vector function cannot be used
for SVE vector types because it relies on the type
associated with the tree_vector_builder having a constant
number of subparts. Even if that limitation were lifted, the
possibility of tree_vector_builder patterns being used is
inappropriate.

The new function takes a vector type and vec of tree nodes
giving the element values to put into the built vector, instead of an
instance of tree_vector_builder. If the number of values is zero then
a zero constant is built. If all values are constant then a vector
constant is built. Otherwise, a new constructor node is created.

gcc/ChangeLog:

	* gimple-fold.cc (gimple_build_vector_from_elems): Define a
	new function to build a vector from a list of elements that need
	not be complete.
	* gimple-fold.h (gimple_build_vector_from_elems): Declare a new
	function and a simpler overloaded version with fewer parameters.
	* tree-vect-slp.cc (vect_create_constant_vectors):
	Use gimple_build_vector_from_elems instead of
	duplicate_and_interleave to create non-uniform constant
	vectors for BB SLP vectorization.
---
 gcc/gimple-fold.cc   | 55 ++++++++++++++++++++++++++++++++++++++++++++
 gcc/gimple-fold.h    | 14 +++++++++++
 gcc/tree-vect-slp.cc | 40 +++++++++++++++++++++++++-------
 3 files changed, 101 insertions(+), 8 deletions(-)
  

Comments

Richard Biener June 9, 2026, 9:33 a.m. UTC | #1
On Wed, Jun 3, 2026 at 5:20 PM Christopher Bazley <chris.bazley@arm.com> wrote:
>
> Created a new function, gimple_build_vector_from_elems,
> for use when creating vectorized definitions for basic block
> vectorization in vect_create_constant_vectors.
>
> The existing gimple_build_vector function cannot be used
> for SVE vector types because it relies on the type
> associated with the tree_vector_builder having a constant
> number of subparts. Even if that limitation were lifted, the
> possibility of tree_vector_builder patterns being used is
> inappropriate.
>
> The new function takes a vector type and vec of tree nodes
> giving the element values to put into the built vector, instead of an
> instance of tree_vector_builder. If the number of values is zero then
> a zero constant is built. If all values are constant then a vector
> constant is built. Otherwise, a new constructor node is created.
>
> gcc/ChangeLog:
>
>         * gimple-fold.cc (gimple_build_vector_from_elems): Define a
>         new function to build a vector from a list of elements that need
>         not be complete.
>         * gimple-fold.h (gimple_build_vector_from_elems): Declare a new
>         function and a simpler overloaded version with fewer parameters.
>         * tree-vect-slp.cc (vect_create_constant_vectors):
>         Use gimple_build_vector_from_elems instead of
>         duplicate_and_interleave to create non-uniform constant
>         vectors for BB SLP vectorization.
> ---
>  gcc/gimple-fold.cc   | 55 ++++++++++++++++++++++++++++++++++++++++++++
>  gcc/gimple-fold.h    | 14 +++++++++++
>  gcc/tree-vect-slp.cc | 40 +++++++++++++++++++++++++-------
>  3 files changed, 101 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
> index 1ceb5aa5fba..3462c5acb6e 100644
> --- a/gcc/gimple-fold.cc
> +++ b/gcc/gimple-fold.cc
> @@ -11425,6 +11425,61 @@ gimple_build_vector (gimple_stmt_iterator *gsi,
>    return builder->build ();
>  }
>
> +/* Build a vector of type VECTYPE from a partial list of ELTS, handling the case
> +   in which some elements are non-constant.  The list of values may be shorter
> +   than the minimum number of subparts implied by VECTYPE. (When the vector
> +   type is variable-length, the actual number of subparts may not be known.)
> +   Omitted elements are implicitly zero.
> +
> +   Return a gimple value for the result, inserting any new instructions
> +   to GSI honoring BEFORE and UPDATE.  */
> +
> +tree
> +gimple_build_vector_from_elems (gimple_stmt_iterator *gsi, bool before,
> +                               gsi_iterator_update update, location_t loc,
> +                               tree vectype, const vec<tree> &elts)

The name is not distinctive enough to answer why it's used over
gimple_build_vector.
In particular ...

> +{
> +  unsigned int encoded_nelts = elts.length ();
> +  gimple_seq seq = NULL;
> +  gcc_assert (TREE_CODE (vectype) == VECTOR_TYPE);
> +  unsigned int lower_bound
> +    = constant_lower_bound (TYPE_VECTOR_SUBPARTS (vectype));
> +  gcc_assert (encoded_nelts <= lower_bound);
> +
> +  if (encoded_nelts == 0)
> +    return build_zero_cst (vectype);
> +
> +  /* Prepare a vector of constructor elements and find out whether all
> +     of the element values are constant.  */
> +  vec<constructor_elt, va_gc> *v;
> +  vec_alloc (v, encoded_nelts);
> +  bool is_constant = true;
> +
> +  for (unsigned int i = 0; i < encoded_nelts; ++i)
> +    {
> +      if (!CONSTANT_CLASS_P (elts[i]))
> +       is_constant = false;
> +
> +      CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, elts[i]);
> +    }
> +
> +  /* If all element values are constant then we can return a new VECTOR_CST
> +     node.  Any elements for which no value is supplied will be zero.  */
> +  if (is_constant)
> +    return build_vector_from_ctor (vectype, v);

... this case will exactly use a vector builder again and thus likely ICE for
cases that cannot be handled by can_duplicate_and_interleave_p.

Which raises the question - we agreed on how to handle VLA vector
CONSTRUCTORs, but the VLA VECTOR_CST representation does not
have sth equivalent here?

As for naming I'd prefer sth like

gimple_build_forced_constant_size_vector ()

or something similar.  Not sure why we cannot use a tree_vector_builder
here, possibly even can get a special force-constant-size mode in it
we can just switch on?

> +
> +  tree res;
> +  if (gimple_in_ssa_p (cfun))
> +    res = make_ssa_name (vectype);
> +  else
> +    res = create_tmp_reg (vectype);
> +  gimple *stmt = gimple_build_assign (res, build_constructor (vectype, v));
> +  gimple_set_location (stmt, loc);
> +  gimple_seq_add_stmt_without_update (&seq, stmt);
> +  gimple_build_insert_seq (gsi, before, update, seq);
> +  return res;
> +}
> +
>  /* Emit gimple statements into &stmts that take a value given in OLD_SIZE
>     and generate a value guaranteed to be rounded upwards to ALIGN.
>
> diff --git a/gcc/gimple-fold.h b/gcc/gimple-fold.h
> index f1853560779..8b324be005a 100644
> --- a/gcc/gimple-fold.h
> +++ b/gcc/gimple-fold.h
> @@ -243,6 +243,20 @@ gimple_build_vector (gimple_seq *seq, tree_vector_builder *builder)
>                               UNKNOWN_LOCATION, builder);
>  }
>
> +extern tree gimple_build_vector_from_elems (gimple_stmt_iterator *, bool,
> +                                           enum gsi_iterator_update,
> +                                           location_t, tree vectype,
> +                                           const vec<tree> &);
> +
> +inline tree
> +gimple_build_vector_from_elems (gimple_seq *seq, tree vectype,
> +                               const vec<tree> &elts)
> +{
> +  gimple_stmt_iterator gsi = gsi_last (*seq);
> +  return gimple_build_vector_from_elems (&gsi, false, GSI_CONTINUE_LINKING,
> +                                        UNKNOWN_LOCATION, vectype, elts);
> +}
> +
>  extern tree gimple_build_round_up (gimple_stmt_iterator *, bool,
>                                    enum gsi_iterator_update,
>                                    location_t, tree, tree,
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 4dd7e6e1e21..f91d3e723ec 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -10727,7 +10727,7 @@ vect_create_constant_vectors (vec_info *vinfo, slp_tree op_node)
>    unsigned j, number_of_places_left_in_vector;
>    tree vector_type;
>    tree vop;
> -  int group_size = op_node->ops.length ();
> +  unsigned int group_size = op_node->ops.length ();
>    unsigned int vec_num, i;
>    unsigned number_of_copies = 1;
>    bool constant_p;
> @@ -10757,10 +10757,23 @@ vect_create_constant_vectors (vec_info *vinfo, slp_tree op_node)
>       (s1, s2, ..., s8).  We will create two vectors {s1, s2, s3, s4} and
>       {s5, s6, s7, s8}.  */
>
> -  /* When using duplicate_and_interleave, we just need one element for
> -     each scalar statement.  */
> -  if (!TYPE_VECTOR_SUBPARTS (vector_type).is_constant (&nunits))
> -    nunits = group_size;
> +  if (is_a<bb_vec_info> (vinfo))
> +    {
> +      /* We don't use duplicate_and_interleave for basic block vectorization.
> +        We know that either the group size is exactly divisible by the vector
> +        length or it fits within a single vector.  */
> +      nunits = constant_lower_bound (TYPE_VECTOR_SUBPARTS (vector_type));
> +      gcc_checking_assert (multiple_p (group_size, nunits)
> +                          || known_le (group_size, nunits));
> +      nunits = MIN (nunits, group_size);
> +    }
> +  else
> +    {
> +      /* When using duplicate_and_interleave, we just need one element for
> +        each scalar statement.  */
> +      if (!TYPE_VECTOR_SUBPARTS (vector_type).is_constant (&nunits))
> +       nunits = group_size;
> +    }
>
>    number_of_copies = nunits * number_of_vectors / group_size;
>
> @@ -10860,6 +10873,11 @@ vect_create_constant_vectors (vec_info *vinfo, slp_tree op_node)
>                        ? multiple_p (type_nunits, nunits)
>                        : known_eq (type_nunits, nunits))
>                 vec_cst = gimple_build_vector (&ctor_seq, &elts);
> +             else if (is_a<bb_vec_info> (vinfo))
> +               {
> +                 vec_cst = gimple_build_vector_from_elems (&ctor_seq,
> +                                                           elts.type (), elts);
> +               }
>               else
>                 {
>                   if (permute_results.is_empty ())
> @@ -10925,9 +10943,15 @@ vect_create_constant_vectors (vec_info *vinfo, slp_tree op_node)
>       NUMBER_OF_SCALARS/NUNITS or NUNITS/NUMBER_OF_SCALARS, and hence we have
>       to replicate the vectors.  */
>    while (number_of_vectors > SLP_TREE_VEC_DEFS (op_node).length ())
> -    for (i = 0; SLP_TREE_VEC_DEFS (op_node).iterate (i, &vop) && i < vec_num;
> -        i++)
> -      SLP_TREE_VEC_DEFS (op_node).quick_push (vop);
> +    {
> +      /* Guard against the outer loop never terminating because the
> +        inner loop is never entered.  */
> +      gcc_checking_assert (vec_num > 0);
> +
> +      for (i = 0; SLP_TREE_VEC_DEFS (op_node).iterate (i, &vop) && i < vec_num;
> +          i++)
> +       SLP_TREE_VEC_DEFS (op_node).quick_push (vop);
> +    }
>  }
>
>  /* Get the scalar definition of the Nth lane from SLP_NODE or NULL_TREE
> --
> 2.43.0
>
  

Patch

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 1ceb5aa5fba..3462c5acb6e 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -11425,6 +11425,61 @@  gimple_build_vector (gimple_stmt_iterator *gsi,
   return builder->build ();
 }
 
+/* Build a vector of type VECTYPE from a partial list of ELTS, handling the case
+   in which some elements are non-constant.  The list of values may be shorter
+   than the minimum number of subparts implied by VECTYPE. (When the vector
+   type is variable-length, the actual number of subparts may not be known.)
+   Omitted elements are implicitly zero.
+
+   Return a gimple value for the result, inserting any new instructions
+   to GSI honoring BEFORE and UPDATE.  */
+
+tree
+gimple_build_vector_from_elems (gimple_stmt_iterator *gsi, bool before,
+				gsi_iterator_update update, location_t loc,
+				tree vectype, const vec<tree> &elts)
+{
+  unsigned int encoded_nelts = elts.length ();
+  gimple_seq seq = NULL;
+  gcc_assert (TREE_CODE (vectype) == VECTOR_TYPE);
+  unsigned int lower_bound
+    = constant_lower_bound (TYPE_VECTOR_SUBPARTS (vectype));
+  gcc_assert (encoded_nelts <= lower_bound);
+
+  if (encoded_nelts == 0)
+    return build_zero_cst (vectype);
+
+  /* Prepare a vector of constructor elements and find out whether all
+     of the element values are constant.  */
+  vec<constructor_elt, va_gc> *v;
+  vec_alloc (v, encoded_nelts);
+  bool is_constant = true;
+
+  for (unsigned int i = 0; i < encoded_nelts; ++i)
+    {
+      if (!CONSTANT_CLASS_P (elts[i]))
+	is_constant = false;
+
+      CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, elts[i]);
+    }
+
+  /* If all element values are constant then we can return a new VECTOR_CST
+     node.  Any elements for which no value is supplied will be zero.  */
+  if (is_constant)
+    return build_vector_from_ctor (vectype, v);
+
+  tree res;
+  if (gimple_in_ssa_p (cfun))
+    res = make_ssa_name (vectype);
+  else
+    res = create_tmp_reg (vectype);
+  gimple *stmt = gimple_build_assign (res, build_constructor (vectype, v));
+  gimple_set_location (stmt, loc);
+  gimple_seq_add_stmt_without_update (&seq, stmt);
+  gimple_build_insert_seq (gsi, before, update, seq);
+  return res;
+}
+
 /* Emit gimple statements into &stmts that take a value given in OLD_SIZE
    and generate a value guaranteed to be rounded upwards to ALIGN.
 
diff --git a/gcc/gimple-fold.h b/gcc/gimple-fold.h
index f1853560779..8b324be005a 100644
--- a/gcc/gimple-fold.h
+++ b/gcc/gimple-fold.h
@@ -243,6 +243,20 @@  gimple_build_vector (gimple_seq *seq, tree_vector_builder *builder)
 			      UNKNOWN_LOCATION, builder);
 }
 
+extern tree gimple_build_vector_from_elems (gimple_stmt_iterator *, bool,
+					    enum gsi_iterator_update,
+					    location_t, tree vectype,
+					    const vec<tree> &);
+
+inline tree
+gimple_build_vector_from_elems (gimple_seq *seq, tree vectype,
+				const vec<tree> &elts)
+{
+  gimple_stmt_iterator gsi = gsi_last (*seq);
+  return gimple_build_vector_from_elems (&gsi, false, GSI_CONTINUE_LINKING,
+					 UNKNOWN_LOCATION, vectype, elts);
+}
+
 extern tree gimple_build_round_up (gimple_stmt_iterator *, bool,
 				   enum gsi_iterator_update,
 				   location_t, tree, tree,
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 4dd7e6e1e21..f91d3e723ec 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -10727,7 +10727,7 @@  vect_create_constant_vectors (vec_info *vinfo, slp_tree op_node)
   unsigned j, number_of_places_left_in_vector;
   tree vector_type;
   tree vop;
-  int group_size = op_node->ops.length ();
+  unsigned int group_size = op_node->ops.length ();
   unsigned int vec_num, i;
   unsigned number_of_copies = 1;
   bool constant_p;
@@ -10757,10 +10757,23 @@  vect_create_constant_vectors (vec_info *vinfo, slp_tree op_node)
      (s1, s2, ..., s8).  We will create two vectors {s1, s2, s3, s4} and
      {s5, s6, s7, s8}.  */
 
-  /* When using duplicate_and_interleave, we just need one element for
-     each scalar statement.  */
-  if (!TYPE_VECTOR_SUBPARTS (vector_type).is_constant (&nunits))
-    nunits = group_size;
+  if (is_a<bb_vec_info> (vinfo))
+    {
+      /* We don't use duplicate_and_interleave for basic block vectorization.
+	 We know that either the group size is exactly divisible by the vector
+	 length or it fits within a single vector.  */
+      nunits = constant_lower_bound (TYPE_VECTOR_SUBPARTS (vector_type));
+      gcc_checking_assert (multiple_p (group_size, nunits)
+			   || known_le (group_size, nunits));
+      nunits = MIN (nunits, group_size);
+    }
+  else
+    {
+      /* When using duplicate_and_interleave, we just need one element for
+	 each scalar statement.  */
+      if (!TYPE_VECTOR_SUBPARTS (vector_type).is_constant (&nunits))
+	nunits = group_size;
+    }
 
   number_of_copies = nunits * number_of_vectors / group_size;
 
@@ -10860,6 +10873,11 @@  vect_create_constant_vectors (vec_info *vinfo, slp_tree op_node)
 		       ? multiple_p (type_nunits, nunits)
 		       : known_eq (type_nunits, nunits))
 		vec_cst = gimple_build_vector (&ctor_seq, &elts);
+	      else if (is_a<bb_vec_info> (vinfo))
+		{
+		  vec_cst = gimple_build_vector_from_elems (&ctor_seq,
+							    elts.type (), elts);
+		}
 	      else
 		{
 		  if (permute_results.is_empty ())
@@ -10925,9 +10943,15 @@  vect_create_constant_vectors (vec_info *vinfo, slp_tree op_node)
      NUMBER_OF_SCALARS/NUNITS or NUNITS/NUMBER_OF_SCALARS, and hence we have
      to replicate the vectors.  */
   while (number_of_vectors > SLP_TREE_VEC_DEFS (op_node).length ())
-    for (i = 0; SLP_TREE_VEC_DEFS (op_node).iterate (i, &vop) && i < vec_num;
-	 i++)
-      SLP_TREE_VEC_DEFS (op_node).quick_push (vop);
+    {
+      /* Guard against the outer loop never terminating because the
+	 inner loop is never entered.  */
+      gcc_checking_assert (vec_num > 0);
+
+      for (i = 0; SLP_TREE_VEC_DEFS (op_node).iterate (i, &vop) && i < vec_num;
+	   i++)
+	SLP_TREE_VEC_DEFS (op_node).quick_push (vop);
+    }
 }
 
 /* Get the scalar definition of the Nth lane from SLP_NODE or NULL_TREE