[v11,03/12] Update constant creation for BB SLP with predicated tails
Commit Message
Created a new function, gimple_build_vector_from_elems,
for use when creating vectorized definitions for basic block
vectorization in vect_create_constant_vectors.
The existing gimple_build_vector function cannot be used
for SVE vector types because it relies on the type
associated with the tree_vector_builder having a constant
number of subparts. Even if that limitation were lifted, the
possibility of tree_vector_builder patterns being used is
inappropriate.
The new function takes a vector type and vec of tree nodes
giving the element values to put into the built vector, instead of an
instance of tree_vector_builder. If the number of values is zero then
a zero constant is built. If all values are constant then a vector
constant is built. Otherwise, a new constructor node is created.
gcc/ChangeLog:
* gimple-fold.cc (gimple_build_vector_from_elems): Define a
new function to build a vector from a list of elements that need
not be complete.
* gimple-fold.h (gimple_build_vector_from_elems): Declare a new
function and a simpler overloaded version with fewer parameters.
* tree-vect-slp.cc (vect_create_constant_vectors):
Use gimple_build_vector_from_elems instead of
duplicate_and_interleave to create non-uniform constant
vectors for BB SLP vectorization.
---
gcc/gimple-fold.cc | 55 ++++++++++++++++++++++++++++++++++++++++++++
gcc/gimple-fold.h | 14 +++++++++++
gcc/tree-vect-slp.cc | 40 +++++++++++++++++++++++++-------
3 files changed, 101 insertions(+), 8 deletions(-)
Comments
On Wed, Jun 3, 2026 at 5:20 PM Christopher Bazley <chris.bazley@arm.com> wrote:
>
> Created a new function, gimple_build_vector_from_elems,
> for use when creating vectorized definitions for basic block
> vectorization in vect_create_constant_vectors.
>
> The existing gimple_build_vector function cannot be used
> for SVE vector types because it relies on the type
> associated with the tree_vector_builder having a constant
> number of subparts. Even if that limitation were lifted, the
> possibility of tree_vector_builder patterns being used is
> inappropriate.
>
> The new function takes a vector type and vec of tree nodes
> giving the element values to put into the built vector, instead of an
> instance of tree_vector_builder. If the number of values is zero then
> a zero constant is built. If all values are constant then a vector
> constant is built. Otherwise, a new constructor node is created.
>
> gcc/ChangeLog:
>
> * gimple-fold.cc (gimple_build_vector_from_elems): Define a
> new function to build a vector from a list of elements that need
> not be complete.
> * gimple-fold.h (gimple_build_vector_from_elems): Declare a new
> function and a simpler overloaded version with fewer parameters.
> * tree-vect-slp.cc (vect_create_constant_vectors):
> Use gimple_build_vector_from_elems instead of
> duplicate_and_interleave to create non-uniform constant
> vectors for BB SLP vectorization.
> ---
> gcc/gimple-fold.cc | 55 ++++++++++++++++++++++++++++++++++++++++++++
> gcc/gimple-fold.h | 14 +++++++++++
> gcc/tree-vect-slp.cc | 40 +++++++++++++++++++++++++-------
> 3 files changed, 101 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
> index 1ceb5aa5fba..3462c5acb6e 100644
> --- a/gcc/gimple-fold.cc
> +++ b/gcc/gimple-fold.cc
> @@ -11425,6 +11425,61 @@ gimple_build_vector (gimple_stmt_iterator *gsi,
> return builder->build ();
> }
>
> +/* Build a vector of type VECTYPE from a partial list of ELTS, handling the case
> + in which some elements are non-constant. The list of values may be shorter
> + than the minimum number of subparts implied by VECTYPE. (When the vector
> + type is variable-length, the actual number of subparts may not be known.)
> + Omitted elements are implicitly zero.
> +
> + Return a gimple value for the result, inserting any new instructions
> + to GSI honoring BEFORE and UPDATE. */
> +
> +tree
> +gimple_build_vector_from_elems (gimple_stmt_iterator *gsi, bool before,
> + gsi_iterator_update update, location_t loc,
> + tree vectype, const vec<tree> &elts)
The name is not distinctive enough to answer why it's used over
gimple_build_vector.
In particular ...
> +{
> + unsigned int encoded_nelts = elts.length ();
> + gimple_seq seq = NULL;
> + gcc_assert (TREE_CODE (vectype) == VECTOR_TYPE);
> + unsigned int lower_bound
> + = constant_lower_bound (TYPE_VECTOR_SUBPARTS (vectype));
> + gcc_assert (encoded_nelts <= lower_bound);
> +
> + if (encoded_nelts == 0)
> + return build_zero_cst (vectype);
> +
> + /* Prepare a vector of constructor elements and find out whether all
> + of the element values are constant. */
> + vec<constructor_elt, va_gc> *v;
> + vec_alloc (v, encoded_nelts);
> + bool is_constant = true;
> +
> + for (unsigned int i = 0; i < encoded_nelts; ++i)
> + {
> + if (!CONSTANT_CLASS_P (elts[i]))
> + is_constant = false;
> +
> + CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, elts[i]);
> + }
> +
> + /* If all element values are constant then we can return a new VECTOR_CST
> + node. Any elements for which no value is supplied will be zero. */
> + if (is_constant)
> + return build_vector_from_ctor (vectype, v);
... this case will exactly use a vector builder again and thus likely ICE for
cases that cannot be handled by can_duplicate_and_interleave_p.
Which raises the question - we agreed on how to handle VLA vector
CONSTRUCTORs, but the VLA VECTOR_CST representation does not
have sth equivalent here?
As for naming I'd prefer sth like
gimple_build_forced_constant_size_vector ()
or something similar. Not sure why we cannot use a tree_vector_builder
here, possibly even can get a special force-constant-size mode in it
we can just switch on?
> +
> + tree res;
> + if (gimple_in_ssa_p (cfun))
> + res = make_ssa_name (vectype);
> + else
> + res = create_tmp_reg (vectype);
> + gimple *stmt = gimple_build_assign (res, build_constructor (vectype, v));
> + gimple_set_location (stmt, loc);
> + gimple_seq_add_stmt_without_update (&seq, stmt);
> + gimple_build_insert_seq (gsi, before, update, seq);
> + return res;
> +}
> +
> /* Emit gimple statements into &stmts that take a value given in OLD_SIZE
> and generate a value guaranteed to be rounded upwards to ALIGN.
>
> diff --git a/gcc/gimple-fold.h b/gcc/gimple-fold.h
> index f1853560779..8b324be005a 100644
> --- a/gcc/gimple-fold.h
> +++ b/gcc/gimple-fold.h
> @@ -243,6 +243,20 @@ gimple_build_vector (gimple_seq *seq, tree_vector_builder *builder)
> UNKNOWN_LOCATION, builder);
> }
>
> +extern tree gimple_build_vector_from_elems (gimple_stmt_iterator *, bool,
> + enum gsi_iterator_update,
> + location_t, tree vectype,
> + const vec<tree> &);
> +
> +inline tree
> +gimple_build_vector_from_elems (gimple_seq *seq, tree vectype,
> + const vec<tree> &elts)
> +{
> + gimple_stmt_iterator gsi = gsi_last (*seq);
> + return gimple_build_vector_from_elems (&gsi, false, GSI_CONTINUE_LINKING,
> + UNKNOWN_LOCATION, vectype, elts);
> +}
> +
> extern tree gimple_build_round_up (gimple_stmt_iterator *, bool,
> enum gsi_iterator_update,
> location_t, tree, tree,
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 4dd7e6e1e21..f91d3e723ec 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -10727,7 +10727,7 @@ vect_create_constant_vectors (vec_info *vinfo, slp_tree op_node)
> unsigned j, number_of_places_left_in_vector;
> tree vector_type;
> tree vop;
> - int group_size = op_node->ops.length ();
> + unsigned int group_size = op_node->ops.length ();
> unsigned int vec_num, i;
> unsigned number_of_copies = 1;
> bool constant_p;
> @@ -10757,10 +10757,23 @@ vect_create_constant_vectors (vec_info *vinfo, slp_tree op_node)
> (s1, s2, ..., s8). We will create two vectors {s1, s2, s3, s4} and
> {s5, s6, s7, s8}. */
>
> - /* When using duplicate_and_interleave, we just need one element for
> - each scalar statement. */
> - if (!TYPE_VECTOR_SUBPARTS (vector_type).is_constant (&nunits))
> - nunits = group_size;
> + if (is_a<bb_vec_info> (vinfo))
> + {
> + /* We don't use duplicate_and_interleave for basic block vectorization.
> + We know that either the group size is exactly divisible by the vector
> + length or it fits within a single vector. */
> + nunits = constant_lower_bound (TYPE_VECTOR_SUBPARTS (vector_type));
> + gcc_checking_assert (multiple_p (group_size, nunits)
> + || known_le (group_size, nunits));
> + nunits = MIN (nunits, group_size);
> + }
> + else
> + {
> + /* When using duplicate_and_interleave, we just need one element for
> + each scalar statement. */
> + if (!TYPE_VECTOR_SUBPARTS (vector_type).is_constant (&nunits))
> + nunits = group_size;
> + }
>
> number_of_copies = nunits * number_of_vectors / group_size;
>
> @@ -10860,6 +10873,11 @@ vect_create_constant_vectors (vec_info *vinfo, slp_tree op_node)
> ? multiple_p (type_nunits, nunits)
> : known_eq (type_nunits, nunits))
> vec_cst = gimple_build_vector (&ctor_seq, &elts);
> + else if (is_a<bb_vec_info> (vinfo))
> + {
> + vec_cst = gimple_build_vector_from_elems (&ctor_seq,
> + elts.type (), elts);
> + }
> else
> {
> if (permute_results.is_empty ())
> @@ -10925,9 +10943,15 @@ vect_create_constant_vectors (vec_info *vinfo, slp_tree op_node)
> NUMBER_OF_SCALARS/NUNITS or NUNITS/NUMBER_OF_SCALARS, and hence we have
> to replicate the vectors. */
> while (number_of_vectors > SLP_TREE_VEC_DEFS (op_node).length ())
> - for (i = 0; SLP_TREE_VEC_DEFS (op_node).iterate (i, &vop) && i < vec_num;
> - i++)
> - SLP_TREE_VEC_DEFS (op_node).quick_push (vop);
> + {
> + /* Guard against the outer loop never terminating because the
> + inner loop is never entered. */
> + gcc_checking_assert (vec_num > 0);
> +
> + for (i = 0; SLP_TREE_VEC_DEFS (op_node).iterate (i, &vop) && i < vec_num;
> + i++)
> + SLP_TREE_VEC_DEFS (op_node).quick_push (vop);
> + }
> }
>
> /* Get the scalar definition of the Nth lane from SLP_NODE or NULL_TREE
> --
> 2.43.0
>
@@ -11425,6 +11425,61 @@ gimple_build_vector (gimple_stmt_iterator *gsi,
return builder->build ();
}
+/* Build a vector of type VECTYPE from a partial list of ELTS, handling the case
+ in which some elements are non-constant. The list of values may be shorter
+ than the minimum number of subparts implied by VECTYPE. (When the vector
+ type is variable-length, the actual number of subparts may not be known.)
+ Omitted elements are implicitly zero.
+
+ Return a gimple value for the result, inserting any new instructions
+ to GSI honoring BEFORE and UPDATE. */
+
+tree
+gimple_build_vector_from_elems (gimple_stmt_iterator *gsi, bool before,
+ gsi_iterator_update update, location_t loc,
+ tree vectype, const vec<tree> &elts)
+{
+ unsigned int encoded_nelts = elts.length ();
+ gimple_seq seq = NULL;
+ gcc_assert (TREE_CODE (vectype) == VECTOR_TYPE);
+ unsigned int lower_bound
+ = constant_lower_bound (TYPE_VECTOR_SUBPARTS (vectype));
+ gcc_assert (encoded_nelts <= lower_bound);
+
+ if (encoded_nelts == 0)
+ return build_zero_cst (vectype);
+
+ /* Prepare a vector of constructor elements and find out whether all
+ of the element values are constant. */
+ vec<constructor_elt, va_gc> *v;
+ vec_alloc (v, encoded_nelts);
+ bool is_constant = true;
+
+ for (unsigned int i = 0; i < encoded_nelts; ++i)
+ {
+ if (!CONSTANT_CLASS_P (elts[i]))
+ is_constant = false;
+
+ CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, elts[i]);
+ }
+
+ /* If all element values are constant then we can return a new VECTOR_CST
+ node. Any elements for which no value is supplied will be zero. */
+ if (is_constant)
+ return build_vector_from_ctor (vectype, v);
+
+ tree res;
+ if (gimple_in_ssa_p (cfun))
+ res = make_ssa_name (vectype);
+ else
+ res = create_tmp_reg (vectype);
+ gimple *stmt = gimple_build_assign (res, build_constructor (vectype, v));
+ gimple_set_location (stmt, loc);
+ gimple_seq_add_stmt_without_update (&seq, stmt);
+ gimple_build_insert_seq (gsi, before, update, seq);
+ return res;
+}
+
/* Emit gimple statements into &stmts that take a value given in OLD_SIZE
and generate a value guaranteed to be rounded upwards to ALIGN.
@@ -243,6 +243,20 @@ gimple_build_vector (gimple_seq *seq, tree_vector_builder *builder)
UNKNOWN_LOCATION, builder);
}
+extern tree gimple_build_vector_from_elems (gimple_stmt_iterator *, bool,
+ enum gsi_iterator_update,
+ location_t, tree vectype,
+ const vec<tree> &);
+
+inline tree
+gimple_build_vector_from_elems (gimple_seq *seq, tree vectype,
+ const vec<tree> &elts)
+{
+ gimple_stmt_iterator gsi = gsi_last (*seq);
+ return gimple_build_vector_from_elems (&gsi, false, GSI_CONTINUE_LINKING,
+ UNKNOWN_LOCATION, vectype, elts);
+}
+
extern tree gimple_build_round_up (gimple_stmt_iterator *, bool,
enum gsi_iterator_update,
location_t, tree, tree,
@@ -10727,7 +10727,7 @@ vect_create_constant_vectors (vec_info *vinfo, slp_tree op_node)
unsigned j, number_of_places_left_in_vector;
tree vector_type;
tree vop;
- int group_size = op_node->ops.length ();
+ unsigned int group_size = op_node->ops.length ();
unsigned int vec_num, i;
unsigned number_of_copies = 1;
bool constant_p;
@@ -10757,10 +10757,23 @@ vect_create_constant_vectors (vec_info *vinfo, slp_tree op_node)
(s1, s2, ..., s8). We will create two vectors {s1, s2, s3, s4} and
{s5, s6, s7, s8}. */
- /* When using duplicate_and_interleave, we just need one element for
- each scalar statement. */
- if (!TYPE_VECTOR_SUBPARTS (vector_type).is_constant (&nunits))
- nunits = group_size;
+ if (is_a<bb_vec_info> (vinfo))
+ {
+ /* We don't use duplicate_and_interleave for basic block vectorization.
+ We know that either the group size is exactly divisible by the vector
+ length or it fits within a single vector. */
+ nunits = constant_lower_bound (TYPE_VECTOR_SUBPARTS (vector_type));
+ gcc_checking_assert (multiple_p (group_size, nunits)
+ || known_le (group_size, nunits));
+ nunits = MIN (nunits, group_size);
+ }
+ else
+ {
+ /* When using duplicate_and_interleave, we just need one element for
+ each scalar statement. */
+ if (!TYPE_VECTOR_SUBPARTS (vector_type).is_constant (&nunits))
+ nunits = group_size;
+ }
number_of_copies = nunits * number_of_vectors / group_size;
@@ -10860,6 +10873,11 @@ vect_create_constant_vectors (vec_info *vinfo, slp_tree op_node)
? multiple_p (type_nunits, nunits)
: known_eq (type_nunits, nunits))
vec_cst = gimple_build_vector (&ctor_seq, &elts);
+ else if (is_a<bb_vec_info> (vinfo))
+ {
+ vec_cst = gimple_build_vector_from_elems (&ctor_seq,
+ elts.type (), elts);
+ }
else
{
if (permute_results.is_empty ())
@@ -10925,9 +10943,15 @@ vect_create_constant_vectors (vec_info *vinfo, slp_tree op_node)
NUMBER_OF_SCALARS/NUNITS or NUNITS/NUMBER_OF_SCALARS, and hence we have
to replicate the vectors. */
while (number_of_vectors > SLP_TREE_VEC_DEFS (op_node).length ())
- for (i = 0; SLP_TREE_VEC_DEFS (op_node).iterate (i, &vop) && i < vec_num;
- i++)
- SLP_TREE_VEC_DEFS (op_node).quick_push (vop);
+ {
+ /* Guard against the outer loop never terminating because the
+ inner loop is never entered. */
+ gcc_checking_assert (vec_num > 0);
+
+ for (i = 0; SLP_TREE_VEC_DEFS (op_node).iterate (i, &vop) && i < vec_num;
+ i++)
+ SLP_TREE_VEC_DEFS (op_node).quick_push (vop);
+ }
}
/* Get the scalar definition of the Nth lane from SLP_NODE or NULL_TREE