[v10,02/11] Implement recording/getting of mask/length for BB SLP

Message ID 20260603131548.50668-3-chris.bazley@arm.com
State New
Headers
Series Extend BB SLP vectorization to use predicated tails |

Commit Message

Christopher Bazley June 3, 2026, 1:15 p.m. UTC
  Add two new fields to SLP tree nodes, which are accessed as
SLP_TREE_CAN_USE_PARTIAL_VECTORS_P and SLP_TREE_PARTIAL_VECTORS_STYLE.

SLP_TREE_CAN_USE_PARTIAL_VECTORS_P is analogous to the existing
predicate LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P. It is initialized to
true. This flag just records whether the target could vectorize a
node using a partial vector; it does not say anything about
whether the vector actually is partial, or how the target would support
use of a partial vector. Some kinds of node require mask/length for
partial vectors; others don't. In the latter case (e.g., for add
operations), SLP_TREE_CAN_USE_PARTIAL_VECTORS_P will remain true.

SLP_TREE_PARTIAL_VECTORS_STYLE is analogous to the existing field
LOOP_VINFO_PARTIAL_VECTORS_STYLE. Both are initialized to 'none'.
The vect_partial_vectors_avx512 enumerator is not used for BB SLP.
Unlike loop vectorization, a different style of partial vectors can be
chosen for each node during analysis of that node.

Implement the recently-introduced wrapper functions,
vect_record_(len|mask), for BB SLP by setting
SLP_TREE_PARTIAL_VECTORS_STYLE to indicate that a mask or length should
be used for a given SLP node. The passed-in vec_info is ignored.

Implement the vect_fully_(masked|with_length)_p wrapper functions for
BB SLP by checking the SLP_TREE_PARTIAL_VECTORS_STYLE. This should be
sufficient because at most one of vect_record_(len|mask) and
vect_cannot_use_partial_vectors are expected to be called for any
given SLP node. SLP_TREE_CAN_USE_PARTIAL_VECTORS_P should be true if
the style is not 'none', but its value isn't used beyond the analysis
phase.

The implementations of vect_get_mask and vect_get_len for BB SLP are
non-trivial (albeit simpler than for loop vectorization), therefore they
are delegated to SLP-specific functions defined in tree-vect-slp.cc.

Implement the vect_cannot_use_partial_vectors wrapper function by
setting the SLP_TREE_CAN_USE_PARTIAL_VECTORS_P flag to false.
To prevent regressions, vect_can_use_partial_vectors_p still returns
false for BB SLP regardless (for now). This prevents vect_record_mask
or vect_record_len from being called.

gcc/ChangeLog:

	* tree-vect-slp.cc (_slp_tree::_slp_tree): initialize new
	partial_vector_style, can_use_partial_vectors and
	num_partial_vectors members.
	(vect_slp_analyze_node_operations): Account for worst-case
	prologue costs of per-node partial-vector mask or length
	materialisation.
	(vect_slp_record_bb_style): Set the partial vector style of an
	SLP node, checking that the style does not flip-flop between mask
	and length.
	(vect_slp_record_bb_mask): Use vect_slp_record_bb_style to set
	the partial vector style of the SLP tree node to
	vect_partial_vectors_while_ult.
	(vect_slp_get_bb_mask): New function to materialize a mask for
	basic block SLP vectorization.
	(vect_slp_record_bb_len): Use vect_slp_record_bb_style to set
	the partial vector style of the SLP tree node to
	vect_partial_vectors_len.
	(vect_slp_get_bb_len): New function to materialize a length for
	basic block SLP vectorization.
	* tree-vect-stmts.cc (vectorizable_internal_function):
	(vect_record_mask): Handle the basic block SLP use case by
	delegating to vect_slp_record_bb_mask.
	(vect_get_mask): Handle the basic block SLP use case by
	delegating to vect_slp_get_bb_mask.
	(vect_record_len): Handle the basic block SLP use case by
	delegating to vect_slp_record_bb_len.
	(vect_get_len): Handle the basic block SLP use case by
	delegating to vect_slp_get_bb_len.
	(vect_gen_while_ssa_name): New function containing code
	refactored out of vect_gen_while for reuse by
	vect_slp_get_bb_mask.
	(vect_gen_while): Use vect_gen_while_ssa_name instead of custom
	code for some of the implementation.
	* tree-vectorizer.h (enum vect_partial_vector_style): Move this
	definition earlier to allow reuse by struct _slp_tree.
	(struct _slp_tree): Add a partial_vector_style member to record
	whether to use a length or mask for the SLP tree node, if
	partial vectors are required and supported.
	Add a can_use_partial_vectors member to record whether partial
	vectors are supported for the SLP tree node.
	Add a num_partial_vectors member for costing.
	(SLP_TREE_PARTIAL_VECTORS_STYLE): New member accessor macro.
	(SLP_TREE_CAN_USE_PARTIAL_VECTORS_P): New member accessor macro.
	(SLP_TREE_NUM_PARTIAL_VECTORS): New member accessor macro.
	(vect_gen_while_ssa_name): Declaration of a new function.
	(vect_slp_get_bb_mask): As above.
	(vect_slp_get_bb_len): As above.
	(vect_cannot_use_partial_vectors): Handle the basic block SLP
	use-case by setting SLP_TREE_CAN_USE_PARTIAL_VECTORS_P to
	false.
	(vect_fully_with_length_p): Handle the basic block SLP use
	case by checking whether the SLP_TREE_PARTIAL_VECTORS_STYLE is
	vect_partial_vectors_len.
	(vect_fully_masked_p): Handle the basic block SLP use case by
	checking whether the SLP_TREE_PARTIAL_VECTORS_STYLE is
	vect_partial_vectors_while_ult.
---
 gcc/tree-vect-slp.cc   | 182 +++++++++++++++++++++++++++++++++++++++++
 gcc/tree-vect-stmts.cc |  52 +++++++-----
 gcc/tree-vectorizer.h  |  52 ++++++++----
 3 files changed, 247 insertions(+), 39 deletions(-)
  

Patch

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 075e93f04a9..4dd7e6e1e21 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -125,6 +125,9 @@  _slp_tree::_slp_tree ()
   SLP_TREE_GS_BASE (this) = NULL_TREE;
   this->ldst_lanes = false;
   this->avoid_stlf_fail = false;
+  SLP_TREE_PARTIAL_VECTORS_STYLE (this) = vect_partial_vectors_none;
+  SLP_TREE_CAN_USE_PARTIAL_VECTORS_P (this) = true;
+  SLP_TREE_NUM_PARTIAL_VECTORS (this) = 0;
   SLP_TREE_VECTYPE (this) = NULL_TREE;
   SLP_TREE_REPRESENTATIVE (this) = NULL;
   this->cycle_info.id = -1;
@@ -8958,6 +8961,40 @@  vect_slp_analyze_node_operations (vec_info *vinfo, slp_tree node,
 	  vect_prologue_cost_for_slp (vinfo, child, cost_vec);
 	}
 
+  if (res)
+    {
+      /* Take care of special costs for partial vectors.
+	 Costing each partial vector is excessive for many SLP instances,
+	 because it is common to materialise identical masks/lengths for related
+	 operations (e.g., for vector loads and stores of the same length).
+	 Masks/lengths can also be shared between SLP subgraphs or eliminated by
+	 pattern-based lowering during instruction selection.  However, it's
+	 simpler and safer to use the worst-case cost; if this ends up being the
+	 tie-breaker between vectorizing or not, then it's probably better not
+	 to vectorize.  */
+      const int num_partial_vectors = SLP_TREE_NUM_PARTIAL_VECTORS (node);
+
+      if (SLP_TREE_PARTIAL_VECTORS_STYLE (node)
+	  == vect_partial_vectors_while_ult)
+	{
+	  gcc_assert (num_partial_vectors > 0);
+	  record_stmt_cost (cost_vec, num_partial_vectors, vector_stmt, NULL,
+			    NULL, NULL_TREE, 0, vect_prologue);
+	}
+      else if (SLP_TREE_PARTIAL_VECTORS_STYLE (node)
+	       == vect_partial_vectors_len)
+	{
+	  /* Need to set up a length in the prologue.  */
+	  gcc_assert (num_partial_vectors > 0);
+	  record_stmt_cost (cost_vec, num_partial_vectors, scalar_stmt, NULL,
+			    NULL, NULL_TREE, 0, vect_prologue);
+	}
+      else
+	{
+	  gcc_assert (num_partial_vectors == 0);
+	}
+    }
+
   /* If this node or any of its children can't be vectorized, try pruning
      the tree here rather than felling the whole thing.  */
   if (!res && vect_slp_convert_to_external (vinfo, node, node_instance))
@@ -12441,3 +12478,148 @@  vect_schedule_slp (vec_info *vinfo, const vec<slp_instance> &slp_instances)
         }
     }
 }
+
+/* Record that a specific partial vector style could be used to vectorize
+   SLP_NODE if required.  */
+
+static void
+vect_slp_record_bb_style (slp_tree slp_node, vect_partial_vector_style style)
+{
+  gcc_assert (style != vect_partial_vectors_none);
+  gcc_assert (style != vect_partial_vectors_avx512);
+
+  if (SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node) == vect_partial_vectors_none)
+    SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node) = style;
+  else
+    gcc_assert (SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node) == style);
+}
+
+/* Record that a complete set of masks associated with SLP_NODE would need to
+   contain a sequence of NVECTORS masks that each control a vector of type
+   VECTYPE.  If SCALAR_MASK is nonnull, the fully-masked loop would AND
+   these vector masks with the vector version of SCALAR_MASK.  */
+void
+vect_slp_record_bb_mask (slp_tree slp_node, unsigned int /* nvectors */,
+			 tree /* vectype */, tree /* scalar_mask */)
+{
+  vect_slp_record_bb_style (slp_node, vect_partial_vectors_while_ult);
+
+  /* FORNOW: this often overestimates the number of masks for costing purposes
+     because, after lowering, masks have often been eliminated, shared between
+     SLP nodes, or even shared between SLP subgraphs.  */
+  SLP_TREE_NUM_PARTIAL_VECTORS(slp_node) ++;
+}
+
+/* Materialize mask number INDEX for a group of scalar stmts in SLP_NODE that
+   operate on NVECTORS vectors of type VECTYPE, where 0 <= INDEX < NVECTORS.
+   Insert any set-up statements before GSI.  */
+
+tree
+vect_slp_get_bb_mask (slp_tree slp_node, gimple_stmt_iterator *gsi,
+		      unsigned int nvectors, tree vectype, unsigned int index)
+{
+  gcc_assert (SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node)
+	      == vect_partial_vectors_while_ult);
+  gcc_assert (nvectors >= 1);
+  gcc_assert (index < nvectors);
+
+  const poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (vectype);
+  const unsigned int group_size = SLP_TREE_LANES (slp_node);
+  unsigned int mask_size = group_size;
+  const tree masktype = truth_type_for (vectype);
+
+  if (nunits.is_constant ())
+    {
+      /* Only the last vector can be a partial vector.  */
+      if (index + 1 < nvectors)
+	return build_minus_one_cst (masktype);
+
+      /* Return a mask for a possibly-partial tail vector. */
+      const unsigned int const_nunits = nunits.to_constant ();
+      const unsigned int head_size = (nvectors - 1) * const_nunits;
+      gcc_assert (head_size <= group_size);
+      mask_size = group_size - head_size;
+
+      if (mask_size == const_nunits)
+	return build_minus_one_cst (masktype);
+    }
+  else
+    {
+      /* Return a mask for a single variable-length vector. */
+      gcc_assert (nvectors == 1);
+      gcc_assert (known_le (mask_size, nunits));
+    }
+
+  /* FORNOW: don't bother maintaining a set of mask constants to allow
+     sharing between nodes belonging to the same instance of bb_vec_info
+     or even within the same SLP subgraph.  */
+  gimple_seq stmts = NULL;
+  const tree cmp_type = size_type_node;
+  const tree start_index = build_zero_cst (cmp_type);
+  const tree end_index = build_int_cst (cmp_type, mask_size);
+  const tree mask = make_temp_ssa_name (masktype, NULL, "slp_mask");
+  vect_gen_while_ssa_name (&stmts, masktype, start_index, end_index, mask);
+  gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+  return mask;
+}
+
+/* Record that a complete set of lengths associated with SLP_NODE would need to
+   contain a sequence of NVECTORS lengths for controlling an operation on
+   VECTYPE.  The operation splits each element of VECTYPE into FACTOR separate
+   subelements, measuring the length as a number of these subelements.  */
+
+void
+vect_slp_record_bb_len (slp_tree slp_node, unsigned int /* nvectors */,
+			tree /* vectype */, unsigned int /* factor */)
+{
+  vect_slp_record_bb_style (slp_node, vect_partial_vectors_len);
+
+  /* FORNOW: this probably overestimates the number of lengths for costing
+     purposes because, after lowering, lengths might have been eliminated,
+     shared between SLP nodes, or even shared between SLP subgraphs.  */
+  SLP_TREE_NUM_PARTIAL_VECTORS (slp_node)++;
+}
+
+/* Materialize length number INDEX for a group of scalar stmts in SLP_NODE that
+   operate on NVECTORS vectors of type VECTYPE, where 0 <= INDEX < NVECTORS.
+   Return a value that contains FACTOR multiplied by the number of elements that
+   should be processed.  */
+
+tree
+vect_slp_get_bb_len (slp_tree slp_node, unsigned int nvectors, tree vectype,
+		     unsigned int index, unsigned int factor, bool adjusted)
+{
+  gcc_checking_assert (SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node)
+		       == vect_partial_vectors_len);
+  gcc_assert (nvectors >= 1);
+  gcc_assert (index < nvectors);
+  (void) adjusted;
+
+  const poly_uint64 nunits = TYPE_VECTOR_SUBPARTS (vectype);
+  const unsigned int group_size = SLP_TREE_LANES (slp_node);
+  unsigned int len = group_size;
+
+  if (nunits.is_constant ())
+    {
+      const unsigned int const_nunits = nunits.to_constant ();
+
+      /* Only the last vector can be a partial vector.  */
+      if (index + 1 < nvectors)
+	len = const_nunits;
+      else
+	{
+	  /* Return a length for a possibly-partial tail vector. */
+	  const unsigned int head_size = (nvectors - 1) * const_nunits;
+	  gcc_assert (head_size <= group_size);
+	  len = group_size - head_size;
+	}
+    }
+  else
+    {
+      /* Return a length for a single variable-length vector. */
+      gcc_assert (nvectors == 1);
+      gcc_assert (known_le (len, nunits));
+    }
+
+  return size_int (len * factor);
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 15fca17a407..ecad74e7cbf 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -1385,7 +1385,9 @@  vectorizable_internal_function (combined_fn cfn, tree fndecl,
 /* Record that a complete set of masks associated with VINFO would need to
    contain a sequence of NVECTORS masks that each control a vector of type
    VECTYPE.  If SCALAR_MASK is nonnull, the fully-masked loop would AND
-   these vector masks with the vector version of SCALAR_MASK.  */
+   these vector masks with the vector version of SCALAR_MASK.  Alternatively,
+   if doing basic block vectorization, record that a mask could be used to
+   vectorize SLP_NODE if required.  */
 static void
 vect_record_mask (vec_info *vinfo, slp_tree slp_node, unsigned int nvectors,
 		  tree vectype, tree scalar_mask)
@@ -1395,7 +1397,7 @@  vect_record_mask (vec_info *vinfo, slp_tree slp_node, unsigned int nvectors,
     vect_record_loop_mask (loop_vinfo, &LOOP_VINFO_MASKS (loop_vinfo), nvectors,
 			   vectype, scalar_mask);
   else
-    (void) slp_node; /* FORNOW */
+    vect_slp_record_bb_mask (slp_node, nvectors, vectype, scalar_mask);
 }
 
 /* Given a complete set of masks associated with VINFO, extract mask number
@@ -1413,16 +1415,15 @@  vect_get_mask (vec_info *vinfo, slp_tree slp_node, gimple_stmt_iterator *gsi,
     return vect_get_loop_mask (loop_vinfo, gsi, &LOOP_VINFO_MASKS (loop_vinfo),
 			       nvectors, vectype, index);
   else
-    {
-      (void) slp_node; /* FORNOW */
-      return NULL_TREE;
-    }
+    return vect_slp_get_bb_mask (slp_node, gsi, nvectors, vectype, index);
 }
 
 /* Record that a complete set of lengths associated with VINFO would need to
    contain a sequence of NVECTORS lengths for controlling an operation on
    VECTYPE.  The operation splits each element of VECTYPE into FACTOR separate
-   subelements, measuring the length as a number of these subelements.  */
+   subelements, measuring the length as a number of these subelements.
+   Alternatively, if doing basic block vectorization, record that a length limit
+   could be used to vectorize SLP_NODE if required.  */
 static void
 vect_record_len (vec_info *vinfo, slp_tree slp_node, unsigned int nvectors,
 		 tree vectype, unsigned int factor)
@@ -1432,7 +1433,7 @@  vect_record_len (vec_info *vinfo, slp_tree slp_node, unsigned int nvectors,
     vect_record_loop_len (loop_vinfo, &LOOP_VINFO_LENS (loop_vinfo), nvectors,
 			  vectype, factor);
   else
-    (void) slp_node; /* FORNOW */
+    vect_slp_record_bb_len (slp_node, nvectors, vectype, factor);
 }
 
 /* Given a complete set of lengths associated with VINFO, extract length number
@@ -1453,10 +1454,8 @@  vect_get_len (vec_info *vinfo, slp_tree slp_node, gimple_stmt_iterator *gsi,
     return vect_get_loop_len (loop_vinfo, gsi, &LOOP_VINFO_LENS (loop_vinfo),
 			      nvectors, vectype, index, factor, adjusted);
   else
-    {
-      (void) slp_node; /* FORNOW */
-      return NULL_TREE;
-    }
+    return vect_slp_get_bb_len (slp_node, nvectors, vectype, index, factor,
+				adjusted);
 }
 
 static tree permute_vec_elements (vec_info *, tree, tree, tree, stmt_vec_info,
@@ -14710,24 +14709,35 @@  supportable_indirect_convert_operation (code_helper code,
    mask[I] is true iff J + START_INDEX < END_INDEX for all J <= I.
    Add the statements to SEQ.  */
 
+void
+vect_gen_while_ssa_name (gimple_seq *seq, tree mask_type, tree start_index,
+			 tree end_index, tree ssa_name)
+{
+  tree cmp_type = TREE_TYPE (start_index);
+  gcc_checking_assert (direct_internal_fn_supported_p (IFN_WHILE_ULT, cmp_type,
+						       mask_type,
+						       OPTIMIZE_FOR_SPEED));
+  gcall *call
+    = gimple_build_call_internal (IFN_WHILE_ULT, 3, start_index, end_index,
+				  build_zero_cst (mask_type));
+  gimple_call_set_lhs (call, ssa_name);
+  gimple_seq_add_stmt (seq, call);
+}
+
+/*  Like vect_gen_while_ssa_name except that it creates a new SSA_NAME node
+    for type MASK_TYPE defined in the created GIMPLE_CALL statement.  If NAME
+    is not a null pointer then it is used for the SSA_NAME in dumps.  */
+
 tree
 vect_gen_while (gimple_seq *seq, tree mask_type, tree start_index,
 		tree end_index, const char *name)
 {
-  tree cmp_type = TREE_TYPE (start_index);
-  gcc_checking_assert (direct_internal_fn_supported_p (IFN_WHILE_ULT,
-						       cmp_type, mask_type,
-						       OPTIMIZE_FOR_SPEED));
-  gcall *call = gimple_build_call_internal (IFN_WHILE_ULT, 3,
-					    start_index, end_index,
-					    build_zero_cst (mask_type));
   tree tmp;
   if (name)
     tmp = make_temp_ssa_name (mask_type, NULL, name);
   else
     tmp = make_ssa_name (mask_type);
-  gimple_call_set_lhs (call, tmp);
-  gimple_seq_add_stmt (seq, call);
+  vect_gen_while_ssa_name (seq, mask_type, start_index, end_index, tmp);
   return tmp;
 }
 
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index a3855568b09..f79f04ff8ac 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -312,6 +312,13 @@  struct vect_load_store_data : vect_data {
   bool subchain_p; // VMAT_STRIDED_SLP and VMAT_GATHER_SCATTER
 };
 
+enum vect_partial_vector_style {
+  vect_partial_vectors_none,
+  vect_partial_vectors_while_ult,
+  vect_partial_vectors_avx512,
+  vect_partial_vectors_len
+};
+
 /* A computation tree of an SLP instance.  Each node corresponds to a group of
    stmts to be packed in a SIMD stmt.  */
 struct _slp_tree {
@@ -377,7 +384,16 @@  struct _slp_tree {
   /* For BB vect, flag to indicate this load node should be vectorized
      as to avoid STLF fails because of related stores.  */
   bool avoid_stlf_fail;
-
+  /* The style used for implementing partial vectors if LANES is less than
+     the minimum number of lanes implied by the VECTYPE.  */
+  vect_partial_vector_style partial_vector_style;
+  /* Flag to indicate whether we still have the option of vectorizing this node
+     using partial vectors (i.e.  using lengths or masks to prevent use of
+     inactive scalar lanes).  */
+  bool can_use_partial_vectors;
+  /* Number of partial vectors, for costing purposes. Should be 0 unless a
+     partial vector style has been set.  */
+  int num_partial_vectors;
   int vertex;
 
   /* The kind of operation as determined by analysis and optional
@@ -476,6 +492,9 @@  public:
 #define SLP_TREE_GS_BASE(S)			 (S)->gs_base
 #define SLP_TREE_REDUC_IDX(S)			 (S)->cycle_info.reduc_idx
 #define SLP_TREE_PERMUTE_P(S)			 ((S)->code == VEC_PERM_EXPR)
+#define SLP_TREE_PARTIAL_VECTORS_STYLE(S)	 (S)->partial_vector_style
+#define SLP_TREE_CAN_USE_PARTIAL_VECTORS_P(S)	 (S)->can_use_partial_vectors
+#define SLP_TREE_NUM_PARTIAL_VECTORS(S)		 (S)->num_partial_vectors
 
 inline vect_memory_access_type
 SLP_TREE_MEMORY_ACCESS_TYPE (slp_tree node)
@@ -486,13 +505,6 @@  SLP_TREE_MEMORY_ACCESS_TYPE (slp_tree node)
   return VMAT_UNINITIALIZED;
 }
 
-enum vect_partial_vector_style {
-    vect_partial_vectors_none,
-    vect_partial_vectors_while_ult,
-    vect_partial_vectors_avx512,
-    vect_partial_vectors_len
-};
-
 /* Key for map that records association between
    scalar conditions and corresponding loop mask, and
    is populated by vect_record_loop_mask.  */
@@ -2607,6 +2619,7 @@  extern tree vect_gen_perm_mask_checked (tree, const vec_perm_indices &);
 extern void optimize_mask_stores (class loop*);
 extern tree vect_gen_while (gimple_seq *, tree, tree, tree,
 			    const char * = nullptr);
+extern void vect_gen_while_ssa_name (gimple_seq *, tree, tree, tree, tree);
 extern tree vect_gen_while_not (gimple_seq *, tree, tree, tree);
 extern opt_result vect_get_vector_types_for_stmt (vec_info *,
 						  stmt_vec_info, tree *,
@@ -2788,7 +2801,14 @@  extern slp_tree vect_create_new_slp_node (unsigned, tree_code);
 extern void vect_free_slp_tree (slp_tree);
 extern bool compatible_calls_p (gcall *, gcall *, bool);
 extern int vect_slp_child_index_for_operand (const stmt_vec_info, int op);
-
+extern void vect_slp_record_bb_mask (slp_tree slp_node, unsigned int nvectors,
+				     tree vectype, tree scalar_mask);
+extern tree vect_slp_get_bb_mask (slp_tree, gimple_stmt_iterator *,
+				  unsigned int, tree, unsigned int);
+extern void vect_slp_record_bb_len (slp_tree slp_node, unsigned int nvectors,
+				    tree vectype, unsigned int factor);
+extern tree vect_slp_get_bb_len (slp_tree, unsigned int, tree, unsigned int,
+				 unsigned int, bool);
 extern tree prepare_vec_mask (vec_info *, tree, tree, tree,
 			      gimple_stmt_iterator *);
 extern tree vect_get_mask_load_else (int, tree);
@@ -2953,7 +2973,7 @@  vect_cannot_use_partial_vectors (vec_info *vinfo, slp_tree slp_node)
   if (loop_vinfo)
     LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false;
   else
-    (void) slp_node; /* FORNOW */
+    SLP_TREE_CAN_USE_PARTIAL_VECTORS_P (slp_node) = false;
 }
 
 /* Return true if VINFO is vectorizer state for loop vectorization, we've
@@ -2967,10 +2987,8 @@  vect_fully_with_length_p (vec_info *vinfo, slp_tree slp_node)
   if (loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo))
     return LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo);
   else
-    {
-      (void) slp_node; /* FORNOW */
-      return false;
-    }
+    return SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node)
+	   == vect_partial_vectors_len;
 }
 
 /* Return true if VINFO is vectorizer state for loop vectorization, we've
@@ -2984,10 +3002,8 @@  vect_fully_masked_p (vec_info *vinfo, slp_tree slp_node)
   if (loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo))
     return LOOP_VINFO_FULLY_MASKED_P (loop_vinfo);
   else
-    {
-      (void) slp_node; /* FORNOW */
-      return false;
-    }
+    return SLP_TREE_PARTIAL_VECTORS_STYLE (slp_node)
+	   == vect_partial_vectors_while_ult;
 }
 
 /* If STMT_INFO describes a reduction, return the vect_reduction_type