[2/3] tree-optimization/104582 - make SLP node available in vector cost hook

Message ID 20220218135904.E462F13C91@imap2.suse-dmz.suse.de
State New
Headers
Series [1/3] tree-optimization/104582 - Simplify vectorizer cost API and fixes |

Commit Message

Richard Biener Feb. 18, 2022, 1:59 p.m. UTC
  This adjusts the vectorizer costing API to allow passing down the
SLP node the vector stmt is created from.

Bootstrapped and tested on x86_64-unknown-linux-gnu, I've built
aarch64 and rs6000 cc1 crosses.

OK?

Thanks,
Richard.

2022-02-18  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/104582
	* tree-vectorizer.h (stmt_info_for_cost::node): New field.
	(vector_costs::add_stmt_cost): Add SLP node parameter.
	(dump_stmt_cost): Likewise.
	(add_stmt_cost): Likewise, new overload and adjust.
	(add_stmt_costs): Adjust.
	(record_stmt_cost): New overload.
	* tree-vectorizer.cc (dump_stmt_cost): Dump the SLP node.
	(vector_costs::add_stmt_cost): Adjust.
	* tree-vect-loop.cc (vect_estimate_min_profitable_iters):
	Adjust.
	* tree-vect-slp.cc (vect_prologue_cost_for_slp): Record
	the SLP node for costing.
	(vectorizable_slp_permutation): Likewise.
	* tree-vect-stmts.cc (record_stmt_cost): Adjust and add
	new overloads.
	* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
	Adjust.
	* config/aarch64/aarch64.cc (aarch64_vector_costs::add_stmt_cost):
	Adjust.
	* config/rs6000/rs6000.cc (rs6000_vector_costs::add_stmt_cost):
	Adjust.
	(rs6000_cost_data::adjust_vect_cost_per_loop): Likewise.
---
 gcc/config/aarch64/aarch64.cc |  6 +++---
 gcc/config/i386/i386.cc       |  9 +++++----
 gcc/config/rs6000/rs6000.cc   | 10 ++++++----
 gcc/tree-vect-loop.cc         | 16 +++++++++-------
 gcc/tree-vect-slp.cc          |  4 ++--
 gcc/tree-vect-stmts.cc        | 30 ++++++++++++++++++++++++++----
 gcc/tree-vectorizer.cc        | 10 +++++++---
 gcc/tree-vectorizer.h         | 28 +++++++++++++++++-----------
 8 files changed, 75 insertions(+), 38 deletions(-)
  

Comments

Richard Sandiford Feb. 21, 2022, 10:13 a.m. UTC | #1
Richard Biener <rguenther@suse.de> writes:
> This adjusts the vectorizer costing API to allow passing down the
> SLP node the vector stmt is created from.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, I've built
> aarch64 and rs6000 cc1 crosses.
>
> OK?

Not sure about the stmt_info + no node overload.  It'll make it too
easy to forget to add the SLP node in new code, in cases where a node is
available.  (This reminds me of how we have sometimes forgotten to pass
a vectype to vect_is_simple_use in cases where the vectype is actually
necessary.)

No strong opinion about the no stmt_info + node overload, but it looks
like it's only used in a couple of places, so maybe it would be better
not to have that either?

LGTM otherwise FWIW.  If you disagree with the above then let's just
go with the patch as-is.

Thanks,
Richard

> Thanks,
> Richard.
>
> 2022-02-18  Richard Biener  <rguenther@suse.de>
>
> 	PR tree-optimization/104582
> 	* tree-vectorizer.h (stmt_info_for_cost::node): New field.
> 	(vector_costs::add_stmt_cost): Add SLP node parameter.
> 	(dump_stmt_cost): Likewise.
> 	(add_stmt_cost): Likewise, new overload and adjust.
> 	(add_stmt_costs): Adjust.
> 	(record_stmt_cost): New overload.
> 	* tree-vectorizer.cc (dump_stmt_cost): Dump the SLP node.
> 	(vector_costs::add_stmt_cost): Adjust.
> 	* tree-vect-loop.cc (vect_estimate_min_profitable_iters):
> 	Adjust.
> 	* tree-vect-slp.cc (vect_prologue_cost_for_slp): Record
> 	the SLP node for costing.
> 	(vectorizable_slp_permutation): Likewise.
> 	* tree-vect-stmts.cc (record_stmt_cost): Adjust and add
> 	new overloads.
> 	* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
> 	Adjust.
> 	* config/aarch64/aarch64.cc (aarch64_vector_costs::add_stmt_cost):
> 	Adjust.
> 	* config/rs6000/rs6000.cc (rs6000_vector_costs::add_stmt_cost):
> 	Adjust.
> 	(rs6000_cost_data::adjust_vect_cost_per_loop): Likewise.
> ---
>  gcc/config/aarch64/aarch64.cc |  6 +++---
>  gcc/config/i386/i386.cc       |  9 +++++----
>  gcc/config/rs6000/rs6000.cc   | 10 ++++++----
>  gcc/tree-vect-loop.cc         | 16 +++++++++-------
>  gcc/tree-vect-slp.cc          |  4 ++--
>  gcc/tree-vect-stmts.cc        | 30 ++++++++++++++++++++++++++----
>  gcc/tree-vectorizer.cc        | 10 +++++++---
>  gcc/tree-vectorizer.h         | 28 +++++++++++++++++-----------
>  8 files changed, 75 insertions(+), 38 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 37ed22bcc94..fb40b7e9c78 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -15008,7 +15008,7 @@ public:
>    aarch64_vector_costs (vec_info *, bool);
>  
>    unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
> -			      stmt_vec_info stmt_info, tree vectype,
> +			      stmt_vec_info stmt_info, slp_tree, tree vectype,
>  			      int misalign,
>  			      vect_cost_model_location where) override;
>    void finish_cost (const vector_costs *) override;
> @@ -15953,8 +15953,8 @@ aarch64_stp_sequence_cost (unsigned int count, vect_cost_for_stmt kind,
>  
>  unsigned
>  aarch64_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> -				     stmt_vec_info stmt_info, tree vectype,
> -				     int misalign,
> +				     stmt_vec_info stmt_info, slp_tree,
> +				     tree vectype, int misalign,
>  				     vect_cost_model_location where)
>  {
>    fractional_cost stmt_cost
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index e4b42fbba6f..0830dbd7dca 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -22982,8 +22982,8 @@ class ix86_vector_costs : public vector_costs
>    using vector_costs::vector_costs;
>  
>    unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
> -			      stmt_vec_info stmt_info, tree vectype,
> -			      int misalign,
> +			      stmt_vec_info stmt_info, slp_tree node,
> +			      tree vectype, int misalign,
>  			      vect_cost_model_location where) override;
>  };
>  
> @@ -22997,8 +22997,9 @@ ix86_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar)
>  
>  unsigned
>  ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> -				  stmt_vec_info stmt_info, tree vectype,
> -				  int misalign, vect_cost_model_location where)
> +				  stmt_vec_info stmt_info, slp_tree,
> +				  tree vectype, int misalign,
> +				  vect_cost_model_location where)
>  {
>    unsigned retval = 0;
>    bool scalar_p
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 32a13cdaf1f..91e5efb6288 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -5236,7 +5236,7 @@ public:
>    using vector_costs::vector_costs;
>  
>    unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
> -			      stmt_vec_info stmt_info, tree vectype,
> +			      stmt_vec_info stmt_info, slp_tree, tree vectype,
>  			      int misalign,
>  			      vect_cost_model_location where) override;
>    void finish_cost (const vector_costs *) override;
> @@ -5452,8 +5452,9 @@ rs6000_cost_data::update_target_cost_per_stmt (vect_cost_for_stmt kind,
>  
>  unsigned
>  rs6000_cost_data::add_stmt_cost (int count, vect_cost_for_stmt kind,
> -				 stmt_vec_info stmt_info, tree vectype,
> -				 int misalign, vect_cost_model_location where)
> +				 stmt_vec_info stmt_info, slp_tree,
> +				 tree vectype, int misalign,
> +				 vect_cost_model_location where)
>  {
>    unsigned retval = 0;
>  
> @@ -5494,7 +5495,8 @@ rs6000_cost_data::adjust_vect_cost_per_loop (loop_vec_info loop_vinfo)
>  	  /* Each length needs one shift to fill into bits 0-7.  */
>  	  shift_cnt += num_vectors_m1 + 1;
>  
> -      add_stmt_cost (shift_cnt, scalar_stmt, NULL, NULL_TREE, 0, vect_body);
> +      add_stmt_cost (shift_cnt, scalar_stmt, NULL, NULL,
> +		     NULL_TREE, 0, vect_body);
>      }
>  }
>  
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 5c7b163f01c..1f30fc82ca1 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -3982,7 +3982,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
>      {
>        /*  FIXME: Make cost depend on complexity of individual check.  */
>        (void) add_stmt_cost (target_cost_data, 1, vector_stmt,
> -			    NULL, NULL_TREE, 0, vect_prologue);
> +			    NULL, NULL, NULL_TREE, 0, vect_prologue);
>        if (dump_enabled_p ())
>  	dump_printf (MSG_NOTE,
>  		     "cost model: Adding cost of checks for loop "
> @@ -4079,8 +4079,8 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
>        {
>  	(void) add_stmt_cost (target_cost_data,
>  			      si->count * peel_iters_prologue, si->kind,
> -			      si->stmt_info, si->vectype, si->misalign,
> -			      vect_prologue);
> +			      si->stmt_info, si->node, si->vectype,
> +			      si->misalign, vect_prologue);
>        }
>  
>    /* Add costs associated with peel_iters_epilogue.  */
> @@ -4089,8 +4089,8 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
>        {
>  	(void) add_stmt_cost (target_cost_data,
>  			      si->count * peel_iters_epilogue, si->kind,
> -			      si->stmt_info, si->vectype, si->misalign,
> -			      vect_epilogue);
> +			      si->stmt_info, si->node, si->vectype,
> +			      si->misalign, vect_epilogue);
>        }
>  
>    /* Add possible cond_branch_taken/cond_branch_not_taken cost.  */
> @@ -4136,9 +4136,11 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
>  	 being the tie-breaker between vectorizing or not, then it's
>  	 probably better not to vectorize.  */
>        (void) add_stmt_cost (target_cost_data, num_masks,
> -			    vector_stmt, NULL, NULL_TREE, 0, vect_prologue);
> +			    vector_stmt, NULL, NULL, NULL_TREE, 0,
> +			    vect_prologue);
>        (void) add_stmt_cost (target_cost_data, num_masks - 1,
> -			    vector_stmt, NULL, NULL_TREE, 0, vect_body);
> +			    vector_stmt, NULL, NULL, NULL_TREE, 0,
> +			    vect_body);
>      }
>    else if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo))
>      {
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index c6b5a0696a2..9188d727e33 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -4612,7 +4612,7 @@ vect_prologue_cost_for_slp (slp_tree node,
>  	kind = scalar_to_vec;
>        else
>  	kind = vec_construct;
> -      record_stmt_cost (cost_vec, 1, kind, NULL, vectype, 0, vect_prologue);
> +      record_stmt_cost (cost_vec, 1, kind, node, vectype, 0, vect_prologue);
>      }
>  }
>  
> @@ -7120,7 +7120,7 @@ vectorizable_slp_permutation (vec_info *vinfo, gimple_stmt_iterator *gsi,
>      }
>  
>    if (!gsi)
> -    record_stmt_cost (cost_vec, nperms, vec_perm, NULL, vectype, 0, vect_body);
> +    record_stmt_cost (cost_vec, nperms, vec_perm, node, vectype, 0, vect_body);
>  
>    return true;
>  }
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index acb9fa611c1..a21ce946d8b 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -89,9 +89,10 @@ stmt_in_inner_loop_p (vec_info *vinfo, class _stmt_vec_info *stmt_info)
>     target model or by saving it in a vector for later processing.
>     Return a preliminary estimate of the statement's cost.  */
>  
> -unsigned
> +static unsigned
>  record_stmt_cost (stmt_vector_for_cost *body_cost_vec, int count,
> -		  enum vect_cost_for_stmt kind, stmt_vec_info stmt_info,
> +		  enum vect_cost_for_stmt kind,
> +		  stmt_vec_info stmt_info, slp_tree node,
>  		  tree vectype, int misalign,
>  		  enum vect_cost_model_location where)
>  {
> @@ -102,20 +103,41 @@ record_stmt_cost (stmt_vector_for_cost *body_cost_vec, int count,
>        && (stmt_info && STMT_VINFO_GATHER_SCATTER_P (stmt_info)))
>      kind = vector_scatter_store;
>  
> -  stmt_info_for_cost si = { count, kind, where, stmt_info, vectype, misalign };
> +  stmt_info_for_cost si
> +    = { count, kind, where, stmt_info, node, vectype, misalign };
>    body_cost_vec->safe_push (si);
>  
>    return (unsigned)
>        (builtin_vectorization_cost (kind, vectype, misalign) * count);
>  }
>  
> +unsigned
> +record_stmt_cost (stmt_vector_for_cost *body_cost_vec, int count,
> +		  enum vect_cost_for_stmt kind, stmt_vec_info stmt_info,
> +		  tree vectype, int misalign,
> +		  enum vect_cost_model_location where)
> +{
> +  return record_stmt_cost (body_cost_vec, count, kind, stmt_info, NULL,
> +			   vectype, misalign, where);
> +}
> +
> +unsigned
> +record_stmt_cost (stmt_vector_for_cost *body_cost_vec, int count,
> +		  enum vect_cost_for_stmt kind, slp_tree node,
> +		  tree vectype, int misalign,
> +		  enum vect_cost_model_location where)
> +{
> +  return record_stmt_cost (body_cost_vec, count, kind, NULL, node,
> +			   vectype, misalign, where);
> +}
> +
>  unsigned
>  record_stmt_cost (stmt_vector_for_cost *body_cost_vec, int count,
>  		  enum vect_cost_for_stmt kind,
>  		  enum vect_cost_model_location where)
>  {
>    gcc_assert (kind == cond_branch_taken || kind == cond_branch_not_taken);
> -  return record_stmt_cost (body_cost_vec, count, kind, NULL,
> +  return record_stmt_cost (body_cost_vec, count, kind, NULL, NULL,
>  			   NULL_TREE, 0, where);
>  }
>  
> diff --git a/gcc/tree-vectorizer.cc b/gcc/tree-vectorizer.cc
> index 344dc419d8c..a63fa391273 100644
> --- a/gcc/tree-vectorizer.cc
> +++ b/gcc/tree-vectorizer.cc
> @@ -99,7 +99,8 @@ auto_purge_vect_location::~auto_purge_vect_location ()
>  
>  void
>  dump_stmt_cost (FILE *f, int count, enum vect_cost_for_stmt kind,
> -		stmt_vec_info stmt_info, tree, int misalign, unsigned cost,
> +		stmt_vec_info stmt_info, slp_tree node, tree,
> +		int misalign, unsigned cost,
>  		enum vect_cost_model_location where)
>  {
>    if (stmt_info)
> @@ -107,6 +108,8 @@ dump_stmt_cost (FILE *f, int count, enum vect_cost_for_stmt kind,
>        print_gimple_expr (f, STMT_VINFO_STMT (stmt_info), 0, TDF_SLIM);
>        fprintf (f, " ");
>      }
> +  else if (node)
> +    fprintf (f, "node %p ", (void *)node);
>    else
>      fprintf (f, "<unknown> ");
>    fprintf (f, "%d times ", count);
> @@ -1766,8 +1769,9 @@ scalar_cond_masked_key::get_cond_ops_from_tree (tree t)
>  
>  unsigned int
>  vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> -			     stmt_vec_info stmt_info, tree vectype,
> -			     int misalign, vect_cost_model_location where)
> +			     stmt_vec_info stmt_info, slp_tree,
> +			     tree vectype, int misalign,
> +			     vect_cost_model_location where)
>  {
>    unsigned int cost
>      = builtin_vectorization_cost (kind, vectype, misalign) * count;
> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> index da99f28c0dc..642eb0aeb21 100644
> --- a/gcc/tree-vectorizer.h
> +++ b/gcc/tree-vectorizer.h
> @@ -22,6 +22,7 @@ along with GCC; see the file COPYING3.  If not see
>  #define GCC_TREE_VECTORIZER_H
>  
>  typedef class _stmt_vec_info *stmt_vec_info;
> +typedef struct _slp_tree *slp_tree;
>  
>  #include "tree-data-ref.h"
>  #include "tree-hash-traits.h"
> @@ -101,6 +102,7 @@ struct stmt_info_for_cost {
>    enum vect_cost_for_stmt kind;
>    enum vect_cost_model_location where;
>    stmt_vec_info stmt_info;
> +  slp_tree node;
>    tree vectype;
>    int misalign;
>  };
> @@ -151,7 +153,6 @@ struct vect_scalar_ops_slice_hash : typed_noop_remove<vect_scalar_ops_slice>
>  /************************************************************************
>    SLP
>   ************************************************************************/
> -typedef struct _slp_tree *slp_tree;
>  typedef vec<std::pair<unsigned, unsigned> > lane_permutation_t;
>  typedef vec<unsigned> load_permutation_t;
>  
> @@ -1462,7 +1463,7 @@ public:
>       - WHERE specifies whether the cost occurs in the loop prologue,
>         the loop body, or the loop epilogue.
>       - KIND is the kind of statement, which is always meaningful.
> -     - STMT_INFO, if nonnull, describes the statement that will be
> +     - STMT_INFO or NODE, if nonnull, describe the statement that will be
>         vectorized.
>       - VECTYPE, if nonnull, is the vector type that the vectorized
>         statement will operate on.  Note that this should be used in
> @@ -1476,8 +1477,9 @@ public:
>       Return the calculated cost as well as recording it.  The return
>       value is used for dumping purposes.  */
>    virtual unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
> -				      stmt_vec_info stmt_info, tree vectype,
> -				      int misalign,
> +				      stmt_vec_info stmt_info,
> +				      slp_tree node,
> +				      tree vectype, int misalign,
>  				      vect_cost_model_location where);
>  
>    /* Finish calculating the cost of the code.  The results can be
> @@ -1743,7 +1745,7 @@ init_cost (vec_info *vinfo, bool costing_for_scalar)
>  }
>  
>  extern void dump_stmt_cost (FILE *, int, enum vect_cost_for_stmt,
> -			    stmt_vec_info, tree, int, unsigned,
> +			    stmt_vec_info, slp_tree, tree, int, unsigned,
>  			    enum vect_cost_model_location);
>  
>  /* Alias targetm.vectorize.add_stmt_cost.  */
> @@ -1751,13 +1753,14 @@ extern void dump_stmt_cost (FILE *, int, enum vect_cost_for_stmt,
>  static inline unsigned
>  add_stmt_cost (vector_costs *costs, int count,
>  	       enum vect_cost_for_stmt kind,
> -	       stmt_vec_info stmt_info, tree vectype, int misalign,
> +	       stmt_vec_info stmt_info, slp_tree node,
> +	       tree vectype, int misalign,
>  	       enum vect_cost_model_location where)
>  {
> -  unsigned cost = costs->add_stmt_cost (count, kind, stmt_info, vectype,
> +  unsigned cost = costs->add_stmt_cost (count, kind, stmt_info, node, vectype,
>  					misalign, where);
>    if (dump_file && (dump_flags & TDF_DETAILS))
> -    dump_stmt_cost (dump_file, count, kind, stmt_info, vectype, misalign,
> +    dump_stmt_cost (dump_file, count, kind, stmt_info, node, vectype, misalign,
>  		    cost, where);
>    return cost;
>  }
> @@ -1768,7 +1771,7 @@ add_stmt_cost (vector_costs *costs, int count, enum vect_cost_for_stmt kind,
>  {
>    gcc_assert (kind == cond_branch_taken || kind == cond_branch_not_taken
>  	      || kind == scalar_stmt);
> -  return add_stmt_cost (costs, count, kind, NULL, NULL_TREE, 0, where);
> +  return add_stmt_cost (costs, count, kind, NULL, NULL, NULL_TREE, 0, where);
>  }
>  
>  /* Alias targetm.vectorize.add_stmt_cost.  */
> @@ -1776,7 +1779,7 @@ add_stmt_cost (vector_costs *costs, int count, enum vect_cost_for_stmt kind,
>  static inline unsigned
>  add_stmt_cost (vector_costs *costs, stmt_info_for_cost *i)
>  {
> -  return add_stmt_cost (costs, i->count, i->kind, i->stmt_info,
> +  return add_stmt_cost (costs, i->count, i->kind, i->stmt_info, i->node,
>  			i->vectype, i->misalign, i->where);
>  }
>  
> @@ -1802,7 +1805,7 @@ add_stmt_costs (vector_costs *costs, stmt_vector_for_cost *cost_vec)
>    unsigned i;
>    FOR_EACH_VEC_ELT (*cost_vec, i, cost)
>      add_stmt_cost (costs, cost->count, cost->kind, cost->stmt_info,
> -		   cost->vectype, cost->misalign, cost->where);
> +		   cost->node, cost->vectype, cost->misalign, cost->where);
>  }
>  
>  /*-----------------------------------------------------------------*/
> @@ -2129,6 +2132,9 @@ extern bool supportable_narrowing_operation (enum tree_code, tree, tree,
>  extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
>  				  enum vect_cost_for_stmt, stmt_vec_info,
>  				  tree, int, enum vect_cost_model_location);
> +extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
> +				  enum vect_cost_for_stmt, slp_tree,
> +				  tree, int, enum vect_cost_model_location);
>  extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
>  				  enum vect_cost_for_stmt,
>  				  enum vect_cost_model_location);
  
Richard Biener Feb. 21, 2022, 10:30 a.m. UTC | #2
On Mon, 21 Feb 2022, Richard Sandiford wrote:

> Richard Biener <rguenther@suse.de> writes:
> > This adjusts the vectorizer costing API to allow passing down the
> > SLP node the vector stmt is created from.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, I've built
> > aarch64 and rs6000 cc1 crosses.
> >
> > OK?
> 
> Not sure about the stmt_info + no node overload.  It'll make it too
> easy to forget to add the SLP node in new code, in cases where a node is
> available.  (This reminds me of how we have sometimes forgotten to pass
> a vectype to vect_is_simple_use in cases where the vectype is actually
> necessary.)
> 
> No strong opinion about the no stmt_info + node overload, but it looks
> like it's only used in a couple of places, so maybe it would be better
> not to have that either?
> 
> LGTM otherwise FWIW.  If you disagree with the above then let's just
> go with the patch as-is.

I've gone with the two overloads because we shouldn't pass down both
SLP node and stmt_info, it should be either-or (or none), but never both.

In the future it might all be different of course but I've tried to
not complicate things here.  In principle we might now ditch the
vectype argument in favor of stmt_info->vectype / node->vectype, but
I've tried to be simple at this point.

So yeah, I think doing better needs a lot more thoughts - a good next
target would be to improve costings for permutes - the patch gets
us access to the ones with explicit SLP nodes but not those from
load permutations (but how we implement those depends on the
vectorizable_load strathegy as wel).  Permutes are also not so
obvious to use given they need to be "translated" with the actual
number of vector stmts and the vector type in mind.  One of the ideas
I had was to, during vectorizable_* build an actual instantiation SLP
tree matching the code generation strathegy and duplicating nodes to
have one for each actual vector (but then one could maybe also just
code generate to GIMPLE?).

Anyway, unless you see stage4 uses for patches 1 + 2 in the series
I'm going to wait for the patch 3 conclusion from the x86 maintainers
and otherwise will push this to stage1 (I still think it's a step
in the right direction).

Thanks,
Richard.


> Thanks,
> Richard
> 
> > Thanks,
> > Richard.
> >
> > 2022-02-18  Richard Biener  <rguenther@suse.de>
> >
> > 	PR tree-optimization/104582
> > 	* tree-vectorizer.h (stmt_info_for_cost::node): New field.
> > 	(vector_costs::add_stmt_cost): Add SLP node parameter.
> > 	(dump_stmt_cost): Likewise.
> > 	(add_stmt_cost): Likewise, new overload and adjust.
> > 	(add_stmt_costs): Adjust.
> > 	(record_stmt_cost): New overload.
> > 	* tree-vectorizer.cc (dump_stmt_cost): Dump the SLP node.
> > 	(vector_costs::add_stmt_cost): Adjust.
> > 	* tree-vect-loop.cc (vect_estimate_min_profitable_iters):
> > 	Adjust.
> > 	* tree-vect-slp.cc (vect_prologue_cost_for_slp): Record
> > 	the SLP node for costing.
> > 	(vectorizable_slp_permutation): Likewise.
> > 	* tree-vect-stmts.cc (record_stmt_cost): Adjust and add
> > 	new overloads.
> > 	* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
> > 	Adjust.
> > 	* config/aarch64/aarch64.cc (aarch64_vector_costs::add_stmt_cost):
> > 	Adjust.
> > 	* config/rs6000/rs6000.cc (rs6000_vector_costs::add_stmt_cost):
> > 	Adjust.
> > 	(rs6000_cost_data::adjust_vect_cost_per_loop): Likewise.
> > ---
> >  gcc/config/aarch64/aarch64.cc |  6 +++---
> >  gcc/config/i386/i386.cc       |  9 +++++----
> >  gcc/config/rs6000/rs6000.cc   | 10 ++++++----
> >  gcc/tree-vect-loop.cc         | 16 +++++++++-------
> >  gcc/tree-vect-slp.cc          |  4 ++--
> >  gcc/tree-vect-stmts.cc        | 30 ++++++++++++++++++++++++++----
> >  gcc/tree-vectorizer.cc        | 10 +++++++---
> >  gcc/tree-vectorizer.h         | 28 +++++++++++++++++-----------
> >  8 files changed, 75 insertions(+), 38 deletions(-)
> >
> > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> > index 37ed22bcc94..fb40b7e9c78 100644
> > --- a/gcc/config/aarch64/aarch64.cc
> > +++ b/gcc/config/aarch64/aarch64.cc
> > @@ -15008,7 +15008,7 @@ public:
> >    aarch64_vector_costs (vec_info *, bool);
> >  
> >    unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
> > -			      stmt_vec_info stmt_info, tree vectype,
> > +			      stmt_vec_info stmt_info, slp_tree, tree vectype,
> >  			      int misalign,
> >  			      vect_cost_model_location where) override;
> >    void finish_cost (const vector_costs *) override;
> > @@ -15953,8 +15953,8 @@ aarch64_stp_sequence_cost (unsigned int count, vect_cost_for_stmt kind,
> >  
> >  unsigned
> >  aarch64_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> > -				     stmt_vec_info stmt_info, tree vectype,
> > -				     int misalign,
> > +				     stmt_vec_info stmt_info, slp_tree,
> > +				     tree vectype, int misalign,
> >  				     vect_cost_model_location where)
> >  {
> >    fractional_cost stmt_cost
> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > index e4b42fbba6f..0830dbd7dca 100644
> > --- a/gcc/config/i386/i386.cc
> > +++ b/gcc/config/i386/i386.cc
> > @@ -22982,8 +22982,8 @@ class ix86_vector_costs : public vector_costs
> >    using vector_costs::vector_costs;
> >  
> >    unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
> > -			      stmt_vec_info stmt_info, tree vectype,
> > -			      int misalign,
> > +			      stmt_vec_info stmt_info, slp_tree node,
> > +			      tree vectype, int misalign,
> >  			      vect_cost_model_location where) override;
> >  };
> >  
> > @@ -22997,8 +22997,9 @@ ix86_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar)
> >  
> >  unsigned
> >  ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> > -				  stmt_vec_info stmt_info, tree vectype,
> > -				  int misalign, vect_cost_model_location where)
> > +				  stmt_vec_info stmt_info, slp_tree,
> > +				  tree vectype, int misalign,
> > +				  vect_cost_model_location where)
> >  {
> >    unsigned retval = 0;
> >    bool scalar_p
> > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> > index 32a13cdaf1f..91e5efb6288 100644
> > --- a/gcc/config/rs6000/rs6000.cc
> > +++ b/gcc/config/rs6000/rs6000.cc
> > @@ -5236,7 +5236,7 @@ public:
> >    using vector_costs::vector_costs;
> >  
> >    unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
> > -			      stmt_vec_info stmt_info, tree vectype,
> > +			      stmt_vec_info stmt_info, slp_tree, tree vectype,
> >  			      int misalign,
> >  			      vect_cost_model_location where) override;
> >    void finish_cost (const vector_costs *) override;
> > @@ -5452,8 +5452,9 @@ rs6000_cost_data::update_target_cost_per_stmt (vect_cost_for_stmt kind,
> >  
> >  unsigned
> >  rs6000_cost_data::add_stmt_cost (int count, vect_cost_for_stmt kind,
> > -				 stmt_vec_info stmt_info, tree vectype,
> > -				 int misalign, vect_cost_model_location where)
> > +				 stmt_vec_info stmt_info, slp_tree,
> > +				 tree vectype, int misalign,
> > +				 vect_cost_model_location where)
> >  {
> >    unsigned retval = 0;
> >  
> > @@ -5494,7 +5495,8 @@ rs6000_cost_data::adjust_vect_cost_per_loop (loop_vec_info loop_vinfo)
> >  	  /* Each length needs one shift to fill into bits 0-7.  */
> >  	  shift_cnt += num_vectors_m1 + 1;
> >  
> > -      add_stmt_cost (shift_cnt, scalar_stmt, NULL, NULL_TREE, 0, vect_body);
> > +      add_stmt_cost (shift_cnt, scalar_stmt, NULL, NULL,
> > +		     NULL_TREE, 0, vect_body);
> >      }
> >  }
> >  
> > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> > index 5c7b163f01c..1f30fc82ca1 100644
> > --- a/gcc/tree-vect-loop.cc
> > +++ b/gcc/tree-vect-loop.cc
> > @@ -3982,7 +3982,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
> >      {
> >        /*  FIXME: Make cost depend on complexity of individual check.  */
> >        (void) add_stmt_cost (target_cost_data, 1, vector_stmt,
> > -			    NULL, NULL_TREE, 0, vect_prologue);
> > +			    NULL, NULL, NULL_TREE, 0, vect_prologue);
> >        if (dump_enabled_p ())
> >  	dump_printf (MSG_NOTE,
> >  		     "cost model: Adding cost of checks for loop "
> > @@ -4079,8 +4079,8 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
> >        {
> >  	(void) add_stmt_cost (target_cost_data,
> >  			      si->count * peel_iters_prologue, si->kind,
> > -			      si->stmt_info, si->vectype, si->misalign,
> > -			      vect_prologue);
> > +			      si->stmt_info, si->node, si->vectype,
> > +			      si->misalign, vect_prologue);
> >        }
> >  
> >    /* Add costs associated with peel_iters_epilogue.  */
> > @@ -4089,8 +4089,8 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
> >        {
> >  	(void) add_stmt_cost (target_cost_data,
> >  			      si->count * peel_iters_epilogue, si->kind,
> > -			      si->stmt_info, si->vectype, si->misalign,
> > -			      vect_epilogue);
> > +			      si->stmt_info, si->node, si->vectype,
> > +			      si->misalign, vect_epilogue);
> >        }
> >  
> >    /* Add possible cond_branch_taken/cond_branch_not_taken cost.  */
> > @@ -4136,9 +4136,11 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
> >  	 being the tie-breaker between vectorizing or not, then it's
> >  	 probably better not to vectorize.  */
> >        (void) add_stmt_cost (target_cost_data, num_masks,
> > -			    vector_stmt, NULL, NULL_TREE, 0, vect_prologue);
> > +			    vector_stmt, NULL, NULL, NULL_TREE, 0,
> > +			    vect_prologue);
> >        (void) add_stmt_cost (target_cost_data, num_masks - 1,
> > -			    vector_stmt, NULL, NULL_TREE, 0, vect_body);
> > +			    vector_stmt, NULL, NULL, NULL_TREE, 0,
> > +			    vect_body);
> >      }
> >    else if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo))
> >      {
> > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> > index c6b5a0696a2..9188d727e33 100644
> > --- a/gcc/tree-vect-slp.cc
> > +++ b/gcc/tree-vect-slp.cc
> > @@ -4612,7 +4612,7 @@ vect_prologue_cost_for_slp (slp_tree node,
> >  	kind = scalar_to_vec;
> >        else
> >  	kind = vec_construct;
> > -      record_stmt_cost (cost_vec, 1, kind, NULL, vectype, 0, vect_prologue);
> > +      record_stmt_cost (cost_vec, 1, kind, node, vectype, 0, vect_prologue);
> >      }
> >  }
> >  
> > @@ -7120,7 +7120,7 @@ vectorizable_slp_permutation (vec_info *vinfo, gimple_stmt_iterator *gsi,
> >      }
> >  
> >    if (!gsi)
> > -    record_stmt_cost (cost_vec, nperms, vec_perm, NULL, vectype, 0, vect_body);
> > +    record_stmt_cost (cost_vec, nperms, vec_perm, node, vectype, 0, vect_body);
> >  
> >    return true;
> >  }
> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > index acb9fa611c1..a21ce946d8b 100644
> > --- a/gcc/tree-vect-stmts.cc
> > +++ b/gcc/tree-vect-stmts.cc
> > @@ -89,9 +89,10 @@ stmt_in_inner_loop_p (vec_info *vinfo, class _stmt_vec_info *stmt_info)
> >     target model or by saving it in a vector for later processing.
> >     Return a preliminary estimate of the statement's cost.  */
> >  
> > -unsigned
> > +static unsigned
> >  record_stmt_cost (stmt_vector_for_cost *body_cost_vec, int count,
> > -		  enum vect_cost_for_stmt kind, stmt_vec_info stmt_info,
> > +		  enum vect_cost_for_stmt kind,
> > +		  stmt_vec_info stmt_info, slp_tree node,
> >  		  tree vectype, int misalign,
> >  		  enum vect_cost_model_location where)
> >  {
> > @@ -102,20 +103,41 @@ record_stmt_cost (stmt_vector_for_cost *body_cost_vec, int count,
> >        && (stmt_info && STMT_VINFO_GATHER_SCATTER_P (stmt_info)))
> >      kind = vector_scatter_store;
> >  
> > -  stmt_info_for_cost si = { count, kind, where, stmt_info, vectype, misalign };
> > +  stmt_info_for_cost si
> > +    = { count, kind, where, stmt_info, node, vectype, misalign };
> >    body_cost_vec->safe_push (si);
> >  
> >    return (unsigned)
> >        (builtin_vectorization_cost (kind, vectype, misalign) * count);
> >  }
> >  
> > +unsigned
> > +record_stmt_cost (stmt_vector_for_cost *body_cost_vec, int count,
> > +		  enum vect_cost_for_stmt kind, stmt_vec_info stmt_info,
> > +		  tree vectype, int misalign,
> > +		  enum vect_cost_model_location where)
> > +{
> > +  return record_stmt_cost (body_cost_vec, count, kind, stmt_info, NULL,
> > +			   vectype, misalign, where);
> > +}
> > +
> > +unsigned
> > +record_stmt_cost (stmt_vector_for_cost *body_cost_vec, int count,
> > +		  enum vect_cost_for_stmt kind, slp_tree node,
> > +		  tree vectype, int misalign,
> > +		  enum vect_cost_model_location where)
> > +{
> > +  return record_stmt_cost (body_cost_vec, count, kind, NULL, node,
> > +			   vectype, misalign, where);
> > +}
> > +
> >  unsigned
> >  record_stmt_cost (stmt_vector_for_cost *body_cost_vec, int count,
> >  		  enum vect_cost_for_stmt kind,
> >  		  enum vect_cost_model_location where)
> >  {
> >    gcc_assert (kind == cond_branch_taken || kind == cond_branch_not_taken);
> > -  return record_stmt_cost (body_cost_vec, count, kind, NULL,
> > +  return record_stmt_cost (body_cost_vec, count, kind, NULL, NULL,
> >  			   NULL_TREE, 0, where);
> >  }
> >  
> > diff --git a/gcc/tree-vectorizer.cc b/gcc/tree-vectorizer.cc
> > index 344dc419d8c..a63fa391273 100644
> > --- a/gcc/tree-vectorizer.cc
> > +++ b/gcc/tree-vectorizer.cc
> > @@ -99,7 +99,8 @@ auto_purge_vect_location::~auto_purge_vect_location ()
> >  
> >  void
> >  dump_stmt_cost (FILE *f, int count, enum vect_cost_for_stmt kind,
> > -		stmt_vec_info stmt_info, tree, int misalign, unsigned cost,
> > +		stmt_vec_info stmt_info, slp_tree node, tree,
> > +		int misalign, unsigned cost,
> >  		enum vect_cost_model_location where)
> >  {
> >    if (stmt_info)
> > @@ -107,6 +108,8 @@ dump_stmt_cost (FILE *f, int count, enum vect_cost_for_stmt kind,
> >        print_gimple_expr (f, STMT_VINFO_STMT (stmt_info), 0, TDF_SLIM);
> >        fprintf (f, " ");
> >      }
> > +  else if (node)
> > +    fprintf (f, "node %p ", (void *)node);
> >    else
> >      fprintf (f, "<unknown> ");
> >    fprintf (f, "%d times ", count);
> > @@ -1766,8 +1769,9 @@ scalar_cond_masked_key::get_cond_ops_from_tree (tree t)
> >  
> >  unsigned int
> >  vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
> > -			     stmt_vec_info stmt_info, tree vectype,
> > -			     int misalign, vect_cost_model_location where)
> > +			     stmt_vec_info stmt_info, slp_tree,
> > +			     tree vectype, int misalign,
> > +			     vect_cost_model_location where)
> >  {
> >    unsigned int cost
> >      = builtin_vectorization_cost (kind, vectype, misalign) * count;
> > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> > index da99f28c0dc..642eb0aeb21 100644
> > --- a/gcc/tree-vectorizer.h
> > +++ b/gcc/tree-vectorizer.h
> > @@ -22,6 +22,7 @@ along with GCC; see the file COPYING3.  If not see
> >  #define GCC_TREE_VECTORIZER_H
> >  
> >  typedef class _stmt_vec_info *stmt_vec_info;
> > +typedef struct _slp_tree *slp_tree;
> >  
> >  #include "tree-data-ref.h"
> >  #include "tree-hash-traits.h"
> > @@ -101,6 +102,7 @@ struct stmt_info_for_cost {
> >    enum vect_cost_for_stmt kind;
> >    enum vect_cost_model_location where;
> >    stmt_vec_info stmt_info;
> > +  slp_tree node;
> >    tree vectype;
> >    int misalign;
> >  };
> > @@ -151,7 +153,6 @@ struct vect_scalar_ops_slice_hash : typed_noop_remove<vect_scalar_ops_slice>
> >  /************************************************************************
> >    SLP
> >   ************************************************************************/
> > -typedef struct _slp_tree *slp_tree;
> >  typedef vec<std::pair<unsigned, unsigned> > lane_permutation_t;
> >  typedef vec<unsigned> load_permutation_t;
> >  
> > @@ -1462,7 +1463,7 @@ public:
> >       - WHERE specifies whether the cost occurs in the loop prologue,
> >         the loop body, or the loop epilogue.
> >       - KIND is the kind of statement, which is always meaningful.
> > -     - STMT_INFO, if nonnull, describes the statement that will be
> > +     - STMT_INFO or NODE, if nonnull, describe the statement that will be
> >         vectorized.
> >       - VECTYPE, if nonnull, is the vector type that the vectorized
> >         statement will operate on.  Note that this should be used in
> > @@ -1476,8 +1477,9 @@ public:
> >       Return the calculated cost as well as recording it.  The return
> >       value is used for dumping purposes.  */
> >    virtual unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
> > -				      stmt_vec_info stmt_info, tree vectype,
> > -				      int misalign,
> > +				      stmt_vec_info stmt_info,
> > +				      slp_tree node,
> > +				      tree vectype, int misalign,
> >  				      vect_cost_model_location where);
> >  
> >    /* Finish calculating the cost of the code.  The results can be
> > @@ -1743,7 +1745,7 @@ init_cost (vec_info *vinfo, bool costing_for_scalar)
> >  }
> >  
> >  extern void dump_stmt_cost (FILE *, int, enum vect_cost_for_stmt,
> > -			    stmt_vec_info, tree, int, unsigned,
> > +			    stmt_vec_info, slp_tree, tree, int, unsigned,
> >  			    enum vect_cost_model_location);
> >  
> >  /* Alias targetm.vectorize.add_stmt_cost.  */
> > @@ -1751,13 +1753,14 @@ extern void dump_stmt_cost (FILE *, int, enum vect_cost_for_stmt,
> >  static inline unsigned
> >  add_stmt_cost (vector_costs *costs, int count,
> >  	       enum vect_cost_for_stmt kind,
> > -	       stmt_vec_info stmt_info, tree vectype, int misalign,
> > +	       stmt_vec_info stmt_info, slp_tree node,
> > +	       tree vectype, int misalign,
> >  	       enum vect_cost_model_location where)
> >  {
> > -  unsigned cost = costs->add_stmt_cost (count, kind, stmt_info, vectype,
> > +  unsigned cost = costs->add_stmt_cost (count, kind, stmt_info, node, vectype,
> >  					misalign, where);
> >    if (dump_file && (dump_flags & TDF_DETAILS))
> > -    dump_stmt_cost (dump_file, count, kind, stmt_info, vectype, misalign,
> > +    dump_stmt_cost (dump_file, count, kind, stmt_info, node, vectype, misalign,
> >  		    cost, where);
> >    return cost;
> >  }
> > @@ -1768,7 +1771,7 @@ add_stmt_cost (vector_costs *costs, int count, enum vect_cost_for_stmt kind,
> >  {
> >    gcc_assert (kind == cond_branch_taken || kind == cond_branch_not_taken
> >  	      || kind == scalar_stmt);
> > -  return add_stmt_cost (costs, count, kind, NULL, NULL_TREE, 0, where);
> > +  return add_stmt_cost (costs, count, kind, NULL, NULL, NULL_TREE, 0, where);
> >  }
> >  
> >  /* Alias targetm.vectorize.add_stmt_cost.  */
> > @@ -1776,7 +1779,7 @@ add_stmt_cost (vector_costs *costs, int count, enum vect_cost_for_stmt kind,
> >  static inline unsigned
> >  add_stmt_cost (vector_costs *costs, stmt_info_for_cost *i)
> >  {
> > -  return add_stmt_cost (costs, i->count, i->kind, i->stmt_info,
> > +  return add_stmt_cost (costs, i->count, i->kind, i->stmt_info, i->node,
> >  			i->vectype, i->misalign, i->where);
> >  }
> >  
> > @@ -1802,7 +1805,7 @@ add_stmt_costs (vector_costs *costs, stmt_vector_for_cost *cost_vec)
> >    unsigned i;
> >    FOR_EACH_VEC_ELT (*cost_vec, i, cost)
> >      add_stmt_cost (costs, cost->count, cost->kind, cost->stmt_info,
> > -		   cost->vectype, cost->misalign, cost->where);
> > +		   cost->node, cost->vectype, cost->misalign, cost->where);
> >  }
> >  
> >  /*-----------------------------------------------------------------*/
> > @@ -2129,6 +2132,9 @@ extern bool supportable_narrowing_operation (enum tree_code, tree, tree,
> >  extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
> >  				  enum vect_cost_for_stmt, stmt_vec_info,
> >  				  tree, int, enum vect_cost_model_location);
> > +extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
> > +				  enum vect_cost_for_stmt, slp_tree,
> > +				  tree, int, enum vect_cost_model_location);
> >  extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
> >  				  enum vect_cost_for_stmt,
> >  				  enum vect_cost_model_location);
>
  

Patch

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 37ed22bcc94..fb40b7e9c78 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -15008,7 +15008,7 @@  public:
   aarch64_vector_costs (vec_info *, bool);
 
   unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
-			      stmt_vec_info stmt_info, tree vectype,
+			      stmt_vec_info stmt_info, slp_tree, tree vectype,
 			      int misalign,
 			      vect_cost_model_location where) override;
   void finish_cost (const vector_costs *) override;
@@ -15953,8 +15953,8 @@  aarch64_stp_sequence_cost (unsigned int count, vect_cost_for_stmt kind,
 
 unsigned
 aarch64_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
-				     stmt_vec_info stmt_info, tree vectype,
-				     int misalign,
+				     stmt_vec_info stmt_info, slp_tree,
+				     tree vectype, int misalign,
 				     vect_cost_model_location where)
 {
   fractional_cost stmt_cost
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index e4b42fbba6f..0830dbd7dca 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -22982,8 +22982,8 @@  class ix86_vector_costs : public vector_costs
   using vector_costs::vector_costs;
 
   unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
-			      stmt_vec_info stmt_info, tree vectype,
-			      int misalign,
+			      stmt_vec_info stmt_info, slp_tree node,
+			      tree vectype, int misalign,
 			      vect_cost_model_location where) override;
 };
 
@@ -22997,8 +22997,9 @@  ix86_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar)
 
 unsigned
 ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
-				  stmt_vec_info stmt_info, tree vectype,
-				  int misalign, vect_cost_model_location where)
+				  stmt_vec_info stmt_info, slp_tree,
+				  tree vectype, int misalign,
+				  vect_cost_model_location where)
 {
   unsigned retval = 0;
   bool scalar_p
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 32a13cdaf1f..91e5efb6288 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -5236,7 +5236,7 @@  public:
   using vector_costs::vector_costs;
 
   unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
-			      stmt_vec_info stmt_info, tree vectype,
+			      stmt_vec_info stmt_info, slp_tree, tree vectype,
 			      int misalign,
 			      vect_cost_model_location where) override;
   void finish_cost (const vector_costs *) override;
@@ -5452,8 +5452,9 @@  rs6000_cost_data::update_target_cost_per_stmt (vect_cost_for_stmt kind,
 
 unsigned
 rs6000_cost_data::add_stmt_cost (int count, vect_cost_for_stmt kind,
-				 stmt_vec_info stmt_info, tree vectype,
-				 int misalign, vect_cost_model_location where)
+				 stmt_vec_info stmt_info, slp_tree,
+				 tree vectype, int misalign,
+				 vect_cost_model_location where)
 {
   unsigned retval = 0;
 
@@ -5494,7 +5495,8 @@  rs6000_cost_data::adjust_vect_cost_per_loop (loop_vec_info loop_vinfo)
 	  /* Each length needs one shift to fill into bits 0-7.  */
 	  shift_cnt += num_vectors_m1 + 1;
 
-      add_stmt_cost (shift_cnt, scalar_stmt, NULL, NULL_TREE, 0, vect_body);
+      add_stmt_cost (shift_cnt, scalar_stmt, NULL, NULL,
+		     NULL_TREE, 0, vect_body);
     }
 }
 
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 5c7b163f01c..1f30fc82ca1 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -3982,7 +3982,7 @@  vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
     {
       /*  FIXME: Make cost depend on complexity of individual check.  */
       (void) add_stmt_cost (target_cost_data, 1, vector_stmt,
-			    NULL, NULL_TREE, 0, vect_prologue);
+			    NULL, NULL, NULL_TREE, 0, vect_prologue);
       if (dump_enabled_p ())
 	dump_printf (MSG_NOTE,
 		     "cost model: Adding cost of checks for loop "
@@ -4079,8 +4079,8 @@  vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
       {
 	(void) add_stmt_cost (target_cost_data,
 			      si->count * peel_iters_prologue, si->kind,
-			      si->stmt_info, si->vectype, si->misalign,
-			      vect_prologue);
+			      si->stmt_info, si->node, si->vectype,
+			      si->misalign, vect_prologue);
       }
 
   /* Add costs associated with peel_iters_epilogue.  */
@@ -4089,8 +4089,8 @@  vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
       {
 	(void) add_stmt_cost (target_cost_data,
 			      si->count * peel_iters_epilogue, si->kind,
-			      si->stmt_info, si->vectype, si->misalign,
-			      vect_epilogue);
+			      si->stmt_info, si->node, si->vectype,
+			      si->misalign, vect_epilogue);
       }
 
   /* Add possible cond_branch_taken/cond_branch_not_taken cost.  */
@@ -4136,9 +4136,11 @@  vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo,
 	 being the tie-breaker between vectorizing or not, then it's
 	 probably better not to vectorize.  */
       (void) add_stmt_cost (target_cost_data, num_masks,
-			    vector_stmt, NULL, NULL_TREE, 0, vect_prologue);
+			    vector_stmt, NULL, NULL, NULL_TREE, 0,
+			    vect_prologue);
       (void) add_stmt_cost (target_cost_data, num_masks - 1,
-			    vector_stmt, NULL, NULL_TREE, 0, vect_body);
+			    vector_stmt, NULL, NULL, NULL_TREE, 0,
+			    vect_body);
     }
   else if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo))
     {
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index c6b5a0696a2..9188d727e33 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -4612,7 +4612,7 @@  vect_prologue_cost_for_slp (slp_tree node,
 	kind = scalar_to_vec;
       else
 	kind = vec_construct;
-      record_stmt_cost (cost_vec, 1, kind, NULL, vectype, 0, vect_prologue);
+      record_stmt_cost (cost_vec, 1, kind, node, vectype, 0, vect_prologue);
     }
 }
 
@@ -7120,7 +7120,7 @@  vectorizable_slp_permutation (vec_info *vinfo, gimple_stmt_iterator *gsi,
     }
 
   if (!gsi)
-    record_stmt_cost (cost_vec, nperms, vec_perm, NULL, vectype, 0, vect_body);
+    record_stmt_cost (cost_vec, nperms, vec_perm, node, vectype, 0, vect_body);
 
   return true;
 }
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index acb9fa611c1..a21ce946d8b 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -89,9 +89,10 @@  stmt_in_inner_loop_p (vec_info *vinfo, class _stmt_vec_info *stmt_info)
    target model or by saving it in a vector for later processing.
    Return a preliminary estimate of the statement's cost.  */
 
-unsigned
+static unsigned
 record_stmt_cost (stmt_vector_for_cost *body_cost_vec, int count,
-		  enum vect_cost_for_stmt kind, stmt_vec_info stmt_info,
+		  enum vect_cost_for_stmt kind,
+		  stmt_vec_info stmt_info, slp_tree node,
 		  tree vectype, int misalign,
 		  enum vect_cost_model_location where)
 {
@@ -102,20 +103,41 @@  record_stmt_cost (stmt_vector_for_cost *body_cost_vec, int count,
       && (stmt_info && STMT_VINFO_GATHER_SCATTER_P (stmt_info)))
     kind = vector_scatter_store;
 
-  stmt_info_for_cost si = { count, kind, where, stmt_info, vectype, misalign };
+  stmt_info_for_cost si
+    = { count, kind, where, stmt_info, node, vectype, misalign };
   body_cost_vec->safe_push (si);
 
   return (unsigned)
       (builtin_vectorization_cost (kind, vectype, misalign) * count);
 }
 
+unsigned
+record_stmt_cost (stmt_vector_for_cost *body_cost_vec, int count,
+		  enum vect_cost_for_stmt kind, stmt_vec_info stmt_info,
+		  tree vectype, int misalign,
+		  enum vect_cost_model_location where)
+{
+  return record_stmt_cost (body_cost_vec, count, kind, stmt_info, NULL,
+			   vectype, misalign, where);
+}
+
+unsigned
+record_stmt_cost (stmt_vector_for_cost *body_cost_vec, int count,
+		  enum vect_cost_for_stmt kind, slp_tree node,
+		  tree vectype, int misalign,
+		  enum vect_cost_model_location where)
+{
+  return record_stmt_cost (body_cost_vec, count, kind, NULL, node,
+			   vectype, misalign, where);
+}
+
 unsigned
 record_stmt_cost (stmt_vector_for_cost *body_cost_vec, int count,
 		  enum vect_cost_for_stmt kind,
 		  enum vect_cost_model_location where)
 {
   gcc_assert (kind == cond_branch_taken || kind == cond_branch_not_taken);
-  return record_stmt_cost (body_cost_vec, count, kind, NULL,
+  return record_stmt_cost (body_cost_vec, count, kind, NULL, NULL,
 			   NULL_TREE, 0, where);
 }
 
diff --git a/gcc/tree-vectorizer.cc b/gcc/tree-vectorizer.cc
index 344dc419d8c..a63fa391273 100644
--- a/gcc/tree-vectorizer.cc
+++ b/gcc/tree-vectorizer.cc
@@ -99,7 +99,8 @@  auto_purge_vect_location::~auto_purge_vect_location ()
 
 void
 dump_stmt_cost (FILE *f, int count, enum vect_cost_for_stmt kind,
-		stmt_vec_info stmt_info, tree, int misalign, unsigned cost,
+		stmt_vec_info stmt_info, slp_tree node, tree,
+		int misalign, unsigned cost,
 		enum vect_cost_model_location where)
 {
   if (stmt_info)
@@ -107,6 +108,8 @@  dump_stmt_cost (FILE *f, int count, enum vect_cost_for_stmt kind,
       print_gimple_expr (f, STMT_VINFO_STMT (stmt_info), 0, TDF_SLIM);
       fprintf (f, " ");
     }
+  else if (node)
+    fprintf (f, "node %p ", (void *)node);
   else
     fprintf (f, "<unknown> ");
   fprintf (f, "%d times ", count);
@@ -1766,8 +1769,9 @@  scalar_cond_masked_key::get_cond_ops_from_tree (tree t)
 
 unsigned int
 vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
-			     stmt_vec_info stmt_info, tree vectype,
-			     int misalign, vect_cost_model_location where)
+			     stmt_vec_info stmt_info, slp_tree,
+			     tree vectype, int misalign,
+			     vect_cost_model_location where)
 {
   unsigned int cost
     = builtin_vectorization_cost (kind, vectype, misalign) * count;
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index da99f28c0dc..642eb0aeb21 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -22,6 +22,7 @@  along with GCC; see the file COPYING3.  If not see
 #define GCC_TREE_VECTORIZER_H
 
 typedef class _stmt_vec_info *stmt_vec_info;
+typedef struct _slp_tree *slp_tree;
 
 #include "tree-data-ref.h"
 #include "tree-hash-traits.h"
@@ -101,6 +102,7 @@  struct stmt_info_for_cost {
   enum vect_cost_for_stmt kind;
   enum vect_cost_model_location where;
   stmt_vec_info stmt_info;
+  slp_tree node;
   tree vectype;
   int misalign;
 };
@@ -151,7 +153,6 @@  struct vect_scalar_ops_slice_hash : typed_noop_remove<vect_scalar_ops_slice>
 /************************************************************************
   SLP
  ************************************************************************/
-typedef struct _slp_tree *slp_tree;
 typedef vec<std::pair<unsigned, unsigned> > lane_permutation_t;
 typedef vec<unsigned> load_permutation_t;
 
@@ -1462,7 +1463,7 @@  public:
      - WHERE specifies whether the cost occurs in the loop prologue,
        the loop body, or the loop epilogue.
      - KIND is the kind of statement, which is always meaningful.
-     - STMT_INFO, if nonnull, describes the statement that will be
+     - STMT_INFO or NODE, if nonnull, describe the statement that will be
        vectorized.
      - VECTYPE, if nonnull, is the vector type that the vectorized
        statement will operate on.  Note that this should be used in
@@ -1476,8 +1477,9 @@  public:
      Return the calculated cost as well as recording it.  The return
      value is used for dumping purposes.  */
   virtual unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
-				      stmt_vec_info stmt_info, tree vectype,
-				      int misalign,
+				      stmt_vec_info stmt_info,
+				      slp_tree node,
+				      tree vectype, int misalign,
 				      vect_cost_model_location where);
 
   /* Finish calculating the cost of the code.  The results can be
@@ -1743,7 +1745,7 @@  init_cost (vec_info *vinfo, bool costing_for_scalar)
 }
 
 extern void dump_stmt_cost (FILE *, int, enum vect_cost_for_stmt,
-			    stmt_vec_info, tree, int, unsigned,
+			    stmt_vec_info, slp_tree, tree, int, unsigned,
 			    enum vect_cost_model_location);
 
 /* Alias targetm.vectorize.add_stmt_cost.  */
@@ -1751,13 +1753,14 @@  extern void dump_stmt_cost (FILE *, int, enum vect_cost_for_stmt,
 static inline unsigned
 add_stmt_cost (vector_costs *costs, int count,
 	       enum vect_cost_for_stmt kind,
-	       stmt_vec_info stmt_info, tree vectype, int misalign,
+	       stmt_vec_info stmt_info, slp_tree node,
+	       tree vectype, int misalign,
 	       enum vect_cost_model_location where)
 {
-  unsigned cost = costs->add_stmt_cost (count, kind, stmt_info, vectype,
+  unsigned cost = costs->add_stmt_cost (count, kind, stmt_info, node, vectype,
 					misalign, where);
   if (dump_file && (dump_flags & TDF_DETAILS))
-    dump_stmt_cost (dump_file, count, kind, stmt_info, vectype, misalign,
+    dump_stmt_cost (dump_file, count, kind, stmt_info, node, vectype, misalign,
 		    cost, where);
   return cost;
 }
@@ -1768,7 +1771,7 @@  add_stmt_cost (vector_costs *costs, int count, enum vect_cost_for_stmt kind,
 {
   gcc_assert (kind == cond_branch_taken || kind == cond_branch_not_taken
 	      || kind == scalar_stmt);
-  return add_stmt_cost (costs, count, kind, NULL, NULL_TREE, 0, where);
+  return add_stmt_cost (costs, count, kind, NULL, NULL, NULL_TREE, 0, where);
 }
 
 /* Alias targetm.vectorize.add_stmt_cost.  */
@@ -1776,7 +1779,7 @@  add_stmt_cost (vector_costs *costs, int count, enum vect_cost_for_stmt kind,
 static inline unsigned
 add_stmt_cost (vector_costs *costs, stmt_info_for_cost *i)
 {
-  return add_stmt_cost (costs, i->count, i->kind, i->stmt_info,
+  return add_stmt_cost (costs, i->count, i->kind, i->stmt_info, i->node,
 			i->vectype, i->misalign, i->where);
 }
 
@@ -1802,7 +1805,7 @@  add_stmt_costs (vector_costs *costs, stmt_vector_for_cost *cost_vec)
   unsigned i;
   FOR_EACH_VEC_ELT (*cost_vec, i, cost)
     add_stmt_cost (costs, cost->count, cost->kind, cost->stmt_info,
-		   cost->vectype, cost->misalign, cost->where);
+		   cost->node, cost->vectype, cost->misalign, cost->where);
 }
 
 /*-----------------------------------------------------------------*/
@@ -2129,6 +2132,9 @@  extern bool supportable_narrowing_operation (enum tree_code, tree, tree,
 extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
 				  enum vect_cost_for_stmt, stmt_vec_info,
 				  tree, int, enum vect_cost_model_location);
+extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
+				  enum vect_cost_for_stmt, slp_tree,
+				  tree, int, enum vect_cost_model_location);
 extern unsigned record_stmt_cost (stmt_vector_for_cost *, int,
 				  enum vect_cost_for_stmt,
 				  enum vect_cost_model_location);