[v2] Fix SLP scalar costing with stmts also used in externals

Message ID 20250114093314.DC58F3857357@sourceware.org
State Committed
Commit 21edcb95340424efe2e66831f3b32cbab9cc6d31
Headers
Series [v2] Fix SLP scalar costing with stmts also used in externals |

Checks

Context Check Description
rivoscibot/toolchain-ci-rivos-lint success Lint passed
rivoscibot/toolchain-ci-rivos-apply-patch success Patch applied
rivoscibot/toolchain-ci-rivos-build--newlib-rv64gcv-lp64d-multilib success Build passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Build passed
rivoscibot/toolchain-ci-rivos-build--linux-rv64gcv-lp64d-multilib success Build passed
rivoscibot/toolchain-ci-rivos-build--linux-rv64gc_zba_zbb_zbc_zbs-lp64d-multilib success Build passed
rivoscibot/toolchain-ci-rivos-test success Testing passed
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 success Test passed
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_gcc_check--master-arm success Test passed

Commit Message

Richard Biener Jan. 14, 2025, 9:32 a.m. UTC
  When we have the situation of an external SLP node that is
permuted the scalar stmts recorded in the permute node do not
mean the scalar computation can be removed.  We are removing
those stmts from the vectorized_scalar_stmts for this reason
but we fail to check this set when we cost scalar stmts.  Note
vectorized_scalar_stmts isn't a complete set so also pass
scalar_stmts_in_externs and check that.

The following fixes this.

This shows in PR115777 when we avoid vectorizing the load, but
on it's own doesn't help the PR yet.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

	PR tree-optimization/115777
	* tree-vect-slp.cc (vect_bb_slp_scalar_cost): Do not
	cost a scalar stmt that needs to be preserved.
---
 gcc/tree-vect-slp.cc | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)
  

Patch

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 02e7f5c4d58..1d8e62e2fc1 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -8676,6 +8676,7 @@  vect_bb_slp_scalar_cost (vec_info *vinfo,
 			 slp_tree node, vec<bool, va_heap> *life,
 			 stmt_vector_for_cost *cost_vec,
 			 hash_set<stmt_vec_info> &vectorized_scalar_stmts,
+			 hash_set<stmt_vec_info> &scalar_stmts_in_externs,
 			 hash_set<slp_tree> &visited)
 {
   unsigned i;
@@ -8690,7 +8691,12 @@  vect_bb_slp_scalar_cost (vec_info *vinfo,
       ssa_op_iter op_iter;
       def_operand_p def_p;
 
-      if (!stmt_info || (*life)[i])
+      if (!stmt_info
+	  || (*life)[i]
+	  /* Defs also used in external nodes are not in the
+	     vectorized_scalar_stmts set as they need to be preserved.
+	     Honor that.  */
+	  || scalar_stmts_in_externs.contains (stmt_info))
 	continue;
 
       stmt_vec_info orig_stmt_info = vect_orig_stmt (stmt_info);
@@ -8809,7 +8815,8 @@  next_lane:
 	      subtree_life.safe_splice (*life);
 	    }
 	  vect_bb_slp_scalar_cost (vinfo, child, &subtree_life, cost_vec,
-				   vectorized_scalar_stmts, visited);
+				   vectorized_scalar_stmts,
+				   scalar_stmts_in_externs, visited);
 	  subtree_life.truncate (0);
 	}
     }
@@ -8891,7 +8898,7 @@  vect_bb_vectorization_profitable_p (bb_vec_info bb_vinfo,
       vect_bb_slp_scalar_cost (bb_vinfo,
 			       SLP_INSTANCE_TREE (instance),
 			       &life, &scalar_costs, vectorized_scalar_stmts,
-			       visited);
+			       scalar_stmts_in_externs, visited);
       vector_costs.safe_splice (instance->cost_vec);
       instance->cost_vec.release ();
     }