Simplify logic in tree-scalar-evolution's expensive_expression_p.

Message ID 011c01d86a16$baaf6030$300e2090$@nextmovesoftware.com
State New
Headers
Series Simplify logic in tree-scalar-evolution's expensive_expression_p. |

Commit Message

Roger Sayle May 17, 2022, 5:51 p.m. UTC
  This patch simplifies tree-scalar-evolution's expensive_expression_p, but
produces identical results; the replacement implementation is just smaller
(uses less memory), faster and easier to understand.

The current idiom (introduced to fix PR90726) looks like:

    hash_map<tree, uint64_t> cache;
    uint64_t expanded_size = 0;
    return (expression_expensive_p (expr, cache, expanded_size)
           || expanded_size > cache.elements ());

Here the recursive function computes expanded_size, effectively the
number of tree nodes visited, which is then only used in the comparison
against cache.elements(), i.e. to check whether the number of visited
nodes is greater than the number of unique visited nodes.  This is
equivalent to instead checking where expression_expensive_p's recursion
visits any node more than once.

Instead of using a map to cache the "cost" of revisited sub-trees, the
same outcome can be determined using a set, and immediately returning
true as soon as encountering a previously seen tree node, avoiding the
unnecessary "cost"/expanded_size computation.  [A simplification analogous
to checking STL's empty() instead of comparing size() with zero].

The semantics of expensive_expression_p (both before and after) are
quite reasonable, as calling unshare_expr on a generic tree can result
in an exponential growth in the number of gimple statements, hence
any "shared" nodes are indeed expensive.  If shared nodes are to be
allowed, they'll need to be managed explicitly with SAVE_EXPR (or similar
mechanism) that avoids exponential growth.


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}, with
no new failures.  Is this a reasonable clean-up for mainline?


2022-05-17  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
 	* tree_scalar_evolution.cc (expression_expensive_p): Change type
 	of cache from hash_map to hash_set, and remove cost argument.
 	When expr appears in the hash_set, return true.  Calculation of
 	cost (and updating hash_map) is no longer required.
 	(expression_expensive_p):  Simplify top-level implementation.


Thanks in advance,
Roger
--
  

Comments

Richard Biener May 18, 2022, 6:19 a.m. UTC | #1
On Tue, May 17, 2022 at 7:51 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> This patch simplifies tree-scalar-evolution's expensive_expression_p, but
> produces identical results; the replacement implementation is just smaller
> (uses less memory), faster and easier to understand.
>
> The current idiom (introduced to fix PR90726) looks like:
>
>     hash_map<tree, uint64_t> cache;
>     uint64_t expanded_size = 0;
>     return (expression_expensive_p (expr, cache, expanded_size)
>            || expanded_size > cache.elements ());
>
> Here the recursive function computes expanded_size, effectively the
> number of tree nodes visited, which is then only used in the comparison
> against cache.elements(), i.e. to check whether the number of visited
> nodes is greater than the number of unique visited nodes.  This is
> equivalent to instead checking where expression_expensive_p's recursion
> visits any node more than once.
>
> Instead of using a map to cache the "cost" of revisited sub-trees, the
> same outcome can be determined using a set, and immediately returning
> true as soon as encountering a previously seen tree node, avoiding the
> unnecessary "cost"/expanded_size computation.  [A simplification analogous
> to checking STL's empty() instead of comparing size() with zero].
>
> The semantics of expensive_expression_p (both before and after) are
> quite reasonable, as calling unshare_expr on a generic tree can result
> in an exponential growth in the number of gimple statements, hence
> any "shared" nodes are indeed expensive.  If shared nodes are to be
> allowed, they'll need to be managed explicitly with SAVE_EXPR (or similar
> mechanism) that avoids exponential growth.

Indeed.  The code seems to allow for doing "better" than counting the
cost of a sub-expression either as one or giving up on the whole expression
though while the cleanup removes this possibility.  In fact the predicate
currently allows arbitrary expensive (read: large) expressions and the
result of expression_expensive_p (x) and expression_expensive_p
(unshare_expr (x))
isn't equal which IMHO is a bug.

> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}, with
> no new failures.  Is this a reasonable clean-up for mainline?

So I think the change goes in the wrong direction, even if it preserves
the current semantics.  The bug is probably in the callers allowing
arbitrarily large expressions so maybe the API can be changed to
provide a max_size argument.

I know I placed the extra expression expansion limit ontop of the
previous implementation which just looked for an expensive operation.
Also note that technically unshare_expr isn't necessary if
gimplification would cope with shared trees - possibly a variant which,
instead of unsharing in-expression uses, would insert SAVE_EXPRs,
would do the trick here.

Richard.

>
> 2022-05-17  Roger Sayle  <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
>         * tree_scalar_evolution.cc (expression_expensive_p): Change type
>         of cache from hash_map to hash_set, and remove cost argument.
>         When expr appears in the hash_set, return true.  Calculation of
>         cost (and updating hash_map) is no longer required.
>         (expression_expensive_p):  Simplify top-level implementation.
>
>
> Thanks in advance,
> Roger
> --
>
  

Patch

diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
index 72ceb40..347dede 100644
--- a/gcc/tree-scalar-evolution.cc
+++ b/gcc/tree-scalar-evolution.cc
@@ -3352,8 +3352,7 @@  scev_finalize (void)
    for scev_const_prop.  */
 
 static bool
-expression_expensive_p (tree expr, hash_map<tree, uint64_t> &cache,
-			uint64_t &cost)
+expression_expensive_p (tree expr, hash_set<tree> &cache)
 {
   enum tree_code code;
 
@@ -3377,19 +3376,11 @@  expression_expensive_p (tree expr, hash_map<tree, uint64_t> &cache,
 	return true;
     }
 
-  bool visited_p;
-  uint64_t &local_cost = cache.get_or_insert (expr, &visited_p);
-  if (visited_p)
-    {
-      uint64_t tem = cost + local_cost;
-      if (tem < cost)
-	return true;
-      cost = tem;
-      return false;
-    }
-  local_cost = 1;
+  /* If we've encountered this expression before, it would be duplicated
+     by unshare_expr, which makes this expression expensive.  */
+  if (cache.add (expr))
+    return true;
 
-  uint64_t op_cost = 0;
   if (code == CALL_EXPR)
     {
       tree arg;
@@ -3431,16 +3422,14 @@  expression_expensive_p (tree expr, hash_map<tree, uint64_t> &cache,
 	}
 
       FOR_EACH_CALL_EXPR_ARG (arg, iter, expr)
-	if (expression_expensive_p (arg, cache, op_cost))
+	if (expression_expensive_p (arg, cache))
 	  return true;
-      *cache.get (expr) += op_cost;
-      cost += op_cost + 1;
       return false;
     }
 
   if (code == COND_EXPR)
     {
-      if (expression_expensive_p (TREE_OPERAND (expr, 0), cache, op_cost)
+      if (expression_expensive_p (TREE_OPERAND (expr, 0), cache)
 	  || (EXPR_P (TREE_OPERAND (expr, 1))
 	      && EXPR_P (TREE_OPERAND (expr, 2)))
 	  /* If either branch has side effects or could trap.  */
@@ -3448,13 +3437,9 @@  expression_expensive_p (tree expr, hash_map<tree, uint64_t> &cache,
 	  || generic_expr_could_trap_p (TREE_OPERAND (expr, 1))
 	  || TREE_SIDE_EFFECTS (TREE_OPERAND (expr, 0))
 	  || generic_expr_could_trap_p (TREE_OPERAND (expr, 0))
-	  || expression_expensive_p (TREE_OPERAND (expr, 1),
-				     cache, op_cost)
-	  || expression_expensive_p (TREE_OPERAND (expr, 2),
-				     cache, op_cost))
+	  || expression_expensive_p (TREE_OPERAND (expr, 1), cache)
+	  || expression_expensive_p (TREE_OPERAND (expr, 2), cache))
 	return true;
-      *cache.get (expr) += op_cost;
-      cost += op_cost + 1;
       return false;
     }
 
@@ -3462,15 +3447,13 @@  expression_expensive_p (tree expr, hash_map<tree, uint64_t> &cache,
     {
     case tcc_binary:
     case tcc_comparison:
-      if (expression_expensive_p (TREE_OPERAND (expr, 1), cache, op_cost))
+      if (expression_expensive_p (TREE_OPERAND (expr, 1), cache))
 	return true;
 
       /* Fallthru.  */
     case tcc_unary:
-      if (expression_expensive_p (TREE_OPERAND (expr, 0), cache, op_cost))
+      if (expression_expensive_p (TREE_OPERAND (expr, 0), cache))
 	return true;
-      *cache.get (expr) += op_cost;
-      cost += op_cost + 1;
       return false;
 
     default:
@@ -3481,10 +3464,8 @@  expression_expensive_p (tree expr, hash_map<tree, uint64_t> &cache,
 bool
 expression_expensive_p (tree expr)
 {
-  hash_map<tree, uint64_t> cache;
-  uint64_t expanded_size = 0;
-  return (expression_expensive_p (expr, cache, expanded_size)
-	  || expanded_size > cache.elements ());
+  hash_set<tree> cache;
+  return expression_expensive_p (expr, cache);
 }
 
 /* Do final value replacement for LOOP, return true if we did anything.  */