PR tree-optimization/108356 - Ranger cache - always use range_from_dom when updating.

Message ID 4edb00e2-4691-bdd4-1b4d-12e6b9983ad7@redhat.com
State New
Headers
Series PR tree-optimization/108356 - Ranger cache - always use range_from_dom when updating. |

Commit Message

Andrew MacLeod Jan. 31, 2023, 8:10 p.m. UTC
  This turned out to be a more interesting problem than I wanted.

the situation boils down to:

<bb 10>
# g_5 = PHI <0(2), 2(8)>
   if (g_5 <= 1)
     goto <bb 4>; [INV]

<bb 4> :
   if (g_5 != 0)
     goto <bb 7>; [INV]
   else
     goto <bb 8>; [INV]

   <bb 7> :
   c = 0;

   <bb 8> :
     goto <bb 10>; [INV]

  We globally know that g_5 is [0,0][2,2]
we also know that 10->4 , g_5 will be [0,0]
Which means we also know that that the branch in bb_4 will never be 
taken, and will always go to bb 8.
THis is all processed in EVRP, the branch is changed, and the next time 
VRP is called, we blow the block with c = 0 like we want...

Unfortunately it doesnt happen within EVRP because when this updated 
range for g_5 is propagated in the cache, it was tripping over a shotcut 
which tried to avoid using lookups when it thinks it didnt matter, and 
would occasionally drop back to the global range.

In particular, the cache had originally propagated [0,0][2,2] as the on 
entry range to bb8. when we rewrite the branch, we mark 4->7 and 7->8  
as unexecutable edges, then propagate the new range for g_5..  This 
requires recalculating the existing range on entry to bb8.

It properly picked up [0,0] from 4->8, but when the cache query checked 
the range from 7->8, it discovered that no value was yet set, so instead 
of looking it up, it fell back to the global range of [0,0][2,2].  If it 
properly calculates that edge instead, it comes up with UNDEFINED (as it 
is unexecutable) and results in [0,0]... and EVRP then also removes the 
block is c = 0 in.

By picking up the global value, it still thought 2 was a possibility 
later on, and a single VRP pass couldn't eliminate the branch ultimately 
leading to the store... it required a second one with the adjusted CFG 
to catch it.

This patch tells the cache to always do a read-only scan of the 
dominator tree to find the nearest actual value and use that instead.  
This may solve other lingering weird propagation issues.

I also ran a performance run on this change. It does slow VRP by down 
about 1%, but the overall change is nominal at around 0.05%.

Bootstraps on x86_64-pc-linux-gnu with no regressions.  OK?

Andrew
  

Comments

Richard Biener Feb. 1, 2023, 7:52 a.m. UTC | #1
On Tue, Jan 31, 2023 at 9:11 PM Andrew MacLeod via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> This turned out to be a more interesting problem than I wanted.
>
> the situation boils down to:
>
> <bb 10>
> # g_5 = PHI <0(2), 2(8)>
>    if (g_5 <= 1)
>      goto <bb 4>; [INV]
>
> <bb 4> :
>    if (g_5 != 0)
>      goto <bb 7>; [INV]
>    else
>      goto <bb 8>; [INV]
>
>    <bb 7> :
>    c = 0;
>
>    <bb 8> :
>      goto <bb 10>; [INV]
>
>   We globally know that g_5 is [0,0][2,2]
> we also know that 10->4 , g_5 will be [0,0]
> Which means we also know that that the branch in bb_4 will never be
> taken, and will always go to bb 8.
> THis is all processed in EVRP, the branch is changed, and the next time
> VRP is called, we blow the block with c = 0 like we want...
>
> Unfortunately it doesnt happen within EVRP because when this updated
> range for g_5 is propagated in the cache, it was tripping over a shotcut
> which tried to avoid using lookups when it thinks it didnt matter, and
> would occasionally drop back to the global range.
>
> In particular, the cache had originally propagated [0,0][2,2] as the on
> entry range to bb8. when we rewrite the branch, we mark 4->7 and 7->8
> as unexecutable edges, then propagate the new range for g_5..  This
> requires recalculating the existing range on entry to bb8.
>
> It properly picked up [0,0] from 4->8, but when the cache query checked
> the range from 7->8, it discovered that no value was yet set, so instead
> of looking it up, it fell back to the global range of [0,0][2,2].  If it
> properly calculates that edge instead, it comes up with UNDEFINED (as it
> is unexecutable) and results in [0,0]... and EVRP then also removes the
> block is c = 0 in.
>
> By picking up the global value, it still thought 2 was a possibility
> later on, and a single VRP pass couldn't eliminate the branch ultimately
> leading to the store... it required a second one with the adjusted CFG
> to catch it.
>
> This patch tells the cache to always do a read-only scan of the
> dominator tree to find the nearest actual value and use that instead.
> This may solve other lingering weird propagation issues.
>
> I also ran a performance run on this change. It does slow VRP by down
> about 1%, but the overall change is nominal at around 0.05%.
>
> Bootstraps on x86_64-pc-linux-gnu with no regressions.  OK?

OK.

Thanks,
Richard.

> Andrew
>
  

Patch

From ddb9df254ed2bc5f11bdde213e1485d7971a25d9 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod <amacleod@redhat.com>
Date: Tue, 10 Jan 2023 13:40:56 -0500
Subject: [PATCH] Ranger cache - always use range_from_dom when updating.

When updating an existing range, if we dont query the dom tree, we can
get the global range instead of a proper range on some incoming edges
which cause the range to not be refined properly.

	PR tree-optimization/108356
	gcc/
	* gimple-range-cache.cc (ranger_cache::range_on_edge): Always
	do a search of the DOM tree for a range.

	gcc/testsuite/
	* gcc.dg/pr108356.c: New.
---
 gcc/gimple-range-cache.cc       |  2 +-
 gcc/testsuite/gcc.dg/pr108356.c | 23 +++++++++++++++++++++++
 2 files changed, 24 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr108356.c

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index a8202f0e895..20c444bc4f4 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -998,7 +998,7 @@  bool
 ranger_cache::range_on_edge (vrange &r, edge e, tree expr)
 {
   if (gimple_range_ssa_p (expr))
-    return edge_range (r, e, expr, RFD_NONE);
+    return edge_range (r, e, expr, RFD_READ_ONLY);
   return get_tree_range (r, expr, NULL);
 }
 
diff --git a/gcc/testsuite/gcc.dg/pr108356.c b/gcc/testsuite/gcc.dg/pr108356.c
new file mode 100644
index 00000000000..1df6b278e45
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr108356.c
@@ -0,0 +1,23 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+char a;
+static char c = 3;
+char d;
+void foo();
+short(b)(short e, short f) { return e + f; }
+int main() {
+  unsigned g = 0;
+  if (c)
+    ;
+  else
+    foo();
+  for (; g < 2; g = b(g, 2)) {
+    d = g ? 0 : a;
+    if (g)
+      c = 0;
+  }
+}
+
+
+/* { dg-final { scan-tree-dump-not "foo" "optimized" } } */
-- 
2.39.0