tree-optimization/96881 - CD-DCE and CLOBBERs

Message ID 20220215144427.4F6D713C9F@imap2.suse-dmz.suse.de
State New
Headers
Series tree-optimization/96881 - CD-DCE and CLOBBERs |

Commit Message

Richard Biener Feb. 15, 2022, 2:44 p.m. UTC
  CD-DCE does not consider CLOBBERs as necessary in the attempt
to not prevent DCE of SSA defs it uses.  A side-effect of that
is that it also removes all its control dependences if they are
not made necessary by other means.  When we later try to preserve
as many CLOBBERs as possible we have to make sure we also
preserved the controlling conditions, otherwise a CLOBBER can
now appear on a path where it was not executed before, leading
to wrong code as seen in the testcase.

This removes 20570 clobbers early, and 2642 and 957 in the two
late CD-DCE passes during stage2 in gcc/ - note almost all
of those are direct clobbers, if only tracking indirect clobbers
lazily we remove 39, 12 and 5.

It FAILs the stack coalescing test g++.dg/opt/pr80032.C which
can be only recovered if we never DCE paths to direct clobbers
(and thus those clobbers).  The pattern there is
if (pred) D.2512 = CLOBBER; else D.2512 = CLOBBER;, basically
we have all paths leading to the same clobber but we could
safely cut some branches which we do not realize.

Handling indirect vs. direct clobbers differently feels wrong,
so the option would be to simply stop removing control flow
to clobbers.  Elsewhere we noticed though that we elide
all indirect clobbers only after the last CD-DCE.  Removing
conditional clobbers too early might also inhibit DSE after
inlining, so maybe we should rather keep them and try to
optimize things elsewhere?

The aggressive clobber removal variant patch also causes

In destructor 'auto_timevar::~auto_timevar()',
    inlined from 'void c_parser_declaration_or_fndef(c_parser*, bool, bool, bool, bool, bool, tree_node**, vec<c_token>*, bool, tree, oacc_routine_data*, bool*' at /home/rguenther/src/gcc3/gcc/c/c-parser.cc:2569:7:
/home/rguenther/src/gcc3/gcc/timevar.h:247:20: error: dangling pointer to 'at' may be used [-Werror=dangling-pointer=]
  247 |       m_timer->pop (m_tv);
      |       ~~~~~~~~~~~~~^~~~~~

and a lot more dangling-pointer diagnostics during bootstrap.

When trying to avoid control DCE for all clobber kinds we end
up miscompiling g++.dg/pr64191.C.  So the only working solution
I found is handling indirect clobbers different from direct ones
with regard to control DCE.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Thoughts?

Thanks,
Richard.

2022-02-15  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/96881
	* tree-ssa-dce.cc (mark_stmt_if_obviously_necessary): Comment
	CLOBBER handling.
	(eliminate_unnecessary_stmts): Check that we preserved control
	parents before retaining a CLOBBER.
	(perform_tree_ssa_dce): Pass down aggressive flag
	to eliminate_unnecessary_stmts.

	* g++.dg/torture/pr96881-1.C: New testcase.
	* g++.dg/torture/pr96881-2.C: Likewise.
---
 gcc/testsuite/g++.dg/torture/pr96881-1.C | 37 ++++++++++++++++++++++++
 gcc/testsuite/g++.dg/torture/pr96881-2.C | 37 ++++++++++++++++++++++++
 gcc/tree-ssa-dce.cc                      | 13 ++++++---
 3 files changed, 83 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/pr96881-1.C
 create mode 100644 gcc/testsuite/g++.dg/torture/pr96881-2.C
  

Comments

Jan Hubicka Feb. 15, 2022, 3:22 p.m. UTC | #1
> @@ -1272,7 +1275,7 @@ maybe_optimize_arith_overflow (gimple_stmt_iterator *gsi,
>     contributes nothing to the program, and can be deleted.  */
>  
>  static bool
> -eliminate_unnecessary_stmts (void)
> +eliminate_unnecessary_stmts (bool aggressive)
>  {
>    bool something_changed = false;
>    basic_block bb;
> @@ -1366,7 +1369,9 @@ eliminate_unnecessary_stmts (void)
>  			  break;
>  			}
>  		    }
> -		  if (!dead)
> +		  if (!dead
> +		      && (!aggressive
> +			  || bitmap_bit_p (visited_control_parents, bb->index)))

It seems to me that it may be worth to consider case where
visited_control_parents is 0 while all basic blocks in the CD relation
are live for different reasons.  I suppose this can happen in more
complex CFGs when the other arms of conditionals are live...

Honza
>  		    {
>  		      bitmap_clear (debug_seen);
>  		      continue;
> @@ -1876,7 +1881,7 @@ perform_tree_ssa_dce (bool aggressive)
>    propagate_necessity (aggressive);
>    BITMAP_FREE (visited);
>  
> -  something_changed |= eliminate_unnecessary_stmts ();
> +  something_changed |= eliminate_unnecessary_stmts (aggressive);
>    something_changed |= cfg_altered;
>  
>    /* We do not update postdominators, so free them unconditionally.  */
> -- 
> 2.34.1
  
Richard Biener Feb. 17, 2022, 7:10 a.m. UTC | #2
On Tue, 15 Feb 2022, Jan Hubicka wrote:

> > @@ -1272,7 +1275,7 @@ maybe_optimize_arith_overflow (gimple_stmt_iterator *gsi,
> >     contributes nothing to the program, and can be deleted.  */
> >  
> >  static bool
> > -eliminate_unnecessary_stmts (void)
> > +eliminate_unnecessary_stmts (bool aggressive)
> >  {
> >    bool something_changed = false;
> >    basic_block bb;
> > @@ -1366,7 +1369,9 @@ eliminate_unnecessary_stmts (void)
> >  			  break;
> >  			}
> >  		    }
> > -		  if (!dead)
> > +		  if (!dead
> > +		      && (!aggressive
> > +			  || bitmap_bit_p (visited_control_parents, bb->index)))
> 
> It seems to me that it may be worth to consider case where
> visited_control_parents is 0 while all basic blocks in the CD relation
> are live for different reasons.  I suppose this can happen in more
> complex CFGs when the other arms of conditionals are live...

It's a bit difficult to do in this place though since we might already
have altered those blocks (and we need to check not for the block being
live but for its control stmt).  I suppose we could use the
last_stmt_necessary bitmap.  I'll do some statistics to see whether
this helps.

Richard.
  
Richard Biener Feb. 17, 2022, 8:28 a.m. UTC | #3
On Thu, 17 Feb 2022, Richard Biener wrote:

> On Tue, 15 Feb 2022, Jan Hubicka wrote:
> 
> > > @@ -1272,7 +1275,7 @@ maybe_optimize_arith_overflow (gimple_stmt_iterator *gsi,
> > >     contributes nothing to the program, and can be deleted.  */
> > >  
> > >  static bool
> > > -eliminate_unnecessary_stmts (void)
> > > +eliminate_unnecessary_stmts (bool aggressive)
> > >  {
> > >    bool something_changed = false;
> > >    basic_block bb;
> > > @@ -1366,7 +1369,9 @@ eliminate_unnecessary_stmts (void)
> > >  			  break;
> > >  			}
> > >  		    }
> > > -		  if (!dead)
> > > +		  if (!dead
> > > +		      && (!aggressive
> > > +			  || bitmap_bit_p (visited_control_parents, bb->index)))
> > 
> > It seems to me that it may be worth to consider case where
> > visited_control_parents is 0 while all basic blocks in the CD relation
> > are live for different reasons.  I suppose this can happen in more
> > complex CFGs when the other arms of conditionals are live...
> 
> It's a bit difficult to do in this place though since we might already
> have altered those blocks (and we need to check not for the block being
> live but for its control stmt).  I suppose we could use the
> last_stmt_necessary bitmap.  I'll do some statistics to see whether
> this helps.

So it does help.  The visited_control_parents catches
44010 from 44033 candidates and the remaining 23 are catched by doing

                      EXECUTE_IF_SET_IN_BITMAP (cd->get_edges_dependent_on 
(bb->index),
                                                0, edge_number, bi)
                        {
                          basic_block cd_bb = cd->get_edge_src 
(edge_number);
                          if (cd_bb != bb
                              && !bitmap_bit_p (last_stmt_necessary, 
cd_bb->index))
                            {
                              dead = true;
                              break;
                            }
                        }

in addition to that simple check.  That means for all files in gcc/
the patch would be a no-op but it still fixes the problematical case.

I'll put an adjusted patch to testing.

Richard.
  

Patch

diff --git a/gcc/testsuite/g++.dg/torture/pr96881-1.C b/gcc/testsuite/g++.dg/torture/pr96881-1.C
new file mode 100644
index 00000000000..1c182e6a8b3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr96881-1.C
@@ -0,0 +1,37 @@ 
+/* { dg-do run } */
+
+struct S { int s; ~S () {} } s;
+
+void __attribute__((noipa))
+foo (struct S *s, int flag)
+{
+  s->s = 1;
+  // We have to makes sure to not make the inlined CLOBBER
+  // unconditional but we have to remove it to be able
+  // to elide the branch
+  if (!flag)
+    return;
+  s->~S();
+}
+
+void __attribute__((noipa))
+bar (struct S *s, int flag)
+{
+  s->s = 1;
+  // CD-DCE chooses an arbitrary path, try to present it
+  // with all variants
+  if (flag)
+    s->~S();
+}
+
+int
+main ()
+{
+  foo (&s, 0);
+  if (s.s != 1)
+    __builtin_abort ();
+  bar (&s, 0);
+  if (s.s != 1)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/torture/pr96881-2.C b/gcc/testsuite/g++.dg/torture/pr96881-2.C
new file mode 100644
index 00000000000..35c788791e8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr96881-2.C
@@ -0,0 +1,37 @@ 
+/* { dg-do run } */
+
+struct S { int s; ~S () {} } s;
+
+void __attribute__((noipa))
+foo (int flag)
+{
+  s.s = 1;
+  // We have to makes sure to not make the inlined CLOBBER
+  // unconditional but we have to remove it to be able
+  // to elide the branch
+  if (!flag)
+    return;
+  s.~S();
+}
+
+void __attribute__((noipa))
+bar (int flag)
+{
+  s.s = 1;
+  // CD-DCE chooses an arbitrary path, try to present it
+  // with all variants
+  if (flag)
+    s.~S();
+}
+
+int
+main ()
+{
+  foo (0);
+  if (s.s != 1)
+    __builtin_abort ();
+  bar (0);
+  if (s.s != 1)
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-ssa-dce.cc b/gcc/tree-ssa-dce.cc
index f1034878eaf..9aefdc1c8d3 100644
--- a/gcc/tree-ssa-dce.cc
+++ b/gcc/tree-ssa-dce.cc
@@ -284,7 +284,10 @@  mark_stmt_if_obviously_necessary (gimple *stmt, bool aggressive)
       break;
 
     case GIMPLE_ASSIGN:
-      if (gimple_clobber_p (stmt))
+      /* Mark indirect CLOBBERs to be lazily removed if their SSA operands
+	 do not prevail.  That also makes control flow leading to them
+	 not necessary in aggressive mode.  */
+      if (gimple_clobber_p (stmt) && !zero_ssa_operands (stmt, SSA_OP_USE))
 	return;
       break;
 
@@ -1272,7 +1275,7 @@  maybe_optimize_arith_overflow (gimple_stmt_iterator *gsi,
    contributes nothing to the program, and can be deleted.  */
 
 static bool
-eliminate_unnecessary_stmts (void)
+eliminate_unnecessary_stmts (bool aggressive)
 {
   bool something_changed = false;
   basic_block bb;
@@ -1366,7 +1369,9 @@  eliminate_unnecessary_stmts (void)
 			  break;
 			}
 		    }
-		  if (!dead)
+		  if (!dead
+		      && (!aggressive
+			  || bitmap_bit_p (visited_control_parents, bb->index)))
 		    {
 		      bitmap_clear (debug_seen);
 		      continue;
@@ -1876,7 +1881,7 @@  perform_tree_ssa_dce (bool aggressive)
   propagate_necessity (aggressive);
   BITMAP_FREE (visited);
 
-  something_changed |= eliminate_unnecessary_stmts ();
+  something_changed |= eliminate_unnecessary_stmts (aggressive);
   something_changed |= cfg_altered;
 
   /* We do not update postdominators, so free them unconditionally.  */