Maintain a validity flag for REG_UNUSED notes [PR112760] (was Re: [PATCH] pro_and_epilogue: Call df_note_add_problem () if SHRINK_WRAPPING_ENABLED [PR112760])

Message ID mpty1e96fju.fsf_-_@arm.com
State New
Headers
Series Maintain a validity flag for REG_UNUSED notes [PR112760] (was Re: [PATCH] pro_and_epilogue: Call df_note_add_problem () if SHRINK_WRAPPING_ENABLED [PR112760]) |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 success Testing passed

Commit Message

Richard Sandiford Dec. 4, 2023, 5:30 p.m. UTC
  Richard Sandiford <richard.sandiford@arm.com> writes:
> Jakub Jelinek <jakub@redhat.com> writes:
>> On Sat, Dec 02, 2023 at 11:04:04AM +0000, Richard Sandiford wrote:
>>> I still maintain that so much stuff relies on the lack of false-positive
>>> REG_UNUSED notes that (whatever the intention might have been) we need
>>> to prevent the false positive.  Like Andrew says, any use of single_set
>>> is suspect if there's a REG_UNUSED note for something that is in fact used.
>>
>> The false positive REG_UNUSED in that case comes from
>> (insn 15 14 35 2 (set (reg:CCZ 17 flags)
>>         (compare:CCZ (reg:DI 0 ax [111])
>>             (reg:DI 1 dx [112]))) "pr112760.c":11:22 12 {*cmpdi_1}
>>      (expr_list:REG_UNUSED (reg:CCZ 17 flags)
>>         (nil)))
>> (insn 35 15 36 2 (set (reg:CCZ 17 flags)
>>         (compare:CCZ (reg:DI 0 ax [111])
>>             (reg:DI 1 dx [112]))) "pr112760.c":11:22 12 {*cmpdi_1}
>>      (expr_list:REG_DEAD (reg:DI 1 dx [112])
>>         (expr_list:REG_DEAD (reg:DI 0 ax [111])
>>             (nil))))
>> ...
>> use of flags
>> Haven't verified what causes the redundant comparison, but postreload cse
>> then does:
>> 110	      if (!count && cselib_redundant_set_p (body))
>> 111		{
>> 112		  if (check_for_inc_dec (insn))
>> 113		    delete_insn_and_edges (insn);
>> 114		  /* We're done with this insn.  */
>> 115		  goto done;
>> 116		}
>> So, we'd in such cases need to look up what instruction was the earlier
>> setter and if it has REG_UNUSED note, drop it.
>
> Hmm, OK.  I guess it's not as simple as I'd imagined.  cselib does have
> some code to track which instruction established which equivalence,
> but it doesn't currently record what we want, and it would be difficult
> to reuse that information here anyway.  Something "simple" like a map of
> register numbers to instructions, populated only for REG_UNUSED sets,
> would be enough, and low overhead.  But it's not very natural.
>
> Perhaps DF should maintain a flag to say "the current pass keeps
> notes up-to-date", with the assumption being that any pass that
> uses the notes problem does that.  Then single_set and the
> regcprop.cc uses can check that flag.
>
> I don't think it's worth adding the note problem to shrink-wrapping
> just for the regcprop code.  If we're prepared to take that compile-time
> hit, we might as well run a proper (fast) DCE.

Here's a patch that tries to do that.  Boostrapped & regression tested
on aarch64-linux-gnu.  Also tested on x86_64-linux-gnu for the testcase.
(I'll run full x86_64-linux-gnu testing overnight.)

OK to install if that passes?  Not an elegant fix, but it's probably
too much to hope for one of those.

Richard

--------

PR112760 is a miscompilation caused by a stale, false-positive
REG_UNUSED note.  There were originally two consecutive,
identical instructions that set the CC flags.  The first
originally had a REG_UNUSED note, but postreload later deleted
the second in favour of the first, based on cselib_redundant_set_p.

Although in principle it would be possible to remove the note
when making the optimisation, the required bookkeeping wouldn't
fit naturally into what cselib already does.  Doing that would also
arguably be a change of policy.

This patch instead adds a global flag that says whether REG_UNUSED
notes are trustworthy.  The assumption is that any pass that calls
df_note_add_problem cares about REG_UNUSED notes and will keep them
sufficiently up-to-date to support the pass's use of things like
single_set.

gcc/
	PR rtl-optimization/112760
	* df.h (df_d::can_trust_reg_unused_notes): New member variable.
	* df-problems.cc (df_note_add_problem): Set can_trust_reg_unused_notes
	to true.
	* passes.cc (execute_one_pass): Clear can_trust_reg_unused_notes
	after each pass.
	* rtlanal.cc (single_set_2): Check can_trust_reg_unused_notes.
	* regcprop.cc (copyprop_hardreg_forward_1): Likewise.

gcc/testsuite/
	* gcc.dg/pr112760.c: New test.
---
 gcc/df-problems.cc              |  1 +
 gcc/df.h                        |  4 ++++
 gcc/passes.cc                   |  3 +++
 gcc/regcprop.cc                 |  4 +++-
 gcc/rtlanal.cc                  |  8 ++++++--
 gcc/testsuite/gcc.dg/pr112760.c | 22 ++++++++++++++++++++++
 6 files changed, 39 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr112760.c
  

Comments

Jakub Jelinek Dec. 4, 2023, 7:34 p.m. UTC | #1
On Mon, Dec 04, 2023 at 05:30:45PM +0000, Richard Sandiford wrote:
> > I don't think it's worth adding the note problem to shrink-wrapping
> > just for the regcprop code.  If we're prepared to take that compile-time
> > hit, we might as well run a proper (fast) DCE.
> 
> Here's a patch that tries to do that.  Boostrapped & regression tested
> on aarch64-linux-gnu.  Also tested on x86_64-linux-gnu for the testcase.
> (I'll run full x86_64-linux-gnu testing overnight.)
> 
> OK to install if that passes?  Not an elegant fix, but it's probably
> too much to hope for one of those.

Isn't this way too conservative though, basically limiting single_set
to ~ 15 out of the ~ 65 RTL passes (sure, it will still DTRT for
non-PARALLEL or just PARALLEL with clobbers/uses)?
Do we know about passes other than postreload which may invalidate
REG_UNUSED notes while not purging them altogether?
Given what postreload does, I bet cse/gcse might too.

If we add a RTL checking verification of the notes, we could know
immediately what other passes invalidate it.

So, couldn't we just set such flag at the end of such passes (or only if
they actually remove any redundant insns and thus potentially invalidate
them, perhaps during doing so)?

And on the x86 side, the question from the PR still stands, why is
vzeroupper pass placed exactly after reload and not postreload which
cleans stuff up after reload.

	Jakub
  

Patch

diff --git a/gcc/df-problems.cc b/gcc/df-problems.cc
index d2cfaf7f50f..d2eb95d35ad 100644
--- a/gcc/df-problems.cc
+++ b/gcc/df-problems.cc
@@ -3782,6 +3782,7 @@  void
 df_note_add_problem (void)
 {
   df_add_problem (&problem_NOTE);
+  df->can_trust_reg_unused_notes = true;
 }
 
 
diff --git a/gcc/df.h b/gcc/df.h
index 402657a7076..a405c000235 100644
--- a/gcc/df.h
+++ b/gcc/df.h
@@ -614,6 +614,10 @@  public:
   /* True if someone added or deleted something from regs_ever_live so
      that the entry and exit blocks need be reprocessed.  */
   bool redo_entry_and_exit;
+
+  /* True if REG_UNUSED notes are up-to-date enough to be free of false
+     positives.  */
+  bool can_trust_reg_unused_notes;
 };
 
 #define DF_SCAN_BB_INFO(BB) (df_scan_get_bb_info ((BB)->index))
diff --git a/gcc/passes.cc b/gcc/passes.cc
index 6f894a41d22..0313d0e3689 100644
--- a/gcc/passes.cc
+++ b/gcc/passes.cc
@@ -2640,6 +2640,9 @@  execute_one_pass (opt_pass *pass)
   /* Do it!  */
   todo_after = pass->execute (cfun);
 
+  if (df)
+    df->can_trust_reg_unused_notes = false;
+
   if (todo_after & TODO_discard_function)
     {
       /* Stop timevar.  */
diff --git a/gcc/regcprop.cc b/gcc/regcprop.cc
index a75af2ba7e3..03df9a7cf4f 100644
--- a/gcc/regcprop.cc
+++ b/gcc/regcprop.cc
@@ -820,6 +820,7 @@  copyprop_hardreg_forward_1 (basic_block bb, struct value_data *vd)
       if (set
 	  && !RTX_FRAME_RELATED_P (insn)
 	  && NONJUMP_INSN_P (insn)
+	  && df->can_trust_reg_unused_notes
 	  && !may_trap_p (set)
 	  && find_reg_note (insn, REG_UNUSED, SET_DEST (set))
 	  && !side_effects_p (SET_SRC (set))
@@ -878,7 +879,8 @@  copyprop_hardreg_forward_1 (basic_block bb, struct value_data *vd)
 	 would clobbers.  */
       for (link = REG_NOTES (insn); link; link = XEXP (link, 1))
 	{
-	  if (REG_NOTE_KIND (link) == REG_UNUSED)
+	  if (df->can_trust_reg_unused_notes
+	      && REG_NOTE_KIND (link) == REG_UNUSED)
 	    {
 	      kill_value (XEXP (link, 0), vd);
 	      /* Furthermore, if the insn looked like a single-set,
diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
index 87267ee3b88..42b07cd9634 100644
--- a/gcc/rtlanal.cc
+++ b/gcc/rtlanal.cc
@@ -1572,7 +1572,9 @@  single_set_2 (const rtx_insn *insn, const_rtx pat)
 		 sets are found in the insn, we check them.  */
 	      if (!set_verified)
 		{
-		  if (find_reg_note (insn, REG_UNUSED, SET_DEST (set))
+		  if (df
+		      && df->can_trust_reg_unused_notes
+		      && find_reg_note (insn, REG_UNUSED, SET_DEST (set))
 		      && !side_effects_p (set))
 		    set = NULL;
 		  else
@@ -1580,7 +1582,9 @@  single_set_2 (const rtx_insn *insn, const_rtx pat)
 		}
 	      if (!set)
 		set = sub, set_verified = 0;
-	      else if (!find_reg_note (insn, REG_UNUSED, SET_DEST (sub))
+	      else if (!df
+		       || !df->can_trust_reg_unused_notes
+		       || !find_reg_note (insn, REG_UNUSED, SET_DEST (sub))
 		       || side_effects_p (sub))
 		return NULL_RTX;
 	      break;
diff --git a/gcc/testsuite/gcc.dg/pr112760.c b/gcc/testsuite/gcc.dg/pr112760.c
new file mode 100644
index 00000000000..b4ec70e4701
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr112760.c
@@ -0,0 +1,22 @@ 
+/* PR rtl-optimization/112760 */
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-dce -fno-guess-branch-probability --param=max-cse-insns=0" } */
+/* { dg-additional-options "-m8bit-idiv -mavx" { target i?86-*-* x86_64-*-* } } */
+
+unsigned g;
+
+__attribute__((__noipa__)) unsigned short
+foo (unsigned short a, unsigned short b)
+{
+  unsigned short x = __builtin_add_overflow_p (a, g, (unsigned short) 0);
+  g -= g / b;
+  return x;
+}
+
+int
+main ()
+{
+  unsigned short x = foo (40, 6);
+  if (x != 0)
+    __builtin_abort ();
+}