PR rtl-optimization/101885: Prevent combine from clobbering flags

Message ID 011001d818ee$01004e70$0300eb50$@nextmovesoftware.com
State New
Headers
Series PR rtl-optimization/101885: Prevent combine from clobbering flags |

Commit Message

Roger Sayle Feb. 3, 2022, 11:05 a.m. UTC
  This patch addresses PR rtl-optimization/101885 which is a P2 wrong code
regression.  In combine, if the resulting fused instruction is a parallel
of two sets which fails to be recognized by the backend, combine tries to
emit these as two sequential set instructions (known as split_i2i3).
As each set is recognized the backend may add any necessary "clobbers".
The code currently checks that any clobbers added to the first "set"
don't interfere with the second set, but doesn't currently handle the
case that clobbers added to the second set may interfere/kill the
destination of the first set (which must be live at this point).
The solution is to cut'n'paste the "clobber" logic from just a few
lines earlier, suitably adjusted for the second instruction.

One minor nit that may confuse a reviewer is that at this point in
the code we've lost track of which set was first and which was second
(combine chooses dynamically, and the recog processes that adds the
clobbers may have obfuscated the original SET_DEST) so the actual test
below is to confirm that any newly added clobbers (on the second set
instruction) don't overlap either set0's or set1's destination.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check with no new failures.  Ok for mainline?


2022-02-03  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR rtl-optimization/101885
	* combine.c (try_combine): When splitting a parallel into two
	sequential sets, check not only that the first doesn't clobber
	the second but also that the second doesn't clobber the first.

gcc/testsuite/ChangeLog
	PR rtl-optimization/101885
	* gcc.dg/pr101885.c: New test case.


Thanks in advance,
Roger
--
  

Comments

Segher Boessenkool Feb. 4, 2022, 1:49 a.m. UTC | #1
Hi!

Please Cc: me on combine patches, I cannot keep up with gcc-patches@,
and even if I could I would miss things.

On Thu, Feb 03, 2022 at 11:05:48AM -0000, Roger Sayle wrote:
> This patch addresses PR rtl-optimization/101885 which is a P2 wrong code
> regression.  In combine, if the resulting fused instruction is a parallel
> of two sets which fails to be recognized by the backend, combine tries to
> emit these as two sequential set instructions (known as split_i2i3).
> As each set is recognized the backend may add any necessary "clobbers".
> The code currently checks that any clobbers added to the first "set"
> don't interfere with the second set, but doesn't currently handle the
> case that clobbers added to the second set may interfere/kill the
> destination of the first set (which must be live at this point).

The insns are inserted where the original I2 and I3 were, so "must be"
isn't obvious at all.

> The solution is to cut'n'paste the "clobber" logic from just a few
> lines earlier, suitably adjusted for the second instruction.
> 
> One minor nit that may confuse a reviewer is that at this point in
> the code we've lost track of which set was first and which was second
> (combine chooses dynamically, and the recog processes that adds the
> clobbers may have obfuscated the original SET_DEST) so the actual test
> below is to confirm that any newly added clobbers (on the second set
> instruction) don't overlap either set0's or set1's destination.

There is no "which was": they both were to be inserted at I3, as a
parallel.  In no case does combine backtrack, or try multiple solutions
to do something (this would be prohibitively expensive).

> gcc/ChangeLog
> 	PR rtl-optimization/101885
> 	* combine.c (try_combine): When splitting a parallel into two
> 	sequential sets, check not only that the first doesn't clobber
> 	the second but also that the second doesn't clobber the first.
> 
> gcc/testsuite/ChangeLog
> 	PR rtl-optimization/101885
> 	* gcc.dg/pr101885.c: New test case.

> index 2d40635..38e0369 100644
> --- a/gcc/combine.cc
> +++ b/gcc/combine.cc
> @@ -4017,6 +4017,23 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
>  
>  	  insn_code_number = recog_for_combine (&newpat, i3, &new_i3_notes);
>  
> +	  /* Likewise, recog_for_combine might have added clobbers to NEWPAT.
> +	     Make sure NEWI2PAT2's original SET_DEST isn't clobbered.  */

"original SET_DEST"?  Please rephrase this, so it is what the code does
(and makes sense :-) )?  It is fine to talk about SET0 and SET1.

> +	  if (insn_code_number >= 0 && GET_CODE (newpat) == PARALLEL)
> +	    {
> +	      for (i = XVECLEN (newpat, 0) - 1; i >= 0; i--)
> +		if (GET_CODE (XVECEXP (newpat, 0, i)) == CLOBBER)
> +		  {
> +		    rtx reg = XEXP (XVECEXP (newpat, 0, i), 0);
> +		    if (reg_overlap_mentioned_p (reg, SET_DEST (set0))
> +			|| reg_overlap_mentioned_p (reg, SET_DEST (set1)))
> +		      {
> +			undo_all ();
> +			return 0;
> +		      }
> +		  }
> +	    }

The patch is okay for trunk.  Thank you!

Also okay for backports (after waiting for fallout a week or two).


Segher
  

Patch

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 2d40635..38e0369 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -4017,6 +4017,23 @@  try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
 
 	  insn_code_number = recog_for_combine (&newpat, i3, &new_i3_notes);
 
+	  /* Likewise, recog_for_combine might have added clobbers to NEWPAT.
+	     Make sure NEWI2PAT2's original SET_DEST isn't clobbered.  */
+	  if (insn_code_number >= 0 && GET_CODE (newpat) == PARALLEL)
+	    {
+	      for (i = XVECLEN (newpat, 0) - 1; i >= 0; i--)
+		if (GET_CODE (XVECEXP (newpat, 0, i)) == CLOBBER)
+		  {
+		    rtx reg = XEXP (XVECEXP (newpat, 0, i), 0);
+		    if (reg_overlap_mentioned_p (reg, SET_DEST (set0))
+			|| reg_overlap_mentioned_p (reg, SET_DEST (set1)))
+		      {
+			undo_all ();
+			return 0;
+		      }
+		  }
+	    }
+
 	  if (insn_code_number >= 0)
 	    split_i2i3 = 1;
 	}
diff --git a/gcc/testsuite/gcc.dg/pr101885.c b/gcc/testsuite/gcc.dg/pr101885.c
new file mode 100644
index 0000000..05fd0ed
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr101885.c
@@ -0,0 +1,31 @@ 
+/* { dg-do run } */
+/* { dg-options "-O3" } */
+int a = 3, c;
+short b = 5, d, f;
+volatile short e;
+
+__attribute__((noipa)) void
+foo (void)
+{
+}
+
+int
+main ()
+{
+  for (f = 0; f != 2; f++)
+    {
+      int g = a;
+      if (b)
+	if (a)
+	  for (a = b = 0; b <= 3; b++)
+	    ;
+      for (c = 0; c != 16; ++c)
+	e;
+    }
+  b = d;
+  foo ();
+  if (a != 0)
+    __builtin_abort ();
+  return 0;
+}
+