[committed] CRIS: Improve bailing for eliminable compares for "addi" vs. "add"

Message ID 20230328012939.49ECF20417@pchp3.se.axis.com
State Committed
Commit 75be4badfeda2448383429f6061e095afd5a4ee5
Headers
Series [committed] CRIS: Improve bailing for eliminable compares for "addi" vs. "add" |

Commit Message

Hans-Peter Nilsson March 28, 2023, 1:29 a.m. UTC
  This patch affects a post-reload define_split for CRIS that transforms
a condition-code-clobbering addition into a non-clobbering addition.
(A "two-operand" addition between registers is the only insn that has
both a condition-code-clobbering and a non-clobbering variant for
CRIS.)  Many more "add.d":s are replaced by non-condition-code-
clobbering "addi":s after this patch, but most of the transformations
don't matter.

CRIS with LRA generated code that exposed a flaw with the original
patch: it bailed too easily, on *any* insn using the result of the
addition.  To wit, more effort than simply applying reg_mentioned_p is
needed to inspect the user, in the code to avoid munging an insn
sequence that cmpelim is supposed to handle.

With this patch coremark score for CRIS (*with reload*) improves by
less than 0.01% (a single "nop" is eliminated in
core_state_transition, in an execution path that affects ~1/20 of all
of the 10240 calls).  However, the original cause for this patch is to
not regress gcc.target/cris/pr93372-44.c for LRA, where otherwise a
needless "cmpq" is emitted.  For CRIS with LRA, the performance effect
on coremark isn't even measurable, except by reducing the size of the
executable due to affecting non-called library code.

	* config/cris/cris.md ("*add<mode>3_addi"): Improve to bail only
	for possible eliminable compares.
---
 gcc/config/cris/cris.md | 53 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 52 insertions(+), 1 deletion(-)
  

Patch

diff --git a/gcc/config/cris/cris.md b/gcc/config/cris/cris.md
index 2bea480a0200..30ff7e75c1bf 100644
--- a/gcc/config/cris/cris.md
+++ b/gcc/config/cris/cris.md
@@ -1362,12 +1362,63 @@  (define_split ;; "*add<mode>3_addi"
 {
   rtx reg = operands[0];
   rtx_insn *i = next_nonnote_nondebug_insn_bb (curr_insn);
+  rtx x, src, dest;
 
   while (i != NULL_RTX && (!INSN_P (i) || DEBUG_INSN_P (i)))
     i = next_nonnote_nondebug_insn_bb (i);
 
-  if (i == NULL_RTX || reg_mentioned_p (reg, i) || BARRIER_P (i))
+  /* We don't want to strip the clobber if the next insn possibly uses the
+     zeroness of the result.  Preferably fail only if we see a compare insn
+     that looks eliminable and with the register "reg" compared.  With some
+     effort we could also check for an equality test (EQ, NE) in the post-split
+     user, just not for now.  */
+  if (i == NULL_RTX)
     FAIL;
+
+  x = single_set (i);
+
+  /* We explicitly need to bail on a BARRIER, but that's implied by a failing
+     single_set test.  */
+  if (x == NULL_RTX)
+    FAIL;
+
+  src = SET_SRC (x);
+  dest = SET_DEST (x);
+
+  /* Bail on (post-split) eliminable compares.  */
+  if (REG_P (dest) && REGNO (dest) == CRIS_CC0_REGNUM
+      && GET_CODE (src) == COMPARE)
+    {
+      rtx cop0 = XEXP (src, 0);
+
+      if (REG_P (cop0) && REGNO (cop0) == REGNO (reg)
+	  && XEXP (src, 1) == const0_rtx)
+	FAIL;
+    }
+
+  /* Bail out if we see a (pre-split) cbranch or cstore where the comparison
+     looks eliminable and uses the destination register in this addition.  We
+     don't need to look very deep: a single_set which is a parallel clobbers
+     something, and (one of) that something, is always CRIS_CC0_REGNUM here.
+     Also, the entities we're looking for are two-element parallels.  A
+     split-up cbranch or cstore doesn't clobber CRIS_CC0_REGNUM.  A cbranch has
+     if_then_else as its source with a comparison operator as the condition,
+     and a cstore has a source with the comparison operator directly.  That
+     also matches dstep, so look for pc as destination for the if_then_else.
+     We error on the safe side if we happen to catch other conditional entities
+     and FAIL, that just means the split won't happen.  */
+  if (GET_CODE (PATTERN (i)) == PARALLEL && XVECLEN (PATTERN (i), 0) == 2)
+    {
+      rtx cmp
+	= (GET_CODE (src) == IF_THEN_ELSE && dest == pc_rtx
+	   ? XEXP (src, 0)
+	   : (COMPARISON_P (src) ? src : NULL_RTX));
+      gcc_assert (cmp == NULL_RTX || COMPARISON_P (cmp));
+
+      if (cmp && REG_P (XEXP (cmp, 0)) && XEXP (cmp, 1) == const0_rtx
+	  && REGNO (XEXP (cmp, 0)) == REGNO (reg))
+	FAIL;
+    }
 })
 
 (define_insn "<u>mul<s><mode>3"