Improved handling of REG_UNUSED notes on PARALLEL in try_combine.

Message ID 002301d7edd1$f218fa70$d64aef50$@nextmovesoftware.com
State New
Headers
Series Improved handling of REG_UNUSED notes on PARALLEL in try_combine. |

Commit Message

Roger Sayle Dec. 10, 2021, 2:26 p.m. UTC
  This patch is the middle-end piece of a set of patches for PR target/43892,
that improves combine's ability to optimize instructions with multiple
side-effects, such as updating explicit carry (flag) registers.

In RTL, an instruction that updates multiple registers is represented
as a PARALLEL of several SETs, such as PowerPC's subfc instruction:

(insn 80 79 81 4 (parallel [
            (set (reg:SI 143)
                (minus:SI (reg/v:SI 142 [ <retval> ])
                    (reg:SI 141 [ _16 ])))
            (set (reg:SI 98 ca)
                (leu:SI (reg:SI 141 [ _16 ])
                    (reg/v:SI 142 [ <retval> ])))
        ]) "../pr43892.c":8:6 104 {subfsi3_carry}
     (expr_list:REG_DEAD (reg:SI 141 [ _16 ])
        (expr_list:REG_UNUSED (reg:SI 143)
            (nil))))

As shown above, it's relatively common for only one of the results
of these instructions to be used, and the other destination register(s)
ignored, annotated with a REG_UNUSED note (as above).

This patch teaches combine to take advantage of these REG_UNUSED
annotations when trying to simplify instruction sequences.
Currently, these annotations are ignored and the useless SETs
preserved in try_combine's combination attempts:

Trying 79 -> 80:
   79: r142:SI=r139:SI+r141:SI
      REG_DEAD r139:SI
   80: {r143:SI=r142:SI-r141:SI;ca:SI=leu(r141:SI,r142:SI);}
      REG_DEAD r141:SI
      REG_UNUSED r143:SI
Failed to match this instruction:
(parallel [
        (set (reg:SI 143)
            (reg/v:SI 139 [ <retval> ]))
        (set (reg:SI 98 ca)
            (geu:SI (plus:SI (reg/v:SI 139 [ <retval> ])
                    (reg:SI 141 [ _16 ]))
                (reg/v:SI 139 [ <retval> ])))
        (set (reg/v:SI 142 [ <retval> ])
            (plus:SI (reg/v:SI 139 [ <retval> ])
                (reg:SI 141 [ _16 ])))
    ])

Notice that the combined/fused instruction passed to recog contains
a (set (reg:SI 143) (reg:139)), even though r143 was marked as unused
in the input sequence.  Fortunately, it's trivial to prune these
vestigial SETs, using the logic in single_set to determine that
only one of the SETs in a PARALLEL is useful, or expressed another
way, that the parallel can be simplified to the single_set.

This patch has been tested on x86_64-pc-linux-gnu with a make
bootstrap and make -k check with no new failures, and in
combination with other patches on powerpc64-unknown-linux-gnu
(c.f. https://gcc.gnu.org/pipermail/gcc-patches/2021-December/585977.html)
I'll include a testcase for this functionality with the final
rs6000 backend patch in the series.

Ok for mainline?


2021-12-10  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* combine.c (try_combine): When I2 or I3 is PARALLEL without
	clobbers that is effectively just a single_set, just use that
	SET during the recombination/fusion attempt.


Thanks in advance,
Roger
--
  

Patch

diff --git a/gcc/combine.c b/gcc/combine.c
index 03e9a78..07f70b3 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -2901,6 +2901,17 @@  try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
 		  alloc_insn_link (i1, regno, LOG_LINKS (i2)));
     }
 
+  /* If I2 is a PARALLEL with only one useful SET and without clobbers,
+     transform I2 into that SET.  */
+  if (GET_CODE (PATTERN (i2)) == PARALLEL
+      && GET_CODE (XVECEXP (PATTERN (i2), 0, XVECLEN (PATTERN (i2), 0) - 1))
+	 != CLOBBER)
+    {
+      rtx tmp = single_set (i2);
+      if (tmp)
+        SUBST (PATTERN (i2), tmp);
+    }
+
   /* If I2 is a PARALLEL of two SETs of REGs (and perhaps some CLOBBERs),
      make those two SETs separate I1 and I2 insns, and make an I0 that is
      the original I1.  */
@@ -3389,6 +3400,18 @@  try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
       int extra_sets = added_sets_0 + added_sets_1 + added_sets_2;
       combine_extras++;
 
+      /* If I3 was a PARALLEL with only one useful SET, we can discard
+	 the other SETs now before constructing the new PARALLEL.  */
+      if (GET_CODE (newpat) == PARALLEL
+	  && newpat == PATTERN (i3)
+          && GET_CODE (XVECEXP (newpat, 0, XVECLEN (newpat, 0) - 1))
+	     != CLOBBER)
+	{
+	  rtx tmp = single_set (i3);
+	  if (tmp)
+	    newpat = tmp;
+	}
+
       if (GET_CODE (newpat) == PARALLEL)
 	{
 	  rtvec old = XVEC (newpat, 0);