Improved handling of REG_UNUSED notes on PARALLEL in try_combine.
Commit Message
This patch is the middle-end piece of a set of patches for PR target/43892,
that improves combine's ability to optimize instructions with multiple
side-effects, such as updating explicit carry (flag) registers.
In RTL, an instruction that updates multiple registers is represented
as a PARALLEL of several SETs, such as PowerPC's subfc instruction:
(insn 80 79 81 4 (parallel [
(set (reg:SI 143)
(minus:SI (reg/v:SI 142 [ <retval> ])
(reg:SI 141 [ _16 ])))
(set (reg:SI 98 ca)
(leu:SI (reg:SI 141 [ _16 ])
(reg/v:SI 142 [ <retval> ])))
]) "../pr43892.c":8:6 104 {subfsi3_carry}
(expr_list:REG_DEAD (reg:SI 141 [ _16 ])
(expr_list:REG_UNUSED (reg:SI 143)
(nil))))
As shown above, it's relatively common for only one of the results
of these instructions to be used, and the other destination register(s)
ignored, annotated with a REG_UNUSED note (as above).
This patch teaches combine to take advantage of these REG_UNUSED
annotations when trying to simplify instruction sequences.
Currently, these annotations are ignored and the useless SETs
preserved in try_combine's combination attempts:
Trying 79 -> 80:
79: r142:SI=r139:SI+r141:SI
REG_DEAD r139:SI
80: {r143:SI=r142:SI-r141:SI;ca:SI=leu(r141:SI,r142:SI);}
REG_DEAD r141:SI
REG_UNUSED r143:SI
Failed to match this instruction:
(parallel [
(set (reg:SI 143)
(reg/v:SI 139 [ <retval> ]))
(set (reg:SI 98 ca)
(geu:SI (plus:SI (reg/v:SI 139 [ <retval> ])
(reg:SI 141 [ _16 ]))
(reg/v:SI 139 [ <retval> ])))
(set (reg/v:SI 142 [ <retval> ])
(plus:SI (reg/v:SI 139 [ <retval> ])
(reg:SI 141 [ _16 ])))
])
Notice that the combined/fused instruction passed to recog contains
a (set (reg:SI 143) (reg:139)), even though r143 was marked as unused
in the input sequence. Fortunately, it's trivial to prune these
vestigial SETs, using the logic in single_set to determine that
only one of the SETs in a PARALLEL is useful, or expressed another
way, that the parallel can be simplified to the single_set.
This patch has been tested on x86_64-pc-linux-gnu with a make
bootstrap and make -k check with no new failures, and in
combination with other patches on powerpc64-unknown-linux-gnu
(c.f. https://gcc.gnu.org/pipermail/gcc-patches/2021-December/585977.html)
I'll include a testcase for this functionality with the final
rs6000 backend patch in the series.
Ok for mainline?
2021-12-10 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* combine.c (try_combine): When I2 or I3 is PARALLEL without
clobbers that is effectively just a single_set, just use that
SET during the recombination/fusion attempt.
Thanks in advance,
Roger
--
@@ -2901,6 +2901,17 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
alloc_insn_link (i1, regno, LOG_LINKS (i2)));
}
+ /* If I2 is a PARALLEL with only one useful SET and without clobbers,
+ transform I2 into that SET. */
+ if (GET_CODE (PATTERN (i2)) == PARALLEL
+ && GET_CODE (XVECEXP (PATTERN (i2), 0, XVECLEN (PATTERN (i2), 0) - 1))
+ != CLOBBER)
+ {
+ rtx tmp = single_set (i2);
+ if (tmp)
+ SUBST (PATTERN (i2), tmp);
+ }
+
/* If I2 is a PARALLEL of two SETs of REGs (and perhaps some CLOBBERs),
make those two SETs separate I1 and I2 insns, and make an I0 that is
the original I1. */
@@ -3389,6 +3400,18 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, rtx_insn *i0,
int extra_sets = added_sets_0 + added_sets_1 + added_sets_2;
combine_extras++;
+ /* If I3 was a PARALLEL with only one useful SET, we can discard
+ the other SETs now before constructing the new PARALLEL. */
+ if (GET_CODE (newpat) == PARALLEL
+ && newpat == PATTERN (i3)
+ && GET_CODE (XVECEXP (newpat, 0, XVECLEN (newpat, 0) - 1))
+ != CLOBBER)
+ {
+ rtx tmp = single_set (i3);
+ if (tmp)
+ newpat = tmp;
+ }
+
if (GET_CODE (newpat) == PARALLEL)
{
rtvec old = XVEC (newpat, 0);