From patchwork Wed May 31 15:13:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hans-Peter Nilsson X-Patchwork-Id: 70394 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 544C03858033 for ; Wed, 31 May 2023 15:14:14 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 544C03858033 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1685546054; bh=APYmxySkIkqUqay+bu1r/5LU0u8i0+gCEiGu87nc/hc=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=CdAHcg5B+URfIK8PCu6ef6H6i8DU57OO0Vq6Sk6pV+eer+pdYusnZ+9I8LsbpVdxS 86XbLetf4suS9/+tqW3nMGQhIsQY7vHijiHcIE1kEagjVY6r5InwojKSfHU751Aa5f teA5QLf6mNmjYHaDUm/XxUQIC05p9p00FMZksRFs= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp2.axis.com (smtp2.axis.com [195.60.68.18]) by sourceware.org (Postfix) with ESMTPS id C93A83858022 for ; Wed, 31 May 2023 15:13:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C93A83858022 To: Subject: [PATCH] reload_cse_move2add: Handle trivial single_set:s MIME-Version: 1.0 Message-ID: <20230531151322.4CE832041F@pchp3.se.axis.com> Date: Wed, 31 May 2023 17:13:22 +0200 X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hans-Peter Nilsson via Gcc-patches From: Hans-Peter Nilsson Reply-To: Hans-Peter Nilsson Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Tested cris-elf, bootstrapped & checked native x86_64-pc-linux-gnu for good measure. Ok to commit? If it wasn't for there already being an auto_inc_dec pass, this looks like a good place to put it, considering the framework data. (BTW, current auto-inc-dec generation is so poor that you can replace half of what auto_inc_dec does with a few peephole2s.) brgds, H-P -- >8 -- The reload_cse_move2add part of "postreload" handled only insns whose PATTERN was a SET. That excludes insns that e.g. clobber a flags register, which it does only for "simplicity". The patch extends the "simplicity" to most single_set insns. For a subset of those insns there's still an assumption; that the single_set of a PARALLEL insn is the first element in the PARALLEL. If the assumption fails, it's no biggie; the optimization just isn't performed. Don't let the name deceive you, this optimization doesn't hit often, but as often (or as rarely) for LRA as for reload at least on e.g. cris-elf where the biggest effect was seen in reducing repeated addresses in copies from fixed-address arrays, like in gcc.c-torture/compile/pr78694.c. * postreload.cc (move2add_use_add2_insn): Handle trivial single_sets. Rename variable PAT to SET. (move2add_use_add3_insn, reload_cse_move2add): Similar. --- gcc/postreload.cc | 67 +++++++++++++++++++++++++++-------------------- 1 file changed, 38 insertions(+), 29 deletions(-) diff --git a/gcc/postreload.cc b/gcc/postreload.cc index fb392651e1b6..996206f589d3 100644 --- a/gcc/postreload.cc +++ b/gcc/postreload.cc @@ -1744,8 +1744,8 @@ static bool move2add_use_add2_insn (scalar_int_mode mode, rtx reg, rtx sym, rtx off, rtx_insn *insn) { - rtx pat = PATTERN (insn); - rtx src = SET_SRC (pat); + rtx set = single_set (insn); + rtx src = SET_SRC (set); int regno = REGNO (reg); rtx new_src = gen_int_mode (UINTVAL (off) - reg_offset[regno], mode); bool speed = optimize_bb_for_speed_p (BLOCK_FOR_INSN (insn)); @@ -1764,21 +1764,21 @@ move2add_use_add2_insn (scalar_int_mode mode, rtx reg, rtx sym, rtx off, (reg)), would be discarded. Maybe we should try a truncMN pattern? */ if (INTVAL (off) == reg_offset [regno]) - changed = validate_change (insn, &SET_SRC (pat), reg, 0); + changed = validate_change (insn, &SET_SRC (set), reg, 0); } else { struct full_rtx_costs oldcst, newcst; rtx tem = gen_rtx_PLUS (mode, reg, new_src); - get_full_set_rtx_cost (pat, &oldcst); - SET_SRC (pat) = tem; - get_full_set_rtx_cost (pat, &newcst); - SET_SRC (pat) = src; + get_full_set_rtx_cost (set, &oldcst); + SET_SRC (set) = tem; + get_full_set_rtx_cost (set, &newcst); + SET_SRC (set) = src; if (costs_lt_p (&newcst, &oldcst, speed) && have_add2_insn (reg, new_src)) - changed = validate_change (insn, &SET_SRC (pat), tem, 0); + changed = validate_change (insn, &SET_SRC (set), tem, 0); else if (sym == NULL_RTX && mode != BImode) { scalar_int_mode narrow_mode; @@ -1796,10 +1796,15 @@ move2add_use_add2_insn (scalar_int_mode mode, rtx reg, rtx sym, rtx off, narrow_reg), narrow_src); get_full_set_rtx_cost (new_set, &newcst); - if (costs_lt_p (&newcst, &oldcst, speed)) + + /* We perform this replacement only if NEXT is either a + naked SET, or else its single_set is the first element + in a PARALLEL. */ + rtx *setloc = GET_CODE (PATTERN (insn)) == PARALLEL + ? &XEXP (PATTERN (insn), 0) : &PATTERN (insn); + if (*setloc == set && costs_lt_p (&newcst, &oldcst, speed)) { - changed = validate_change (insn, &PATTERN (insn), - new_set, 0); + changed = validate_change (insn, setloc, new_set, 0); if (changed) break; } @@ -1825,8 +1830,8 @@ static bool move2add_use_add3_insn (scalar_int_mode mode, rtx reg, rtx sym, rtx off, rtx_insn *insn) { - rtx pat = PATTERN (insn); - rtx src = SET_SRC (pat); + rtx set = single_set (insn); + rtx src = SET_SRC (set); int regno = REGNO (reg); int min_regno = 0; bool speed = optimize_bb_for_speed_p (BLOCK_FOR_INSN (insn)); @@ -1836,10 +1841,10 @@ move2add_use_add3_insn (scalar_int_mode mode, rtx reg, rtx sym, rtx off, rtx plus_expr; init_costs_to_max (&mincst); - get_full_set_rtx_cost (pat, &oldcst); + get_full_set_rtx_cost (set, &oldcst); plus_expr = gen_rtx_PLUS (GET_MODE (reg), reg, const0_rtx); - SET_SRC (pat) = plus_expr; + SET_SRC (set) = plus_expr; for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) if (move2add_valid_value_p (i, mode) @@ -1864,7 +1869,7 @@ move2add_use_add3_insn (scalar_int_mode mode, rtx reg, rtx sym, rtx off, else { XEXP (plus_expr, 1) = new_src; - get_full_set_rtx_cost (pat, &newcst); + get_full_set_rtx_cost (set, &newcst); if (costs_lt_p (&newcst, &mincst, speed)) { @@ -1873,7 +1878,7 @@ move2add_use_add3_insn (scalar_int_mode mode, rtx reg, rtx sym, rtx off, } } } - SET_SRC (pat) = src; + SET_SRC (set) = src; if (costs_lt_p (&mincst, &oldcst, speed)) { @@ -1886,7 +1891,7 @@ move2add_use_add3_insn (scalar_int_mode mode, rtx reg, rtx sym, rtx off, GET_MODE (reg)); tem = gen_rtx_PLUS (GET_MODE (reg), tem, new_src); } - if (validate_change (insn, &SET_SRC (pat), tem, 0)) + if (validate_change (insn, &SET_SRC (set), tem, 0)) changed = true; } reg_set_luid[regno] = move2add_luid; @@ -1916,7 +1921,7 @@ reload_cse_move2add (rtx_insn *first) move2add_luid = 2; for (insn = first; insn; insn = NEXT_INSN (insn), move2add_luid++) { - rtx pat, note; + rtx set, note; if (LABEL_P (insn)) { @@ -1929,17 +1934,17 @@ reload_cse_move2add (rtx_insn *first) } if (! INSN_P (insn)) continue; - pat = PATTERN (insn); + set = single_set (insn); /* For simplicity, we only perform this optimization on - straightforward SETs. */ + single-sets. */ scalar_int_mode mode; - if (GET_CODE (pat) == SET - && REG_P (SET_DEST (pat)) - && is_a (GET_MODE (SET_DEST (pat)), &mode)) + if (set + && REG_P (SET_DEST (set)) + && is_a (GET_MODE (SET_DEST (set)), &mode)) { - rtx reg = SET_DEST (pat); + rtx reg = SET_DEST (set); int regno = REGNO (reg); - rtx src = SET_SRC (pat); + rtx src = SET_SRC (set); /* Check if we have valid information on the contents of this register in the mode of REG. */ @@ -2021,13 +2028,15 @@ reload_cse_move2add (rtx_insn *first) SET_SRC (set) = old_src; costs_add_n_insns (&oldcst, 1); - if (costs_lt_p (&newcst, &oldcst, speed) + rtx *setloc = GET_CODE (PATTERN (next)) == PARALLEL + ? &XEXP (PATTERN (next), 0) : &PATTERN (next); + if (*setloc == set + && costs_lt_p (&newcst, &oldcst, speed) && have_add2_insn (reg, new_src)) { rtx newpat = gen_rtx_SET (reg, tem); success - = validate_change (next, &PATTERN (next), - newpat, 0); + = validate_change (next, setloc, newpat, 0); } } if (success)