From patchwork Sat Jan 6 08:54:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lulu Cheng X-Patchwork-Id: 83446 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0D4093857BB8 for ; Sat, 6 Jan 2024 08:55:12 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail.loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id 830FF3858D1E for ; Sat, 6 Jan 2024 08:54:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 830FF3858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 830FF3858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=114.242.206.163 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704531270; cv=none; b=PHZtshOtEynJVTGJPx6tBnd32+KCBzPuPxv9jPVAm5Ze9kSsohjCyekjh2yTrSzxoUhIYLkZ3Ty90IhLSnsFY0I+XaE8Q17JRnTZa5cErK6fxqRomkmLLR5g3QLiexoE04KbM9M+1MeFHjwgkI445UVBk5TRHwTcXFQiCrBHGlw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704531270; c=relaxed/simple; bh=PvhrQaWJogbbmojSSr0+gQ7/trP0sSDB7dHoUwwl3KA=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=S8zeRwxsGUY25Cn0lYMcWrsQoOki4kNr+roT7xkGtzDl09iF04RAoMVmeCgO0ZNTI0NDFn0ECJYMgVYJc4C2cVBJ2tznHxYxIxlTA/L7T+mFZnnqHQvoxorEtn8OV3rbF+JTJQENEu6vtlNUbcV4ISDKy7X4OtTVIGQOWiolntE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from loongson.cn (unknown [10.20.4.107]) by gateway (Coremail) with SMTP id _____8Axzeg_FZllpJgCAA--.824S3; Sat, 06 Jan 2024 16:54:23 +0800 (CST) Received: from loongson-pc.loongson.cn (unknown [10.20.4.107]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxK9w2FZllcWgEAA--.11725S3; Sat, 06 Jan 2024 16:54:20 +0800 (CST) From: Lulu Cheng To: gcc-patches@gcc.gnu.org Cc: xry111@xry111.site, i@xen0n.name, xuchenghua@loongson.cn, chenglulu@loongson.cn, liwei Subject: [PATCH 2/3] LoongArch: Redundant sign extension elimination optimization. Date: Sat, 6 Jan 2024 16:54:08 +0800 Message-Id: <20240106085409.25985-2-chenglulu@loongson.cn> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20240106085409.25985-1-chenglulu@loongson.cn> References: <20240106085409.25985-1-chenglulu@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxK9w2FZllcWgEAA--.11725S3 X-CM-SenderInfo: xfkh0wpoxo3qxorr0wxvrqhubq/ X-Coremail-Antispam: 1Uk129KBj93XoW3XryDtr1rZFWkuw17Cr1UXFc_yoW3GFyxpr ZrCw12gr48Jwn3K340ka4UJr45Krn7JrWavF93u3srCryxX3srXa1Fyr9IqFy5Xa1Fqry5 XFs3Z3WUWa17K3cCm3ZEXasCq-sJn29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUk0b4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Gr0_Xr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Gr0_Cr1l84ACjcxK6I8E87Iv67AKxVWxJVW8Jr1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6r4j6r4UJwAS0I0E0xvYzxvE52x082IY62kv0487Mc804VCY07AIYIkI8VC2zVCFFI0UMc 02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUAVWUtwAv7VC2z280aVAF wI0_Gr0_Cr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JMxAIw28IcxkI7V AKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCj r7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUAVWUtwCIc40Y0x0EwIxGrwCI42IY6x IIjxv20xvE14v26r1I6r4UMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWUJVW8JwCI42IY6xAI w20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x 0267AKxVWUJVW8JbIYCTnIWIevJa73UjIFyTuYvjxU7tx6UUUUU X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org From: liwei We found that the current combine optimization pass in gcc cannot handle the following redundant sign extension situations: (insn 77 76 78 5 (set (reg:SI 143) (plus:SI (subreg/s/u:SI (reg/v:DI 104 [ len ]) 0) (const_int 1 [0x1]))) {addsi3} (expr_list:REG_DEAD (reg/v:DI 104 [ len ]) (nil))) (insn 78 77 82 5 (set (reg/v:DI 104 [ len ]) (sign_extend:DI (reg:SI 143))) {extendsidi2} (nil)) Because reg:SI 143 is not died or set in insn 78, no replacement merge will be performed for the insn sequence. We adjusted the add template to eliminate redundant sign extensions during the expand pass. gcc/ChangeLog: * config/loongarch/loongarch.md (add3): Removed. (*addsi3): New. (addsi3): New. (adddi3): New. (*addsi3_extended): Removed. (addsi3_extended): New. gcc/testsuite/ChangeLog: * gcc.target/loongarch/sign-extend.c: Moved to... * gcc.target/loongarch/sign-extend-1.c: ...here. * gcc.target/loongarch/sign-extend-2.c: New test. --- gcc/config/loongarch/loongarch.md | 93 ++++++++++++++++--- .../{sign-extend.c => sign-extend-1.c} | 0 .../gcc.target/loongarch/sign-extend-2.c | 59 ++++++++++++ 3 files changed, 137 insertions(+), 15 deletions(-) rename gcc/testsuite/gcc.target/loongarch/{sign-extend.c => sign-extend-1.c} (100%) create mode 100644 gcc/testsuite/gcc.target/loongarch/sign-extend-2.c diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md index 436b9a93235..17ec401f535 100644 --- a/gcc/config/loongarch/loongarch.md +++ b/gcc/config/loongarch/loongarch.md @@ -657,15 +657,15 @@ (define_insn "add3" [(set_attr "type" "fadd") (set_attr "mode" "")]) -(define_insn_and_split "add3" - [(set (match_operand:GPR 0 "register_operand" "=r,r,r,r,r,r,r") - (plus:GPR (match_operand:GPR 1 "register_operand" "r,r,r,r,r,r,r") - (match_operand:GPR 2 "plus__operand" +(define_insn_and_split "*addsi3" + [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r,r,r") + (plus:SI (match_operand:SI 1 "register_operand" "r,r,r,r,r,r,r") + (match_operand:SI 2 "plus_si_operand" "r,I,La,Lb,Lc,Ld,Le")))] "" "@ - add.\t%0,%1,%2 - addi.\t%0,%1,%2 + add.w\t%0,%1,%2 + addi.w\t%0,%1,%2 # * operands[2] = GEN_INT (INTVAL (operands[2]) / 65536); \ return \"addu16i.d\t%0,%1,%2\"; @@ -674,25 +674,88 @@ (define_insn_and_split "add3" #" "CONST_INT_P (operands[2]) && !IMM12_INT (operands[2]) \ && !ADDU16I_OPERAND (INTVAL (operands[2]))" - [(set (match_dup 0) (plus:GPR (match_dup 1) (match_dup 3))) - (set (match_dup 0) (plus:GPR (match_dup 0) (match_dup 4)))] + [(set (match_dup 0) (plus:SI (match_dup 1) (match_dup 3))) + (set (match_dup 0) (plus:SI (match_dup 0) (match_dup 4)))] { - loongarch_split_plus_constant (&operands[2], mode); + loongarch_split_plus_constant (&operands[2], SImode); } [(set_attr "alu_type" "add") - (set_attr "mode" "") + (set_attr "mode" "SI") (set_attr "insn_count" "1,1,2,1,2,2,2") (set (attr "enabled") (cond - [(match_test "mode != DImode && which_alternative == 4") + [(match_test "which_alternative == 4") (const_string "no") - (match_test "mode != DImode && which_alternative == 5") - (const_string "no") - (match_test "mode != SImode && which_alternative == 6") + (match_test "which_alternative == 5") + (const_string "no")] + (const_string "yes")))]) + +(define_expand "addsi3" + [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r") + (plus:SI (match_operand:SI 1 "register_operand" "r,r,r,r,r") + (match_operand:SI 2 "plus_si_operand" "r,I,La,Le,Lb")))] + "" +{ + if (TARGET_64BIT) + { + if (CONST_INT_P (operands[2]) && !IMM12_INT (operands[2]) + && ADDU16I_OPERAND (INTVAL (operands[2]))) + { + rtx t1 = gen_reg_rtx (DImode); + rtx t2 = gen_reg_rtx (DImode); + rtx t3 = gen_reg_rtx (DImode); + emit_insn (gen_extend_insn (t1, operands[1], DImode, SImode, 0)); + t2 = operands[2]; + emit_insn (gen_adddi3 (t3, t1, t2)); + t3 = gen_lowpart (SImode, t3); + emit_move_insn (operands[0], t3); + DONE; + } + else + { + rtx t = gen_reg_rtx (DImode); + emit_insn (gen_addsi3_extended (t, operands[1], operands[2])); + t = gen_lowpart (SImode, t); + SUBREG_PROMOTED_VAR_P (t) = 1; + SUBREG_PROMOTED_SET (t, SRP_SIGNED); + emit_move_insn (operands[0], t); + DONE; + } + } +}) + +(define_insn_and_split "adddi3" + [(set (match_operand:DI 0 "register_operand" "=r,r,r,r,r,r,r") + (plus:DI (match_operand:DI 1 "register_operand" "r,r,r,r,r,r,r") + (match_operand:DI 2 "plus_di_operand" + "r,I,La,Lb,Lc,Ld,Le")))] + "" + "@ + add.d\t%0,%1,%2 + addi.d\t%0,%1,%2 + # + * operands[2] = GEN_INT (INTVAL (operands[2]) / 65536); \ + return \"addu16i.d\t%0,%1,%2\"; + # + # + #" + "CONST_INT_P (operands[2]) && !IMM12_INT (operands[2]) \ + && !ADDU16I_OPERAND (INTVAL (operands[2]))" + [(set (match_dup 0) (plus:DI (match_dup 1) (match_dup 3))) + (set (match_dup 0) (plus:DI (match_dup 0) (match_dup 4)))] + { + loongarch_split_plus_constant (&operands[2], DImode); + } + [(set_attr "alu_type" "add") + (set_attr "mode" "DI") + (set_attr "insn_count" "1,1,2,1,2,2,2") + (set (attr "enabled") + (cond + [(match_test "which_alternative == 6") (const_string "no")] (const_string "yes")))]) -(define_insn_and_split "*addsi3_extended" +(define_insn_and_split "addsi3_extended" [(set (match_operand:DI 0 "register_operand" "=r,r,r,r") (sign_extend:DI (plus:SI (match_operand:SI 1 "register_operand" "r,r,r,r") diff --git a/gcc/testsuite/gcc.target/loongarch/sign-extend.c b/gcc/testsuite/gcc.target/loongarch/sign-extend-1.c similarity index 100% rename from gcc/testsuite/gcc.target/loongarch/sign-extend.c rename to gcc/testsuite/gcc.target/loongarch/sign-extend-1.c diff --git a/gcc/testsuite/gcc.target/loongarch/sign-extend-2.c b/gcc/testsuite/gcc.target/loongarch/sign-extend-2.c new file mode 100644 index 00000000000..a45dde4f73f --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/sign-extend-2.c @@ -0,0 +1,59 @@ +/* { dg-do compile } */ +/* { dg-options "-mabi=lp64d -O2" } */ +/* { dg-final { scan-assembler-times "slli.w\t\\\$r\[0-9\]+,\\\$r\[0-9\]+,0" 1 } } */ + +#include +#define my_min(x, y) ((x) < (y) ? (x) : (y)) + +void +bt_skip_func (const uint32_t len_limit, const uint32_t pos, + const uint8_t *const cur, uint32_t cur_match, + uint32_t *const son, const uint32_t cyclic_pos, + const uint32_t cyclic_size) +{ + uint32_t *ptr0 = son + (cyclic_pos << 1) + 1; + uint32_t *ptr1 = son + (cyclic_pos << 1); + + uint32_t len0 = 0; + uint32_t len1 = 0; + + while (1) + { + const uint32_t delta = pos - cur_match; + uint32_t *pair + = son + + ((cyclic_pos - delta + (delta > cyclic_pos ? cyclic_size : 0)) + << 1); + const uint8_t *pb = cur - delta; + uint32_t len = my_min (len0, len1); + + if (pb[len] == cur[len]) + { + while (++len != len_limit) + if (pb[len] != cur[len]) + break; + + if (len == len_limit) + { + *ptr1 = pair[0]; + *ptr0 = pair[1]; + return; + } + } + + if (pb[len] < cur[len]) + { + *ptr1 = cur_match; + ptr1 = pair + 1; + cur_match = *ptr1; + len1 = len; + } + else + { + *ptr0 = cur_match; + ptr0 = pair; + cur_match = *ptr0; + len0 = len; + } + } +}