From patchwork Tue Dec 6 09:11:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Wang X-Patchwork-Id: 61492 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9672A3959E4D for ; Tue, 6 Dec 2022 09:12:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from zg8tmja5ljk3lje4ms43mwaa.icoremail.net (zg8tmja5ljk3lje4ms43mwaa.icoremail.net [209.97.181.73]) by sourceware.org (Postfix) with SMTP id 3B1473952510 for ; Tue, 6 Dec 2022 09:12:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3B1473952510 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=eswincomputing.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=eswincomputing.com Received: from localhost.localdomain (unknown [10.12.130.31]) by app2 (Coremail) with SMTP id EggMCgC3iC1fB49jQ3AWAA--.34031S5; Tue, 06 Dec 2022 17:12:00 +0800 (CST) From: Feng Wang To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, palmer@dabbelt.com, wangfeng Subject: [PATCH v2 1/1] RISC-V: Optimze the reverse conditions of rotate shift Date: Tue, 6 Dec 2022 09:11:53 +0000 Message-Id: <20221206091153.27281-2-wangfeng@eswincomputing.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20221206091153.27281-1-wangfeng@eswincomputing.com> References: <20221206091153.27281-1-wangfeng@eswincomputing.com> X-CM-TRANSID: EggMCgC3iC1fB49jQ3AWAA--.34031S5 X-Coremail-Antispam: 1UD129KBjvJXoW3uF4DAr4Dtw48urW8Gw1kuFg_yoWktF4fpr 48G3y8K3yxJryfKw4SkFWF9r4fArs7KF1Yy39Ivry8t3W5JrZIqa9Yy3yaqFW5JFy8uF12 9FW29r15Cws0g3DanT9S1TB71UUUUUDqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPI14x267AKxVW8JVW5JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r1Y6r1xM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6rxdM2 8EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxkF7I0En4kS14v26r1q6r43Mx kIecxEwVCm-wCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02 F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_JF0_Jw 1lIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7Cj xVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r 1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUC YL9UUUUU= X-CM-SenderInfo: pzdqwwxhqjqvxvzl0uprps33xlqjhudrp/ X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: wangfeng There is no Immediate operand of ins "rol" according to the B-ext, so the immediate operand should be loaded into register at first. But we can convert it to the ins "rori" or "roriw", and then one immediate load ins can be reduced. So I added some conditions when reverse the rotate shift during RTL expansion and RTL optimization.Reverse if the below two conditions are met at the same time, 1. The current insn_code doesn't exist or it's operand doesn't match, or the shift amount is beyond the half size of the machine mode; 2. The reversed insn_code exists and it's operand matches. Please refer to the following use cases: unsigned long foo2(unsigned long rs1) { return (rs1 << 10) | (rs1 >> 54); } The compiler result is: li a1,10 rol a0,a0,a1 This patch will generate one ins rori a0,a0,54 At the same time I add the missing "roriw" ins RTL pattern Pass the linux-rv32imafdc-ilp32d-medany,linux-rv64imafdc-lp64d-medany, newlib-rv32imafc-ilp32f-medany and newlib-rv64imafdc-lp64d-medany regression. gcc/ChangeLog: * config/riscv/bitmanip.md: Add "roriw" insn output * expmed.cc (expand_shift_1):Call reverse_rotate_by_imm_p to judge whether reverse the rotate direction when GIMPLE to RTL. * rtl.h (reverse_rotate_by_imm_p): Add function declartion * simplify-rtx.cc (reverse_rotate_by_imm_p): Add a function to judge whether reverse rotate shift direction when simplify rtx. Reverse if the below two conditions are met at the same time, 1. The current insn_code doesn't exist or it's operand doesn't match, or the shift amount is beyond the half size of the machine mode; 2. The reversed insn_code exists and it's operand matches. gcc/testsuite/ChangeLog: * gcc.target/riscv/zbb-rol-ror-04.c: New test. * gcc.target/riscv/zbb-rol-ror-05.c: New test. * gcc.target/riscv/zbb-rol-ror-06.c: New test. * gcc.target/riscv/zbb-rol-ror-07.c: New test. --- gcc/config/riscv/bitmanip.md | 4 +- gcc/expmed.cc | 14 ++-- gcc/rtl.h | 1 + gcc/simplify-rtx.cc | 49 ++++++++++---- .../gcc.target/riscv/zbb-rol-ror-04.c | 52 +++++++++++++++ .../gcc.target/riscv/zbb-rol-ror-05.c | 24 +++++++ .../gcc.target/riscv/zbb-rol-ror-06.c | 36 +++++++++++ .../gcc.target/riscv/zbb-rol-ror-07.c | 64 +++++++++++++++++++ 8 files changed, 219 insertions(+), 25 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-rol-ror-06.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-rol-ror-07.c diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md index d17133d58c1..ba69d0134b2 100644 --- a/gcc/config/riscv/bitmanip.md +++ b/gcc/config/riscv/bitmanip.md @@ -292,9 +292,9 @@ (define_insn "rotrsi3_sext" [(set (match_operand:DI 0 "register_operand" "=r") (sign_extend:DI (rotatert:SI (match_operand:SI 1 "register_operand" "r") - (match_operand:QI 2 "register_operand" "r"))))] + (match_operand:QI 2 "arith_operand" "rI"))))] "TARGET_64BIT && TARGET_ZBB" - "rorw\t%0,%1,%2" + "ror%i2%~\t%0,%1,%2" [(set_attr "type" "bitmanip")]) (define_insn "rotlsi3" diff --git a/gcc/expmed.cc b/gcc/expmed.cc index b12b0e000c2..907c259c624 100644 --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -2475,7 +2475,7 @@ expand_dec (rtx target, rtx dec) if (value != target) emit_move_insn (target, value); } - + /* Output a shift instruction for expression code CODE, with SHIFTED being the rtx for the value to shift, and AMOUNT the rtx for the amount to shift by. @@ -2535,17 +2535,13 @@ expand_shift_1 (enum tree_code code, machine_mode mode, rtx shifted, op1 = SUBREG_REG (op1); } - /* Canonicalize rotates by constant amount. If op1 is bitsize / 2, - prefer left rotation, if op1 is from bitsize / 2 + 1 to - bitsize - 1, use other direction of rotate with 1 .. bitsize / 2 - 1 - amount instead. */ + /* Canonicalize rotates by constant amount. If the condition of + reversing direction is met, then reverse the direction. */ if (rotate - && CONST_INT_P (op1) - && IN_RANGE (INTVAL (op1), GET_MODE_BITSIZE (scalar_mode) / 2 + left, - GET_MODE_BITSIZE (scalar_mode) - 1)) + && reverse_rotate_by_imm_p (scalar_mode, left, op1)) { op1 = gen_int_shift_amount (mode, (GET_MODE_BITSIZE (scalar_mode) - - INTVAL (op1))); + - INTVAL (op1))); left = !left; code = left ? LROTATE_EXPR : RROTATE_EXPR; } diff --git a/gcc/rtl.h b/gcc/rtl.h index 7a8c4709257..1daab747b6a 100644 --- a/gcc/rtl.h +++ b/gcc/rtl.h @@ -3566,6 +3566,7 @@ extern bool val_signbit_known_set_p (machine_mode, unsigned HOST_WIDE_INT); extern bool val_signbit_known_clear_p (machine_mode, unsigned HOST_WIDE_INT); +extern bool reverse_rotate_by_imm_p (machine_mode, unsigned int, rtx); /* In reginfo.cc */ extern machine_mode choose_hard_reg_mode (unsigned int, unsigned int, diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc index fc0d6c3ca2a..ed9399ce28e 100644 --- a/gcc/simplify-rtx.cc +++ b/gcc/simplify-rtx.cc @@ -2726,6 +2726,32 @@ simplify_context::simplify_distributive_operation (rtx_code code, return NULL_RTX; } +/* Reverse if the below two conditions are met at the same time, + 1. The current insn_code doesn't exist or it's operand doesn't match, + or the shift amount is beyond the half size of the machine mode; + 2. The reversed insn_code exists and it's operand matches. */ + +bool reverse_rotate_by_imm_p (machine_mode mode, unsigned int left, rtx op1) +{ + if (!CONST_INT_P (op1)) + return false; + + optab binoptab = left ? rotl_optab : rotr_optab; + optab re_binoptab = left ? rotr_optab : rotl_optab; + + enum insn_code icode = optab_handler (binoptab, mode); + enum insn_code re_icode = optab_handler (re_binoptab, mode); + if (((icode == CODE_FOR_nothing) + || (!insn_operand_matches (icode, 2, op1)) + || (IN_RANGE (INTVAL (op1), + GET_MODE_UNIT_PRECISION (mode) / 2 + left, + GET_MODE_UNIT_PRECISION (mode) - 1))) + && (re_icode != CODE_FOR_nothing) + && (insn_operand_matches (re_icode, 2, op1))) + return true; + return false; +} + /* Subroutine of simplify_binary_operation. Simplify a binary operation CODE with result mode MODE, operating on OP0 and OP1. If OP0 and/or OP1 are constant pool references, TRUEOP0 and TRUEOP1 represent the @@ -4077,21 +4103,16 @@ simplify_context::simplify_binary_operation_1 (rtx_code code, case ROTATE: if (trueop1 == CONST0_RTX (mode)) return op0; - /* Canonicalize rotates by constant amount. If op1 is bitsize / 2, - prefer left rotation, if op1 is from bitsize / 2 + 1 to - bitsize - 1, use other direction of rotate with 1 .. bitsize / 2 - 1 - amount instead. */ + /* Canonicalize rotates by constant amount. If the condition of + reversing direction is met, then reverse the direction. */ #if defined(HAVE_rotate) && defined(HAVE_rotatert) - if (CONST_INT_P (trueop1) - && IN_RANGE (INTVAL (trueop1), - GET_MODE_UNIT_PRECISION (mode) / 2 + (code == ROTATE), - GET_MODE_UNIT_PRECISION (mode) - 1)) - { - int new_amount = GET_MODE_UNIT_PRECISION (mode) - INTVAL (trueop1); - rtx new_amount_rtx = gen_int_shift_amount (mode, new_amount); - return simplify_gen_binary (code == ROTATE ? ROTATERT : ROTATE, - mode, op0, new_amount_rtx); - } + if (reverse_rotate_by_imm_p (mode, (code == ROTATE), trueop1)) + { + int new_amount = GET_MODE_UNIT_PRECISION (mode) - INTVAL (trueop1); + rtx new_amount_rtx = gen_int_shift_amount (mode, new_amount); + return simplify_gen_binary (code == ROTATE ? ROTATERT : ROTATE, + mode, op0, new_amount_rtx); + } #endif /* FALLTHRU */ case ASHIFTRT: diff --git a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c new file mode 100644 index 00000000000..08053484cb2 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c @@ -0,0 +1,52 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zbb -mabi=lp64d -fno-lto -O2" } */ +/* { dg-skip-if "" { *-*-* } { "-g" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +/* +**foo1: +** rori a0,a0,34 +** ret +*/ +unsigned long foo1 (unsigned long rs1) +{ return (rs1 >> (34)) | (rs1 << 30); } + +/* +**foo2: +** rori a0,a0,54 +** ret +*/ +unsigned long foo2(unsigned long rs1) +{ + return (rs1 << 10) | (rs1 >> 54); +} + +/* +**foo3: +** roriw a0,a0,20 +** ret +*/ +unsigned int foo3(unsigned int rs1) +{ + return (rs1 >> 20) | (rs1 << 12); +} + +/* +**foo4: +** roriw a0,a0,22 +** ret +*/ +unsigned int foo4(unsigned int rs1) +{ + return (rs1 << 10) | (rs1 >> 22); +} + +/* +**foo5: +** rorw a0,a0,a1 +** ret +*/ +unsigned int foo5(unsigned int rs1, unsigned int rs2) +{ + return (rs1 >> rs2) | (rs1 << (32 - rs2)); +} diff --git a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c new file mode 100644 index 00000000000..85090b1b0fc --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zbb -mabi=ilp32 -fno-lto -O2" } */ +/* { dg-skip-if "" { *-*-* } { "-g" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +/* +**foo1: +** rori a0,a0,20 +** ret +*/ +unsigned int foo1(unsigned int rs1) +{ + return (rs1 >> 20) | (rs1 << 12); +} + +/* +**foo2: +** rori a0,a0,22 +** ret +*/ +unsigned int foo2(unsigned int rs1) +{ + return (rs1 << 10) | (rs1 >> 22); +} diff --git a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-06.c b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-06.c new file mode 100644 index 00000000000..70b79abb6ed --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-06.c @@ -0,0 +1,36 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zbb -mabi=lp64d -fno-lto -O2" } */ +/* { dg-skip-if "" { *-*-* } { "-g" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +/* +**foo1: +** roriw a0,a0,14 +** ret +*/ +unsigned int foo1 (unsigned int rs1) +{ return ((rs1 >> 14) | (rs1 << 18)); } + +/* +**foo2: +** roriw a0,a0,18 +** ret +*/ +unsigned int foo2 (unsigned int rs1) +{ return ((rs1 >> 18) | (rs1 << 14)); } + +/* +**foo3: +** roriw a0,a0,18 +** ret +*/ +unsigned int foo3 (unsigned int rs1) +{ return ((rs1 << 14) | (rs1 >> 18)); } + +/* +**foo4: +** roriw a0,a0,14 +** ret +*/ +unsigned int foo4 (unsigned int rs1) +{ return ((rs1 << 18) | (rs1 >> 14)); } \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-07.c b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-07.c new file mode 100644 index 00000000000..3b6ab385a85 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-07.c @@ -0,0 +1,64 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zbb -mabi=lp64d -fno-lto -O2" } */ +/* { dg-skip-if "" { *-*-* } { "-g" } } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +/* +**foo1: +** rori a0,a0,34 +** ret +*/ +unsigned long foo1 (unsigned long rs1) +{ + unsigned long tempt; + tempt = rs1 >> 30; + tempt = tempt << 2; + tempt = tempt >> 6; + rs1 = tempt | (rs1 << 30); + return rs1 ; +} + +/* +**foo2: +** rori a0,a0,24 +** ret +*/ +unsigned long foo2 (unsigned long rs1) +{ + unsigned long tempt; + tempt = rs1 >> 20; + tempt = tempt << 2; + tempt = tempt >> 6; + rs1 = tempt | (rs1 << 40); + return rs1 ; +} + +/* +**foo3: +** rori a0,a0,40 +** ret +*/ +unsigned long foo3 (unsigned long rs1) +{ + unsigned long tempt; + tempt = rs1 << 20; + tempt = tempt >> 2; + tempt = tempt << 6; + rs1 = tempt | (rs1 >> 40); + return rs1 ; +} + +/* +**foo4: +** rori a0,a0,20 +** ret +*/ +unsigned long foo4 (unsigned long rs1) +{ + unsigned long tempt; + tempt = rs1 << 40; + tempt = tempt >> 2; + tempt = tempt << 6; + rs1 = tempt | (rs1 >> 20); + return rs1 ; +} \ No newline at end of file