From patchwork Thu Jul 7 10:30:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Lulu Cheng X-Patchwork-Id: 55820 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BA940385C318 for ; Thu, 7 Jul 2022 10:31:17 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id EF0CA3858D32 for ; Thu, 7 Jul 2022 10:31:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org EF0CA3858D32 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn Received: from 5.5.5 (unknown [10.2.5.5]) by mail.loongson.cn (Coremail) with SMTP id AQAAf9DxD9HetcZiquoOAA--.2188S2; Thu, 07 Jul 2022 18:30:58 +0800 (CST) From: Lulu Cheng To: gcc-patches@gcc.gnu.org Subject: [PATCH v3] LoongArch: Modify fp_sp_offset and gp_sp_offset's calculation method when frame->mask or frame->fmask is zero. Date: Thu, 7 Jul 2022 18:30:50 +0800 Message-Id: <20220707103050.2849360-1-chenglulu@loongson.cn> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-CM-TRANSID: AQAAf9DxD9HetcZiquoOAA--.2188S2 X-Coremail-Antispam: 1UD129KBjvJXoWxXF15KF4kJrWUZF47WF4rKrg_yoWrWryxp3 9xCwsIqr4kGFyxCrsFgry8ZFs5Jr9rG3y2gFZFqry0kwsxtry2qF1kKFn2ya48G3s5Aw4a vF1rA3ZIga1DAaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUkS14x267AKxVWUJVW8JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26F1j6w1UM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4U JVWxJr1l84ACjcxK6I8E87Iv67AKxVWxJr0_GcWl84ACjcxK6I8E87Iv6xkF7I0E14v26r xl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj 6xIIjxv20xvE14v26r1Y6r17McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr 0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxkIecxEwVCm-wCF 04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r 18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_JF0_Jw1lIxkGc2Ij64vI r41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Jr0_Gr 1lIxAIcVCF04k26cxKx2IYs7xG6Fyj6rWUJwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY 6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjfUYDGYDUUUU X-CM-SenderInfo: xfkh0wpoxo3qxorr0wxvrqhubq/ X-Spam-Status: No, score=-13.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: xuchenghua@loongson.cn, Lulu Cheng , i@xen0n.name Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" update testsuite. -------------------------- Under the LA architecture, when the stack is dropped too far, the process of dropping the stack is divided into two steps. step1: After dropping the stack, save callee saved registers on the stack. step2: The rest of it. The stack drop operation is optimized when frame->total_size minus frame->sp_fp_offset is an integer multiple of 4096, can reduce the number of instructions required to drop the stack. However, this optimization is not effective because of the original calculation method The following case: int main() { char buf[1024 * 12]; printf ("%p\n", buf); return 0; } As you can see from the generated assembler, the old GCC has two more instructions than the new GCC, lines 14 and line 24. new old 10 main: │ 11 main: 11 addi.d $r3,$r3,-16 │ 12 lu12i.w $r13,-12288>>12 12 lu12i.w $r13,-12288>>12 │ 13 addi.d $r3,$r3,-2032 13 lu12i.w $r5,-12288>>12 │ 14 ori $r13,$r13,2016 14 lu12i.w $r12,12288>>12 │ 15 lu12i.w $r5,-12288>>12 15 st.d $r1,$r3,8 │ 16 lu12i.w $r12,12288>>12 16 add.d $r12,$r12,$r5 │ 17 st.d $r1,$r3,2024 17 add.d $r3,$r3,$r13 │ 18 add.d $r12,$r12,$r5 18 add.d $r5,$r12,$r3 │ 19 add.d $r3,$r3,$r13 19 la.local $r4,.LC0 │ 20 add.d $r5,$r12,$r3 20 bl %plt(printf) │ 21 la.local $r4,.LC0 21 lu12i.w $r13,12288>>12 │ 22 bl %plt(printf) 22 add.d $r3,$r3,$r13 │ 23 lu12i.w $r13,8192>>12 23 ld.d $r1,$r3,8 │ 24 ori $r13,$r13,2080 24 or $r4,$r0,$r0 │ 25 add.d $r3,$r3,$r13 25 addi.d $r3,$r3,16 │ 26 ld.d $r1,$r3,2024 26 jr $r1 │ 27 or $r4,$r0,$r0 │ 28 addi.d $r3,$r3,2032 │ 29 jr $r1 gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_compute_frame_info): Modify fp_sp_offset and gp_sp_offset's calculation method, when frame->mask or frame->fmask is zero, don't minus UNITS_PER_WORD or UNITS_PER_FP_REG. gcc/testsuite/ChangeLog: * gcc.target/loongarch/prolog-opt.c: New test. --- gcc/config/loongarch/loongarch.cc | 12 +++++++++--- gcc/testsuite/gcc.target/loongarch/prolog-opt.c | 15 +++++++++++++++ 2 files changed, 24 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/loongarch/prolog-opt.c diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index d72b256df51..5c9a33c14f7 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -917,8 +917,12 @@ loongarch_compute_frame_info (void) frame->frame_pointer_offset = offset; /* Next are the callee-saved FPRs. */ if (frame->fmask) - offset += LARCH_STACK_ALIGN (num_f_saved * UNITS_PER_FP_REG); - frame->fp_sp_offset = offset - UNITS_PER_FP_REG; + { + offset += LARCH_STACK_ALIGN (num_f_saved * UNITS_PER_FP_REG); + frame->fp_sp_offset = offset - UNITS_PER_FP_REG; + } + else + frame->fp_sp_offset = offset; /* Next are the callee-saved GPRs. */ if (frame->mask) { @@ -931,8 +935,10 @@ loongarch_compute_frame_info (void) frame->save_libcall_adjustment = x_save_size; offset += x_save_size; + frame->gp_sp_offset = offset - UNITS_PER_WORD; } - frame->gp_sp_offset = offset - UNITS_PER_WORD; + else + frame->gp_sp_offset = offset; /* The hard frame pointer points above the callee-saved GPRs. */ frame->hard_frame_pointer_offset = offset; /* Above the hard frame pointer is the callee-allocated varags save area. */ diff --git a/gcc/testsuite/gcc.target/loongarch/prolog-opt.c b/gcc/testsuite/gcc.target/loongarch/prolog-opt.c new file mode 100644 index 00000000000..c7bd71dde93 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/prolog-opt.c @@ -0,0 +1,15 @@ +/* Test that LoongArch backend stack drop operation optimized. */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -mabi=lp64d" } */ +/* { dg-final { scan-assembler "addi.d\t\\\$r3,\\\$r3,-16" } } */ + +#include + +int main() +{ + char buf[1024 * 12]; + printf ("%p\n", buf); + return 0; +} +