From patchwork Thu Jul 7 10:09:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Lulu Cheng X-Patchwork-Id: 55819 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CD64D385C31B for ; Thu, 7 Jul 2022 10:10:16 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id 6031D3857B99 for ; Thu, 7 Jul 2022 10:09:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6031D3857B99 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn Received: from 5.5.5 (unknown [10.2.5.5]) by mail.loongson.cn (Coremail) with SMTP id AQAAf9DxH9LxsMZinuEOAA--.2219S2; Thu, 07 Jul 2022 18:09:56 +0800 (CST) From: Lulu Cheng To: gcc-patches@gcc.gnu.org Subject: [PATCH v2] LoongArch: Modify fp_sp_offset and gp_sp_offset's calculation method when frame->mask or frame->fmask is zero. Date: Thu, 7 Jul 2022 18:09:50 +0800 Message-Id: <20220707100950.2834573-1-chenglulu@loongson.cn> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-CM-TRANSID: AQAAf9DxH9LxsMZinuEOAA--.2219S2 X-Coremail-Antispam: 1UD129KBjvJXoWxXF15KF4kJrWUZF47WF4rKrg_yoWrAryUpa 9xCwsaqF4kJryI9rsFqry8ZFs8Jr9xG3yjga9IqryFkrsxtryjqF1kKFnFy3W8Gw1kZwsI vF15JasIga1kAaDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUkv14x267AKxVWUJVW8JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26F1j6w1UM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4U JVWxJr1l84ACjcxK6I8E87Iv67AKxVWxJr0_GcWl84ACjcxK6I8E87Iv6xkF7I0E14v26r xl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj 6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr 0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxkIecxEwVCm-wCF 04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r 18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_JF0_Jw1lIxkGc2Ij64vI r41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Jr0_Gr 1lIxAIcVCF04k26cxKx2IYs7xG6rWUJVWrZr1UMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF 0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUywZ7UUUUU= X-CM-SenderInfo: xfkh0wpoxo3qxorr0wxvrqhubq/ X-Spam-Status: No, score=-13.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: xuchenghua@loongson.cn, Lulu Cheng , i@xen0n.name Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Under the LA architecture, when the stack is dropped too far, the process of dropping the stack is divided into two steps. step1: After dropping the stack, save callee saved registers on the stack. step2: The rest of it. The stack drop operation is optimized when frame->total_size minus frame->sp_fp_offset is an integer multiple of 4096, can reduce the number of instructions required to drop the stack. However, this optimization is not effective because of the original calculation method The following case: int main() { char buf[1024 * 12]; printf ("%p\n", buf); return 0; } As you can see from the generated assembler, the old GCC has two more instructions than the new GCC, lines 14 and line 24. new old 10 main: │ 11 main: 11 addi.d $r3,$r3,-16 │ 12 lu12i.w $r13,-12288>>12 12 lu12i.w $r13,-12288>>12 │ 13 addi.d $r3,$r3,-2032 13 lu12i.w $r5,-12288>>12 │ 14 ori $r13,$r13,2016 14 lu12i.w $r12,12288>>12 │ 15 lu12i.w $r5,-12288>>12 15 st.d $r1,$r3,8 │ 16 lu12i.w $r12,12288>>12 16 add.d $r12,$r12,$r5 │ 17 st.d $r1,$r3,2024 17 add.d $r3,$r3,$r13 │ 18 add.d $r12,$r12,$r5 18 add.d $r5,$r12,$r3 │ 19 add.d $r3,$r3,$r13 19 la.local $r4,.LC0 │ 20 add.d $r5,$r12,$r3 20 bl %plt(printf) │ 21 la.local $r4,.LC0 21 lu12i.w $r13,12288>>12 │ 22 bl %plt(printf) 22 add.d $r3,$r3,$r13 │ 23 lu12i.w $r13,8192>>12 23 ld.d $r1,$r3,8 │ 24 ori $r13,$r13,2080 24 or $r4,$r0,$r0 │ 25 add.d $r3,$r3,$r13 25 addi.d $r3,$r3,16 │ 26 ld.d $r1,$r3,2024 26 jr $r1 │ 27 or $r4,$r0,$r0 │ 28 addi.d $r3,$r3,2032 │ 29 jr $r1 gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_compute_frame_info): Modify fp_sp_offset and gp_sp_offset's calculation method, when frame->mask or frame->fmask is zero, don't minus UNITS_PER_WORD or UNITS_PER_FP_REG. gcc/testsuite/ChangeLog: * gcc.target/loongarch/prolog-opt.c: New test. --- gcc/config/loongarch/loongarch.cc | 12 ++++++-- .../gcc.target/loongarch/prolog-opt.c | 29 +++++++++++++++++++ 2 files changed, 38 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/loongarch/prolog-opt.c diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index d72b256df51..5c9a33c14f7 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -917,8 +917,12 @@ loongarch_compute_frame_info (void) frame->frame_pointer_offset = offset; /* Next are the callee-saved FPRs. */ if (frame->fmask) - offset += LARCH_STACK_ALIGN (num_f_saved * UNITS_PER_FP_REG); - frame->fp_sp_offset = offset - UNITS_PER_FP_REG; + { + offset += LARCH_STACK_ALIGN (num_f_saved * UNITS_PER_FP_REG); + frame->fp_sp_offset = offset - UNITS_PER_FP_REG; + } + else + frame->fp_sp_offset = offset; /* Next are the callee-saved GPRs. */ if (frame->mask) { @@ -931,8 +935,10 @@ loongarch_compute_frame_info (void) frame->save_libcall_adjustment = x_save_size; offset += x_save_size; + frame->gp_sp_offset = offset - UNITS_PER_WORD; } - frame->gp_sp_offset = offset - UNITS_PER_WORD; + else + frame->gp_sp_offset = offset; /* The hard frame pointer points above the callee-saved GPRs. */ frame->hard_frame_pointer_offset = offset; /* Above the hard frame pointer is the callee-allocated varags save area. */ diff --git a/gcc/testsuite/gcc.target/loongarch/prolog-opt.c b/gcc/testsuite/gcc.target/loongarch/prolog-opt.c new file mode 100644 index 00000000000..7f611370aa4 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/prolog-opt.c @@ -0,0 +1,29 @@ +/* Test that LoongArch backend stack drop operation optimized. */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -mabi=lp64d" } */ +/* { dg-final { scan-assembler "addi.d\t\\\$r3,\\\$r3,-16" } } */ + +struct test +{ + int empty1[0]; + double empty2[0]; + int : 0; + float x; + long empty3[0]; + long : 0; + float y; + unsigned : 0; + char empty4[0]; +}; + +extern void callee (struct test); + +void +caller (void) +{ + struct test test; + test.x = 114; + test.y = 514; + callee (test); +}