Message ID | 00f501d7f666$26b93cd0$742bb670$@nextmovesoftware.com |
---|---|
State | New |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1E24A3858037 for <patchwork@sourceware.org>; Tue, 21 Dec 2021 12:27:59 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id CEB4A3858C2C for <gcc-patches@gcc.gnu.org>; Tue, 21 Dec 2021 12:27:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CEB4A3858C2C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=YP4ncgJs9iGNAtxVImhv3ClAXeG46VQ1OBslPfcjQ6s=; b=FxsToaV5Pk26pdrECxI/eHNa8O e94QfS3yD2XHU99TdIXflzgEL1ntT0D9BVqG+IbZCduL6Lk/nzHQLeuuNQawRNhR3m6v4TyxQDNWb qJpx8m5W0xQCjaCHIyZhhSfIF6owxKKkZOZK5XAJ8GzXrPC7FOA+26abQ/nFyLNlfgF24k1ZJFjtR mra2WCpJtv+QuJlo+wcv+LRty9xn1I0AGDbV67MITtgHDTc4MTCD7CXmAE8H7QAur9RvsEOE4p4Z1 C0IguDRo2hlgTqYtOTLtLGQqAu0H3yATPi5jzoQT4Yyzcx9q87GrQTTw9vNd+8+Ae9lVsx9eYNMQh OhJL6nxw==; Received: from [185.62.158.67] (port=49488 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from <roger@nextmovesoftware.com>) id 1mzeF3-0006v9-Ta; Tue, 21 Dec 2021 07:27:42 -0500 From: "Roger Sayle" <roger@nextmovesoftware.com> To: "'GCC Patches'" <gcc-patches@gcc.gnu.org> Subject: [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to memory. Date: Tue, 21 Dec 2021 12:27:41 -0000 Message-ID: <00f501d7f666$26b93cd0$742bb670$@nextmovesoftware.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_00F6_01D7F666.26B93CD0" X-Mailer: Microsoft Outlook 16.0 Thread-Index: Adf2ZcRv7AAUPSnDRzWEeY+E5Ee6sA== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-11.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, LIKELY_SPAM_BODY, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
PR target/103773: Fix wrong-code with -Oz from pop to memory.
|
|
Commit Message
Roger Sayle
Dec. 21, 2021, 12:27 p.m. UTC
My apologies for the inconvenience. The new support for -Oz using push/pop for small integer constants on x86_64 is only a win/correct for loading registers. Fixed by adding !MEM_P tests in the appropriate locations. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check with no new failures. Ok for mainline? 2021-12-21 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR target/103773 * config/i386/i386.md (*movdi_internal): Only use short push/pop sequence for register (non-memory) destinations. (*movsi_internal): Likewise. gcc/testsuite/ChangeLog PR target/103773 * gcc.target/i386/pr103773.c: New test case. Roger --
Comments
On Tue, Dec 21, 2021 at 1:27 PM Roger Sayle <roger@nextmovesoftware.com> wrote: > > > My apologies for the inconvenience. The new support for -Oz using > push/pop for small integer constants on x86_64 is only a win/correct > for loading registers. Fixed by adding !MEM_P tests in the appropriate > locations. > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check with no new failures. Ok for mainline? > > > 2021-12-21 Roger Sayle <roger@nextmovesoftware.com> > > gcc/ChangeLog > PR target/103773 > * config/i386/i386.md (*movdi_internal): Only use short > push/pop sequence for register (non-memory) destinations. > (*movsi_internal): Likewise. > > gcc/testsuite/ChangeLog > PR target/103773 > * gcc.target/i386/pr103773.c: New test case. Ouch, as pointed out in the PR, this approach clobbers the red zone. Please revert the original patch. Thanks, Uros. > > Roger > -- >
On Wed, Dec 22, 2021 at 9:10 AM Uros Bizjak <ubizjak@gmail.com> wrote: > > On Tue, Dec 21, 2021 at 1:27 PM Roger Sayle <roger@nextmovesoftware.com> wrote: > > > > > > My apologies for the inconvenience. The new support for -Oz using > > push/pop for small integer constants on x86_64 is only a win/correct > > for loading registers. Fixed by adding !MEM_P tests in the appropriate > > locations. > > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > > and make -k check with no new failures. Ok for mainline? > > > > > > 2021-12-21 Roger Sayle <roger@nextmovesoftware.com> > > > > gcc/ChangeLog > > PR target/103773 > > * config/i386/i386.md (*movdi_internal): Only use short > > push/pop sequence for register (non-memory) destinations. > > (*movsi_internal): Likewise. > > > > gcc/testsuite/ChangeLog > > PR target/103773 > > * gcc.target/i386/pr103773.c: New test case. > > Ouch, as pointed out in the PR, this approach clobbers the red zone. > > Please revert the original patch. *Maybe* we can use frame->red_zone_size here, but the frame is recalculated several times during the compilation. I think it is just too dangerous to use push/pop w.r.t. red zone clobbering. Uros.
Hi Uros, Would you consider the following variant that disables this optimization when a red zone is used by the current function? You're right that cfun's red_zone_size is recalculated dynamically, but ix86_red_zone_used should be a better "gate" given that this logic resides very late during compilation, in the output templates, where whether or not a red zone is used is known. On CSiBE, disabling this optimization in non-leaf functions that use a red zone costs 219 bytes, but remains a significant win over -Os. (Alas the absolute numbers aren't comparable as this testing included the 0/-1 write to memory changes). Tested (overnight) on x86_64-pc-linux-gnu with make bootstrap and make -k check with no new failures. 2021-12-22 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog PR target/103773 * config/i386/i386.md (*movdi_internal): Only use short push/pop sequence for register (non-memory) destinations when the current function doesn't make use of a red zone. (*movsi_internal): Likewise. gcc/testsuite/ChangeLog PR target/103773 * gcc.target/i386/pr103773.c: New test case. Please let me know what you think. I'll revert, if this tweak doesn't address your concerns. Roger -- > -----Original Message----- > From: Uros Bizjak <ubizjak@gmail.com> > Sent: 22 December 2021 08:20 > To: Roger Sayle <roger@nextmovesoftware.com> > Cc: GCC Patches <gcc-patches@gcc.gnu.org> > Subject: Re: [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to > memory. > > On Wed, Dec 22, 2021 at 9:10 AM Uros Bizjak <ubizjak@gmail.com> wrote: > > > > On Tue, Dec 21, 2021 at 1:27 PM Roger Sayle > <roger@nextmovesoftware.com> wrote: > > > > > > > > > My apologies for the inconvenience. The new support for -Oz using > > > push/pop for small integer constants on x86_64 is only a win/correct > > > for loading registers. Fixed by adding !MEM_P tests in the > > > appropriate locations. > > > > > > This patch has been tested on x86_64-pc-linux-gnu with make > > > bootstrap and make -k check with no new failures. Ok for mainline? > > > > > > > > > 2021-12-21 Roger Sayle <roger@nextmovesoftware.com> > > > > > > gcc/ChangeLog > > > PR target/103773 > > > * config/i386/i386.md (*movdi_internal): Only use short > > > push/pop sequence for register (non-memory) destinations. > > > (*movsi_internal): Likewise. > > > > > > gcc/testsuite/ChangeLog > > > PR target/103773 > > > * gcc.target/i386/pr103773.c: New test case. > > > > Ouch, as pointed out in the PR, this approach clobbers the red zone. > > > > Please revert the original patch. > > *Maybe* we can use frame->red_zone_size here, but the frame is recalculated > several times during the compilation. I think it is just too dangerous to use > push/pop w.r.t. red zone clobbering. > > Uros. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index d25453f..489cede 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -2217,7 +2217,9 @@ if (optimize_size > 1 && TARGET_64BIT && CONST_INT_P (operands[1]) - && IN_RANGE (INTVAL (operands[1]), -128, 127)) + && IN_RANGE (INTVAL (operands[1]), -128, 127) + && !MEM_P (operands[0]) + && !ix86_red_zone_used) return "push{q}\t%1\n\tpop{q}\t%0"; return "mov{l}\t{%k1, %k0|%k0, %k1}"; } @@ -2440,7 +2442,9 @@ return "lea{l}\t{%E1, %0|%0, %E1}"; else if (optimize_size > 1 && CONST_INT_P (operands[1]) - && IN_RANGE (INTVAL (operands[1]), -128, 127)) + && IN_RANGE (INTVAL (operands[1]), -128, 127) + && !MEM_P (operands[0]) + && !ix86_red_zone_used) { if (TARGET_64BIT) return "push{q}\t%1\n\tpop{q}\t%q0";
On Wed, Dec 22, 2021 at 10:26 AM Roger Sayle <roger@nextmovesoftware.com> wrote: > > > Hi Uros, > Would you consider the following variant that disables this optimization when a > red zone is used by the current function? You're right that cfun's red_zone_size is > recalculated dynamically, but ix86_red_zone_used should be a better "gate" given > that this logic resides very late during compilation, in the output templates, where > whether or not a red zone is used is known. > > On CSiBE, disabling this optimization in non-leaf functions that use a red zone costs > 219 bytes, but remains a significant win over -Os. (Alas the absolute numbers aren't > comparable as this testing included the 0/-1 write to memory changes). > > Tested (overnight) on x86_64-pc-linux-gnu with make bootstrap and make -k check > with no new failures. > > 2021-12-22 Roger Sayle <roger@nextmovesoftware.com> > > gcc/ChangeLog > PR target/103773 > * config/i386/i386.md (*movdi_internal): Only use short > push/pop sequence for register (non-memory) destinations > when the current function doesn't make use of a red zone. > (*movsi_internal): Likewise. > > gcc/testsuite/ChangeLog > PR target/103773 > * gcc.target/i386/pr103773.c: New test case. > > Please let me know what you think. I'll revert, if this tweak doesn't address > your concerns. Yes, using ix86_red_zone_used looks safe. OTOH, is there a reason the transformation is not implemented via peephole2 pass? IIRC, frame is stable after pro_and_epilogue_pass, and peephole2 pass is instanced well after register allocation. Uros.
On Wed, Dec 22, 2021 at 11:26 AM Uros Bizjak <ubizjak@gmail.com> wrote: > > On Wed, Dec 22, 2021 at 10:26 AM Roger Sayle <roger@nextmovesoftware.com> wrote: > > > > > > Hi Uros, > > Would you consider the following variant that disables this optimization when a > > red zone is used by the current function? You're right that cfun's red_zone_size is > > recalculated dynamically, but ix86_red_zone_used should be a better "gate" given > > that this logic resides very late during compilation, in the output templates, where > > whether or not a red zone is used is known. > > > > On CSiBE, disabling this optimization in non-leaf functions that use a red zone costs > > 219 bytes, but remains a significant win over -Os. (Alas the absolute numbers aren't > > comparable as this testing included the 0/-1 write to memory changes). > > > > Tested (overnight) on x86_64-pc-linux-gnu with make bootstrap and make -k check > > with no new failures. > > > > 2021-12-22 Roger Sayle <roger@nextmovesoftware.com> > > > > gcc/ChangeLog > > PR target/103773 > > * config/i386/i386.md (*movdi_internal): Only use short > > push/pop sequence for register (non-memory) destinations > > when the current function doesn't make use of a red zone. > > (*movsi_internal): Likewise. > > > > gcc/testsuite/ChangeLog > > PR target/103773 > > * gcc.target/i386/pr103773.c: New test case. > > > > Please let me know what you think. I'll revert, if this tweak doesn't address > > your concerns. > > Yes, using ix86_red_zone_used looks safe. > > OTOH, is there a reason the transformation is not implemented via > peephole2 pass? IIRC, frame is stable after pro_and_epilogue_pass, and > peephole2 pass is instanced well after register allocation. Something like the attached patch. Uros. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 58b10643fcb..e5d603f0025 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -2514,6 +2514,24 @@ ] (symbol_ref "true")))]) +(define_peephole2 + [(set (match_operand:SWI48 0 "general_reg_operand") + (match_operand:SWI48 1 "const_int_operand"))] + "optimize_insn_for_size_p () && optimize_size > 1 + && IN_RANGE (INTVAL (operands[1]), -128, 127) + && !ix86_red_zone_used" + [(set (match_dup 2) (match_dup 1)) + (set (match_dup 0) (match_dup 3))] +{ + if (GET_MODE (operands[0]) != word_mode) + operands[0] = gen_rtx_REG (word_mode, REGNO (operands[0])); + + operands[2] = gen_rtx_MEM (word_mode, + gen_rtx_PRE_DEC (Pmode, stack_pointer_rtx)); + operands[3] = gen_rtx_MEM (word_mode, + gen_rtx_POST_INC (Pmode, stack_pointer_rtx)); +}) + (define_insn "*movhi_internal" [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r,r,m ,*k,*k ,r ,m ,*k ,?r,?*v,*v,*v,*v,m")
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index d25453f..e596f8b 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -2217,7 +2217,8 @@ if (optimize_size > 1 && TARGET_64BIT && CONST_INT_P (operands[1]) - && IN_RANGE (INTVAL (operands[1]), -128, 127)) + && IN_RANGE (INTVAL (operands[1]), -128, 127) + && !MEM_P (operands[0])) return "push{q}\t%1\n\tpop{q}\t%0"; return "mov{l}\t{%k1, %k0|%k0, %k1}"; } @@ -2440,7 +2441,8 @@ return "lea{l}\t{%E1, %0|%0, %E1}"; else if (optimize_size > 1 && CONST_INT_P (operands[1]) - && IN_RANGE (INTVAL (operands[1]), -128, 127)) + && IN_RANGE (INTVAL (operands[1]), -128, 127) + && !MEM_P (operands[0])) { if (TARGET_64BIT) return "push{q}\t%1\n\tpop{q}\t%q0"; diff --git a/gcc/testsuite/gcc.target/i386/pr103773.c b/gcc/testsuite/gcc.target/i386/pr103773.c new file mode 100644 index 0000000..1e4b8ce --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr103773.c @@ -0,0 +1,12 @@ +/* { dg-do run } */ +/* { dg-options "-Oz" } */ + +unsigned long long x; + +int main (void) +{ + __builtin_memset (&x, 0xff, 4); + if (x != 0xffffffff) + __builtin_abort (); + return 0; +}