From patchwork Thu Dec 23 09:35:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 49208 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E3D123858029 for ; Thu, 23 Dec 2021 09:36:05 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 9ECEB385840E for ; Thu, 23 Dec 2021 09:35:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9ECEB385840E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=UYO7+mljsTOYNR4t+IqAEJlYnQglcgzG94eZpGASEIc=; b=GKirzHMgzLYFYF9KZc/LDo0SrI VmlYG2kHgYwxSWLNCNyaiY0+oSJMVIVxoR90zYMXAaGfGUKSDB8cEMU08lBg4m8XAKiFawCheZ+Oy ejgnXvZk0HMNpUSOWl/cZ3+T6CNuFKOxikv8heepGvXOQ9ikAtwQ6PiWqeDjDeHVaaXXcsoK+QNic 8MnPSyRqMEdlZPX6ug7S6UVFd4BbYBfl5Eivet1A1HOeDjRvYCQBkotB32HuWDdldnWz+pXCOVE1b nPnx1pW9vY7kWmFg/OKVQwVKH30dkL43tV2o0sshqP91w8B4vdo403k6Mvbg2Ul81nLNa6uipJoob WQA3LAtw==; Received: from host86-160-23-130.range86-160.btcentralplus.com ([86.160.23.130]:52232 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1n0KVo-000233-QB; Thu, 23 Dec 2021 04:35:49 -0500 From: "Roger Sayle" To: "'Uros Bizjak'" Subject: [PATCH take #3] PR target/103773: Fix wrong-code with -Oz from pop to memory. Date: Thu, 23 Dec 2021 09:35:46 -0000 Message-ID: <01cc01d7f7e0$77a92df0$66fb89d0$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: Adf33baIjscdvGdXTSayvukn/sZ8Gg== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-9.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, LIKELY_SPAM_BODY, RCVD_IN_BARRACUDACENTRAL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: 'GCC Patches' Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi Uros, A huge thanks for the list of suggested improvements to the -Oz related patches. I've combined them altogether in the submission below, which makes sense now that everything is implemented using peephole2. The implementation of push/pop via peephole2 is exactly as you've suggested, also checking that the immediate value isn't zero (the value -1 is still a size win over OR), and extended to include HImode (where it is a win), but not QImode (where it isn't). For writes to memory, I've extended *mov_or to allow memory destinations and HImode, but I've introduced a new *mov_and for writing zero to memory, rather than complicate/overload *mov_xor (for example, it doesn't take an immediate). In this form, only a single peephole2 is needed, that adds a clobber to the instruction if the flags are dead. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check with no new failures, and the new testcase checked both with and without -m32. Ok for mainline? 2021-12-23 Roger Sayle Uroš Bizjak gcc/ChangeLog PR target/103773 * config/i386/i386.md (*mov_and): New define_insn for writing a zero to memory using AND. (*mov_or): Extend to allow memory destination and HImode. (*movdi_internal): Remove -Oz push/pop optimization from here. (*movsi_internal): Likewise. (peephole2): Perform -Oz push/pop optimization here, only for register destinations, values other than zero, and in functions that don't used the red zone. (peephole2): With -Oz, convert writes of 0 or -1 to memory into their clobber forms, i.e. *mov_and and *mov_or resp. gcc/testsuite/ChangeLog PR target/103773 * gcc.target/pr103773-2.c: New test case. * gcc.target/pr103773.c: New test case. Many thanks again for your help. Roger --- > -----Original Message----- > From: Uros Bizjak > Sent: 22 December 2021 15:24 > To: Roger Sayle > Subject: Re: [PATCH] PR target/103773: Fix wrong-code with -Oz from pop to > memory. > > On Wed, Dec 22, 2021 at 3:19 PM Uros Bizjak wrote: > > > > On Wed, Dec 22, 2021 at 2:57 PM Uros Bizjak wrote: > > > > > > On Wed, Dec 22, 2021 at 2:12 PM Roger Sayle > wrote: > > > > > > > > > > > > Hi Uros, > > > > I'm bootstrapping and regression testing your proposed patch now > > > > (including the removal/reversion of my pieces in *mov[sd]i2_internal). > > > > Many thanks for all of your help with this. > > > > > > Probably you want to avoid transformation of loads of 0 and -1, > > > which should still be implemented via xor %reg, %ref and or $-1, %eax. > > > > This constraint will result in optimal conversion approach: > > > > + "optimize_insn_for_size_p () && optimize_size > 1 > > + && operands[1] != const0_rtx && operands[1] != constm1_rtx > > + && IN_RANGE (INTVAL (operands[1]), -128, 127) > > + && !ix86_red_zone_used" > > I think we should also convert HImode and QImode initializations, the pattern > supports it by changing the mode iterator to SWI. > > Uros. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 58b1064..b709a3e 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -2028,9 +2028,19 @@ (set_attr "mode" "SI") (set_attr "length_immediate" "0")]) +(define_insn "*mov_and" + [(set (match_operand:SWI248 0 "memory_operand" "=m") + (match_operand:SWI248 1 "const0_operand")) + (clobber (reg:CC FLAGS_REG))] + "reload_completed" + "and{}\t{%1, %0|%0, %1}" + [(set_attr "type" "alu1") + (set_attr "mode" "") + (set_attr "length_immediate" "1")]) + (define_insn "*mov_or" - [(set (match_operand:SWI48 0 "register_operand" "=r") - (match_operand:SWI48 1 "constm1_operand")) + [(set (match_operand:SWI248 0 "nonimmediate_operand" "=rm") + (match_operand:SWI248 1 "constm1_operand")) (clobber (reg:CC FLAGS_REG))] "reload_completed" "or{}\t{%1, %0|%0, %1}" @@ -2218,14 +2228,7 @@ case TYPE_IMOV: gcc_assert (!flag_pic || LEGITIMATE_PIC_OPERAND_P (operands[1])); if (get_attr_mode (insn) == MODE_SI) - { - if (optimize_size > 1 - && TARGET_64BIT - && CONST_INT_P (operands[1]) - && IN_RANGE (INTVAL (operands[1]), -128, 127)) - return "push{q}\t%1\n\tpop{q}\t%0"; - return "mov{l}\t{%k1, %k0|%k0, %k1}"; - } + return "mov{l}\t{%k1, %k0|%k0, %k1}"; else if (which_alternative == 4) return "movabs{q}\t{%1, %0|%0, %1}"; else if (ix86_use_lea_for_mov (insn, operands)) @@ -2443,14 +2446,6 @@ gcc_assert (!flag_pic || LEGITIMATE_PIC_OPERAND_P (operands[1])); if (ix86_use_lea_for_mov (insn, operands)) return "lea{l}\t{%E1, %0|%0, %E1}"; - else if (optimize_size > 1 - && CONST_INT_P (operands[1]) - && IN_RANGE (INTVAL (operands[1]), -128, 127)) - { - if (TARGET_64BIT) - return "push{q}\t%1\n\tpop{q}\t%q0"; - return "push{l}\t%1\n\tpop{l}\t%0"; - } else return "mov{l}\t{%1, %0|%0, %1}"; @@ -2514,6 +2509,34 @@ ] (symbol_ref "true")))]) +(define_peephole2 + [(set (match_operand:SWI248 0 "general_reg_operand") + (match_operand:SWI248 1 "const_int_operand"))] + "optimize_insn_for_size_p () && optimize_size > 1 + && operands[1] != const0_rtx + && IN_RANGE (INTVAL (operands[1]), -128, 127) + && !ix86_red_zone_used" + [(set (match_dup 2) (match_dup 1)) + (set (match_dup 0) (match_dup 3))] +{ + if (GET_MODE (operands[0]) != word_mode) + operands[0] = gen_rtx_REG (word_mode, REGNO (operands[0])); + + operands[2] = gen_rtx_MEM (word_mode, + gen_rtx_PRE_DEC (Pmode, stack_pointer_rtx)); + operands[3] = gen_rtx_MEM (word_mode, + gen_rtx_POST_INC (Pmode, stack_pointer_rtx)); +}) + +(define_peephole2 + [(set (match_operand:SWI248 0 "memory_operand") + (match_operand:SWI248 1 "const_int_operand"))] + "(operands[1] == const0_rtx || operands[1] == constm1_rtx) + && optimize_insn_for_size_p () && optimize_size > 1 + && peep2_regno_dead_p (0, FLAGS_REG)" + [(parallel [(set (match_dup 0) (match_dup 1)) + (clobber (reg:CC FLAGS_REG))])]) + (define_insn "*movhi_internal" [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r,r,m ,*k,*k ,r ,m ,*k ,?r,?*v,*v,*v,*v,m") diff --git a/gcc/testsuite/gcc.target/i386/pr103773-2.c b/gcc/testsuite/gcc.target/i386/pr103773-2.c new file mode 100644 index 0000000..9dafebd --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr103773-2.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-Oz" } */ +short s; +int i; +long long l; + +void s0() { s = 0; } +void sm1() { s = -1; } +void i0() { i = 0; } +void im1() { i = -1; } +void l0() { l = 0; } +void lm1() { l = -1; } + +/* { dg-final { scan-assembler-not "\tmov\[wlq\]\t\\\$0," } } */ +/* { dg-final { scan-assembler-not "\tmov\[wlq\]\t\\\$-1," } } */ +/* { dg-final { scan-assembler "\tandw\t\\\$0," } } */ +/* { dg-final { scan-assembler "\torw\t\\\$-1," } } */ +/* { dg-final { scan-assembler "\torl\t\\\$-1," } } */ + diff --git a/gcc/testsuite/gcc.target/i386/pr103773.c b/gcc/testsuite/gcc.target/i386/pr103773.c new file mode 100644 index 0000000..1e4b8ce --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr103773.c @@ -0,0 +1,12 @@ +/* { dg-do run } */ +/* { dg-options "-Oz" } */ + +unsigned long long x; + +int main (void) +{ + __builtin_memset (&x, 0xff, 4); + if (x != 0xffffffff) + __builtin_abort (); + return 0; +}