Message ID | 053901d88209$015b0b10$04112130$@nextmovesoftware.com |
---|---|
State | New |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C990A386C598 for <patchwork@sourceware.org>; Fri, 17 Jun 2022 05:13:57 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 0CB2A3857B8C for <gcc-patches@gcc.gnu.org>; Fri, 17 Jun 2022 05:13:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0CB2A3857B8C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=k7ODMYJJLDoO2Y6DzQuOR7S2uw2QNsmSjt+SbwlO3uk=; b=agCosVF3NcnRGl0kMCclmkUgm4 361ksca8/CPYMjQCDJmUfyX3aEEg3JL/GDtEIDosTDxVwF7ct0NgnVblg2ktX08ELonsS3tcJedrS oTKE39Az9Kjs5TwOSGKQmBL5j2D1RYUNWaid8cheQv2WROqHuqv9Pc8xinGKvXw82SCRqHd20502b 5yD8+DMTB4kMtY6Pw95Jcb5OdNryNaEEYoyWyk693Yo/K4J0ceD9KSade0dMsGe77UGl8Pny+RbQV 0OK5F6M+mxhPScfFcQKb9zP5y5B6TqMNxmMWGBIzwCWTS1Zopsh5QAnCR4maM0OjWQnCcY9QeLI0H FT54q23w==; Received: from host109-154-46-154.range109-154.btcentralplus.com ([109.154.46.154]:51953 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from <roger@nextmovesoftware.com>) id 1o24Id-0001iG-2z; Fri, 17 Jun 2022 01:13:39 -0400 From: "Roger Sayle" <roger@nextmovesoftware.com> To: <gcc-patches@gcc.gnu.org> Subject: [rs6000 PATCH] PR target/105991: Recognize PLUS and XOR forms of rldimi. Date: Fri, 17 Jun 2022 07:13:37 +0200 Message-ID: <053901d88209$015b0b10$04112130$@nextmovesoftware.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_053A_01D88219.C4E3DB10" X-Mailer: Microsoft Outlook 16.0 Thread-Index: AdiCCL6/0a4sMpSHTUC5eauVhl+GkA== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Cc: 'Marek Polacek' <polacek@redhat.com> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
[rs6000] PR target/105991: Recognize PLUS and XOR forms of rldimi.
|
|
Commit Message
Roger Sayle
June 17, 2022, 5:13 a.m. UTC
This patch addresses PR target/105991 where a change to prefer representing shifts and adds at the tree-level as multiplications, causes problems for the rldimi patterns in the powerpc backend. The issue is that rs6000.md models this pattern using IOR, and some variants that have the equivalent PLUS or XOR in the RTL fail to match some *rotl<mode>4_insert patterns. This is fixed in this patch by adding a define_insn_and_split to locally canonicalize the PLUS and XOR forms to the backend's preferred IOR form. An alternative fix might be for the RTL optimizers to define a canonical form for these plus_xor_ior equivalent expressions, but the logical choice might be plus (which may appear in an addressing mode), and such a change may require a number of tweaks to update various backends (i.e. a more intrusive change than the one proposed here). Many thanks for Marek Polacek for bootstrapping and regression testing this change without problems. Hopefully the new testcase is portable across powerpc's effective-targets. Ok for mainline? 2022-06-17 Roger Sayle <roger@nextmovesoftware.com> Marek Polacek <polacek@redhat.com> gcc/ChangeLog PR target/105991 * config/rs6000/rs6000.md (plus_xor): New code iterator. (*rotl<mode>3_insert_3_<code>): New define_insn_and_split. gcc/testsuite/ChangeLog PR target/105991 * gcc.target/powerpc/pr105991.c: New test case. Thanks in advance, Roger --
Comments
Hi! On Fri, Jun 17, 2022 at 07:13:37AM +0200, Roger Sayle wrote: > This patch addresses PR target/105991 where a change to prefer representing > shifts and adds at the tree-level as multiplications, causes problems for > the rldimi patterns in the powerpc backend. Because it now is converted to different RTL at expand time. Which the generic expand code does some premature optimisation on, which makes us end up with the addition instead of data manipulation insns. Oh well. > The issue is that rs6000.md > models this pattern using IOR, and some variants that have the equivalent > PLUS or XOR in the RTL fail to match some *rotl<mode>4_insert patterns. > This is fixed in this patch by adding a define_insn_and_split to locally > canonicalize the PLUS and XOR forms to the backend's preferred IOR form. Okay. > An alternative fix might be for the RTL optimizers to define a canonical > form for these plus_xor_ior equivalent expressions, but the logical > choice might be plus (which may appear in an addressing mode), and such > a change may require a number of tweaks to update various backends > (i.e. a more intrusive change than the one proposed here). This does not make sense in an address at all, thankfully :-) The only sane canonicalisation for this is something like VEC_DUPLICATE but for submodes of integer modes, instead of the component mode of a vector mode. I don't feel this is worth trying to handle in general though. > Many thanks for Marek Polacek for bootstrapping and regression testing > this change without problems. You have an account on the cfarm, it is quick and easy to test there :-) I recommend gcc135, a 32 core p9, with oodles of disk space :-) > +; Canonicalize the PLUS and XOR forms to IOR for rotl<mode>3_insert_3 > +(define_code_iterator plus_xor [plus xor]) > + > +(define_insn_and_split "*rotl<mode>3_insert_3_<code>" > + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") > + (plus_xor:GPR > + (and:GPR (match_operand:GPR 3 "gpc_reg_operand" "0") > + (match_operand:GPR 4 "const_int_operand" "n")) > + (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand" "r") > + (match_operand:SI 2 "const_int_operand" "n"))))] > + "INTVAL (operands[2]) == exact_log2 (UINTVAL (operands[4]) + 1)" exact_log2 returns -1 if its argument is not a power of two. Please test it is > 0 explicitly here: I don't think this splitter will work correctly otherwise. There shouldn't really be a shift by 0 ever of course, but it isn't invalid RTL. > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr105991.c > @@ -0,0 +1,11 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > +unsigned long long > +foo (unsigned long long value) > +{ > + value &= 0xffffffff; > + value |= value << 32; > + return value; > +} > +/* { dg-final { scan-assembler "rldimi" } } */ Write /* { dg-final { scan-assembler {\mrldimi\M} } } */ please. Okay for trunk with those changes. Thanks! Segher
on 2022/6/21 06:10, Segher Boessenkool wrote: > Hi! > > On Fri, Jun 17, 2022 at 07:13:37AM +0200, Roger Sayle wrote: >> This patch addresses PR target/105991 where a change to prefer representing >> shifts and adds at the tree-level as multiplications, causes problems for >> the rldimi patterns in the powerpc backend. > > Because it now is converted to different RTL at expand time. Which the > generic expand code does some premature optimisation on, which makes us > end up with the addition instead of data manipulation insns. Oh well. > >> The issue is that rs6000.md >> models this pattern using IOR, and some variants that have the equivalent >> PLUS or XOR in the RTL fail to match some *rotl<mode>4_insert patterns. >> This is fixed in this patch by adding a define_insn_and_split to locally >> canonicalize the PLUS and XOR forms to the backend's preferred IOR form. > > Okay. > >> An alternative fix might be for the RTL optimizers to define a canonical >> form for these plus_xor_ior equivalent expressions, but the logical >> choice might be plus (which may appear in an addressing mode), and such >> a change may require a number of tweaks to update various backends >> (i.e. a more intrusive change than the one proposed here). > > This does not make sense in an address at all, thankfully :-) > > The only sane canonicalisation for this is something like VEC_DUPLICATE > but for submodes of integer modes, instead of the component mode of a > vector mode. I don't feel this is worth trying to handle in general > though. > >> Many thanks for Marek Polacek for bootstrapping and regression testing >> this change without problems. > > You have an account on the cfarm, it is quick and easy to test there :-) > I recommend gcc135, a 32 core p9, with oodles of disk space :-) > >> +; Canonicalize the PLUS and XOR forms to IOR for rotl<mode>3_insert_3 >> +(define_code_iterator plus_xor [plus xor]) >> + >> +(define_insn_and_split "*rotl<mode>3_insert_3_<code>" >> + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") >> + (plus_xor:GPR >> + (and:GPR (match_operand:GPR 3 "gpc_reg_operand" "0") >> + (match_operand:GPR 4 "const_int_operand" "n")) >> + (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand" "r") >> + (match_operand:SI 2 "const_int_operand" "n"))))] >> + "INTVAL (operands[2]) == exact_log2 (UINTVAL (operands[4]) + 1)" > > exact_log2 returns -1 if its argument is not a power of two. Please > test it is > 0 explicitly here: I don't think this splitter will work > correctly otherwise. There shouldn't really be a shift by 0 ever of > course, but it isn't invalid RTL. > >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pr105991.c >> @@ -0,0 +1,11 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2" } */ >> +unsigned long long >> +foo (unsigned long long value) >> +{ >> + value &= 0xffffffff; >> + value |= value << 32; >> + return value; >> +} >> +/* { dg-final { scan-assembler "rldimi" } } */ > > Write > /* { dg-final { scan-assembler {\mrldimi\M} } } */ > please. > This case also needs effective-target keyword lp64, that is /* { dg-require-effective-target lp64 } */ since with -m32, it gets: mr 3,4 with -m32 -mpowerpc64, it gets: rldicl 3,4,0,32 BR, Kewen
On Tue, Jun 21, 2022 at 10:03:18AM +0800, Kewen.Lin wrote: > This case also needs effective-target keyword lp64, > that is /* { dg-require-effective-target lp64 } */ Good point. Yes. It would be nice to have just has_arch_ppc64 really. > since with -m32, it gets: > mr 3,4 > > with -m32 -mpowerpc64, it gets: > rldicl 3,4,0,32 Yes, and that is not lp64 -- both longs and pointers are 32 bits when you have -m32. You get different code because parameter passing is different. The usual way to sidestep is to have the data in memory instead: unsigned long long x; void goo (void) { unsigned long long value = x; value &= 0xffffffff; value |= value << 32; x = value; } but then the compiler tries to be smart and do code like addis 10,2,.LANCHOR0+4@toc@ha lwz 10,.LANCHOR0+4@toc@l(10) sldi 9,10,32 add 9,9,10 addis 10,2,.LANCHOR0@toc@ha std 9,.LANCHOR0@toc@l(10) blr for -m64, and lis 9,x@ha la 10,x@l(9) lwz 10,4(10) stw 10,x@l(9) blr for just -m32, but lis 10,x@ha la 9,x@l(10) la 10,x@l(10) ld 9,0(9) rldicl 8,9,0,32 sldi 9,9,32 add 9,9,8 std 9,0(10) blr for -m32 -mpowerpc64 (note it has not managed to do the splitter here; it gets Failed to match this instruction: (set (reg:DI 128) (plus:DI (ashift:DI (reg/v:DI 117 [ value ]) (const_int 32 [0x20])) (zero_extend:DI (subreg:SI (reg/v:DI 117 [ value ]) 4)))) and then Failed to match this instruction: (set (reg:DI 128) (plus:DI (and:DI (reg/v:DI 117 [ value ]) (const_int 4294967295 [0xffffffff])) (ashift:DI (reg/v:DI 117 [ value ]) (const_int 32 [0x20])))) but that is not enough). So let's just do lp64, at least for now :-) Segher
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index c55ee7e..695ec33 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -4188,6 +4188,23 @@ } [(set_attr "type" "insert")]) +; Canonicalize the PLUS and XOR forms to IOR for rotl<mode>3_insert_3 +(define_code_iterator plus_xor [plus xor]) + +(define_insn_and_split "*rotl<mode>3_insert_3_<code>" + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") + (plus_xor:GPR + (and:GPR (match_operand:GPR 3 "gpc_reg_operand" "0") + (match_operand:GPR 4 "const_int_operand" "n")) + (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand" "r") + (match_operand:SI 2 "const_int_operand" "n"))))] + "INTVAL (operands[2]) == exact_log2 (UINTVAL (operands[4]) + 1)" + "#" + "&& 1" + [(set (match_dup 0) + (ior:GPR (and:GPR (match_dup 3) (match_dup 4)) + (ashift:GPR (match_dup 1) (match_dup 2))))]) + (define_code_iterator plus_ior_xor [plus ior xor]) (define_split diff --git a/gcc/testsuite/gcc.target/powerpc/pr105991.c b/gcc/testsuite/gcc.target/powerpc/pr105991.c new file mode 100644 index 0000000..e853e53 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr105991.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +unsigned long long +foo (unsigned long long value) +{ + value &= 0xffffffff; + value |= value << 32; + return value; +} +/* { dg-final { scan-assembler "rldimi" } } */ +