From patchwork Sun Jun 26 20:56:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 55415 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0D7CA38515CF for ; Sun, 26 Jun 2022 20:56:29 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 43ED73858281 for ; Sun, 26 Jun 2022 20:56:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 43ED73858281 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:To:From:Sender:Reply-To:Cc:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=ddMCO7oWmAC2Rpy9aNYbSwDbuM5RWux4T2oDj1L8TXQ=; b=d2iO7IzWzratJknPGm6rN3nGWq Jcjl4R6XLjG0FoD8I6kZzKe6U0mzT7DNSLbR+tRiTVi3EnO73/fZrU8rFFSkZM/LDyTWWMO7ncBJn vgqabA2rs2jURaCZjCIOr5ZGm9HoKMtsnuagrzp974Zq7JPTP4i20yEQ7dGMbk0TxH9HcGbzAsb0r ej8wK8VaHpurkwjNNKznhrPBJHdcB7q90VQbVPV6d5Yg7LA21/80pe+rvyb2T9ChQRd6cy8p74KcQ LmkeAM7LKCFAwPf/DFl0Llsj//zsQ6mejrNHGDAwTBlTCk1hL8sZOMdbvW42ecN2Sb94EHtooLb80 sWPaPXiA==; Received: from host86-130-134-60.range86-130.btcentralplus.com ([86.130.134.60]:55993 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1o5ZIf-00062d-Iv for gcc-patches@gcc.gnu.org; Sun, 26 Jun 2022 16:56:09 -0400 From: "Roger Sayle" To: Subject: [rs6000 PATCH] Improve constant integer multiply using rldimi. Date: Sun, 26 Jun 2022 21:56:07 +0100 Message-ID: <006101d8899f$295807b0$7c081710$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: AdiJnksSllEYMss6TLCZvP5W7OwRDg== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, HTML_MESSAGE, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch tweaks the code generated on POWER for integer multiplications by a constant, by making use of rldimi instructions. Much like x86's lea instruction, rldimi can be used to implement a shift and add pair in some circumstances. For rldimi this is when the shifted operand is known to have no bits in common with the added operand. Hence for the new testcase below: int foo(int x) { int t = x & 42; return t * 0x2001; } when compiled with -O2, GCC currently generates: andi. 3,3,0x2a slwi 9,3,13 add 3,9,3 extsw 3,3 blr with this patch, we now generate: andi. 3,3,0x2a rlwimi 3,3,13,0,31-13 extsw 3,3 blr It turns out this optimization already exists in the form of a combine splitter in rs6000.md, but the constraints on combine splitters, requiring three of four input instructions (and generating one or two output instructions) mean it doesn't get applied as often as it could. This patch converts the define_split into a define_insn_and_split to catch more cases (such as the one above). The one bit that's tricky/controversial is the use of RTL's nonzero_bits which is accurate during the combine pass when this pattern is first recognized, but not as advanced (not kept up to date) when this pattern is eventually split. To support this, I've used a "|| reload_completed" idiom. Does this approach seem reasonable? [I've another patch of x86 that uses the same idiom]. This patch has been tested on powerpc64le-unknown-linux-gnu with make bootstrap and make -k check with no new failures. Ok for mainline? 2022-06-26 Roger Sayle gcc/ChangeLog * config/rs6000/rs6000.md (*rotl3_insert_3b_): New define_insn_and_split created from exisiting define_split. gcc/testsuite/ChangeLog * gcc.target/powerpc/rldimi-3.c: New test case. Thanks in advance, Roger diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 090dbcf..b8aada32 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -4209,13 +4209,19 @@ (define_code_iterator plus_ior_xor [plus ior xor]) -(define_split - [(set (match_operand:GPR 0 "gpc_reg_operand") - (plus_ior_xor:GPR (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand") - (match_operand:SI 2 "const_int_operand")) - (match_operand:GPR 3 "gpc_reg_operand")))] - "nonzero_bits (operands[3], mode) - < HOST_WIDE_INT_1U << INTVAL (operands[2])" +(define_insn_and_split "*rotl3_insert_3b_" + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") + (plus_ior_xor:GPR + (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand" "r") + (match_operand:SI 2 "const_int_operand" "n")) + (match_operand:GPR 3 "gpc_reg_operand" "0")))] + "INTVAL (operands[2]) > 0 + && INTVAL (operands[2]) < 64 + && ((nonzero_bits (operands[3], mode) + < HOST_WIDE_INT_1U << INTVAL (operands[2])) + || reload_completed)" + "#" + "&& 1" [(set (match_dup 0) (ior:GPR (and:GPR (match_dup 3) (match_dup 4)) diff --git a/gcc/testsuite/gcc.target/powerpc/rldimi-3.c b/gcc/testsuite/gcc.target/powerpc/rldimi-3.c new file mode 100644 index 0000000..80ff86e --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/rldimi-3.c @@ -0,0 +1,9 @@ +/* { dg-do compile { target lp64 } } */ +/* { dg-options "-O2" } */ + +int foo(int x) +{ + int t = x & 42; + return t * 0x2001; +} +/* { dg-final { scan-assembler {\mrldimi\M} } } */