From patchwork Thu Jun 2 10:19:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 54739 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 331A1395B061 for ; Thu, 2 Jun 2022 10:20:01 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 60778395A07F for ; Thu, 2 Jun 2022 10:19:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 60778395A07F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:To:From:Sender:Reply-To:Cc:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=0lrJr56a6ynBL5dY+dmZHPk2KHPmr24R7jGi5sF8eNA=; b=XXv6zT2azE27B4wGj2W6VgZum0 SPNSKKbrAQDiOc3ms+0LSLKzopaHANsdNsWIhoDwHNBb3Ra28M16Uamu83y85eKpKBu6uERuapVoB WGETlnzJmzyAq4QLBTDfyP/olf7y2KZqKLe6oA14eBTd9pjHxFIk88SU6jYHZYvb2d6HlqpTCUX8/ Bms79sf5HK+8wGJpQ6WKdRKJwQG3lLhjNpGCVq52ZYWf174tq3lEfTo/TTByqI+X7idRrSOZyciY8 Td3Los02Dk+0tWX1k4cje38YGdh02+QBdka80V0sr0nrIINQMyZNCJ5iWpJlzw/jr34iYGfXgf8Jl vHTQmhuQ==; Received: from host109-154-46-241.range109-154.btcentralplus.com ([109.154.46.241]:55980 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nwhvY-0004pH-PB for gcc-patches@gcc.gnu.org; Thu, 02 Jun 2022 06:19:41 -0400 From: "Roger Sayle" To: "'GCC Patches'" Subject: [PATCH/RFC] cprop_hardreg... Third time's a charm. Date: Thu, 2 Jun 2022 11:19:38 +0100 Message-ID: <03c201d8766a$450d9a80$cf28cf80$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: Adh2ajJr7mPOGcWWQCO9XMLzRWWcNA== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, HTML_MESSAGE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This middle-end patch proposes the "hard register constant propagation" pass be performed up to three times on each basic block (up from the current two times) if the second pass successfully made changes. The motivation for three passes is to handle the "swap idiom" (i.e. t = x; x = y; y = t;" sequences) that get generated by register allocation (reload). Consider the x86_64 test case for __int128 addition recently discussed on gcc-patches. With that proposed patch, the input to the cprop_hardreg pass looks like: movq %rdi, %r8 movq %rsi, %rdi movq %r8, %rsi movq %rdx, %rax movq %rcx, %rdx addq %rsi %rax adcq %rdi, %rdx ret where the first three instructions effectively swap %rsi and %rdi. On the first pass of cprop_hardreg, we notice that the third insn, %rsi := %r8, is redundant and can eliminated/propagated to produce: movq %rdi, %r8 movq %rsi, %rdi movq %rdx, %rax movq %rcx, %rdx addq %r8 %rax adcq %rdi, %rdx ret Because a successful propagation was found, cprop_hardreg then runs a second pass/sweep on affected basic blocks (using worklist), and on this second pass notices that the second instruction, %rdi := %rsi, may now be propagated (%rsi was killed in the before the first transform), and after a second pass, we now end up with: movq %rdi, %r8 movq %rdx, %rax movq %rcx, %rdx addq %r8, %rax adcq %rsi, %rdx ret which is the current behaviour on mainline. However, a third and final pass would now notice that the first insn, "%r8 := %rdi" is also now eliminable, and a third iteration would produce optimal code: movq %rdx, %rax movq %rcx, %rdx addq %rdi, %rax adcq %rsi, %rdx ret The patch below creates an additional worklist, third_pass, that is populated with the basic block id's of blocks that were updated during the second pass. Does the motivation for three passes (reload doesn't generate more or less than three instructions to swap a pair of registers) seem reasonable for all targets? If cprop_hardreg is considered an expensive pass, this change could be gated based on basic block count or similar. Finally, I should point out that this a regression fix; GCC 4.8 generated optimal code with two moves (whereas GCC 12 required 5 moves, up from GCC 11's 4 moves). This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Thoughts? Ok for mainline? 2022-06-02 Roger Sayle gcc/ChangeLog * regcprop.cc (pass_cprop_hardreg::execute): Perform a third iteration over each basic block that was updated by the second iteration. Thanks in advance, Roger diff --git a/gcc/regcprop.cc b/gcc/regcprop.cc index 1fdc367..5555c4a 100644 --- a/gcc/regcprop.cc +++ b/gcc/regcprop.cc @@ -1384,6 +1384,7 @@ pass_cprop_hardreg::execute (function *fun) bitmap_clear (visited); auto_vec worklist; + auto_vec third_pass; bool any_debug_changes = false; /* We need accurate notes. Earlier passes such as if-conversion may @@ -1425,7 +1426,27 @@ pass_cprop_hardreg::execute (function *fun) for (int index : worklist) { bb = BASIC_BLOCK_FOR_FN (fun, index); - cprop_hardreg_bb (bb, all_vd, visited); + /* Perform a third pass, if the second pass changed anything. + Three passes are required for swaps: t = x; x = y; y = t. */ + if (cprop_hardreg_bb (bb, all_vd, visited)) + third_pass.safe_push (bb->index); + if (all_vd[bb->index].n_debug_insn_changes) + any_debug_changes = true; + } + + df_analyze (); + if (MAY_HAVE_DEBUG_BIND_INSNS && any_debug_changes) + cprop_hardreg_debug (fun, all_vd); + } + + if (!third_pass.is_empty ()) + { + any_debug_changes = false; + bitmap_clear (visited); + for (int index : third_pass) + { + bb = BASIC_BLOCK_FOR_FN (fun, index); + cprop_hardreg_bb (bb, all_vd, visited); if (all_vd[bb->index].n_debug_insn_changes) any_debug_changes = true; }