From patchwork Fri Aug 12 12:01:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 56704 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B89AE385E001 for ; Fri, 12 Aug 2022 12:02:16 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B89AE385E001 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1660305736; bh=eiF7Yp123geCa8o0SWYdXADQ9WSLKCUnDN98bV698y8=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=jDln7h1QJut19blSAu2rzq8MUM0AuDEkgfi5UFYYARXu2VJtZiB2vVnOMZdq8CxeL 5tfNJphct+Y9iVxvZtFhxIL8XHfx7Oln0KCaR/9WwI63bv5eUsSBSLesnomDXQtSRr quwxbsN5HH2oHu4bDhqu1GzYcIoCkt+V0/BKL4js= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id E19543858D28 for ; Fri, 12 Aug 2022 12:01:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E19543858D28 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id B00053F750; Fri, 12 Aug 2022 12:01:45 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 91C6513AAE; Fri, 12 Aug 2022 12:01:45 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id KVdsIilB9mJ3BgAAMHmgww (envelope-from ); Fri, 12 Aug 2022 12:01:45 +0000 Date: Fri, 12 Aug 2022 14:01:45 +0200 (CEST) To: gcc-patches@gcc.gnu.org Subject: [PATCH] Support threading of just the exit edge MIME-Version: 1.0 Message-Id: <20220812120145.91C6513AAE@imap2.suse-dmz.suse.de> X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Biener via Gcc-patches From: Richard Biener Reply-To: Richard Biener Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This started with noticing we add ENTRY_BLOCK to our threads just for the sake of simplifying the conditional at the end of the first block in a function. That's not really threading anything but it ends up duplicating the entry block, and re-writing the result instead of statically fold the jump. The following tries to handle those by recording simplifications of the exit conditional as a thread of length one. That requires special-casing them in the backward copier since if we do not have any block to copy but modify the jump in place and remove not taken edges this confuses the hell out of remaining threads. So back_jt_path_registry::update_cfg now first marks all edges we know are never taken and then prunes the threading candidates when they include such edge. Then it makes sure to first perform unreachable edge removal (so we avoid copying them when other thread paths contain the prevailing edge) before continuing to apply the remaining threads. In statistics you can see this avoids quite a bunch of useless threads (I've investiated 3 random files from cc1files with dropped stats in any of the thread passes). Still thinking about it it would be nice to avoid the work of discovering those candidates we have to throw away later which could eventually be done by having the backward threader perform a RPO walk over the CFG, skipping edges that can be statically determined as not being executed. Below I'm abusing the path range query to statically analyze the exit branch but I assume there's a simpler way of folding this stmt which could then better integrate with such a walk. In any case it seems worth more conciously handling the case of exit branches that simplify without path sensitive information. Then the patch also restricts path discovery when we'd produce threads we'll reject later during copying - the backward threader copying cannot handle paths where the to duplicate blocks are not from exactly the same loop. I'm probably going to split this part out. Any thoughts? * gimple-range-path.cc (path_range_query::set_path): Adjust assert to allow paths of size one. * tree-ssa-threadbackward.cc (back_threader::maybe_register_path): Paths of size one are always profitable. (back_threader::find_paths_to_names): Likewise. Do not walk further if we are leaving the current loop. (back_threader::find_taken_edge): Remove assert. Do not walk to ENTRY_BLOCK. * tree-ssa-threadupdate.cc (back_jt_path_registry::update_cfg): Handle jump threads of just the exit edge by modifying the control statement in-place. --- gcc/gimple-range-path.cc | 2 +- gcc/tree-ssa-threadbackward.cc | 21 ++++++++----- gcc/tree-ssa-threadupdate.cc | 54 ++++++++++++++++++++++++++++++++++ 3 files changed, 69 insertions(+), 8 deletions(-) diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc index 78146f5683e..a7d277c31b8 100644 --- a/gcc/gimple-range-path.cc +++ b/gcc/gimple-range-path.cc @@ -220,7 +220,7 @@ path_range_query::unreachable_path_p () void path_range_query::set_path (const vec &path) { - gcc_checking_assert (path.length () > 1); + gcc_checking_assert (!path.is_empty ()); m_path = path.copy (); m_pos = m_path.length () - 1; bitmap_clear (m_has_cache_entry); diff --git a/gcc/tree-ssa-threadbackward.cc b/gcc/tree-ssa-threadbackward.cc index b886027fccf..669098e4ec3 100644 --- a/gcc/tree-ssa-threadbackward.cc +++ b/gcc/tree-ssa-threadbackward.cc @@ -241,8 +241,9 @@ back_threader::maybe_register_path () else { bool irreducible = false; - if (m_profit.profitable_path_p (m_path, m_name, taken_edge, - &irreducible) + if ((m_path.length () == 1 + || m_profit.profitable_path_p (m_path, m_name, taken_edge, + &irreducible)) && debug_counter () && m_registry.register_path (m_path, taken_edge)) { @@ -267,7 +268,6 @@ back_threader::maybe_register_path () edge back_threader::find_taken_edge (const vec &path) { - gcc_checking_assert (path.length () > 1); switch (gimple_code (m_last_stmt)) { case GIMPLE_COND: @@ -350,9 +350,15 @@ back_threader::find_paths_to_names (basic_block bb, bitmap interesting, m_path.safe_push (bb); // Try to resolve the path without looking back. - if (m_path.length () > 1 - && (!m_profit.profitable_path_p (m_path, m_name, NULL) - || maybe_register_path ())) + if ((m_path.length () > 1 + && !m_profit.profitable_path_p (m_path, m_name, NULL)) + || maybe_register_path ()) + ; + + // The backwards thread copier cannot copy blocks that do not belong + // to the same loop, so when the new source of the path entry no + // longer belongs to it we don't need to search further. + else if (m_path[0]->loop_father != bb->loop_father) ; // Continue looking for ways to extend the path but limit the @@ -445,7 +451,8 @@ back_threader::find_paths_to_names (basic_block bb, bitmap interesting, edge e; FOR_EACH_EDGE (e, iter, bb->preds) { - if (e->flags & EDGE_ABNORMAL + if ((e->flags & EDGE_ABNORMAL) + || e->src->index == ENTRY_BLOCK // This is like path_crosses_loops in profitable_path_p but // more restrictive to avoid peeling off loop iterations (see // tree-ssa/pr14341.c for an example). diff --git a/gcc/tree-ssa-threadupdate.cc b/gcc/tree-ssa-threadupdate.cc index 59c268a3567..d40fa7c4cff 100644 --- a/gcc/tree-ssa-threadupdate.cc +++ b/gcc/tree-ssa-threadupdate.cc @@ -2613,6 +2613,60 @@ back_jt_path_registry::update_cfg (bool /*peel_loop_headers*/) bool retval = false; hash_set visited_starting_edges; + /* Mark never taken edges from paths that are just jump simplifications. */ + auto_edge_flag never_taken (cfun); + for (auto path : m_paths) + if (path->length () == 1) + { + edge_iterator ei; + edge e; + FOR_EACH_EDGE (e, ei, (*path)[0]->e->src->succs) + if (e != (*path)[0]->e) + e->flags |= never_taken; + } + + /* And prune paths that contain such edge before we remove them. */ + for (unsigned i = 0; i < m_paths.length ();) + { + bool remove = false; + for (auto je : *m_paths[i]) + { + if (je->e->flags & never_taken) + { + cancel_thread (m_paths[i], + "Avoiding threading through unreachable edge"); + remove = true; + break; + } + } + if (!remove) + ++i; + else + m_paths.unordered_remove (i); + } + + /* Finally perform those threads first, this way we avoid copying the + dead outgoing edges when other theads contain the prevailing edge. */ + for (unsigned i = 0; i < m_paths.length ();) + { + vec *path = m_paths[i]; + if (path->length () != 1) + { + ++i; + continue; + } + edge exit = (*path)[0]->e; + remove_ctrl_stmt_and_useless_edges (exit->src, exit->dest); + exit->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE | EDGE_ABNORMAL); + exit->flags |= EDGE_FALLTHRU; + /* We do not update dominance info. */ + free_dominance_info (CDI_DOMINATORS); + retval = true; + m_num_threaded_edges++; + path->release (); + m_paths.unordered_remove (i); + } + while (m_paths.length ()) { vec *path = m_paths[0];