From patchwork Sat Nov 27 08:32:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aldy Hernandez X-Patchwork-Id: 48220 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A9F58385842B for ; Sat, 27 Nov 2021 08:36:13 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A9F58385842B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1638002173; bh=Gnao9lPNC/5hKDASnR6s0mMDq59vGodjRuaj7AD1iec=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=PMMB2xw9LqZJg5Bd3lsJkVHHUZ5QE8nzNiVlFGXJdGrfQP7aCPIhbzLru+Z1rhHpu 5qoViEIwXYAdl5JGQNeO613Fagx+PQEfsF+n17+QmguZnm43lDXzdXu/wdNKkmz494 EXs+hcMSE8Bmia0n2Jh+oK6BGzglBFw2kD8lrFmY= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id D3D363857029 for ; Sat, 27 Nov 2021 08:32:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D3D363857029 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-205-RY3FO5hoN9OM8FLiQoWebQ-1; Sat, 27 Nov 2021 03:32:23 -0500 X-MC-Unique: RY3FO5hoN9OM8FLiQoWebQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 77C5B185302B for ; Sat, 27 Nov 2021 08:32:22 +0000 (UTC) Received: from abulafia.quesejoda.com (unknown [10.39.192.28]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 0A06819C79; Sat, 27 Nov 2021 08:32:21 +0000 (UTC) Received: from abulafia.quesejoda.com (localhost [127.0.0.1]) by abulafia.quesejoda.com (8.16.1/8.15.2) with ESMTPS id 1AR8WIDF2450213 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Sat, 27 Nov 2021 09:32:19 +0100 Received: (from aldyh@localhost) by abulafia.quesejoda.com (8.16.1/8.16.1/Submit) id 1AR8WILQ2450212; Sat, 27 Nov 2021 09:32:18 +0100 To: GCC patches Subject: [PATCH] path solver: Minimize exported ranges to subsequent blocks. Date: Sat, 27 Nov 2021 09:32:07 +0100 Message-Id: <20211127083206.2450093-1-aldyh@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-13.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Aldy Hernandez via Gcc-patches From: Aldy Hernandez Reply-To: Aldy Hernandez Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Currently we have a set of interesting names in the path that we solve for. However, solving *all* of them on exit from each block is unnecessary, even if an edge range is available. We can reduce the queried edges by only solving SSA names from the current block that will be needed in subsequent blocks. With this patch we avoid the explosion in PR103254, without the need for Andrew's patch (which is still independently needed). In my testbed we loose 6 threads in the non-resolving threading passes, but the overall thread count is the same, as we pick them up later. We could probably try harder to get them, but it's probably not worth the effort. Benchmarking this patch shows no difference in overall compilation speed and a tiny slowdown (< 0.80%) for some of the threading passes. Benchmarked for both -O2, -O3, and -O2 --param ranger-logical-depth=10. I'm unsure what the review requirements are in stage3, so... OK for trunk? p.s. I saw some cleanups that could be done (intersecting more bitmaps, putting the rest of the bitmaps in the new obstack, etc etc), but I avoided doing them to minimize the churn in stage3. Regstrapped on x86-64 & ppc64le Linux. PR tree-optimization/103254 gcc/ChangeLog: * gimple-range-path.cc (path_range_query::path_range_query): Initialize obstack. (path_range_query::~path_range_query): Release obstack. (path_range_query::needs_on_entry_for_block): New. (path_range_query::compute_needs_on_entry_set): New. (path_range_query::compute_ranges_in_block): Only calculate outgoing edges for m_needs_on_entry set. (path_range_query::compute_ranges): Call compute_needs_on_entry_set. * gimple-range-path.h: Add needs_on_entry_for_block, compute_needs_on_entry_set, m_bitmaps, and m_needs_on_entry. gcc/testsuite/ChangeLog: * gcc.dg/pr103254.c: Add -fdump-tree-vrp2-details. * gcc.dg/pr103254-2.c: New test. --- gcc/gimple-range-path.cc | 67 +++++++++++++++++++++++++++++-- gcc/gimple-range-path.h | 4 ++ gcc/testsuite/gcc.dg/pr103254-2.c | 29 +++++++++++++ gcc/testsuite/gcc.dg/pr103254.c | 6 ++- 4 files changed, 102 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/pr103254-2.c diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc index b9c71226c1c..32226df05b5 100644 --- a/gcc/gimple-range-path.cc +++ b/gcc/gimple-range-path.cc @@ -48,6 +48,7 @@ path_range_query::path_range_query (bool resolve, gimple_ranger *ranger) m_ranger = ranger; m_oracle = new path_oracle (m_ranger->oracle ()); + bitmap_obstack_initialize (&m_bitmaps); } path_range_query::~path_range_query () @@ -57,6 +58,7 @@ path_range_query::~path_range_query () delete m_ranger; BITMAP_FREE (m_has_cache_entry); delete m_cache; + bitmap_obstack_release (&m_bitmaps); } // Return TRUE if NAME is in the import bitmap. @@ -401,6 +403,62 @@ path_range_query::compute_ranges_in_phis (basic_block bb) } } +// Return the set of SSA names that would be useful to have resolved +// on entry to this block. These are the names that will ultimately +// help resolve the final conditional on the path. + +void +path_range_query::needs_on_entry_for_block (bitmap on_entry, unsigned bbn) +{ + gori_compute &g = m_ranger->gori (); + basic_block next = m_path[bbn]; + bitmap_copy (on_entry, g.imports (next)); + + // Any incoming PHI args are needed on entry, but are not in the + // g.imports() above, so add them manually. + for (auto iter = gsi_start_phis (next); !gsi_end_p (iter); gsi_next (&iter)) + { + gphi *phi = iter.phi (); + tree result = gimple_phi_result (phi); + + if (bitmap_bit_p (m_imports, SSA_NAME_VERSION (result))) + for (size_t i = 0; i < gimple_phi_num_args (phi); ++i) + { + tree arg = gimple_phi_arg (phi, i)->def; + + if (TREE_CODE (arg) == SSA_NAME + && m_path.contains (gimple_phi_arg_edge (phi, i)->src)) + bitmap_set_bit (on_entry, SSA_NAME_VERSION (arg)); + } + } + bitmap_and_into (on_entry, m_imports); +} + +// Compute the set of needed on-entry names for all path blocks. + +void +path_range_query::compute_needs_on_entry_set () +{ + unsigned len = m_path.length (); + + if (len > m_needs_on_entry.length ()) + m_needs_on_entry.safe_grow_cleared (len); + + // Iterate backwards through the path, excluding the entry block + // since we'll be asking the ranger for incoming values into the path. + for (unsigned i = 0; i + 1 < len; ++i) + { + if (!m_needs_on_entry[i]) + m_needs_on_entry[i] = BITMAP_ALLOC (&m_bitmaps); + + needs_on_entry_for_block (m_needs_on_entry[i], i); + + // Include the on-entry names for the subsequent blocks. + if (i > 0) + bitmap_ior_into (m_needs_on_entry[i], m_needs_on_entry[i - 1]); + } +} + // Compute ranges defined in the current block, or exported to the // next block. @@ -441,11 +499,12 @@ path_range_query::compute_ranges_in_block (basic_block bb) // Solve imports that are exported to the next block. basic_block next = next_bb (); edge e = find_edge (bb, next); - EXECUTE_IF_SET_IN_BITMAP (m_imports, 0, i, bi) + const_bitmap candidates = m_needs_on_entry[m_pos - 1]; + gori_compute &g = m_ranger->gori (); + bitmap exports = g.exports (bb); + EXECUTE_IF_SET_IN_BITMAP (candidates, 0, i, bi) { tree name = ssa_name (i); - gori_compute &g = m_ranger->gori (); - bitmap exports = g.exports (bb); if (bitmap_bit_p (exports, i)) { @@ -602,6 +661,8 @@ path_range_query::compute_ranges (const vec &path, else compute_imports (m_imports, exit_bb ()); + compute_needs_on_entry_set (); + if (m_resolve) get_path_oracle ()->reset_path (); diff --git a/gcc/gimple-range-path.h b/gcc/gimple-range-path.h index 57a9ae9bdcd..f5601377209 100644 --- a/gcc/gimple-range-path.h +++ b/gcc/gimple-range-path.h @@ -61,6 +61,8 @@ private: void compute_ranges_in_phis (basic_block bb); void adjust_for_non_null_uses (basic_block bb); void ssa_range_in_phi (irange &r, gphi *phi); + void needs_on_entry_for_block (bitmap interesting, unsigned bbn); + void compute_needs_on_entry_set (); void compute_outgoing_relations (basic_block bb, basic_block next); void compute_phi_relations (basic_block bb, basic_block prev); void maybe_register_phi_relation (gphi *, tree arg); @@ -93,6 +95,8 @@ private: auto_bitmap m_imports; gimple_ranger *m_ranger; non_null_ref m_non_null; + bitmap_obstack m_bitmaps; + auto_vec m_needs_on_entry; // Current path position. unsigned m_pos; diff --git a/gcc/testsuite/gcc.dg/pr103254-2.c b/gcc/testsuite/gcc.dg/pr103254-2.c new file mode 100644 index 00000000000..1956d90a563 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr103254-2.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 --param ranger-logical-depth=99" } */ +/* { dg-timeout 10 } */ + +/* This is the same test as pr103254.c, but making sure the threader + succeeds regardless of the depth limit, since it should avoid + asking for unnecessary edges. */ + +short int i; + +void +foo (void) +{ + for (i = 1; i < 2; i += 4) + { + int j; + + for (j = 0; j < 5; j += 4) + { + int k; + + for (k = 0; k < 68; k += 4) + { + i &= j; + j &= i; + } + } + } +} diff --git a/gcc/testsuite/gcc.dg/pr103254.c b/gcc/testsuite/gcc.dg/pr103254.c index 62d2415f216..9d6828a6771 100644 --- a/gcc/testsuite/gcc.dg/pr103254.c +++ b/gcc/testsuite/gcc.dg/pr103254.c @@ -1,7 +1,11 @@ /* { dg-do compile } */ -/* { dg-options "-O3" } */ +/* { dg-options "-O3 -fdump-tree-vrp2-details" } */ /* { dg-timeout 10 } */ +/* The vrp2 dump above is needed to test that ranger can dump all the + known ranges in the IL and not blow up, which would happen for + --param ranger-logical-depth=99. */ + short int i; void