From patchwork Mon Oct 4 09:43:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aldy Hernandez X-Patchwork-Id: 45781 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4FEF23857C44 for ; Mon, 4 Oct 2021 09:44:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4FEF23857C44 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1633340689; bh=cQnMVO7YRBkdzk2jQP2N1JbJ1w5hjikC3V1XazcCh6U=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=ZiXpj8zBwwukLHcQ3uDeKn55mLhZLr2qWn27RO1GDGkLBkXicT+Ulcf0tDMSITZ+P 4b5rZGrBsIxo+hhBhqcamh+VfdwfnduycqzPCXaXQkqnTdI253L5/8msyGXUJLxoAg j9AvHd3pX2E6jVNdw69hJjAj+uk7kf2yd7TRMos4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTP id 33A3F3858C3A for ; Mon, 4 Oct 2021 09:44:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 33A3F3858C3A Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-304-jkhJfq8jNoO337rQBcEbuA-1; Mon, 04 Oct 2021 05:44:18 -0400 X-MC-Unique: jkhJfq8jNoO337rQBcEbuA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D06A7835DE7; Mon, 4 Oct 2021 09:44:17 +0000 (UTC) Received: from abulafia.quesejoda.com (unknown [10.39.193.168]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 0A6F65DF2F; Mon, 4 Oct 2021 09:44:11 +0000 (UTC) Received: from abulafia.quesejoda.com (localhost [127.0.0.1]) by abulafia.quesejoda.com (8.16.1/8.15.2) with ESMTPS id 1949i8B41596619 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Mon, 4 Oct 2021 11:44:09 +0200 Received: (from aldyh@localhost) by abulafia.quesejoda.com (8.16.1/8.16.1/Submit) id 1949i8mF1596618; Mon, 4 Oct 2021 11:44:08 +0200 To: Jeff Law , Richard Biener , matz@suse.de Subject: [RFC] More jump threading restrictions in the presence of loops. Date: Mon, 4 Oct 2021 11:43:14 +0200 Message-Id: <20211004094313.1596556-1-aldyh@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-13.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Aldy Hernandez via Gcc-patches From: Aldy Hernandez Reply-To: Aldy Hernandez Cc: GCC patches Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" It is frustrating that virtually all the regressions with the hybrid threader for VRP, have not been with the engine itself, but with the independent restrictions we have agreed upon. The following patch is a collection of discussions with Richi, Jeff, and Michael Matz regarding jump threading limitations in the presence of loops, that I hope can lead to further refinements. As I have mentioned before, most of our threading tests are too fragile, so in this patch I have distilled various restrictions into gimple FE tests that I hope can help in maintaining the threader going forward. The goal is to have one test with no valid threads whatsover, and one with exclusively one valid thread per function. This should make it trivial to maintain this going forward. I would like to request the relevant experts to not only examine the patch, but review the tests in this patch, to make sure we agree upon these restrictions. I have distilled the smallest possible test for each restriction and have annotated said tests to make reviewing easy. Note that the test in ssa-thread-valid.c is a thread that Jeff has suggested should be an exception to the path crossing loops restriction, but I have not implemented it yet, because even if we could loosen the restriction, it would violate Richi's restriction of a path that rotates a loop. Comments are highly welcome. By the way, these restrictions trigger *a lot*. We seem to be rotating loops left and right. So I wouldn't be surprised if this requires (as usual) a lot of test tweaking. Untested patch follows. p.s. Note that I'm just facilitating the discussion. I'm highly dependent on the loop experts here ;-). gcc/ChangeLog: * tree-ssa-threadupdate.c (jt_path_registry::cancel_invalid_paths): Tweak. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/ssa-thread-invalid.c: New test. * gcc.dg/tree-ssa/ssa-thread-valid.c: New test. --- .../gcc.dg/tree-ssa/ssa-thread-invalid.c | 102 ++++++++++++++++++ .../gcc.dg/tree-ssa/ssa-thread-valid.c | 57 ++++++++++ gcc/tree-ssa-threadupdate.c | 29 ++++- 3 files changed, 186 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-invalid.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-valid.c diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-invalid.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-invalid.c new file mode 100644 index 00000000000..bd56a62a4b4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-invalid.c @@ -0,0 +1,102 @@ +// { dg-do compile } +// { dg-options "-O2 -fgimple -fdump-statistics" } +// +// This is a collection of seemingly threadble paths that should not be allowed. + +void foobar (int); + +// Possible thread from 2->4->3, but it would rotate the loop. +void __GIMPLE (ssa) +f1 () +{ + int i; + + // Pre-header. + __BB(2): + goto __BB4; + + // Latch. + __BB(3): + foobar (i_1); + i_5 = i_1 + 1; + goto __BB4; + + __BB(4,loop_header(1)): + i_1 = __PHI (__BB2: 0, __BB3: i_5); + if (i_1 != 101) + goto __BB3; + else + goto __BB5; + + __BB(5): + return; + +} + +// Possible thread from 2->3->5 but threading through the empty latch +// would create a non-empty latch. +void __GIMPLE (ssa) +f2 () +{ + int i; + + // Pre-header. + __BB(2): + goto __BB3; + + __BB(3,loop_header(1)): + i_8 = __PHI (__BB5: i_5, __BB2: 0); + foobar (i_8); + i_5 = i_8 + 1; + if (i_5 != 256) + goto __BB5; + else + goto __BB4; + + // Latch. + __BB(5): + goto __BB3; + + __BB(4): + return; + +} + +// Possible thread from 3->5->6->3 but this would thread through the +// header but not exit the loop. +int __GIMPLE (ssa) +f3 (int a) +{ + int i; + + __BB(2): + goto __BB6; + + __BB(3): + if (i_1 != 0) + goto __BB4; + else + goto __BB5; + + __BB(4): + foobar (5); + goto __BB5; + + // Latch. + __BB(5): + i_7 = i_1 + 1; + goto __BB6; + + __BB(6,loop_header(1)): + i_1 = __PHI (__BB2: 1, __BB5: i_7); + if (i_1 <= 99) + goto __BB3; + else + goto __BB7; + + __BB(7): + return; + +} + +// { dg-final { scan-tree-dump-not "Jumps threaded" "statistics" } } diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-valid.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-valid.c new file mode 100644 index 00000000000..e10e877d01a --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-valid.c @@ -0,0 +1,57 @@ +// { dg-do compile } +// { dg-options "-O2 -fgimple -fdump-statistics" } +// +// This is a collection of threadable paths. To simplify maintenance, +// there should only be one threadable path per function. + +int global; + +// ** FIXME: This currently fails** +// +// There is a path crossing loops through 3->4->5. +// +// Jeff has stipulated that... +// +// This might be an exception since this goes from inside the loop to +// outside the loop without entering another loop. That is, we have +// found an early exit from the loop that doesn't require us to go +// through the latch. In fact, the whole threaded path is logically +// outside the loop. So the primary effect is to reduce the number of +// branches necessary when we exit the loop via this path. +// +// This has nice secondary effects. For example, objects on the +// threaded path will no longer necessarily be live throughout the +// loop -- so we can get register allocation improvements. The +// threaded path can physically move outside the loop resulting in +// better icache efficiency, etc. +// +// The question we'd have to answer is whether or not exposing this +// alternate exit path mucks up other loop optimizations, but if we +// restrict to after the loop optimizer is done, then that's a +// non-issue. +int __GIMPLE (ssa) +f1 (int x) +{ + int a; + + __BB(2): + a_4 = ~x_3(D); + goto __BB4; + + // Latch. + __BB(3): + global = a_1; + goto __BB4; + + __BB(4,loop_header(1)): + a_1 = __PHI (__BB2: a_4, __BB3: 0); + if (a_1 != 0) + goto __BB3; + else + goto __BB5; + + __BB(5): + return; +} + +// { dg-final { scan-tree-dump "Jumps threaded: 1" "statistics" { xfail *-*-* } } } diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c index dcabfdb30d2..414ebabe895 100644 --- a/gcc/tree-ssa-threadupdate.c +++ b/gcc/tree-ssa-threadupdate.c @@ -2766,10 +2766,12 @@ bool jt_path_registry::cancel_invalid_paths (vec &path) { gcc_checking_assert (!path.is_empty ()); - edge taken_edge = path[path.length () - 1]->e; - loop_p loop = taken_edge->src->loop_father; + edge entry = path[0]->e; + edge exit = path[path.length () - 1]->e; + loop_p loop = exit->src->loop_father; bool seen_latch = false; bool path_crosses_loops = false; + bool path_crosses_loop_header = false; for (unsigned int i = 0; i < path.length (); i++) { @@ -2793,6 +2795,14 @@ jt_path_registry::cancel_invalid_paths (vec &path) || e->dest->loop_father != loop) path_crosses_loops = true; + // ?? Avoid threading through loop headers that remain in the + // loop, as such threadings tend to create sub-loops which + // _might_ be OK ??. + if (e->dest->loop_father->header == e->dest + && !flow_loop_nested_p (exit->dest->loop_father, + e->dest->loop_father)) + path_crosses_loop_header = true; + if (flag_checking && !m_backedge_threads) gcc_assert ((path[i]->e->flags & EDGE_DFS_BACK) == 0); } @@ -2811,6 +2821,21 @@ jt_path_registry::cancel_invalid_paths (vec &path) cancel_thread (&path, "Path crosses loops"); return true; } + // The path should either start and end in the same loop or exit the + // loop it starts in but never enter a loop. This also catches + // creating irreducible loops, not only rotation. + if (entry->src->loop_father != exit->dest->loop_father + && !flow_loop_nested_p (exit->src->loop_father, + entry->dest->loop_father)) + { + cancel_thread (&path, "Path rotates loop"); + return true; + } + if (path_crosses_loop_header) + { + cancel_thread (&path, "Path crosses loop header but does not exit it"); + return true; + } return false; }