From patchwork Mon Oct 17 07:43:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Schwinge X-Patchwork-Id: 58929 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DFF363858C55 for ; Mon, 17 Oct 2022 07:44:16 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa4.mentor.iphmx.com (esa4.mentor.iphmx.com [68.232.137.252]) by sourceware.org (Postfix) with ESMTPS id 77A973858C74 for ; Mon, 17 Oct 2022 07:43:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 77A973858C74 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.95,190,1661846400"; d="scan'208,223";a="84861938" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa4.mentor.iphmx.com with ESMTP; 16 Oct 2022 23:43:48 -0800 IronPort-SDR: S7/b+VLbtkOduEGXja4p3z/QeVh+KMa23/foOddz5M805frYjfmPDwWngwhrHQIV4HPvhr63l4 fVOc9vpTJvmRt5o4MXUTrIl8A5BPirqIKFuvI2qeecfGRVmHcwXxpRsAPYedn3/OGqVAbLHX4z SXQ7kofWrAlf3DDrJ0rCeMwrxR3T8GDGCLVgyaYLVXgsiJQox9MqlQlGATb7G3sBvkuwpc8SiE MQ3tNC2GXDQAk2haOqwtBZuZDErz8T7roNv9JljhfDRpwarxvWMuYc7uS0AHO6caifpgbEwQGn 4rQ= From: Thomas Schwinge To: Aldy Hernandez , Subject: Add 'c-c++-common/torture/pr107195-1.c' [PR107195] (was: [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0.) In-Reply-To: <20221011083137.336470-1-aldyh@redhat.com> References: <20221011083137.336470-1-aldyh@redhat.com> User-Agent: Notmuch/0.29.3+94~g74c3f1b (https://notmuchmail.org) Emacs/27.1 (x86_64-pc-linux-gnu) Date: Mon, 17 Oct 2022 09:43:37 +0200 Message-ID: <878rlej3o6.fsf@euler.schwinge.homeip.net> MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-13.mgc.mentorg.com (139.181.222.13) To svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_LOTSOFHASH, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi! On 2022-10-11T10:31:37+0200, Aldy Hernandez via Gcc-patches wrote: > When solving 0 = _15 & 1, we calculate _15 as: > > [irange] int [-INF, -2][0, +INF] NONZERO 0xfffffffe > > The known value of _15 is [0, 1] NONZERO 0x1 which is intersected with > the above, yielding: > > [0, 1] NONZERO 0x0 > > This eventually gets copied to a _Bool [0, 1] NONZERO 0x0. > > This is problematic because here we have a bool which is zero, but > returns false for irange::zero_p, since the latter does not look at > nonzero bits. This causes logical_combine to assume the range is > not-zero, and all hell breaks loose. > > I think we should just normalize a nonzero mask of 0 to [0, 0] at > creation, thus avoiding all this. 1. This commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04 "[PR107195] Set range to zero when nonzero mask is 0" broke a GCC/nvptx offloading test case: UNSUPPORTED: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 (test for excess errors) PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 execution test [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 scan-nvptx-none-offload-rtl-dump mach "SESE regions:.* [0-9]+{[0-9]+->[0-9]+(\\.[0-9]+)+}" Same for C++. I'll later send a patch (for the test case!) to fix that up. 2. Looking into this, I found that this commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04 "[PR107195] Set range to zero when nonzero mask is 0" actually enables a code transformation/optimization that GCC apparently has not been doing before! I've tried to capture that in the attached "Add 'c-c++-common/torture/pr107195-1.c' [PR107195]". Will you please verify that one? In its current '#if 1' configuration, it's all-PASS after commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04 "[PR107195] Set range to zero when nonzero mask is 0", whereas before, we get two calls to 'foo', because GCC apparently didnn't understand the relation (optimization opportunity) between 'r *= 2;' and the subsequent 'if (r & 1)'. I've left in the other '#if' variants in case you'd like to experiment with these, but would otherwise clean that up before pushing. Where does one put such a test case? Should the file be named 'pr107195' or something else? Do we scan 'optimized', or an earlier dump? At '-O1', the actual code transformation is visible already in the 'dom2' dump: [local count: 536870913]: gimple_assign + gimple_assign + goto ; [100.00%] - [local count: 1073741824]: - # gimple_phi + [local count: 536870912]: + # gimple_phi gimple_assign gimple_cond - goto ; [50.00%] + goto ; [100.00%] else - goto ; [50.00%] + goto ; [0.00%] [local count: 536870913]: gimple_call gimple_assign [local count: 1073741824]: - # gimple_phi + # gimple_phi gimple_return And, the actual "avoid second call 'foo'" optimization is visiable starting 'dom3': [local count: 536870913]: gimple_assign + goto ; [100.00%] - [local count: 1073741824]: - # gimple_phi - gimple_assign + [local count: 536870912]: + gimple_assign gimple_cond - goto ; [50.00%] + goto ; [100.00%] else - goto ; [50.00%] + goto ; [0.00%] [local count: 536870913]: - gimple_call - gimple_assign + gimple_assign + gimple_assign [local count: 1073741824]: - # gimple_phi + # gimple_phi gimple_return ..., but I don't know if either of those would be stable/appropriate to scan instead of 'optimized'? Grüße Thomas > PR tree-optimization/107195 > > gcc/ChangeLog: > > * value-range.cc (irange::set_range_from_nonzero_bits): Set range > to [0,0] when nonzero mask is 0. > > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/pr107195-1.c: New test. > * gcc.dg/tree-ssa/pr107195-2.c: New test. > --- > gcc/testsuite/gcc.dg/tree-ssa/pr107195-1.c | 15 +++++++++++++++ > gcc/testsuite/gcc.dg/tree-ssa/pr107195-2.c | 16 ++++++++++++++++ > gcc/value-range.cc | 5 +++++ > 3 files changed, 36 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr107195-1.c > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr107195-2.c > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr107195-1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr107195-1.c > new file mode 100644 > index 00000000000..a0c20dbd4b1 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr107195-1.c > @@ -0,0 +1,15 @@ > +// { dg-do run } > +// { dg-options "-O1 -fno-tree-ccp" } > + > +int a, b; > +int main() { > + int c = 0; > + if (a) > + c = 1; > + c = 1 & (a && c) && b; > + if (a) { > + b = c; > + __builtin_abort (); > + } > + return 0; > +} > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr107195-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr107195-2.c > new file mode 100644 > index 00000000000..d447c78bdd3 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr107195-2.c > @@ -0,0 +1,16 @@ > +// { dg-do run } > +// { dg-options "-O1" } > + > +int a, b; > +int main() { > + int c = 0; > + long d; > + for (; b < 1; b++) { > + (c && d) & 3 || a; > + d = c; > + c = -1; > + if (d) > + __builtin_abort(); > + } > + return 0; > +} > diff --git a/gcc/value-range.cc b/gcc/value-range.cc > index a14f9bc4394..e07d2aa9a5b 100644 > --- a/gcc/value-range.cc > +++ b/gcc/value-range.cc > @@ -2903,6 +2903,11 @@ irange::set_range_from_nonzero_bits () > } > return true; > } > + else if (popcount == 0) > + { > + set_zero (type ()); > + return true; > + } > return false; > } > > -- > 2.37.3 ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 From dc4644dcef05a1f21a9ebc194689f31412811387 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Mon, 17 Oct 2022 09:10:03 +0200 Subject: [PATCH] Add 'c-c++-common/torture/pr107195-1.c' [PR107195] ... to display optimization performed as of recent commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04 "[PR107195] Set range to zero when nonzero mask is 0". PR tree-optimization/107195 gcc/testsuite/ * c-c++-common/torture/pr107195-1.c: New. --- .../c-c++-common/torture/pr107195-1.c | 41 +++++++++++++++++++ 1 file changed, 41 insertions(+) create mode 100644 gcc/testsuite/c-c++-common/torture/pr107195-1.c diff --git a/gcc/testsuite/c-c++-common/torture/pr107195-1.c b/gcc/testsuite/c-c++-common/torture/pr107195-1.c new file mode 100644 index 000000000000..1e201c1f5e6c --- /dev/null +++ b/gcc/testsuite/c-c++-common/torture/pr107195-1.c @@ -0,0 +1,41 @@ +/* Inspired by 'libgomp.oacc-c-c++-common/nvptx-sese-1.c'. */ + +/* { dg-additional-options -fdump-tree-optimized-raw } + { dg-skip-if {} { *-*-* } { {-flto -fno-fat-lto-objects} } { } } */ + +#if 1 +extern int +__attribute__((const)) +foo (int); +#else +int +__attribute__((noinline)) +foo (int x) +{ + return x & 2; +} +#endif + +int f (int r) +{ + if (foo (r)) /* If this first 'if' holds... */ + r *= 2; /* ..., 'r' now has a zero-value lower-most bit... */ + + if (r & 1) /* ..., so this second 'if' can never hold... */ + { /* ..., so this is unreachable. */ +#if 1 + /* In constrast, if the first 'if' does not hold ('foo (r) == 0'), the + second 'if' may hold, but we know ('foo' being 'const') that + 'foo (r) == 0', so don't have to re-evaluate it here: */ + r += foo (r); + /* Thus, if optimizing, we only ever expect one call of 'foo'. + { dg-final { scan-tree-dump-times {gimple_call