From patchwork Sat Jan 28 01:04:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew MacLeod X-Patchwork-Id: 63829 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0259C3858433 for ; Sat, 28 Jan 2023 01:05:29 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0259C3858433 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1674867929; bh=O9/BU5c0SOrRHns0NzS3RnZ0g9vxAjqwsxSGTHy9KN4=; h=Date:To:Cc:Subject:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=Oa6v41tN8u3PKCVprrD0l6PElcx0f61Y8N9n11/6ZpDIsmY/s5NOb7KoQ2hopRplU 5gGtafAr74rjWxyhWyEqPlNiXC1nb4qLPsn3M5no1QM9DVWKxyjHdodTThE/ichdNU 28hmZHGCov/wUqc4TXdXwcuaMSOVREd+V80pt5TI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 658BC3858C54 for ; Sat, 28 Jan 2023 01:04:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 658BC3858C54 Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-643-RDkHCobjONusLca2tiwKsA-1; Fri, 27 Jan 2023 20:04:53 -0500 X-MC-Unique: RDkHCobjONusLca2tiwKsA-1 Received: by mail-qv1-f72.google.com with SMTP id lp6-20020a056214590600b0053a07384c95so397584qvb.4 for ; Fri, 27 Jan 2023 17:04:53 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=subject:from:cc:to:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=l19Vz5csF6IDuH90rRYnNFHO+BB1gJwc4UTY0GQA3f8=; b=ZD9c1HyQb1Z0MCP3o7BeF7KD7laR5Z8KQf6g4G36ULlovPlVBIcy5F7IW1JdaT/NEV v+OthzkCS3ZyQT3qN1fDoH3bQvh2drRP51X2lQB1tLLyonYNri4A68EnVMt5Cdy7k3kZ 6NcdGUI4TKTIcukW/OR+1G3t2ev+crt6LJXYnyRbAToiThUYyi+S5Ih6nPtczN3uvDNi 6alVf6Jo9sbm+4FAyk8CPuHM7NHgZpxq0grk2lA1EX7bxVcnhGjv8Ip9zHudI9/bWT+K yoox4c+NsQcaF1BWuKfI+NU7p9pLiGSYJ7CRPZW15SpiK1HmgdmOJ69xt8pW4dx4aWWC kUUw== X-Gm-Message-State: AFqh2kq+AnKTEVP3fIrl3oqZA6XsiqPuc6jr/8Cg60u2k8ux6QGusWnT cvAHANb4kHsIxb9++rpw/nuCu9RTsUFSA9sbYfFthGLW6EjmQI+XGYL1Wxn1YFDeXkSbzmjQKPD 2K/Mwu5xOjlNpd6YjU7a6TtRGt4olEdRTxxyPbPQbFcndm/s9YzOPIQ+UMa60pUNT8kTPoQ== X-Received: by 2002:a0c:bed2:0:b0:534:26eb:a25d with SMTP id f18-20020a0cbed2000000b0053426eba25dmr57982848qvj.40.1674867892583; Fri, 27 Jan 2023 17:04:52 -0800 (PST) X-Google-Smtp-Source: AMrXdXtcpGCQNLpWm7mpAnVcgLcBWUrP+HhjE2Y6Ckq1BqNSoht+VdF+yyytc4zydzBOcsvH3eBNnw== X-Received: by 2002:a0c:bed2:0:b0:534:26eb:a25d with SMTP id f18-20020a0cbed2000000b0053426eba25dmr57982828qvj.40.1674867892259; Fri, 27 Jan 2023 17:04:52 -0800 (PST) Received: from ?IPV6:2607:fea8:a263:f600::fa90? ([2607:fea8:a263:f600::fa90]) by smtp.gmail.com with ESMTPSA id eb10-20020a05620a480a00b007112aa42c4fsm3774398qkb.135.2023.01.27.17.04.51 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 27 Jan 2023 17:04:51 -0800 (PST) Message-ID: Date: Fri, 27 Jan 2023 20:04:50 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0 To: gcc-patches Cc: "hernandez, aldy" Subject: [PATCH 2/3] PR tree-optimization/108359 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Andrew MacLeod via Gcc-patches From: Andrew MacLeod Reply-To: Andrew MacLeod Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" If there exists an equivalence relationship between op1 and op2,any binary operation can be broken into individual operations and unioned if there are sufficiently few elements in the set. This depends on the first patch as we need to get the relation op1 == op2 correct in to the relation_trio set. There were various suggestions on how to "control" when we perform the sub-folds and when we continue using the aggregate. I settled on passing into wi_fold_in_parts_equiv a limit, which is currently either 0 or 8.  As long as there are 8 or less ranges n the subrange, we split it up, until the range we are accumulating into reaches 32 subranges... then we just do aggregates again. That should allow us to do a "good job" for a while, but bail before it becomes wasteful. Bootstraps on x86_64-pc-linux-gnu with no regressions,  OK for trunk? (3rd patch is delayed while I look into it further) Andrew From 409fc68cd85953c77e02588057a9eb0d21991475 Mon Sep 17 00:00:00 2001 From: Andrew MacLeod Date: Tue, 17 Jan 2023 11:14:41 -0500 Subject: [PATCH 2/4] Utilize op1 == op2 when invoking range-ops folding. If there exists an equivalence relationship between op1 and op2, any binary operation can be broken into individual operations and unioned if there are sufficently few elements in the set. PR tree-optimization/108359 gcc/ * range-op.cc (range_operator::wi_fold_in_parts_equiv): New. (range_operator::fold_range): If op1 is equivalent to op2 then invoke new fold_in_parts_equiv to operate on sub-components. * range-op.h (wi_fold_in_parts_equiv): New prototype. gcc/testsuite/ * gcc.dg/pr108359.c: New. --- gcc/range-op.cc | 54 +++++++++++++++++++++++++++++++++ gcc/range-op.h | 6 ++++ gcc/testsuite/gcc.dg/pr108359.c | 52 +++++++++++++++++++++++++++++++ 3 files changed, 112 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/pr108359.c diff --git a/gcc/range-op.cc b/gcc/range-op.cc index ed2dd1eb99c..f7c1e84e0bd 100644 --- a/gcc/range-op.cc +++ b/gcc/range-op.cc @@ -160,6 +160,38 @@ range_operator::wi_fold (irange &r, tree type, r.set_varying (type); } +// Call wi_fold when both op1 and op2 are equivalent. Further split small +// subranges into constants. This can provide better precision. +// For x + y, when x == y with a range of [0,4] instead of [0, 8] produce +// [0,0][2, 2][4,4][6, 6][8, 8] +// LIMIT is the maximum number of elements in range allowed before we +// do not processs them individually. + +void +range_operator::wi_fold_in_parts_equiv (irange &r, tree type, + const wide_int &lh_lb, + const wide_int &lh_ub, + unsigned limit) const +{ + int_range_max tmp; + widest_int lh_range = wi::sub (widest_int::from (lh_ub, TYPE_SIGN (type)), + widest_int::from (lh_lb, TYPE_SIGN (type))); + // if there are 1 to 8 values in the LH range, split them up. + r.set_undefined (); + if (lh_range >= 0 && lh_range < limit) + { + for (unsigned x = 0; x <= lh_range; x++) + { + wide_int val = lh_lb + x; + wi_fold (tmp, type, val, val, val, val); + r.union_ (tmp); + } + } + // Otherwise just call wi_fold. + else + wi_fold (r, type, lh_lb, lh_ub, lh_lb, lh_ub); +} + // Call wi_fold, except further split small subranges into constants. // This can provide better precision. For something 8 >> [0,1] // Instead of [8, 16], we will produce [8,8][16,16] @@ -234,6 +266,28 @@ range_operator::fold_range (irange &r, tree type, unsigned num_lh = lh.num_pairs (); unsigned num_rh = rh.num_pairs (); + // If op1 and op2 are equivalences, then we don't need a complete cross + // product, just pairs of matching elements. + if (relation_equiv_p (rel) && lh == rh) + { + int_range_max tmp; + r.set_undefined (); + for (unsigned x = 0; x < num_lh; ++x) + { + // If the number of subranges is too high, limit subrange creation. + unsigned limit = (r.num_pairs () > 32) ? 0 : 8; + wide_int lh_lb = lh.lower_bound (x); + wide_int lh_ub = lh.upper_bound (x); + wi_fold_in_parts_equiv (tmp, type, lh_lb, lh_ub, limit); + r.union_ (tmp); + if (r.varying_p ()) + break; + } + op1_op2_relation_effect (r, type, lh, rh, rel); + update_known_bitmask (r, m_code, lh, rh); + return true; + } + // If both ranges are single pairs, fold directly into the result range. // If the number of subranges grows too high, produce a summary result as the // loop becomes exponential with little benefit. See PR 103821. diff --git a/gcc/range-op.h b/gcc/range-op.h index b7b8a3b9473..f00b747f08a 100644 --- a/gcc/range-op.h +++ b/gcc/range-op.h @@ -109,6 +109,12 @@ protected: const wide_int &rh_lb, const wide_int &rh_ub) const; + // Called by fold range to split small subranges into parts when op1 == op2 + void wi_fold_in_parts_equiv (irange &r, tree type, + const wide_int &lb, + const wide_int &ub, + unsigned limit) const; + // Tree code of the range operator or ERROR_MARK if unknown. tree_code m_code; }; diff --git a/gcc/testsuite/gcc.dg/pr108359.c b/gcc/testsuite/gcc.dg/pr108359.c new file mode 100644 index 00000000000..7cf04f74132 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr108359.c @@ -0,0 +1,52 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +/* PR test case. */ +int b = 10; +int c; +char e; +void foo(); +static char(a)(char f, char g) { return f && g == 1 ? 0 : f % g; } +short(d)(short f, short g) { return f * g; } +int main() { + short h; + int i; + unsigned j; + h = d(b && c, 5); + j = h; + i = a(h, 237); + unsigned k = i; + e = i < 0 || k >= 32 ? 0 : i >> k; + if (e) { + c = 0; + foo(); + } +} + + +/* Also Check that small ranges are broken down and optimized properly + This function should never call foo (). */ + +int otherfunc (int x, int z) { + if (x < 1 || x > 6 ) + return 0; + + if (x == z) + { + if (x >> z > 0) + foo (); + if (x * z > 26 && x * z < 35) + foo (); + if (x + z == 5) + foo (); + if ((x + z) % 2 == 1) + foo (); + if (x / z != 1) + foo (); + + } + return 0; +} + + +/* { dg-final { scan-tree-dump-not "foo" "optimized" } } */ -- 2.39.0