From patchwork Mon Dec 4 10:05:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Liu OS X-Patchwork-Id: 81267 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0B013385AC26 for ; Mon, 4 Dec 2023 10:05:35 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2136.outbound.protection.outlook.com [40.107.223.136]) by sourceware.org (Postfix) with ESMTPS id A12E9385AC09 for ; Mon, 4 Dec 2023 10:05:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A12E9385AC09 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=os.amperecomputing.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=os.amperecomputing.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A12E9385AC09 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.223.136 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1701684319; cv=pass; b=pH9PnuMWCb8L734qXhQkdYfeEn70llGzFnvbv0XRGfsQckVzmwJaPPZNcyqjmTzmebLs1YD6zy7vzLOIQy0aFEvZQQCLYvdSUy2UF94vom/AchIYtfzg9HKzHIMaHKE3o0G7PG49SDtePv6311NR11cAhDvhYkyGOGqehmsxz4c= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1701684319; c=relaxed/simple; bh=GxqA3GXYkb+mMW8F7hxKA/3HxRzGpEj724GVM/JuIOA=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=HEchDqwEBQaquOjtksTXY/cpyU7x1AmUrHph6wkZXkQbhNPC0jChp7LWVHCP0Wy0h6Ax/w+5RsUUpsJwaQhaxiHPorujiD3AsZ3h9EcYDQJcyJt+BM2ODKlPFQScLrhGJM36DqOwEjivOkZfSrLBaQTREPR2HzU+ki4tyBEyhuA= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=dtePDl+iPC7ISvvdb7kjkaDU2sP/5SxhD2iIWhhhEtb6AE7czi47CMtXMG0Rn8KI3PtEyBQC/XFtRTdXl5CYJkJV5bOXtnRUe5q3hLsYIQo8An8vI6sfn7A6eu+0vbcuIRno1bzP/xcc32tqjjNa4c09Ft2Pd4ZDzROgRF2fZP2nkFl4LAOyU8YRLqXgtC2MaUhUD14xCVuzHdTu01+7hZ6cz5YGyXbARB/WFYbuDzmezeBxeZLVk7CFPeyO8Q7P+D5QCsfi9jTeyfWQnLoimVLiNhGeAavaMAz+bIfVd7+gLslZQnF/CyAg7wATlvPbIGpp/CBOwfpp6XDzDI/5XQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=wnZdBlrQcM/yX+Uz+TN82iz8hcnB6vQFCynMtLReU3Y=; b=OMbSJOfq1XWR99Hyi3koRu96A0LS8HcpIBG0rPB20lXjuud3SH884pwQ2JxH3GgV1KS3IE1ejX4lsdxzEE5ElILm89Z1k2f3Lyg0TAL3bRacH/6E426oss7+vSaXglysoEJD9BQiF39ZaKN6NeMAfuEOMttDt750lenRLDeZcGnG7nuKJemVEFuxVRQn+XSVYtHe1cgkx6bVeGHA7Z46pWpAh1eu/4vQaqAKgFhbD2KHeR6frnlX6o2qk12HFVCCUcDLMqWTyooftLC5x0pNO0VglPkbkbXP9Avh0WGpWy4ZHrZMLMYDZYlY7JzfIxtXSa1KgBvsgHaTpKWq/nCKXQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=os.amperecomputing.com; dmarc=pass action=none header.from=os.amperecomputing.com; dkim=pass header.d=os.amperecomputing.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=os.amperecomputing.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=wnZdBlrQcM/yX+Uz+TN82iz8hcnB6vQFCynMtLReU3Y=; b=Ucj+KwqYjHKuA84BeJctkD6FlhXe3yeJmp1NoEtfO+NqP+3DkI6oeUzemADzpzyX2Fv+JEkNq/cIJT5YkQgL+rp9S0tfHszkV889pW2oP/TRKWvdjrW38RxqwpSc4r8kz2DPXXJofskjPwT2x3O1mexzQCLAmD1hGg/KM1Eequ8= Received: from SN6PR0102MB3487.prod.exchangelabs.com (2603:10b6:805:3::28) by SA0PR01MB6329.prod.exchangelabs.com (2603:10b6:806:ee::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7046.33; Mon, 4 Dec 2023 10:05:05 +0000 Received: from SN6PR0102MB3487.prod.exchangelabs.com ([fe80::88fb:57da:b19f:2f5a]) by SN6PR0102MB3487.prod.exchangelabs.com ([fe80::88fb:57da:b19f:2f5a%7]) with mapi id 15.20.7046.033; Mon, 4 Dec 2023 10:05:05 +0000 From: Hao Liu OS To: "GCC-patches@gcc.gnu.org" Subject: [PATCH] tree-optimization/PR112774 - SCEV: extend the chrec tree with a nonwrapping flag Thread-Topic: [PATCH] tree-optimization/PR112774 - SCEV: extend the chrec tree with a nonwrapping flag Thread-Index: AQHaJpjz807IjvAR70SqMTUmFSqmCQ== Date: Mon, 4 Dec 2023 10:05:05 +0000 Message-ID: Accept-Language: en-US, zh-CN Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_Enabled=True; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_SiteId=3bc2b170-fd94-476d-b0ce-4229bdc904a7; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_SetDate=2023-12-04T10:05:04.589Z; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_Name=Confidential; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_ContentBits=0; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_Method=Standard; authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=os.amperecomputing.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: SN6PR0102MB3487:EE_|SA0PR01MB6329:EE_ x-ms-office365-filtering-correlation-id: 03eb9194-9c68-4ee7-f52e-08dbf4b07ced x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: HCXlVKfcokK+ogwL7qaoD5t2HsJeyliAVYSv/VjeO5GjBAGmowG20Aqi4HDke/vcFrTXn9c6qzvRcDaPQJmyfyjpeu4NHVouRsB/nqzzujf7vIke+z6uA8UvyJ20tyTTqOBWYInwU1Ma5S+sLFhDBsLGWtAKePlRcUqTFPIewQ60vkToCdZHidnleX0HP8zeW2jzuWLnFBvOPpPqVwZ+N9YB34XbtQ6iRXHVIMdubleYlt7bdfOcvmOlK8kbFnzwppLf1nVA3SJuZWZ1ogMQqoNMwdfjftZJkSbtEukZB8kIV00kUBDcbMedtsLKD/X1RSAWkkPqVU/YiG4acz2T1ITHNt3stLpb1RSO7nupamm4Cy6SzJXWCAsYQ62qBAhLqQNAqY+Hb0BRug3FCgvQQpPXUHGwtOIpv6cL5jq04TMqU2gpn2u82WXrlQmGdZ0v/mskeeTN98VL7TqDTaPnGRfsGwIfDmOsz1+3TQPqdFwnlu5l/BWTVuV8rjq9FQrIHKHxNl38/U3kL5AgL71Dx77nCSqDHv1vEqfYv+2wGJrAycJMJkriehkUUIgWF4GOGkcWeeMCeyx1nX1V//lhFh8pBUU30Ddx/XmL1tAp4Co= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SN6PR0102MB3487.prod.exchangelabs.com; PTR:; CAT:NONE; SFS:(13230031)(366004)(39850400004)(136003)(346002)(396003)(376002)(230922051799003)(64100799003)(186009)(451199024)(1800799012)(6916009)(33656002)(316002)(64756008)(66446008)(66476007)(66556008)(66946007)(91956017)(76116006)(478600001)(71200400001)(30864003)(5660300002)(38070700009)(2906002)(52536014)(8676002)(8936002)(86362001)(83380400001)(84970400001)(55016003)(38100700002)(26005)(41300700001)(122000001)(6506007)(7696005)(9686003); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-8859-1?q?bwYAC8A/nQQu3uXotqCQyZ3?= =?iso-8859-1?q?8myNblXvTvdIF61m51klygmgmb1RTv7PZXeOCWxxJZ526+nTyeWWnR+6BRCJ?= =?iso-8859-1?q?e0iFgtPVOqOvlLZf2OQrmcaQPY1SCYSF1v+l+nq0j385C+1+s5uGuQVmlLfv?= =?iso-8859-1?q?AdaYTtwYoxX5JGjo0C3jwj97GaTU7ibTM6G1j2HaQvju/VHGdLNMHqyAbE9c?= =?iso-8859-1?q?PTwZLfZcw1ICyHPLObV9hV8LzS2uVZvAjngMBwde3WYmCd4T8evGBhRauNkZ?= =?iso-8859-1?q?V7w0k3tRJF6Z3m/DJ1q/1ggtrs+Ya7PylKMpPLAOzZie1iaIG1ZDHZmbU1rI?= =?iso-8859-1?q?1PIPNVBbnrqtQAK7CgOYEWsndFniOHCHI9YN6xKjwQIlm4SlvWSTJq/GNFG4?= =?iso-8859-1?q?L4XfDH+YFVSKtivJoWsH7ZKrMDEjqHJ8NIIIZJJEp+xbia1JC7w8ucAOuj7e?= =?iso-8859-1?q?PCG4HesC9AhQQxCgOtWcxRau0AyicGm2cSwCH39VmpunJLM5giECApwOS3da?= =?iso-8859-1?q?HodsAiMnVqusng6XTUCfe9q3KIY/PK1PL/1gj/Jpx3cTG83TyBFzbY1Z7Afl?= =?iso-8859-1?q?PdM0YkVO0sQksw9oYUMb7CrAGaUSl9kRtYo2/HFiFicz6cqAGLlgjoRyFzEp?= =?iso-8859-1?q?llzMW9t9FAWo5nIwLHO93aSK8KK0uosCFQgz9dsW8EX0AwS9S+2XihNCGpcq?= =?iso-8859-1?q?+EOc32y7FUAs3CM3iK9/bwsnz4oYPqNpKgH6ExR5t11iUdd9aGJjncEnrip+?= =?iso-8859-1?q?W4aNIzIWHB7n5fDTAdbwUM5XYH6yz3JqP75ZyWlqzprhI0/hDmh/vjxN3m/I?= =?iso-8859-1?q?22EFSlqr0JfRVWMfHsg6V3wJrAlpJ2vfuG3/0KUcYkj5ai3NvxKwUTZBpJi1?= =?iso-8859-1?q?xTfcduTRiB6KRkTAr0j/kFu4GOLgY4C3Xwa79GNciWsQy3op861mHrB+DJ45?= =?iso-8859-1?q?CxCbyUCpiftnWzPBIF4gGFQOCMwEx3Ym8oAB109eshR6IiR/UnzW49kp8M9h?= =?iso-8859-1?q?uano1O2vMjzQ6Bzanke+P30myNRtdk7jphRmmNdJ5ZheIuLmC6h23FnJnENI?= =?iso-8859-1?q?WJNfXIFdEvUlxyK62jD3zgjJdeqW2fXw+upp0V7HYP0LGa1sAwRIb00Hjwmv?= =?iso-8859-1?q?b7NUeV93AKdCcBULXURbZOgCE0hNBExaFFKxC15mfJR332fGJs8BNE1DuVdk?= =?iso-8859-1?q?iUjHNKzStCvuXbwqlXQ28GXmDMTjOrrBUfVVTrg/pMhJkXaX3orMnVHvWLvL?= =?iso-8859-1?q?DwCPrvipHTuAtSwCfeC/s4YfvmHeR0wKenCsueiKLWnXFyU05wZoDNLoHSQ1?= =?iso-8859-1?q?LBqwtI/PYlyNQ3IP8mvvsicd2Xqbhyd+K/L4J80yRrtRRvjeunIIMoa9Pin9?= =?iso-8859-1?q?/Lr84zVfdeY+lQ8p6ka6lg/Jq1f+2s36DLKLxYiL3EZYACA+XEyXLKujKmDZ?= =?iso-8859-1?q?ONsWk9avQ1XgXiq7mDJsmLXSZQvLHz0HOSPQan5VwRLojyNakzMRLsKpmUGQ?= =?iso-8859-1?q?ZU/3rGxzYRXa3HMjIsb4F1tFF4e9v098DItfznJbIRL+JBC0NixtuoaMtC36?= =?iso-8859-1?q?dxsXsVyNclq/HIs8taBU3zIM3Mrme84YxjANzy4Wz5GYXv/ErcdAKdDZuse0?= =?iso-8859-1?q?iIoJzv+BBKf8gP98lX9e0RonBIiHzXH5S73hk2A=3D=3D?= MIME-Version: 1.0 X-OriginatorOrg: os.amperecomputing.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: SN6PR0102MB3487.prod.exchangelabs.com X-MS-Exchange-CrossTenant-Network-Message-Id: 03eb9194-9c68-4ee7-f52e-08dbf4b07ced X-MS-Exchange-CrossTenant-originalarrivaltime: 04 Dec 2023 10:05:05.0416 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 3bc2b170-fd94-476d-b0ce-4229bdc904a7 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: pvF7d76E9YWAHwFg0yjAIdrDD4ZRY9uz8OVQ2NKDN9eD1O9N5cOcy/qQVaB9aoCLMs0GV9ndEashm52/YgL9TsVLh2f5m1PrBYfo3NfCa26+hOnnV5JDGl+VAhniMYMo X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA0PR01MB6329 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Loop vecotorization can not optimize following case due to SCEV is not affine failure (i+offset may overflow): int A[1024 * 2]; int foo (unsigned offset, unsigned N) { int sum = 0; for (unsigned i = 0; i < N; i++) sum += A[i + offset]; return sum; } Actually, niter pass can find nonwrapping induction variables (i.e., i + offset can not overflow) by inferring from undefined behaviors like array access (see record_nonwrapping_iv). But this information is not shared to SCEV yet. This patch adds a nonwrapping flag to the chrec tree, which allows SCEV to re-use it to pass the nonwrapping checks like scev_probably_wraps_p, and finaly loop vectorization could succeed. The new flag is defined as CHREC_NOWRAP(tree), and the dump info is changed from "{offset, +, 1}_1" -> "{offset, +, 1}_1" (nw is short for nonwrapping). Two SCEV interfaces record_nonwrapping_chrec and nonwrapping_chrec_p are added to set and check the flag respectively. However, an extra problem is caused by resetting SCEV cache (i.e., scev_reset or reset_scev_htab), which may not be synchronized with the calling of free_numbers_of_iterations_estimates, which set the loop->estimate_state to EST_NOT_COMPUTED and make sure the above inferring from array access is called. In other words, the nonwrapping flag could be cleared and lost by resetting SCEV cache if the loop->estimate_state is not reset. E.g., gimple_duplicate_loop_body_to_header_edge/flush_ssaname_freelist, which calls scev_reset/scev_reset_htab to clear the SCEV cache, but the estimate_state is still kept as EST_AVAILABLE and the flag will not be set in loop vectorization. This patch uses a simple fix by calling free_numbers_of_iterations_estimates in vect_analyze_loop, which will make sure the flag is always set propriately in loop vectorization. This fix is a bit ad-hoc (works for loop vectorization only), if there is more reasonable method, I will revert the simple fix and try that. This patch is bootstrapped and tested on aarch64-linux-gnu with no regressions. OK for trunk? --- The patch is as following: [PATCH] SCEV: extend the chrec tree with a nonwrapping flag [PR112774] The flag is defined as CHREC_NOWRAP(tree), and will be dumped from "{offset, +, 1}_1" to "{offset, +, 1}_1" (nw is short for nonwrapping). Two SCEV interfaces record_nonwrapping_chrec and nonwrapping_chrec_p are added to set and check the flag respectively. As resetting the SCEV cache (i.e., the chrec trees) may not reset the loop->estimate_state, free_numbers_of_iterations_estimates is called explicitly in loop vectorization to make sure the flag can be calculated propriately by niter. gcc/ChangeLog: PR tree-optimization/112774 * tree-pretty-print.cc: if nonwrapping flag is set, chrec will be printed with additional info. * tree-scalar-evolution.cc: add record_nonwrapping_chrec and nonwrapping_chrec_p to set and check the new flag respectively. * tree-scalar-evolution.h: Likewise. * tree-ssa-loop-niter.cc (idx_infer_loop_bounds, infer_loop_bounds_from_pointer_arith, infer_loop_bounds_from_signedness, scev_probably_wraps_p): call record_nonwrapping_chrec before record_nonwrapping_iv, call nonwrapping_chrec_p to check the flag is set and return false from scev_probably_wraps_p. * tree-vect-loop.cc (vect_analyze_loop): call free_numbers_of_iterations_estimates explicitly. * gcc/tree.h: add CHREC_NOWRAP(NODE), base.nothrow_flag is used to represent the nonwrapping info. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/scev-16.c: New test. --- gcc/testsuite/gcc.dg/tree-ssa/scev-16.c | 17 +++++++++++++++++ gcc/tree-pretty-print.cc | 2 +- gcc/tree-scalar-evolution.cc | 24 ++++++++++++++++++++++++ gcc/tree-scalar-evolution.h | 2 ++ gcc/tree-ssa-loop-niter.cc | 21 ++++++++++++++++----- gcc/tree-vect-loop.cc | 4 ++++ gcc/tree.h | 8 +++++--- 7 files changed, 69 insertions(+), 9 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/scev-16.c diff --git a/gcc/testsuite/gcc.dg/tree-ssa/scev-16.c b/gcc/testsuite/gcc.dg/tree-ssa/scev-16.c new file mode 100644 index 00000000000..96ea36e4c65 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/scev-16.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-tree-vect-scev" } */ + +int A[1024 * 2]; + +int foo (unsigned offset, unsigned N) +{ + int sum = 0; + + for (unsigned i = 0; i < N; i++) + sum += A[i + offset]; + + return sum; +} + +/* { dg-final { scan-tree-dump "vec_transform_loop" "vect" } } */ +/* { dg-final { scan-tree-dump-not "missed: failed: evolution of offset is not affine" "vect" } } */ diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc index 1fadd752d05..0dabb6d1580 100644 --- a/gcc/tree-pretty-print.cc +++ b/gcc/tree-pretty-print.cc @@ -3488,7 +3488,7 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags, dump_generic_node (pp, CHREC_LEFT (node), spc, flags, false); pp_string (pp, ", +, "); dump_generic_node (pp, CHREC_RIGHT (node), spc, flags, false); - pp_string (pp, "}_"); + pp_string (pp, !CHREC_NOWRAP (node) ? "}_" : "}_"); pp_scalar (pp, "%u", CHREC_VARIABLE (node)); is_stmt = false; break; diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc index 065bcd0743d..a9112572e0c 100644 --- a/gcc/tree-scalar-evolution.cc +++ b/gcc/tree-scalar-evolution.cc @@ -2050,6 +2050,30 @@ analyze_scalar_evolution (class loop *loop, tree var) return res; } +/* If CHREC doesn't overflow, set the nonwrapping flag. */ + +void record_nonwrapping_chrec (tree chrec) +{ + CHREC_NOWRAP(chrec) = 1; + + if (dump_file && (dump_flags & TDF_SCEV)) + { + fprintf (dump_file, "(record_nonwrapping_chrec: "); + print_generic_expr (dump_file, chrec); + fprintf (dump_file, ")\n"); + } +} + +/* Return true if CHREC's nonwrapping flag is set. */ + +bool nonwrapping_chrec_p (tree chrec) +{ + if (!chrec || TREE_CODE(chrec) != POLYNOMIAL_CHREC) + return false; + + return CHREC_NOWRAP(chrec); +} + /* Analyzes and returns the scalar evolution of VAR address in LOOP. */ static tree diff --git a/gcc/tree-scalar-evolution.h b/gcc/tree-scalar-evolution.h index a64ed78fe63..f57fde12ee2 100644 --- a/gcc/tree-scalar-evolution.h +++ b/gcc/tree-scalar-evolution.h @@ -43,6 +43,8 @@ extern bool simple_iv (class loop *, class loop *, tree, struct affine_iv *, bool); extern bool iv_can_overflow_p (class loop *, tree, tree, tree); extern tree compute_overall_effect_of_inner_loop (class loop *, tree); +extern void record_nonwrapping_chrec (tree); +extern bool nonwrapping_chrec_p (tree); /* Returns the basic block preceding LOOP, or the CFG entry block when the loop is function's body. */ diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc index 2098bef9a97..d465e0ed7e1 100644 --- a/gcc/tree-ssa-loop-niter.cc +++ b/gcc/tree-ssa-loop-niter.cc @@ -4206,11 +4206,15 @@ idx_infer_loop_bounds (tree base, tree *idx, void *dta) /* If access is not executed on every iteration, we must ensure that overlow may not make the access valid later. */ - if (!dominated_by_p (CDI_DOMINATORS, loop->latch, gimple_bb (data->stmt)) - && scev_probably_wraps_p (NULL_TREE, - initial_condition_in_loop_num (ev, loop->num), - step, data->stmt, loop, true)) - upper = false; + if (!dominated_by_p (CDI_DOMINATORS, loop->latch, gimple_bb (data->stmt))) + { + if (scev_probably_wraps_p (NULL_TREE, + initial_condition_in_loop_num (ev, loop->num), + step, data->stmt, loop, true)) + upper = false; + } + else + record_nonwrapping_chrec (ev); record_nonwrapping_iv (loop, init, step, data->stmt, low, high, false, upper); return true; @@ -4324,6 +4328,7 @@ infer_loop_bounds_from_pointer_arith (class loop *loop, gimple *stmt) if (flag_delete_null_pointer_checks && int_cst_value (low) == 0) low = build_int_cstu (TREE_TYPE (low), TYPE_ALIGN_UNIT (TREE_TYPE (type))); + record_nonwrapping_chrec (scev); record_nonwrapping_iv (loop, base, step, stmt, low, high, false, true); } @@ -4371,6 +4376,7 @@ infer_loop_bounds_from_signedness (class loop *loop, gimple *stmt) high = wide_int_to_tree (type, r.upper_bound ()); } + record_nonwrapping_chrec (scev); record_nonwrapping_iv (loop, base, step, stmt, low, high, false, true); } @@ -5505,6 +5511,11 @@ scev_probably_wraps_p (tree var, tree base, tree step, if (loop_exits_before_overflow (base, step, at_stmt, loop)) return false; + /* Check the nonwrapping flag, which may be set by niter analysis (e.g., the + above loop exits before overflow). */ + if (var && nonwrapping_chrec_p (analyze_scalar_evolution (loop, var))) + return false; + /* At this point we still don't have a proof that the iv does not overflow: give up. */ return true; diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index dd584ab4a42..6261cd1be1d 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -3570,6 +3570,10 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared) analysis are done under the assumptions. */ loop_constraint_set (loop, LOOP_C_FINITE); } + else + /* Clear the existing niter information to make sure the nonwrapping flag + will be calculated and set propriately. */ + free_numbers_of_iterations_estimates (loop); auto_vector_modes vector_modes; /* Autodetect first vector size we try. */ diff --git a/gcc/tree.h b/gcc/tree.h index 086b55f0375..59af8920f02 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -1438,9 +1438,11 @@ class auto_suppress_location_wrappers #define COND_EXPR_ELSE(NODE) (TREE_OPERAND (COND_EXPR_CHECK (NODE), 2)) /* Accessors for the chains of recurrences. */ -#define CHREC_LEFT(NODE) TREE_OPERAND (POLYNOMIAL_CHREC_CHECK (NODE), 0) -#define CHREC_RIGHT(NODE) TREE_OPERAND (POLYNOMIAL_CHREC_CHECK (NODE), 1) -#define CHREC_VARIABLE(NODE) POLYNOMIAL_CHREC_CHECK (NODE)->base.u.chrec_var +#define CHREC_LEFT(NODE) TREE_OPERAND (POLYNOMIAL_CHREC_CHECK (NODE), 0) +#define CHREC_RIGHT(NODE) TREE_OPERAND (POLYNOMIAL_CHREC_CHECK (NODE), 1) +#define CHREC_VARIABLE(NODE) POLYNOMIAL_CHREC_CHECK (NODE)->base.u.chrec_var +/* Nonzero if this chrec doesn't overflow (i.e., nonwrapping). */ +#define CHREC_NOWRAP(NODE) POLYNOMIAL_CHREC_CHECK (NODE)->base.nothrow_flag /* LABEL_EXPR accessor. This gives access to the label associated with the given label expression. */