From patchwork Thu Apr 4 16:15:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 88037 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 665B53858C50 for ; Thu, 4 Apr 2024 16:16:46 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-AM7-obe.outbound.protection.outlook.com (mail-am7eur03on2053.outbound.protection.outlook.com [40.107.105.53]) by sourceware.org (Postfix) with ESMTPS id 6E35C3858C98 for ; Thu, 4 Apr 2024 16:16:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6E35C3858C98 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6E35C3858C98 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.105.53 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1712247380; cv=pass; b=MtE6sLwumlgPNhg2orgwsVcachsAJKMUnXZRKX3zrT8i5g6079FNOCCXrMy1APAF7h5hYM4eHU8eGpe/lXI3/1Oha5rhUkiwPbmS5lDo0KTFkY5S5wfNmGN2LYclsAE+PgjjF37EDYWQD5TXfZXtElDK2Ag+hJ5Ogd+C/LglTvI= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1712247380; c=relaxed/simple; bh=bJqegq6e6eRV4idYa4lw7uV3fkjahq/MIqSouGXa7/4=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=ct/hTyikjsllgzNBlTJvVd5JiOKhEZKL4KXGaW5D+XyS5LjMtdfp3of6h2y3JWPPIoHoOm0ZfNmdBH9h1BZQT2BEeC5X9kjKjmCzXiQkNKIqEctGHzC3HBJdXHKB+icYsgmLV8dZWs0wPohQ/yNPdpRbSZOmnSF/enDW5qg1oiY= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=B8WbILc7C0V3CzH5mZH6xJY3bkgKsmWSCAxbcwoDQhAhuNVwSlBEzu7vwByqpZeoGjPDY1m8aF7WgMuQpmtSYIn4FW8ozP56CXAwBMy19bm6ZjchTtrZVeZWSos+V9Z77k/NTcN5JjavHqIA96gZ3R78xBb/s3miYj6OayVR73lP0fFpm92Q2lWnWkQouUJBEnlojiyloj54zAVhX+/rvXD9rUno/+T9CYU+Z4fa+oQAE9pHjRdyqypYasUvlwzD8exuQJH4fEJvaL7Rqjs2bkdF/WxIfdnNdGivpw9roOB80Vm9WUWOse8qzZgCjc0gEC9DmuME1uVsTRu5IPlbIw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=nETTNGyke881mFvP2KfYGR9TWWG+0/7P6k3pcCqUMP0=; b=Yg7CpQgAihCwIj4EhOasDP/51QzAC2vYzjGe73ie1o8BDkQKRKQ4VokV4Irpgfngk0mgQyw4upFjJvTlciFppavodDVdCaEFdeM2mjr7+SVWIzM16kTVfp7/Fc/l1NrHx+tCR0iImqMU46AYpet62dGK/tisOAZgecmTMUsX2Oc3mOzJs10zV/JEFCyf6ffgZKLkDpZ1pKQbp5j0QBSc07/gXOpHWhs3/lD6pNbzzHNu74//ZWzihL4xFVziSM5FZDDIz36K3p8ToavTs97lIGb6TOmHjTsCxeYqcCPTCFVma+RUASNZsC2H6MfsCzH/rqT6v7CIvRVnq0FxsEuguA== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=nETTNGyke881mFvP2KfYGR9TWWG+0/7P6k3pcCqUMP0=; b=6zpbQPO/hfVLtyNhVExJ3MQO0hwYRSL2K5apMmyYwRl45aE3TD46k4keXVJhgTh8Rb7Hcrpl8WCJsuL4B4RgiGYedGSW41nCMZwnzqbOgm6WzOzzqFFnED1LVKaFLxuOqsrXQv2qnlKE47n1SoYJNrJQsWJhz854KYfKqQ6y5gc= Received: from AM0PR06CA0096.eurprd06.prod.outlook.com (2603:10a6:208:fa::37) by DU0PR08MB8445.eurprd08.prod.outlook.com (2603:10a6:10:407::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Thu, 4 Apr 2024 16:16:12 +0000 Received: from AM3PEPF00009B9D.eurprd04.prod.outlook.com (2603:10a6:208:fa:cafe::e7) by AM0PR06CA0096.outlook.office365.com (2603:10a6:208:fa::37) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7452.27 via Frontend Transport; Thu, 4 Apr 2024 16:16:12 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM3PEPF00009B9D.mail.protection.outlook.com (10.167.16.22) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7452.22 via Frontend Transport; Thu, 4 Apr 2024 16:16:12 +0000 Received: ("Tessian outbound e26069fc76b9:v300"); Thu, 04 Apr 2024 16:16:11 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 99b21e2efa481599 X-CR-MTA-TID: 64aa7808 Received: from 0bb9d814ff15.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id CA16762E-FE5D-4B21-8B57-C4B609264AC2.1; Thu, 04 Apr 2024 16:16:05 +0000 Received: from EUR04-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 0bb9d814ff15.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 04 Apr 2024 16:16:05 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=adE+GWcgw/czkBtWQnSxtYM8EiaB1YrN+NEg4bS91Piwk4ZtUoS423L5QnL5XnMNrULdE3WeAmehfQ1mAiHd6JaHWkvzvK9hsl7RA7plnOxPu02xyL4KI2hsRLiiTIv7exuuUN247bwGaJNBxpoeunHX1j5ZzUXXjqb1FTQgZQBJ7lMlZFVHDglSfJUWWgutl3V2gcsNftCq+QzKwpqqH/gQ+/9HmnOp4EZC/K8iwCJHaDNiciTvCqITdjgnX7eQk4foCklerMgKpWaHCLnoC0COSE/JXAvqjww8+4C11BsesD6wuZHq6SB0dKl/Rve6RqdaAkfj/11Mok3Q5Cbgsg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=nETTNGyke881mFvP2KfYGR9TWWG+0/7P6k3pcCqUMP0=; b=J0AX44+P25eKx43m0DCGxI9KFB0fSRNjjo61cs19T5rfzN1Bw4sdUozq7SEP4IA/D71ExqQ6lnz2iR5Ozabx6n/GjyWq2opfaXgrVsXb6J8wFL+cukUnrbCdNvqk8QFNHRM/m8ETy5LHvowDCo4PPKZyDshqqzaOhbKlPU759BPr2l4RMHIlp3YorjS0P/hiabEbZn/cliC/w9YCA6rrKNUMI/m3oAiHnReMCcldkOqmmdil2MNhidTaZkFXtlAvtuCiCRY9PdSy4zNGmyz6psI42SJngAeMeb5QWocK5rSkZwWhswv4fZmMKERLnIPD2zWJIR5ySESQCY4+C+s5rQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=nETTNGyke881mFvP2KfYGR9TWWG+0/7P6k3pcCqUMP0=; b=6zpbQPO/hfVLtyNhVExJ3MQO0hwYRSL2K5apMmyYwRl45aE3TD46k4keXVJhgTh8Rb7Hcrpl8WCJsuL4B4RgiGYedGSW41nCMZwnzqbOgm6WzOzzqFFnED1LVKaFLxuOqsrXQv2qnlKE47n1SoYJNrJQsWJhz854KYfKqQ6y5gc= Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by GV2PR08MB8049.eurprd08.prod.outlook.com (2603:10a6:150:ae::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Thu, 4 Apr 2024 16:16:00 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::a0e:800c:c8b2:5ff0]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::a0e:800c:c8b2:5ff0%4]) with mapi id 15.20.7409.042; Thu, 4 Apr 2024 16:16:00 +0000 Date: Thu, 4 Apr 2024 17:15:58 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, rguenther@suse.de, jlaw@ventanamicro.com Subject: [PATCH]middle-end vect: adjust loop upper bounds when peeling for gaps and early break [PR114403] Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO4P123CA0484.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:1a8::21) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|GV2PR08MB8049:EE_|AM3PEPF00009B9D:EE_|DU0PR08MB8445:EE_ x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: ajWZrx1lXPDJqyRbDn88BFoQpZBAvJdA6AoFU5ARlEP+3Q0TBli472+Gjvaq5jveFgm+g9jSYozWyeRaiCJwHfFQSnbiR7AuknfULwb2pzqrf7T735N3A17y7h8GKuDTcgdja8hUx1pu+tA/daN88trsNAlE/vXFTlQUlQoWNJBXOuO5y2mJlhAa1ry2pXxamoiFF0BAJA/r/LCStPza8PziKEWj349ikbIThWCdYQwQVdvJ3erBtu6aN8derHzwPFR5wV/iiWm4N0L6mXOV6wfnGjscdrPoS4s86ynq76+hloylcTtDezLelwEP+RNCdmPx1rbn8wKk3wSn6WaSMIuTdvv3mawp1bhojEqtHYl4BV6eyHbD46UH2KdbIM+/XaCg/YuW/1js2k8l/E5l/Myv4vCMAn1baA64qKNI5kMYUyYhrVaxcZDvVfsrnpc7nVYxAEZ8taFJO65vPLLqsXoEunQBCyDT+LPb/2N/13dfVwsd7M0fard/ebhNir/RS4tbJeCN4uT7vx/iPJM2rhyQT1ymVom4DhYRD/7tQ5Ia9+t8DwXXgK3bQTKg1XXmHYQHDEWU5Ojz07F887Ay7X624w0Q/h3aiChBsit1BfZUuoZSUvg3Z+us+5VmoyrCMNafwz2pMFurvSUeQh1cBv1qyvVCc+lnx3YP14RVW7o= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(366007)(376005)(1800799015); DIR:OUT; SFP:1102; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV2PR08MB8049 X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM3PEPF00009B9D.eurprd04.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: b154bcbb-12f0-4c6b-8add-08dc54c28ba0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ZhhqVw563D5QO1wW25i1+O7gjmXG8afdx2A4Xa3UOQUR082wA/pSrphOuqaMeEo0QC5xEXZCmnazjmeJg6eZR13OcczyJiRXfsQ47OyxLyXmUqH6cLpfCbNBcm+xROYCcktbJjCfPMpciwIWplCRwKGF3C6zAPhdJ4oRxJCL3K7b9/fcili7P7dXffTtOUpzW3hVW1knHmElEr8/Z7YLqfjuoeJ5fdjtraYfoeltD5Zg+N+FuzooF9JRNZqMVKKZtgsXTptXcNVVH61AttzEdGoCW6w+6z9wenH6TdsW+4C2x2tWUEIS7WrF2SGYijGH7LL7Q410K+nfyDASs34RA2mvFdBHCmUh8S8okwGKY+jyXjTm7+hwEciG94Ao4jKsqXndgjUX2eO+BrlojlKvoQFhsGB5XKPFdwxYB8phB0DAb+IdKfLYx1NVUsv7T5WRQpGD4Ldplb8Mkf5BdkAxYSdkiM7jcvXkZobdgHU/FmOWKCdMiDPq0ZzSYEjLLGnwy0onVcpxWbWbHFrXMAaQqoE5eLU0PAgHqSIc8yOyor1r0XKf3A8nTujhx2YallFuTG50QDeh1W+DpCgtH/OIqBZKCpoKKQhmKLi0arPD4KJYqt1NsqhRcOZvLWwOn8F7xAzD/S3hTDe7ttYvTNphJbAelsR0EKnz7LAzhlBaHL4z85/svOhjdB/FRnO4Df6RHlwrPfHJIGxVuxVhflucGmjnUzHh8rCOLIR+gRraC5OrLwFkjnHw0cYyxTdyk9VD X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(376005)(82310400014)(36860700004)(1800799015); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Apr 2024 16:16:12.2085 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: b154bcbb-12f0-4c6b-8add-08dc54c28ba0 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM3PEPF00009B9D.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR08MB8445 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY, WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Hi All, The report shows that we end up in a situation where the code has been peeled for gaps and we have an early break. The code for peeling for gaps assume that a scalar loop needs to perform at least one iteration. However this doesn't take into account early break where the scalar loop may not need to be executed. That the early break loop can be partial is not accounted for in this scenario. loop partiality is normally handled by setting bias_for_lowest to 1, but when peeling for gaps we end up with 0, which when the loop upper bounds are calculated means that a partial loop iteration loses the final partial iter: Analyzing # of iterations of loop 1 exit condition [8, + , 18446744073709551615] != 0 bounds on difference of bases: -8 ... -8 result: # of iterations 8, bounded by 8 and a VF=4 calculating: Loop 1 iterates at most 1 times. Loop 1 likely iterates at most 1 times. Analyzing # of iterations of loop 1 exit condition [1, + , 1](no_overflow) < bnd.5505_39 bounds on difference of bases: 0 ... 4611686018427387902 Matching expression match.pd:2011, generic-match-8.cc:27 Applying pattern match.pd:2067, generic-match-1.cc:4813 result: # of iterations bnd.5505_39 + 18446744073709551615, bounded by 4611686018427387902 Estimating sizes for loop 1 ... Induction variable computation will be folded away. size: 2 if (ivtmp_312 < bnd.5505_39) Exit condition will be eliminated in last copy. size: 24-3, last_iteration: 24-5 Loop size: 24 Estimated size after unrolling: 26 ;; Guessed iterations of loop 1 is 0.858446. New upper bound 1. upper bound should be 2 not 1. This patch forced the bias_for_lowest to be 1 even when peeling for gaps. I have however not been able to write a standalone reproducer for this so I have no tests but bootstrap and LLVM build fine now. The testcase: #define COUNT 9 #define SIZE COUNT * 4 #define TYPE unsigned long TYPE x[SIZE], y[SIZE]; void __attribute__((noipa)) loop (TYPE val) { for (int i = 0; i < COUNT; ++i) { if (x[i * 4] > val || x[i * 4 + 1] > val) return; x[i * 4] = y[i * 2] + 1; x[i * 4 + 1] = y[i * 2] + 2; x[i * 4 + 2] = y[i * 2 + 1] + 3; x[i * 4 + 3] = y[i * 2 + 1] + 4; } } does perform the peeling for gaps and early beak, however it creates a hybrid loop which works fine. adjusting the indices to non linear also works. So I'd like to submit the fix and work on a testcase separately if needed. Bootstrapped Regtested on x86_64-pc-linux-gnu no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: PR tree-optimization/114403 * tree-vect-loop.cc (vect_transform_loop): Adjust upper bounds for when peeling for gaps and early break. --- -- diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 4375ebdcb493a90fd0501cbb4b07466077b525c3..bf1bb9b005c68fbb13ee1b1279424865b237245a 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -12139,7 +12139,8 @@ vect_transform_loop (loop_vec_info loop_vinfo, gimple *loop_vectorized_call) /* The minimum number of iterations performed by the epilogue. This is 1 when peeling for gaps because we always need a final scalar iteration. */ - int min_epilogue_iters = LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) ? 1 : 0; + int min_epilogue_iters = LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) + && !LOOP_VINFO_EARLY_BREAKS (loop_vinfo) ? 1 : 0; /* +1 to convert latch counts to loop iteration counts, -min_epilogue_iters to remove iterations that cannot be performed by the vector code. */