From patchwork Wed Jan 10 23:42:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Xue OS X-Patchwork-Id: 83803 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 06F1F385DC37 for ; Wed, 10 Jan 2024 23:43:27 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2139.outbound.protection.outlook.com [40.107.223.139]) by sourceware.org (Postfix) with ESMTPS id 0C5923858C74 for ; Wed, 10 Jan 2024 23:42:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0C5923858C74 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=os.amperecomputing.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=os.amperecomputing.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0C5923858C74 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.223.139 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1704930171; cv=pass; b=BKtr20p9oWJvd7IpQmLlAJJZylqBAyeljYFRPPpIs/GN6wSQgJC4hKbq2b8pXQcbOtN2LAFqEIfic8V3BnbDzEKDNlOQXdFE20w8Y5idmZlEstqNeMKWPWOW17M8OwvOQFdWD8+Hx2qi1ajNdLWTapcBwzy5Fx0fbUnq+KLwQNs= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1704930171; c=relaxed/simple; bh=cY94QoxIAidtu8Bt3mMhFupmNfhJ4PrEzeog9a4kkvQ=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=XhGS2vkV4Tms2845DXMLaF7xETZFH/L0jpgyztuZ6mJO6Vao1mGlckiXx2y9PAFSSYUZLrfJdWddfqI6+6sLi2/mVINKZeM4nWPEqHkkUJVc9hNxfLlERmEW91ZtE0vMwn5v4B5V1s2Pfs32y8+M2vzB/Kbf6/86x4+ESr+m4ek= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bafJ9or9oBqe3BiHwoxZCJebmxoshcEXuVlAdgLfdKeLUd0RC2/y7nfCreOF0BGsQ1GvUy8c82TH8D7QYPKUz4ZfJ5qB9NZ7MiGeHePqYerG4aU42Wnryp/KgEBuOzjWCjdhV+9dfw78gbP2CT4X1A74cHrKJ3ZwvpB/SX1PPjChBh3uUR9cfJnhiw6XANJSgiHG1jDy1+Wru8fxDYWf1yRAF4jJoC1Sd6TvHXM8F+kb/VkdNnAmJJ2JgFH8vNnr0EpyZIQPti0H8BuVmuIPuv9xsdPvHvZS30IKpegoLf0MZfkxI0GUj9k1F41U1ZVP9YqjyiNQpkGqUOB3xagW+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=X7joRpbQ/kMEJlwWNbkBxXMja6+LxOMS19s3OJWGVgo=; b=CAvM/oBCoHKJo1PJugGgvFJadj8nM6Q+syIJG7NWjzJ+zAQY7TldRxDTaWvSyFBmbvGLJFWt9Mg1hTqokbJOlmv1hMWrobmMV0yPGlkY50i+sHTqYnHE1yJT7gpQ+ubz3JGI+Gvly4XPKRXoJW1KExM1bcuxIGaee5GbId/IKRIcByvtfXu6fzhtFA1zSe6CQZmYjmL8B1f/hu5f6nHbj0W0xj0WP4ZIMt6nZoJdtZzQ/spWhlFo2YAD67s7QKAWWLF4wuafC48O0QmOzd7rfXM68jIvp+dBMBog7PFSj3BNwsXUiBXJXppOKCjxzjIwPP+AH4cd9+yH1NGiP7RHkg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=os.amperecomputing.com; dmarc=pass action=none header.from=os.amperecomputing.com; dkim=pass header.d=os.amperecomputing.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=os.amperecomputing.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=X7joRpbQ/kMEJlwWNbkBxXMja6+LxOMS19s3OJWGVgo=; b=iTqXocWOsXF++I5RQbdbo0lu6Or/GmK29noh9jSnHG7Rj8JRJIV3obqRL+YU1+SlxmbrGZ0fiZMxPYU2NLUt4PymZ3z3xs6TzcHoYP4rr2yJCPJ8caZuEcnpN2Ny4qcyVqY1/tCa3LmVx9NkBjPR+IdXA0atWnVqN+eNnV89Rus= Received: from LV2PR01MB7839.prod.exchangelabs.com (2603:10b6:408:14f::13) by SA1PR01MB7279.prod.exchangelabs.com (2603:10b6:806:1f2::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7181.18; Wed, 10 Jan 2024 23:42:45 +0000 Received: from LV2PR01MB7839.prod.exchangelabs.com ([fe80::71fb:3678:f29:7a94]) by LV2PR01MB7839.prod.exchangelabs.com ([fe80::71fb:3678:f29:7a94%4]) with mapi id 15.20.7181.015; Wed, 10 Jan 2024 23:42:45 +0000 From: Feng Xue OS To: Richard Biener , "gcc-patches@gcc.gnu.org" Subject: PING: [PATCH] Do not count unused scalar use when marking STMT_VINFO_LIVE_P [PR113091] Thread-Topic: PING: [PATCH] Do not count unused scalar use when marking STMT_VINFO_LIVE_P [PR113091] Thread-Index: AQHaOj+7eU2CkybyGEufZnkwsqWdb7DTyKAR Date: Wed, 10 Jan 2024 23:42:45 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_Enabled=True; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_SiteId=3bc2b170-fd94-476d-b0ce-4229bdc904a7; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_SetDate=2024-01-10T23:42:45.010Z; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_Name=Confidential; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_ContentBits=0; MSIP_Label_5b82cb1d-c2e0-4643-920a-bbe7b2d7cc47_Method=Standard; authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=os.amperecomputing.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: LV2PR01MB7839:EE_|SA1PR01MB7279:EE_ x-ms-office365-filtering-correlation-id: 99a3eade-ee3a-4859-5f83-08dc1235d883 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: PYE4W1qFzFBT6JJvx8OWZl6JJ/8BDDkNjAcxSQA/AtCJO25b1u1sBntWOAz1GU9/NWM8Q+G3P89rOEwyDOev0Copq4UAi77I91fH9bJ7I9dXOEzGMAnhQjaAO7B7OWg1nGRDHXnglpKk1+JEKOhr2S7+Gecx23LfQBMfetII4wQXPDOVIT1REPjLBZPsSN1S30R9NifAF+KFEI1srKYW+4x5ae6KQjj+pTlKn+UPowfXW/Ylm+MAqt+Y70CuK7GnfMEKH+peiteJsOeKY7qSjRWc8jpPTebcjwfInMJ16JhJhgD5cTuC2OjDPccGUelG4klUANzw308H/t3cz8T2XaWszq0JgiHDbVPbW4qKCr3wcCXSAuqx8skSrFU706AGdsDwPgJwvap8h0aEu/YaOQ+BYs2iQUgJIEAsknA+1A5hUlxZ4a+5EReyh6PHdz2GkB04HET/dRj/xRr4LfcppLRmlEWfqYM7rfKEWDuJ8YkT+gLfOXDjrHi5+2yUenr4fOZszwFX5CAPNRx1UfDC4MenLHZ4ssfBgZAq26+8mEaj+T4rW8JA72Mz/MsDaBY50yT2K7DnhvH+LIqpgjJb6MMNiiiGFg8RMsEa4Pn8SxeGEG1JUuNiYOXE/4C1cdGD1Z3YP6OkqEKwatHC2L50ctPEGmG7zzGEAWWzaqtr3Ic= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:LV2PR01MB7839.prod.exchangelabs.com; PTR:; CAT:NONE; SFS:(13230031)(366004)(136003)(396003)(346002)(39850400004)(376002)(230173577357003)(230922051799003)(230273577357003)(64100799003)(186009)(451199024)(1800799012)(55016003)(71200400001)(7696005)(478600001)(53546011)(9686003)(6506007)(26005)(38070700009)(122000001)(38100700002)(86362001)(33656002)(2906002)(41300700001)(76116006)(91956017)(83380400001)(30864003)(5660300002)(8676002)(52536014)(66946007)(110136005)(66476007)(66446008)(64756008)(316002)(66556008)(8936002); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-8859-1?q?cwlYeEoJ1tmHebYxR2f8+xe?= =?iso-8859-1?q?R4CC+qRI67976u00oi4nFNSeexVjSjYtvjFzzm7r4T2YCZdZCd4Q6PVJhWAV?= =?iso-8859-1?q?QEHZDIP5qOenAOrUfR5LeKfliUdd3NqAUIquW8EM92jtaF96JCkPGY2xMYSh?= =?iso-8859-1?q?G/mXwhFkXdML/IDP89hJZTeP8EGUiI/fmPSnVyCdnv6EgebuJXFLkgNI8umo?= =?iso-8859-1?q?T1kUEie0pPCWoqbL2HjECaIUi5D05RE9/s1xuod0cojNPwFQuJQwvbzBI+8A?= =?iso-8859-1?q?GYBVF7RpQWgAbQy9g/nQvyHr3Snq2lTJynP/XzrSqwyUap0lCU6rmmHJvHmk?= =?iso-8859-1?q?wUoeXkAeV3VwfLsc0rKdl6K2eesVTk16ZGbOERYHb525mACK9+xFgXYKQMnC?= =?iso-8859-1?q?xN5sXEjTd2Y0AI2Ae8YPXDdCYfHmXcRtWzQzyY7pdbc6mS8iqFwSSNrF7c22?= =?iso-8859-1?q?iAh99qILxRZD688oTjB3bPWOhyXnwnLtaopqF4liMp7ZFcaPFeHBTFxvXjnZ?= =?iso-8859-1?q?2+lJLq7g+oUpOAAjPK6w5VRZSvphsDSPrJKQUof2Dx9ZhMqM2bcpXMRCEk9M?= =?iso-8859-1?q?4Q/MrtPxj8GMXYt/Pk6j0Onf2tpY99nxjYIWybWo9B1A6c8DCdZmxU/9uMPd?= =?iso-8859-1?q?/TqwjIS3emOPdp/h1T3+3ZsxQSpgW6qCHSopDJL9Vha7KAIb97tHMmGp2iY5?= =?iso-8859-1?q?owzdehAr4VDz5AeXS8+ohGWIUhN7zUxpOiYN+yeum6o9DJLTPtaII/p51TGf?= =?iso-8859-1?q?EVoprFPSYM/+ylwG9wrNUgoo7TIxRVKTIMibY3cruBJzTc8VnUo8uvg20voV?= =?iso-8859-1?q?ybdkgMYMP6P/vIdilqx74UbiBbi2tTb+TacFa0yGvqXMcfDP8w4znj+nluqu?= =?iso-8859-1?q?h9x1nZrGrHUwSqmjFpis7b/9qMC/IhMUw2l/4RqlqkSZ+dcqqowShaj4MYzt?= =?iso-8859-1?q?7a5UszFHRI7x10a2vcnv7AQbk0oFW6WBAHQx3kqh67us4zyxCfoyxeunuBPB?= =?iso-8859-1?q?wrtvmdyKhCzZRYu6PZDzfNsszGGVBW4sceKn3jGb4NWIjFmR5G3mQ+/tVtjC?= =?iso-8859-1?q?s9xJt4wgqS1b8KoboJh4ZZfwVOG4lHr0OQ94n6iI7D2NOw4g2aWBGIR0Kiq/?= =?iso-8859-1?q?RApj0DgEiIA7D14amS28fMbj5KGbQAGSYn3UTWmvFMjDlMeP92kqWrK3pNjF?= =?iso-8859-1?q?PXAypacsNE/Byqd1F4y+R2oC50tIfWpEpEdOCzDkp3xgJG1uuMXSyEkk+6K8?= =?iso-8859-1?q?7tG02VxR5LUe6Xq6n3fQoPzes/9o4q4sd6qupZs8qGp0ZOEMsiL/dCdqJvuu?= =?iso-8859-1?q?57EetY2yg2vd/RdkE1xjCW/CoxyII0rLRTlANWNPo6rfRKEFAIVizMDNET5V?= =?iso-8859-1?q?9WootCfH7vTmGBDD+EsOP0YtR8abj0W09tKjefOvuh7RXqZ9QKPoKjHD/a3V?= =?iso-8859-1?q?/L5+mAp0hJakrMRKVl+zfoagYoF21cCkO74OLsG1UkITJoLRAgSaO5AmTi07?= =?iso-8859-1?q?UtPjEkEM4wVlHmTY864OrGVZM6GD5pJpLPNBFZrbFOim73qYYyrQg5lkOCzP?= =?iso-8859-1?q?khMnSfhYVazBFH4kTjpjgH9rMujR6DsS8XM+ZgBlZvDIT/zmQmVyJ11x4+Zv?= =?iso-8859-1?q?EM7Jl17Ec8jUqrccp?= MIME-Version: 1.0 X-OriginatorOrg: os.amperecomputing.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: LV2PR01MB7839.prod.exchangelabs.com X-MS-Exchange-CrossTenant-Network-Message-Id: 99a3eade-ee3a-4859-5f83-08dc1235d883 X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Jan 2024 23:42:45.4695 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 3bc2b170-fd94-476d-b0ce-4229bdc904a7 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: Ps4FaxtYKS/K+j+WDbJbiaJFCxsbOgIeTVi4TGGr+axoTMH9c/wymQyL+f6Eh030dXD0spJ3pE1yillZ1tRFauUMVLYdDx4e1hX6NLmsqGg= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA1PR01MB7279 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Hi, Richard, Would you please talk a look at this patch? Thanks, Feng diff --git a/gcc/testsuite/gcc.target/aarch64/bb-slp-pr113091.c b/gcc/testsuite/gcc.target/aarch64/bb-slp-pr113091.c new file mode 100644 index 00000000000..ff822e90b4a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/bb-slp-pr113091.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3 -fdump-tree-slp-details -ftree-slp-vectorize" } */ + +int test(unsigned array[8]); + +int foo(char *a, char *b) +{ + unsigned array[8]; + + array[0] = (a[0] - b[0]); + array[1] = (a[1] - b[1]); + array[2] = (a[2] - b[2]); + array[3] = (a[3] - b[3]); + array[4] = (a[4] - b[4]); + array[5] = (a[5] - b[5]); + array[6] = (a[6] - b[6]); + array[7] = (a[7] - b[7]); + + return test(array); +} + +/* { dg-final { scan-tree-dump-times "Basic block will be vectorized using SLP" 1 "slp2" } } */ diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index a82fca45161..d36ff37114e 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -6418,6 +6418,84 @@ vect_slp_analyze_node_operations (vec_info *vinfo, slp_tree node, return res; } +/* Given a definition DEF, analyze if it will have any live scalar use after + performing SLP vectorization whose information is represented by BB_VINFO, + and record result into hash map SCALAR_USE_MAP as cache for later fast + check. */ + +static bool +vec_slp_has_scalar_use (bb_vec_info bb_vinfo, tree def, + hash_map &scalar_use_map) +{ + imm_use_iterator use_iter; + gimple *use_stmt; + + if (bool *res = scalar_use_map.get (def)) + return *res; + + FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, def) + { + if (is_gimple_debug (use_stmt)) + continue; + + stmt_vec_info use_stmt_info = bb_vinfo->lookup_stmt (use_stmt); + + if (!use_stmt_info) + break; + + if (PURE_SLP_STMT (vect_stmt_to_vectorize (use_stmt_info))) + continue; + + /* Do not step forward when encounter PHI statement, since it may + involve cyclic reference and cause infinite recursive invocation. */ + if (gimple_code (use_stmt) == GIMPLE_PHI) + break; + + /* When pattern recognition is involved, a statement whose definition is + consumed in some pattern, may not be included in the final replacement + pattern statements, so would be skipped when building SLP graph. + + * Original + char a_c = *(char *) a; + char b_c = *(char *) b; + unsigned short a_s = (unsigned short) a_c; + int a_i = (int) a_s; + int b_i = (int) b_c; + int r_i = a_i - b_i; + + * After pattern replacement + a_s = (unsigned short) a_c; + a_i = (int) a_s; + + patt_b_s = (unsigned short) b_c; // b_i = (int) b_c + patt_b_i = (int) patt_b_s; // b_i = (int) b_c + + patt_r_s = widen_minus(a_c, b_c); // r_i = a_i - b_i + patt_r_i = (int) patt_r_s; // r_i = a_i - b_i + + The definitions of a_i(original statement) and b_i(pattern statement) + are related to, but actually not part of widen_minus pattern. + Vectorizing the pattern does not cause these definition statements to + be marked as PURE_SLP. For this case, we need to recursively check + whether their uses are all absorbed into vectorized code. But there + is an exception that some use may participate in an vectorized + operation via an external SLP node containing that use as an element. + The parameter "scalar_use_map" tags such kind of SSA as having scalar + use in advance. */ + tree lhs = gimple_get_lhs (use_stmt); + + if (!lhs || TREE_CODE (lhs) != SSA_NAME + || vec_slp_has_scalar_use (bb_vinfo, lhs, scalar_use_map)) + break; + } + + bool found = !end_imm_use_stmt_p (&use_iter); + bool added = scalar_use_map.put (def, found); + + gcc_assert (!added); + return found; +} + /* Mark lanes of NODE that are live outside of the basic-block vectorized region and that can be vectorized using vectorizable_live_operation with STMT_VINFO_LIVE_P. Not handled live operations will cause the @@ -6427,6 +6505,7 @@ static void vect_bb_slp_mark_live_stmts (bb_vec_info bb_vinfo, slp_tree node, slp_instance instance, stmt_vector_for_cost *cost_vec, + hash_map &scalar_use_map, hash_set &svisited, hash_set &visited) { @@ -6451,32 +6530,22 @@ vect_bb_slp_mark_live_stmts (bb_vec_info bb_vinfo, slp_tree node, def_operand_p def_p; FOR_EACH_PHI_OR_STMT_DEF (def_p, orig_stmt, op_iter, SSA_OP_DEF) { - imm_use_iterator use_iter; - gimple *use_stmt; - stmt_vec_info use_stmt_info; - FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, DEF_FROM_PTR (def_p)) - if (!is_gimple_debug (use_stmt)) - { - use_stmt_info = bb_vinfo->lookup_stmt (use_stmt); - if (!use_stmt_info - || !PURE_SLP_STMT (vect_stmt_to_vectorize (use_stmt_info))) - { - STMT_VINFO_LIVE_P (stmt_info) = true; - if (vectorizable_live_operation (bb_vinfo, stmt_info, - node, instance, i, - false, cost_vec)) - /* ??? So we know we can vectorize the live stmt - from one SLP node. If we cannot do so from all - or none consistently we'd have to record which - SLP node (and lane) we want to use for the live - operation. So make sure we can code-generate - from all nodes. */ - mark_visited = false; - else - STMT_VINFO_LIVE_P (stmt_info) = false; - break; - } - } + if (vec_slp_has_scalar_use (bb_vinfo, DEF_FROM_PTR (def_p), + scalar_use_map)) + { + STMT_VINFO_LIVE_P (stmt_info) = true; + if (vectorizable_live_operation (bb_vinfo, stmt_info, node, + instance, i, false, cost_vec)) + /* ??? So we know we can vectorize the live stmt from one SLP + node. If we cannot do so from all or none consistently + we'd have to record which SLP node (and lane) we want to + use for the live operation. So make sure we can + code-generate from all nodes. */ + mark_visited = false; + else + STMT_VINFO_LIVE_P (stmt_info) = false; + } + /* We have to verify whether we can insert the lane extract before all uses. The following is a conservative approximation. We cannot put this into vectorizable_live_operation because @@ -6495,6 +6564,10 @@ vect_bb_slp_mark_live_stmts (bb_vec_info bb_vinfo, slp_tree node, from the latest stmt in a node. So we compensate for this during code-generation, simply not replacing uses for those hopefully rare cases. */ + imm_use_iterator use_iter; + gimple *use_stmt; + stmt_vec_info use_stmt_info; + if (STMT_VINFO_LIVE_P (stmt_info)) FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, DEF_FROM_PTR (def_p)) if (!is_gimple_debug (use_stmt) @@ -6517,8 +6590,56 @@ vect_bb_slp_mark_live_stmts (bb_vec_info bb_vinfo, slp_tree node, slp_tree child; FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child) if (child && SLP_TREE_DEF_TYPE (child) == vect_internal_def) - vect_bb_slp_mark_live_stmts (bb_vinfo, child, instance, - cost_vec, svisited, visited); + vect_bb_slp_mark_live_stmts (bb_vinfo, child, instance, cost_vec, + scalar_use_map, svisited, visited); +} + +/* Traverse all slp instances of BB_VINFO, and mark lanes of every node that + are live outside of the basic-block vectorized region and that can be + vectorized using vectorizable_live_operation with STMT_VINFO_LIVE_P. */ + +static void +vect_bb_slp_mark_live_stmts (bb_vec_info bb_vinfo) +{ + if (bb_vinfo->slp_instances.is_empty ()) + return; + + hash_set svisited; + hash_set visited; + hash_map scalar_use_map; + auto_vec worklist; + + for (slp_instance instance : bb_vinfo->slp_instances) + if (!visited.add (SLP_INSTANCE_TREE (instance))) + worklist.safe_push (SLP_INSTANCE_TREE (instance)); + + do + { + slp_tree node = worklist.pop (); + + if (SLP_TREE_DEF_TYPE (node) == vect_external_def) + { + for (tree op : SLP_TREE_SCALAR_OPS (node)) + if (TREE_CODE (op) == SSA_NAME) + scalar_use_map.put (op, true); + } + else + { + for (slp_tree child : SLP_TREE_CHILDREN (node)) + if (child && !visited.add (child)) + worklist.safe_push (child); + } + } while (!worklist.is_empty ()); + + visited.empty (); + + for (slp_instance instance : bb_vinfo->slp_instances) + { + vect_location = instance->location (); + vect_bb_slp_mark_live_stmts (bb_vinfo, SLP_INSTANCE_TREE (instance), + instance, &instance->cost_vec, + scalar_use_map, svisited, visited); + } } /* Determine whether we can vectorize the reduction epilogue for INSTANCE. */ @@ -6684,17 +6805,7 @@ vect_slp_analyze_operations (vec_info *vinfo) /* Compute vectorizable live stmts. */ if (bb_vec_info bb_vinfo = dyn_cast (vinfo)) - { - hash_set svisited; - hash_set visited; - for (i = 0; vinfo->slp_instances.iterate (i, &instance); ++i) - { - vect_location = instance->location (); - vect_bb_slp_mark_live_stmts (bb_vinfo, SLP_INSTANCE_TREE (instance), - instance, &instance->cost_vec, svisited, - visited); - } - } + vect_bb_slp_mark_live_stmts (bb_vinfo); return !vinfo->slp_instances.is_empty (); }