From patchwork Tue Oct 1 12:28:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 98208 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BB980386F457 for ; Tue, 1 Oct 2024 12:28:50 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2a07:de40:b251:101:10:150:64:2]) by sourceware.org (Postfix) with ESMTPS id 16D503844778 for ; Tue, 1 Oct 2024 12:28:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 16D503844778 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 16D503844778 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a07:de40:b251:101:10:150:64:2 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1727785704; cv=none; b=vue36vkKHaY2jKburGH86JH0JqdGobFeTn4A1dJtq1juHNxdnCabdVPNljZGgegUd326GbwFbl3o4dyV2SzZ3AHFUTb9vYzqGQIH+l2olRfm9GLCNSWwnLGGBxcKLbVbxydOWPAZsTYcQC2YJdb8q2F3dstO0g7vZ+ApAQz+CL4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1727785704; c=relaxed/simple; bh=HiBxcP1L8C2Z6qnCneb4Ud1SNIDXKMeUu5EOW33Jeo0=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:MIME-Version; b=BVGvph80ypCqIzYIP723kE8OZv96qPeYnqnktIryXylTEf4UdUZw1bUcVirBJfpsNVumhMhl1zrh0ClNjXRppxVnnRdD8qS8wjjTcjJ0KY8grzO31oZthMDufLahqANn5PyUHMavnjewkBeGycVEtX/MANgUJ2sNPUjuptXNUXg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from murzim.nue2.suse.org (unknown [10.168.4.243]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 1A97B1F7FB for ; Tue, 1 Oct 2024 12:28:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1727785701; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=g6t46Axcbf44Hj/FDBHTsRIStNUT1CyPLPuf9D3O7XM=; b=iLPXer8itcj6b9cAHWvQiCcHUd8iJcazKg779Pdl7FZh0/ezunI1IsElITf1SXrnlwAD7l KAe2l7U3gE0ohGrh+1Ssb5vLP6c75pwfKdzxoZq/czyoZyFUqzOvA38ktz3YA0KKcWRaR2 PAy11MDWl+iAsUOmDW/HmJ3EiWz335w= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1727785701; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=g6t46Axcbf44Hj/FDBHTsRIStNUT1CyPLPuf9D3O7XM=; b=weXq/nDUuef9pf5P3UoSShRuiaqDW4L4OVWbJPxdBB44E6k1S1GuTBRwwGgYc5FNb0I5dt 2AqzA4Uz85DKHqAA== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1727785701; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=g6t46Axcbf44Hj/FDBHTsRIStNUT1CyPLPuf9D3O7XM=; b=iLPXer8itcj6b9cAHWvQiCcHUd8iJcazKg779Pdl7FZh0/ezunI1IsElITf1SXrnlwAD7l KAe2l7U3gE0ohGrh+1Ssb5vLP6c75pwfKdzxoZq/czyoZyFUqzOvA38ktz3YA0KKcWRaR2 PAy11MDWl+iAsUOmDW/HmJ3EiWz335w= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1727785701; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=g6t46Axcbf44Hj/FDBHTsRIStNUT1CyPLPuf9D3O7XM=; b=weXq/nDUuef9pf5P3UoSShRuiaqDW4L4OVWbJPxdBB44E6k1S1GuTBRwwGgYc5FNb0I5dt 2AqzA4Uz85DKHqAA== Date: Tue, 1 Oct 2024 14:28:21 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH] tree-optimization/116902 - vectorizer load hosting breaks UID order #2 MIME-Version: 1.0 X-Spam-Level: X-Spamd-Result: default: False [-1.13 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MISSING_MID(2.50)[]; NEURAL_HAM_LONG(-0.33)[-0.328]; NEURAL_HAM_SHORT(-0.20)[-0.999]; MIME_GOOD(-0.10)[text/plain]; RCPT_COUNT_ONE(0.00)[1]; RCVD_COUNT_ZERO(0.00)[0]; ARC_NA(0.00)[]; MISSING_XM_UA(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; TO_DN_NONE(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_HAS_DN(0.00)[] X-Spam-Score: -1.13 X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, MISSING_MID, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org Message-Id: <20241001122850.BB980386F457@sourceware.org> This is another case of load hoisting breaking UID order in the preheader, this time between two hoistings. The easiest way out is to do what we do for the main stmt - copy instead of move. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/116902 PR tree-optimization/116842 * tree-vect-stmts.cc (sort_after_uid): Remove again. (hoist_defs_of_uses): Copy defs instead of hoisting them so we can zero their UID. (vectorizable_load): Separate analysis and transform call, do transform on the stmt copy. * g++.dg/torture/pr116902.C: New testcase. --- gcc/testsuite/g++.dg/torture/pr116902.C | 20 +++++++++ gcc/tree-vect-stmts.cc | 54 ++++++++++++------------- 2 files changed, 45 insertions(+), 29 deletions(-) create mode 100644 gcc/testsuite/g++.dg/torture/pr116902.C diff --git a/gcc/testsuite/g++.dg/torture/pr116902.C b/gcc/testsuite/g++.dg/torture/pr116902.C new file mode 100644 index 00000000000..cf195c8ac02 --- /dev/null +++ b/gcc/testsuite/g++.dg/torture/pr116902.C @@ -0,0 +1,20 @@ +// { dg-do compile } +// { dg-additional-options "-ftree-vectorize" } + +template +inline const _Tp& +min(const _Tp& __a, const _Tp& __b) +{ + if (__b < __a) + return __b; + return __a; +} + +unsigned a; +void i(long b, char c[][4], long d[][4]) { + for (char e; e; e++) + for (char f = 0; f < 021; f = b) + for (signed char g; g < 7; g += ~0) + for (bool h = 0; h < bool(d[f][f]); h = 1) + a = c[2][1] - min(c[1][f + 1], c[2][f + 2]); +} diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index b880f050715..584be52f423 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -9788,20 +9788,6 @@ permute_vec_elements (vec_info *vinfo, return data_ref; } -/* Comparator for qsort, sorting after GIMPLE UID. */ - -static int -sort_after_uid (const void *a_, const void *b_) -{ - const gimple *a = *(const gimple * const *)a_; - const gimple *b = *(const gimple * const *)b_; - if (gimple_uid (a) < gimple_uid (b)) - return -1; - else if (gimple_uid (a) > gimple_uid (b)) - return 1; - return 0; -} - /* Hoist the definitions of all SSA uses on STMT_INFO out of the loop LOOP, inserting them on the loops preheader edge. Returns true if we were successful in doing so (and thus STMT_INFO can be moved then), @@ -9809,15 +9795,15 @@ sort_after_uid (const void *a_, const void *b_) definitions of all SSA uses, it would be false when we are costing. */ static bool -hoist_defs_of_uses (stmt_vec_info stmt_info, class loop *loop, bool hoist_p) +hoist_defs_of_uses (gimple *stmt, class loop *loop, bool hoist_p) { ssa_op_iter i; - tree op; - auto_vec to_hoist; + use_operand_p use_p; + auto_vec to_hoist; - FOR_EACH_SSA_TREE_OPERAND (op, stmt_info->stmt, i, SSA_OP_USE) + FOR_EACH_SSA_USE_OPERAND (use_p, stmt, i, SSA_OP_USE) { - gimple *def_stmt = SSA_NAME_DEF_STMT (op); + gimple *def_stmt = SSA_NAME_DEF_STMT (USE_FROM_PTR (use_p)); if (!gimple_nop_p (def_stmt) && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt))) { @@ -9827,7 +9813,9 @@ hoist_defs_of_uses (stmt_vec_info stmt_info, class loop *loop, bool hoist_p) dependencies within them. */ tree op2; ssa_op_iter i2; - if (gimple_code (def_stmt) == GIMPLE_PHI) + if (gimple_code (def_stmt) == GIMPLE_PHI + || (single_ssa_def_operand (def_stmt, SSA_OP_DEF) + == NULL_DEF_OPERAND_P)) return false; FOR_EACH_SSA_TREE_OPERAND (op2, def_stmt, i2, SSA_OP_USE) { @@ -9836,7 +9824,7 @@ hoist_defs_of_uses (stmt_vec_info stmt_info, class loop *loop, bool hoist_p) && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt2))) return false; } - to_hoist.safe_push (def_stmt); + to_hoist.safe_push (use_p); } } @@ -9846,14 +9834,21 @@ hoist_defs_of_uses (stmt_vec_info stmt_info, class loop *loop, bool hoist_p) if (!hoist_p) return true; - /* Preserve UID order, otherwise we break dominance checks. */ - to_hoist.qsort (sort_after_uid); - - for (gimple *def_stmt : to_hoist) + /* Instead of moving defs we copy them so we can zero their UID to not + confuse dominance queries in the preheader. */ + gimple_stmt_iterator gsi = gsi_for_stmt (stmt); + for (use_operand_p use_p : to_hoist) { - gimple_stmt_iterator gsi = gsi_for_stmt (def_stmt); - gsi_remove (&gsi, false); - gsi_insert_on_edge_immediate (loop_preheader_edge (loop), def_stmt); + gimple *def_stmt = SSA_NAME_DEF_STMT (USE_FROM_PTR (use_p)); + gimple *copy = gimple_copy (def_stmt); + gimple_set_uid (copy, 0); + def_operand_p def_p = single_ssa_def_operand (def_stmt, SSA_OP_DEF); + tree new_def = duplicate_ssa_name (DEF_FROM_PTR (def_p), copy); + update_stmt (copy); + def_p = single_ssa_def_operand (copy, SSA_OP_DEF); + SET_DEF (def_p, new_def); + SET_USE (use_p, new_def); + gsi_insert_before (&gsi, copy, GSI_SAME_STMT); } return true; @@ -10227,7 +10222,7 @@ vectorizable_load (vec_info *vinfo, transform time. */ bool hoist_p = (LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo) && !nested_in_vect_loop - && hoist_defs_of_uses (stmt_info, loop, !costing_p)); + && hoist_defs_of_uses (stmt_info->stmt, loop, false)); if (costing_p) { enum vect_cost_model_location cost_loc @@ -10264,6 +10259,7 @@ vectorizable_load (vec_info *vinfo, gimple *new_stmt = gimple_build_assign (scalar_dest, rhs); gimple_set_vuse (new_stmt, vuse); gsi_insert_on_edge_immediate (pe, new_stmt); + hoist_defs_of_uses (new_stmt, loop, true); } /* These copies are all equivalent. */ if (hoist_p)