From patchwork Mon Jul 15 11:56:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 93939 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8F5C1386480A for ; Mon, 15 Jul 2024 11:56:37 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2a07:de40:b251:101:10:150:64:1]) by sourceware.org (Postfix) with ESMTPS id 286AB385828B for ; Mon, 15 Jul 2024 11:56:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 286AB385828B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 286AB385828B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a07:de40:b251:101:10:150:64:1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721044570; cv=none; b=vk7XPwjQoArKcbxgg1DMKShJ6MZO5x0p+a4nThkei4q3/NqnbzJbFrKcZGdOoXi+GObMGocaQbxwiXmA+a27WPUCmy8aHUtDeuSJCA/MAlIT9jtAnUlfvXhSBXSrbgog/Qv4suSuU+JvuKDw/ubftUabxPUSxxpzWmR9eQR5JnU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721044570; c=relaxed/simple; bh=BBqokMo1ZG6EZm6/Ra7ZNMu22TXfcETzugg/3SdQPKY=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:MIME-Version; b=p9UMWdk+48cLEvBhQ1UzTQTYxgdzUwjjfVUPOd8XcbGTNuJoIwHMi4DG22q8Jg2btWh/6kUeGQ0yUAaLYjs0gXuaimQcapi79uLS4bqsIp6CO+6WubmYiClWNjVvVO9oMvD3ErjqfMA14TYzm7TTf5ixLbt8HBNjvHkvBBb+y8E= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from murzim.nue2.suse.org (unknown [10.168.4.243]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 6C0062127D for ; Mon, 15 Jul 2024 11:56:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1721044563; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=m82okT4dnLQQyXA10Nsm6Ib5jK/LRl+tMgMUy8a581s=; b=TliuGUxSgV0lGSr74kRPVhAsBEcLf1w3FVvMXHM2bKx5c3zkfefsTYA4Hrr0KDvP5SpjAz BQ0XkHjooBc8sjOVMM8Nom2qMVPAGWQi1BwC042ly2S4x4pGHp+qiCCvd3ehl+4pI/hd7S n5xCt4MI5LFyQ0F54prFBUzvuKNWPmI= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1721044563; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=m82okT4dnLQQyXA10Nsm6Ib5jK/LRl+tMgMUy8a581s=; b=IZjH+j7rSnprPKecp0tZ39oj1UWcqmn8V9l+MshfA2XJObrw1XE7d6agJ7u8EsAy/S2Zac MKwpe/miBWzPN7Cw== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1721044562; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=m82okT4dnLQQyXA10Nsm6Ib5jK/LRl+tMgMUy8a581s=; b=TX+gDDzDCHpMPGgG1R+IGuRIj9ZECisH2rHKKjj5TfJ4A8z+nXvaC4DWhMB7NjGwWl/UBo aGTbU3EZ5MConGHWNUWOAhIThIgzIgeDrFLF/2SuatHgd8hF4NVldru5oBH3WEx+mFJQmI mEMIKQFDT+2JY+fZoc7nFnFq1gdk4Ss= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1721044562; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=m82okT4dnLQQyXA10Nsm6Ib5jK/LRl+tMgMUy8a581s=; b=EwQGAhsS5ad1Dbv2kl1UXEd2U1Gm5euP6uXQTzCtH+NRiGT/I2JXoKVFUZsA8z4GQ2PMjq OtA4v1sHc960qvAA== Date: Mon, 15 Jul 2024 13:56:02 +0200 (CEST) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH] tree-optimization/115843 - fix wrong-code with fully-masked loop and peeling MIME-Version: 1.0 X-Spamd-Result: default: False [4.88 / 50.00]; NEURAL_SPAM_LONG(3.37)[0.962]; BAYES_HAM(-3.00)[100.00%]; MISSING_MID(2.50)[]; NEURAL_SPAM_SHORT(2.11)[0.704]; MIME_GOOD(-0.10)[text/plain]; ARC_NA(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; MISSING_XM_UA(0.00)[]; RCVD_COUNT_ZERO(0.00)[0]; FROM_HAS_DN(0.00)[]; FUZZY_BLOCKED(0.00)[rspamd.com]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FROM_EQ_ENVFROM(0.00)[]; TO_DN_NONE(0.00)[]; MIME_TRACE(0.00)[0:+] X-Spam-Score: 4.88 X-Spam-Status: No, score=-10.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, MISSING_MID, RCVD_IN_DNSWL_NONE, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org Message-Id: <20240715115637.8F5C1386480A@sourceware.org> When AVX512 uses a fully masked loop and peeling we fail to create the correct initial loop mask when the mask is composed of multiple components in some cases. The following fixes this by properly applying the bias for the component to the shift amount. Bootstrap and regtest running on x86_64-unknown-linux-gnu. PR tree-optimization/115843 * tree-vect-loop-manip.cc (vect_set_loop_condition_partial_vectors_avx512): Properly bias the shift of the initial mask for alignment peeling. * gcc.dg/vect/pr115843.c: New testcase. --- gcc/testsuite/gcc.dg/vect/pr115843.c | 40 ++++++++++++++++++++++++++++ gcc/tree-vect-loop-manip.cc | 8 ++++-- 2 files changed, 46 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/pr115843.c diff --git a/gcc/testsuite/gcc.dg/vect/pr115843.c b/gcc/testsuite/gcc.dg/vect/pr115843.c new file mode 100644 index 00000000000..f829d90b1ad --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr115843.c @@ -0,0 +1,40 @@ +/* { dg-additional-options "-mavx512vl --param vect-partial-vector-usage=2" { target { avx512f_runtime && avx512vl } } } */ + +#include "tree-vect.h" + +typedef __UINT64_TYPE__ BITBOARD; +BITBOARD KingPressureMask1[64], KingSafetyMask1[64]; + +void __attribute__((noinline)) +foo() +{ + for (int i = 0; i < 64; i++) + { + if ((i & 7) == 0) + KingPressureMask1[i] = KingSafetyMask1[i + 1]; + else if ((i & 7) == 7) + KingPressureMask1[i] = KingSafetyMask1[i - 1]; + else + KingPressureMask1[i] = KingSafetyMask1[i]; + } +} + +BITBOARD verify[64] + = {1, 1, 2, 3, 4, 5, 6, 6, 9, 9, 10, 11, 12, 13, 14, 14, 17, 17, 18, 19, + 20, 21, 22, 22, 25, 25, 26, 27, 28, 29, 30, 30, 33, 33, 34, 35, 36, 37, 38, + 38, 41, 41, 42, 43, 44, 45, 46, 46, 49, 49, 50, 51, 52, 53, 54, 54, 57, 57, + 58, 59, 60, 61, 62, 62}; + +int main() +{ + check_vect (); + +#pragma GCC novect + for (int i = 0; i < 64; ++i) + KingSafetyMask1[i] = i; + foo (); + for (int i = 0; i < 64; ++i) + if (KingPressureMask1[i] != verify[i]) + __builtin_abort (); + return 0; +} diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc index ac13873cd88..57dbcbe862c 100644 --- a/gcc/tree-vect-loop-manip.cc +++ b/gcc/tree-vect-loop-manip.cc @@ -1149,10 +1149,14 @@ vect_set_loop_condition_partial_vectors_avx512 (class loop *loop, /* ??? But when the shift amount isn't constant this requires a round-trip to GRPs. We could apply the bias to either side of the compare instead. */ - tree shift = gimple_build (&preheader_seq, MULT_EXPR, + tree shift = gimple_build (&preheader_seq, MINUS_EXPR, TREE_TYPE (niters_skip), niters_skip, build_int_cst (TREE_TYPE (niters_skip), - rgc.max_nscalars_per_iter)); + bias)); + shift = gimple_build (&preheader_seq, MULT_EXPR, + TREE_TYPE (niters_skip), shift, + build_int_cst (TREE_TYPE (niters_skip), + rgc.max_nscalars_per_iter)); init_ctrl = gimple_build (&preheader_seq, LSHIFT_EXPR, TREE_TYPE (init_ctrl), init_ctrl, shift);