From patchwork Tue Oct 26 09:19:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 46653 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DC5573858410 for ; Tue, 26 Oct 2021 09:20:15 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DC5573858410 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1635240015; bh=JwqdHKJUct5ilPB9aDFe+pib3ELbzEkP353RzQy9FU4=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=mUTFuE7pBFJWnCOcW1XsmEVX0PqPnruJB5EeK8xTOP0LPsxjJXUgSDL304RP9aVkZ Bw+fVr2WTVCAJZsRy76/l5ouqiKa2Szrph3w9fBXGFu5P0uQRsU6kPNWux+Ke45IJ/ ZR384iH16lKP2b9ARGUuaQieWIJqTRRNv3/uieks= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id 157303858400 for ; Tue, 26 Oct 2021 09:19:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 157303858400 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id A39F721940 for ; Tue, 26 Oct 2021 09:19:44 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 8ED2713B1F for ; Tue, 26 Oct 2021 09:19:44 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id vqSnITDId2ESbwAAMHmgww (envelope-from ) for ; Tue, 26 Oct 2021 09:19:44 +0000 Date: Tue, 26 Oct 2021 11:19:44 +0200 (CEST) To: gcc-patches@gcc.gnu.org Subject: [PATCH] Move negative stride bias out of dr_misalignment Message-ID: <24qs317-4090-n6n6-1o2n-40r52no5313n@fhfr.qr> MIME-Version: 1.0 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Biener via Gcc-patches From: Richard Biener Reply-To: Richard Biener Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This moves applying of a bias for negative stride accesses out of dr_misalignment in favor of a more general optional offset argument. The negative bias is now computed by get_load_store_type and applied accordingly to determine the alignment support scheme. Likewise the peeling/versioning code is adjusted albeit that still assumes we'll end up with VMAT_CONTIGUOUS_DOWN or VMAT_CONTIGUOUS_REVERSE but at least when not so (VMAT_STRIDED_SLP is one possibility) then get_load_store_type will _not_ falsely report an aligned access but instead an access with known misalignment. This fixes PR96109. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2021-10-25 Richard Biener PR tree-optimization/96109 * tree-vectorizer.h (dr_misalignment): Add optional offset parameter. * tree-vect-data-refs.c (dr_misalignment): Likewise. Remove offset applied for negative stride accesses. (vect_enhance_data_refs_alignment): Compute negative stride access offset and pass it to dr_misalignment. * tree-vect-stmts.c (get_negative_load_store_type): Pass negative offset to dr_misalignment. (get_group_load_store_type): Likewise. (get_load_store_type): Likewise. (vectorizable_store): Remove asserts about alignment. (vectorizable_load): Likewise. --- gcc/tree-vect-data-refs.c | 58 ++++++++++++++++++++++++++------------- gcc/tree-vect-stmts.c | 19 +++++-------- gcc/tree-vectorizer.h | 3 +- 3 files changed, 48 insertions(+), 32 deletions(-) diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c index 46360c50bd4..cda89ef9725 100644 --- a/gcc/tree-vect-data-refs.c +++ b/gcc/tree-vect-data-refs.c @@ -887,10 +887,11 @@ vect_slp_analyze_instance_dependence (vec_info *vinfo, slp_instance instance) return res; } -/* Return the misalignment of DR_INFO accessed in VECTYPE. */ +/* Return the misalignment of DR_INFO accessed in VECTYPE with OFFSET + applied. */ int -dr_misalignment (dr_vec_info *dr_info, tree vectype) +dr_misalignment (dr_vec_info *dr_info, tree vectype, poly_int64 offset) { HOST_WIDE_INT diff = 0; /* Alignment is only analyzed for the first element of a DR group, @@ -919,13 +920,9 @@ dr_misalignment (dr_vec_info *dr_info, tree vectype) targetm.vectorize.preferred_vector_alignment (vectype))) return DR_MISALIGNMENT_UNKNOWN; - /* If this is a backward running DR then first access in the larger - vectype actually is N-1 elements before the address in the DR. - Adjust misalign accordingly. */ - poly_int64 misalignment = misalign + diff; - if (tree_int_cst_sgn (DR_STEP (dr_info->dr)) < 0) - misalignment += ((TYPE_VECTOR_SUBPARTS (vectype) - 1) - * -TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (vectype)))); + /* Apply the offset from the DR group start and the externally supplied + offset which can for example result from a negative stride access. */ + poly_int64 misalignment = misalign + diff + offset; /* vect_compute_data_ref_alignment will have ensured that target_alignment is constant and otherwise set misalign to DR_MISALIGNMENT_UNKNOWN. */ @@ -1549,8 +1546,15 @@ vect_get_peeling_costs_all_drs (loop_vec_info loop_vinfo, int misalignment; unsigned HOST_WIDE_INT alignment; + bool negative = tree_int_cst_compare (DR_STEP (dr_info->dr), + size_zero_node) < 0; + poly_int64 off = 0; + if (negative) + off = ((TYPE_VECTOR_SUBPARTS (vectype) - 1) + * -TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (vectype)))); + if (npeel == 0) - misalignment = dr_misalignment (dr_info, vectype); + misalignment = dr_misalignment (dr_info, vectype, off); else if (dr_info == dr0_info || vect_dr_aligned_if_peeled_dr_is (dr_info, dr0_info)) misalignment = 0; @@ -1560,7 +1564,7 @@ vect_get_peeling_costs_all_drs (loop_vec_info loop_vinfo, misalignment = DR_MISALIGNMENT_UNKNOWN; else { - misalignment = dr_misalignment (dr_info, vectype); + misalignment = dr_misalignment (dr_info, vectype, off); misalignment += npeel * TREE_INT_CST_LOW (DR_STEP (dr_info->dr)); misalignment &= alignment - 1; } @@ -1960,8 +1964,11 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo) */ unsigned int target_align = DR_TARGET_ALIGNMENT (dr_info).to_constant (); - unsigned int dr_size = vect_get_scalar_dr_size (dr_info); - unsigned int mis = dr_misalignment (dr_info, vectype); + unsigned HOST_WIDE_INT dr_size = vect_get_scalar_dr_size (dr_info); + poly_int64 off = 0; + if (negative) + off = (TYPE_VECTOR_SUBPARTS (vectype) - 1) * -dr_size; + unsigned int mis = dr_misalignment (dr_info, vectype, off); mis = negative ? mis : -mis; if (mis != 0) npeel_tmp = (mis & (target_align - 1)) / dr_size; @@ -2238,8 +2245,13 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo) updating DR_MISALIGNMENT values. The peeling factor is the vectorization factor minus the misalignment as an element count. */ + tree vectype = STMT_VINFO_VECTYPE (stmt_info); + poly_int64 off = 0; + if (negative) + off = ((TYPE_VECTOR_SUBPARTS (vectype) - 1) + * -TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (vectype)))); unsigned int mis - = dr_misalignment (dr0_info, STMT_VINFO_VECTYPE (stmt_info)); + = dr_misalignment (dr0_info, vectype, off); mis = negative ? mis : -mis; /* If known_alignment_for_access_p then we have set DR_MISALIGNMENT which is only done if we know it at compiler @@ -2383,19 +2395,27 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo) FOR_EACH_VEC_ELT (datarefs, i, dr) { dr_vec_info *dr_info = loop_vinfo->lookup_dr (dr); - stmt_vec_info stmt_info = dr_info->stmt; - tree vectype = STMT_VINFO_VECTYPE (stmt_info); - int misalignment; - if ((misalignment = dr_misalignment (dr_info, vectype)) == 0 - || !vect_relevant_for_alignment_p (dr_info)) + if (!vect_relevant_for_alignment_p (dr_info)) continue; + stmt_vec_info stmt_info = dr_info->stmt; if (STMT_VINFO_STRIDED_P (stmt_info)) { do_versioning = false; break; } + tree vectype = STMT_VINFO_VECTYPE (stmt_info); + bool negative = tree_int_cst_compare (DR_STEP (dr), + size_zero_node) < 0; + poly_int64 off = 0; + if (negative) + off = ((TYPE_VECTOR_SUBPARTS (vectype) - 1) + * -TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (vectype)))); + int misalignment; + if ((misalignment = dr_misalignment (dr_info, vectype, off)) == 0) + continue; + enum dr_alignment_support supportable_dr_alignment = vect_supportable_dr_alignment (loop_vinfo, dr_info, vectype, misalignment); diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index 1f56e10709e..9a40493686c 100644 --- a/gcc/tree-vect-stmts.c +++ b/gcc/tree-vect-stmts.c @@ -2001,7 +2001,7 @@ get_negative_load_store_type (vec_info *vinfo, *poffset = ((-TYPE_VECTOR_SUBPARTS (vectype) + 1) * TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (vectype)))); - int misalignment = dr_misalignment (dr_info, vectype); + int misalignment = dr_misalignment (dr_info, vectype, *poffset); alignment_support_scheme = vect_supportable_dr_alignment (vinfo, dr_info, vectype, misalignment); if (alignment_support_scheme != dr_aligned @@ -2010,6 +2010,7 @@ get_negative_load_store_type (vec_info *vinfo, if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "negative step but alignment required.\n"); + *poffset = 0; return VMAT_ELEMENTWISE; *poffset = 0; } @@ -2320,7 +2321,7 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info, } else { - *misalignment = dr_misalignment (first_dr_info, vectype); + *misalignment = dr_misalignment (first_dr_info, vectype, *poffset); *alignment_support_scheme = vect_supportable_dr_alignment (vinfo, first_dr_info, vectype, *misalignment); @@ -2470,7 +2471,7 @@ get_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info, else *memory_access_type = VMAT_CONTIGUOUS; *misalignment = dr_misalignment (STMT_VINFO_DR_INFO (stmt_info), - vectype); + vectype, *poffset); *alignment_support_scheme = vect_supportable_dr_alignment (vinfo, STMT_VINFO_DR_INFO (stmt_info), @@ -8252,10 +8253,7 @@ vectorizable_store (vec_info *vinfo, align = known_alignment (DR_TARGET_ALIGNMENT (first_dr_info)); if (alignment_support_scheme == dr_aligned) - { - gcc_assert (aligned_access_p (first_dr_info, vectype)); - misalign = 0; - } + misalign = 0; else if (misalignment == DR_MISALIGNMENT_UNKNOWN) { align = dr_alignment (vect_dr_behavior (vinfo, first_dr_info)); @@ -8340,7 +8338,7 @@ vectorizable_store (vec_info *vinfo, ? dataref_offset : build_int_cst (ref_type, 0)); if (alignment_support_scheme == dr_aligned) - gcc_assert (aligned_access_p (first_dr_info, vectype)); + ; else TREE_TYPE (data_ref) = build_aligned_type (TREE_TYPE (data_ref), @@ -9589,10 +9587,7 @@ vectorizable_load (vec_info *vinfo, align = known_alignment (DR_TARGET_ALIGNMENT (first_dr_info)); if (alignment_support_scheme == dr_aligned) - { - gcc_assert (aligned_access_p (first_dr_info, vectype)); - misalign = 0; - } + misalign = 0; else if (misalignment == DR_MISALIGNMENT_UNKNOWN) { align = dr_alignment diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 866d813a12c..52cdfbfae02 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -1606,7 +1606,8 @@ set_dr_misalignment (dr_vec_info *dr_info, int val) dr_info->misalignment = val; } -extern int dr_misalignment (dr_vec_info *dr_info, tree vectype); +extern int dr_misalignment (dr_vec_info *dr_info, tree vectype, + poly_int64 offset = 0); #define SET_DR_MISALIGNMENT(DR, VAL) set_dr_misalignment (DR, VAL)