From patchwork Wed Oct 23 10:45:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 99395 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E76E4385840E for ; Wed, 23 Oct 2024 10:47:49 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.11]) by sourceware.org (Postfix) with ESMTPS id 9B7E63858D28 for ; Wed, 23 Oct 2024 10:47:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9B7E63858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9B7E63858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.11 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729680429; cv=none; b=b6WQLE+skNwQCvT/6a/UQt6EHkMltcie6wJv3lS7Ys/3aqcHEcrAfDyJlXM30b6+CKrY00ck8BYrrl2QUhA+mPXykR6dm/xEKHhA5I3cUZSz9Ol5V0+NUBch7fDfHz13x9j7tcvBey9yZX/gnIpZbfDLrqfxgrk9bI0eb84HnBc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729680429; c=relaxed/simple; bh=GyMt0bPi+3KR1J3jGjCQVHH//J1+llAnHMtkxvqMEmI=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=Z48YU0ZkScAsq92P7czq8aRDnLA4ItO3wM5ehsTtwHgZBRjMqAVvhZ/GL3FFxesj47MtUAV/2B5tmsI8zweUZ6iE7LvrqilfxlrC6GF2OVt7wBA2MezDkbNHq3fFYWxKwQpnbLE5or4VUQmUYJJnQK3RYabgi5pZ82OmuasOkCc= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729680425; x=1761216425; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GyMt0bPi+3KR1J3jGjCQVHH//J1+llAnHMtkxvqMEmI=; b=TQwrQjJYJei/SlRE2bGjgpNDG21SmF8QJPx02DmmIj2emYCLMq2DoOZA NANMYQo3VDJObIi3zKNBShFct2th7PWVHAB66Nb/ncR+ob8n2PXWIr/Kg cKKC5f5M5OB+TpdspEdPCgJdQBnTQQJ0bnmIhD61eZpogWBuPi29HRC6v wdJOm83zydSP425N7pMuaXru9Prkh6NSnpsn3lp+Koc9M8gQ8UW2nxCDV lqEPdeuPSp092ug8qovApJE2uKPLU5TmMmb2prMa+hhitpGe1gSzaVrwx XCP7ogqjTEPSHl2eUw/VEADqre4t1ndyyCBLlOP+utMEGYSayYjvJKDUg g==; X-CSE-ConnectionGUID: X7OYlNGnTWi2GI/3kHNeTg== X-CSE-MsgGUID: 2jLtHw8CQJWXcRywgs2hmg== X-IronPort-AV: E=McAfee;i="6700,10204,11222"; a="39808701" X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="39808701" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Oct 2024 03:47:04 -0700 X-CSE-ConnectionGUID: OLA3p3+TTrCWxnhco05WFQ== X-CSE-MsgGUID: dCJzDC+vSCCWQuEAGDaHGg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,225,1725346800"; d="scan'208";a="103436947" Received: from panli.sh.intel.com ([10.239.154.73]) by fmviesa002.fm.intel.com with ESMTP; 23 Oct 2024 03:47:02 -0700 From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, Tamar.Christina@arm.com, juzhe.zhong@rivai.ai, kito.cheng@gmail.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Pan Li Subject: [PATCH 2/5] Vect: Introduce MASK_LEN_STRIDED_LOAD{STORE} to loop vectorizer Date: Wed, 23 Oct 2024 18:45:13 +0800 Message-ID: <20241023104516.2818244-2-pan2.li@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241023104516.2818244-1-pan2.li@intel.com> References: <20241023104516.2818244-1-pan2.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Pan Li This patch would like to allow generation of MASK_LEN_STRIDED_LOAD{STORE} IR for invariant stride memory access. For example as below void foo (int * __restrict a, int * __restrict b, int stride, int n) { for (int i = 0; i < n; i++) a[i*stride] = b[i*stride] + 100; } Before this patch: 66 │ _73 = .SELECT_VL (ivtmp_71, POLY_INT_CST [4, 4]); 67 │ _52 = _54 * _73; 68 │ vect__5.16_61 = .MASK_LEN_GATHER_LOAD (vectp_b.14_59, _58, 4, { 0, ... }, { -1, ... }, _73, 0); 69 │ vect__7.17_63 = vect__5.16_61 + { 100, ... }; 70 │ .MASK_LEN_SCATTER_STORE (vectp_a.18_67, _58, 4, vect__7.17_63, { -1, ... }, _73, 0); 71 │ vectp_b.14_60 = vectp_b.14_59 + _52; 72 │ vectp_a.18_68 = vectp_a.18_67 + _52; 73 │ ivtmp_72 = ivtmp_71 - _73; After this patch: 60 │ _70 = .SELECT_VL (ivtmp_68, POLY_INT_CST [4, 4]); 61 │ _52 = _54 * _70; 62 │ vect__5.16_58 = .MASK_LEN_STRIDED_LOAD (vectp_b.14_56, _55, { 0, ... }, { -1, ... }, _70, 0); 63 │ vect__7.17_60 = vect__5.16_58 + { 100, ... }; 64 │ .MASK_LEN_STRIDED_STORE (vectp_a.18_64, _55, vect__7.17_60, { -1, ... }, _70, 0); 65 │ vectp_b.14_57 = vectp_b.14_56 + _52; 66 │ vectp_a.18_65 = vectp_a.18_64 + _52; 67 │ ivtmp_69 = ivtmp_68 - _70; The below test suites are passed for this patch: * The x86 bootstrap test. * The x86 fully regression test. * The riscv fully regression test. gcc/ChangeLog: * tree-vect-stmts.cc (vect_get_strided_load_store_ops): Handle MASK_LEN_STRIDED_LOAD{STORE} after supported check. (vectorizable_store): Generate MASK_LEN_STRIDED_LOAD when the offset of gater is not vector type. (vectorizable_load): Ditto but for store. Signed-off-by: Pan Li Co-Authored-By: Juzhe-Zhong --- gcc/tree-vect-stmts.cc | 45 +++++++++++++++++++++++++++++++++--------- 1 file changed, 36 insertions(+), 9 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index e7f14c3144c..78d66a4ef9d 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -2950,6 +2950,15 @@ vect_get_strided_load_store_ops (stmt_vec_info stmt_info, *dataref_bump = cse_and_gimplify_to_preheader (loop_vinfo, bump); } + internal_fn ifn + = DR_IS_READ (dr) ? IFN_MASK_LEN_STRIDED_LOAD : IFN_MASK_LEN_STRIDED_STORE; + if (direct_internal_fn_supported_p (ifn, vectype, OPTIMIZE_FOR_SPEED)) + { + *vec_offset = cse_and_gimplify_to_preheader (loop_vinfo, + unshare_expr (DR_STEP (dr))); + return; + } + /* The offset given in GS_INFO can have pointer type, so use the element type of the vector instead. */ tree offset_type = TREE_TYPE (gs_info->offset_vectype); @@ -9194,10 +9203,20 @@ vectorizable_store (vec_info *vinfo, gcall *call; if (final_len && final_mask) - call = gimple_build_call_internal - (IFN_MASK_LEN_SCATTER_STORE, 7, dataref_ptr, - vec_offset, scale, vec_oprnd, final_mask, - final_len, bias); + { + if (VECTOR_TYPE_P (TREE_TYPE (vec_offset))) + call = gimple_build_call_internal ( + IFN_MASK_LEN_SCATTER_STORE, 7, dataref_ptr, + vec_offset, scale, vec_oprnd, final_mask, final_len, + bias); + else + /* Non-vector offset indicates that prefer to take + MASK_LEN_STRIDED_STORE instead of the + IFN_MASK_SCATTER_STORE with direct stride arg. */ + call = gimple_build_call_internal ( + IFN_MASK_LEN_STRIDED_STORE, 6, dataref_ptr, + vec_offset, vec_oprnd, final_mask, final_len, bias); + } else if (final_mask) call = gimple_build_call_internal (IFN_MASK_SCATTER_STORE, 5, dataref_ptr, @@ -11194,11 +11213,19 @@ vectorizable_load (vec_info *vinfo, gcall *call; if (final_len && final_mask) - call - = gimple_build_call_internal (IFN_MASK_LEN_GATHER_LOAD, 7, - dataref_ptr, vec_offset, - scale, zero, final_mask, - final_len, bias); + { + if (VECTOR_TYPE_P (TREE_TYPE (vec_offset))) + call = gimple_build_call_internal ( + IFN_MASK_LEN_GATHER_LOAD, 7, dataref_ptr, vec_offset, + scale, zero, final_mask, final_len, bias); + else + /* Non-vector offset indicates that prefer to take + MASK_LEN_STRIDED_LOAD instead of the + MASK_LEN_GATHER_LOAD with direct stride arg. */ + call = gimple_build_call_internal ( + IFN_MASK_LEN_STRIDED_LOAD, 6, dataref_ptr, vec_offset, + zero, final_mask, final_len, bias); + } else if (final_mask) call = gimple_build_call_internal (IFN_MASK_GATHER_LOAD, 5, dataref_ptr, vec_offset,