From patchwork Wed Oct 23 10:45:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 99394 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 47A1F385843D for ; Wed, 23 Oct 2024 10:47:46 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.11]) by sourceware.org (Postfix) with ESMTPS id 5BADE3858D21 for ; Wed, 23 Oct 2024 10:47:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5BADE3858D21 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5BADE3858D21 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.11 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729680427; cv=none; b=TzHEDvtGC7qvxeHycBY5lL68N+uGHvy+6xLhzUH217qDnNtBophMSbG8uDEoQNxeNuwf7Rw0OUjZPQy8ez+MlcVrsw1w9eceYylfOVTDv5P+rjwJ1zJt29YYbkwn7zPPnMQW+YnqWtYkVItTU+Lzs8N0e3xN5LGkiq1pcIBVieo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729680427; c=relaxed/simple; bh=OCCjC7UA05WmCXLlOXDhaQ+BQ25sOZPpnlro5Mf19ng=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=UVq9WBSd+9Cdg5Sd1FZ3YqmD4wHh0mRp29dGor7Du8vk+mCTIvkiCptvJuF9isvaz0Uf8xclNZ7yXpMSb+6GlKmWfvxoRhswKveXUH/kQOIF3DhyPi7FfyrfitW/CWv3GuLBkiTQ0s+EKB31D8ce2cnLUtOoO4bGqG3FXsTaYYE= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729680424; x=1761216424; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=OCCjC7UA05WmCXLlOXDhaQ+BQ25sOZPpnlro5Mf19ng=; b=fBMV15wadZg8pnHNzbDc2ilXzHUTQ1qgUkyaj3a+XLlZLj+KrPUsnwpC XdcQvqIWwBUzNsmsF/TKaJ0m2+KHWkp6Gy7QWS8sSx3EEvA/xmn36RGuk BgeDvPiCPdGbnhaF5D5oqO7Ji4augk8byFmNtw4Fj+Rapsr6HeXo5QpsP m1PTH6m3PtFjPBPmsxxrsKfPd18GP1D7Fr8TIyTe63WAHAiI7w9rj0El5 2HyQJEZiWb918WmIeL9rtHRY5zVrAmr9kAQtZfeJ/QzXXPAWVrTAnKmzB gQ3H8KAUQI/d65J3mmfoMKxfxisi43R8tzWTfxVWHlfVfdLeqtS0hcmbp A==; X-CSE-ConnectionGUID: rLVoNVAwQCOYsBo5ybxpVQ== X-CSE-MsgGUID: 3qTq5NlBRmecyDIzP37xfg== X-IronPort-AV: E=McAfee;i="6700,10204,11222"; a="39808696" X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="39808696" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Oct 2024 03:47:02 -0700 X-CSE-ConnectionGUID: PAT4Y29kRLCs9QJwDpIkQA== X-CSE-MsgGUID: llBKOOkSRAyjCQREyOnVRA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,225,1725346800"; d="scan'208";a="103436942" Received: from panli.sh.intel.com ([10.239.154.73]) by fmviesa002.fm.intel.com with ESMTP; 23 Oct 2024 03:47:00 -0700 From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, Tamar.Christina@arm.com, juzhe.zhong@rivai.ai, kito.cheng@gmail.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Pan Li Subject: [PATCH 1/5] Internal-fn: Introduce new IFN MASK_LEN_STRIDED_LOAD{STORE} Date: Wed, 23 Oct 2024 18:45:12 +0800 Message-ID: <20241023104516.2818244-1-pan2.li@intel.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Pan Li This patch would like to introduce new IFN for strided load and store. LOAD: v = MASK_LEN_STRIDED_LOAD (ptr, stride, mask, len, bias) STORE: MASK_LEN_STRIED_STORE (ptr, stride, v, mask, len, bias) The IFN target below code example similar as below void foo (int * a, int * b, int stride, int n) { for (int i = 0; i < n; i++) a[i * stride] = b[i * stride]; } The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * internal-fn.cc (strided_load_direct): Add new define direct for strided load. (strided_store_direct): Ditto but for store. (expand_strided_load_optab_fn): Add new func to expand the IFN MASK_LEN_STRIDED_LOAD in middle-end. (expand_strided_store_optab_fn): Ditto but for store. (direct_strided_load_optab_supported_p): Add define for stride load optab supported. (direct_strided_store_optab_supported_p): Ditto but for store. (internal_fn_len_index): Add strided load/store len index. (internal_fn_mask_index): Ditto but for mask. (internal_fn_stored_value_index): Add strided store value index. * internal-fn.def (MASK_LEN_STRIDED_LOAD): Add new IFN for strided load. (MASK_LEN_STRIDED_STORE): Ditto but for store. * optabs.def (OPTAB_D): Add strided load/store optab. Signed-off-by: Pan Li Co-Authored-By: Juzhe-Zhong --- gcc/internal-fn.cc | 71 +++++++++++++++++++++++++++++++++++++++++++++ gcc/internal-fn.def | 6 ++++ gcc/optabs.def | 2 ++ 3 files changed, 79 insertions(+) diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index d89a04fe412..bfbbba8e2dd 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -159,6 +159,7 @@ init_internal_fns () #define load_lanes_direct { -1, -1, false } #define mask_load_lanes_direct { -1, -1, false } #define gather_load_direct { 3, 1, false } +#define strided_load_direct { -1, -1, false } #define len_load_direct { -1, -1, false } #define mask_len_load_direct { -1, 4, false } #define mask_store_direct { 3, 2, false } @@ -168,6 +169,7 @@ init_internal_fns () #define vec_cond_mask_len_direct { 1, 1, false } #define vec_cond_direct { 2, 0, false } #define scatter_store_direct { 3, 1, false } +#define strided_store_direct { 1, 1, false } #define len_store_direct { 3, 3, false } #define mask_len_store_direct { 4, 5, false } #define vec_set_direct { 3, 3, false } @@ -3712,6 +3714,64 @@ expand_gather_load_optab_fn (internal_fn, gcall *stmt, direct_optab optab) assign_call_lhs (lhs, lhs_rtx, &ops[0]); } +/* Expand MASK_LEN_STRIDED_LOAD call CALL by optab OPTAB. */ + +static void +expand_strided_load_optab_fn (ATTRIBUTE_UNUSED internal_fn, gcall *stmt, + direct_optab optab) +{ + tree lhs = gimple_call_lhs (stmt); + tree base = gimple_call_arg (stmt, 0); + tree stride = gimple_call_arg (stmt, 1); + + rtx lhs_rtx = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); + rtx base_rtx = expand_normal (base); + rtx stride_rtx = expand_normal (stride); + + unsigned i = 0; + class expand_operand ops[6]; + machine_mode mode = TYPE_MODE (TREE_TYPE (lhs)); + + create_output_operand (&ops[i++], lhs_rtx, mode); + create_address_operand (&ops[i++], base_rtx); + create_address_operand (&ops[i++], stride_rtx); + + i = add_mask_and_len_args (ops, i, stmt); + expand_insn (direct_optab_handler (optab, mode), i, ops); + + if (!rtx_equal_p (lhs_rtx, ops[0].value)) + emit_move_insn (lhs_rtx, ops[0].value); +} + +/* Expand MASK_LEN_STRIDED_STORE call CALL by optab OPTAB. */ + +static void +expand_strided_store_optab_fn (ATTRIBUTE_UNUSED internal_fn, gcall *stmt, + direct_optab optab) +{ + internal_fn fn = gimple_call_internal_fn (stmt); + int rhs_index = internal_fn_stored_value_index (fn); + + tree base = gimple_call_arg (stmt, 0); + tree stride = gimple_call_arg (stmt, 1); + tree rhs = gimple_call_arg (stmt, rhs_index); + + rtx base_rtx = expand_normal (base); + rtx stride_rtx = expand_normal (stride); + rtx rhs_rtx = expand_normal (rhs); + + unsigned i = 0; + class expand_operand ops[6]; + machine_mode mode = TYPE_MODE (TREE_TYPE (rhs)); + + create_address_operand (&ops[i++], base_rtx); + create_address_operand (&ops[i++], stride_rtx); + create_input_operand (&ops[i++], rhs_rtx, mode); + + i = add_mask_and_len_args (ops, i, stmt); + expand_insn (direct_optab_handler (optab, mode), i, ops); +} + /* Helper for expand_DIVMOD. Return true if the sequence starting with INSN contains any call insns or insns with {,U}{DIV,MOD} rtxes. */ @@ -4101,6 +4161,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types, #define direct_load_lanes_optab_supported_p multi_vector_optab_supported_p #define direct_mask_load_lanes_optab_supported_p multi_vector_optab_supported_p #define direct_gather_load_optab_supported_p convert_optab_supported_p +#define direct_strided_load_optab_supported_p direct_optab_supported_p #define direct_len_load_optab_supported_p direct_optab_supported_p #define direct_mask_len_load_optab_supported_p convert_optab_supported_p #define direct_mask_store_optab_supported_p convert_optab_supported_p @@ -4109,6 +4170,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types, #define direct_vec_cond_mask_optab_supported_p convert_optab_supported_p #define direct_vec_cond_optab_supported_p convert_optab_supported_p #define direct_scatter_store_optab_supported_p convert_optab_supported_p +#define direct_strided_store_optab_supported_p direct_optab_supported_p #define direct_len_store_optab_supported_p direct_optab_supported_p #define direct_mask_len_store_optab_supported_p convert_optab_supported_p #define direct_while_optab_supported_p convert_optab_supported_p @@ -4808,6 +4870,8 @@ internal_fn_len_index (internal_fn fn) case IFN_COND_LEN_XOR: case IFN_COND_LEN_SHL: case IFN_COND_LEN_SHR: + case IFN_MASK_LEN_STRIDED_LOAD: + case IFN_MASK_LEN_STRIDED_STORE: return 4; case IFN_COND_LEN_NEG: @@ -4902,6 +4966,10 @@ internal_fn_mask_index (internal_fn fn) case IFN_MASK_LEN_STORE: return 2; + case IFN_MASK_LEN_STRIDED_LOAD: + case IFN_MASK_LEN_STRIDED_STORE: + return 3; + case IFN_MASK_GATHER_LOAD: case IFN_MASK_SCATTER_STORE: case IFN_MASK_LEN_GATHER_LOAD: @@ -4925,6 +4993,9 @@ internal_fn_stored_value_index (internal_fn fn) { switch (fn) { + case IFN_MASK_LEN_STRIDED_STORE: + return 2; + case IFN_MASK_STORE: case IFN_MASK_STORE_LANES: case IFN_SCATTER_STORE: diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 23b4ab02b30..2d455938271 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -56,6 +56,7 @@ along with GCC; see the file COPYING3. If not see - mask_load_lanes: currently just vec_mask_load_lanes - mask_len_load_lanes: currently just vec_mask_len_load_lanes - gather_load: used for {mask_,mask_len_,}gather_load + - strided_load: currently just mask_len_strided_load - len_load: currently just len_load - mask_len_load: currently just mask_len_load @@ -64,6 +65,7 @@ along with GCC; see the file COPYING3. If not see - mask_store_lanes: currently just vec_mask_store_lanes - mask_len_store_lanes: currently just vec_mask_len_store_lanes - scatter_store: used for {mask_,mask_len_,}scatter_store + - strided_store: currently just mask_len_strided_store - len_store: currently just len_store - mask_len_store: currently just mask_len_store @@ -212,6 +214,8 @@ DEF_INTERNAL_OPTAB_FN (MASK_GATHER_LOAD, ECF_PURE, mask_gather_load, gather_load) DEF_INTERNAL_OPTAB_FN (MASK_LEN_GATHER_LOAD, ECF_PURE, mask_len_gather_load, gather_load) +DEF_INTERNAL_OPTAB_FN (MASK_LEN_STRIDED_LOAD, ECF_PURE, + mask_len_strided_load, strided_load) DEF_INTERNAL_OPTAB_FN (LEN_LOAD, ECF_PURE, len_load, len_load) DEF_INTERNAL_OPTAB_FN (MASK_LEN_LOAD, ECF_PURE, mask_len_load, mask_len_load) @@ -221,6 +225,8 @@ DEF_INTERNAL_OPTAB_FN (MASK_SCATTER_STORE, 0, mask_scatter_store, scatter_store) DEF_INTERNAL_OPTAB_FN (MASK_LEN_SCATTER_STORE, 0, mask_len_scatter_store, scatter_store) +DEF_INTERNAL_OPTAB_FN (MASK_LEN_STRIDED_STORE, 0, + mask_len_strided_store, strided_store) DEF_INTERNAL_OPTAB_FN (MASK_STORE, 0, maskstore, mask_store) DEF_INTERNAL_OPTAB_FN (STORE_LANES, ECF_CONST, vec_store_lanes, store_lanes) diff --git a/gcc/optabs.def b/gcc/optabs.def index b48e2e5a5ac..90be40f74d5 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -548,6 +548,8 @@ OPTAB_DC (vec_series_optab, "vec_series$a", VEC_SERIES) OPTAB_D (vec_shl_insert_optab, "vec_shl_insert_$a") OPTAB_D (len_load_optab, "len_load_$a") OPTAB_D (len_store_optab, "len_store_$a") +OPTAB_D (mask_len_strided_load_optab, "mask_len_strided_load_$a") +OPTAB_D (mask_len_strided_store_optab, "mask_len_strided_store_$a") OPTAB_D (select_vl_optab, "select_vl$a") OPTAB_D (andn_optab, "andn$a3") OPTAB_D (iorn_optab, "iorn$a3") From patchwork Wed Oct 23 10:45:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 99395 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E76E4385840E for ; Wed, 23 Oct 2024 10:47:49 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.11]) by sourceware.org (Postfix) with ESMTPS id 9B7E63858D28 for ; Wed, 23 Oct 2024 10:47:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9B7E63858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9B7E63858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.11 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729680429; cv=none; b=b6WQLE+skNwQCvT/6a/UQt6EHkMltcie6wJv3lS7Ys/3aqcHEcrAfDyJlXM30b6+CKrY00ck8BYrrl2QUhA+mPXykR6dm/xEKHhA5I3cUZSz9Ol5V0+NUBch7fDfHz13x9j7tcvBey9yZX/gnIpZbfDLrqfxgrk9bI0eb84HnBc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729680429; c=relaxed/simple; bh=GyMt0bPi+3KR1J3jGjCQVHH//J1+llAnHMtkxvqMEmI=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=Z48YU0ZkScAsq92P7czq8aRDnLA4ItO3wM5ehsTtwHgZBRjMqAVvhZ/GL3FFxesj47MtUAV/2B5tmsI8zweUZ6iE7LvrqilfxlrC6GF2OVt7wBA2MezDkbNHq3fFYWxKwQpnbLE5or4VUQmUYJJnQK3RYabgi5pZ82OmuasOkCc= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729680425; x=1761216425; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GyMt0bPi+3KR1J3jGjCQVHH//J1+llAnHMtkxvqMEmI=; b=TQwrQjJYJei/SlRE2bGjgpNDG21SmF8QJPx02DmmIj2emYCLMq2DoOZA NANMYQo3VDJObIi3zKNBShFct2th7PWVHAB66Nb/ncR+ob8n2PXWIr/Kg cKKC5f5M5OB+TpdspEdPCgJdQBnTQQJ0bnmIhD61eZpogWBuPi29HRC6v wdJOm83zydSP425N7pMuaXru9Prkh6NSnpsn3lp+Koc9M8gQ8UW2nxCDV lqEPdeuPSp092ug8qovApJE2uKPLU5TmMmb2prMa+hhitpGe1gSzaVrwx XCP7ogqjTEPSHl2eUw/VEADqre4t1ndyyCBLlOP+utMEGYSayYjvJKDUg g==; X-CSE-ConnectionGUID: X7OYlNGnTWi2GI/3kHNeTg== X-CSE-MsgGUID: 2jLtHw8CQJWXcRywgs2hmg== X-IronPort-AV: E=McAfee;i="6700,10204,11222"; a="39808701" X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="39808701" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Oct 2024 03:47:04 -0700 X-CSE-ConnectionGUID: OLA3p3+TTrCWxnhco05WFQ== X-CSE-MsgGUID: dCJzDC+vSCCWQuEAGDaHGg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,225,1725346800"; d="scan'208";a="103436947" Received: from panli.sh.intel.com ([10.239.154.73]) by fmviesa002.fm.intel.com with ESMTP; 23 Oct 2024 03:47:02 -0700 From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, Tamar.Christina@arm.com, juzhe.zhong@rivai.ai, kito.cheng@gmail.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Pan Li Subject: [PATCH 2/5] Vect: Introduce MASK_LEN_STRIDED_LOAD{STORE} to loop vectorizer Date: Wed, 23 Oct 2024 18:45:13 +0800 Message-ID: <20241023104516.2818244-2-pan2.li@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241023104516.2818244-1-pan2.li@intel.com> References: <20241023104516.2818244-1-pan2.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Pan Li This patch would like to allow generation of MASK_LEN_STRIDED_LOAD{STORE} IR for invariant stride memory access. For example as below void foo (int * __restrict a, int * __restrict b, int stride, int n) { for (int i = 0; i < n; i++) a[i*stride] = b[i*stride] + 100; } Before this patch: 66 │ _73 = .SELECT_VL (ivtmp_71, POLY_INT_CST [4, 4]); 67 │ _52 = _54 * _73; 68 │ vect__5.16_61 = .MASK_LEN_GATHER_LOAD (vectp_b.14_59, _58, 4, { 0, ... }, { -1, ... }, _73, 0); 69 │ vect__7.17_63 = vect__5.16_61 + { 100, ... }; 70 │ .MASK_LEN_SCATTER_STORE (vectp_a.18_67, _58, 4, vect__7.17_63, { -1, ... }, _73, 0); 71 │ vectp_b.14_60 = vectp_b.14_59 + _52; 72 │ vectp_a.18_68 = vectp_a.18_67 + _52; 73 │ ivtmp_72 = ivtmp_71 - _73; After this patch: 60 │ _70 = .SELECT_VL (ivtmp_68, POLY_INT_CST [4, 4]); 61 │ _52 = _54 * _70; 62 │ vect__5.16_58 = .MASK_LEN_STRIDED_LOAD (vectp_b.14_56, _55, { 0, ... }, { -1, ... }, _70, 0); 63 │ vect__7.17_60 = vect__5.16_58 + { 100, ... }; 64 │ .MASK_LEN_STRIDED_STORE (vectp_a.18_64, _55, vect__7.17_60, { -1, ... }, _70, 0); 65 │ vectp_b.14_57 = vectp_b.14_56 + _52; 66 │ vectp_a.18_65 = vectp_a.18_64 + _52; 67 │ ivtmp_69 = ivtmp_68 - _70; The below test suites are passed for this patch: * The x86 bootstrap test. * The x86 fully regression test. * The riscv fully regression test. gcc/ChangeLog: * tree-vect-stmts.cc (vect_get_strided_load_store_ops): Handle MASK_LEN_STRIDED_LOAD{STORE} after supported check. (vectorizable_store): Generate MASK_LEN_STRIDED_LOAD when the offset of gater is not vector type. (vectorizable_load): Ditto but for store. Signed-off-by: Pan Li Co-Authored-By: Juzhe-Zhong --- gcc/tree-vect-stmts.cc | 45 +++++++++++++++++++++++++++++++++--------- 1 file changed, 36 insertions(+), 9 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index e7f14c3144c..78d66a4ef9d 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -2950,6 +2950,15 @@ vect_get_strided_load_store_ops (stmt_vec_info stmt_info, *dataref_bump = cse_and_gimplify_to_preheader (loop_vinfo, bump); } + internal_fn ifn + = DR_IS_READ (dr) ? IFN_MASK_LEN_STRIDED_LOAD : IFN_MASK_LEN_STRIDED_STORE; + if (direct_internal_fn_supported_p (ifn, vectype, OPTIMIZE_FOR_SPEED)) + { + *vec_offset = cse_and_gimplify_to_preheader (loop_vinfo, + unshare_expr (DR_STEP (dr))); + return; + } + /* The offset given in GS_INFO can have pointer type, so use the element type of the vector instead. */ tree offset_type = TREE_TYPE (gs_info->offset_vectype); @@ -9194,10 +9203,20 @@ vectorizable_store (vec_info *vinfo, gcall *call; if (final_len && final_mask) - call = gimple_build_call_internal - (IFN_MASK_LEN_SCATTER_STORE, 7, dataref_ptr, - vec_offset, scale, vec_oprnd, final_mask, - final_len, bias); + { + if (VECTOR_TYPE_P (TREE_TYPE (vec_offset))) + call = gimple_build_call_internal ( + IFN_MASK_LEN_SCATTER_STORE, 7, dataref_ptr, + vec_offset, scale, vec_oprnd, final_mask, final_len, + bias); + else + /* Non-vector offset indicates that prefer to take + MASK_LEN_STRIDED_STORE instead of the + IFN_MASK_SCATTER_STORE with direct stride arg. */ + call = gimple_build_call_internal ( + IFN_MASK_LEN_STRIDED_STORE, 6, dataref_ptr, + vec_offset, vec_oprnd, final_mask, final_len, bias); + } else if (final_mask) call = gimple_build_call_internal (IFN_MASK_SCATTER_STORE, 5, dataref_ptr, @@ -11194,11 +11213,19 @@ vectorizable_load (vec_info *vinfo, gcall *call; if (final_len && final_mask) - call - = gimple_build_call_internal (IFN_MASK_LEN_GATHER_LOAD, 7, - dataref_ptr, vec_offset, - scale, zero, final_mask, - final_len, bias); + { + if (VECTOR_TYPE_P (TREE_TYPE (vec_offset))) + call = gimple_build_call_internal ( + IFN_MASK_LEN_GATHER_LOAD, 7, dataref_ptr, vec_offset, + scale, zero, final_mask, final_len, bias); + else + /* Non-vector offset indicates that prefer to take + MASK_LEN_STRIDED_LOAD instead of the + MASK_LEN_GATHER_LOAD with direct stride arg. */ + call = gimple_build_call_internal ( + IFN_MASK_LEN_STRIDED_LOAD, 6, dataref_ptr, vec_offset, + zero, final_mask, final_len, bias); + } else if (final_mask) call = gimple_build_call_internal (IFN_MASK_GATHER_LOAD, 5, dataref_ptr, vec_offset, From patchwork Wed Oct 23 10:45:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 99396 X-Patchwork-Delegate: rdapp.gcc@gmail.com Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5F5483858282 for ; Wed, 23 Oct 2024 10:47:50 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.11]) by sourceware.org (Postfix) with ESMTPS id 265763858D33 for ; Wed, 23 Oct 2024 10:47:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 265763858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 265763858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.11 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729680431; cv=none; b=cEU0qu+lFNN/+zNrunejx89MZpERf8xkATk71gIG+MmtVyKu6lC7wes6JDlYGiQUGtEqq8KA6J8FAkDmjcxvPFODV6KPbLPd8m+SKh+xykyKtXz5Ixo/SFpR8F7hoKdIbzzp8GA0tHx+Dj6VjF8nfntD7EbNM+y0KcFvLzVtbtM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729680431; c=relaxed/simple; bh=4QEbMPTSVw9UIw9apHqLwJNl3Bs+OaJ95hmngclbbbc=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=RufSBvOz0cmOvxlKE3Ac4VD+CgnDEAhKPSUgsoVcggYQB3nuGy7jGiYCIBI+leJ1qXWRnFn8SNI8QnzH4C5PY8c0KCK87kcV0bVU2p6DVnbZoJo7b6KipMI99C5ob0z1UGtAOGVGgkRFcpRgK7zkaFqftNkLpFECi4QqJcGcvQY= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729680428; x=1761216428; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4QEbMPTSVw9UIw9apHqLwJNl3Bs+OaJ95hmngclbbbc=; b=oHFvJYgaTkFMv+ZQoLSTQSXGcEHLw/6njV072mj5ReLLaUB/dDyprVjz bbGC3UHmLo+VJuWVAaA54qKnN3vZGxijHMEX/Wbf57syyYx61MoX79pFS nDzM+9YQ/3zQR3Ns+6hQEJNoCK+/5e358U0UW+gUj+2z/E1dMTZKb+Dr5 u2tYc7XybKb+Gzk8/L/jlsShAh5hy/LE3NP2lC4VCMfbWI6xqzgpq4cjp KWktfi45a7wGhMt0VpoKWZAgci/2BIx/ytCeqi/I35UKjs9tWnwkeomi3 LErnQMvM9SRQej+QHq5/iVocqPBRIuLmIdpnDAT4d3qVtdLDuv1QyXH6q Q==; X-CSE-ConnectionGUID: d2VZU1BPThaAo7pepGODSg== X-CSE-MsgGUID: n2aqdMozTembBfOAYLVrVg== X-IronPort-AV: E=McAfee;i="6700,10204,11222"; a="39808708" X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="39808708" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Oct 2024 03:47:07 -0700 X-CSE-ConnectionGUID: s6qSPg9hRpWZu/jbBDCVqA== X-CSE-MsgGUID: jJnRLdhoSyqFNdu0RDZjyA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,225,1725346800"; d="scan'208";a="103436959" Received: from panli.sh.intel.com ([10.239.154.73]) by fmviesa002.fm.intel.com with ESMTP; 23 Oct 2024 03:47:05 -0700 From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, Tamar.Christina@arm.com, juzhe.zhong@rivai.ai, kito.cheng@gmail.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Pan Li Subject: [PATCH 3/5] RISC-V: Adjust the gather-scatter testcases due to middle-end change Date: Wed, 23 Oct 2024 18:45:14 +0800 Message-ID: <20241023104516.2818244-3-pan2.li@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241023104516.2818244-1-pan2.li@intel.com> References: <20241023104516.2818244-1-pan2.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Pan Li After we have MASK_LEN_STRIDED_LOAD{STORE} in the middle-end, the strided case need to be adjust for IR check. The below test suites are passed for this patch: * The riscv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c: Adjust IR for MASK_LEN_LOAD check. * gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c: Ditto. * gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c: Ditto but for store. * gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c: Ditto. Signed-off-by: Pan Li Co-Authored-By: Juzhe-Zhong --- .../riscv/rvv/autovec/gather-scatter/strided_load-1.c | 2 +- .../riscv/rvv/autovec/gather-scatter/strided_load-2.c | 2 +- .../riscv/rvv/autovec/gather-scatter/strided_store-1.c | 2 +- .../riscv/rvv/autovec/gather-scatter/strided_store-2.c | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c index 53263d16ae2..79b39f102bf 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c @@ -40,6 +40,6 @@ TEST_ALL (TEST_LOOP) -/* { dg-final { scan-tree-dump-times " \.MASK_LEN_GATHER_LOAD" 66 "optimized" } } */ +/* { dg-final { scan-tree-dump-times " \.MASK_LEN_STRIDED_LOAD " 66 "optimized" } } */ /* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "optimized" } } */ /* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "optimized" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c index 6fef474cf8e..8a452e547a3 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c @@ -40,6 +40,6 @@ TEST_ALL (TEST_LOOP) -/* { dg-final { scan-tree-dump-times " \.MASK_LEN_GATHER_LOAD" 33 "optimized" } } */ +/* { dg-final { scan-tree-dump-times " \.MASK_LEN_STRIDED_LOAD " 33 "optimized" } } */ /* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "optimized" } } */ /* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "optimized" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c index ad23ed42129..ec8c3a5c63a 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c @@ -40,6 +40,6 @@ TEST_ALL (TEST_LOOP) -/* { dg-final { scan-tree-dump-times " \.MASK_LEN_SCATTER_STORE" 66 "optimized" } } */ +/* { dg-final { scan-tree-dump-times " \.MASK_LEN_STRIDED_STORE" 66 "optimized" } } */ /* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "optimized" } } */ /* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "optimized" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c index 65f3f00b8c2..b433b5b5210 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c @@ -40,6 +40,6 @@ TEST_ALL (TEST_LOOP) -/* { dg-final { scan-tree-dump-times " \.MASK_LEN_SCATTER_STORE" 44 "optimized" } } */ +/* { dg-final { scan-tree-dump-times " \.MASK_LEN_STRIDED_STORE " 44 "optimized" } } */ /* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "optimized" } } */ /* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "optimized" } } */ From patchwork Wed Oct 23 10:45:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 99398 X-Patchwork-Delegate: rdapp.gcc@gmail.com Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3AA2D3858CDA for ; Wed, 23 Oct 2024 10:48:55 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.11]) by sourceware.org (Postfix) with ESMTPS id 80A4F3858CD1 for ; Wed, 23 Oct 2024 10:47:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 80A4F3858CD1 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 80A4F3858CD1 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.11 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729680432; cv=none; b=nmoaKghaIIQTSS/QC3US0Gas6JYjVRNboWBFQicTcGX2Tvyj8BZq6b3kE8Z0IUKGHe90mfRCXfTDpCWOFIk1WGYoH8VegJuLZKqeO742elWcv7m1WrUfVc9JRL+1uNzWW/CwfOEQY+NWQrSCPa8wwTQdM24Yp7MaOOmVxVLEDf0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729680432; c=relaxed/simple; bh=KN1++Ldd/SzM9s/0xMQIVpfiD0FAyLhJaQvxIjpeubo=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=OxU+woBIMcPdVHQfI9uA1uphGpH6eipSGrJmyI+F1H6CqnD+MocsCTX6DxmGuHdqsvbHACa0sBrLlDD5Eri82tJRBmdBE05y8zdQalWsWObtnY/OtSJla6W0tkK2oK8XOhnFECqDGBD5mg5nz2ube0iEETxyC/rtA0TMSfYR8po= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729680429; x=1761216429; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=KN1++Ldd/SzM9s/0xMQIVpfiD0FAyLhJaQvxIjpeubo=; b=Pd/XViwaTFPbjFUQZVv7U3ERN9UL55GuePg2z7M0jo0gRCPp7IxUsXif xVJPXa7kft0RtQKnKpVlbyT1c6HoEA2cxCWUtjuou9eMriocv7H2ioCMx guoGP3vNBmaWoPAztSdoT85f3j26EMBmjMgbO9plm9GWYFqX+ROkh0D+c WwNBcmQhjgcBCPPcvFDEXnDFWhv8cDrwEdA5xwpQDRKwIi5FrYD1Z9WEn ++NocQM7nW5klrFH0FDQt/f2V1deiQzADTwJbyfhhmgH5ZZ5Sl+RBpyIq HeF2NvLdjZcxIfSqv0etjFsVlb29Vu2YPWN266aXRuKr8tXkXQeFusTm5 w==; X-CSE-ConnectionGUID: /l1c87xIS8Ke1cav2KrXCA== X-CSE-MsgGUID: /otEK5ZcS6aAUIOpM6L5Bg== X-IronPort-AV: E=McAfee;i="6700,10204,11222"; a="39808717" X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="39808717" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Oct 2024 03:47:09 -0700 X-CSE-ConnectionGUID: SnUW7myMS2eSfI/IexRYWA== X-CSE-MsgGUID: t4hWXtOST7ijGR+gT7kF/w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,225,1725346800"; d="scan'208";a="103436966" Received: from panli.sh.intel.com ([10.239.154.73]) by fmviesa002.fm.intel.com with ESMTP; 23 Oct 2024 03:47:07 -0700 From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, Tamar.Christina@arm.com, juzhe.zhong@rivai.ai, kito.cheng@gmail.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Pan Li Subject: [PATCH 4/5] RISC-V: Implement the MASK_LEN_STRIDED_LOAD{STORE} Date: Wed, 23 Oct 2024 18:45:15 +0800 Message-ID: <20241023104516.2818244-4-pan2.li@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241023104516.2818244-1-pan2.li@intel.com> References: <20241023104516.2818244-1-pan2.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_ASCII_DIVIDERS, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Pan Li This patch would like to implment the MASK_LEN_STRIDED_LOAD{STORE} in the RISC-V backend by leveraging the vector strided load/store insn. For example: void foo (int * __restrict a, int * __restrict b, int stride, int n) { for (int i = 0; i < n; i++) a[i*stride] = b[i*stride] + 100; } Before this patch: 38 │ vsetvli a5,a3,e32,m1,ta,ma 39 │ vluxei64.v v1,(a1),v4 40 │ mul a4,a2,a5 41 │ sub a3,a3,a5 42 │ vadd.vv v1,v1,v2 43 │ vsuxei64.v v1,(a0),v4 44 │ add a1,a1,a4 45 │ add a0,a0,a4 After this patch: 33 │ vsetvli a5,a3,e32,m1,ta,ma 34 │ vlse32.v v1,0(a1),a2 35 │ mul a4,a2,a5 36 │ sub a3,a3,a5 37 │ vadd.vv v1,v1,v2 38 │ vsse32.v v1,0(a0),a2 39 │ add a1,a1,a4 40 │ add a0,a0,a4 The below test suites are passed for this patch: * The riscv fully regression test. gcc/ChangeLog: * config/riscv/autovec.md (mask_len_strided_load_): Add new pattern for MASK_LEN_STRIDED_LOAD. (mask_len_strided_store_): Ditto but for store. * config/riscv/riscv-protos.h (expand_strided_load): Add new func decl to expand strided load. (expand_strided_store): Ditto but for store. * config/riscv/riscv-v.cc (expand_strided_load): Add new func impl to expand strided load. (expand_strided_store): Ditto but for store. Signed-off-by: Pan Li Co-Authored-By: Juzhe-Zhong --- gcc/config/riscv/autovec.md | 29 ++++++++++++++++++ gcc/config/riscv/riscv-protos.h | 2 ++ gcc/config/riscv/riscv-v.cc | 52 +++++++++++++++++++++++++++++++++ 3 files changed, 83 insertions(+) diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index a34f63c9651..85a915bd65f 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2855,3 +2855,32 @@ (define_expand "v3" DONE; } ) + +;; ========================================================================= +;; == Strided Load/Store +;; ========================================================================= +(define_expand "mask_len_strided_load_" + [(match_operand:V 0 "register_operand") + (match_operand 1 "pmode_reg_or_0_operand") + (match_operand 2 "pmode_reg_or_0_operand") + (match_operand: 3 "vector_mask_operand") + (match_operand 4 "autovec_length_operand") + (match_operand 5 "const_0_operand")] + "TARGET_VECTOR" + { + riscv_vector::expand_strided_load (mode, operands); + DONE; + }) + +(define_expand "mask_len_strided_store_" + [(match_operand 0 "pmode_reg_or_0_operand") + (match_operand 1 "pmode_reg_or_0_operand") + (match_operand:V 2 "register_operand") + (match_operand: 3 "vector_mask_operand") + (match_operand 4 "autovec_length_operand") + (match_operand 5 "const_0_operand")] + "TARGET_VECTOR" + { + riscv_vector::expand_strided_store(mode, operands); + DONE; + }) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index d690162bb0c..47c9494ff2b 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -696,6 +696,8 @@ bool expand_strcmp (rtx, rtx, rtx, rtx, unsigned HOST_WIDE_INT, bool); void emit_vec_extract (rtx, rtx, rtx); bool expand_vec_setmem (rtx, rtx, rtx); bool expand_vec_cmpmem (rtx, rtx, rtx, rtx); +void expand_strided_load (machine_mode, rtx *); +void expand_strided_store (machine_mode, rtx *); /* Rounding mode bitfield for fixed point VXRM. */ enum fixed_point_rounding_mode diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 630fbd80e94..ae028e8928a 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -3833,6 +3833,58 @@ expand_load_store (rtx *ops, bool is_load) } } +/* Expand MASK_LEN_STRIDED_LOAD. */ +void +expand_strided_load (machine_mode mode, rtx *ops) +{ + rtx v_reg = ops[0]; + rtx base = ops[1]; + rtx stride = ops[2]; + rtx mask = ops[3]; + rtx len = ops[4]; + poly_int64 len_val; + + insn_code icode = code_for_pred_strided_load (mode); + rtx emit_ops[] = {v_reg, mask, gen_rtx_MEM (mode, base), stride}; + + if (poly_int_rtx_p (len, &len_val) + && known_eq (len_val, GET_MODE_NUNITS (mode))) + emit_vlmax_insn (icode, BINARY_OP_TAMA, emit_ops); + else + { + len = satisfies_constraint_K (len) ? len : force_reg (Pmode, len); + emit_nonvlmax_insn (icode, BINARY_OP_TAMA, emit_ops, len); + } +} + +/* Expand MASK_LEN_STRIDED_STORE. */ +void +expand_strided_store (machine_mode mode, rtx *ops) +{ + rtx v_reg = ops[2]; + rtx base = ops[0]; + rtx stride = ops[1]; + rtx mask = ops[3]; + rtx len = ops[4]; + poly_int64 len_val; + rtx vl_type; + + if (poly_int_rtx_p (len, &len_val) + && known_eq (len_val, GET_MODE_NUNITS (mode))) + { + len = gen_reg_rtx (Pmode); + emit_vlmax_vsetvl (mode, len); + vl_type = get_avl_type_rtx (VLMAX); + } + else + { + len = satisfies_constraint_K (len) ? len : force_reg (Pmode, len); + vl_type = get_avl_type_rtx (NONVLMAX); + } + + emit_insn (gen_pred_strided_store (mode, gen_rtx_MEM (mode, base), + mask, stride, v_reg, len, vl_type)); +} /* Return true if the operation is the floating-point operation need FRM. */ static bool From patchwork Wed Oct 23 10:45:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 99397 X-Patchwork-Delegate: rdapp.gcc@gmail.com Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5881E3858C98 for ; Wed, 23 Oct 2024 10:47:58 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.11]) by sourceware.org (Postfix) with ESMTPS id 2A55D3858C42 for ; Wed, 23 Oct 2024 10:47:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2A55D3858C42 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2A55D3858C42 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.11 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729680443; cv=none; b=m9ya525LMZC6Vhu811A9GNQvmziB8RkwKYzD2T7cXgtEsJH86vsaCHCa9q80vP86m4zHvGFoPZqTGUX6tCvNd6V9t9y3EwhfjOUBjP4g0+pq6cjpuyYRaysGrm7pI9fGdDRMZnT9QuTJUMtCAVx/OWgvMWaq+ZSXi0lCpWt/7+U= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729680443; c=relaxed/simple; bh=JFXk3k+BU9MAxavynFp6j9XPFtfEnzZOztvriw01XBE=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=gSgDRHgnqKnRU9ag2r2xMeXl4yL81vFUdI9ORkHexP0v7QhPCeAQ7bUJuF4wHe+LlvGX5sCBLNE3Drf0LDzcjftzaY+OI9AnOtVc3w9r3JhX1XFxZeObmhtckJ8vR5pXz7rb4CwujV4PYLKko+jqPBfJ8FYmkTX/Pqh2t711CdM= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729680433; x=1761216433; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=JFXk3k+BU9MAxavynFp6j9XPFtfEnzZOztvriw01XBE=; b=aemSc0FUc4l8b6lrLc8ua9jvRMMof5xTn/KugBwiDNISqqwShfs70t5R 6z5J+RIH3Lm1aV9sSSS5CgAVYbrLCvJJ5rN4dPZoqC0CekAkEyiWtZbHM T8h6HNzkRRHE2l18SD1i8BbMAjCQ7RIqYAa8Nt4PmrCs+F+OGeUDm3LIo /Q/HLgjB4VxC4zBVeBZZTIggYZ3kHPhpjmLg/kywcC+uoCQRtOtmYMzJw gTZNsOKI9nqlcKSZZsyrqbCfbtDHimlFOzcqmkMp1XjC9+PCxzPd9Q/n/ LZDhcfyo1MX9Yx1U9nala6ooCaHGB4uXUG4B8FSKNArokbbnHOYf5zQxf g==; X-CSE-ConnectionGUID: y4s6lJzdQ4mkoWUZqCGYlA== X-CSE-MsgGUID: o1kc/gtqT9erJynabl1w4A== X-IronPort-AV: E=McAfee;i="6700,10204,11222"; a="39808727" X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="39808727" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Oct 2024 03:47:12 -0700 X-CSE-ConnectionGUID: A68fcnmpSF2qFnALOgFv+A== X-CSE-MsgGUID: F8XkljHMSq2rGYBVtC8WWg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,225,1725346800"; d="scan'208";a="103436979" Received: from panli.sh.intel.com ([10.239.154.73]) by fmviesa002.fm.intel.com with ESMTP; 23 Oct 2024 03:47:09 -0700 From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, Tamar.Christina@arm.com, juzhe.zhong@rivai.ai, kito.cheng@gmail.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Pan Li Subject: [PATCH 5/5] RISC-V: Add testcases for form 1 of MASK_LEN_STRIDED_LOAD{STORE} Date: Wed, 23 Oct 2024 18:45:16 +0800 Message-ID: <20241023104516.2818244-5-pan2.li@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241023104516.2818244-1-pan2.li@intel.com> References: <20241023104516.2818244-1-pan2.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_LOTSOFHASH, KAM_SHORT, SCC_10_SHORT_WORD_LINES, SCC_20_SHORT_WORD_LINES, SCC_35_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org From: Pan Li Form 1: void __attribute__((noinline)) \ vec_strided_load_store_##T##_form_1 (T *restrict out, T *restrict in, \ long stride, size_t size) \ { \ for (size_t i = 0; i < size; i++) \ out[i * stride] = in[i * stride]; \ } The below test suites are passed for this patch: * The riscv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/rvv.exp: Add strided folder. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f16.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f32.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f64.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i16.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i32.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i64.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i8.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u16.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u32.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u64.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u8.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f16.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f32.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f64.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i16.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i32.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i64.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i8.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u16.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u32.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u64.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u8.c: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st.h: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st_data.h: New test. * gcc.target/riscv/rvv/autovec/strided/strided_ld_st_run.h: New test. Signed-off-by: Pan Li Co-Authored-By: Juzhe-Zhong --- .../rvv/autovec/strided/strided_ld_st-1-f16.c | 11 + .../rvv/autovec/strided/strided_ld_st-1-f32.c | 11 + .../rvv/autovec/strided/strided_ld_st-1-f64.c | 11 + .../rvv/autovec/strided/strided_ld_st-1-i16.c | 11 + .../rvv/autovec/strided/strided_ld_st-1-i32.c | 11 + .../rvv/autovec/strided/strided_ld_st-1-i64.c | 11 + .../rvv/autovec/strided/strided_ld_st-1-i8.c | 11 + .../rvv/autovec/strided/strided_ld_st-1-u16.c | 11 + .../rvv/autovec/strided/strided_ld_st-1-u32.c | 11 + .../rvv/autovec/strided/strided_ld_st-1-u64.c | 11 + .../rvv/autovec/strided/strided_ld_st-1-u8.c | 11 + .../autovec/strided/strided_ld_st-run-1-f16.c | 15 + .../autovec/strided/strided_ld_st-run-1-f32.c | 15 + .../autovec/strided/strided_ld_st-run-1-f64.c | 15 + .../autovec/strided/strided_ld_st-run-1-i16.c | 15 + .../autovec/strided/strided_ld_st-run-1-i32.c | 15 + .../autovec/strided/strided_ld_st-run-1-i64.c | 15 + .../autovec/strided/strided_ld_st-run-1-i8.c | 15 + .../autovec/strided/strided_ld_st-run-1-u16.c | 15 + .../autovec/strided/strided_ld_st-run-1-u32.c | 15 + .../autovec/strided/strided_ld_st-run-1-u64.c | 15 + .../autovec/strided/strided_ld_st-run-1-u8.c | 15 + .../riscv/rvv/autovec/strided/strided_ld_st.h | 22 + .../rvv/autovec/strided/strided_ld_st_data.h | 1145 +++++++++++++++++ .../rvv/autovec/strided/strided_ld_st_run.h | 27 + gcc/testsuite/gcc.target/riscv/rvv/rvv.exp | 2 + 26 files changed, 1482 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f16.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f32.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f64.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i16.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i32.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i64.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u16.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u32.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u64.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f16.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f32.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f64.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i16.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i32.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i64.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u16.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u32.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u64.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st_data.h create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st_run.h diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f16.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f16.c new file mode 100644 index 00000000000..41fe2b20a98 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f16.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 -fno-vect-cost-model -fdump-rtl-expand-details" } */ + +#include "strided_ld_st.h" + +DEF_STRIDED_LD_ST_FORM_1(_Float16) + +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_LOAD " 4 "expand" } } */ +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_STORE " 4 "expand" } } */ +/* { dg-final { scan-assembler-times {vlse16.v} 1 } } */ +/* { dg-final { scan-assembler-times {vsse16.v} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f32.c new file mode 100644 index 00000000000..650b5fce4e8 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f32.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-vect-cost-model -fdump-rtl-expand-details" } */ + +#include "strided_ld_st.h" + +DEF_STRIDED_LD_ST_FORM_1(float) + +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_LOAD " 4 "expand" } } */ +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_STORE " 4 "expand" } } */ +/* { dg-final { scan-assembler-times {vlse32.v} 1 } } */ +/* { dg-final { scan-assembler-times {vsse32.v} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f64.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f64.c new file mode 100644 index 00000000000..c0559a9265e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-f64.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-vect-cost-model -fdump-rtl-expand-details" } */ + +#include "strided_ld_st.h" + +DEF_STRIDED_LD_ST_FORM_1(double) + +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_LOAD " 4 "expand" } } */ +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_STORE " 4 "expand" } } */ +/* { dg-final { scan-assembler-times {vlse64.v} 1 } } */ +/* { dg-final { scan-assembler-times {vsse64.v} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i16.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i16.c new file mode 100644 index 00000000000..641eaf14977 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i16.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-vect-cost-model -fdump-rtl-expand-details" } */ + +#include "strided_ld_st.h" + +DEF_STRIDED_LD_ST_FORM_1(int16_t) + +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_LOAD " 4 "expand" } } */ +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_STORE " 4 "expand" } } */ +/* { dg-final { scan-assembler-times {vlse16.v} 1 } } */ +/* { dg-final { scan-assembler-times {vsse16.v} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i32.c new file mode 100644 index 00000000000..5fc1ea91c5b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i32.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-vect-cost-model -fdump-rtl-expand-details" } */ + +#include "strided_ld_st.h" + +DEF_STRIDED_LD_ST_FORM_1(int32_t) + +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_LOAD " 4 "expand" } } */ +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_STORE " 4 "expand" } } */ +/* { dg-final { scan-assembler-times {vlse32.v} 1 } } */ +/* { dg-final { scan-assembler-times {vsse32.v} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i64.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i64.c new file mode 100644 index 00000000000..1819941cc36 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i64.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-vect-cost-model -fdump-rtl-expand-details" } */ + +#include "strided_ld_st.h" + +DEF_STRIDED_LD_ST_FORM_1(int64_t) + +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_LOAD " 4 "expand" } } */ +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_STORE " 4 "expand" } } */ +/* { dg-final { scan-assembler-times {vlse64.v} 1 } } */ +/* { dg-final { scan-assembler-times {vsse64.v} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i8.c new file mode 100644 index 00000000000..119a6d75fba --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-i8.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-vect-cost-model -fdump-rtl-expand-details" } */ + +#include "strided_ld_st.h" + +DEF_STRIDED_LD_ST_FORM_1(int8_t) + +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_LOAD " 4 "expand" } } */ +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_STORE " 4 "expand" } } */ +/* { dg-final { scan-assembler-times {vlse8.v} 1 } } */ +/* { dg-final { scan-assembler-times {vsse8.v} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u16.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u16.c new file mode 100644 index 00000000000..19d4f6edc87 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u16.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-vect-cost-model -fdump-rtl-expand-details" } */ + +#include "strided_ld_st.h" + +DEF_STRIDED_LD_ST_FORM_1(uint16_t) + +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_LOAD " 4 "expand" } } */ +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_STORE " 4 "expand" } } */ +/* { dg-final { scan-assembler-times {vlse16.v} 1 } } */ +/* { dg-final { scan-assembler-times {vsse16.v} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u32.c new file mode 100644 index 00000000000..10b1d4fefb5 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u32.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-vect-cost-model -fdump-rtl-expand-details" } */ + +#include "strided_ld_st.h" + +DEF_STRIDED_LD_ST_FORM_1(uint32_t) + +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_LOAD " 4 "expand" } } */ +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_STORE " 4 "expand" } } */ +/* { dg-final { scan-assembler-times {vlse32.v} 1 } } */ +/* { dg-final { scan-assembler-times {vsse32.v} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u64.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u64.c new file mode 100644 index 00000000000..b1654b8c80f --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u64.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-vect-cost-model -fdump-rtl-expand-details" } */ + +#include "strided_ld_st.h" + +DEF_STRIDED_LD_ST_FORM_1(uint64_t) + +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_LOAD " 4 "expand" } } */ +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_STORE " 4 "expand" } } */ +/* { dg-final { scan-assembler-times {vlse64.v} 1 } } */ +/* { dg-final { scan-assembler-times {vsse64.v} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u8.c new file mode 100644 index 00000000000..273dcb83b0b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-1-u8.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-vect-cost-model -fdump-rtl-expand-details" } */ + +#include "strided_ld_st.h" + +DEF_STRIDED_LD_ST_FORM_1(uint8_t) + +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_LOAD " 4 "expand" } } */ +/* { dg-final { scan-rtl-dump-times ".MASK_LEN_STRIDED_STORE " 4 "expand" } } */ +/* { dg-final { scan-assembler-times {vlse8.v} 1 } } */ +/* { dg-final { scan-assembler-times {vsse8.v} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f16.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f16.c new file mode 100644 index 00000000000..21dd7dc6d7f --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f16.c @@ -0,0 +1,15 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model" } */ + +#include "strided_ld_st.h" +#include "strided_ld_st_data.h" + +#define T _Float16 + +DEF_STRIDED_LD_ST_FORM_1_WRAP(T) + +#define DATA TEST_STRIDED_LD_ST_DATA_WRAP(T) +#define RUN_STRIDED_LD_ST(out, in, stride, size) \ + RUN_STRIDED_LD_ST_FORM_1_WRAP(T, out, in, stride, size) + +#include "strided_ld_st_run.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f32.c new file mode 100644 index 00000000000..da3932bfa86 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f32.c @@ -0,0 +1,15 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model" } */ + +#include "strided_ld_st.h" +#include "strided_ld_st_data.h" + +#define T float + +DEF_STRIDED_LD_ST_FORM_1_WRAP(T) + +#define DATA TEST_STRIDED_LD_ST_DATA_WRAP(T) +#define RUN_STRIDED_LD_ST(out, in, stride, size) \ + RUN_STRIDED_LD_ST_FORM_1_WRAP(T, out, in, stride, size) + +#include "strided_ld_st_run.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f64.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f64.c new file mode 100644 index 00000000000..4e7ec6eb0f8 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-f64.c @@ -0,0 +1,15 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model" } */ + +#include "strided_ld_st.h" +#include "strided_ld_st_data.h" + +#define T double + +DEF_STRIDED_LD_ST_FORM_1_WRAP(T) + +#define DATA TEST_STRIDED_LD_ST_DATA_WRAP(T) +#define RUN_STRIDED_LD_ST(out, in, stride, size) \ + RUN_STRIDED_LD_ST_FORM_1_WRAP(T, out, in, stride, size) + +#include "strided_ld_st_run.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i16.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i16.c new file mode 100644 index 00000000000..64791064b72 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i16.c @@ -0,0 +1,15 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model" } */ + +#include "strided_ld_st.h" +#include "strided_ld_st_data.h" + +#define T int16_t + +DEF_STRIDED_LD_ST_FORM_1_WRAP(T) + +#define DATA TEST_STRIDED_LD_ST_DATA_WRAP(T) +#define RUN_STRIDED_LD_ST(out, in, stride, size) \ + RUN_STRIDED_LD_ST_FORM_1_WRAP(T, out, in, stride, size) + +#include "strided_ld_st_run.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i32.c new file mode 100644 index 00000000000..223a894a5fe --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i32.c @@ -0,0 +1,15 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model" } */ + +#include "strided_ld_st.h" +#include "strided_ld_st_data.h" + +#define T int32_t + +DEF_STRIDED_LD_ST_FORM_1_WRAP(T) + +#define DATA TEST_STRIDED_LD_ST_DATA_WRAP(T) +#define RUN_STRIDED_LD_ST(out, in, stride, size) \ + RUN_STRIDED_LD_ST_FORM_1_WRAP(T, out, in, stride, size) + +#include "strided_ld_st_run.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i64.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i64.c new file mode 100644 index 00000000000..1835b419048 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i64.c @@ -0,0 +1,15 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model" } */ + +#include "strided_ld_st.h" +#include "strided_ld_st_data.h" + +#define T int64_t + +DEF_STRIDED_LD_ST_FORM_1_WRAP(T) + +#define DATA TEST_STRIDED_LD_ST_DATA_WRAP(T) +#define RUN_STRIDED_LD_ST(out, in, stride, size) \ + RUN_STRIDED_LD_ST_FORM_1_WRAP(T, out, in, stride, size) + +#include "strided_ld_st_run.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i8.c new file mode 100644 index 00000000000..cee43db9a0d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-i8.c @@ -0,0 +1,15 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model" } */ + +#include "strided_ld_st.h" +#include "strided_ld_st_data.h" + +#define T int8_t + +DEF_STRIDED_LD_ST_FORM_1_WRAP(T) + +#define DATA TEST_STRIDED_LD_ST_DATA_WRAP(T) +#define RUN_STRIDED_LD_ST(out, in, stride, size) \ + RUN_STRIDED_LD_ST_FORM_1_WRAP(T, out, in, stride, size) + +#include "strided_ld_st_run.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u16.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u16.c new file mode 100644 index 00000000000..6fe2a579325 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u16.c @@ -0,0 +1,15 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model" } */ + +#include "strided_ld_st.h" +#include "strided_ld_st_data.h" + +#define T uint16_t + +DEF_STRIDED_LD_ST_FORM_1_WRAP(T) + +#define DATA TEST_STRIDED_LD_ST_DATA_WRAP(T) +#define RUN_STRIDED_LD_ST(out, in, stride, size) \ + RUN_STRIDED_LD_ST_FORM_1_WRAP(T, out, in, stride, size) + +#include "strided_ld_st_run.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u32.c new file mode 100644 index 00000000000..f2a7ece4839 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u32.c @@ -0,0 +1,15 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model" } */ + +#include "strided_ld_st.h" +#include "strided_ld_st_data.h" + +#define T uint32_t + +DEF_STRIDED_LD_ST_FORM_1_WRAP(T) + +#define DATA TEST_STRIDED_LD_ST_DATA_WRAP(T) +#define RUN_STRIDED_LD_ST(out, in, stride, size) \ + RUN_STRIDED_LD_ST_FORM_1_WRAP(T, out, in, stride, size) + +#include "strided_ld_st_run.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u64.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u64.c new file mode 100644 index 00000000000..19490a39709 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u64.c @@ -0,0 +1,15 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model" } */ + +#include "strided_ld_st.h" +#include "strided_ld_st_data.h" + +#define T uint64_t + +DEF_STRIDED_LD_ST_FORM_1_WRAP(T) + +#define DATA TEST_STRIDED_LD_ST_DATA_WRAP(T) +#define RUN_STRIDED_LD_ST(out, in, stride, size) \ + RUN_STRIDED_LD_ST_FORM_1_WRAP(T, out, in, stride, size) + +#include "strided_ld_st_run.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u8.c new file mode 100644 index 00000000000..e9c3b9e40db --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st-run-1-u8.c @@ -0,0 +1,15 @@ +/* { dg-do run { target { riscv_v } } } */ +/* { dg-additional-options "-std=c99 -fno-vect-cost-model" } */ + +#include "strided_ld_st.h" +#include "strided_ld_st_data.h" + +#define T uint8_t + +DEF_STRIDED_LD_ST_FORM_1_WRAP(T) + +#define DATA TEST_STRIDED_LD_ST_DATA_WRAP(T) +#define RUN_STRIDED_LD_ST(out, in, stride, size) \ + RUN_STRIDED_LD_ST_FORM_1_WRAP(T, out, in, stride, size) + +#include "strided_ld_st_run.h" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st.h new file mode 100644 index 00000000000..8c86deca839 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st.h @@ -0,0 +1,22 @@ +#ifndef HAVE_DEFINED_STRIDED_H +#define HAVE_DEFINED_STRIDED_H + +#include +#include +#include + +#define DEF_STRIDED_LD_ST_FORM_1(T) \ + void __attribute__((noinline)) \ + vec_strided_load_store_##T##_form_1 (T *restrict out, T *restrict in, \ + long stride, size_t size) \ + { \ + for (size_t i = 0; i < size; i++) \ + out[i * stride] = in[i * stride]; \ + } +#define DEF_STRIDED_LD_ST_FORM_1_WRAP(T) DEF_STRIDED_LD_ST_FORM_1(T) +#define RUN_STRIDED_LD_ST_FORM_1(T, out, in, stride, size) \ + vec_strided_load_store_##T##_form_1 (out, in, stride, size) +#define RUN_STRIDED_LD_ST_FORM_1_WRAP(T, out, in, stride, size) \ + RUN_STRIDED_LD_ST_FORM_1(T, out, in, stride, size) + +#endif diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st_data.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st_data.h new file mode 100644 index 00000000000..ebf81970ef0 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st_data.h @@ -0,0 +1,1145 @@ +#ifndef HAVE_DEFINED_STRIDED_DATA_H +#define HAVE_DEFINED_STRIDED_DATA_H + +#include +#include +#include + +#define N 32 +#define TEST_STRIDED_LD_ST_DATA(T) test_strided_ld_st_##T##_data +#define TEST_STRIDED_LD_ST_DATA_WRAP(T) TEST_STRIDED_LD_ST_DATA(T) + +int8_t TEST_STRIDED_LD_ST_DATA(int8_t)[][4][N] = +{ + { + { 1 }, /* stride */ + { /* input */ + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + }, + }, + { + { 2 }, /* stride */ + { /* input */ + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + }, + }, + { + { 4 }, /* stride */ + { /* input */ + 127, 127, 127, 127, + 127, 127, 127, 127, + 127, 127, 127, 127, + 127, 127, 127, 127, + -128, -128, -128, -128, + -128, -128, -128, -128, + -128, -128, -128, -128, + -128, -128, -128, -128, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 127, 0, 0, 0, + 127, 0, 0, 0, + 127, 0, 0, 0, + 127, 0, 0, 0, + -128, 0, 0, 0, + -128, 0, 0, 0, + -128, 0, 0, 0, + -128, 0, 0, 0, + }, + }, +}; + +int16_t TEST_STRIDED_LD_ST_DATA(int16_t)[][4][N] = +{ + { + { 1 }, /* stride */ + { /* input */ + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + }, + }, + { + { 2 }, /* stride */ + { /* input */ + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + }, + }, + { + { 4 }, /* stride */ + { /* input */ + 32767, 32767, 32767, 32767, + 32767, 32767, 32767, 32767, + 32767, 32767, 32767, 32767, + 32767, 32767, 32767, 32767, + -32768, -32768, -32768, -32768, + -32768, -32768, -32768, -32768, + -32768, -32768, -32768, -32768, + -32768, -32768, -32768, -32768, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 32767, 0, 0, 0, + 32767, 0, 0, 0, + 32767, 0, 0, 0, + 32767, 0, 0, 0, + -32768, 0, 0, 0, + -32768, 0, 0, 0, + -32768, 0, 0, 0, + -32768, 0, 0, 0, + }, + }, +}; + +int32_t TEST_STRIDED_LD_ST_DATA(int32_t)[][4][N] = +{ + { + { 1 }, /* stride */ + { /* input */ + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + }, + }, + { + { 2 }, /* stride */ + { /* input */ + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + }, + }, + { + { 4 }, /* stride */ + { /* input */ + 2147483647, 2147483647, 2147483647, 2147483647, + 2147483647, 2147483647, 2147483647, 2147483647, + 2147483647, 2147483647, 2147483647, 2147483647, + 2147483647, 2147483647, 2147483647, 2147483647, + -2147483648, -2147483648, -2147483648, -2147483648, + -2147483648, -2147483648, -2147483648, -2147483648, + -2147483648, -2147483648, -2147483648, -2147483648, + -2147483648, -2147483648, -2147483648, -2147483648, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 2147483647, 0, 0, 0, + 2147483647, 0, 0, 0, + 2147483647, 0, 0, 0, + 2147483647, 0, 0, 0, + -2147483648, 0, 0, 0, + -2147483648, 0, 0, 0, + -2147483648, 0, 0, 0, + -2147483648, 0, 0, 0, + }, + }, +}; + +int64_t TEST_STRIDED_LD_ST_DATA(int64_t)[][4][N] = +{ + { + { 1 }, /* stride */ + { /* input */ + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + }, + }, + { + { 2 }, /* stride */ + { /* input */ + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + }, + }, + { + { 4 }, /* stride */ + { /* input */ + 9223372036854775807ll, 9223372036854775807ll, 9223372036854775807ll, 9223372036854775807ll, + 9223372036854775807ll, 9223372036854775807ll, 9223372036854775807ll, 9223372036854775807ll, + 9223372036854775807ll, 9223372036854775807ll, 9223372036854775807ll, 9223372036854775807ll, + 9223372036854775807ll, 9223372036854775807ll, 9223372036854775807ll, 9223372036854775807ll, + -9223372036854775808ull, -9223372036854775808ull, -9223372036854775808ull, -9223372036854775808ull, + -9223372036854775808ull, -9223372036854775808ull, -9223372036854775808ull, -9223372036854775808ull, + -9223372036854775808ull, -9223372036854775808ull, -9223372036854775808ull, -9223372036854775808ull, + -9223372036854775808ull, -9223372036854775808ull, -9223372036854775808ull, -9223372036854775808ull, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 9223372036854775807ll, 0, 0, 0, + 9223372036854775807ll, 0, 0, 0, + 9223372036854775807ll, 0, 0, 0, + 9223372036854775807ll, 0, 0, 0, + -9223372036854775808ull, 0, 0, 0, + -9223372036854775808ull, 0, 0, 0, + -9223372036854775808ull, 0, 0, 0, + -9223372036854775808ull, 0, 0, 0, + }, + }, +}; + +uint8_t TEST_STRIDED_LD_ST_DATA(uint8_t)[][4][N] = +{ + { + { 1 }, /* stride */ + { /* input */ + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + }, + }, + { + { 2 }, /* stride */ + { /* input */ + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + }, + }, + { + { 4 }, /* stride */ + { /* input */ + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 255, 255, 255, 255, + 254, 254, 254, 254, + 254, 254, 254, 254, + 254, 254, 254, 254, + 254, 254, 254, 254, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 255, 0, 0, 0, + 255, 0, 0, 0, + 255, 0, 0, 0, + 255, 0, 0, 0, + 254, 0, 0, 0, + 254, 0, 0, 0, + 254, 0, 0, 0, + 254, 0, 0, 0, + }, + }, +}; + +uint16_t TEST_STRIDED_LD_ST_DATA(uint16_t)[][4][N] = +{ + { + { 1 }, /* stride */ + { /* input */ + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + }, + }, + { + { 2 }, /* stride */ + { /* input */ + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + }, + }, + { + { 4 }, /* stride */ + { /* input */ + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65535, 65535, 65535, 65535, + 65534, 65534, 65534, 65534, + 65534, 65534, 65534, 65534, + 65534, 65534, 65534, 65534, + 65534, 65534, 65534, 65534, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 65535, 0, 0, 0, + 65535, 0, 0, 0, + 65535, 0, 0, 0, + 65535, 0, 0, 0, + 65534, 0, 0, 0, + 65534, 0, 0, 0, + 65534, 0, 0, 0, + 65534, 0, 0, 0, + }, + }, +}; + +uint32_t TEST_STRIDED_LD_ST_DATA(uint32_t)[][4][N] = +{ + { + { 1 }, /* stride */ + { /* input */ + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + }, + }, + { + { 2 }, /* stride */ + { /* input */ + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + }, + }, + { + { 4 }, /* stride */ + { /* input */ + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967295, 4294967295, 4294967295, 4294967295, + 4294967294, 4294967294, 4294967294, 4294967294, + 4294967294, 4294967294, 4294967294, 4294967294, + 4294967294, 4294967294, 4294967294, 4294967294, + 4294967294, 4294967294, 4294967294, 4294967294, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 4294967295, 0, 0, 0, + 4294967295, 0, 0, 0, + 4294967295, 0, 0, 0, + 4294967295, 0, 0, 0, + 4294967294, 0, 0, 0, + 4294967294, 0, 0, 0, + 4294967294, 0, 0, 0, + 4294967294, 0, 0, 0, + }, + }, +}; + +uint64_t TEST_STRIDED_LD_ST_DATA(uint64_t)[][4][N] = +{ + { + { 1 }, /* stride */ + { /* input */ + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + 1, 0, 0, 0, + 0, 1, 0, 0, + 0, 0, 1, 0, + 0, 0, 0, 1, + }, + }, + { + { 2 }, /* stride */ + { /* input */ + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + 2, 3, 9, 7, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + 2, 0, 9, 0, + }, + }, + { + { 4 }, /* stride */ + { /* input */ + 18446744073709551615ull, 18446744073709551615ull, 18446744073709551615ull, 18446744073709551615ull, + 18446744073709551615ull, 18446744073709551615ull, 18446744073709551615ull, 18446744073709551615ull, + 18446744073709551615ull, 18446744073709551615ull, 18446744073709551615ull, 18446744073709551615ull, + 18446744073709551615ull, 18446744073709551615ull, 18446744073709551615ull, 18446744073709551615ull, + 18446744073709551614ull, 18446744073709551614ull, 18446744073709551614ull, 18446744073709551614ull, + 18446744073709551614ull, 18446744073709551614ull, 18446744073709551614ull, 18446744073709551614ull, + 18446744073709551614ull, 18446744073709551614ull, 18446744073709551614ull, 18446744073709551614ull, + 18446744073709551614ull, 18446744073709551614ull, 18446744073709551614ull, 18446744073709551614ull, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 18446744073709551615ull, 0, 0, 0, + 18446744073709551615ull, 0, 0, 0, + 18446744073709551615ull, 0, 0, 0, + 18446744073709551615ull, 0, 0, 0, + 18446744073709551614ull, 0, 0, 0, + 18446744073709551614ull, 0, 0, 0, + 18446744073709551614ull, 0, 0, 0, + 18446744073709551614ull, 0, 0, 0, + }, + }, +}; + +_Float16 TEST_STRIDED_LD_ST_DATA(_Float16)[][4][N] = +{ + { + { 1 }, /* stride */ + { /* input */ + 1.4, 0.2, 0.8, 0.8, + 0.4, 1.2, 0.8, 0.8, + 0.4, 0.2, 1.8, 0.8, + 0.4, 0.2, 0.8, 1.8, + 1.4, 0.2, 0.8, 0.8, + 0.4, 1.2, 0.8, 0.8, + 0.4, 0.2, 1.8, 0.8, + 0.4, 0.2, 0.8, 1.8, + }, + { /* output */ + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + }, + { /* expect */ + 1.4, 0.2, 0.8, 0.8, + 0.4, 1.2, 0.8, 0.8, + 0.4, 0.2, 1.8, 0.8, + 0.4, 0.2, 0.8, 1.8, + 1.4, 0.2, 0.8, 0.8, + 0.4, 1.2, 0.8, 0.8, + 0.4, 0.2, 1.8, 0.8, + 0.4, 0.2, 0.8, 1.8, + }, + }, + { + { 2 }, /* stride */ + { /* input */ + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + }, + { /* output */ + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + }, + { /* expect */ + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + }, + }, + { + { 4 }, /* stride */ + { /* input */ + 127.8, 127.8, 127.8, 127.8, + 127.8, 127.8, 127.8, 127.8, + 127.8, 127.8, 127.8, 127.8, + 127.8, 127.8, 127.8, 127.8, + -128.2, -128.2, -128.2, -128.2, + -128.2, -128.2, -128.2, -128.2, + -128.2, -128.2, -128.2, -128.2, + -128.2, -128.2, -128.2, -128.2, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 127.8, 0, 0, 0, + 127.8, 0, 0, 0, + 127.8, 0, 0, 0, + 127.8, 0, 0, 0, + -128.2, 0, 0, 0, + -128.2, 0, 0, 0, + -128.2, 0, 0, 0, + -128.2, 0, 0, 0, + }, + }, +}; + +float TEST_STRIDED_LD_ST_DATA(float)[][4][N] = +{ + { + { 1 }, /* stride */ + { /* input */ + 1.4, 0.2, 0.8, 0.8, + 0.4, 1.2, 0.8, 0.8, + 0.4, 0.2, 1.8, 0.8, + 0.4, 0.2, 0.8, 1.8, + 1.4, 0.2, 0.8, 0.8, + 0.4, 1.2, 0.8, 0.8, + 0.4, 0.2, 1.8, 0.8, + 0.4, 0.2, 0.8, 1.8, + }, + { /* output */ + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + }, + { /* expect */ + 1.4, 0.2, 0.8, 0.8, + 0.4, 1.2, 0.8, 0.8, + 0.4, 0.2, 1.8, 0.8, + 0.4, 0.2, 0.8, 1.8, + 1.4, 0.2, 0.8, 0.8, + 0.4, 1.2, 0.8, 0.8, + 0.4, 0.2, 1.8, 0.8, + 0.4, 0.2, 0.8, 1.8, + }, + }, + { + { 2 }, /* stride */ + { /* input */ + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + }, + { /* output */ + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + }, + { /* expect */ + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + }, + }, + { + { 4 }, /* stride */ + { /* input */ + 148885872271752691712.0, 148885872271752691712.0, 148885872271752691712.0, 148885872271752691712.0, + 148885872271752691712.0, 148885872271752691712.0, 148885872271752691712.0, 148885872271752691712.0, + 148885872271752691712.0, 148885872271752691712.0, 148885872271752691712.0, 148885872271752691712.0, + 148885872271752691712.0, 148885872271752691712.0, 148885872271752691712.0, 148885872271752691712.0, + -639460027801474761417333669888.0, -639460027801474761417333669888.0, -639460027801474761417333669888.0, -639460027801474761417333669888.0, + -639460027801474761417333669888.0, -639460027801474761417333669888.0, -639460027801474761417333669888.0, -639460027801474761417333669888.0, + -639460027801474761417333669888.0, -639460027801474761417333669888.0, -639460027801474761417333669888.0, -639460027801474761417333669888.0, + -639460027801474761417333669888.0, -639460027801474761417333669888.0, -639460027801474761417333669888.0, -639460027801474761417333669888.0, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 148885872271752691712.0, 0, 0, 0, + 148885872271752691712.0, 0, 0, 0, + 148885872271752691712.0, 0, 0, 0, + 148885872271752691712.0, 0, 0, 0, + -639460027801474761417333669888.0, 0, 0, 0, + -639460027801474761417333669888.0, 0, 0, 0, + -639460027801474761417333669888.0, 0, 0, 0, + -639460027801474761417333669888.0, 0, 0, 0, + }, + }, +}; + +double TEST_STRIDED_LD_ST_DATA(double)[][4][N] = +{ + { + { 1 }, /* stride */ + { /* input */ + 1.4, 0.2, 0.8, 0.8, + 0.4, 1.2, 0.8, 0.8, + 0.4, 0.2, 1.8, 0.8, + 0.4, 0.2, 0.8, 1.8, + 1.4, 0.2, 0.8, 0.8, + 0.4, 1.2, 0.8, 0.8, + 0.4, 0.2, 1.8, 0.8, + 0.4, 0.2, 0.8, 1.8, + }, + { /* output */ + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + }, + { /* expect */ + 1.4, 0.2, 0.8, 0.8, + 0.4, 1.2, 0.8, 0.8, + 0.4, 0.2, 1.8, 0.8, + 0.4, 0.2, 0.8, 1.8, + 1.4, 0.2, 0.8, 0.8, + 0.4, 1.2, 0.8, 0.8, + 0.4, 0.2, 1.8, 0.8, + 0.4, 0.2, 0.8, 1.8, + }, + }, + { + { 2 }, /* stride */ + { /* input */ + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + 2.6, 3.1, 9.4, 7.8, + }, + { /* output */ + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0, 0.0, 0.0, + }, + { /* expect */ + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + 2.6, 0.0, 9.4, 0.0, + }, + }, + { + { 4 }, /* stride */ + { /* input */ + 98789784453484056064183762944.0, 98789784453484056064183762944.0, 98789784453484056064183762944.0, 98789784453484056064183762944.0, + 98789784453484056064183762944.0, 98789784453484056064183762944.0, 98789784453484056064183762944.0, 98789784453484056064183762944.0, + 98789784453484056064183762944.0, 98789784453484056064183762944.0, 98789784453484056064183762944.0, 98789784453484056064183762944.0, + 98789784453484056064183762944.0, 98789784453484056064183762944.0, 98789784453484056064183762944.0, 98789784453484056064183762944.0, + -1507412482505555054690304.0, -1507412482505555054690304.0, -1507412482505555054690304.0, -1507412482505555054690304.0, + -1507412482505555054690304.0, -1507412482505555054690304.0, -1507412482505555054690304.0, -1507412482505555054690304.0, + -1507412482505555054690304.0, -1507412482505555054690304.0, -1507412482505555054690304.0, -1507412482505555054690304.0, + -1507412482505555054690304.0, -1507412482505555054690304.0, -1507412482505555054690304.0, -1507412482505555054690304.0, + }, + { /* output */ + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0, 0, + }, + { /* expect */ + 98789784453484056064183762944.0, 0, 0, 0, + 98789784453484056064183762944.0, 0, 0, 0, + 98789784453484056064183762944.0, 0, 0, 0, + 98789784453484056064183762944.0, 0, 0, 0, + -1507412482505555054690304.0, 0, 0, 0, + -1507412482505555054690304.0, 0, 0, 0, + -1507412482505555054690304.0, 0, 0, 0, + -1507412482505555054690304.0, 0, 0, 0, + }, + }, +}; + +#endif diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st_run.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st_run.h new file mode 100644 index 00000000000..2549dad103d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/strided/strided_ld_st_run.h @@ -0,0 +1,27 @@ +#ifndef HAVE_DEFINE_STRIDED_LD_ST_H +#define HAVE_DEFINE_STRIDED_LD_ST_H + +int +main () +{ + unsigned i, k; + + for (i = 0; i < sizeof (DATA) / sizeof (DATA[0]); i++) + { + T stride = DATA[i][0][0]; + T *in = DATA[i][1]; + T *out = DATA[i][2]; + T *expect = DATA[i][3]; + + RUN_STRIDED_LD_ST (out, in, stride, N / stride); + + for (k = 0; k < N; k = k + stride) + if (out[k] != expect[k]) + __builtin_abort (); + } + + return 0; +} + +#endif + diff --git a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp index 8c4e916d5b1..12002dd51bf 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp +++ b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp @@ -120,6 +120,8 @@ set AUTOVEC_TEST_OPTS [list \ foreach op $AUTOVEC_TEST_OPTS { dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/gather-scatter/*.\[cS\]]] \ "" "$op" + dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/strided/*.\[cS\]]] \ + "" "$op" } # All done.