From patchwork Sun Nov 13 09:59:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 60507 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 59798388B6BB for ; Sun, 13 Nov 2022 10:00:29 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 59798388B6BB DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668333629; bh=paEnUiEN9tkjTLZSW1zruTFQhro6gGsm6ZnygIVPpEo=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=vRwOQizw6zekdREwOzRfuJJsx6w1FspjJSx8+1ZH/NAJLw2SBKgUngbY1S6neEBQQ c3EH7ly5E+BpOxt3YXkDZR0gXpUG6zqk667PhrjNoyaBUhWOA3DcxTE3rwHl0ivNvO JBIngkah9jW/dDJSfqseJ8ISiIIozhk83a42PSag= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 13E203887F69 for ; Sun, 13 Nov 2022 09:59:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 13E203887F69 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0C8DD23A for ; Sun, 13 Nov 2022 02:00:00 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 549D03F73D for ; Sun, 13 Nov 2022 01:59:53 -0800 (PST) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 01/16] aarch64: Add arm_streaming(_compatible) attributes References: Date: Sun, 13 Nov 2022 09:59:52 +0000 In-Reply-To: (Richard Sandiford's message of "Sun, 13 Nov 2022 09:59:23 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-40.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, KAM_STOCKGEN, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch adds support for recognising the SME arm_streaming and arm_streaming_compatible attributes. These attributes respectively describe whether the processor is definitely in "streaming mode" (PSTATE.SM==1), whether the processor is definitely not in streaming mode (PSTATE.SM==0), or whether we don't know at compile time either way. As far as the compiler is concerned, this effectively create three ISA submodes: streaming mode enables things that are not available in non-streaming mode, non-streaming mode enables things that not available in streaming mode, and streaming-compatible mode has to stick to the common subset. This means that some instructions are conditional on PSTATE.SM==1 and some are conditional on PSTATE.SM==0. I wondered about recording the streaming state in a new variable. However, the set of available instructions is also influenced by PSTATE.ZA (added later), so I think it makes sense to view this as an instance of a more general mechanism. Also, keeping the PSTATE.SM state in the same flag variable as the other ISA features makes it possible to sum up the requirements of an ACLE function in a single value. The patch therefore adds a new set of feature flags called "ISA modes". Unlike the other two sets of flags (optional features and architecture- level features), these ISA modes are not controlled directly by command-line parameters or "target" attributes. arm_streaming and arm_streaming_compatible are function type attributes rather than function declaration attributes. This means that we need to find somewhere to copy the type information across to a function's target options. The patch does this in aarch64_set_current_function. We also need to record which ISA mode a callee expects/requires to be active on entry. (The same mode is then active on return.) The patch extends the current UNSPEC_CALLEE_ABI cookie to include this information, as well as the PCS variant that it recorded previously. gcc/ * config/aarch64/aarch64-isa-modes.def: New file. * config/aarch64/aarch64.h: Include it in the feature enumerations. (AARCH64_FL_SM_STATE, AARCH64_FL_ISA_MODES): New constants. (AARCH64_FL_DEFAULT_ISA_MODE): Likewise. (AARCH64_ISA_MODE): New macro. (CUMULATIVE_ARGS): Add an isa_mode field. * config/aarch64/aarch64-protos.h (aarch64_gen_callee_cookie): Declare. (aarch64_tlsdesc_abi_id): Return an arm_pcs. * config/aarch64/aarch64.cc (attr_streaming_exclusions): New variable. (aarch64_attribute_table): Add arm_streaming and arm_streaming_compatible. (aarch64_fntype_sm_state, aarch64_fntype_isa_mode): New functions. (aarch64_fndecl_sm_state, aarch64_fndecl_isa_mode): Likewise. (aarch64_gen_callee_cookie, aarch64_callee_abi): Likewise. (aarch64_insn_callee_cookie, aarch64_insn_callee_abi): Use them. (aarch64_function_arg, aarch64_output_mi_thunk): Likewise. (aarch64_init_cumulative_args): Initialize the isa_mode field. (aarch64_override_options): Add the ISA mode to the feature set. (aarch64_temporary_target::copy_from_fndecl): Likewise. (aarch64_fndecl_options, aarch64_handle_attr_arch): Likewise. (aarch64_set_current_function): Maintain the correct ISA mode. (aarch64_tlsdesc_abi_id): Return an arm_pcs. (aarch64_comp_type_attributes): Handle arm_streaming and arm_streaming_compatible. * config/aarch64/aarch64.md (tlsdesc_small_): Use aarch64_gen_callee_cookie to get the ABI cookie. * config/aarch64/t-aarch64 (TM_H): Add all feature-related .def files. gcc/testsuite/ * gcc.target/aarch64/sme/aarch64-sme.exp: New harness. * gcc.target/aarch64/sme/streaming_mode_1.c: New test. * gcc.target/aarch64/sme/streaming_mode_2.c: Likewise. * gcc.target/aarch64/auto-init-1.c: Only expect the call insn to contain 1 (const_int 0), not 2. --- gcc/config/aarch64/aarch64-isa-modes.def | 35 ++++ gcc/config/aarch64/aarch64-protos.h | 3 +- gcc/config/aarch64/aarch64.cc | 194 +++++++++++++++--- gcc/config/aarch64/aarch64.h | 24 ++- gcc/config/aarch64/aarch64.md | 3 +- gcc/config/aarch64/t-aarch64 | 5 +- .../gcc.target/aarch64/auto-init-1.c | 3 +- .../gcc.target/aarch64/sme/aarch64-sme.exp | 41 ++++ .../gcc.target/aarch64/sme/streaming_mode_1.c | 106 ++++++++++ .../gcc.target/aarch64/sme/streaming_mode_2.c | 25 +++ 10 files changed, 403 insertions(+), 36 deletions(-) create mode 100644 gcc/config/aarch64/aarch64-isa-modes.def create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme.exp create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_2.c diff --git a/gcc/config/aarch64/aarch64-isa-modes.def b/gcc/config/aarch64/aarch64-isa-modes.def new file mode 100644 index 00000000000..fba8eafbae1 --- /dev/null +++ b/gcc/config/aarch64/aarch64-isa-modes.def @@ -0,0 +1,35 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published + by the Free Software Foundation; either version 3, or (at your + option) any later version. + + GCC is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public + License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +/* This file defines a set of "ISA modes"; in other words, it defines + various bits of runtime state that control the set of available + instructions or that affect the semantics of instructions in some way. + + Before using #include to read this file, define a macro: + + DEF_AARCH64_ISA_MODE(NAME) + + where NAME is the name of the mode. */ + +/* Indicates that PSTATE.SM is known to be 1 or 0 respectively. These + modes are mutually exclusive. If neither mode is active then the state + of PSTATE.SM is not known at compile time. */ +DEF_AARCH64_ISA_MODE(SM_ON) +DEF_AARCH64_ISA_MODE(SM_OFF) + +#undef DEF_AARCH64_ISA_MODE diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 866d68ad4d7..06b926b42d6 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -771,6 +771,7 @@ bool aarch64_const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, bool aarch64_constant_address_p (rtx); bool aarch64_emit_approx_div (rtx, rtx, rtx); bool aarch64_emit_approx_sqrt (rtx, rtx, bool); +rtx aarch64_gen_callee_cookie (aarch64_feature_flags, arm_pcs); void aarch64_expand_call (rtx, rtx, rtx, bool); bool aarch64_expand_cpymem (rtx *); bool aarch64_expand_setmem (rtx *); @@ -849,7 +850,7 @@ int aarch64_movk_shift (const wide_int_ref &, const wide_int_ref &); bool aarch64_use_return_insn_p (void); const char *aarch64_output_casesi (rtx *); -unsigned int aarch64_tlsdesc_abi_id (); +arm_pcs aarch64_tlsdesc_abi_id (); enum aarch64_symbol_type aarch64_classify_symbol (rtx, HOST_WIDE_INT); enum aarch64_symbol_type aarch64_classify_tls_symbol (rtx); enum reg_class aarch64_regno_regclass (unsigned); diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index a40ac6fd903..a2e910daddf 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -2731,6 +2731,16 @@ handle_aarch64_vector_pcs_attribute (tree *node, tree name, tree, gcc_unreachable (); } +/* Mutually-exclusive function type attributes for controlling PSTATE.SM. */ +static const struct attribute_spec::exclusions attr_streaming_exclusions[] = +{ + /* Attribute name exclusion applies to: + function, type, variable */ + { "arm_streaming", false, false, false }, + { "arm_streaming_compatible", false, true, false }, + { NULL, false, false, false } +}; + /* Table of machine attributes. */ static const struct attribute_spec aarch64_attribute_table[] = { @@ -2738,6 +2748,10 @@ static const struct attribute_spec aarch64_attribute_table[] = affects_type_identity, handler, exclude } */ { "aarch64_vector_pcs", 0, 0, false, true, true, true, handle_aarch64_vector_pcs_attribute, NULL }, + { "arm_streaming", 0, 0, false, true, true, true, + NULL, attr_streaming_exclusions }, + { "arm_streaming_compatible", 0, 0, false, true, true, true, + NULL, attr_streaming_exclusions }, { "arm_sve_vector_bits", 1, 1, false, true, false, true, aarch64_sve::handle_arm_sve_vector_bits_attribute, NULL }, @@ -4048,6 +4062,47 @@ aarch64_fntype_abi (const_tree fntype) return default_function_abi; } +/* Return the state of PSTATE.SM on entry to functions of type FNTYPE. */ + +static aarch64_feature_flags +aarch64_fntype_sm_state (const_tree fntype) +{ + if (lookup_attribute ("arm_streaming", TYPE_ATTRIBUTES (fntype))) + return AARCH64_FL_SM_ON; + + if (lookup_attribute ("arm_streaming_compatible", TYPE_ATTRIBUTES (fntype))) + return 0; + + return AARCH64_FL_SM_OFF; +} + +/* Return the ISA mode on entry to functions of type FNTYPE. */ + +static aarch64_feature_flags +aarch64_fntype_isa_mode (const_tree fntype) +{ + return aarch64_fntype_sm_state (fntype); +} + +/* Return the state of PSTATE.SM when compiling the body of + function FNDECL. This might be different from the state of + PSTATE.SM on entry. */ + +static aarch64_feature_flags +aarch64_fndecl_sm_state (const_tree fndecl) +{ + return aarch64_fntype_sm_state (TREE_TYPE (fndecl)); +} + +/* Return the ISA mode that should be used to compile the body of + function FNDECL. */ + +static aarch64_feature_flags +aarch64_fndecl_isa_mode (const_tree fndecl) +{ + return aarch64_fndecl_sm_state (fndecl); +} + /* Implement TARGET_COMPATIBLE_VECTOR_TYPES_P. */ static bool @@ -4110,17 +4165,46 @@ aarch64_reg_save_mode (unsigned int regno) gcc_unreachable (); } -/* Implement TARGET_INSN_CALLEE_ABI. */ +/* Given the ISA mode on entry to a callee and the ABI of the callee, + return the CONST_INT that should be placed in an UNSPEC_CALLEE_ABI rtx. */ -const predefined_function_abi & -aarch64_insn_callee_abi (const rtx_insn *insn) +rtx +aarch64_gen_callee_cookie (aarch64_feature_flags isa_mode, arm_pcs pcs_variant) +{ + return gen_int_mode ((unsigned int) isa_mode + | (unsigned int) pcs_variant << AARCH64_NUM_ISA_MODES, + DImode); +} + +/* COOKIE is a CONST_INT from an UNSPEC_CALLEE_ABI rtx. Return the + callee's ABI. */ + +static const predefined_function_abi & +aarch64_callee_abi (rtx cookie) +{ + return function_abis[UINTVAL (cookie) >> AARCH64_NUM_ISA_MODES]; +} + +/* INSN is a call instruction. Return the CONST_INT stored in its + UNSPEC_CALLEE_ABI rtx. */ + +static rtx +aarch64_insn_callee_cookie (const rtx_insn *insn) { rtx pat = PATTERN (insn); gcc_assert (GET_CODE (pat) == PARALLEL); rtx unspec = XVECEXP (pat, 0, 1); gcc_assert (GET_CODE (unspec) == UNSPEC && XINT (unspec, 1) == UNSPEC_CALLEE_ABI); - return function_abis[INTVAL (XVECEXP (unspec, 0, 0))]; + return XVECEXP (unspec, 0, 0); +} + +/* Implement TARGET_INSN_CALLEE_ABI. */ + +const predefined_function_abi & +aarch64_insn_callee_abi (const rtx_insn *insn) +{ + return aarch64_callee_abi (aarch64_insn_callee_cookie (insn)); } /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED. The callee only saves @@ -7861,7 +7945,7 @@ aarch64_function_arg (cumulative_args_t pcum_v, const function_arg_info &arg) || pcum->pcs_variant == ARM_PCS_SVE); if (arg.end_marker_p ()) - return gen_int_mode (pcum->pcs_variant, DImode); + return aarch64_gen_callee_cookie (pcum->isa_mode, pcum->pcs_variant); aarch64_layout_arg (pcum_v, arg); return pcum->aapcs_reg; @@ -7882,9 +7966,15 @@ aarch64_init_cumulative_args (CUMULATIVE_ARGS *pcum, pcum->aapcs_nextnvrn = 0; pcum->aapcs_nextnprn = 0; if (fntype) - pcum->pcs_variant = (arm_pcs) fntype_abi (fntype).id (); + { + pcum->pcs_variant = (arm_pcs) fntype_abi (fntype).id (); + pcum->isa_mode = aarch64_fntype_isa_mode (fntype); + } else - pcum->pcs_variant = ARM_PCS_AAPCS64; + { + pcum->pcs_variant = ARM_PCS_AAPCS64; + pcum->isa_mode = AARCH64_FL_DEFAULT_ISA_MODE; + } pcum->aapcs_reg = NULL_RTX; pcum->aapcs_arg_processed = false; pcum->aapcs_stack_words = 0; @@ -10372,7 +10462,9 @@ aarch64_output_mi_thunk (FILE *file, tree thunk ATTRIBUTE_UNUSED, } funexp = XEXP (DECL_RTL (function), 0); funexp = gen_rtx_MEM (FUNCTION_MODE, funexp); - rtx callee_abi = gen_int_mode (fndecl_abi (function).id (), DImode); + auto isa_mode = aarch64_fntype_isa_mode (TREE_TYPE (function)); + auto pcs_variant = arm_pcs (fndecl_abi (function).id ()); + rtx callee_abi = aarch64_gen_callee_cookie (isa_mode, pcs_variant); insn = emit_call_insn (gen_sibcall (funexp, const0_rtx, callee_abi)); SIBLING_CALL_P (insn) = 1; @@ -18315,6 +18407,7 @@ aarch64_override_options (void) SUBTARGET_OVERRIDE_OPTIONS; #endif + auto isa_mode = AARCH64_FL_DEFAULT_ISA_MODE; if (cpu && arch) { /* If both -mcpu and -march are specified, warn if they are not @@ -18327,25 +18420,25 @@ aarch64_override_options (void) } selected_arch = arch->arch; - aarch64_set_asm_isa_flags (arch_isa); + aarch64_set_asm_isa_flags (arch_isa | isa_mode); } else if (cpu) { selected_arch = cpu->arch; - aarch64_set_asm_isa_flags (cpu_isa); + aarch64_set_asm_isa_flags (cpu_isa | isa_mode); } else if (arch) { cpu = &all_cores[arch->ident]; selected_arch = arch->arch; - aarch64_set_asm_isa_flags (arch_isa); + aarch64_set_asm_isa_flags (arch_isa | isa_mode); } else { /* No -mcpu or -march specified, so use the default CPU. */ cpu = &all_cores[TARGET_CPU_DEFAULT]; selected_arch = cpu->arch; - aarch64_set_asm_isa_flags (cpu->flags); + aarch64_set_asm_isa_flags (cpu->flags | isa_mode); } selected_tune = tune ? tune->ident : cpu->ident; @@ -18518,6 +18611,21 @@ aarch64_save_restore_target_globals (tree new_tree) TREE_TARGET_GLOBALS (new_tree) = save_target_globals_default_opts (); } +/* Return the target_option_node for FNDECL, or the current options + if FNDECL is null. */ + +static tree +aarch64_fndecl_options (tree fndecl) +{ + if (!fndecl) + return target_option_current_node; + + if (tree options = DECL_FUNCTION_SPECIFIC_TARGET (fndecl)) + return options; + + return target_option_default_node; +} + /* Implement TARGET_SET_CURRENT_FUNCTION. Unpack the codegen decisions like tuning and ISA features from the DECL_FUNCTION_SPECIFIC_TARGET of the function, if such exists. This function may be called multiple @@ -18527,25 +18635,24 @@ aarch64_save_restore_target_globals (tree new_tree) static void aarch64_set_current_function (tree fndecl) { - if (!fndecl || fndecl == aarch64_previous_fndecl) - return; + tree old_tree = aarch64_fndecl_options (aarch64_previous_fndecl); + tree new_tree = aarch64_fndecl_options (fndecl); - tree old_tree = (aarch64_previous_fndecl - ? DECL_FUNCTION_SPECIFIC_TARGET (aarch64_previous_fndecl) - : NULL_TREE); - - tree new_tree = DECL_FUNCTION_SPECIFIC_TARGET (fndecl); - - /* If current function has no attributes but the previous one did, - use the default node. */ - if (!new_tree && old_tree) - new_tree = target_option_default_node; + auto new_isa_mode = (fndecl + ? aarch64_fndecl_isa_mode (fndecl) + : AARCH64_FL_DEFAULT_ISA_MODE); + auto isa_flags = TREE_TARGET_OPTION (new_tree)->x_aarch64_isa_flags; /* If nothing to do, return. #pragma GCC reset or #pragma GCC pop to the default have been handled by aarch64_save_restore_target_globals from aarch64_pragma_target_parse. */ - if (old_tree == new_tree) - return; + if (old_tree == new_tree + && (!fndecl || aarch64_previous_fndecl) + && (isa_flags & AARCH64_FL_ISA_MODES) == new_isa_mode) + { + gcc_assert (AARCH64_ISA_MODE == new_isa_mode); + return; + } aarch64_previous_fndecl = fndecl; @@ -18553,7 +18660,28 @@ aarch64_set_current_function (tree fndecl) cl_target_option_restore (&global_options, &global_options_set, TREE_TARGET_OPTION (new_tree)); + /* The ISA mode can vary based on function type attributes and + function declaration attributes. Make sure that the target + options correctly reflect these attributes. */ + if ((isa_flags & AARCH64_FL_ISA_MODES) != new_isa_mode) + { + auto base_flags = (aarch64_asm_isa_flags & ~AARCH64_FL_ISA_MODES); + aarch64_set_asm_isa_flags (base_flags | new_isa_mode); + + aarch64_override_options_internal (&global_options); + new_tree = build_target_option_node (&global_options, + &global_options_set); + DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = new_tree; + + tree new_optimize = build_optimization_node (&global_options, + &global_options_set); + if (new_optimize != optimization_default_node) + DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl) = new_optimize; + } + aarch64_save_restore_target_globals (new_tree); + + gcc_assert (AARCH64_ISA_MODE == new_isa_mode); } /* Enum describing the various ways we can handle attributes. @@ -18603,7 +18731,7 @@ aarch64_handle_attr_arch (const char *str) { gcc_assert (tmp_arch); selected_arch = tmp_arch->arch; - aarch64_set_asm_isa_flags (tmp_flags); + aarch64_set_asm_isa_flags (tmp_flags | AARCH64_ISA_MODE); return true; } @@ -18644,7 +18772,7 @@ aarch64_handle_attr_cpu (const char *str) gcc_assert (tmp_cpu); selected_tune = tmp_cpu->ident; selected_arch = tmp_cpu->arch; - aarch64_set_asm_isa_flags (tmp_flags); + aarch64_set_asm_isa_flags (tmp_flags | AARCH64_ISA_MODE); return true; } @@ -18744,7 +18872,7 @@ aarch64_handle_attr_isa_flags (char *str) features if the user wants to handpick specific features. */ if (strncmp ("+nothing", str, 8) == 0) { - isa_flags = 0; + isa_flags = AARCH64_ISA_MODE; str += 8; } @@ -19237,7 +19365,7 @@ aarch64_can_inline_p (tree caller, tree callee) /* Return the ID of the TLDESC ABI, initializing the descriptor if hasn't been already. */ -unsigned int +arm_pcs aarch64_tlsdesc_abi_id () { predefined_function_abi &tlsdesc_abi = function_abis[ARM_PCS_TLSDESC]; @@ -19251,7 +19379,7 @@ aarch64_tlsdesc_abi_id () SET_HARD_REG_BIT (full_reg_clobbers, regno); tlsdesc_abi.initialize (ARM_PCS_TLSDESC, full_reg_clobbers); } - return tlsdesc_abi.id (); + return ARM_PCS_TLSDESC; } /* Return true if SYMBOL_REF X binds locally. */ @@ -26956,6 +27084,10 @@ aarch64_comp_type_attributes (const_tree type1, const_tree type2) return 0; if (!check_attr ("SVE sizeless type")) return 0; + if (!check_attr ("arm_streaming")) + return 0; + if (!check_attr ("arm_streaming_compatible")) + return 0; return 1; } diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index e60f9bce023..1ac37b902bf 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -157,10 +157,13 @@ #ifndef USED_FOR_TARGET -/* Define an enum of all features (architectures and extensions). */ +/* Define an enum of all features (ISA modes, architectures and extensions). + The ISA modes must come first. */ enum class aarch64_feature : unsigned char { +#define DEF_AARCH64_ISA_MODE(IDENT) IDENT, #define AARCH64_OPT_EXTENSION(A, IDENT, C, D, E, F) IDENT, #define AARCH64_ARCH(A, B, IDENT, D, E) IDENT, +#include "aarch64-isa-modes.def" #include "aarch64-option-extensions.def" #include "aarch64-arches.def" }; @@ -169,16 +172,34 @@ enum class aarch64_feature : unsigned char { #define HANDLE(IDENT) \ constexpr auto AARCH64_FL_##IDENT \ = aarch64_feature_flags (1) << int (aarch64_feature::IDENT); +#define DEF_AARCH64_ISA_MODE(IDENT) HANDLE (IDENT) #define AARCH64_OPT_EXTENSION(A, IDENT, C, D, E, F) HANDLE (IDENT) #define AARCH64_ARCH(A, B, IDENT, D, E) HANDLE (IDENT) +#include "aarch64-isa-modes.def" #include "aarch64-option-extensions.def" #include "aarch64-arches.def" #undef HANDLE +constexpr auto AARCH64_FL_SM_STATE = AARCH64_FL_SM_ON | AARCH64_FL_SM_OFF; + +constexpr unsigned int AARCH64_NUM_ISA_MODES = (0 +#define DEF_AARCH64_ISA_MODE(IDENT) + 1 +#include "aarch64-isa-modes.def" +); + +/* The mask of all ISA modes. */ +constexpr auto AARCH64_FL_ISA_MODES + = (aarch64_feature_flags (1) << AARCH64_NUM_ISA_MODES) - 1; + +/* The default ISA mode, for functions with no attributes that specify + something to the contrary. */ +constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; + #endif /* Macros to test ISA flags. */ +#define AARCH64_ISA_MODE (aarch64_isa_flags & AARCH64_FL_ISA_MODES) #define AARCH64_ISA_CRC (aarch64_isa_flags & AARCH64_FL_CRC) #define AARCH64_ISA_CRYPTO (aarch64_isa_flags & AARCH64_FL_CRYPTO) #define AARCH64_ISA_FP (aarch64_isa_flags & AARCH64_FL_FP) @@ -895,6 +916,7 @@ enum arm_pcs typedef struct { enum arm_pcs pcs_variant; + aarch64_feature_flags isa_mode; int aapcs_arg_processed; /* No need to lay out this argument again. */ int aapcs_ncrn; /* Next Core register number. */ int aapcs_nextncrn; /* Next next core register number. */ diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index ca2e618d9b9..cd6d5e5000c 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -7088,7 +7088,8 @@ (define_expand "tlsdesc_small_" { if (TARGET_SVE) { - rtx abi = gen_int_mode (aarch64_tlsdesc_abi_id (), DImode); + rtx abi = aarch64_gen_callee_cookie (AARCH64_ISA_MODE, + aarch64_tlsdesc_abi_id ()); rtx_insn *call = emit_call_insn (gen_tlsdesc_small_sve_ (operands[0], abi)); RTL_CONST_CALL_P (call) = 1; diff --git a/gcc/config/aarch64/t-aarch64 b/gcc/config/aarch64/t-aarch64 index ba74abc0a43..47a753c5f1b 100644 --- a/gcc/config/aarch64/t-aarch64 +++ b/gcc/config/aarch64/t-aarch64 @@ -18,7 +18,10 @@ # along with GCC; see the file COPYING3. If not see # . -TM_H += $(srcdir)/config/aarch64/aarch64-cores.def +TM_H += $(srcdir)/config/aarch64/aarch64-cores.def \ + $(srcdir)/config/aarch64/aarch64-isa-modes.def \ + $(srcdir)/config/aarch64/aarch64-option-extensions.def \ + $(srcdir)/config/aarch64/aarch64-arches.def OPTIONS_H_EXTRA += $(srcdir)/config/aarch64/aarch64-cores.def \ $(srcdir)/config/aarch64/aarch64-arches.def \ $(srcdir)/config/aarch64/aarch64-fusion-pairs.def \ diff --git a/gcc/testsuite/gcc.target/aarch64/auto-init-1.c b/gcc/testsuite/gcc.target/aarch64/auto-init-1.c index 0fa470880bf..48c5bb6a45c 100644 --- a/gcc/testsuite/gcc.target/aarch64/auto-init-1.c +++ b/gcc/testsuite/gcc.target/aarch64/auto-init-1.c @@ -29,4 +29,5 @@ void foo() return; } -/* { dg-final { scan-rtl-dump-times "const_int 0" 11 "expand" } } */ +/* Includes 1 for the call instruction and one for a nop. */ +/* { dg-final { scan-rtl-dump-times "const_int 0" 10 "expand" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme.exp b/gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme.exp new file mode 100644 index 00000000000..c542912e14a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme.exp @@ -0,0 +1,41 @@ +# Specific regression driver for AArch64 SME. +# Copyright (C) 2009-2022 Free Software Foundation, Inc. +# Contributed by ARM Ltd. +# +# This file is part of GCC. +# +# GCC is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. +# +# GCC is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . */ + +# GCC testsuite that uses the `dg.exp' driver. + +# Exit immediately if this isn't an AArch64 target. +if {![istarget aarch64*-*-*] } then { + return +} + +# Load support procs. +load_lib gcc-dg.exp + +# Initialize `dg'. +dg-init + +aarch64-with-arch-dg-options "" { + # Main loop. + dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cCS\]]] \ + "" "" +} + +# All done. +dg-finish diff --git a/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_1.c b/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_1.c new file mode 100644 index 00000000000..22d4a8bcc97 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_1.c @@ -0,0 +1,106 @@ +// { dg-options "" } + +void __attribute__((arm_streaming_compatible)) sc_a (); +void sc_a (); // { dg-error "conflicting types" } + +void sc_b (); +void __attribute__((arm_streaming_compatible)) sc_b (); // { dg-error "conflicting types" } + +void __attribute__((arm_streaming_compatible)) sc_c (); +void sc_c () {} // Inherits attribute from declaration (confusingly). + +void sc_d (); +void __attribute__((arm_streaming_compatible)) sc_d () {} // { dg-error "conflicting types" } + +void __attribute__((arm_streaming_compatible)) sc_e () {} +void sc_e (); // { dg-error "conflicting types" } + +void sc_f () {} +void __attribute__((arm_streaming_compatible)) sc_f (); // { dg-error "conflicting types" } + +extern void (*sc_g) (); +extern __attribute__((arm_streaming_compatible)) void (*sc_g) (); // { dg-error "conflicting types" } + +extern __attribute__((arm_streaming_compatible)) void (*sc_h) (); +extern void (*sc_h) (); // { dg-error "conflicting types" } + +//---------------------------------------------------------------------------- + +void __attribute__((arm_streaming)) s_a (); +void s_a (); // { dg-error "conflicting types" } + +void s_b (); +void __attribute__((arm_streaming)) s_b (); // { dg-error "conflicting types" } + +void __attribute__((arm_streaming)) s_c (); +void s_c () {} // Inherits attribute from declaration (confusingly). + +void s_d (); +void __attribute__((arm_streaming)) s_d () {} // { dg-error "conflicting types" } + +void __attribute__((arm_streaming)) s_e () {} +void s_e (); // { dg-error "conflicting types" } + +void s_f () {} +void __attribute__((arm_streaming)) s_f (); // { dg-error "conflicting types" } + +extern void (*s_g) (); +extern __attribute__((arm_streaming)) void (*s_g) (); // { dg-error "conflicting types" } + +extern __attribute__((arm_streaming)) void (*s_h) (); +extern void (*s_h) (); // { dg-error "conflicting types" } + +//---------------------------------------------------------------------------- + +void __attribute__((arm_streaming)) mixed_a (); +void __attribute__((arm_streaming_compatible)) mixed_a (); // { dg-error "conflicting types" } +// { dg-warning "ignoring attribute" "" { target *-*-* } .-1 } + +void __attribute__((arm_streaming_compatible)) mixed_b (); +void __attribute__((arm_streaming)) mixed_b (); // { dg-error "conflicting types" } +// { dg-warning "ignoring attribute" "" { target *-*-* } .-1 } + +void __attribute__((arm_streaming)) mixed_c (); +void __attribute__((arm_streaming_compatible)) mixed_c () {} // { dg-warning "ignoring attribute" } + +void __attribute__((arm_streaming_compatible)) mixed_d (); +void __attribute__((arm_streaming)) mixed_d () {} // { dg-warning "ignoring attribute" } + +void __attribute__((arm_streaming)) mixed_e () {} +void __attribute__((arm_streaming_compatible)) mixed_e (); // { dg-error "conflicting types" } +// { dg-warning "ignoring attribute" "" { target *-*-* } .-1 } + +void __attribute__((arm_streaming_compatible)) mixed_f () {} +void __attribute__((arm_streaming)) mixed_f (); // { dg-error "conflicting types" } +// { dg-warning "ignoring attribute" "" { target *-*-* } .-1 } + +extern __attribute__((arm_streaming_compatible)) void (*mixed_g) (); +extern __attribute__((arm_streaming)) void (*mixed_g) (); // { dg-error "conflicting types" } + +extern __attribute__((arm_streaming)) void (*mixed_h) (); +extern __attribute__((arm_streaming_compatible)) void (*mixed_h) (); // { dg-error "conflicting types" } + +//---------------------------------------------------------------------------- + +void __attribute__((arm_streaming, arm_streaming_compatible)) contradiction_1(); // { dg-warning "conflicts with attribute" } +void __attribute__((arm_streaming_compatible, arm_streaming)) contradiction_2(); // { dg-warning "conflicts with attribute" } + +int __attribute__((arm_streaming_compatible)) int_attr; // { dg-warning "only applies to function types" } +void *__attribute__((arm_streaming)) ptr_attr; // { dg-warning "only applies to function types" } + +typedef void __attribute__((arm_streaming)) s_callback (); +typedef void __attribute__((arm_streaming_compatible)) sc_callback (); + +void (*__attribute__((arm_streaming)) s_callback_ptr) (); +void (*__attribute__((arm_streaming_compatible)) sc_callback_ptr) (); + +typedef void __attribute__((arm_streaming, arm_streaming_compatible)) contradiction_callback_1 (); // { dg-warning "conflicts with attribute" } +typedef void __attribute__((arm_streaming_compatible, arm_streaming)) contradiction_callback_2 (); // { dg-warning "conflicts with attribute" } + +void __attribute__((arm_streaming, arm_streaming_compatible)) (*contradiction_callback_ptr_1) (); // { dg-warning "conflicts with attribute" } +void __attribute__((arm_streaming_compatible, arm_streaming)) (*contradiction_callback_ptr_2) (); // { dg-warning "conflicts with attribute" } + +struct s { + void __attribute__((arm_streaming, arm_streaming_compatible)) (*contradiction_callback_ptr_1) (); // { dg-warning "conflicts with attribute" } + void __attribute__((arm_streaming_compatible, arm_streaming)) (*contradiction_callback_ptr_2) (); // { dg-warning "conflicts with attribute" } +}; diff --git a/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_2.c b/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_2.c new file mode 100644 index 00000000000..448ddb5feb1 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_2.c @@ -0,0 +1,25 @@ +// { dg-options "" } + +void __attribute__((arm_streaming_compatible)) sc_fn (); +void __attribute__((arm_streaming)) s_fn (); +void ns_fn (); + +__attribute__((arm_streaming_compatible)) void (*sc_fn_ptr) (); +__attribute__((arm_streaming)) void (*s_fn_ptr) (); +void (*ns_fn_ptr) (); + +void +f () +{ + sc_fn_ptr = sc_fn; + sc_fn_ptr = s_fn; // { dg-warning "incompatible pointer type" } + sc_fn_ptr = ns_fn; // { dg-warning "incompatible pointer type" } + + s_fn_ptr = sc_fn; // { dg-warning "incompatible pointer type" } + s_fn_ptr = s_fn; + s_fn_ptr = ns_fn; // { dg-warning "incompatible pointer type" } + + ns_fn_ptr = sc_fn; // { dg-warning "incompatible pointer type" } + ns_fn_ptr = s_fn; // { dg-warning "incompatible pointer type" } + ns_fn_ptr = ns_fn; +} From patchwork Sun Nov 13 10:00:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 60508 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CA1353889819 for ; Sun, 13 Nov 2022 10:00:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CA1353889819 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668333644; bh=b342ph2aXKDXdanS7z32QjLuk+4UqXG4Vp80fBUGv+Q=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=Jb4k6/XeZRZjl6EvuFCc7Fq690YbLXcZR2eyJwL5kshycoQmEXDcc0VI12qznFNjy Et9UDQ8OHQnEs1Flaahx5HNVUgf2eAbjHlilVb46+QSKVsE6kK+TX594RkFMvhalSx vzDylqL+mO2A8/9dMympjX9UCqlMOxPntLWJZE/g= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id CBD9E3887F7D for ; Sun, 13 Nov 2022 10:00:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CBD9E3887F7D Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BB5F923A for ; Sun, 13 Nov 2022 02:00:17 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2B7283F73D for ; Sun, 13 Nov 2022 02:00:11 -0800 (PST) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 02/16] aarch64: Add +sme References: Date: Sun, 13 Nov 2022 10:00:09 +0000 In-Reply-To: (Richard Sandiford's message of "Sun, 13 Nov 2022 09:59:23 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-42.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch adds the +sme ISA feature and requires it to be present when compiling arm_streaming code. (arm_streaming_compatible code does not necessarily assume the presence of SME. It just has to work when SME is present and streaming mode is enabled.) gcc/ * doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst: Document SME. * doc/gccint/testsuites/directives-used-within-dejagnu-tests/keywords-describing-target-attributes.rst: Document aarch64_sve. * config/aarch64/aarch64-option-extensions.def (sme): Define. * config/aarch64/aarch64.h (AARCH64_ISA_SME): New macro. * config/aarch64/aarch64.cc (aarch64_override_options_internal): Ensure that SME is present when compiling streaming code. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_aarch64_sme): New target test. * gcc.target/aarch64/sme/aarch64-sme.exp: Force SME to be enabled if it isn't by default. * gcc.target/aarch64/sme/streaming_mode_3.c: New test. --- .../aarch64/aarch64-option-extensions.def | 2 + gcc/config/aarch64/aarch64.cc | 33 ++++++++++ gcc/config/aarch64/aarch64.h | 1 + .../aarch64-options.rst | 3 + .../keywords-describing-target-attributes.rst | 3 + .../gcc.target/aarch64/sme/aarch64-sme.exp | 10 ++- .../gcc.target/aarch64/sme/streaming_mode_3.c | 63 +++++++++++++++++++ .../gcc.target/aarch64/sme/streaming_mode_4.c | 22 +++++++ gcc/testsuite/lib/target-supports.exp | 12 ++++ 9 files changed, 147 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_4.c diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def index bdf4baf309c..402a9832f87 100644 --- a/gcc/config/aarch64/aarch64-option-extensions.def +++ b/gcc/config/aarch64/aarch64-option-extensions.def @@ -129,6 +129,8 @@ AARCH64_OPT_EXTENSION("sve2-sha3", SVE2_SHA3, (SVE2, SHA3), (), (), "svesha3") AARCH64_OPT_EXTENSION("sve2-bitperm", SVE2_BITPERM, (SVE2), (), (), "svebitperm") +AARCH64_OPT_EXTENSION("sme", SME, (SVE2), (), (), "sme") + AARCH64_OPT_EXTENSION("tme", TME, (), (), (), "") AARCH64_OPT_EXTENSION("i8mm", I8MM, (SIMD), (), (), "i8mm") diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index a2e910daddf..fc6f0bc208a 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -11374,6 +11374,23 @@ aarch64_fixed_condition_code_regs (unsigned int *p1, unsigned int *p2) return true; } +/* Implement TARGET_START_CALL_ARGS. */ + +static void +aarch64_start_call_args (cumulative_args_t ca_v) +{ + CUMULATIVE_ARGS *ca = get_cumulative_args (ca_v); + + if (!TARGET_SME && (ca->isa_mode & AARCH64_FL_SM_ON)) + { + error ("calling a streaming function requires the ISA extension %qs", + "sme"); + inform (input_location, "you can enable %qs using the command-line" + " option %<-march%>, or by using the %" + " attribute or pragma", "sme"); + } +} + /* This function is used by the call expanders of the machine description. RESULT is the register in which the result is returned. It's NULL for "call" and "sibcall". @@ -17865,6 +17882,19 @@ aarch64_override_options_internal (struct gcc_options *opts) && !fixed_regs[R18_REGNUM]) error ("%<-fsanitize=shadow-call-stack%> requires %<-ffixed-x18%>"); + if ((opts->x_aarch64_isa_flags & AARCH64_FL_SM_ON) + && !(opts->x_aarch64_isa_flags & AARCH64_FL_SME)) + { + error ("streaming functions require the ISA extension %qs", "sme"); + inform (input_location, "you can enable %qs using the command-line" + " option %<-march%>, or by using the %" + " attribute or pragma", "sme"); + opts->x_target_flags &= ~MASK_GENERAL_REGS_ONLY; + auto new_flags = (opts->x_aarch64_asm_isa_flags + | feature_deps::SME ().enable); + aarch64_set_asm_isa_flags (opts, new_flags); + } + initialize_aarch64_code_model (opts); initialize_aarch64_tls_size (opts); @@ -27721,6 +27751,9 @@ aarch64_run_selftests (void) #undef TARGET_FUNCTION_VALUE_REGNO_P #define TARGET_FUNCTION_VALUE_REGNO_P aarch64_function_value_regno_p +#undef TARGET_START_CALL_ARGS +#define TARGET_START_CALL_ARGS aarch64_start_call_args + #undef TARGET_GIMPLE_FOLD_BUILTIN #define TARGET_GIMPLE_FOLD_BUILTIN aarch64_gimple_fold_builtin diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 1ac37b902bf..c47f27eefec 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -214,6 +214,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; #define AARCH64_ISA_SVE2_BITPERM (aarch64_isa_flags & AARCH64_FL_SVE2_BITPERM) #define AARCH64_ISA_SVE2_SHA3 (aarch64_isa_flags & AARCH64_FL_SVE2_SHA3) #define AARCH64_ISA_SVE2_SM4 (aarch64_isa_flags & AARCH64_FL_SVE2_SM4) +#define AARCH64_ISA_SME (aarch64_isa_flags & AARCH64_FL_SME) #define AARCH64_ISA_V8_3A (aarch64_isa_flags & AARCH64_FL_V8_3A) #define AARCH64_ISA_DOTPROD (aarch64_isa_flags & AARCH64_FL_DOTPROD) #define AARCH64_ISA_AES (aarch64_isa_flags & AARCH64_FL_AES) diff --git a/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst b/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst index c2b23a6ee97..f6d82f4435b 100644 --- a/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst +++ b/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst @@ -544,6 +544,9 @@ the following and their inverses no :samp:`{feature}` : :samp:`pauth` Enable the Pointer Authentication Extension. +:samp:`sme` + Enable the Scalable Matrix Extension. + Feature ``crypto`` implies ``aes``, ``sha2``, and ``simd``, which implies ``fp``. Conversely, ``nofp`` implies ``nosimd``, which implies diff --git a/gcc/doc/gccint/testsuites/directives-used-within-dejagnu-tests/keywords-describing-target-attributes.rst b/gcc/doc/gccint/testsuites/directives-used-within-dejagnu-tests/keywords-describing-target-attributes.rst index 709e4ea2b90..84822b4335c 100644 --- a/gcc/doc/gccint/testsuites/directives-used-within-dejagnu-tests/keywords-describing-target-attributes.rst +++ b/gcc/doc/gccint/testsuites/directives-used-within-dejagnu-tests/keywords-describing-target-attributes.rst @@ -886,6 +886,9 @@ AArch64-specific attributes AArch64 target that is able to generate and execute armv8.3-a FJCVTZS instruction. +``aarch64_sme`` + AArch64 target that generates instructions for SME. + MIPS-specific attributes ~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme.exp b/gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme.exp index c542912e14a..b3ad2ea4c5e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme.exp +++ b/gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme.exp @@ -31,10 +31,16 @@ load_lib gcc-dg.exp # Initialize `dg'. dg-init -aarch64-with-arch-dg-options "" { +if { [check_effective_target_aarch64_sme] } { + set sme_flags "" +} else { + set sme_flags "-march=armv8.2-a+sme" +} + +aarch64-with-arch-dg-options $sme_flags { # Main loop. dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cCS\]]] \ - "" "" + "" $sme_flags } # All done. diff --git a/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_3.c b/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_3.c new file mode 100644 index 00000000000..926ffa24e45 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_3.c @@ -0,0 +1,63 @@ +// { dg-options "" } + +#pragma GCC target "+nosme" + +void __attribute__((arm_streaming_compatible)) sc_a () {} +void __attribute__((arm_streaming)) s_a () {} // { dg-error "streaming functions require the ISA extension 'sme'" } +void ns_a () {} + +void __attribute__((arm_streaming_compatible)) sc_b () {} +void ns_b () {} +void __attribute__((arm_streaming)) s_b () {} // { dg-error "streaming functions require the ISA extension 'sme'" } + +void __attribute__((arm_streaming_compatible)) sc_c () {} +void __attribute__((arm_streaming_compatible)) sc_d () {} + +void __attribute__((arm_streaming)) s_c () {} // { dg-error "streaming functions require the ISA extension 'sme'" } +void __attribute__((arm_streaming)) s_d () {} // { dg-error "streaming functions require the ISA extension 'sme'" } + +void ns_c () {} +void ns_d () {} + +void __attribute__((arm_streaming_compatible)) sc_e (); +void __attribute__((arm_streaming)) s_e (); +void ns_e (); + +#pragma GCC target "+sme" + +void __attribute__((arm_streaming_compatible)) sc_f () {} +void __attribute__((arm_streaming)) s_f () {} +void ns_f () {} + +void __attribute__((arm_streaming_compatible)) sc_g () {} +void ns_g () {} +void __attribute__((arm_streaming)) s_g () {} + +void __attribute__((arm_streaming_compatible)) sc_h () {} +void __attribute__((arm_streaming_compatible)) sc_i () {} + +void __attribute__((arm_streaming)) s_h () {} +void __attribute__((arm_streaming)) s_i () {} + +void ns_h () {} +void ns_i () {} + +void __attribute__((arm_streaming_compatible)) sc_j (); +void __attribute__((arm_streaming)) s_j (); +void ns_j (); + +#pragma GCC target "+sme" + +void __attribute__((arm_streaming_compatible)) sc_k () {} + +#pragma GCC target "+nosme" +#pragma GCC target "+sme" + +void __attribute__((arm_streaming)) s_k () {} + +#pragma GCC target "+nosme" +#pragma GCC target "+sme" + +void ns_k () {} + +#pragma GCC target "+nosme" diff --git a/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_4.c b/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_4.c new file mode 100644 index 00000000000..d777d7ee0d9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/streaming_mode_4.c @@ -0,0 +1,22 @@ +// { dg-options "-mgeneral-regs-only" } + +void __attribute__((arm_streaming_compatible)) sc_a () {} +void __attribute__((arm_streaming)) s_a () {} // { dg-error "streaming functions require the ISA extension 'sme'" } +void ns_a () {} + +void __attribute__((arm_streaming_compatible)) sc_b () {} +void ns_b () {} +void __attribute__((arm_streaming)) s_b () {} // { dg-error "streaming functions require the ISA extension 'sme'" } + +void __attribute__((arm_streaming_compatible)) sc_c () {} +void __attribute__((arm_streaming_compatible)) sc_d () {} + +void __attribute__((arm_streaming)) s_c () {} // { dg-error "streaming functions require the ISA extension 'sme'" } +void __attribute__((arm_streaming)) s_d () {} // { dg-error "streaming functions require the ISA extension 'sme'" } + +void ns_c () {} +void ns_d () {} + +void __attribute__((arm_streaming_compatible)) sc_e (); +void __attribute__((arm_streaming)) s_e (); +void ns_e (); diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index c7f583d6d14..f6cb16521b3 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -3967,6 +3967,18 @@ proc aarch64_sve_bits { } { }] } +# Return 1 if this is an AArch64 target that generates instructions for SME. +proc check_effective_target_aarch64_sme { } { + if { ![istarget aarch64*-*-*] } { + return 0 + } + return [check_no_compiler_messages aarch64_sme assembly { + #if !defined (__ARM_FEATURE_SME) + #error FOO + #endif + }] +} + # Return 1 if this is a compiler supporting ARC atomic operations proc check_effective_target_arc_atomic { } { return [check_no_compiler_messages arc_atomic assembly { From patchwork Sun Nov 13 10:00:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 60509 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A00A1388B683 for ; Sun, 13 Nov 2022 10:01:52 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A00A1388B683 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668333712; bh=J0q2U764hhewRK+e0IjhelrBV3TOd2lCO5+OgLfwcxM=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=uFZzKgnb4pnu6sHkW52SGlW8TxRbguLwNRGJF8Rfj7jWRuatd1w3gLFQDHAy36Xos GrV2K70wK5AiEixgLoIbUGvEnMzRfYGjec+UvJJzk27RAjoq0GmEbMHBb1rjyzq+em IPl8CtvDk9zubeC4iQbAyzJzShharGI63IC1pMMc= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id A5E303889E36 for ; Sun, 13 Nov 2022 10:00:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A5E303889E36 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9E24D23A for ; Sun, 13 Nov 2022 02:00:33 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C90193F73D for ; Sun, 13 Nov 2022 02:00:26 -0800 (PST) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 03/16] aarch64: Distinguish streaming-compatible AdvSIMD insns References: Date: Sun, 13 Nov 2022 10:00:25 +0000 In-Reply-To: (Richard Sandiford's message of "Sun, 13 Nov 2022 09:59:23 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-42.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" The vast majority of Advanced SIMD instructions are not available in streaming mode, but some of the load/store/move instructions are. This patch adds a new target feature macro called TARGET_BASE_SIMD for this streaming-compatible subset. The vector-to-vector move instructions are not streaming-compatible, so we need to use the SVE move instructions where enabled, or fall back to the nofp16 handling otherwise. I haven't found a good way of testing the SVE EXT alternative in aarch64_simd_mov_from_high, but I'd rather provide it than not. gcc/ * config/aarch64/aarch64.h (TARGET_BASE_SIMD): New macro. (TARGET_SIMD): Require PSTATE.SM to be 0. (AARCH64_ISA_SM_OFF): New macro. * config/aarch64/aarch64.cc (aarch64_array_mode_supported_p): Allow Advanced SIMD structure modes for TARGET_BASE_SIMD. (aarch64_print_operand): Support '%Z'. (aarch64_secondary_reload): Expect SVE moves to be used for Advanced SIMD modes if SVE is enabled and non-streaming Advanced SIMD isn't. (aarch64_register_move_cost): Likewise. (aarch64_simd_container_mode): Extend Advanced SIMD mode handling to TARGET_BASE_SIMD. (aarch64_expand_cpymem): Expand commentary. * config/aarch64/aarch64.md (arches): Add base_simd. (arch_enabled): Handle it. (*mov_aarch64): Extend UMOV alternative to TARGET_BASE_SIMD. (*movti_aarch64): Use an SVE move instruction if non-streaming SIMD isn't available. (*mov_aarch64): Likewise. (load_pair_dw_tftf): Extend to TARGET_BASE_SIMD. (store_pair_dw_tftf): Likewise. (loadwb_pair_): Likewise. (storewb_pair_): Likewise. * config/aarch64/aarch64-simd.md (*aarch64_simd_mov): Allow UMOV in streaming mode. (*aarch64_simd_mov): Use an SVE move instruction if non-streaming SIMD isn't available. (aarch64_store_lane0): Depend on TARGET_FLOAT rather than TARGET_SIMD. (aarch64_simd_mov_from_low): Likewise. Use fmov if Advanced SIMD is completely disabled. (aarch64_simd_mov_from_high): Use SVE EXT instructions if non-streaming SIMD isn't available. gcc/testsuite/ * gcc.target/aarch64/movdf_2.c: New test. * gcc.target/aarch64/movdi_3.c: Likewise. * gcc.target/aarch64/movhf_2.c: Likewise. * gcc.target/aarch64/movhi_2.c: Likewise. * gcc.target/aarch64/movqi_2.c: Likewise. * gcc.target/aarch64/movsf_2.c: Likewise. * gcc.target/aarch64/movsi_2.c: Likewise. * gcc.target/aarch64/movtf_3.c: Likewise. * gcc.target/aarch64/movtf_4.c: Likewise. * gcc.target/aarch64/movti_3.c: Likewise. * gcc.target/aarch64/movti_4.c: Likewise. * gcc.target/aarch64/movv16qi_4.c: Likewise. * gcc.target/aarch64/movv16qi_5.c: Likewise. * gcc.target/aarch64/movv8qi_4.c: Likewise. * gcc.target/aarch64/sme/arm_neon_1.c: Likewise. * gcc.target/aarch64/sme/arm_neon_2.c: Likewise. * gcc.target/aarch64/sme/arm_neon_3.c: Likewise. --- gcc/config/aarch64/aarch64-simd.md | 43 ++++++---- gcc/config/aarch64/aarch64.cc | 22 +++-- gcc/config/aarch64/aarch64.h | 12 ++- gcc/config/aarch64/aarch64.md | 45 +++++----- gcc/testsuite/gcc.target/aarch64/movdf_2.c | 51 +++++++++++ gcc/testsuite/gcc.target/aarch64/movdi_3.c | 59 +++++++++++++ gcc/testsuite/gcc.target/aarch64/movhf_2.c | 53 ++++++++++++ gcc/testsuite/gcc.target/aarch64/movhi_2.c | 61 +++++++++++++ gcc/testsuite/gcc.target/aarch64/movqi_2.c | 59 +++++++++++++ gcc/testsuite/gcc.target/aarch64/movsf_2.c | 51 +++++++++++ gcc/testsuite/gcc.target/aarch64/movsi_2.c | 59 +++++++++++++ gcc/testsuite/gcc.target/aarch64/movtf_3.c | 81 +++++++++++++++++ gcc/testsuite/gcc.target/aarch64/movtf_4.c | 78 +++++++++++++++++ gcc/testsuite/gcc.target/aarch64/movti_3.c | 86 +++++++++++++++++++ gcc/testsuite/gcc.target/aarch64/movti_4.c | 83 ++++++++++++++++++ gcc/testsuite/gcc.target/aarch64/movv16qi_4.c | 82 ++++++++++++++++++ gcc/testsuite/gcc.target/aarch64/movv16qi_5.c | 79 +++++++++++++++++ gcc/testsuite/gcc.target/aarch64/movv8qi_4.c | 55 ++++++++++++ .../gcc.target/aarch64/sme/arm_neon_1.c | 13 +++ .../gcc.target/aarch64/sme/arm_neon_2.c | 11 +++ .../gcc.target/aarch64/sme/arm_neon_3.c | 11 +++ 21 files changed, 1047 insertions(+), 47 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/movdf_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movdi_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movhf_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movhi_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movqi_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movsf_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movsi_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movtf_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movtf_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movti_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movti_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movv16qi_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movv16qi_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/movv8qi_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/arm_neon_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/arm_neon_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/arm_neon_3.c diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 5386043739a..b6313cba172 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -133,7 +133,7 @@ (define_insn "*aarch64_simd_mov" return "mov\t%0., %1."; return "fmov\t%d0, %d1"; case 4: - if (TARGET_SIMD) + if (TARGET_BASE_SIMD) return "umov\t%0, %1.d[0]"; return "fmov\t%x0, %d1"; case 5: return "fmov\t%d0, %1"; @@ -152,9 +152,9 @@ (define_insn "*aarch64_simd_mov" (define_insn "*aarch64_simd_mov" [(set (match_operand:VQMOV 0 "nonimmediate_operand" - "=w, Umn, m, w, ?r, ?w, ?r, w, w") + "=w, Umn, m, w, w, ?r, ?w, ?r, w, w") (match_operand:VQMOV 1 "general_operand" - "m, Dz, w, w, w, r, r, Dn, Dz"))] + "m, Dz, w, w, w, w, r, r, Dn, Dz"))] "TARGET_FLOAT && (register_operand (operands[0], mode) || aarch64_simd_reg_or_zero (operands[1], mode))" @@ -170,22 +170,24 @@ (define_insn "*aarch64_simd_mov" case 3: return "mov\t%0., %1."; case 4: + return "mov\t%Z0.d, %Z1.d"; case 5: case 6: - return "#"; case 7: - return aarch64_output_simd_mov_immediate (operands[1], 128); + return "#"; case 8: + return aarch64_output_simd_mov_immediate (operands[1], 128); + case 9: return "fmov\t%d0, xzr"; default: gcc_unreachable (); } } [(set_attr "type" "neon_load1_1reg, store_16, neon_store1_1reg,\ - neon_logic, multiple, multiple,\ - multiple, neon_move, fmov") - (set_attr "length" "4,4,4,4,8,8,8,4,4") - (set_attr "arch" "*,*,*,simd,*,*,*,simd,*")] + neon_logic, *, multiple, multiple,\ + multiple, neon_move, f_mcr") + (set_attr "length" "4,4,4,4,4,8,8,8,4,4") + (set_attr "arch" "*,*,*,simd,sve,*,*,*,simd,*")] ) ;; When storing lane zero we can use the normal STR and its more permissive @@ -195,7 +197,7 @@ (define_insn "aarch64_store_lane0" [(set (match_operand: 0 "memory_operand" "=m") (vec_select: (match_operand:VALL_F16 1 "register_operand" "w") (parallel [(match_operand 2 "const_int_operand" "n")])))] - "TARGET_SIMD + "TARGET_FLOAT && ENDIAN_LANE_N (, INTVAL (operands[2])) == 0" "str\\t%1, %0" [(set_attr "type" "neon_store1_1reg")] @@ -353,35 +355,38 @@ (define_expand "aarch64_get_high" ) (define_insn_and_split "aarch64_simd_mov_from_low" - [(set (match_operand: 0 "register_operand" "=w,?r") + [(set (match_operand: 0 "register_operand" "=w,?r,?r") (vec_select: - (match_operand:VQMOV_NO2E 1 "register_operand" "w,w") + (match_operand:VQMOV_NO2E 1 "register_operand" "w,w,w") (match_operand:VQMOV_NO2E 2 "vect_par_cnst_lo_half" "")))] - "TARGET_SIMD" + "TARGET_FLOAT" "@ # - umov\t%0, %1.d[0]" + umov\t%0, %1.d[0] + fmov\t%0, %d1" "&& reload_completed && aarch64_simd_register (operands[0], mode)" [(set (match_dup 0) (match_dup 1))] { operands[1] = aarch64_replace_reg_mode (operands[1], mode); } - [(set_attr "type" "mov_reg,neon_to_gp") + [(set_attr "type" "mov_reg,neon_to_gp,f_mrc") + (set_attr "arch" "simd,base_simd,*") (set_attr "length" "4")] ) (define_insn "aarch64_simd_mov_from_high" - [(set (match_operand: 0 "register_operand" "=w,?r,?r") + [(set (match_operand: 0 "register_operand" "=w,w,?r,?r") (vec_select: - (match_operand:VQMOV_NO2E 1 "register_operand" "w,w,w") + (match_operand:VQMOV_NO2E 1 "register_operand" "w,0,w,w") (match_operand:VQMOV_NO2E 2 "vect_par_cnst_hi_half" "")))] "TARGET_FLOAT" "@ dup\t%d0, %1.d[1] + ext\t%Z0.b, %Z0.b, %Z0.b, #8 umov\t%0, %1.d[1] fmov\t%0, %1.d[1]" - [(set_attr "type" "neon_dup,neon_to_gp,f_mrc") - (set_attr "arch" "simd,simd,*") + [(set_attr "type" "neon_dup,*,neon_to_gp,f_mrc") + (set_attr "arch" "simd,sve,simd,*") (set_attr "length" "4")] ) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index fc6f0bc208a..36ef0435b4e 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -3726,7 +3726,7 @@ static bool aarch64_array_mode_supported_p (machine_mode mode, unsigned HOST_WIDE_INT nelems) { - if (TARGET_SIMD + if (TARGET_BASE_SIMD && (AARCH64_VALID_SIMD_QREG_MODE (mode) || AARCH64_VALID_SIMD_DREG_MODE (mode)) && (nelems >= 2 && nelems <= 4)) @@ -11876,6 +11876,10 @@ sizetochar (int size) 'N': Take the duplicated element in a vector constant and print the negative of it in decimal. 'b/h/s/d/q': Print a scalar FP/SIMD register name. + 'Z': Same for SVE registers. ('z' was already taken.) + Note that it is not necessary to use %Z for operands + that have SVE modes. The convention is to use %Z + only for non-SVE (or potentially non-SVE) modes. 'S/T/U/V': Print a FP/SIMD register name for a register list. The register printed is the FP/SIMD register name of X + 0/1/2/3 for S/T/U/V. @@ -12048,6 +12052,8 @@ aarch64_print_operand (FILE *f, rtx x, int code) case 's': case 'd': case 'q': + case 'Z': + code = TOLOWER (code); if (!REG_P (x) || !FP_REGNUM_P (REGNO (x))) { output_operand_lossage ("incompatible floating point / vector register operand for '%%%c'", code); @@ -12702,8 +12708,8 @@ aarch64_secondary_reload (bool in_p ATTRIBUTE_UNUSED, rtx x, return NO_REGS; } - /* Without the TARGET_SIMD instructions we cannot move a Q register - to a Q register directly. We need a scratch. */ + /* Without the TARGET_SIMD or TARGET_SVE instructions we cannot move a + Q register to a Q register directly. We need a scratch. */ if (REG_P (x) && (mode == TFmode || mode == TImode @@ -15273,7 +15279,7 @@ aarch64_register_move_cost (machine_mode mode, secondary reload. A general register is used as a scratch to move the upper DI value and the lower DI value is moved directly, hence the cost is the sum of three moves. */ - if (! TARGET_SIMD) + if (!TARGET_SIMD && !TARGET_SVE) return regmove_cost->GP2FP + regmove_cost->FP2GP + regmove_cost->FP2FP; return regmove_cost->FP2FP; @@ -20773,7 +20779,7 @@ aarch64_simd_container_mode (scalar_mode mode, poly_int64 width) return aarch64_full_sve_mode (mode).else_mode (word_mode); gcc_assert (known_eq (width, 64) || known_eq (width, 128)); - if (TARGET_SIMD) + if (TARGET_BASE_SIMD) { if (known_eq (width, 128)) return aarch64_vq_mode (mode).else_mode (word_mode); @@ -24908,7 +24914,11 @@ aarch64_expand_cpymem (rtx *operands) int copy_bits = 256; /* Default to 256-bit LDP/STP on large copies, however small copies, no SIMD - support or slow 256-bit LDP/STP fall back to 128-bit chunks. */ + support or slow 256-bit LDP/STP fall back to 128-bit chunks. + + ??? Although it would be possible to use LDP/STP Qn in streaming mode + (so using TARGET_BASE_SIMD instead of TARGET_SIMD), it isn't clear + whether that would improve performance. */ if (size <= 24 || !TARGET_SIMD || (aarch64_tune_params.extra_tuning_flags diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index c47f27eefec..398cc03fd1f 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -61,8 +61,15 @@ #define WORDS_BIG_ENDIAN (BYTES_BIG_ENDIAN) /* AdvSIMD is supported in the default configuration, unless disabled by - -mgeneral-regs-only or by the +nosimd extension. */ -#define TARGET_SIMD (AARCH64_ISA_SIMD) + -mgeneral-regs-only or by the +nosimd extension. The set of available + instructions is then subdivided into: + + - the "base" set, available both in SME streaming mode and in + non-streaming mode + + - the full set, available only in non-streaming mode. */ +#define TARGET_BASE_SIMD (AARCH64_ISA_SIMD) +#define TARGET_SIMD (AARCH64_ISA_SIMD && AARCH64_ISA_SM_OFF) #define TARGET_FLOAT (AARCH64_ISA_FP) #define UNITS_PER_WORD 8 @@ -199,6 +206,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; /* Macros to test ISA flags. */ +#define AARCH64_ISA_SM_OFF (aarch64_isa_flags & AARCH64_FL_SM_OFF) #define AARCH64_ISA_MODE (aarch64_isa_flags & AARCH64_FL_ISA_MODES) #define AARCH64_ISA_CRC (aarch64_isa_flags & AARCH64_FL_CRC) #define AARCH64_ISA_CRYPTO (aarch64_isa_flags & AARCH64_FL_CRYPTO) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index cd6d5e5000c..3dc877ba9fe 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -374,7 +374,7 @@ (define_constants ;; As a convenience, "fp_q" means "fp" + the ability to move between ;; Q registers and is equivalent to "simd". -(define_enum "arches" [ any rcpc8_4 fp fp_q simd sve fp16]) +(define_enum "arches" [any rcpc8_4 fp fp_q base_simd simd sve fp16]) (define_enum_attr "arch" "arches" (const_string "any")) @@ -402,6 +402,9 @@ (define_attr "arch_enabled" "no,yes" (and (eq_attr "arch" "fp") (match_test "TARGET_FLOAT")) + (and (eq_attr "arch" "base_simd") + (match_test "TARGET_BASE_SIMD")) + (and (eq_attr "arch" "fp_q, simd") (match_test "TARGET_SIMD")) @@ -1215,7 +1218,7 @@ (define_insn "*mov_aarch64" case 8: return "str\t%1, %0"; case 9: - return TARGET_SIMD ? "umov\t%w0, %1.[0]" : "fmov\t%w0, %s1"; + return TARGET_BASE_SIMD ? "umov\t%w0, %1.[0]" : "fmov\t%w0, %s1"; case 10: return TARGET_SIMD ? "dup\t%0., %w1" : "fmov\t%s0, %w1"; case 11: @@ -1395,9 +1398,9 @@ (define_expand "movti" (define_insn "*movti_aarch64" [(set (match_operand:TI 0 - "nonimmediate_operand" "= r,w,w,w, r,w,r,m,m,w,m") + "nonimmediate_operand" "= r,w,w,w, r,w,w,r,m,m,w,m") (match_operand:TI 1 - "aarch64_movti_operand" " rUti,Z,Z,r, w,w,m,r,Z,m,w"))] + "aarch64_movti_operand" " rUti,Z,Z,r, w,w,w,m,r,Z,m,w"))] "(register_operand (operands[0], TImode) || aarch64_reg_or_zero (operands[1], TImode))" "@ @@ -1407,16 +1410,17 @@ (define_insn "*movti_aarch64" # # mov\\t%0.16b, %1.16b + mov\\t%Z0.d, %Z1.d ldp\\t%0, %H0, %1 stp\\t%1, %H1, %0 stp\\txzr, xzr, %0 ldr\\t%q0, %1 str\\t%q1, %0" - [(set_attr "type" "multiple,neon_move,f_mcr,f_mcr,f_mrc,neon_logic_q, \ + [(set_attr "type" "multiple,neon_move,f_mcr,f_mcr,f_mrc,neon_logic_q,*,\ load_16,store_16,store_16,\ load_16,store_16") - (set_attr "length" "8,4,4,8,8,4,4,4,4,4,4") - (set_attr "arch" "*,simd,*,*,*,simd,*,*,*,fp,fp")] + (set_attr "length" "8,4,4,8,8,4,4,4,4,4,4,4") + (set_attr "arch" "*,simd,*,*,*,simd,sve,*,*,*,fp,fp")] ) ;; Split a TImode register-register or register-immediate move into @@ -1552,13 +1556,14 @@ (define_split (define_insn "*mov_aarch64" [(set (match_operand:TFD 0 - "nonimmediate_operand" "=w,?r ,w ,?r,w,?w,w,m,?r,m ,m") + "nonimmediate_operand" "=w,w,?r ,w ,?r,w,?w,w,m,?r,m ,m") (match_operand:TFD 1 - "general_operand" " w,?rY,?r,w ,Y,Y ,m,w,m ,?r,Y"))] + "general_operand" " w,w,?rY,?r,w ,Y,Y ,m,w,m ,?r,Y"))] "TARGET_FLOAT && (register_operand (operands[0], mode) || aarch64_reg_or_fp_zero (operands[1], mode))" "@ mov\\t%0.16b, %1.16b + mov\\t%Z0.d, %Z1.d # # # @@ -1569,10 +1574,10 @@ (define_insn "*mov_aarch64" ldp\\t%0, %H0, %1 stp\\t%1, %H1, %0 stp\\txzr, xzr, %0" - [(set_attr "type" "logic_reg,multiple,f_mcr,f_mrc,neon_move_q,f_mcr,\ + [(set_attr "type" "logic_reg,*,multiple,f_mcr,f_mrc,neon_move_q,f_mcr,\ f_loadd,f_stored,load_16,store_16,store_16") - (set_attr "length" "4,8,8,8,4,4,4,4,4,4,4") - (set_attr "arch" "simd,*,*,*,simd,*,*,*,*,*,*")] + (set_attr "length" "4,4,8,8,8,4,4,4,4,4,4,4") + (set_attr "arch" "simd,sve,*,*,*,simd,*,*,*,*,*,*")] ) (define_split @@ -1756,7 +1761,7 @@ (define_insn "load_pair_dw_tftf" (match_operand:TF 1 "aarch64_mem_pair_operand" "Ump")) (set (match_operand:TF 2 "register_operand" "=w") (match_operand:TF 3 "memory_operand" "m"))] - "TARGET_SIMD + "TARGET_BASE_SIMD && rtx_equal_p (XEXP (operands[3], 0), plus_constant (Pmode, XEXP (operands[1], 0), @@ -1806,11 +1811,11 @@ (define_insn "store_pair_dw_tftf" (match_operand:TF 1 "register_operand" "w")) (set (match_operand:TF 2 "memory_operand" "=m") (match_operand:TF 3 "register_operand" "w"))] - "TARGET_SIMD && - rtx_equal_p (XEXP (operands[2], 0), - plus_constant (Pmode, - XEXP (operands[0], 0), - GET_MODE_SIZE (TFmode)))" + "TARGET_BASE_SIMD + && rtx_equal_p (XEXP (operands[2], 0), + plus_constant (Pmode, + XEXP (operands[0], 0), + GET_MODE_SIZE (TFmode)))" "stp\\t%q1, %q3, %z0" [(set_attr "type" "neon_stp_q") (set_attr "fp" "yes")] @@ -1858,7 +1863,7 @@ (define_insn "loadwb_pair_" (set (match_operand:TX 3 "register_operand" "=w") (mem:TX (plus:P (match_dup 1) (match_operand:P 5 "const_int_operand" "n"))))])] - "TARGET_SIMD && INTVAL (operands[5]) == GET_MODE_SIZE (mode)" + "TARGET_BASE_SIMD && INTVAL (operands[5]) == GET_MODE_SIZE (mode)" "ldp\\t%q2, %q3, [%1], %4" [(set_attr "type" "neon_ldp_q")] ) @@ -1908,7 +1913,7 @@ (define_insn "storewb_pair_" (set (mem:TX (plus:P (match_dup 0) (match_operand:P 5 "const_int_operand" "n"))) (match_operand:TX 3 "register_operand" "w"))])] - "TARGET_SIMD + "TARGET_BASE_SIMD && INTVAL (operands[5]) == INTVAL (operands[4]) + GET_MODE_SIZE (mode)" "stp\\t%q2, %q3, [%0, %4]!" diff --git a/gcc/testsuite/gcc.target/aarch64/movdf_2.c b/gcc/testsuite/gcc.target/aarch64/movdf_2.c new file mode 100644 index 00000000000..c2454d2c83e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movdf_2.c @@ -0,0 +1,51 @@ +/* { dg-do assemble } */ +/* { dg-options "-O --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +/* +** fpr_to_fpr: +** fmov d0, d1 +** ret +*/ +double __attribute__((arm_streaming_compatible)) +fpr_to_fpr (double q0, double q1) +{ + return q1; +} + +/* +** gpr_to_fpr: +** fmov d0, x0 +** ret +*/ +double __attribute__((arm_streaming_compatible)) +gpr_to_fpr () +{ + register double x0 asm ("x0"); + asm volatile ("" : "=r" (x0)); + return x0; +} + +/* +** zero_to_fpr: +** fmov d0, xzr +** ret +*/ +double __attribute__((arm_streaming_compatible)) +zero_to_fpr () +{ + return 0; +} + +/* +** fpr_to_gpr: +** fmov x0, d0 +** ret +*/ +void __attribute__((arm_streaming_compatible)) +fpr_to_gpr (double q0) +{ + register double x0 asm ("x0"); + x0 = q0; + asm volatile ("" :: "r" (x0)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/movdi_3.c b/gcc/testsuite/gcc.target/aarch64/movdi_3.c new file mode 100644 index 00000000000..5d369b27356 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movdi_3.c @@ -0,0 +1,59 @@ +/* { dg-do assemble } */ +/* { dg-options "-O --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#include + +/* +** fpr_to_fpr: +** fmov d0, d1 +** ret +*/ +void __attribute__((arm_streaming_compatible)) +fpr_to_fpr (void) +{ + register uint64_t q0 asm ("q0"); + register uint64_t q1 asm ("q1"); + asm volatile ("" : "=w" (q1)); + q0 = q1; + asm volatile ("" :: "w" (q0)); +} + +/* +** gpr_to_fpr: +** fmov d0, x0 +** ret +*/ +void __attribute__((arm_streaming_compatible)) +gpr_to_fpr (uint64_t x0) +{ + register uint64_t q0 asm ("q0"); + q0 = x0; + asm volatile ("" :: "w" (q0)); +} + +/* +** zero_to_fpr: +** fmov d0, xzr +** ret +*/ +void __attribute__((arm_streaming_compatible)) +zero_to_fpr () +{ + register uint64_t q0 asm ("q0"); + q0 = 0; + asm volatile ("" :: "w" (q0)); +} + +/* +** fpr_to_gpr: +** fmov x0, d0 +** ret +*/ +uint64_t __attribute__((arm_streaming_compatible)) +fpr_to_gpr () +{ + register uint64_t q0 asm ("q0"); + asm volatile ("" : "=w" (q0)); + return q0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/movhf_2.c b/gcc/testsuite/gcc.target/aarch64/movhf_2.c new file mode 100644 index 00000000000..cf3af357b84 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movhf_2.c @@ -0,0 +1,53 @@ +/* { dg-do assemble } */ +/* { dg-options "-O --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#pragma GCC target "+nothing+simd" + +/* +** fpr_to_fpr: +** fmov s0, s1 +** ret +*/ +_Float16 __attribute__((arm_streaming_compatible)) +fpr_to_fpr (_Float16 q0, _Float16 q1) +{ + return q1; +} + +/* +** gpr_to_fpr: +** fmov s0, w0 +** ret +*/ +_Float16 __attribute__((arm_streaming_compatible)) +gpr_to_fpr () +{ + register _Float16 w0 asm ("w0"); + asm volatile ("" : "=r" (w0)); + return w0; +} + +/* +** zero_to_fpr: +** fmov s0, wzr +** ret +*/ +_Float16 __attribute__((arm_streaming_compatible)) +zero_to_fpr () +{ + return 0; +} + +/* +** fpr_to_gpr: +** fmov w0, s0 +** ret +*/ +void __attribute__((arm_streaming_compatible)) +fpr_to_gpr (_Float16 q0) +{ + register _Float16 w0 asm ("w0"); + w0 = q0; + asm volatile ("" :: "r" (w0)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/movhi_2.c b/gcc/testsuite/gcc.target/aarch64/movhi_2.c new file mode 100644 index 00000000000..108923449b9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movhi_2.c @@ -0,0 +1,61 @@ +/* { dg-do assemble } */ +/* { dg-options "-O --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#pragma GCC target "+nothing+simd" + +#include + +/* +** fpr_to_fpr: +** fmov s0, s1 +** ret +*/ +void __attribute__((arm_streaming_compatible)) +fpr_to_fpr (void) +{ + register uint16_t q0 asm ("q0"); + register uint16_t q1 asm ("q1"); + asm volatile ("" : "=w" (q1)); + q0 = q1; + asm volatile ("" :: "w" (q0)); +} + +/* +** gpr_to_fpr: +** fmov s0, w0 +** ret +*/ +void __attribute__((arm_streaming_compatible)) +gpr_to_fpr (uint16_t w0) +{ + register uint16_t q0 asm ("q0"); + q0 = w0; + asm volatile ("" :: "w" (q0)); +} + +/* +** zero_to_fpr: +** fmov s0, wzr +** ret +*/ +void __attribute__((arm_streaming_compatible)) +zero_to_fpr () +{ + register uint16_t q0 asm ("q0"); + q0 = 0; + asm volatile ("" :: "w" (q0)); +} + +/* +** fpr_to_gpr: +** umov w0, v0.h\[0\] +** ret +*/ +uint16_t __attribute__((arm_streaming_compatible)) +fpr_to_gpr () +{ + register uint16_t q0 asm ("q0"); + asm volatile ("" : "=w" (q0)); + return q0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/movqi_2.c b/gcc/testsuite/gcc.target/aarch64/movqi_2.c new file mode 100644 index 00000000000..a28547d2ba3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movqi_2.c @@ -0,0 +1,59 @@ +/* { dg-do assemble } */ +/* { dg-options "-O --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#include + +/* +** fpr_to_fpr: +** fmov s0, s1 +** ret +*/ +void __attribute__((arm_streaming_compatible)) +fpr_to_fpr (void) +{ + register uint8_t q0 asm ("q0"); + register uint8_t q1 asm ("q1"); + asm volatile ("" : "=w" (q1)); + q0 = q1; + asm volatile ("" :: "w" (q0)); +} + +/* +** gpr_to_fpr: +** fmov s0, w0 +** ret +*/ +void __attribute__((arm_streaming_compatible)) +gpr_to_fpr (uint8_t w0) +{ + register uint8_t q0 asm ("q0"); + q0 = w0; + asm volatile ("" :: "w" (q0)); +} + +/* +** zero_to_fpr: +** fmov s0, wzr +** ret +*/ +void __attribute__((arm_streaming_compatible)) +zero_to_fpr () +{ + register uint8_t q0 asm ("q0"); + q0 = 0; + asm volatile ("" :: "w" (q0)); +} + +/* +** fpr_to_gpr: +** umov w0, v0.b\[0\] +** ret +*/ +uint8_t __attribute__((arm_streaming_compatible)) +fpr_to_gpr () +{ + register uint8_t q0 asm ("q0"); + asm volatile ("" : "=w" (q0)); + return q0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/movsf_2.c b/gcc/testsuite/gcc.target/aarch64/movsf_2.c new file mode 100644 index 00000000000..53abd380510 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movsf_2.c @@ -0,0 +1,51 @@ +/* { dg-do assemble } */ +/* { dg-options "-O --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +/* +** fpr_to_fpr: +** fmov s0, s1 +** ret +*/ +float __attribute__((arm_streaming_compatible)) +fpr_to_fpr (float q0, float q1) +{ + return q1; +} + +/* +** gpr_to_fpr: +** fmov s0, w0 +** ret +*/ +float __attribute__((arm_streaming_compatible)) +gpr_to_fpr () +{ + register float w0 asm ("w0"); + asm volatile ("" : "=r" (w0)); + return w0; +} + +/* +** zero_to_fpr: +** fmov s0, wzr +** ret +*/ +float __attribute__((arm_streaming_compatible)) +zero_to_fpr () +{ + return 0; +} + +/* +** fpr_to_gpr: +** fmov w0, s0 +** ret +*/ +void __attribute__((arm_streaming_compatible)) +fpr_to_gpr (float q0) +{ + register float w0 asm ("w0"); + w0 = q0; + asm volatile ("" :: "r" (w0)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/movsi_2.c b/gcc/testsuite/gcc.target/aarch64/movsi_2.c new file mode 100644 index 00000000000..a0159d3fc1e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movsi_2.c @@ -0,0 +1,59 @@ +/* { dg-do assemble } */ +/* { dg-options "-O --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#include + +/* +** fpr_to_fpr: +** fmov s0, s1 +** ret +*/ +void __attribute__((arm_streaming_compatible)) +fpr_to_fpr (void) +{ + register uint32_t q0 asm ("q0"); + register uint32_t q1 asm ("q1"); + asm volatile ("" : "=w" (q1)); + q0 = q1; + asm volatile ("" :: "w" (q0)); +} + +/* +** gpr_to_fpr: +** fmov s0, w0 +** ret +*/ +void __attribute__((arm_streaming_compatible)) +gpr_to_fpr (uint32_t w0) +{ + register uint32_t q0 asm ("q0"); + q0 = w0; + asm volatile ("" :: "w" (q0)); +} + +/* +** zero_to_fpr: +** fmov s0, wzr +** ret +*/ +void __attribute__((arm_streaming_compatible)) +zero_to_fpr () +{ + register uint32_t q0 asm ("q0"); + q0 = 0; + asm volatile ("" :: "w" (q0)); +} + +/* +** fpr_to_gpr: +** fmov w0, s0 +** ret +*/ +uint32_t __attribute__((arm_streaming_compatible)) +fpr_to_gpr () +{ + register uint32_t q0 asm ("q0"); + asm volatile ("" : "=w" (q0)); + return q0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/movtf_3.c b/gcc/testsuite/gcc.target/aarch64/movtf_3.c new file mode 100644 index 00000000000..d38f59e2a1f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movtf_3.c @@ -0,0 +1,81 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target large_long_double } */ +/* { dg-options "-O -mtune=neoverse-v1 --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#pragma GCC target "+nosve" + +/* +** fpr_to_fpr: +** sub sp, sp, #16 +** str q1, \[sp\] +** ldr q0, \[sp\] +** add sp, sp, #?16 +** ret +*/ +long double __attribute__((arm_streaming_compatible)) +fpr_to_fpr (long double q0, long double q1) +{ + return q1; +} + +/* +** gpr_to_fpr: { target aarch64_little_endian } +** fmov d0, x0 +** fmov v0.d\[1\], x1 +** ret +*/ +/* +** gpr_to_fpr: { target aarch64_big_endian } +** fmov d0, x1 +** fmov v0.d\[1\], x0 +** ret +*/ +long double __attribute__((arm_streaming_compatible)) +gpr_to_fpr () +{ + register long double x0 asm ("x0"); + asm volatile ("" : "=r" (x0)); + return x0; +} + +/* +** zero_to_fpr: +** fmov s0, wzr +** ret +*/ +long double __attribute__((arm_streaming_compatible)) +zero_to_fpr () +{ + return 0; +} + +/* +** fpr_to_gpr: { target aarch64_little_endian } +** ( +** fmov x0, d0 +** fmov x1, v0.d\[1\] +** | +** fmov x1, v0.d\[1\] +** fmov x0, d0 +** ) +** ret +*/ +/* +** fpr_to_gpr: { target aarch64_big_endian } +** ( +** fmov x1, d0 +** fmov x0, v0.d\[1\] +** | +** fmov x0, v0.d\[1\] +** fmov x1, d0 +** ) +** ret +*/ +void __attribute__((arm_streaming_compatible)) +fpr_to_gpr (long double q0) +{ + register long double x0 asm ("x0"); + x0 = q0; + asm volatile ("" :: "r" (x0)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/movtf_4.c b/gcc/testsuite/gcc.target/aarch64/movtf_4.c new file mode 100644 index 00000000000..5b7486c7887 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movtf_4.c @@ -0,0 +1,78 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target large_long_double } */ +/* { dg-options "-O -mtune=neoverse-v1 --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#pragma GCC target "+sve" + +/* +** fpr_to_fpr: +** mov z0.d, z1.d +** ret +*/ +long double __attribute__((arm_streaming_compatible)) +fpr_to_fpr (long double q0, long double q1) +{ + return q1; +} + +/* +** gpr_to_fpr: { target aarch64_little_endian } +** fmov d0, x0 +** fmov v0.d\[1\], x1 +** ret +*/ +/* +** gpr_to_fpr: { target aarch64_big_endian } +** fmov d0, x1 +** fmov v0.d\[1\], x0 +** ret +*/ +long double __attribute__((arm_streaming_compatible)) +gpr_to_fpr () +{ + register long double x0 asm ("x0"); + asm volatile ("" : "=r" (x0)); + return x0; +} + +/* +** zero_to_fpr: +** fmov s0, wzr +** ret +*/ +long double __attribute__((arm_streaming_compatible)) +zero_to_fpr () +{ + return 0; +} + +/* +** fpr_to_gpr: { target aarch64_little_endian } +** ( +** fmov x0, d0 +** fmov x1, v0.d\[1\] +** | +** fmov x1, v0.d\[1\] +** fmov x0, d0 +** ) +** ret +*/ +/* +** fpr_to_gpr: { target aarch64_big_endian } +** ( +** fmov x1, d0 +** fmov x0, v0.d\[1\] +** | +** fmov x0, v0.d\[1\] +** fmov x1, d0 +** ) +** ret +*/ +void __attribute__((arm_streaming_compatible)) +fpr_to_gpr (long double q0) +{ + register long double x0 asm ("x0"); + x0 = q0; + asm volatile ("" :: "r" (x0)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/movti_3.c b/gcc/testsuite/gcc.target/aarch64/movti_3.c new file mode 100644 index 00000000000..d846b09497e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movti_3.c @@ -0,0 +1,86 @@ +/* { dg-do assemble } */ +/* { dg-options "-O -mtune=neoverse-v1 --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#pragma GCC target "+nosve" + +/* +** fpr_to_fpr: +** sub sp, sp, #16 +** str q1, \[sp\] +** ldr q0, \[sp\] +** add sp, sp, #?16 +** ret +*/ +void __attribute__((arm_streaming_compatible)) +fpr_to_fpr (void) +{ + register __int128_t q0 asm ("q0"); + register __int128_t q1 asm ("q1"); + asm volatile ("" : "=w" (q1)); + q0 = q1; + asm volatile ("" :: "w" (q0)); +} + +/* +** gpr_to_fpr: { target aarch64_little_endian } +** fmov d0, x0 +** fmov v0.d\[1\], x1 +** ret +*/ +/* +** gpr_to_fpr: { target aarch64_big_endian } +** fmov d0, x1 +** fmov v0.d\[1\], x0 +** ret +*/ +void __attribute__((arm_streaming_compatible)) +gpr_to_fpr (__int128_t x0) +{ + register __int128_t q0 asm ("q0"); + q0 = x0; + asm volatile ("" :: "w" (q0)); +} + +/* +** zero_to_fpr: +** fmov d0, xzr +** ret +*/ +void __attribute__((arm_streaming_compatible)) +zero_to_fpr () +{ + register __int128_t q0 asm ("q0"); + q0 = 0; + asm volatile ("" :: "w" (q0)); +} + +/* +** fpr_to_gpr: { target aarch64_little_endian } +** ( +** fmov x0, d0 +** fmov x1, v0.d\[1\] +** | +** fmov x1, v0.d\[1\] +** fmov x0, d0 +** ) +** ret +*/ +/* +** fpr_to_gpr: { target aarch64_big_endian } +** ( +** fmov x1, d0 +** fmov x0, v0.d\[1\] +** | +** fmov x0, v0.d\[1\] +** fmov x1, d0 +** ) +** ret +*/ +__int128_t __attribute__((arm_streaming_compatible)) +fpr_to_gpr () +{ + register __int128_t q0 asm ("q0"); + asm volatile ("" : "=w" (q0)); + return q0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/movti_4.c b/gcc/testsuite/gcc.target/aarch64/movti_4.c new file mode 100644 index 00000000000..01e5537e88f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movti_4.c @@ -0,0 +1,83 @@ +/* { dg-do assemble } */ +/* { dg-options "-O -mtune=neoverse-v1 --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#pragma GCC target "+sve" + +/* +** fpr_to_fpr: +** mov z0\.d, z1\.d +** ret +*/ +void __attribute__((arm_streaming_compatible)) +fpr_to_fpr (void) +{ + register __int128_t q0 asm ("q0"); + register __int128_t q1 asm ("q1"); + asm volatile ("" : "=w" (q1)); + q0 = q1; + asm volatile ("" :: "w" (q0)); +} + +/* +** gpr_to_fpr: { target aarch64_little_endian } +** fmov d0, x0 +** fmov v0.d\[1\], x1 +** ret +*/ +/* +** gpr_to_fpr: { target aarch64_big_endian } +** fmov d0, x1 +** fmov v0.d\[1\], x0 +** ret +*/ +void __attribute__((arm_streaming_compatible)) +gpr_to_fpr (__int128_t x0) +{ + register __int128_t q0 asm ("q0"); + q0 = x0; + asm volatile ("" :: "w" (q0)); +} + +/* +** zero_to_fpr: +** fmov d0, xzr +** ret +*/ +void __attribute__((arm_streaming_compatible)) +zero_to_fpr () +{ + register __int128_t q0 asm ("q0"); + q0 = 0; + asm volatile ("" :: "w" (q0)); +} + +/* +** fpr_to_gpr: { target aarch64_little_endian } +** ( +** fmov x0, d0 +** fmov x1, v0.d\[1\] +** | +** fmov x1, v0.d\[1\] +** fmov x0, d0 +** ) +** ret +*/ +/* +** fpr_to_gpr: { target aarch64_big_endian } +** ( +** fmov x1, d0 +** fmov x0, v0.d\[1\] +** | +** fmov x0, v0.d\[1\] +** fmov x1, d0 +** ) +** ret +*/ +__int128_t __attribute__((arm_streaming_compatible)) +fpr_to_gpr () +{ + register __int128_t q0 asm ("q0"); + asm volatile ("" : "=w" (q0)); + return q0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/movv16qi_4.c b/gcc/testsuite/gcc.target/aarch64/movv16qi_4.c new file mode 100644 index 00000000000..f0f8cb95750 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movv16qi_4.c @@ -0,0 +1,82 @@ +/* { dg-do assemble } */ +/* { dg-options "-O -mtune=neoverse-v1 --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#pragma GCC target "+nosve" + +typedef unsigned char v16qi __attribute__((vector_size(16))); + +/* +** fpr_to_fpr: +** sub sp, sp, #16 +** str q1, \[sp\] +** ldr q0, \[sp\] +** add sp, sp, #?16 +** ret +*/ +v16qi __attribute__((arm_streaming_compatible)) +fpr_to_fpr (v16qi q0, v16qi q1) +{ + return q1; +} + +/* +** gpr_to_fpr: { target aarch64_little_endian } +** fmov d0, x0 +** fmov v0.d\[1\], x1 +** ret +*/ +/* +** gpr_to_fpr: { target aarch64_big_endian } +** fmov d0, x1 +** fmov v0.d\[1\], x0 +** ret +*/ +v16qi __attribute__((arm_streaming_compatible)) +gpr_to_fpr () +{ + register v16qi x0 asm ("x0"); + asm volatile ("" : "=r" (x0)); + return x0; +} + +/* +** zero_to_fpr: +** fmov d0, xzr +** ret +*/ +v16qi __attribute__((arm_streaming_compatible)) +zero_to_fpr () +{ + return (v16qi) {}; +} + +/* +** fpr_to_gpr: { target aarch64_little_endian } +** ( +** umov x0, v0.d\[0\] +** fmov x1, v0.d\[1\] +** | +** fmov x1, v0.d\[1\] +** umov x0, v0.d\[0\] +** ) +** ret +*/ +/* +** fpr_to_gpr: { target aarch64_big_endian } +** ( +** umov x1, v0.d\[0\] +** fmov x0, v0.d\[1\] +** | +** fmov x0, v0.d\[1\] +** umov x1, v0.d\[0\] +** ) +** ret +*/ +void __attribute__((arm_streaming_compatible)) +fpr_to_gpr (v16qi q0) +{ + register v16qi x0 asm ("x0"); + x0 = q0; + asm volatile ("" :: "r" (x0)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/movv16qi_5.c b/gcc/testsuite/gcc.target/aarch64/movv16qi_5.c new file mode 100644 index 00000000000..db59f01376e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movv16qi_5.c @@ -0,0 +1,79 @@ +/* { dg-do assemble } */ +/* { dg-options "-O -mtune=neoverse-v1 --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#pragma GCC target "+sve" + +typedef unsigned char v16qi __attribute__((vector_size(16))); + +/* +** fpr_to_fpr: +** mov z0.d, z1.d +** ret +*/ +v16qi __attribute__((arm_streaming_compatible)) +fpr_to_fpr (v16qi q0, v16qi q1) +{ + return q1; +} + +/* +** gpr_to_fpr: { target aarch64_little_endian } +** fmov d0, x0 +** fmov v0.d\[1\], x1 +** ret +*/ +/* +** gpr_to_fpr: { target aarch64_big_endian } +** fmov d0, x1 +** fmov v0.d\[1\], x0 +** ret +*/ +v16qi __attribute__((arm_streaming_compatible)) +gpr_to_fpr () +{ + register v16qi x0 asm ("x0"); + asm volatile ("" : "=r" (x0)); + return x0; +} + +/* +** zero_to_fpr: +** fmov d0, xzr +** ret +*/ +v16qi __attribute__((arm_streaming_compatible)) +zero_to_fpr () +{ + return (v16qi) {}; +} + +/* +** fpr_to_gpr: { target aarch64_little_endian } +** ( +** umov x0, v0.d\[0\] +** fmov x1, v0.d\[1\] +** | +** fmov x1, v0.d\[1\] +** umov x0, v0.d\[0\] +** ) +** ret +*/ +/* +** fpr_to_gpr: { target aarch64_big_endian } +** ( +** umov x1, v0.d\[0\] +** fmov x0, v0.d\[1\] +** | +** fmov x0, v0.d\[1\] +** umov x1, v0.d\[0\] +** ) +** ret +*/ +void __attribute__((arm_streaming_compatible)) +fpr_to_gpr (v16qi q0) +{ + register v16qi x0 asm ("x0"); + x0 = q0; + asm volatile ("" :: "r" (x0)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/movv8qi_4.c b/gcc/testsuite/gcc.target/aarch64/movv8qi_4.c new file mode 100644 index 00000000000..49eb2d31910 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/movv8qi_4.c @@ -0,0 +1,55 @@ +/* { dg-do assemble } */ +/* { dg-options "-O -mtune=neoverse-v1 --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#pragma GCC target "+nosve" + +typedef unsigned char v8qi __attribute__((vector_size(8))); + +/* +** fpr_to_fpr: +** fmov d0, d1 +** ret +*/ +v8qi __attribute__((arm_streaming_compatible)) +fpr_to_fpr (v8qi q0, v8qi q1) +{ + return q1; +} + +/* +** gpr_to_fpr: +** fmov d0, x0 +** ret +*/ +v8qi __attribute__((arm_streaming_compatible)) +gpr_to_fpr () +{ + register v8qi x0 asm ("x0"); + asm volatile ("" : "=r" (x0)); + return x0; +} + +/* +** zero_to_fpr: +** fmov d0, xzr +** ret +*/ +v8qi __attribute__((arm_streaming_compatible)) +zero_to_fpr () +{ + return (v8qi) {}; +} + +/* +** fpr_to_gpr: +** umov x0, v0\.d\[0\] +** ret +*/ +void __attribute__((arm_streaming_compatible)) +fpr_to_gpr (v8qi q0) +{ + register v8qi x0 asm ("x0"); + x0 = q0; + asm volatile ("" :: "r" (x0)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/arm_neon_1.c b/gcc/testsuite/gcc.target/aarch64/sme/arm_neon_1.c new file mode 100644 index 00000000000..4a526e7d125 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/arm_neon_1.c @@ -0,0 +1,13 @@ +// { dg-options "" } + +#include + +#pragma GCC target "+nosme" + +// { dg-error {inlining failed.*'vaddq_s32'} "" { target *-*-* } 0 } + +int32x4_t __attribute__((arm_streaming_compatible)) +foo (int32x4_t x, int32x4_t y) +{ + return vaddq_s32 (x, y); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/arm_neon_2.c b/gcc/testsuite/gcc.target/aarch64/sme/arm_neon_2.c new file mode 100644 index 00000000000..e7183caa6f8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/arm_neon_2.c @@ -0,0 +1,11 @@ +// { dg-options "" } + +#include + +// { dg-error {inlining failed.*'vaddq_s32'} "" { target *-*-* } 0 } + +int32x4_t __attribute__((arm_streaming_compatible)) +foo (int32x4_t x, int32x4_t y) +{ + return vaddq_s32 (x, y); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/arm_neon_3.c b/gcc/testsuite/gcc.target/aarch64/sme/arm_neon_3.c new file mode 100644 index 00000000000..e11570e41d1 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/arm_neon_3.c @@ -0,0 +1,11 @@ +// { dg-options "" } + +#include + +// { dg-error {inlining failed.*'vaddq_s32'} "" { target *-*-* } 0 } + +int32x4_t __attribute__((arm_streaming)) +foo (int32x4_t x, int32x4_t y) +{ + return vaddq_s32 (x, y); +} From patchwork Sun Nov 13 10:00:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 60510 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BD8133896C2A for ; Sun, 13 Nov 2022 10:01:56 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BD8133896C2A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668333716; bh=8xNb+DbuhmPUzfniVqQKFfvHqxkQ8hbqm+dziwGLPzQ=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=gbQfkCcREA5s7OFoWR08SCoSyxb4T1YoJzFYFnYiW4oHoUX03tdQyQZ2TUitnyHJ4 c8cEg0HwIBaxKhryd2HCiVdqPgCb/zmAkbNN/fSB8k4b6Esvjifu4H4oEacKW+2Th6 G9TQLRmaWNFbI9QfOfG5BTCVA8Gwnn3k/8ZSXy3g= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 6EBAD3889E38 for ; Sun, 13 Nov 2022 10:00:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6EBAD3889E38 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6001F23A for ; Sun, 13 Nov 2022 02:00:50 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 36B583F73D for ; Sun, 13 Nov 2022 02:00:43 -0800 (PST) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 04/16] aarch64: Mark relevant SVE instructions as non-streaming References: Date: Sun, 13 Nov 2022 10:00:41 +0000 In-Reply-To: (Richard Sandiford's message of "Sun, 13 Nov 2022 09:59:23 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-41.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Following on from the previous Advanced SIMD patch, this one divides SVE instructions into non-streaming and streaming- compatible groups. gcc/ * config/aarch64/aarch64.h (TARGET_NON_STREAMING): New macro. (TARGET_SVE2_AES, TARGET_SVE2_BITPERM): Use it. (TARGET_SVE2_SHA3, TARGET_SVE2_SM4): Likewise. * config/aarch64/aarch64-sve-builtins-base.def: Separate out the functions that require PSTATE.SM to be 0 and guard them with AARCH64_FL_SM_OFF. * config/aarch64/aarch64-sve-builtins-sve2.def: Likewise. * config/aarch64/aarch64-sve-builtins.cc (check_required_extensions): Enforce AARCH64_FL_SM_OFF requirements. * config/aarch64/aarch64-sve.md (aarch64_wrffr): Require TARGET_NON_STREAMING (aarch64_rdffr, aarch64_rdffr_z, *aarch64_rdffr_z_ptest): Likewise. (*aarch64_rdffr_ptest, *aarch64_rdffr_z_cc, *aarch64_rdffr_cc) (@aarch64_ldf1): Likewise. (@aarch64_ldf1_) (gather_load): Likewise (mask_gather_load): Likewise. (mask_gather_load): Likewise. (*mask_gather_load_xtw_unpacked): Likewise. (*mask_gather_load_sxtw): Likewise. (*mask_gather_load_uxtw): Likewise. (@aarch64_gather_load_) (@aarch64_gather_load_ ): Likewise. (*aarch64_gather_load_ _xtw_unpacked) (*aarch64_gather_load_ _sxtw): Likewise. (*aarch64_gather_load_ _uxtw): Likewise. (@aarch64_ldff1_gather, @aarch64_ldff1_gather): Likewise. (*aarch64_ldff1_gather_sxtw): Likewise. (*aarch64_ldff1_gather_uxtw): Likewise. (@aarch64_ldff1_gather_ ): Likewise. (@aarch64_ldff1_gather_ ): Likewise. (*aarch64_ldff1_gather_ _sxtw): Likewise. (*aarch64_ldff1_gather_ _uxtw): Likewise. (@aarch64_sve_gather_prefetch) (@aarch64_sve_gather_prefetch) (*aarch64_sve_gather_prefetch_sxtw) (*aarch64_sve_gather_prefetch_uxtw) (scatter_store): Likewise. (mask_scatter_store): Likewise. (*mask_scatter_store_xtw_unpacked) (*mask_scatter_store_sxtw): Likewise. (*mask_scatter_store_uxtw): Likewise. (@aarch64_scatter_store_trunc) (@aarch64_scatter_store_trunc) (*aarch64_scatter_store_trunc_sxtw) (*aarch64_scatter_store_trunc_uxtw) (@aarch64_sve_ld1ro, @aarch64_adr): Likewise. (*aarch64_adr_sxtw, *aarch64_adr_uxtw_unspec): Likewise. (*aarch64_adr_uxtw_and, @aarch64_adr_shift): Likewise. (*aarch64_adr_shift, *aarch64_adr_shift_sxtw): Likewise. (*aarch64_adr_shift_uxtw, @aarch64_sve_add_): Likewise. (@aarch64_sve_, fold_left_plus_): Likewise. (mask_fold_left_plus_, @aarch64_sve_compact): Likewise. * config/aarch64/aarch64-sve2.md (@aarch64_gather_ldnt) (@aarch64_gather_ldnt_ ): Likewise. (@aarch64_sve2_histcnt, @aarch64_sve2_histseg): Likewise. (@aarch64_pred_): Likewise. (*aarch64_pred__cc): Likewise. (*aarch64_pred__ptest): Likewise. * config/aarch64/iterators.md (SVE_FP_UNARY_INT): Make FEXPA depend on TARGET_NON_STREAMING. (SVE_BFLOAT_TERNARY_LONG): Likewise BFMMLA. gcc/testsuite/ * g++.target/aarch64/sve/aarch64-ssve.exp: New harness. * g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Add -DSTREAMING_COMPATIBLE to the list of options. * g++.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp: Likewise. * gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Likewise. * gcc.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp: Likewise. Fix pasto in variable name. * gcc.target/aarch64/sve/acle/asm/test_sve_acle.h: Mark functions as streaming-compatible if STREAMING_COMPATIBLE is defined. * gcc.target/aarch64/sve/acle/asm/adda_f16.c: Disable for streaming-compatible code. * gcc.target/aarch64/sve/acle/asm/adda_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/adda_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/adrb.c: Likewise. * gcc.target/aarch64/sve/acle/asm/adrd.c: Likewise. * gcc.target/aarch64/sve/acle/asm/adrh.c: Likewise. * gcc.target/aarch64/sve/acle/asm/adrw.c: Likewise. * gcc.target/aarch64/sve/acle/asm/bfmmla_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/compact_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/compact_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/compact_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/compact_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/compact_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/compact_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/expa_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/expa_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/expa_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1_gather_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1_gather_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_bf16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_s8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_u8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sw_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1sw_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1uw_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1uw_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_bf16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_gather_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_gather_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_s8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_u8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sb_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sh_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sh_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sh_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sh_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sw_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1sw_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1ub_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uh_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uh_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uh_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uh_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uw_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1uw_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_bf16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_s8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_u8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sb_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sb_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sb_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sb_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sb_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sb_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sh_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sh_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sh_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sh_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sw_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1sw_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1ub_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1ub_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1ub_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1ub_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1ub_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1ub_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1uh_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1uh_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1uh_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1uh_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1uw_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1uw_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mmla_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mmla_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mmla_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/mmla_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/prfb_gather.c: Likewise. * gcc.target/aarch64/sve/acle/asm/prfd_gather.c: Likewise. * gcc.target/aarch64/sve/acle/asm/prfh_gather.c: Likewise. * gcc.target/aarch64/sve/acle/asm/prfw_gather.c: Likewise. * gcc.target/aarch64/sve/acle/asm/rdffr_1.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1_scatter_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1_scatter_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1_scatter_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1_scatter_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1_scatter_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1_scatter_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1b_scatter_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1b_scatter_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1b_scatter_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1b_scatter_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1h_scatter_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1h_scatter_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1h_scatter_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1h_scatter_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1w_scatter_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1w_scatter_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tmad_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tmad_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tmad_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tsmul_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tsmul_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tsmul_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tssel_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tssel_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tssel_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/usmmla_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/aesd_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/aese_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/aesimc_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/aesmc_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bdep_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bdep_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bdep_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bdep_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bext_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bext_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bext_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bext_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bgrp_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bgrp_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bgrp_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/bgrp_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/histcnt_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/histcnt_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/histcnt_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/histcnt_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/histseg_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/histseg_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/match_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/match_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/match_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/match_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/nmatch_s16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/nmatch_s8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/nmatch_u16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/nmatch_u8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pmullb_pair_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/pmullt_pair_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/rax1_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/rax1_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/sm4e_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/sm4ekey_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u32.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_s64.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_u64.c: Likewise. --- .../aarch64/aarch64-sve-builtins-base.def | 150 +++++---- .../aarch64/aarch64-sve-builtins-sve2.def | 65 ++-- gcc/config/aarch64/aarch64-sve-builtins.cc | 7 + gcc/config/aarch64/aarch64-sve.md | 124 +++---- gcc/config/aarch64/aarch64-sve2.md | 11 +- gcc/config/aarch64/aarch64.h | 11 +- gcc/config/aarch64/iterators.md | 4 +- .../g++.target/aarch64/sve/aarch64-ssve.exp | 309 ++++++++++++++++++ .../aarch64/sve/acle/aarch64-sve-acle-asm.exp | 1 + .../sve2/acle/aarch64-sve2-acle-asm.exp | 1 + .../aarch64/sve/acle/aarch64-sve-acle-asm.exp | 1 + .../aarch64/sve/acle/asm/adda_f16.c | 1 + .../aarch64/sve/acle/asm/adda_f32.c | 1 + .../aarch64/sve/acle/asm/adda_f64.c | 1 + .../gcc.target/aarch64/sve/acle/asm/adrb.c | 1 + .../gcc.target/aarch64/sve/acle/asm/adrd.c | 1 + .../gcc.target/aarch64/sve/acle/asm/adrh.c | 1 + .../gcc.target/aarch64/sve/acle/asm/adrw.c | 1 + .../aarch64/sve/acle/asm/bfmmla_f32.c | 1 + .../aarch64/sve/acle/asm/compact_f32.c | 1 + .../aarch64/sve/acle/asm/compact_f64.c | 1 + .../aarch64/sve/acle/asm/compact_s32.c | 1 + .../aarch64/sve/acle/asm/compact_s64.c | 1 + .../aarch64/sve/acle/asm/compact_u32.c | 1 + .../aarch64/sve/acle/asm/compact_u64.c | 1 + .../aarch64/sve/acle/asm/expa_f16.c | 1 + .../aarch64/sve/acle/asm/expa_f32.c | 1 + .../aarch64/sve/acle/asm/expa_f64.c | 1 + .../aarch64/sve/acle/asm/ld1_gather_f32.c | 1 + .../aarch64/sve/acle/asm/ld1_gather_f64.c | 1 + .../aarch64/sve/acle/asm/ld1_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ld1_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ld1_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ld1_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ld1ro_bf16.c | 1 + .../aarch64/sve/acle/asm/ld1ro_f16.c | 1 + .../aarch64/sve/acle/asm/ld1ro_f32.c | 1 + .../aarch64/sve/acle/asm/ld1ro_f64.c | 1 + .../aarch64/sve/acle/asm/ld1ro_s16.c | 1 + .../aarch64/sve/acle/asm/ld1ro_s32.c | 1 + .../aarch64/sve/acle/asm/ld1ro_s64.c | 1 + .../aarch64/sve/acle/asm/ld1ro_s8.c | 1 + .../aarch64/sve/acle/asm/ld1ro_u16.c | 1 + .../aarch64/sve/acle/asm/ld1ro_u32.c | 1 + .../aarch64/sve/acle/asm/ld1ro_u64.c | 1 + .../aarch64/sve/acle/asm/ld1ro_u8.c | 1 + .../aarch64/sve/acle/asm/ld1sb_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ld1sb_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ld1sb_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ld1sb_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ld1sh_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ld1sh_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ld1sh_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ld1sh_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ld1sw_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ld1sw_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ld1ub_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ld1ub_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ld1ub_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ld1ub_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ld1uh_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ld1uh_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ld1uh_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ld1uh_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ld1uw_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ld1uw_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1_bf16.c | 1 + .../aarch64/sve/acle/asm/ldff1_f16.c | 1 + .../aarch64/sve/acle/asm/ldff1_f32.c | 1 + .../aarch64/sve/acle/asm/ldff1_f64.c | 1 + .../aarch64/sve/acle/asm/ldff1_gather_f32.c | 1 + .../aarch64/sve/acle/asm/ldff1_gather_f64.c | 1 + .../aarch64/sve/acle/asm/ldff1_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1_s16.c | 1 + .../aarch64/sve/acle/asm/ldff1_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1_s8.c | 1 + .../aarch64/sve/acle/asm/ldff1_u16.c | 1 + .../aarch64/sve/acle/asm/ldff1_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1_u8.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_s16.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_u16.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1sb_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1sh_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1sh_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1sh_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1sh_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1sh_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1sh_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1sh_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1sh_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1sw_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1sw_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1sw_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1sw_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_s16.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_u16.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1ub_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1uh_gather_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1uh_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1uh_gather_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1uh_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1uh_s32.c | 1 + .../aarch64/sve/acle/asm/ldff1uh_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1uh_u32.c | 1 + .../aarch64/sve/acle/asm/ldff1uh_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1uw_gather_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1uw_gather_u64.c | 1 + .../aarch64/sve/acle/asm/ldff1uw_s64.c | 1 + .../aarch64/sve/acle/asm/ldff1uw_u64.c | 1 + .../aarch64/sve/acle/asm/ldnf1_bf16.c | 1 + .../aarch64/sve/acle/asm/ldnf1_f16.c | 1 + .../aarch64/sve/acle/asm/ldnf1_f32.c | 1 + .../aarch64/sve/acle/asm/ldnf1_f64.c | 1 + .../aarch64/sve/acle/asm/ldnf1_s16.c | 1 + .../aarch64/sve/acle/asm/ldnf1_s32.c | 1 + .../aarch64/sve/acle/asm/ldnf1_s64.c | 1 + .../aarch64/sve/acle/asm/ldnf1_s8.c | 1 + .../aarch64/sve/acle/asm/ldnf1_u16.c | 1 + .../aarch64/sve/acle/asm/ldnf1_u32.c | 1 + .../aarch64/sve/acle/asm/ldnf1_u64.c | 1 + .../aarch64/sve/acle/asm/ldnf1_u8.c | 1 + .../aarch64/sve/acle/asm/ldnf1sb_s16.c | 1 + .../aarch64/sve/acle/asm/ldnf1sb_s32.c | 1 + .../aarch64/sve/acle/asm/ldnf1sb_s64.c | 1 + .../aarch64/sve/acle/asm/ldnf1sb_u16.c | 1 + .../aarch64/sve/acle/asm/ldnf1sb_u32.c | 1 + .../aarch64/sve/acle/asm/ldnf1sb_u64.c | 1 + .../aarch64/sve/acle/asm/ldnf1sh_s32.c | 1 + .../aarch64/sve/acle/asm/ldnf1sh_s64.c | 1 + .../aarch64/sve/acle/asm/ldnf1sh_u32.c | 1 + .../aarch64/sve/acle/asm/ldnf1sh_u64.c | 1 + .../aarch64/sve/acle/asm/ldnf1sw_s64.c | 1 + .../aarch64/sve/acle/asm/ldnf1sw_u64.c | 1 + .../aarch64/sve/acle/asm/ldnf1ub_s16.c | 1 + .../aarch64/sve/acle/asm/ldnf1ub_s32.c | 1 + .../aarch64/sve/acle/asm/ldnf1ub_s64.c | 1 + .../aarch64/sve/acle/asm/ldnf1ub_u16.c | 1 + .../aarch64/sve/acle/asm/ldnf1ub_u32.c | 1 + .../aarch64/sve/acle/asm/ldnf1ub_u64.c | 1 + .../aarch64/sve/acle/asm/ldnf1uh_s32.c | 1 + .../aarch64/sve/acle/asm/ldnf1uh_s64.c | 1 + .../aarch64/sve/acle/asm/ldnf1uh_u32.c | 1 + .../aarch64/sve/acle/asm/ldnf1uh_u64.c | 1 + .../aarch64/sve/acle/asm/ldnf1uw_s64.c | 1 + .../aarch64/sve/acle/asm/ldnf1uw_u64.c | 1 + .../aarch64/sve/acle/asm/mmla_f32.c | 1 + .../aarch64/sve/acle/asm/mmla_f64.c | 1 + .../aarch64/sve/acle/asm/mmla_s32.c | 1 + .../aarch64/sve/acle/asm/mmla_u32.c | 1 + .../aarch64/sve/acle/asm/prfb_gather.c | 1 + .../aarch64/sve/acle/asm/prfd_gather.c | 1 + .../aarch64/sve/acle/asm/prfh_gather.c | 1 + .../aarch64/sve/acle/asm/prfw_gather.c | 1 + .../gcc.target/aarch64/sve/acle/asm/rdffr_1.c | 1 + .../aarch64/sve/acle/asm/st1_scatter_f32.c | 1 + .../aarch64/sve/acle/asm/st1_scatter_f64.c | 1 + .../aarch64/sve/acle/asm/st1_scatter_s32.c | 1 + .../aarch64/sve/acle/asm/st1_scatter_s64.c | 1 + .../aarch64/sve/acle/asm/st1_scatter_u32.c | 1 + .../aarch64/sve/acle/asm/st1_scatter_u64.c | 1 + .../aarch64/sve/acle/asm/st1b_scatter_s32.c | 1 + .../aarch64/sve/acle/asm/st1b_scatter_s64.c | 1 + .../aarch64/sve/acle/asm/st1b_scatter_u32.c | 1 + .../aarch64/sve/acle/asm/st1b_scatter_u64.c | 1 + .../aarch64/sve/acle/asm/st1h_scatter_s32.c | 1 + .../aarch64/sve/acle/asm/st1h_scatter_s64.c | 1 + .../aarch64/sve/acle/asm/st1h_scatter_u32.c | 1 + .../aarch64/sve/acle/asm/st1h_scatter_u64.c | 1 + .../aarch64/sve/acle/asm/st1w_scatter_s64.c | 1 + .../aarch64/sve/acle/asm/st1w_scatter_u64.c | 1 + .../aarch64/sve/acle/asm/test_sve_acle.h | 11 +- .../aarch64/sve/acle/asm/tmad_f16.c | 1 + .../aarch64/sve/acle/asm/tmad_f32.c | 1 + .../aarch64/sve/acle/asm/tmad_f64.c | 1 + .../aarch64/sve/acle/asm/tsmul_f16.c | 1 + .../aarch64/sve/acle/asm/tsmul_f32.c | 1 + .../aarch64/sve/acle/asm/tsmul_f64.c | 1 + .../aarch64/sve/acle/asm/tssel_f16.c | 1 + .../aarch64/sve/acle/asm/tssel_f32.c | 1 + .../aarch64/sve/acle/asm/tssel_f64.c | 1 + .../aarch64/sve/acle/asm/usmmla_s32.c | 1 + .../sve2/acle/aarch64-sve2-acle-asm.exp | 3 +- .../aarch64/sve2/acle/asm/aesd_u8.c | 1 + .../aarch64/sve2/acle/asm/aese_u8.c | 1 + .../aarch64/sve2/acle/asm/aesimc_u8.c | 1 + .../aarch64/sve2/acle/asm/aesmc_u8.c | 1 + .../aarch64/sve2/acle/asm/bdep_u16.c | 1 + .../aarch64/sve2/acle/asm/bdep_u32.c | 1 + .../aarch64/sve2/acle/asm/bdep_u64.c | 1 + .../aarch64/sve2/acle/asm/bdep_u8.c | 1 + .../aarch64/sve2/acle/asm/bext_u16.c | 1 + .../aarch64/sve2/acle/asm/bext_u32.c | 1 + .../aarch64/sve2/acle/asm/bext_u64.c | 1 + .../aarch64/sve2/acle/asm/bext_u8.c | 1 + .../aarch64/sve2/acle/asm/bgrp_u16.c | 1 + .../aarch64/sve2/acle/asm/bgrp_u32.c | 1 + .../aarch64/sve2/acle/asm/bgrp_u64.c | 1 + .../aarch64/sve2/acle/asm/bgrp_u8.c | 1 + .../aarch64/sve2/acle/asm/histcnt_s32.c | 1 + .../aarch64/sve2/acle/asm/histcnt_s64.c | 1 + .../aarch64/sve2/acle/asm/histcnt_u32.c | 1 + .../aarch64/sve2/acle/asm/histcnt_u64.c | 1 + .../aarch64/sve2/acle/asm/histseg_s8.c | 1 + .../aarch64/sve2/acle/asm/histseg_u8.c | 1 + .../aarch64/sve2/acle/asm/ldnt1_gather_f32.c | 1 + .../aarch64/sve2/acle/asm/ldnt1_gather_f64.c | 1 + .../aarch64/sve2/acle/asm/ldnt1_gather_s32.c | 1 + .../aarch64/sve2/acle/asm/ldnt1_gather_s64.c | 1 + .../aarch64/sve2/acle/asm/ldnt1_gather_u32.c | 1 + .../aarch64/sve2/acle/asm/ldnt1_gather_u64.c | 1 + .../sve2/acle/asm/ldnt1sb_gather_s32.c | 1 + .../sve2/acle/asm/ldnt1sb_gather_s64.c | 1 + .../sve2/acle/asm/ldnt1sb_gather_u32.c | 1 + .../sve2/acle/asm/ldnt1sb_gather_u64.c | 1 + .../sve2/acle/asm/ldnt1sh_gather_s32.c | 1 + .../sve2/acle/asm/ldnt1sh_gather_s64.c | 1 + .../sve2/acle/asm/ldnt1sh_gather_u32.c | 1 + .../sve2/acle/asm/ldnt1sh_gather_u64.c | 1 + .../sve2/acle/asm/ldnt1sw_gather_s64.c | 1 + .../sve2/acle/asm/ldnt1sw_gather_u64.c | 1 + .../sve2/acle/asm/ldnt1ub_gather_s32.c | 1 + .../sve2/acle/asm/ldnt1ub_gather_s64.c | 1 + .../sve2/acle/asm/ldnt1ub_gather_u32.c | 1 + .../sve2/acle/asm/ldnt1ub_gather_u64.c | 1 + .../sve2/acle/asm/ldnt1uh_gather_s32.c | 1 + .../sve2/acle/asm/ldnt1uh_gather_s64.c | 1 + .../sve2/acle/asm/ldnt1uh_gather_u32.c | 1 + .../sve2/acle/asm/ldnt1uh_gather_u64.c | 1 + .../sve2/acle/asm/ldnt1uw_gather_s64.c | 1 + .../sve2/acle/asm/ldnt1uw_gather_u64.c | 1 + .../aarch64/sve2/acle/asm/match_s16.c | 1 + .../aarch64/sve2/acle/asm/match_s8.c | 1 + .../aarch64/sve2/acle/asm/match_u16.c | 1 + .../aarch64/sve2/acle/asm/match_u8.c | 1 + .../aarch64/sve2/acle/asm/nmatch_s16.c | 1 + .../aarch64/sve2/acle/asm/nmatch_s8.c | 1 + .../aarch64/sve2/acle/asm/nmatch_u16.c | 1 + .../aarch64/sve2/acle/asm/nmatch_u8.c | 1 + .../aarch64/sve2/acle/asm/pmullb_pair_u64.c | 1 + .../aarch64/sve2/acle/asm/pmullt_pair_u64.c | 1 + .../aarch64/sve2/acle/asm/rax1_s64.c | 1 + .../aarch64/sve2/acle/asm/rax1_u64.c | 1 + .../aarch64/sve2/acle/asm/sm4e_u32.c | 1 + .../aarch64/sve2/acle/asm/sm4ekey_u32.c | 1 + .../aarch64/sve2/acle/asm/stnt1_scatter_f32.c | 1 + .../aarch64/sve2/acle/asm/stnt1_scatter_f64.c | 1 + .../aarch64/sve2/acle/asm/stnt1_scatter_s32.c | 1 + .../aarch64/sve2/acle/asm/stnt1_scatter_s64.c | 1 + .../aarch64/sve2/acle/asm/stnt1_scatter_u32.c | 1 + .../aarch64/sve2/acle/asm/stnt1_scatter_u64.c | 1 + .../sve2/acle/asm/stnt1b_scatter_s32.c | 1 + .../sve2/acle/asm/stnt1b_scatter_s64.c | 1 + .../sve2/acle/asm/stnt1b_scatter_u32.c | 1 + .../sve2/acle/asm/stnt1b_scatter_u64.c | 1 + .../sve2/acle/asm/stnt1h_scatter_s32.c | 1 + .../sve2/acle/asm/stnt1h_scatter_s64.c | 1 + .../sve2/acle/asm/stnt1h_scatter_u32.c | 1 + .../sve2/acle/asm/stnt1h_scatter_u64.c | 1 + .../sve2/acle/asm/stnt1w_scatter_s64.c | 1 + .../sve2/acle/asm/stnt1w_scatter_u64.c | 1 + 279 files changed, 799 insertions(+), 165 deletions(-) create mode 100644 gcc/testsuite/g++.target/aarch64/sve/aarch64-ssve.exp diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.def b/gcc/config/aarch64/aarch64-sve-builtins-base.def index ffdf7cb4c32..a2d0cea6c5b 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.def @@ -25,12 +25,7 @@ DEF_SVE_FUNCTION (svacgt, compare_opt_n, all_float, implicit) DEF_SVE_FUNCTION (svacle, compare_opt_n, all_float, implicit) DEF_SVE_FUNCTION (svaclt, compare_opt_n, all_float, implicit) DEF_SVE_FUNCTION (svadd, binary_opt_n, all_arith, mxz) -DEF_SVE_FUNCTION (svadda, fold_left, all_float, implicit) DEF_SVE_FUNCTION (svaddv, reduction_wide, all_arith, implicit) -DEF_SVE_FUNCTION (svadrb, adr_offset, none, none) -DEF_SVE_FUNCTION (svadrd, adr_index, none, none) -DEF_SVE_FUNCTION (svadrh, adr_index, none, none) -DEF_SVE_FUNCTION (svadrw, adr_index, none, none) DEF_SVE_FUNCTION (svand, binary_opt_n, all_integer, mxz) DEF_SVE_FUNCTION (svand, binary_opt_n, b, z) DEF_SVE_FUNCTION (svandv, reduction, all_integer, implicit) @@ -75,7 +70,6 @@ DEF_SVE_FUNCTION (svcnth_pat, count_pat, none, none) DEF_SVE_FUNCTION (svcntp, count_pred, all_pred, implicit) DEF_SVE_FUNCTION (svcntw, count_inherent, none, none) DEF_SVE_FUNCTION (svcntw_pat, count_pat, none, none) -DEF_SVE_FUNCTION (svcompact, unary, sd_data, implicit) DEF_SVE_FUNCTION (svcreate2, create, all_data, none) DEF_SVE_FUNCTION (svcreate3, create, all_data, none) DEF_SVE_FUNCTION (svcreate4, create, all_data, none) @@ -93,7 +87,6 @@ DEF_SVE_FUNCTION (svdupq_lane, binary_uint64_n, all_data, none) DEF_SVE_FUNCTION (sveor, binary_opt_n, all_integer, mxz) DEF_SVE_FUNCTION (sveor, binary_opt_n, b, z) DEF_SVE_FUNCTION (sveorv, reduction, all_integer, implicit) -DEF_SVE_FUNCTION (svexpa, unary_uint, all_float, none) DEF_SVE_FUNCTION (svext, ext, all_data, none) DEF_SVE_FUNCTION (svextb, unary, hsd_integer, mxz) DEF_SVE_FUNCTION (svexth, unary, sd_integer, mxz) @@ -106,51 +99,13 @@ DEF_SVE_FUNCTION (svinsr, binary_n, all_data, none) DEF_SVE_FUNCTION (svlasta, reduction, all_data, implicit) DEF_SVE_FUNCTION (svlastb, reduction, all_data, implicit) DEF_SVE_FUNCTION (svld1, load, all_data, implicit) -DEF_SVE_FUNCTION (svld1_gather, load_gather_sv, sd_data, implicit) -DEF_SVE_FUNCTION (svld1_gather, load_gather_vs, sd_data, implicit) DEF_SVE_FUNCTION (svld1rq, load_replicate, all_data, implicit) DEF_SVE_FUNCTION (svld1sb, load_ext, hsd_integer, implicit) -DEF_SVE_FUNCTION (svld1sb_gather, load_ext_gather_offset, sd_integer, implicit) DEF_SVE_FUNCTION (svld1sh, load_ext, sd_integer, implicit) -DEF_SVE_FUNCTION (svld1sh_gather, load_ext_gather_offset, sd_integer, implicit) -DEF_SVE_FUNCTION (svld1sh_gather, load_ext_gather_index, sd_integer, implicit) DEF_SVE_FUNCTION (svld1sw, load_ext, d_integer, implicit) -DEF_SVE_FUNCTION (svld1sw_gather, load_ext_gather_offset, d_integer, implicit) -DEF_SVE_FUNCTION (svld1sw_gather, load_ext_gather_index, d_integer, implicit) DEF_SVE_FUNCTION (svld1ub, load_ext, hsd_integer, implicit) -DEF_SVE_FUNCTION (svld1ub_gather, load_ext_gather_offset, sd_integer, implicit) DEF_SVE_FUNCTION (svld1uh, load_ext, sd_integer, implicit) -DEF_SVE_FUNCTION (svld1uh_gather, load_ext_gather_offset, sd_integer, implicit) -DEF_SVE_FUNCTION (svld1uh_gather, load_ext_gather_index, sd_integer, implicit) DEF_SVE_FUNCTION (svld1uw, load_ext, d_integer, implicit) -DEF_SVE_FUNCTION (svld1uw_gather, load_ext_gather_offset, d_integer, implicit) -DEF_SVE_FUNCTION (svld1uw_gather, load_ext_gather_index, d_integer, implicit) -DEF_SVE_FUNCTION (svldff1, load, all_data, implicit) -DEF_SVE_FUNCTION (svldff1_gather, load_gather_sv, sd_data, implicit) -DEF_SVE_FUNCTION (svldff1_gather, load_gather_vs, sd_data, implicit) -DEF_SVE_FUNCTION (svldff1sb, load_ext, hsd_integer, implicit) -DEF_SVE_FUNCTION (svldff1sb_gather, load_ext_gather_offset, sd_integer, implicit) -DEF_SVE_FUNCTION (svldff1sh, load_ext, sd_integer, implicit) -DEF_SVE_FUNCTION (svldff1sh_gather, load_ext_gather_offset, sd_integer, implicit) -DEF_SVE_FUNCTION (svldff1sh_gather, load_ext_gather_index, sd_integer, implicit) -DEF_SVE_FUNCTION (svldff1sw, load_ext, d_integer, implicit) -DEF_SVE_FUNCTION (svldff1sw_gather, load_ext_gather_offset, d_integer, implicit) -DEF_SVE_FUNCTION (svldff1sw_gather, load_ext_gather_index, d_integer, implicit) -DEF_SVE_FUNCTION (svldff1ub, load_ext, hsd_integer, implicit) -DEF_SVE_FUNCTION (svldff1ub_gather, load_ext_gather_offset, sd_integer, implicit) -DEF_SVE_FUNCTION (svldff1uh, load_ext, sd_integer, implicit) -DEF_SVE_FUNCTION (svldff1uh_gather, load_ext_gather_offset, sd_integer, implicit) -DEF_SVE_FUNCTION (svldff1uh_gather, load_ext_gather_index, sd_integer, implicit) -DEF_SVE_FUNCTION (svldff1uw, load_ext, d_integer, implicit) -DEF_SVE_FUNCTION (svldff1uw_gather, load_ext_gather_offset, d_integer, implicit) -DEF_SVE_FUNCTION (svldff1uw_gather, load_ext_gather_index, d_integer, implicit) -DEF_SVE_FUNCTION (svldnf1, load, all_data, implicit) -DEF_SVE_FUNCTION (svldnf1sb, load_ext, hsd_integer, implicit) -DEF_SVE_FUNCTION (svldnf1sh, load_ext, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnf1sw, load_ext, d_integer, implicit) -DEF_SVE_FUNCTION (svldnf1ub, load_ext, hsd_integer, implicit) -DEF_SVE_FUNCTION (svldnf1uh, load_ext, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnf1uw, load_ext, d_integer, implicit) DEF_SVE_FUNCTION (svldnt1, load, all_data, implicit) DEF_SVE_FUNCTION (svld2, load, all_data, implicit) DEF_SVE_FUNCTION (svld3, load, all_data, implicit) @@ -173,7 +128,6 @@ DEF_SVE_FUNCTION (svmla, ternary_opt_n, all_arith, mxz) DEF_SVE_FUNCTION (svmla_lane, ternary_lane, all_float, none) DEF_SVE_FUNCTION (svmls, ternary_opt_n, all_arith, mxz) DEF_SVE_FUNCTION (svmls_lane, ternary_lane, all_float, none) -DEF_SVE_FUNCTION (svmmla, mmla, none, none) DEF_SVE_FUNCTION (svmov, unary, b, z) DEF_SVE_FUNCTION (svmsb, ternary_opt_n, all_arith, mxz) DEF_SVE_FUNCTION (svmul, binary_opt_n, all_arith, mxz) @@ -197,13 +151,9 @@ DEF_SVE_FUNCTION (svpfalse, inherent_b, b, none) DEF_SVE_FUNCTION (svpfirst, unary, b, implicit) DEF_SVE_FUNCTION (svpnext, unary_pred, all_pred, implicit) DEF_SVE_FUNCTION (svprfb, prefetch, none, implicit) -DEF_SVE_FUNCTION (svprfb_gather, prefetch_gather_offset, none, implicit) DEF_SVE_FUNCTION (svprfd, prefetch, none, implicit) -DEF_SVE_FUNCTION (svprfd_gather, prefetch_gather_index, none, implicit) DEF_SVE_FUNCTION (svprfh, prefetch, none, implicit) -DEF_SVE_FUNCTION (svprfh_gather, prefetch_gather_index, none, implicit) DEF_SVE_FUNCTION (svprfw, prefetch, none, implicit) -DEF_SVE_FUNCTION (svprfw_gather, prefetch_gather_index, none, implicit) DEF_SVE_FUNCTION (svptest_any, ptest, none, implicit) DEF_SVE_FUNCTION (svptest_first, ptest, none, implicit) DEF_SVE_FUNCTION (svptest_last, ptest, none, implicit) @@ -244,7 +194,6 @@ DEF_SVE_FUNCTION (svqincw_pat, inc_dec_pat, s_integer, none) DEF_SVE_FUNCTION (svqincw_pat, inc_dec_pat, sd_integer, none) DEF_SVE_FUNCTION (svqsub, binary_opt_n, all_integer, none) DEF_SVE_FUNCTION (svrbit, unary, all_integer, mxz) -DEF_SVE_FUNCTION (svrdffr, rdffr, none, z_or_none) DEF_SVE_FUNCTION (svrecpe, unary, all_float, none) DEF_SVE_FUNCTION (svrecps, binary, all_float, none) DEF_SVE_FUNCTION (svrecpx, unary, all_float, mxz) @@ -269,20 +218,12 @@ DEF_SVE_FUNCTION (svsel, binary, b, implicit) DEF_SVE_FUNCTION (svset2, set, all_data, none) DEF_SVE_FUNCTION (svset3, set, all_data, none) DEF_SVE_FUNCTION (svset4, set, all_data, none) -DEF_SVE_FUNCTION (svsetffr, setffr, none, none) DEF_SVE_FUNCTION (svsplice, binary, all_data, implicit) DEF_SVE_FUNCTION (svsqrt, unary, all_float, mxz) DEF_SVE_FUNCTION (svst1, store, all_data, implicit) -DEF_SVE_FUNCTION (svst1_scatter, store_scatter_index, sd_data, implicit) -DEF_SVE_FUNCTION (svst1_scatter, store_scatter_offset, sd_data, implicit) DEF_SVE_FUNCTION (svst1b, store, hsd_integer, implicit) -DEF_SVE_FUNCTION (svst1b_scatter, store_scatter_offset, sd_integer, implicit) DEF_SVE_FUNCTION (svst1h, store, sd_integer, implicit) -DEF_SVE_FUNCTION (svst1h_scatter, store_scatter_index, sd_integer, implicit) -DEF_SVE_FUNCTION (svst1h_scatter, store_scatter_offset, sd_integer, implicit) DEF_SVE_FUNCTION (svst1w, store, d_integer, implicit) -DEF_SVE_FUNCTION (svst1w_scatter, store_scatter_index, d_integer, implicit) -DEF_SVE_FUNCTION (svst1w_scatter, store_scatter_offset, d_integer, implicit) DEF_SVE_FUNCTION (svst2, store, all_data, implicit) DEF_SVE_FUNCTION (svst3, store, all_data, implicit) DEF_SVE_FUNCTION (svst4, store, all_data, implicit) @@ -290,13 +231,10 @@ DEF_SVE_FUNCTION (svstnt1, store, all_data, implicit) DEF_SVE_FUNCTION (svsub, binary_opt_n, all_arith, mxz) DEF_SVE_FUNCTION (svsubr, binary_opt_n, all_arith, mxz) DEF_SVE_FUNCTION (svtbl, binary_uint, all_data, none) -DEF_SVE_FUNCTION (svtmad, tmad, all_float, none) DEF_SVE_FUNCTION (svtrn1, binary, all_data, none) DEF_SVE_FUNCTION (svtrn1, binary_pred, all_pred, none) DEF_SVE_FUNCTION (svtrn2, binary, all_data, none) DEF_SVE_FUNCTION (svtrn2, binary_pred, all_pred, none) -DEF_SVE_FUNCTION (svtsmul, binary_uint, all_float, none) -DEF_SVE_FUNCTION (svtssel, binary_uint, all_float, none) DEF_SVE_FUNCTION (svundef, inherent, all_data, none) DEF_SVE_FUNCTION (svundef2, inherent, all_data, none) DEF_SVE_FUNCTION (svundef3, inherent, all_data, none) @@ -311,13 +249,78 @@ DEF_SVE_FUNCTION (svuzp2, binary, all_data, none) DEF_SVE_FUNCTION (svuzp2, binary_pred, all_pred, none) DEF_SVE_FUNCTION (svwhilele, compare_scalar, while, none) DEF_SVE_FUNCTION (svwhilelt, compare_scalar, while, none) -DEF_SVE_FUNCTION (svwrffr, setffr, none, implicit) DEF_SVE_FUNCTION (svzip1, binary, all_data, none) DEF_SVE_FUNCTION (svzip1, binary_pred, all_pred, none) DEF_SVE_FUNCTION (svzip2, binary, all_data, none) DEF_SVE_FUNCTION (svzip2, binary_pred, all_pred, none) #undef REQUIRED_EXTENSIONS +#define REQUIRED_EXTENSIONS AARCH64_FL_SM_OFF +DEF_SVE_FUNCTION (svadda, fold_left, all_float, implicit) +DEF_SVE_FUNCTION (svadrb, adr_offset, none, none) +DEF_SVE_FUNCTION (svadrd, adr_index, none, none) +DEF_SVE_FUNCTION (svadrh, adr_index, none, none) +DEF_SVE_FUNCTION (svadrw, adr_index, none, none) +DEF_SVE_FUNCTION (svcompact, unary, sd_data, implicit) +DEF_SVE_FUNCTION (svexpa, unary_uint, all_float, none) +DEF_SVE_FUNCTION (svld1_gather, load_gather_sv, sd_data, implicit) +DEF_SVE_FUNCTION (svld1_gather, load_gather_vs, sd_data, implicit) +DEF_SVE_FUNCTION (svld1sb_gather, load_ext_gather_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svld1sh_gather, load_ext_gather_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svld1sh_gather, load_ext_gather_index, sd_integer, implicit) +DEF_SVE_FUNCTION (svld1sw_gather, load_ext_gather_offset, d_integer, implicit) +DEF_SVE_FUNCTION (svld1sw_gather, load_ext_gather_index, d_integer, implicit) +DEF_SVE_FUNCTION (svld1ub_gather, load_ext_gather_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svld1uh_gather, load_ext_gather_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svld1uh_gather, load_ext_gather_index, sd_integer, implicit) +DEF_SVE_FUNCTION (svld1uw_gather, load_ext_gather_offset, d_integer, implicit) +DEF_SVE_FUNCTION (svld1uw_gather, load_ext_gather_index, d_integer, implicit) +DEF_SVE_FUNCTION (svldff1, load, all_data, implicit) +DEF_SVE_FUNCTION (svldff1_gather, load_gather_sv, sd_data, implicit) +DEF_SVE_FUNCTION (svldff1_gather, load_gather_vs, sd_data, implicit) +DEF_SVE_FUNCTION (svldff1sb, load_ext, hsd_integer, implicit) +DEF_SVE_FUNCTION (svldff1sb_gather, load_ext_gather_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svldff1sh, load_ext, sd_integer, implicit) +DEF_SVE_FUNCTION (svldff1sh_gather, load_ext_gather_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svldff1sh_gather, load_ext_gather_index, sd_integer, implicit) +DEF_SVE_FUNCTION (svldff1sw, load_ext, d_integer, implicit) +DEF_SVE_FUNCTION (svldff1sw_gather, load_ext_gather_offset, d_integer, implicit) +DEF_SVE_FUNCTION (svldff1sw_gather, load_ext_gather_index, d_integer, implicit) +DEF_SVE_FUNCTION (svldff1ub, load_ext, hsd_integer, implicit) +DEF_SVE_FUNCTION (svldff1ub_gather, load_ext_gather_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svldff1uh, load_ext, sd_integer, implicit) +DEF_SVE_FUNCTION (svldff1uh_gather, load_ext_gather_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svldff1uh_gather, load_ext_gather_index, sd_integer, implicit) +DEF_SVE_FUNCTION (svldff1uw, load_ext, d_integer, implicit) +DEF_SVE_FUNCTION (svldff1uw_gather, load_ext_gather_offset, d_integer, implicit) +DEF_SVE_FUNCTION (svldff1uw_gather, load_ext_gather_index, d_integer, implicit) +DEF_SVE_FUNCTION (svldnf1, load, all_data, implicit) +DEF_SVE_FUNCTION (svldnf1sb, load_ext, hsd_integer, implicit) +DEF_SVE_FUNCTION (svldnf1sh, load_ext, sd_integer, implicit) +DEF_SVE_FUNCTION (svldnf1sw, load_ext, d_integer, implicit) +DEF_SVE_FUNCTION (svldnf1ub, load_ext, hsd_integer, implicit) +DEF_SVE_FUNCTION (svldnf1uh, load_ext, sd_integer, implicit) +DEF_SVE_FUNCTION (svldnf1uw, load_ext, d_integer, implicit) +DEF_SVE_FUNCTION (svmmla, mmla, none, none) +DEF_SVE_FUNCTION (svprfb_gather, prefetch_gather_offset, none, implicit) +DEF_SVE_FUNCTION (svprfd_gather, prefetch_gather_index, none, implicit) +DEF_SVE_FUNCTION (svprfh_gather, prefetch_gather_index, none, implicit) +DEF_SVE_FUNCTION (svprfw_gather, prefetch_gather_index, none, implicit) +DEF_SVE_FUNCTION (svrdffr, rdffr, none, z_or_none) +DEF_SVE_FUNCTION (svsetffr, setffr, none, none) +DEF_SVE_FUNCTION (svst1_scatter, store_scatter_index, sd_data, implicit) +DEF_SVE_FUNCTION (svst1_scatter, store_scatter_offset, sd_data, implicit) +DEF_SVE_FUNCTION (svst1b_scatter, store_scatter_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svst1h_scatter, store_scatter_index, sd_integer, implicit) +DEF_SVE_FUNCTION (svst1h_scatter, store_scatter_offset, sd_integer, implicit) +DEF_SVE_FUNCTION (svst1w_scatter, store_scatter_index, d_integer, implicit) +DEF_SVE_FUNCTION (svst1w_scatter, store_scatter_offset, d_integer, implicit) +DEF_SVE_FUNCTION (svtmad, tmad, all_float, none) +DEF_SVE_FUNCTION (svtsmul, binary_uint, all_float, none) +DEF_SVE_FUNCTION (svtssel, binary_uint, all_float, none) +DEF_SVE_FUNCTION (svwrffr, setffr, none, implicit) +#undef REQUIRED_EXTENSIONS + #define REQUIRED_EXTENSIONS AARCH64_FL_BF16 DEF_SVE_FUNCTION (svbfdot, ternary_bfloat_opt_n, s_float, none) DEF_SVE_FUNCTION (svbfdot_lane, ternary_bfloat_lanex2, s_float, none) @@ -325,27 +328,31 @@ DEF_SVE_FUNCTION (svbfmlalb, ternary_bfloat_opt_n, s_float, none) DEF_SVE_FUNCTION (svbfmlalb_lane, ternary_bfloat_lane, s_float, none) DEF_SVE_FUNCTION (svbfmlalt, ternary_bfloat_opt_n, s_float, none) DEF_SVE_FUNCTION (svbfmlalt_lane, ternary_bfloat_lane, s_float, none) -DEF_SVE_FUNCTION (svbfmmla, ternary_bfloat, s_float, none) DEF_SVE_FUNCTION (svcvt, unary_convert, cvt_bfloat, mxz) DEF_SVE_FUNCTION (svcvtnt, unary_convert_narrowt, cvt_bfloat, mx) #undef REQUIRED_EXTENSIONS +#define REQUIRED_EXTENSIONS AARCH64_FL_BF16 | AARCH64_FL_SM_OFF +DEF_SVE_FUNCTION (svbfmmla, ternary_bfloat, s_float, none) +#undef REQUIRED_EXTENSIONS + #define REQUIRED_EXTENSIONS AARCH64_FL_I8MM -DEF_SVE_FUNCTION (svmmla, mmla, s_integer, none) -DEF_SVE_FUNCTION (svusmmla, ternary_uintq_intq, s_signed, none) DEF_SVE_FUNCTION (svsudot, ternary_intq_uintq_opt_n, s_signed, none) DEF_SVE_FUNCTION (svsudot_lane, ternary_intq_uintq_lane, s_signed, none) DEF_SVE_FUNCTION (svusdot, ternary_uintq_intq_opt_n, s_signed, none) DEF_SVE_FUNCTION (svusdot_lane, ternary_uintq_intq_lane, s_signed, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS AARCH64_FL_F32MM +#define REQUIRED_EXTENSIONS AARCH64_FL_I8MM | AARCH64_FL_SM_OFF +DEF_SVE_FUNCTION (svmmla, mmla, s_integer, none) +DEF_SVE_FUNCTION (svusmmla, ternary_uintq_intq, s_signed, none) +#undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS AARCH64_FL_F32MM | AARCH64_FL_SM_OFF DEF_SVE_FUNCTION (svmmla, mmla, s_float, none) #undef REQUIRED_EXTENSIONS #define REQUIRED_EXTENSIONS AARCH64_FL_F64MM -DEF_SVE_FUNCTION (svld1ro, load_replicate, all_data, implicit) -DEF_SVE_FUNCTION (svmmla, mmla, d_float, none) DEF_SVE_FUNCTION (svtrn1q, binary, all_data, none) DEF_SVE_FUNCTION (svtrn2q, binary, all_data, none) DEF_SVE_FUNCTION (svuzp1q, binary, all_data, none) @@ -353,3 +360,8 @@ DEF_SVE_FUNCTION (svuzp2q, binary, all_data, none) DEF_SVE_FUNCTION (svzip1q, binary, all_data, none) DEF_SVE_FUNCTION (svzip2q, binary, all_data, none) #undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS AARCH64_FL_F64MM | AARCH64_FL_SM_OFF +DEF_SVE_FUNCTION (svld1ro, load_replicate, all_data, implicit) +DEF_SVE_FUNCTION (svmmla, mmla, d_float, none) +#undef REQUIRED_EXTENSIONS diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def index 635089ffc58..4e0466b4cf8 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def @@ -51,24 +51,9 @@ DEF_SVE_FUNCTION (sveor3, ternary_opt_n, all_integer, none) DEF_SVE_FUNCTION (sveorbt, ternary_opt_n, all_integer, none) DEF_SVE_FUNCTION (sveortb, ternary_opt_n, all_integer, none) DEF_SVE_FUNCTION (svhadd, binary_opt_n, all_integer, mxz) -DEF_SVE_FUNCTION (svhistcnt, binary_to_uint, sd_integer, z) -DEF_SVE_FUNCTION (svhistseg, binary_to_uint, b_integer, none) DEF_SVE_FUNCTION (svhsub, binary_opt_n, all_integer, mxz) DEF_SVE_FUNCTION (svhsubr, binary_opt_n, all_integer, mxz) -DEF_SVE_FUNCTION (svldnt1_gather, load_gather_sv_restricted, sd_data, implicit) -DEF_SVE_FUNCTION (svldnt1_gather, load_gather_vs, sd_data, implicit) -DEF_SVE_FUNCTION (svldnt1sb_gather, load_ext_gather_offset_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnt1sh_gather, load_ext_gather_offset_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnt1sh_gather, load_ext_gather_index_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnt1sw_gather, load_ext_gather_offset_restricted, d_integer, implicit) -DEF_SVE_FUNCTION (svldnt1sw_gather, load_ext_gather_index_restricted, d_integer, implicit) -DEF_SVE_FUNCTION (svldnt1ub_gather, load_ext_gather_offset_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnt1uh_gather, load_ext_gather_offset_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnt1uh_gather, load_ext_gather_index_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svldnt1uw_gather, load_ext_gather_offset_restricted, d_integer, implicit) -DEF_SVE_FUNCTION (svldnt1uw_gather, load_ext_gather_index_restricted, d_integer, implicit) DEF_SVE_FUNCTION (svlogb, unary_to_int, all_float, mxz) -DEF_SVE_FUNCTION (svmatch, compare, bh_integer, implicit) DEF_SVE_FUNCTION (svmaxp, binary, all_arith, mx) DEF_SVE_FUNCTION (svmaxnmp, binary, all_float, mx) DEF_SVE_FUNCTION (svmla_lane, ternary_lane, hsd_integer, none) @@ -91,7 +76,6 @@ DEF_SVE_FUNCTION (svmullb_lane, binary_long_lane, sd_integer, none) DEF_SVE_FUNCTION (svmullt, binary_long_opt_n, hsd_integer, none) DEF_SVE_FUNCTION (svmullt_lane, binary_long_lane, sd_integer, none) DEF_SVE_FUNCTION (svnbsl, ternary_opt_n, all_integer, none) -DEF_SVE_FUNCTION (svnmatch, compare, bh_integer, implicit) DEF_SVE_FUNCTION (svpmul, binary_opt_n, b_unsigned, none) DEF_SVE_FUNCTION (svpmullb, binary_long_opt_n, hd_unsigned, none) DEF_SVE_FUNCTION (svpmullb_pair, binary_opt_n, bs_unsigned, none) @@ -164,13 +148,6 @@ DEF_SVE_FUNCTION (svsli, ternary_shift_left_imm, all_integer, none) DEF_SVE_FUNCTION (svsqadd, binary_int_opt_n, all_unsigned, mxz) DEF_SVE_FUNCTION (svsra, ternary_shift_right_imm, all_integer, none) DEF_SVE_FUNCTION (svsri, ternary_shift_right_imm, all_integer, none) -DEF_SVE_FUNCTION (svstnt1_scatter, store_scatter_index_restricted, sd_data, implicit) -DEF_SVE_FUNCTION (svstnt1_scatter, store_scatter_offset_restricted, sd_data, implicit) -DEF_SVE_FUNCTION (svstnt1b_scatter, store_scatter_offset_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svstnt1h_scatter, store_scatter_index_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svstnt1h_scatter, store_scatter_offset_restricted, sd_integer, implicit) -DEF_SVE_FUNCTION (svstnt1w_scatter, store_scatter_index_restricted, d_integer, implicit) -DEF_SVE_FUNCTION (svstnt1w_scatter, store_scatter_offset_restricted, d_integer, implicit) DEF_SVE_FUNCTION (svsubhnb, binary_narrowb_opt_n, hsd_integer, none) DEF_SVE_FUNCTION (svsubhnt, binary_narrowt_opt_n, hsd_integer, none) DEF_SVE_FUNCTION (svsublb, binary_long_opt_n, hsd_integer, none) @@ -189,7 +166,35 @@ DEF_SVE_FUNCTION (svwhilewr, compare_ptr, all_data, none) DEF_SVE_FUNCTION (svxar, ternary_shift_right_imm, all_integer, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE2 | AARCH64_FL_SVE2_AES) +#define REQUIRED_EXTENSIONS AARCH64_FL_SVE2 | AARCH64_FL_SM_OFF +DEF_SVE_FUNCTION (svhistcnt, binary_to_uint, sd_integer, z) +DEF_SVE_FUNCTION (svhistseg, binary_to_uint, b_integer, none) +DEF_SVE_FUNCTION (svldnt1_gather, load_gather_sv_restricted, sd_data, implicit) +DEF_SVE_FUNCTION (svldnt1_gather, load_gather_vs, sd_data, implicit) +DEF_SVE_FUNCTION (svldnt1sb_gather, load_ext_gather_offset_restricted, sd_integer, implicit) +DEF_SVE_FUNCTION (svldnt1sh_gather, load_ext_gather_offset_restricted, sd_integer, implicit) +DEF_SVE_FUNCTION (svldnt1sh_gather, load_ext_gather_index_restricted, sd_integer, implicit) +DEF_SVE_FUNCTION (svldnt1sw_gather, load_ext_gather_offset_restricted, d_integer, implicit) +DEF_SVE_FUNCTION (svldnt1sw_gather, load_ext_gather_index_restricted, d_integer, implicit) +DEF_SVE_FUNCTION (svldnt1ub_gather, load_ext_gather_offset_restricted, sd_integer, implicit) +DEF_SVE_FUNCTION (svldnt1uh_gather, load_ext_gather_offset_restricted, sd_integer, implicit) +DEF_SVE_FUNCTION (svldnt1uh_gather, load_ext_gather_index_restricted, sd_integer, implicit) +DEF_SVE_FUNCTION (svldnt1uw_gather, load_ext_gather_offset_restricted, d_integer, implicit) +DEF_SVE_FUNCTION (svldnt1uw_gather, load_ext_gather_index_restricted, d_integer, implicit) +DEF_SVE_FUNCTION (svmatch, compare, bh_integer, implicit) +DEF_SVE_FUNCTION (svnmatch, compare, bh_integer, implicit) +DEF_SVE_FUNCTION (svstnt1_scatter, store_scatter_index_restricted, sd_data, implicit) +DEF_SVE_FUNCTION (svstnt1_scatter, store_scatter_offset_restricted, sd_data, implicit) +DEF_SVE_FUNCTION (svstnt1b_scatter, store_scatter_offset_restricted, sd_integer, implicit) +DEF_SVE_FUNCTION (svstnt1h_scatter, store_scatter_index_restricted, sd_integer, implicit) +DEF_SVE_FUNCTION (svstnt1h_scatter, store_scatter_offset_restricted, sd_integer, implicit) +DEF_SVE_FUNCTION (svstnt1w_scatter, store_scatter_index_restricted, d_integer, implicit) +DEF_SVE_FUNCTION (svstnt1w_scatter, store_scatter_offset_restricted, d_integer, implicit) +#undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE2 \ + | AARCH64_FL_SVE2_AES \ + | AARCH64_FL_SM_OFF) DEF_SVE_FUNCTION (svaesd, binary, b_unsigned, none) DEF_SVE_FUNCTION (svaese, binary, b_unsigned, none) DEF_SVE_FUNCTION (svaesmc, unary, b_unsigned, none) @@ -198,17 +203,23 @@ DEF_SVE_FUNCTION (svpmullb_pair, binary_opt_n, d_unsigned, none) DEF_SVE_FUNCTION (svpmullt_pair, binary_opt_n, d_unsigned, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE2 | AARCH64_FL_SVE2_BITPERM) +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE2 \ + | AARCH64_FL_SVE2_BITPERM \ + | AARCH64_FL_SM_OFF) DEF_SVE_FUNCTION (svbdep, binary_opt_n, all_unsigned, none) DEF_SVE_FUNCTION (svbext, binary_opt_n, all_unsigned, none) DEF_SVE_FUNCTION (svbgrp, binary_opt_n, all_unsigned, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE2 | AARCH64_FL_SVE2_SHA3) +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE2 \ + | AARCH64_FL_SVE2_SHA3 \ + | AARCH64_FL_SM_OFF) DEF_SVE_FUNCTION (svrax1, binary, d_integer, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE2 | AARCH64_FL_SVE2_SM4) +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE2 \ + | AARCH64_FL_SVE2_SM4 \ + | AARCH64_FL_SM_OFF) DEF_SVE_FUNCTION (svsm4e, binary, s_unsigned, none) DEF_SVE_FUNCTION (svsm4ekey, binary, s_unsigned, none) #undef REQUIRED_EXTENSIONS diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index e168c83344a..a6de1068da9 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -700,6 +700,13 @@ check_required_extensions (location_t location, tree fndecl, if (missing_extensions == 0) return check_required_registers (location, fndecl); + if (missing_extensions & AARCH64_FL_SM_OFF) + { + error_at (location, "ACLE function %qD cannot be called when" + " SME streaming mode is enabled", fndecl); + return false; + } + static const struct { aarch64_feature_flags flag; const char *name; diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index b8cc47ef5fc..e98fbcbeb0e 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -1082,7 +1082,7 @@ (define_insn "aarch64_wrffr" (match_operand:VNx16BI 0 "aarch64_simd_reg_or_minus_one" "Dm, Upa")) (set (reg:VNx16BI FFRT_REGNUM) (unspec:VNx16BI [(match_dup 0)] UNSPEC_WRFFR))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "@ setffr wrffr\t%0.b" @@ -1123,7 +1123,7 @@ (define_insn "aarch64_copy_ffr_to_ffrt" (define_insn "aarch64_rdffr" [(set (match_operand:VNx16BI 0 "register_operand" "=Upa") (reg:VNx16BI FFRT_REGNUM))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "rdffr\t%0.b" ) @@ -1133,7 +1133,7 @@ (define_insn "aarch64_rdffr_z" (and:VNx16BI (reg:VNx16BI FFRT_REGNUM) (match_operand:VNx16BI 1 "register_operand" "Upa")))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "rdffr\t%0.b, %1/z" ) @@ -1149,7 +1149,7 @@ (define_insn "*aarch64_rdffr_z_ptest" (match_dup 1))] UNSPEC_PTEST)) (clobber (match_scratch:VNx16BI 0 "=Upa"))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "rdffrs\t%0.b, %1/z" ) @@ -1163,7 +1163,7 @@ (define_insn "*aarch64_rdffr_ptest" (reg:VNx16BI FFRT_REGNUM)] UNSPEC_PTEST)) (clobber (match_scratch:VNx16BI 0 "=Upa"))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "rdffrs\t%0.b, %1/z" ) @@ -1182,7 +1182,7 @@ (define_insn "*aarch64_rdffr_z_cc" (and:VNx16BI (reg:VNx16BI FFRT_REGNUM) (match_dup 1)))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "rdffrs\t%0.b, %1/z" ) @@ -1197,7 +1197,7 @@ (define_insn "*aarch64_rdffr_cc" UNSPEC_PTEST)) (set (match_operand:VNx16BI 0 "register_operand" "=Upa") (reg:VNx16BI FFRT_REGNUM))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "rdffrs\t%0.b, %1/z" ) @@ -1327,7 +1327,7 @@ (define_insn "@aarch64_ldf1" (match_operand:SVE_FULL 1 "aarch64_sve_ldf1_operand" "Ut") (reg:VNx16BI FFRT_REGNUM)] SVE_LDFF1_LDNF1))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "ldf1\t%0., %2/z, %1" ) @@ -1361,7 +1361,9 @@ (define_insn_and_rewrite "@aarch64_ldf1_ & ) == 0" + "TARGET_SVE + && TARGET_NON_STREAMING + && (~ & ) == 0" "ldf1\t%0., %2/z, %1" "&& !CONSTANT_P (operands[3])" { @@ -1409,7 +1411,7 @@ (define_expand "gather_load" (match_operand:DI 4 "aarch64_gather_scale_operand_") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" { operands[5] = aarch64_ptrue_reg (mode); } @@ -1427,7 +1429,7 @@ (define_insn "mask_gather_load" (match_operand:DI 4 "aarch64_gather_scale_operand_" "Ui1, Ui1, Ui1, Ui1, i, i") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "@ ld1\t%0.s, %5/z, [%2.s] ld1\t%0.s, %5/z, [%2.s, #%1] @@ -1449,7 +1451,7 @@ (define_insn "mask_gather_load" (match_operand:DI 4 "aarch64_gather_scale_operand_" "Ui1, Ui1, Ui1, i") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "@ ld1\t%0.d, %5/z, [%2.d] ld1\t%0.d, %5/z, [%2.d, #%1] @@ -1472,7 +1474,7 @@ (define_insn_and_rewrite "*mask_gather_load_xtw_unpac (match_operand:DI 4 "aarch64_gather_scale_operand_" "Ui1, i") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "@ ld1\t%0.d, %5/z, [%1, %2.d, xtw] ld1\t%0.d, %5/z, [%1, %2.d, xtw %p4]" @@ -1499,7 +1501,7 @@ (define_insn_and_rewrite "*mask_gather_load_sxtw" (match_operand:DI 4 "aarch64_gather_scale_operand_" "Ui1, i") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "@ ld1\t%0.d, %5/z, [%1, %2.d, sxtw] ld1\t%0.d, %5/z, [%1, %2.d, sxtw %p4]" @@ -1523,7 +1525,7 @@ (define_insn "*mask_gather_load_uxtw" (match_operand:DI 4 "aarch64_gather_scale_operand_" "Ui1, i") (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "@ ld1\t%0.d, %5/z, [%1, %2.d, uxtw] ld1\t%0.d, %5/z, [%1, %2.d, uxtw %p4]" @@ -1557,7 +1559,9 @@ (define_insn_and_rewrite "@aarch64_gather_load_ (mem:BLK (scratch))] UNSPEC_LD1_GATHER))] UNSPEC_PRED_X))] - "TARGET_SVE && (~ & ) == 0" + "TARGET_SVE + && TARGET_NON_STREAMING + && (~ & ) == 0" "@ ld1\t%0.s, %5/z, [%2.s] ld1\t%0.s, %5/z, [%2.s, #%1] @@ -1587,7 +1591,9 @@ (define_insn_and_rewrite "@aarch64_gather_load_ & ) == 0" + "TARGET_SVE + && TARGET_NON_STREAMING + && (~ & ) == 0" "@ ld1\t%0.d, %5/z, [%2.d] ld1\t%0.d, %5/z, [%2.d, #%1] @@ -1618,7 +1624,9 @@ (define_insn_and_rewrite "*aarch64_gather_load_ & ) == 0" + "TARGET_SVE + && TARGET_NON_STREAMING + && (~ & ) == 0" "@ ld1\t%0.d, %5/z, [%1, %2.d, xtw] ld1\t%0.d, %5/z, [%1, %2.d, xtw %p4]" @@ -1650,7 +1658,9 @@ (define_insn_and_rewrite "*aarch64_gather_load_ & ) == 0" + "TARGET_SVE + && TARGET_NON_STREAMING + && (~ & ) == 0" "@ ld1\t%0.d, %5/z, [%1, %2.d, sxtw] ld1\t%0.d, %5/z, [%1, %2.d, sxtw %p4]" @@ -1679,7 +1689,9 @@ (define_insn_and_rewrite "*aarch64_gather_load_ & ) == 0" + "TARGET_SVE + && TARGET_NON_STREAMING + && (~ & ) == 0" "@ ld1\t%0.d, %5/z, [%1, %2.d, uxtw] ld1\t%0.d, %5/z, [%1, %2.d, uxtw %p4]" @@ -1710,7 +1722,7 @@ (define_insn "@aarch64_ldff1_gather" (mem:BLK (scratch)) (reg:VNx16BI FFRT_REGNUM)] UNSPEC_LDFF1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "@ ldff1w\t%0.s, %5/z, [%2.s] ldff1w\t%0.s, %5/z, [%2.s, #%1] @@ -1733,7 +1745,7 @@ (define_insn "@aarch64_ldff1_gather" (mem:BLK (scratch)) (reg:VNx16BI FFRT_REGNUM)] UNSPEC_LDFF1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "@ ldff1d\t%0.d, %5/z, [%2.d] ldff1d\t%0.d, %5/z, [%2.d, #%1] @@ -1758,7 +1770,7 @@ (define_insn_and_rewrite "*aarch64_ldff1_gather_sxtw" (mem:BLK (scratch)) (reg:VNx16BI FFRT_REGNUM)] UNSPEC_LDFF1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "@ ldff1d\t%0.d, %5/z, [%1, %2.d, sxtw] ldff1d\t%0.d, %5/z, [%1, %2.d, sxtw %p4]" @@ -1782,7 +1794,7 @@ (define_insn "*aarch64_ldff1_gather_uxtw" (mem:BLK (scratch)) (reg:VNx16BI FFRT_REGNUM)] UNSPEC_LDFF1_GATHER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "@ ldff1d\t%0.d, %5/z, [%1, %2.d, uxtw] ldff1d\t%0.d, %5/z, [%1, %2.d, uxtw %p4]" @@ -1817,7 +1829,7 @@ (define_insn_and_rewrite "@aarch64_ldff1_gather_\t%0.s, %5/z, [%2.s] ldff1\t%0.s, %5/z, [%2.s, #%1] @@ -1848,7 +1860,7 @@ (define_insn_and_rewrite "@aarch64_ldff1_gather_\t%0.d, %5/z, [%2.d] ldff1\t%0.d, %5/z, [%2.d, #%1] @@ -1881,7 +1893,7 @@ (define_insn_and_rewrite "*aarch64_ldff1_gather_\t%0.d, %5/z, [%1, %2.d, sxtw] ldff1\t%0.d, %5/z, [%1, %2.d, sxtw %p4]" @@ -1910,7 +1922,7 @@ (define_insn_and_rewrite "*aarch64_ldff1_gather_\t%0.d, %5/z, [%1, %2.d, uxtw] ldff1\t%0.d, %5/z, [%1, %2.d, uxtw %p4]" @@ -1985,7 +1997,7 @@ (define_insn "@aarch64_sve_gather_prefetch" UNSPEC_SVE_PREFETCH_GATHER) (match_operand:DI 7 "const_int_operand") (match_operand:DI 8 "const_int_operand"))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" { static const char *const insns[][2] = { "prf", "%0, [%2.s]", @@ -2014,7 +2026,7 @@ (define_insn "@aarch64_sve_gather_prefetch" UNSPEC_SVE_PREFETCH_GATHER) (match_operand:DI 7 "const_int_operand") (match_operand:DI 8 "const_int_operand"))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" { static const char *const insns[][2] = { "prf", "%0, [%2.d]", @@ -2045,7 +2057,7 @@ (define_insn_and_rewrite "*aarch64_sve_gather_prefetch_ux UNSPEC_SVE_PREFETCH_GATHER) (match_operand:DI 7 "const_int_operand") (match_operand:DI 8 "const_int_operand"))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" { static const char *const insns[][2] = { "prfb", "%0, [%1, %2.d, uxtw]", @@ -2242,7 +2254,7 @@ (define_expand "scatter_store" (match_operand:DI 3 "aarch64_gather_scale_operand_") (match_operand:SVE_24 4 "register_operand")] UNSPEC_ST1_SCATTER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" { operands[5] = aarch64_ptrue_reg (mode); } @@ -2260,7 +2272,7 @@ (define_insn "mask_scatter_store" (match_operand:DI 3 "aarch64_gather_scale_operand_" "Ui1, Ui1, Ui1, Ui1, i, i") (match_operand:SVE_4 4 "register_operand" "w, w, w, w, w, w")] UNSPEC_ST1_SCATTER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "@ st1\t%4.s, %5, [%1.s] st1\t%4.s, %5, [%1.s, #%0] @@ -2282,7 +2294,7 @@ (define_insn "mask_scatter_store" (match_operand:DI 3 "aarch64_gather_scale_operand_" "Ui1, Ui1, Ui1, i") (match_operand:SVE_2 4 "register_operand" "w, w, w, w")] UNSPEC_ST1_SCATTER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "@ st1\t%4.d, %5, [%1.d] st1\t%4.d, %5, [%1.d, #%0] @@ -2305,7 +2317,7 @@ (define_insn_and_rewrite "*mask_scatter_store_xtw_unp (match_operand:DI 3 "aarch64_gather_scale_operand_" "Ui1, i") (match_operand:SVE_2 4 "register_operand" "w, w")] UNSPEC_ST1_SCATTER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "@ st1\t%4.d, %5, [%0, %1.d, xtw] st1\t%4.d, %5, [%0, %1.d, xtw %p3]" @@ -2332,7 +2344,7 @@ (define_insn_and_rewrite "*mask_scatter_store_sxtw" (match_operand:DI 3 "aarch64_gather_scale_operand_" "Ui1, i") (match_operand:SVE_2 4 "register_operand" "w, w")] UNSPEC_ST1_SCATTER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "@ st1\t%4.d, %5, [%0, %1.d, sxtw] st1\t%4.d, %5, [%0, %1.d, sxtw %p3]" @@ -2356,7 +2368,7 @@ (define_insn "*mask_scatter_store_uxtw" (match_operand:DI 3 "aarch64_gather_scale_operand_" "Ui1, i") (match_operand:SVE_2 4 "register_operand" "w, w")] UNSPEC_ST1_SCATTER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "@ st1\t%4.d, %5, [%0, %1.d, uxtw] st1\t%4.d, %5, [%0, %1.d, uxtw %p3]" @@ -2384,7 +2396,7 @@ (define_insn "@aarch64_scatter_store_trunc" (truncate:VNx4_NARROW (match_operand:VNx4_WIDE 4 "register_operand" "w, w, w, w, w, w"))] UNSPEC_ST1_SCATTER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "@ st1\t%4.s, %5, [%1.s] st1\t%4.s, %5, [%1.s, #%0] @@ -2407,7 +2419,7 @@ (define_insn "@aarch64_scatter_store_trunc" (truncate:VNx2_NARROW (match_operand:VNx2_WIDE 4 "register_operand" "w, w, w, w"))] UNSPEC_ST1_SCATTER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "@ st1\t%4.d, %5, [%1.d] st1\t%4.d, %5, [%1.d, #%0] @@ -2432,7 +2444,7 @@ (define_insn_and_rewrite "*aarch64_scatter_store_trunc\t%4.d, %5, [%0, %1.d, sxtw] st1\t%4.d, %5, [%0, %1.d, sxtw %p3]" @@ -2456,7 +2468,7 @@ (define_insn "*aarch64_scatter_store_trunc_uxt (truncate:VNx2_NARROW (match_operand:VNx2_WIDE 4 "register_operand" "w, w"))] UNSPEC_ST1_SCATTER))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "@ st1\t%4.d, %5, [%0, %1.d, uxtw] st1\t%4.d, %5, [%0, %1.d, uxtw %p3]" @@ -2602,7 +2614,7 @@ (define_insn "@aarch64_sve_ld1ro" (match_operand:OI 1 "aarch64_sve_ld1ro_operand_" "UO")] UNSPEC_LD1RO))] - "TARGET_SVE_F64MM" + "TARGET_SVE_F64MM && TARGET_NON_STREAMING" { operands[1] = gen_rtx_MEM (mode, XEXP (operands[1], 0)); return "ld1ro\t%0., %2/z, %1"; @@ -3834,7 +3846,7 @@ (define_insn "@aarch64_adr" [(match_operand:SVE_FULL_SDI 1 "register_operand" "w") (match_operand:SVE_FULL_SDI 2 "register_operand" "w")] UNSPEC_ADR))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "adr\t%0., [%1., %2.]" ) @@ -3850,7 +3862,7 @@ (define_insn_and_rewrite "*aarch64_adr_sxtw" (match_operand:VNx2DI 2 "register_operand" "w")))] UNSPEC_PRED_X)] UNSPEC_ADR))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "adr\t%0.d, [%1.d, %2.d, sxtw]" "&& !CONSTANT_P (operands[3])" { @@ -3867,7 +3879,7 @@ (define_insn "*aarch64_adr_uxtw_unspec" (match_operand:VNx2DI 2 "register_operand" "w") (match_operand:VNx2DI 3 "aarch64_sve_uxtw_immediate"))] UNSPEC_ADR))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "adr\t%0.d, [%1.d, %2.d, uxtw]" ) @@ -3879,7 +3891,7 @@ (define_insn "*aarch64_adr_uxtw_and" (match_operand:VNx2DI 2 "register_operand" "w") (match_operand:VNx2DI 3 "aarch64_sve_uxtw_immediate")) (match_operand:VNx2DI 1 "register_operand" "w")))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "adr\t%0.d, [%1.d, %2.d, uxtw]" ) @@ -3894,7 +3906,7 @@ (define_expand "@aarch64_adr_shift" (match_operand:SVE_FULL_SDI 3 "const_1_to_3_operand"))] UNSPEC_PRED_X) (match_operand:SVE_FULL_SDI 1 "register_operand")))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" { operands[4] = CONSTM1_RTX (mode); } @@ -3910,7 +3922,7 @@ (define_insn_and_rewrite "*aarch64_adr_shift" (match_operand:SVE_24I 3 "const_1_to_3_operand"))] UNSPEC_PRED_X) (match_operand:SVE_24I 1 "register_operand" "w")))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "adr\t%0., [%1., %2., lsl %3]" "&& !CONSTANT_P (operands[4])" { @@ -3934,7 +3946,7 @@ (define_insn_and_rewrite "*aarch64_adr_shift_sxtw" (match_operand:VNx2DI 3 "const_1_to_3_operand"))] UNSPEC_PRED_X) (match_operand:VNx2DI 1 "register_operand" "w")))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "adr\t%0.d, [%1.d, %2.d, sxtw %3]" "&& (!CONSTANT_P (operands[4]) || !CONSTANT_P (operands[5]))" { @@ -3955,7 +3967,7 @@ (define_insn_and_rewrite "*aarch64_adr_shift_uxtw" (match_operand:VNx2DI 3 "const_1_to_3_operand"))] UNSPEC_PRED_X) (match_operand:VNx2DI 1 "register_operand" "w")))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "adr\t%0.d, [%1.d, %2.d, uxtw %3]" "&& !CONSTANT_P (operands[5])" { @@ -6967,7 +6979,7 @@ (define_insn "@aarch64_sve_add_" (match_operand: 3 "register_operand" "w, w")] MATMUL) (match_operand:VNx4SI_ONLY 1 "register_operand" "0, w")))] - "TARGET_SVE_I8MM" + "TARGET_SVE_I8MM && TARGET_NON_STREAMING" "@ mmla\\t%0.s, %2.b, %3.b movprfx\t%0, %1\;mmla\\t%0.s, %2.b, %3.b" @@ -7538,7 +7550,7 @@ (define_insn "@aarch64_sve_" (match_operand:SVE_MATMULF 3 "register_operand" "w, w") (match_operand:SVE_MATMULF 1 "register_operand" "0, w")] FMMLA))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "@ \\t%0., %2., %3. movprfx\t%0, %1\;\\t%0., %2., %3." @@ -8601,7 +8613,7 @@ (define_expand "fold_left_plus_" (match_operand: 1 "register_operand") (match_operand:SVE_FULL_F 2 "register_operand")] UNSPEC_FADDA))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" { operands[3] = aarch64_ptrue_reg (mode); } @@ -8614,7 +8626,7 @@ (define_insn "mask_fold_left_plus_" (match_operand: 1 "register_operand" "0") (match_operand:SVE_FULL_F 2 "register_operand" "w")] UNSPEC_FADDA))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "fadda\t%0, %3, %0, %2." ) @@ -8668,7 +8680,7 @@ (define_insn "@aarch64_sve_compact" [(match_operand: 1 "register_operand" "Upl") (match_operand:SVE_FULL_SD 2 "register_operand" "w")] UNSPEC_SVE_COMPACT))] - "TARGET_SVE" + "TARGET_SVE && TARGET_NON_STREAMING" "compact\t%0., %1, %2." ) diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index 5df38e3f951..033520740cd 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -109,7 +109,7 @@ (define_insn "@aarch64_gather_ldnt" (match_operand: 3 "register_operand" "w, w") (mem:BLK (scratch))] UNSPEC_LDNT1_GATHER))] - "TARGET_SVE2" + "TARGET_SVE2 && TARGET_NON_STREAMING" "@ ldnt1\t%0., %1/z, [%3.] ldnt1\t%0., %1/z, [%3., %2]" @@ -129,6 +129,7 @@ (define_insn_and_rewrite "@aarch64_gather_ldnt_ & ) == 0" "@ ldnt1\t%0., %1/z, [%3.] @@ -2426,7 +2427,7 @@ (define_insn "@aarch64_sve2_histcnt" (match_operand:SVE_FULL_SDI 2 "register_operand" "w") (match_operand:SVE_FULL_SDI 3 "register_operand" "w")] UNSPEC_HISTCNT))] - "TARGET_SVE2" + "TARGET_SVE2 && TARGET_NON_STREAMING" "histcnt\t%0., %1/z, %2., %3." ) @@ -2436,7 +2437,7 @@ (define_insn "@aarch64_sve2_histseg" [(match_operand:VNx16QI_ONLY 1 "register_operand" "w") (match_operand:VNx16QI_ONLY 2 "register_operand" "w")] UNSPEC_HISTSEG))] - "TARGET_SVE2" + "TARGET_SVE2 && TARGET_NON_STREAMING" "histseg\t%0., %1., %2." ) @@ -2460,7 +2461,7 @@ (define_insn "@aarch64_pred_" SVE2_MATCH)] UNSPEC_PRED_Z)) (clobber (reg:CC_NZC CC_REGNUM))] - "TARGET_SVE2" + "TARGET_SVE2 && TARGET_NON_STREAMING" "\t%0., %1/z, %3., %4." ) @@ -2491,6 +2492,7 @@ (define_insn_and_rewrite "*aarch64_pred__cc" SVE2_MATCH)] UNSPEC_PRED_Z))] "TARGET_SVE2 + && TARGET_NON_STREAMING && aarch64_sve_same_pred_for_ptest_p (&operands[4], &operands[6])" "\t%0., %1/z, %2., %3." "&& !rtx_equal_p (operands[4], operands[6])" @@ -2518,6 +2520,7 @@ (define_insn_and_rewrite "*aarch64_pred__ptest" UNSPEC_PTEST)) (clobber (match_scratch: 0 "=Upa"))] "TARGET_SVE2 + && TARGET_NON_STREAMING && aarch64_sve_same_pred_for_ptest_p (&operands[4], &operands[6])" "\t%0., %1/z, %2., %3." "&& !rtx_equal_p (operands[4], operands[6])" diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 398cc03fd1f..8359cf709c1 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -252,6 +252,9 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; #define AARCH64_ISA_MOPS (aarch64_isa_flags & AARCH64_FL_MOPS) #define AARCH64_ISA_LS64 (aarch64_isa_flags & AARCH64_FL_LS64) +/* The current function is a normal non-streaming function. */ +#define TARGET_NON_STREAMING (AARCH64_ISA_SM_OFF) + /* Crypto is an optional extension to AdvSIMD. */ #define TARGET_CRYPTO (AARCH64_ISA_CRYPTO) @@ -290,16 +293,16 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; #define TARGET_SVE2 (AARCH64_ISA_SVE2) /* SVE2 AES instructions, enabled through +sve2-aes. */ -#define TARGET_SVE2_AES (AARCH64_ISA_SVE2_AES) +#define TARGET_SVE2_AES (AARCH64_ISA_SVE2_AES && TARGET_NON_STREAMING) /* SVE2 BITPERM instructions, enabled through +sve2-bitperm. */ -#define TARGET_SVE2_BITPERM (AARCH64_ISA_SVE2_BITPERM) +#define TARGET_SVE2_BITPERM (AARCH64_ISA_SVE2_BITPERM && TARGET_NON_STREAMING) /* SVE2 SHA3 instructions, enabled through +sve2-sha3. */ -#define TARGET_SVE2_SHA3 (AARCH64_ISA_SVE2_SHA3) +#define TARGET_SVE2_SHA3 (AARCH64_ISA_SVE2_SHA3 && TARGET_NON_STREAMING) /* SVE2 SM4 instructions, enabled through +sve2-sm4. */ -#define TARGET_SVE2_SM4 (AARCH64_ISA_SVE2_SM4) +#define TARGET_SVE2_SM4 (AARCH64_ISA_SVE2_SM4 && TARGET_NON_STREAMING) /* ARMv8.3-A features. */ #define TARGET_ARMV8_3 (AARCH64_ISA_V8_3A) diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index a8ad4e5ff21..8d65fadbdf6 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -2709,7 +2709,7 @@ (define_int_iterator SVE_INT_UNARY [UNSPEC_RBIT UNSPEC_REVB (define_int_iterator SVE_FP_UNARY [UNSPEC_FRECPE UNSPEC_RSQRTE]) -(define_int_iterator SVE_FP_UNARY_INT [UNSPEC_FEXPA]) +(define_int_iterator SVE_FP_UNARY_INT [(UNSPEC_FEXPA "TARGET_NON_STREAMING")]) (define_int_iterator SVE_INT_SHIFT_IMM [UNSPEC_ASRD (UNSPEC_SQSHLU "TARGET_SVE2") @@ -2723,7 +2723,7 @@ (define_int_iterator SVE_FP_BINARY_INT [UNSPEC_FTSMUL UNSPEC_FTSSEL]) (define_int_iterator SVE_BFLOAT_TERNARY_LONG [UNSPEC_BFDOT UNSPEC_BFMLALB UNSPEC_BFMLALT - UNSPEC_BFMMLA]) + (UNSPEC_BFMMLA "TARGET_NON_STREAMING")]) (define_int_iterator SVE_BFLOAT_TERNARY_LONG_LANE [UNSPEC_BFDOT UNSPEC_BFMLALB diff --git a/gcc/testsuite/g++.target/aarch64/sve/aarch64-ssve.exp b/gcc/testsuite/g++.target/aarch64/sve/aarch64-ssve.exp new file mode 100644 index 00000000000..23f23f8ec42 --- /dev/null +++ b/gcc/testsuite/g++.target/aarch64/sve/aarch64-ssve.exp @@ -0,0 +1,309 @@ +# Specific regression driver for AArch64 SME. +# Copyright (C) 2009-2022 Free Software Foundation, Inc. +# Contributed by ARM Ltd. +# +# This file is part of GCC. +# +# GCC is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. +# +# GCC is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . */ + +# Test whether certain SVE instructions are accepted or rejected in +# SME streaming mode. + +# Exit immediately if this isn't an AArch64 target. +if {![istarget aarch64*-*-*] } then { + return +} + +load_lib gcc-defs.exp + +gcc_parallel_test_enable 0 + +# Code shared by all tests. +set preamble { +#include + +#pragma GCC target "+i8mm+f32mm+f64mm+sve2+sve2-bitperm+sve2-sm4+sve2-aes+sve2-sha3+sme" + +extern svbool_t &pred; + +extern svint8_t &s8; +extern svint32_t &s32; + +extern svuint8_t &u8; +extern svuint16_t &u16; +extern svuint32_t &u32; +extern svuint64_t &u64; + +extern svbfloat16_t &bf16; +extern svfloat32_t &f32; + +extern void *void_ptr; + +extern int8_t *s8_ptr; +extern int16_t *s16_ptr; +extern int32_t *s32_ptr; + +extern uint8_t *u8_ptr; +extern uint16_t *u16_ptr; +extern uint32_t *u32_ptr; +extern uint64_t *u64_ptr; + +extern uint64_t indx; +} + +# Wrap a standalone call in a streaming-compatible function. +set sc_harness { +void __attribute__((arm_streaming_compatible)) +foo () +{ + $CALL; +} +} + +# HARNESS is some source code that should be appended to the preamble +# variable defined above. It includes the string "$CALL", which should be +# replaced by the function call in CALL. The result after both steps is +# a complete C++ translation unit. +# +# Try compiling the C++ code and see what output GCC produces. +# The expected output is either: +# +# - empty, if SHOULD_PASS is true +# - a message rejecting CALL in streaming mode, if SHOULD_PASS is false +# +# CALL is simple enough that it can be used in test names. +proc check_ssve_call { harness name call should_pass } { + global preamble + + set filename test-[pid] + set fd [open $filename.cc w] + puts $fd $preamble + puts -nonewline $fd [string map [list {$CALL} $call] $harness] + close $fd + remote_download host $filename.cc + + set test "streaming SVE call $name" + + set gcc_output [g++_target_compile $filename.cc $filename.s assembly ""] + remote_file build delete $filename.cc $filename.s + + if { [string equal $gcc_output ""] } { + if { $should_pass } { + pass $test + } else { + fail $test + } + return + } + + set lines [split $gcc_output "\n"] + set error_text "cannot be called when SME streaming mode is enabled" + if { [llength $lines] == 3 + && [string first "In function" [lindex $lines 0]] >= 0 + && [string first $error_text [lindex $lines 1]] >= 0 + && [string equal [lindex $lines 2] ""] } { + if { $should_pass } { + fail $test + } else { + pass $test + } + return + } + + verbose -log "$test: unexpected output" + fail $test +} + +# Apply check_ssve_call to each line in CALLS. The other arguments are +# as for check_ssve_call. +proc check_ssve_calls { harness calls should_pass } { + foreach line [split $calls "\n"] { + set call [string trim $line] + if { [string equal $call ""] } { + continue + } + check_ssve_call $harness "$call" $call $should_pass + } +} + +# A small selection of things that are valid in streaming mode. +set streaming_ok { + s8 = svadd_x (pred, s8, s8) + s8 = svld1 (pred, s8_ptr) +} + +# This order follows the list in the SME manual. +set nonstreaming_only { + u32 = svadrb_offset (u32, u32) + u64 = svadrb_offset (u64, u64) + u32 = svadrh_index (u32, u32) + u64 = svadrh_index (u64, u64) + u32 = svadrw_index (u32, u32) + u64 = svadrw_index (u64, u64) + u32 = svadrd_index (u32, u32) + u64 = svadrd_index (u64, u64) + u8 = svaesd (u8, u8) + u8 = svaese (u8, u8) + u8 = svaesimc (u8) + u8 = svaesmc (u8) + u8 = svbdep (u8, u8) + u8 = svbext (u8, u8) + f32 = svbfmmla (f32, bf16, bf16) + u8 = svbgrp (u8, u8) + u32 = svcompact (pred, u32) + f32 = svadda (pred, 1.0f, f32) + f32 = svexpa (u32) + f32 = svmmla (f32, f32, f32) + f32 = svtmad (f32, f32, 0) + f32 = svtsmul (f32, u32) + f32 = svtssel (f32, u32) + u32 = svhistcnt_z (pred, u32, u32) + u8 = svhistseg (u8, u8) + u32 = svld1ub_gather_offset_u32 (pred, u8_ptr, u32) + u32 = svld1ub_gather_offset_u32 (pred, u32, 1) + u64 = svld1_gather_index (pred, u64_ptr, u64) + u64 = svld1_gather_index_u64 (pred, u64, 1) + u32 = svld1uh_gather_index_u32 (pred, u16_ptr, u32) + u32 = svld1uh_gather_index_u32 (pred, u32, 1) + u8 = svld1ro (pred, u8_ptr + indx) + u8 = svld1ro (pred, u8_ptr + 1) + u16 = svld1ro (pred, u16_ptr + indx) + u16 = svld1ro (pred, u16_ptr + 1) + u32 = svld1ro (pred, u32_ptr + indx) + u32 = svld1ro (pred, u32_ptr + 1) + u64 = svld1ro (pred, u64_ptr + indx) + u64 = svld1ro (pred, u64_ptr + 1) + u32 = svld1sb_gather_offset_u32 (pred, s8_ptr, u32) + u32 = svld1sb_gather_offset_u32 (pred, u32, 1) + u32 = svld1sh_gather_index_u32 (pred, s16_ptr, u32) + u32 = svld1sh_gather_index_u32 (pred, u32, 1) + u64 = svld1sw_gather_index_u64 (pred, s32_ptr, u64) + u64 = svld1sw_gather_index_u64 (pred, u64, 1) + u64 = svld1uw_gather_index_u64 (pred, u32_ptr, u64) + u64 = svld1uw_gather_index_u64 (pred, u64, 1) + u32 = svld1_gather_index (pred, u32_ptr, u32) + u32 = svld1_gather_index_u32 (pred, u32, 1) + u8 = svldff1(pred, u8_ptr) + u16 = svldff1ub_u16(pred, u8_ptr) + u32 = svldff1ub_u32(pred, u8_ptr) + u64 = svldff1ub_u64(pred, u8_ptr) + u32 = svldff1ub_gather_offset_u32 (pred, u8_ptr, u32) + u32 = svldff1ub_gather_offset_u32 (pred, u32, 1) + u64 = svldff1(pred, u64_ptr) + u64 = svldff1_gather_index (pred, u64_ptr, u64) + u64 = svldff1_gather_index_u64 (pred, u64, 1) + u16 = svldff1(pred, u16_ptr) + u32 = svldff1uh_u32(pred, u16_ptr) + u64 = svldff1uh_u64(pred, u16_ptr) + u32 = svldff1uh_gather_offset_u32 (pred, u16_ptr, u32) + u32 = svldff1uh_gather_offset_u32 (pred, u32, 1) + u16 = svldff1sb_u16(pred, s8_ptr) + u32 = svldff1sb_u32(pred, s8_ptr) + u64 = svldff1sb_u64(pred, s8_ptr) + u32 = svldff1sb_gather_offset_u32 (pred, s8_ptr, u32) + u32 = svldff1sb_gather_offset_u32 (pred, u32, 1) + u32 = svldff1sh_u32(pred, s16_ptr) + u64 = svldff1sh_u64(pred, s16_ptr) + u32 = svldff1sh_gather_offset_u32 (pred, s16_ptr, u32) + u32 = svldff1sh_gather_offset_u32 (pred, u32, 1) + u64 = svldff1sw_u64(pred, s32_ptr) + u64 = svldff1sw_gather_offset_u64 (pred, s32_ptr, u64) + u64 = svldff1sw_gather_offset_u64 (pred, u64, 1) + u32 = svldff1(pred, u32_ptr) + u32 = svldff1_gather_index (pred, u32_ptr, u32) + u32 = svldff1_gather_index_u32 (pred, u32, 1) + u64 = svldff1uw_u64(pred, u32_ptr) + u64 = svldff1uw_gather_offset_u64 (pred, u32_ptr, u64) + u64 = svldff1uw_gather_offset_u64 (pred, u64, 1) + u8 = svldnf1(pred, u8_ptr) + u16 = svldnf1ub_u16(pred, u8_ptr) + u32 = svldnf1ub_u32(pred, u8_ptr) + u64 = svldnf1ub_u64(pred, u8_ptr) + u64 = svldnf1(pred, u64_ptr) + u16 = svldnf1(pred, u16_ptr) + u32 = svldnf1uh_u32(pred, u16_ptr) + u64 = svldnf1uh_u64(pred, u16_ptr) + u16 = svldnf1sb_u16(pred, s8_ptr) + u32 = svldnf1sb_u32(pred, s8_ptr) + u64 = svldnf1sb_u64(pred, s8_ptr) + u32 = svldnf1sh_u32(pred, s16_ptr) + u64 = svldnf1sh_u64(pred, s16_ptr) + u64 = svldnf1sw_u64(pred, s32_ptr) + u32 = svldnf1(pred, u32_ptr) + u64 = svldnf1uw_u64(pred, u32_ptr) + u32 = svldnt1ub_gather_offset_u32 (pred, u8_ptr, u32) + u32 = svldnt1ub_gather_offset_u32 (pred, u32, 1) + u64 = svldnt1_gather_index (pred, u64_ptr, u64) + u64 = svldnt1_gather_index_u64 (pred, u64, 1) + u32 = svldnt1uh_gather_offset_u32 (pred, u16_ptr, u32) + u32 = svldnt1uh_gather_offset_u32 (pred, u32, 1) + u32 = svldnt1sb_gather_offset_u32 (pred, s8_ptr, u32) + u32 = svldnt1sb_gather_offset_u32 (pred, u32, 1) + u32 = svldnt1sh_gather_offset_u32 (pred, s16_ptr, u32) + u32 = svldnt1sh_gather_offset_u32 (pred, u32, 1) + u64 = svldnt1sw_gather_offset_u64 (pred, s32_ptr, u64) + u64 = svldnt1sw_gather_offset_u64 (pred, u64, 1) + u64 = svldnt1uw_gather_offset_u64 (pred, u32_ptr, u64) + u64 = svldnt1uw_gather_offset_u64 (pred, u64, 1) + u32 = svldnt1_gather_offset (pred, u32_ptr, u32) + u32 = svldnt1_gather_offset_u32 (pred, u32, 1) + pred = svmatch (pred, u8, u8) + pred = svnmatch (pred, u8, u8) + u64 = svpmullb_pair (u64, u64) + u64 = svpmullt_pair (u64, u64) + svprfb_gather_offset (pred, void_ptr, u64, SV_PLDL1KEEP) + svprfb_gather_offset (pred, u64, 1, SV_PLDL1KEEP) + svprfd_gather_index (pred, void_ptr, u64, SV_PLDL1KEEP) + svprfd_gather_index (pred, u64, 1, SV_PLDL1KEEP) + svprfh_gather_index (pred, void_ptr, u64, SV_PLDL1KEEP) + svprfh_gather_index (pred, u64, 1, SV_PLDL1KEEP) + svprfw_gather_index (pred, void_ptr, u64, SV_PLDL1KEEP) + svprfw_gather_index (pred, u64, 1, SV_PLDL1KEEP) + u64 = svrax1 (u64, u64) + pred = svrdffr () + pred = svrdffr_z (pred) + svsetffr () + u32 = svsm4e (u32, u32) + u32 = svsm4ekey (u32, u32) + s32 = svmmla (s32, s8, s8) + svst1b_scatter_offset (pred, u8_ptr, u32, u32) + svst1b_scatter_offset (pred, u32, 1, u32) + svst1_scatter_index (pred, u64_ptr, u64, u64) + svst1_scatter_index (pred, u64, 1, u64) + svst1h_scatter_index (pred, u16_ptr, u32, u32) + svst1h_scatter_index (pred, u32, 1, u32) + svst1w_scatter_index (pred, u32_ptr, u64, u64) + svst1w_scatter_index (pred, u64, 1, u64) + svst1_scatter_index (pred, u32_ptr, u32, u32) + svst1_scatter_index (pred, u32, 1, u32) + svstnt1b_scatter_offset (pred, u8_ptr, u32, u32) + svstnt1b_scatter_offset (pred, u32, 1, u32) + svstnt1_scatter_offset (pred, u64_ptr, u64, u64) + svstnt1_scatter_offset (pred, u64, 1, u64) + svstnt1h_scatter_offset (pred, u16_ptr, u32, u32) + svstnt1h_scatter_offset (pred, u32, 1, u32) + svstnt1w_scatter_offset (pred, u32_ptr, u64, u64) + svstnt1w_scatter_offset (pred, u64, 1, u64) + svstnt1_scatter_offset (pred, u32_ptr, u32, u32) + svstnt1_scatter_offset (pred, u32, 1, u32) + u32 = svmmla (u32, u8, u8) + s32 = svusmmla (s32, u8, s8) + svwrffr (pred) +} + +check_ssve_calls $sc_harness $streaming_ok 1 +check_ssve_calls $sc_harness $nonstreaming_only 0 + +gcc_parallel_test_enable 1 diff --git a/gcc/testsuite/g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp b/gcc/testsuite/g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp index 38140413a97..1f49c98f077 100644 --- a/gcc/testsuite/g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp +++ b/gcc/testsuite/g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp @@ -50,6 +50,7 @@ if { [info exists gcc_runtest_parallelize_limit_minor] } { torture-init set-torture-options { "-std=c++98 -O0 -g" + "-std=c++98 -O0 -DSTREAMING_COMPATIBLE" "-std=c++98 -O1 -g" "-std=c++11 -O2 -g" "-std=c++14 -O3 -g" diff --git a/gcc/testsuite/g++.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp b/gcc/testsuite/g++.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp index 78e8ecae729..8d562171a01 100644 --- a/gcc/testsuite/g++.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp +++ b/gcc/testsuite/g++.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp @@ -53,6 +53,7 @@ if { [info exists gcc_runtest_parallelize_limit_minor] } { torture-init set-torture-options { "-std=c++98 -O0 -g" + "-std=c++98 -O0 -DSTREAMING_COMPATIBLE" "-std=c++98 -O1 -g" "-std=c++11 -O2 -g" "-std=c++14 -O3 -g" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp b/gcc/testsuite/gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp index a271f1793f4..8cb2b9bb4fc 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp @@ -50,6 +50,7 @@ if { [info exists gcc_runtest_parallelize_limit_minor] } { torture-init set-torture-options { "-std=c90 -O0 -g" + "-std=c90 -O0 -DSTREAMING_COMPATIBLE" "-std=c90 -O1 -g" "-std=c99 -O2 -g" "-std=c11 -O3 -g" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f16.c index 6c6bfa1c294..4d6ec2d65f7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f32.c index 8b2a1dd1c68..04afbcee6c0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f64.c index 90a56420a6a..8b4c7d1ff7f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adda_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrb.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrb.c index a61eec9712e..5dcdc54b007 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrb.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrb.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrd.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrd.c index 970485bd67d..d9d16ce3f7d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrd.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrd.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrh.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrh.c index d06f51fe35b..a358c240389 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrh.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrh.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrw.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrw.c index b23f25a1125..bd1e9af0a6d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrw.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/adrw.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bfmmla_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bfmmla_f32.c index b1d98fbf536..4bb2912a45a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bfmmla_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/bfmmla_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-additional-options "-march=armv8.2-a+sve+bf16" } */ /* { dg-require-effective-target aarch64_asm_bf16_ok } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_f32.c index 2e80d6830ca..d261ec00b92 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_f64.c index e0bc33efec2..024b0510faa 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_s32.c index e4634982bf6..0b32dfb609c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_s64.c index 71cb97b8a2a..38688dbca73 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_u32.c index 954329a0b2f..a3e89cc97a1 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_u64.c index ec664845f4a..602ab048c99 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/compact_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f16.c index 5a5411e46cb..87c26e6ea6b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f32.c index 4ded1c5756e..5e9839537c7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f64.c index c31f9ccb5b2..b117df2a4b1 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/expa_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_f32.c index 00b68ff290c..8b972f61b49 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_f64.c index 47127960c0d..413d4d62d4e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_s32.c index 9b6335547f5..b3df7d154cf 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_s64.c index c9cea3ad8c7..0da1e52966b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_u32.c index 2cccc8d4906..a3304c4197a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_u64.c index 6ee1d48ab0c..73ef94805dc 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_bf16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_bf16.c index cb1801778d4..fe909b666c9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_bf16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_bf16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c index 86081edbd65..30ba3063900 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c index c8df00f8a02..cf62fada91a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c index 2fb9d5b7486..b9fde4dac69 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c index 3cd211b1646..35b7dd1d27e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c index 44b16ed5f72..57b6a6567c0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s64.c index 3aa9a15eeee..bd7e28478e2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s8.c index 49aff5146f2..1438000038e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_s8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u16.c index 00bf9e129f5..145b0b7f3aa 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u32.c index 9e9b3290a12..9f150631b94 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u64.c index 64ec628714b..8dd75d13607 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u8.c index 22701320bf7..f154545868b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ /* { dg-additional-options "-march=armv8.6-a+f64mm" } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s32.c index 16a5316a9e4..06249ad4c5c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s64.c index 3f953247ea1..8d141e133e6 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u32.c index 424de65a6fe..77836cbf652 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u64.c index aa375bea2e3..f4b24ab419a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sb_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s32.c index ed07b4dfcfa..1b978236845 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s64.c index 20ca4272059..2009dec812e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u32.c index e3a85a23fb6..0e1d4896665 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u64.c index 3a0094fba59..115d7d3a996 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sh_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sw_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sw_gather_s64.c index 4d076b4861a..5dc44421ca4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sw_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sw_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sw_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sw_gather_u64.c index ffa85eb3e73..fac4ec41c00 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sw_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1sw_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s32.c index a9c4182659e..f57df42266d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s64.c index 99af86ddf82..0c069fa4f44 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u32.c index 77c7e0a2dff..98102e01393 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u64.c index b605f8b67e3..f86a34d1248 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ub_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s32.c index 84fb5c335d7..13937187895 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s64.c index 44700179322..f0338aae6b4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u32.c index 09d3cc8c298..5810bc0accb 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u64.c index f3dcf03cd81..52e95abb9b4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uh_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uw_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uw_gather_s64.c index f4e9d5db970..0889eefdddd 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uw_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uw_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uw_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uw_gather_u64.c index 854d19233f5..fb144d756ab 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uw_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1uw_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_bf16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_bf16.c index 80f6468700e..1f997480ea8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_bf16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_bf16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f16.c index 13ce863c96a..60405d0a0ed 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f32.c index 2fcc633906c..225e9969dd2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f64.c index cc15b927aba..366e36afdbe 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_f32.c index 7e330c04221..b84b9bcdda7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_f64.c index d0e47f0bf19..e779b071283 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_s32.c index 66bf0f74630..17e0f9aa2d8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_s64.c index faf71bf9dd5..030f187b152 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_u32.c index 41c7dc9cf31..fb86530166f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_u64.c index 8b53ce94f85..5be30a2d842 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s16.c index 1d5fde0e639..61d242c074b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s32.c index 97a36e88499..afe748ef939 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s64.c index c018a4c1ca6..bee22285539 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s8.c index cf620d1f4b0..ccaac2ca4eb 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_s8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u16.c index 1fa819296cb..c8416f99df9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u32.c index 5224ec40ac8..ec26a82ca19 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u64.c index 18e87f2b805..e211f179486 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u8.c index 83883fca43a..24dfe452f03 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s32.c index c2a676807a5..f7e3977bfcf 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s64.c index 2f2a04d24bb..7f2a829a8e4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u32.c index e3e83a205cb..685f628088d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u64.c index 769f2c266e9..49a7a85367f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s16.c index e0a748c6a6b..1d30c7ba618 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s32.c index 86716da9ba1..c2b3f42cb5b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s64.c index e7a4aa6e93d..585a6241e0b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u16.c index 69ba96d52e2..ebb2f0f66f0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u32.c index e1a1873f0a4..f4ea96cf91c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u64.c index 0a49cbcc07f..e3735239c4e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sb_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s32.c index b633335dc71..67e70361b5c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s64.c index 32a4309b633..5755c79bc1a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u32.c index 73a9be8923b..a5848999573 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u64.c index 94ea73b6306..b1875120980 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_s32.c index 81b64e836b8..bffac936527 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_s64.c index 453b3ff244a..a4acb1e5ea9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_u32.c index bbbed79dc35..828288cd825 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_u64.c index 5430e256b46..e3432c46c27 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sh_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_s64.c index e5da8a83dc3..78aa34ec055 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_u64.c index 41142875673..9dad1212c81 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_s64.c index d795ace6391..33b6c10ddc5 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_u64.c index 6caf2f5045d..e8c9c845f95 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1sw_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s32.c index af0be08d21c..b1c9c81357f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s64.c index 43124dd8930..9ab776a218f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u32.c index 90c4e58a275..745740dfa3f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u64.c index 302623a400b..3a7bd6a436b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s16.c index 88ad2d1dc61..ade0704f7ad 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s32.c index e8e06411f98..5d3e0ce95e5 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s64.c index 21d02ddb721..08ae802ee26 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u16.c index 904cb027e3e..d8dc5e15738 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u32.c index a400123188b..042ae5a9f02 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u64.c index a9a98a68362..d0844fa5197 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1ub_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s32.c index d02e443428a..12460105d0e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s64.c index 663a73d2715..536331371b0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u32.c index 5e0ef067f54..602e6a686e6 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u64.c index 1cfae1b9532..4b307b3416e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_s32.c index abb3d769a74..db205b1ef7b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_s64.c index 6e330e8e8a8..0eac877eb82 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_u32.c index 4eb5323e957..266ecf167fe 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_u64.c index ebac26e7d37..bdd725e4a35 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uh_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_s64.c index 6c0daea52b5..ab2c79da782 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_u64.c index 0e400c6790f..361d7de05d8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_s64.c index ac97798991c..8adcec3d512 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_u64.c index c7ab0617106..781fc1a9c66 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1uw_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_bf16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_bf16.c index 947a896e778..93b4425ecb5 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_bf16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_bf16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f16.c index cf017868839..d47d748c76c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f32.c index 83b73ec8e09..e390d685797 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f64.c index 778096e826b..97a0e39e7c8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s16.c index 592c8237de3..21008d7f9ca 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s32.c index 634092af8ea..8a3d795b309 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s64.c index 4a03f66767a..c0b57a2f3fc 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s8.c index 162ee176ad5..6714152d93c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_s8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u16.c index e920ac43b45..3df404d77bb 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u32.c index 65e28c5c206..e899a4a6ff4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u64.c index 70d3f27d87a..ab69656cfa8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u8.c index 5c29f1d196a..5d7b074973e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s16.c index e04b9a7887f..5b53c885d6a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s32.c index 0553fc98da4..992eba7cc2f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s64.c index 61a474fdf52..99e0f8bd091 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u16.c index be63d8bf9b2..fe23913f23c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u32.c index 4f52490b4a8..6deb39770a1 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u64.c index 73f50d182a5..e76457da6cd 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sb_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_s32.c index 08c7dc6dd4d..e49a7f8ed49 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_s64.c index 6a41bc26b7f..00b40281c24 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_u32.c index 2f7718730f1..41560af330f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_u64.c index d7f1a68a4cd..0acf4b34916 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sh_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sw_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sw_s64.c index 5b483e4aa1d..5782128982c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sw_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sw_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sw_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sw_u64.c index 62121ce0a44..8249c4c3f79 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sw_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1sw_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s16.c index 8fe13411f31..e59c451f790 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s32.c index 50122e3b786..d788576e275 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s64.c index d7cce11b60c..b21fdb96491 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u16.c index 7bf82c3b6c0..1ae41b002ff 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u32.c index e2fef064b47..e3d8fb3b5f0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u64.c index 57c61e122ac..df9a0c07fa7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1ub_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_s32.c index ed9686c4ed5..c3467d84675 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_s64.c index a3107f562b8..bf3355e9986 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_u32.c index 93d5abaf76e..bcc3eb3fd8f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_u64.c index 32d36a84ce3..4c01c13ac3f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uh_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uw_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uw_s64.c index 373922791d0..3c655659115 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uw_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uw_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uw_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uw_u64.c index b3c3be1d01f..b222a0dc648 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uw_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1uw_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_f32.c index f66dbf397c4..e1c7f47dc96 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-require-effective-target aarch64_asm_f32mm_ok } */ /* { dg-additional-options "-march=armv8.2-a+f32mm" } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_f64.c index 49dc0607cff..c45caa70001 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-require-effective-target aarch64_asm_f64mm_ok } */ /* { dg-additional-options "-march=armv8.2-a+f64mm" } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_s32.c index e7ce009acfc..dc155461c61 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-require-effective-target aarch64_asm_i8mm_ok } */ /* { dg-additional-options "-march=armv8.2-a+sve+i8mm" } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_u32.c index 81f5166fbf9..43d601a471d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/mmla_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-require-effective-target aarch64_asm_i8mm_ok } */ /* { dg-additional-options "-march=armv8.2-a+sve+i8mm" } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfb_gather.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfb_gather.c index c4bfbbbf7d7..f32cfbfcb19 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfb_gather.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfb_gather.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfd_gather.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfd_gather.c index a84acb1a106..8a4293b6253 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfd_gather.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfd_gather.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfh_gather.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfh_gather.c index 04b7a15758c..6beca4b8e0f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfh_gather.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfh_gather.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfw_gather.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfw_gather.c index 2bbae1b9e02..6af44ac8290 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfw_gather.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/prfw_gather.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/rdffr_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/rdffr_1.c index 5564e967fcf..7e28ef6412f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/rdffr_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/rdffr_1.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_f32.c index cb6774ad04f..1efd4344532 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_f64.c index fe978bbe5f1..f50c43e8309 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_s32.c index d244e701a81..bb6fb10b83f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_s64.c index 5c4ebf440bc..19ec78e9e6e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_u32.c index fe3f7259f24..57fbb91b0ef 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_u64.c index 23212356625..60018be5b80 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_scatter_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_s32.c index d59033356be..fb1bb29dbe2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_s64.c index c7a35f1b470..65ee9a071fd 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_u32.c index e098cb9b77e..ceec6193952 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_u64.c index 058d1313fc2..aeedbc6d7a7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1b_scatter_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_s32.c index 2a23d41f3a1..2d69d085bc0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_s64.c index 6a1adb05609..3e5733ef9bb 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_u32.c index 12197315d09..5cd330a3dec 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_u64.c index 7021ea68f49..0ee9948cb4e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1h_scatter_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1w_scatter_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1w_scatter_s64.c index 2363f592b19..f18bedce1ca 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1w_scatter_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1w_scatter_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1w_scatter_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1w_scatter_u64.c index 767c009b4f7..6850865ec9a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1w_scatter_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1w_scatter_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h index fbf392b3ed4..5ee272e270c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h @@ -11,10 +11,17 @@ #error "Please define -DTEST_OVERLOADS or -DTEST_FULL" #endif +#ifdef STREAMING_COMPATIBLE +#define ATTR __attribute__ ((arm_streaming_compatible)) +#else +#define ATTR +#endif + #ifdef __cplusplus -#define PROTO(NAME, RET, ARGS) extern "C" RET NAME ARGS; RET NAME ARGS +#define PROTO(NAME, RET, ARGS) \ + extern "C" RET ATTR NAME ARGS; RET ATTR NAME ARGS #else -#define PROTO(NAME, RET, ARGS) RET NAME ARGS +#define PROTO(NAME, RET, ARGS) RET ATTR NAME ARGS #endif #define TEST_UNIFORM_Z(NAME, TYPE, CODE1, CODE2) \ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f16.c index 3a00716e37f..c0b03a0d331 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f32.c index b73d420fbac..8eef8a12ca8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f64.c index fc31928a6c3..5c96c55796c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tmad_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f16.c index 94bc696eb07..9deed667f89 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f32.c index d0ec91882d2..749ea8664be 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f64.c index 23e0da3f7a0..053abcb26e9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tsmul_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f16.c index e7c3ea03b81..3ab251fe04a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f32.c index 022573a191d..6c6471c5e56 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f64.c index ffcdf4224b3..9559e0f352d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tssel_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/usmmla_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/usmmla_s32.c index 9440f3fd919..a0dd7e334aa 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/usmmla_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/usmmla_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-require-effective-target aarch64_asm_i8mm_ok } */ /* { dg-additional-options "-march=armv8.2-a+sve+i8mm" } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp b/gcc/testsuite/gcc.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp index e08cd612190..41fd283fdbc 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/aarch64-sve2-acle-asm.exp @@ -39,7 +39,7 @@ if { [check_effective_target_aarch64_sve2] } { # Turn off any codegen tweaks by default that may affect expected assembly. # Tests relying on those should turn them on explicitly. -set sve_flags "$sve_flags -mtune=generic -moverride=tune=none" +set sve2_flags "$sve2_flags -mtune=generic -moverride=tune=none" lappend extra_flags "-fno-ipa-icf" @@ -52,6 +52,7 @@ if { [info exists gcc_runtest_parallelize_limit_minor] } { torture-init set-torture-options { "-std=c90 -O0 -g" + "-std=c90 -O0 -DSTREAMING_COMPATIBLE" "-std=c90 -O1 -g" "-std=c99 -O2 -g" "-std=c11 -O3 -g" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesd_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesd_u8.c index 622f5cf4609..484f7251f75 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesd_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesd_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aese_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aese_u8.c index 6555bbb1de7..6869bbd0527 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aese_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aese_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesimc_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesimc_u8.c index 4630595ff20..534ffe06f35 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesimc_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesimc_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesmc_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesmc_u8.c index 6e8acf48f2a..1660a8eaf01 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesmc_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/aesmc_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u16.c index 14230850f70..c1a4e10614f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u32.c index 7f08df4baa2..4f14cc4c432 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u64.c index 7f7cbbeebad..091253ec60b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u8.c index b420323b906..deb1ad27d90 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bdep_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u16.c index 50a647918e5..9efa501efa8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u32.c index 9f98b843c1a..18963da5bd3 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u64.c index 9dbaec1b762..91591f93b88 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u8.c index 81ed5a463a0..1211587ef41 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bext_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u16.c index 70aeae3f329..72868bea7f6 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u32.c index 6e19e38d897..c8923816fe4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u64.c index 27fa40f4777..86989529faf 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u8.c index b667e03e3a4..5cd941a7a6e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/bgrp_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_s32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_s32.c index 7bf783a7c18..53d6c5c5636 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_s64.c index 001f5f0f187..c6d9862e31f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_u32.c index d93091adc55..cb11a00261b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_u64.c index 3b889802395..0bb06cdb45d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histcnt_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histseg_s8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histseg_s8.c index 380ccdf85a5..ce3458e5ef6 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histseg_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histseg_s8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histseg_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histseg_u8.c index f43292f0ccd..7b1eff811c5 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histseg_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/histseg_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f32.c index 102810e25c8..17e3673a4a7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f64.c index a0ed71227e8..8ce32e9f9ff 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s32.c index 94c64971c77..b7e1d7a99c8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s64.c index a0aa6703f9c..b0789ad21ce 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u32.c index e1479684e82..df09eaa7680 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u64.c index 77cdcfebafe..5f185ea824b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s32.c index bb729483fcd..71fece575d9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s64.c index de5b693140c..1183e72f0fb 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u32.c index d01ec18e442..4d5e6e7716f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u64.c index b96e94353f1..ed329a23f19 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sb_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s32.c index 1dcfbc0fb95..6dbd6cea0f6 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s64.c index 4166ed0a6c8..4ea3335a29f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u32.c index 7680344da28..d5545151994 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u64.c index 2427c83ab67..18c8ca44e7b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sh_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_s64.c index 2f538e847c2..41bff31d021 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_u64.c index ace1c2f2fe5..30b8f6948f7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1sw_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s32.c index d3b29eb193d..8750d11af0f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s64.c index 3bc406620d7..f7981991a6a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u32.c index 0af4b40b851..4d5ee4ef4ef 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u64.c index fe28d78ed46..005c29c0644 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1ub_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s32.c index 985432615ca..92613b16685 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s64.c index 3c5baeee60e..be2e6d126e8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u32.c index 4d945e9f994..4d122059f72 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u64.c index 680238ac4f7..e3bc1044cd7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uh_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_s64.c index 787ae9defb2..9efa4b2cbf0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_u64.c index 4810bc3c45c..4ded4454df1 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/ldnt1uw_gather_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_s16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_s16.c index baebc7693c6..d0ce8129475 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_s16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_s8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_s8.c index f35a753791d..03473906aa2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_s8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_u16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_u16.c index 0bdf4462f3d..2a8b4d250ab 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_u8.c index 6d78692bdb4..8409276d905 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/match_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_s16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_s16.c index 935b19a1040..044ba1de397 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_s16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_s8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_s8.c index 8a00b30f308..6c2d890fa41 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_s8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_u16.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_u16.c index 868c20a11e5..863e31054e2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_u16.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_u8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_u8.c index af6b5816513..a62783db763 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/nmatch_u8.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmullb_pair_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmullb_pair_u64.c index 944609214a1..1fd85e0ce80 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmullb_pair_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmullb_pair_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmullt_pair_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmullt_pair_u64.c index 90e2e991f9b..300d885abb0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmullt_pair_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/pmullt_pair_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/rax1_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/rax1_s64.c index ea80d40dbdf..9dbc7183992 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/rax1_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/rax1_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/rax1_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/rax1_u64.c index b237c7edd5a..5caa2a5443b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/rax1_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/rax1_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/sm4e_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/sm4e_u32.c index 0ff5746d814..14194eef6c4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/sm4e_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/sm4e_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/sm4ekey_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/sm4ekey_u32.c index 58ad33c5ddb..e72384108e6 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/sm4ekey_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/sm4ekey_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f32.c index 3f928e20eac..75539f6928f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f64.c index 8a35c76b90a..c0d47d0c13f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_f64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s32.c index bd600268228..80fb3e8695b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s64.c index 0bfa2616ef5..edd2bc41832 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u32.c index fbfa008c1d5..a6e5059def9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u64.c index c283135c4ec..067e5b109c3 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1_scatter_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s32.c index bf6ba597362..498fe82e5c2 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s64.c index a24d0c89c76..614f5fb1a49 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u32.c index 2b05a7720bd..ce2c482afbd 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u64.c index a13c5f5bb9d..593dc193975 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1b_scatter_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s32.c index 4e012f61f34..b9d06c1c5ab 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s64.c index e934a708d89..006e0e24dec 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u32.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u32.c index db21821eb58..8cd7cb86ab3 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u32.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u64.c index 53f930da1fc..972ee36896b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1h_scatter_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_s64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_s64.c index ec6c837d907..368a17c4769 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_s64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_u64.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_u64.c index 3c5d96de4f8..57d60a350de 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/stnt1w_scatter_u64.c @@ -1,3 +1,4 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ /* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ #include "test_sve_acle.h" From patchwork Sun Nov 13 10:00:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 60513 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4A8033887F79 for ; Sun, 13 Nov 2022 10:03:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4A8033887F79 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668333819; bh=TJ9ECPhK9prNjtm+7eB8Apvsi8P3ebBarD6oQt4wkXQ=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=YSM+4YoHRjpPT4b1cTyUhoCCa3msetoXHkNXmPAeEu+05AzoktO08+SBB3paq2Qxb M8MboUze8C/PahClHlUWq7R3H8ioOxw5qHu7/kcrBENHfepVmat5oyt7WnVsS/pq4U 5uHkmdr05yTSfhGetx3eDQg0e7imHJAJq9kedpqE= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 5E6E43887F77 for ; Sun, 13 Nov 2022 10:01:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5E6E43887F77 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4BCE923A for ; Sun, 13 Nov 2022 02:01:06 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5ACA33F73D for ; Sun, 13 Nov 2022 02:00:59 -0800 (PST) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 05/16] aarch64: Switch PSTATE.SM around calls References: Date: Sun, 13 Nov 2022 10:00:58 +0000 In-Reply-To: (Richard Sandiford's message of "Sun, 13 Nov 2022 09:59:23 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-41.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch adds support for switching to the appropriate SME mode for each call. Switching to streaming mode requires an SMSTART SM instruction and switching to non-streaming mode requires an SMSTOP SM instruction. If the call is being made from streaming-compatible code, these switches are conditional on the current mode being the opposite of the one that the call needs. Since changing PSTATE.SM changes the vector length and effectively changes the ISA, the code to do the switching has to be emitted late. The patch does this using a new pass that runs next to late prologue/ epilogue insertion. (It doesn't use md_reorg because later additions need the CFG.) If a streaming-compatible function needs to switch mode for a call, it must restore the original mode afterwards. The old mode must therefore be available immediately after the call. The easiest way of ensuring this is to force the use of a hard frame pointer and ensure that the old state is saved at an in-range offset from there. Changing modes clobbers the Z and P registers, so we need to save and restore live Z and P state around each mode switch. However, mode switches are not expected to be performance critical, so it seemed better to err on the side of being correct rather than trying to optimise the save and restore with surrounding code. gcc/ * config/aarch64/aarch64-passes.def (pass_late_thread_prologue_and_epilogue): New pass. * config/aarch64/aarch64-sme.md: New file. * config/aarch64/aarch64.md: Include it. (*tb1): Rename to... (@aarch64_tb): ...this. (call, call_value, sibcall, sibcall_value): Don't require operand 2 to be a CONST_INT. * config/aarch64/aarch64-protos.h (aarch64_emit_call_insn): Return the insn. (make_pass_switch_sm_state): Declare. * config/aarch64/aarch64.h (TARGET_STREAMING_COMPATIBLE): New macro. (TARGET_SME): Likewise. (aarch64_frame::old_svcr_offset): New member variable. (machine_function::call_switches_sm_state): Likewise. (CUMULATIVE_ARGS::num_sme_mode_switch_args): Likewise. (CUMULATIVE_ARGS::sme_mode_switch_args): Likewise. * config/aarch64/aarch64.cc: Include tree-pass.h and cfgbuild.h. (aarch64_cfun_incoming_sm_state): New function. (aarch64_call_switches_sm_state): Likewise. (aarch64_callee_isa_mode): Likewise. (aarch64_insn_callee_isa_mode): Likewise. (aarch64_guard_switch_pstate_sm): Likewise. (aarch64_switch_pstate_sm): Likewise. (aarch64_sme_mode_switch_regs): New class. (aarch64_record_sme_mode_switch_args): New function. (aarch64_finish_sme_mode_switch_args): Likewise. (aarch64_function_arg): Handle the end marker by returning a PARALLEL that contains the ABI cookie that we used previously alongside the result of aarch64_finish_sme_mode_switch_args. (aarch64_init_cumulative_args): Initialize num_sme_mode_switch_args. (aarch64_function_arg_advance): If a call would switch SM state, record all argument registers that would need to be saved around the mode switch. (aarch64_need_old_pstate_sm): New function. (aarch64_layout_frame): Decide whether the frame needs to store the incoming value of PSTATE.SM and allocate a save slot for it if so. (aarch64_old_svcr_mem): New function. (aarch64_read_old_svcr): Likewise. (aarch64_guard_switch_pstate_sm): Likewise. (aarch64_expand_prologue): Initialize any SVCR save slot. (aarch64_expand_call): Allow the cookie to be PARALLEL that contains both the UNSPEC_CALLEE_ABI value and a list of registers that need to be preserved across a change to PSTATE.SM. If the call does involve such a change to PSTATE.SM, record the registers that would be clobbered by this process. Update call_switches_sm_state accordingly. (aarch64_emit_call_insn): Return the emitted instruction. (aarch64_frame_pointer_required): New function. (aarch64_switch_sm_state_for_call): Likewise. (pass_data_switch_sm_state): New pass variable. (pass_switch_sm_state): New pass class. (make_pass_switch_sm_state): New function. (TARGET_FRAME_POINTER_REQUIRED): Define. * config/aarch64/t-aarch64 (s-check-sve-md): Add aarch64-sme.md. gcc/testsuite/ * gcc.target/aarch64/sme/call_sm_switch_1.c: New test. * gcc.target/aarch64/sme/call_sm_switch_2.c: Likewise. * gcc.target/aarch64/sme/call_sm_switch_3.c: Likewise. * gcc.target/aarch64/sme/call_sm_switch_4.c: Likewise. * gcc.target/aarch64/sme/call_sm_switch_5.c: Likewise. * gcc.target/aarch64/sme/call_sm_switch_6.c: Likewise. * gcc.target/aarch64/sme/call_sm_switch_7.c: Likewise. * gcc.target/aarch64/sme/call_sm_switch_8.c: Likewise. * gcc.target/aarch64/sme/call_sm_switch_9.c: Likewise. * gcc.target/aarch64/sme/call_sm_switch_10.c: Likewise. --- gcc/config/aarch64/aarch64-passes.def | 1 + gcc/config/aarch64/aarch64-protos.h | 3 +- gcc/config/aarch64/aarch64-sme.md | 133 +++ gcc/config/aarch64/aarch64.cc | 815 +++++++++++++++++- gcc/config/aarch64/aarch64.h | 25 + gcc/config/aarch64/aarch64.md | 13 +- gcc/config/aarch64/t-aarch64 | 3 +- .../gcc.target/aarch64/sme/call_sm_switch_1.c | 195 +++++ .../aarch64/sme/call_sm_switch_10.c | 37 + .../gcc.target/aarch64/sme/call_sm_switch_2.c | 43 + .../gcc.target/aarch64/sme/call_sm_switch_3.c | 156 ++++ .../gcc.target/aarch64/sme/call_sm_switch_4.c | 43 + .../gcc.target/aarch64/sme/call_sm_switch_5.c | 308 +++++++ .../gcc.target/aarch64/sme/call_sm_switch_6.c | 45 + .../gcc.target/aarch64/sme/call_sm_switch_7.c | 516 +++++++++++ .../gcc.target/aarch64/sme/call_sm_switch_8.c | 87 ++ .../gcc.target/aarch64/sme/call_sm_switch_9.c | 103 +++ 17 files changed, 2512 insertions(+), 14 deletions(-) create mode 100644 gcc/config/aarch64/aarch64-sme.md create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_10.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_6.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_7.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_9.c diff --git a/gcc/config/aarch64/aarch64-passes.def b/gcc/config/aarch64/aarch64-passes.def index a2babc112c3..0bd558001e4 100644 --- a/gcc/config/aarch64/aarch64-passes.def +++ b/gcc/config/aarch64/aarch64-passes.def @@ -20,6 +20,7 @@ INSERT_PASS_AFTER (pass_regrename, 1, pass_fma_steering); INSERT_PASS_BEFORE (pass_reorder_blocks, 1, pass_track_speculation); +INSERT_PASS_BEFORE (pass_late_thread_prologue_and_epilogue, 1, pass_switch_sm_state); INSERT_PASS_AFTER (pass_machine_reorg, 1, pass_tag_collision_avoidance); INSERT_PASS_BEFORE (pass_shorten_branches, 1, pass_insert_bti); INSERT_PASS_AFTER (pass_if_after_combine, 1, pass_cc_fusion); diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 06b926b42d6..0f686fba4bd 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -910,7 +910,7 @@ void aarch64_sve_expand_vector_init (rtx, rtx); void aarch64_init_cumulative_args (CUMULATIVE_ARGS *, const_tree, rtx, const_tree, unsigned, bool = false); void aarch64_init_expanders (void); -void aarch64_emit_call_insn (rtx); +rtx_insn *aarch64_emit_call_insn (rtx); void aarch64_register_pragmas (void); void aarch64_relayout_simd_types (void); void aarch64_reset_previous_fndecl (void); @@ -1051,6 +1051,7 @@ rtl_opt_pass *make_pass_track_speculation (gcc::context *); rtl_opt_pass *make_pass_tag_collision_avoidance (gcc::context *); rtl_opt_pass *make_pass_insert_bti (gcc::context *ctxt); rtl_opt_pass *make_pass_cc_fusion (gcc::context *ctxt); +rtl_opt_pass *make_pass_switch_sm_state (gcc::context *ctxt); poly_uint64 aarch64_regmode_natural_size (machine_mode); diff --git a/gcc/config/aarch64/aarch64-sme.md b/gcc/config/aarch64/aarch64-sme.md new file mode 100644 index 00000000000..88f1526fa34 --- /dev/null +++ b/gcc/config/aarch64/aarch64-sme.md @@ -0,0 +1,133 @@ +;; Machine description for AArch64 SME. +;; Copyright (C) 2022 Free Software Foundation, Inc. +;; +;; This file is part of GCC. +;; +;; GCC is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. +;; +;; GCC is distributed in the hope that it will be useful, but +;; WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +;; General Public License for more details. +;; +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; . + +;; The file is organised into the following sections (search for the full +;; line): +;; +;; == State management +;; ---- Test current state +;; ---- PSTATE.SM management + +;; ========================================================================= +;; == State management +;; ========================================================================= + +;; ------------------------------------------------------------------------- +;; ---- Test current state +;; ------------------------------------------------------------------------- + +(define_c_enum "unspec" [ + UNSPEC_GET_SME_STATE + UNSPEC_READ_SVCR +]) + +(define_insn "aarch64_get_sme_state" + [(set (reg:TI R0_REGNUM) + (unspec_volatile:TI [(const_int 0)] UNSPEC_GET_SME_STATE)) + (clobber (reg:DI R16_REGNUM)) + (clobber (reg:DI R17_REGNUM)) + (clobber (reg:DI R18_REGNUM)) + (clobber (reg:DI R30_REGNUM)) + (clobber (reg:CC CC_REGNUM))] + "" + "bl\t__arm_sme_state" +) + +(define_insn "aarch64_read_svcr" + [(set (match_operand:DI 0 "register_operand" "=r") + (unspec_volatile:DI [(const_int 0)] UNSPEC_READ_SVCR))] + "TARGET_SME" + "mrs\t%0, svcr" +) + +;; ------------------------------------------------------------------------- +;; ---- PSTATE.SM management +;; ------------------------------------------------------------------------- +;; Includes +;; - SMSTART SM +;; - SMSTOP SM +;; ------------------------------------------------------------------------- + +(define_c_enum "unspec" [ + UNSPEC_SMSTART_SM + UNSPEC_SMSTOP_SM +]) + +;; Doesn't depend on a TARGET_* since (a) the instruction is always +;; emitted under direct control of aarch64 code and (b) it is sometimes +;; used conditionally. +(define_insn "aarch64_smstart_sm" + [(unspec_volatile [(const_int 0)] UNSPEC_SMSTART_SM) + (clobber (reg:V4x16QI V0_REGNUM)) + (clobber (reg:V4x16QI V4_REGNUM)) + (clobber (reg:V4x16QI V8_REGNUM)) + (clobber (reg:V4x16QI V12_REGNUM)) + (clobber (reg:V4x16QI V16_REGNUM)) + (clobber (reg:V4x16QI V20_REGNUM)) + (clobber (reg:V4x16QI V24_REGNUM)) + (clobber (reg:V4x16QI V28_REGNUM)) + (clobber (reg:VNx16BI P0_REGNUM)) + (clobber (reg:VNx16BI P1_REGNUM)) + (clobber (reg:VNx16BI P2_REGNUM)) + (clobber (reg:VNx16BI P3_REGNUM)) + (clobber (reg:VNx16BI P4_REGNUM)) + (clobber (reg:VNx16BI P5_REGNUM)) + (clobber (reg:VNx16BI P6_REGNUM)) + (clobber (reg:VNx16BI P7_REGNUM)) + (clobber (reg:VNx16BI P8_REGNUM)) + (clobber (reg:VNx16BI P9_REGNUM)) + (clobber (reg:VNx16BI P10_REGNUM)) + (clobber (reg:VNx16BI P11_REGNUM)) + (clobber (reg:VNx16BI P12_REGNUM)) + (clobber (reg:VNx16BI P13_REGNUM)) + (clobber (reg:VNx16BI P14_REGNUM)) + (clobber (reg:VNx16BI P15_REGNUM))] + "" + "smstart\tsm" +) + +(define_insn "aarch64_smstop_sm" + [(unspec_volatile [(const_int 0)] UNSPEC_SMSTOP_SM) + (clobber (reg:V4x16QI V0_REGNUM)) + (clobber (reg:V4x16QI V4_REGNUM)) + (clobber (reg:V4x16QI V8_REGNUM)) + (clobber (reg:V4x16QI V12_REGNUM)) + (clobber (reg:V4x16QI V16_REGNUM)) + (clobber (reg:V4x16QI V20_REGNUM)) + (clobber (reg:V4x16QI V24_REGNUM)) + (clobber (reg:V4x16QI V28_REGNUM)) + (clobber (reg:VNx16BI P0_REGNUM)) + (clobber (reg:VNx16BI P1_REGNUM)) + (clobber (reg:VNx16BI P2_REGNUM)) + (clobber (reg:VNx16BI P3_REGNUM)) + (clobber (reg:VNx16BI P4_REGNUM)) + (clobber (reg:VNx16BI P5_REGNUM)) + (clobber (reg:VNx16BI P6_REGNUM)) + (clobber (reg:VNx16BI P7_REGNUM)) + (clobber (reg:VNx16BI P8_REGNUM)) + (clobber (reg:VNx16BI P9_REGNUM)) + (clobber (reg:VNx16BI P10_REGNUM)) + (clobber (reg:VNx16BI P11_REGNUM)) + (clobber (reg:VNx16BI P12_REGNUM)) + (clobber (reg:VNx16BI P13_REGNUM)) + (clobber (reg:VNx16BI P14_REGNUM)) + (clobber (reg:VNx16BI P15_REGNUM))] + "" + "smstop\tsm" +) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 36ef0435b4e..d8310eb8597 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -82,6 +82,8 @@ #include "tree-dfa.h" #include "asan.h" #include "aarch64-feature-deps.h" +#include "tree-pass.h" +#include "cfgbuild.h" /* This file should be included last. */ #include "target-def.h" @@ -4103,6 +4105,26 @@ aarch64_fndecl_isa_mode (const_tree fndecl) return aarch64_fndecl_sm_state (fndecl); } +/* Return the state of PSTATE.SM on entry to the current function. + This might be different from the state of PSTATE.SM in the function + body. */ + +static aarch64_feature_flags +aarch64_cfun_incoming_sm_state () +{ + return aarch64_fntype_sm_state (TREE_TYPE (cfun->decl)); +} + +/* Return true if a call from the current function to a function with + ISA mode CALLEE_MODE would involve a change to PSTATE.SM around + the BL instruction. */ + +static bool +aarch64_call_switches_sm_state (aarch64_feature_flags callee_mode) +{ + return (callee_mode & ~AARCH64_ISA_MODE & AARCH64_FL_SM_STATE) != 0; +} + /* Implement TARGET_COMPATIBLE_VECTOR_TYPES_P. */ static bool @@ -4185,6 +4207,16 @@ aarch64_callee_abi (rtx cookie) return function_abis[UINTVAL (cookie) >> AARCH64_NUM_ISA_MODES]; } +/* COOKIE is a CONST_INT from an UNSPEC_CALLEE_ABI rtx. Return the + required ISA mode on entry to the callee, which is also the ISA + mode on return from the callee. */ + +static aarch64_feature_flags +aarch64_callee_isa_mode (rtx cookie) +{ + return UINTVAL (cookie) & AARCH64_FL_ISA_MODES; +} + /* INSN is a call instruction. Return the CONST_INT stored in its UNSPEC_CALLEE_ABI rtx. */ @@ -4207,6 +4239,15 @@ aarch64_insn_callee_abi (const rtx_insn *insn) return aarch64_callee_abi (aarch64_insn_callee_cookie (insn)); } +/* INSN is a call instruction. Return the required ISA mode on entry to + the callee, which is also the ISA mode on return from the callee. */ + +static aarch64_feature_flags +aarch64_insn_callee_isa_mode (const rtx_insn *insn) +{ + return aarch64_callee_isa_mode (aarch64_insn_callee_cookie (insn)); +} + /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED. The callee only saves the lower 64 bits of a 128-bit register. Tell the compiler the callee clobbers the top 64 bits when restoring the bottom 64 bits. */ @@ -6394,6 +6435,428 @@ aarch64_sub_sp (rtx temp1, rtx temp2, poly_int64 delta, bool frame_related_p, temp1, temp2, frame_related_p, emit_move_imm); } +/* A streaming-compatible function needs to switch temporarily to the known + PSTATE.SM mode described by LOCAL_MODE. The low bit of OLD_SVCR contains + the runtime state of PSTATE.SM in the streaming-compatible code, before + the start of the switch to LOCAL_MODE. + + Emit instructions to branch around the mode switch if PSTATE.SM already + matches LOCAL_MODE. Return the label that the branch jumps to. */ + +static rtx_insn * +aarch64_guard_switch_pstate_sm (rtx old_svcr, aarch64_feature_flags local_mode) +{ + local_mode &= AARCH64_FL_SM_STATE; + gcc_assert (local_mode != 0); + auto already_ok_cond = (local_mode & AARCH64_FL_SM_ON ? NE : EQ); + auto *label = gen_label_rtx (); + auto *jump = emit_jump_insn (gen_aarch64_tb (already_ok_cond, DImode, + old_svcr, const0_rtx, label)); + JUMP_LABEL (jump) = label; + return label; +} + +/* Emit code to switch from the PSTATE.SM state in OLD_MODE to the PSTATE.SM + state in NEW_MODE. This is known to involve either an SMSTART SM or + an SMSTOP SM. */ + +static void +aarch64_switch_pstate_sm (aarch64_feature_flags old_mode, + aarch64_feature_flags new_mode) +{ + old_mode &= AARCH64_FL_SM_STATE; + new_mode &= AARCH64_FL_SM_STATE; + gcc_assert (old_mode != new_mode); + + if ((new_mode & AARCH64_FL_SM_ON) + || (new_mode == 0 && (old_mode & AARCH64_FL_SM_OFF))) + emit_insn (gen_aarch64_smstart_sm ()); + else + emit_insn (gen_aarch64_smstop_sm ()); +} + +/* As a side-effect, SMSTART SM and SMSTOP SM clobber the contents of all + FP and predicate registers. This class emits code to preserve any + necessary registers around the mode switch. + + The class uses four approaches to saving and restoring contents, enumerated + by group_type: + + - GPR: save and restore the contents of FP registers using GPRs. + This is used if the FP register contains no more than 64 significant + bits. The registers used are FIRST_GPR onwards. + + - MEM_128: save and restore 128-bit SIMD registers using memory. + + - MEM_SVE_PRED: save and restore full SVE predicate registers using memory. + + - MEM_SVE_DATA: save and restore full SVE vector registers using memory. + + The save slots within each memory group are consecutive, with the + MEM_SVE_PRED slots occupying a region below the MEM_SVE_DATA slots. + + There will only be two mode switches for each use of SME, so they should + not be particularly performance-sensitive. It's also rare for SIMD, SVE + or predicate registers to be live across mode switches. We therefore + don't preallocate the save slots but instead allocate them locally on + demand. This makes the code emitted by the class self-contained. */ + +class aarch64_sme_mode_switch_regs +{ +public: + static const unsigned int FIRST_GPR = R10_REGNUM; + + void add_reg (machine_mode, unsigned int); + void add_call_args (rtx_call_insn *); + void add_call_result (rtx_call_insn *); + + void emit_prologue (); + void emit_epilogue (); + + /* The number of GPRs needed to save FP registers, starting from + FIRST_GPR. */ + unsigned int num_gprs () { return m_group_count[GPR]; } + +private: + enum sequence { PROLOGUE, EPILOGUE }; + enum group_type { GPR, MEM_128, MEM_SVE_PRED, MEM_SVE_DATA, NUM_GROUPS }; + + /* Information about the save location for one FP, SIMD, SVE data, or + SVE predicate register. */ + struct save_location { + /* The register to be saved. */ + rtx reg; + + /* Which group the save location belongs to. */ + group_type group; + + /* A zero-based index of the register within the group. */ + unsigned int index; + }; + + unsigned int sve_data_headroom (); + rtx get_slot_mem (machine_mode, poly_int64); + void emit_stack_adjust (sequence, poly_int64); + void emit_mem_move (sequence, const save_location &, poly_int64); + + void emit_gpr_moves (sequence); + void emit_mem_128_moves (sequence); + void emit_sve_sp_adjust (sequence); + void emit_sve_pred_moves (sequence); + void emit_sve_data_moves (sequence); + + /* All save locations, in no particular order. */ + auto_vec m_save_locations; + + /* The number of registers in each group. */ + unsigned int m_group_count[NUM_GROUPS] = {}; +}; + +/* Record that (reg:MODE REGNO) needs to be preserved around the mode + switch. */ + +void +aarch64_sme_mode_switch_regs::add_reg (machine_mode mode, unsigned int regno) +{ + if (!FP_REGNUM_P (regno) && !PR_REGNUM_P (regno)) + return; + + unsigned int end_regno = end_hard_regno (mode, regno); + unsigned int vec_flags = aarch64_classify_vector_mode (mode); + gcc_assert ((vec_flags & VEC_STRUCT) || end_regno == regno + 1); + for (; regno < end_regno; regno++) + { + machine_mode submode = mode; + if (vec_flags & VEC_STRUCT) + { + if (vec_flags & VEC_SVE_DATA) + submode = SVE_BYTE_MODE; + else if (vec_flags & VEC_PARTIAL) + submode = V8QImode; + else + submode = V16QImode; + } + save_location loc; + loc.reg = gen_rtx_REG (submode, regno); + if (vec_flags == VEC_SVE_PRED) + { + gcc_assert (PR_REGNUM_P (regno)); + loc.group = MEM_SVE_PRED; + } + else + { + gcc_assert (FP_REGNUM_P (regno)); + if (known_le (GET_MODE_SIZE (submode), 8)) + loc.group = GPR; + else if (known_eq (GET_MODE_SIZE (submode), 16)) + loc.group = MEM_128; + else + loc.group = MEM_SVE_DATA; + } + loc.index = m_group_count[loc.group]++; + m_save_locations.quick_push (loc); + } +} + +/* Record that the arguments to CALL_INSN need to be preserved around + the mode switch. */ + +void +aarch64_sme_mode_switch_regs::add_call_args (rtx_call_insn *call_insn) +{ + for (rtx node = CALL_INSN_FUNCTION_USAGE (call_insn); + node; node = XEXP (node, 1)) + { + rtx item = XEXP (node, 0); + if (GET_CODE (item) != USE) + continue; + item = XEXP (item, 0); + if (!REG_P (item)) + continue; + add_reg (GET_MODE (item), REGNO (item)); + } +} + +/* Record that the return value from CALL_INSN (if any) needs to be + preserved around the mode switch. */ + +void +aarch64_sme_mode_switch_regs::add_call_result (rtx_call_insn *call_insn) +{ + rtx pat = PATTERN (call_insn); + gcc_assert (GET_CODE (pat) == PARALLEL); + pat = XVECEXP (pat, 0, 0); + if (GET_CODE (pat) == CALL) + return; + rtx dest = SET_DEST (pat); + add_reg (GET_MODE (dest), REGNO (dest)); +} + +/* Emit code to save registers before the mode switch. */ + +void +aarch64_sme_mode_switch_regs::emit_prologue () +{ + emit_sve_sp_adjust (PROLOGUE); + emit_sve_pred_moves (PROLOGUE); + emit_sve_data_moves (PROLOGUE); + emit_mem_128_moves (PROLOGUE); + emit_gpr_moves (PROLOGUE); +} + +/* Emit code to restore registers after the mode switch. */ + +void +aarch64_sme_mode_switch_regs::emit_epilogue () +{ + emit_gpr_moves (EPILOGUE); + emit_mem_128_moves (EPILOGUE); + emit_sve_pred_moves (EPILOGUE); + emit_sve_data_moves (EPILOGUE); + emit_sve_sp_adjust (EPILOGUE); +} + +/* The SVE predicate registers are stored below the SVE data registers, + with the predicate save area being padded to a data-register-sized + boundary. Return the size of this padded area as a whole number + of data register slots. */ + +unsigned int +aarch64_sme_mode_switch_regs::sve_data_headroom () +{ + return CEIL (m_group_count[MEM_SVE_PRED], 8); +} + +/* Return a memory reference of mode MODE to OFFSET bytes from the + stack pointer. */ + +rtx +aarch64_sme_mode_switch_regs::get_slot_mem (machine_mode mode, + poly_int64 offset) +{ + rtx addr = plus_constant (Pmode, stack_pointer_rtx, offset); + return gen_rtx_MEM (mode, addr); +} + +/* Allocate or deallocate SIZE bytes of stack space: SEQ decides which. */ + +void +aarch64_sme_mode_switch_regs::emit_stack_adjust (sequence seq, + poly_int64 size) +{ + if (seq == PROLOGUE) + size = -size; + emit_insn (gen_rtx_SET (stack_pointer_rtx, + plus_constant (Pmode, stack_pointer_rtx, size))); +} + +/* Save or restore the register in LOC, whose slot is OFFSET bytes from + the stack pointer. SEQ chooses between saving and restoring. */ + +void +aarch64_sme_mode_switch_regs::emit_mem_move (sequence seq, + const save_location &loc, + poly_int64 offset) +{ + rtx mem = get_slot_mem (GET_MODE (loc.reg), offset); + if (seq == PROLOGUE) + emit_move_insn (mem, loc.reg); + else + emit_move_insn (loc.reg, mem); +} + +/* Emit instructions to save or restore the GPR group. SEQ chooses between + saving and restoring. */ + +void +aarch64_sme_mode_switch_regs::emit_gpr_moves (sequence seq) +{ + for (auto &loc : m_save_locations) + if (loc.group == GPR) + { + gcc_assert (loc.index < 8); + rtx gpr = gen_rtx_REG (GET_MODE (loc.reg), FIRST_GPR + loc.index); + if (seq == PROLOGUE) + emit_move_insn (gpr, loc.reg); + else + emit_move_insn (loc.reg, gpr); + } +} + +/* Emit instructions to save or restore the MEM_128 group. SEQ chooses + between saving and restoring. */ + +void +aarch64_sme_mode_switch_regs::emit_mem_128_moves (sequence seq) +{ + HOST_WIDE_INT count = m_group_count[MEM_128]; + if (count == 0) + return; + + auto sp = stack_pointer_rtx; + auto sp_adjust = (seq == PROLOGUE ? -count : count) * 16; + + /* Pick a common mode that supports LDR & STR with pre/post-modification + and LDP & STP with pre/post-modification. */ + auto mode = TFmode; + + /* An instruction pattern that should be emitted at the end. */ + rtx last_pat = NULL_RTX; + + /* A previous MEM_128 location that hasn't been handled yet. */ + save_location *prev_loc = nullptr; + + /* Look for LDP/STPs and record any leftover LDR/STR in PREV_LOC. */ + for (auto &loc : m_save_locations) + if (loc.group == MEM_128) + { + if (!prev_loc) + { + prev_loc = &loc; + continue; + } + gcc_assert (loc.index == prev_loc->index + 1); + + /* The offset of the base of the save area from the current + stack pointer. */ + HOST_WIDE_INT bias = 0; + if (prev_loc->index == 0 && seq == PROLOGUE) + bias = sp_adjust; + + /* Get the two sets in the LDP/STP. */ + rtx ops[] = { + gen_rtx_REG (mode, REGNO (prev_loc->reg)), + get_slot_mem (mode, prev_loc->index * 16 + bias), + gen_rtx_REG (mode, REGNO (loc.reg)), + get_slot_mem (mode, loc.index * 16 + bias) + }; + unsigned int lhs = (seq == PROLOGUE); + rtx set1 = gen_rtx_SET (ops[lhs], ops[1 - lhs]); + rtx set2 = gen_rtx_SET (ops[lhs + 2], ops[3 - lhs]); + + /* Combine the sets with any stack allocation/deallocation. */ + rtvec vec; + if (prev_loc->index == 0) + { + rtx plus_sp = plus_constant (Pmode, sp, sp_adjust); + vec = gen_rtvec (3, gen_rtx_SET (sp, plus_sp), set1, set2); + } + else + vec = gen_rtvec (2, set1, set2); + rtx pat = gen_rtx_PARALLEL (VOIDmode, vec); + + /* Queue a deallocation to the end, otherwise emit the + instruction now. */ + if (seq == EPILOGUE && prev_loc->index == 0) + last_pat = pat; + else + emit_insn (pat); + prev_loc = nullptr; + } + + /* Handle any leftover LDR/STR. */ + if (prev_loc) + { + rtx reg = gen_rtx_REG (mode, REGNO (prev_loc->reg)); + rtx addr; + if (prev_loc->index != 0) + addr = plus_constant (Pmode, sp, prev_loc->index * 16); + else if (seq == PROLOGUE) + { + rtx allocate = plus_constant (Pmode, sp, -count * 16); + addr = gen_rtx_PRE_MODIFY (Pmode, sp, allocate); + } + else + { + rtx deallocate = plus_constant (Pmode, sp, count * 16); + addr = gen_rtx_POST_MODIFY (Pmode, sp, deallocate); + } + rtx mem = gen_rtx_MEM (mode, addr); + if (seq == PROLOGUE) + emit_move_insn (mem, reg); + else + emit_move_insn (reg, mem); + } + + if (last_pat) + emit_insn (last_pat); +} + +/* Allocate or deallocate the stack space needed by the SVE groups. + SEQ chooses between allocating and deallocating. */ + +void +aarch64_sme_mode_switch_regs::emit_sve_sp_adjust (sequence seq) +{ + if (unsigned int count = m_group_count[MEM_SVE_DATA] + sve_data_headroom ()) + emit_stack_adjust (seq, count * BYTES_PER_SVE_VECTOR); +} + +/* Save or restore the MEM_SVE_DATA group. SEQ chooses between saving + and restoring. */ + +void +aarch64_sme_mode_switch_regs::emit_sve_data_moves (sequence seq) +{ + for (auto &loc : m_save_locations) + if (loc.group == MEM_SVE_DATA) + { + auto index = loc.index + sve_data_headroom (); + emit_mem_move (seq, loc, index * BYTES_PER_SVE_VECTOR); + } +} + +/* Save or restore the MEM_SVE_PRED group. SEQ chooses between saving + and restoring. */ + +void +aarch64_sme_mode_switch_regs::emit_sve_pred_moves (sequence seq) +{ + for (auto &loc : m_save_locations) + if (loc.group == MEM_SVE_PRED) + emit_mem_move (seq, loc, loc.index * BYTES_PER_SVE_PRED); +} + /* Set DEST to (vec_series BASE STEP). */ static void @@ -7934,6 +8397,40 @@ on_stack: return; } +/* Add the current argument register to the set of those that need + to be saved and restored around a change to PSTATE.SM. */ + +static void +aarch64_record_sme_mode_switch_args (CUMULATIVE_ARGS *pcum) +{ + subrtx_var_iterator::array_type array; + FOR_EACH_SUBRTX_VAR (iter, array, pcum->aapcs_reg, NONCONST) + { + rtx x = *iter; + if (REG_P (x) && (FP_REGNUM_P (REGNO (x)) || PR_REGNUM_P (REGNO (x)))) + { + unsigned int i = pcum->num_sme_mode_switch_args++; + gcc_assert (i < ARRAY_SIZE (pcum->sme_mode_switch_args)); + pcum->sme_mode_switch_args[i] = x; + } + } +} + +/* Return a parallel that contains all the registers that need to be + saved around a change to PSTATE.SM. Return const0_rtx if there is + no such mode switch, or if no registers need to be saved. */ + +static rtx +aarch64_finish_sme_mode_switch_args (CUMULATIVE_ARGS *pcum) +{ + if (!pcum->num_sme_mode_switch_args) + return const0_rtx; + + auto argvec = gen_rtvec_v (pcum->num_sme_mode_switch_args, + pcum->sme_mode_switch_args); + return gen_rtx_PARALLEL (VOIDmode, argvec); +} + /* Implement TARGET_FUNCTION_ARG. */ static rtx @@ -7945,7 +8442,13 @@ aarch64_function_arg (cumulative_args_t pcum_v, const function_arg_info &arg) || pcum->pcs_variant == ARM_PCS_SVE); if (arg.end_marker_p ()) - return aarch64_gen_callee_cookie (pcum->isa_mode, pcum->pcs_variant); + { + rtx abi_cookie = aarch64_gen_callee_cookie (pcum->isa_mode, + pcum->pcs_variant); + rtx sme_mode_switch_args = aarch64_finish_sme_mode_switch_args (pcum); + return gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, abi_cookie, + sme_mode_switch_args)); + } aarch64_layout_arg (pcum_v, arg); return pcum->aapcs_reg; @@ -7980,6 +8483,7 @@ aarch64_init_cumulative_args (CUMULATIVE_ARGS *pcum, pcum->aapcs_stack_words = 0; pcum->aapcs_stack_size = 0; pcum->silent_p = silent_p; + pcum->num_sme_mode_switch_args = 0; if (!silent_p && !TARGET_FLOAT @@ -8020,6 +8524,10 @@ aarch64_function_arg_advance (cumulative_args_t pcum_v, aarch64_layout_arg (pcum_v, arg); gcc_assert ((pcum->aapcs_reg != NULL_RTX) != (pcum->aapcs_stack_words != 0)); + if (pcum->aapcs_reg + && aarch64_call_switches_sm_state (pcum->isa_mode)) + aarch64_record_sme_mode_switch_args (pcum); + pcum->aapcs_arg_processed = false; pcum->aapcs_ncrn = pcum->aapcs_nextncrn; pcum->aapcs_nvrn = pcum->aapcs_nextnvrn; @@ -8457,6 +8965,30 @@ aarch64_needs_frame_chain (void) return aarch64_use_frame_pointer; } +/* Return true if the current function needs to record the incoming + value of PSTATE.SM. */ +static bool +aarch64_need_old_pstate_sm () +{ + /* Exit early if the incoming value of PSTATE.SM is known at + compile time. */ + if (aarch64_cfun_incoming_sm_state () != 0) + return false; + + if (cfun->machine->call_switches_sm_state) + for (auto insn = get_insns (); insn; insn = NEXT_INSN (insn)) + if (auto *call = dyn_cast (insn)) + if (!SIBLING_CALL_P (call)) + { + /* Return true if there is call to a non-streaming-compatible + function. */ + auto callee_isa_mode = aarch64_insn_callee_isa_mode (call); + if (aarch64_call_switches_sm_state (callee_isa_mode)) + return true; + } + return false; +} + /* Mark the registers that need to be saved by the callee and calculate the size of the callee-saved registers area and frame record (both FP and LR may be omitted). */ @@ -8486,6 +9018,7 @@ aarch64_layout_frame (void) /* First mark all the registers that really need to be saved... */ for (regno = 0; regno <= LAST_SAVED_REGNUM; regno++) frame.reg_offset[regno] = SLOT_NOT_REQUIRED; + frame.old_svcr_offset = SLOT_NOT_REQUIRED; /* ... that includes the eh data registers (if needed)... */ if (crtl->calls_eh_return) @@ -8612,6 +9145,12 @@ aarch64_layout_frame (void) offset += UNITS_PER_WORD; } + if (aarch64_need_old_pstate_sm ()) + { + frame.old_svcr_offset = offset; + offset += UNITS_PER_WORD; + } + poly_int64 max_int_offset = offset; offset = aligned_upper_bound (offset, STACK_BOUNDARY / BITS_PER_UNIT); bool has_align_gap = maybe_ne (offset, max_int_offset); @@ -9908,6 +10447,48 @@ aarch64_epilogue_uses (int regno) return 0; } +/* The current function's frame has a save slot for the incoming state + of SVCR. Return a legitimate memory for the slot, based on the hard + frame pointer. */ + +static rtx +aarch64_old_svcr_mem () +{ + gcc_assert (frame_pointer_needed + && known_ge (cfun->machine->frame.old_svcr_offset, 0)); + rtx base = hard_frame_pointer_rtx; + poly_int64 offset = (/* hard fp -> top of frame. */ + cfun->machine->frame.hard_fp_offset + /* top of frame -> bottom of frame. */ + - cfun->machine->frame.frame_size + /* bottom of frame -> save slot. */ + + cfun->machine->frame.old_svcr_offset); + return gen_frame_mem (DImode, plus_constant (Pmode, base, offset)); +} + +/* The current function's frame has a save slot for the incoming state + of SVCR. Load the slot into register REGNO and return the register. */ + +static rtx +aarch64_read_old_svcr (unsigned int regno) +{ + rtx svcr = gen_rtx_REG (DImode, regno); + emit_move_insn (svcr, aarch64_old_svcr_mem ()); + return svcr; +} + +/* Like the rtx version of aarch64_guard_switch_pstate_sm, but first + load the incoming value of SVCR from its save slot into temporary + register REGNO. */ + +static rtx_insn * +aarch64_guard_switch_pstate_sm (unsigned int regno, + aarch64_feature_flags local_mode) +{ + rtx old_svcr = aarch64_read_old_svcr (regno); + return aarch64_guard_switch_pstate_sm (old_svcr, local_mode); +} + /* AArch64 stack frames generated by this compiler look like: +-------------------------------+ @@ -10141,6 +10722,40 @@ aarch64_expand_prologue (void) that is assumed by the called. */ aarch64_allocate_and_probe_stack_space (tmp1_rtx, tmp0_rtx, final_adjust, !frame_pointer_needed, true); + + /* Save the incoming value of PSTATE.SM, if required. */ + if (known_ge (cfun->machine->frame.old_svcr_offset, 0)) + { + rtx mem = aarch64_old_svcr_mem (); + MEM_VOLATILE_P (mem) = 1; + if (TARGET_SME) + { + rtx reg = gen_rtx_REG (DImode, IP0_REGNUM); + emit_insn (gen_aarch64_read_svcr (reg)); + emit_move_insn (mem, reg); + } + else + { + rtx old_r0 = NULL_RTX, old_r1 = NULL_RTX; + auto &args = crtl->args.info; + if (args.aapcs_ncrn > 0) + { + old_r0 = gen_rtx_REG (DImode, PROBE_STACK_FIRST_REGNUM); + emit_move_insn (old_r0, gen_rtx_REG (DImode, R0_REGNUM)); + } + if (args.aapcs_ncrn > 1) + { + old_r1 = gen_rtx_REG (DImode, PROBE_STACK_SECOND_REGNUM); + emit_move_insn (old_r1, gen_rtx_REG (DImode, R1_REGNUM)); + } + emit_insn (gen_aarch64_get_sme_state ()); + emit_move_insn (mem, gen_rtx_REG (DImode, R0_REGNUM)); + if (old_r0) + emit_move_insn (gen_rtx_REG (DImode, R0_REGNUM), old_r0); + if (old_r1) + emit_move_insn (gen_rtx_REG (DImode, R1_REGNUM), old_r1); + } + } } /* Return TRUE if we can use a simple_return insn. @@ -11395,17 +12010,33 @@ aarch64_start_call_args (cumulative_args_t ca_v) RESULT is the register in which the result is returned. It's NULL for "call" and "sibcall". MEM is the location of the function call. - CALLEE_ABI is a const_int that gives the arm_pcs of the callee. + COOKIE is either: + - a const_int that gives the argument to the call's UNSPEC_CALLEE_ABI. + - a PARALLEL that contains such a const_int as its first element. + The second element is a PARALLEL that lists all the argument + registers that need to be saved and restored around a change + in PSTATE.SM, or const0_rtx if no such switch is needed. SIBCALL indicates whether this function call is normal call or sibling call. It will generate different pattern accordingly. */ void -aarch64_expand_call (rtx result, rtx mem, rtx callee_abi, bool sibcall) +aarch64_expand_call (rtx result, rtx mem, rtx cookie, bool sibcall) { rtx call, callee, tmp; rtvec vec; machine_mode mode; + rtx callee_abi = cookie; + rtx sme_mode_switch_args = const0_rtx; + if (GET_CODE (cookie) == PARALLEL) + { + callee_abi = XVECEXP (cookie, 0, 0); + sme_mode_switch_args = XVECEXP (cookie, 0, 1); + } + + gcc_assert (CONST_INT_P (callee_abi)); + auto callee_isa_mode = aarch64_callee_isa_mode (callee_abi); + gcc_assert (MEM_P (mem)); callee = XEXP (mem, 0); mode = GET_MODE (callee); @@ -11430,26 +12061,67 @@ aarch64_expand_call (rtx result, rtx mem, rtx callee_abi, bool sibcall) else tmp = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (Pmode, LR_REGNUM)); - gcc_assert (CONST_INT_P (callee_abi)); callee_abi = gen_rtx_UNSPEC (DImode, gen_rtvec (1, callee_abi), UNSPEC_CALLEE_ABI); vec = gen_rtvec (3, call, callee_abi, tmp); call = gen_rtx_PARALLEL (VOIDmode, vec); - aarch64_emit_call_insn (call); + auto call_insn = aarch64_emit_call_insn (call); + + /* Check whether the call requires a change to PSTATE.SM. We can't + emit the instructions to change PSTATE.SM yet, since they involve + a change in vector length and a change in instruction set, which + cannot be represented in RTL. + + For now, just record which registers will be clobbered by the + changes to PSTATE.SM. */ + if (!sibcall && aarch64_call_switches_sm_state (callee_isa_mode)) + { + aarch64_sme_mode_switch_regs args_switch; + if (sme_mode_switch_args != const0_rtx) + { + unsigned int num_args = XVECLEN (sme_mode_switch_args, 0); + for (unsigned int i = 0; i < num_args; ++i) + { + rtx x = XVECEXP (sme_mode_switch_args, 0, i); + args_switch.add_reg (GET_MODE (x), REGNO (x)); + } + } + + aarch64_sme_mode_switch_regs result_switch; + if (result) + result_switch.add_reg (GET_MODE (result), REGNO (result)); + + unsigned int num_gprs = MAX (args_switch.num_gprs (), + result_switch.num_gprs ()); + for (unsigned int i = 0; i < num_gprs; ++i) + clobber_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), + gen_rtx_REG (DImode, args_switch.FIRST_GPR + i)); + + for (int regno = V0_REGNUM; regno < V0_REGNUM + 32; regno += 4) + clobber_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), + gen_rtx_REG (V4x16QImode, regno)); + + for (int regno = P0_REGNUM; regno < P0_REGNUM + 16; regno += 1) + clobber_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), + gen_rtx_REG (VNx16BImode, regno)); + + cfun->machine->call_switches_sm_state = true; + } } /* Emit call insn with PAT and do aarch64-specific handling. */ -void +rtx_insn * aarch64_emit_call_insn (rtx pat) { - rtx insn = emit_call_insn (pat); + auto insn = emit_call_insn (pat); rtx *fusage = &CALL_INSN_FUNCTION_USAGE (insn); clobber_reg (fusage, gen_rtx_REG (word_mode, IP0_REGNUM)); clobber_reg (fusage, gen_rtx_REG (word_mode, IP1_REGNUM)); + return insn; } machine_mode @@ -12761,6 +13433,16 @@ aarch64_secondary_memory_needed (machine_mode mode, reg_class_t class1, return false; } +/* Implement TARGET_FRAME_POINTER_REQUIRED. */ + +static bool +aarch64_frame_pointer_required () +{ + /* If the function needs to record the incoming value of PSTATE.SM, + make sure that the slot is accessible from the frame pointer. */ + return aarch64_need_old_pstate_sm (); +} + static bool aarch64_can_eliminate (const int from ATTRIBUTE_UNUSED, const int to) { @@ -27496,6 +28178,122 @@ aarch64_indirect_call_asm (rtx addr) return ""; } +/* If CALL involves a change in PSTATE.SM, emit the instructions needed + to switch to the new mode and the instructions needed to restore the + original mode. Return true if something changed. */ +static bool +aarch64_switch_sm_state_for_call (rtx_call_insn *call) +{ + /* Mode switches for sibling calls are handled via the epilogue. */ + if (SIBLING_CALL_P (call)) + return false; + + auto callee_isa_mode = aarch64_insn_callee_isa_mode (call); + if (!aarch64_call_switches_sm_state (callee_isa_mode)) + return false; + + /* Switch mode before the call, preserving any argument registers + across the switch. */ + start_sequence (); + rtx_insn *args_guard_label = nullptr; + if (TARGET_STREAMING_COMPATIBLE) + args_guard_label = aarch64_guard_switch_pstate_sm (IP0_REGNUM, + callee_isa_mode); + aarch64_sme_mode_switch_regs args_switch; + args_switch.add_call_args (call); + args_switch.emit_prologue (); + aarch64_switch_pstate_sm (AARCH64_ISA_MODE, callee_isa_mode); + args_switch.emit_epilogue (); + if (args_guard_label) + emit_label (args_guard_label); + auto args_seq = get_insns (); + end_sequence (); + emit_insn_before (args_seq, call); + + if (find_reg_note (call, REG_NORETURN, NULL_RTX)) + return true; + + /* Switch mode after the call, preserving any return registers across + the switch. */ + start_sequence (); + rtx_insn *return_guard_label = nullptr; + if (TARGET_STREAMING_COMPATIBLE) + return_guard_label = aarch64_guard_switch_pstate_sm (IP0_REGNUM, + callee_isa_mode); + aarch64_sme_mode_switch_regs return_switch; + return_switch.add_call_result (call); + return_switch.emit_prologue (); + aarch64_switch_pstate_sm (callee_isa_mode, AARCH64_ISA_MODE); + return_switch.emit_epilogue (); + if (return_guard_label) + emit_label (return_guard_label); + auto result_seq = get_insns (); + end_sequence (); + emit_insn_after (result_seq, call); + return true; +} + +namespace { + +const pass_data pass_data_switch_sm_state = +{ + RTL_PASS, // type + "smstarts", // name + OPTGROUP_NONE, // optinfo_flags + TV_NONE, // tv_id + 0, // properties_required + 0, // properties_provided + 0, // properties_destroyed + 0, // todo_flags_start + TODO_df_finish, // todo_flags_finish +}; + +class pass_switch_sm_state : public rtl_opt_pass +{ +public: + pass_switch_sm_state (gcc::context *ctxt) + : rtl_opt_pass (pass_data_switch_sm_state, ctxt) + {} + + // opt_pass methods: + bool gate (function *) override final; + unsigned int execute (function *) override final; +}; + +bool +pass_switch_sm_state::gate (function *) +{ + return cfun->machine->call_switches_sm_state; +} + +/* Emit any instructions needed to switch PSTATE.SM. */ +unsigned int +pass_switch_sm_state::execute (function *fn) +{ + basic_block bb; + + auto_sbitmap blocks (last_basic_block_for_fn (cfun)); + bitmap_clear (blocks); + FOR_EACH_BB_FN (bb, fn) + { + rtx_insn *insn; + FOR_BB_INSNS (bb, insn) + if (auto *call = dyn_cast (insn)) + if (aarch64_switch_sm_state_for_call (call)) + bitmap_set_bit (blocks, bb->index); + } + find_many_sub_basic_blocks (blocks); + return 0; +} + +} + +rtl_opt_pass * +make_pass_switch_sm_state (gcc::context *ctxt) +{ + return new pass_switch_sm_state (ctxt); +} + /* Target-specific selftests. */ #if CHECKING_P @@ -27683,6 +28481,9 @@ aarch64_run_selftests (void) #undef TARGET_CALLEE_COPIES #define TARGET_CALLEE_COPIES hook_bool_CUMULATIVE_ARGS_arg_info_false +#undef TARGET_FRAME_POINTER_REQUIRED +#define TARGET_FRAME_POINTER_REQUIRED aarch64_frame_pointer_required + #undef TARGET_CAN_ELIMINATE #define TARGET_CAN_ELIMINATE aarch64_can_eliminate diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 8359cf709c1..f23edea35f5 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -255,6 +255,10 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; /* The current function is a normal non-streaming function. */ #define TARGET_NON_STREAMING (AARCH64_ISA_SM_OFF) +/* The current function has a streaming-compatible body. */ +#define TARGET_STREAMING_COMPATIBLE \ + ((aarch64_isa_flags & AARCH64_FL_SM_STATE) == 0) + /* Crypto is an optional extension to AdvSIMD. */ #define TARGET_CRYPTO (AARCH64_ISA_CRYPTO) @@ -304,6 +308,10 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; /* SVE2 SM4 instructions, enabled through +sve2-sm4. */ #define TARGET_SVE2_SM4 (AARCH64_ISA_SVE2_SM4 && TARGET_NON_STREAMING) +/* SME instructions, enabled through +sme. Note that this does not + imply anything about the state of PSTATE.SM. */ +#define TARGET_SME (AARCH64_ISA_SME) + /* ARMv8.3-A features. */ #define TARGET_ARMV8_3 (AARCH64_ISA_V8_3A) @@ -802,6 +810,13 @@ struct GTY (()) aarch64_frame STACK_BOUNDARY. */ poly_int64 locals_offset; + /* The offset from the base of the frame of a 64-bit slot whose low + bit contains the incoming value of PSTATE.SM. This slot must be + within reach of the hard frame pointer. + + The offset is -1 if such a slot isn't needed. */ + poly_int64 old_svcr_offset; + /* Offset from the base of the frame (incomming SP) to the hard_frame_pointer. This value is always a multiple of STACK_BOUNDARY. */ @@ -884,6 +899,10 @@ typedef struct GTY (()) machine_function /* One entry for each general purpose register. */ rtx call_via[SP_REGNUM]; bool label_is_assembled; + /* True if we've expanded at least one call to a function that changes + PSTATE.SM. This should only be used for saving compile time: false + guarantees that no such mode switch exists. */ + bool call_switches_sm_state; } machine_function; #endif @@ -948,6 +967,12 @@ typedef struct stack arg area so far. */ bool silent_p; /* True if we should act silently, rather than raise an error for invalid calls. */ + + /* A list of registers that need to be saved and restored around a + change to PSTATE.SM. An auto_vec would be more convenient, but those + can't be copied. */ + unsigned int num_sme_mode_switch_args; + rtx sme_mode_switch_args[12]; } CUMULATIVE_ARGS; #endif diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 3dc877ba9fe..991f46fbc80 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -940,7 +940,7 @@ (define_insn "*cb1" (const_int 1)))] ) -(define_insn "*tb1" +(define_insn "@aarch64_tb" [(set (pc) (if_then_else (EQL (zero_extract:DI (match_operand:GPI 0 "register_operand" "r") (const_int 1) @@ -1027,7 +1027,7 @@ (define_expand "call" [(parallel [(call (match_operand 0 "memory_operand") (match_operand 1 "general_operand")) - (unspec:DI [(match_operand 2 "const_int_operand")] UNSPEC_CALLEE_ABI) + (unspec:DI [(match_operand 2)] UNSPEC_CALLEE_ABI) (clobber (reg:DI LR_REGNUM))])] "" " @@ -1053,7 +1053,7 @@ (define_expand "call_value" [(set (match_operand 0 "") (call (match_operand 1 "memory_operand") (match_operand 2 "general_operand"))) - (unspec:DI [(match_operand 3 "const_int_operand")] UNSPEC_CALLEE_ABI) + (unspec:DI [(match_operand 3)] UNSPEC_CALLEE_ABI) (clobber (reg:DI LR_REGNUM))])] "" " @@ -1080,7 +1080,7 @@ (define_expand "sibcall" [(parallel [(call (match_operand 0 "memory_operand") (match_operand 1 "general_operand")) - (unspec:DI [(match_operand 2 "const_int_operand")] UNSPEC_CALLEE_ABI) + (unspec:DI [(match_operand 2)] UNSPEC_CALLEE_ABI) (return)])] "" { @@ -1094,7 +1094,7 @@ (define_expand "sibcall_value" [(set (match_operand 0 "") (call (match_operand 1 "memory_operand") (match_operand 2 "general_operand"))) - (unspec:DI [(match_operand 3 "const_int_operand")] UNSPEC_CALLEE_ABI) + (unspec:DI [(match_operand 3)] UNSPEC_CALLEE_ABI) (return)])] "" { @@ -7783,3 +7783,6 @@ (define_insn "st64bv0" ;; SVE2. (include "aarch64-sve2.md") + +;; SME and extensions +(include "aarch64-sme.md") diff --git a/gcc/config/aarch64/t-aarch64 b/gcc/config/aarch64/t-aarch64 index 47a753c5f1b..c1c8f5c7dae 100644 --- a/gcc/config/aarch64/t-aarch64 +++ b/gcc/config/aarch64/t-aarch64 @@ -186,7 +186,8 @@ MULTILIB_DIRNAMES = $(subst $(comma), ,$(TM_MULTILIB_CONFIG)) insn-conditions.md: s-check-sve-md s-check-sve-md: $(srcdir)/config/aarch64/check-sve-md.awk \ $(srcdir)/config/aarch64/aarch64-sve.md \ - $(srcdir)/config/aarch64/aarch64-sve2.md + $(srcdir)/config/aarch64/aarch64-sve2.md \ + $(srcdir)/config/aarch64/aarch64-sme.md $(AWK) -f $(srcdir)/config/aarch64/check-sve-md.awk \ $(srcdir)/config/aarch64/aarch64-sve.md $(AWK) -f $(srcdir)/config/aarch64/check-sve-md.awk \ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_1.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_1.c new file mode 100644 index 00000000000..b4931c1bc37 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_1.c @@ -0,0 +1,195 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls" } +// { dg-final { check-function-bodies "**" "" } } + +void ns_callee (); +__attribute__((arm_streaming)) void s_callee (); +__attribute__((arm_streaming_compatible)) void sc_callee (); + +struct callbacks { + void (*ns_ptr) (); + __attribute__((arm_streaming)) void (*s_ptr) (); + __attribute__((arm_streaming_compatible)) void (*sc_ptr) (); +}; + +/* +** n_caller: { target lp64 } +** stp (x19|x2[0-8]), x30, \[sp, #?-80\]! +** stp d8, d9, \[sp, #?16\] +** stp d10, d11, \[sp, #?32\] +** stp d12, d13, \[sp, #?48\] +** stp d14, d15, \[sp, #?64\] +** mov \1, x0 +** bl ns_callee +** smstart sm +** bl s_callee +** smstop sm +** bl sc_callee +** ldr (x[0-9]+), \[\1\] +** blr \2 +** ldr (x[0-9]+), \[\1, #?8\] +** smstart sm +** blr \3 +** smstop sm +** ldr (x[0-9]+), \[\1, #?16\] +** blr \4 +** ldp d8, d9, \[sp, #?16\] +** ldp d10, d11, \[sp, #?32\] +** ldp d12, d13, \[sp, #?48\] +** ldp d14, d15, \[sp, #?64\] +** ldp \1, x30, \[sp\], #?80 +** ret +*/ +void +n_caller (struct callbacks *c) +{ + ns_callee (); + s_callee (); + sc_callee (); + + c->ns_ptr (); + c->s_ptr (); + c->sc_ptr (); +} + +/* +** s_caller: { target lp64 } +** stp (x19|x2[0-8]), x30, \[sp, #?-80\]! +** stp d8, d9, \[sp, #?16\] +** stp d10, d11, \[sp, #?32\] +** stp d12, d13, \[sp, #?48\] +** stp d14, d15, \[sp, #?64\] +** mov \1, x0 +** smstop sm +** bl ns_callee +** smstart sm +** bl s_callee +** bl sc_callee +** ldr (x[0-9]+), \[\1\] +** smstop sm +** blr \2 +** smstart sm +** ldr (x[0-9]+), \[\1, #?8\] +** blr \3 +** ldr (x[0-9]+), \[\1, #?16\] +** blr \4 +** ldp d8, d9, \[sp, #?16\] +** ldp d10, d11, \[sp, #?32\] +** ldp d12, d13, \[sp, #?48\] +** ldp d14, d15, \[sp, #?64\] +** ldp \1, x30, \[sp\], #?80 +** ret +*/ +void __attribute__((arm_streaming)) +s_caller (struct callbacks *c) +{ + ns_callee (); + s_callee (); + sc_callee (); + + c->ns_ptr (); + c->s_ptr (); + c->sc_ptr (); +} + +/* +** sc_caller_sme: +** stp x29, x30, \[sp, #?-96\]! +** mov x29, sp +** stp d8, d9, \[sp, #?32\] +** stp d10, d11, \[sp, #?48\] +** stp d12, d13, \[sp, #?64\] +** stp d14, d15, \[sp, #?80\] +** mrs x16, svcr +** str x16, \[x29, #?16\] +** ldr x16, \[x29, #?16\] +** tbz x16, 0, .* +** smstop sm +** bl ns_callee +** ldr x16, \[x29, #?16\] +** tbz x16, 0, .* +** smstart sm +** ldr x16, \[x29, #?16\] +** tbnz x16, 0, .* +** smstart sm +** bl s_callee +** ldr x16, \[x29, #?16\] +** tbnz x16, 0, .* +** smstop sm +** bl sc_callee +** ldp d8, d9, \[sp, #?32\] +** ldp d10, d11, \[sp, #?48\] +** ldp d12, d13, \[sp, #?64\] +** ldp d14, d15, \[sp, #?80\] +** ldp x29, x30, \[sp\], #?96 +** ret +*/ +void __attribute__((arm_streaming_compatible)) +sc_caller_sme () +{ + ns_callee (); + s_callee (); + sc_callee (); +} + +#pragma GCC target "+nosme" + +/* +** sc_caller: +** stp x29, x30, \[sp, #?-96\]! +** mov x29, sp +** stp d8, d9, \[sp, #?32\] +** stp d10, d11, \[sp, #?48\] +** stp d12, d13, \[sp, #?64\] +** stp d14, d15, \[sp, #?80\] +** bl __arm_sme_state +** str x0, \[x29, #?16\] +** ... +** bl sc_callee +** ldp d8, d9, \[sp, #?32\] +** ldp d10, d11, \[sp, #?48\] +** ldp d12, d13, \[sp, #?64\] +** ldp d14, d15, \[sp, #?80\] +** ldp x29, x30, \[sp\], #?96 +** ret +*/ +void __attribute__((arm_streaming_compatible)) +sc_caller () +{ + ns_callee (); + sc_callee (); +} + +/* +** sc_caller_x0: +** ... +** mov x10, x0 +** bl __arm_sme_state +** ... +** str wzr, \[x10\] +** ... +*/ +void __attribute__((arm_streaming_compatible)) +sc_caller_x0 (int *ptr) +{ + *ptr = 0; + ns_callee (); + sc_callee (); +} + +/* +** sc_caller_x1: +** ... +** mov x10, x0 +** mov x11, x1 +** bl __arm_sme_state +** ... +** str w11, \[x10\] +** ... +*/ +void __attribute__((arm_streaming_compatible)) +sc_caller_x1 (int *ptr, int a) +{ + *ptr = a; + ns_callee (); + sc_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_10.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_10.c new file mode 100644 index 00000000000..f70378541a4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_10.c @@ -0,0 +1,37 @@ +// { dg-options "" } + +#pragma GCC target "+nosme" + +void ns_callee (); +__attribute__((arm_streaming)) void s_callee (); +__attribute__((arm_streaming_compatible)) void sc_callee (); + +struct callbacks { + void (*ns_ptr) (); + __attribute__((arm_streaming)) void (*s_ptr) (); + __attribute__((arm_streaming_compatible)) void (*sc_ptr) (); +}; + +void +n_caller (struct callbacks *c) +{ + ns_callee (); + s_callee (); // { dg-error "calling a streaming function requires the ISA extension 'sme'" } + sc_callee (); + + c->ns_ptr (); + c->s_ptr (); // { dg-error "calling a streaming function requires the ISA extension 'sme'" } + c->sc_ptr (); +} + +void __attribute__((arm_streaming_compatible)) +sc_caller_sme (struct callbacks *c) +{ + ns_callee (); + s_callee (); // { dg-error "calling a streaming function requires the ISA extension 'sme'" } + sc_callee (); + + c->ns_ptr (); + c->s_ptr (); // { dg-error "calling a streaming function requires the ISA extension 'sme'" } + c->sc_ptr (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_2.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_2.c new file mode 100644 index 00000000000..9a1b646a2f8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_2.c @@ -0,0 +1,43 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls" } + +void ns_callee (); +__attribute__((arm_streaming)) void s_callee (); +__attribute__((arm_streaming_compatible)) void sc_callee (); + +struct callbacks { + void (*ns_ptr) (); + __attribute__((arm_streaming)) void (*s_ptr) (); + __attribute__((arm_streaming_compatible)) void (*sc_ptr) (); +}; + +void +n_caller (struct callbacks *c) +{ + ns_callee (); + sc_callee (); + + c->ns_ptr (); + c->sc_ptr (); +} + +void __attribute__((arm_streaming)) +s_caller (struct callbacks *c) +{ + s_callee (); + sc_callee (); + + c->s_ptr (); + c->sc_ptr (); +} + +void __attribute__((arm_streaming_compatible)) +sc_caller (struct callbacks *c) +{ + sc_callee (); + + c->sc_ptr (); +} + +// { dg-final { scan-assembler-not {[dpqz][0-9]+,} } } +// { dg-final { scan-assembler-not {smstart\tsm} } } +// { dg-final { scan-assembler-not {smstop\tsm} } } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_3.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_3.c new file mode 100644 index 00000000000..9ad6b2c1fff --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_3.c @@ -0,0 +1,156 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls" } +// { dg-final { check-function-bodies "**" "" } } + +__attribute__((aarch64_vector_pcs)) void ns_callee (); +__attribute__((arm_streaming, aarch64_vector_pcs)) void s_callee (); +__attribute__((arm_streaming_compatible, aarch64_vector_pcs)) void sc_callee (); + +struct callbacks { + __attribute__((aarch64_vector_pcs)) void (*ns_ptr) (); + __attribute__((arm_streaming, aarch64_vector_pcs)) void (*s_ptr) (); + __attribute__((arm_streaming_compatible, aarch64_vector_pcs)) void (*sc_ptr) (); +}; + +/* +** n_caller: { target lp64 } +** stp (x19|x2[0-8]), x30, \[sp, #?-272\]! +** stp q8, q9, \[sp, #?16\] +** stp q10, q11, \[sp, #?48\] +** stp q12, q13, \[sp, #?80\] +** stp q14, q15, \[sp, #?112\] +** stp q16, q17, \[sp, #?144\] +** stp q18, q19, \[sp, #?176\] +** stp q20, q21, \[sp, #?208\] +** stp q22, q23, \[sp, #?240\] +** mov \1, x0 +** bl ns_callee +** smstart sm +** bl s_callee +** smstop sm +** bl sc_callee +** ldr (x[0-9]+), \[\1\] +** blr \2 +** ldr (x[0-9]+), \[\1, #?8\] +** smstart sm +** blr \3 +** smstop sm +** ldr (x[0-9]+), \[\1, #?16\] +** blr \4 +** ldp q8, q9, \[sp, #?16\] +** ldp q10, q11, \[sp, #?48\] +** ldp q12, q13, \[sp, #?80\] +** ldp q14, q15, \[sp, #?112\] +** ldp q16, q17, \[sp, #?144\] +** ldp q18, q19, \[sp, #?176\] +** ldp q20, q21, \[sp, #?208\] +** ldp q22, q23, \[sp, #?240\] +** ldp \1, x30, \[sp\], #?272 +** ret +*/ +void __attribute__((aarch64_vector_pcs)) +n_caller (struct callbacks *c) +{ + ns_callee (); + s_callee (); + sc_callee (); + + c->ns_ptr (); + c->s_ptr (); + c->sc_ptr (); +} + +/* +** s_caller: { target lp64 } +** stp (x19|x2[0-8]), x30, \[sp, #?-272\]! +** stp q8, q9, \[sp, #?16\] +** stp q10, q11, \[sp, #?48\] +** stp q12, q13, \[sp, #?80\] +** stp q14, q15, \[sp, #?112\] +** stp q16, q17, \[sp, #?144\] +** stp q18, q19, \[sp, #?176\] +** stp q20, q21, \[sp, #?208\] +** stp q22, q23, \[sp, #?240\] +** mov \1, x0 +** smstop sm +** bl ns_callee +** smstart sm +** bl s_callee +** bl sc_callee +** ldr (x[0-9]+), \[\1\] +** smstop sm +** blr \2 +** smstart sm +** ldr (x[0-9]+), \[\1, #?8\] +** blr \3 +** ldr (x[0-9]+), \[\1, #?16\] +** blr \4 +** ldp q8, q9, \[sp, #?16\] +** ldp q10, q11, \[sp, #?48\] +** ldp q12, q13, \[sp, #?80\] +** ldp q14, q15, \[sp, #?112\] +** ldp q16, q17, \[sp, #?144\] +** ldp q18, q19, \[sp, #?176\] +** ldp q20, q21, \[sp, #?208\] +** ldp q22, q23, \[sp, #?240\] +** ldp \1, x30, \[sp\], #?272 +** ret +*/ +void __attribute__((arm_streaming, aarch64_vector_pcs)) +s_caller (struct callbacks *c) +{ + ns_callee (); + s_callee (); + sc_callee (); + + c->ns_ptr (); + c->s_ptr (); + c->sc_ptr (); +} + +/* +** sc_caller: +** stp x29, x30, \[sp, #?-288\]! +** mov x29, sp +** stp q8, q9, \[sp, #?32\] +** stp q10, q11, \[sp, #?64\] +** stp q12, q13, \[sp, #?96\] +** stp q14, q15, \[sp, #?128\] +** stp q16, q17, \[sp, #?160\] +** stp q18, q19, \[sp, #?192\] +** stp q20, q21, \[sp, #?224\] +** stp q22, q23, \[sp, #?256\] +** mrs x16, svcr +** str x16, \[x29, #?16\] +** ldr x16, \[x29, #?16\] +** tbz x16, 0, .* +** smstop sm +** bl ns_callee +** ldr x16, \[x29, #?16\] +** tbz x16, 0, .* +** smstart sm +** ldr x16, \[x29, #?16\] +** tbnz x16, 0, .* +** smstart sm +** bl s_callee +** ldr x16, \[x29, #?16\] +** tbnz x16, 0, .* +** smstop sm +** bl sc_callee +** ldp q8, q9, \[sp, #?32\] +** ldp q10, q11, \[sp, #?64\] +** ldp q12, q13, \[sp, #?96\] +** ldp q14, q15, \[sp, #?128\] +** ldp q16, q17, \[sp, #?160\] +** ldp q18, q19, \[sp, #?192\] +** ldp q20, q21, \[sp, #?224\] +** ldp q22, q23, \[sp, #?256\] +** ldp x29, x30, \[sp\], #?288 +** ret +*/ +void __attribute__((arm_streaming_compatible, aarch64_vector_pcs)) +sc_caller () +{ + ns_callee (); + s_callee (); + sc_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_4.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_4.c new file mode 100644 index 00000000000..1dd1eeb2439 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_4.c @@ -0,0 +1,43 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls" } + +__attribute__((aarch64_vector_pcs)) void ns_callee (); +__attribute__((aarch64_vector_pcs, arm_streaming)) void s_callee (); +__attribute__((aarch64_vector_pcs, arm_streaming_compatible)) void sc_callee (); + +struct callbacks { + __attribute__((aarch64_vector_pcs)) void (*ns_ptr) (); + __attribute__((aarch64_vector_pcs, arm_streaming)) void (*s_ptr) (); + __attribute__((aarch64_vector_pcs, arm_streaming_compatible)) void (*sc_ptr) (); +}; + +void __attribute__((aarch64_vector_pcs)) +n_caller (struct callbacks *c) +{ + ns_callee (); + sc_callee (); + + c->ns_ptr (); + c->sc_ptr (); +} + +void __attribute__((aarch64_vector_pcs, arm_streaming)) +s_caller (struct callbacks *c) +{ + s_callee (); + sc_callee (); + + c->s_ptr (); + c->sc_ptr (); +} + +void __attribute__((aarch64_vector_pcs, arm_streaming_compatible)) +sc_caller (struct callbacks *c) +{ + sc_callee (); + + c->sc_ptr (); +} + +// { dg-final { scan-assembler-not {[dpqz][0-9]+,} } } +// { dg-final { scan-assembler-not {smstart\tsm} } } +// { dg-final { scan-assembler-not {smstop\tsm} } } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_5.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_5.c new file mode 100644 index 00000000000..e9f7af16445 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_5.c @@ -0,0 +1,308 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls" } +// { dg-final { check-function-bodies "**" "" } } + +#include + +svbool_t ns_callee (); +__attribute__((arm_streaming)) svbool_t s_callee (); +__attribute__((arm_streaming_compatible)) svbool_t sc_callee (); + +struct callbacks { + svbool_t (*ns_ptr) (); + __attribute__((arm_streaming)) svbool_t (*s_ptr) (); + __attribute__((arm_streaming_compatible)) svbool_t (*sc_ptr) (); +}; + +/* +** n_caller: { target lp64 } +** stp (x19|x2[0-8]), x30, \[sp, #?-16\]! +** addvl sp, sp, #-18 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str p12, \[sp, #8, mul vl\] +** str p13, \[sp, #9, mul vl\] +** str p14, \[sp, #10, mul vl\] +** str p15, \[sp, #11, mul vl\] +** str z8, \[sp, #2, mul vl\] +** str z9, \[sp, #3, mul vl\] +** str z10, \[sp, #4, mul vl\] +** str z11, \[sp, #5, mul vl\] +** str z12, \[sp, #6, mul vl\] +** str z13, \[sp, #7, mul vl\] +** str z14, \[sp, #8, mul vl\] +** str z15, \[sp, #9, mul vl\] +** str z16, \[sp, #10, mul vl\] +** str z17, \[sp, #11, mul vl\] +** str z18, \[sp, #12, mul vl\] +** str z19, \[sp, #13, mul vl\] +** str z20, \[sp, #14, mul vl\] +** str z21, \[sp, #15, mul vl\] +** str z22, \[sp, #16, mul vl\] +** str z23, \[sp, #17, mul vl\] +** mov \1, x0 +** bl ns_callee +** smstart sm +** bl s_callee +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstop sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** bl sc_callee +** ldr (x[0-9]+), \[\1\] +** blr \2 +** ldr (x[0-9]+), \[\1, #?8\] +** smstart sm +** blr \3 +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstop sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** ldr (x[0-9]+), \[\1, #?16\] +** blr \4 +** ldr z8, \[sp, #2, mul vl\] +** ldr z9, \[sp, #3, mul vl\] +** ldr z10, \[sp, #4, mul vl\] +** ldr z11, \[sp, #5, mul vl\] +** ldr z12, \[sp, #6, mul vl\] +** ldr z13, \[sp, #7, mul vl\] +** ldr z14, \[sp, #8, mul vl\] +** ldr z15, \[sp, #9, mul vl\] +** ldr z16, \[sp, #10, mul vl\] +** ldr z17, \[sp, #11, mul vl\] +** ldr z18, \[sp, #12, mul vl\] +** ldr z19, \[sp, #13, mul vl\] +** ldr z20, \[sp, #14, mul vl\] +** ldr z21, \[sp, #15, mul vl\] +** ldr z22, \[sp, #16, mul vl\] +** ldr z23, \[sp, #17, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** ldr p12, \[sp, #8, mul vl\] +** ldr p13, \[sp, #9, mul vl\] +** ldr p14, \[sp, #10, mul vl\] +** ldr p15, \[sp, #11, mul vl\] +** addvl sp, sp, #18 +** ldp \1, x30, \[sp\], #?16 +** ret +*/ +svbool_t +n_caller (struct callbacks *c) +{ + ns_callee (); + s_callee (); + sc_callee (); + + c->ns_ptr (); + c->s_ptr (); + return c->sc_ptr (); +} + +/* +** s_caller: { target lp64 } +** stp (x19|x2[0-8]), x30, \[sp, #?-16\]! +** addvl sp, sp, #-18 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str p12, \[sp, #8, mul vl\] +** str p13, \[sp, #9, mul vl\] +** str p14, \[sp, #10, mul vl\] +** str p15, \[sp, #11, mul vl\] +** str z8, \[sp, #2, mul vl\] +** str z9, \[sp, #3, mul vl\] +** str z10, \[sp, #4, mul vl\] +** str z11, \[sp, #5, mul vl\] +** str z12, \[sp, #6, mul vl\] +** str z13, \[sp, #7, mul vl\] +** str z14, \[sp, #8, mul vl\] +** str z15, \[sp, #9, mul vl\] +** str z16, \[sp, #10, mul vl\] +** str z17, \[sp, #11, mul vl\] +** str z18, \[sp, #12, mul vl\] +** str z19, \[sp, #13, mul vl\] +** str z20, \[sp, #14, mul vl\] +** str z21, \[sp, #15, mul vl\] +** str z22, \[sp, #16, mul vl\] +** str z23, \[sp, #17, mul vl\] +** mov \1, x0 +** smstop sm +** bl ns_callee +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstart sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** bl s_callee +** bl sc_callee +** ldr (x[0-9]+), \[\1\] +** smstop sm +** blr \2 +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstart sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** ldr (x[0-9]+), \[\1, #?8\] +** blr \3 +** ldr (x[0-9]+), \[\1, #?16\] +** blr \4 +** ldr z8, \[sp, #2, mul vl\] +** ldr z9, \[sp, #3, mul vl\] +** ldr z10, \[sp, #4, mul vl\] +** ldr z11, \[sp, #5, mul vl\] +** ldr z12, \[sp, #6, mul vl\] +** ldr z13, \[sp, #7, mul vl\] +** ldr z14, \[sp, #8, mul vl\] +** ldr z15, \[sp, #9, mul vl\] +** ldr z16, \[sp, #10, mul vl\] +** ldr z17, \[sp, #11, mul vl\] +** ldr z18, \[sp, #12, mul vl\] +** ldr z19, \[sp, #13, mul vl\] +** ldr z20, \[sp, #14, mul vl\] +** ldr z21, \[sp, #15, mul vl\] +** ldr z22, \[sp, #16, mul vl\] +** ldr z23, \[sp, #17, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** ldr p12, \[sp, #8, mul vl\] +** ldr p13, \[sp, #9, mul vl\] +** ldr p14, \[sp, #10, mul vl\] +** ldr p15, \[sp, #11, mul vl\] +** addvl sp, sp, #18 +** ldp \1, x30, \[sp\], #?16 +** ret +*/ +svbool_t __attribute__((arm_streaming)) +s_caller (struct callbacks *c) +{ + ns_callee (); + s_callee (); + sc_callee (); + + c->ns_ptr (); + c->s_ptr (); + return c->sc_ptr (); +} + +/* +** sc_caller: +** stp x29, x30, \[sp, #?-32\]! +** mov x29, sp +** addvl sp, sp, #-18 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str p12, \[sp, #8, mul vl\] +** str p13, \[sp, #9, mul vl\] +** str p14, \[sp, #10, mul vl\] +** str p15, \[sp, #11, mul vl\] +** str z8, \[sp, #2, mul vl\] +** str z9, \[sp, #3, mul vl\] +** str z10, \[sp, #4, mul vl\] +** str z11, \[sp, #5, mul vl\] +** str z12, \[sp, #6, mul vl\] +** str z13, \[sp, #7, mul vl\] +** str z14, \[sp, #8, mul vl\] +** str z15, \[sp, #9, mul vl\] +** str z16, \[sp, #10, mul vl\] +** str z17, \[sp, #11, mul vl\] +** str z18, \[sp, #12, mul vl\] +** str z19, \[sp, #13, mul vl\] +** str z20, \[sp, #14, mul vl\] +** str z21, \[sp, #15, mul vl\] +** str z22, \[sp, #16, mul vl\] +** str z23, \[sp, #17, mul vl\] +** mrs x16, svcr +** str x16, \[x29, #?16\] +** ldr x16, \[x29, #?16\] +** tbz x16, 0, .* +** smstop sm +** bl ns_callee +** ldr x16, \[x29, #?16\] +** tbz x16, 0, .* +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstart sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** ldr x16, \[x29, #?16\] +** tbnz x16, 0, .* +** smstart sm +** bl s_callee +** ldr x16, \[x29, #?16\] +** tbnz x16, 0, .* +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstop sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** bl sc_callee +** ldr z8, \[sp, #2, mul vl\] +** ldr z9, \[sp, #3, mul vl\] +** ldr z10, \[sp, #4, mul vl\] +** ldr z11, \[sp, #5, mul vl\] +** ldr z12, \[sp, #6, mul vl\] +** ldr z13, \[sp, #7, mul vl\] +** ldr z14, \[sp, #8, mul vl\] +** ldr z15, \[sp, #9, mul vl\] +** ldr z16, \[sp, #10, mul vl\] +** ldr z17, \[sp, #11, mul vl\] +** ldr z18, \[sp, #12, mul vl\] +** ldr z19, \[sp, #13, mul vl\] +** ldr z20, \[sp, #14, mul vl\] +** ldr z21, \[sp, #15, mul vl\] +** ldr z22, \[sp, #16, mul vl\] +** ldr z23, \[sp, #17, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** ldr p12, \[sp, #8, mul vl\] +** ldr p13, \[sp, #9, mul vl\] +** ldr p14, \[sp, #10, mul vl\] +** ldr p15, \[sp, #11, mul vl\] +** addvl sp, sp, #18 +** ldp x29, x30, \[sp\], #?32 +** ret +*/ +svbool_t __attribute__((arm_streaming_compatible)) +sc_caller () +{ + ns_callee (); + s_callee (); + return sc_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_6.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_6.c new file mode 100644 index 00000000000..507e2856a8c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_6.c @@ -0,0 +1,45 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls" } + +#include + +svbool_t ns_callee (); +__attribute__((arm_streaming)) svbool_t s_callee (); +__attribute__((arm_streaming_compatible)) svbool_t sc_callee (); + +struct callbacks { + svbool_t (*ns_ptr) (); + __attribute__((arm_streaming)) svbool_t (*s_ptr) (); + __attribute__((arm_streaming_compatible)) svbool_t (*sc_ptr) (); +}; + +svbool_t +n_caller (struct callbacks *c) +{ + ns_callee (); + sc_callee (); + + c->ns_ptr (); + return c->sc_ptr (); +} + +svbool_t __attribute__((arm_streaming)) +s_caller (struct callbacks *c) +{ + s_callee (); + sc_callee (); + + c->s_ptr (); + return c->sc_ptr (); +} + +svbool_t __attribute__((arm_streaming_compatible)) +sc_caller (struct callbacks *c) +{ + sc_callee (); + + return c->sc_ptr (); +} + +// { dg-final { scan-assembler-not {[dpqz][0-9]+,} } } +// { dg-final { scan-assembler-not {smstart\tsm} } } +// { dg-final { scan-assembler-not {smstop\tsm} } } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_7.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_7.c new file mode 100644 index 00000000000..af4d64b26ec --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_7.c @@ -0,0 +1,516 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls" } +// { dg-final { check-function-bodies "**" "" } } + +#include +#include + +double produce_d0 (); +void consume_d0 (double); + +/* +** test_d0: +** ... +** smstop sm +** bl produce_d0 +** fmov x10, d0 +** smstart sm +** fmov d0, x10 +** fmov x10, d0 +** smstop sm +** fmov d0, x10 +** bl consume_d0 +** ... +*/ +void __attribute__((arm_streaming)) +test_d0 () +{ + double res = produce_d0 (); + asm volatile (""); + consume_d0 (res); +} + +int8x8_t produce_d0_vec (); +void consume_d0_vec (int8x8_t); + +/* +** test_d0_vec: +** ... +** smstop sm +** bl produce_d0_vec +** ( +** fmov x10, d0 +** | +** umov x10, v0.d\[0\] +** ) +** smstart sm +** fmov d0, x10 +** ( +** fmov x10, d0 +** | +** umov x10, v0.d\[0\] +** ) +** smstop sm +** fmov d0, x10 +** bl consume_d0_vec +** ... +*/ +void __attribute__((arm_streaming)) +test_d0_vec () +{ + int8x8_t res = produce_d0_vec (); + asm volatile (""); + consume_d0_vec (res); +} + +int8x16_t produce_q0 (); +void consume_q0 (int8x16_t); + +/* +** test_q0: +** ... +** smstop sm +** bl produce_q0 +** str q0, \[sp, #?-16\]! +** smstart sm +** ldr q0, \[sp\], #?16 +** str q0, \[sp, #?-16\]! +** smstop sm +** ldr q0, \[sp\], #?16 +** bl consume_q0 +** ... +*/ +void __attribute__((arm_streaming)) +test_q0 () +{ + int8x16_t res = produce_q0 (); + asm volatile (""); + consume_q0 (res); +} + +int8x16x2_t produce_q1 (); +void consume_q1 (int8x16x2_t); + +/* +** test_q1: +** ... +** smstop sm +** bl produce_q1 +** stp q0, q1, \[sp, #?-32\]! +** smstart sm +** ldp q0, q1, \[sp\], #?32 +** stp q0, q1, \[sp, #?-32\]! +** smstop sm +** ldp q0, q1, \[sp\], #?32 +** bl consume_q1 +** ... +*/ +void __attribute__((arm_streaming)) +test_q1 () +{ + int8x16x2_t res = produce_q1 (); + asm volatile (""); + consume_q1 (res); +} + +int8x16x3_t produce_q2 (); +void consume_q2 (int8x16x3_t); + +/* +** test_q2: +** ... +** smstop sm +** bl produce_q2 +** stp q0, q1, \[sp, #?-48\]! +** str q2, \[sp, #?32\] +** smstart sm +** ldr q2, \[sp, #?32\] +** ldp q0, q1, \[sp\], #?48 +** stp q0, q1, \[sp, #?-48\]! +** str q2, \[sp, #?32\] +** smstop sm +** ldr q2, \[sp, #?32\] +** ldp q0, q1, \[sp\], #?48 +** bl consume_q2 +** ... +*/ +void __attribute__((arm_streaming)) +test_q2 () +{ + int8x16x3_t res = produce_q2 (); + asm volatile (""); + consume_q2 (res); +} + +int8x16x4_t produce_q3 (); +void consume_q3 (int8x16x4_t); + +/* +** test_q3: +** ... +** smstop sm +** bl produce_q3 +** stp q0, q1, \[sp, #?-64\]! +** stp q2, q3, \[sp, #?32\] +** smstart sm +** ldp q2, q3, \[sp, #?32\] +** ldp q0, q1, \[sp\], #?64 +** stp q0, q1, \[sp, #?-64\]! +** stp q2, q3, \[sp, #?32\] +** smstop sm +** ldp q2, q3, \[sp, #?32\] +** ldp q0, q1, \[sp\], #?64 +** bl consume_q3 +** ... +*/ +void __attribute__((arm_streaming)) +test_q3 () +{ + int8x16x4_t res = produce_q3 (); + asm volatile (""); + consume_q3 (res); +} + +svint8_t produce_z0 (); +void consume_z0 (svint8_t); + +/* +** test_z0: +** ... +** smstop sm +** bl produce_z0 +** addvl sp, sp, #-1 +** str z0, \[sp\] +** smstart sm +** ldr z0, \[sp\] +** addvl sp, sp, #1 +** addvl sp, sp, #-1 +** str z0, \[sp\] +** smstop sm +** ldr z0, \[sp\] +** addvl sp, sp, #1 +** bl consume_z0 +** ... +*/ +void __attribute__((arm_streaming)) +test_z0 () +{ + svint8_t res = produce_z0 (); + asm volatile (""); + consume_z0 (res); +} + +svint8x4_t produce_z3 (); +void consume_z3 (svint8x4_t); + +/* +** test_z3: +** ... +** smstop sm +** bl produce_z3 +** addvl sp, sp, #-4 +** str z0, \[sp\] +** str z1, \[sp, #1, mul vl\] +** str z2, \[sp, #2, mul vl\] +** str z3, \[sp, #3, mul vl\] +** smstart sm +** ldr z0, \[sp\] +** ldr z1, \[sp, #1, mul vl\] +** ldr z2, \[sp, #2, mul vl\] +** ldr z3, \[sp, #3, mul vl\] +** addvl sp, sp, #4 +** addvl sp, sp, #-4 +** str z0, \[sp\] +** str z1, \[sp, #1, mul vl\] +** str z2, \[sp, #2, mul vl\] +** str z3, \[sp, #3, mul vl\] +** smstop sm +** ldr z0, \[sp\] +** ldr z1, \[sp, #1, mul vl\] +** ldr z2, \[sp, #2, mul vl\] +** ldr z3, \[sp, #3, mul vl\] +** addvl sp, sp, #4 +** bl consume_z3 +** ... +*/ +void __attribute__((arm_streaming)) +test_z3 () +{ + svint8x4_t res = produce_z3 (); + asm volatile (""); + consume_z3 (res); +} + +svbool_t produce_p0 (); +void consume_p0 (svbool_t); + +/* +** test_p0: +** ... +** smstop sm +** bl produce_p0 +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstart sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstop sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** bl consume_p0 +** ... +*/ +void __attribute__((arm_streaming)) +test_p0 () +{ + svbool_t res = produce_p0 (); + asm volatile (""); + consume_p0 (res); +} + +void consume_d7 (double, double, double, double, double, double, double, + double); + +/* +** test_d7: +** ... +** fmov x10, d0 +** fmov x11, d1 +** fmov x12, d2 +** fmov x13, d3 +** fmov x14, d4 +** fmov x15, d5 +** fmov x16, d6 +** fmov x17, d7 +** smstop sm +** fmov d0, x10 +** fmov d1, x11 +** fmov d2, x12 +** fmov d3, x13 +** fmov d4, x14 +** fmov d5, x15 +** fmov d6, x16 +** fmov d7, x17 +** bl consume_d7 +** ... +*/ +void __attribute__((arm_streaming)) +test_d7 () +{ + consume_d7 (1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0); +} + +void consume_d7_vec (int8x8_t, int8x8_t, int8x8_t, int8x8_t, int8x8_t, + int8x8_t, int8x8_t, int8x8_t); + +/* +** test_d7_vec: +** ... +** ( +** fmov x10, d0 +** fmov x11, d1 +** fmov x12, d2 +** fmov x13, d3 +** fmov x14, d4 +** fmov x15, d5 +** fmov x16, d6 +** fmov x17, d7 +** | +** umov x10, v0.d\[0\] +** umov x11, v1.d\[0\] +** umov x12, v2.d\[0\] +** umov x13, v3.d\[0\] +** umov x14, v4.d\[0\] +** umov x15, v5.d\[0\] +** umov x16, v6.d\[0\] +** umov x17, v7.d\[0\] +** ) +** smstop sm +** fmov d0, x10 +** fmov d1, x11 +** fmov d2, x12 +** fmov d3, x13 +** fmov d4, x14 +** fmov d5, x15 +** fmov d6, x16 +** fmov d7, x17 +** bl consume_d7_vec +** ... +*/ +void __attribute__((arm_streaming)) +test_d7_vec (int8x8_t *ptr) +{ + consume_d7_vec (*ptr, *ptr, *ptr, *ptr, *ptr, *ptr, *ptr, *ptr); +} + +void consume_q7 (int8x16_t, int8x16_t, int8x16_t, int8x16_t, int8x16_t, + int8x16_t, int8x16_t, int8x16_t); + +/* +** test_q7: +** ... +** stp q0, q1, \[sp, #?-128\]! +** stp q2, q3, \[sp, #?32\] +** stp q4, q5, \[sp, #?64\] +** stp q6, q7, \[sp, #?96\] +** smstop sm +** ldp q2, q3, \[sp, #?32\] +** ldp q4, q5, \[sp, #?64\] +** ldp q6, q7, \[sp, #?96\] +** ldp q0, q1, \[sp\], #?128 +** bl consume_q7 +** ... +*/ +void __attribute__((arm_streaming)) +test_q7 (int8x16_t *ptr) +{ + consume_q7 (*ptr, *ptr, *ptr, *ptr, *ptr, *ptr, *ptr, *ptr); +} + +void consume_z7 (svint8_t, svint8_t, svint8_t, svint8_t, svint8_t, + svint8_t, svint8_t, svint8_t); + +/* +** test_z7: +** ... +** addvl sp, sp, #-8 +** str z0, \[sp\] +** str z1, \[sp, #1, mul vl\] +** str z2, \[sp, #2, mul vl\] +** str z3, \[sp, #3, mul vl\] +** str z4, \[sp, #4, mul vl\] +** str z5, \[sp, #5, mul vl\] +** str z6, \[sp, #6, mul vl\] +** str z7, \[sp, #7, mul vl\] +** smstop sm +** ldr z0, \[sp\] +** ldr z1, \[sp, #1, mul vl\] +** ldr z2, \[sp, #2, mul vl\] +** ldr z3, \[sp, #3, mul vl\] +** ldr z4, \[sp, #4, mul vl\] +** ldr z5, \[sp, #5, mul vl\] +** ldr z6, \[sp, #6, mul vl\] +** ldr z7, \[sp, #7, mul vl\] +** addvl sp, sp, #8 +** bl consume_z7 +** ... +*/ +void __attribute__((arm_streaming)) +test_z7 (svint8_t *ptr) +{ + consume_z7 (*ptr, *ptr, *ptr, *ptr, *ptr, *ptr, *ptr, *ptr); +} + +void consume_p3 (svbool_t, svbool_t, svbool_t, svbool_t); + +/* +** test_p3: +** ... +** addvl sp, sp, #-1 +** str p0, \[sp\] +** str p1, \[sp, #1, mul vl\] +** str p2, \[sp, #2, mul vl\] +** str p3, \[sp, #3, mul vl\] +** smstop sm +** ldr p0, \[sp\] +** ldr p1, \[sp, #1, mul vl\] +** ldr p2, \[sp, #2, mul vl\] +** ldr p3, \[sp, #3, mul vl\] +** addvl sp, sp, #1 +** bl consume_p3 +** ... +*/ +void __attribute__((arm_streaming)) +test_p3 (svbool_t *ptr) +{ + consume_p3 (*ptr, *ptr, *ptr, *ptr); +} + +void consume_mixed (float, double, float32x4_t, svfloat32_t, + float, double, float64x2_t, svfloat64_t, + svbool_t, svbool_t, svbool_t, svbool_t); + +/* +** test_mixed: +** ... +** addvl sp, sp, #-3 +** str p0, \[sp\] +** str p1, \[sp, #1, mul vl\] +** str p2, \[sp, #2, mul vl\] +** str p3, \[sp, #3, mul vl\] +** str z3, \[sp, #1, mul vl\] +** str z7, \[sp, #2, mul vl\] +** stp q2, q6, \[sp, #?-32\]! +** fmov w10, s0 +** fmov x11, d1 +** fmov w12, s4 +** fmov x13, d5 +** smstop sm +** fmov s0, w10 +** fmov d1, x11 +** fmov s4, w12 +** fmov d5, x13 +** ldp q2, q6, \[sp\], #?32 +** ldr p0, \[sp\] +** ldr p1, \[sp, #1, mul vl\] +** ldr p2, \[sp, #2, mul vl\] +** ldr p3, \[sp, #3, mul vl\] +** ldr z3, \[sp, #1, mul vl\] +** ldr z7, \[sp, #2, mul vl\] +** addvl sp, sp, #3 +** bl consume_mixed +** ... +*/ +void __attribute__((arm_streaming)) +test_mixed (float32x4_t *float32x4_ptr, + svfloat32_t *svfloat32_ptr, + float64x2_t *float64x2_ptr, + svfloat64_t *svfloat64_ptr, + svbool_t *svbool_ptr) +{ + consume_mixed (1.0f, 2.0, *float32x4_ptr, *svfloat32_ptr, + 3.0f, 4.0, *float64x2_ptr, *svfloat64_ptr, + *svbool_ptr, *svbool_ptr, *svbool_ptr, *svbool_ptr); +} + +void consume_varargs (float, ...); + +/* +** test_varargs: +** ... +** stp q3, q7, \[sp, #?-32\]! +** fmov w10, s0 +** fmov x11, d1 +** ( +** fmov x12, d2 +** | +** umov x12, v2.d\[0\] +** ) +** fmov x13, d4 +** fmov x14, d5 +** ( +** fmov x15, d6 +** | +** umov x15, v6.d\[0\] +** ) +** smstop sm +** fmov s0, w10 +** fmov d1, x11 +** fmov d2, x12 +** fmov d4, x13 +** fmov d5, x14 +** fmov d6, x15 +** ldp q3, q7, \[sp\], #?32 +** bl consume_varargs +** ... +*/ +void __attribute__((arm_streaming)) +test_varargs (float32x2_t *float32x2_ptr, + float32x4_t *float32x4_ptr, + float64x1_t *float64x1_ptr, + float64x2_t *float64x2_ptr) +{ + consume_varargs (1.0f, 2.0, *float32x2_ptr, *float32x4_ptr, + 3.0f, 4.0, *float64x1_ptr, *float64x2_ptr); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_8.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_8.c new file mode 100644 index 00000000000..1a28da795de --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_8.c @@ -0,0 +1,87 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls -msve-vector-bits=128" } +// { dg-final { check-function-bodies "**" "" } } + +#include + +svint8_t produce_z0 (); +void consume_z0 (svint8_t); + +/* +** test_z0: +** ... +** smstop sm +** bl produce_z0 +** str q0, \[sp, #?-16\]! +** smstart sm +** ldr q0, \[sp\], #?16 +** str q0, \[sp, #?-16\]! +** smstop sm +** ldr q0, \[sp\], #?16 +** bl consume_z0 +** ... +*/ +void __attribute__((arm_streaming)) +test_z0 () +{ + svint8_t res = produce_z0 (); + asm volatile (""); + consume_z0 (res); +} + +svint8x4_t produce_z3 (); +void consume_z3 (svint8x4_t); + +/* +** test_z3: +** ... +** smstop sm +** bl produce_z3 +** stp q0, q1, \[sp, #?-64\]! +** stp q2, q3, \[sp, #?32\] +** smstart sm +** ldp q2, q3, \[sp, #?32\] +** ldp q0, q1, \[sp\], #?64 +** stp q0, q1, \[sp, #?-64\]! +** stp q2, q3, \[sp, #?32\] +** smstop sm +** ldp q2, q3, \[sp, #?32\] +** ldp q0, q1, \[sp\], #?64 +** bl consume_z3 +** ... +*/ +void __attribute__((arm_streaming)) +test_z3 () +{ + svint8x4_t res = produce_z3 (); + asm volatile (""); + consume_z3 (res); +} + +svbool_t produce_p0 (); +void consume_p0 (svbool_t); + +/* +** test_p0: +** ... +** smstop sm +** bl produce_p0 +** sub sp, sp, #?16 +** str p0, \[sp\] +** smstart sm +** ldr p0, \[sp\] +** add sp, sp, #?16 +** sub sp, sp, #?16 +** str p0, \[sp\] +** smstop sm +** ldr p0, \[sp\] +** add sp, sp, #?16 +** bl consume_p0 +** ... +*/ +void __attribute__((arm_streaming)) +test_p0 () +{ + svbool_t res = produce_p0 (); + asm volatile (""); + consume_p0 (res); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_9.c b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_9.c new file mode 100644 index 00000000000..fd880ee7931 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/call_sm_switch_9.c @@ -0,0 +1,103 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls -msve-vector-bits=256" } +// { dg-final { check-function-bodies "**" "" } } + +#include + +svint8_t produce_z0 (); +void consume_z0 (svint8_t); + +/* +** test_z0: +** ... +** smstop sm +** bl produce_z0 +** sub sp, sp, #?32 +** str z0, \[sp\] +** smstart sm +** ldr z0, \[sp\] +** add sp, sp, #?32 +** sub sp, sp, #?32 +** str z0, \[sp\] +** smstop sm +** ldr z0, \[sp\] +** add sp, sp, #?32 +** bl consume_z0 +** ... +*/ +void __attribute__((arm_streaming)) +test_z0 () +{ + svint8_t res = produce_z0 (); + asm volatile (""); + consume_z0 (res); +} + +svint8x4_t produce_z3 (); +void consume_z3 (svint8x4_t); + +/* +** test_z3: +** ... +** smstop sm +** bl produce_z3 +** sub sp, sp, #?128 +** str z0, \[sp\] +** str z1, \[sp, #1, mul vl\] +** str z2, \[sp, #2, mul vl\] +** str z3, \[sp, #3, mul vl\] +** smstart sm +** ldr z0, \[sp\] +** ldr z1, \[sp, #1, mul vl\] +** ldr z2, \[sp, #2, mul vl\] +** ldr z3, \[sp, #3, mul vl\] +** add sp, sp, #?128 +** sub sp, sp, #?128 +** str z0, \[sp\] +** str z1, \[sp, #1, mul vl\] +** str z2, \[sp, #2, mul vl\] +** str z3, \[sp, #3, mul vl\] +** smstop sm +** ldr z0, \[sp\] +** ldr z1, \[sp, #1, mul vl\] +** ldr z2, \[sp, #2, mul vl\] +** ldr z3, \[sp, #3, mul vl\] +** add sp, sp, #?128 +** bl consume_z3 +** ... +*/ +void __attribute__((arm_streaming)) +test_z3 () +{ + svint8x4_t res = produce_z3 (); + asm volatile (""); + consume_z3 (res); +} + +svbool_t produce_p0 (); +void consume_p0 (svbool_t); + +/* +** test_p0: +** ... +** smstop sm +** bl produce_p0 +** sub sp, sp, #?32 +** str p0, \[sp\] +** smstart sm +** ldr p0, \[sp\] +** add sp, sp, #?32 +** sub sp, sp, #?32 +** str p0, \[sp\] +** smstop sm +** ldr p0, \[sp\] +** add sp, sp, #?32 +** bl consume_p0 +** ... +*/ +void __attribute__((arm_streaming)) +test_p0 () +{ + svbool_t res = produce_p0 (); + asm volatile (""); + consume_p0 (res); +} From patchwork Sun Nov 13 10:01:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 60517 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BE3A3393BA4D for ; Sun, 13 Nov 2022 10:04:47 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BE3A3393BA4D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668333887; bh=JU+/ARvJx/qEP/6kwWoB5TP8nJQGnan9v6qNoMIh0OQ=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=vjIjU18bxd2zIaFzW5cv5x6dOUC11FLrfAxyKNA07u54fL044W7b3uQap2dJFB2W/ +nMgx9V5dtNmmQczgywRZyNnuwaglEQ7OBxXhIHaN1QBu/VPgvMpXXwVd8tzYBkGko x+t6+sITPBoiNNsrZF8cgJ7mnxxP9cZy8l5cE6v8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 7CB463887F52 for ; Sun, 13 Nov 2022 10:01:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7CB463887F52 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7306A23A for ; Sun, 13 Nov 2022 02:01:20 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 836FC3F73D for ; Sun, 13 Nov 2022 02:01:13 -0800 (PST) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 06/16] aarch64: Add support for SME ZA attributes References: Date: Sun, 13 Nov 2022 10:01:12 +0000 In-Reply-To: (Richard Sandiford's message of "Sun, 13 Nov 2022 09:59:23 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-40.7 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, KAM_STOCKGEN, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" SME has an array called ZA that can be enabled and disabled separately from streaming mode. A status bit called PSTATE.ZA indicates whether ZA is currently enabled or not. In C and C++, the state of PSTATE.ZA is controlled using function attributes. If a function's type has an arm_shared_za attribute, PSTATE.ZA==1 on entry to the function and on return from the function, and the function shares the contents of ZA with its caller. Otherwise, the caller and callee have separate ZA contexts; they do not use ZA to share data. Although normal non-arm_shared_za functions have a separate ZA context from their callers, nested uses of ZA are expected to be rare. The ABI therefore defines a cooperative lazy saving scheme that allows saves and restore of ZA to be kept to a minimum. (Callers still have the option of doing a full save and restore if they prefer.) Functions that want to use ZA internally have an arm_new_za attribute, which tells the compiler to enable PSTATE.ZA for the duration of the function body. It also tells the compiler to commit any lazy save initiated by a caller. There is also a function type attribute called arm_preserves_za, which a function can use to guarantee to callers that it doesn't change ZA (and so that callers don't need to save and restore it). A known flaw is that it should be possible to assign preserves-ZA functions to normal function pointers, but currently that results in a diagnostic. (The opposite way is invalid and rightly rejected.) gcc/ * config/aarch64/aarch64-isa-modes.def (ZA_ON): New ISA mode. * config/aarch64/aarch64-protos.h (aarch64_rdsvl_immediate_p) (aarch64_output_rdsvl, aarch64_restore_za): Declare. * config/aarch64/constraints.md (UsR): New constraint. * config/aarch64/aarch64.md (ZA_REGNUM, OLD_ZA_REGNUM): New constants. (UNSPEC_SME_VQ): New unspec. (arches): Add sme. (arch_enabled): Handle it. (*cb1): Rename to... (aarch64_cb1): ...this. (*movsi_aarch64): Add an alernative for RDSVL. (*movdi_aarch64): Likewise. * config/aarch64/aarch64-sme.md (UNSPEC_SMSTART_ZA, UNSPEC_SMSTOP_ZA) (UNSPEC_TPIDR2_SAVE, UNSPEC_TPIDR2_RESTORE, UNSPEC_READ_TPIDR2) (UNSPEC_CLEAR_TPIDR2): New unspecs. (aarch64_smstart_za, aarch64_smstop_za, aarch64_tpidr2_save) (aarch64_tpidr2_restore, aarch64_read_tpidr2, aarch64_clear_tpidr2) (aarch64_save_za, aarch64_restore_za): New patterns. * config/aarch64/aarch64.h (AARCH64_ISA_ZA_ON, TARGET_ZA): New macros. (FIXED_REGISTERS, REGISTER_NAMES): Add the ZA registers. (CALL_USED_REGISTERS): Replace with... (CALL_REALLY_USED_REGISTERS): ...this and add the ZA registers. (FIRST_PSEUDO_REGISTER): Bump to include ZA registers. (ZA_REGS): New register class. (REG_CLASS_NAMES): Update accordingly. (REG_CLASS_CONTENTS): Likewise. (aarch64_frame::has_new_za_state): New member variable. (machine_function::tpidr2_block): Likewise. (machine_function::tpidr2_block_ptr): Likewise. (machine_function::za_save_buffer): Likewise. (CUMULATIVE_ARGS::preserves_za): Likewise. * config/aarch64/aarch64.cc (handle_arm_new_za_attribute): New function. (attr_arm_new_za_exclusions): New variable. (attr_no_arm_new_za): Likewise. (aarch64_attribute_table): Add arm_new_za, arm_shared_za, and arm_preserves_za. (aarch64_hard_regno_nregs): Handle the ZA registers. (aarch64_hard_regno_mode_ok): Likewise. (aarch64_regno_regclass): Likewise. (aarch64_class_max_nregs): Likewise. (aarch64_conditional_register_usage): Likewise. (aarch64_fntype_za_state): New function. (aarch64_fntype_isa_mode): Call it. (aarch64_fntype_preserves_za): New function. (aarch64_fndecl_has_new_za_state): Likewise. (aarch64_fndecl_za_state): Likewise. (aarch64_fndecl_isa_mode): Call it. (aarch64_fndecl_preserves_za): New function. (aarch64_cfun_incoming_za_state): Likewise. (aarch64_cfun_has_new_za_state): Likewise. (aarch64_sme_vq_immediate): Likewise. (aarch64_sme_vq_unspec_p): Likewise. (aarch64_rdsvl_immediate_p): Likewise. (aarch64_output_rdsvl): Likewise. (aarch64_expand_mov_immediate): Handle RDSVL immediates. (aarch64_mov_operand_p): Likewise. (aarch64_init_cumulative_args): Record whether the call preserves ZA. (aarch64_layout_frame): Check whether the current function creates new ZA state. Record that it clobbers LR if so. (aarch64_epilogue_uses): Handle ZA_REGNUM. (aarch64_expand_prologue): Handle functions that create new ZA state. (aarch64_expand_epilogue): Likewise. (aarch64_create_tpidr2_block): New function. (aarch64_restore_za): Likewise. (aarch64_start_call_args): Disallow calls to shared-ZA functions from functions that have no ZA state. Set up a lazy save if the call might clobber the caller's ZA state. (aarch64_expand_call): Record the shared-ZA functions use ZA_REGNUM. (aarch64_end_call_args): New function. (aarch64_override_options_internal): Require TARGET_SME for functions that have ZA state. (aarch64_comp_type_attributes): Handle arm_shared_za and arm_preserves_za. (aarch64_merge_decl_attributes): New function. (TARGET_END_CALL_ARGS, TARGET_MERGE_DECL_ATTRIBUTES): Define. (TARGET_MD_ASM_ADJUST): Use aarch64_md_asm_adjust. gcc/testsuite/ * gcc.target/aarch64/sme/za_state_1.c: New test. * gcc.target/aarch64/sme/za_state_2.c: Likewise. * gcc.target/aarch64/sme/za_state_3.c: Likewise. * gcc.target/aarch64/sme/za_state_4.c: Likewise. * gcc.target/aarch64/sme/za_state_5.c: Likewise. * gcc.target/aarch64/sme/za_state_6.c: Likewise. * gcc.target/aarch64/sme/za_state_7.c: Likewise. --- gcc/config/aarch64/aarch64-isa-modes.def | 5 + gcc/config/aarch64/aarch64-protos.h | 4 + gcc/config/aarch64/aarch64-sme.md | 138 ++++++ gcc/config/aarch64/aarch64.cc | 408 +++++++++++++++++- gcc/config/aarch64/aarch64.h | 43 +- gcc/config/aarch64/aarch64.md | 39 +- gcc/config/aarch64/constraints.md | 6 + .../gcc.target/aarch64/sme/za_state_1.c | 102 +++++ .../gcc.target/aarch64/sme/za_state_2.c | 96 +++++ .../gcc.target/aarch64/sme/za_state_3.c | 27 ++ .../gcc.target/aarch64/sme/za_state_4.c | 277 ++++++++++++ .../gcc.target/aarch64/sme/za_state_5.c | 241 +++++++++++ .../gcc.target/aarch64/sme/za_state_6.c | 132 ++++++ .../gcc.target/aarch64/sme/za_state_7.c | 55 +++ 14 files changed, 1548 insertions(+), 25 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/za_state_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/za_state_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/za_state_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/za_state_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/za_state_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/za_state_6.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/za_state_7.c diff --git a/gcc/config/aarch64/aarch64-isa-modes.def b/gcc/config/aarch64/aarch64-isa-modes.def index fba8eafbae1..001ef54a59b 100644 --- a/gcc/config/aarch64/aarch64-isa-modes.def +++ b/gcc/config/aarch64/aarch64-isa-modes.def @@ -32,4 +32,9 @@ DEF_AARCH64_ISA_MODE(SM_ON) DEF_AARCH64_ISA_MODE(SM_OFF) +/* Indicates that PSTATE.ZA is known to be 1. The converse is that + PSTATE.ZA might be 0 or 1, depending on whether there is an uncommitted + lazy save. */ +DEF_AARCH64_ISA_MODE(ZA_ON) + #undef DEF_AARCH64_ISA_MODE diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 0f686fba4bd..97a84f616a2 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -807,6 +807,8 @@ bool aarch64_sve_addvl_addpl_immediate_p (rtx); bool aarch64_sve_vector_inc_dec_immediate_p (rtx); int aarch64_add_offset_temporaries (rtx); void aarch64_split_add_offset (scalar_int_mode, rtx, rtx, rtx, rtx, rtx); +bool aarch64_rdsvl_immediate_p (const_rtx); +char *aarch64_output_rdsvl (const_rtx); bool aarch64_mov_operand_p (rtx, machine_mode); rtx aarch64_reverse_mask (machine_mode, unsigned int); bool aarch64_offset_7bit_signed_scaled_p (machine_mode, poly_int64); @@ -1077,4 +1079,6 @@ const char *aarch64_indirect_call_asm (rtx); extern bool aarch64_harden_sls_retbr_p (void); extern bool aarch64_harden_sls_blr_p (void); +void aarch64_restore_za (); + #endif /* GCC_AARCH64_PROTOS_H */ diff --git a/gcc/config/aarch64/aarch64-sme.md b/gcc/config/aarch64/aarch64-sme.md index 88f1526fa34..55fb00db12d 100644 --- a/gcc/config/aarch64/aarch64-sme.md +++ b/gcc/config/aarch64/aarch64-sme.md @@ -23,6 +23,7 @@ ;; == State management ;; ---- Test current state ;; ---- PSTATE.SM management +;; ---- PSTATE.ZA management ;; ========================================================================= ;; == State management @@ -131,3 +132,140 @@ (define_insn "aarch64_smstop_sm" "" "smstop\tsm" ) + +;; ------------------------------------------------------------------------- +;; ---- PSTATE.ZA management +;; ------------------------------------------------------------------------- +;; Includes +;; - SMSTART ZA +;; - SMSTOP ZA +;; plus calls to support routines. +;; ------------------------------------------------------------------------- + +(define_c_enum "unspec" [ + UNSPEC_SMSTART_ZA + UNSPEC_SMSTOP_ZA + UNSPEC_TPIDR2_SAVE + UNSPEC_TPIDR2_RESTORE + UNSPEC_READ_TPIDR2 + UNSPEC_CLEAR_TPIDR2 +]) + +;; Enable ZA, starting with fresh ZA contents. This is only valid when +;; SME is present, but the pattern does not depend on TARGET_SME since +;; it can be used conditionally. +(define_insn "aarch64_smstart_za" + [(unspec_volatile [(const_int 0)] UNSPEC_SMSTART_ZA) + (clobber (reg:VNx16QI ZA_REGNUM))] + "" + "smstart\tza" +) + +;; Disable ZA and discard its current contents. This is only valid when +;; SME is present, but the pattern does not depend on TARGET_SME since +;; it can be used conditionally. +;; +;; The ABI says that the ZA save buffer must be null whenever PSTATE.ZA +;; is zero. This instruction is therefore sequenced wrt writes to +;; OLD_ZA_REGNUM. +(define_insn "aarch64_smstop_za" + [(unspec_volatile [(reg:VNx16QI OLD_ZA_REGNUM)] UNSPEC_SMSTOP_ZA) + (clobber (reg:VNx16QI ZA_REGNUM))] + "" + "smstop\tza" +) + +;; Use the ABI-defined routine to commit any uncommitted lazy save. +(define_insn "aarch64_tpidr2_save" + [(unspec_volatile:DI [(reg:VNx16QI OLD_ZA_REGNUM) + (reg:VNx16QI ZA_REGNUM)] UNSPEC_TPIDR2_SAVE) + (clobber (reg:DI R14_REGNUM)) + (clobber (reg:DI R15_REGNUM)) + (clobber (reg:DI R16_REGNUM)) + (clobber (reg:DI R17_REGNUM)) + (clobber (reg:DI R18_REGNUM)) + (clobber (reg:DI R30_REGNUM)) + (clobber (reg:CC CC_REGNUM))] + "" + "bl\t__arm_tpidr2_save" +) + +;; Use the ABI-defined routine to restore lazy-saved ZA contents +;; from the TPIDR2 block pointed to by X0. +(define_insn "aarch64_tpidr2_restore" + [(set (reg:VNx16QI ZA_REGNUM) + (unspec:VNx16QI [(reg:VNx16QI OLD_ZA_REGNUM) + (reg:DI R0_REGNUM)] UNSPEC_TPIDR2_RESTORE)) + (clobber (reg:DI R14_REGNUM)) + (clobber (reg:DI R15_REGNUM)) + (clobber (reg:DI R16_REGNUM)) + (clobber (reg:DI R17_REGNUM)) + (clobber (reg:DI R18_REGNUM)) + (clobber (reg:DI R30_REGNUM)) + (clobber (reg:CC CC_REGNUM))] + "" + "bl\t__arm_tpidr2_restore" +) + +;; Check whether a lazy save of ZA is active. This is only valid when +;; SME is present, but the pattern does not depend on TARGET_SME since +;; it can be used conditionally. +(define_insn "aarch64_read_tpidr2" + [(set (match_operand:DI 0 "register_operand" "=r") + (unspec:DI [(reg:VNx16QI OLD_ZA_REGNUM)] UNSPEC_READ_TPIDR2))] + "" + "mrs\t%0, tpidr2_el0" +) + +;; Clear TPIDR2_EL0, cancelling any uncommitted lazy save. This is only +;; valid when SME is present, but the pattern does not depend on TARGET_SME +;; since it can be used conditionally. +(define_insn "aarch64_clear_tpidr2" + [(set (reg:VNx16QI OLD_ZA_REGNUM) + (unspec:VNx16QI [(const_int 0)] UNSPEC_CLEAR_TPIDR2))] + "" + "msr\ttpidr2_el0, xzr" +) + +;; Set up a lazy save of ZA. Operand 0 points to the TPIDR2 block and +;; operand 1 is the contents of that block. Operand 1 exists only to +;; provide dependency information: the TPIDR2 block must be valid +;; before TPIDR2_EL0 is updated. +(define_insn "aarch64_save_za" + [(set (reg:VNx16QI OLD_ZA_REGNUM) + (reg:VNx16QI ZA_REGNUM)) + (use (match_operand 0 "pmode_register_operand" "r")) + (use (match_operand:V16QI 1 "memory_operand" "m"))] + "" + "msr\ttpidr2_el0, %0" +) + +;; Check whether a lazy save set up by aarch64_save_za was committed +;; and restore the saved contents if so. +(define_insn_and_split "aarch64_restore_za" + [(set (reg:VNx16QI ZA_REGNUM) + (reg:VNx16QI OLD_ZA_REGNUM)) + (clobber (reg:DI R14_REGNUM)) + (clobber (reg:DI R15_REGNUM)) + (clobber (reg:DI R16_REGNUM)) + (clobber (reg:DI R17_REGNUM)) + (clobber (reg:DI R18_REGNUM)) + (clobber (reg:DI R30_REGNUM)) + (clobber (reg:CC CC_REGNUM)) + (clobber (reg:VNx16QI OLD_ZA_REGNUM))] + "" + "#" + "&& epilogue_completed" + [(const_int 0)] + { + auto label = gen_label_rtx (); + auto tpidr2 = gen_rtx_REG (DImode, R16_REGNUM); + emit_insn (gen_aarch64_read_tpidr2 (tpidr2)); + auto jump = emit_jump_insn (gen_aarch64_cbnedi1 (tpidr2, label)); + JUMP_LABEL (jump) = label; + aarch64_restore_za (); + emit_label (label); + emit_insn (gen_aarch64_clear_tpidr2 ()); + DONE; + } +) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index d8310eb8597..b200d2a9f80 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -2733,6 +2733,22 @@ handle_aarch64_vector_pcs_attribute (tree *node, tree name, tree, gcc_unreachable (); } +/* Check whether an 'arm_new_za' attribute is valid. */ + +static tree +handle_arm_new_za_attribute (tree *node, tree name, tree, + int, bool *no_add_attrs) +{ + tree decl = *node; + if (TREE_CODE (decl) != FUNCTION_DECL) + { + error_at (DECL_SOURCE_LOCATION (decl), + "%qE attribute applies only to functions", name); + *no_add_attrs = true; + } + return NULL_TREE; +} + /* Mutually-exclusive function type attributes for controlling PSTATE.SM. */ static const struct attribute_spec::exclusions attr_streaming_exclusions[] = { @@ -2743,6 +2759,26 @@ static const struct attribute_spec::exclusions attr_streaming_exclusions[] = { NULL, false, false, false } }; +/* Function type attributes that are mutually-exclusive with arm_new_za. */ +static const struct attribute_spec::exclusions attr_arm_new_za_exclusions[] = +{ + /* Attribute name exclusion applies to: + function, type, variable */ + { "arm_preserves_za", true, false, false }, + { "arm_shared_za", true, false, false }, + { NULL, false, false, false } +}; + +/* Used by function type attributes that are mutually-exclusive with + arm_new_za. */ +static const struct attribute_spec::exclusions attr_no_arm_new_za[] = +{ + /* Attribute name exclusion applies to: + function, type, variable */ + { "arm_new_za", true, false, false }, + { NULL, false, false, false } +}; + /* Table of machine attributes. */ static const struct attribute_spec aarch64_attribute_table[] = { @@ -2754,6 +2790,13 @@ static const struct attribute_spec aarch64_attribute_table[] = NULL, attr_streaming_exclusions }, { "arm_streaming_compatible", 0, 0, false, true, true, true, NULL, attr_streaming_exclusions }, + { "arm_new_za", 0, 0, true, false, false, false, + handle_arm_new_za_attribute, + attr_arm_new_za_exclusions }, + { "arm_shared_za", 0, 0, false, true, true, true, + NULL, attr_no_arm_new_za }, + { "arm_preserves_za", 0, 0, false, true, true, true, + NULL, attr_no_arm_new_za }, { "arm_sve_vector_bits", 1, 1, false, true, false, true, aarch64_sve::handle_arm_sve_vector_bits_attribute, NULL }, @@ -3929,6 +3972,7 @@ aarch64_hard_regno_nregs (unsigned regno, machine_mode mode) case PR_HI_REGS: case FFR_REGS: case PR_AND_FFR_REGS: + case ZA_REGS: return 1; default: return CEIL (lowest_size, UNITS_PER_WORD); @@ -3959,6 +4003,9 @@ aarch64_hard_regno_mode_ok (unsigned regno, machine_mode mode) if (pr_or_ffr_regnum_p (regno)) return false; + if (regno == ZA_REGNUM || regno == OLD_ZA_REGNUM) + return true; + if (regno == SP_REGNUM) /* The purpose of comparing with ptr_mode is to support the global register variable associated with the stack pointer @@ -4078,12 +4125,41 @@ aarch64_fntype_sm_state (const_tree fntype) return AARCH64_FL_SM_OFF; } +/* Return the state of PSTATE.ZA on entry to functions of type FNTYPE. */ + +static aarch64_feature_flags +aarch64_fntype_za_state (const_tree fntype) +{ + if (lookup_attribute ("arm_shared_za", TYPE_ATTRIBUTES (fntype))) + return AARCH64_FL_ZA_ON; + + return 0; +} + /* Return the ISA mode on entry to functions of type FNTYPE. */ static aarch64_feature_flags aarch64_fntype_isa_mode (const_tree fntype) { - return aarch64_fntype_sm_state (fntype); + return (aarch64_fntype_sm_state (fntype) + | aarch64_fntype_za_state (fntype)); +} + +/* Return true if functions of type FNTYPE preserve the contents of ZA. */ + +static bool +aarch64_fntype_preserves_za (const_tree fntype) +{ + return lookup_attribute ("arm_preserves_za", TYPE_ATTRIBUTES (fntype)); +} + +/* Return true if FNDECL creates new ZA state (as opposed to sharing + ZA with its callers or ignoring ZA altogether). */ + +static bool +aarch64_fndecl_has_new_za_state (const_tree fndecl) +{ + return lookup_attribute ("arm_new_za", DECL_ATTRIBUTES (fndecl)); } /* Return the state of PSTATE.SM when compiling the body of @@ -4096,13 +4172,34 @@ aarch64_fndecl_sm_state (const_tree fndecl) return aarch64_fntype_sm_state (TREE_TYPE (fndecl)); } +/* Return the state of PSTATE.ZA when compiling the body of function FNDECL. + This might be different from the state of PSTATE.ZA on entry. */ + +static aarch64_feature_flags +aarch64_fndecl_za_state (const_tree fndecl) +{ + if (aarch64_fndecl_has_new_za_state (fndecl)) + return AARCH64_FL_ZA_ON; + + return aarch64_fntype_za_state (TREE_TYPE (fndecl)); +} + /* Return the ISA mode that should be used to compile the body of function FNDECL. */ static aarch64_feature_flags aarch64_fndecl_isa_mode (const_tree fndecl) { - return aarch64_fndecl_sm_state (fndecl); + return (aarch64_fndecl_sm_state (fndecl) + | aarch64_fndecl_za_state (fndecl)); +} + +/* Return true if function FNDECL preserves the contents of ZA. */ + +static bool +aarch64_fndecl_preserves_za (const_tree fndecl) +{ + return aarch64_fntype_preserves_za (TREE_TYPE (fndecl)); } /* Return the state of PSTATE.SM on entry to the current function. @@ -4115,6 +4212,25 @@ aarch64_cfun_incoming_sm_state () return aarch64_fntype_sm_state (TREE_TYPE (cfun->decl)); } +/* Return the state of PSTATE.ZA on entry to the current function + (which might be different from the state of PSTATE.ZA in the + function body). */ + +static aarch64_feature_flags +aarch64_cfun_incoming_za_state () +{ + return aarch64_fntype_za_state (TREE_TYPE (cfun->decl)); +} + +/* Return true if the current function creates new ZA state (as opposed + to sharing ZA with its callers or ignoring ZA altogether). */ + +static bool +aarch64_cfun_has_new_za_state () +{ + return aarch64_fndecl_has_new_za_state (cfun->decl); +} + /* Return true if a call from the current function to a function with ISA mode CALLEE_MODE would involve a change to PSTATE.SM around the BL instruction. */ @@ -5678,6 +5794,74 @@ aarch64_output_sve_vector_inc_dec (const char *operands, rtx x) factor, nelts_per_vq); } +/* Return a constant that represents FACTOR multiplied by the + number of 128-bit quadwords in an SME vector. ISA_MODE is the + ISA mode in which the calculation is being performed. */ + +static rtx +aarch64_sme_vq_immediate (machine_mode mode, HOST_WIDE_INT factor, + aarch64_feature_flags isa_mode) +{ + gcc_assert (aarch64_sve_rdvl_factor_p (factor)); + if (isa_mode & AARCH64_FL_SM_ON) + /* We're in streaming mode, so we can use normal poly-int values. */ + return gen_int_mode ({ factor, factor }, mode); + + rtvec vec = gen_rtvec (1, gen_int_mode (factor, SImode)); + rtx unspec = gen_rtx_UNSPEC (mode, vec, UNSPEC_SME_VQ); + return gen_rtx_CONST (mode, unspec); +} + +/* Return true if X is a constant that represents some number X + multiplied by the number of quadwords in an SME vector. Store this X + in *FACTOR if so. */ + +static bool +aarch64_sme_vq_unspec_p (const_rtx x, HOST_WIDE_INT *factor) +{ + if (!TARGET_SME || GET_CODE (x) != CONST) + return false; + + x = XEXP (x, 0); + if (GET_CODE (x) != UNSPEC + || XINT (x, 1) != UNSPEC_SME_VQ + || XVECLEN (x, 0) != 1) + return false; + + x = XVECEXP (x, 0, 0); + if (!CONST_INT_P (x)) + return false; + + *factor = INTVAL (x); + return true; +} + +/* Return true if X is a constant that represents some number X + multiplied by the number of quadwords in an SME vector, and if + that X is in the range of RDSVL. */ + +bool +aarch64_rdsvl_immediate_p (const_rtx x) +{ + HOST_WIDE_INT factor; + return (aarch64_sme_vq_unspec_p (x, &factor) + && aarch64_sve_rdvl_factor_p (factor)); +} + +/* Return the asm string for an RDSVL instruction that calculates X, + which is a constant that satisfies aarch64_rdsvl_immediate_p. */ + +char * +aarch64_output_rdsvl (const_rtx x) +{ + gcc_assert (aarch64_rdsvl_immediate_p (x)); + static char buffer[sizeof ("rdsvl\t%x0, #-") + 3 * sizeof (int)]; + x = XVECEXP (XEXP (x, 0), 0, 0); + snprintf (buffer, sizeof (buffer), "rdsvl\t%%x0, #%d", + (int) INTVAL (x) / 16); + return buffer; +} + /* Multipliers for repeating bitmasks of width 32, 16, 8, 4, and 2. */ static const unsigned HOST_WIDE_INT bitmask_imm_mul[] = @@ -7457,6 +7641,15 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm) return; } + if (aarch64_rdsvl_immediate_p (base)) + { + /* We could handle non-constant offsets if they are ever + generated. */ + gcc_assert (const_offset == 0); + emit_insn (gen_rtx_SET (dest, imm)); + return; + } + sty = aarch64_classify_symbol (base, const_offset); switch (sty) { @@ -8458,7 +8651,7 @@ void aarch64_init_cumulative_args (CUMULATIVE_ARGS *pcum, const_tree fntype, rtx libname ATTRIBUTE_UNUSED, - const_tree fndecl ATTRIBUTE_UNUSED, + const_tree fndecl, unsigned n_named ATTRIBUTE_UNUSED, bool silent_p) { @@ -8483,6 +8676,9 @@ aarch64_init_cumulative_args (CUMULATIVE_ARGS *pcum, pcum->aapcs_stack_words = 0; pcum->aapcs_stack_size = 0; pcum->silent_p = silent_p; + pcum->preserves_za = (fndecl ? aarch64_fndecl_preserves_za (fndecl) + : fntype ? aarch64_fntype_preserves_za (fntype) + : false); pcum->num_sme_mode_switch_args = 0; if (!silent_p @@ -9015,6 +9211,12 @@ aarch64_layout_frame (void) frame.wb_push_candidate2 = INVALID_REGNUM; frame.spare_pred_reg = INVALID_REGNUM; + frame.has_new_za_state = (aarch64_cfun_has_new_za_state () + && DF_REG_USE_COUNT (ZA_REGNUM) > 0); + if (frame.has_new_za_state) + /* Saving any old ZA state involves a call to __arm_tpidr2_save. */ + df_set_regs_ever_live (R30_REGNUM, true); + /* First mark all the registers that really need to be saved... */ for (regno = 0; regno <= LAST_SAVED_REGNUM; regno++) frame.reg_offset[regno] = SLOT_NOT_REQUIRED; @@ -10443,7 +10645,11 @@ aarch64_epilogue_uses (int regno) { if (regno == LR_REGNUM) return 1; + if (regno == ZA_REGNUM) + return 1; } + if (regno == ZA_REGNUM && aarch64_cfun_incoming_za_state ()) + return 1; return 0; } @@ -10756,6 +10962,27 @@ aarch64_expand_prologue (void) emit_move_insn (gen_rtx_REG (DImode, R1_REGNUM), old_r1); } } + + if (cfun->machine->frame.has_new_za_state) + { + /* Commit any uncommitted lazy save and turn ZA on. The sequence is: + + mrs , tpidr2_el0 + cbz , no_save + bl __arm_tpidr2_save + msr tpidr2_el0, xzr + no_save: + smstart za */ + auto label = gen_label_rtx (); + auto tmp_reg = gen_rtx_REG (DImode, STACK_CLASH_SVE_CFA_REGNUM); + emit_insn (gen_aarch64_read_tpidr2 (tmp_reg)); + auto jump = emit_jump_insn (gen_aarch64_cbeqdi1 (tmp_reg, label)); + JUMP_LABEL (jump) = label; + emit_insn (gen_aarch64_tpidr2_save ()); + emit_insn (gen_aarch64_clear_tpidr2 ()); + emit_label (label); + emit_insn (gen_aarch64_smstart_za ()); + } } /* Return TRUE if we can use a simple_return insn. @@ -10829,6 +11056,11 @@ aarch64_expand_epilogue (rtx_call_insn *sibcall) = maybe_ne (get_frame_size () + cfun->machine->frame.saved_varargs_size, 0); + if (cfun->machine->frame.has_new_za_state) + /* Turn ZA off before returning. TPIDR2_EL0 is already null at + this point. */ + emit_insn (gen_aarch64_smstop_za ()); + /* Emit a barrier to prevent loads from a deallocated stack. */ if (maybe_gt (final_adjust, crtl->outgoing_args_size) || cfun->calls_alloca @@ -11989,6 +12221,66 @@ aarch64_fixed_condition_code_regs (unsigned int *p1, unsigned int *p2) return true; } +/* Make the start of the current function allocate a ZA lazy save buffer + and associated TPIDR2 block. Also make it initialize the TPIDR2 block + to point to the ZA save buffer. */ + +static void +aarch64_create_tpidr2_block () +{ + if (cfun->machine->tpidr2_block) + return; + + start_sequence (); + NO_DEFER_POP; + + /* The TPIDR2 block is 16 bytes in size and must be aligned to a 128-bit + boundary. */ + rtx block = assign_stack_local (V16QImode, 16, 128); + + /* We use the block by moving its address into TPIDR2_EL0, so we need + a simple register pointer to it rather than a general address. */ + rtx ptr = force_reg (Pmode, XEXP (block, 0)); + cfun->machine->tpidr2_block_ptr = ptr; + cfun->machine->tpidr2_block = replace_equiv_address (block, ptr); + + /* The ZA save buffer is SVL.B*SVL.B bytes in size. */ + rtx svl_bytes = aarch64_sme_vq_immediate (Pmode, 16, AARCH64_ISA_MODE); + rtx za_size = expand_simple_binop (Pmode, MULT, svl_bytes, svl_bytes, + NULL, 0, OPTAB_LIB_WIDEN); + rtx za_save_buffer = allocate_dynamic_stack_space (za_size, 128, 128, + -1, true); + za_save_buffer = force_reg (Pmode, za_save_buffer); + cfun->machine->za_save_buffer = za_save_buffer; + + /* The first word of the block points to the save buffer and the second + word is the number of ZA slices to save. */ + rtx block_0 = adjust_address (block, DImode, 0); + rtx block_8 = adjust_address (block, DImode, 8); + emit_insn (gen_store_pair_dw_didi (block_0, za_save_buffer, + block_8, force_reg (DImode, svl_bytes))); + + OK_DEFER_POP; + auto insns = get_insns (); + end_sequence (); + + emit_insn_after (insns, parm_birth_insn); +} + +/* Restore the contents of ZA from the lazy save buffer. PSTATE.ZA is + known to be 0 and TPIDR2_EL0 is known to be null. */ + +void +aarch64_restore_za () +{ + gcc_assert (cfun->machine->tpidr2_block); + + emit_insn (gen_aarch64_smstart_za ()); + emit_move_insn (gen_rtx_REG (Pmode, R0_REGNUM), + cfun->machine->tpidr2_block_ptr); + emit_insn (gen_aarch64_tpidr2_restore ()); +} + /* Implement TARGET_START_CALL_ARGS. */ static void @@ -12004,6 +12296,23 @@ aarch64_start_call_args (cumulative_args_t ca_v) " option %<-march%>, or by using the %" " attribute or pragma", "sme"); } + + if (!TARGET_ZA && (ca->isa_mode & AARCH64_FL_ZA_ON)) + error ("call to an % function from a function" + " that has no ZA state"); + + /* Set up a lazy save buffer if the current function has ZA state + that is not shared with the callee and if the callee might + clobber the state. */ + if (TARGET_ZA + && !(ca->isa_mode & AARCH64_FL_ZA_ON) + && !ca->preserves_za) + { + if (!cfun->machine->tpidr2_block) + aarch64_create_tpidr2_block (); + emit_insn (gen_aarch64_save_za (cfun->machine->tpidr2_block_ptr, + cfun->machine->tpidr2_block)); + } } /* This function is used by the call expanders of the machine description. @@ -12109,6 +12418,27 @@ aarch64_expand_call (rtx result, rtx mem, rtx cookie, bool sibcall) cfun->machine->call_switches_sm_state = true; } + + /* If the callee is a shared ZA function, record that it uses the + current value of ZA. */ + if (callee_isa_mode & AARCH64_FL_ZA_ON) + use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), + gen_rtx_REG (VNx16BImode, ZA_REGNUM)); +} + +/* Implement TARGET_END_CALL_ARGS. */ + +static void +aarch64_end_call_args (cumulative_args_t ca_v) +{ + CUMULATIVE_ARGS *ca = get_cumulative_args (ca_v); + + /* If we set up a ZA lazy save before the call, check whether the save + was committed. Restore the contents of ZA from the buffer is so. */ + if (TARGET_ZA + && !(ca->isa_mode & AARCH64_FL_ZA_ON) + && !ca->preserves_za) + emit_insn (gen_aarch64_restore_za ()); } /* Emit call insn with PAT and do aarch64-specific handling. */ @@ -13246,6 +13576,9 @@ aarch64_regno_regclass (unsigned regno) if (regno == FFR_REGNUM || regno == FFRT_REGNUM) return FFR_REGS; + if (regno == ZA_REGNUM || regno == OLD_ZA_REGNUM) + return ZA_REGS; + return NO_REGS; } @@ -13601,12 +13934,14 @@ aarch64_class_max_nregs (reg_class_t regclass, machine_mode mode) return (vec_flags & VEC_ADVSIMD ? CEIL (lowest_size, UNITS_PER_VREG) : CEIL (lowest_size, UNITS_PER_WORD)); + case STACK_REG: case PR_REGS: case PR_LO_REGS: case PR_HI_REGS: case FFR_REGS: case PR_AND_FFR_REGS: + case ZA_REGS: return 1; case NO_REGS: @@ -18570,10 +18905,13 @@ aarch64_override_options_internal (struct gcc_options *opts) && !fixed_regs[R18_REGNUM]) error ("%<-fsanitize=shadow-call-stack%> requires %<-ffixed-x18%>"); - if ((opts->x_aarch64_isa_flags & AARCH64_FL_SM_ON) + if ((opts->x_aarch64_isa_flags & (AARCH64_FL_SM_ON | AARCH64_FL_ZA_ON)) && !(opts->x_aarch64_isa_flags & AARCH64_FL_SME)) { - error ("streaming functions require the ISA extension %qs", "sme"); + if (opts->x_aarch64_isa_flags & AARCH64_FL_SM_ON) + error ("streaming functions require the ISA extension %qs", "sme"); + else + error ("functions with ZA state require the ISA extension %qs", "sme"); inform (input_location, "you can enable %qs using the command-line" " option %<-march%>, or by using the %" " attribute or pragma", "sme"); @@ -20900,9 +21238,11 @@ aarch64_conditional_register_usage (void) call_used_regs[i] = 1; } - /* Only allow the FFR and FFRT to be accessed via special patterns. */ + /* Only allow these registers to be accessed via special patterns. */ CLEAR_HARD_REG_BIT (operand_reg_set, FFR_REGNUM); CLEAR_HARD_REG_BIT (operand_reg_set, FFRT_REGNUM); + CLEAR_HARD_REG_BIT (operand_reg_set, ZA_REGNUM); + CLEAR_HARD_REG_BIT (operand_reg_set, OLD_ZA_REGNUM); /* When tracking speculation, we need a couple of call-clobbered registers to track the speculation state. It would be nice to just use @@ -22359,6 +22699,9 @@ aarch64_mov_operand_p (rtx x, machine_mode mode) || aarch64_sve_rdvl_immediate_p (x))) return true; + if (aarch64_rdsvl_immediate_p (x)) + return true; + return aarch64_classify_symbolic_expression (x) == SYMBOL_TINY_ABSOLUTE; } @@ -27810,9 +28153,36 @@ aarch64_comp_type_attributes (const_tree type1, const_tree type2) return 0; if (!check_attr ("arm_streaming_compatible")) return 0; + if (!check_attr ("arm_shared_za")) + return 0; + if (!check_attr ("arm_preserves_za")) + return 0; return 1; } +/* Implement TARGET_MERGE_DECL_ATTRIBUTES. */ + +static tree +aarch64_merge_decl_attributes (tree olddecl, tree newdecl) +{ + tree attrs = merge_attributes (DECL_ATTRIBUTES (olddecl), + DECL_ATTRIBUTES (newdecl)); + + if (DECL_INITIAL (olddecl)) + for (auto name : { "arm_new_za" }) + if (!lookup_attribute (name, DECL_ATTRIBUTES (olddecl)) + && lookup_attribute (name, DECL_ATTRIBUTES (newdecl))) + { + error ("cannot apply attribute %qs to %q+D after the function" + " has been defined", name, newdecl); + inform (DECL_SOURCE_LOCATION (olddecl), "%q+D defined here", + newdecl); + attrs = remove_attribute (name, attrs); + } + + return attrs; +} + /* Implement TARGET_GET_MULTILIB_ABI_NAME */ static const char * @@ -28178,6 +28548,24 @@ aarch64_indirect_call_asm (rtx addr) return ""; } +/* Implement TARGET_MD_ASM_ADJUST. */ + +rtx_insn * +aarch64_md_asm_adjust (vec &outputs, vec &inputs, + vec &input_modes, + vec &constraints, + vec &uses, vec &clobbers, + HARD_REG_SET &clobbered_regs, location_t loc) +{ + /* "za" in the clobber list is defined to mean that the asm can read + from and write to ZA. */ + if (TEST_HARD_REG_BIT (clobbered_regs, ZA_REGNUM)) + uses.safe_push (gen_rtx_REG (VNx16QImode, ZA_REGNUM)); + + return arm_md_asm_adjust (outputs, inputs, input_modes, constraints, + uses, clobbers, clobbered_regs, loc); +} + /* If CALL involves a change in PSTATE.SM, emit the instructions needed to switch to the new mode and the instructions needed to restore the original mode. Return true if something changed. */ @@ -28565,6 +28953,9 @@ aarch64_run_selftests (void) #undef TARGET_START_CALL_ARGS #define TARGET_START_CALL_ARGS aarch64_start_call_args +#undef TARGET_END_CALL_ARGS +#define TARGET_END_CALL_ARGS aarch64_end_call_args + #undef TARGET_GIMPLE_FOLD_BUILTIN #define TARGET_GIMPLE_FOLD_BUILTIN aarch64_gimple_fold_builtin @@ -28926,6 +29317,9 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_COMP_TYPE_ATTRIBUTES #define TARGET_COMP_TYPE_ATTRIBUTES aarch64_comp_type_attributes +#undef TARGET_MERGE_DECL_ATTRIBUTES +#define TARGET_MERGE_DECL_ATTRIBUTES aarch64_merge_decl_attributes + #undef TARGET_GET_MULTILIB_ABI_NAME #define TARGET_GET_MULTILIB_ABI_NAME aarch64_get_multilib_abi_name @@ -28947,7 +29341,7 @@ aarch64_libgcc_floating_mode_supported_p #define TARGET_STRICT_ARGUMENT_NAMING hook_bool_CUMULATIVE_ARGS_true #undef TARGET_MD_ASM_ADJUST -#define TARGET_MD_ASM_ADJUST arm_md_asm_adjust +#define TARGET_MD_ASM_ADJUST aarch64_md_asm_adjust #undef TARGET_ASM_FILE_END #define TARGET_ASM_FILE_END aarch64_asm_file_end diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index f23edea35f5..b5877e7e61e 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -207,6 +207,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; /* Macros to test ISA flags. */ #define AARCH64_ISA_SM_OFF (aarch64_isa_flags & AARCH64_FL_SM_OFF) +#define AARCH64_ISA_ZA_ON (aarch64_isa_flags & AARCH64_FL_ZA_ON) #define AARCH64_ISA_MODE (aarch64_isa_flags & AARCH64_FL_ISA_MODES) #define AARCH64_ISA_CRC (aarch64_isa_flags & AARCH64_FL_CRC) #define AARCH64_ISA_CRYPTO (aarch64_isa_flags & AARCH64_FL_CRYPTO) @@ -259,6 +260,9 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; #define TARGET_STREAMING_COMPATIBLE \ ((aarch64_isa_flags & AARCH64_FL_SM_STATE) == 0) +/* PSTATE.ZA is enabled in the current function body. */ +#define TARGET_ZA (AARCH64_ISA_ZA_ON) + /* Crypto is an optional extension to AdvSIMD. */ #define TARGET_CRYPTO (AARCH64_ISA_CRYPTO) @@ -445,7 +449,8 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; 1, 1, 1, 1, /* SFP, AP, CC, VG */ \ 0, 0, 0, 0, 0, 0, 0, 0, /* P0 - P7 */ \ 0, 0, 0, 0, 0, 0, 0, 0, /* P8 - P15 */ \ - 1, 1 /* FFR and FFRT */ \ + 1, 1, /* FFR and FFRT */ \ + 1, 1 /* TPIDR2 and ZA */ \ } /* X30 is marked as caller-saved which is in line with regular function call @@ -455,7 +460,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; true but not until function epilogues have been generated. This ensures that X30 is available for use in leaf functions if needed. */ -#define CALL_USED_REGISTERS \ +#define CALL_REALLY_USED_REGISTERS \ { \ 1, 1, 1, 1, 1, 1, 1, 1, /* R0 - R7 */ \ 1, 1, 1, 1, 1, 1, 1, 1, /* R8 - R15 */ \ @@ -468,7 +473,8 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; 1, 1, 1, 1, /* SFP, AP, CC, VG */ \ 1, 1, 1, 1, 1, 1, 1, 1, /* P0 - P7 */ \ 1, 1, 1, 1, 1, 1, 1, 1, /* P8 - P15 */ \ - 1, 1 /* FFR and FFRT */ \ + 1, 1, /* FFR and FFRT */ \ + 1, 0 /* TPIDR2 and ZA */ \ } #define REGISTER_NAMES \ @@ -484,7 +490,8 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; "sfp", "ap", "cc", "vg", \ "p0", "p1", "p2", "p3", "p4", "p5", "p6", "p7", \ "p8", "p9", "p10", "p11", "p12", "p13", "p14", "p15", \ - "ffr", "ffrt" \ + "ffr", "ffrt", \ + "za", "old_za" \ } /* Generate the register aliases for core register N */ @@ -533,7 +540,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; #define FRAME_POINTER_REGNUM SFP_REGNUM #define STACK_POINTER_REGNUM SP_REGNUM #define ARG_POINTER_REGNUM AP_REGNUM -#define FIRST_PSEUDO_REGISTER (FFRT_REGNUM + 1) +#define FIRST_PSEUDO_REGISTER (OLD_ZA_REGNUM + 1) /* The number of argument registers available for each class. */ #define NUM_ARG_REGS 8 @@ -673,6 +680,7 @@ enum reg_class PR_REGS, FFR_REGS, PR_AND_FFR_REGS, + ZA_REGS, ALL_REGS, LIM_REG_CLASSES /* Last */ }; @@ -696,6 +704,7 @@ enum reg_class "PR_REGS", \ "FFR_REGS", \ "PR_AND_FFR_REGS", \ + "ZA_REGS", \ "ALL_REGS" \ } @@ -716,6 +725,7 @@ enum reg_class { 0x00000000, 0x00000000, 0x000ffff0 }, /* PR_REGS */ \ { 0x00000000, 0x00000000, 0x00300000 }, /* FFR_REGS */ \ { 0x00000000, 0x00000000, 0x003ffff0 }, /* PR_AND_FFR_REGS */ \ + { 0x00000000, 0x00000000, 0x00c00000 }, /* ZA_REGS */ \ { 0xffffffff, 0xffffffff, 0x000fffff } /* ALL_REGS */ \ } @@ -889,16 +899,36 @@ struct GTY (()) aarch64_frame /* True if shadow call stack should be enabled for the current function. */ bool is_scs_enabled; + + /* True if the function has an arm_new_za attribute and if ZA is + actually used by the function. */ + bool has_new_za_state; }; typedef struct GTY (()) machine_function { struct aarch64_frame frame; + /* One entry for each hard register. */ bool reg_is_wrapped_separately[LAST_SAVED_REGNUM]; + /* One entry for each general purpose register. */ rtx call_via[SP_REGNUM]; + + /* A MEM for the whole of the function's TPIDR2 block, or null if the + function doesn't have a TPIDR2 block. */ + rtx tpidr2_block; + + /* A pseudo register that points to the function's TPIDR2 block, or null + if the function doesn't have a TPIDR2 block. */ + rtx tpidr2_block_ptr; + + /* A pseudo register that points to the function's ZA save buffer, + or null if none. */ + rtx za_save_buffer; + bool label_is_assembled; + /* True if we've expanded at least one call to a function that changes PSTATE.SM. This should only be used for saving compile time: false guarantees that no such mode switch exists. */ @@ -968,6 +998,9 @@ typedef struct bool silent_p; /* True if we should act silently, rather than raise an error for invalid calls. */ + /* True if the call preserves ZA. */ + bool preserves_za; + /* A list of registers that need to be saved and restored around a change to PSTATE.SM. An auto_vec would be more convenient, but those can't be copied. */ diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 991f46fbc80..3ebe8690c31 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -111,6 +111,11 @@ (define_constants ;; "FFR token": a fake register used for representing the scheduling ;; restrictions on FFR-related operations. (FFRT_REGNUM 85) + (ZA_REGNUM 86) + ;; Represents a lazy-populated back-up of the ZA contents, as managed + ;; by TPIDR2_EL0. Modelling this as a simple register allows the RTL + ;; optimizers to remove redundant saves and restores. + (OLD_ZA_REGNUM 87) ;; The pair of scratch registers used for stack probing with -fstack-check. ;; Leave R9 alone as a possible choice for the static chain. ;; Note that the use of these registers is mutually exclusive with the use @@ -303,6 +308,9 @@ (define_c_enum "unspec" [ UNSPEC_TAG_SPACE ; Translate address to MTE tag address space. UNSPEC_LD1RO UNSPEC_SALT_ADDR + ;; Wraps a constant integer that should be multiplied by the number + ;; of quadwords in an SME vector. + UNSPEC_SME_VQ ]) (define_c_enum "unspecv" [ @@ -374,7 +382,7 @@ (define_constants ;; As a convenience, "fp_q" means "fp" + the ability to move between ;; Q registers and is equivalent to "simd". -(define_enum "arches" [any rcpc8_4 fp fp_q base_simd simd sve fp16]) +(define_enum "arches" [any rcpc8_4 fp fp_q base_simd simd sve fp16 sme]) (define_enum_attr "arch" "arches" (const_string "any")) @@ -412,7 +420,10 @@ (define_attr "arch_enabled" "no,yes" (match_test "TARGET_FP_F16INST")) (and (eq_attr "arch" "sve") - (match_test "TARGET_SVE"))) + (match_test "TARGET_SVE")) + + (and (eq_attr "arch" "sme") + (match_test "TARGET_SME"))) (const_string "yes") (const_string "no"))) @@ -915,7 +926,7 @@ (define_insn "simple_return" (set_attr "sls_length" "retbr")] ) -(define_insn "*cb1" +(define_insn "aarch64_cb1" [(set (pc) (if_then_else (EQL (match_operand:GPI 0 "register_operand" "r") (const_int 0)) (label_ref (match_operand 1 "" "")) @@ -1268,8 +1279,8 @@ (define_expand "mov" ) (define_insn_and_split "*movsi_aarch64" - [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r, r, r,r,w, m,m, r, r, r, w,r,w, w") - (match_operand:SI 1 "aarch64_mov_operand" " r,r,k,M,n,Usv,Usr,m,m,rZ,w,Usw,Usa,Ush,rZ,w,w,Ds"))] + [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r, r, r, r,r,w, m,m, r, r, r, w,r,w, w") + (match_operand:SI 1 "aarch64_mov_operand" " r,r,k,M,n,Usv,Usr,UsR,m,m,rZ,w,Usw,Usa,Ush,rZ,w,w,Ds"))] "(register_operand (operands[0], SImode) || aarch64_reg_or_zero (operands[1], SImode))" "@ @@ -1280,6 +1291,7 @@ (define_insn_and_split "*movsi_aarch64" # * return aarch64_output_sve_cnt_immediate (\"cnt\", \"%x0\", operands[1]); * return aarch64_output_sve_rdvl (operands[1]); + * return aarch64_output_rdsvl (operands[1]); ldr\\t%w0, %1 ldr\\t%s0, %1 str\\t%w1, %0 @@ -1300,17 +1312,17 @@ (define_insn_and_split "*movsi_aarch64" }" ;; The "mov_imm" type for CNT is just a placeholder. [(set_attr "type" "mov_reg,mov_reg,mov_reg, - mov_imm,mov_imm,mov_imm,mov_imm, + mov_imm,mov_imm,mov_imm,mov_imm,mov_imm, load_4,load_4,store_4,store_4,load_4, adr,adr,f_mcr,f_mrc,fmov,neon_move") - (set_attr "arch" "*,*,*,*,*,sve,sve,*,fp,*,fp,*,*,*,fp,fp,fp,simd") - (set_attr "length" "4,4,4,4,*, 4, 4,4, 4,4, 4,8,4,4, 4, 4, 4, 4") + (set_attr "arch" "*,*,*,*,*,sve,sve,sme,*,fp,*,fp,*,*,*,fp,fp,fp,simd") + (set_attr "length" "4,4,4,4,*, 4, 4, 4,4, 4,4, 4,8,4,4, 4, 4, 4, 4") ] ) (define_insn_and_split "*movdi_aarch64" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r,k,r,r,r,r, r, r,r,w, m,m, r, r, r, w,r,w, w") - (match_operand:DI 1 "aarch64_mov_operand" " r,r,k,N,M,n,Usv,Usr,m,m,rZ,w,Usw,Usa,Ush,rZ,w,w,Dd"))] + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,k,r,r,r,r, r, r, r,r,w, m,m, r, r, r, w,r,w, w") + (match_operand:DI 1 "aarch64_mov_operand" " r,r,k,N,M,n,Usv,Usr,UsR,m,m,rZ,w,Usw,Usa,Ush,rZ,w,w,Dd"))] "(register_operand (operands[0], DImode) || aarch64_reg_or_zero (operands[1], DImode))" "@ @@ -1322,6 +1334,7 @@ (define_insn_and_split "*movdi_aarch64" # * return aarch64_output_sve_cnt_immediate (\"cnt\", \"%x0\", operands[1]); * return aarch64_output_sve_rdvl (operands[1]); + * return aarch64_output_rdsvl (operands[1]); ldr\\t%x0, %1 ldr\\t%d0, %1 str\\t%x1, %0 @@ -1342,11 +1355,11 @@ (define_insn_and_split "*movdi_aarch64" }" ;; The "mov_imm" type for CNTD is just a placeholder. [(set_attr "type" "mov_reg,mov_reg,mov_reg, - mov_imm,mov_imm,mov_imm,mov_imm,mov_imm, + mov_imm,mov_imm,mov_imm,mov_imm,mov_imm,mov_imm, load_8,load_8,store_8,store_8,load_8, adr,adr,f_mcr,f_mrc,fmov,neon_move") - (set_attr "arch" "*,*,*,*,*,*,sve,sve,*,fp,*,fp,*,*,*,fp,fp,fp,simd") - (set_attr "length" "4,4,4,4,4,*, 4, 4,4, 4,4, 4,8,4,4, 4, 4, 4, 4")] + (set_attr "arch" "*,*,*,*,*,*,sve,sve,sme,*,fp,*,fp,*,*,*,fp,fp,fp,simd") + (set_attr "length" "4,4,4,4,4,*, 4, 4, 4,4, 4,4, 4,8,4,4, 4, 4, 4, 4")] ) (define_insn "insv_imm" diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md index 3664e4dbdd6..8d4393f30a1 100644 --- a/gcc/config/aarch64/constraints.md +++ b/gcc/config/aarch64/constraints.md @@ -215,6 +215,12 @@ (define_constraint "Usr" (and (match_code "const_poly_int") (match_test "aarch64_sve_rdvl_immediate_p (op)"))) +(define_constraint "UsR" + "@internal + A constraint that matches a value produced by RDSVL." + (and (match_code "const") + (match_test "aarch64_rdsvl_immediate_p (op)"))) + (define_constraint "Usv" "@internal A constraint that matches a VG-based constant that can be loaded by diff --git a/gcc/testsuite/gcc.target/aarch64/sme/za_state_1.c b/gcc/testsuite/gcc.target/aarch64/sme/za_state_1.c new file mode 100644 index 00000000000..cd3cfd0cf4d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/za_state_1.c @@ -0,0 +1,102 @@ +// { dg-options "" } + +void __attribute__((arm_shared_za)) shared_a (); +void shared_a (); // { dg-error "conflicting types" } + +void shared_b (); +void __attribute__((arm_shared_za)) shared_b (); // { dg-error "conflicting types" } + +void __attribute__((arm_shared_za)) shared_c (); +void shared_c () {} // Inherits attribute from declaration (confusingly). + +void shared_d (); +void __attribute__((arm_shared_za)) shared_d () {} // { dg-error "conflicting types" } + +void __attribute__((arm_shared_za)) shared_e () {} +void shared_e (); // { dg-error "conflicting types" } + +void shared_f () {} +void __attribute__((arm_shared_za)) shared_f (); // { dg-error "conflicting types" } + +extern void (*shared_g) (); +extern __attribute__((arm_shared_za)) void (*shared_g) (); // { dg-error "conflicting types" } + +extern __attribute__((arm_shared_za)) void (*shared_h) (); +extern void (*shared_h) (); // { dg-error "conflicting types" } + +//---------------------------------------------------------------------------- + +void __attribute__((arm_preserves_za)) preserves_a (); +void preserves_a (); // { dg-error "conflicting types" } + +void preserves_b (); +void __attribute__((arm_preserves_za)) preserves_b (); // { dg-error "conflicting types" } + +void __attribute__((arm_preserves_za)) preserves_c (); +void preserves_c () {} // Inherits attribute from declaration (confusingly). + +void preserves_d (); +void __attribute__((arm_preserves_za)) preserves_d () {} // { dg-error "conflicting types" } + +void __attribute__((arm_preserves_za)) preserves_e () {} +void preserves_e (); // { dg-error "conflicting types" } + +void preserves_f () {} +void __attribute__((arm_preserves_za)) preserves_f (); // { dg-error "conflicting types" } + +extern void (*preserves_g) (); +extern __attribute__((arm_preserves_za)) void (*preserves_g) (); // { dg-error "conflicting types" } + +extern __attribute__((arm_preserves_za)) void (*preserves_h) (); +extern void (*preserves_h) (); // { dg-error "conflicting types" } + +//---------------------------------------------------------------------------- + +void __attribute__((arm_preserves_za)) mixed_a (); +void __attribute__((arm_shared_za)) mixed_a (); // { dg-error "conflicting types" } + +void __attribute__((arm_shared_za)) mixed_b (); +void __attribute__((arm_preserves_za)) mixed_b (); // { dg-error "conflicting types" } + +void __attribute__((arm_preserves_za)) mixed_c (); +void __attribute__((arm_shared_za)) mixed_c () {} // { dg-error "conflicting types" } + +void __attribute__((arm_shared_za)) mixed_d (); +void __attribute__((arm_preserves_za)) mixed_d () {} // { dg-error "conflicting types" } + +void __attribute__((arm_preserves_za)) mixed_e () {} +void __attribute__((arm_shared_za)) mixed_e (); // { dg-error "conflicting types" } + +void __attribute__((arm_shared_za)) mixed_f () {} +void __attribute__((arm_preserves_za)) mixed_f (); // { dg-error "conflicting types" } + +extern __attribute__((arm_shared_za)) void (*mixed_g) (); +extern __attribute__((arm_preserves_za)) void (*mixed_g) (); // { dg-error "conflicting types" } + +extern __attribute__((arm_preserves_za)) void (*mixed_h) (); +extern __attribute__((arm_shared_za)) void (*mixed_h) (); // { dg-error "conflicting types" } + +//---------------------------------------------------------------------------- + +void __attribute__((arm_preserves_za, arm_shared_za)) complementary_1(); +void __attribute__((arm_shared_za, arm_preserves_za)) complementary_2(); + +int __attribute__((arm_shared_za)) int_attr; // { dg-warning "only applies to function types" } +void *__attribute__((arm_preserves_za)) ptr_attr; // { dg-warning "only applies to function types" } + +typedef void __attribute__((arm_preserves_za)) preserves_callback (); +typedef void __attribute__((arm_shared_za)) shared_callback (); + +void (*__attribute__((arm_preserves_za)) preserves_callback_ptr) (); +void (*__attribute__((arm_shared_za)) shared_callback_ptr) (); + +typedef void __attribute__((arm_preserves_za, arm_shared_za)) complementary_callback_1 (); +typedef void __attribute__((arm_shared_za, arm_preserves_za)) complementary_callback_2 (); + +void __attribute__((arm_preserves_za, arm_shared_za)) (*complementary_callback_ptr_1) (); +void __attribute__((arm_shared_za, arm_preserves_za)) (*complementary_callback_ptr_2) (); + +struct s { + void __attribute__((arm_preserves_za, arm_shared_za)) (*complementary_callback_ptr_1) (); + void __attribute__((arm_shared_za, arm_preserves_za)) (*complementary_callback_ptr_2) (); +}; diff --git a/gcc/testsuite/gcc.target/aarch64/sme/za_state_2.c b/gcc/testsuite/gcc.target/aarch64/sme/za_state_2.c new file mode 100644 index 00000000000..261c500aff1 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/za_state_2.c @@ -0,0 +1,96 @@ +// { dg-options "" } + +void __attribute__((arm_new_za)) new_za_a (); +void new_za_a (); + +void new_za_b (); +void __attribute__((arm_new_za)) new_za_b (); + +void __attribute__((arm_new_za)) new_za_c (); +void new_za_c () {} + +void new_za_d (); +void __attribute__((arm_new_za)) new_za_d () {} + +void __attribute__((arm_new_za)) new_za_e () {} +void new_za_e (); + +void new_za_f () {} +void __attribute__((arm_new_za)) new_za_f (); // { dg-error "cannot apply attribute 'arm_new_za' to 'new_za_f' after the function has been defined" } + +extern void (*new_za_g) (); +extern __attribute__((arm_new_za)) void (*new_za_g) (); // { dg-error "applies only to functions" } + +extern __attribute__((arm_new_za)) void (*new_za_h) (); // { dg-error "applies only to functions" } +extern void (*new_za_h) (); + +//---------------------------------------------------------------------------- + +void __attribute__((arm_new_za)) shared_a (); +void __attribute__((arm_shared_za)) shared_a (); // { dg-warning "conflicts with attribute" } + +void __attribute__((arm_shared_za)) shared_b (); +void __attribute__((arm_new_za)) shared_b (); // { dg-error "conflicting types" } +// { dg-warning "conflicts with attribute" "" { target *-*-* } .-1 } + +void __attribute__((arm_new_za)) shared_c (); +void __attribute__((arm_shared_za)) shared_c () {} // { dg-warning "conflicts with attribute" } + +void __attribute__((arm_shared_za)) shared_d (); +void __attribute__((arm_new_za)) shared_d () {} // { dg-warning "conflicts with attribute" } + +void __attribute__((arm_new_za)) shared_e () {} +void __attribute__((arm_shared_za)) shared_e (); // { dg-warning "conflicts with attribute" } + +void __attribute__((arm_shared_za)) shared_f () {} +void __attribute__((arm_new_za)) shared_f (); // { dg-error "conflicting types" } +// { dg-warning "conflicts with attribute" "" { target *-*-* } .-1 } + +//---------------------------------------------------------------------------- + +void __attribute__((arm_new_za)) preserves_a (); +void __attribute__((arm_preserves_za)) preserves_a (); // { dg-warning "conflicts with attribute" } + +void __attribute__((arm_preserves_za)) preserves_b (); +void __attribute__((arm_new_za)) preserves_b (); // { dg-error "conflicting types" } +// { dg-warning "conflicts with attribute" "" { target *-*-* } .-1 } + +void __attribute__((arm_new_za)) preserves_c (); +void __attribute__((arm_preserves_za)) preserves_c () {} // { dg-warning "conflicts with attribute" } + +void __attribute__((arm_preserves_za)) preserves_d (); +void __attribute__((arm_new_za)) preserves_d () {} // { dg-warning "conflicts with attribute" } + +void __attribute__((arm_new_za)) preserves_e () {} +void __attribute__((arm_preserves_za)) preserves_e (); // { dg-warning "conflicts with attribute" } + +void __attribute__((arm_preserves_za)) preserves_f () {} +void __attribute__((arm_new_za)) preserves_f (); // { dg-error "conflicting types" } +// { dg-warning "conflicts with attribute" "" { target *-*-* } .-1 } + +//---------------------------------------------------------------------------- + +void __attribute__((arm_new_za, arm_shared_za)) contradiction_1(); // { dg-warning "conflicts with attribute" } +void __attribute__((arm_shared_za, arm_new_za)) contradiction_2(); // { dg-warning "conflicts with attribute" } +void __attribute__((arm_new_za, arm_preserves_za)) contradiction_3(); // { dg-warning "conflicts with attribute" } +void __attribute__((arm_preserves_za, arm_new_za)) contradiction_4(); // { dg-warning "conflicts with attribute" } + +int __attribute__((arm_new_za)) int_attr; // { dg-error "applies only to functions" } +typedef __attribute__((arm_new_za)) int int_typdef; // { dg-error "applies only to functions" } +typedef void __attribute__((arm_new_za)) new_za_callback (); // { dg-error "applies only to functions" } + +//---------------------------------------------------------------------------- + +void __attribute__((arm_streaming, arm_new_za)) complementary_1 () {} +void __attribute__((arm_new_za, arm_streaming)) complementary_2 () {} +void __attribute__((arm_streaming_compatible, arm_new_za)) complementary_3 () {} +void __attribute__((arm_new_za, arm_streaming_compatible)) complementary_4 () {} + +//---------------------------------------------------------------------------- + +#pragma GCC target "+nosme" + +void __attribute__((arm_new_za)) bereft_1 (); +void __attribute__((arm_new_za)) bereft_2 () {} // { dg-error "functions with ZA state require the ISA extension 'sme'" } +void __attribute__((arm_shared_za)) bereft_3 (); +void __attribute__((arm_shared_za)) bereft_4 () {} // { dg-error "functions with ZA state require the ISA extension 'sme'" } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/za_state_3.c b/gcc/testsuite/gcc.target/aarch64/sme/za_state_3.c new file mode 100644 index 00000000000..fc5771070e4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/za_state_3.c @@ -0,0 +1,27 @@ +// { dg-options "" } + +void normal_callee (); +__attribute__((arm_shared_za)) void shared_callee (); +__attribute__((arm_preserves_za)) void preserves_callee (); +__attribute__((arm_shared_za, arm_preserves_za)) void shared_preserves_callee (); + +struct callbacks { + void (*normal_ptr) (); + __attribute__((arm_shared_za)) void (*shared_ptr) (); + __attribute__((arm_preserves_za)) void (*preserves_ptr) (); + __attribute__((arm_shared_za, arm_preserves_za)) void (*shared_preserves_ptr) (); +}; + +void +normal_caller (struct callbacks *c) +{ + normal_callee (); + shared_callee (); // { dg-error "call to an 'arm_shared_za' function from a function that has no ZA state" } + preserves_callee (); + shared_preserves_callee (); // { dg-error "call to an 'arm_shared_za' function from a function that has no ZA state" } + + c->normal_ptr (); + c->shared_ptr (); // { dg-error "call to an 'arm_shared_za' function from a function that has no ZA state" } + c->preserves_ptr (); + c->shared_preserves_ptr (); // { dg-error "call to an 'arm_shared_za' function from a function that has no ZA state" } +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/za_state_4.c b/gcc/testsuite/gcc.target/aarch64/sme/za_state_4.c new file mode 100644 index 00000000000..34369101085 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/za_state_4.c @@ -0,0 +1,277 @@ +// { dg-options "-O -fno-optimize-sibling-calls" } +// { dg-final { check-function-bodies "**" "" } } + +void ns_normal_callee (); +__attribute__((arm_shared_za)) void ns_shared_callee (); +__attribute__((arm_preserves_za)) void ns_preserves_callee (); +__attribute__((arm_shared_za, arm_preserves_za)) void ns_shared_preserves_callee (); + +__attribute__((arm_streaming)) void s_normal_callee (); +__attribute__((arm_streaming, arm_shared_za)) void s_shared_callee (); +__attribute__((arm_streaming, arm_preserves_za)) void s_preserves_callee (); +__attribute__((arm_streaming, arm_shared_za, arm_preserves_za)) void s_shared_preserves_callee (); + +__attribute__((arm_streaming_compatible)) void sc_normal_callee (); +__attribute__((arm_streaming_compatible, arm_shared_za)) void sc_shared_callee (); +__attribute__((arm_streaming_compatible, arm_preserves_za)) void sc_preserves_callee (); +__attribute__((arm_streaming_compatible, arm_shared_za, arm_preserves_za)) void sc_shared_preserves_callee (); + +struct callbacks { + void (*normal_ptr) (); + __attribute__((arm_shared_za)) void (*shared_ptr) (); + __attribute__((arm_preserves_za)) void (*preserves_ptr) (); + __attribute__((arm_shared_za, arm_preserves_za)) void (*shared_preserves_ptr) (); +}; + +/* +** ns_caller1: +** ... +** mrs x11, tpidr2_el0 +** cbz x11, .* +** bl __arm_tpidr2_save +** smstart za +** ... +** add (x[0-9]+), x29, .* +** rdsvl (x[0-9]+), #1 +** mov (x[0-9]+), sp +** msub (x[0-9]+), \2, \2, \3 +** mov sp, \4 +** stp \4, \2, .* +** msr tpidr2_el0, \1 +** bl ns_normal_callee +** mrs x16, tpidr2_el0 +** cbnz x16, .* +** smstart za +** mov x0, \1 +** bl __arm_tpidr2_restore +** msr tpidr2_el0, xzr +** bl ns_shared_callee +** bl ns_preserves_callee +** bl ns_shared_preserves_callee +** msr tpidr2_el0, \1 +** ldr x[0-9]+, .* +** blr x[0-9]+ +** mrs x16, tpidr2_el0 +** cbnz x16, .* +** smstart za +** mov x0, \1 +** bl __arm_tpidr2_restore +** msr tpidr2_el0, xzr +** ldr x[0-9]+, .* +** blr x[0-9]+ +** ldr x[0-9]+, .* +** blr x[0-9]+ +** ldr x[0-9]+, .* +** blr x[0-9]+ +** smstop za +** ... +*/ +void __attribute__((arm_new_za)) +ns_caller1 (struct callbacks *c) +{ + ns_normal_callee (); + ns_shared_callee (); + ns_preserves_callee (); + ns_shared_preserves_callee (); + + c->normal_ptr (); + c->shared_ptr (); + c->preserves_ptr (); + c->shared_preserves_ptr (); +} + +/* +** ns_caller2: +** ... +** mrs x11, tpidr2_el0 +** cbz x11, .* +** bl __arm_tpidr2_save +** msr tpidr2_el0, xzr +** smstart za +** bl ns_shared_callee +** smstop za +** ... +*/ +void __attribute__((arm_new_za)) +ns_caller2 (struct callbacks *c) +{ + ns_shared_callee (); +} + +/* +** ns_caller3: +** ... +** mrs x11, tpidr2_el0 +** cbz x11, .* +** bl __arm_tpidr2_save +** msr tpidr2_el0, xzr +** smstart za +** bl ns_preserves_callee +** bl ns_shared_callee +** bl ns_shared_preserves_callee +** smstop za +** ... +*/ +void __attribute__((arm_new_za)) +ns_caller3 (struct callbacks *c) +{ + ns_preserves_callee (); + ns_shared_callee (); + ns_shared_preserves_callee (); +} + +/* +** ns_caller4: +** ... +** mrs x11, tpidr2_el0 +** cbz x11, .* +** bl __arm_tpidr2_save +** smstart za +** ... +** add (x[0-9]+), x29, .* +** rdsvl (x[0-9]+), #1 +** mov (x[0-9]+), sp +** msub (x[0-9]+), \2, \2, \3 +** mov sp, \4 +** stp \4, \2, .* +** msr tpidr2_el0, \1 +** smstart sm +** bl s_normal_callee +** smstop sm +** mrs x16, tpidr2_el0 +** cbnz x16, .* +** smstart za +** mov x0, \1 +** bl __arm_tpidr2_restore +** msr tpidr2_el0, xzr +** smstart sm +** bl s_shared_callee +** smstop sm +** smstart sm +** bl s_preserves_callee +** smstop sm +** smstart sm +** bl s_shared_preserves_callee +** smstop sm +** smstop za +** ... +*/ +void __attribute__((arm_new_za)) +ns_caller4 (struct callbacks *c) +{ + s_normal_callee (); + s_shared_callee (); + s_preserves_callee (); + s_shared_preserves_callee (); +} + +/* +** ns_caller5: +** ... +** mrs x11, tpidr2_el0 +** cbz x11, .* +** bl __arm_tpidr2_save +** smstart za +** ... +** add (x[0-9]+), x29, .* +** rdsvl (x[0-9]+), #1 +** mov (x[0-9]+), sp +** msub (x[0-9]+), \2, \2, \3 +** mov sp, \4 +** stp \4, \2, .* +** msr tpidr2_el0, \1 +** bl sc_normal_callee +** mrs x16, tpidr2_el0 +** cbnz x16, .* +** smstart za +** mov x0, \1 +** bl __arm_tpidr2_restore +** msr tpidr2_el0, xzr +** bl sc_shared_callee +** bl sc_preserves_callee +** bl sc_shared_preserves_callee +** smstop za +** ... +*/ +void __attribute__((arm_new_za)) +ns_caller5 (struct callbacks *c) +{ + sc_normal_callee (); + sc_shared_callee (); + sc_preserves_callee (); + sc_shared_preserves_callee (); +} + +/* +** s_caller1: +** ... +** mrs x11, tpidr2_el0 +** cbz x11, .* +** bl __arm_tpidr2_save +** smstart za +** ... +** add (x[0-9]+), x29, .* +** cntb (x[0-9]+) +** mov (x[0-9]+), sp +** msub (x[0-9]+), \2, \2, \3 +** mov sp, \4 +** stp \4, \2, .* +** msr tpidr2_el0, \1 +** bl s_normal_callee +** mrs x16, tpidr2_el0 +** cbnz x16, .* +** smstart za +** mov x0, \1 +** bl __arm_tpidr2_restore +** msr tpidr2_el0, xzr +** bl s_shared_callee +** bl s_preserves_callee +** bl s_shared_preserves_callee +** smstop za +** ... +*/ +void __attribute__((arm_new_za, arm_streaming)) +s_caller1 (struct callbacks *c) +{ + s_normal_callee (); + s_shared_callee (); + s_preserves_callee (); + s_shared_preserves_callee (); +} + +/* +** sc_caller1: +** ... +** mrs x11, tpidr2_el0 +** cbz x11, .* +** bl __arm_tpidr2_save +** smstart za +** ... +** add (x[0-9]+), x29, .* +** rdsvl (x[0-9]+), #1 +** mov (x[0-9]+), sp +** msub (x[0-9]+), \2, \2, \3 +** mov sp, \4 +** stp \4, \2, .* +** msr tpidr2_el0, \1 +** bl sc_normal_callee +** mrs x16, tpidr2_el0 +** cbnz x16, .* +** smstart za +** mov x0, \1 +** bl __arm_tpidr2_restore +** msr tpidr2_el0, xzr +** bl sc_shared_callee +** bl sc_preserves_callee +** bl sc_shared_preserves_callee +** smstop za +** ... +*/ +void __attribute__((arm_new_za, arm_streaming_compatible)) +sc_caller1 (struct callbacks *c) +{ + sc_normal_callee (); + sc_shared_callee (); + sc_preserves_callee (); + sc_shared_preserves_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/za_state_5.c b/gcc/testsuite/gcc.target/aarch64/sme/za_state_5.c new file mode 100644 index 00000000000..b18d3fff652 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/za_state_5.c @@ -0,0 +1,241 @@ +// { dg-options "-O -fno-optimize-sibling-calls" } +// { dg-final { check-function-bodies "**" "" } } + +void ns_normal_callee (); +__attribute__((arm_shared_za)) void ns_shared_callee (); +__attribute__((arm_preserves_za)) void ns_preserves_callee (); +__attribute__((arm_shared_za, arm_preserves_za)) void ns_shared_preserves_callee (); + +__attribute__((arm_streaming)) void s_normal_callee (); +__attribute__((arm_streaming, arm_shared_za)) void s_shared_callee (); +__attribute__((arm_streaming, arm_preserves_za)) void s_preserves_callee (); +__attribute__((arm_streaming, arm_shared_za, arm_preserves_za)) void s_shared_preserves_callee (); + +__attribute__((arm_streaming_compatible)) void sc_normal_callee (); +__attribute__((arm_streaming_compatible, arm_shared_za)) void sc_shared_callee (); +__attribute__((arm_streaming_compatible, arm_preserves_za)) void sc_preserves_callee (); +__attribute__((arm_streaming_compatible, arm_shared_za, arm_preserves_za)) void sc_shared_preserves_callee (); + +struct callbacks { + void (*normal_ptr) (); + __attribute__((arm_shared_za)) void (*shared_ptr) (); + __attribute__((arm_preserves_za)) void (*preserves_ptr) (); + __attribute__((arm_shared_za, arm_preserves_za)) void (*shared_preserves_ptr) (); +}; + +/* +** ns_caller1: +** ... +** add (x[0-9]+), x29, .* +** rdsvl (x[0-9]+), #1 +** mov (x[0-9]+), sp +** msub (x[0-9]+), \2, \2, \3 +** mov sp, \4 +** stp \4, \2, .* +** msr tpidr2_el0, \1 +** bl ns_normal_callee +** mrs x16, tpidr2_el0 +** cbnz x16, .* +** smstart za +** mov x0, \1 +** bl __arm_tpidr2_restore +** msr tpidr2_el0, xzr +** bl ns_shared_callee +** bl ns_preserves_callee +** bl ns_shared_preserves_callee +** msr tpidr2_el0, \1 +** ldr x[0-9]+, .* +** blr x[0-9]+ +** mrs x16, tpidr2_el0 +** cbnz x16, .* +** smstart za +** mov x0, \1 +** bl __arm_tpidr2_restore +** msr tpidr2_el0, xzr +** ldr x[0-9]+, .* +** blr x[0-9]+ +** ldr x[0-9]+, .* +** blr x[0-9]+ +** ldr x[0-9]+, .* +** blr x[0-9]+ +** ... +*/ +void __attribute__((arm_shared_za)) +ns_caller1 (struct callbacks *c) +{ + ns_normal_callee (); + ns_shared_callee (); + ns_preserves_callee (); + ns_shared_preserves_callee (); + + c->normal_ptr (); + c->shared_ptr (); + c->preserves_ptr (); + c->shared_preserves_ptr (); +} + +/* +** ns_caller2: +** stp x29, x30, \[sp, #?-16\]! +** mov x29, sp +** bl ns_shared_callee +** ldp x29, x30, \[sp\], #?16 +** ret +*/ +void __attribute__((arm_shared_za)) +ns_caller2 (struct callbacks *c) +{ + ns_shared_callee (); +} + +/* +** ns_caller3: +** stp x29, x30, \[sp, #?-16\]! +** mov x29, sp +** bl ns_preserves_callee +** bl ns_shared_callee +** bl ns_shared_preserves_callee +** ldp x29, x30, \[sp\], #?16 +** ret +*/ +void __attribute__((arm_shared_za)) +ns_caller3 (struct callbacks *c) +{ + ns_preserves_callee (); + ns_shared_callee (); + ns_shared_preserves_callee (); +} + +/* +** ns_caller4: +** ... +** add (x[0-9]+), x29, .* +** rdsvl (x[0-9]+), #1 +** mov (x[0-9]+), sp +** msub (x[0-9]+), \2, \2, \3 +** mov sp, \4 +** stp \4, \2, .* +** msr tpidr2_el0, \1 +** smstart sm +** bl s_normal_callee +** smstop sm +** mrs x16, tpidr2_el0 +** cbnz x16, .* +** smstart za +** mov x0, \1 +** bl __arm_tpidr2_restore +** msr tpidr2_el0, xzr +** smstart sm +** bl s_shared_callee +** smstop sm +** smstart sm +** bl s_preserves_callee +** smstop sm +** smstart sm +** bl s_shared_preserves_callee +** smstop sm +** ... +*/ +void __attribute__((arm_shared_za)) +ns_caller4 (struct callbacks *c) +{ + s_normal_callee (); + s_shared_callee (); + s_preserves_callee (); + s_shared_preserves_callee (); +} + +/* +** ns_caller5: +** ... +** add (x[0-9]+), x29, .* +** rdsvl (x[0-9]+), #1 +** mov (x[0-9]+), sp +** msub (x[0-9]+), \2, \2, \3 +** mov sp, \4 +** stp \4, \2, .* +** msr tpidr2_el0, \1 +** bl sc_normal_callee +** mrs x16, tpidr2_el0 +** cbnz x16, .* +** smstart za +** mov x0, \1 +** bl __arm_tpidr2_restore +** msr tpidr2_el0, xzr +** bl sc_shared_callee +** bl sc_preserves_callee +** bl sc_shared_preserves_callee +** ... +*/ +void __attribute__((arm_shared_za)) +ns_caller5 (struct callbacks *c) +{ + sc_normal_callee (); + sc_shared_callee (); + sc_preserves_callee (); + sc_shared_preserves_callee (); +} + +/* +** s_caller1: +** ... +** add (x[0-9]+), x29, .* +** cntb (x[0-9]+) +** mov (x[0-9]+), sp +** msub (x[0-9]+), \2, \2, \3 +** mov sp, \4 +** stp \4, \2, .* +** msr tpidr2_el0, \1 +** bl s_normal_callee +** mrs x16, tpidr2_el0 +** cbnz x16, .* +** smstart za +** mov x0, \1 +** bl __arm_tpidr2_restore +** msr tpidr2_el0, xzr +** bl s_shared_callee +** bl s_preserves_callee +** bl s_shared_preserves_callee +** ... +*/ +void __attribute__((arm_shared_za, arm_streaming)) +s_caller1 (struct callbacks *c) +{ + s_normal_callee (); + s_shared_callee (); + s_preserves_callee (); + s_shared_preserves_callee (); +} + +/* +** sc_caller1: +** ... +** add (x[0-9]+), x29, .* +** rdsvl (x[0-9]+), #1 +** mov (x[0-9]+), sp +** msub (x[0-9]+), \2, \2, \3 +** mov sp, \4 +** stp \4, \2, .* +** msr tpidr2_el0, \1 +** bl sc_normal_callee +** mrs x16, tpidr2_el0 +** cbnz x16, .* +** smstart za +** mov x0, \1 +** bl __arm_tpidr2_restore +** msr tpidr2_el0, xzr +** bl sc_shared_callee +** bl sc_preserves_callee +** bl sc_shared_preserves_callee +** ... +*/ +void __attribute__((arm_shared_za, arm_streaming_compatible)) +sc_caller1 (struct callbacks *c) +{ + sc_normal_callee (); + sc_shared_callee (); + sc_preserves_callee (); + sc_shared_preserves_callee (); +} + +// { dg-final { scan-assembler-not {\tsmstop\tza} } } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/za_state_6.c b/gcc/testsuite/gcc.target/aarch64/sme/za_state_6.c new file mode 100644 index 00000000000..c0b9e2275f2 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/za_state_6.c @@ -0,0 +1,132 @@ +// { dg-options "-O -fno-optimize-sibling-calls" } + +void ns_normal_callee (); +__attribute__((arm_shared_za)) void ns_shared_callee (); +__attribute__((arm_preserves_za)) void ns_preserves_callee (); +__attribute__((arm_shared_za, arm_preserves_za)) void ns_shared_preserves_callee (); + +__attribute__((arm_streaming)) void s_normal_callee (); +__attribute__((arm_streaming, arm_shared_za)) void s_shared_callee (); +__attribute__((arm_streaming, arm_preserves_za)) void s_preserves_callee (); +__attribute__((arm_streaming, arm_shared_za, arm_preserves_za)) void s_shared_preserves_callee (); + +__attribute__((arm_streaming_compatible)) void sc_normal_callee (); +__attribute__((arm_streaming_compatible, arm_shared_za)) void sc_shared_callee (); +__attribute__((arm_streaming_compatible, arm_preserves_za)) void sc_preserves_callee (); +__attribute__((arm_streaming_compatible, arm_shared_za, arm_preserves_za)) void sc_shared_preserves_callee (); + +void __attribute__((arm_new_za)) +caller1 () +{ + ns_normal_callee (); + ns_shared_callee (); + ns_preserves_callee (); + ns_shared_preserves_callee (); + + s_normal_callee (); + s_shared_callee (); + s_preserves_callee (); + s_shared_preserves_callee (); + + sc_normal_callee (); + sc_shared_callee (); + sc_preserves_callee (); + sc_shared_preserves_callee (); +} + +void __attribute__((arm_shared_za)) +caller2 () +{ + ns_normal_callee (); + ns_shared_callee (); + ns_preserves_callee (); + ns_shared_preserves_callee (); + + s_normal_callee (); + s_shared_callee (); + s_preserves_callee (); + s_shared_preserves_callee (); + + sc_normal_callee (); + sc_shared_callee (); + sc_preserves_callee (); + sc_shared_preserves_callee (); +} + +void __attribute__((arm_new_za, arm_streaming)) +caller3 () +{ + ns_normal_callee (); + ns_shared_callee (); + ns_preserves_callee (); + ns_shared_preserves_callee (); + + s_normal_callee (); + s_shared_callee (); + s_preserves_callee (); + s_shared_preserves_callee (); + + sc_normal_callee (); + sc_shared_callee (); + sc_preserves_callee (); + sc_shared_preserves_callee (); +} + +void __attribute__((arm_shared_za, arm_streaming)) +caller4 () +{ + ns_normal_callee (); + ns_shared_callee (); + ns_preserves_callee (); + ns_shared_preserves_callee (); + + s_normal_callee (); + s_shared_callee (); + s_preserves_callee (); + s_shared_preserves_callee (); + + sc_normal_callee (); + sc_shared_callee (); + sc_preserves_callee (); + sc_shared_preserves_callee (); +} + +void __attribute__((arm_new_za, arm_streaming_compatible)) +caller5 () +{ + ns_normal_callee (); + ns_shared_callee (); + ns_preserves_callee (); + ns_shared_preserves_callee (); + + s_normal_callee (); + s_shared_callee (); + s_preserves_callee (); + s_shared_preserves_callee (); + + sc_normal_callee (); + sc_shared_callee (); + sc_preserves_callee (); + sc_shared_preserves_callee (); +} + +void __attribute__((arm_shared_za, arm_streaming_compatible)) +caller6 () +{ + ns_normal_callee (); + ns_shared_callee (); + ns_preserves_callee (); + ns_shared_preserves_callee (); + + s_normal_callee (); + s_shared_callee (); + s_preserves_callee (); + s_shared_preserves_callee (); + + sc_normal_callee (); + sc_shared_callee (); + sc_preserves_callee (); + sc_shared_preserves_callee (); +} + +// { dg-final { scan-assembler-times {\tmsr\ttpidr2_el0, xzr} 18 } } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/za_state_7.c b/gcc/testsuite/gcc.target/aarch64/sme/za_state_7.c new file mode 100644 index 00000000000..4b588517cb1 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/za_state_7.c @@ -0,0 +1,55 @@ +// { dg-options "-O -fno-optimize-sibling-calls -fomit-frame-pointer" } +// { dg-final { check-function-bodies "**" "" } } + +/* +** za1: +** mov w0, #?1 +** ret +*/ +int __attribute__((arm_new_za)) +za1 () +{ + asm (""); + return 1; +} + +/* +** za2: +** str x30, \[sp, #?-16\]! +** mrs x11, tpidr2_el0 +** cbz x11, .* +** bl __arm_tpidr2_save +** msr tpidr2_el0, xzr +** smstart za +** mov w0, #?1 +** smstop za +** ldr x30, \[sp\], #?16 +** ret +*/ +int __attribute__((arm_new_za)) +za2 () +{ + asm ("" ::: "za"); + return 1; +} + +/* +** za3: +** str x30, \[sp, #?-16\]! +** mrs x11, tpidr2_el0 +** cbz x11, .* +** bl __arm_tpidr2_save +** msr tpidr2_el0, xzr +** smstart za +** mov w0, w2 +** smstop za +** ldr x30, \[sp\], #?16 +** ret +*/ +int __attribute__((arm_new_za)) +za3 () +{ + register int ret asm ("x2"); + asm ("" : "=r" (ret) :: "za"); + return ret; +} From patchwork Sun Nov 13 10:01:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 60512 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 25C0D3898510 for ; Sun, 13 Nov 2022 10:03:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 25C0D3898510 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668333789; bh=huMi7uiM5o/hSAJAiXgs1eIdOdlSvVDdknAAbp8U9U0=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=nBip2mIyexHXMCAFyP1rNr5JWounp+vpqIXdwYq/Botkxi4IBIAfcdx4aFcTcY6J0 XE31tjJQcvL/gucpe1ZsMoyJpNeJuHdeJiR+Q3BY9LLpSrR8NRx1sXbfwVgkuwKRRk B7AJYDc+1ku/a9dxhY87XyA3drkgCmPAnaNVSttI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 9D8A53893663 for ; Sun, 13 Nov 2022 10:01:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9D8A53893663 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8CD7123A for ; Sun, 13 Nov 2022 02:01:34 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id F19D93F73D for ; Sun, 13 Nov 2022 02:01:27 -0800 (PST) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 07/16] aarch64: Add a register class for w12-w15 References: Date: Sun, 13 Nov 2022 10:01:26 +0000 In-Reply-To: (Richard Sandiford's message of "Sun, 13 Nov 2022 09:59:23 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-41.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_NUMSUBJECT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Some SME instructions use w12-w15 to index ZA. This patch adds a register class for that range. gcc/ * config/aarch64/aarch64.h (ZA_INDEX_REGNUM_P): New macro. (ZA_INDEX_REGS): New register class. (REG_CLASS_NAMES, REG_CLASS_CONTENTS): Add entries for it. * config/aarch64/aarch64.cc (aarch64_regno_regclass) (aarch64_class_max_nregs, aarch64_register_move_cost): Handle ZA_INDEX_REGS. --- gcc/config/aarch64/aarch64.cc | 12 +++++++----- gcc/config/aarch64/aarch64.h | 6 ++++++ 2 files changed, 13 insertions(+), 5 deletions(-) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index b200d2a9f80..d29cfefee6b 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -13553,6 +13553,9 @@ aarch64_label_mentioned_p (rtx x) enum reg_class aarch64_regno_regclass (unsigned regno) { + if (ZA_INDEX_REGNUM_P (regno)) + return ZA_INDEX_REGS; + if (STUB_REGNUM_P (regno)) return STUB_REGS; @@ -13917,6 +13920,7 @@ aarch64_class_max_nregs (reg_class_t regclass, machine_mode mode) unsigned int nregs, vec_flags; switch (regclass) { + case ZA_INDEX_REGS: case STUB_REGS: case TAILCALL_ADDR_REGS: case POINTER_REGS: @@ -16252,13 +16256,11 @@ aarch64_register_move_cost (machine_mode mode, const struct cpu_regmove_cost *regmove_cost = aarch64_tune_params.regmove_cost; - /* Caller save and pointer regs are equivalent to GENERAL_REGS. */ - if (to == TAILCALL_ADDR_REGS || to == POINTER_REGS - || to == STUB_REGS) + /* Trest any subset of GENERAL_REGS as though it were GENERAL_REGS. */ + if (reg_class_subset_p (to, GENERAL_REGS)) to = GENERAL_REGS; - if (from == TAILCALL_ADDR_REGS || from == POINTER_REGS - || from == STUB_REGS) + if (reg_class_subset_p (from, GENERAL_REGS)) from = GENERAL_REGS; /* Make RDFFR very expensive. In particular, if we know that the FFR diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index b5877e7e61e..bfa28726221 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -643,6 +643,9 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; && (REGNO) != R17_REGNUM \ && (REGNO) != R30_REGNUM) \ +#define ZA_INDEX_REGNUM_P(REGNO) \ + IN_RANGE (REGNO, R12_REGNUM, R15_REGNUM) + #define FP_REGNUM_P(REGNO) \ (((unsigned) (REGNO - V0_REGNUM)) <= (V31_REGNUM - V0_REGNUM)) @@ -666,6 +669,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; enum reg_class { NO_REGS, + ZA_INDEX_REGS, TAILCALL_ADDR_REGS, STUB_REGS, GENERAL_REGS, @@ -690,6 +694,7 @@ enum reg_class #define REG_CLASS_NAMES \ { \ "NO_REGS", \ + "ZA_INDEX_REGS", \ "TAILCALL_ADDR_REGS", \ "STUB_REGS", \ "GENERAL_REGS", \ @@ -711,6 +716,7 @@ enum reg_class #define REG_CLASS_CONTENTS \ { \ { 0x00000000, 0x00000000, 0x00000000 }, /* NO_REGS */ \ + { 0x0000f000, 0x00000000, 0x00000000 }, /* ZA_INDEX_REGS */ \ { 0x00030000, 0x00000000, 0x00000000 }, /* TAILCALL_ADDR_REGS */\ { 0x3ffcffff, 0x00000000, 0x00000000 }, /* STUB_REGS */ \ { 0x7fffffff, 0x00000000, 0x00000003 }, /* GENERAL_REGS */ \ From patchwork Sun Nov 13 10:01:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 60511 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 30BD838983BE for ; Sun, 13 Nov 2022 10:03:07 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 30BD838983BE DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668333787; bh=T2AEtLsSjWwIlbYzbd/AAAgy+AYd/I4EAcomt0pWi0k=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=pYNNpIOXmpmZSkLEec0BWdnExDkXVCzZ2+L+Du8BJbdPp9zuGk4Nab2Y6m0w+/OGC tyvaS4HPPxcLRUW0fgyu1pijziskaZPmL4pw+uVp+tjiDsbPul+5mPnEJeGhdv/u9b Z0a7xDT3bRhGc9i998kmDj1WZIGZv+otLgWEdREk= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 7E2D43888C41 for ; Sun, 13 Nov 2022 10:01:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7E2D43888C41 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 82EEE23A for ; Sun, 13 Nov 2022 02:01:46 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E77C03F73D for ; Sun, 13 Nov 2022 02:01:39 -0800 (PST) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 08/16] aarch64: Add a VNx1TI mode References: Date: Sun, 13 Nov 2022 10:01:38 +0000 In-Reply-To: (Richard Sandiford's message of "Sun, 13 Nov 2022 09:59:23 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-41.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Although TI isn't really a native SVE element mode, it's convenient for SME if we define VNx1TI anyway, so that it can be used to distinguish .Q ZA operations from others. It's purely an RTL convenience and isn't (yet) a valid storage mode. gcc/ * config/aarch64/aarch64-modes.def: Add VNx1TI. --- gcc/config/aarch64/aarch64-modes.def | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/gcc/config/aarch64/aarch64-modes.def b/gcc/config/aarch64/aarch64-modes.def index 0fd4c32ad0b..e960b649a6b 100644 --- a/gcc/config/aarch64/aarch64-modes.def +++ b/gcc/config/aarch64/aarch64-modes.def @@ -148,7 +148,7 @@ ADV_SIMD_Q_REG_STRUCT_MODES (4, V4x16, V4x8, V4x4, V4x2) for 8-bit, 16-bit, 32-bit and 64-bit elements respectively. It isn't strictly necessary to set the alignment here, since the default would be clamped to BIGGEST_ALIGNMENT anyhow, but it seems clearer. */ -#define SVE_MODES(NVECS, VB, VH, VS, VD) \ +#define SVE_MODES(NVECS, VB, VH, VS, VD, VT) \ VECTOR_MODES_WITH_PREFIX (VNx, INT, 16 * NVECS, NVECS == 1 ? 1 : 4); \ VECTOR_MODES_WITH_PREFIX (VNx, FLOAT, 16 * NVECS, NVECS == 1 ? 1 : 4); \ \ @@ -156,6 +156,7 @@ ADV_SIMD_Q_REG_STRUCT_MODES (4, V4x16, V4x8, V4x4, V4x2) ADJUST_NUNITS (VH##HI, aarch64_sve_vg * NVECS * 4); \ ADJUST_NUNITS (VS##SI, aarch64_sve_vg * NVECS * 2); \ ADJUST_NUNITS (VD##DI, aarch64_sve_vg * NVECS); \ + ADJUST_NUNITS (VT##TI, exact_div (aarch64_sve_vg * NVECS, 2)); \ ADJUST_NUNITS (VH##BF, aarch64_sve_vg * NVECS * 4); \ ADJUST_NUNITS (VH##HF, aarch64_sve_vg * NVECS * 4); \ ADJUST_NUNITS (VS##SF, aarch64_sve_vg * NVECS * 2); \ @@ -165,17 +166,23 @@ ADV_SIMD_Q_REG_STRUCT_MODES (4, V4x16, V4x8, V4x4, V4x2) ADJUST_ALIGNMENT (VH##HI, 16); \ ADJUST_ALIGNMENT (VS##SI, 16); \ ADJUST_ALIGNMENT (VD##DI, 16); \ + ADJUST_ALIGNMENT (VT##TI, 16); \ ADJUST_ALIGNMENT (VH##BF, 16); \ ADJUST_ALIGNMENT (VH##HF, 16); \ ADJUST_ALIGNMENT (VS##SF, 16); \ ADJUST_ALIGNMENT (VD##DF, 16); -/* Give SVE vectors the names normally used for 256-bit vectors. - The actual number depends on command-line flags. */ -SVE_MODES (1, VNx16, VNx8, VNx4, VNx2) -SVE_MODES (2, VNx32, VNx16, VNx8, VNx4) -SVE_MODES (3, VNx48, VNx24, VNx12, VNx6) -SVE_MODES (4, VNx64, VNx32, VNx16, VNx8) +/* Give SVE vectors names of the form VNxX, where X describes what is + stored in each 128-bit unit. The actual size of the mode depends + on command-line flags. + + VNx1TI isn't really a native SVE mode, but it can be useful in some + limited situations. */ +VECTOR_MODE_WITH_PREFIX (VNx, INT, TI, 1, 1); +SVE_MODES (1, VNx16, VNx8, VNx4, VNx2, VNx1) +SVE_MODES (2, VNx32, VNx16, VNx8, VNx4, VNx2) +SVE_MODES (3, VNx48, VNx24, VNx12, VNx6, VNx3) +SVE_MODES (4, VNx64, VNx32, VNx16, VNx8, VNx4) /* Partial SVE vectors: From patchwork Sun Nov 13 10:01:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 60519 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C7BC33858291 for ; Sun, 13 Nov 2022 10:05:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C7BC33858291 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668333949; bh=BRQLyfjA1cs3BE3T3Z+kqTN6+WjX8hwQ0raEMNoakYw=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=ZqzMDyqt97+nXQHEMj4W5jr9aZxomKipG4BboQMW/66x6HzEX6W/RgdUguFZneUON PxVh4202LVVKAuIHfo/jVWoOrMp5X6sKRcWCEAWDFQF2ihWpVtPWGblUcvQrH+YTyw d+Wo/F6pDt5ifXpzbodyd0YfNXq5o1U4OufRvhAE= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id BE7DF3896C1B for ; Sun, 13 Nov 2022 10:01:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BE7DF3896C1B Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AC6F723A for ; Sun, 13 Nov 2022 02:02:01 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1CCF83F73D for ; Sun, 13 Nov 2022 02:01:55 -0800 (PST) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 09/16] aarch64: Make AARCH64_FL_SVE requirements explicit References: Date: Sun, 13 Nov 2022 10:01:53 +0000 In-Reply-To: (Richard Sandiford's message of "Sun, 13 Nov 2022 09:59:23 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-41.7 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" So far, all intrinsics covered by the aarch64-sve-builtins* framework have (naturally enough) required at least SVE. However, arm_sme.h defines a couple of intrinsics that can be called by any code. It's therefore necessary to make the implicit SVE requirement explicit. gcc/ * config/aarch64/aarch64-sve-builtins.cc (function_groups): Remove implied requirement on SVE. * config/aarch64/aarch64-sve-builtins-base.def: Explicitly require SVE. * config/aarch64/aarch64-sve-builtins-sve2.def: Likewise. --- .../aarch64/aarch64-sve-builtins-base.def | 26 ++++++++++++------- .../aarch64/aarch64-sve-builtins-sve2.def | 18 ++++++++----- gcc/config/aarch64/aarch64-sve-builtins.cc | 2 +- 3 files changed, 30 insertions(+), 16 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.def b/gcc/config/aarch64/aarch64-sve-builtins-base.def index a2d0cea6c5b..d35cdffe20f 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.def @@ -17,7 +17,7 @@ along with GCC; see the file COPYING3. If not see . */ -#define REQUIRED_EXTENSIONS 0 +#define REQUIRED_EXTENSIONS AARCH64_FL_SVE DEF_SVE_FUNCTION (svabd, binary_opt_n, all_arith, mxz) DEF_SVE_FUNCTION (svabs, unary, all_float_and_signed, mxz) DEF_SVE_FUNCTION (svacge, compare_opt_n, all_float, implicit) @@ -255,7 +255,7 @@ DEF_SVE_FUNCTION (svzip2, binary, all_data, none) DEF_SVE_FUNCTION (svzip2, binary_pred, all_pred, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS AARCH64_FL_SM_OFF +#define REQUIRED_EXTENSIONS AARCH64_FL_SVE | AARCH64_FL_SM_OFF DEF_SVE_FUNCTION (svadda, fold_left, all_float, implicit) DEF_SVE_FUNCTION (svadrb, adr_offset, none, none) DEF_SVE_FUNCTION (svadrd, adr_index, none, none) @@ -321,7 +321,7 @@ DEF_SVE_FUNCTION (svtssel, binary_uint, all_float, none) DEF_SVE_FUNCTION (svwrffr, setffr, none, implicit) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS AARCH64_FL_BF16 +#define REQUIRED_EXTENSIONS AARCH64_FL_SVE | AARCH64_FL_BF16 DEF_SVE_FUNCTION (svbfdot, ternary_bfloat_opt_n, s_float, none) DEF_SVE_FUNCTION (svbfdot_lane, ternary_bfloat_lanex2, s_float, none) DEF_SVE_FUNCTION (svbfmlalb, ternary_bfloat_opt_n, s_float, none) @@ -332,27 +332,33 @@ DEF_SVE_FUNCTION (svcvt, unary_convert, cvt_bfloat, mxz) DEF_SVE_FUNCTION (svcvtnt, unary_convert_narrowt, cvt_bfloat, mx) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS AARCH64_FL_BF16 | AARCH64_FL_SM_OFF +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ + | AARCH64_FL_BF16 \ + | AARCH64_FL_SM_OFF) DEF_SVE_FUNCTION (svbfmmla, ternary_bfloat, s_float, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS AARCH64_FL_I8MM +#define REQUIRED_EXTENSIONS AARCH64_FL_SVE | AARCH64_FL_I8MM DEF_SVE_FUNCTION (svsudot, ternary_intq_uintq_opt_n, s_signed, none) DEF_SVE_FUNCTION (svsudot_lane, ternary_intq_uintq_lane, s_signed, none) DEF_SVE_FUNCTION (svusdot, ternary_uintq_intq_opt_n, s_signed, none) DEF_SVE_FUNCTION (svusdot_lane, ternary_uintq_intq_lane, s_signed, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS AARCH64_FL_I8MM | AARCH64_FL_SM_OFF +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ + | AARCH64_FL_I8MM \ + | AARCH64_FL_SM_OFF) DEF_SVE_FUNCTION (svmmla, mmla, s_integer, none) DEF_SVE_FUNCTION (svusmmla, ternary_uintq_intq, s_signed, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS AARCH64_FL_F32MM | AARCH64_FL_SM_OFF +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ + | AARCH64_FL_F32MM \ + | AARCH64_FL_SM_OFF) DEF_SVE_FUNCTION (svmmla, mmla, s_float, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS AARCH64_FL_F64MM +#define REQUIRED_EXTENSIONS AARCH64_FL_SVE | AARCH64_FL_F64MM DEF_SVE_FUNCTION (svtrn1q, binary, all_data, none) DEF_SVE_FUNCTION (svtrn2q, binary, all_data, none) DEF_SVE_FUNCTION (svuzp1q, binary, all_data, none) @@ -361,7 +367,9 @@ DEF_SVE_FUNCTION (svzip1q, binary, all_data, none) DEF_SVE_FUNCTION (svzip2q, binary, all_data, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS AARCH64_FL_F64MM | AARCH64_FL_SM_OFF +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ + | AARCH64_FL_F64MM \ + | AARCH64_FL_SM_OFF) DEF_SVE_FUNCTION (svld1ro, load_replicate, all_data, implicit) DEF_SVE_FUNCTION (svmmla, mmla, d_float, none) #undef REQUIRED_EXTENSIONS diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def index 4e0466b4cf8..3c0a0e072f2 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def @@ -17,7 +17,7 @@ along with GCC; see the file COPYING3. If not see . */ -#define REQUIRED_EXTENSIONS AARCH64_FL_SVE2 +#define REQUIRED_EXTENSIONS AARCH64_FL_SVE | AARCH64_FL_SVE2 DEF_SVE_FUNCTION (svaba, ternary_opt_n, all_integer, none) DEF_SVE_FUNCTION (svabalb, ternary_long_opt_n, hsd_integer, none) DEF_SVE_FUNCTION (svabalt, ternary_long_opt_n, hsd_integer, none) @@ -166,7 +166,9 @@ DEF_SVE_FUNCTION (svwhilewr, compare_ptr, all_data, none) DEF_SVE_FUNCTION (svxar, ternary_shift_right_imm, all_integer, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS AARCH64_FL_SVE2 | AARCH64_FL_SM_OFF +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ + | AARCH64_FL_SVE2 \ + | AARCH64_FL_SM_OFF) DEF_SVE_FUNCTION (svhistcnt, binary_to_uint, sd_integer, z) DEF_SVE_FUNCTION (svhistseg, binary_to_uint, b_integer, none) DEF_SVE_FUNCTION (svldnt1_gather, load_gather_sv_restricted, sd_data, implicit) @@ -192,7 +194,8 @@ DEF_SVE_FUNCTION (svstnt1w_scatter, store_scatter_index_restricted, d_integer, i DEF_SVE_FUNCTION (svstnt1w_scatter, store_scatter_offset_restricted, d_integer, implicit) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE2 \ +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ + | AARCH64_FL_SVE2 \ | AARCH64_FL_SVE2_AES \ | AARCH64_FL_SM_OFF) DEF_SVE_FUNCTION (svaesd, binary, b_unsigned, none) @@ -203,7 +206,8 @@ DEF_SVE_FUNCTION (svpmullb_pair, binary_opt_n, d_unsigned, none) DEF_SVE_FUNCTION (svpmullt_pair, binary_opt_n, d_unsigned, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE2 \ +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ + | AARCH64_FL_SVE2 \ | AARCH64_FL_SVE2_BITPERM \ | AARCH64_FL_SM_OFF) DEF_SVE_FUNCTION (svbdep, binary_opt_n, all_unsigned, none) @@ -211,13 +215,15 @@ DEF_SVE_FUNCTION (svbext, binary_opt_n, all_unsigned, none) DEF_SVE_FUNCTION (svbgrp, binary_opt_n, all_unsigned, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE2 \ +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ + | AARCH64_FL_SVE2 \ | AARCH64_FL_SVE2_SHA3 \ | AARCH64_FL_SM_OFF) DEF_SVE_FUNCTION (svrax1, binary, d_integer, none) #undef REQUIRED_EXTENSIONS -#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE2 \ +#define REQUIRED_EXTENSIONS (AARCH64_FL_SVE \ + | AARCH64_FL_SVE2 \ | AARCH64_FL_SVE2_SM4 \ | AARCH64_FL_SM_OFF) DEF_SVE_FUNCTION (svsm4e, binary, s_unsigned, none) diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index a6de1068da9..cb3eb76dd77 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -525,7 +525,7 @@ static const predication_index preds_z[] = { PRED_z, NUM_PREDS }; static CONSTEXPR const function_group_info function_groups[] = { #define DEF_SVE_FUNCTION(NAME, SHAPE, TYPES, PREDS) \ { #NAME, &functions::NAME, &shapes::SHAPE, types_##TYPES, preds_##PREDS, \ - REQUIRED_EXTENSIONS | AARCH64_FL_SVE }, + REQUIRED_EXTENSIONS }, #include "aarch64-sve-builtins.def" }; From patchwork Sun Nov 13 10:02:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 60515 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8D8AE395383A for ; Sun, 13 Nov 2022 10:04:16 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8D8AE395383A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668333856; bh=Nld6WZVqC11SvvzcAWXj07U1oYQAt+6Pi8o5nRo8mv0=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=AYRz5Vx82zF21VjEDSzb4pBJ2vq4UrgpbA1PVlwOcK50JyYXVaUc11CepdTqm5u4Q zEpsBxyprZGiNE8VAd9HtDTmnW61FbLdbsf9/bIkGnZW+KTErSfWsP8OS295K3fwNC MpgCvu1XST7NgIcV134dWvwqZQ5J+q9VWclr6SXg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id C4E863898C6C for ; Sun, 13 Nov 2022 10:02:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C4E863898C6C Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B6C6923A for ; Sun, 13 Nov 2022 02:02:16 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 276413F73D for ; Sun, 13 Nov 2022 02:02:10 -0800 (PST) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 10/16] aarch64: Generalise unspec_based_function_base References: Date: Sun, 13 Nov 2022 10:02:08 +0000 In-Reply-To: (Richard Sandiford's message of "Sun, 13 Nov 2022 09:59:23 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-41.7 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Until now, SVE intrinsics that map directly to unspecs have always used type suffix 0 to distinguish between signed integers, unsigned integers, and floating-point values. SME adds functions that need to use type suffix 1 instead. This patch generalises the classes accordingly. gcc/ * config/aarch64/aarch64-sve-builtins-functions.h (unspec_based_function_base): Allow type suffix 1 to determine the mode of the operation. (unspec_based_fused_function): Update accordingly. (unspec_based_fused_lane_function): Likewise. --- .../aarch64/aarch64-sve-builtins-functions.h | 29 ++++++++++++------- 1 file changed, 18 insertions(+), 11 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-functions.h b/gcc/config/aarch64/aarch64-sve-builtins-functions.h index 472e26c17ff..2fd135aab07 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-functions.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-functions.h @@ -229,18 +229,21 @@ class unspec_based_function_base : public function_base public: CONSTEXPR unspec_based_function_base (int unspec_for_sint, int unspec_for_uint, - int unspec_for_fp) + int unspec_for_fp, + unsigned int suffix_index = 0) : m_unspec_for_sint (unspec_for_sint), m_unspec_for_uint (unspec_for_uint), - m_unspec_for_fp (unspec_for_fp) + m_unspec_for_fp (unspec_for_fp), + m_suffix_index (suffix_index) {} /* Return the unspec code to use for INSTANCE, based on type suffix 0. */ int unspec_for (const function_instance &instance) const { - return (!instance.type_suffix (0).integer_p ? m_unspec_for_fp - : instance.type_suffix (0).unsigned_p ? m_unspec_for_uint + auto &suffix = instance.type_suffix (m_suffix_index); + return (!suffix.integer_p ? m_unspec_for_fp + : suffix.unsigned_p ? m_unspec_for_uint : m_unspec_for_sint); } @@ -249,6 +252,9 @@ public: int m_unspec_for_sint; int m_unspec_for_uint; int m_unspec_for_fp; + + /* Which type suffix is used to choose between the unspecs. */ + unsigned int m_suffix_index; }; /* A function_base for functions that have an associated unspec code. @@ -301,7 +307,8 @@ public: rtx expand (function_expander &e) const override { - return e.use_exact_insn (CODE (unspec_for (e), e.vector_mode (0))); + return e.use_exact_insn (CODE (unspec_for (e), + e.vector_mode (m_suffix_index))); } }; @@ -355,16 +362,16 @@ public: { int unspec = unspec_for (e); insn_code icode; - if (e.type_suffix (0).float_p) + if (e.type_suffix (m_suffix_index).float_p) { /* Put the operands in the normal (fma ...) order, with the accumulator last. This fits naturally since that's also the unprinted operand in the asm output. */ e.rotate_inputs_left (0, e.pred != PRED_none ? 4 : 3); - icode = code_for_aarch64_sve (unspec, e.vector_mode (0)); + icode = code_for_aarch64_sve (unspec, e.vector_mode (m_suffix_index)); } else - icode = INT_CODE (unspec, e.vector_mode (0)); + icode = INT_CODE (unspec, e.vector_mode (m_suffix_index)); return e.use_exact_insn (icode); } }; @@ -385,16 +392,16 @@ public: { int unspec = unspec_for (e); insn_code icode; - if (e.type_suffix (0).float_p) + if (e.type_suffix (m_suffix_index).float_p) { /* Put the operands in the normal (fma ...) order, with the accumulator last. This fits naturally since that's also the unprinted operand in the asm output. */ e.rotate_inputs_left (0, e.pred != PRED_none ? 5 : 4); - icode = code_for_aarch64_lane (unspec, e.vector_mode (0)); + icode = code_for_aarch64_lane (unspec, e.vector_mode (m_suffix_index)); } else - icode = INT_CODE (unspec, e.vector_mode (0)); + icode = INT_CODE (unspec, e.vector_mode (m_suffix_index)); return e.use_exact_insn (icode); } }; From patchwork Sun Nov 13 10:02:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 60518 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 744863887F7A for ; Sun, 13 Nov 2022 10:05:14 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 744863887F7A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668333914; bh=mf9etWSat6mzorzKo80DLxzwyBnLR2wRRxPE7MHG/gI=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=Khr3u/3uX+fuQ1YlVhf6GXQJhd8ecipzATZHHK1YVRcNPNQ23ZoNqROzFd+1kUOST g1Raa+pBHGHBBPw2RQlrVYKJTrBhGNGNAdHev/mZXDEKrDbdK/Wsn48zY86TqorYpS FqvyXRTbogpV8kmeYMcrfo0O2MLjtbj/Y9Rhn570= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 67EF5389EC62 for ; Sun, 13 Nov 2022 10:02:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 67EF5389EC62 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8389623A for ; Sun, 13 Nov 2022 02:02:30 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E84833F73D for ; Sun, 13 Nov 2022 02:02:23 -0800 (PST) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 11/16] aarch64: Generalise _m rules for SVE intrinsics References: Date: Sun, 13 Nov 2022 10:02:22 +0000 In-Reply-To: (Richard Sandiford's message of "Sun, 13 Nov 2022 09:59:23 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-41.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" In SVE there was a simple rule that unary merging (_m) intrinsics had a separate initial argument to specify the values of inactive lanes, whereas other merging functions took inactive lanes from the first operand to the operation. That rule began to break down in SVE2, and it continues to do so in SME. This patch therefore adds a virtual function to specify whether the separate initial argument is present or not. The old rule is still the default. gcc/ * config/aarch64/aarch64-sve-builtins.h (function_shape::has_merge_argument_p): New member function. * config/aarch64/aarch64-sve-builtins.cc: (function_resolver::check_gp_argument): Use it. (function_expander::get_fallback_value): Likewise. * config/aarch64/aarch64-sve-builtins-shapes.cc (apply_predication): Likewise. (unary_convert_narrowt_def::has_merge_argument_p): New function. --- gcc/config/aarch64/aarch64-sve-builtins-shapes.cc | 10 ++++++++-- gcc/config/aarch64/aarch64-sve-builtins.cc | 4 ++-- gcc/config/aarch64/aarch64-sve-builtins.h | 13 +++++++++++++ 3 files changed, 23 insertions(+), 4 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index 8e26bd8a60f..5b47dff0b41 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -66,8 +66,8 @@ apply_predication (const function_instance &instance, tree return_type, the same type as the result. For unary_convert_narrowt it also provides the "bottom" half of active elements, and is present for all types of predication. */ - if ((argument_types.length () == 2 && instance.pred == PRED_m) - || instance.shape == shapes::unary_convert_narrowt) + auto nargs = argument_types.length () - 1; + if (instance.shape->has_merge_argument_p (instance, nargs)) argument_types.quick_insert (0, return_type); } } @@ -3238,6 +3238,12 @@ SHAPE (unary_convert) predicate. */ struct unary_convert_narrowt_def : public overloaded_base<1> { + bool + has_merge_argument_p (const function_instance &, unsigned int) const override + { + return true; + } + void build (function_builder &b, const function_group_info &group) const override { diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index cb3eb76dd77..450a8d958a8 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -2152,7 +2152,7 @@ function_resolver::check_gp_argument (unsigned int nops, if (pred != PRED_none) { /* Unary merge operations should use resolve_unary instead. */ - gcc_assert (nops != 1 || pred != PRED_m); + gcc_assert (!shape->has_merge_argument_p (*this, nops)); nargs = nops + 1; if (!check_num_arguments (nargs) || !require_vector_type (i, VECTOR_TYPE_svbool_t)) @@ -2790,7 +2790,7 @@ function_expander::get_fallback_value (machine_mode mode, unsigned int nops, gcc_assert (pred == PRED_m || pred == PRED_x); if (merge_argno == DEFAULT_MERGE_ARGNO) - merge_argno = nops == 1 && pred == PRED_m ? 0 : 1; + merge_argno = shape->has_merge_argument_p (*this, nops) ? 0 : 1; if (merge_argno == 0) return args[argno++]; diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index 0d130b871d0..623b9e3a07b 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -636,6 +636,9 @@ public: class function_shape { public: + virtual bool has_merge_argument_p (const function_instance &, + unsigned int) const; + virtual bool explicit_type_suffix_p (unsigned int) const = 0; /* Define all functions associated with the given group. */ @@ -877,6 +880,16 @@ function_base::call_properties (const function_instance &instance) const return flags; } +/* Return true if INSTANCE (which has NARGS arguments) has an initial + vector argument whose only purpose is to specify the values of + inactive lanes. */ +inline bool +function_shape::has_merge_argument_p (const function_instance &instance, + unsigned int nargs) const +{ + return nargs == 1 && instance.pred == PRED_m; +} + } #endif From patchwork Sun Nov 13 10:02:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 60521 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A8E103953820 for ; Sun, 13 Nov 2022 10:06:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A8E103953820 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668334009; bh=sRcecaMDUsmhMrsIMaz0hhS1FaUR4mAWxjee5teQa1c=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=TAGc/k2iaCT+tjxgrwBQuxFUsRBQQAgtQQj3bXIq0tmBJjmq4XTmz151OSltGo1ug sY927/BwvYlH2yJ56sY41s/LgoUT1yQnl7mBLCgB1cg8UzIuLSTQY4db9SBrECZ9qZ 76CmamNXjchbWsHsvXkOr964jXlrHWxztrtjscYI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id BA99E389EC4B for ; Sun, 13 Nov 2022 10:02:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BA99E389EC4B Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BCC8423A for ; Sun, 13 Nov 2022 02:02:42 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2C7173F73D for ; Sun, 13 Nov 2022 02:02:36 -0800 (PST) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 12/16] aarch64: Tweaks to function_resolver::resolve_to References: Date: Sun, 13 Nov 2022 10:02:34 +0000 In-Reply-To: (Richard Sandiford's message of "Sun, 13 Nov 2022 09:59:23 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-41.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch adds a new interface to function_resolver::resolve_to in which the mode suffix stays the same (which is the common case). It then moves the handling of explicit first type suffixes from function_resolver::resolve_unary to this new function. This makes things slightly simpler for existing code. However, the main reason for doing it is that it helps require_derived_vector_type handle explicit type suffixes correctly, which in turn improves the error messages generated by the manual C overloading code in a follow-up SME patch. gcc/ * config/aarch64/aarch64-sve-builtins.h (function_resolver::resolve_to): Add an overload that takes only the type suffixes. * config/aarch64/aarch64-sve-builtins.cc (function_resolver::resolve_to): Likewise. Handle explicit type suffixes here rather than... (function_resolver::resolve_unary): ...here. (function_resolver::require_derived_vector_type): Simplify accordingly. (function_resolver::finish_opt_n_resolution): Likewise. (function_resolver::resolve_uniform): Likewise. * config/aarch64/aarch64-sve-builtins-shapes.cc (binary_imm_narrowt_base::resolve): Likewise. (load_contiguous_base::resolve): Likewise. (mmla_def::resolve): Likewise. (ternary_resize2_base::resolve): Likewise. (ternary_resize2_lane_base::resolve): Likewise. (unary_narrowt_base::resolve): Likewise. (binary_n_def::resolve): Likewise. (binary_uint_def::resolve): Likewise. (binary_uint_n_def::resolve): Likewise. (binary_uint64_n_def::resolve): Likewise. (binary_wide_def::resolve): Likewise. (compare_ptr_def::resolve): Likewise. (compare_scalar_def::resolve): Likewise. (fold_left_def::resolve): Likewise. (get_def::resolve): Likewise. (inc_dec_pred_def::resolve): Likewise. (inc_dec_pred_scalar_def::resolve): Likewise. (set_def::resolve): Likewise. (store_def::resolve): Likewise. (tbl_tuple_def::resolve): Likewise. (ternary_qq_lane_rotate_def::resolve): Likewise. (ternary_qq_rotate_def::resolve): Likewise. (ternary_uint_def::resolve): Likewise. (unary_def::resolve): Likewise. (unary_widen_def::resolve): Likewise. --- .../aarch64/aarch64-sve-builtins-shapes.cc | 48 +++++++++---------- gcc/config/aarch64/aarch64-sve-builtins.cc | 34 +++++++++---- gcc/config/aarch64/aarch64-sve-builtins.h | 1 + 3 files changed, 49 insertions(+), 34 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index 5b47dff0b41..df2d5414c07 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -550,7 +550,7 @@ struct binary_imm_narrowt_base : public overloaded_base<0> || !r.require_integer_immediate (i + 2)) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } }; @@ -649,7 +649,7 @@ struct load_contiguous_base : public overloaded_base<0> || (vnum_p && !r.require_scalar_type (i + 1, "int64_t"))) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } }; @@ -739,7 +739,7 @@ struct mmla_def : public overloaded_base<0> /* Make sure that the function exists now, since not all forms follow a set pattern after this point. */ - tree res = r.resolve_to (r.mode_suffix_id, type); + tree res = r.resolve_to (type); if (res == error_mark_node) return res; @@ -896,7 +896,7 @@ struct ternary_resize2_base : public overloaded_base<0> MODIFIER)) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } }; @@ -921,7 +921,7 @@ struct ternary_resize2_lane_base : public overloaded_base<0> || !r.require_integer_immediate (i + 3)) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } }; @@ -1012,7 +1012,7 @@ struct unary_narrowt_base : public overloaded_base<0> || !r.require_derived_vector_type (i, i + 1, type, CLASS, r.HALF_SIZE)) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } }; @@ -1218,7 +1218,7 @@ struct binary_n_def : public overloaded_base<0> || !r.require_derived_scalar_type (i + 1, r.SAME_TYPE_CLASS)) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } }; SHAPE (binary_n) @@ -1399,7 +1399,7 @@ struct binary_uint_def : public overloaded_base<0> || !r.require_derived_vector_type (i + 1, i, type, TYPE_unsigned)) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } }; SHAPE (binary_uint) @@ -1427,7 +1427,7 @@ struct binary_uint_n_def : public overloaded_base<0> || !r.require_derived_scalar_type (i + 1, TYPE_unsigned)) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } }; SHAPE (binary_uint_n) @@ -1484,7 +1484,7 @@ struct binary_uint64_n_def : public overloaded_base<0> || !r.require_scalar_type (i + 1, "uint64_t")) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } }; SHAPE (binary_uint64_n) @@ -1539,7 +1539,7 @@ struct binary_wide_def : public overloaded_base<0> r.HALF_SIZE)) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } }; SHAPE (binary_wide) @@ -1671,7 +1671,7 @@ struct compare_ptr_def : public overloaded_base<0> || !r.require_matching_pointer_type (i + 1, i, type)) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } }; SHAPE (compare_ptr) @@ -1700,7 +1700,7 @@ struct compare_scalar_def : public overloaded_base<1> || !r.require_matching_integer_scalar_type (i + 1, i, type)) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, r.type_suffix_ids[0], type); + return r.resolve_to (type); } }; SHAPE (compare_scalar) @@ -1877,7 +1877,7 @@ struct fold_left_def : public overloaded_base<0> || (type = r.infer_vector_type (i + 1)) == NUM_TYPE_SUFFIXES) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } }; SHAPE (fold_left) @@ -1905,7 +1905,7 @@ struct get_def : public overloaded_base<0> || !r.require_integer_immediate (i + 1)) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } bool @@ -1987,7 +1987,7 @@ struct inc_dec_pred_def : public overloaded_base<0> || !r.require_vector_type (i + 1, VECTOR_TYPE_svbool_t)) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } }; SHAPE (inc_dec_pred) @@ -2014,7 +2014,7 @@ struct inc_dec_pred_scalar_def : public overloaded_base<2> || !r.require_vector_type (i + 1, VECTOR_TYPE_svbool_t)) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type, r.type_suffix_ids[1]); + return r.resolve_to (type, r.type_suffix_ids[1]); } }; SHAPE (inc_dec_pred_scalar) @@ -2419,7 +2419,7 @@ struct set_def : public overloaded_base<0> || !r.require_derived_vector_type (i + 2, i, type)) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } bool @@ -2594,7 +2594,7 @@ struct store_def : public overloaded_base<0> || ((type = r.infer_tuple_type (nargs - 1)) == NUM_TYPE_SUFFIXES)) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } }; SHAPE (store) @@ -2714,7 +2714,7 @@ struct tbl_tuple_def : public overloaded_base<0> || !r.require_derived_vector_type (i + 1, i, type, TYPE_unsigned)) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } }; SHAPE (tbl_tuple) @@ -2959,7 +2959,7 @@ struct ternary_qq_lane_rotate_def : public overloaded_base<0> || !r.require_integer_immediate (i + 4)) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } bool @@ -3018,7 +3018,7 @@ struct ternary_qq_rotate_def : public overloaded_base<0> || !r.require_integer_immediate (i + 3)) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } bool @@ -3107,7 +3107,7 @@ struct ternary_uint_def : public overloaded_base<0> || !r.require_derived_vector_type (i + 2, i, type, TYPE_unsigned)) return error_mark_node; - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); } }; SHAPE (ternary_uint) @@ -3437,7 +3437,7 @@ struct unary_widen_def : public overloaded_base<0> /* There is only a single form for predicates. */ if (type == TYPE_SUFFIX_b) - return r.resolve_to (r.mode_suffix_id, type); + return r.resolve_to (type); if (type_suffixes[type].integer_p && type_suffixes[type].element_bits < 64) diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 450a8d958a8..e50a58dcc0a 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -1255,6 +1255,25 @@ function_resolver::resolve_to (mode_suffix_index mode, return res; } +/* A cut-down interface to the function above that keeps the mode suffix + unchanged. As a convenience, resolve_to (TYPE0) can be used for functions + whose first type suffix is explicit, with TYPE0 then describing the + second type suffix rather than the first. */ +tree +function_resolver::resolve_to (type_suffix_index type0, + type_suffix_index type1) +{ + /* Handle convert-like functions in which the first type suffix is + explicit. */ + if (type_suffix_ids[0] != NUM_TYPE_SUFFIXES && type1 == NUM_TYPE_SUFFIXES) + { + type1 = type0; + type0 = type_suffix_ids[0]; + } + + return resolve_to (mode_suffix_id, type0, type1); +} + /* Require argument ARGNO to be a 32-bit or 64-bit scalar integer type. Return the associated type suffix on success, otherwise report an error and return NUM_TYPE_SUFFIXES. */ @@ -1636,7 +1655,7 @@ require_derived_vector_type (unsigned int argno, /* Make sure that FIRST_TYPE itself is sensible before using it as a basis for an error message. */ - if (resolve_to (mode_suffix_id, first_type) == error_mark_node) + if (resolve_to (first_type) == error_mark_node) return false; /* If the arguments have consistent type classes, but a link between @@ -2202,7 +2221,7 @@ finish_opt_n_resolution (unsigned int argno, unsigned int first_argno, /* Check the vector form normally. If that succeeds, raise an error about having no corresponding _n form. */ - tree res = resolve_to (mode_suffix_id, inferred_type); + tree res = resolve_to (inferred_type); if (res != error_mark_node) error_at (location, "passing %qT to argument %d of %qE, but its" " %qT form does not accept scalars", @@ -2222,7 +2241,7 @@ finish_opt_n_resolution (unsigned int argno, unsigned int first_argno, expected_tclass, expected_bits)) return error_mark_node; - return resolve_to (mode_suffix_id, inferred_type); + return resolve_to (inferred_type); } /* Resolve a (possibly predicated) unary function. If the function uses @@ -2279,12 +2298,7 @@ function_resolver::resolve_unary (type_class_index merge_tclass, return error_mark_node; } - /* Handle convert-like functions in which the first type suffix is - explicit. */ - if (type_suffix_ids[0] != NUM_TYPE_SUFFIXES) - return resolve_to (mode_suffix_id, type_suffix_ids[0], type); - - return resolve_to (mode_suffix_id, type); + return resolve_to (type); } /* Resolve a (possibly predicated) function that takes NOPS like-typed @@ -2309,7 +2323,7 @@ function_resolver::resolve_uniform (unsigned int nops, unsigned int nimm) if (!require_integer_immediate (i)) return error_mark_node; - return resolve_to (mode_suffix_id, type); + return resolve_to (type); } /* Resolve a (possibly predicated) function that offers a choice between diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index 623b9e3a07b..479b248bef1 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -394,6 +394,7 @@ public: tree resolve_to (mode_suffix_index, type_suffix_index = NUM_TYPE_SUFFIXES, type_suffix_index = NUM_TYPE_SUFFIXES); + tree resolve_to (type_suffix_index, type_suffix_index = NUM_TYPE_SUFFIXES); type_suffix_index infer_integer_scalar_type (unsigned int); type_suffix_index infer_pointer_type (unsigned int, bool = false); From patchwork Sun Nov 13 10:02:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 60522 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D43073888C7F for ; Sun, 13 Nov 2022 10:08:17 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D43073888C7F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668334097; bh=Sx0QTC4pyBDCOJlEEK8kvAmWQQbiWAyaI60sBJNpHJs=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=swiGZXaJGRmK/AxIGmpzu13NKgXT0faQW30FVXYCidLU7KDkL3GdS74yLRFKMawZL tKgleOkray2IMdXMiqx2FLx+24VGL9Loqp1oO9jhjlbuhDWsxLOiWVCjSh9uBjTJCI X7xZa/acWQbyvtbYYmW2a1gz/3hg8nVZcu8MkdXQ= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id E0A9838983AE for ; Sun, 13 Nov 2022 10:03:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E0A9838983AE Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D73DB23A for ; Sun, 13 Nov 2022 02:03:07 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id AB73F3F73D for ; Sun, 13 Nov 2022 02:03:00 -0800 (PST) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 13/16] aarch64: Add support for References: Date: Sun, 13 Nov 2022 10:02:59 +0000 In-Reply-To: (Richard Sandiford's message of "Sun, 13 Nov 2022 09:59:23 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-41.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This adds support for the SME parts of arm_sme.h. The SME2 parts will follow at a later date. The patch doesn't add the ACLE-defined __ARM_FEATURE macros though. I'm planning to do this later, once we're sure everything has been implemented. The feature names sme-i16i64 and sme-f64f64 are different from the ones that GAS currently expects, but we thought it would be better to make the compiler names predictable from the architecture FEAT_* names. I'm going to add sme-i16i64 and sme-f64f64 aliases to GAS soon. gcc/ * doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst: Document +sme-i16i64 and +sme-f64f64. * config.gcc (aarch64*-*-*): Add arm_sme.h to the list of headers to install and aarch64-sve-builtins-sme.o to the list of objects to build. * config/aarch64/aarch64-c.cc (aarch64_pragma_aarch64): Handle arm_sme.h. * config/aarch64/aarch64-option-extensions.def (sme-i16i64) (sme-f64f64): New extensions. * config/aarch64/aarch64-protos.h (aarch64_sme_vq_immediate) (aarch64_addsvl_addspl_immediate_p, aarch64_output_addsvl_addspl) (aarch64_output_sme_zero): Declare. (aarch64_output_move_struct): Delete. (aarch64_sme_ldr_vnum_offset): Declare. (aarch64_sve::handle_arm_sme_h): Likewise. * config/aarch64/aarch64.h (AARCH64_ISA_SM_ON): New macro. (AARCH64_ISA_SME_I16I64, AARCH64_ISA_SME_F64F64): Likewise. (TARGET_STREAMING, TARGET_STREAMING_SME): Likewise. (TARGET_SME_I16I64, TARGET_SME_F64F64): Likewise. * config/aarch64/aarch64.cc (aarch64_sve_rdvl_factor_p): Rename to... (aarch64_sve_rdvl_addvl_factor_p): ...this. (aarch64_sve_rdvl_immediate_p): Update accordingly. (aarch64_rdsvl_immediate_p, aarch64_add_offset): Likewise. (aarch64_sme_vq_immediate): Likewise. Make public. (aarch64_sve_addpl_factor_p): New function. (aarch64_sve_addvl_addpl_immediate_p): Use aarch64_sve_rdvl_addvl_factor_p and aarch64_sve_addpl_factor_p. (aarch64_addsvl_addspl_immediate_p): New function. (aarch64_output_addsvl_addspl): Likewise. (aarch64_cannot_force_const_mem): Return true for RDSVL immediates. (aarch64_classify_index): Handle .Q scaling for VNx1TImode. (aarch64_classify_address): Likewise for vnum offsets. (aarch64_output_sme_zero): New function. (aarch64_sme_ldr_vnum_offset_p): Likewise. * config/aarch64/predicates.md (aarch64_addsvl_addspl_immediate): New predicate. (aarch64_pluslong_operand): Include it for SME. * config/aarch64/constraints.md (Uci, Uav): New constraints. * config/aarch64/iterators.md (VNx1TI_ONLY): New mode iterator. (SME_ZA_I, SME_ZA_SDI, SME_MOP_BHI, SME_MOP_HSDF): Likewise. (UNSPEC_SME_ADDHA, UNSPEC_SME_ADDVA, UNSPEC_SME_FMOPA) (UNSPEC_SME_FMOPS, UNSPEC_SME_LD1_HOR, UNSPEC_SME_LD1_VER) (UNSPEC_SME_READ_HOR, UNSPEC_SME_READ_VER, UNSPEC_SME_SMOPA) (UNSPEC_SME_SMOPS, UNSPEC_SME_ST1_HOR, UNSPEC_SME_ST1_VER) (UNSPEC_SME_SUMOPA, UNSPEC_SME_SUMOPS, UNSPEC_SME_UMOPA) (UNSPEC_SME_UMOPS, UNSPEC_SME_USMOPA, UNSPEC_SME_USMOPS) (UNSPEC_SME_WRITE_HOR, UNSPEC_SME_WRITE_VER): New unspecs. (Vetype, Vesize, VPRED): Handle VNx1TI. (V4xWIDE, V4xWIDE_PRED, V4xwetype): New mode attributes. (SME_FMOP_WIDE, SME_FMOP_WIDE_PRED, sme_fmop_wide_etype, b): Likewise. (SME_LD1, SME_READ, SME_ST1, SME_WRITE, SME_UNARY_SDI, SME_INT_MOP) (SME_FP_MOP): New int iterators. (optab): Handle SME unspecs. (hv): New int attribute. * config/aarch64/aarch64.md (*add3_aarch64): Handle ADDSVL and ADDSPL. * config/aarch64/aarch64-sme.md (UNSPEC_SME_LDR): New unspec. (@aarch64_sme_, *aarch64_sme__plus) (aarch64_sme_ldr0, *aarch64_sme_ldrn): New patterns. (UNSPEC_SME_STR): New unspec. (@aarch64_sme_, *aarch64_sme__plus) (aarch64_sme_str0, *aarch64_sme_strn): New patterns. (@aarch64_sme_): Likewise. (*aarch64_sme__plus): Likewise. (@aarch64_sme_): Likewise. (@aarch64_sme_): Likewise. (*aarch64_sme__plus): Likewise. (@aarch64_sme_): Likewise. (UNSPEC_SME_ZERO): New unspec. (aarch64_sme_zero): New pattern. (@aarch64_sme_): Likewise. (@aarch64_sme_): Likewise. (@aarch64_sme_): Likewise. * config/aarch64/aarch64-sve-builtins.def: Add ZA type suffixes. * config/aarch64/aarch64-sve-builtins.h (CP_READ_ZA): New call property. (CP_WRITE_ZA): Likewise. (PRED_za_m): New predication type. (type_suffix_info): Add vector_p and za_p fields. (function_instance::shared_za_p): New member function. (function_instance::preserves_za_p): Likewise. (function_instance::num_za_tiles): Likewise. (function_expander::get_contiguous_base): Take a base argument number, a vnum argument number, and an argument that indicates whether the vnum parameter is a factor of the SME vector length or the prevailing vector length. (function_expander::add_integer_operand): Take a poly_int64. (sve_switcher::sve_switcher): Take a base set of flags. (sme_switcher): New class. (scalar_types): Add a null entry for NUM_VECTOR_TYPES. * config/aarch64/aarch64-sve-builtins.cc: Include aarch64-sve-builtins-sme.h. (pred_suffixes): Add an entry for PRED_za_m. (type_suffixes): Initialize vector_p and za_p. Handle ZA suffixes. (TYPES_all_za, TYPES_all_za_data, TYPES_s_za_integer) (TYPES_d_za_integer, TYPES_mop_base, TYPES_mop_base_signed) (TYPES_mop_base_unsigned, TYPES_mop_i16i64, TYPES_mop_i16i64_signed) (TYPES_mop_i16i64_unsigned, TYPES_mop_f64f64, TYPES_za): New type suffix macros. (preds_m, preds_za_m): New predication lists. (scalar_types): Add an entry for NUM_VECTOR_TYPES. (find_type_suffix_for_scalar_type): Check positively for vectors rather than negatively for predicates. (check_required_extensions): Handle arm_streaming and arm_shared_za requirements. (function_instance::reads_global_state_p): Return true for functions that read ZA. (function_instance::modifies_global_state_p): Return true for functions that write to ZA. (function_instance::shared_za_p): New function. (function_instance::preserves_za_p): Likewise. (sve_switcher::sve_switcher): Add a base flags argument. (function_builder::get_name): Handle "__arm_" prefixes. (function_builder::get_attributes): Add arm_shared_za and arm_preserved_za attributes where appropriate. (function_resolver::check_gp_argument): Assert that the predication isn't ZA _m predication. (function_checker::function_checker): Don't bias the argument number for ZA _m predication. (function_expander::get_contiguous_base): Add arguments that specify the base argument number, the vnum argument number, and an argument that indicates whether the vnum parameter is a factor of the SME vector length or the prevailing vector length. Handle the SME case. (function_expander::add_integer_operand): Take a poly_int64. (init_builtins): Call handle_arm_sme_h for LTO. (handle_arm_sve_h): Skip SME intrinsics. (handle_arm_sme_h): New function. * config/aarch64/aarch64-sve-builtins-functions.h (read_write_za, write_za): New classes. (unspec_based_sme_function, za_arith_function): New using aliases. (quiet_za_arith_function): Likewise. * config/aarch64/aarch64-sve-builtins-shapes.h (binary_za_int_m, binary_za_m, binary_za_uint_m, bool_inherent) (inherent_za, inherent_mask_za, ldr_za, load_za, read_za, store_za) (str_za, unary_za_m, write_za): Declare. * config/aarch64/aarch64-sve-builtins-shapes.cc (apply_predication): Expect za_m functions to have an existing governing predicate. (binary_za_m_base, binary_za_int_m_def, binary_za_m_def) (binary_za_uint_m_def, bool_inherent_def, inherent_za_def) (inherent_mask_za_def, ldr_za_def, load_za_def, read_za_def) (store_za_def, str_za_def, unary_za_m_def, write_za_def): New classes. * config/aarch64/aarch64-sve-builtins-base.cc (svundef_impl::call_properties): New function. Handle ZA suffixes. (svundef_impl::expand): Handle ZA suffixes here too. * config/aarch64/arm_sme.h: New file. * config/aarch64/aarch64-sve-builtins-sme.h: Likewise. * config/aarch64/aarch64-sve-builtins-sme.cc: Likewise. * config/aarch64/aarch64-sve-builtins-sme.def: Likewise. * config/aarch64/t-aarch64 (aarch64-sve-builtins.o): Depend on aarch64-sve-builtins-sme.def and aarch64-sve-builtins-sme.h. (aarch64-sve-builtins-sme.o): New rule. gcc/testsuite/ * lib/target-supports.exp: Add sme and sme-i16i64 features. * gcc.target/aarch64/sve/acle/asm/test_sve_acle.h: Allow functions to be marked as arm_streaming and/or arm_shared_za. * g++.target/aarch64/sve/acle/general-c++/func_redef_4.c: Mark the function as arm_preserves_za. * g++.target/aarch64/sve/acle/general-c++/func_redef_5.c: Likewise. * g++.target/aarch64/sve/acle/general-c++/func_redef_7.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/func_redef_4.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/func_redef_5.c: Likewise. * g++.target/aarch64/sme/aarch64-sme-acle-asm.exp: New test harness. * gcc.target/aarch64/sme/aarch64-sme-acle-asm.exp: Likewise. * gcc.target/aarch64/sme/acle-asm/addha_za32.c: New test. * gcc.target/aarch64/sme/acle-asm/addha_za64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/addva_za32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/addva_za64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/arm_has_sme_sc.c: Likewise. * gcc.target/aarch64/sme/acle-asm/arm_in_streaming_mode_ns.c: Likewise. * gcc.target/aarch64/sme/acle-asm/arm_in_streaming_mode_s.c: Likewise. * gcc.target/aarch64/sme/acle-asm/arm_in_streaming_mode_sc.c: Likewise. * gcc.target/aarch64/sme/acle-asm/cntsb_s.c: Likewise. * gcc.target/aarch64/sme/acle-asm/cntsb_sc.c: Likewise. * gcc.target/aarch64/sme/acle-asm/cntsd_s.c: Likewise. * gcc.target/aarch64/sme/acle-asm/cntsd_sc.c: Likewise. * gcc.target/aarch64/sme/acle-asm/cntsh_s.c: Likewise. * gcc.target/aarch64/sme/acle-asm/cntsh_sc.c: Likewise. * gcc.target/aarch64/sme/acle-asm/cntsw_s.c: Likewise. * gcc.target/aarch64/sme/acle-asm/cntsw_sc.c: Likewise. * gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za128.c: Likewise. * gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za16.c: Likewise. * gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za8.c: Likewise. * gcc.target/aarch64/sme/acle-asm/ld1_hor_za128.c: Likewise. * gcc.target/aarch64/sme/acle-asm/ld1_hor_za16.c: Likewise. * gcc.target/aarch64/sme/acle-asm/ld1_hor_za32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/ld1_hor_za64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/ld1_hor_za8.c: Likewise. * gcc.target/aarch64/sme/acle-asm/ldr_vnum_za_s.c: Likewise. * gcc.target/aarch64/sme/acle-asm/ldr_vnum_za_sc.c: Likewise. * gcc.target/aarch64/sme/acle-asm/ldr_za_s.c: Likewise. * gcc.target/aarch64/sme/acle-asm/ldr_za_sc.c: Likewise. * gcc.target/aarch64/sme/acle-asm/mopa_za32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/mopa_za64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/mops_za32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/mops_za64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/read_hor_za128.c: Likewise. * gcc.target/aarch64/sme/acle-asm/read_hor_za16.c: Likewise. * gcc.target/aarch64/sme/acle-asm/read_hor_za32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/read_hor_za64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/read_hor_za8.c: Likewise. * gcc.target/aarch64/sme/acle-asm/read_ver_za128.c: Likewise. * gcc.target/aarch64/sme/acle-asm/read_ver_za16.c: Likewise. * gcc.target/aarch64/sme/acle-asm/read_ver_za32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/read_ver_za64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/read_ver_za8.c: Likewise. * gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za128.c: Likewise. * gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za16.c: Likewise. * gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za8.c: Likewise. * gcc.target/aarch64/sme/acle-asm/st1_hor_za128.c: Likewise. * gcc.target/aarch64/sme/acle-asm/st1_hor_za16.c: Likewise. * gcc.target/aarch64/sme/acle-asm/st1_hor_za32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/st1_hor_za64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/st1_hor_za8.c: Likewise. * gcc.target/aarch64/sme/acle-asm/str_vnum_za_s.c: Likewise. * gcc.target/aarch64/sme/acle-asm/str_vnum_za_sc.c: Likewise. * gcc.target/aarch64/sme/acle-asm/str_za_s.c: Likewise. * gcc.target/aarch64/sme/acle-asm/str_za_sc.c: Likewise. * gcc.target/aarch64/sme/acle-asm/sumopa_za32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/sumopa_za64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/sumops_za32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/sumops_za64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/test_sme_acle.h: Likewise. * gcc.target/aarch64/sme/acle-asm/undef_za.c: Likewise. * gcc.target/aarch64/sme/acle-asm/usmopa_za32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/usmopa_za64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/usmops_za32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/usmops_za64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/write_hor_za128.c: Likewise. * gcc.target/aarch64/sme/acle-asm/write_hor_za16.c: Likewise. * gcc.target/aarch64/sme/acle-asm/write_hor_za32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/write_hor_za64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/write_hor_za8.c: Likewise. * gcc.target/aarch64/sme/acle-asm/write_ver_za128.c: Likewise. * gcc.target/aarch64/sme/acle-asm/write_ver_za16.c: Likewise. * gcc.target/aarch64/sme/acle-asm/write_ver_za32.c: Likewise. * gcc.target/aarch64/sme/acle-asm/write_ver_za64.c: Likewise. * gcc.target/aarch64/sme/acle-asm/write_ver_za8.c: Likewise. * gcc.target/aarch64/sme/acle-asm/zero_mask_za.c: Likewise. * gcc.target/aarch64/sme/acle-asm/zero_za.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/binary_za_int_m_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/binary_za_m_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/binary_za_m_2.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/binary_za_uint_m_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/read_za_m_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/unary_za_m_1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/write_za_m_1.c: Likewise. --- gcc/config.gcc | 4 +- gcc/config/aarch64/aarch64-c.cc | 2 + .../aarch64/aarch64-option-extensions.def | 4 + gcc/config/aarch64/aarch64-protos.h | 8 +- gcc/config/aarch64/aarch64-sme.md | 335 ++++++++++++++++ .../aarch64/aarch64-sve-builtins-base.cc | 13 +- .../aarch64/aarch64-sve-builtins-functions.h | 39 ++ .../aarch64/aarch64-sve-builtins-shapes.cc | 280 ++++++++++++- .../aarch64/aarch64-sve-builtins-shapes.h | 13 + .../aarch64/aarch64-sve-builtins-sme.cc | 351 +++++++++++++++++ .../aarch64/aarch64-sve-builtins-sme.def | 83 ++++ gcc/config/aarch64/aarch64-sve-builtins-sme.h | 56 +++ gcc/config/aarch64/aarch64-sve-builtins.cc | 275 +++++++++++-- gcc/config/aarch64/aarch64-sve-builtins.def | 15 + gcc/config/aarch64/aarch64-sve-builtins.h | 46 ++- gcc/config/aarch64/aarch64.cc | 143 ++++++- gcc/config/aarch64/aarch64.h | 15 + gcc/config/aarch64/aarch64.md | 14 +- gcc/config/aarch64/arm_sme.h | 46 +++ gcc/config/aarch64/constraints.md | 9 + gcc/config/aarch64/iterators.md | 98 +++++ gcc/config/aarch64/predicates.md | 8 +- gcc/config/aarch64/t-aarch64 | 17 +- .../aarch64-options.rst | 6 + .../aarch64/sme/aarch64-sme-acle-asm.exp | 86 ++++ .../sve/acle/general-c++/func_redef_4.c | 2 +- .../sve/acle/general-c++/func_redef_5.c | 2 +- .../sve/acle/general-c++/func_redef_7.c | 2 +- .../aarch64/sme/aarch64-sme-acle-asm.exp | 82 ++++ .../aarch64/sme/acle-asm/addha_za32.c | 48 +++ .../aarch64/sme/acle-asm/addha_za64.c | 50 +++ .../aarch64/sme/acle-asm/addva_za32.c | 48 +++ .../aarch64/sme/acle-asm/addva_za64.c | 50 +++ .../aarch64/sme/acle-asm/arm_has_sme_sc.c | 25 ++ .../sme/acle-asm/arm_in_streaming_mode_ns.c | 11 + .../sme/acle-asm/arm_in_streaming_mode_s.c | 11 + .../sme/acle-asm/arm_in_streaming_mode_sc.c | 26 ++ .../gcc.target/aarch64/sme/acle-asm/cntsb_s.c | 310 +++++++++++++++ .../aarch64/sme/acle-asm/cntsb_sc.c | 12 + .../gcc.target/aarch64/sme/acle-asm/cntsd_s.c | 277 +++++++++++++ .../aarch64/sme/acle-asm/cntsd_sc.c | 13 + .../gcc.target/aarch64/sme/acle-asm/cntsh_s.c | 279 +++++++++++++ .../aarch64/sme/acle-asm/cntsh_sc.c | 13 + .../gcc.target/aarch64/sme/acle-asm/cntsw_s.c | 278 +++++++++++++ .../aarch64/sme/acle-asm/cntsw_sc.c | 13 + .../aarch64/sme/acle-asm/ld1_hor_vnum_za128.c | 46 +++ .../aarch64/sme/acle-asm/ld1_hor_vnum_za16.c | 46 +++ .../aarch64/sme/acle-asm/ld1_hor_vnum_za32.c | 46 +++ .../aarch64/sme/acle-asm/ld1_hor_vnum_za64.c | 46 +++ .../aarch64/sme/acle-asm/ld1_hor_vnum_za8.c | 46 +++ .../aarch64/sme/acle-asm/ld1_hor_za128.c | 63 +++ .../aarch64/sme/acle-asm/ld1_hor_za16.c | 94 +++++ .../aarch64/sme/acle-asm/ld1_hor_za32.c | 93 +++++ .../aarch64/sme/acle-asm/ld1_hor_za64.c | 73 ++++ .../aarch64/sme/acle-asm/ld1_hor_za8.c | 63 +++ .../aarch64/sme/acle-asm/ld1_ver_vnum_za128.c | 0 .../aarch64/sme/acle-asm/ld1_ver_vnum_za16.c | 0 .../aarch64/sme/acle-asm/ld1_ver_vnum_za32.c | 0 .../aarch64/sme/acle-asm/ld1_ver_vnum_za64.c | 0 .../aarch64/sme/acle-asm/ld1_ver_vnum_za8.c | 0 .../aarch64/sme/acle-asm/ld1_ver_za128.c | 0 .../aarch64/sme/acle-asm/ld1_ver_za16.c | 0 .../aarch64/sme/acle-asm/ld1_ver_za32.c | 0 .../aarch64/sme/acle-asm/ld1_ver_za64.c | 0 .../aarch64/sme/acle-asm/ld1_ver_za8.c | 0 .../aarch64/sme/acle-asm/ldr_vnum_za_s.c | 121 ++++++ .../aarch64/sme/acle-asm/ldr_vnum_za_sc.c | 166 ++++++++ .../aarch64/sme/acle-asm/ldr_za_s.c | 104 +++++ .../aarch64/sme/acle-asm/ldr_za_sc.c | 51 +++ .../aarch64/sme/acle-asm/mopa_za32.c | 102 +++++ .../aarch64/sme/acle-asm/mopa_za64.c | 70 ++++ .../aarch64/sme/acle-asm/mops_za32.c | 102 +++++ .../aarch64/sme/acle-asm/mops_za64.c | 70 ++++ .../aarch64/sme/acle-asm/read_hor_za128.c | 367 ++++++++++++++++++ .../aarch64/sme/acle-asm/read_hor_za16.c | 171 ++++++++ .../aarch64/sme/acle-asm/read_hor_za32.c | 164 ++++++++ .../aarch64/sme/acle-asm/read_hor_za64.c | 154 ++++++++ .../aarch64/sme/acle-asm/read_hor_za8.c | 97 +++++ .../aarch64/sme/acle-asm/read_ver_za128.c | 367 ++++++++++++++++++ .../aarch64/sme/acle-asm/read_ver_za16.c | 171 ++++++++ .../aarch64/sme/acle-asm/read_ver_za32.c | 164 ++++++++ .../aarch64/sme/acle-asm/read_ver_za64.c | 154 ++++++++ .../aarch64/sme/acle-asm/read_ver_za8.c | 97 +++++ .../aarch64/sme/acle-asm/st1_hor_vnum_za128.c | 46 +++ .../aarch64/sme/acle-asm/st1_hor_vnum_za16.c | 46 +++ .../aarch64/sme/acle-asm/st1_hor_vnum_za32.c | 46 +++ .../aarch64/sme/acle-asm/st1_hor_vnum_za64.c | 46 +++ .../aarch64/sme/acle-asm/st1_hor_vnum_za8.c | 46 +++ .../aarch64/sme/acle-asm/st1_hor_za128.c | 63 +++ .../aarch64/sme/acle-asm/st1_hor_za16.c | 94 +++++ .../aarch64/sme/acle-asm/st1_hor_za32.c | 93 +++++ .../aarch64/sme/acle-asm/st1_hor_za64.c | 73 ++++ .../aarch64/sme/acle-asm/st1_hor_za8.c | 63 +++ .../aarch64/sme/acle-asm/st1_ver_vnum_za128.c | 0 .../aarch64/sme/acle-asm/st1_ver_vnum_za16.c | 0 .../aarch64/sme/acle-asm/st1_ver_vnum_za32.c | 0 .../aarch64/sme/acle-asm/st1_ver_vnum_za64.c | 0 .../aarch64/sme/acle-asm/st1_ver_vnum_za8.c | 0 .../aarch64/sme/acle-asm/st1_ver_za128.c | 0 .../aarch64/sme/acle-asm/st1_ver_za16.c | 0 .../aarch64/sme/acle-asm/st1_ver_za32.c | 0 .../aarch64/sme/acle-asm/st1_ver_za64.c | 0 .../aarch64/sme/acle-asm/st1_ver_za8.c | 0 .../aarch64/sme/acle-asm/str_vnum_za_s.c | 121 ++++++ .../aarch64/sme/acle-asm/str_vnum_za_sc.c | 166 ++++++++ .../aarch64/sme/acle-asm/str_za_s.c | 104 +++++ .../aarch64/sme/acle-asm/str_za_sc.c | 51 +++ .../aarch64/sme/acle-asm/sumopa_za32.c | 30 ++ .../aarch64/sme/acle-asm/sumopa_za64.c | 32 ++ .../aarch64/sme/acle-asm/sumops_za32.c | 30 ++ .../aarch64/sme/acle-asm/sumops_za64.c | 32 ++ .../aarch64/sme/acle-asm/test_sme_acle.h | 62 +++ .../aarch64/sme/acle-asm/undef_za.c | 33 ++ .../aarch64/sme/acle-asm/usmopa_za32.c | 30 ++ .../aarch64/sme/acle-asm/usmopa_za64.c | 32 ++ .../aarch64/sme/acle-asm/usmops_za32.c | 30 ++ .../aarch64/sme/acle-asm/usmops_za64.c | 32 ++ .../aarch64/sme/acle-asm/write_hor_za128.c | 173 +++++++++ .../aarch64/sme/acle-asm/write_hor_za16.c | 113 ++++++ .../aarch64/sme/acle-asm/write_hor_za32.c | 123 ++++++ .../aarch64/sme/acle-asm/write_hor_za64.c | 113 ++++++ .../aarch64/sme/acle-asm/write_hor_za8.c | 73 ++++ .../aarch64/sme/acle-asm/write_ver_za128.c | 173 +++++++++ .../aarch64/sme/acle-asm/write_ver_za16.c | 113 ++++++ .../aarch64/sme/acle-asm/write_ver_za32.c | 123 ++++++ .../aarch64/sme/acle-asm/write_ver_za64.c | 113 ++++++ .../aarch64/sme/acle-asm/write_ver_za8.c | 73 ++++ .../aarch64/sme/acle-asm/zero_mask_za.c | 130 +++++++ .../gcc.target/aarch64/sme/acle-asm/zero_za.c | 11 + .../aarch64/sve/acle/asm/test_sve_acle.h | 16 +- .../sve/acle/general-c/binary_za_int_m_1.c | 48 +++ .../sve/acle/general-c/binary_za_m_1.c | 48 +++ .../sve/acle/general-c/binary_za_m_2.c | 11 + .../sve/acle/general-c/binary_za_uint_m_1.c | 48 +++ .../aarch64/sve/acle/general-c/func_redef_4.c | 2 +- .../aarch64/sve/acle/general-c/func_redef_5.c | 2 +- .../aarch64/sve/acle/general-c/read_za_m_1.c | 47 +++ .../aarch64/sve/acle/general-c/unary_za_m_1.c | 47 +++ .../aarch64/sve/acle/general-c/write_za_m_1.c | 47 +++ gcc/testsuite/lib/target-supports.exp | 3 +- 140 files changed, 9806 insertions(+), 71 deletions(-) create mode 100644 gcc/config/aarch64/aarch64-sve-builtins-sme.cc create mode 100644 gcc/config/aarch64/aarch64-sve-builtins-sme.def create mode 100644 gcc/config/aarch64/aarch64-sve-builtins-sme.h create mode 100644 gcc/config/aarch64/arm_sme.h create mode 100644 gcc/testsuite/g++.target/aarch64/sme/aarch64-sme-acle-asm.exp create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme-acle-asm.exp create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addha_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addha_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addva_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addva_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_has_sme_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_in_streaming_mode_ns.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_in_streaming_mode_s.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_in_streaming_mode_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsb_s.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsb_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsd_s.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsd_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsh_s.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsh_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsw_s.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsw_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_vnum_za_s.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_vnum_za_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_za_s.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_za_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mopa_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mopa_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mops_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mops_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_vnum_za_s.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_vnum_za_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_za_s.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_za_sc.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumopa_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumopa_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumops_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumops_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/test_sme_acle.h create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/undef_za.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmopa_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmopa_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmops_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmops_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za128.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za64.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/zero_mask_za.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/acle-asm/zero_za.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_int_m_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_m_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_m_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_uint_m_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/read_za_m_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_za_m_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/write_za_m_1.c diff --git a/gcc/config.gcc b/gcc/config.gcc index b5eda046033..79673619bd4 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -323,11 +323,11 @@ m32c*-*-*) ;; aarch64*-*-*) cpu_type=aarch64 - extra_headers="arm_fp16.h arm_neon.h arm_bf16.h arm_acle.h arm_sve.h" + extra_headers="arm_fp16.h arm_neon.h arm_bf16.h arm_acle.h arm_sve.h arm_sme.h" c_target_objs="aarch64-c.o" cxx_target_objs="aarch64-c.o" d_target_objs="aarch64-d.o" - extra_objs="aarch64-builtins.o aarch-common.o aarch64-sve-builtins.o aarch64-sve-builtins-shapes.o aarch64-sve-builtins-base.o aarch64-sve-builtins-sve2.o cortex-a57-fma-steering.o aarch64-speculation.o falkor-tag-collision-avoidance.o aarch64-bti-insert.o aarch64-cc-fusion.o" + extra_objs="aarch64-builtins.o aarch-common.o aarch64-sve-builtins.o aarch64-sve-builtins-shapes.o aarch64-sve-builtins-base.o aarch64-sve-builtins-sve2.o aarch64-sve-builtins-sme.o cortex-a57-fma-steering.o aarch64-speculation.o falkor-tag-collision-avoidance.o aarch64-bti-insert.o aarch64-cc-fusion.o" target_gtfiles="\$(srcdir)/config/aarch64/aarch64-builtins.cc \$(srcdir)/config/aarch64/aarch64-sve-builtins.h \$(srcdir)/config/aarch64/aarch64-sve-builtins.cc" target_has_targetm_common=yes ;; diff --git a/gcc/config/aarch64/aarch64-c.cc b/gcc/config/aarch64/aarch64-c.cc index e296c73350f..db2705ac6d2 100644 --- a/gcc/config/aarch64/aarch64-c.cc +++ b/gcc/config/aarch64/aarch64-c.cc @@ -288,6 +288,8 @@ aarch64_pragma_aarch64 (cpp_reader *) const char *name = TREE_STRING_POINTER (x); if (strcmp (name, "arm_sve.h") == 0) aarch64_sve::handle_arm_sve_h (); + else if (strcmp (name, "arm_sme.h") == 0) + aarch64_sve::handle_arm_sme_h (); else if (strcmp (name, "arm_neon.h") == 0) handle_arm_neon_h (); else if (strcmp (name, "arm_acle.h") == 0) diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def index 402a9832f87..cf55742ae60 100644 --- a/gcc/config/aarch64/aarch64-option-extensions.def +++ b/gcc/config/aarch64/aarch64-option-extensions.def @@ -131,6 +131,10 @@ AARCH64_OPT_EXTENSION("sve2-bitperm", SVE2_BITPERM, (SVE2), (), (), AARCH64_OPT_EXTENSION("sme", SME, (SVE2), (), (), "sme") +AARCH64_OPT_EXTENSION("sme-i16i64", SME_I16I64, (SME), (), (), "") + +AARCH64_OPT_EXTENSION("sme-f64f64", SME_F64F64, (SME), (), (), "") + AARCH64_OPT_EXTENSION("tme", TME, (), (), (), "") AARCH64_OPT_EXTENSION("i8mm", I8MM, (SIMD), (), (), "i8mm") diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 97a84f616a2..700d5fb1c77 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -808,7 +808,11 @@ bool aarch64_sve_vector_inc_dec_immediate_p (rtx); int aarch64_add_offset_temporaries (rtx); void aarch64_split_add_offset (scalar_int_mode, rtx, rtx, rtx, rtx, rtx); bool aarch64_rdsvl_immediate_p (const_rtx); +rtx aarch64_sme_vq_immediate (machine_mode mode, HOST_WIDE_INT, + aarch64_feature_flags); char *aarch64_output_rdsvl (const_rtx); +bool aarch64_addsvl_addspl_immediate_p (const_rtx); +char *aarch64_output_addsvl_addspl (rtx); bool aarch64_mov_operand_p (rtx, machine_mode); rtx aarch64_reverse_mask (machine_mode, unsigned int); bool aarch64_offset_7bit_signed_scaled_p (machine_mode, poly_int64); @@ -851,6 +855,7 @@ bool aarch64_uimm12_shift (HOST_WIDE_INT); int aarch64_movk_shift (const wide_int_ref &, const wide_int_ref &); bool aarch64_use_return_insn_p (void); const char *aarch64_output_casesi (rtx *); +const char *aarch64_output_sme_zero (rtx); arm_pcs aarch64_tlsdesc_abi_id (); enum aarch64_symbol_type aarch64_classify_symbol (rtx, HOST_WIDE_INT); @@ -865,7 +870,6 @@ int aarch64_uxt_size (int, HOST_WIDE_INT); int aarch64_vec_fpconst_pow_of_2 (rtx); rtx aarch64_eh_return_handler_rtx (void); rtx aarch64_mask_from_zextract_ops (rtx, rtx); -const char *aarch64_output_move_struct (rtx *operands); rtx aarch64_return_addr_rtx (void); rtx aarch64_return_addr (int, rtx); rtx aarch64_simd_gen_const_vector_dup (machine_mode, HOST_WIDE_INT); @@ -879,6 +883,7 @@ bool aarch64_sve_ldnf1_operand_p (rtx); bool aarch64_sve_ldr_operand_p (rtx); bool aarch64_sve_prefetch_operand_p (rtx, machine_mode); bool aarch64_sve_struct_memory_operand_p (rtx); +bool aarch64_sme_ldr_vnum_offset_p (rtx, rtx); rtx aarch64_simd_vect_par_cnst_half (machine_mode, int, bool); rtx aarch64_gen_stepped_int_parallel (unsigned int, int, int); bool aarch64_stepped_int_parallel_p (rtx, int); @@ -996,6 +1001,7 @@ void handle_arm_neon_h (void); namespace aarch64_sve { void init_builtins (); void handle_arm_sve_h (); + void handle_arm_sme_h (); tree builtin_decl (unsigned, bool); bool builtin_type_p (const_tree); bool builtin_type_p (const_tree, unsigned int *, unsigned int *); diff --git a/gcc/config/aarch64/aarch64-sme.md b/gcc/config/aarch64/aarch64-sme.md index 55fb00db12d..7b3ccea2e11 100644 --- a/gcc/config/aarch64/aarch64-sme.md +++ b/gcc/config/aarch64/aarch64-sme.md @@ -24,6 +24,18 @@ ;; ---- Test current state ;; ---- PSTATE.SM management ;; ---- PSTATE.ZA management +;; +;; == Loads, stores and moves +;; ---- Single-vector loads +;; ---- Single-vector stores +;; ---- Single-vector moves +;; ---- Zeroing +;; +;; == Unary operations +;; ---- Single vector input +;; +;; == Binary operations +;; ---- Sum of outer products ;; ========================================================================= ;; == State management @@ -269,3 +281,326 @@ (define_insn_and_split "aarch64_restore_za" DONE; } ) +;; ========================================================================= +;; == Loads, stores and moves +;; ========================================================================= + +;; ------------------------------------------------------------------------- +;; ---- Single-vector loads +;; ------------------------------------------------------------------------- +;; Includes: +;; - LD1 +;; - LDR +;; ------------------------------------------------------------------------- + +(define_c_enum "unspec" [ + UNSPEC_SME_LDR +]) + +(define_insn "@aarch64_sme_" + [(set (reg:SME_ZA_I ZA_REGNUM) + (unspec:SME_ZA_I + [(reg:SME_ZA_I ZA_REGNUM) + (match_operand:DI 0 "const_int_operand") + (match_operand:SI 1 "register_operand" "Uci") + (match_operand: 2 "register_operand" "Upl") + (match_operand:SME_ZA_I 3 "aarch64_sve_ldff1_operand" "Utf")] + SME_LD1))] + "TARGET_STREAMING_SME" + "ld1\t{ za%0.[%w1, 0] }, %2/z, %3" +) + +(define_insn "*aarch64_sme__plus" + [(set (reg:SME_ZA_I ZA_REGNUM) + (unspec:SME_ZA_I + [(reg:SME_ZA_I ZA_REGNUM) + (match_operand:DI 0 "const_int_operand") + (plus:SI (match_operand:SI 1 "register_operand" "Uci") + (match_operand:SI 2 "const_int_operand")) + (match_operand: 3 "register_operand" "Upl") + (match_operand:SME_ZA_I 4 "aarch64_sve_ldff1_operand" "Utf")] + SME_LD1))] + "TARGET_STREAMING_SME + && IN_RANGE (UINTVAL (operands[2]), 0, + 15 / GET_MODE_UNIT_SIZE (mode))" + "ld1\t{ za%0.[%w1, %2] }, %3/z, %4" +) + +(define_insn "aarch64_sme_ldr0" + [(set (reg:VNx16QI ZA_REGNUM) + (unspec:VNx16QI + [(reg:VNx16QI ZA_REGNUM) + (match_operand:SI 0 "register_operand" "Uci") + (match_operand:VNx16QI 1 "aarch64_sync_memory_operand" "Q")] + UNSPEC_SME_LDR))] + "TARGET_SME" + "ldr\tza[%w0, 0], %1" +) + +(define_insn "*aarch64_sme_ldrn" + [(set (reg:VNx16QI ZA_REGNUM) + (unspec:VNx16QI + [(reg:VNx16QI ZA_REGNUM) + (plus:SI (match_operand:SI 0 "register_operand" "Uci") + (match_operand:SI 1 "const_int_operand")) + (mem:VNx16QI + (plus:P (match_operand:P 2 "register_operand" "rk") + (match_operand 3)))] + UNSPEC_SME_LDR))] + "TARGET_SME + && aarch64_sme_ldr_vnum_offset_p (operands[1], operands[3])" + "ldr\tza[%w0, %1], [%2, #%1, mul vl]" +) + +;; ------------------------------------------------------------------------- +;; ---- Single-vector stores +;; ------------------------------------------------------------------------- +;; Includes: +;; - ST1 +;; - STR +;; ------------------------------------------------------------------------- + +(define_c_enum "unspec" [ + UNSPEC_SME_STR +]) + +(define_insn "@aarch64_sme_" + [(set (match_operand:SME_ZA_I 0 "aarch64_sve_ldff1_operand" "+Utf") + (unspec:SME_ZA_I + [(match_dup 0) + (match_operand:DI 1 "const_int_operand") + (match_operand:SI 2 "register_operand" "Uci") + (match_operand: 3 "register_operand" "Upl") + (reg:SME_ZA_I ZA_REGNUM)] + SME_ST1))] + "TARGET_STREAMING_SME" + "st1\t{ za%1.[%w2, 0] }, %3/z, %0" +) + +(define_insn "*aarch64_sme__plus" + [(set (match_operand:SME_ZA_I 0 "aarch64_sve_ldff1_operand" "+Utf") + (unspec:SME_ZA_I + [(match_dup 0) + (match_operand:DI 1 "const_int_operand") + (plus:SI (match_operand:SI 2 "register_operand" "Uci") + (match_operand:SI 3 "const_int_operand")) + (match_operand: 4 "register_operand" "Upl") + (reg:SME_ZA_I ZA_REGNUM)] + SME_ST1))] + "TARGET_STREAMING_SME + && IN_RANGE (UINTVAL (operands[3]), 0, + 15 / GET_MODE_UNIT_SIZE (mode))" + "st1\t{ za%1.[%w2, %3] }, %4/z, %0" +) + +(define_insn "aarch64_sme_str0" + [(set (match_operand:VNx16QI 0 "aarch64_sync_memory_operand" "+Q") + (unspec:VNx16QI + [(match_dup 0) + (match_operand:SI 1 "register_operand" "Uci") + (reg:VNx16QI ZA_REGNUM)] + UNSPEC_SME_STR))] + "TARGET_SME" + "str\tza[%w1, 0], %0" +) + +(define_insn "*aarch64_sme_strn" + [(set (mem:VNx16QI + (plus:P (match_operand:P 2 "register_operand" "rk") + (match_operand 3))) + (unspec:VNx16QI + [(mem:VNx16QI (plus:P (match_dup 2) (match_dup 3))) + (plus:SI (match_operand:SI 0 "register_operand" "Uci") + (match_operand:SI 1 "const_int_operand")) + (reg:VNx16QI ZA_REGNUM)] + UNSPEC_SME_STR))] + "TARGET_SME + && aarch64_sme_ldr_vnum_offset_p (operands[1], operands[3])" + "str\tza[%w0, %1], [%2, #%1, mul vl]" +) + +;; ------------------------------------------------------------------------- +;; ---- Single-vector moves +;; ------------------------------------------------------------------------- +;; Includes: +;; - MOVA +;; ------------------------------------------------------------------------- + +(define_insn "@aarch64_sme_" + [(set (match_operand:SVE_FULL 0 "register_operand" "=w") + (unspec:SVE_FULL + [(match_operand:SVE_FULL 1 "register_operand" "0") + (match_operand: 2 "register_operand" "Upl") + (match_operand:DI 3 "const_int_operand") + (match_operand:SI 4 "register_operand" "Uci") + (reg: ZA_REGNUM)] + SME_READ))] + "TARGET_STREAMING_SME" + "mova\t%0., %2/m, za%3.[%w4, 0]" +) + +(define_insn "*aarch64_sme__plus" + [(set (match_operand:SVE_FULL 0 "register_operand" "=w") + (unspec:SVE_FULL + [(match_operand:SVE_FULL 1 "register_operand" "0") + (match_operand: 2 "register_operand" "Upl") + (match_operand:DI 3 "const_int_operand") + (plus:SI (match_operand:SI 4 "register_operand" "Uci") + (match_operand:SI 5 "const_int_operand")) + (reg: ZA_REGNUM)] + SME_READ))] + "TARGET_STREAMING_SME + && IN_RANGE (UINTVAL (operands[5]), 0, + 15 / GET_MODE_UNIT_SIZE (mode))" + "mova\t%0., %2/m, za%3.[%w4, %5]" +) + +(define_insn "@aarch64_sme_" + [(set (match_operand:SVE_FULL 0 "register_operand" "=w") + (unspec:SVE_FULL + [(match_operand:SVE_FULL 1 "register_operand" "0") + (match_operand:VNx2BI 2 "register_operand" "Upl") + (match_operand:DI 3 "const_int_operand") + (match_operand:SI 4 "register_operand" "Uci") + (reg:VNx1TI_ONLY ZA_REGNUM)] + SME_READ))] + "TARGET_STREAMING_SME" + "mova\t%0.q, %2/m, za%3.q[%w4, 0]" +) + +(define_insn "@aarch64_sme_" + [(set (reg: ZA_REGNUM) + (unspec: + [(reg:SVE_FULL ZA_REGNUM) + (match_operand:DI 0 "const_int_operand") + (match_operand:SI 1 "register_operand" "Uci") + (match_operand: 2 "register_operand" "Upl") + (match_operand:SVE_FULL 3 "register_operand" "w")] + SME_WRITE))] + "TARGET_STREAMING_SME" + "mova\tza%0.[%w1, 0], %2/m, %3." +) + +(define_insn "*aarch64_sme__plus" + [(set (reg: ZA_REGNUM) + (unspec: + [(reg:SVE_FULL ZA_REGNUM) + (match_operand:DI 0 "const_int_operand") + (plus:SI (match_operand:SI 1 "register_operand" "Uci") + (match_operand:SI 2 "const_int_operand")) + (match_operand: 3 "register_operand" "Upl") + (match_operand:SVE_FULL 4 "register_operand" "w")] + SME_WRITE))] + "TARGET_STREAMING_SME + && IN_RANGE (UINTVAL (operands[2]), 0, + 15 / GET_MODE_UNIT_SIZE (mode))" + "mova\tza%0.[%w1, %2], %3/m, %4." +) + +(define_insn "@aarch64_sme_" + [(set (reg:VNx1TI_ONLY ZA_REGNUM) + (unspec:VNx1TI_ONLY + [(reg:VNx1TI_ONLY ZA_REGNUM) + (match_operand:DI 0 "const_int_operand") + (match_operand:SI 1 "register_operand" "Uci") + (match_operand:VNx2BI 2 "register_operand" "Upl") + (match_operand:SVE_FULL 3 "register_operand" "w")] + SME_WRITE))] + "TARGET_STREAMING_SME" + "mova\tza%0.q[%w1, 0], %2/m, %3.q" +) + +;; ------------------------------------------------------------------------- +;; ---- Zeroing +;; ------------------------------------------------------------------------- +;; Includes +;; - ZERO +;; ------------------------------------------------------------------------- + +(define_c_enum "unspec" [UNSPEC_SME_ZERO]) + +(define_insn "aarch64_sme_zero" + [(set (reg:VNx16QI ZA_REGNUM) + (unspec:VNx16QI [(reg:VNx16QI ZA_REGNUM) + (match_operand:DI 0 "const_int_operand")] + UNSPEC_SME_ZERO))] + "TARGET_SME" + { + return aarch64_output_sme_zero (operands[0]); + } +) + +;; ========================================================================= +;; == Unary operations +;; ========================================================================= + +;; ------------------------------------------------------------------------- +;; ---- Single vector input +;; ------------------------------------------------------------------------- +;; Includes +;; - ADDHA +;; - ADDVA +;; ------------------------------------------------------------------------- + +(define_insn "@aarch64_sme_" + [(set (reg:SME_ZA_SDI ZA_REGNUM) + (unspec:SME_ZA_SDI + [(reg:SME_ZA_SDI ZA_REGNUM) + (match_operand:DI 0 "const_int_operand") + (match_operand: 1 "register_operand" "Upl") + (match_operand: 2 "register_operand" "Upl") + (match_operand:SME_ZA_SDI 3 "register_operand" "w")] + SME_UNARY_SDI))] + "TARGET_STREAMING_SME" + "\tza%0., %1/m, %2/m, %3." +) + +;; ========================================================================= +;; == Binary operations +;; ========================================================================= + +;; ------------------------------------------------------------------------- +;; ---- Sum of outer products +;; ------------------------------------------------------------------------- +;; Includes +;; - BFMOPA +;; - BFMOPS +;; - FMOPA +;; - FMOPS +;; - SMOPA +;; - SMOPS +;; - SUMOPA +;; - SUMOPS +;; - UMOPA +;; - UMOPS +;; - USMOPA +;; - USMOPS +;; ------------------------------------------------------------------------- + +(define_insn "@aarch64_sme_" + [(set (reg: ZA_REGNUM) + (unspec: + [(reg: ZA_REGNUM) + (match_operand:DI 0 "const_int_operand") + (match_operand: 1 "register_operand" "Upl") + (match_operand: 2 "register_operand" "Upl") + (match_operand:SME_MOP_BHI 3 "register_operand" "w") + (match_operand:SME_MOP_BHI 4 "register_operand" "w")] + SME_INT_MOP))] + "TARGET_STREAMING_SME" + "\tza%0., %1/m, %2/m, %3., %4." +) + +(define_insn "@aarch64_sme_" + [(set (reg: ZA_REGNUM) + (unspec: + [(reg: ZA_REGNUM) + (match_operand:DI 0 "const_int_operand") + (match_operand: 1 "register_operand" "Upl") + (match_operand: 2 "register_operand" "Upl") + (match_operand:SME_MOP_HSDF 3 "register_operand" "w") + (match_operand:SME_MOP_HSDF 4 "register_operand" "w")] + SME_FP_MOP))] + "TARGET_STREAMING_SME" + "\tza%0., %1/m, %2/m, %3., %4." +) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc index 6347407555f..f4765e6e541 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc @@ -2332,10 +2332,21 @@ class svundef_impl : public quiet public: using quiet::quiet; + unsigned int + call_properties (const function_instance &fi) const override + { + auto base = quiet::call_properties (fi); + if (fi.type_suffix (0).za_p) + base |= CP_WRITE_ZA; + return base; + } + rtx expand (function_expander &e) const override { - rtx target = e.get_reg_target (); + rtx target = (e.type_suffix (0).za_p + ? gen_rtx_REG (VNx16QImode, ZA_REGNUM) + : e.get_reg_target ()); emit_clobber (copy_rtx (target)); return target; } diff --git a/gcc/config/aarch64/aarch64-sve-builtins-functions.h b/gcc/config/aarch64/aarch64-sve-builtins-functions.h index 2fd135aab07..70cfb6a7c23 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-functions.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-functions.h @@ -39,6 +39,36 @@ public: } }; +/* Wrap T, which is derived from function_base, and indicate that the + function reads from and writes to ZA. */ +template +class read_write_za : public T +{ +public: + using T::T; + + unsigned int + call_properties (const function_instance &fi) const override + { + return T::call_properties (fi) | CP_READ_ZA | CP_WRITE_ZA; + } +}; + +/* Wrap T, which is derived from function_base, and indicate that the + function writes to ZA (but does not read from it). */ +template +class write_za : public T +{ +public: + using T::T; + + unsigned int + call_properties (const function_instance &fi) const override + { + return T::call_properties (fi) | CP_WRITE_ZA; + } +}; + /* A function_base that sometimes or always operates on tuples of vectors. */ class multi_vector_function : public function_base @@ -348,6 +378,15 @@ typedef unspec_based_function_exact_insn typedef unspec_based_function_exact_insn unspec_based_sub_lane_function; +/* General SME unspec-based functions. */ +typedef unspec_based_function_exact_insn + unspec_based_sme_function; + +/* SME functions that read from and write to ZA. */ +typedef read_write_za za_arith_function; +typedef read_write_za> + quiet_za_arith_function; + /* A function that acts like unspec_based_function_exact_insn when operating on integers, but that expands to an (fma ...)-style aarch64_sve* operation when applied to floats. */ diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index df2d5414c07..69c5304a8ba 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -59,7 +59,10 @@ static void apply_predication (const function_instance &instance, tree return_type, vec &argument_types) { - if (instance.pred != PRED_none) + /* There are currently no SME ZA instructions that have both merging and + unpredicated forms, so for simplicity, the predicates are always included + in the original format string. */ + if (instance.pred != PRED_none && instance.pred != PRED_za_m) { argument_types.quick_insert (0, get_svbool_t ()); /* For unary merge operations, the first argument is a vector with @@ -584,6 +587,32 @@ struct binary_imm_long_base : public overloaded_base<0> } }; +template +struct binary_za_m_base : public overloaded_base<1> +{ + tree + resolve (function_resolver &r) const override + { + type_suffix_index type; + if (!r.check_num_arguments (5) + || !r.require_integer_immediate (0) + || !r.require_vector_type (1, VECTOR_TYPE_svbool_t) + || !r.require_vector_type (2, VECTOR_TYPE_svbool_t) + || (type = r.infer_vector_type (3)) == NUM_TYPE_SUFFIXES + || !r.require_derived_vector_type (4, 3, type, TCLASS, BITS)) + return error_mark_node; + + return r.resolve_to (type); + } + + bool + check (function_checker &c) const override + { + return c.require_immediate_range (0, 0, c.num_za_tiles () - 1); + } +}; + /* Base class for inc_dec and inc_dec_pat. */ struct inc_dec_base : public overloaded_base<0> { @@ -1571,6 +1600,61 @@ struct binary_wide_opt_n_def : public overloaded_base<0> }; SHAPE (binary_wide_opt_n) +/* void svfoo_t0[_t1](uint64_t, svbool_t, svbool_t, sv_t, + sv_t). */ +struct binary_za_int_m_def : public binary_za_m_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "_,su64,vp,vp,t1,ts1", group, MODE_none); + } +}; +SHAPE (binary_za_int_m) + +/* void svfoo_t0[_t1](uint64_t, svbool_t, svbool_t, sv_t, sv_t). */ +struct binary_za_m_def : public binary_za_m_base<> +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + /* Allow the overloaded form to be specified seperately, with just + a single suffix. This is necessary for the 64-bit SME MOP intrinsics, + which have some forms dependent on FEAT_SME_I16I64 and some forms + dependent on FEAT_SME_F64F64. The resolver needs to be defined + for base SME. */ + if (group.types[0][1] != NUM_TYPE_SUFFIXES) + build_all (b, "_,su64,vp,vp,t1,t1", group, MODE_none); + } +}; +SHAPE (binary_za_m) + +/* void svfoo_t0[_t1](uint64_t, svbool_t, svbool_t, sv_t, + sv_t). */ +struct binary_za_uint_m_def : public binary_za_m_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "_,su64,vp,vp,t1,tu1", group, MODE_none); + } +}; +SHAPE (binary_za_uint_m) + +/* bool svfoo(). */ +struct bool_inherent_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + build_all (b, "sp", group, MODE_none); + } +}; +SHAPE (bool_inherent) + /* sv_t svfoo[_t0](sv_t, sv_t) _t svfoo[_n_t0](_t, sv_t). */ struct clast_def : public overloaded_base<0> @@ -2050,6 +2134,41 @@ struct inherent_b_def : public overloaded_base<0> }; SHAPE (inherent_b) +/* void svfoo_t0(). */ +struct inherent_za_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + build_all (b, "_", group, MODE_none); + } +}; +SHAPE (inherent_za) + +/* void svfoo_t0(uint64_t). */ +struct inherent_mask_za_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + build_all (b, "_,su64", group, MODE_none); + } +}; +SHAPE (inherent_mask_za) + +/* void svfoo_t0(uint32_t, const void *) + void svfoo_vnum_t0(uint32_t, const void *, int64_t). */ +struct ldr_za_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + build_all (b, "_,su32,al", group, MODE_none); + build_all (b, "_,su32,al,ss64", group, MODE_vnum); + } +}; +SHAPE (ldr_za) + /* sv[xN]_t svfoo[_t0](const _t *) sv[xN]_t svfoo_vnum[_t0](const _t *, int64_t). */ struct load_def : public load_contiguous_base @@ -2260,6 +2379,27 @@ struct load_replicate_def : public load_contiguous_base }; SHAPE (load_replicate) +/* void svfoo_t0(uint64_t, uint32_t, svbool_t, const void *) + void svfoo_vnum_t0(uint64_t, uint32_t, svbool_t, const void *, int64_t) + + where the first two fields form a (ZA tile, slice) pair. */ +struct load_za_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + build_all (b, "_,su64,su32,vp,al", group, MODE_none); + build_all (b, "_,su64,su32,vp,al,ss64", group, MODE_vnum); + } + + bool + check (function_checker &c) const override + { + return c.require_immediate_range (0, 0, c.num_za_tiles () - 1); + } +}; +SHAPE (load_za) + /* svbool_t svfoo(enum svpattern). */ struct pattern_pred_def : public nonoverloaded_base { @@ -2354,6 +2494,46 @@ struct rdffr_def : public nonoverloaded_base }; SHAPE (rdffr) +/* sv_t svfoo_t0[_t1](uint64_t, uint32_t). */ +struct read_za_def : public overloaded_base<1> +{ + bool + has_merge_argument_p (const function_instance &, unsigned int) const override + { + return true; + } + + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "t1,su64,su32", group, MODE_none); + } + + tree + resolve (function_resolver &r) const override + { + gcc_assert (r.pred == PRED_m); + type_suffix_index type; + if (!r.check_num_arguments (4) + || (type = r.infer_vector_type (0)) == NUM_TYPE_SUFFIXES + || !r.require_vector_type (1, VECTOR_TYPE_svbool_t) + || !r.require_integer_immediate (2) + || !r.require_scalar_type (3, "uint32_t")) + return error_mark_node; + + return r.resolve_to (type); + } + + bool + check (function_checker &c) const override + { + gcc_assert (c.pred == PRED_m); + return c.require_immediate_range (1, 0, c.num_za_tiles () - 1); + } +}; +SHAPE (read_za) + /* _t svfoo[_t0](sv_t). */ struct reduction_def : public overloaded_base<0> { @@ -2694,6 +2874,40 @@ struct store_scatter_offset_restricted_def : public store_scatter_base }; SHAPE (store_scatter_offset_restricted) +/* void svfoo_t0(uint64_t, uint32_t, svbool_t, void *) + void svfoo_vnum_t0(uint64_t, uint32_t, svbool_t, void *, int64_t) + + where the first two fields form a (ZA tile, slice) pair. */ +struct store_za_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + build_all (b, "_,su64,su32,vp,as", group, MODE_none); + build_all (b, "_,su64,su32,vp,as,ss64", group, MODE_vnum); + } + + bool + check (function_checker &c) const override + { + return c.require_immediate_range (0, 0, c.num_za_tiles () - 1); + } +}; +SHAPE (store_za) + +/* void svfoo_t0(uint32_t, void *) + void svfoo_vnum_t0(uint32_t, void *, int64_t). */ +struct str_za_def : public nonoverloaded_base +{ + void + build (function_builder &b, const function_group_info &group) const override + { + build_all (b, "_,su32,as", group, MODE_none); + build_all (b, "_,su32,as,ss64", group, MODE_vnum); + } +}; +SHAPE (str_za) + /* sv_t svfoo[_t0](svxN_t, sv_t). */ struct tbl_tuple_def : public overloaded_base<0> { @@ -3454,4 +3668,68 @@ struct unary_widen_def : public overloaded_base<0> }; SHAPE (unary_widen) +/* void svfoo_t0[_t1](uint64_t, svbool_t, svbool_t, sv_t). */ +struct unary_za_m_def : public overloaded_base<1> +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "_,su64,vp,vp,t1", group, MODE_none); + } + + tree + resolve (function_resolver &r) const override + { + type_suffix_index type; + if (!r.check_num_arguments (4) + || !r.require_integer_immediate (0) + || !r.require_vector_type (1, VECTOR_TYPE_svbool_t) + || !r.require_vector_type (2, VECTOR_TYPE_svbool_t) + || (type = r.infer_vector_type (3)) == NUM_TYPE_SUFFIXES) + return error_mark_node; + + return r.resolve_to (type); + } + + bool + check (function_checker &c) const override + { + return c.require_immediate_range (0, 0, c.num_za_tiles () - 1); + } +}; +SHAPE (unary_za_m) + +/* void svfoo_t0[_t1](uint64_t, uint32_t, svbool_t, sv_t). */ +struct write_za_def : public overloaded_base<1> +{ + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "_,su64,su32,vp,t1", group, MODE_none); + } + + tree + resolve (function_resolver &r) const override + { + type_suffix_index type; + if (!r.check_num_arguments (4) + || !r.require_integer_immediate (0) + || !r.require_scalar_type (1, "uint32_t") + || !r.require_vector_type (2, VECTOR_TYPE_svbool_t) + || (type = r.infer_vector_type (3)) == NUM_TYPE_SUFFIXES) + return error_mark_node; + + return r.resolve_to (type); + } + + bool + check (function_checker &c) const override + { + return c.require_immediate_range (0, 0, c.num_za_tiles () - 1); + } +}; +SHAPE (write_za) + } diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.h b/gcc/config/aarch64/aarch64-sve-builtins-shapes.h index 3b0025f85db..f7f9cdd3351 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.h @@ -93,6 +93,10 @@ namespace aarch64_sve extern const function_shape *const binary_uint64_opt_n; extern const function_shape *const binary_wide; extern const function_shape *const binary_wide_opt_n; + extern const function_shape *const binary_za_int_m; + extern const function_shape *const binary_za_m; + extern const function_shape *const binary_za_uint_m; + extern const function_shape *const bool_inherent; extern const function_shape *const clast; extern const function_shape *const compare; extern const function_shape *const compare_opt_n; @@ -114,6 +118,9 @@ namespace aarch64_sve extern const function_shape *const inc_dec_pred_scalar; extern const function_shape *const inherent; extern const function_shape *const inherent_b; + extern const function_shape *const inherent_za; + extern const function_shape *const inherent_mask_za; + extern const function_shape *const ldr_za; extern const function_shape *const load; extern const function_shape *const load_ext; extern const function_shape *const load_ext_gather_index; @@ -124,6 +131,7 @@ namespace aarch64_sve extern const function_shape *const load_gather_sv_restricted; extern const function_shape *const load_gather_vs; extern const function_shape *const load_replicate; + extern const function_shape *const load_za; extern const function_shape *const mmla; extern const function_shape *const pattern_pred; extern const function_shape *const prefetch; @@ -131,6 +139,7 @@ namespace aarch64_sve extern const function_shape *const prefetch_gather_offset; extern const function_shape *const ptest; extern const function_shape *const rdffr; + extern const function_shape *const read_za; extern const function_shape *const reduction; extern const function_shape *const reduction_wide; extern const function_shape *const set; @@ -147,6 +156,8 @@ namespace aarch64_sve extern const function_shape *const store_scatter_index_restricted; extern const function_shape *const store_scatter_offset; extern const function_shape *const store_scatter_offset_restricted; + extern const function_shape *const store_za; + extern const function_shape *const str_za; extern const function_shape *const tbl_tuple; extern const function_shape *const ternary_bfloat; extern const function_shape *const ternary_bfloat_lane; @@ -185,6 +196,8 @@ namespace aarch64_sve extern const function_shape *const unary_to_uint; extern const function_shape *const unary_uint; extern const function_shape *const unary_widen; + extern const function_shape *const unary_za_m; + extern const function_shape *const write_za; } } diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sme.cc b/gcc/config/aarch64/aarch64-sve-builtins-sme.cc new file mode 100644 index 00000000000..fa6683c0088 --- /dev/null +++ b/gcc/config/aarch64/aarch64-sve-builtins-sme.cc @@ -0,0 +1,351 @@ +/* ACLE support for AArch64 SVE (__ARM_FEATURE_SVE2 intrinsics) + Copyright (C) 2020-2022 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "tree.h" +#include "rtl.h" +#include "tm_p.h" +#include "memmodel.h" +#include "insn-codes.h" +#include "optabs.h" +#include "recog.h" +#include "expr.h" +#include "basic-block.h" +#include "function.h" +#include "fold-const.h" +#include "gimple.h" +#include "gimple-iterator.h" +#include "gimplify.h" +#include "explow.h" +#include "emit-rtl.h" +#include "aarch64-sve-builtins.h" +#include "aarch64-sve-builtins-shapes.h" +#include "aarch64-sve-builtins-base.h" +#include "aarch64-sve-builtins-sme.h" +#include "aarch64-sve-builtins-functions.h" + +using namespace aarch64_sve; + +namespace { + +class load_store_za_base : public function_base +{ +public: + tree + memory_scalar_type (const function_instance &) const override + { + return void_type_node; + } +}; + +class read_write_za_base : public function_base +{ +public: + constexpr read_write_za_base (int unspec) : m_unspec (unspec) {} + + rtx + expand (function_expander &e) const override + { + auto za_mode = e.vector_mode (0); + auto z_mode = e.vector_mode (1); + auto icode = (za_mode == VNx1TImode + ? code_for_aarch64_sme (m_unspec, za_mode, z_mode) + : code_for_aarch64_sme (m_unspec, z_mode, z_mode)); + return e.use_exact_insn (icode); + } + + int m_unspec; +}; + +class load_za_base : public load_store_za_base +{ +public: + unsigned int + call_properties (const function_instance &) const override + { + return CP_READ_MEMORY | CP_WRITE_ZA; + } +}; + +class store_za_base : public load_store_za_base +{ +public: + unsigned int + call_properties (const function_instance &) const override + { + return CP_WRITE_MEMORY | CP_READ_ZA; + } +}; + +static void +add_load_store_operand (function_expander &e, unsigned int base_argno) +{ + auto mode = e.vector_mode (0); + rtx base = e.get_contiguous_base (mode, base_argno, base_argno + 1, + AARCH64_FL_SM_ON); + auto mem = gen_rtx_MEM (mode, force_reg (Pmode, base)); + set_mem_align (mem, BITS_PER_UNIT); + e.add_fixed_operand (mem); +} + +class arm_has_sme_impl : public function_base +{ + gimple * + fold (gimple_folder &f) const override + { + if (TARGET_SME) + return f.fold_to_cstu (1); + return nullptr; + } + + rtx + expand (function_expander &e) const override + { + if (TARGET_SME) + return const1_rtx; + emit_insn (gen_aarch64_get_sme_state ()); + return expand_simple_binop (DImode, LSHIFTRT, + gen_rtx_REG (DImode, R0_REGNUM), + gen_int_mode (63, QImode), + e.possible_target, true, OPTAB_LIB_WIDEN); + } +}; + +class arm_in_streaming_mode_impl : public function_base +{ + gimple * + fold (gimple_folder &f) const override + { + if (TARGET_STREAMING) + return f.fold_to_cstu (1); + if (TARGET_NON_STREAMING) + return f.fold_to_cstu (0); + return nullptr; + } + + rtx + expand (function_expander &e) const override + { + if (TARGET_STREAMING) + return const1_rtx; + + if (TARGET_NON_STREAMING) + return const0_rtx; + + rtx reg; + if (TARGET_SME) + { + reg = gen_reg_rtx (DImode); + emit_insn (gen_aarch64_read_svcr (reg)); + } + else + { + emit_insn (gen_aarch64_get_sme_state ()); + reg = gen_rtx_REG (DImode, R0_REGNUM); + } + return expand_simple_binop (DImode, AND, reg, gen_int_mode (1, DImode), + e.possible_target, true, OPTAB_LIB_WIDEN); + } +}; + +/* Implements svcnts[bhwd]. */ +class svcnts_bhwd_impl : public function_base +{ +public: + constexpr svcnts_bhwd_impl (machine_mode ref_mode) : m_ref_mode (ref_mode) {} + + unsigned int + get_shift () const + { + return exact_log2 (GET_MODE_UNIT_SIZE (m_ref_mode)); + } + + gimple * + fold (gimple_folder &f) const override + { + if (TARGET_STREAMING) + return f.fold_to_cstu (GET_MODE_NUNITS (m_ref_mode)); + return nullptr; + } + + rtx + expand (function_expander &e) const override + { + rtx cntsb = aarch64_sme_vq_immediate (DImode, 16, AARCH64_ISA_MODE); + auto shift = get_shift (); + if (!shift) + return cntsb; + + return expand_simple_binop (DImode, LSHIFTRT, cntsb, + gen_int_mode (shift, QImode), + e.possible_target, true, OPTAB_LIB_WIDEN); + } + + /* The mode of the vector associated with the [bhwd] suffix. */ + machine_mode m_ref_mode; +}; + +class svld1_impl : public load_za_base +{ +public: + constexpr svld1_impl (int unspec) : m_unspec (unspec) {} + + rtx + expand (function_expander &e) const override + { + auto icode = code_for_aarch64_sme (m_unspec, e.vector_mode (0)); + for (int i = 0; i < 3; ++i) + e.add_input_operand (icode, e.args[i]); + add_load_store_operand (e, 3); + return e.generate_insn (icode); + } + + int m_unspec; +}; + +class svldr_impl : public load_za_base +{ +public: + rtx + expand (function_expander &e) const override + { + auto icode = CODE_FOR_aarch64_sme_ldr0; + e.add_input_operand (icode, e.args[0]); + add_load_store_operand (e, 1); + return e.generate_insn (icode); + } +}; + +class svread_impl : public read_write_za_base +{ +public: + using read_write_za_base::read_write_za_base; + + unsigned int + call_properties (const function_instance &) const override + { + return CP_READ_ZA; + } +}; + +class svst1_impl : public store_za_base +{ +public: + constexpr svst1_impl (int unspec) : m_unspec (unspec) {} + + rtx + expand (function_expander &e) const override + { + auto icode = code_for_aarch64_sme (m_unspec, e.vector_mode (0)); + add_load_store_operand (e, 3); + for (int i = 0; i < 3; ++i) + e.add_input_operand (icode, e.args[i]); + return e.generate_insn (icode); + } + + int m_unspec; +}; + +class svstr_impl : public store_za_base +{ +public: + rtx + expand (function_expander &e) const override + { + auto icode = CODE_FOR_aarch64_sme_str0; + add_load_store_operand (e, 1); + e.add_input_operand (icode, e.args[0]); + return e.generate_insn (icode); + } +}; + +class svwrite_impl : public read_write_za_base +{ +public: + using read_write_za_base::read_write_za_base; + + unsigned int + call_properties (const function_instance &) const override + { + return CP_WRITE_ZA; + } +}; + +class svzero_impl : public write_za +{ +public: + rtx + expand (function_expander &) const override + { + emit_insn (gen_aarch64_sme_zero (gen_int_mode (0xff, SImode))); + return const0_rtx; + } +}; + +class svzero_mask_impl : public write_za +{ +public: + rtx + expand (function_expander &e) const override + { + return e.use_exact_insn (CODE_FOR_aarch64_sme_zero); + } +}; + +} /* end anonymous namespace */ + +namespace aarch64_sve { + +FUNCTION (arm_has_sme, arm_has_sme_impl, ) +FUNCTION (arm_in_streaming_mode, arm_in_streaming_mode_impl, ) +FUNCTION (svaddha, za_arith_function, (UNSPEC_SME_ADDHA, + UNSPEC_SME_ADDHA, -1, 1)) +FUNCTION (svaddva, za_arith_function, (UNSPEC_SME_ADDVA, + UNSPEC_SME_ADDVA, -1, 1)) +FUNCTION (svcntsb, svcnts_bhwd_impl, (VNx16QImode)) +FUNCTION (svcntsd, svcnts_bhwd_impl, (VNx2DImode)) +FUNCTION (svcntsh, svcnts_bhwd_impl, (VNx8HImode)) +FUNCTION (svcntsw, svcnts_bhwd_impl, (VNx4SImode)) +FUNCTION (svld1_hor, svld1_impl, (UNSPEC_SME_LD1_HOR)) +FUNCTION (svld1_ver, svld1_impl, (UNSPEC_SME_LD1_VER)) +FUNCTION (svldr, svldr_impl, ) +FUNCTION (svmopa, quiet_za_arith_function, (UNSPEC_SME_SMOPA, + UNSPEC_SME_UMOPA, + UNSPEC_SME_FMOPA, 1)) +FUNCTION (svmops, quiet_za_arith_function, (UNSPEC_SME_SMOPS, + UNSPEC_SME_UMOPS, + UNSPEC_SME_FMOPS, 1)) +FUNCTION (svread_hor, svread_impl, (UNSPEC_SME_READ_HOR)) +FUNCTION (svread_ver, svread_impl, (UNSPEC_SME_READ_VER)) +FUNCTION (svst1_hor, svst1_impl, (UNSPEC_SME_ST1_HOR)) +FUNCTION (svst1_ver, svst1_impl, (UNSPEC_SME_ST1_VER)) +FUNCTION (svsumopa, quiet_za_arith_function, (UNSPEC_SME_SUMOPA, -1, -1, 1)) +FUNCTION (svsumops, quiet_za_arith_function, (UNSPEC_SME_SUMOPS, -1, -1, 1)) +FUNCTION (svusmopa, quiet_za_arith_function, (-1, UNSPEC_SME_USMOPA, -1, 1)) +FUNCTION (svusmops, quiet_za_arith_function, (-1, UNSPEC_SME_USMOPS, -1, 1)) +FUNCTION (svstr, svstr_impl, ) +FUNCTION (svwrite_hor, svwrite_impl, (UNSPEC_SME_WRITE_HOR)) +FUNCTION (svwrite_ver, svwrite_impl, (UNSPEC_SME_WRITE_VER)) +FUNCTION (svzero, svzero_impl, ) +FUNCTION (svzero_mask, svzero_mask_impl, ) + +} /* end namespace aarch64_sve */ diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sme.def b/gcc/config/aarch64/aarch64-sve-builtins-sme.def new file mode 100644 index 00000000000..a1d496dd809 --- /dev/null +++ b/gcc/config/aarch64/aarch64-sve-builtins-sme.def @@ -0,0 +1,83 @@ +/* ACLE support for AArch64 SVE (__ARM_FEATURE_SVE intrinsics) + Copyright (C) 2022 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +#define REQUIRED_EXTENSIONS 0 +DEF_SVE_FUNCTION (arm_has_sme, bool_inherent, none, none) +DEF_SVE_FUNCTION (arm_in_streaming_mode, bool_inherent, none, none) +#undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS AARCH64_FL_SME +DEF_SVE_FUNCTION (svcntsb, count_inherent, none, none) +DEF_SVE_FUNCTION (svcntsd, count_inherent, none, none) +DEF_SVE_FUNCTION (svcntsh, count_inherent, none, none) +DEF_SVE_FUNCTION (svcntsw, count_inherent, none, none) +#undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS AARCH64_FL_SME | AARCH64_FL_ZA_ON +DEF_SVE_FUNCTION (svldr, ldr_za, za, none) +DEF_SVE_FUNCTION (svstr, str_za, za, none) +DEF_SVE_FUNCTION (svundef, inherent_za, za, none) +DEF_SVE_FUNCTION (svzero, inherent_za, za, none) +DEF_SVE_FUNCTION (svzero_mask, inherent_mask_za, za, none) +#undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS (AARCH64_FL_SME \ + | AARCH64_FL_SM_ON \ + | AARCH64_FL_ZA_ON) +DEF_SVE_FUNCTION (svaddha, unary_za_m, s_za_integer, za_m) +DEF_SVE_FUNCTION (svaddva, unary_za_m, s_za_integer, za_m) +DEF_SVE_FUNCTION (svld1_hor, load_za, all_za, none) +DEF_SVE_FUNCTION (svld1_ver, load_za, all_za, none) +DEF_SVE_FUNCTION (svmopa, binary_za_m, mop_base, za_m) +DEF_SVE_FUNCTION (svmopa, binary_za_m, d_za, za_m) +DEF_SVE_FUNCTION (svmops, binary_za_m, mop_base, za_m) +DEF_SVE_FUNCTION (svmops, binary_za_m, d_za, za_m) +DEF_SVE_FUNCTION (svread_hor, read_za, all_za_data, m) +DEF_SVE_FUNCTION (svread_ver, read_za, all_za_data, m) +DEF_SVE_FUNCTION (svst1_hor, store_za, all_za, none) +DEF_SVE_FUNCTION (svst1_ver, store_za, all_za, none) +DEF_SVE_FUNCTION (svsumopa, binary_za_uint_m, mop_base_signed, za_m) +DEF_SVE_FUNCTION (svsumops, binary_za_uint_m, mop_base_signed, za_m) +DEF_SVE_FUNCTION (svusmopa, binary_za_int_m, mop_base_unsigned, za_m) +DEF_SVE_FUNCTION (svusmops, binary_za_int_m, mop_base_unsigned, za_m) +DEF_SVE_FUNCTION (svwrite_hor, write_za, all_za_data, za_m) +DEF_SVE_FUNCTION (svwrite_ver, write_za, all_za_data, za_m) +#undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS (AARCH64_FL_SME \ + | AARCH64_FL_SME_I16I64 \ + | AARCH64_FL_SM_ON \ + | AARCH64_FL_ZA_ON) +DEF_SVE_FUNCTION (svaddha, unary_za_m, d_za_integer, za_m) +DEF_SVE_FUNCTION (svaddva, unary_za_m, d_za_integer, za_m) +DEF_SVE_FUNCTION (svmopa, binary_za_m, mop_i16i64, za_m) +DEF_SVE_FUNCTION (svmops, binary_za_m, mop_i16i64, za_m) +DEF_SVE_FUNCTION (svsumopa, binary_za_uint_m, mop_i16i64_signed, za_m) +DEF_SVE_FUNCTION (svsumops, binary_za_uint_m, mop_i16i64_signed, za_m) +DEF_SVE_FUNCTION (svusmopa, binary_za_int_m, mop_i16i64_unsigned, za_m) +DEF_SVE_FUNCTION (svusmops, binary_za_int_m, mop_i16i64_unsigned, za_m) +#undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS (AARCH64_FL_SME \ + | AARCH64_FL_SME_F64F64 \ + | AARCH64_FL_SM_ON \ + | AARCH64_FL_ZA_ON) +DEF_SVE_FUNCTION (svmopa, binary_za_m, mop_f64f64, za_m) +DEF_SVE_FUNCTION (svmops, binary_za_m, mop_f64f64, za_m) +#undef REQUIRED_EXTENSIONS diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sme.h b/gcc/config/aarch64/aarch64-sve-builtins-sme.h new file mode 100644 index 00000000000..952e6867e9f --- /dev/null +++ b/gcc/config/aarch64/aarch64-sve-builtins-sme.h @@ -0,0 +1,56 @@ +/* ACLE support for AArch64 SVE (__ARM_FEATURE_SVE intrinsics) + Copyright (C) 2022 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +#ifndef GCC_AARCH64_SVE_BUILTINS_SME_H +#define GCC_AARCH64_SVE_BUILTINS_SME_H + +namespace aarch64_sve +{ + namespace functions + { + extern const function_base *const arm_has_sme; + extern const function_base *const arm_in_streaming_mode; + extern const function_base *const svaddha; + extern const function_base *const svaddva; + extern const function_base *const svcntsb; + extern const function_base *const svcntsd; + extern const function_base *const svcntsh; + extern const function_base *const svcntsw; + extern const function_base *const svld1_hor; + extern const function_base *const svld1_ver; + extern const function_base *const svldr; + extern const function_base *const svmopa; + extern const function_base *const svmops; + extern const function_base *const svread_hor; + extern const function_base *const svread_ver; + extern const function_base *const svst1_hor; + extern const function_base *const svst1_ver; + extern const function_base *const svstr; + extern const function_base *const svsumopa; + extern const function_base *const svsumops; + extern const function_base *const svusmopa; + extern const function_base *const svusmops; + extern const function_base *const svwrite_hor; + extern const function_base *const svwrite_ver; + extern const function_base *const svzero; + extern const function_base *const svzero_mask; + } +} + +#endif diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index e50a58dcc0a..c8e8bbcdc50 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -51,6 +51,7 @@ #include "aarch64-sve-builtins.h" #include "aarch64-sve-builtins-base.h" #include "aarch64-sve-builtins-sve2.h" +#include "aarch64-sve-builtins-sme.h" #include "aarch64-sve-builtins-shapes.h" namespace aarch64_sve { @@ -112,6 +113,7 @@ static const char *const pred_suffixes[NUM_PREDS + 1] = { "_m", "_x", "_z", + "_m", "" }; @@ -136,12 +138,28 @@ CONSTEXPR const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1] = { TYPE_##CLASS == TYPE_signed || TYPE_##CLASS == TYPE_unsigned, \ TYPE_##CLASS == TYPE_unsigned, \ TYPE_##CLASS == TYPE_float, \ + TYPE_##CLASS != TYPE_bool, \ TYPE_##CLASS == TYPE_bool, \ + false, \ + 0, \ + MODE }, +#define DEF_SME_ZA_SUFFIX(NAME, BITS, MODE) \ + { "_" #NAME, \ + NUM_VECTOR_TYPES, \ + NUM_TYPE_CLASSES, \ + BITS, \ + BITS / BITS_PER_UNIT, \ + false, \ + false, \ + false, \ + false, \ + false, \ + true, \ 0, \ MODE }, #include "aarch64-sve-builtins.def" { "", NUM_VECTOR_TYPES, TYPE_bool, 0, 0, false, false, false, false, - 0, VOIDmode } + false, false, 0, VOIDmode } }; /* Define a TYPES_ macro for each combination of type @@ -415,6 +433,73 @@ CONSTEXPR const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1] = { TYPES_while1 (D, b32), \ TYPES_while1 (D, b64) +/* _za8 _za16 _za32 _za64 _za128. */ +#define TYPES_all_za(S, D) \ + S (za8), S (za16), S (za32), S (za64), S (za128) + +/* _za64. */ +#define TYPES_d_za(S, D) \ + S (za64) + +/* { _za8 } x { _s8 _u8 } + + { _za16 } x { _bf16 _f16 _s16 _u16 } + + { _za32 } x { _f32 _s32 _u32 } + + { _za64 } x { _f64 _s64 _u64 } + + { _za128 } x { _bf16 } + { _f16 _f32 _f64 } + { _s8 _s16 _s32 _s64 } + { _u8 _u16 _u32 _u64 }. */ +#define TYPES_all_za_data(S, D) \ + D (za8, s8), D (za8, u8), \ + D (za16, bf16), D (za16, f16), D (za16, s16), D (za16, u16), \ + D (za32, f32), D (za32, s32), D (za32, u32), \ + D (za64, f64), D (za64, s64), D (za64, u64), \ + TYPES_reinterpret1 (D, za128) + +/* _za32 x { _s32 _u32 }. */ +#define TYPES_s_za_integer(S, D) \ + D (za32, s32), D (za32, u32) + +/* _za64 x { _s64 _u64 }. */ +#define TYPES_d_za_integer(S, D) \ + D (za64, s64), D (za64, u64) + +/* _za32 x { _s8 _u8 _bf16 _f16 _f32 }. */ +#define TYPES_mop_base(S, D) \ + D (za32, s8), D (za32, u8), D (za32, bf16), D (za32, f16), D (za32, f32) + +/* _za32_s8. */ +#define TYPES_mop_base_signed(S, D) \ + D (za32, s8) + +/* _za32_u8. */ +#define TYPES_mop_base_unsigned(S, D) \ + D (za32, u8) + +/* _za64 x { _s16 _u16 }. */ +#define TYPES_mop_i16i64(S, D) \ + D (za64, s16), D (za64, u16) + +/* _za64_s16. */ +#define TYPES_mop_i16i64_signed(S, D) \ + D (za64, s16) + +/* _za64_u16. */ +#define TYPES_mop_i16i64_unsigned(S, D) \ + D (za64, u16) + +/* _za64 x { _f64 _f64 }. */ +#define TYPES_mop_f64f64(S, D) \ + D (za64, f64) + +/* _za. */ +#define TYPES_za(S, D) \ + S (za) + /* Describe a pair of type suffixes in which only the first is used. */ #define DEF_VECTOR_TYPE(X) { TYPE_SUFFIX_ ## X, NUM_TYPE_SUFFIXES } @@ -482,6 +567,19 @@ DEF_SVE_TYPES_ARRAY (cvt_narrow); DEF_SVE_TYPES_ARRAY (inc_dec_n); DEF_SVE_TYPES_ARRAY (reinterpret); DEF_SVE_TYPES_ARRAY (while); +DEF_SVE_TYPES_ARRAY (all_za); +DEF_SVE_TYPES_ARRAY (d_za); +DEF_SVE_TYPES_ARRAY (all_za_data); +DEF_SVE_TYPES_ARRAY (s_za_integer); +DEF_SVE_TYPES_ARRAY (d_za_integer); +DEF_SVE_TYPES_ARRAY (mop_base); +DEF_SVE_TYPES_ARRAY (mop_base_signed); +DEF_SVE_TYPES_ARRAY (mop_base_unsigned); +DEF_SVE_TYPES_ARRAY (mop_i16i64); +DEF_SVE_TYPES_ARRAY (mop_i16i64_signed); +DEF_SVE_TYPES_ARRAY (mop_i16i64_unsigned); +DEF_SVE_TYPES_ARRAY (mop_f64f64); +DEF_SVE_TYPES_ARRAY (za); /* Used by functions that have no governing predicate. */ static const predication_index preds_none[] = { PRED_none, NUM_PREDS }; @@ -490,6 +588,9 @@ static const predication_index preds_none[] = { PRED_none, NUM_PREDS }; explicit suffix. */ static const predication_index preds_implicit[] = { PRED_implicit, NUM_PREDS }; +/* Used by functions that only support "_m" predication. */ +static const predication_index preds_m[] = { PRED_m, NUM_PREDS }; + /* Used by functions that allow merging and "don't care" predication, but are not suitable for predicated MOVPRFX. */ static const predication_index preds_mx[] = { @@ -521,6 +622,9 @@ static const predication_index preds_z_or_none[] = { /* Used by (mostly predicate) functions that only support "_z" predication. */ static const predication_index preds_z[] = { PRED_z, NUM_PREDS }; +/* Used by SME instructions that always merge into ZA. */ +static const predication_index preds_za_m[] = { PRED_za_m, NUM_PREDS }; + /* A list of all SVE ACLE functions. */ static CONSTEXPR const function_group_info function_groups[] = { #define DEF_SVE_FUNCTION(NAME, SHAPE, TYPES, PREDS) \ @@ -530,8 +634,8 @@ static CONSTEXPR const function_group_info function_groups[] = { }; /* The scalar type associated with each vector type. */ -extern GTY(()) tree scalar_types[NUM_VECTOR_TYPES]; -tree scalar_types[NUM_VECTOR_TYPES]; +extern GTY(()) tree scalar_types[NUM_VECTOR_TYPES + 1]; +tree scalar_types[NUM_VECTOR_TYPES + 1]; /* The single-predicate and single-vector types, with their built-in "__SV..._t" name. Allow an index of NUM_VECTOR_TYPES, which always @@ -639,7 +743,7 @@ find_type_suffix_for_scalar_type (const_tree type) /* A linear search should be OK here, since the code isn't hot and the number of types is only small. */ for (unsigned int suffix_i = 0; suffix_i < NUM_TYPE_SUFFIXES; ++suffix_i) - if (!type_suffixes[suffix_i].bool_p) + if (type_suffixes[suffix_i].vector_p) { vector_type_index vector_i = type_suffixes[suffix_i].vector_type; if (matches_type_p (scalar_types[vector_i], type)) @@ -707,6 +811,20 @@ check_required_extensions (location_t location, tree fndecl, return false; } + if (missing_extensions & AARCH64_FL_SM_ON) + { + error_at (location, "ACLE function %qD can only be called when" + " SME streaming mode is enabled", fndecl); + return false; + } + + if (missing_extensions & AARCH64_FL_ZA_ON) + { + error_at (location, "ACLE function %qD can only be called from" + " a function that has ZA state", fndecl); + return false; + } + static const struct { aarch64_feature_flags flag; const char *name; @@ -742,9 +860,13 @@ report_out_of_range (location_t location, tree fndecl, unsigned int argno, HOST_WIDE_INT actual, HOST_WIDE_INT min, HOST_WIDE_INT max) { - error_at (location, "passing %wd to argument %d of %qE, which expects" - " a value in the range [%wd, %wd]", actual, argno + 1, fndecl, - min, max); + if (min == max) + error_at (location, "passing %wd to argument %d of %qE, which expects" + " the value %wd", actual, argno + 1, fndecl, min); + else + error_at (location, "passing %wd to argument %d of %qE, which expects" + " a value in the range [%wd, %wd]", actual, argno + 1, fndecl, + min, max); } /* Report that LOCATION has a call to FNDECL in which argument ARGNO has @@ -830,7 +952,7 @@ function_instance::reads_global_state_p () const return true; /* Handle direct reads of global state. */ - return flags & (CP_READ_MEMORY | CP_READ_FFR); + return flags & (CP_READ_MEMORY | CP_READ_FFR | CP_READ_ZA); } /* Return true if calls to the function could modify some form of @@ -851,7 +973,7 @@ function_instance::modifies_global_state_p () const return true; /* Handle direct modifications of global state. */ - return flags & (CP_WRITE_MEMORY | CP_WRITE_FFR); + return flags & (CP_WRITE_MEMORY | CP_WRITE_FFR | CP_WRITE_ZA); } /* Return true if calls to the function could raise a signal. */ @@ -871,6 +993,20 @@ function_instance::could_trap_p () const return false; } +/* Return true if the function shares ZA state with its caller. */ +bool +function_instance::shared_za_p () const +{ + return (call_properties () & (CP_READ_ZA | CP_WRITE_ZA)) != 0; +} + +/* Return true if the function preserves ZA. */ +bool +function_instance::preserves_za_p () const +{ + return (call_properties () & CP_WRITE_ZA) == 0; +} + inline hashval_t registered_function_hasher::hash (value_type value) { @@ -883,8 +1019,8 @@ registered_function_hasher::equal (value_type value, const compare_type &key) return value->instance == key; } -sve_switcher::sve_switcher () - : aarch64_simd_switcher (AARCH64_FL_F16 | AARCH64_FL_SVE) +sve_switcher::sve_switcher (aarch64_feature_flags flags) + : aarch64_simd_switcher (AARCH64_FL_F16 | AARCH64_FL_SVE | flags) { /* Changing the ISA flags and have_regs_of_mode should be enough here. We shouldn't need to pay the compile-time cost of a full target @@ -940,6 +1076,10 @@ char * function_builder::get_name (const function_instance &instance, bool overloaded_p) { + /* __arm_* functions are listed as arm_*, so that the associated GCC + code is not in the implementation namespace. */ + if (strncmp (instance.base_name, "arm_", 4) == 0) + append_name ("__"); append_name (instance.base_name); if (overloaded_p) switch (instance.displacement_units ()) @@ -981,6 +1121,11 @@ function_builder::get_attributes (const function_instance &instance) { tree attrs = NULL_TREE; + if (instance.shared_za_p ()) + attrs = add_attribute ("arm_shared_za", attrs); + if (instance.preserves_za_p ()) + attrs = add_attribute ("arm_preserves_za", attrs); + if (!instance.modifies_global_state_p ()) { if (instance.reads_global_state_p ()) @@ -1236,12 +1381,24 @@ function_resolver::lookup_form (mode_suffix_index mode, /* Resolve the function to one with the mode suffix given by MODE and the type suffixes given by TYPE0 and TYPE1. Return its function decl on - success, otherwise report an error and return error_mark_node. */ + success, otherwise report an error and return error_mark_node. + + As a convenience, resolve_to (MODE, TYPE0) can be used for functions + whose first type suffix is explicit, with TYPE0 then describing the + second type suffix rather than the first. */ tree function_resolver::resolve_to (mode_suffix_index mode, type_suffix_index type0, type_suffix_index type1) { + /* Handle convert-like functions in which the first type suffix is + explicit. */ + if (type_suffix_ids[0] != NUM_TYPE_SUFFIXES && type0 != type_suffix_ids[0]) + { + type1 = type0; + type0 = type_suffix_ids[0]; + } + tree res = lookup_form (mode, type0, type1); if (!res) { @@ -2167,6 +2324,7 @@ bool function_resolver::check_gp_argument (unsigned int nops, unsigned int &i, unsigned int &nargs) { + gcc_assert (pred != PRED_za_m); i = 0; if (pred != PRED_none) { @@ -2367,9 +2525,7 @@ function_checker::function_checker (location_t location, unsigned int nargs, tree *args) : function_call_info (location, instance, fndecl), m_fntype (fntype), m_nargs (nargs), m_args (args), - /* We don't have to worry about unary _m operations here, since they - never have arguments that need checking. */ - m_base_arg (pred != PRED_none ? 1 : 0) + m_base_arg (pred != PRED_none && pred != PRED_za_m ? 1 : 0) { } @@ -2762,21 +2918,51 @@ function_expander::convert_to_pmode (rtx x) } /* Return the base address for a contiguous load or store function. - MEM_MODE is the mode of the addressed memory. */ + MEM_MODE is the mode of the addressed memory, BASE_ARGNO is + the index of the base argument, and VNUM_ARGNO is the index of + the vnum offset argument (if any). VL_ISA_MODE is AARCH64_FL_SM_ON + if the vnum argument is a factor of the SME vector length, 0 if it + is a factor of the current prevailing vector length. */ rtx -function_expander::get_contiguous_base (machine_mode mem_mode) +function_expander::get_contiguous_base (machine_mode mem_mode, + unsigned int base_argno, + unsigned int vnum_argno, + aarch64_feature_flags vl_isa_mode) { - rtx base = convert_to_pmode (args[1]); + rtx base = convert_to_pmode (args[base_argno]); if (mode_suffix_id == MODE_vnum) { - /* Use the size of the memory mode for extending loads and truncating - stores. Use the size of a full vector for non-extending loads - and non-truncating stores (including svld[234] and svst[234]). */ - poly_int64 size = ordered_min (GET_MODE_SIZE (mem_mode), - BYTES_PER_SVE_VECTOR); - rtx offset = gen_int_mode (size, Pmode); - offset = simplify_gen_binary (MULT, Pmode, args[2], offset); - base = simplify_gen_binary (PLUS, Pmode, base, offset); + rtx vnum = args[vnum_argno]; + if (vnum != const0_rtx) + { + /* Use the size of the memory mode for extending loads and truncating + stores. Use the size of a full vector for non-extending loads + and non-truncating stores (including svld[234] and svst[234]). */ + poly_int64 size = ordered_min (GET_MODE_SIZE (mem_mode), + BYTES_PER_SVE_VECTOR); + rtx offset; + if ((vl_isa_mode & AARCH64_FL_SM_ON) + && !TARGET_STREAMING + && !size.is_constant ()) + { + gcc_assert (known_eq (size, BYTES_PER_SVE_VECTOR)); + if (CONST_INT_P (vnum) && IN_RANGE (INTVAL (vnum), -32, 31)) + offset = aarch64_sme_vq_immediate (Pmode, INTVAL (vnum) * 16, + AARCH64_ISA_MODE); + else + { + offset = aarch64_sme_vq_immediate (Pmode, 16, + AARCH64_ISA_MODE); + offset = simplify_gen_binary (MULT, Pmode, vnum, offset); + } + } + else + { + offset = gen_int_mode (size, Pmode); + offset = simplify_gen_binary (MULT, Pmode, vnum, offset); + } + base = simplify_gen_binary (PLUS, Pmode, base, offset); + } } return base; } @@ -2883,7 +3069,7 @@ function_expander::add_input_operand (insn_code icode, rtx x) /* Add an integer operand with value X to the instruction. */ void -function_expander::add_integer_operand (HOST_WIDE_INT x) +function_expander::add_integer_operand (poly_int64 x) { m_ops.safe_grow (m_ops.length () + 1, true); create_integer_operand (&m_ops.last (), x); @@ -3428,7 +3614,10 @@ init_builtins () sve_switcher sve; register_builtin_types (); if (in_lto_p) - handle_arm_sve_h (); + { + handle_arm_sve_h (); + handle_arm_sme_h (); + } } /* Register vector type TYPE under its arm_sve.h name. */ @@ -3578,7 +3767,8 @@ handle_arm_sve_h () function_table = new hash_table (1023); function_builder builder; for (unsigned int i = 0; i < ARRAY_SIZE (function_groups); ++i) - builder.register_function_group (function_groups[i]); + if (!(function_groups[i].required_extensions & AARCH64_FL_SME)) + builder.register_function_group (function_groups[i]); } /* Return the function decl with SVE function subcode CODE, or error_mark_node @@ -3591,6 +3781,33 @@ builtin_decl (unsigned int code, bool) return (*registered_functions)[code]->decl; } +/* Implement #pragma GCC aarch64 "arm_sme.h". */ +void +handle_arm_sme_h () +{ + if (!function_table) + { + error ("%qs defined without first defining %qs", + "arm_sme.h", "arm_sve.h"); + return; + } + + static bool initialized_p; + if (initialized_p) + { + error ("duplicate definition of %qs", "arm_sme.h"); + return; + } + initialized_p = true; + + sme_switcher sme; + + function_builder builder; + for (unsigned int i = 0; i < ARRAY_SIZE (function_groups); ++i) + if (function_groups[i].required_extensions & AARCH64_FL_SME) + builder.register_function_group (function_groups[i]); +} + /* If we're implementing manual overloading, check whether the SVE function with subcode CODE is overloaded, and if so attempt to determine the corresponding non-overloaded function. The call diff --git a/gcc/config/aarch64/aarch64-sve-builtins.def b/gcc/config/aarch64/aarch64-sve-builtins.def index 6e4dcdbc97e..39ef94dc936 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.def +++ b/gcc/config/aarch64/aarch64-sve-builtins.def @@ -29,6 +29,10 @@ #define DEF_SVE_TYPE_SUFFIX(A, B, C, D, E) #endif +#ifndef DEF_SME_ZA_SUFFIX +#define DEF_SME_ZA_SUFFIX(A, B, C) +#endif + #ifndef DEF_SVE_FUNCTION #define DEF_SVE_FUNCTION(A, B, C, D) #endif @@ -95,10 +99,21 @@ DEF_SVE_TYPE_SUFFIX (u16, svuint16_t, unsigned, 16, VNx8HImode) DEF_SVE_TYPE_SUFFIX (u32, svuint32_t, unsigned, 32, VNx4SImode) DEF_SVE_TYPE_SUFFIX (u64, svuint64_t, unsigned, 64, VNx2DImode) +/* Arbitrarily associate _za with bytes (by analogy with char's role in C). */ +DEF_SME_ZA_SUFFIX (za, 8, VNx16QImode) + +DEF_SME_ZA_SUFFIX (za8, 8, VNx16QImode) +DEF_SME_ZA_SUFFIX (za16, 16, VNx8HImode) +DEF_SME_ZA_SUFFIX (za32, 32, VNx4SImode) +DEF_SME_ZA_SUFFIX (za64, 64, VNx2DImode) +DEF_SME_ZA_SUFFIX (za128, 128, VNx1TImode) + #include "aarch64-sve-builtins-base.def" #include "aarch64-sve-builtins-sve2.def" +#include "aarch64-sve-builtins-sme.def" #undef DEF_SVE_FUNCTION +#undef DEF_SME_ZA_SUFFIX #undef DEF_SVE_TYPE_SUFFIX #undef DEF_SVE_TYPE #undef DEF_SVE_MODE diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index 479b248bef1..f5d66987be3 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -97,6 +97,8 @@ const unsigned int CP_PREFETCH_MEMORY = 1U << 3; const unsigned int CP_WRITE_MEMORY = 1U << 4; const unsigned int CP_READ_FFR = 1U << 5; const unsigned int CP_WRITE_FFR = 1U << 6; +const unsigned int CP_READ_ZA = 1U << 7; +const unsigned int CP_WRITE_ZA = 1U << 8; /* Enumerates the SVE predicate and (data) vector types, together called "vector types" for brevity. */ @@ -142,6 +144,10 @@ enum predication_index /* Zero predication: set inactive lanes of the vector result to zero. */ PRED_z, + /* Merging predication for SME's ZA: merge into slices of the array + instead of overwriting the whole slices. */ + PRED_za_m, + NUM_PREDS }; @@ -176,6 +182,8 @@ enum type_suffix_index { #define DEF_SVE_TYPE_SUFFIX(NAME, ACLE_TYPE, CLASS, BITS, MODE) \ TYPE_SUFFIX_ ## NAME, +#define DEF_SME_ZA_SUFFIX(NAME, BITS, MODE) \ + TYPE_SUFFIX_ ## NAME, #include "aarch64-sve-builtins.def" NUM_TYPE_SUFFIXES }; @@ -229,9 +237,13 @@ struct type_suffix_info unsigned int unsigned_p : 1; /* True if the suffix is for a floating-point type. */ unsigned int float_p : 1; + /* True if the suffix is for a vector type (integer or float). */ + unsigned int vector_p : 1; /* True if the suffix is for a boolean type. */ unsigned int bool_p : 1; - unsigned int spare : 12; + /* True if the suffix is for SME's ZA. */ + unsigned int za_p : 1; + unsigned int spare : 10; /* The associated vector or predicate mode. */ machine_mode vector_mode : 16; @@ -283,6 +295,8 @@ public: bool reads_global_state_p () const; bool modifies_global_state_p () const; bool could_trap_p () const; + bool shared_za_p () const; + bool preserves_za_p () const; unsigned int vectors_per_tuple () const; tree memory_scalar_type () const; @@ -293,11 +307,13 @@ public: tree displacement_vector_type () const; units_index displacement_units () const; + unsigned int num_za_tiles () const; + const type_suffix_info &type_suffix (unsigned int) const; tree scalar_type (unsigned int) const; tree vector_type (unsigned int) const; tree tuple_type (unsigned int) const; - unsigned int elements_per_vq (unsigned int i) const; + unsigned int elements_per_vq (unsigned int) const; machine_mode vector_mode (unsigned int) const; machine_mode gp_mode (unsigned int) const; @@ -532,7 +548,8 @@ public: bool overlaps_input_p (rtx); rtx convert_to_pmode (rtx); - rtx get_contiguous_base (machine_mode); + rtx get_contiguous_base (machine_mode, unsigned int = 1, unsigned int = 2, + aarch64_feature_flags = 0); rtx get_fallback_value (machine_mode, unsigned int, unsigned int, unsigned int &); rtx get_reg_target (); @@ -540,7 +557,7 @@ public: void add_output_operand (insn_code); void add_input_operand (insn_code, rtx); - void add_integer_operand (HOST_WIDE_INT); + void add_integer_operand (poly_int64); void add_mem_operand (machine_mode, rtx); void add_address_operand (rtx); void add_fixed_operand (rtx); @@ -660,7 +677,7 @@ public: class sve_switcher : public aarch64_simd_switcher { public: - sve_switcher (); + sve_switcher (aarch64_feature_flags = 0); ~sve_switcher (); private: @@ -668,10 +685,17 @@ private: bool m_old_have_regs_of_mode[MAX_MACHINE_MODE]; }; +/* Extends sve_switch enough for defining arm_sme.h. */ +class sme_switcher : public sve_switcher +{ +public: + sme_switcher () : sve_switcher (AARCH64_FL_SME) {} +}; + extern const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1]; extern const mode_suffix_info mode_suffixes[MODE_none + 1]; -extern tree scalar_types[NUM_VECTOR_TYPES]; +extern tree scalar_types[NUM_VECTOR_TYPES + 1]; extern tree acle_vector_types[MAX_TUPLE_SIZE][NUM_VECTOR_TYPES + 1]; extern tree acle_svpattern; extern tree acle_svprfop; @@ -801,6 +825,16 @@ function_instance::displacement_vector_type () const return acle_vector_types[0][mode_suffix ().displacement_vector_type]; } +/* Return the number of ZA tiles associated with the _za suffix + (which is always the first type suffix). */ +inline unsigned int +function_instance::num_za_tiles () const +{ + auto &suffix = type_suffix (0); + gcc_checking_assert (suffix.za_p); + return suffix.element_bytes; +} + /* If the function takes a vector or scalar displacement, return the units in which the displacement is measured, otherwise return UNITS_none. */ inline units_index diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index d29cfefee6b..966d13abe4c 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -5643,15 +5643,26 @@ aarch64_output_sve_scalar_inc_dec (rtx offset) } /* Return true if a single RDVL instruction can multiply FACTOR by the - number of 128-bit quadwords in an SVE vector. */ + number of 128-bit quadwords in an SVE vector. This is also the + range of ADDVL. */ static bool -aarch64_sve_rdvl_factor_p (HOST_WIDE_INT factor) +aarch64_sve_rdvl_addvl_factor_p (HOST_WIDE_INT factor) { return (multiple_p (factor, 16) && IN_RANGE (factor, -32 * 16, 31 * 16)); } +/* Return true if ADDPL can be used to add FACTOR multiplied by the number + of quadwords in an SVE vector. */ + +static bool +aarch64_sve_addpl_factor_p (HOST_WIDE_INT factor) +{ + return (multiple_p (factor, 2) + && IN_RANGE (factor, -32 * 2, 31 * 2)); +} + /* Return true if we can move VALUE into a register using a single RDVL instruction. */ @@ -5659,7 +5670,7 @@ static bool aarch64_sve_rdvl_immediate_p (poly_int64 value) { HOST_WIDE_INT factor = value.coeffs[0]; - return value.coeffs[1] == factor && aarch64_sve_rdvl_factor_p (factor); + return value.coeffs[1] == factor && aarch64_sve_rdvl_addvl_factor_p (factor); } /* Likewise for rtx X. */ @@ -5695,10 +5706,8 @@ aarch64_sve_addvl_addpl_immediate_p (poly_int64 value) HOST_WIDE_INT factor = value.coeffs[0]; if (factor == 0 || value.coeffs[1] != factor) return false; - /* FACTOR counts VG / 2, so a value of 2 is one predicate width - and a value of 16 is one vector width. */ - return (((factor & 15) == 0 && IN_RANGE (factor, -32 * 16, 31 * 16)) - || ((factor & 1) == 0 && IN_RANGE (factor, -32 * 2, 31 * 2))); + return (aarch64_sve_rdvl_addvl_factor_p (factor) + || aarch64_sve_addpl_factor_p (factor)); } /* Likewise for rtx X. */ @@ -5798,11 +5807,11 @@ aarch64_output_sve_vector_inc_dec (const char *operands, rtx x) number of 128-bit quadwords in an SME vector. ISA_MODE is the ISA mode in which the calculation is being performed. */ -static rtx +rtx aarch64_sme_vq_immediate (machine_mode mode, HOST_WIDE_INT factor, aarch64_feature_flags isa_mode) { - gcc_assert (aarch64_sve_rdvl_factor_p (factor)); + gcc_assert (aarch64_sve_rdvl_addvl_factor_p (factor)); if (isa_mode & AARCH64_FL_SM_ON) /* We're in streaming mode, so we can use normal poly-int values. */ return gen_int_mode ({ factor, factor }, mode); @@ -5845,7 +5854,7 @@ aarch64_rdsvl_immediate_p (const_rtx x) { HOST_WIDE_INT factor; return (aarch64_sme_vq_unspec_p (x, &factor) - && aarch64_sve_rdvl_factor_p (factor)); + && aarch64_sve_rdvl_addvl_factor_p (factor)); } /* Return the asm string for an RDSVL instruction that calculates X, @@ -5862,6 +5871,38 @@ aarch64_output_rdsvl (const_rtx x) return buffer; } +/* Return true if X is a constant that can be added using ADDSVL or ADDSPL. */ + +bool +aarch64_addsvl_addspl_immediate_p (const_rtx x) +{ + HOST_WIDE_INT factor; + return (aarch64_sme_vq_unspec_p (x, &factor) + && (aarch64_sve_rdvl_addvl_factor_p (factor) + || aarch64_sve_addpl_factor_p (factor))); +} + +/* X is a constant that satisfies aarch64_addsvl_addspl_immediate_p. + Return the asm string for the associated instruction. */ + +char * +aarch64_output_addsvl_addspl (rtx x) +{ + static char buffer[sizeof ("addspl\t%x0, %x1, #-") + 3 * sizeof (int)]; + HOST_WIDE_INT factor; + if (!aarch64_sme_vq_unspec_p (x, &factor)) + gcc_unreachable (); + if (aarch64_sve_rdvl_addvl_factor_p (factor)) + snprintf (buffer, sizeof (buffer), "addsvl\t%%x0, %%x1, #%d", + (int) factor / 16); + else if (aarch64_sve_addpl_factor_p (factor)) + snprintf (buffer, sizeof (buffer), "addspl\t%%x0, %%x1, #%d", + (int) factor / 2); + else + gcc_unreachable (); + return buffer; +} + /* Multipliers for repeating bitmasks of width 32, 16, 8, 4, and 2. */ static const unsigned HOST_WIDE_INT bitmask_imm_mul[] = @@ -6471,7 +6512,7 @@ aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx src, shift = 0; } /* Try to use an unshifted RDVL. */ - else if (aarch64_sve_rdvl_factor_p (factor)) + else if (aarch64_sve_rdvl_addvl_factor_p (factor)) { val = gen_int_mode (poly_int64 (factor, factor), mode); shift = 0; @@ -11354,6 +11395,9 @@ aarch64_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x) if (GET_CODE (x) == HIGH) return true; + if (aarch64_rdsvl_immediate_p (x)) + return true; + /* There's no way to calculate VL-based values using relocations. */ subrtx_iterator::array_type array; FOR_EACH_SUBRTX (iter, array, x, ALL) @@ -11569,7 +11613,7 @@ aarch64_classify_index (struct aarch64_address_info *info, rtx x, && contains_reg_of_mode[GENERAL_REGS][GET_MODE (SUBREG_REG (index))]) index = SUBREG_REG (index); - if (aarch64_sve_data_mode_p (mode)) + if (aarch64_sve_data_mode_p (mode) || mode == VNx1TImode) { if (type != ADDRESS_REG_REG || (1 << shift) != GET_MODE_UNIT_SIZE (mode)) @@ -11672,7 +11716,8 @@ aarch64_classify_address (struct aarch64_address_info *info, && ((vec_flags == 0 && known_lt (GET_MODE_SIZE (mode), 16)) || vec_flags == VEC_ADVSIMD - || vec_flags & VEC_SVE_DATA)); + || vec_flags & VEC_SVE_DATA + || mode == VNx1TImode)); /* For SVE, only accept [Rn], [Rn, #offset, MUL VL] and [Rn, Rm, LSL #shift]. The latter is not valid for SVE predicates, and that's rejected through @@ -11791,7 +11836,7 @@ aarch64_classify_address (struct aarch64_address_info *info, /* Make "m" use the LD1 offset range for SVE data modes, so that pre-RTL optimizers like ivopts will work to that instead of the wider LDR/STR range. */ - if (vec_flags == VEC_SVE_DATA) + if (vec_flags == VEC_SVE_DATA || mode == VNx1TImode) return (type == ADDR_QUERY_M ? offset_4bit_signed_scaled_p (mode, offset) : offset_9bit_signed_scaled_p (mode, offset)); @@ -14090,6 +14135,51 @@ aarch64_output_casesi (rtx *operands) return ""; } +/* Return the asm string for an SME ZERO instruction whose 8-bit mask + operand is MASK, */ +const char * +aarch64_output_sme_zero (rtx mask) +{ + auto mask_val = UINTVAL (mask); + if (mask_val == 0) + return "zero\t{}"; + + if (mask_val == 0xff) + return "zero\t{ za }"; + + static constexpr std::pair tiles[] = { + { 0xff, 'b' }, + { 0x55, 'h' }, + { 0x11, 's' }, + { 0x01, 'd' } + }; + /* The last entry in the list has the form "za7.d }", but that's the + same length as "za7.d, ". */ + static char buffer[sizeof("zero\t{ ") + sizeof ("za7.d, ") * 8 + 1]; + unsigned int i = 0; + i += snprintf (buffer + i, sizeof (buffer) - i, "zero\t"); + const char *prefix = "{ "; + for (auto &tile : tiles) + { + auto tile_mask = tile.first; + unsigned int tile_index = 0; + while (tile_mask < 0x100) + { + if ((mask_val & tile_mask) == tile_mask) + { + i += snprintf (buffer + i, sizeof (buffer) - i, "%sza%d.%c", + prefix, tile_index, tile.second); + prefix = ", "; + mask_val &= ~tile_mask; + } + tile_mask <<= 1; + tile_index += 1; + } + } + gcc_assert (mask_val == 0 && i + 3 <= sizeof (buffer)); + snprintf (buffer + i, sizeof (buffer) - i, " }"); + return buffer; +} /* Return size in bits of an arithmetic operand which is shifted/scaled and masked such that it is suitable for a UXTB, UXTH, or UXTW extend @@ -23015,6 +23105,31 @@ aarch64_sve_struct_memory_operand_p (rtx op) && offset_4bit_signed_scaled_p (SVE_BYTE_MODE, last)); } +/* Return true if OFFSET is a constant integer and if VNUM is + OFFSET * the number of bytes in an SVE vector. This is the requirement + that exists in SME LDR and STR instructions, where the VL offset must + equal the ZA slice offset. */ +bool +aarch64_sme_ldr_vnum_offset_p (rtx offset, rtx vnum) +{ + if (!CONST_INT_P (offset) || !IN_RANGE (INTVAL (offset), 0, 15)) + return false; + + if (TARGET_STREAMING) + { + poly_int64 const_vnum; + return (poly_int_rtx_p (vnum, &const_vnum) + && known_eq (const_vnum, + INTVAL (offset) * BYTES_PER_SVE_VECTOR)); + } + else + { + HOST_WIDE_INT factor; + return (aarch64_sme_vq_unspec_p (vnum, &factor) + && factor == INTVAL (offset) * 16); + } +} + /* Emit a register copy from operand to operand, taking care not to early-clobber source registers in the process. diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index bfa28726221..bc86d7220f1 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -207,6 +207,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; /* Macros to test ISA flags. */ #define AARCH64_ISA_SM_OFF (aarch64_isa_flags & AARCH64_FL_SM_OFF) +#define AARCH64_ISA_SM_ON (aarch64_isa_flags & AARCH64_FL_SM_ON) #define AARCH64_ISA_ZA_ON (aarch64_isa_flags & AARCH64_FL_ZA_ON) #define AARCH64_ISA_MODE (aarch64_isa_flags & AARCH64_FL_ISA_MODES) #define AARCH64_ISA_CRC (aarch64_isa_flags & AARCH64_FL_CRC) @@ -224,6 +225,8 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; #define AARCH64_ISA_SVE2_SHA3 (aarch64_isa_flags & AARCH64_FL_SVE2_SHA3) #define AARCH64_ISA_SVE2_SM4 (aarch64_isa_flags & AARCH64_FL_SVE2_SM4) #define AARCH64_ISA_SME (aarch64_isa_flags & AARCH64_FL_SME) +#define AARCH64_ISA_SME_I16I64 (aarch64_isa_flags & AARCH64_FL_SME_I16I64) +#define AARCH64_ISA_SME_F64F64 (aarch64_isa_flags & AARCH64_FL_SME_F64F64) #define AARCH64_ISA_V8_3A (aarch64_isa_flags & AARCH64_FL_V8_3A) #define AARCH64_ISA_DOTPROD (aarch64_isa_flags & AARCH64_FL_DOTPROD) #define AARCH64_ISA_AES (aarch64_isa_flags & AARCH64_FL_AES) @@ -256,6 +259,9 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; /* The current function is a normal non-streaming function. */ #define TARGET_NON_STREAMING (AARCH64_ISA_SM_OFF) +/* The current function has a streaming body. */ +#define TARGET_STREAMING (AARCH64_ISA_SM_ON) + /* The current function has a streaming-compatible body. */ #define TARGET_STREAMING_COMPATIBLE \ ((aarch64_isa_flags & AARCH64_FL_SM_STATE) == 0) @@ -316,6 +322,15 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; imply anything about the state of PSTATE.SM. */ #define TARGET_SME (AARCH64_ISA_SME) +/* Streaming-mode SME instructions. */ +#define TARGET_STREAMING_SME (TARGET_STREAMING && TARGET_SME) + +/* The FEAT_SME_I16I64 extension to SME, enabled through +sme-i16i64. */ +#define TARGET_SME_I16I64 (AARCH64_ISA_SME_I16I64) + +/* The FEAT_SME_F64F64 extension to SME, enabled through +sme-f64f64. */ +#define TARGET_SME_F64F64 (AARCH64_ISA_SME_F64F64) + /* ARMv8.3-A features. */ #define TARGET_ARMV8_3 (AARCH64_ISA_V8_3A) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 3ebe8690c31..de6bf5e6c4d 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -2097,10 +2097,10 @@ (define_expand "add3" (define_insn "*add3_aarch64" [(set - (match_operand:GPI 0 "register_operand" "=rk,rk,w,rk,r,r,rk") + (match_operand:GPI 0 "register_operand" "=rk,rk,w,rk,r,r,rk,rk") (plus:GPI - (match_operand:GPI 1 "register_operand" "%rk,rk,w,rk,rk,0,rk") - (match_operand:GPI 2 "aarch64_pluslong_operand" "I,r,w,J,Uaa,Uai,Uav")))] + (match_operand:GPI 1 "register_operand" "%rk,rk,w,rk,rk,0,rk,rk") + (match_operand:GPI 2 "aarch64_pluslong_operand" "I,r,w,J,Uaa,Uai,Uav,UaV")))] "" "@ add\\t%0, %1, %2 @@ -2109,10 +2109,12 @@ (define_insn "*add3_aarch64" sub\\t%0, %1, #%n2 # * return aarch64_output_sve_scalar_inc_dec (operands[2]); - * return aarch64_output_sve_addvl_addpl (operands[2]);" + * return aarch64_output_sve_addvl_addpl (operands[2]); + * return aarch64_output_addsvl_addspl (operands[2]);" ;; The "alu_imm" types for INC/DEC and ADDVL/ADDPL are just placeholders. - [(set_attr "type" "alu_imm,alu_sreg,neon_add,alu_imm,multiple,alu_imm,alu_imm") - (set_attr "arch" "*,*,simd,*,*,sve,sve")] + [(set_attr "type" "alu_imm,alu_sreg,neon_add,alu_imm,multiple,alu_imm, + alu_imm,alu_imm") + (set_attr "arch" "*,*,simd,*,*,sve,sve,sme")] ) ;; zero_extend version of above diff --git a/gcc/config/aarch64/arm_sme.h b/gcc/config/aarch64/arm_sme.h new file mode 100644 index 00000000000..ab6ec3341c3 --- /dev/null +++ b/gcc/config/aarch64/arm_sme.h @@ -0,0 +1,46 @@ +/* AArch64 SVE intrinsics include file. + Copyright (C) 2022 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published + by the Free Software Foundation; either version 3, or (at your + option) any later version. + + GCC is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public + License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +#ifndef _ARM_SME_H_ +#define _ARM_SME_H_ + +#include +#pragma GCC aarch64 "arm_sme.h" + +__attribute__((arm_streaming_compatible)) +void __arm_za_disable(void); + +__attribute__((arm_streaming_compatible, arm_preserves_za)) +void *__arm_sc_memcpy(void *, const void *, __SIZE_TYPE__); + +__attribute__((arm_streaming_compatible, arm_preserves_za)) +void *__arm_sc_memmove(void *, const void *, __SIZE_TYPE__); + +__attribute__((arm_streaming_compatible, arm_preserves_za)) +void *__arm_sc_memset(void *, int, __SIZE_TYPE__); + +__attribute__((arm_streaming_compatible, arm_preserves_za)) +void *__arm_sc_memchr(void *, int, __SIZE_TYPE__); + +#endif diff --git a/gcc/config/aarch64/constraints.md b/gcc/config/aarch64/constraints.md index 8d4393f30a1..7e35374975e 100644 --- a/gcc/config/aarch64/constraints.md +++ b/gcc/config/aarch64/constraints.md @@ -21,6 +21,9 @@ (define_register_constraint "k" "STACK_REG" "@internal The stack register.") +(define_register_constraint "Uci" "ZA_INDEX_REGS" + "@internal r12-r15, which can be used to index ZA.") + (define_register_constraint "Ucs" "TAILCALL_ADDR_REGS" "@internal Registers suitable for an indirect tail call") @@ -74,6 +77,12 @@ (define_constraint "Uav" a single ADDVL or ADDPL." (match_operand 0 "aarch64_sve_addvl_addpl_immediate")) +(define_constraint "UaV" + "@internal + A constraint that matches a VG-based constant that can be added by + a single ADDVL or ADDPL." + (match_operand 0 "aarch64_addsvl_addspl_immediate")) + (define_constraint "Uat" "@internal A constraint that matches a VG-based constant that can be added by diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 8d65fadbdf6..5a71049751d 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -450,6 +450,7 @@ (define_mode_iterator VNx4SI_ONLY [VNx4SI]) (define_mode_iterator VNx4SF_ONLY [VNx4SF]) (define_mode_iterator VNx2DI_ONLY [VNx2DI]) (define_mode_iterator VNx2DF_ONLY [VNx2DF]) +(define_mode_iterator VNx1TI_ONLY [VNx1TI]) ;; All SVE vector structure modes. (define_mode_iterator SVE_STRUCT [VNx32QI VNx16HI VNx8SI VNx4DI @@ -598,6 +599,15 @@ (define_mode_iterator PRED_HSD [VNx8BI VNx4BI VNx2BI]) ;; Bfloat16 modes to which V4SF can be converted (define_mode_iterator V4SF_TO_BF [V4BF V8BF]) +;; The modes used to represent different ZA access sizes. +(define_mode_iterator SME_ZA_I [VNx16QI VNx8HI VNx4SI VNx2DI VNx1TI]) +(define_mode_iterator SME_ZA_SDI [VNx4SI (VNx2DI "TARGET_SME_I16I64")]) + +;; The modes for which outer product instructions are supported. +(define_mode_iterator SME_MOP_BHI [VNx16QI (VNx8HI "TARGET_SME_I16I64")]) +(define_mode_iterator SME_MOP_HSDF [VNx8BF VNx8HF VNx4SF + (VNx2DF "TARGET_SME_F64F64")]) + ;; ------------------------------------------------------------------ ;; Unspec enumerations for Advance SIMD. These could well go into ;; aarch64.md but for their use in int_iterators here. @@ -976,6 +986,28 @@ (define_c_enum "unspec" UNSPEC_BFCVTN2 ; Used in aarch64-simd.md. UNSPEC_BFCVT ; Used in aarch64-simd.md. UNSPEC_FCVTXN ; Used in aarch64-simd.md. + + ;; All used in aarch64-sme.md + UNSPEC_SME_ADDHA + UNSPEC_SME_ADDVA + UNSPEC_SME_FMOPA + UNSPEC_SME_FMOPS + UNSPEC_SME_LD1_HOR + UNSPEC_SME_LD1_VER + UNSPEC_SME_READ_HOR + UNSPEC_SME_READ_VER + UNSPEC_SME_SMOPA + UNSPEC_SME_SMOPS + UNSPEC_SME_ST1_HOR + UNSPEC_SME_ST1_VER + UNSPEC_SME_SUMOPA + UNSPEC_SME_SUMOPS + UNSPEC_SME_UMOPA + UNSPEC_SME_UMOPS + UNSPEC_SME_USMOPA + UNSPEC_SME_USMOPS + UNSPEC_SME_WRITE_HOR + UNSPEC_SME_WRITE_VER ]) ;; ------------------------------------------------------------------ @@ -1232,6 +1264,7 @@ (define_mode_attr Vetype [(V8QI "b") (V16QI "b") (VNx4SF "s") (VNx2SF "s") (VNx2DI "d") (VNx2DF "d") + (VNx1TI "q") (BF "h") (V4BF "h") (V8BF "h") (HF "h") (SF "s") (DF "d") @@ -1250,6 +1283,7 @@ (define_mode_attr Vesize [(VNx16QI "b") (VNx8QI "b") (VNx4QI "b") (VNx2QI "b") (VNx4SF "w") (VNx2SF "w") (VNx2DI "d") (VNx2DF "d") + (VNx1TI "q") (VNx32QI "b") (VNx48QI "b") (VNx64QI "b") (VNx16HI "h") (VNx24HI "h") (VNx32HI "h") (VNx16HF "h") (VNx24HF "h") (VNx32HF "h") @@ -1574,6 +1608,15 @@ (define_mode_attr Vmwtype [(V8QI ".8h") (V4HI ".4s") (V4HF ".4s") (V2SF ".2d") (SI "") (HI "")]) +;; Vector modes whose elements are four times wider. +(define_mode_attr V4xWIDE [(VNx16QI "VNx4SI") (VNx8HI "VNx2DI")]) + +;; Predicate modes for V4xWIDE. +(define_mode_attr V4xWIDE_PRED [(VNx16QI "VNx4BI") (VNx8HI "VNx2BI")]) + +;; Element suffix for V4xWIDE. +(define_mode_attr V4xwetype [(VNx16QI "s") (VNx8HI "d")]) + ;; Lower part register suffixes for VQW/VQ_HSF. (define_mode_attr Vhalftype [(V16QI "8b") (V8HI "4h") (V4SI "2s") (V8HF "4h") @@ -2046,6 +2089,7 @@ (define_mode_attr VPRED [(VNx16QI "VNx16BI") (VNx8QI "VNx8BI") (VNx4SF "VNx4BI") (VNx2SF "VNx2BI") (VNx2DI "VNx2BI") (VNx2DF "VNx2BI") + (VNx1TI "VNx2BI") (VNx32QI "VNx16BI") (VNx16HI "VNx8BI") (VNx16HF "VNx8BI") (VNx16BF "VNx8BI") @@ -2126,6 +2170,17 @@ (define_mode_attr sve_lane_con [(VNx8HI "y") (VNx4SI "y") (VNx2DI "x") ;; The constraint to use for an SVE FCMLA lane index. (define_mode_attr sve_lane_pair_con [(VNx8HF "y") (VNx4SF "x")]) +(define_mode_attr SME_FMOP_WIDE [(VNx8BF "VNx4SF") (VNx8HF "VNx4SF") + (VNx4SF "VNx4SF") (VNx2DF "VNx2DF")]) + +(define_mode_attr SME_FMOP_WIDE_PRED [(VNx8BF "VNx4BI") (VNx8HF "VNx4BI") + (VNx4SF "VNx4BI") (VNx2DF "VNx2BI")]) + +(define_mode_attr sme_fmop_wide_etype [(VNx8BF "s") (VNx8HF "s") + (VNx4SF "s") (VNx2DF "d")]) + +(define_mode_attr b [(VNx8BF "b") (VNx8HF "") (VNx4SF "") (VNx2DF "")]) + ;; ------------------------------------------------------------------- ;; Code Iterators ;; ------------------------------------------------------------------- @@ -3160,6 +3215,20 @@ (define_int_iterator FCMLA_OP [UNSPEC_FCMLA (define_int_iterator FCMUL_OP [UNSPEC_FCMUL UNSPEC_FCMUL_CONJ]) +(define_int_iterator SME_LD1 [UNSPEC_SME_LD1_HOR UNSPEC_SME_LD1_VER]) +(define_int_iterator SME_READ [UNSPEC_SME_READ_HOR UNSPEC_SME_READ_VER]) +(define_int_iterator SME_ST1 [UNSPEC_SME_ST1_HOR UNSPEC_SME_ST1_VER]) +(define_int_iterator SME_WRITE [UNSPEC_SME_WRITE_HOR UNSPEC_SME_WRITE_VER]) + +(define_int_iterator SME_UNARY_SDI [UNSPEC_SME_ADDHA UNSPEC_SME_ADDVA]) + +(define_int_iterator SME_INT_MOP [UNSPEC_SME_SMOPA UNSPEC_SME_SMOPS + UNSPEC_SME_SUMOPA UNSPEC_SME_SUMOPS + UNSPEC_SME_UMOPA UNSPEC_SME_UMOPS + UNSPEC_SME_USMOPA UNSPEC_SME_USMOPS]) + +(define_int_iterator SME_FP_MOP [UNSPEC_SME_FMOPA UNSPEC_SME_FMOPS]) + ;; Iterators for atomic operations. (define_int_iterator ATOMIC_LDOP @@ -3232,6 +3301,26 @@ (define_int_attr optab [(UNSPEC_ANDF "and") (UNSPEC_PMULLT "pmullt") (UNSPEC_PMULLT_PAIR "pmullt_pair") (UNSPEC_SMATMUL "smatmul") + (UNSPEC_SME_ADDHA "addha") + (UNSPEC_SME_ADDVA "addva") + (UNSPEC_SME_FMOPA "fmopa") + (UNSPEC_SME_FMOPS "fmops") + (UNSPEC_SME_LD1_HOR "ld1_hor") + (UNSPEC_SME_LD1_VER "ld1_ver") + (UNSPEC_SME_READ_HOR "read_hor") + (UNSPEC_SME_READ_VER "read_ver") + (UNSPEC_SME_SMOPA "smopa") + (UNSPEC_SME_SMOPS "smops") + (UNSPEC_SME_ST1_HOR "st1_hor") + (UNSPEC_SME_ST1_VER "st1_ver") + (UNSPEC_SME_SUMOPA "sumopa") + (UNSPEC_SME_SUMOPS "sumops") + (UNSPEC_SME_UMOPA "umopa") + (UNSPEC_SME_UMOPS "umops") + (UNSPEC_SME_USMOPA "usmopa") + (UNSPEC_SME_USMOPS "usmops") + (UNSPEC_SME_WRITE_HOR "write_hor") + (UNSPEC_SME_WRITE_VER "write_ver") (UNSPEC_SQCADD90 "sqcadd90") (UNSPEC_SQCADD270 "sqcadd270") (UNSPEC_SQRDCMLAH "sqrdcmlah") @@ -4001,6 +4090,15 @@ (define_int_attr min_elem_bits [(UNSPEC_RBIT "8") (define_int_attr unspec [(UNSPEC_WHILERW "UNSPEC_WHILERW") (UNSPEC_WHILEWR "UNSPEC_WHILEWR")]) +(define_int_attr hv [(UNSPEC_SME_LD1_HOR "h") + (UNSPEC_SME_LD1_VER "v") + (UNSPEC_SME_READ_HOR "h") + (UNSPEC_SME_READ_VER "v") + (UNSPEC_SME_ST1_HOR "h") + (UNSPEC_SME_ST1_VER "v") + (UNSPEC_SME_WRITE_HOR "h") + (UNSPEC_SME_WRITE_VER "v")]) + ;; Iterators and attributes for fpcr fpsr getter setters (define_int_iterator GET_FPSCR diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index c308015ac2c..9e4a70ad9e9 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -168,11 +168,17 @@ (define_predicate "aarch64_split_add_offset_immediate" (and (match_code "const_poly_int") (match_test "aarch64_add_offset_temporaries (op) == 1"))) +(define_predicate "aarch64_addsvl_addspl_immediate" + (and (match_code "const") + (match_test "aarch64_addsvl_addspl_immediate_p (op)"))) + (define_predicate "aarch64_pluslong_operand" (ior (match_operand 0 "register_operand") (match_operand 0 "aarch64_pluslong_immediate") (and (match_test "TARGET_SVE") - (match_operand 0 "aarch64_sve_plus_immediate")))) + (match_operand 0 "aarch64_sve_plus_immediate")) + (and (match_test "TARGET_SME") + (match_operand 0 "aarch64_addsvl_addspl_immediate")))) (define_predicate "aarch64_pluslong_or_poly_operand" (ior (match_operand 0 "aarch64_pluslong_operand") diff --git a/gcc/config/aarch64/t-aarch64 b/gcc/config/aarch64/t-aarch64 index c1c8f5c7dae..2438b78a87f 100644 --- a/gcc/config/aarch64/t-aarch64 +++ b/gcc/config/aarch64/t-aarch64 @@ -63,6 +63,7 @@ aarch64-sve-builtins.o: $(srcdir)/config/aarch64/aarch64-sve-builtins.cc \ $(srcdir)/config/aarch64/aarch64-sve-builtins.def \ $(srcdir)/config/aarch64/aarch64-sve-builtins-base.def \ $(srcdir)/config/aarch64/aarch64-sve-builtins-sve2.def \ + $(srcdir)/config/aarch64/aarch64-sve-builtins-sme.def \ $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) $(RTL_H) \ $(TM_P_H) memmodel.h insn-codes.h $(OPTABS_H) $(RECOG_H) $(DIAGNOSTIC_H) \ $(EXPR_H) $(BASIC_BLOCK_H) $(FUNCTION_H) fold-const.h $(GIMPLE_H) \ @@ -72,7 +73,8 @@ aarch64-sve-builtins.o: $(srcdir)/config/aarch64/aarch64-sve-builtins.cc \ $(srcdir)/config/aarch64/aarch64-sve-builtins.h \ $(srcdir)/config/aarch64/aarch64-sve-builtins-shapes.h \ $(srcdir)/config/aarch64/aarch64-sve-builtins-base.h \ - $(srcdir)/config/aarch64/aarch64-sve-builtins-sve2.h + $(srcdir)/config/aarch64/aarch64-sve-builtins-sve2.h \ + $(srcdir)/config/aarch64/aarch64-sve-builtins-sme.h $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ $(srcdir)/config/aarch64/aarch64-sve-builtins.cc @@ -113,6 +115,19 @@ aarch64-sve-builtins-sve2.o: \ $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ $(srcdir)/config/aarch64/aarch64-sve-builtins-sve2.cc +aarch64-sve-builtins-sme.o: \ + $(srcdir)/config/aarch64/aarch64-sve-builtins-sme.cc \ + $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) $(RTL_H) \ + $(TM_P_H) memmodel.h insn-codes.h $(OPTABS_H) $(RECOG_H) \ + $(EXPR_H) $(BASIC_BLOCK_H) $(FUNCTION_H) fold-const.h $(GIMPLE_H) \ + gimple-iterator.h gimplify.h explow.h $(EMIT_RTL_H) \ + $(srcdir)/config/aarch64/aarch64-sve-builtins.h \ + $(srcdir)/config/aarch64/aarch64-sve-builtins-shapes.h \ + $(srcdir)/config/aarch64/aarch64-sve-builtins-sme.h \ + $(srcdir)/config/aarch64/aarch64-sve-builtins-functions.h + $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ + $(srcdir)/config/aarch64/aarch64-sve-builtins-sme.cc + aarch64-builtin-iterators.h: $(srcdir)/config/aarch64/geniterators.sh \ $(srcdir)/config/aarch64/iterators.md $(SHELL) $(srcdir)/config/aarch64/geniterators.sh \ diff --git a/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst b/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst index f6d82f4435b..03da1a867bb 100644 --- a/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst +++ b/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst @@ -547,6 +547,12 @@ the following and their inverses no :samp:`{feature}` : :samp:`sme` Enable the Scalable Matrix Extension. +:samp:`sme-i16i64` + Enable the FEAT_SME_I16I64 extension to SME. + +:samp:`sme-f64f64` + Enable the FEAT_SME_F64F64 extension to SME. + Feature ``crypto`` implies ``aes``, ``sha2``, and ``simd``, which implies ``fp``. Conversely, ``nofp`` implies ``nosimd``, which implies diff --git a/gcc/testsuite/g++.target/aarch64/sme/aarch64-sme-acle-asm.exp b/gcc/testsuite/g++.target/aarch64/sme/aarch64-sme-acle-asm.exp new file mode 100644 index 00000000000..f05b9de76c5 --- /dev/null +++ b/gcc/testsuite/g++.target/aarch64/sme/aarch64-sme-acle-asm.exp @@ -0,0 +1,86 @@ +# Assembly-based regression-test driver for the SME ACLE +# Copyright (C) 2009-2022 Free Software Foundation, Inc. +# +# This file is part of GCC. +# +# GCC is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. +# +# GCC is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . */ + +# GCC testsuite that uses the `dg.exp' driver. + +# Exit immediately if this isn't an AArch64 target. +if {![istarget aarch64*-*-*] } { + return +} + +# Load support procs. +load_lib g++-dg.exp + +# Initialize `dg'. +dg-init + +# Force SME if we're not testing it already. +if { [check_effective_target_aarch64_sme] } { + set sme_flags "" +} else { + set sme_flags "-march=armv8.2-a+sme" +} + +# Turn off any codegen tweaks by default that may affect expected assembly. +# Tests relying on those should turn them on explicitly. +set sme_flags "$sme_flags -mtune=generic -moverride=tune=none" + +global gcc_runtest_parallelize_limit_minor +if { [info exists gcc_runtest_parallelize_limit_minor] } { + set old_limit_minor $gcc_runtest_parallelize_limit_minor + set gcc_runtest_parallelize_limit_minor 1 +} + +torture-init +set-torture-options { + "-std=c++98 -O0 -g" + "-std=c++98 -O1 -g" + "-std=c++11 -O2 -g" + "-std=c++14 -O3 -g" + "-std=c++17 -Og -g" + "-std=c++2a -Os -g" + "-std=gnu++98 -O2 -fno-schedule-insns -fno-schedule-insns2 -DCHECK_ASM --save-temps" + "-std=gnu++11 -Ofast -g" + "-std=gnu++17 -O3 -g" + "-std=gnu++2a -O0 -g" +} { + "-DTEST_FULL" + "-DTEST_OVERLOADS" +} + +# Main loop. +set gcc_subdir [string replace $subdir 0 2 gcc] +set files [glob -nocomplain $srcdir/$gcc_subdir/acle-asm/*.c] +set save-dg-do-what-default ${dg-do-what-default} +if { [check_effective_target_aarch64_asm_sme-i16i64_ok] } { + set dg-do-what-default assemble +} else { + set dg-do-what-default compile +} +gcc-dg-runtest [lsort $files] "" "$sme_flags -fno-ipa-icf" +set dg-do-what-default ${save-dg-do-what-default} + +torture-finish + +if { [info exists gcc_runtest_parallelize_limit_minor] } { + set gcc_runtest_parallelize_limit_minor $old_limit_minor +} + +# All done. +dg-finish diff --git a/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/func_redef_4.c b/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/func_redef_4.c index 9591e3d01d6..8ad86a3c024 100644 --- a/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/func_redef_4.c +++ b/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/func_redef_4.c @@ -4,6 +4,6 @@ to be diagnosed. Any attempt to call the function before including arm_sve.h will lead to a link failure. (Same for taking its address, etc.) */ -extern __SVUint8_t svadd_u8_x (__SVBool_t, __SVUint8_t, __SVUint8_t); +extern __attribute__((arm_preserves_za)) __SVUint8_t svadd_u8_x (__SVBool_t, __SVUint8_t, __SVUint8_t); #pragma GCC aarch64 "arm_sve.h" diff --git a/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/func_redef_5.c b/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/func_redef_5.c index f87201984b8..7c2f4c440cb 100644 --- a/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/func_redef_5.c +++ b/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/func_redef_5.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ -__SVUint8_t +__SVUint8_t __attribute__((arm_preserves_za)) svadd_u8_x (__SVBool_t pg, __SVUint8_t x, __SVUint8_t y) { return x; diff --git a/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/func_redef_7.c b/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/func_redef_7.c index 1f2e4bf66a3..31b8d7ddfab 100644 --- a/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/func_redef_7.c +++ b/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/func_redef_7.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ -__SVUint8_t +__SVUint8_t __attribute__((arm_preserves_za)) svadd_x (__SVBool_t pg, __SVUint8_t x, __SVUint8_t y) { return x; diff --git a/gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme-acle-asm.exp b/gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme-acle-asm.exp new file mode 100644 index 00000000000..ad998583b70 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/aarch64-sme-acle-asm.exp @@ -0,0 +1,82 @@ +# Assembly-based regression-test driver for the SME ACLE +# Copyright (C) 2009-2022 Free Software Foundation, Inc. +# +# This file is part of GCC. +# +# GCC is free software; you can redistribute it and/or modify it +# under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. +# +# GCC is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . */ + +# GCC testsuite that uses the `dg.exp' driver. + +# Exit immediately if this isn't an AArch64 target. +if {![istarget aarch64*-*-*] } { + return +} + +# Load support procs. +load_lib gcc-dg.exp + +# Initialize `dg'. +dg-init + +# Force SME if we're not testing it already. +if { [check_effective_target_aarch64_sme] } { + set sme_flags "" +} else { + set sme_flags "-march=armv8.2-a+sme" +} + +# Turn off any codegen tweaks by default that may affect expected assembly. +# Tests relying on those should turn them on explicitly. +set sme_flags "$sme_flags -mtune=generic -moverride=tune=none" + +global gcc_runtest_parallelize_limit_minor +if { [info exists gcc_runtest_parallelize_limit_minor] } { + set old_limit_minor $gcc_runtest_parallelize_limit_minor + set gcc_runtest_parallelize_limit_minor 1 +} + +torture-init +set-torture-options { + "-std=c90 -O0 -g" + "-std=c90 -O1 -g" + "-std=c99 -O2 -g" + "-std=c11 -O3 -g" + "-std=gnu90 -O2 -fno-schedule-insns -fno-schedule-insns2 -DCHECK_ASM --save-temps" + "-std=gnu99 -Ofast -g" + "-std=gnu11 -Os -g" +} { + "-DTEST_FULL" + "-DTEST_OVERLOADS" +} + +# Main loop. +set files [glob -nocomplain $srcdir/$subdir/acle-asm/*.c] +set save-dg-do-what-default ${dg-do-what-default} +if { [check_effective_target_aarch64_asm_sme-i16i64_ok] } { + set dg-do-what-default assemble +} else { + set dg-do-what-default compile +} +gcc-dg-runtest [lsort $files] "" "$sme_flags -fno-ipa-icf" +set dg-do-what-default ${save-dg-do-what-default} + +torture-finish + +if { [info exists gcc_runtest_parallelize_limit_minor] } { + set gcc_runtest_parallelize_limit_minor $old_limit_minor +} + +# All done. +dg-finish diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addha_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addha_za32.c new file mode 100644 index 00000000000..8dee401458c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addha_za32.c @@ -0,0 +1,48 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** addha_za32_s32_0_p0_p1_z0: +** addha za0\.s, p0/m, p1/m, z0\.s +** ret +*/ +TEST_UNIFORM_ZA (addha_za32_s32_0_p0_p1_z0, svint32_t, + svaddha_za32_s32_m (0, p0, p1, z0), + svaddha_za32_m (0, p0, p1, z0)) + +/* +** addha_za32_s32_0_p1_p0_z1: +** addha za0\.s, p1/m, p0/m, z1\.s +** ret +*/ +TEST_UNIFORM_ZA (addha_za32_s32_0_p1_p0_z1, svint32_t, + svaddha_za32_s32_m (0, p1, p0, z1), + svaddha_za32_m (0, p1, p0, z1)) + +/* +** addha_za32_s32_1_p0_p1_z0: +** addha za1\.s, p0/m, p1/m, z0\.s +** ret +*/ +TEST_UNIFORM_ZA (addha_za32_s32_1_p0_p1_z0, svint32_t, + svaddha_za32_s32_m (1, p0, p1, z0), + svaddha_za32_m (1, p0, p1, z0)) + +/* +** addha_za32_s32_3_p0_p1_z0: +** addha za3\.s, p0/m, p1/m, z0\.s +** ret +*/ +TEST_UNIFORM_ZA (addha_za32_s32_3_p0_p1_z0, svint32_t, + svaddha_za32_s32_m (3, p0, p1, z0), + svaddha_za32_m (3, p0, p1, z0)) + +/* +** addha_za32_u32_0_p0_p1_z0: +** addha za0\.s, p0/m, p1/m, z0\.s +** ret +*/ +TEST_UNIFORM_ZA (addha_za32_u32_0_p0_p1_z0, svuint32_t, + svaddha_za32_u32_m (0, p0, p1, z0), + svaddha_za32_m (0, p0, p1, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addha_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addha_za64.c new file mode 100644 index 00000000000..363ff1aab21 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addha_za64.c @@ -0,0 +1,50 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +#pragma GCC target "+sme-i16i64" + +/* +** addha_za64_s64_0_p0_p1_z0: +** addha za0\.d, p0/m, p1/m, z0\.d +** ret +*/ +TEST_UNIFORM_ZA (addha_za64_s64_0_p0_p1_z0, svint64_t, + svaddha_za64_s64_m (0, p0, p1, z0), + svaddha_za64_m (0, p0, p1, z0)) + +/* +** addha_za64_s64_0_p1_p0_z1: +** addha za0\.d, p1/m, p0/m, z1\.d +** ret +*/ +TEST_UNIFORM_ZA (addha_za64_s64_0_p1_p0_z1, svint64_t, + svaddha_za64_s64_m (0, p1, p0, z1), + svaddha_za64_m (0, p1, p0, z1)) + +/* +** addha_za64_s64_1_p0_p1_z0: +** addha za1\.d, p0/m, p1/m, z0\.d +** ret +*/ +TEST_UNIFORM_ZA (addha_za64_s64_1_p0_p1_z0, svint64_t, + svaddha_za64_s64_m (1, p0, p1, z0), + svaddha_za64_m (1, p0, p1, z0)) + +/* +** addha_za64_s64_7_p0_p1_z0: +** addha za7\.d, p0/m, p1/m, z0\.d +** ret +*/ +TEST_UNIFORM_ZA (addha_za64_s64_7_p0_p1_z0, svint64_t, + svaddha_za64_s64_m (7, p0, p1, z0), + svaddha_za64_m (7, p0, p1, z0)) + +/* +** addha_za64_u64_0_p0_p1_z0: +** addha za0\.d, p0/m, p1/m, z0\.d +** ret +*/ +TEST_UNIFORM_ZA (addha_za64_u64_0_p0_p1_z0, svuint64_t, + svaddha_za64_u64_m (0, p0, p1, z0), + svaddha_za64_m (0, p0, p1, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addva_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addva_za32.c new file mode 100644 index 00000000000..0de019ac86a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addva_za32.c @@ -0,0 +1,48 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** addva_za32_s32_0_p0_p1_z0: +** addva za0\.s, p0/m, p1/m, z0\.s +** ret +*/ +TEST_UNIFORM_ZA (addva_za32_s32_0_p0_p1_z0, svint32_t, + svaddva_za32_s32_m (0, p0, p1, z0), + svaddva_za32_m (0, p0, p1, z0)) + +/* +** addva_za32_s32_0_p1_p0_z1: +** addva za0\.s, p1/m, p0/m, z1\.s +** ret +*/ +TEST_UNIFORM_ZA (addva_za32_s32_0_p1_p0_z1, svint32_t, + svaddva_za32_s32_m (0, p1, p0, z1), + svaddva_za32_m (0, p1, p0, z1)) + +/* +** addva_za32_s32_1_p0_p1_z0: +** addva za1\.s, p0/m, p1/m, z0\.s +** ret +*/ +TEST_UNIFORM_ZA (addva_za32_s32_1_p0_p1_z0, svint32_t, + svaddva_za32_s32_m (1, p0, p1, z0), + svaddva_za32_m (1, p0, p1, z0)) + +/* +** addva_za32_s32_3_p0_p1_z0: +** addva za3\.s, p0/m, p1/m, z0\.s +** ret +*/ +TEST_UNIFORM_ZA (addva_za32_s32_3_p0_p1_z0, svint32_t, + svaddva_za32_s32_m (3, p0, p1, z0), + svaddva_za32_m (3, p0, p1, z0)) + +/* +** addva_za32_u32_0_p0_p1_z0: +** addva za0\.s, p0/m, p1/m, z0\.s +** ret +*/ +TEST_UNIFORM_ZA (addva_za32_u32_0_p0_p1_z0, svuint32_t, + svaddva_za32_u32_m (0, p0, p1, z0), + svaddva_za32_m (0, p0, p1, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addva_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addva_za64.c new file mode 100644 index 00000000000..d83d4e03c6a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/addva_za64.c @@ -0,0 +1,50 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +#pragma GCC target "+sme-i16i64" + +/* +** addva_za64_s64_0_p0_p1_z0: +** addva za0\.d, p0/m, p1/m, z0\.d +** ret +*/ +TEST_UNIFORM_ZA (addva_za64_s64_0_p0_p1_z0, svint64_t, + svaddva_za64_s64_m (0, p0, p1, z0), + svaddva_za64_m (0, p0, p1, z0)) + +/* +** addva_za64_s64_0_p1_p0_z1: +** addva za0\.d, p1/m, p0/m, z1\.d +** ret +*/ +TEST_UNIFORM_ZA (addva_za64_s64_0_p1_p0_z1, svint64_t, + svaddva_za64_s64_m (0, p1, p0, z1), + svaddva_za64_m (0, p1, p0, z1)) + +/* +** addva_za64_s64_1_p0_p1_z0: +** addva za1\.d, p0/m, p1/m, z0\.d +** ret +*/ +TEST_UNIFORM_ZA (addva_za64_s64_1_p0_p1_z0, svint64_t, + svaddva_za64_s64_m (1, p0, p1, z0), + svaddva_za64_m (1, p0, p1, z0)) + +/* +** addva_za64_s64_7_p0_p1_z0: +** addva za7\.d, p0/m, p1/m, z0\.d +** ret +*/ +TEST_UNIFORM_ZA (addva_za64_s64_7_p0_p1_z0, svint64_t, + svaddva_za64_s64_m (7, p0, p1, z0), + svaddva_za64_m (7, p0, p1, z0)) + +/* +** addva_za64_u64_0_p0_p1_z0: +** addva za0\.d, p0/m, p1/m, z0\.d +** ret +*/ +TEST_UNIFORM_ZA (addva_za64_u64_0_p0_p1_z0, svuint64_t, + svaddva_za64_u64_m (0, p0, p1, z0), + svaddva_za64_m (0, p0, p1, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_has_sme_sc.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_has_sme_sc.c new file mode 100644 index 00000000000..e37793f9e75 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_has_sme_sc.c @@ -0,0 +1,25 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#define STREAMING_COMPATIBLE +#define NO_SHARED_ZA +#include "test_sme_acle.h" + +#pragma GCC target "+nosme" + +/* +** test_nosme: +** ... +** bl __arm_sme_state +** lsr x0, x0, #?63 +** ... +*/ +PROTO (test_nosme, int, ()) { return __arm_has_sme (); } + +#pragma GCC target "+sme" + +/* +** test_sme: +** mov w0, #?1 +** ret +*/ +PROTO (test_sme, int, ()) { return __arm_has_sme (); } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_in_streaming_mode_ns.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_in_streaming_mode_ns.c new file mode 100644 index 00000000000..ba475d67bb2 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_in_streaming_mode_ns.c @@ -0,0 +1,11 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#define NON_STREAMING +#include "test_sme_acle.h" + +/* +** test_sme: +** mov w0, #?0 +** ret +*/ +PROTO (test_sme, int, ()) { return __arm_in_streaming_mode (); } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_in_streaming_mode_s.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_in_streaming_mode_s.c new file mode 100644 index 00000000000..b88d47921bb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_in_streaming_mode_s.c @@ -0,0 +1,11 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#define NO_SHARED_ZA +#include "test_sme_acle.h" + +/* +** test_sme: +** mov w0, #?1 +** ret +*/ +PROTO (test_sme, int, ()) { return __arm_in_streaming_mode (); } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_in_streaming_mode_sc.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_in_streaming_mode_sc.c new file mode 100644 index 00000000000..fb3588a642e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/arm_in_streaming_mode_sc.c @@ -0,0 +1,26 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#define STREAMING_COMPATIBLE +#define NO_SHARED_ZA +#include "test_sme_acle.h" + +#pragma GCC target "+nosme" + +/* +** test_nosme: +** ... +** bl __arm_sme_state +** and w0, w0, #?1 +** ... +*/ +PROTO (test_nosme, int, ()) { return __arm_in_streaming_mode (); } + +#pragma GCC target "+sme" + +/* +** test_sme: +** mrs x([0-9]+), svcr +** and w0, w\1, #?1 +** ret +*/ +PROTO (test_sme, int, ()) { return __arm_in_streaming_mode (); } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsb_s.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsb_s.c new file mode 100644 index 00000000000..0a8de45be4d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsb_s.c @@ -0,0 +1,310 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#define NO_SHARED_ZA +#include "test_sme_acle.h" + +/* +** cntb_1: +** cntb x0 +** ret +*/ +PROTO (cntb_1, uint64_t, ()) { return svcntsb (); } + +/* +** cntb_2: +** cntb x0, all, mul #2 +** ret +*/ +PROTO (cntb_2, uint64_t, ()) { return svcntsb () * 2; } + +/* +** cntb_3: +** cntb x0, all, mul #3 +** ret +*/ +PROTO (cntb_3, uint64_t, ()) { return svcntsb () * 3; } + +/* +** cntb_4: +** cntb x0, all, mul #4 +** ret +*/ +PROTO (cntb_4, uint64_t, ()) { return svcntsb () * 4; } + +/* +** cntb_8: +** cntb x0, all, mul #8 +** ret +*/ +PROTO (cntb_8, uint64_t, ()) { return svcntsb () * 8; } + +/* +** cntb_15: +** cntb x0, all, mul #15 +** ret +*/ +PROTO (cntb_15, uint64_t, ()) { return svcntsb () * 15; } + +/* +** cntb_16: +** cntb x0, all, mul #16 +** ret +*/ +PROTO (cntb_16, uint64_t, ()) { return svcntsb () * 16; } + +/* +** cntb_17: +** rdvl x0, #17 +** ret +*/ +PROTO (cntb_17, uint64_t, ()) { return svcntsb () * 17; } + +/* +** cntb_31: +** rdvl x0, #31 +** ret +*/ +PROTO (cntb_31, uint64_t, ()) { return svcntsb () * 31; } + +/* +** cntb_32: +** cntb (x[0-9]+) +** lsl x0, \1, 5 +** ret +*/ +PROTO (cntb_32, uint64_t, ()) { return svcntsb () * 32; } + +/* Other sequences would be OK. */ +/* +** cntb_33: +** cntb (x[0-9]+) +** lsl x0, \1, 5 +** incb x0 +** ret +*/ +PROTO (cntb_33, uint64_t, ()) { return svcntsb () * 33; } + +/* +** cntb_64: +** cntb (x[0-9]+) +** lsl x0, \1, 6 +** ret +*/ +PROTO (cntb_64, uint64_t, ()) { return svcntsb () * 64; } + +/* +** cntb_128: +** cntb (x[0-9]+) +** lsl x0, \1, 7 +** ret +*/ +PROTO (cntb_128, uint64_t, ()) { return svcntsb () * 128; } + +/* Other sequences would be OK. */ +/* +** cntb_129: +** cntb (x[0-9]+) +** lsl x0, \1, 7 +** incb x0 +** ret +*/ +PROTO (cntb_129, uint64_t, ()) { return svcntsb () * 129; } + +/* +** cntb_m1: +** rdvl x0, #-1 +** ret +*/ +PROTO (cntb_m1, uint64_t, ()) { return -svcntsb (); } + +/* +** cntb_m13: +** rdvl x0, #-13 +** ret +*/ +PROTO (cntb_m13, uint64_t, ()) { return -svcntsb () * 13; } + +/* +** cntb_m15: +** rdvl x0, #-15 +** ret +*/ +PROTO (cntb_m15, uint64_t, ()) { return -svcntsb () * 15; } + +/* +** cntb_m16: +** rdvl x0, #-16 +** ret +*/ +PROTO (cntb_m16, uint64_t, ()) { return -svcntsb () * 16; } + +/* +** cntb_m17: +** rdvl x0, #-17 +** ret +*/ +PROTO (cntb_m17, uint64_t, ()) { return -svcntsb () * 17; } + +/* +** cntb_m32: +** rdvl x0, #-32 +** ret +*/ +PROTO (cntb_m32, uint64_t, ()) { return -svcntsb () * 32; } + +/* +** cntb_m33: +** rdvl x0, #-32 +** decb x0 +** ret +*/ +PROTO (cntb_m33, uint64_t, ()) { return -svcntsb () * 33; } + +/* +** cntb_m34: +** rdvl (x[0-9]+), #-17 +** lsl x0, \1, #?1 +** ret +*/ +PROTO (cntb_m34, uint64_t, ()) { return -svcntsb () * 34; } + +/* +** cntb_m64: +** rdvl (x[0-9]+), #-1 +** lsl x0, \1, #?6 +** ret +*/ +PROTO (cntb_m64, uint64_t, ()) { return -svcntsb () * 64; } + +/* +** incb_1: +** incb x0 +** ret +*/ +PROTO (incb_1, uint64_t, (uint64_t x0)) { return x0 + svcntsb (); } + +/* +** incb_2: +** incb x0, all, mul #2 +** ret +*/ +PROTO (incb_2, uint64_t, (uint64_t x0)) { return x0 + svcntsb () * 2; } + +/* +** incb_3: +** incb x0, all, mul #3 +** ret +*/ +PROTO (incb_3, uint64_t, (uint64_t x0)) { return x0 + svcntsb () * 3; } + +/* +** incb_4: +** incb x0, all, mul #4 +** ret +*/ +PROTO (incb_4, uint64_t, (uint64_t x0)) { return x0 + svcntsb () * 4; } + +/* +** incb_8: +** incb x0, all, mul #8 +** ret +*/ +PROTO (incb_8, uint64_t, (uint64_t x0)) { return x0 + svcntsb () * 8; } + +/* +** incb_15: +** incb x0, all, mul #15 +** ret +*/ +PROTO (incb_15, uint64_t, (uint64_t x0)) { return x0 + svcntsb () * 15; } + +/* +** incb_16: +** incb x0, all, mul #16 +** ret +*/ +PROTO (incb_16, uint64_t, (uint64_t x0)) { return x0 + svcntsb () * 16; } + +/* +** incb_17: +** addvl x0, x0, #17 +** ret +*/ +PROTO (incb_17, uint64_t, (uint64_t x0)) { return x0 + svcntsb () * 17; } + +/* +** incb_31: +** addvl x0, x0, #31 +** ret +*/ +PROTO (incb_31, uint64_t, (uint64_t x0)) { return x0 + svcntsb () * 31; } + +/* +** decb_1: +** decb x0 +** ret +*/ +PROTO (decb_1, uint64_t, (uint64_t x0)) { return x0 - svcntsb (); } + +/* +** decb_2: +** decb x0, all, mul #2 +** ret +*/ +PROTO (decb_2, uint64_t, (uint64_t x0)) { return x0 - svcntsb () * 2; } + +/* +** decb_3: +** decb x0, all, mul #3 +** ret +*/ +PROTO (decb_3, uint64_t, (uint64_t x0)) { return x0 - svcntsb () * 3; } + +/* +** decb_4: +** decb x0, all, mul #4 +** ret +*/ +PROTO (decb_4, uint64_t, (uint64_t x0)) { return x0 - svcntsb () * 4; } + +/* +** decb_8: +** decb x0, all, mul #8 +** ret +*/ +PROTO (decb_8, uint64_t, (uint64_t x0)) { return x0 - svcntsb () * 8; } + +/* +** decb_15: +** decb x0, all, mul #15 +** ret +*/ +PROTO (decb_15, uint64_t, (uint64_t x0)) { return x0 - svcntsb () * 15; } + +/* +** decb_16: +** decb x0, all, mul #16 +** ret +*/ +PROTO (decb_16, uint64_t, (uint64_t x0)) { return x0 - svcntsb () * 16; } + +/* +** decb_17: +** addvl x0, x0, #-17 +** ret +*/ +PROTO (decb_17, uint64_t, (uint64_t x0)) { return x0 - svcntsb () * 17; } + +/* +** decb_31: +** addvl x0, x0, #-31 +** ret +*/ +PROTO (decb_31, uint64_t, (uint64_t x0)) { return x0 - svcntsb () * 31; } + +/* +** decb_32: +** addvl x0, x0, #-32 +** ret +*/ +PROTO (decb_32, uint64_t, (uint64_t x0)) { return x0 - svcntsb () * 32; } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsb_sc.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsb_sc.c new file mode 100644 index 00000000000..9ee4c8afc36 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsb_sc.c @@ -0,0 +1,12 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#define STREAMING_COMPATIBLE +#define NO_SHARED_ZA +#include "test_sme_acle.h" + +/* +** cntsb: +** rdsvl x0, #1 +** ret +*/ +PROTO (cntsb, uint64_t, ()) { return svcntsb (); } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsd_s.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsd_s.c new file mode 100644 index 00000000000..3bf9498e925 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsd_s.c @@ -0,0 +1,277 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#define NO_SHARED_ZA +#include "test_sme_acle.h" + +/* +** cntd_1: +** cntd x0 +** ret +*/ +PROTO (cntd_1, uint64_t, ()) { return svcntsd (); } + +/* +** cntd_2: +** cntw x0 +** ret +*/ +PROTO (cntd_2, uint64_t, ()) { return svcntsd () * 2; } + +/* +** cntd_3: +** cntd x0, all, mul #3 +** ret +*/ +PROTO (cntd_3, uint64_t, ()) { return svcntsd () * 3; } + +/* +** cntd_4: +** cnth x0 +** ret +*/ +PROTO (cntd_4, uint64_t, ()) { return svcntsd () * 4; } + +/* +** cntd_8: +** cntb x0 +** ret +*/ +PROTO (cntd_8, uint64_t, ()) { return svcntsd () * 8; } + +/* +** cntd_15: +** cntd x0, all, mul #15 +** ret +*/ +PROTO (cntd_15, uint64_t, ()) { return svcntsd () * 15; } + +/* +** cntd_16: +** cntb x0, all, mul #2 +** ret +*/ +PROTO (cntd_16, uint64_t, ()) { return svcntsd () * 16; } + +/* Other sequences would be OK. */ +/* +** cntd_17: +** rdvl (x[0-9]+), #17 +** asr x0, \1, 3 +** ret +*/ +PROTO (cntd_17, uint64_t, ()) { return svcntsd () * 17; } + +/* +** cntd_32: +** cntb x0, all, mul #4 +** ret +*/ +PROTO (cntd_32, uint64_t, ()) { return svcntsd () * 32; } + +/* +** cntd_64: +** cntb x0, all, mul #8 +** ret +*/ +PROTO (cntd_64, uint64_t, ()) { return svcntsd () * 64; } + +/* +** cntd_128: +** cntb x0, all, mul #16 +** ret +*/ +PROTO (cntd_128, uint64_t, ()) { return svcntsd () * 128; } + +/* +** cntd_m1: +** cntd (x[0-9]+) +** neg x0, \1 +** ret +*/ +PROTO (cntd_m1, uint64_t, ()) { return -svcntsd (); } + +/* +** cntd_m13: +** cntd (x[0-9]+), all, mul #13 +** neg x0, \1 +** ret +*/ +PROTO (cntd_m13, uint64_t, ()) { return -svcntsd () * 13; } + +/* +** cntd_m15: +** cntd (x[0-9]+), all, mul #15 +** neg x0, \1 +** ret +*/ +PROTO (cntd_m15, uint64_t, ()) { return -svcntsd () * 15; } + +/* +** cntd_m16: +** rdvl x0, #-2 +** ret +*/ +PROTO (cntd_m16, uint64_t, ()) { return -svcntsd () * 16; } + +/* Other sequences would be OK. */ +/* +** cntd_m17: +** rdvl (x[0-9]+), #-17 +** asr x0, \1, 3 +** ret +*/ +PROTO (cntd_m17, uint64_t, ()) { return -svcntsd () * 17; } + +/* +** incd_1: +** incd x0 +** ret +*/ +PROTO (incd_1, uint64_t, (uint64_t x0)) { return x0 + svcntsd (); } + +/* +** incd_2: +** incw x0 +** ret +*/ +PROTO (incd_2, uint64_t, (uint64_t x0)) { return x0 + svcntsd () * 2; } + +/* +** incd_3: +** incd x0, all, mul #3 +** ret +*/ +PROTO (incd_3, uint64_t, (uint64_t x0)) { return x0 + svcntsd () * 3; } + +/* +** incd_4: +** inch x0 +** ret +*/ +PROTO (incd_4, uint64_t, (uint64_t x0)) { return x0 + svcntsd () * 4; } + +/* +** incd_7: +** incd x0, all, mul #7 +** ret +*/ +PROTO (incd_7, uint64_t, (uint64_t x0)) { return x0 + svcntsd () * 7; } + +/* +** incd_8: +** incb x0 +** ret +*/ +PROTO (incd_8, uint64_t, (uint64_t x0)) { return x0 + svcntsd () * 8; } + +/* +** incd_9: +** incd x0, all, mul #9 +** ret +*/ +PROTO (incd_9, uint64_t, (uint64_t x0)) { return x0 + svcntsd () * 9; } + +/* +** incd_15: +** incd x0, all, mul #15 +** ret +*/ +PROTO (incd_15, uint64_t, (uint64_t x0)) { return x0 + svcntsd () * 15; } + +/* +** incd_16: +** incb x0, all, mul #2 +** ret +*/ +PROTO (incd_16, uint64_t, (uint64_t x0)) { return x0 + svcntsd () * 16; } + +/* +** incd_18: +** incw x0, all, mul #9 +** ret +*/ +PROTO (incd_18, uint64_t, (uint64_t x0)) { return x0 + svcntsd () * 18; } + +/* +** incd_30: +** incw x0, all, mul #15 +** ret +*/ +PROTO (incd_30, uint64_t, (uint64_t x0)) { return x0 + svcntsd () * 30; } + +/* +** decd_1: +** decd x0 +** ret +*/ +PROTO (decd_1, uint64_t, (uint64_t x0)) { return x0 - svcntsd (); } + +/* +** decd_2: +** decw x0 +** ret +*/ +PROTO (decd_2, uint64_t, (uint64_t x0)) { return x0 - svcntsd () * 2; } + +/* +** decd_3: +** decd x0, all, mul #3 +** ret +*/ +PROTO (decd_3, uint64_t, (uint64_t x0)) { return x0 - svcntsd () * 3; } + +/* +** decd_4: +** dech x0 +** ret +*/ +PROTO (decd_4, uint64_t, (uint64_t x0)) { return x0 - svcntsd () * 4; } + +/* +** decd_7: +** decd x0, all, mul #7 +** ret +*/ +PROTO (decd_7, uint64_t, (uint64_t x0)) { return x0 - svcntsd () * 7; } + +/* +** decd_8: +** decb x0 +** ret +*/ +PROTO (decd_8, uint64_t, (uint64_t x0)) { return x0 - svcntsd () * 8; } + +/* +** decd_9: +** decd x0, all, mul #9 +** ret +*/ +PROTO (decd_9, uint64_t, (uint64_t x0)) { return x0 - svcntsd () * 9; } + +/* +** decd_15: +** decd x0, all, mul #15 +** ret +*/ +PROTO (decd_15, uint64_t, (uint64_t x0)) { return x0 - svcntsd () * 15; } + +/* +** decd_16: +** decb x0, all, mul #2 +** ret +*/ +PROTO (decd_16, uint64_t, (uint64_t x0)) { return x0 - svcntsd () * 16; } + +/* +** decd_18: +** decw x0, all, mul #9 +** ret +*/ +PROTO (decd_18, uint64_t, (uint64_t x0)) { return x0 - svcntsd () * 18; } + +/* +** decd_30: +** decw x0, all, mul #15 +** ret +*/ +PROTO (decd_30, uint64_t, (uint64_t x0)) { return x0 - svcntsd () * 30; } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsd_sc.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsd_sc.c new file mode 100644 index 00000000000..90fb374bac9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsd_sc.c @@ -0,0 +1,13 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#define STREAMING_COMPATIBLE +#define NO_SHARED_ZA +#include "test_sme_acle.h" + +/* +** cntsd: +** rdsvl (x[0-9])+, #1 +** lsr x0, \1, #?3 +** ret +*/ +PROTO (cntsd, uint64_t, ()) { return svcntsd (); } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsh_s.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsh_s.c new file mode 100644 index 00000000000..021c39a1467 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsh_s.c @@ -0,0 +1,279 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#define NO_SHARED_ZA +#include "test_sme_acle.h" + +/* +** cnth_1: +** cnth x0 +** ret +*/ +PROTO (cnth_1, uint64_t, ()) { return svcntsh (); } + +/* +** cnth_2: +** cntb x0 +** ret +*/ +PROTO (cnth_2, uint64_t, ()) { return svcntsh () * 2; } + +/* +** cnth_3: +** cnth x0, all, mul #3 +** ret +*/ +PROTO (cnth_3, uint64_t, ()) { return svcntsh () * 3; } + +/* +** cnth_4: +** cntb x0, all, mul #2 +** ret +*/ +PROTO (cnth_4, uint64_t, ()) { return svcntsh () * 4; } + +/* +** cnth_8: +** cntb x0, all, mul #4 +** ret +*/ +PROTO (cnth_8, uint64_t, ()) { return svcntsh () * 8; } + +/* +** cnth_15: +** cnth x0, all, mul #15 +** ret +*/ +PROTO (cnth_15, uint64_t, ()) { return svcntsh () * 15; } + +/* +** cnth_16: +** cntb x0, all, mul #8 +** ret +*/ +PROTO (cnth_16, uint64_t, ()) { return svcntsh () * 16; } + +/* Other sequences would be OK. */ +/* +** cnth_17: +** rdvl (x[0-9]+), #17 +** asr x0, \1, 1 +** ret +*/ +PROTO (cnth_17, uint64_t, ()) { return svcntsh () * 17; } + +/* +** cnth_32: +** cntb x0, all, mul #16 +** ret +*/ +PROTO (cnth_32, uint64_t, ()) { return svcntsh () * 32; } + +/* +** cnth_64: +** cntb (x[0-9]+) +** lsl x0, \1, 5 +** ret +*/ +PROTO (cnth_64, uint64_t, ()) { return svcntsh () * 64; } + +/* +** cnth_128: +** cntb (x[0-9]+) +** lsl x0, \1, 6 +** ret +*/ +PROTO (cnth_128, uint64_t, ()) { return svcntsh () * 128; } + +/* +** cnth_m1: +** cnth (x[0-9]+) +** neg x0, \1 +** ret +*/ +PROTO (cnth_m1, uint64_t, ()) { return -svcntsh (); } + +/* +** cnth_m13: +** cnth (x[0-9]+), all, mul #13 +** neg x0, \1 +** ret +*/ +PROTO (cnth_m13, uint64_t, ()) { return -svcntsh () * 13; } + +/* +** cnth_m15: +** cnth (x[0-9]+), all, mul #15 +** neg x0, \1 +** ret +*/ +PROTO (cnth_m15, uint64_t, ()) { return -svcntsh () * 15; } + +/* +** cnth_m16: +** rdvl x0, #-8 +** ret +*/ +PROTO (cnth_m16, uint64_t, ()) { return -svcntsh () * 16; } + +/* Other sequences would be OK. */ +/* +** cnth_m17: +** rdvl (x[0-9]+), #-17 +** asr x0, \1, 1 +** ret +*/ +PROTO (cnth_m17, uint64_t, ()) { return -svcntsh () * 17; } + +/* +** inch_1: +** inch x0 +** ret +*/ +PROTO (inch_1, uint64_t, (uint64_t x0)) { return x0 + svcntsh (); } + +/* +** inch_2: +** incb x0 +** ret +*/ +PROTO (inch_2, uint64_t, (uint64_t x0)) { return x0 + svcntsh () * 2; } + +/* +** inch_3: +** inch x0, all, mul #3 +** ret +*/ +PROTO (inch_3, uint64_t, (uint64_t x0)) { return x0 + svcntsh () * 3; } + +/* +** inch_4: +** incb x0, all, mul #2 +** ret +*/ +PROTO (inch_4, uint64_t, (uint64_t x0)) { return x0 + svcntsh () * 4; } + +/* +** inch_7: +** inch x0, all, mul #7 +** ret +*/ +PROTO (inch_7, uint64_t, (uint64_t x0)) { return x0 + svcntsh () * 7; } + +/* +** inch_8: +** incb x0, all, mul #4 +** ret +*/ +PROTO (inch_8, uint64_t, (uint64_t x0)) { return x0 + svcntsh () * 8; } + +/* +** inch_9: +** inch x0, all, mul #9 +** ret +*/ +PROTO (inch_9, uint64_t, (uint64_t x0)) { return x0 + svcntsh () * 9; } + +/* +** inch_15: +** inch x0, all, mul #15 +** ret +*/ +PROTO (inch_15, uint64_t, (uint64_t x0)) { return x0 + svcntsh () * 15; } + +/* +** inch_16: +** incb x0, all, mul #8 +** ret +*/ +PROTO (inch_16, uint64_t, (uint64_t x0)) { return x0 + svcntsh () * 16; } + +/* +** inch_18: +** incb x0, all, mul #9 +** ret +*/ +PROTO (inch_18, uint64_t, (uint64_t x0)) { return x0 + svcntsh () * 18; } + +/* +** inch_30: +** incb x0, all, mul #15 +** ret +*/ +PROTO (inch_30, uint64_t, (uint64_t x0)) { return x0 + svcntsh () * 30; } + +/* +** dech_1: +** dech x0 +** ret +*/ +PROTO (dech_1, uint64_t, (uint64_t x0)) { return x0 - svcntsh (); } + +/* +** dech_2: +** decb x0 +** ret +*/ +PROTO (dech_2, uint64_t, (uint64_t x0)) { return x0 - svcntsh () * 2; } + +/* +** dech_3: +** dech x0, all, mul #3 +** ret +*/ +PROTO (dech_3, uint64_t, (uint64_t x0)) { return x0 - svcntsh () * 3; } + +/* +** dech_4: +** decb x0, all, mul #2 +** ret +*/ +PROTO (dech_4, uint64_t, (uint64_t x0)) { return x0 - svcntsh () * 4; } + +/* +** dech_7: +** dech x0, all, mul #7 +** ret +*/ +PROTO (dech_7, uint64_t, (uint64_t x0)) { return x0 - svcntsh () * 7; } + +/* +** dech_8: +** decb x0, all, mul #4 +** ret +*/ +PROTO (dech_8, uint64_t, (uint64_t x0)) { return x0 - svcntsh () * 8; } + +/* +** dech_9: +** dech x0, all, mul #9 +** ret +*/ +PROTO (dech_9, uint64_t, (uint64_t x0)) { return x0 - svcntsh () * 9; } + +/* +** dech_15: +** dech x0, all, mul #15 +** ret +*/ +PROTO (dech_15, uint64_t, (uint64_t x0)) { return x0 - svcntsh () * 15; } + +/* +** dech_16: +** decb x0, all, mul #8 +** ret +*/ +PROTO (dech_16, uint64_t, (uint64_t x0)) { return x0 - svcntsh () * 16; } + +/* +** dech_18: +** decb x0, all, mul #9 +** ret +*/ +PROTO (dech_18, uint64_t, (uint64_t x0)) { return x0 - svcntsh () * 18; } + +/* +** dech_30: +** decb x0, all, mul #15 +** ret +*/ +PROTO (dech_30, uint64_t, (uint64_t x0)) { return x0 - svcntsh () * 30; } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsh_sc.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsh_sc.c new file mode 100644 index 00000000000..9f6c85208a6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsh_sc.c @@ -0,0 +1,13 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#define STREAMING_COMPATIBLE +#define NO_SHARED_ZA +#include "test_sme_acle.h" + +/* +** cntsh: +** rdsvl (x[0-9])+, #1 +** lsr x0, \1, #?1 +** ret +*/ +PROTO (cntsh, uint64_t, ()) { return svcntsh (); } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsw_s.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsw_s.c new file mode 100644 index 00000000000..c421e1b8e1f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsw_s.c @@ -0,0 +1,278 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#define NO_SHARED_ZA +#include "test_sme_acle.h" + +/* +** cntw_1: +** cntw x0 +** ret +*/ +PROTO (cntw_1, uint64_t, ()) { return svcntsw (); } + +/* +** cntw_2: +** cnth x0 +** ret +*/ +PROTO (cntw_2, uint64_t, ()) { return svcntsw () * 2; } + +/* +** cntw_3: +** cntw x0, all, mul #3 +** ret +*/ +PROTO (cntw_3, uint64_t, ()) { return svcntsw () * 3; } + +/* +** cntw_4: +** cntb x0 +** ret +*/ +PROTO (cntw_4, uint64_t, ()) { return svcntsw () * 4; } + +/* +** cntw_8: +** cntb x0, all, mul #2 +** ret +*/ +PROTO (cntw_8, uint64_t, ()) { return svcntsw () * 8; } + +/* +** cntw_15: +** cntw x0, all, mul #15 +** ret +*/ +PROTO (cntw_15, uint64_t, ()) { return svcntsw () * 15; } + +/* +** cntw_16: +** cntb x0, all, mul #4 +** ret +*/ +PROTO (cntw_16, uint64_t, ()) { return svcntsw () * 16; } + +/* Other sequences would be OK. */ +/* +** cntw_17: +** rdvl (x[0-9]+), #17 +** asr x0, \1, 2 +** ret +*/ +PROTO (cntw_17, uint64_t, ()) { return svcntsw () * 17; } + +/* +** cntw_32: +** cntb x0, all, mul #8 +** ret +*/ +PROTO (cntw_32, uint64_t, ()) { return svcntsw () * 32; } + +/* +** cntw_64: +** cntb x0, all, mul #16 +** ret +*/ +PROTO (cntw_64, uint64_t, ()) { return svcntsw () * 64; } + +/* +** cntw_128: +** cntb (x[0-9]+) +** lsl x0, \1, 5 +** ret +*/ +PROTO (cntw_128, uint64_t, ()) { return svcntsw () * 128; } + +/* +** cntw_m1: +** cntw (x[0-9]+) +** neg x0, \1 +** ret +*/ +PROTO (cntw_m1, uint64_t, ()) { return -svcntsw (); } + +/* +** cntw_m13: +** cntw (x[0-9]+), all, mul #13 +** neg x0, \1 +** ret +*/ +PROTO (cntw_m13, uint64_t, ()) { return -svcntsw () * 13; } + +/* +** cntw_m15: +** cntw (x[0-9]+), all, mul #15 +** neg x0, \1 +** ret +*/ +PROTO (cntw_m15, uint64_t, ()) { return -svcntsw () * 15; } + +/* +** cntw_m16: +** rdvl (x[0-9]+), #-4 +** ret +*/ +PROTO (cntw_m16, uint64_t, ()) { return -svcntsw () * 16; } + +/* Other sequences would be OK. */ +/* +** cntw_m17: +** rdvl (x[0-9]+), #-17 +** asr x0, \1, 2 +** ret +*/ +PROTO (cntw_m17, uint64_t, ()) { return -svcntsw () * 17; } + +/* +** incw_1: +** incw x0 +** ret +*/ +PROTO (incw_1, uint64_t, (uint64_t x0)) { return x0 + svcntsw (); } + +/* +** incw_2: +** inch x0 +** ret +*/ +PROTO (incw_2, uint64_t, (uint64_t x0)) { return x0 + svcntsw () * 2; } + +/* +** incw_3: +** incw x0, all, mul #3 +** ret +*/ +PROTO (incw_3, uint64_t, (uint64_t x0)) { return x0 + svcntsw () * 3; } + +/* +** incw_4: +** incb x0 +** ret +*/ +PROTO (incw_4, uint64_t, (uint64_t x0)) { return x0 + svcntsw () * 4; } + +/* +** incw_7: +** incw x0, all, mul #7 +** ret +*/ +PROTO (incw_7, uint64_t, (uint64_t x0)) { return x0 + svcntsw () * 7; } + +/* +** incw_8: +** incb x0, all, mul #2 +** ret +*/ +PROTO (incw_8, uint64_t, (uint64_t x0)) { return x0 + svcntsw () * 8; } + +/* +** incw_9: +** incw x0, all, mul #9 +** ret +*/ +PROTO (incw_9, uint64_t, (uint64_t x0)) { return x0 + svcntsw () * 9; } + +/* +** incw_15: +** incw x0, all, mul #15 +** ret +*/ +PROTO (incw_15, uint64_t, (uint64_t x0)) { return x0 + svcntsw () * 15; } + +/* +** incw_16: +** incb x0, all, mul #4 +** ret +*/ +PROTO (incw_16, uint64_t, (uint64_t x0)) { return x0 + svcntsw () * 16; } + +/* +** incw_18: +** inch x0, all, mul #9 +** ret +*/ +PROTO (incw_18, uint64_t, (uint64_t x0)) { return x0 + svcntsw () * 18; } + +/* +** incw_30: +** inch x0, all, mul #15 +** ret +*/ +PROTO (incw_30, uint64_t, (uint64_t x0)) { return x0 + svcntsw () * 30; } + +/* +** decw_1: +** decw x0 +** ret +*/ +PROTO (decw_1, uint64_t, (uint64_t x0)) { return x0 - svcntsw (); } + +/* +** decw_2: +** dech x0 +** ret +*/ +PROTO (decw_2, uint64_t, (uint64_t x0)) { return x0 - svcntsw () * 2; } + +/* +** decw_3: +** decw x0, all, mul #3 +** ret +*/ +PROTO (decw_3, uint64_t, (uint64_t x0)) { return x0 - svcntsw () * 3; } + +/* +** decw_4: +** decb x0 +** ret +*/ +PROTO (decw_4, uint64_t, (uint64_t x0)) { return x0 - svcntsw () * 4; } + +/* +** decw_7: +** decw x0, all, mul #7 +** ret +*/ +PROTO (decw_7, uint64_t, (uint64_t x0)) { return x0 - svcntsw () * 7; } + +/* +** decw_8: +** decb x0, all, mul #2 +** ret +*/ +PROTO (decw_8, uint64_t, (uint64_t x0)) { return x0 - svcntsw () * 8; } + +/* +** decw_9: +** decw x0, all, mul #9 +** ret +*/ +PROTO (decw_9, uint64_t, (uint64_t x0)) { return x0 - svcntsw () * 9; } + +/* +** decw_15: +** decw x0, all, mul #15 +** ret +*/ +PROTO (decw_15, uint64_t, (uint64_t x0)) { return x0 - svcntsw () * 15; } + +/* +** decw_16: +** decb x0, all, mul #4 +** ret +*/ +PROTO (decw_16, uint64_t, (uint64_t x0)) { return x0 - svcntsw () * 16; } + +/* +** decw_18: +** dech x0, all, mul #9 +** ret +*/ +PROTO (decw_18, uint64_t, (uint64_t x0)) { return x0 - svcntsw () * 18; } + +/* +** decw_30: +** dech x0, all, mul #15 +** ret +*/ +PROTO (decw_30, uint64_t, (uint64_t x0)) { return x0 - svcntsw () * 30; } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsw_sc.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsw_sc.c new file mode 100644 index 00000000000..75ca937c48f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/cntsw_sc.c @@ -0,0 +1,13 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#define STREAMING_COMPATIBLE +#define NO_SHARED_ZA +#include "test_sme_acle.h" + +/* +** cntsw: +** rdsvl (x[0-9])+, #1 +** lsr x0, \1, #?2 +** ret +*/ +PROTO (cntsw, uint64_t, ()) { return svcntsw (); } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za128.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za128.c new file mode 100644 index 00000000000..897b5522d8d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za128.c @@ -0,0 +1,46 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** ld1_vnum_za128_0_0: +** mov (w1[2-5]), w0 +** ld1q { za0h\.q\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za128_0_0, + svld1_hor_vnum_za128 (0, w0, p0, x1, 0), + svld1_hor_vnum_za128 (0, w0, p0, x1, 0)) + +/* +** ld1_vnum_za128_5_0: +** incb x1, all, mul #13 +** mov (w1[2-5]), w0 +** ld1q { za5h\.q\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za128_5_0, + svld1_hor_vnum_za128 (5, w0, p0, x1, 13), + svld1_hor_vnum_za128 (5, w0, p0, x1, 13)) + +/* +** ld1_vnum_za128_11_0: +** cntb (x[0-9]+) +** madd (x[0-9]+), (?:\1, x2|x2, \1), x1 +** mov (w1[2-5]), w0 +** ld1q { za11h\.q\[\3, 0\] }, p0/z, \[\2\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za128_11_0, + svld1_hor_vnum_za128 (11, w0, p0, x1, x2), + svld1_hor_vnum_za128 (11, w0, p0, x1, x2)) + +/* +** ld1_vnum_za128_0_1: +** add (w1[2-5]), w0, #?1 +** ld1q { za0h\.q\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za128_0_1, + svld1_hor_vnum_za128 (0, w0 + 1, p0, x1, 0), + svld1_hor_vnum_za128 (0, w0 + 1, p0, x1, 0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za16.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za16.c new file mode 100644 index 00000000000..4cf4417b9b0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za16.c @@ -0,0 +1,46 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** ld1_vnum_za16_0_0: +** mov (w1[2-5]), w0 +** ld1h { za0h\.h\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za16_0_0, + svld1_hor_vnum_za16 (0, w0, p0, x1, 0), + svld1_hor_vnum_za16 (0, w0, p0, x1, 0)) + +/* +** ld1_vnum_za16_0_1: +** incb x1, all, mul #9 +** mov (w1[2-5]), w0 +** ld1h { za0h\.h\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za16_0_1, + svld1_hor_vnum_za16 (0, w0 + 1, p0, x1, 9), + svld1_hor_vnum_za16 (0, w0 + 1, p0, x1, 9)) + +/* +** ld1_vnum_za16_1_7: +** cntb (x[0-9]+) +** madd (x[0-9]+), (?:\1, x2|x2, \1), x1 +** mov (w1[2-5]), w0 +** ld1h { za1h\.h\[\3, 7\] }, p0/z, \[\2\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za16_1_7, + svld1_hor_vnum_za16 (1, w0 + 7, p0, x1, x2), + svld1_hor_vnum_za16 (1, w0 + 7, p0, x1, x2)) + +/* +** ld1_vnum_za16_0_8: +** add (w1[2-5]), w0, #?8 +** ld1h { za0h\.h\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za16_0_8, + svld1_hor_vnum_za16 (0, w0 + 8, p0, x1, 0), + svld1_hor_vnum_za16 (0, w0 + 8, p0, x1, 0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za32.c new file mode 100644 index 00000000000..9dc0d0b0309 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za32.c @@ -0,0 +1,46 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** ld1_vnum_za32_0_0: +** mov (w1[2-5]), w0 +** ld1w { za0h\.s\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za32_0_0, + svld1_hor_vnum_za32 (0, w0, p0, x1, 0), + svld1_hor_vnum_za32 (0, w0, p0, x1, 0)) + +/* +** ld1_vnum_za32_0_1: +** incb x1, all, mul #5 +** mov (w1[2-5]), w0 +** ld1w { za0h\.s\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za32_0_1, + svld1_hor_vnum_za32 (0, w0 + 1, p0, x1, 5), + svld1_hor_vnum_za32 (0, w0 + 1, p0, x1, 5)) + +/* +** ld1_vnum_za32_2_3: +** cntb (x[0-9]+) +** madd (x[0-9]+), (?:\1, x2|x2, \1), x1 +** mov (w1[2-5]), w0 +** ld1w { za2h\.s\[\3, 3\] }, p0/z, \[\2\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za32_2_3, + svld1_hor_vnum_za32 (2, w0 + 3, p0, x1, x2), + svld1_hor_vnum_za32 (2, w0 + 3, p0, x1, x2)) + +/* +** ld1_vnum_za32_0_4: +** add (w1[2-5]), w0, #?4 +** ld1w { za0h\.s\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za32_0_4, + svld1_hor_vnum_za32 (0, w0 + 4, p0, x1, 0), + svld1_hor_vnum_za32 (0, w0 + 4, p0, x1, 0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za64.c new file mode 100644 index 00000000000..ad3258718a6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za64.c @@ -0,0 +1,46 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** ld1_vnum_za64_0_0: +** mov (w1[2-5]), w0 +** ld1d { za0h\.d\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za64_0_0, + svld1_hor_vnum_za64 (0, w0, p0, x1, 0), + svld1_hor_vnum_za64 (0, w0, p0, x1, 0)) + +/* +** ld1_vnum_za64_0_1: +** incb x1, all, mul #13 +** mov (w1[2-5]), w0 +** ld1d { za0h\.d\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za64_0_1, + svld1_hor_vnum_za64 (0, w0 + 1, p0, x1, 13), + svld1_hor_vnum_za64 (0, w0 + 1, p0, x1, 13)) + +/* +** ld1_vnum_za64_5_1: +** cntb (x[0-9]+) +** madd (x[0-9]+), (?:\1, x2|x2, \1), x1 +** mov (w1[2-5]), w0 +** ld1d { za5h\.d\[\3, 1\] }, p0/z, \[\2\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za64_5_1, + svld1_hor_vnum_za64 (5, w0 + 1, p0, x1, x2), + svld1_hor_vnum_za64 (5, w0 + 1, p0, x1, x2)) + +/* +** ld1_vnum_za64_0_2: +** add (w1[2-5]), w0, #?2 +** ld1d { za0h\.d\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za64_0_2, + svld1_hor_vnum_za64 (0, w0 + 2, p0, x1, 0), + svld1_hor_vnum_za64 (0, w0 + 2, p0, x1, 0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za8.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za8.c new file mode 100644 index 00000000000..68b43dc32a7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_vnum_za8.c @@ -0,0 +1,46 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** ld1_vnum_za8_0_0: +** mov (w1[2-5]), w0 +** ld1b { za0h\.b\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za8_0_0, + svld1_hor_vnum_za8 (0, w0, p0, x1, 0), + svld1_hor_vnum_za8 (0, w0, p0, x1, 0)) + +/* +** ld1_vnum_za8_0_1: +** incb x1, all, mul #11 +** mov (w1[2-5]), w0 +** ld1b { za0h\.b\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za8_0_1, + svld1_hor_vnum_za8 (0, w0 + 1, p0, x1, 11), + svld1_hor_vnum_za8 (0, w0 + 1, p0, x1, 11)) + +/* +** ld1_vnum_za8_0_15: +** cntb (x[0-9]+) +** mul (x[0-9]+), (?:\1, x2|x2, \1) +** mov (w1[2-5]), w0 +** ld1b { za0h\.b\[\3, 15\] }, p0/z, \[x1, \2\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za8_0_15, + svld1_hor_vnum_za8 (0, w0 + 15, p0, x1, x2), + svld1_hor_vnum_za8 (0, w0 + 15, p0, x1, x2)) + +/* +** ld1_vnum_za8_0_16: +** add (w1[2-5]), w0, #?16 +** ld1b { za0h\.b\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_vnum_za8_0_16, + svld1_hor_vnum_za8 (0, w0 + 16, p0, x1, 0), + svld1_hor_vnum_za8 (0, w0 + 16, p0, x1, 0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za128.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za128.c new file mode 100644 index 00000000000..554028a7e3e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za128.c @@ -0,0 +1,63 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** ld1_za128_0_0: +** mov (w1[2-5]), w0 +** ld1q { za0h\.q\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za128_0_0, + svld1_hor_za128 (0, w0, p0, x1), + svld1_hor_za128 (0, w0, p0, x1)) + +/* +** ld1_za128_0_1: +** add (w1[2-5]), w0, #?1 +** ld1q { za0h\.q\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za128_0_1, + svld1_hor_za128 (0, w0 + 1, p0, x1), + svld1_hor_za128 (0, w0 + 1, p0, x1)) + +/* +** ld1_za128_7_0: +** mov (w1[2-5]), w0 +** ld1q { za7h\.q\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za128_7_0, + svld1_hor_za128 (7, w0, p0, x1), + svld1_hor_za128 (7, w0, p0, x1)) + +/* +** ld1_za128_13_0: +** mov (w1[2-5]), w0 +** ld1q { za13h\.q\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za128_13_0, + svld1_hor_za128 (13, w0, p0, x1), + svld1_hor_za128 (13, w0, p0, x1)) + +/* +** ld1_za128_15_0: +** mov (w1[2-5]), w0 +** ld1q { za15h\.q\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za128_15_0, + svld1_hor_za128 (15, w0, p0, x1), + svld1_hor_za128 (15, w0, p0, x1)) + +/* +** ld1_za128_9_0_index: +** mov (w1[2-5]), w0 +** ld1q { za9h\.q\[\1, 0\] }, p0/z, \[x1, x2, lsl #?4\] +** ret +*/ +TEST_LOAD_ZA (ld1_za128_9_0_index, + svld1_hor_za128 (9, w0, p0, x1 + x2 * 16), + svld1_hor_za128 (9, w0, p0, x1 + x2 * 16)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za16.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za16.c new file mode 100644 index 00000000000..4f807e6aa7a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za16.c @@ -0,0 +1,94 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** ld1_za16_0_0: +** mov (w1[2-5]), w0 +** ld1h { za0h\.h\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za16_0_0, + svld1_hor_za16 (0, w0, p0, x1), + svld1_hor_za16 (0, w0, p0, x1)) + +/* +** ld1_za16_0_1: +** mov (w1[2-5]), w0 +** ld1h { za0h\.h\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za16_0_1, + svld1_hor_za16 (0, w0 + 1, p0, x1), + svld1_hor_za16 (0, w0 + 1, p0, x1)) + +/* +** ld1_za16_0_7: +** mov (w1[2-5]), w0 +** ld1h { za0h\.h\[\1, 7\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za16_0_7, + svld1_hor_za16 (0, w0 + 7, p0, x1), + svld1_hor_za16 (0, w0 + 7, p0, x1)) + +/* +** ld1_za16_1_0: +** mov (w1[2-5]), w0 +** ld1h { za1h\.h\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za16_1_0, + svld1_hor_za16 (1, w0, p0, x1), + svld1_hor_za16 (1, w0, p0, x1)) + + +/* +** ld1_za16_1_1: +** mov (w1[2-5]), w0 +** ld1h { za1h\.h\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za16_1_1, + svld1_hor_za16 (1, w0 + 1, p0, x1), + svld1_hor_za16 (1, w0 + 1, p0, x1)) + +/* +** ld1_za16_1_7: +** mov (w1[2-5]), w0 +** ld1h { za1h\.h\[\1, 7\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za16_1_7, + svld1_hor_za16 (1, w0 + 7, p0, x1), + svld1_hor_za16 (1, w0 + 7, p0, x1)) + +/* +** ld1_za16_1_5_index: +** mov (w1[2-5]), w0 +** ld1h { za1h\.h\[\1, 5\] }, p0/z, \[x1, x2, lsl #?1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za16_1_5_index, + svld1_hor_za16 (1, w0 + 5, p0, x1 + x2 * 2), + svld1_hor_za16 (1, w0 + 5, p0, x1 + x2 * 2)) + +/* +** ld1_za16_0_8: +** add (w1[2-5]), w0, #?8 +** ld1h { za0h\.h\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za16_0_8, + svld1_hor_za16 (0, w0 + 8, p0, x1), + svld1_hor_za16 (0, w0 + 8, p0, x1)) + +/* +** ld1_za16_0_m1: +** sub (w1[2-5]), w0, #?1 +** ld1h { za0h\.h\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za16_0_m1, + svld1_hor_za16 (0, w0 - 1, p0, x1), + svld1_hor_za16 (0, w0 - 1, p0, x1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za32.c new file mode 100644 index 00000000000..253783047a0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za32.c @@ -0,0 +1,93 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** ld1_za32_0_0: +** mov (w1[2-5]), w0 +** ld1w { za0h\.s\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za32_0_0, + svld1_hor_za32 (0, w0, p0, x1), + svld1_hor_za32 (0, w0, p0, x1)) + +/* +** ld1_za32_0_1: +** mov (w1[2-5]), w0 +** ld1w { za0h\.s\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za32_0_1, + svld1_hor_za32 (0, w0 + 1, p0, x1), + svld1_hor_za32 (0, w0 + 1, p0, x1)) + +/* +** ld1_za32_0_3: +** mov (w1[2-5]), w0 +** ld1w { za0h\.s\[\1, 3\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za32_0_3, + svld1_hor_za32 (0, w0 + 3, p0, x1), + svld1_hor_za32 (0, w0 + 3, p0, x1)) + +/* +** ld1_za32_3_0: +** mov (w1[2-5]), w0 +** ld1w { za3h\.s\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za32_3_0, + svld1_hor_za32 (3, w0, p0, x1), + svld1_hor_za32 (3, w0, p0, x1)) + +/* +** ld1_za32_3_1: +** mov (w1[2-5]), w0 +** ld1w { za3h\.s\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za32_3_1, + svld1_hor_za32 (3, w0 + 1, p0, x1), + svld1_hor_za32 (3, w0 + 1, p0, x1)) + +/* +** ld1_za32_3_3: +** mov (w1[2-5]), w0 +** ld1w { za3h\.s\[\1, 3\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za32_3_3, + svld1_hor_za32 (3, w0 + 3, p0, x1), + svld1_hor_za32 (3, w0 + 3, p0, x1)) + +/* +** ld1_za32_1_2_index: +** mov (w1[2-5]), w0 +** ld1w { za1h\.s\[\1, 2\] }, p0/z, \[x1, x2, lsl #?2\] +** ret +*/ +TEST_LOAD_ZA (ld1_za32_1_2_index, + svld1_hor_za32 (1, w0 + 2, p0, x1 + x2 * 4), + svld1_hor_za32 (1, w0 + 2, p0, x1 + x2 * 4)) + +/* +** ld1_za32_0_4: +** add (w1[2-5]), w0, #?4 +** ld1w { za0h\.s\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za32_0_4, + svld1_hor_za32 (0, w0 + 4, p0, x1), + svld1_hor_za32 (0, w0 + 4, p0, x1)) + +/* +** ld1_za32_0_m1: +** sub (w1[2-5]), w0, #?1 +** ld1w { za0h\.s\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za32_0_m1, + svld1_hor_za32 (0, w0 - 1, p0, x1), + svld1_hor_za32 (0, w0 - 1, p0, x1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za64.c new file mode 100644 index 00000000000..b90b49dd054 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za64.c @@ -0,0 +1,73 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** ld1_za64_0_0: +** mov (w1[2-5]), w0 +** ld1d { za0h\.d\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za64_0_0, + svld1_hor_za64 (0, w0, p0, x1), + svld1_hor_za64 (0, w0, p0, x1)) + +/* +** ld1_za64_0_1: +** mov (w1[2-5]), w0 +** ld1d { za0h\.d\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za64_0_1, + svld1_hor_za64 (0, w0 + 1, p0, x1), + svld1_hor_za64 (0, w0 + 1, p0, x1)) + +/* +** ld1_za64_7_0: +** mov (w1[2-5]), w0 +** ld1d { za7h\.d\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za64_7_0, + svld1_hor_za64 (7, w0, p0, x1), + svld1_hor_za64 (7, w0, p0, x1)) + +/* +** ld1_za64_7_1: +** mov (w1[2-5]), w0 +** ld1d { za7h\.d\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za64_7_1, + svld1_hor_za64 (7, w0 + 1, p0, x1), + svld1_hor_za64 (7, w0 + 1, p0, x1)) + +/* +** ld1_za64_5_1_index: +** mov (w1[2-5]), w0 +** ld1d { za5h\.d\[\1, 1\] }, p0/z, \[x1, x2, lsl #?3\] +** ret +*/ +TEST_LOAD_ZA (ld1_za64_5_1_index, + svld1_hor_za64 (5, w0 + 1, p0, x1 + x2 * 8), + svld1_hor_za64 (5, w0 + 1, p0, x1 + x2 * 8)) + +/* +** ld1_za64_0_2: +** add (w1[2-5]), w0, #?2 +** ld1d { za0h\.d\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za64_0_2, + svld1_hor_za64 (0, w0 + 2, p0, x1), + svld1_hor_za64 (0, w0 + 2, p0, x1)) + +/* +** ld1_za64_0_m1: +** sub (w1[2-5]), w0, #?1 +** ld1d { za0h\.d\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za64_0_m1, + svld1_hor_za64 (0, w0 - 1, p0, x1), + svld1_hor_za64 (0, w0 - 1, p0, x1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za8.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za8.c new file mode 100644 index 00000000000..937e6376cf1 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_hor_za8.c @@ -0,0 +1,63 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** ld1_za8_0_0: +** mov (w1[2-5]), w0 +** ld1b { za0h\.b\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za8_0_0, + svld1_hor_za8 (0, w0, p0, x1), + svld1_hor_za8 (0, w0, p0, x1)) + +/* +** ld1_za8_0_1: +** mov (w1[2-5]), w0 +** ld1b { za0h\.b\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za8_0_1, + svld1_hor_za8 (0, w0 + 1, p0, x1), + svld1_hor_za8 (0, w0 + 1, p0, x1)) + +/* +** ld1_za8_0_15: +** mov (w1[2-5]), w0 +** ld1b { za0h\.b\[\1, 15\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za8_0_15, + svld1_hor_za8 (0, w0 + 15, p0, x1), + svld1_hor_za8 (0, w0 + 15, p0, x1)) + +/* +** ld1_za8_0_13_index: +** mov (w1[2-5]), w0 +** ld1b { za0h\.b\[\1, 15\] }, p0/z, \[x1, x2\] +** ret +*/ +TEST_LOAD_ZA (ld1_za8_0_13_index, + svld1_hor_za8 (0, w0 + 15, p0, x1 + x2), + svld1_hor_za8 (0, w0 + 15, p0, x1 + x2)) + +/* +** ld1_za8_0_16: +** add (w1[2-5]), w0, #?16 +** ld1b { za0h\.b\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za8_0_16, + svld1_hor_za8 (0, w0 + 16, p0, x1), + svld1_hor_za8 (0, w0 + 16, p0, x1)) + +/* +** ld1_za8_0_m1: +** sub (w1[2-5]), w0, #?1 +** ld1b { za0h\.b\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_LOAD_ZA (ld1_za8_0_m1, + svld1_hor_za8 (0, w0 - 1, p0, x1), + svld1_hor_za8 (0, w0 - 1, p0, x1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za128.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za128.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za16.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za16.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za32.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za64.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za8.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_vnum_za8.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za128.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za128.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za16.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za16.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za32.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za64.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za8.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ld1_ver_za8.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_vnum_za_s.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_vnum_za_s.c new file mode 100644 index 00000000000..592cfc3c145 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_vnum_za_s.c @@ -0,0 +1,121 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** ldr_vnum_za_0: +** mov (w1[2-5]), w0 +** ldr za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_0, + svldr_vnum_za (w0, x1, 0), + svldr_vnum_za (w0, x1, 0)) + +/* +** ldr_vnum_za_1: +** mov (w1[2-5]), w0 +** ldr za\[\1, 1\], \[x1, #1, mul vl\] +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_1, + svldr_vnum_za (w0 + 1, x1, 1), + svldr_vnum_za (w0 + 1, x1, 1)) + +/* +** ldr_vnum_za_13: +** mov (w1[2-5]), w0 +** ldr za\[\1, 13\], \[x1, #13, mul vl\] +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_13, + svldr_vnum_za (w0 + 13, x1, 13), + svldr_vnum_za (w0 + 13, x1, 13)) + +/* +** ldr_vnum_za_15: +** mov (w1[2-5]), w0 +** ldr za\[\1, 15\], \[x1, #15, mul vl\] +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_15, + svldr_vnum_za (w0 + 15, x1, 15), + svldr_vnum_za (w0 + 15, x1, 15)) + +/* +** ldr_vnum_za_16: +** ( +** add (w1[2-5]), w0, #?16 +** incb x1, all, mul #16 +** ldr za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** | +** incb x1, all, mul #16 +** add (w1[2-5]), w0, #?16 +** ldr za\[\2, 0\], \[x1(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_16, + svldr_vnum_za (w0 + 16, x1, 16), + svldr_vnum_za (w0 + 16, x1, 16)) + +/* +** ldr_vnum_za_m1: +** ( +** sub (w1[2-5]), w0, #?1 +** decb x1 +** ldr za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** | +** decb x1 +** sub (w1[2-5]), w0, #?1 +** ldr za\[\2, 0\], \[x1(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_m1, + svldr_vnum_za (w0 - 1, x1, -1), + svldr_vnum_za (w0 - 1, x1, -1)) + +/* +** ldr_vnum_za_mixed_1: +** add (w1[2-5]), w0, #?1 +** ldr za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_mixed_1, + svldr_vnum_za (w0 + 1, x1, 0), + svldr_vnum_za (w0 + 1, x1, 0)) + +/* +** ldr_vnum_za_mixed_2: +** ( +** mov (w1[2-5]), w0 +** incb x1 +** ldr za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** | +** incb x1 +** mov (w1[2-5]), w0 +** ldr za\[\2, 0\], \[x1(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_mixed_2, + svldr_vnum_za (w0, x1, 1), + svldr_vnum_za (w0, x1, 1)) + +/* +** ldr_vnum_za_mixed_3: +** ( +** add (w1[2-5]), w0, #?2 +** incb x1 +** ldr za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** | +** incb x1 +** add (w1[2-5]), w0, #?2 +** ldr za\[\2, 0\], \[x1(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_mixed_3, + svldr_vnum_za (w0 + 2, x1, 1), + svldr_vnum_za (w0 + 2, x1, 1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_vnum_za_sc.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_vnum_za_sc.c new file mode 100644 index 00000000000..303cf5b03eb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_vnum_za_sc.c @@ -0,0 +1,166 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#define STREAMING_COMPATIBLE +#include "test_sme_acle.h" + +/* +** ldr_vnum_za_0: +** mov (w1[2-5]), w0 +** ldr za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_0, + svldr_vnum_za (w0, x1, 0), + svldr_vnum_za (w0, x1, 0)) + +/* +** ldr_vnum_za_1: +** mov (w1[2-5]), w0 +** ldr za\[\1, 1\], \[x1, #1, mul vl\] +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_1, + svldr_vnum_za (w0 + 1, x1, 1), + svldr_vnum_za (w0 + 1, x1, 1)) + +/* +** ldr_vnum_za_13: +** mov (w1[2-5]), w0 +** ldr za\[\1, 13\], \[x1, #13, mul vl\] +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_13, + svldr_vnum_za (w0 + 13, x1, 13), + svldr_vnum_za (w0 + 13, x1, 13)) + +/* +** ldr_vnum_za_15: +** mov (w1[2-5]), w0 +** ldr za\[\1, 15\], \[x1, #15, mul vl\] +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_15, + svldr_vnum_za (w0 + 15, x1, 15), + svldr_vnum_za (w0 + 15, x1, 15)) + +/* +** ldr_vnum_za_16: +** ( +** add (w1[2-5]), w0, #?16 +** addsvl (x[0-9]+), x1, #16 +** ldr za\[\1, 0\], \[\2(?:, #0, mul vl)?\] +** | +** addsvl (x[0-9]+), x1, #16 +** add (w1[2-5]), w0, #?16 +** ldr za\[\4, 0\], \[\3(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_16, + svldr_vnum_za (w0 + 16, x1, 16), + svldr_vnum_za (w0 + 16, x1, 16)) + +/* +** ldr_vnum_za_m1: +** ( +** sub (w1[2-5]), w0, #?1 +** addsvl (x[0-9]+), x1, #-1 +** ldr za\[\1, 0\], \[\2(?:, #0, mul vl)?\] +** | +** addsvl (x[0-9]+), x1, #-1 +** sub (w1[2-5]), w0, #?1 +** ldr za\[\4, 0\], \[\3(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_m1, + svldr_vnum_za (w0 - 1, x1, -1), + svldr_vnum_za (w0 - 1, x1, -1)) + +/* +** ldr_vnum_za_mixed_1: +** add (w1[2-5]), w0, #?1 +** ldr za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_mixed_1, + svldr_vnum_za (w0 + 1, x1, 0), + svldr_vnum_za (w0 + 1, x1, 0)) + +/* +** ldr_vnum_za_mixed_2: +** ( +** mov (w1[2-5]), w0 +** addsvl (x[0-9]+), x1, #1 +** ldr za\[\1, 0\], \[\2(?:, #0, mul vl)?\] +** | +** addsvl (x[0-9]+), x1, #1 +** mov (w1[2-5]), w0 +** ldr za\[\4, 0\], \[\3(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_mixed_2, + svldr_vnum_za (w0, x1, 1), + svldr_vnum_za (w0, x1, 1)) + +/* +** ldr_vnum_za_mixed_3: +** ( +** add (w1[2-5]), w0, #?2 +** addsvl (x[0-9]+), x1, #1 +** ldr za\[\1, 0\], \[\2(?:, #0, mul vl)?\] +** | +** addsvl (x[0-9]+), x1, #1 +** add (w1[2-5]), w0, #?2 +** ldr za\[\4, 0\], \[\3(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_mixed_3, + svldr_vnum_za (w0 + 2, x1, 1), + svldr_vnum_za (w0 + 2, x1, 1)) + +/* +** ldr_vnum_za_mixed_4: +** ... +** addsvl x[0-9]+, x1, #-32 +** ... +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_mixed_4, + svldr_vnum_za (w0 + 3, x1, -32), + svldr_vnum_za (w0 + 3, x1, -32)) + +/* +** ldr_vnum_za_mixed_5: +** ... +** rdsvl x[0-9]+, #1 +** ... +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_mixed_5, + svldr_vnum_za (w0 + 3, x1, -33), + svldr_vnum_za (w0 + 3, x1, -33)) + +/* +** ldr_vnum_za_mixed_6: +** ... +** addsvl x[0-9]+, x1, #31 +** ... +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_mixed_6, + svldr_vnum_za (w0 + 4, x1, 31), + svldr_vnum_za (w0 + 4, x1, 31)) + +/* +** ldr_vnum_za_mixed_7: +** ... +** rdsvl x[0-9]+, #1 +** ... +** ret +*/ +TEST_LOAD_ZA (ldr_vnum_za_mixed_7, + svldr_vnum_za (w0 + 3, x1, 32), + svldr_vnum_za (w0 + 3, x1, 32)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_za_s.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_za_s.c new file mode 100644 index 00000000000..72c335c4d83 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_za_s.c @@ -0,0 +1,104 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** ldr_za_0: +** mov (w1[2-5]), w0 +** ldr za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** ret +*/ +TEST_LOAD_ZA (ldr_za_0, + svldr_za (w0, x1), + svldr_za (w0, x1)) + +/* +** ldr_za_1_vnum: +** mov (w1[2-5]), w0 +** ldr za\[\1, 1\], \[x1, #1, mul vl\] +** ret +*/ +TEST_LOAD_ZA (ldr_za_1_vnum, + svldr_za (w0 + 1, x1 + svcntsb ()), + svldr_za (w0 + 1, x1 + svcntsb ())) + +/* +** ldr_za_13_vnum: +** mov (w1[2-5]), w0 +** ldr za\[\1, 13\], \[x1, #13, mul vl\] +** ret +*/ +TEST_LOAD_ZA (ldr_za_13_vnum, + svldr_za (w0 + 13, x1 + svcntsb () * 13), + svldr_za (w0 + 13, x1 + svcntsb () * 13)) + +/* +** ldr_za_15_vnum: +** mov (w1[2-5]), w0 +** ldr za\[\1, 15\], \[x1, #15, mul vl\] +** ret +*/ +TEST_LOAD_ZA (ldr_za_15_vnum, + svldr_za (w0 + 15, x1 + svcntsb () * 15), + svldr_za (w0 + 15, x1 + svcntsb () * 15)) + +/* +** ldr_za_16_vnum: +** ( +** add (w1[2-5]), w0, #?16 +** incb x1, all, mul #16 +** ldr za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** | +** incb x1, all, mul #16 +** add (w1[2-5]), w0, #?16 +** ldr za\[\2, 0\], \[x1(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_LOAD_ZA (ldr_za_16_vnum, + svldr_za (w0 + 16, x1 + svcntsb () * 16), + svldr_za (w0 + 16, x1 + svcntsb () * 16)) + +/* +** ldr_za_m1_vnum: +** ( +** sub (w1[2-5]), w0, #?1 +** decb x1 +** ldr za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** | +** decb x1 +** sub (w1[2-5]), w0, #?1 +** ldr za\[\2, 0\], \[x1(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_LOAD_ZA (ldr_za_m1_vnum, + svldr_za (w0 - 1, x1 - svcntsb ()), + svldr_za (w0 - 1, x1 - svcntsb ())) + +/* +** ldr_za_2: +** add (w1[2-5]), w0, #?2 +** ldr za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** ret +*/ +TEST_LOAD_ZA (ldr_za_2, + svldr_za (w0 + 2, x1), + svldr_za (w0 + 2, x1)) + +/* +** ldr_za_offset: +** ( +** mov (w1[2-5]), w0 +** add (x[0-9]+), x1, #?1 +** ldr za\[\1, 0\], \[\2(?:, #0, mul vl)?\] +** | +** add (x[0-9]+), x1, #?1 +** mov (w1[2-5]), w0 +** ldr za\[\4, 0\], \[\3(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_LOAD_ZA (ldr_za_offset, + svldr_za (w0, x1 + 1), + svldr_za (w0, x1 + 1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_za_sc.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_za_sc.c new file mode 100644 index 00000000000..3f9593fa5fb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/ldr_za_sc.c @@ -0,0 +1,51 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#define STREAMING_COMPATIBLE +#include "test_sme_acle.h" + +/* +** ldr_za_0: +** mov (w1[2-5]), w0 +** ldr za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** ret +*/ +TEST_LOAD_ZA (ldr_za_0, + svldr_za (w0, x1), + svldr_za (w0, x1)) + +/* +** ldr_za_1_vnum: +** mov (w1[2-5]), w0 +** ldr za\[\1, 1\], \[x1, #1, mul vl\] +** ret +*/ +TEST_LOAD_ZA (ldr_za_1_vnum, + svldr_za (w0 + 1, x1 + svcntsb ()), + svldr_za (w0 + 1, x1 + svcntsb ())) + +/* +** ldr_za_2: +** add (w1[2-5]), w0, #?2 +** ldr za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** ret +*/ +TEST_LOAD_ZA (ldr_za_2, + svldr_za (w0 + 2, x1), + svldr_za (w0 + 2, x1)) + +/* +** ldr_za_offset: +** ( +** mov (w1[2-5]), w0 +** add (x[0-9]+), x1, #?1 +** ldr za\[\1, 0\], \[\2(?:, #0, mul vl)?\] +** | +** add (x[0-9]+), x1, #?1 +** mov (w1[2-5]), w0 +** ldr za\[\4, 0\], \[\3(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_LOAD_ZA (ldr_za_offset, + svldr_za (w0, x1 + 1), + svldr_za (w0, x1 + 1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mopa_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mopa_za32.c new file mode 100644 index 00000000000..480de2c7faf --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mopa_za32.c @@ -0,0 +1,102 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** mopa_za32_s8_0_p0_p1_z0_z1: +** smopa za0\.s, p0/m, p1/m, z0\.b, z1\.b +** ret +*/ +TEST_UNIFORM_ZA (mopa_za32_s8_0_p0_p1_z0_z1, svint8_t, + svmopa_za32_s8_m (0, p0, p1, z0, z1), + svmopa_za32_m (0, p0, p1, z0, z1)) + +/* +** mopa_za32_s8_0_p1_p0_z1_z0: +** smopa za0\.s, p1/m, p0/m, z1\.b, z0\.b +** ret +*/ +TEST_UNIFORM_ZA (mopa_za32_s8_0_p1_p0_z1_z0, svint8_t, + svmopa_za32_s8_m (0, p1, p0, z1, z0), + svmopa_za32_m (0, p1, p0, z1, z0)) + +/* +** mopa_za32_s8_3_p0_p1_z0_z1: +** smopa za3\.s, p0/m, p1/m, z0\.b, z1\.b +** ret +*/ +TEST_UNIFORM_ZA (mopa_za32_s8_3_p0_p1_z0_z1, svint8_t, + svmopa_za32_s8_m (3, p0, p1, z0, z1), + svmopa_za32_m (3, p0, p1, z0, z1)) + +/* +** mopa_za32_u8_0_p0_p1_z0_z1: +** umopa za0\.s, p0/m, p1/m, z0\.b, z1\.b +** ret +*/ +TEST_UNIFORM_ZA (mopa_za32_u8_0_p0_p1_z0_z1, svuint8_t, + svmopa_za32_u8_m (0, p0, p1, z0, z1), + svmopa_za32_m (0, p0, p1, z0, z1)) + +/* +** mopa_za32_u8_3_p0_p1_z0_z1: +** umopa za3\.s, p0/m, p1/m, z0\.b, z1\.b +** ret +*/ +TEST_UNIFORM_ZA (mopa_za32_u8_3_p0_p1_z0_z1, svuint8_t, + svmopa_za32_u8_m (3, p0, p1, z0, z1), + svmopa_za32_m (3, p0, p1, z0, z1)) + +/* +** mopa_za32_bf16_0_p0_p1_z0_z1: +** bfmopa za0\.s, p0/m, p1/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZA (mopa_za32_bf16_0_p0_p1_z0_z1, svbfloat16_t, + svmopa_za32_bf16_m (0, p0, p1, z0, z1), + svmopa_za32_m (0, p0, p1, z0, z1)) + +/* +** mopa_za32_bf16_3_p0_p1_z0_z1: +** bfmopa za3\.s, p0/m, p1/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZA (mopa_za32_bf16_3_p0_p1_z0_z1, svbfloat16_t, + svmopa_za32_bf16_m (3, p0, p1, z0, z1), + svmopa_za32_m (3, p0, p1, z0, z1)) + +/* +** mopa_za32_f16_0_p0_p1_z0_z1: +** fmopa za0\.s, p0/m, p1/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZA (mopa_za32_f16_0_p0_p1_z0_z1, svfloat16_t, + svmopa_za32_f16_m (0, p0, p1, z0, z1), + svmopa_za32_m (0, p0, p1, z0, z1)) + +/* +** mopa_za32_f16_3_p0_p1_z0_z1: +** fmopa za3\.s, p0/m, p1/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZA (mopa_za32_f16_3_p0_p1_z0_z1, svfloat16_t, + svmopa_za32_f16_m (3, p0, p1, z0, z1), + svmopa_za32_m (3, p0, p1, z0, z1)) + +/* +** mopa_za32_f32_0_p0_p1_z0_z1: +** fmopa za0\.s, p0/m, p1/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_ZA (mopa_za32_f32_0_p0_p1_z0_z1, svfloat32_t, + svmopa_za32_f32_m (0, p0, p1, z0, z1), + svmopa_za32_m (0, p0, p1, z0, z1)) + +/* +** mopa_za32_f32_3_p0_p1_z0_z1: +** fmopa za3\.s, p0/m, p1/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_ZA (mopa_za32_f32_3_p0_p1_z0_z1, svfloat32_t, + svmopa_za32_f32_m (3, p0, p1, z0, z1), + svmopa_za32_m (3, p0, p1, z0, z1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mopa_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mopa_za64.c new file mode 100644 index 00000000000..f523b960538 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mopa_za64.c @@ -0,0 +1,70 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +#pragma GCC target "+sme-i16i64" + +/* +** mopa_za64_s16_0_p0_p1_z0_z1: +** smopa za0\.d, p0/m, p1/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZA (mopa_za64_s16_0_p0_p1_z0_z1, svint16_t, + svmopa_za64_s16_m (0, p0, p1, z0, z1), + svmopa_za64_m (0, p0, p1, z0, z1)) + +/* +** mopa_za64_s16_0_p1_p0_z1_z0: +** smopa za0\.d, p1/m, p0/m, z1\.h, z0\.h +** ret +*/ +TEST_UNIFORM_ZA (mopa_za64_s16_0_p1_p0_z1_z0, svint16_t, + svmopa_za64_s16_m (0, p1, p0, z1, z0), + svmopa_za64_m (0, p1, p0, z1, z0)) + +/* +** mopa_za64_s16_7_p0_p1_z0_z1: +** smopa za7\.d, p0/m, p1/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZA (mopa_za64_s16_7_p0_p1_z0_z1, svint16_t, + svmopa_za64_s16_m (7, p0, p1, z0, z1), + svmopa_za64_m (7, p0, p1, z0, z1)) + +/* +** mopa_za64_u16_0_p0_p1_z0_z1: +** umopa za0\.d, p0/m, p1/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZA (mopa_za64_u16_0_p0_p1_z0_z1, svuint16_t, + svmopa_za64_u16_m (0, p0, p1, z0, z1), + svmopa_za64_m (0, p0, p1, z0, z1)) + +/* +** mopa_za64_u16_7_p0_p1_z0_z1: +** umopa za7\.d, p0/m, p1/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZA (mopa_za64_u16_7_p0_p1_z0_z1, svuint16_t, + svmopa_za64_u16_m (7, p0, p1, z0, z1), + svmopa_za64_m (7, p0, p1, z0, z1)) + +#pragma GCC target "+nosme-i16i64+sme-f64f64" + +/* +** mopa_za64_f64_0_p0_p1_z0_z1: +** fmopa za0\.d, p0/m, p1/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_ZA (mopa_za64_f64_0_p0_p1_z0_z1, svfloat64_t, + svmopa_za64_f64_m (0, p0, p1, z0, z1), + svmopa_za64_m (0, p0, p1, z0, z1)) + +/* +** mopa_za64_f64_7_p0_p1_z0_z1: +** fmopa za7\.d, p0/m, p1/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_ZA (mopa_za64_f64_7_p0_p1_z0_z1, svfloat64_t, + svmopa_za64_f64_m (7, p0, p1, z0, z1), + svmopa_za64_m (7, p0, p1, z0, z1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mops_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mops_za32.c new file mode 100644 index 00000000000..63c2b80fd5b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mops_za32.c @@ -0,0 +1,102 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** mops_za32_s8_0_p0_p1_z0_z1: +** smops za0\.s, p0/m, p1/m, z0\.b, z1\.b +** ret +*/ +TEST_UNIFORM_ZA (mops_za32_s8_0_p0_p1_z0_z1, svint8_t, + svmops_za32_s8_m (0, p0, p1, z0, z1), + svmops_za32_m (0, p0, p1, z0, z1)) + +/* +** mops_za32_s8_0_p1_p0_z1_z0: +** smops za0\.s, p1/m, p0/m, z1\.b, z0\.b +** ret +*/ +TEST_UNIFORM_ZA (mops_za32_s8_0_p1_p0_z1_z0, svint8_t, + svmops_za32_s8_m (0, p1, p0, z1, z0), + svmops_za32_m (0, p1, p0, z1, z0)) + +/* +** mops_za32_s8_3_p0_p1_z0_z1: +** smops za3\.s, p0/m, p1/m, z0\.b, z1\.b +** ret +*/ +TEST_UNIFORM_ZA (mops_za32_s8_3_p0_p1_z0_z1, svint8_t, + svmops_za32_s8_m (3, p0, p1, z0, z1), + svmops_za32_m (3, p0, p1, z0, z1)) + +/* +** mops_za32_u8_0_p0_p1_z0_z1: +** umops za0\.s, p0/m, p1/m, z0\.b, z1\.b +** ret +*/ +TEST_UNIFORM_ZA (mops_za32_u8_0_p0_p1_z0_z1, svuint8_t, + svmops_za32_u8_m (0, p0, p1, z0, z1), + svmops_za32_m (0, p0, p1, z0, z1)) + +/* +** mops_za32_u8_3_p0_p1_z0_z1: +** umops za3\.s, p0/m, p1/m, z0\.b, z1\.b +** ret +*/ +TEST_UNIFORM_ZA (mops_za32_u8_3_p0_p1_z0_z1, svuint8_t, + svmops_za32_u8_m (3, p0, p1, z0, z1), + svmops_za32_m (3, p0, p1, z0, z1)) + +/* +** mops_za32_bf16_0_p0_p1_z0_z1: +** bfmops za0\.s, p0/m, p1/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZA (mops_za32_bf16_0_p0_p1_z0_z1, svbfloat16_t, + svmops_za32_bf16_m (0, p0, p1, z0, z1), + svmops_za32_m (0, p0, p1, z0, z1)) + +/* +** mops_za32_bf16_3_p0_p1_z0_z1: +** bfmops za3\.s, p0/m, p1/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZA (mops_za32_bf16_3_p0_p1_z0_z1, svbfloat16_t, + svmops_za32_bf16_m (3, p0, p1, z0, z1), + svmops_za32_m (3, p0, p1, z0, z1)) + +/* +** mops_za32_f16_0_p0_p1_z0_z1: +** fmops za0\.s, p0/m, p1/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZA (mops_za32_f16_0_p0_p1_z0_z1, svfloat16_t, + svmops_za32_f16_m (0, p0, p1, z0, z1), + svmops_za32_m (0, p0, p1, z0, z1)) + +/* +** mops_za32_f16_3_p0_p1_z0_z1: +** fmops za3\.s, p0/m, p1/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZA (mops_za32_f16_3_p0_p1_z0_z1, svfloat16_t, + svmops_za32_f16_m (3, p0, p1, z0, z1), + svmops_za32_m (3, p0, p1, z0, z1)) + +/* +** mops_za32_f32_0_p0_p1_z0_z1: +** fmops za0\.s, p0/m, p1/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_ZA (mops_za32_f32_0_p0_p1_z0_z1, svfloat32_t, + svmops_za32_f32_m (0, p0, p1, z0, z1), + svmops_za32_m (0, p0, p1, z0, z1)) + +/* +** mops_za32_f32_3_p0_p1_z0_z1: +** fmops za3\.s, p0/m, p1/m, z0\.s, z1\.s +** ret +*/ +TEST_UNIFORM_ZA (mops_za32_f32_3_p0_p1_z0_z1, svfloat32_t, + svmops_za32_f32_m (3, p0, p1, z0, z1), + svmops_za32_m (3, p0, p1, z0, z1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mops_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mops_za64.c new file mode 100644 index 00000000000..bc04c3cf7fa --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/mops_za64.c @@ -0,0 +1,70 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +#pragma GCC target "+sme-i16i64" + +/* +** mops_za64_s16_0_p0_p1_z0_z1: +** smops za0\.d, p0/m, p1/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZA (mops_za64_s16_0_p0_p1_z0_z1, svint16_t, + svmops_za64_s16_m (0, p0, p1, z0, z1), + svmops_za64_m (0, p0, p1, z0, z1)) + +/* +** mops_za64_s16_0_p1_p0_z1_z0: +** smops za0\.d, p1/m, p0/m, z1\.h, z0\.h +** ret +*/ +TEST_UNIFORM_ZA (mops_za64_s16_0_p1_p0_z1_z0, svint16_t, + svmops_za64_s16_m (0, p1, p0, z1, z0), + svmops_za64_m (0, p1, p0, z1, z0)) + +/* +** mops_za64_s16_7_p0_p1_z0_z1: +** smops za7\.d, p0/m, p1/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZA (mops_za64_s16_7_p0_p1_z0_z1, svint16_t, + svmops_za64_s16_m (7, p0, p1, z0, z1), + svmops_za64_m (7, p0, p1, z0, z1)) + +/* +** mops_za64_u16_0_p0_p1_z0_z1: +** umops za0\.d, p0/m, p1/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZA (mops_za64_u16_0_p0_p1_z0_z1, svuint16_t, + svmops_za64_u16_m (0, p0, p1, z0, z1), + svmops_za64_m (0, p0, p1, z0, z1)) + +/* +** mops_za64_u16_7_p0_p1_z0_z1: +** umops za7\.d, p0/m, p1/m, z0\.h, z1\.h +** ret +*/ +TEST_UNIFORM_ZA (mops_za64_u16_7_p0_p1_z0_z1, svuint16_t, + svmops_za64_u16_m (7, p0, p1, z0, z1), + svmops_za64_m (7, p0, p1, z0, z1)) + +#pragma GCC target "+nosme-i16i64+sme-f64f64" + +/* +** mops_za64_f64_0_p0_p1_z0_z1: +** fmops za0\.d, p0/m, p1/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_ZA (mops_za64_f64_0_p0_p1_z0_z1, svfloat64_t, + svmops_za64_f64_m (0, p0, p1, z0, z1), + svmops_za64_m (0, p0, p1, z0, z1)) + +/* +** mops_za64_f64_7_p0_p1_z0_z1: +** fmops za7\.d, p0/m, p1/m, z0\.d, z1\.d +** ret +*/ +TEST_UNIFORM_ZA (mops_za64_f64_7_p0_p1_z0_z1, svfloat64_t, + svmops_za64_f64_m (7, p0, p1, z0, z1), + svmops_za64_m (7, p0, p1, z0, z1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za128.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za128.c new file mode 100644 index 00000000000..0dd503143e5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za128.c @@ -0,0 +1,367 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** read_za128_s8_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_s8_0_0_tied, svint8_t, + z0 = svread_hor_za128_s8_m (z0, p0, 0, w0), + z0 = svread_hor_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_s8_0_1_tied: +** add (w1[2-5]), w0, #?1 +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_s8_0_1_tied, svint8_t, + z0 = svread_hor_za128_s8_m (z0, p0, 0, w0 + 1), + z0 = svread_hor_za128_m (z0, p0, 0, w0 + 1)) + +/* +** read_za128_s8_0_m1_tied: +** sub (w1[2-5]), w0, #?1 +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_s8_0_m1_tied, svint8_t, + z0 = svread_hor_za128_s8_m (z0, p0, 0, w0 - 1), + z0 = svread_hor_za128_m (z0, p0, 0, w0 - 1)) + +/* +** read_za128_s8_1_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za1h\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_s8_1_0_tied, svint8_t, + z0 = svread_hor_za128_s8_m (z0, p0, 1, w0), + z0 = svread_hor_za128_m (z0, p0, 1, w0)) + +/* +** read_za128_s8_15_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za15h\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_s8_15_0_tied, svint8_t, + z0 = svread_hor_za128_s8_m (z0, p0, 15, w0), + z0 = svread_hor_za128_m (z0, p0, 15, w0)) + +/* +** read_za128_s8_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_s8_0_0_untied, svint8_t, + z0 = svread_hor_za128_s8_m (z1, p0, 0, w0), + z0 = svread_hor_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_u8_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_u8_0_0_tied, svuint8_t, + z0 = svread_hor_za128_u8_m (z0, p0, 0, w0), + z0 = svread_hor_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_u8_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_u8_0_0_untied, svuint8_t, + z0 = svread_hor_za128_u8_m (z1, p0, 0, w0), + z0 = svread_hor_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_s16_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_s16_0_0_tied, svint16_t, + z0 = svread_hor_za128_s16_m (z0, p0, 0, w0), + z0 = svread_hor_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_s16_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_s16_0_0_untied, svint16_t, + z0 = svread_hor_za128_s16_m (z1, p0, 0, w0), + z0 = svread_hor_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_u16_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_u16_0_0_tied, svuint16_t, + z0 = svread_hor_za128_u16_m (z0, p0, 0, w0), + z0 = svread_hor_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_u16_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_u16_0_0_untied, svuint16_t, + z0 = svread_hor_za128_u16_m (z1, p0, 0, w0), + z0 = svread_hor_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_f16_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_f16_0_0_tied, svfloat16_t, + z0 = svread_hor_za128_f16_m (z0, p0, 0, w0), + z0 = svread_hor_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_f16_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_f16_0_0_untied, svfloat16_t, + z0 = svread_hor_za128_f16_m (z1, p0, 0, w0), + z0 = svread_hor_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_bf16_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_bf16_0_0_tied, svbfloat16_t, + z0 = svread_hor_za128_bf16_m (z0, p0, 0, w0), + z0 = svread_hor_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_bf16_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_bf16_0_0_untied, svbfloat16_t, + z0 = svread_hor_za128_bf16_m (z1, p0, 0, w0), + z0 = svread_hor_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_s32_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_s32_0_0_tied, svint32_t, + z0 = svread_hor_za128_s32_m (z0, p0, 0, w0), + z0 = svread_hor_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_s32_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_s32_0_0_untied, svint32_t, + z0 = svread_hor_za128_s32_m (z1, p0, 0, w0), + z0 = svread_hor_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_u32_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_u32_0_0_tied, svuint32_t, + z0 = svread_hor_za128_u32_m (z0, p0, 0, w0), + z0 = svread_hor_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_u32_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_u32_0_0_untied, svuint32_t, + z0 = svread_hor_za128_u32_m (z1, p0, 0, w0), + z0 = svread_hor_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_f32_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_f32_0_0_tied, svfloat32_t, + z0 = svread_hor_za128_f32_m (z0, p0, 0, w0), + z0 = svread_hor_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_f32_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_f32_0_0_untied, svfloat32_t, + z0 = svread_hor_za128_f32_m (z1, p0, 0, w0), + z0 = svread_hor_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_s64_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_s64_0_0_tied, svint64_t, + z0 = svread_hor_za128_s64_m (z0, p0, 0, w0), + z0 = svread_hor_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_s64_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_s64_0_0_untied, svint64_t, + z0 = svread_hor_za128_s64_m (z1, p0, 0, w0), + z0 = svread_hor_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_u64_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_u64_0_0_tied, svuint64_t, + z0 = svread_hor_za128_u64_m (z0, p0, 0, w0), + z0 = svread_hor_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_u64_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_u64_0_0_untied, svuint64_t, + z0 = svread_hor_za128_u64_m (z1, p0, 0, w0), + z0 = svread_hor_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_f64_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_f64_0_0_tied, svfloat64_t, + z0 = svread_hor_za128_f64_m (z0, p0, 0, w0), + z0 = svread_hor_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_f64_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0h\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0h\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_f64_0_0_untied, svfloat64_t, + z0 = svread_hor_za128_f64_m (z1, p0, 0, w0), + z0 = svread_hor_za128_m (z1, p0, 0, w0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za16.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za16.c new file mode 100644 index 00000000000..c52d94a584b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za16.c @@ -0,0 +1,171 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** read_za16_s16_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0h\.h\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za16_s16_0_0_tied, svint16_t, + z0 = svread_hor_za16_s16_m (z0, p0, 0, w0), + z0 = svread_hor_za16_m (z0, p0, 0, w0)) + +/* +** read_za16_s16_0_1_tied: +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0h\.h\[\1, 1\] +** ret +*/ +TEST_READ_ZA (read_za16_s16_0_1_tied, svint16_t, + z0 = svread_hor_za16_s16_m (z0, p0, 0, w0 + 1), + z0 = svread_hor_za16_m (z0, p0, 0, w0 + 1)) + +/* +** read_za16_s16_0_7_tied: +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0h\.h\[\1, 7\] +** ret +*/ +TEST_READ_ZA (read_za16_s16_0_7_tied, svint16_t, + z0 = svread_hor_za16_s16_m (z0, p0, 0, w0 + 7), + z0 = svread_hor_za16_m (z0, p0, 0, w0 + 7)) + +/* +** read_za16_s16_0_8_tied: +** add (w1[2-5]), w0, #?8 +** mova z0\.h, p0/m, za0h\.h\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za16_s16_0_8_tied, svint16_t, + z0 = svread_hor_za16_s16_m (z0, p0, 0, w0 + 8), + z0 = svread_hor_za16_m (z0, p0, 0, w0 + 8)) + +/* +** read_za16_s16_0_m1_tied: +** sub (w1[2-5]), w0, #?1 +** mova z0\.h, p0/m, za0h\.h\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za16_s16_0_m1_tied, svint16_t, + z0 = svread_hor_za16_s16_m (z0, p0, 0, w0 - 1), + z0 = svread_hor_za16_m (z0, p0, 0, w0 - 1)) + +/* +** read_za16_s16_1_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za1h\.h\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za16_s16_1_0_tied, svint16_t, + z0 = svread_hor_za16_s16_m (z0, p0, 1, w0), + z0 = svread_hor_za16_m (z0, p0, 1, w0)) + +/* +** read_za16_s16_1_7_tied: +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za1h\.h\[\1, 7\] +** ret +*/ +TEST_READ_ZA (read_za16_s16_1_7_tied, svint16_t, + z0 = svread_hor_za16_s16_m (z0, p0, 1, w0 + 7), + z0 = svread_hor_za16_m (z0, p0, 1, w0 + 7)) + +/* +** read_za16_s16_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.h, p0/m, za0h\.h\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0h\.h\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za16_s16_0_0_untied, svint16_t, + z0 = svread_hor_za16_s16_m (z1, p0, 0, w0), + z0 = svread_hor_za16_m (z1, p0, 0, w0)) + +/* +** read_za16_u16_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0h\.h\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za16_u16_0_0_tied, svuint16_t, + z0 = svread_hor_za16_u16_m (z0, p0, 0, w0), + z0 = svread_hor_za16_m (z0, p0, 0, w0)) + +/* +** read_za16_u16_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.h, p0/m, za0h\.h\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0h\.h\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za16_u16_0_0_untied, svuint16_t, + z0 = svread_hor_za16_u16_m (z1, p0, 0, w0), + z0 = svread_hor_za16_m (z1, p0, 0, w0)) + +/* +** read_za16_f16_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0h\.h\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za16_f16_0_0_tied, svfloat16_t, + z0 = svread_hor_za16_f16_m (z0, p0, 0, w0), + z0 = svread_hor_za16_m (z0, p0, 0, w0)) + +/* +** read_za16_f16_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.h, p0/m, za0h\.h\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0h\.h\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za16_f16_0_0_untied, svfloat16_t, + z0 = svread_hor_za16_f16_m (z1, p0, 0, w0), + z0 = svread_hor_za16_m (z1, p0, 0, w0)) + +/* +** read_za16_bf16_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0h\.h\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za16_bf16_0_0_tied, svbfloat16_t, + z0 = svread_hor_za16_bf16_m (z0, p0, 0, w0), + z0 = svread_hor_za16_m (z0, p0, 0, w0)) + +/* +** read_za16_bf16_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.h, p0/m, za0h\.h\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0h\.h\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za16_bf16_0_0_untied, svbfloat16_t, + z0 = svread_hor_za16_bf16_m (z1, p0, 0, w0), + z0 = svread_hor_za16_m (z1, p0, 0, w0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za32.c new file mode 100644 index 00000000000..a085dc7fea7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za32.c @@ -0,0 +1,164 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** read_za32_s32_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za0h\.s\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za32_s32_0_0_tied, svint32_t, + z0 = svread_hor_za32_s32_m (z0, p0, 0, w0), + z0 = svread_hor_za32_m (z0, p0, 0, w0)) + +/* +** read_za32_s32_0_1_tied: +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za0h\.s\[\1, 1\] +** ret +*/ +TEST_READ_ZA (read_za32_s32_0_1_tied, svint32_t, + z0 = svread_hor_za32_s32_m (z0, p0, 0, w0 + 1), + z0 = svread_hor_za32_m (z0, p0, 0, w0 + 1)) + +/* +** read_za32_s32_0_3_tied: +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za0h\.s\[\1, 3\] +** ret +*/ +TEST_READ_ZA (read_za32_s32_0_3_tied, svint32_t, + z0 = svread_hor_za32_s32_m (z0, p0, 0, w0 + 3), + z0 = svread_hor_za32_m (z0, p0, 0, w0 + 3)) + +/* +** read_za32_s32_0_4_tied: +** add (w1[2-5]), w0, #?4 +** mova z0\.s, p0/m, za0h\.s\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za32_s32_0_4_tied, svint32_t, + z0 = svread_hor_za32_s32_m (z0, p0, 0, w0 + 4), + z0 = svread_hor_za32_m (z0, p0, 0, w0 + 4)) + +/* +** read_za32_s32_0_m1_tied: +** sub (w1[2-5]), w0, #?1 +** mova z0\.s, p0/m, za0h\.s\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za32_s32_0_m1_tied, svint32_t, + z0 = svread_hor_za32_s32_m (z0, p0, 0, w0 - 1), + z0 = svread_hor_za32_m (z0, p0, 0, w0 - 1)) + +/* +** read_za32_s32_1_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za1h\.s\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za32_s32_1_0_tied, svint32_t, + z0 = svread_hor_za32_s32_m (z0, p0, 1, w0), + z0 = svread_hor_za32_m (z0, p0, 1, w0)) + +/* +** read_za32_s32_1_3_tied: +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za1h\.s\[\1, 3\] +** ret +*/ +TEST_READ_ZA (read_za32_s32_1_3_tied, svint32_t, + z0 = svread_hor_za32_s32_m (z0, p0, 1, w0 + 3), + z0 = svread_hor_za32_m (z0, p0, 1, w0 + 3)) + +/* +** read_za32_s32_3_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za3h\.s\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za32_s32_3_0_tied, svint32_t, + z0 = svread_hor_za32_s32_m (z0, p0, 3, w0), + z0 = svread_hor_za32_m (z0, p0, 3, w0)) + +/* +** read_za32_s32_3_3_tied: +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za3h\.s\[\1, 3\] +** ret +*/ +TEST_READ_ZA (read_za32_s32_3_3_tied, svint32_t, + z0 = svread_hor_za32_s32_m (z0, p0, 3, w0 + 3), + z0 = svread_hor_za32_m (z0, p0, 3, w0 + 3)) + +/* +** read_za32_s32_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.s, p0/m, za0h\.s\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za0h\.s\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za32_s32_0_0_untied, svint32_t, + z0 = svread_hor_za32_s32_m (z1, p0, 0, w0), + z0 = svread_hor_za32_m (z1, p0, 0, w0)) + +/* +** read_za32_u32_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za0h\.s\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za32_u32_0_0_tied, svuint32_t, + z0 = svread_hor_za32_u32_m (z0, p0, 0, w0), + z0 = svread_hor_za32_m (z0, p0, 0, w0)) + +/* +** read_za32_u32_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.s, p0/m, za0h\.s\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za0h\.s\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za32_u32_0_0_untied, svuint32_t, + z0 = svread_hor_za32_u32_m (z1, p0, 0, w0), + z0 = svread_hor_za32_m (z1, p0, 0, w0)) + +/* +** read_za32_f32_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za0h\.s\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za32_f32_0_0_tied, svfloat32_t, + z0 = svread_hor_za32_f32_m (z0, p0, 0, w0), + z0 = svread_hor_za32_m (z0, p0, 0, w0)) + +/* +** read_za32_f32_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.s, p0/m, za0h\.s\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za0h\.s\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za32_f32_0_0_untied, svfloat32_t, + z0 = svread_hor_za32_f32_m (z1, p0, 0, w0), + z0 = svread_hor_za32_m (z1, p0, 0, w0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za64.c new file mode 100644 index 00000000000..021e3460f0c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za64.c @@ -0,0 +1,154 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** read_za64_s64_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za0h\.d\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za64_s64_0_0_tied, svint64_t, + z0 = svread_hor_za64_s64_m (z0, p0, 0, w0), + z0 = svread_hor_za64_m (z0, p0, 0, w0)) + +/* +** read_za64_s64_0_1_tied: +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za0h\.d\[\1, 1\] +** ret +*/ +TEST_READ_ZA (read_za64_s64_0_1_tied, svint64_t, + z0 = svread_hor_za64_s64_m (z0, p0, 0, w0 + 1), + z0 = svread_hor_za64_m (z0, p0, 0, w0 + 1)) + +/* +** read_za64_s64_0_2_tied: +** add (w1[2-5]), w0, #?2 +** mova z0\.d, p0/m, za0h\.d\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za64_s64_0_2_tied, svint64_t, + z0 = svread_hor_za64_s64_m (z0, p0, 0, w0 + 2), + z0 = svread_hor_za64_m (z0, p0, 0, w0 + 2)) + +/* +** read_za64_s64_0_m1_tied: +** sub (w1[2-5]), w0, #?1 +** mova z0\.d, p0/m, za0h\.d\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za64_s64_0_m1_tied, svint64_t, + z0 = svread_hor_za64_s64_m (z0, p0, 0, w0 - 1), + z0 = svread_hor_za64_m (z0, p0, 0, w0 - 1)) + +/* +** read_za64_s64_1_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za1h\.d\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za64_s64_1_0_tied, svint64_t, + z0 = svread_hor_za64_s64_m (z0, p0, 1, w0), + z0 = svread_hor_za64_m (z0, p0, 1, w0)) + +/* +** read_za64_s64_1_1_tied: +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za1h\.d\[\1, 1\] +** ret +*/ +TEST_READ_ZA (read_za64_s64_1_1_tied, svint64_t, + z0 = svread_hor_za64_s64_m (z0, p0, 1, w0 + 1), + z0 = svread_hor_za64_m (z0, p0, 1, w0 + 1)) + +/* +** read_za64_s64_7_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za7h\.d\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za64_s64_7_0_tied, svint64_t, + z0 = svread_hor_za64_s64_m (z0, p0, 7, w0), + z0 = svread_hor_za64_m (z0, p0, 7, w0)) + +/* +** read_za64_s64_7_1_tied: +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za7h\.d\[\1, 1\] +** ret +*/ +TEST_READ_ZA (read_za64_s64_7_1_tied, svint64_t, + z0 = svread_hor_za64_s64_m (z0, p0, 7, w0 + 1), + z0 = svread_hor_za64_m (z0, p0, 7, w0 + 1)) + +/* +** read_za64_s64_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.d, p0/m, za0h\.d\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za0h\.d\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za64_s64_0_0_untied, svint64_t, + z0 = svread_hor_za64_s64_m (z1, p0, 0, w0), + z0 = svread_hor_za64_m (z1, p0, 0, w0)) + +/* +** read_za64_u64_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za0h\.d\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za64_u64_0_0_tied, svuint64_t, + z0 = svread_hor_za64_u64_m (z0, p0, 0, w0), + z0 = svread_hor_za64_m (z0, p0, 0, w0)) + +/* +** read_za64_u64_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.d, p0/m, za0h\.d\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za0h\.d\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za64_u64_0_0_untied, svuint64_t, + z0 = svread_hor_za64_u64_m (z1, p0, 0, w0), + z0 = svread_hor_za64_m (z1, p0, 0, w0)) + +/* +** read_za64_f64_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za0h\.d\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za64_f64_0_0_tied, svfloat64_t, + z0 = svread_hor_za64_f64_m (z0, p0, 0, w0), + z0 = svread_hor_za64_m (z0, p0, 0, w0)) + +/* +** read_za64_f64_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.d, p0/m, za0h\.d\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za0h\.d\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za64_f64_0_0_untied, svfloat64_t, + z0 = svread_hor_za64_f64_m (z1, p0, 0, w0), + z0 = svread_hor_za64_m (z1, p0, 0, w0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za8.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za8.c new file mode 100644 index 00000000000..0558aa0e583 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_hor_za8.c @@ -0,0 +1,97 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** read_za8_s8_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.b, p0/m, za0h\.b\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za8_s8_0_0_tied, svint8_t, + z0 = svread_hor_za8_s8_m (z0, p0, 0, w0), + z0 = svread_hor_za8_m (z0, p0, 0, w0)) + +/* +** read_za8_s8_0_1_tied: +** mov (w1[2-5]), w0 +** mova z0\.b, p0/m, za0h\.b\[\1, 1\] +** ret +*/ +TEST_READ_ZA (read_za8_s8_0_1_tied, svint8_t, + z0 = svread_hor_za8_s8_m (z0, p0, 0, w0 + 1), + z0 = svread_hor_za8_m (z0, p0, 0, w0 + 1)) + +/* +** read_za8_s8_0_15_tied: +** mov (w1[2-5]), w0 +** mova z0\.b, p0/m, za0h\.b\[\1, 15\] +** ret +*/ +TEST_READ_ZA (read_za8_s8_0_15_tied, svint8_t, + z0 = svread_hor_za8_s8_m (z0, p0, 0, w0 + 15), + z0 = svread_hor_za8_m (z0, p0, 0, w0 + 15)) + +/* +** read_za8_s8_0_16_tied: +** add (w1[2-5]), w0, #?16 +** mova z0\.b, p0/m, za0h\.b\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za8_s8_0_16_tied, svint8_t, + z0 = svread_hor_za8_s8_m (z0, p0, 0, w0 + 16), + z0 = svread_hor_za8_m (z0, p0, 0, w0 + 16)) + +/* +** read_za8_s8_0_m1_tied: +** sub (w1[2-5]), w0, #?1 +** mova z0\.b, p0/m, za0h\.b\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za8_s8_0_m1_tied, svint8_t, + z0 = svread_hor_za8_s8_m (z0, p0, 0, w0 - 1), + z0 = svread_hor_za8_m (z0, p0, 0, w0 - 1)) + +/* +** read_za8_s8_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.b, p0/m, za0h\.b\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.b, p0/m, za0h\.b\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za8_s8_0_0_untied, svint8_t, + z0 = svread_hor_za8_s8_m (z1, p0, 0, w0), + z0 = svread_hor_za8_m (z1, p0, 0, w0)) + +/* +** read_za8_u8_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.b, p0/m, za0h\.b\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za8_u8_0_0_tied, svuint8_t, + z0 = svread_hor_za8_u8_m (z0, p0, 0, w0), + z0 = svread_hor_za8_m (z0, p0, 0, w0)) + +/* +** read_za8_u8_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.b, p0/m, za0h\.b\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.b, p0/m, za0h\.b\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za8_u8_0_0_untied, svuint8_t, + z0 = svread_hor_za8_u8_m (z1, p0, 0, w0), + z0 = svread_hor_za8_m (z1, p0, 0, w0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za128.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za128.c new file mode 100644 index 00000000000..177fa7a8124 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za128.c @@ -0,0 +1,367 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** read_za128_s8_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_s8_0_0_tied, svint8_t, + z0 = svread_ver_za128_s8_m (z0, p0, 0, w0), + z0 = svread_ver_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_s8_0_1_tied: +** add (w1[2-5]), w0, #?1 +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_s8_0_1_tied, svint8_t, + z0 = svread_ver_za128_s8_m (z0, p0, 0, w0 + 1), + z0 = svread_ver_za128_m (z0, p0, 0, w0 + 1)) + +/* +** read_za128_s8_0_m1_tied: +** sub (w1[2-5]), w0, #?1 +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_s8_0_m1_tied, svint8_t, + z0 = svread_ver_za128_s8_m (z0, p0, 0, w0 - 1), + z0 = svread_ver_za128_m (z0, p0, 0, w0 - 1)) + +/* +** read_za128_s8_1_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za1v\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_s8_1_0_tied, svint8_t, + z0 = svread_ver_za128_s8_m (z0, p0, 1, w0), + z0 = svread_ver_za128_m (z0, p0, 1, w0)) + +/* +** read_za128_s8_15_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za15v\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_s8_15_0_tied, svint8_t, + z0 = svread_ver_za128_s8_m (z0, p0, 15, w0), + z0 = svread_ver_za128_m (z0, p0, 15, w0)) + +/* +** read_za128_s8_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_s8_0_0_untied, svint8_t, + z0 = svread_ver_za128_s8_m (z1, p0, 0, w0), + z0 = svread_ver_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_u8_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_u8_0_0_tied, svuint8_t, + z0 = svread_ver_za128_u8_m (z0, p0, 0, w0), + z0 = svread_ver_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_u8_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_u8_0_0_untied, svuint8_t, + z0 = svread_ver_za128_u8_m (z1, p0, 0, w0), + z0 = svread_ver_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_s16_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_s16_0_0_tied, svint16_t, + z0 = svread_ver_za128_s16_m (z0, p0, 0, w0), + z0 = svread_ver_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_s16_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_s16_0_0_untied, svint16_t, + z0 = svread_ver_za128_s16_m (z1, p0, 0, w0), + z0 = svread_ver_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_u16_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_u16_0_0_tied, svuint16_t, + z0 = svread_ver_za128_u16_m (z0, p0, 0, w0), + z0 = svread_ver_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_u16_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_u16_0_0_untied, svuint16_t, + z0 = svread_ver_za128_u16_m (z1, p0, 0, w0), + z0 = svread_ver_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_f16_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_f16_0_0_tied, svfloat16_t, + z0 = svread_ver_za128_f16_m (z0, p0, 0, w0), + z0 = svread_ver_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_f16_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_f16_0_0_untied, svfloat16_t, + z0 = svread_ver_za128_f16_m (z1, p0, 0, w0), + z0 = svread_ver_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_bf16_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_bf16_0_0_tied, svbfloat16_t, + z0 = svread_ver_za128_bf16_m (z0, p0, 0, w0), + z0 = svread_ver_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_bf16_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_bf16_0_0_untied, svbfloat16_t, + z0 = svread_ver_za128_bf16_m (z1, p0, 0, w0), + z0 = svread_ver_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_s32_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_s32_0_0_tied, svint32_t, + z0 = svread_ver_za128_s32_m (z0, p0, 0, w0), + z0 = svread_ver_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_s32_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_s32_0_0_untied, svint32_t, + z0 = svread_ver_za128_s32_m (z1, p0, 0, w0), + z0 = svread_ver_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_u32_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_u32_0_0_tied, svuint32_t, + z0 = svread_ver_za128_u32_m (z0, p0, 0, w0), + z0 = svread_ver_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_u32_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_u32_0_0_untied, svuint32_t, + z0 = svread_ver_za128_u32_m (z1, p0, 0, w0), + z0 = svread_ver_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_f32_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_f32_0_0_tied, svfloat32_t, + z0 = svread_ver_za128_f32_m (z0, p0, 0, w0), + z0 = svread_ver_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_f32_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_f32_0_0_untied, svfloat32_t, + z0 = svread_ver_za128_f32_m (z1, p0, 0, w0), + z0 = svread_ver_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_s64_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_s64_0_0_tied, svint64_t, + z0 = svread_ver_za128_s64_m (z0, p0, 0, w0), + z0 = svread_ver_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_s64_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_s64_0_0_untied, svint64_t, + z0 = svread_ver_za128_s64_m (z1, p0, 0, w0), + z0 = svread_ver_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_u64_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_u64_0_0_tied, svuint64_t, + z0 = svread_ver_za128_u64_m (z0, p0, 0, w0), + z0 = svread_ver_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_u64_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_u64_0_0_untied, svuint64_t, + z0 = svread_ver_za128_u64_m (z1, p0, 0, w0), + z0 = svread_ver_za128_m (z1, p0, 0, w0)) + +/* +** read_za128_f64_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za128_f64_0_0_tied, svfloat64_t, + z0 = svread_ver_za128_f64_m (z0, p0, 0, w0), + z0 = svread_ver_za128_m (z0, p0, 0, w0)) + +/* +** read_za128_f64_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.q, p0/m, za0v\.q\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.q, p0/m, za0v\.q\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za128_f64_0_0_untied, svfloat64_t, + z0 = svread_ver_za128_f64_m (z1, p0, 0, w0), + z0 = svread_ver_za128_m (z1, p0, 0, w0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za16.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za16.c new file mode 100644 index 00000000000..ea67289f0ed --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za16.c @@ -0,0 +1,171 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** read_za16_s16_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0v\.h\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za16_s16_0_0_tied, svint16_t, + z0 = svread_ver_za16_s16_m (z0, p0, 0, w0), + z0 = svread_ver_za16_m (z0, p0, 0, w0)) + +/* +** read_za16_s16_0_1_tied: +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0v\.h\[\1, 1\] +** ret +*/ +TEST_READ_ZA (read_za16_s16_0_1_tied, svint16_t, + z0 = svread_ver_za16_s16_m (z0, p0, 0, w0 + 1), + z0 = svread_ver_za16_m (z0, p0, 0, w0 + 1)) + +/* +** read_za16_s16_0_7_tied: +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0v\.h\[\1, 7\] +** ret +*/ +TEST_READ_ZA (read_za16_s16_0_7_tied, svint16_t, + z0 = svread_ver_za16_s16_m (z0, p0, 0, w0 + 7), + z0 = svread_ver_za16_m (z0, p0, 0, w0 + 7)) + +/* +** read_za16_s16_0_8_tied: +** add (w1[2-5]), w0, #?8 +** mova z0\.h, p0/m, za0v\.h\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za16_s16_0_8_tied, svint16_t, + z0 = svread_ver_za16_s16_m (z0, p0, 0, w0 + 8), + z0 = svread_ver_za16_m (z0, p0, 0, w0 + 8)) + +/* +** read_za16_s16_0_m1_tied: +** sub (w1[2-5]), w0, #?1 +** mova z0\.h, p0/m, za0v\.h\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za16_s16_0_m1_tied, svint16_t, + z0 = svread_ver_za16_s16_m (z0, p0, 0, w0 - 1), + z0 = svread_ver_za16_m (z0, p0, 0, w0 - 1)) + +/* +** read_za16_s16_1_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za1v\.h\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za16_s16_1_0_tied, svint16_t, + z0 = svread_ver_za16_s16_m (z0, p0, 1, w0), + z0 = svread_ver_za16_m (z0, p0, 1, w0)) + +/* +** read_za16_s16_1_7_tied: +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za1v\.h\[\1, 7\] +** ret +*/ +TEST_READ_ZA (read_za16_s16_1_7_tied, svint16_t, + z0 = svread_ver_za16_s16_m (z0, p0, 1, w0 + 7), + z0 = svread_ver_za16_m (z0, p0, 1, w0 + 7)) + +/* +** read_za16_s16_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.h, p0/m, za0v\.h\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0v\.h\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za16_s16_0_0_untied, svint16_t, + z0 = svread_ver_za16_s16_m (z1, p0, 0, w0), + z0 = svread_ver_za16_m (z1, p0, 0, w0)) + +/* +** read_za16_u16_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0v\.h\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za16_u16_0_0_tied, svuint16_t, + z0 = svread_ver_za16_u16_m (z0, p0, 0, w0), + z0 = svread_ver_za16_m (z0, p0, 0, w0)) + +/* +** read_za16_u16_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.h, p0/m, za0v\.h\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0v\.h\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za16_u16_0_0_untied, svuint16_t, + z0 = svread_ver_za16_u16_m (z1, p0, 0, w0), + z0 = svread_ver_za16_m (z1, p0, 0, w0)) + +/* +** read_za16_f16_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0v\.h\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za16_f16_0_0_tied, svfloat16_t, + z0 = svread_ver_za16_f16_m (z0, p0, 0, w0), + z0 = svread_ver_za16_m (z0, p0, 0, w0)) + +/* +** read_za16_f16_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.h, p0/m, za0v\.h\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0v\.h\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za16_f16_0_0_untied, svfloat16_t, + z0 = svread_ver_za16_f16_m (z1, p0, 0, w0), + z0 = svread_ver_za16_m (z1, p0, 0, w0)) + +/* +** read_za16_bf16_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0v\.h\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za16_bf16_0_0_tied, svbfloat16_t, + z0 = svread_ver_za16_bf16_m (z0, p0, 0, w0), + z0 = svread_ver_za16_m (z0, p0, 0, w0)) + +/* +** read_za16_bf16_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.h, p0/m, za0v\.h\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.h, p0/m, za0v\.h\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za16_bf16_0_0_untied, svbfloat16_t, + z0 = svread_ver_za16_bf16_m (z1, p0, 0, w0), + z0 = svread_ver_za16_m (z1, p0, 0, w0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za32.c new file mode 100644 index 00000000000..97d7a2d627b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za32.c @@ -0,0 +1,164 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** read_za32_s32_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za0v\.s\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za32_s32_0_0_tied, svint32_t, + z0 = svread_ver_za32_s32_m (z0, p0, 0, w0), + z0 = svread_ver_za32_m (z0, p0, 0, w0)) + +/* +** read_za32_s32_0_1_tied: +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za0v\.s\[\1, 1\] +** ret +*/ +TEST_READ_ZA (read_za32_s32_0_1_tied, svint32_t, + z0 = svread_ver_za32_s32_m (z0, p0, 0, w0 + 1), + z0 = svread_ver_za32_m (z0, p0, 0, w0 + 1)) + +/* +** read_za32_s32_0_3_tied: +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za0v\.s\[\1, 3\] +** ret +*/ +TEST_READ_ZA (read_za32_s32_0_3_tied, svint32_t, + z0 = svread_ver_za32_s32_m (z0, p0, 0, w0 + 3), + z0 = svread_ver_za32_m (z0, p0, 0, w0 + 3)) + +/* +** read_za32_s32_0_4_tied: +** add (w1[2-5]), w0, #?4 +** mova z0\.s, p0/m, za0v\.s\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za32_s32_0_4_tied, svint32_t, + z0 = svread_ver_za32_s32_m (z0, p0, 0, w0 + 4), + z0 = svread_ver_za32_m (z0, p0, 0, w0 + 4)) + +/* +** read_za32_s32_0_m1_tied: +** sub (w1[2-5]), w0, #?1 +** mova z0\.s, p0/m, za0v\.s\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za32_s32_0_m1_tied, svint32_t, + z0 = svread_ver_za32_s32_m (z0, p0, 0, w0 - 1), + z0 = svread_ver_za32_m (z0, p0, 0, w0 - 1)) + +/* +** read_za32_s32_1_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za1v\.s\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za32_s32_1_0_tied, svint32_t, + z0 = svread_ver_za32_s32_m (z0, p0, 1, w0), + z0 = svread_ver_za32_m (z0, p0, 1, w0)) + +/* +** read_za32_s32_1_3_tied: +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za1v\.s\[\1, 3\] +** ret +*/ +TEST_READ_ZA (read_za32_s32_1_3_tied, svint32_t, + z0 = svread_ver_za32_s32_m (z0, p0, 1, w0 + 3), + z0 = svread_ver_za32_m (z0, p0, 1, w0 + 3)) + +/* +** read_za32_s32_3_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za3v\.s\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za32_s32_3_0_tied, svint32_t, + z0 = svread_ver_za32_s32_m (z0, p0, 3, w0), + z0 = svread_ver_za32_m (z0, p0, 3, w0)) + +/* +** read_za32_s32_3_3_tied: +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za3v\.s\[\1, 3\] +** ret +*/ +TEST_READ_ZA (read_za32_s32_3_3_tied, svint32_t, + z0 = svread_ver_za32_s32_m (z0, p0, 3, w0 + 3), + z0 = svread_ver_za32_m (z0, p0, 3, w0 + 3)) + +/* +** read_za32_s32_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.s, p0/m, za0v\.s\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za0v\.s\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za32_s32_0_0_untied, svint32_t, + z0 = svread_ver_za32_s32_m (z1, p0, 0, w0), + z0 = svread_ver_za32_m (z1, p0, 0, w0)) + +/* +** read_za32_u32_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za0v\.s\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za32_u32_0_0_tied, svuint32_t, + z0 = svread_ver_za32_u32_m (z0, p0, 0, w0), + z0 = svread_ver_za32_m (z0, p0, 0, w0)) + +/* +** read_za32_u32_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.s, p0/m, za0v\.s\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za0v\.s\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za32_u32_0_0_untied, svuint32_t, + z0 = svread_ver_za32_u32_m (z1, p0, 0, w0), + z0 = svread_ver_za32_m (z1, p0, 0, w0)) + +/* +** read_za32_f32_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za0v\.s\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za32_f32_0_0_tied, svfloat32_t, + z0 = svread_ver_za32_f32_m (z0, p0, 0, w0), + z0 = svread_ver_za32_m (z0, p0, 0, w0)) + +/* +** read_za32_f32_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.s, p0/m, za0v\.s\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.s, p0/m, za0v\.s\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za32_f32_0_0_untied, svfloat32_t, + z0 = svread_ver_za32_f32_m (z1, p0, 0, w0), + z0 = svread_ver_za32_m (z1, p0, 0, w0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za64.c new file mode 100644 index 00000000000..ce1348f147b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za64.c @@ -0,0 +1,154 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** read_za64_s64_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za0v\.d\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za64_s64_0_0_tied, svint64_t, + z0 = svread_ver_za64_s64_m (z0, p0, 0, w0), + z0 = svread_ver_za64_m (z0, p0, 0, w0)) + +/* +** read_za64_s64_0_1_tied: +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za0v\.d\[\1, 1\] +** ret +*/ +TEST_READ_ZA (read_za64_s64_0_1_tied, svint64_t, + z0 = svread_ver_za64_s64_m (z0, p0, 0, w0 + 1), + z0 = svread_ver_za64_m (z0, p0, 0, w0 + 1)) + +/* +** read_za64_s64_0_2_tied: +** add (w1[2-5]), w0, #?2 +** mova z0\.d, p0/m, za0v\.d\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za64_s64_0_2_tied, svint64_t, + z0 = svread_ver_za64_s64_m (z0, p0, 0, w0 + 2), + z0 = svread_ver_za64_m (z0, p0, 0, w0 + 2)) + +/* +** read_za64_s64_0_m1_tied: +** sub (w1[2-5]), w0, #?1 +** mova z0\.d, p0/m, za0v\.d\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za64_s64_0_m1_tied, svint64_t, + z0 = svread_ver_za64_s64_m (z0, p0, 0, w0 - 1), + z0 = svread_ver_za64_m (z0, p0, 0, w0 - 1)) + +/* +** read_za64_s64_1_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za1v\.d\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za64_s64_1_0_tied, svint64_t, + z0 = svread_ver_za64_s64_m (z0, p0, 1, w0), + z0 = svread_ver_za64_m (z0, p0, 1, w0)) + +/* +** read_za64_s64_1_1_tied: +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za1v\.d\[\1, 1\] +** ret +*/ +TEST_READ_ZA (read_za64_s64_1_1_tied, svint64_t, + z0 = svread_ver_za64_s64_m (z0, p0, 1, w0 + 1), + z0 = svread_ver_za64_m (z0, p0, 1, w0 + 1)) + +/* +** read_za64_s64_7_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za7v\.d\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za64_s64_7_0_tied, svint64_t, + z0 = svread_ver_za64_s64_m (z0, p0, 7, w0), + z0 = svread_ver_za64_m (z0, p0, 7, w0)) + +/* +** read_za64_s64_7_1_tied: +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za7v\.d\[\1, 1\] +** ret +*/ +TEST_READ_ZA (read_za64_s64_7_1_tied, svint64_t, + z0 = svread_ver_za64_s64_m (z0, p0, 7, w0 + 1), + z0 = svread_ver_za64_m (z0, p0, 7, w0 + 1)) + +/* +** read_za64_s64_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.d, p0/m, za0v\.d\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za0v\.d\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za64_s64_0_0_untied, svint64_t, + z0 = svread_ver_za64_s64_m (z1, p0, 0, w0), + z0 = svread_ver_za64_m (z1, p0, 0, w0)) + +/* +** read_za64_u64_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za0v\.d\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za64_u64_0_0_tied, svuint64_t, + z0 = svread_ver_za64_u64_m (z0, p0, 0, w0), + z0 = svread_ver_za64_m (z0, p0, 0, w0)) + +/* +** read_za64_u64_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.d, p0/m, za0v\.d\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za0v\.d\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za64_u64_0_0_untied, svuint64_t, + z0 = svread_ver_za64_u64_m (z1, p0, 0, w0), + z0 = svread_ver_za64_m (z1, p0, 0, w0)) + +/* +** read_za64_f64_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za0v\.d\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za64_f64_0_0_tied, svfloat64_t, + z0 = svread_ver_za64_f64_m (z0, p0, 0, w0), + z0 = svread_ver_za64_m (z0, p0, 0, w0)) + +/* +** read_za64_f64_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.d, p0/m, za0v\.d\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.d, p0/m, za0v\.d\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za64_f64_0_0_untied, svfloat64_t, + z0 = svread_ver_za64_f64_m (z1, p0, 0, w0), + z0 = svread_ver_za64_m (z1, p0, 0, w0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za8.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za8.c new file mode 100644 index 00000000000..916155f5261 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/read_ver_za8.c @@ -0,0 +1,97 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** read_za8_s8_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.b, p0/m, za0v\.b\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za8_s8_0_0_tied, svint8_t, + z0 = svread_ver_za8_s8_m (z0, p0, 0, w0), + z0 = svread_ver_za8_m (z0, p0, 0, w0)) + +/* +** read_za8_s8_0_1_tied: +** mov (w1[2-5]), w0 +** mova z0\.b, p0/m, za0v\.b\[\1, 1\] +** ret +*/ +TEST_READ_ZA (read_za8_s8_0_1_tied, svint8_t, + z0 = svread_ver_za8_s8_m (z0, p0, 0, w0 + 1), + z0 = svread_ver_za8_m (z0, p0, 0, w0 + 1)) + +/* +** read_za8_s8_0_15_tied: +** mov (w1[2-5]), w0 +** mova z0\.b, p0/m, za0v\.b\[\1, 15\] +** ret +*/ +TEST_READ_ZA (read_za8_s8_0_15_tied, svint8_t, + z0 = svread_ver_za8_s8_m (z0, p0, 0, w0 + 15), + z0 = svread_ver_za8_m (z0, p0, 0, w0 + 15)) + +/* +** read_za8_s8_0_16_tied: +** add (w1[2-5]), w0, #?16 +** mova z0\.b, p0/m, za0v\.b\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za8_s8_0_16_tied, svint8_t, + z0 = svread_ver_za8_s8_m (z0, p0, 0, w0 + 16), + z0 = svread_ver_za8_m (z0, p0, 0, w0 + 16)) + +/* +** read_za8_s8_0_m1_tied: +** sub (w1[2-5]), w0, #?1 +** mova z0\.b, p0/m, za0v\.b\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za8_s8_0_m1_tied, svint8_t, + z0 = svread_ver_za8_s8_m (z0, p0, 0, w0 - 1), + z0 = svread_ver_za8_m (z0, p0, 0, w0 - 1)) + +/* +** read_za8_s8_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.b, p0/m, za0v\.b\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.b, p0/m, za0v\.b\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za8_s8_0_0_untied, svint8_t, + z0 = svread_ver_za8_s8_m (z1, p0, 0, w0), + z0 = svread_ver_za8_m (z1, p0, 0, w0)) + +/* +** read_za8_u8_0_0_tied: +** mov (w1[2-5]), w0 +** mova z0\.b, p0/m, za0v\.b\[\1, 0\] +** ret +*/ +TEST_READ_ZA (read_za8_u8_0_0_tied, svuint8_t, + z0 = svread_ver_za8_u8_m (z0, p0, 0, w0), + z0 = svread_ver_za8_m (z0, p0, 0, w0)) + +/* +** read_za8_u8_0_0_untied: +** ( +** mov (w1[2-5]), w0 +** mov z0\.d, z1\.d +** mova z0\.b, p0/m, za0v\.b\[\1, 0\] +** | +** mov z0\.d, z1\.d +** mov (w1[2-5]), w0 +** mova z0\.b, p0/m, za0v\.b\[\2, 0\] +** ) +** ret +*/ +TEST_READ_ZA (read_za8_u8_0_0_untied, svuint8_t, + z0 = svread_ver_za8_u8_m (z1, p0, 0, w0), + z0 = svread_ver_za8_m (z1, p0, 0, w0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za128.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za128.c new file mode 100644 index 00000000000..23491484bf9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za128.c @@ -0,0 +1,46 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** st1_vnum_za128_0_0: +** mov (w1[2-5]), w0 +** st1q { za0h\.q\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za128_0_0, + svst1_hor_vnum_za128 (0, w0, p0, x1, 0), + svst1_hor_vnum_za128 (0, w0, p0, x1, 0)) + +/* +** st1_vnum_za128_5_0: +** incb x1, all, mul #13 +** mov (w1[2-5]), w0 +** st1q { za5h\.q\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za128_5_0, + svst1_hor_vnum_za128 (5, w0, p0, x1, 13), + svst1_hor_vnum_za128 (5, w0, p0, x1, 13)) + +/* +** st1_vnum_za128_11_0: +** cntb (x[0-9]+) +** madd (x[0-9]+), (?:\1, x2|x2, \1), x1 +** mov (w1[2-5]), w0 +** st1q { za11h\.q\[\3, 0\] }, p0/z, \[\2\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za128_11_0, + svst1_hor_vnum_za128 (11, w0, p0, x1, x2), + svst1_hor_vnum_za128 (11, w0, p0, x1, x2)) + +/* +** st1_vnum_za128_0_1: +** add (w1[2-5]), w0, #?1 +** st1q { za0h\.q\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za128_0_1, + svst1_hor_vnum_za128 (0, w0 + 1, p0, x1, 0), + svst1_hor_vnum_za128 (0, w0 + 1, p0, x1, 0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za16.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za16.c new file mode 100644 index 00000000000..86c3e5de0e6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za16.c @@ -0,0 +1,46 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** st1_vnum_za16_0_0: +** mov (w1[2-5]), w0 +** st1h { za0h\.h\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za16_0_0, + svst1_hor_vnum_za16 (0, w0, p0, x1, 0), + svst1_hor_vnum_za16 (0, w0, p0, x1, 0)) + +/* +** st1_vnum_za16_0_1: +** incb x1, all, mul #9 +** mov (w1[2-5]), w0 +** st1h { za0h\.h\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za16_0_1, + svst1_hor_vnum_za16 (0, w0 + 1, p0, x1, 9), + svst1_hor_vnum_za16 (0, w0 + 1, p0, x1, 9)) + +/* +** st1_vnum_za16_1_7: +** cntb (x[0-9]+) +** madd (x[0-9]+), (?:\1, x2|x2, \1), x1 +** mov (w1[2-5]), w0 +** st1h { za1h\.h\[\3, 7\] }, p0/z, \[\2\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za16_1_7, + svst1_hor_vnum_za16 (1, w0 + 7, p0, x1, x2), + svst1_hor_vnum_za16 (1, w0 + 7, p0, x1, x2)) + +/* +** st1_vnum_za16_0_8: +** add (w1[2-5]), w0, #?8 +** st1h { za0h\.h\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za16_0_8, + svst1_hor_vnum_za16 (0, w0 + 8, p0, x1, 0), + svst1_hor_vnum_za16 (0, w0 + 8, p0, x1, 0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za32.c new file mode 100644 index 00000000000..6a5902b1dee --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za32.c @@ -0,0 +1,46 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** st1_vnum_za32_0_0: +** mov (w1[2-5]), w0 +** st1w { za0h\.s\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za32_0_0, + svst1_hor_vnum_za32 (0, w0, p0, x1, 0), + svst1_hor_vnum_za32 (0, w0, p0, x1, 0)) + +/* +** st1_vnum_za32_0_1: +** incb x1, all, mul #5 +** mov (w1[2-5]), w0 +** st1w { za0h\.s\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za32_0_1, + svst1_hor_vnum_za32 (0, w0 + 1, p0, x1, 5), + svst1_hor_vnum_za32 (0, w0 + 1, p0, x1, 5)) + +/* +** st1_vnum_za32_2_3: +** cntb (x[0-9]+) +** madd (x[0-9]+), (?:\1, x2|x2, \1), x1 +** mov (w1[2-5]), w0 +** st1w { za2h\.s\[\3, 3\] }, p0/z, \[\2\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za32_2_3, + svst1_hor_vnum_za32 (2, w0 + 3, p0, x1, x2), + svst1_hor_vnum_za32 (2, w0 + 3, p0, x1, x2)) + +/* +** st1_vnum_za32_0_4: +** add (w1[2-5]), w0, #?4 +** st1w { za0h\.s\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za32_0_4, + svst1_hor_vnum_za32 (0, w0 + 4, p0, x1, 0), + svst1_hor_vnum_za32 (0, w0 + 4, p0, x1, 0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za64.c new file mode 100644 index 00000000000..145bce0ac09 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za64.c @@ -0,0 +1,46 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** st1_vnum_za64_0_0: +** mov (w1[2-5]), w0 +** st1d { za0h\.d\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za64_0_0, + svst1_hor_vnum_za64 (0, w0, p0, x1, 0), + svst1_hor_vnum_za64 (0, w0, p0, x1, 0)) + +/* +** st1_vnum_za64_0_1: +** incb x1, all, mul #13 +** mov (w1[2-5]), w0 +** st1d { za0h\.d\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za64_0_1, + svst1_hor_vnum_za64 (0, w0 + 1, p0, x1, 13), + svst1_hor_vnum_za64 (0, w0 + 1, p0, x1, 13)) + +/* +** st1_vnum_za64_5_1: +** cntb (x[0-9]+) +** madd (x[0-9]+), (?:\1, x2|x2, \1), x1 +** mov (w1[2-5]), w0 +** st1d { za5h\.d\[\3, 1\] }, p0/z, \[\2\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za64_5_1, + svst1_hor_vnum_za64 (5, w0 + 1, p0, x1, x2), + svst1_hor_vnum_za64 (5, w0 + 1, p0, x1, x2)) + +/* +** st1_vnum_za64_0_2: +** add (w1[2-5]), w0, #?2 +** st1d { za0h\.d\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za64_0_2, + svst1_hor_vnum_za64 (0, w0 + 2, p0, x1, 0), + svst1_hor_vnum_za64 (0, w0 + 2, p0, x1, 0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za8.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za8.c new file mode 100644 index 00000000000..3477180670a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_vnum_za8.c @@ -0,0 +1,46 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** st1_vnum_za8_0_0: +** mov (w1[2-5]), w0 +** st1b { za0h\.b\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za8_0_0, + svst1_hor_vnum_za8 (0, w0, p0, x1, 0), + svst1_hor_vnum_za8 (0, w0, p0, x1, 0)) + +/* +** st1_vnum_za8_0_1: +** incb x1, all, mul #11 +** mov (w1[2-5]), w0 +** st1b { za0h\.b\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za8_0_1, + svst1_hor_vnum_za8 (0, w0 + 1, p0, x1, 11), + svst1_hor_vnum_za8 (0, w0 + 1, p0, x1, 11)) + +/* +** st1_vnum_za8_0_15: +** cntb (x[0-9]+) +** mul (x[0-9]+), (?:\1, x2|x2, \1) +** mov (w1[2-5]), w0 +** st1b { za0h\.b\[\3, 15\] }, p0/z, \[x1, \2\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za8_0_15, + svst1_hor_vnum_za8 (0, w0 + 15, p0, x1, x2), + svst1_hor_vnum_za8 (0, w0 + 15, p0, x1, x2)) + +/* +** st1_vnum_za8_0_16: +** add (w1[2-5]), w0, #?16 +** st1b { za0h\.b\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_vnum_za8_0_16, + svst1_hor_vnum_za8 (0, w0 + 16, p0, x1, 0), + svst1_hor_vnum_za8 (0, w0 + 16, p0, x1, 0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za128.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za128.c new file mode 100644 index 00000000000..050205c8802 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za128.c @@ -0,0 +1,63 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** st1_za128_0_0: +** mov (w1[2-5]), w0 +** st1q { za0h\.q\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za128_0_0, + svst1_hor_za128 (0, w0, p0, x1), + svst1_hor_za128 (0, w0, p0, x1)) + +/* +** st1_za128_0_1: +** add (w1[2-5]), w0, #?1 +** st1q { za0h\.q\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za128_0_1, + svst1_hor_za128 (0, w0 + 1, p0, x1), + svst1_hor_za128 (0, w0 + 1, p0, x1)) + +/* +** st1_za128_7_0: +** mov (w1[2-5]), w0 +** st1q { za7h\.q\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za128_7_0, + svst1_hor_za128 (7, w0, p0, x1), + svst1_hor_za128 (7, w0, p0, x1)) + +/* +** st1_za128_13_0: +** mov (w1[2-5]), w0 +** st1q { za13h\.q\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za128_13_0, + svst1_hor_za128 (13, w0, p0, x1), + svst1_hor_za128 (13, w0, p0, x1)) + +/* +** st1_za128_15_0: +** mov (w1[2-5]), w0 +** st1q { za15h\.q\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za128_15_0, + svst1_hor_za128 (15, w0, p0, x1), + svst1_hor_za128 (15, w0, p0, x1)) + +/* +** st1_za128_9_0_index: +** mov (w1[2-5]), w0 +** st1q { za9h\.q\[\1, 0\] }, p0/z, \[x1, x2, lsl #?4\] +** ret +*/ +TEST_STORE_ZA (st1_za128_9_0_index, + svst1_hor_za128 (9, w0, p0, x1 + x2 * 16), + svst1_hor_za128 (9, w0, p0, x1 + x2 * 16)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za16.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za16.c new file mode 100644 index 00000000000..3f5141e1dcd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za16.c @@ -0,0 +1,94 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** st1_za16_0_0: +** mov (w1[2-5]), w0 +** st1h { za0h\.h\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za16_0_0, + svst1_hor_za16 (0, w0, p0, x1), + svst1_hor_za16 (0, w0, p0, x1)) + +/* +** st1_za16_0_1: +** mov (w1[2-5]), w0 +** st1h { za0h\.h\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za16_0_1, + svst1_hor_za16 (0, w0 + 1, p0, x1), + svst1_hor_za16 (0, w0 + 1, p0, x1)) + +/* +** st1_za16_0_7: +** mov (w1[2-5]), w0 +** st1h { za0h\.h\[\1, 7\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za16_0_7, + svst1_hor_za16 (0, w0 + 7, p0, x1), + svst1_hor_za16 (0, w0 + 7, p0, x1)) + +/* +** st1_za16_1_0: +** mov (w1[2-5]), w0 +** st1h { za1h\.h\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za16_1_0, + svst1_hor_za16 (1, w0, p0, x1), + svst1_hor_za16 (1, w0, p0, x1)) + + +/* +** st1_za16_1_1: +** mov (w1[2-5]), w0 +** st1h { za1h\.h\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za16_1_1, + svst1_hor_za16 (1, w0 + 1, p0, x1), + svst1_hor_za16 (1, w0 + 1, p0, x1)) + +/* +** st1_za16_1_7: +** mov (w1[2-5]), w0 +** st1h { za1h\.h\[\1, 7\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za16_1_7, + svst1_hor_za16 (1, w0 + 7, p0, x1), + svst1_hor_za16 (1, w0 + 7, p0, x1)) + +/* +** st1_za16_1_5_index: +** mov (w1[2-5]), w0 +** st1h { za1h\.h\[\1, 5\] }, p0/z, \[x1, x2, lsl #?1\] +** ret +*/ +TEST_STORE_ZA (st1_za16_1_5_index, + svst1_hor_za16 (1, w0 + 5, p0, x1 + x2 * 2), + svst1_hor_za16 (1, w0 + 5, p0, x1 + x2 * 2)) + +/* +** st1_za16_0_8: +** add (w1[2-5]), w0, #?8 +** st1h { za0h\.h\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za16_0_8, + svst1_hor_za16 (0, w0 + 8, p0, x1), + svst1_hor_za16 (0, w0 + 8, p0, x1)) + +/* +** st1_za16_0_m1: +** sub (w1[2-5]), w0, #?1 +** st1h { za0h\.h\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za16_0_m1, + svst1_hor_za16 (0, w0 - 1, p0, x1), + svst1_hor_za16 (0, w0 - 1, p0, x1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za32.c new file mode 100644 index 00000000000..b7b04771dcb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za32.c @@ -0,0 +1,93 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** st1_za32_0_0: +** mov (w1[2-5]), w0 +** st1w { za0h\.s\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za32_0_0, + svst1_hor_za32 (0, w0, p0, x1), + svst1_hor_za32 (0, w0, p0, x1)) + +/* +** st1_za32_0_1: +** mov (w1[2-5]), w0 +** st1w { za0h\.s\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za32_0_1, + svst1_hor_za32 (0, w0 + 1, p0, x1), + svst1_hor_za32 (0, w0 + 1, p0, x1)) + +/* +** st1_za32_0_3: +** mov (w1[2-5]), w0 +** st1w { za0h\.s\[\1, 3\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za32_0_3, + svst1_hor_za32 (0, w0 + 3, p0, x1), + svst1_hor_za32 (0, w0 + 3, p0, x1)) + +/* +** st1_za32_3_0: +** mov (w1[2-5]), w0 +** st1w { za3h\.s\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za32_3_0, + svst1_hor_za32 (3, w0, p0, x1), + svst1_hor_za32 (3, w0, p0, x1)) + +/* +** st1_za32_3_1: +** mov (w1[2-5]), w0 +** st1w { za3h\.s\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za32_3_1, + svst1_hor_za32 (3, w0 + 1, p0, x1), + svst1_hor_za32 (3, w0 + 1, p0, x1)) + +/* +** st1_za32_3_3: +** mov (w1[2-5]), w0 +** st1w { za3h\.s\[\1, 3\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za32_3_3, + svst1_hor_za32 (3, w0 + 3, p0, x1), + svst1_hor_za32 (3, w0 + 3, p0, x1)) + +/* +** st1_za32_1_2_index: +** mov (w1[2-5]), w0 +** st1w { za1h\.s\[\1, 2\] }, p0/z, \[x1, x2, lsl #?2\] +** ret +*/ +TEST_STORE_ZA (st1_za32_1_2_index, + svst1_hor_za32 (1, w0 + 2, p0, x1 + x2 * 4), + svst1_hor_za32 (1, w0 + 2, p0, x1 + x2 * 4)) + +/* +** st1_za32_0_4: +** add (w1[2-5]), w0, #?4 +** st1w { za0h\.s\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za32_0_4, + svst1_hor_za32 (0, w0 + 4, p0, x1), + svst1_hor_za32 (0, w0 + 4, p0, x1)) + +/* +** st1_za32_0_m1: +** sub (w1[2-5]), w0, #?1 +** st1w { za0h\.s\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za32_0_m1, + svst1_hor_za32 (0, w0 - 1, p0, x1), + svst1_hor_za32 (0, w0 - 1, p0, x1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za64.c new file mode 100644 index 00000000000..7c50d5768ca --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za64.c @@ -0,0 +1,73 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** st1_za64_0_0: +** mov (w1[2-5]), w0 +** st1d { za0h\.d\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za64_0_0, + svst1_hor_za64 (0, w0, p0, x1), + svst1_hor_za64 (0, w0, p0, x1)) + +/* +** st1_za64_0_1: +** mov (w1[2-5]), w0 +** st1d { za0h\.d\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za64_0_1, + svst1_hor_za64 (0, w0 + 1, p0, x1), + svst1_hor_za64 (0, w0 + 1, p0, x1)) + +/* +** st1_za64_7_0: +** mov (w1[2-5]), w0 +** st1d { za7h\.d\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za64_7_0, + svst1_hor_za64 (7, w0, p0, x1), + svst1_hor_za64 (7, w0, p0, x1)) + +/* +** st1_za64_7_1: +** mov (w1[2-5]), w0 +** st1d { za7h\.d\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za64_7_1, + svst1_hor_za64 (7, w0 + 1, p0, x1), + svst1_hor_za64 (7, w0 + 1, p0, x1)) + +/* +** st1_za64_5_1_index: +** mov (w1[2-5]), w0 +** st1d { za5h\.d\[\1, 1\] }, p0/z, \[x1, x2, lsl #?3\] +** ret +*/ +TEST_STORE_ZA (st1_za64_5_1_index, + svst1_hor_za64 (5, w0 + 1, p0, x1 + x2 * 8), + svst1_hor_za64 (5, w0 + 1, p0, x1 + x2 * 8)) + +/* +** st1_za64_0_2: +** add (w1[2-5]), w0, #?2 +** st1d { za0h\.d\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za64_0_2, + svst1_hor_za64 (0, w0 + 2, p0, x1), + svst1_hor_za64 (0, w0 + 2, p0, x1)) + +/* +** st1_za64_0_m1: +** sub (w1[2-5]), w0, #?1 +** st1d { za0h\.d\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za64_0_m1, + svst1_hor_za64 (0, w0 - 1, p0, x1), + svst1_hor_za64 (0, w0 - 1, p0, x1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za8.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za8.c new file mode 100644 index 00000000000..1a777b9c80e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_hor_za8.c @@ -0,0 +1,63 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** st1_za8_0_0: +** mov (w1[2-5]), w0 +** st1b { za0h\.b\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za8_0_0, + svst1_hor_za8 (0, w0, p0, x1), + svst1_hor_za8 (0, w0, p0, x1)) + +/* +** st1_za8_0_1: +** mov (w1[2-5]), w0 +** st1b { za0h\.b\[\1, 1\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za8_0_1, + svst1_hor_za8 (0, w0 + 1, p0, x1), + svst1_hor_za8 (0, w0 + 1, p0, x1)) + +/* +** st1_za8_0_15: +** mov (w1[2-5]), w0 +** st1b { za0h\.b\[\1, 15\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za8_0_15, + svst1_hor_za8 (0, w0 + 15, p0, x1), + svst1_hor_za8 (0, w0 + 15, p0, x1)) + +/* +** st1_za8_0_13_index: +** mov (w1[2-5]), w0 +** st1b { za0h\.b\[\1, 15\] }, p0/z, \[x1, x2\] +** ret +*/ +TEST_STORE_ZA (st1_za8_0_13_index, + svst1_hor_za8 (0, w0 + 15, p0, x1 + x2), + svst1_hor_za8 (0, w0 + 15, p0, x1 + x2)) + +/* +** st1_za8_0_16: +** add (w1[2-5]), w0, #?16 +** st1b { za0h\.b\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za8_0_16, + svst1_hor_za8 (0, w0 + 16, p0, x1), + svst1_hor_za8 (0, w0 + 16, p0, x1)) + +/* +** st1_za8_0_m1: +** sub (w1[2-5]), w0, #?1 +** st1b { za0h\.b\[\1, 0\] }, p0/z, \[x1\] +** ret +*/ +TEST_STORE_ZA (st1_za8_0_m1, + svst1_hor_za8 (0, w0 - 1, p0, x1), + svst1_hor_za8 (0, w0 - 1, p0, x1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za128.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za128.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za16.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za16.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za32.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za64.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za8.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_vnum_za8.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za128.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za128.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za16.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za16.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za32.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za64.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za8.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/st1_ver_za8.c new file mode 100644 index 00000000000..e69de29bb2d diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_vnum_za_s.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_vnum_za_s.c new file mode 100644 index 00000000000..29fefd3bcce --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_vnum_za_s.c @@ -0,0 +1,121 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** str_vnum_za_0: +** mov (w1[2-5]), w0 +** str za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** ret +*/ +TEST_STORE_ZA (str_vnum_za_0, + svstr_vnum_za (w0, x1, 0), + svstr_vnum_za (w0, x1, 0)) + +/* +** str_vnum_za_1: +** mov (w1[2-5]), w0 +** str za\[\1, 1\], \[x1, #1, mul vl\] +** ret +*/ +TEST_STORE_ZA (str_vnum_za_1, + svstr_vnum_za (w0 + 1, x1, 1), + svstr_vnum_za (w0 + 1, x1, 1)) + +/* +** str_vnum_za_13: +** mov (w1[2-5]), w0 +** str za\[\1, 13\], \[x1, #13, mul vl\] +** ret +*/ +TEST_STORE_ZA (str_vnum_za_13, + svstr_vnum_za (w0 + 13, x1, 13), + svstr_vnum_za (w0 + 13, x1, 13)) + +/* +** str_vnum_za_15: +** mov (w1[2-5]), w0 +** str za\[\1, 15\], \[x1, #15, mul vl\] +** ret +*/ +TEST_STORE_ZA (str_vnum_za_15, + svstr_vnum_za (w0 + 15, x1, 15), + svstr_vnum_za (w0 + 15, x1, 15)) + +/* +** str_vnum_za_16: +** ( +** add (w1[2-5]), w0, #?16 +** incb x1, all, mul #16 +** str za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** | +** incb x1, all, mul #16 +** add (w1[2-5]), w0, #?16 +** str za\[\2, 0\], \[x1(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_STORE_ZA (str_vnum_za_16, + svstr_vnum_za (w0 + 16, x1, 16), + svstr_vnum_za (w0 + 16, x1, 16)) + +/* +** str_vnum_za_m1: +** ( +** sub (w1[2-5]), w0, #?1 +** decb x1 +** str za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** | +** decb x1 +** sub (w1[2-5]), w0, #?1 +** str za\[\2, 0\], \[x1(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_STORE_ZA (str_vnum_za_m1, + svstr_vnum_za (w0 - 1, x1, -1), + svstr_vnum_za (w0 - 1, x1, -1)) + +/* +** str_vnum_za_mixed_1: +** add (w1[2-5]), w0, #?1 +** str za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** ret +*/ +TEST_STORE_ZA (str_vnum_za_mixed_1, + svstr_vnum_za (w0 + 1, x1, 0), + svstr_vnum_za (w0 + 1, x1, 0)) + +/* +** str_vnum_za_mixed_2: +** ( +** mov (w1[2-5]), w0 +** incb x1 +** str za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** | +** incb x1 +** mov (w1[2-5]), w0 +** str za\[\2, 0\], \[x1(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_STORE_ZA (str_vnum_za_mixed_2, + svstr_vnum_za (w0, x1, 1), + svstr_vnum_za (w0, x1, 1)) + +/* +** str_vnum_za_mixed_3: +** ( +** add (w1[2-5]), w0, #?2 +** incb x1 +** str za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** | +** incb x1 +** add (w1[2-5]), w0, #?2 +** str za\[\2, 0\], \[x1(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_STORE_ZA (str_vnum_za_mixed_3, + svstr_vnum_za (w0 + 2, x1, 1), + svstr_vnum_za (w0 + 2, x1, 1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_vnum_za_sc.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_vnum_za_sc.c new file mode 100644 index 00000000000..5f3ffc8d336 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_vnum_za_sc.c @@ -0,0 +1,166 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#define STREAMING_COMPATIBLE +#include "test_sme_acle.h" + +/* +** str_vnum_za_0: +** mov (w1[2-5]), w0 +** str za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** ret +*/ +TEST_STORE_ZA (str_vnum_za_0, + svstr_vnum_za (w0, x1, 0), + svstr_vnum_za (w0, x1, 0)) + +/* +** str_vnum_za_1: +** mov (w1[2-5]), w0 +** str za\[\1, 1\], \[x1, #1, mul vl\] +** ret +*/ +TEST_STORE_ZA (str_vnum_za_1, + svstr_vnum_za (w0 + 1, x1, 1), + svstr_vnum_za (w0 + 1, x1, 1)) + +/* +** str_vnum_za_13: +** mov (w1[2-5]), w0 +** str za\[\1, 13\], \[x1, #13, mul vl\] +** ret +*/ +TEST_STORE_ZA (str_vnum_za_13, + svstr_vnum_za (w0 + 13, x1, 13), + svstr_vnum_za (w0 + 13, x1, 13)) + +/* +** str_vnum_za_15: +** mov (w1[2-5]), w0 +** str za\[\1, 15\], \[x1, #15, mul vl\] +** ret +*/ +TEST_STORE_ZA (str_vnum_za_15, + svstr_vnum_za (w0 + 15, x1, 15), + svstr_vnum_za (w0 + 15, x1, 15)) + +/* +** str_vnum_za_16: +** ( +** add (w1[2-5]), w0, #?16 +** addsvl (x[0-9]+), x1, #16 +** str za\[\1, 0\], \[\2(?:, #0, mul vl)?\] +** | +** addsvl (x[0-9]+), x1, #16 +** add (w1[2-5]), w0, #?16 +** str za\[\4, 0\], \[\3(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_STORE_ZA (str_vnum_za_16, + svstr_vnum_za (w0 + 16, x1, 16), + svstr_vnum_za (w0 + 16, x1, 16)) + +/* +** str_vnum_za_m1: +** ( +** sub (w1[2-5]), w0, #?1 +** addsvl (x[0-9]+), x1, #-1 +** str za\[\1, 0\], \[\2(?:, #0, mul vl)?\] +** | +** addsvl (x[0-9]+), x1, #-1 +** sub (w1[2-5]), w0, #?1 +** str za\[\4, 0\], \[\3(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_STORE_ZA (str_vnum_za_m1, + svstr_vnum_za (w0 - 1, x1, -1), + svstr_vnum_za (w0 - 1, x1, -1)) + +/* +** str_vnum_za_mixed_1: +** add (w1[2-5]), w0, #?1 +** str za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** ret +*/ +TEST_STORE_ZA (str_vnum_za_mixed_1, + svstr_vnum_za (w0 + 1, x1, 0), + svstr_vnum_za (w0 + 1, x1, 0)) + +/* +** str_vnum_za_mixed_2: +** ( +** mov (w1[2-5]), w0 +** addsvl (x[0-9]+), x1, #1 +** str za\[\1, 0\], \[\2(?:, #0, mul vl)?\] +** | +** addsvl (x[0-9]+), x1, #1 +** mov (w1[2-5]), w0 +** str za\[\4, 0\], \[\3(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_STORE_ZA (str_vnum_za_mixed_2, + svstr_vnum_za (w0, x1, 1), + svstr_vnum_za (w0, x1, 1)) + +/* +** str_vnum_za_mixed_3: +** ( +** add (w1[2-5]), w0, #?2 +** addsvl (x[0-9]+), x1, #1 +** str za\[\1, 0\], \[\2(?:, #0, mul vl)?\] +** | +** addsvl (x[0-9]+), x1, #1 +** add (w1[2-5]), w0, #?2 +** str za\[\4, 0\], \[\3(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_STORE_ZA (str_vnum_za_mixed_3, + svstr_vnum_za (w0 + 2, x1, 1), + svstr_vnum_za (w0 + 2, x1, 1)) + +/* +** str_vnum_za_mixed_4: +** ... +** addsvl x[0-9]+, x1, #-32 +** ... +** ret +*/ +TEST_STORE_ZA (str_vnum_za_mixed_4, + svstr_vnum_za (w0 + 3, x1, -32), + svstr_vnum_za (w0 + 3, x1, -32)) + +/* +** str_vnum_za_mixed_5: +** ... +** rdsvl x[0-9]+, #1 +** ... +** ret +*/ +TEST_STORE_ZA (str_vnum_za_mixed_5, + svstr_vnum_za (w0 + 3, x1, -33), + svstr_vnum_za (w0 + 3, x1, -33)) + +/* +** str_vnum_za_mixed_6: +** ... +** addsvl x[0-9]+, x1, #31 +** ... +** ret +*/ +TEST_STORE_ZA (str_vnum_za_mixed_6, + svstr_vnum_za (w0 + 4, x1, 31), + svstr_vnum_za (w0 + 4, x1, 31)) + +/* +** str_vnum_za_mixed_7: +** ... +** rdsvl x[0-9]+, #1 +** ... +** ret +*/ +TEST_STORE_ZA (str_vnum_za_mixed_7, + svstr_vnum_za (w0 + 3, x1, 32), + svstr_vnum_za (w0 + 3, x1, 32)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_za_s.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_za_s.c new file mode 100644 index 00000000000..c5dafd6019a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_za_s.c @@ -0,0 +1,104 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** str_za_0: +** mov (w1[2-5]), w0 +** str za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** ret +*/ +TEST_STORE_ZA (str_za_0, + svstr_za (w0, x1), + svstr_za (w0, x1)) + +/* +** str_za_1_vnum: +** mov (w1[2-5]), w0 +** str za\[\1, 1\], \[x1, #1, mul vl\] +** ret +*/ +TEST_STORE_ZA (str_za_1_vnum, + svstr_za (w0 + 1, x1 + svcntsb ()), + svstr_za (w0 + 1, x1 + svcntsb ())) + +/* +** str_za_13_vnum: +** mov (w1[2-5]), w0 +** str za\[\1, 13\], \[x1, #13, mul vl\] +** ret +*/ +TEST_STORE_ZA (str_za_13_vnum, + svstr_za (w0 + 13, x1 + svcntsb () * 13), + svstr_za (w0 + 13, x1 + svcntsb () * 13)) + +/* +** str_za_15_vnum: +** mov (w1[2-5]), w0 +** str za\[\1, 15\], \[x1, #15, mul vl\] +** ret +*/ +TEST_STORE_ZA (str_za_15_vnum, + svstr_za (w0 + 15, x1 + svcntsb () * 15), + svstr_za (w0 + 15, x1 + svcntsb () * 15)) + +/* +** str_za_16_vnum: +** ( +** add (w1[2-5]), w0, #?16 +** incb x1, all, mul #16 +** str za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** | +** incb x1, all, mul #16 +** add (w1[2-5]), w0, #?16 +** str za\[\2, 0\], \[x1(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_STORE_ZA (str_za_16_vnum, + svstr_za (w0 + 16, x1 + svcntsb () * 16), + svstr_za (w0 + 16, x1 + svcntsb () * 16)) + +/* +** str_za_m1_vnum: +** ( +** sub (w1[2-5]), w0, #?1 +** decb x1 +** str za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** | +** decb x1 +** sub (w1[2-5]), w0, #?1 +** str za\[\2, 0\], \[x1(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_STORE_ZA (str_za_m1_vnum, + svstr_za (w0 - 1, x1 - svcntsb ()), + svstr_za (w0 - 1, x1 - svcntsb ())) + +/* +** str_za_2: +** add (w1[2-5]), w0, #?2 +** str za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** ret +*/ +TEST_STORE_ZA (str_za_2, + svstr_za (w0 + 2, x1), + svstr_za (w0 + 2, x1)) + +/* +** str_za_offset: +** ( +** mov (w1[2-5]), w0 +** add (x[0-9]+), x1, #?1 +** str za\[\1, 0\], \[\2(?:, #0, mul vl)?\] +** | +** add (x[0-9]+), x1, #?1 +** mov (w1[2-5]), w0 +** str za\[\4, 0\], \[\3(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_STORE_ZA (str_za_offset, + svstr_za (w0, x1 + 1), + svstr_za (w0, x1 + 1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_za_sc.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_za_sc.c new file mode 100644 index 00000000000..770a76648da --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/str_za_sc.c @@ -0,0 +1,51 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#define STREAMING_COMPATIBLE +#include "test_sme_acle.h" + +/* +** str_za_0: +** mov (w1[2-5]), w0 +** str za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** ret +*/ +TEST_STORE_ZA (str_za_0, + svstr_za (w0, x1), + svstr_za (w0, x1)) + +/* +** str_za_1_vnum: +** mov (w1[2-5]), w0 +** str za\[\1, 1\], \[x1, #1, mul vl\] +** ret +*/ +TEST_STORE_ZA (str_za_1_vnum, + svstr_za (w0 + 1, x1 + svcntsb ()), + svstr_za (w0 + 1, x1 + svcntsb ())) + +/* +** str_za_2: +** add (w1[2-5]), w0, #?2 +** str za\[\1, 0\], \[x1(?:, #0, mul vl)?\] +** ret +*/ +TEST_STORE_ZA (str_za_2, + svstr_za (w0 + 2, x1), + svstr_za (w0 + 2, x1)) + +/* +** str_za_offset: +** ( +** mov (w1[2-5]), w0 +** add (x[0-9]+), x1, #?1 +** str za\[\1, 0\], \[\2(?:, #0, mul vl)?\] +** | +** add (x[0-9]+), x1, #?1 +** mov (w1[2-5]), w0 +** str za\[\4, 0\], \[\3(?:, #0, mul vl)?\] +** ) +** ret +*/ +TEST_STORE_ZA (str_za_offset, + svstr_za (w0, x1 + 1), + svstr_za (w0, x1 + 1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumopa_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumopa_za32.c new file mode 100644 index 00000000000..9dd66f722f4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumopa_za32.c @@ -0,0 +1,30 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** sumopa_za32_s8_0_p0_p1_z0_z4: +** sumopa za0\.s, p0/m, p1/m, z0\.b, z4\.b +** ret +*/ +TEST_DUAL_ZA (sumopa_za32_s8_0_p0_p1_z0_z4, svint8_t, svuint8_t, + svsumopa_za32_s8_m (0, p0, p1, z0, z4), + svsumopa_za32_m (0, p0, p1, z0, z4)) + +/* +** sumopa_za32_s8_0_p1_p0_z4_z0: +** sumopa za0\.s, p1/m, p0/m, z4\.b, z0\.b +** ret +*/ +TEST_DUAL_ZA (sumopa_za32_s8_0_p1_p0_z4_z0, svuint8_t, svint8_t, + svsumopa_za32_s8_m (0, p1, p0, z4, z0), + svsumopa_za32_m (0, p1, p0, z4, z0)) + +/* +** sumopa_za32_s8_3_p0_p1_z0_z4: +** sumopa za3\.s, p0/m, p1/m, z0\.b, z4\.b +** ret +*/ +TEST_DUAL_ZA (sumopa_za32_s8_3_p0_p1_z0_z4, svint8_t, svuint8_t, + svsumopa_za32_s8_m (3, p0, p1, z0, z4), + svsumopa_za32_m (3, p0, p1, z0, z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumopa_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumopa_za64.c new file mode 100644 index 00000000000..2a78ab85d17 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumopa_za64.c @@ -0,0 +1,32 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +#pragma GCC target "+sme-i16i64" + +/* +** sumopa_za64_s16_0_p0_p1_z0_z4: +** sumopa za0\.d, p0/m, p1/m, z0\.h, z4\.h +** ret +*/ +TEST_DUAL_ZA (sumopa_za64_s16_0_p0_p1_z0_z4, svint16_t, svuint16_t, + svsumopa_za64_s16_m (0, p0, p1, z0, z4), + svsumopa_za64_m (0, p0, p1, z0, z4)) + +/* +** sumopa_za64_s16_0_p1_p0_z4_z0: +** sumopa za0\.d, p1/m, p0/m, z4\.h, z0\.h +** ret +*/ +TEST_DUAL_ZA (sumopa_za64_s16_0_p1_p0_z4_z0, svuint16_t, svint16_t, + svsumopa_za64_s16_m (0, p1, p0, z4, z0), + svsumopa_za64_m (0, p1, p0, z4, z0)) + +/* +** sumopa_za64_s16_7_p0_p1_z0_z4: +** sumopa za7\.d, p0/m, p1/m, z0\.h, z4\.h +** ret +*/ +TEST_DUAL_ZA (sumopa_za64_s16_7_p0_p1_z0_z4, svint16_t, svuint16_t, + svsumopa_za64_s16_m (7, p0, p1, z0, z4), + svsumopa_za64_m (7, p0, p1, z0, z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumops_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumops_za32.c new file mode 100644 index 00000000000..55cb92d1b4c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumops_za32.c @@ -0,0 +1,30 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** sumops_za32_s8_0_p0_p1_z0_z4: +** sumops za0\.s, p0/m, p1/m, z0\.b, z4\.b +** ret +*/ +TEST_DUAL_ZA (sumops_za32_s8_0_p0_p1_z0_z4, svint8_t, svuint8_t, + svsumops_za32_s8_m (0, p0, p1, z0, z4), + svsumops_za32_m (0, p0, p1, z0, z4)) + +/* +** sumops_za32_s8_0_p1_p0_z4_z0: +** sumops za0\.s, p1/m, p0/m, z4\.b, z0\.b +** ret +*/ +TEST_DUAL_ZA (sumops_za32_s8_0_p1_p0_z4_z0, svuint8_t, svint8_t, + svsumops_za32_s8_m (0, p1, p0, z4, z0), + svsumops_za32_m (0, p1, p0, z4, z0)) + +/* +** sumops_za32_s8_3_p0_p1_z0_z4: +** sumops za3\.s, p0/m, p1/m, z0\.b, z4\.b +** ret +*/ +TEST_DUAL_ZA (sumops_za32_s8_3_p0_p1_z0_z4, svint8_t, svuint8_t, + svsumops_za32_s8_m (3, p0, p1, z0, z4), + svsumops_za32_m (3, p0, p1, z0, z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumops_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumops_za64.c new file mode 100644 index 00000000000..910a45b2978 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/sumops_za64.c @@ -0,0 +1,32 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +#pragma GCC target "+sme-i16i64" + +/* +** sumops_za64_s16_0_p0_p1_z0_z4: +** sumops za0\.d, p0/m, p1/m, z0\.h, z4\.h +** ret +*/ +TEST_DUAL_ZA (sumops_za64_s16_0_p0_p1_z0_z4, svint16_t, svuint16_t, + svsumops_za64_s16_m (0, p0, p1, z0, z4), + svsumops_za64_m (0, p0, p1, z0, z4)) + +/* +** sumops_za64_s16_0_p1_p0_z4_z0: +** sumops za0\.d, p1/m, p0/m, z4\.h, z0\.h +** ret +*/ +TEST_DUAL_ZA (sumops_za64_s16_0_p1_p0_z4_z0, svuint16_t, svint16_t, + svsumops_za64_s16_m (0, p1, p0, z4, z0), + svsumops_za64_m (0, p1, p0, z4, z0)) + +/* +** sumops_za64_s16_7_p0_p1_z0_z4: +** sumops za7\.d, p0/m, p1/m, z0\.h, z4\.h +** ret +*/ +TEST_DUAL_ZA (sumops_za64_s16_7_p0_p1_z0_z4, svint16_t, svuint16_t, + svsumops_za64_s16_m (7, p0, p1, z0, z4), + svsumops_za64_m (7, p0, p1, z0, z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/test_sme_acle.h b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/test_sme_acle.h new file mode 100644 index 00000000000..0b01284b7cc --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/test_sme_acle.h @@ -0,0 +1,62 @@ +#ifndef TEST_SME_ACLE_H +#define TEST_SME_ACLE_H 1 + +#if (!defined(STREAMING_COMPATIBLE) \ + && !defined(NON_STREAMING) \ + && !defined(STREAMING)) +#define STREAMING +#endif + +#if !defined(NO_SHARED_ZA) +#define SHARED_ZA +#endif + +#include "../../sve/acle/asm/test_sve_acle.h" + +#include + +#define TEST_LOAD_ZA(NAME, CODE1, CODE2) \ + PROTO (NAME, void, (svbool_t p0, int32_t w0, const char *x1, \ + uint64_t x2)) \ + { \ + INVOKE (CODE1, CODE2); \ + } + +#define TEST_STORE_ZA(NAME, CODE1, CODE2) \ + PROTO (NAME, void, (svbool_t p0, int32_t w0, char *x1, \ + uint64_t x2)) \ + { \ + INVOKE (CODE1, CODE2); \ + } + +#define TEST_READ_ZA(NAME, TYPE, CODE1, CODE2) \ + PROTO (NAME, TYPE, (TYPE z0, TYPE z1, svbool_t p0, \ + int32_t w0)) \ + { \ + INVOKE (CODE1, CODE2); \ + return z0; \ + } + +#define TEST_WRITE_ZA(NAME, TYPE, CODE1, CODE2) \ + PROTO (NAME, void, (TYPE z0, TYPE z1, svbool_t p0, \ + int32_t w0)) \ + { \ + INVOKE (CODE1, CODE2); \ + } + +#define TEST_UNIFORM_ZA(NAME, TYPE, CODE1, CODE2) \ + PROTO (NAME, void, (TYPE z0, TYPE z1, svbool_t p0, \ + svbool_t p1)) \ + { \ + INVOKE (CODE1, CODE2); \ + } + +#define TEST_DUAL_ZA(NAME, TYPE1, TYPE2, CODE1, CODE2) \ + PROTO (NAME, void, (TYPE1 z0, TYPE1 z1, TYPE1 z2, TYPE1 z3, \ + TYPE2 z4, TYPE2 z5, TYPE2 z6, TYPE2 z7, \ + svbool_t p0, svbool_t p1)) \ + { \ + INVOKE (CODE1, CODE2); \ + } + +#endif diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/undef_za.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/undef_za.c new file mode 100644 index 00000000000..9cdca3d907e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/undef_za.c @@ -0,0 +1,33 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#define STREAMING_COMPATIBLE +#include "test_sme_acle.h" + +/* +** undef_za_1: +** ret +*/ +PROTO (undef_za_1, void, ()) { svundef_za (); } + +/* +** undef_za_2: +** ret +*/ +PROTO (undef_za_2, void, ()) +{ + svzero_za (); + svundef_za (); +} + +/* +** undef_za_3: +** mov (w1[2-5]), #?0 +** str za\[\1, 0\], \[x0\] +** ret +*/ +PROTO (undef_za_3, void, (void *ptr)) +{ + svzero_za (); + svundef_za (); + svstr_za (0, ptr); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmopa_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmopa_za32.c new file mode 100644 index 00000000000..bbc0b6c11d1 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmopa_za32.c @@ -0,0 +1,30 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** usmopa_za32_u8_0_p0_p1_z0_z4: +** usmopa za0\.s, p0/m, p1/m, z0\.b, z4\.b +** ret +*/ +TEST_DUAL_ZA (usmopa_za32_u8_0_p0_p1_z0_z4, svuint8_t, svint8_t, + svusmopa_za32_u8_m (0, p0, p1, z0, z4), + svusmopa_za32_m (0, p0, p1, z0, z4)) + +/* +** usmopa_za32_u8_0_p1_p0_z4_z0: +** usmopa za0\.s, p1/m, p0/m, z4\.b, z0\.b +** ret +*/ +TEST_DUAL_ZA (usmopa_za32_u8_0_p1_p0_z4_z0, svint8_t, svuint8_t, + svusmopa_za32_u8_m (0, p1, p0, z4, z0), + svusmopa_za32_m (0, p1, p0, z4, z0)) + +/* +** usmopa_za32_u8_3_p0_p1_z0_z4: +** usmopa za3\.s, p0/m, p1/m, z0\.b, z4\.b +** ret +*/ +TEST_DUAL_ZA (usmopa_za32_u8_3_p0_p1_z0_z4, svuint8_t, svint8_t, + svusmopa_za32_u8_m (3, p0, p1, z0, z4), + svusmopa_za32_m (3, p0, p1, z0, z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmopa_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmopa_za64.c new file mode 100644 index 00000000000..64ee25bc737 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmopa_za64.c @@ -0,0 +1,32 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +#pragma GCC target "+sme-i16i64" + +/* +** usmopa_za64_u16_0_p0_p1_z0_z4: +** usmopa za0\.d, p0/m, p1/m, z0\.h, z4\.h +** ret +*/ +TEST_DUAL_ZA (usmopa_za64_u16_0_p0_p1_z0_z4, svuint16_t, svint16_t, + svusmopa_za64_u16_m (0, p0, p1, z0, z4), + svusmopa_za64_m (0, p0, p1, z0, z4)) + +/* +** usmopa_za64_u16_0_p1_p0_z4_z0: +** usmopa za0\.d, p1/m, p0/m, z4\.h, z0\.h +** ret +*/ +TEST_DUAL_ZA (usmopa_za64_u16_0_p1_p0_z4_z0, svint16_t, svuint16_t, + svusmopa_za64_u16_m (0, p1, p0, z4, z0), + svusmopa_za64_m (0, p1, p0, z4, z0)) + +/* +** usmopa_za64_u16_7_p0_p1_z0_z4: +** usmopa za7\.d, p0/m, p1/m, z0\.h, z4\.h +** ret +*/ +TEST_DUAL_ZA (usmopa_za64_u16_7_p0_p1_z0_z4, svuint16_t, svint16_t, + svusmopa_za64_u16_m (7, p0, p1, z0, z4), + svusmopa_za64_m (7, p0, p1, z0, z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmops_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmops_za32.c new file mode 100644 index 00000000000..98fd3315735 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmops_za32.c @@ -0,0 +1,30 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** usmops_za32_u8_0_p0_p1_z0_z4: +** usmops za0\.s, p0/m, p1/m, z0\.b, z4\.b +** ret +*/ +TEST_DUAL_ZA (usmops_za32_u8_0_p0_p1_z0_z4, svuint8_t, svint8_t, + svusmops_za32_u8_m (0, p0, p1, z0, z4), + svusmops_za32_m (0, p0, p1, z0, z4)) + +/* +** usmops_za32_u8_0_p1_p0_z4_z0: +** usmops za0\.s, p1/m, p0/m, z4\.b, z0\.b +** ret +*/ +TEST_DUAL_ZA (usmops_za32_u8_0_p1_p0_z4_z0, svint8_t, svuint8_t, + svusmops_za32_u8_m (0, p1, p0, z4, z0), + svusmops_za32_m (0, p1, p0, z4, z0)) + +/* +** usmops_za32_u8_3_p0_p1_z0_z4: +** usmops za3\.s, p0/m, p1/m, z0\.b, z4\.b +** ret +*/ +TEST_DUAL_ZA (usmops_za32_u8_3_p0_p1_z0_z4, svuint8_t, svint8_t, + svusmops_za32_u8_m (3, p0, p1, z0, z4), + svusmops_za32_m (3, p0, p1, z0, z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmops_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmops_za64.c new file mode 100644 index 00000000000..e20cdab41c3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/usmops_za64.c @@ -0,0 +1,32 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +#pragma GCC target "+sme-i16i64" + +/* +** usmops_za64_u16_0_p0_p1_z0_z4: +** usmops za0\.d, p0/m, p1/m, z0\.h, z4\.h +** ret +*/ +TEST_DUAL_ZA (usmops_za64_u16_0_p0_p1_z0_z4, svuint16_t, svint16_t, + svusmops_za64_u16_m (0, p0, p1, z0, z4), + svusmops_za64_m (0, p0, p1, z0, z4)) + +/* +** usmops_za64_u16_0_p1_p0_z4_z0: +** usmops za0\.d, p1/m, p0/m, z4\.h, z0\.h +** ret +*/ +TEST_DUAL_ZA (usmops_za64_u16_0_p1_p0_z4_z0, svint16_t, svuint16_t, + svusmops_za64_u16_m (0, p1, p0, z4, z0), + svusmops_za64_m (0, p1, p0, z4, z0)) + +/* +** usmops_za64_u16_7_p0_p1_z0_z4: +** usmops za7\.d, p0/m, p1/m, z0\.h, z4\.h +** ret +*/ +TEST_DUAL_ZA (usmops_za64_u16_7_p0_p1_z0_z4, svuint16_t, svint16_t, + svusmops_za64_u16_m (7, p0, p1, z0, z4), + svusmops_za64_m (7, p0, p1, z0, z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za128.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za128.c new file mode 100644 index 00000000000..d2d2897eec7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za128.c @@ -0,0 +1,173 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** write_za128_s8_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_s8_0_0_z0, svint8_t, + svwrite_hor_za128_s8_m (0, w0, p0, z0), + svwrite_hor_za128_m (0, w0, p0, z0)) + +/* +** write_za128_s8_0_1_z0: +** add (w1[2-5]), w0, #?1 +** mova za0h\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_s8_0_1_z0, svint8_t, + svwrite_hor_za128_s8_m (0, w0 + 1, p0, z0), + svwrite_hor_za128_m (0, w0 + 1, p0, z0)) + +/* +** write_za128_s8_0_m1_z0: +** sub (w1[2-5]), w0, #?1 +** mova za0h\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_s8_0_m1_z0, svint8_t, + svwrite_hor_za128_s8_m (0, w0 - 1, p0, z0), + svwrite_hor_za128_m (0, w0 - 1, p0, z0)) + +/* +** write_za128_s8_1_0_z0: +** mov (w1[2-5]), w0 +** mova za1h\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_s8_1_0_z0, svint8_t, + svwrite_hor_za128_s8_m (1, w0, p0, z0), + svwrite_hor_za128_m (1, w0, p0, z0)) + +/* +** write_za128_s8_15_0_z0: +** mov (w1[2-5]), w0 +** mova za15h\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_s8_15_0_z0, svint8_t, + svwrite_hor_za128_s8_m (15, w0, p0, z0), + svwrite_hor_za128_m (15, w0, p0, z0)) + +/* +** write_za128_s8_0_0_z1: +** mov (w1[2-5]), w0 +** mova za0h\.q\[\1, 0\], p0/m, z1\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_s8_0_0_z1, svint8_t, + svwrite_hor_za128_s8_m (0, w0, p0, z1), + svwrite_hor_za128_m (0, w0, p0, z1)) + +/* +** write_za128_u8_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_u8_0_0_z0, svuint8_t, + svwrite_hor_za128_u8_m (0, w0, p0, z0), + svwrite_hor_za128_m (0, w0, p0, z0)) + +/* +** write_za128_s16_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_s16_0_0_z0, svint16_t, + svwrite_hor_za128_s16_m (0, w0, p0, z0), + svwrite_hor_za128_m (0, w0, p0, z0)) + +/* +** write_za128_u16_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_u16_0_0_z0, svuint16_t, + svwrite_hor_za128_u16_m (0, w0, p0, z0), + svwrite_hor_za128_m (0, w0, p0, z0)) + +/* +** write_za128_f16_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_f16_0_0_z0, svfloat16_t, + svwrite_hor_za128_f16_m (0, w0, p0, z0), + svwrite_hor_za128_m (0, w0, p0, z0)) + +/* +** write_za128_bf16_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_bf16_0_0_z0, svbfloat16_t, + svwrite_hor_za128_bf16_m (0, w0, p0, z0), + svwrite_hor_za128_m (0, w0, p0, z0)) + +/* +** write_za128_s32_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_s32_0_0_z0, svint32_t, + svwrite_hor_za128_s32_m (0, w0, p0, z0), + svwrite_hor_za128_m (0, w0, p0, z0)) + +/* +** write_za128_u32_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_u32_0_0_z0, svuint32_t, + svwrite_hor_za128_u32_m (0, w0, p0, z0), + svwrite_hor_za128_m (0, w0, p0, z0)) + +/* +** write_za128_f32_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_f32_0_0_z0, svfloat32_t, + svwrite_hor_za128_f32_m (0, w0, p0, z0), + svwrite_hor_za128_m (0, w0, p0, z0)) + +/* +** write_za128_s64_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_s64_0_0_z0, svint64_t, + svwrite_hor_za128_s64_m (0, w0, p0, z0), + svwrite_hor_za128_m (0, w0, p0, z0)) + +/* +** write_za128_u64_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_u64_0_0_z0, svuint64_t, + svwrite_hor_za128_u64_m (0, w0, p0, z0), + svwrite_hor_za128_m (0, w0, p0, z0)) + +/* +** write_za128_f64_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_f64_0_0_z0, svfloat64_t, + svwrite_hor_za128_f64_m (0, w0, p0, z0), + svwrite_hor_za128_m (0, w0, p0, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za16.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za16.c new file mode 100644 index 00000000000..eb5bec4a642 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za16.c @@ -0,0 +1,113 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** write_za16_s16_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.h\[\1, 0\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_s16_0_0_z0, svint16_t, + svwrite_hor_za16_s16_m (0, w0, p0, z0), + svwrite_hor_za16_m (0, w0, p0, z0)) + +/* +** write_za16_s16_0_1_z0: +** mov (w1[2-5]), w0 +** mova za0h\.h\[\1, 1\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_s16_0_1_z0, svint16_t, + svwrite_hor_za16_s16_m (0, w0 + 1, p0, z0), + svwrite_hor_za16_m (0, w0 + 1, p0, z0)) + +/* +** write_za16_s16_0_7_z0: +** mov (w1[2-5]), w0 +** mova za0h\.h\[\1, 7\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_s16_0_7_z0, svint16_t, + svwrite_hor_za16_s16_m (0, w0 + 7, p0, z0), + svwrite_hor_za16_m (0, w0 + 7, p0, z0)) + +/* +** write_za16_s16_0_8_z0: +** add (w1[2-5]), w0, #?8 +** mova za0h\.h\[\1, 0\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_s16_0_8_z0, svint16_t, + svwrite_hor_za16_s16_m (0, w0 + 8, p0, z0), + svwrite_hor_za16_m (0, w0 + 8, p0, z0)) + +/* +** write_za16_s16_0_m1_z0: +** sub (w1[2-5]), w0, #?1 +** mova za0h\.h\[\1, 0\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_s16_0_m1_z0, svint16_t, + svwrite_hor_za16_s16_m (0, w0 - 1, p0, z0), + svwrite_hor_za16_m (0, w0 - 1, p0, z0)) + +/* +** write_za16_s16_1_0_z0: +** mov (w1[2-5]), w0 +** mova za1h\.h\[\1, 0\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_s16_1_0_z0, svint16_t, + svwrite_hor_za16_s16_m (1, w0, p0, z0), + svwrite_hor_za16_m (1, w0, p0, z0)) + +/* +** write_za16_s16_1_7_z0: +** mov (w1[2-5]), w0 +** mova za1h\.h\[\1, 7\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_s16_1_7_z0, svint16_t, + svwrite_hor_za16_s16_m (1, w0 + 7, p0, z0), + svwrite_hor_za16_m (1, w0 + 7, p0, z0)) + +/* +** write_za16_s16_0_0_z1: +** mov (w1[2-5]), w0 +** mova za0h\.h\[\1, 0\], p0/m, z1\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_s16_0_0_z1, svint16_t, + svwrite_hor_za16_s16_m (0, w0, p0, z1), + svwrite_hor_za16_m (0, w0, p0, z1)) + +/* +** write_za16_u16_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.h\[\1, 0\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_u16_0_0_z0, svuint16_t, + svwrite_hor_za16_u16_m (0, w0, p0, z0), + svwrite_hor_za16_m (0, w0, p0, z0)) + +/* +** write_za16_f16_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.h\[\1, 0\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_f16_0_0_z0, svfloat16_t, + svwrite_hor_za16_f16_m (0, w0, p0, z0), + svwrite_hor_za16_m (0, w0, p0, z0)) + +/* +** write_za16_bf16_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.h\[\1, 0\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_bf16_0_0_z0, svbfloat16_t, + svwrite_hor_za16_bf16_m (0, w0, p0, z0), + svwrite_hor_za16_m (0, w0, p0, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za32.c new file mode 100644 index 00000000000..24f900dce97 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za32.c @@ -0,0 +1,123 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** write_za32_s32_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.s\[\1, 0\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_0_0_z0, svint32_t, + svwrite_hor_za32_s32_m (0, w0, p0, z0), + svwrite_hor_za32_m (0, w0, p0, z0)) + +/* +** write_za32_s32_0_1_z0: +** mov (w1[2-5]), w0 +** mova za0h\.s\[\1, 1\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_0_1_z0, svint32_t, + svwrite_hor_za32_s32_m (0, w0 + 1, p0, z0), + svwrite_hor_za32_m (0, w0 + 1, p0, z0)) + +/* +** write_za32_s32_0_3_z0: +** mov (w1[2-5]), w0 +** mova za0h\.s\[\1, 3\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_0_3_z0, svint32_t, + svwrite_hor_za32_s32_m (0, w0 + 3, p0, z0), + svwrite_hor_za32_m (0, w0 + 3, p0, z0)) + +/* +** write_za32_s32_0_4_z0: +** add (w1[2-5]), w0, #?4 +** mova za0h\.s\[\1, 0\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_0_4_z0, svint32_t, + svwrite_hor_za32_s32_m (0, w0 + 4, p0, z0), + svwrite_hor_za32_m (0, w0 + 4, p0, z0)) + +/* +** write_za32_s32_0_m1_z0: +** sub (w1[2-5]), w0, #?1 +** mova za0h\.s\[\1, 0\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_0_m1_z0, svint32_t, + svwrite_hor_za32_s32_m (0, w0 - 1, p0, z0), + svwrite_hor_za32_m (0, w0 - 1, p0, z0)) + +/* +** write_za32_s32_1_0_z0: +** mov (w1[2-5]), w0 +** mova za1h\.s\[\1, 0\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_1_0_z0, svint32_t, + svwrite_hor_za32_s32_m (1, w0, p0, z0), + svwrite_hor_za32_m (1, w0, p0, z0)) + +/* +** write_za32_s32_1_3_z0: +** mov (w1[2-5]), w0 +** mova za1h\.s\[\1, 3\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_1_3_z0, svint32_t, + svwrite_hor_za32_s32_m (1, w0 + 3, p0, z0), + svwrite_hor_za32_m (1, w0 + 3, p0, z0)) + +/* +** write_za32_s32_3_0_z0: +** mov (w1[2-5]), w0 +** mova za3h\.s\[\1, 0\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_3_0_z0, svint32_t, + svwrite_hor_za32_s32_m (3, w0, p0, z0), + svwrite_hor_za32_m (3, w0, p0, z0)) + +/* +** write_za32_s32_3_3_z0: +** mov (w1[2-5]), w0 +** mova za3h\.s\[\1, 3\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_3_3_z0, svint32_t, + svwrite_hor_za32_s32_m (3, w0 + 3, p0, z0), + svwrite_hor_za32_m (3, w0 + 3, p0, z0)) + +/* +** write_za32_s32_0_0_z1: +** mov (w1[2-5]), w0 +** mova za0h\.s\[\1, 0\], p0/m, z1\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_0_0_z1, svint32_t, + svwrite_hor_za32_s32_m (0, w0, p0, z1), + svwrite_hor_za32_m (0, w0, p0, z1)) + +/* +** write_za32_u32_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.s\[\1, 0\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_u32_0_0_z0, svuint32_t, + svwrite_hor_za32_u32_m (0, w0, p0, z0), + svwrite_hor_za32_m (0, w0, p0, z0)) + +/* +** write_za32_f32_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.s\[\1, 0\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_f32_0_0_z0, svfloat32_t, + svwrite_hor_za32_f32_m (0, w0, p0, z0), + svwrite_hor_za32_m (0, w0, p0, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za64.c new file mode 100644 index 00000000000..ee77acdde35 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za64.c @@ -0,0 +1,113 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** write_za64_s64_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.d\[\1, 0\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_s64_0_0_z0, svint64_t, + svwrite_hor_za64_s64_m (0, w0, p0, z0), + svwrite_hor_za64_m (0, w0, p0, z0)) + +/* +** write_za64_s64_0_1_z0: +** mov (w1[2-5]), w0 +** mova za0h\.d\[\1, 1\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_s64_0_1_z0, svint64_t, + svwrite_hor_za64_s64_m (0, w0 + 1, p0, z0), + svwrite_hor_za64_m (0, w0 + 1, p0, z0)) + +/* +** write_za64_s64_0_2_z0: +** add (w1[2-5]), w0, #?2 +** mova za0h\.d\[\1, 0\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_s64_0_2_z0, svint64_t, + svwrite_hor_za64_s64_m (0, w0 + 2, p0, z0), + svwrite_hor_za64_m (0, w0 + 2, p0, z0)) + +/* +** write_za64_s64_0_m1_z0: +** sub (w1[2-5]), w0, #?1 +** mova za0h\.d\[\1, 0\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_s64_0_m1_z0, svint64_t, + svwrite_hor_za64_s64_m (0, w0 - 1, p0, z0), + svwrite_hor_za64_m (0, w0 - 1, p0, z0)) + +/* +** write_za64_s64_1_0_z0: +** mov (w1[2-5]), w0 +** mova za1h\.d\[\1, 0\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_s64_1_0_z0, svint64_t, + svwrite_hor_za64_s64_m (1, w0, p0, z0), + svwrite_hor_za64_m (1, w0, p0, z0)) + +/* +** write_za64_s64_1_1_z0: +** mov (w1[2-5]), w0 +** mova za1h\.d\[\1, 1\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_s64_1_1_z0, svint64_t, + svwrite_hor_za64_s64_m (1, w0 + 1, p0, z0), + svwrite_hor_za64_m (1, w0 + 1, p0, z0)) + +/* +** write_za64_s64_7_0_z0: +** mov (w1[2-5]), w0 +** mova za7h\.d\[\1, 0\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_s64_7_0_z0, svint64_t, + svwrite_hor_za64_s64_m (7, w0, p0, z0), + svwrite_hor_za64_m (7, w0, p0, z0)) + +/* +** write_za64_s64_7_1_z0: +** mov (w1[2-5]), w0 +** mova za7h\.d\[\1, 1\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_s64_7_1_z0, svint64_t, + svwrite_hor_za64_s64_m (7, w0 + 1, p0, z0), + svwrite_hor_za64_m (7, w0 + 1, p0, z0)) + +/* +** write_za64_s64_0_0_z1: +** mov (w1[2-5]), w0 +** mova za0h\.d\[\1, 0\], p0/m, z1\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_s64_0_0_z1, svint64_t, + svwrite_hor_za64_s64_m (0, w0, p0, z1), + svwrite_hor_za64_m (0, w0, p0, z1)) + +/* +** write_za64_u64_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.d\[\1, 0\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_u64_0_0_z0, svuint64_t, + svwrite_hor_za64_u64_m (0, w0, p0, z0), + svwrite_hor_za64_m (0, w0, p0, z0)) + +/* +** write_za64_f64_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.d\[\1, 0\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_f64_0_0_z0, svfloat64_t, + svwrite_hor_za64_f64_m (0, w0, p0, z0), + svwrite_hor_za64_m (0, w0, p0, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za8.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za8.c new file mode 100644 index 00000000000..7b66406b51a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_hor_za8.c @@ -0,0 +1,73 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** write_za8_s8_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.b\[\1, 0\], p0/m, z0\.b +** ret +*/ +TEST_WRITE_ZA (write_za8_s8_0_0_z0, svint8_t, + svwrite_hor_za8_s8_m (0, w0, p0, z0), + svwrite_hor_za8_m (0, w0, p0, z0)) + +/* +** write_za8_s8_0_1_z0: +** mov (w1[2-5]), w0 +** mova za0h\.b\[\1, 1\], p0/m, z0\.b +** ret +*/ +TEST_WRITE_ZA (write_za8_s8_0_1_z0, svint8_t, + svwrite_hor_za8_s8_m (0, w0 + 1, p0, z0), + svwrite_hor_za8_m (0, w0 + 1, p0, z0)) + +/* +** write_za8_s8_0_15_z0: +** mov (w1[2-5]), w0 +** mova za0h\.b\[\1, 15\], p0/m, z0\.b +** ret +*/ +TEST_WRITE_ZA (write_za8_s8_0_15_z0, svint8_t, + svwrite_hor_za8_s8_m (0, w0 + 15, p0, z0), + svwrite_hor_za8_m (0, w0 + 15, p0, z0)) + +/* +** write_za8_s8_0_16_z0: +** add (w1[2-5]), w0, #?16 +** mova za0h\.b\[\1, 0\], p0/m, z0\.b +** ret +*/ +TEST_WRITE_ZA (write_za8_s8_0_16_z0, svint8_t, + svwrite_hor_za8_s8_m (0, w0 + 16, p0, z0), + svwrite_hor_za8_m (0, w0 + 16, p0, z0)) + +/* +** write_za8_s8_0_m1_z0: +** sub (w1[2-5]), w0, #?1 +** mova za0h\.b\[\1, 0\], p0/m, z0\.b +** ret +*/ +TEST_WRITE_ZA (write_za8_s8_0_m1_z0, svint8_t, + svwrite_hor_za8_s8_m (0, w0 - 1, p0, z0), + svwrite_hor_za8_m (0, w0 - 1, p0, z0)) + +/* +** write_za8_s8_0_0_z1: +** mov (w1[2-5]), w0 +** mova za0h\.b\[\1, 0\], p0/m, z1\.b +** ret +*/ +TEST_WRITE_ZA (write_za8_s8_0_0_z1, svint8_t, + svwrite_hor_za8_s8_m (0, w0, p0, z1), + svwrite_hor_za8_m (0, w0, p0, z1)) + +/* +** write_za8_u8_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0h\.b\[\1, 0\], p0/m, z0\.b +** ret +*/ +TEST_WRITE_ZA (write_za8_u8_0_0_z0, svuint8_t, + svwrite_hor_za8_u8_m (0, w0, p0, z0), + svwrite_hor_za8_m (0, w0, p0, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za128.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za128.c new file mode 100644 index 00000000000..2af4ced5ce7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za128.c @@ -0,0 +1,173 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** write_za128_s8_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_s8_0_0_z0, svint8_t, + svwrite_ver_za128_s8_m (0, w0, p0, z0), + svwrite_ver_za128_m (0, w0, p0, z0)) + +/* +** write_za128_s8_0_1_z0: +** add (w1[2-5]), w0, #?1 +** mova za0v\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_s8_0_1_z0, svint8_t, + svwrite_ver_za128_s8_m (0, w0 + 1, p0, z0), + svwrite_ver_za128_m (0, w0 + 1, p0, z0)) + +/* +** write_za128_s8_0_m1_z0: +** sub (w1[2-5]), w0, #?1 +** mova za0v\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_s8_0_m1_z0, svint8_t, + svwrite_ver_za128_s8_m (0, w0 - 1, p0, z0), + svwrite_ver_za128_m (0, w0 - 1, p0, z0)) + +/* +** write_za128_s8_1_0_z0: +** mov (w1[2-5]), w0 +** mova za1v\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_s8_1_0_z0, svint8_t, + svwrite_ver_za128_s8_m (1, w0, p0, z0), + svwrite_ver_za128_m (1, w0, p0, z0)) + +/* +** write_za128_s8_15_0_z0: +** mov (w1[2-5]), w0 +** mova za15v\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_s8_15_0_z0, svint8_t, + svwrite_ver_za128_s8_m (15, w0, p0, z0), + svwrite_ver_za128_m (15, w0, p0, z0)) + +/* +** write_za128_s8_0_0_z1: +** mov (w1[2-5]), w0 +** mova za0v\.q\[\1, 0\], p0/m, z1\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_s8_0_0_z1, svint8_t, + svwrite_ver_za128_s8_m (0, w0, p0, z1), + svwrite_ver_za128_m (0, w0, p0, z1)) + +/* +** write_za128_u8_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_u8_0_0_z0, svuint8_t, + svwrite_ver_za128_u8_m (0, w0, p0, z0), + svwrite_ver_za128_m (0, w0, p0, z0)) + +/* +** write_za128_s16_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_s16_0_0_z0, svint16_t, + svwrite_ver_za128_s16_m (0, w0, p0, z0), + svwrite_ver_za128_m (0, w0, p0, z0)) + +/* +** write_za128_u16_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_u16_0_0_z0, svuint16_t, + svwrite_ver_za128_u16_m (0, w0, p0, z0), + svwrite_ver_za128_m (0, w0, p0, z0)) + +/* +** write_za128_f16_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_f16_0_0_z0, svfloat16_t, + svwrite_ver_za128_f16_m (0, w0, p0, z0), + svwrite_ver_za128_m (0, w0, p0, z0)) + +/* +** write_za128_bf16_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_bf16_0_0_z0, svbfloat16_t, + svwrite_ver_za128_bf16_m (0, w0, p0, z0), + svwrite_ver_za128_m (0, w0, p0, z0)) + +/* +** write_za128_s32_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_s32_0_0_z0, svint32_t, + svwrite_ver_za128_s32_m (0, w0, p0, z0), + svwrite_ver_za128_m (0, w0, p0, z0)) + +/* +** write_za128_u32_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_u32_0_0_z0, svuint32_t, + svwrite_ver_za128_u32_m (0, w0, p0, z0), + svwrite_ver_za128_m (0, w0, p0, z0)) + +/* +** write_za128_f32_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_f32_0_0_z0, svfloat32_t, + svwrite_ver_za128_f32_m (0, w0, p0, z0), + svwrite_ver_za128_m (0, w0, p0, z0)) + +/* +** write_za128_s64_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_s64_0_0_z0, svint64_t, + svwrite_ver_za128_s64_m (0, w0, p0, z0), + svwrite_ver_za128_m (0, w0, p0, z0)) + +/* +** write_za128_u64_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_u64_0_0_z0, svuint64_t, + svwrite_ver_za128_u64_m (0, w0, p0, z0), + svwrite_ver_za128_m (0, w0, p0, z0)) + +/* +** write_za128_f64_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.q\[\1, 0\], p0/m, z0\.q +** ret +*/ +TEST_WRITE_ZA (write_za128_f64_0_0_z0, svfloat64_t, + svwrite_ver_za128_f64_m (0, w0, p0, z0), + svwrite_ver_za128_m (0, w0, p0, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za16.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za16.c new file mode 100644 index 00000000000..e59c595f594 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za16.c @@ -0,0 +1,113 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** write_za16_s16_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.h\[\1, 0\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_s16_0_0_z0, svint16_t, + svwrite_ver_za16_s16_m (0, w0, p0, z0), + svwrite_ver_za16_m (0, w0, p0, z0)) + +/* +** write_za16_s16_0_1_z0: +** mov (w1[2-5]), w0 +** mova za0v\.h\[\1, 1\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_s16_0_1_z0, svint16_t, + svwrite_ver_za16_s16_m (0, w0 + 1, p0, z0), + svwrite_ver_za16_m (0, w0 + 1, p0, z0)) + +/* +** write_za16_s16_0_7_z0: +** mov (w1[2-5]), w0 +** mova za0v\.h\[\1, 7\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_s16_0_7_z0, svint16_t, + svwrite_ver_za16_s16_m (0, w0 + 7, p0, z0), + svwrite_ver_za16_m (0, w0 + 7, p0, z0)) + +/* +** write_za16_s16_0_8_z0: +** add (w1[2-5]), w0, #?8 +** mova za0v\.h\[\1, 0\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_s16_0_8_z0, svint16_t, + svwrite_ver_za16_s16_m (0, w0 + 8, p0, z0), + svwrite_ver_za16_m (0, w0 + 8, p0, z0)) + +/* +** write_za16_s16_0_m1_z0: +** sub (w1[2-5]), w0, #?1 +** mova za0v\.h\[\1, 0\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_s16_0_m1_z0, svint16_t, + svwrite_ver_za16_s16_m (0, w0 - 1, p0, z0), + svwrite_ver_za16_m (0, w0 - 1, p0, z0)) + +/* +** write_za16_s16_1_0_z0: +** mov (w1[2-5]), w0 +** mova za1v\.h\[\1, 0\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_s16_1_0_z0, svint16_t, + svwrite_ver_za16_s16_m (1, w0, p0, z0), + svwrite_ver_za16_m (1, w0, p0, z0)) + +/* +** write_za16_s16_1_7_z0: +** mov (w1[2-5]), w0 +** mova za1v\.h\[\1, 7\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_s16_1_7_z0, svint16_t, + svwrite_ver_za16_s16_m (1, w0 + 7, p0, z0), + svwrite_ver_za16_m (1, w0 + 7, p0, z0)) + +/* +** write_za16_s16_0_0_z1: +** mov (w1[2-5]), w0 +** mova za0v\.h\[\1, 0\], p0/m, z1\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_s16_0_0_z1, svint16_t, + svwrite_ver_za16_s16_m (0, w0, p0, z1), + svwrite_ver_za16_m (0, w0, p0, z1)) + +/* +** write_za16_u16_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.h\[\1, 0\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_u16_0_0_z0, svuint16_t, + svwrite_ver_za16_u16_m (0, w0, p0, z0), + svwrite_ver_za16_m (0, w0, p0, z0)) + +/* +** write_za16_f16_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.h\[\1, 0\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_f16_0_0_z0, svfloat16_t, + svwrite_ver_za16_f16_m (0, w0, p0, z0), + svwrite_ver_za16_m (0, w0, p0, z0)) + +/* +** write_za16_bf16_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.h\[\1, 0\], p0/m, z0\.h +** ret +*/ +TEST_WRITE_ZA (write_za16_bf16_0_0_z0, svbfloat16_t, + svwrite_ver_za16_bf16_m (0, w0, p0, z0), + svwrite_ver_za16_m (0, w0, p0, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za32.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za32.c new file mode 100644 index 00000000000..47fc6272038 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za32.c @@ -0,0 +1,123 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** write_za32_s32_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.s\[\1, 0\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_0_0_z0, svint32_t, + svwrite_ver_za32_s32_m (0, w0, p0, z0), + svwrite_ver_za32_m (0, w0, p0, z0)) + +/* +** write_za32_s32_0_1_z0: +** mov (w1[2-5]), w0 +** mova za0v\.s\[\1, 1\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_0_1_z0, svint32_t, + svwrite_ver_za32_s32_m (0, w0 + 1, p0, z0), + svwrite_ver_za32_m (0, w0 + 1, p0, z0)) + +/* +** write_za32_s32_0_3_z0: +** mov (w1[2-5]), w0 +** mova za0v\.s\[\1, 3\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_0_3_z0, svint32_t, + svwrite_ver_za32_s32_m (0, w0 + 3, p0, z0), + svwrite_ver_za32_m (0, w0 + 3, p0, z0)) + +/* +** write_za32_s32_0_4_z0: +** add (w1[2-5]), w0, #?4 +** mova za0v\.s\[\1, 0\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_0_4_z0, svint32_t, + svwrite_ver_za32_s32_m (0, w0 + 4, p0, z0), + svwrite_ver_za32_m (0, w0 + 4, p0, z0)) + +/* +** write_za32_s32_0_m1_z0: +** sub (w1[2-5]), w0, #?1 +** mova za0v\.s\[\1, 0\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_0_m1_z0, svint32_t, + svwrite_ver_za32_s32_m (0, w0 - 1, p0, z0), + svwrite_ver_za32_m (0, w0 - 1, p0, z0)) + +/* +** write_za32_s32_1_0_z0: +** mov (w1[2-5]), w0 +** mova za1v\.s\[\1, 0\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_1_0_z0, svint32_t, + svwrite_ver_za32_s32_m (1, w0, p0, z0), + svwrite_ver_za32_m (1, w0, p0, z0)) + +/* +** write_za32_s32_1_3_z0: +** mov (w1[2-5]), w0 +** mova za1v\.s\[\1, 3\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_1_3_z0, svint32_t, + svwrite_ver_za32_s32_m (1, w0 + 3, p0, z0), + svwrite_ver_za32_m (1, w0 + 3, p0, z0)) + +/* +** write_za32_s32_3_0_z0: +** mov (w1[2-5]), w0 +** mova za3v\.s\[\1, 0\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_3_0_z0, svint32_t, + svwrite_ver_za32_s32_m (3, w0, p0, z0), + svwrite_ver_za32_m (3, w0, p0, z0)) + +/* +** write_za32_s32_3_3_z0: +** mov (w1[2-5]), w0 +** mova za3v\.s\[\1, 3\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_3_3_z0, svint32_t, + svwrite_ver_za32_s32_m (3, w0 + 3, p0, z0), + svwrite_ver_za32_m (3, w0 + 3, p0, z0)) + +/* +** write_za32_s32_0_0_z1: +** mov (w1[2-5]), w0 +** mova za0v\.s\[\1, 0\], p0/m, z1\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_s32_0_0_z1, svint32_t, + svwrite_ver_za32_s32_m (0, w0, p0, z1), + svwrite_ver_za32_m (0, w0, p0, z1)) + +/* +** write_za32_u32_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.s\[\1, 0\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_u32_0_0_z0, svuint32_t, + svwrite_ver_za32_u32_m (0, w0, p0, z0), + svwrite_ver_za32_m (0, w0, p0, z0)) + +/* +** write_za32_f32_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.s\[\1, 0\], p0/m, z0\.s +** ret +*/ +TEST_WRITE_ZA (write_za32_f32_0_0_z0, svfloat32_t, + svwrite_ver_za32_f32_m (0, w0, p0, z0), + svwrite_ver_za32_m (0, w0, p0, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za64.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za64.c new file mode 100644 index 00000000000..8eca5849bc8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za64.c @@ -0,0 +1,113 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** write_za64_s64_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.d\[\1, 0\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_s64_0_0_z0, svint64_t, + svwrite_ver_za64_s64_m (0, w0, p0, z0), + svwrite_ver_za64_m (0, w0, p0, z0)) + +/* +** write_za64_s64_0_1_z0: +** mov (w1[2-5]), w0 +** mova za0v\.d\[\1, 1\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_s64_0_1_z0, svint64_t, + svwrite_ver_za64_s64_m (0, w0 + 1, p0, z0), + svwrite_ver_za64_m (0, w0 + 1, p0, z0)) + +/* +** write_za64_s64_0_2_z0: +** add (w1[2-5]), w0, #?2 +** mova za0v\.d\[\1, 0\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_s64_0_2_z0, svint64_t, + svwrite_ver_za64_s64_m (0, w0 + 2, p0, z0), + svwrite_ver_za64_m (0, w0 + 2, p0, z0)) + +/* +** write_za64_s64_0_m1_z0: +** sub (w1[2-5]), w0, #?1 +** mova za0v\.d\[\1, 0\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_s64_0_m1_z0, svint64_t, + svwrite_ver_za64_s64_m (0, w0 - 1, p0, z0), + svwrite_ver_za64_m (0, w0 - 1, p0, z0)) + +/* +** write_za64_s64_1_0_z0: +** mov (w1[2-5]), w0 +** mova za1v\.d\[\1, 0\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_s64_1_0_z0, svint64_t, + svwrite_ver_za64_s64_m (1, w0, p0, z0), + svwrite_ver_za64_m (1, w0, p0, z0)) + +/* +** write_za64_s64_1_1_z0: +** mov (w1[2-5]), w0 +** mova za1v\.d\[\1, 1\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_s64_1_1_z0, svint64_t, + svwrite_ver_za64_s64_m (1, w0 + 1, p0, z0), + svwrite_ver_za64_m (1, w0 + 1, p0, z0)) + +/* +** write_za64_s64_7_0_z0: +** mov (w1[2-5]), w0 +** mova za7v\.d\[\1, 0\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_s64_7_0_z0, svint64_t, + svwrite_ver_za64_s64_m (7, w0, p0, z0), + svwrite_ver_za64_m (7, w0, p0, z0)) + +/* +** write_za64_s64_7_1_z0: +** mov (w1[2-5]), w0 +** mova za7v\.d\[\1, 1\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_s64_7_1_z0, svint64_t, + svwrite_ver_za64_s64_m (7, w0 + 1, p0, z0), + svwrite_ver_za64_m (7, w0 + 1, p0, z0)) + +/* +** write_za64_s64_0_0_z1: +** mov (w1[2-5]), w0 +** mova za0v\.d\[\1, 0\], p0/m, z1\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_s64_0_0_z1, svint64_t, + svwrite_ver_za64_s64_m (0, w0, p0, z1), + svwrite_ver_za64_m (0, w0, p0, z1)) + +/* +** write_za64_u64_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.d\[\1, 0\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_u64_0_0_z0, svuint64_t, + svwrite_ver_za64_u64_m (0, w0, p0, z0), + svwrite_ver_za64_m (0, w0, p0, z0)) + +/* +** write_za64_f64_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.d\[\1, 0\], p0/m, z0\.d +** ret +*/ +TEST_WRITE_ZA (write_za64_f64_0_0_z0, svfloat64_t, + svwrite_ver_za64_f64_m (0, w0, p0, z0), + svwrite_ver_za64_m (0, w0, p0, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za8.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za8.c new file mode 100644 index 00000000000..fc48ba70970 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/write_ver_za8.c @@ -0,0 +1,73 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sme_acle.h" + +/* +** write_za8_s8_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.b\[\1, 0\], p0/m, z0\.b +** ret +*/ +TEST_WRITE_ZA (write_za8_s8_0_0_z0, svint8_t, + svwrite_ver_za8_s8_m (0, w0, p0, z0), + svwrite_ver_za8_m (0, w0, p0, z0)) + +/* +** write_za8_s8_0_1_z0: +** mov (w1[2-5]), w0 +** mova za0v\.b\[\1, 1\], p0/m, z0\.b +** ret +*/ +TEST_WRITE_ZA (write_za8_s8_0_1_z0, svint8_t, + svwrite_ver_za8_s8_m (0, w0 + 1, p0, z0), + svwrite_ver_za8_m (0, w0 + 1, p0, z0)) + +/* +** write_za8_s8_0_15_z0: +** mov (w1[2-5]), w0 +** mova za0v\.b\[\1, 15\], p0/m, z0\.b +** ret +*/ +TEST_WRITE_ZA (write_za8_s8_0_15_z0, svint8_t, + svwrite_ver_za8_s8_m (0, w0 + 15, p0, z0), + svwrite_ver_za8_m (0, w0 + 15, p0, z0)) + +/* +** write_za8_s8_0_16_z0: +** add (w1[2-5]), w0, #?16 +** mova za0v\.b\[\1, 0\], p0/m, z0\.b +** ret +*/ +TEST_WRITE_ZA (write_za8_s8_0_16_z0, svint8_t, + svwrite_ver_za8_s8_m (0, w0 + 16, p0, z0), + svwrite_ver_za8_m (0, w0 + 16, p0, z0)) + +/* +** write_za8_s8_0_m1_z0: +** sub (w1[2-5]), w0, #?1 +** mova za0v\.b\[\1, 0\], p0/m, z0\.b +** ret +*/ +TEST_WRITE_ZA (write_za8_s8_0_m1_z0, svint8_t, + svwrite_ver_za8_s8_m (0, w0 - 1, p0, z0), + svwrite_ver_za8_m (0, w0 - 1, p0, z0)) + +/* +** write_za8_s8_0_0_z1: +** mov (w1[2-5]), w0 +** mova za0v\.b\[\1, 0\], p0/m, z1\.b +** ret +*/ +TEST_WRITE_ZA (write_za8_s8_0_0_z1, svint8_t, + svwrite_ver_za8_s8_m (0, w0, p0, z1), + svwrite_ver_za8_m (0, w0, p0, z1)) + +/* +** write_za8_u8_0_0_z0: +** mov (w1[2-5]), w0 +** mova za0v\.b\[\1, 0\], p0/m, z0\.b +** ret +*/ +TEST_WRITE_ZA (write_za8_u8_0_0_z0, svuint8_t, + svwrite_ver_za8_u8_m (0, w0, p0, z0), + svwrite_ver_za8_m (0, w0, p0, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/zero_mask_za.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/zero_mask_za.c new file mode 100644 index 00000000000..9ce7331ebdd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/zero_mask_za.c @@ -0,0 +1,130 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#define STREAMING_COMPATIBLE +#include "test_sme_acle.h" + +/* +** zero_mask_za_0: +** zero { *} +** ret +*/ +PROTO (zero_mask_za_0, void, ()) { svzero_mask_za (0); } + +/* +** zero_mask_za_01: +** zero { za0\.d } +** ret +*/ +PROTO (zero_mask_za_01, void, ()) { svzero_mask_za (0x01); } + +/* +** zero_mask_za_80: +** zero { za7\.d } +** ret +*/ +PROTO (zero_mask_za_80, void, ()) { svzero_mask_za (0x80); } + +/* +** zero_mask_za_03: +** zero { za0\.d, za1\.d } +** ret +*/ +PROTO (zero_mask_za_03, void, ()) { svzero_mask_za (0x03); } + +/* +** zero_mask_za_09: +** zero { za0\.d, za3\.d } +** ret +*/ +PROTO (zero_mask_za_09, void, ()) { svzero_mask_za (0x09); } + +/* +** zero_mask_za_0d: +** zero { za0\.d, za2\.d, za3\.d } +** ret +*/ +PROTO (zero_mask_za_0d, void, ()) { svzero_mask_za (0x0d); } + +/* +** zero_mask_za_3c: +** zero { za2\.d, za3\.d, za4\.d, za5\.d } +** ret +*/ +PROTO (zero_mask_za_3c, void, ()) { svzero_mask_za (0x3c); } + +/* +** zero_mask_za_5a: +** zero { za1\.d, za3\.d, za4\.d, za6\.d } +** ret +*/ +PROTO (zero_mask_za_5a, void, ()) { svzero_mask_za (0x5a); } + +/* +** zero_mask_za_11: +** zero { za0\.s } +** ret +*/ +PROTO (zero_mask_za_11, void, ()) { svzero_mask_za (0x11); } + +/* +** zero_mask_za_88: +** zero { za3\.s } +** ret +*/ +PROTO (zero_mask_za_88, void, ()) { svzero_mask_za (0x88); } + +/* +** zero_mask_za_33: +** zero { za0\.s, za1\.s } +** ret +*/ +PROTO (zero_mask_za_33, void, ()) { svzero_mask_za (0x33); } + +/* +** zero_mask_za_cc: +** zero { za2\.s, za3\.s } +** ret +*/ +PROTO (zero_mask_za_cc, void, ()) { svzero_mask_za (0xcc); } + +/* +** zero_mask_za_55: +** zero { za0\.h } +** ret +*/ +PROTO (zero_mask_za_55, void, ()) { svzero_mask_za (0x55); } + +/* +** zero_mask_za_aa: +** zero { za1\.h } +** ret +*/ +PROTO (zero_mask_za_aa, void, ()) { svzero_mask_za (0xaa); } + +/* +** zero_mask_za_ab: +** zero { za1\.h, za0\.d } +** ret +*/ +PROTO (zero_mask_za_ab, void, ()) { svzero_mask_za (0xab); } + +/* +** zero_mask_za_d7: +** zero { za0\.h, za1\.d, za7\.d } +** ret +*/ +PROTO (zero_mask_za_d7, void, ()) { svzero_mask_za (0xd7); } + +/* +** zero_mask_za_bf: +** zero { za1\.h, za0\.s, za2\.d } +** ret +*/ +PROTO (zero_mask_za_bf, void, ()) { svzero_mask_za (0xbf); } + +/* +** zero_mask_za_ff: +** zero { za } +** ret +*/ +PROTO (zero_mask_za_ff, void, ()) { svzero_mask_za (0xff); } diff --git a/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/zero_za.c b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/zero_za.c new file mode 100644 index 00000000000..4688d0950ce --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/acle-asm/zero_za.c @@ -0,0 +1,11 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#define STREAMING_COMPATIBLE +#include "test_sme_acle.h" + +/* +** zero_za: +** zero { za } +** ret +*/ +PROTO (zero_za, void, ()) { svzero_za (); } diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h index 5ee272e270c..a0b400ccd21 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h @@ -11,12 +11,22 @@ #error "Please define -DTEST_OVERLOADS or -DTEST_FULL" #endif -#ifdef STREAMING_COMPATIBLE -#define ATTR __attribute__ ((arm_streaming_compatible)) +#if defined(STREAMING_COMPATIBLE) +#define SM_ATTR __attribute__ ((arm_streaming_compatible)) +#elif defined(STREAMING) +#define SM_ATTR __attribute__ ((arm_streaming)) #else -#define ATTR +#define SM_ATTR #endif +#ifdef SHARED_ZA +#define ZA_ATTR __attribute__ ((arm_shared_za)) +#else +#define ZA_ATTR +#endif + +#define ATTR SM_ATTR ZA_ATTR + #ifdef __cplusplus #define PROTO(NAME, RET, ARGS) \ extern "C" RET ATTR NAME ARGS; RET ATTR NAME ARGS diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_int_m_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_int_m_1.c new file mode 100644 index 00000000000..98f1cfcbefd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_int_m_1.c @@ -0,0 +1,48 @@ +/* { dg-do compile } */ + +#include + +#pragma GCC target ("arch=armv8.2-a+sme") + +void __attribute__((arm_streaming, arm_shared_za)) +f1 (svbool_t pg, svint8_t s8, svuint8_t u8, + svint16_t s16, svuint16_t u16, svfloat16_t f16, uint32_t tile) +{ + svusmopa_za32_m (0, pg, pg, u8); /* { dg-error {too few arguments to function 'svusmopa_za32_m'} } */ + svusmopa_za32_m (0, pg, pg, u8, s8, 0); /* { dg-error {too many arguments to function 'svusmopa_za32_m'} } */ + svusmopa_za32_m (tile, pg, pg, u8, s8); /* { dg-error {argument 1 of 'svusmopa_za32_m' must be an integer constant expression} } */ + svusmopa_za32_m (-1, pg, pg, u8, s8); /* { dg-error {passing -1 to argument 1 of 'svusmopa_za32_m', which expects a value in the range \[0, 3\]} } */ + svusmopa_za32_m (4, pg, pg, u8, s8); /* { dg-error {passing 4 to argument 1 of 'svusmopa_za32_m', which expects a value in the range \[0, 3\]} } */ + svusmopa_za32_m (0, u8, pg, u8, s8); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svusmopa_za32_m', which expects 'svbool_t'} } */ + svusmopa_za32_m (0, pg, u8, u8, s8); /* { dg-error {passing 'svuint8_t' to argument 3 of 'svusmopa_za32_m', which expects 'svbool_t'} } */ + svusmopa_za32_m (0, pg, pg, tile, s8); /* { dg-error {passing 'uint32_t'.* to argument 4 of 'svusmopa_za32_m', which expects an SVE vector type} } */ + svusmopa_za32_m (0, pg, pg, s8, s8); /* { dg-error {'svusmopa_za32_m' has no form that takes 'svint8_t' arguments} } */ + svusmopa_za32_m (0, pg, pg, pg, s8); /* { dg-error {'svusmopa_za32_m' has no form that takes 'svbool_t' arguments} } */ + svusmopa_za32_m (0, pg, pg, f16, s8); /* { dg-error {'svusmopa_za32_m' has no form that takes 'svfloat16_t' arguments} } */ + svusmopa_za32_m (0, pg, pg, u8, u8); /* { dg-error {passing 'svuint8_t' to argument 5 of 'svusmopa_za32_m', which expects a vector of signed integers} } */ + svusmopa_za32_m (0, pg, pg, u8, s16); /* { dg-error {arguments 4 and 5 of 'svusmopa_za32_m' must have the same element size, but the values passed here have type 'svuint8_t' and 'svint16_t' respectively} } */ + svusmopa_za32_m (0, pg, pg, u16, s16); /* { dg-error {'svusmopa_za32_m' has no form that takes 'svuint16_t' arguments} } */ + + svusmopa_za64_m (0, pg, pg, u16, s16); /* { dg-error {ACLE function 'svusmopa_za64_u16_m' requires ISA extension 'sme-i16i64'} } */ +} + +void __attribute__((arm_streaming)) +f2 (svbool_t pg, svint8_t s8, svuint8_t u8) +{ + svusmopa_za32_m (0, pg, pg, u8, s8); /* { dg-error {ACLE function 'svusmopa_za32_u8_m' can only be called from a function that has ZA state} } */ +} + +void __attribute__((arm_shared_za)) +f3 (svbool_t pg, svint8_t s8, svuint8_t u8) +{ + svusmopa_za32_m (0, pg, pg, u8, s8); /* { dg-error {ACLE function 'svusmopa_za32_u8_m' can only be called when SME streaming mode is enabled} } */ +} + +#pragma GCC target ("arch=armv8.2-a+sme-i16i64") + +void __attribute__((arm_streaming, arm_shared_za)) +f4 (svbool_t pg, svint16_t s16, svuint16_t u16) +{ + svusmopa_za64_m (-1, pg, pg, u16, s16); /* { dg-error {passing -1 to argument 1 of 'svusmopa_za64_m', which expects a value in the range \[0, 7\]} } */ + svusmopa_za64_m (8, pg, pg, u16, s16); /* { dg-error {passing 8 to argument 1 of 'svusmopa_za64_m', which expects a value in the range \[0, 7\]} } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_m_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_m_1.c new file mode 100644 index 00000000000..954da6df0a3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_m_1.c @@ -0,0 +1,48 @@ +/* { dg-do compile } */ + +#include + +#pragma GCC target ("arch=armv8.2-a+sme") + +void __attribute__((arm_streaming, arm_shared_za)) +f1 (svbool_t pg, svint8_t s8, svuint8_t u8, svint16_t s16, svint32_t s32, + svfloat16_t f16, svfloat32_t f32, svfloat64_t f64, uint32_t tile) +{ + svmopa_za32_m (0, pg, pg, s8); /* { dg-error {too few arguments to function 'svmopa_za32_m'} } */ + svmopa_za32_m (0, pg, pg, s8, s8, 0); /* { dg-error {too many arguments to function 'svmopa_za32_m'} } */ + svmopa_za32_m (tile, pg, pg, s8, s8); /* { dg-error {argument 1 of 'svmopa_za32_m' must be an integer constant expression} } */ + svmopa_za32_m (-1, pg, pg, s8, s8); /* { dg-error {passing -1 to argument 1 of 'svmopa_za32_m', which expects a value in the range \[0, 3\]} } */ + svmopa_za32_m (4, pg, pg, s8, s8); /* { dg-error {passing 4 to argument 1 of 'svmopa_za32_m', which expects a value in the range \[0, 3\]} } */ + svmopa_za32_m (0, u8, pg, s8, s8); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svmopa_za32_m', which expects 'svbool_t'} } */ + svmopa_za32_m (0, pg, u8, s8, s8); /* { dg-error {passing 'svuint8_t' to argument 3 of 'svmopa_za32_m', which expects 'svbool_t'} } */ + svmopa_za32_m (0, pg, pg, tile, s8); /* { dg-error {passing 'uint32_t'.* to argument 4 of 'svmopa_za32_m', which expects an SVE vector type} } */ + svmopa_za32_m (0, pg, pg, u8, s8); /* { dg-error {passing 'svint8_t'.* to argument 5 of 'svmopa_za32_m', but previous arguments had type 'svuint8_t'} } */ + svmopa_za32_m (0, pg, pg, s8, f16); /* { dg-error {passing 'svfloat16_t'.* to argument 5 of 'svmopa_za32_m', but previous arguments had type 'svint8_t'} } */ + svmopa_za32_m (0, pg, pg, pg, pg); /* { dg-error {'svmopa_za32_m' has no form that takes 'svbool_t' arguments} } */ + svmopa_za32_m (0, pg, pg, s16, s16); /* { dg-error {'svmopa_za32_m' has no form that takes 'svint16_t' arguments} } */ + svmopa_za32_m (0, pg, pg, s32, s32); /* { dg-error {'svmopa_za32_m' has no form that takes 'svint32_t' arguments} } */ + svmopa_za32_m (0, pg, pg, f64, f64); /* { dg-error {'svmopa_za32_m' has no form that takes 'svfloat64_t' arguments} } */ + + svmopa_za64_m (0, pg, pg, s16, s16); /* { dg-error {ACLE function 'svmopa_za64_s16_m' requires ISA extension 'sme-i16i64'} } */ +} + +void __attribute__((arm_streaming)) +f2 (svbool_t pg, svint8_t s8) +{ + svmopa_za32_m (0, pg, pg, s8, s8); /* { dg-error {ACLE function 'svmopa_za32_s8_m' can only be called from a function that has ZA state} } */ +} + +void __attribute__((arm_shared_za)) +f3 (svbool_t pg, svint8_t s8) +{ + svmopa_za32_m (0, pg, pg, s8, s8); /* { dg-error {ACLE function 'svmopa_za32_s8_m' can only be called when SME streaming mode is enabled} } */ +} + +#pragma GCC target ("arch=armv8.2-a+sme-i16i64") + +void __attribute__((arm_streaming, arm_shared_za)) +f4 (svbool_t pg, svint16_t s16) +{ + svmopa_za64_m (-1, pg, pg, s16, s16); /* { dg-error {passing -1 to argument 1 of 'svmopa_za64_m', which expects a value in the range \[0, 7\]} } */ + svmopa_za64_m (8, pg, pg, s16, s16); /* { dg-error {passing 8 to argument 1 of 'svmopa_za64_m', which expects a value in the range \[0, 7\]} } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_m_2.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_m_2.c new file mode 100644 index 00000000000..70d68b999db --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_m_2.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ + +#include + +#pragma GCC target ("arch=armv8.2-a+sme") + +void __attribute__((arm_streaming, arm_shared_za)) +f1 (svbool_t pg, svfloat64_t f64) +{ + svmopa_za64_m (0, pg, pg, f64, f64); /* { dg-error {ACLE function 'svmopa_za64_f64_m' requires ISA extension 'sme-f64f64'} } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_uint_m_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_uint_m_1.c new file mode 100644 index 00000000000..21f392aed25 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_uint_m_1.c @@ -0,0 +1,48 @@ +/* { dg-do compile } */ + +#include + +#pragma GCC target ("arch=armv8.2-a+sme") + +void __attribute__((arm_streaming, arm_shared_za)) +f1 (svbool_t pg, svint8_t s8, svuint8_t u8, + svint16_t s16, svuint16_t u16, svfloat16_t f16, uint32_t tile) +{ + svsumopa_za32_m (0, pg, pg, s8); /* { dg-error {too few arguments to function 'svsumopa_za32_m'} } */ + svsumopa_za32_m (0, pg, pg, s8, u8, 0); /* { dg-error {too many arguments to function 'svsumopa_za32_m'} } */ + svsumopa_za32_m (tile, pg, pg, s8, u8); /* { dg-error {argument 1 of 'svsumopa_za32_m' must be an integer constant expression} } */ + svsumopa_za32_m (-1, pg, pg, s8, u8); /* { dg-error {passing -1 to argument 1 of 'svsumopa_za32_m', which expects a value in the range \[0, 3\]} } */ + svsumopa_za32_m (4, pg, pg, s8, u8); /* { dg-error {passing 4 to argument 1 of 'svsumopa_za32_m', which expects a value in the range \[0, 3\]} } */ + svsumopa_za32_m (0, u8, pg, s8, u8); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svsumopa_za32_m', which expects 'svbool_t'} } */ + svsumopa_za32_m (0, pg, u8, s8, u8); /* { dg-error {passing 'svuint8_t' to argument 3 of 'svsumopa_za32_m', which expects 'svbool_t'} } */ + svsumopa_za32_m (0, pg, pg, tile, s8); /* { dg-error {passing 'uint32_t'.* to argument 4 of 'svsumopa_za32_m', which expects an SVE vector type} } */ + svsumopa_za32_m (0, pg, pg, u8, u8); /* { dg-error {'svsumopa_za32_m' has no form that takes 'svuint8_t' arguments} } */ + svsumopa_za32_m (0, pg, pg, pg, u8); /* { dg-error {'svsumopa_za32_m' has no form that takes 'svbool_t' arguments} } */ + svsumopa_za32_m (0, pg, pg, f16, u8); /* { dg-error {'svsumopa_za32_m' has no form that takes 'svfloat16_t' arguments} } */ + svsumopa_za32_m (0, pg, pg, s8, s8); /* { dg-error {passing 'svint8_t' to argument 5 of 'svsumopa_za32_m', which expects a vector of unsigned integers} } */ + svsumopa_za32_m (0, pg, pg, s8, u16); /* { dg-error {arguments 4 and 5 of 'svsumopa_za32_m' must have the same element size, but the values passed here have type 'svint8_t' and 'svuint16_t' respectively} } */ + svsumopa_za32_m (0, pg, pg, s16, u16); /* { dg-error {'svsumopa_za32_m' has no form that takes 'svint16_t' arguments} } */ + + svsumopa_za64_m (0, pg, pg, s16, u16); /* { dg-error {ACLE function 'svsumopa_za64_s16_m' requires ISA extension 'sme-i16i64'} } */ +} + +void __attribute__((arm_streaming)) +f2 (svbool_t pg, svint8_t s8, svuint8_t u8) +{ + svsumopa_za32_m (0, pg, pg, s8, u8); /* { dg-error {ACLE function 'svsumopa_za32_s8_m' can only be called from a function that has ZA state} } */ +} + +void __attribute__((arm_shared_za)) +f3 (svbool_t pg, svint8_t s8, svuint8_t u8) +{ + svsumopa_za32_m (0, pg, pg, s8, u8); /* { dg-error {ACLE function 'svsumopa_za32_s8_m' can only be called when SME streaming mode is enabled} } */ +} + +#pragma GCC target ("arch=armv8.2-a+sme-i16i64") + +void __attribute__((arm_streaming, arm_shared_za)) +f4 (svbool_t pg, svint16_t s16, svuint16_t u16) +{ + svsumopa_za64_m (-1, pg, pg, s16, u16); /* { dg-error {passing -1 to argument 1 of 'svsumopa_za64_m', which expects a value in the range \[0, 7\]} } */ + svsumopa_za64_m (8, pg, pg, s16, u16); /* { dg-error {passing 8 to argument 1 of 'svsumopa_za64_m', which expects a value in the range \[0, 7\]} } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/func_redef_4.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/func_redef_4.c index 9591e3d01d6..8ad86a3c024 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/func_redef_4.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/func_redef_4.c @@ -4,6 +4,6 @@ to be diagnosed. Any attempt to call the function before including arm_sve.h will lead to a link failure. (Same for taking its address, etc.) */ -extern __SVUint8_t svadd_u8_x (__SVBool_t, __SVUint8_t, __SVUint8_t); +extern __attribute__((arm_preserves_za)) __SVUint8_t svadd_u8_x (__SVBool_t, __SVUint8_t, __SVUint8_t); #pragma GCC aarch64 "arm_sve.h" diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/func_redef_5.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/func_redef_5.c index 85923611d8e..466d57a9f1d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/func_redef_5.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/func_redef_5.c @@ -6,7 +6,7 @@ At the moment this works like other built-ins in the sense that the explicit definition "wins". This isn't supported behavior though. */ -__SVUint8_t +__SVUint8_t __attribute__((arm_preserves_za)) svadd_u8_x (__SVBool_t pg, __SVUint8_t x, __SVUint8_t y) { return x; diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/read_za_m_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/read_za_m_1.c new file mode 100644 index 00000000000..4581bd7f2fc --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/read_za_m_1.c @@ -0,0 +1,47 @@ +/* { dg-do compile } */ + +#include + +#pragma GCC target ("arch=armv8.2-a+sme") + +void __attribute__((arm_streaming, arm_shared_za)) +f1 (svbool_t pg, svint8_t s8, svint64_t s64, svuint8_t u8, svuint16_t u16, + svfloat32_t f32, uint32_t tile) +{ + svread_hor_za8_m (s8, pg, 0); /* { dg-error {too few arguments to function 'svread_hor_za8_m'} } */ + svread_hor_za8_m (s8, pg, 0, 0, 0); /* { dg-error {too many arguments to function 'svread_hor_za8_m'} } */ + svread_hor_za8_m (tile, pg, 0, 0); /* { dg-error {passing 'uint32_t'.* to argument 1 of 'svread_hor_za8_m', which expects an SVE vector type} } */ + svread_hor_za8_m (pg, pg, 0, 0); /* { dg-error {'svread_hor_za8_m' has no form that takes 'svbool_t' arguments} } */ + svread_hor_za8_m (u16, pg, 0, 0); /* { dg-error {'svread_hor_za8_m' has no form that takes 'svuint16_t' arguments} } */ + svread_hor_za8_m (s8, s8, 0, 0); /* { dg-error {passing 'svint8_t' to argument 2 of 'svread_hor_za8_m', which expects 'svbool_t'} } */ + svread_hor_za8_m (s8, pg, tile, 0); /* { dg-error {argument 3 of 'svread_hor_za8_m' must be an integer constant expression} } */ + svread_hor_za8_m (s8, pg, -1, 0); /* { dg-error {passing -1 to argument 3 of 'svread_hor_za8_m', which expects the value 0} } */ + svread_hor_za8_m (s8, pg, 1, 0); /* { dg-error {passing 1 to argument 3 of 'svread_hor_za8_m', which expects the value 0} } */ + svread_hor_za8_m (s8, pg, 0, u8); /* { dg-error {passing 'svuint8_t' to argument 4 of 'svread_hor_za8_m', which expects 'uint32_t'} } */ + + svread_hor_za16_m (u16, pg, -1, 0); /* { dg-error {passing -1 to argument 3 of 'svread_hor_za16_m', which expects a value in the range \[0, 1\]} } */ + svread_hor_za16_m (u16, pg, 2, 0); /* { dg-error {passing 2 to argument 3 of 'svread_hor_za16_m', which expects a value in the range \[0, 1\]} } */ + + svread_hor_za32_m (f32, pg, -1, 0); /* { dg-error {passing -1 to argument 3 of 'svread_hor_za32_m', which expects a value in the range \[0, 3\]} } */ + svread_hor_za32_m (f32, pg, 4, 0); /* { dg-error {passing 4 to argument 3 of 'svread_hor_za32_m', which expects a value in the range \[0, 3\]} } */ + + svread_hor_za64_m (s64, pg, -1, 0); /* { dg-error {passing -1 to argument 3 of 'svread_hor_za64_m', which expects a value in the range \[0, 7\]} } */ + svread_hor_za64_m (s64, pg, 8, 0); /* { dg-error {passing 8 to argument 3 of 'svread_hor_za64_m', which expects a value in the range \[0, 7\]} } */ + + svread_hor_za128_m (s8, pg, -1, 0); /* { dg-error {passing -1 to argument 3 of 'svread_hor_za128_m', which expects a value in the range \[0, 15\]} } */ + svread_hor_za128_m (s8, pg, 16, 0); /* { dg-error {passing 16 to argument 3 of 'svread_hor_za128_m', which expects a value in the range \[0, 15\]} } */ + svread_hor_za128_m (f32, pg, -1, 0); /* { dg-error {passing -1 to argument 3 of 'svread_hor_za128_m', which expects a value in the range \[0, 15\]} } */ + svread_hor_za128_m (f32, pg, 16, 0); /* { dg-error {passing 16 to argument 3 of 'svread_hor_za128_m', which expects a value in the range \[0, 15\]} } */ +} + +void __attribute__((arm_streaming)) +f2 (svbool_t pg, svint8_t s8) +{ + svread_hor_za8_m (s8, pg, 0, 0); /* { dg-error {ACLE function 'svread_hor_za8_s8_m' can only be called from a function that has ZA state} } */ +} + +void __attribute__((arm_shared_za)) +f3 (svbool_t pg, svint8_t s8) +{ + svread_hor_za8_m (s8, pg, 0, 0); /* { dg-error {ACLE function 'svread_hor_za8_s8_m' can only be called when SME streaming mode is enabled} } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_za_m_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_za_m_1.c new file mode 100644 index 00000000000..4749601fc18 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_za_m_1.c @@ -0,0 +1,47 @@ +/* { dg-do compile } */ + +#include + +#pragma GCC target ("arch=armv8.2-a+sme") + +void __attribute__((arm_streaming, arm_shared_za)) +f1 (svbool_t pg, svuint8_t u8, svint16_t s16, svint32_t s32, svint64_t s64, + svfloat32_t f32, uint32_t tile) +{ + svaddha_za32_m (0, pg, pg); /* { dg-error {too few arguments to function 'svaddha_za32_m'} } */ + svaddha_za32_m (0, pg, pg, s32, s32); /* { dg-error {too many arguments to function 'svaddha_za32_m'} } */ + svaddha_za32_m (tile, pg, pg, s32); /* { dg-error {argument 1 of 'svaddha_za32_m' must be an integer constant expression} } */ + svaddha_za32_m (-1, pg, pg, s32); /* { dg-error {passing -1 to argument 1 of 'svaddha_za32_m', which expects a value in the range \[0, 3\]} } */ + svaddha_za32_m (4, pg, pg, s32); /* { dg-error {passing 4 to argument 1 of 'svaddha_za32_m', which expects a value in the range \[0, 3\]} } */ + svaddha_za32_m (0, u8, pg, s32); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svaddha_za32_m', which expects 'svbool_t'} } */ + svaddha_za32_m (0, pg, u8, s32); /* { dg-error {passing 'svuint8_t' to argument 3 of 'svaddha_za32_m', which expects 'svbool_t'} } */ + svaddha_za32_m (0, pg, pg, tile); /* { dg-error {passing 'uint32_t'.* to argument 4 of 'svaddha_za32_m', which expects an SVE vector type} } */ + svaddha_za32_m (0, pg, pg, pg); /* { dg-error {'svaddha_za32_m' has no form that takes 'svbool_t' arguments} } */ + svaddha_za32_m (0, pg, pg, u8); /* { dg-error {'svaddha_za32_m' has no form that takes 'svuint8_t' arguments} } */ + svaddha_za32_m (0, pg, pg, s16); /* { dg-error {'svaddha_za32_m' has no form that takes 'svint16_t' arguments} } */ + svaddha_za32_m (0, pg, pg, f32); /* { dg-error {'svaddha_za32_m' has no form that takes 'svfloat32_t' arguments} } */ + svaddha_za32_m (0, pg, pg, s64); /* { dg-error {'svaddha_za32_m' has no form that takes 'svint64_t' arguments} } */ + + svaddha_za64_m (0, pg, pg, s64); /* { dg-error {ACLE function 'svaddha_za64_s64_m' requires ISA extension 'sme-i16i64'} } */ +} + +void __attribute__((arm_streaming)) +f2 (svbool_t pg, svint32_t s32) +{ + svaddha_za32_m (0, pg, pg, s32); /* { dg-error {ACLE function 'svaddha_za32_s32_m' can only be called from a function that has ZA state} } */ +} + +void __attribute__((arm_shared_za)) +f3 (svbool_t pg, svint32_t s32) +{ + svaddha_za32_m (0, pg, pg, s32); /* { dg-error {ACLE function 'svaddha_za32_s32_m' can only be called when SME streaming mode is enabled} } */ +} + +#pragma GCC target ("arch=armv8.2-a+sme-i16i64") + +void __attribute__((arm_streaming, arm_shared_za)) +f4 (svbool_t pg, svint64_t s64) +{ + svaddha_za64_m (-1, pg, pg, s64); /* { dg-error {passing -1 to argument 1 of 'svaddha_za64_m', which expects a value in the range \[0, 7\]} } */ + svaddha_za64_m (8, pg, pg, s64); /* { dg-error {passing 8 to argument 1 of 'svaddha_za64_m', which expects a value in the range \[0, 7\]} } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/write_za_m_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/write_za_m_1.c new file mode 100644 index 00000000000..f84ca40629b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/write_za_m_1.c @@ -0,0 +1,47 @@ +/* { dg-do compile } */ + +#include + +#pragma GCC target ("arch=armv8.2-a+sme") + +void __attribute__((arm_streaming, arm_shared_za)) +f1 (svbool_t pg, svint8_t s8, svint64_t s64, svuint8_t u8, svuint16_t u16, + svfloat32_t f32, uint32_t tile) +{ + svwrite_ver_za8_m (0, 0, pg); /* { dg-error {too few arguments to function 'svwrite_ver_za8_m'} } */ + svwrite_ver_za8_m (0, 0, pg, s8, 0); /* { dg-error {too many arguments to function 'svwrite_ver_za8_m'} } */ + svwrite_ver_za8_m (tile, 0, pg, s8); /* { dg-error {argument 1 of 'svwrite_ver_za8_m' must be an integer constant expression} } */ + svwrite_ver_za8_m (-1, 0, pg, s8); /* { dg-error {passing -1 to argument 1 of 'svwrite_ver_za8_m', which expects the value 0} } */ + svwrite_ver_za8_m (1, 0, pg, s8); /* { dg-error {passing 1 to argument 1 of 'svwrite_ver_za8_m', which expects the value 0} } */ + svwrite_ver_za8_m (0, u8, pg, s8); /* { dg-error {passing 'svuint8_t' to argument 2 of 'svwrite_ver_za8_m', which expects 'uint32_t'} } */ + svwrite_ver_za8_m (0, 0, s8, s8); /* { dg-error {passing 'svint8_t' to argument 3 of 'svwrite_ver_za8_m', which expects 'svbool_t'} } */ + svwrite_ver_za8_m (0, 0, pg, tile); /* { dg-error {passing 'uint32_t'.* to argument 4 of 'svwrite_ver_za8_m', which expects an SVE vector type} } */ + svwrite_ver_za8_m (0, 0, pg, pg); /* { dg-error {'svwrite_ver_za8_m' has no form that takes 'svbool_t' arguments} } */ + svwrite_ver_za8_m (0, 0, pg, u16); /* { dg-error {'svwrite_ver_za8_m' has no form that takes 'svuint16_t' arguments} } */ + + svwrite_ver_za16_m (-1, 0, pg, u16); /* { dg-error {passing -1 to argument 1 of 'svwrite_ver_za16_m', which expects a value in the range \[0, 1\]} } */ + svwrite_ver_za16_m (2, 0, pg, u16); /* { dg-error {passing 2 to argument 1 of 'svwrite_ver_za16_m', which expects a value in the range \[0, 1\]} } */ + + svwrite_ver_za32_m (-1, 0, pg, f32); /* { dg-error {passing -1 to argument 1 of 'svwrite_ver_za32_m', which expects a value in the range \[0, 3\]} } */ + svwrite_ver_za32_m (4, 0, pg, f32); /* { dg-error {passing 4 to argument 1 of 'svwrite_ver_za32_m', which expects a value in the range \[0, 3\]} } */ + + svwrite_ver_za64_m (-1, 0, pg, s64); /* { dg-error {passing -1 to argument 1 of 'svwrite_ver_za64_m', which expects a value in the range \[0, 7\]} } */ + svwrite_ver_za64_m (8, 0, pg, s64); /* { dg-error {passing 8 to argument 1 of 'svwrite_ver_za64_m', which expects a value in the range \[0, 7\]} } */ + + svwrite_ver_za128_m (-1, 0, pg, s8); /* { dg-error {passing -1 to argument 1 of 'svwrite_ver_za128_m', which expects a value in the range \[0, 15\]} } */ + svwrite_ver_za128_m (16, 0, pg, s8); /* { dg-error {passing 16 to argument 1 of 'svwrite_ver_za128_m', which expects a value in the range \[0, 15\]} } */ + svwrite_ver_za128_m (-1, 0, pg, f32); /* { dg-error {passing -1 to argument 1 of 'svwrite_ver_za128_m', which expects a value in the range \[0, 15\]} } */ + svwrite_ver_za128_m (16, 0, pg, f32); /* { dg-error {passing 16 to argument 1 of 'svwrite_ver_za128_m', which expects a value in the range \[0, 15\]} } */ +} + +void __attribute__((arm_streaming)) +f2 (svbool_t pg, svint8_t s8) +{ + svwrite_ver_za8_m (0, 0, pg, s8); /* { dg-error {ACLE function 'svwrite_ver_za8_s8_m' can only be called from a function that has ZA state} } */ +} + +void __attribute__((arm_shared_za)) +f3 (svbool_t pg, svint8_t s8) +{ + svwrite_ver_za8_m (0, 0, pg, s8); /* { dg-error {ACLE function 'svwrite_ver_za8_s8_m' can only be called when SME streaming mode is enabled} } */ +} diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index f6cb16521b3..9fb5d587378 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -10781,7 +10781,8 @@ proc check_effective_target_aarch64_tiny { } { # various architecture extensions via the .arch_extension pseudo-op. foreach { aarch64_ext } { "fp" "simd" "crypto" "crc" "lse" "dotprod" "sve" - "i8mm" "f32mm" "f64mm" "bf16" "sb" "sve2" } { + "i8mm" "f32mm" "f64mm" "bf16" "sb" "sve2" "sme" + "sme-i16i64" } { eval [string map [list FUNC $aarch64_ext] { proc check_effective_target_aarch64_asm_FUNC_ok { } { if { [istarget aarch64*-*-*] } { From patchwork Sun Nov 13 10:03:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 60514 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B24013949F2D for ; Sun, 13 Nov 2022 10:03:58 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B24013949F2D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668333838; bh=2kDqBpRuxK2IBM7n9a20Ua/m4X9PZR6VW/hpH6d0jj8=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=jTI/ZX0+T7SGshW4lKQRh9fLeu1BvbJTJqNlV5DRZ1T1qJXCI/h2RiM8ny2fyEyRT M6B4z1eR9H8iDgdIQ2mPYOHFtev+cirj+6s12NGwBLIkUny7xCtzRlw4dI/P7MR+iP 6ebmrwGW38qvvk4gYsjUpZbNV8Pwd7+Y8U3UnOXw= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 5118C393BA68 for ; Sun, 13 Nov 2022 10:03:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5118C393BA68 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2E47323A for ; Sun, 13 Nov 2022 02:03:23 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 59AE33F73D for ; Sun, 13 Nov 2022 02:03:16 -0800 (PST) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 14/16] aarch64: Add support for arm_locally_streaming References: Date: Sun, 13 Nov 2022 10:03:15 +0000 In-Reply-To: (Richard Sandiford's message of "Sun, 13 Nov 2022 09:59:23 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-41.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch adds support for the arm_locally_streaming attribute, which allows a function to use SME internally without changing the function's ABI. The attribute is valid but redundant for arm_streaming functions. gcc/ * config/aarch64/aarch64.cc (aarch64_attribute_table): Add arm_locally_streaming. (aarch64_fndecl_is_locally_streaming): New function. (aarch64_fndecl_sm_state): Handle arm_locally_streaming functions. (aarch64_cfun_enables_pstate_sm): New function. (aarch64_add_offset): Add an argument that specifies whether the streaming vector length should be used instead of the prevailing one. (aarch64_split_add_offset, aarch64_add_sp, aarch64_sub_sp): Likewise. (aarch64_allocate_and_probe_stack_space): Likewise. (aarch64_expand_mov_immediate): Update calls accordingly. (aarch64_need_old_pstate_sm): Return true for locally-streaming streaming-compatible functions. (aarch64_layout_frame): Force all call-preserved Z and P registers to be saved and restored if the function switches PSTATE.SM in the prologue. (aarch64_get_separate_components): Disable shrink-wrapping of such Z and P saves and restores. (aarch64_use_late_prologue_epilogue): New function. (aarch64_expand_prologue): Measure SVE lengths in the streaming vector length for locally-streaming functions, then emit code to enable streaming mode. Combine separate SMSTART ZA and SMSTART SM instructions into a single SMSTART where possible. (aarch64_expand_epilogue): Likewise in reverse. (TARGET_USE_LATE_PROLOGUE_EPILOGUE): Define. * config/aarch64/aarch64-sme.md (UNSPEC_SMSTART): New unspec. (UNSPEC_SMSTOP): Likewise. (aarch64_smstart, aarch64_smstop): New patterns. gcc/testsuite/ * gcc.target/aarch64/sme/locally_streaming_1.c: New test. * gcc.target/aarch64/sme/locally_streaming_2.c: Likewise. * gcc.target/aarch64/sme/locally_streaming_3.c: Likewise. --- gcc/config/aarch64/aarch64-sme.md | 82 ++++ gcc/config/aarch64/aarch64.cc | 237 ++++++++-- .../aarch64/sme/locally_streaming_1.c | 433 ++++++++++++++++++ .../aarch64/sme/locally_streaming_2.c | 177 +++++++ .../aarch64/sme/locally_streaming_3.c | 273 +++++++++++ 5 files changed, 1164 insertions(+), 38 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_3.c diff --git a/gcc/config/aarch64/aarch64-sme.md b/gcc/config/aarch64/aarch64-sme.md index 7b3ccea2e11..70be7adba28 100644 --- a/gcc/config/aarch64/aarch64-sme.md +++ b/gcc/config/aarch64/aarch64-sme.md @@ -281,6 +281,88 @@ (define_insn_and_split "aarch64_restore_za" DONE; } ) + +;; ------------------------------------------------------------------------- +;; ---- Combined PSTATE.SM and PSTATE.ZA management +;; ------------------------------------------------------------------------- +;; Includes +;; - SMSTART +;; - SMSTOP +;; ------------------------------------------------------------------------- + +(define_c_enum "unspec" [ + UNSPEC_SMSTART + UNSPEC_SMSTOP +]) + +;; Enable SM and ZA, starting with fresh ZA contents. This is only valid when +;; SME is present, but the pattern does not depend on TARGET_SME since it can +;; be used conditionally. +(define_insn "aarch64_smstart" + [(unspec_volatile [(const_int 0)] UNSPEC_SMSTART) + (clobber (reg:V4x16QI V0_REGNUM)) + (clobber (reg:V4x16QI V4_REGNUM)) + (clobber (reg:V4x16QI V8_REGNUM)) + (clobber (reg:V4x16QI V12_REGNUM)) + (clobber (reg:V4x16QI V16_REGNUM)) + (clobber (reg:V4x16QI V20_REGNUM)) + (clobber (reg:V4x16QI V24_REGNUM)) + (clobber (reg:V4x16QI V28_REGNUM)) + (clobber (reg:VNx16BI P0_REGNUM)) + (clobber (reg:VNx16BI P1_REGNUM)) + (clobber (reg:VNx16BI P2_REGNUM)) + (clobber (reg:VNx16BI P3_REGNUM)) + (clobber (reg:VNx16BI P4_REGNUM)) + (clobber (reg:VNx16BI P5_REGNUM)) + (clobber (reg:VNx16BI P6_REGNUM)) + (clobber (reg:VNx16BI P7_REGNUM)) + (clobber (reg:VNx16BI P8_REGNUM)) + (clobber (reg:VNx16BI P9_REGNUM)) + (clobber (reg:VNx16BI P10_REGNUM)) + (clobber (reg:VNx16BI P11_REGNUM)) + (clobber (reg:VNx16BI P12_REGNUM)) + (clobber (reg:VNx16BI P13_REGNUM)) + (clobber (reg:VNx16BI P14_REGNUM)) + (clobber (reg:VNx16BI P15_REGNUM)) + (clobber (reg:VNx16QI ZA_REGNUM))] + "" + "smstart" +) + +;; Disable SM and ZA, and discard its current contents. This is only valid +;; when SME is present, but the pattern does not depend on TARGET_SME since +;; it can be used conditionally. +(define_insn "aarch64_smstop" + [(unspec_volatile [(reg:VNx16QI OLD_ZA_REGNUM)] UNSPEC_SMSTOP) + (clobber (reg:V4x16QI V0_REGNUM)) + (clobber (reg:V4x16QI V4_REGNUM)) + (clobber (reg:V4x16QI V8_REGNUM)) + (clobber (reg:V4x16QI V12_REGNUM)) + (clobber (reg:V4x16QI V16_REGNUM)) + (clobber (reg:V4x16QI V20_REGNUM)) + (clobber (reg:V4x16QI V24_REGNUM)) + (clobber (reg:V4x16QI V28_REGNUM)) + (clobber (reg:VNx16BI P0_REGNUM)) + (clobber (reg:VNx16BI P1_REGNUM)) + (clobber (reg:VNx16BI P2_REGNUM)) + (clobber (reg:VNx16BI P3_REGNUM)) + (clobber (reg:VNx16BI P4_REGNUM)) + (clobber (reg:VNx16BI P5_REGNUM)) + (clobber (reg:VNx16BI P6_REGNUM)) + (clobber (reg:VNx16BI P7_REGNUM)) + (clobber (reg:VNx16BI P8_REGNUM)) + (clobber (reg:VNx16BI P9_REGNUM)) + (clobber (reg:VNx16BI P10_REGNUM)) + (clobber (reg:VNx16BI P11_REGNUM)) + (clobber (reg:VNx16BI P12_REGNUM)) + (clobber (reg:VNx16BI P13_REGNUM)) + (clobber (reg:VNx16BI P14_REGNUM)) + (clobber (reg:VNx16BI P15_REGNUM)) + (clobber (reg:VNx16QI ZA_REGNUM))] + "" + "smstop" +) + ;; ========================================================================= ;; == Loads, stores and moves ;; ========================================================================= diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 966d13abe4c..48bf2de4b3d 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -2790,6 +2790,7 @@ static const struct attribute_spec aarch64_attribute_table[] = NULL, attr_streaming_exclusions }, { "arm_streaming_compatible", 0, 0, false, true, true, true, NULL, attr_streaming_exclusions }, + { "arm_locally_streaming", 0, 0, true, false, false, false, NULL, NULL }, { "arm_new_za", 0, 0, true, false, false, false, handle_arm_new_za_attribute, attr_arm_new_za_exclusions }, @@ -4162,6 +4163,15 @@ aarch64_fndecl_has_new_za_state (const_tree fndecl) return lookup_attribute ("arm_new_za", DECL_ATTRIBUTES (fndecl)); } +/* Return true if FNDECL uses streaming mode internally, as an + implementation choice. */ + +static bool +aarch64_fndecl_is_locally_streaming (const_tree fndecl) +{ + return lookup_attribute ("arm_locally_streaming", DECL_ATTRIBUTES (fndecl)); +} + /* Return the state of PSTATE.SM when compiling the body of function FNDECL. This might be different from the state of PSTATE.SM on entry. */ @@ -4169,6 +4179,9 @@ aarch64_fndecl_has_new_za_state (const_tree fndecl) static aarch64_feature_flags aarch64_fndecl_sm_state (const_tree fndecl) { + if (aarch64_fndecl_is_locally_streaming (fndecl)) + return AARCH64_FL_SM_ON; + return aarch64_fntype_sm_state (TREE_TYPE (fndecl)); } @@ -4222,6 +4235,16 @@ aarch64_cfun_incoming_za_state () return aarch64_fntype_za_state (TREE_TYPE (cfun->decl)); } +/* Return true if PSTATE.SM is 1 in the body of the current function, + but is not guaranteed to be 1 on entry. */ + +static bool +aarch64_cfun_enables_pstate_sm () +{ + return (aarch64_fndecl_is_locally_streaming (cfun->decl) + && aarch64_cfun_incoming_sm_state () != AARCH64_FL_SM_ON); +} + /* Return true if the current function creates new ZA state (as opposed to sharing ZA with its callers or ignoring ZA altogether). */ @@ -6432,6 +6455,10 @@ aarch64_add_offset_temporaries (rtx x) TEMP2, if nonnull, is a second temporary register that doesn't overlap either DEST or REG. + FORCE_ISA_MODE is AARCH64_FL_SM_ON if any variable component of OFFSET + is measured relative to the SME vector length instead of the current + prevailing vector length. It is 0 otherwise. + Since this function may be used to adjust the stack pointer, we must ensure that it cannot cause transient stack deallocation (for example by first incrementing SP and then decrementing when adjusting by a @@ -6440,6 +6467,7 @@ aarch64_add_offset_temporaries (rtx x) static void aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx src, poly_int64 offset, rtx temp1, rtx temp2, + aarch64_feature_flags force_isa_mode, bool frame_related_p, bool emit_move_imm = true) { gcc_assert (emit_move_imm || temp1 != NULL_RTX); @@ -6452,9 +6480,17 @@ aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx src, /* Try using ADDVL or ADDPL to add the whole value. */ if (src != const0_rtx && aarch64_sve_addvl_addpl_immediate_p (offset)) { - rtx offset_rtx = gen_int_mode (offset, mode); + rtx offset_rtx; + if (force_isa_mode == 0) + offset_rtx = gen_int_mode (offset, mode); + else + offset_rtx = aarch64_sme_vq_immediate (mode, offset.coeffs[0], 0); rtx_insn *insn = emit_insn (gen_add3_insn (dest, src, offset_rtx)); RTX_FRAME_RELATED_P (insn) = frame_related_p; + if (frame_related_p && (force_isa_mode & AARCH64_FL_SM_ON)) + add_reg_note (insn, REG_CFA_ADJUST_CFA, + gen_rtx_SET (dest, plus_constant (Pmode, src, + offset))); return; } @@ -6470,11 +6506,19 @@ aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx src, if (src != const0_rtx && aarch64_sve_addvl_addpl_immediate_p (poly_offset)) { - rtx offset_rtx = gen_int_mode (poly_offset, mode); + rtx offset_rtx; + if (force_isa_mode == 0) + offset_rtx = gen_int_mode (poly_offset, mode); + else + offset_rtx = aarch64_sme_vq_immediate (mode, factor, 0); if (frame_related_p) { rtx_insn *insn = emit_insn (gen_add3_insn (dest, src, offset_rtx)); RTX_FRAME_RELATED_P (insn) = true; + if (force_isa_mode & AARCH64_FL_SM_ON) + add_reg_note (insn, REG_CFA_ADJUST_CFA, + gen_rtx_SET (dest, plus_constant (Pmode, src, + poly_offset))); src = dest; } else @@ -6505,8 +6549,18 @@ aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx src, rtx val; if (IN_RANGE (rel_factor, -32, 31)) { + if (force_isa_mode & AARCH64_FL_SM_ON) + { + /* Try to use an unshifted RDSVL, otherwise fall back on + a shifted RDSVL #1. */ + if (aarch64_sve_rdvl_addvl_factor_p (factor)) + shift = 0; + else + factor = rel_factor * 16; + val = aarch64_sme_vq_immediate (mode, factor, 0); + } /* Try to use an unshifted CNT[BHWD]. */ - if (aarch64_sve_cnt_factor_p (factor)) + else if (aarch64_sve_cnt_factor_p (factor)) { val = gen_int_mode (poly_int64 (factor, factor), mode); shift = 0; @@ -6542,12 +6596,19 @@ aarch64_add_offset (scalar_int_mode mode, rtx dest, rtx src, a shift and add sequence for the multiplication. If CNTB << SHIFT is out of range, stick with the current shift factor. */ - if (IN_RANGE (low_bit, 2, 16 * 16)) + if (force_isa_mode == 0 + && IN_RANGE (low_bit, 2, 16 * 16)) { val = gen_int_mode (poly_int64 (low_bit, low_bit), mode); shift = 0; } - else + else if ((force_isa_mode & AARCH64_FL_SM_ON) + && aarch64_sve_rdvl_addvl_factor_p (low_bit)) + { + val = aarch64_sme_vq_immediate (mode, low_bit, 0); + shift = 0; + } + else val = gen_int_mode (BYTES_PER_SVE_VECTOR, mode); val = aarch64_force_temporary (mode, temp1, val); @@ -6634,30 +6695,34 @@ aarch64_split_add_offset (scalar_int_mode mode, rtx dest, rtx src, rtx offset_rtx, rtx temp1, rtx temp2) { aarch64_add_offset (mode, dest, src, rtx_to_poly_int64 (offset_rtx), - temp1, temp2, false); + temp1, temp2, 0, false); } /* Add DELTA to the stack pointer, marking the instructions frame-related. - TEMP1 is available as a temporary if nonnull. EMIT_MOVE_IMM is false - if TEMP1 already contains abs (DELTA). */ + TEMP1 is available as a temporary if nonnull. FORCE_ISA_MODE is as + for aarch64_add_offset. EMIT_MOVE_IMM is false if TEMP1 already + contains abs (DELTA). */ static inline void -aarch64_add_sp (rtx temp1, rtx temp2, poly_int64 delta, bool emit_move_imm) +aarch64_add_sp (rtx temp1, rtx temp2, poly_int64 delta, + aarch64_feature_flags force_isa_mode, bool emit_move_imm) { aarch64_add_offset (Pmode, stack_pointer_rtx, stack_pointer_rtx, delta, - temp1, temp2, true, emit_move_imm); + temp1, temp2, force_isa_mode, true, emit_move_imm); } /* Subtract DELTA from the stack pointer, marking the instructions - frame-related if FRAME_RELATED_P. TEMP1 is available as a temporary - if nonnull. */ + frame-related if FRAME_RELATED_P. FORCE_ISA_MODE is as for + aarch64_add_offset. TEMP1 is available as a temporary if nonnull. */ static inline void -aarch64_sub_sp (rtx temp1, rtx temp2, poly_int64 delta, bool frame_related_p, - bool emit_move_imm = true) +aarch64_sub_sp (rtx temp1, rtx temp2, poly_int64 delta, + aarch64_feature_flags force_isa_mode, + bool frame_related_p, bool emit_move_imm = true) { aarch64_add_offset (Pmode, stack_pointer_rtx, stack_pointer_rtx, -delta, - temp1, temp2, frame_related_p, emit_move_imm); + temp1, temp2, force_isa_mode, frame_related_p, + emit_move_imm); } /* A streaming-compatible function needs to switch temporarily to the known @@ -7673,11 +7738,11 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm) { base = aarch64_force_temporary (int_mode, dest, base); aarch64_add_offset (int_mode, dest, base, offset, - NULL_RTX, NULL_RTX, false); + NULL_RTX, NULL_RTX, 0, false); } else aarch64_add_offset (int_mode, dest, base, offset, - dest, NULL_RTX, false); + dest, NULL_RTX, 0, false); } return; } @@ -7704,7 +7769,7 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm) gcc_assert (can_create_pseudo_p ()); base = aarch64_force_temporary (int_mode, dest, base); aarch64_add_offset (int_mode, dest, base, const_offset, - NULL_RTX, NULL_RTX, false); + NULL_RTX, NULL_RTX, 0, false); return; } @@ -7744,7 +7809,7 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm) gcc_assert(can_create_pseudo_p ()); base = aarch64_force_temporary (int_mode, dest, base); aarch64_add_offset (int_mode, dest, base, const_offset, - NULL_RTX, NULL_RTX, false); + NULL_RTX, NULL_RTX, 0, false); return; } /* FALLTHRU */ @@ -9212,6 +9277,9 @@ aarch64_need_old_pstate_sm () if (aarch64_cfun_incoming_sm_state () != 0) return false; + if (aarch64_cfun_enables_pstate_sm ()) + return true; + if (cfun->machine->call_switches_sm_state) for (auto insn = get_insns (); insn; insn = NEXT_INSN (insn)) if (auto *call = dyn_cast (insn)) @@ -9238,6 +9306,7 @@ aarch64_layout_frame (void) poly_int64 vector_save_size = GET_MODE_SIZE (vector_save_mode); bool frame_related_fp_reg_p = false; aarch64_frame &frame = cfun->machine->frame; + bool enables_pstate_sm = aarch64_cfun_enables_pstate_sm (); frame.emit_frame_chain = aarch64_needs_frame_chain (); @@ -9277,7 +9346,7 @@ aarch64_layout_frame (void) frame.reg_offset[regno] = SLOT_REQUIRED; for (regno = V0_REGNUM; regno <= V31_REGNUM; regno++) - if (df_regs_ever_live_p (regno) + if ((enables_pstate_sm || df_regs_ever_live_p (regno)) && !fixed_regs[regno] && !crtl->abi->clobbers_full_reg_p (regno)) { @@ -9306,7 +9375,7 @@ aarch64_layout_frame (void) } for (regno = P0_REGNUM; regno <= P15_REGNUM; regno++) - if (df_regs_ever_live_p (regno) + if ((enables_pstate_sm || df_regs_ever_live_p (regno)) && !fixed_regs[regno] && !crtl->abi->clobbers_full_reg_p (regno)) frame.reg_offset[regno] = SLOT_REQUIRED; @@ -10121,9 +10190,14 @@ aarch64_get_separate_components (void) bitmap_clear (components); /* The registers we need saved to the frame. */ + bool enables_pstate_sm = aarch64_cfun_enables_pstate_sm (); for (unsigned regno = 0; regno <= LAST_SAVED_REGNUM; regno++) if (aarch64_register_saved_on_entry (regno)) { + if (enables_pstate_sm + && (FP_REGNUM_P (regno) || PR_REGNUM_P (regno))) + continue; + /* Punt on saves and restores that use ST1D and LD1D. We could try to be smarter, but it would involve making sure that the spare predicate register itself is safe to use at the save @@ -10438,6 +10512,7 @@ aarch64_stack_clash_protection_alloca_probe_range (void) static void aarch64_allocate_and_probe_stack_space (rtx temp1, rtx temp2, poly_int64 poly_size, + aarch64_feature_flags force_isa_mode, bool frame_related_p, bool final_adjustment_p) { @@ -10498,7 +10573,8 @@ aarch64_allocate_and_probe_stack_space (rtx temp1, rtx temp2, if (known_lt (poly_size, min_probe_threshold) || !flag_stack_clash_protection) { - aarch64_sub_sp (temp1, temp2, poly_size, frame_related_p); + aarch64_sub_sp (temp1, temp2, poly_size, force_isa_mode, + frame_related_p); return; } @@ -10515,7 +10591,8 @@ aarch64_allocate_and_probe_stack_space (rtx temp1, rtx temp2, /* First calculate the amount of bytes we're actually spilling. */ aarch64_add_offset (Pmode, temp1, CONST0_RTX (Pmode), - poly_size, temp1, temp2, false, true); + poly_size, temp1, temp2, force_isa_mode, + false, true); rtx_insn *insn = get_last_insn (); @@ -10573,7 +10650,7 @@ aarch64_allocate_and_probe_stack_space (rtx temp1, rtx temp2, { for (HOST_WIDE_INT i = 0; i < rounded_size; i += guard_size) { - aarch64_sub_sp (NULL, temp2, guard_size, true); + aarch64_sub_sp (NULL, temp2, guard_size, force_isa_mode, true); emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx, guard_used_by_caller)); emit_insn (gen_blockage ()); @@ -10584,7 +10661,7 @@ aarch64_allocate_and_probe_stack_space (rtx temp1, rtx temp2, { /* Compute the ending address. */ aarch64_add_offset (Pmode, temp1, stack_pointer_rtx, -rounded_size, - temp1, NULL, false, true); + temp1, NULL, force_isa_mode, false, true); rtx_insn *insn = get_last_insn (); /* For the initial allocation, we don't have a frame pointer @@ -10654,7 +10731,7 @@ aarch64_allocate_and_probe_stack_space (rtx temp1, rtx temp2, else if (final_adjustment_p && rounded_size == 0) residual_probe_offset = 0; - aarch64_sub_sp (temp1, temp2, residual, frame_related_p); + aarch64_sub_sp (temp1, temp2, residual, force_isa_mode, frame_related_p); if (residual >= min_probe_threshold) { if (dump_file) @@ -10670,6 +10747,14 @@ aarch64_allocate_and_probe_stack_space (rtx temp1, rtx temp2, } } +/* Implement TARGET_USE_LATE_PROLOGUE_EPILOGUE. */ + +static bool +aarch64_use_late_prologue_epilogue () +{ + return aarch64_cfun_enables_pstate_sm (); +} + /* Return 1 if the register is used by the epilogue. We need to say the return register is used, but only after epilogue generation is complete. Note that in the case of sibcalls, the values "used by the epilogue" are @@ -10826,6 +10911,9 @@ aarch64_expand_prologue (void) unsigned reg2 = cfun->machine->frame.wb_push_candidate2; bool emit_frame_chain = cfun->machine->frame.emit_frame_chain; rtx_insn *insn; + aarch64_feature_flags force_isa_mode = 0; + if (aarch64_cfun_enables_pstate_sm ()) + force_isa_mode = AARCH64_FL_SM_ON; if (flag_stack_clash_protection && known_eq (callee_adjust, 0)) { @@ -10887,7 +10975,7 @@ aarch64_expand_prologue (void) less the amount of the guard reserved for use by the caller's outgoing args. */ aarch64_allocate_and_probe_stack_space (tmp0_rtx, tmp1_rtx, initial_adjust, - true, false); + force_isa_mode, true, false); if (callee_adjust != 0) aarch64_push_regs (reg1, reg2, callee_adjust); @@ -10913,7 +11001,8 @@ aarch64_expand_prologue (void) gcc_assert (known_eq (chain_offset, 0)); aarch64_add_offset (Pmode, hard_frame_pointer_rtx, stack_pointer_rtx, chain_offset, - tmp1_rtx, tmp0_rtx, frame_pointer_needed); + tmp1_rtx, tmp0_rtx, force_isa_mode, + frame_pointer_needed); if (frame_pointer_needed && !frame_size.is_constant ()) { /* Variable-sized frames need to describe the save slot @@ -10956,6 +11045,7 @@ aarch64_expand_prologue (void) || known_eq (initial_adjust, 0)); aarch64_allocate_and_probe_stack_space (tmp1_rtx, tmp0_rtx, sve_callee_adjust, + force_isa_mode, !frame_pointer_needed, false); saved_regs_offset += sve_callee_adjust; } @@ -10968,10 +11058,13 @@ aarch64_expand_prologue (void) /* We may need to probe the final adjustment if it is larger than the guard that is assumed by the called. */ aarch64_allocate_and_probe_stack_space (tmp1_rtx, tmp0_rtx, final_adjust, + force_isa_mode, !frame_pointer_needed, true); - /* Save the incoming value of PSTATE.SM, if required. */ - if (known_ge (cfun->machine->frame.old_svcr_offset, 0)) + /* Save the incoming value of PSTATE.SM, if required. Code further + down does this for locally-streaming functions. */ + if (known_ge (cfun->machine->frame.old_svcr_offset, 0) + && !aarch64_cfun_enables_pstate_sm ()) { rtx mem = aarch64_old_svcr_mem (); MEM_VOLATILE_P (mem) = 1; @@ -11022,7 +11115,40 @@ aarch64_expand_prologue (void) emit_insn (gen_aarch64_tpidr2_save ()); emit_insn (gen_aarch64_clear_tpidr2 ()); emit_label (label); - emit_insn (gen_aarch64_smstart_za ()); + if (!aarch64_cfun_enables_pstate_sm () + || known_ge (cfun->machine->frame.old_svcr_offset, 0)) + emit_insn (gen_aarch64_smstart_za ()); + } + + /* Enable PSTATE.SM, if required. */ + if (aarch64_cfun_enables_pstate_sm ()) + { + rtx_insn *guard_label = nullptr; + if (known_ge (cfun->machine->frame.old_svcr_offset, 0)) + { + /* The current function is streaming-compatible. Save the + original state of PSTATE.SM. */ + rtx svcr = gen_rtx_REG (DImode, IP0_REGNUM); + emit_insn (gen_aarch64_read_svcr (svcr)); + emit_move_insn (aarch64_old_svcr_mem (), svcr); + guard_label = aarch64_guard_switch_pstate_sm (svcr, + aarch64_isa_flags); + } + aarch64_sme_mode_switch_regs args_switch; + auto &args = crtl->args.info; + for (unsigned int i = 0; i < args.num_sme_mode_switch_args; ++i) + { + rtx x = args.sme_mode_switch_args[i]; + args_switch.add_reg (GET_MODE (x), REGNO (x)); + } + args_switch.emit_prologue (); + if (cfun->machine->frame.has_new_za_state && !guard_label) + emit_insn (gen_aarch64_smstart ()); + else + emit_insn (gen_aarch64_smstart_sm ()); + args_switch.emit_epilogue (); + if (guard_label) + emit_label (guard_label); } } @@ -11073,6 +11199,9 @@ aarch64_expand_epilogue (rtx_call_insn *sibcall) HOST_WIDE_INT guard_size = 1 << param_stack_clash_protection_guard_size; HOST_WIDE_INT guard_used_by_caller = STACK_CLASH_CALLER_GUARD; + aarch64_feature_flags force_isa_mode = 0; + if (aarch64_cfun_enables_pstate_sm ()) + force_isa_mode = AARCH64_FL_SM_ON; /* We can re-use the registers when: @@ -11097,7 +11226,33 @@ aarch64_expand_epilogue (rtx_call_insn *sibcall) = maybe_ne (get_frame_size () + cfun->machine->frame.saved_varargs_size, 0); - if (cfun->machine->frame.has_new_za_state) + /* Reset PSTATE.SM, if required. Fold an unconditional SMSTOP SM + and SMSTOP ZA into a single SMSTOP. */ + bool pending_smstop_za = cfun->machine->frame.has_new_za_state; + if (aarch64_cfun_enables_pstate_sm ()) + { + rtx_insn *guard_label = nullptr; + if (known_ge (cfun->machine->frame.old_svcr_offset, 0)) + guard_label = aarch64_guard_switch_pstate_sm (IP0_REGNUM, + aarch64_isa_flags); + aarch64_sme_mode_switch_regs args_switch; + if (crtl->return_rtx && REG_P (crtl->return_rtx)) + args_switch.add_reg (GET_MODE (crtl->return_rtx), + REGNO (crtl->return_rtx)); + args_switch.emit_prologue (); + if (pending_smstop_za && !guard_label) + { + emit_insn (gen_aarch64_smstop ()); + pending_smstop_za = false; + } + else + emit_insn (gen_aarch64_smstop_sm ()); + args_switch.emit_epilogue (); + if (guard_label) + emit_label (guard_label); + } + + if (pending_smstop_za) /* Turn ZA off before returning. TPIDR2_EL0 is already null at this point. */ emit_insn (gen_aarch64_smstop_za ()); @@ -11122,12 +11277,13 @@ aarch64_expand_epilogue (rtx_call_insn *sibcall) aarch64_add_offset (Pmode, stack_pointer_rtx, hard_frame_pointer_rtx, -callee_offset - below_hard_fp_saved_regs_size, - tmp1_rtx, tmp0_rtx, callee_adjust == 0); + tmp1_rtx, tmp0_rtx, force_isa_mode, + callee_adjust == 0); else /* The case where we need to re-use the register here is very rare, so avoid the complicated condition and just always emit a move if the immediate doesn't fit. */ - aarch64_add_sp (tmp1_rtx, tmp0_rtx, final_adjust, true); + aarch64_add_sp (tmp1_rtx, tmp0_rtx, final_adjust, force_isa_mode, true); /* Restore the vector registers before the predicate registers, so that we can use P4 as a temporary for big-endian SVE frames. */ @@ -11136,7 +11292,8 @@ aarch64_expand_epilogue (rtx_call_insn *sibcall) aarch64_restore_callee_saves (callee_offset, P0_REGNUM, P15_REGNUM, false, &cfi_ops); if (maybe_ne (sve_callee_adjust, 0)) - aarch64_add_sp (NULL_RTX, NULL_RTX, sve_callee_adjust, true); + aarch64_add_sp (NULL_RTX, NULL_RTX, sve_callee_adjust, + force_isa_mode, true); /* When shadow call stack is enabled, the scs_pop in the epilogue will restore x30, we don't need to restore x30 again in the traditional @@ -11167,7 +11324,7 @@ aarch64_expand_epilogue (rtx_call_insn *sibcall) /* Liveness of EP0_REGNUM can not be trusted across function calls either, so add restriction on emit_move optimization to leaf functions. */ - aarch64_add_sp (tmp0_rtx, tmp1_rtx, initial_adjust, + aarch64_add_sp (tmp0_rtx, tmp1_rtx, initial_adjust, force_isa_mode, (!can_inherit_p || !crtl->is_leaf || df_regs_ever_live_p (EP0_REGNUM))); @@ -11300,7 +11457,8 @@ aarch64_output_mi_thunk (FILE *file, tree thunk ATTRIBUTE_UNUSED, temp1 = gen_rtx_REG (Pmode, EP1_REGNUM); if (vcall_offset == 0) - aarch64_add_offset (Pmode, this_rtx, this_rtx, delta, temp1, temp0, false); + aarch64_add_offset (Pmode, this_rtx, this_rtx, delta, temp1, temp0, + 0, false); else { gcc_assert ((vcall_offset & (POINTER_BYTES - 1)) == 0); @@ -11313,7 +11471,7 @@ aarch64_output_mi_thunk (FILE *file, tree thunk ATTRIBUTE_UNUSED, plus_constant (Pmode, this_rtx, delta)); else aarch64_add_offset (Pmode, this_rtx, this_rtx, delta, - temp1, temp0, false); + temp1, temp0, 0, false); } if (Pmode == ptr_mode) @@ -29469,6 +29627,9 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_HAVE_SHADOW_CALL_STACK #define TARGET_HAVE_SHADOW_CALL_STACK true +#undef TARGET_USE_LATE_PROLOGUE_EPILOGUE +#define TARGET_USE_LATE_PROLOGUE_EPILOGUE aarch64_use_late_prologue_epilogue + #undef TARGET_EMIT_EPILOGUE_FOR_SIBCALL #define TARGET_EMIT_EPILOGUE_FOR_SIBCALL aarch64_expand_epilogue diff --git a/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_1.c b/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_1.c new file mode 100644 index 00000000000..ab9c8cd6bac --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_1.c @@ -0,0 +1,433 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls" } +// { dg-final { check-function-bodies "**" "" } } + +__attribute__((arm_streaming, arm_shared_za)) void consume_za (); + +/* +** n_ls: +** stp d8, d9, \[sp, #?-64\]! +** stp d10, d11, \[sp, #?16\] +** stp d12, d13, \[sp, #?32\] +** stp d14, d15, \[sp, #?48\] +** smstart sm +** smstop sm +** ldp d10, d11, \[sp, #?16\] +** ldp d12, d13, \[sp, #?32\] +** ldp d14, d15, \[sp, #?48\] +** ldp d8, d9, \[sp\], #?64 +** ret +*/ +void __attribute__((arm_locally_streaming)) +n_ls () +{ + asm (""); +} + +/* +** s_ls: +** ret +*/ +void __attribute__((arm_streaming, arm_locally_streaming)) +s_ls () +{ + asm (""); +} + +/* +** sc_ls: +** stp x29, x30, \[sp, #?-96\]! +** mov x29, sp +** stp d8, d9, \[sp, #?32\] +** stp d10, d11, \[sp, #?48\] +** stp d12, d13, \[sp, #?64\] +** stp d14, d15, \[sp, #?80\] +** mrs x16, svcr +** str x16, \[x29, #?16\] +** tbnz x16, 0, .* +** smstart sm +** ldr x16, \[x29, #?16\] +** tbnz x16, 0, .* +** smstop sm +** ldp d8, d9, \[sp, #?32\] +** ldp d10, d11, \[sp, #?48\] +** ldp d12, d13, \[sp, #?64\] +** ldp d14, d15, \[sp, #?80\] +** ldp x29, x30, \[sp\], #?96 +** ret +*/ +void __attribute__((arm_streaming_compatible, arm_locally_streaming)) +sc_ls () +{ + asm (""); +} + +/* +** n_ls_new_za: +** str x30, \[sp, #?-80\]! +** stp d8, d9, \[sp, #?16\] +** stp d10, d11, \[sp, #?32\] +** stp d12, d13, \[sp, #?48\] +** stp d14, d15, \[sp, #?64\] +** mrs x11, tpidr2_el0 +** cbz x11, .* +** bl __arm_tpidr2_save +** msr tpidr2_el0, xzr +** smstart +** bl consume_za +** smstop +** ldp d8, d9, \[sp, #?16\] +** ldp d10, d11, \[sp, #?32\] +** ldp d12, d13, \[sp, #?48\] +** ldp d14, d15, \[sp, #?64\] +** ldr x30, \[sp\], #?80 +** ret +*/ +void __attribute__((arm_locally_streaming, arm_new_za)) +n_ls_new_za () +{ + consume_za (); +} + +/* +** s_ls_new_za: +** str x30, \[sp, #?-16\]! +** mrs x11, tpidr2_el0 +** cbz x11, .* +** bl __arm_tpidr2_save +** msr tpidr2_el0, xzr +** smstart za +** bl consume_za +** smstop za +** ldr x30, \[sp\], #?16 +** ret +*/ +void __attribute__((arm_locally_streaming, arm_streaming, arm_new_za)) +s_ls_new_za () +{ + consume_za (); +} + +/* +** sc_ls_new_za: +** stp x29, x30, \[sp, #?-96\]! +** mov x29, sp +** stp d8, d9, \[sp, #?32\] +** stp d10, d11, \[sp, #?48\] +** stp d12, d13, \[sp, #?64\] +** stp d14, d15, \[sp, #?80\] +** mrs x11, tpidr2_el0 +** cbz x11, .* +** bl __arm_tpidr2_save +** msr tpidr2_el0, xzr +** smstart za +** mrs x16, svcr +** str x16, \[x29, #?16\] +** tbnz x16, 0, .* +** smstart sm +** bl consume_za +** ldr x16, \[x29, #?16\] +** tbnz x16, 0, .* +** smstop sm +** smstop za +** ldp d8, d9, \[sp, #?32\] +** ldp d10, d11, \[sp, #?48\] +** ldp d12, d13, \[sp, #?64\] +** ldp d14, d15, \[sp, #?80\] +** ldp x29, x30, \[sp\], #?96 +** ret +*/ +void __attribute__((arm_streaming_compatible, arm_locally_streaming, arm_new_za)) +sc_ls_new_za () +{ + consume_za (); +} + +/* +** n_ls_shared_za: +** str x30, \[sp, #?-80\]! +** stp d8, d9, \[sp, #?16\] +** stp d10, d11, \[sp, #?32\] +** stp d12, d13, \[sp, #?48\] +** stp d14, d15, \[sp, #?64\] +** smstart sm +** bl consume_za +** smstop sm +** ldp d8, d9, \[sp, #?16\] +** ldp d10, d11, \[sp, #?32\] +** ldp d12, d13, \[sp, #?48\] +** ldp d14, d15, \[sp, #?64\] +** ldr x30, \[sp\], #?80 +** ret +*/ +void __attribute__((arm_locally_streaming, arm_shared_za)) +n_ls_shared_za () +{ + consume_za (); +} + +/* +** s_ls_shared_za: +** str x30, \[sp, #?-16\]! +** bl consume_za +** ldr x30, \[sp\], #?16 +** ret +*/ +void __attribute__((arm_streaming, arm_locally_streaming, arm_shared_za)) +s_ls_shared_za () +{ + consume_za (); +} + +/* +** sc_ls_shared_za: +** stp x29, x30, \[sp, #?-96\]! +** mov x29, sp +** stp d8, d9, \[sp, #?32\] +** stp d10, d11, \[sp, #?48\] +** stp d12, d13, \[sp, #?64\] +** stp d14, d15, \[sp, #?80\] +** mrs x16, svcr +** str x16, \[x29, #?16\] +** tbnz x16, 0, .* +** smstart sm +** bl consume_za +** ldr x16, \[x29, #?16\] +** tbnz x16, 0, .* +** smstop sm +** ldp d8, d9, \[sp, #?32\] +** ldp d10, d11, \[sp, #?48\] +** ldp d12, d13, \[sp, #?64\] +** ldp d14, d15, \[sp, #?80\] +** ldp x29, x30, \[sp\], #?96 +** ret +*/ +void __attribute__((arm_streaming_compatible, arm_locally_streaming, arm_shared_za)) +sc_ls_shared_za () +{ + consume_za (); +} + +/* +** n_ls_vector_pcs: +** stp q8, q9, \[sp, #?-256\]! +** stp q10, q11, \[sp, #?32\] +** stp q12, q13, \[sp, #?64\] +** stp q14, q15, \[sp, #?96\] +** stp q16, q17, \[sp, #?128\] +** stp q18, q19, \[sp, #?160\] +** stp q20, q21, \[sp, #?192\] +** stp q22, q23, \[sp, #?224\] +** smstart sm +** smstop sm +** ldp q10, q11, \[sp, #?32\] +** ldp q12, q13, \[sp, #?64\] +** ldp q14, q15, \[sp, #?96\] +** ldp q16, q17, \[sp, #?128\] +** ldp q18, q19, \[sp, #?160\] +** ldp q20, q21, \[sp, #?192\] +** ldp q22, q23, \[sp, #?224\] +** ldp q8, q9, \[sp\], #?256 +** ret +*/ +void __attribute__((arm_locally_streaming, aarch64_vector_pcs)) +n_ls_vector_pcs () +{ + asm (""); +} + +/* +** n_ls_sve_pcs: +** addsvl sp, sp, #-18 +** str p4, \[sp\] +** str p5, \[sp, #1, mul vl\] +** str p6, \[sp, #2, mul vl\] +** str p7, \[sp, #3, mul vl\] +** str p8, \[sp, #4, mul vl\] +** str p9, \[sp, #5, mul vl\] +** str p10, \[sp, #6, mul vl\] +** str p11, \[sp, #7, mul vl\] +** str p12, \[sp, #8, mul vl\] +** str p13, \[sp, #9, mul vl\] +** str p14, \[sp, #10, mul vl\] +** str p15, \[sp, #11, mul vl\] +** str z8, \[sp, #2, mul vl\] +** str z9, \[sp, #3, mul vl\] +** str z10, \[sp, #4, mul vl\] +** str z11, \[sp, #5, mul vl\] +** str z12, \[sp, #6, mul vl\] +** str z13, \[sp, #7, mul vl\] +** str z14, \[sp, #8, mul vl\] +** str z15, \[sp, #9, mul vl\] +** str z16, \[sp, #10, mul vl\] +** str z17, \[sp, #11, mul vl\] +** str z18, \[sp, #12, mul vl\] +** str z19, \[sp, #13, mul vl\] +** str z20, \[sp, #14, mul vl\] +** str z21, \[sp, #15, mul vl\] +** str z22, \[sp, #16, mul vl\] +** str z23, \[sp, #17, mul vl\] +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstart sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** smstop sm +** ldr z8, \[sp, #2, mul vl\] +** ldr z9, \[sp, #3, mul vl\] +** ldr z10, \[sp, #4, mul vl\] +** ldr z11, \[sp, #5, mul vl\] +** ldr z12, \[sp, #6, mul vl\] +** ldr z13, \[sp, #7, mul vl\] +** ldr z14, \[sp, #8, mul vl\] +** ldr z15, \[sp, #9, mul vl\] +** ldr z16, \[sp, #10, mul vl\] +** ldr z17, \[sp, #11, mul vl\] +** ldr z18, \[sp, #12, mul vl\] +** ldr z19, \[sp, #13, mul vl\] +** ldr z20, \[sp, #14, mul vl\] +** ldr z21, \[sp, #15, mul vl\] +** ldr z22, \[sp, #16, mul vl\] +** ldr z23, \[sp, #17, mul vl\] +** ldr p4, \[sp\] +** ldr p5, \[sp, #1, mul vl\] +** ldr p6, \[sp, #2, mul vl\] +** ldr p7, \[sp, #3, mul vl\] +** ldr p8, \[sp, #4, mul vl\] +** ldr p9, \[sp, #5, mul vl\] +** ldr p10, \[sp, #6, mul vl\] +** ldr p11, \[sp, #7, mul vl\] +** ldr p12, \[sp, #8, mul vl\] +** ldr p13, \[sp, #9, mul vl\] +** ldr p14, \[sp, #10, mul vl\] +** ldr p15, \[sp, #11, mul vl\] +** addsvl sp, sp, #18 +** ret +*/ +void __attribute__((arm_locally_streaming)) +n_ls_sve_pcs (__SVBool_t x) +{ + asm (""); +} + +/* +** n_ls_v0: +** addsvl sp, sp, #-1 +** ... +** smstart sm +** add x[0-9]+, .* +** smstop sm +** ... +** addsvl sp, sp, #1 +** ... +*/ +#define TEST(VN) __SVInt32_t VN; asm ("" :: "r" (&VN)); +void __attribute__((arm_locally_streaming)) +n_ls_v0 () +{ + TEST (v0); +} + +/* +** n_ls_v32: +** addsvl sp, sp, #-32 +** ... +** smstart sm +** ... +** smstop sm +** ... +** rdsvl (x[0-9]+), #1 +** lsl (x[0-9]+), \1, #?5 +** add sp, sp, \2 +** ... +*/ +void __attribute__((arm_locally_streaming)) +n_ls_v32 () +{ + TEST (v0); + TEST (v1); + TEST (v2); + TEST (v3); + TEST (v4); + TEST (v5); + TEST (v6); + TEST (v7); + TEST (v8); + TEST (v9); + TEST (v10); + TEST (v11); + TEST (v12); + TEST (v13); + TEST (v14); + TEST (v15); + TEST (v16); + TEST (v17); + TEST (v18); + TEST (v19); + TEST (v20); + TEST (v21); + TEST (v22); + TEST (v23); + TEST (v24); + TEST (v25); + TEST (v26); + TEST (v27); + TEST (v28); + TEST (v29); + TEST (v30); + TEST (v31); +} + +/* +** n_ls_v33: +** rdsvl (x[0-9]+), #1 +** mov (x[0-9]+), #?33 +** mul (x[0-9]+), (?:\1, \2|\2, \1) +** sub sp, sp, \3 +** ... +** smstart sm +** ... +** smstop sm +** ... +** rdsvl (x[0-9]+), #1 +** mov (x[0-9]+), #?33 +** mul (x[0-9]+), (?:\4, \5|\5, \4) +** add sp, sp, \6 +** ... +*/ +void __attribute__((arm_locally_streaming)) +n_ls_v33 () +{ + TEST (v0); + TEST (v1); + TEST (v2); + TEST (v3); + TEST (v4); + TEST (v5); + TEST (v6); + TEST (v7); + TEST (v8); + TEST (v9); + TEST (v10); + TEST (v11); + TEST (v12); + TEST (v13); + TEST (v14); + TEST (v15); + TEST (v16); + TEST (v17); + TEST (v18); + TEST (v19); + TEST (v20); + TEST (v21); + TEST (v22); + TEST (v23); + TEST (v24); + TEST (v25); + TEST (v26); + TEST (v27); + TEST (v28); + TEST (v29); + TEST (v30); + TEST (v31); + TEST (v32); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_2.c b/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_2.c new file mode 100644 index 00000000000..4c9caf5d078 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_2.c @@ -0,0 +1,177 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls" } +// { dg-final { check-function-bodies "**" "" } } + +#include +#include + +/* +** test_d0: +** ... +** smstart sm +** .* +** fmov x10, d0 +** smstop sm +** fmov d0, x10 +** ... +*/ +double __attribute__((arm_locally_streaming)) +test_d0 () +{ + asm (""); + return 1.0f; +} + +/* +** test_d0_vec: +** ... +** smstart sm +** .* +** ( +** fmov x10, d0 +** | +** umov x10, v0.d\[0\] +** ) +** smstop sm +** fmov d0, x10 +** ... +*/ +int8x8_t __attribute__((arm_locally_streaming)) +test_d0_vec () +{ + asm volatile (""); + return (int8x8_t) {}; +} + +/* +** test_q0: +** ... +** smstart sm +** .* +** str q0, \[sp, #?-16\]! +** smstop sm +** ldr q0, \[sp\], #?16 +** ... +*/ +int8x16_t __attribute__((arm_locally_streaming)) +test_q0 () +{ + asm volatile (""); + return (int8x16_t) {}; +} + +/* +** test_q1: +** ... +** smstart sm +** ... +** stp q0, q1, \[sp, #?-32\]! +** smstop sm +** ldp q0, q1, \[sp\], #?32 +** ... +*/ +int8x16x2_t __attribute__((arm_locally_streaming)) +test_q1 () +{ + asm volatile (""); + return (int8x16x2_t) {}; +} + +/* +** test_q2: +** ... +** smstart sm +** ... +** stp q0, q1, \[sp, #?-48\]! +** str q2, \[sp, #?32\] +** smstop sm +** ldr q2, \[sp, #?32\] +** ldp q0, q1, \[sp\], #?48 +** ... +*/ +int8x16x3_t __attribute__((arm_locally_streaming)) +test_q2 () +{ + asm volatile (""); + return (int8x16x3_t) {}; +} + +/* +** test_q3: +** ... +** smstart sm +** ... +** stp q0, q1, \[sp, #?-64\]! +** stp q2, q3, \[sp, #?32\] +** smstop sm +** ldp q2, q3, \[sp, #?32\] +** ldp q0, q1, \[sp\], #?64 +** ... +*/ +int8x16x4_t __attribute__((arm_locally_streaming)) +test_q3 () +{ + asm volatile (""); + return (int8x16x4_t) {}; +} + +/* +** test_z0: +** ... +** smstart sm +** mov z0\.b, #0 +** addvl sp, sp, #-1 +** str z0, \[sp\] +** smstop sm +** ldr z0, \[sp\] +** addvl sp, sp, #1 +** ... +*/ +svint8_t __attribute__((arm_locally_streaming)) +test_z0 () +{ + asm volatile (""); + return (svint8_t) {}; +} + +/* +** test_z3: +** ... +** smstart sm +** ... +** addvl sp, sp, #-4 +** str z0, \[sp\] +** str z1, \[sp, #1, mul vl\] +** str z2, \[sp, #2, mul vl\] +** str z3, \[sp, #3, mul vl\] +** smstop sm +** ldr z0, \[sp\] +** ldr z1, \[sp, #1, mul vl\] +** ldr z2, \[sp, #2, mul vl\] +** ldr z3, \[sp, #3, mul vl\] +** ... +*/ +svint8x4_t __attribute__((arm_locally_streaming)) +test_z3 () +{ + asm volatile (""); + return (svint8x4_t) {}; +} + +/* +** test_p0: +** ... +** smstart sm +** pfalse p0\.b +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstop sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** ... +*/ +svbool_t __attribute__((arm_locally_streaming)) +test_p0 () +{ + asm volatile (""); + return (svbool_t) {}; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_3.c b/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_3.c new file mode 100644 index 00000000000..e6cbd9d176d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_3.c @@ -0,0 +1,273 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls" } +// { dg-final { check-function-bodies "**" "" } } + +#include +#include + +/* +** test_d0: +** ... +** fmov x10, d0 +** smstart sm +** fmov d0, x10 +** smstop sm +** ... +*/ +void __attribute__((arm_locally_streaming)) +test_d0 (double d0) +{ + asm (""); +} + +/* +** test_d7: +** ... +** fmov x10, d0 +** fmov x11, d1 +** fmov x12, d2 +** fmov x13, d3 +** fmov x14, d4 +** fmov x15, d5 +** fmov x16, d6 +** fmov x17, d7 +** smstart sm +** fmov d0, x10 +** fmov d1, x11 +** fmov d2, x12 +** fmov d3, x13 +** fmov d4, x14 +** fmov d5, x15 +** fmov d6, x16 +** fmov d7, x17 +** smstop sm +** ... +*/ +void __attribute__((arm_locally_streaming)) +test_d7 (double d0, double d1, double d2, double d3, + double d4, double d5, double d6, double d7) +{ + asm volatile (""); +} + +/* +** test_d0_vec: +** ... +** ( +** fmov x10, d0 +** | +** umov x10, v0.d\[0\] +** ) +** smstart sm +** fmov d0, x10 +** smstop sm +** ... +*/ +void __attribute__((arm_locally_streaming)) +test_d0_vec (int8x8_t d0) +{ + asm volatile (""); +} + +/* +** test_d7_vec: +** ... +** ( +** fmov x10, d0 +** fmov x11, d1 +** fmov x12, d2 +** fmov x13, d3 +** fmov x14, d4 +** fmov x15, d5 +** fmov x16, d6 +** fmov x17, d7 +** | +** umov x10, v0.d\[0\] +** umov x11, v1.d\[0\] +** umov x12, v2.d\[0\] +** umov x13, v3.d\[0\] +** umov x14, v4.d\[0\] +** umov x15, v5.d\[0\] +** umov x16, v6.d\[0\] +** umov x17, v7.d\[0\] +** ) +** smstart sm +** fmov d0, x10 +** fmov d1, x11 +** fmov d2, x12 +** fmov d3, x13 +** fmov d4, x14 +** fmov d5, x15 +** fmov d6, x16 +** fmov d7, x17 +** smstop sm +** ... +*/ +void __attribute__((arm_locally_streaming)) +test_d7_vec (int8x8_t d0, int8x8_t d1, int8x8_t d2, int8x8_t d3, + int8x8_t d4, int8x8_t d5, int8x8_t d6, int8x8_t d7) +{ + asm volatile (""); +} + +/* +** test_q0: +** ... +** str q0, \[sp, #?-16\]! +** smstart sm +** ldr q0, \[sp\], #?16 +** smstop sm +** ... +*/ +void __attribute__((arm_locally_streaming)) +test_q0 (int8x16_t q0) +{ + asm volatile (""); +} + +/* +** test_q7: +** ... +** stp q0, q1, \[sp, #?-128\]! +** stp q2, q3, \[sp, #?32\] +** stp q4, q5, \[sp, #?64\] +** stp q6, q7, \[sp, #?96\] +** smstart sm +** ldp q2, q3, \[sp, #?32\] +** ldp q4, q5, \[sp, #?64\] +** ldp q6, q7, \[sp, #?96\] +** ldp q0, q1, \[sp\], #?128 +** smstop sm +** ... +*/ +void __attribute__((arm_locally_streaming)) +test_q7 (int8x16x4_t q0, int8x16x4_t q4) +{ + asm volatile (""); +} + +/* +** test_z0: +** ... +** addvl sp, sp, #-1 +** str z0, \[sp\] +** smstart sm +** ldr z0, \[sp\] +** addvl sp, sp, #1 +** smstop sm +** ... +*/ +void __attribute__((arm_locally_streaming)) +test_z0 (svint8_t z0) +{ + asm volatile (""); +} + +/* +** test_z7: +** ... +** addvl sp, sp, #-8 +** str z0, \[sp\] +** str z1, \[sp, #1, mul vl\] +** str z2, \[sp, #2, mul vl\] +** str z3, \[sp, #3, mul vl\] +** str z4, \[sp, #4, mul vl\] +** str z5, \[sp, #5, mul vl\] +** str z6, \[sp, #6, mul vl\] +** str z7, \[sp, #7, mul vl\] +** smstart sm +** ldr z0, \[sp\] +** ldr z1, \[sp, #1, mul vl\] +** ldr z2, \[sp, #2, mul vl\] +** ldr z3, \[sp, #3, mul vl\] +** ldr z4, \[sp, #4, mul vl\] +** ldr z5, \[sp, #5, mul vl\] +** ldr z6, \[sp, #6, mul vl\] +** ldr z7, \[sp, #7, mul vl\] +** addvl sp, sp, #8 +** smstop sm +** ... +*/ +void __attribute__((arm_locally_streaming)) +test_z7 (svint8x4_t z0, svint8x4_t z4) +{ + asm volatile (""); +} + +/* +** test_p0: +** ... +** addvl sp, sp, #-1 +** str p0, \[sp\] +** smstart sm +** ldr p0, \[sp\] +** addvl sp, sp, #1 +** smstop sm +** ... +*/ +void __attribute__((arm_locally_streaming)) +test_p0 (svbool_t p0) +{ + asm volatile (""); +} + +/* +** test_p3: +** ... +** addvl sp, sp, #-1 +** str p0, \[sp\] +** str p1, \[sp, #1, mul vl\] +** str p2, \[sp, #2, mul vl\] +** str p3, \[sp, #3, mul vl\] +** smstart sm +** ldr p0, \[sp\] +** ldr p1, \[sp, #1, mul vl\] +** ldr p2, \[sp, #2, mul vl\] +** ldr p3, \[sp, #3, mul vl\] +** addvl sp, sp, #1 +** smstop sm +** ... +*/ +void __attribute__((arm_locally_streaming)) +test_p3 (svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + asm volatile (""); +} + +/* +** test_mixed: +** ... +** addvl sp, sp, #-3 +** str p0, \[sp\] +** str p1, \[sp, #1, mul vl\] +** str p2, \[sp, #2, mul vl\] +** str p3, \[sp, #3, mul vl\] +** str z3, \[sp, #1, mul vl\] +** str z7, \[sp, #2, mul vl\] +** stp q2, q6, \[sp, #?-32\]! +** fmov w10, s0 +** fmov x11, d1 +** fmov w12, s4 +** fmov x13, d5 +** smstart sm +** fmov s0, w10 +** fmov d1, x11 +** fmov s4, w12 +** fmov d5, x13 +** ldp q2, q6, \[sp\], #?32 +** ldr p0, \[sp\] +** ldr p1, \[sp, #1, mul vl\] +** ldr p2, \[sp, #2, mul vl\] +** ldr p3, \[sp, #3, mul vl\] +** ldr z3, \[sp, #1, mul vl\] +** ldr z7, \[sp, #2, mul vl\] +** addvl sp, sp, #3 +** smstop sm +** ... +*/ +void __attribute__((arm_locally_streaming)) +test_mixed (float s0, double d1, float32x4_t q2, svfloat32_t z3, + float s4, double d5, float64x2_t q6, svfloat64_t z7, + svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + asm volatile (""); +} From patchwork Sun Nov 13 10:03:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 60520 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7CE67395248D for ; Sun, 13 Nov 2022 10:06:29 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7CE67395248D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668333989; bh=9tZp97/iZkLYJH6hzZ2QnY5JQZo90KtYwimFQzm+5l4=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=NHmcKpLlszVfYpskg73zr8eSZYNqHK3rPDWQZ+uhVNI8TjqlEv9liimt/qrEPJhab hCs+cihC+1oZjUXCES3Xn1GG3ewDsq/B5eOKlutU0dMi0DOI0j4IY3Sm126El9bNal 8uAcjbiMewEObYaEkX3tRL3PKxr5m8Xrt2jHUNUw= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 5CA2338438B2 for ; Sun, 13 Nov 2022 10:03:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5CA2338438B2 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5488D23A for ; Sun, 13 Nov 2022 02:03:36 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B8EA63F73D for ; Sun, 13 Nov 2022 02:03:29 -0800 (PST) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 15/16] aarch64: Enforce inlining restrictions for SME References: Date: Sun, 13 Nov 2022 10:03:28 +0000 In-Reply-To: (Richard Sandiford's message of "Sun, 13 Nov 2022 09:59:23 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-41.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" A function that has local ZA state cannot be inlined into its caller, since we only support managing ZA switches at function scope. A function whose body requires a particular PSTATE.SM setting can only be inlined into a function body that guarantees that PSTATE.SM setting. (The callee's function type doesn't matter here: one locally-streaming function can be inlined into another.) gcc/ * config/aarch64/aarch64.cc (aarch64_function_attribute_inlinable_p): New function. (aarch64_can_inline_p): Use aarch64_fndecl_isa_mode to populate the ISA mode bits when comparing the ISA flags of the two functions. (TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P): Define. gcc/testsuite/ * gcc.target/aarch64/sme/inlining_1.c: New test. * gcc.target/aarch64/sme/inlining_2.c: Likewise. * gcc.target/aarch64/sme/inlining_3.c: Likewise. * gcc.target/aarch64/sme/inlining_4.c: Likewise. * gcc.target/aarch64/sme/inlining_5.c: Likewise. * gcc.target/aarch64/sme/inlining_6.c: Likewise. * gcc.target/aarch64/sme/inlining_7.c: Likewise. * gcc.target/aarch64/sme/inlining_8.c: Likewise. --- gcc/config/aarch64/aarch64.cc | 33 ++++++++++++++++--- .../gcc.target/aarch64/sme/inlining_1.c | 26 +++++++++++++++ .../gcc.target/aarch64/sme/inlining_2.c | 26 +++++++++++++++ .../gcc.target/aarch64/sme/inlining_3.c | 26 +++++++++++++++ .../gcc.target/aarch64/sme/inlining_4.c | 26 +++++++++++++++ .../gcc.target/aarch64/sme/inlining_5.c | 26 +++++++++++++++ .../gcc.target/aarch64/sme/inlining_6.c | 18 ++++++++++ .../gcc.target/aarch64/sme/inlining_7.c | 18 ++++++++++ .../gcc.target/aarch64/sme/inlining_8.c | 18 ++++++++++ 9 files changed, 212 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_6.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_7.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/inlining_8.c diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 48bf2de4b3d..9a4a469a078 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -20554,6 +20554,17 @@ aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int) return ret; } +/* Implement TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P. Use an opt-out + rather than an opt-in list. */ + +static bool +aarch64_function_attribute_inlinable_p (const_tree fndecl) +{ + /* A function that has local ZA state cannot be inlined into its caller, + since we only support managing ZA switches at function scope. */ + return !aarch64_fndecl_has_new_za_state (fndecl); +} + /* Helper for aarch64_can_inline_p. In the case where CALLER and CALLEE are tri-bool options (yes, no, don't care) and the default value is DEF, determine whether to reject inlining. */ @@ -20597,12 +20608,20 @@ aarch64_can_inline_p (tree caller, tree callee) : target_option_default_node); /* Callee's ISA flags should be a subset of the caller's. */ - if ((caller_opts->x_aarch64_asm_isa_flags - & callee_opts->x_aarch64_asm_isa_flags) - != callee_opts->x_aarch64_asm_isa_flags) + auto caller_asm_isa = (caller_opts->x_aarch64_isa_flags + & ~AARCH64_FL_ISA_MODES); + auto callee_asm_isa = (callee_opts->x_aarch64_isa_flags + & ~AARCH64_FL_ISA_MODES); + if (callee_asm_isa & ~caller_asm_isa) return false; - if ((caller_opts->x_aarch64_isa_flags & callee_opts->x_aarch64_isa_flags) - != callee_opts->x_aarch64_isa_flags) + + auto caller_isa = ((caller_opts->x_aarch64_isa_flags + & ~AARCH64_FL_ISA_MODES) + | aarch64_fndecl_isa_mode (caller)); + auto callee_isa = ((callee_opts->x_aarch64_isa_flags + & ~AARCH64_FL_ISA_MODES) + | aarch64_fndecl_isa_mode (callee)); + if (callee_isa & ~caller_isa) return false; /* Allow non-strict aligned functions inlining into strict @@ -29150,6 +29169,10 @@ aarch64_run_selftests (void) #undef TARGET_CAN_ELIMINATE #define TARGET_CAN_ELIMINATE aarch64_can_eliminate +#undef TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P +#define TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P \ + aarch64_function_attribute_inlinable_p + #undef TARGET_CAN_INLINE_P #define TARGET_CAN_INLINE_P aarch64_can_inline_p diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_1.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_1.c new file mode 100644 index 00000000000..63d23cb8b41 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_1.c @@ -0,0 +1,26 @@ +/* { dg-options "" } */ + +inline void __attribute__((always_inline, arm_streaming_compatible)) +sc_callee () {} + +inline void __attribute__((always_inline, arm_streaming)) +s_callee () {} // { dg-error "inlining failed" } + +inline void __attribute__((always_inline)) +n_callee () {} // { dg-error "inlining failed" } + +inline void __attribute__((always_inline, arm_streaming_compatible, arm_locally_streaming)) +sc_ls_callee () {} // { dg-error "inlining failed" } + +inline void __attribute__((always_inline, arm_locally_streaming)) +n_ls_callee () {} // { dg-error "inlining failed" } + +void __attribute__((arm_streaming_compatible)) +sc_caller () +{ + sc_callee (); + s_callee (); + n_callee (); + sc_ls_callee (); + n_ls_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_2.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_2.c new file mode 100644 index 00000000000..277a5b691a2 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_2.c @@ -0,0 +1,26 @@ +/* { dg-options "" } */ + +inline void __attribute__((always_inline, arm_streaming_compatible)) +sc_callee () {} + +inline void __attribute__((always_inline, arm_streaming)) +s_callee () {} + +inline void __attribute__((always_inline)) +n_callee () {} // { dg-error "inlining failed" } + +inline void __attribute__((always_inline, arm_streaming_compatible, arm_locally_streaming)) +sc_ls_callee () {} + +inline void __attribute__((always_inline, arm_locally_streaming)) +n_ls_callee () {} + +void __attribute__((arm_streaming)) +s_caller () +{ + sc_callee (); + s_callee (); + n_callee (); + sc_ls_callee (); + n_ls_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_3.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_3.c new file mode 100644 index 00000000000..d9913350d05 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_3.c @@ -0,0 +1,26 @@ +/* { dg-options "" } */ + +inline void __attribute__((always_inline, arm_streaming_compatible)) +sc_callee () {} + +inline void __attribute__((always_inline, arm_streaming)) +s_callee () {} // { dg-error "inlining failed" } + +inline void __attribute__((always_inline)) +n_callee () {} + +inline void __attribute__((always_inline, arm_streaming_compatible, arm_locally_streaming)) +sc_ls_callee () {} // { dg-error "inlining failed" } + +inline void __attribute__((always_inline, arm_locally_streaming)) +n_ls_callee () {} // { dg-error "inlining failed" } + +void +n_caller () +{ + sc_callee (); + s_callee (); + n_callee (); + sc_ls_callee (); + n_ls_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_4.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_4.c new file mode 100644 index 00000000000..db7f2cecc22 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_4.c @@ -0,0 +1,26 @@ +/* { dg-options "" } */ + +inline void __attribute__((always_inline, arm_streaming_compatible)) +sc_callee () {} + +inline void __attribute__((always_inline, arm_streaming)) +s_callee () {} + +inline void __attribute__((always_inline)) +n_callee () {} // { dg-error "inlining failed" } + +inline void __attribute__((always_inline, arm_streaming_compatible, arm_locally_streaming)) +sc_ls_callee () {} + +inline void __attribute__((always_inline, arm_locally_streaming)) +n_ls_callee () {} + +void __attribute__((arm_streaming_compatible, arm_locally_streaming)) +sc_ls_caller () +{ + sc_callee (); + s_callee (); + n_callee (); + sc_ls_callee (); + n_ls_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_5.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_5.c new file mode 100644 index 00000000000..32db426682b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_5.c @@ -0,0 +1,26 @@ +/* { dg-options "" } */ + +inline void __attribute__((always_inline, arm_streaming_compatible)) +sc_callee () {} + +inline void __attribute__((always_inline, arm_streaming)) +s_callee () {} + +inline void __attribute__((always_inline)) +n_callee () {} // { dg-error "inlining failed" } + +inline void __attribute__((always_inline, arm_streaming_compatible, arm_locally_streaming)) +sc_ls_callee () {} + +inline void __attribute__((always_inline, arm_locally_streaming)) +n_ls_callee () {} + +void __attribute__((arm_locally_streaming)) +n_ls_caller () +{ + sc_callee (); + s_callee (); + n_callee (); + sc_ls_callee (); + n_ls_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_6.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_6.c new file mode 100644 index 00000000000..cf09c61f9d4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_6.c @@ -0,0 +1,18 @@ +/* { dg-options "" } */ + +inline void __attribute__((always_inline, arm_shared_za)) +shared_callee () {} + +inline void __attribute__((always_inline, arm_new_za)) +new_callee () {} // { dg-error "inlining failed" } + +inline void __attribute__((always_inline)) +normal_callee () {} + +void __attribute__((arm_shared_za)) +shared_caller () +{ + shared_callee (); + new_callee (); + normal_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_7.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_7.c new file mode 100644 index 00000000000..8a5d261a8a2 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_7.c @@ -0,0 +1,18 @@ +/* { dg-options "" } */ + +inline void __attribute__((always_inline, arm_shared_za)) +shared_callee () {} + +inline void __attribute__((always_inline, arm_new_za)) +new_callee () {} // { dg-error "inlining failed" } + +inline void __attribute__((always_inline)) +normal_callee () {} + +void __attribute__((arm_new_za)) +new_caller () +{ + shared_callee (); + new_callee (); + normal_callee (); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/inlining_8.c b/gcc/testsuite/gcc.target/aarch64/sme/inlining_8.c new file mode 100644 index 00000000000..0706f5a5089 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/inlining_8.c @@ -0,0 +1,18 @@ +/* { dg-options "" } */ + +inline void __attribute__((always_inline, arm_shared_za)) +shared_callee () {} // { dg-error "inlining failed" } + +inline void __attribute__((always_inline, arm_new_za)) +new_callee () {} // { dg-error "inlining failed" } + +inline void __attribute__((always_inline)) +normal_callee () {} + +void +normal_caller () +{ + shared_callee (); + new_callee (); + normal_callee (); +} From patchwork Sun Nov 13 10:03:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 60516 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 27CB23954410 for ; Sun, 13 Nov 2022 10:04:21 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 27CB23954410 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668333861; bh=/JYSqv1MfAwci4pKoIAXe6szmpZ5kN5Jn8CPPSeGAqU=; h=To:Subject:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=Gx3ny0Yxx0Blrb7W4c8qt9/VtKaDNwiQpVIQvmEli9Y8PdqkGg7zTDq7XKYLFN8cm 9GayQRfVJoie53jaqu1IchdsvzttiagJf0SaJ7wSlKgxqdZYp88yjLAw7b4ND9YZz0 BLkk2pUNA3YeDfvIcQ+LH/ymQsER+3/4o/FrE4UY= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 017DB3870C01 for ; Sun, 13 Nov 2022 10:03:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 017DB3870C01 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id F1FD323A for ; Sun, 13 Nov 2022 02:03:52 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.62]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 45D083F73D for ; Sun, 13 Nov 2022 02:03:46 -0800 (PST) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 16/16] aarch64: Update sibcall handling for SME References: Date: Sun, 13 Nov 2022 10:03:45 +0000 In-Reply-To: (Richard Sandiford's message of "Sun, 13 Nov 2022 09:59:23 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-41.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" We only support tail calls between functions with the same PSTATE.ZA setting ("private-ZA" to "private-ZA" and "shared-ZA" to "shared-ZA"). Only a normal non-streaming function can tail-call another non-streaming function, and only a streaming function can tail-call another streaming function. Any function can tail-call a streaming-compatible function. gcc/ * config/aarch64/aarch64.cc (aarch64_function_ok_for_sibcall): Enforce PSTATE.SM and PSTATE.ZA restrictions. (aarch64_expand_epilogue): Save and restore the arguments to a sibcall around any change to PSTATE.SM. gcc/testsuite/ * gcc.target/aarch64/sme/locally_streaming_4.c: New test. * gcc.target/aarch64/sme/sibcall_1.c: Likewise. * gcc.target/aarch64/sme/sibcall_2.c: Likewise. * gcc.target/aarch64/sme/sibcall_3.c: Likewise. * gcc.target/aarch64/sme/sibcall_4.c: Likewise. * gcc.target/aarch64/sme/sibcall_5.c: Likewise. * gcc.target/aarch64/sme/sibcall_6.c: Likewise. * gcc.target/aarch64/sme/sibcall_7.c: Likewise. * gcc.target/aarch64/sme/sibcall_8.c: Likewise. --- gcc/config/aarch64/aarch64.cc | 9 +- .../aarch64/sme/locally_streaming_4.c | 129 ++++++++++++++++++ .../gcc.target/aarch64/sme/sibcall_1.c | 45 ++++++ .../gcc.target/aarch64/sme/sibcall_2.c | 45 ++++++ .../gcc.target/aarch64/sme/sibcall_3.c | 45 ++++++ .../gcc.target/aarch64/sme/sibcall_4.c | 45 ++++++ .../gcc.target/aarch64/sme/sibcall_5.c | 45 ++++++ .../gcc.target/aarch64/sme/sibcall_6.c | 26 ++++ .../gcc.target/aarch64/sme/sibcall_7.c | 26 ++++ .../gcc.target/aarch64/sme/sibcall_8.c | 19 +++ 10 files changed, 433 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/sibcall_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/sibcall_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/sibcall_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/sibcall_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/sibcall_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/sibcall_6.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/sibcall_7.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/sibcall_8.c diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 9a4a469a078..0d4c20f5c6a 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -8110,6 +8110,11 @@ aarch64_function_ok_for_sibcall (tree, tree exp) if (crtl->abi->id () != expr_callee_abi (exp).id ()) return false; + tree fntype = TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (exp))); + if (aarch64_fntype_sm_state (fntype) & ~aarch64_cfun_incoming_sm_state ()) + return false; + if (aarch64_fntype_za_state (fntype) != aarch64_cfun_incoming_za_state ()) + return false; return true; } @@ -11236,7 +11241,9 @@ aarch64_expand_epilogue (rtx_call_insn *sibcall) guard_label = aarch64_guard_switch_pstate_sm (IP0_REGNUM, aarch64_isa_flags); aarch64_sme_mode_switch_regs args_switch; - if (crtl->return_rtx && REG_P (crtl->return_rtx)) + if (sibcall) + args_switch.add_call_args (sibcall); + else if (crtl->return_rtx && REG_P (crtl->return_rtx)) args_switch.add_reg (GET_MODE (crtl->return_rtx), REGNO (crtl->return_rtx)); args_switch.emit_prologue (); diff --git a/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_4.c b/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_4.c new file mode 100644 index 00000000000..b0e4759ed11 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/locally_streaming_4.c @@ -0,0 +1,129 @@ +// { dg-options "-O -fomit-frame-pointer -fno-optimize-sibling-calls" } +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include + +/* +** test_d0: +** ... +** fmov x10, d0 +** smstop sm +** fmov d0, x10 +** ... +*/ +void consume_d0 (double d0); + +void __attribute__((arm_locally_streaming)) +test_d0 () +{ + consume_d0 (1.0); +} + +/* +** test_d7: +** ... +** fmov x10, d0 +** fmov x11, d1 +** fmov x12, d2 +** fmov x13, d3 +** fmov x14, d4 +** fmov x15, d5 +** fmov x16, d6 +** fmov x17, d7 +** smstop sm +** fmov d0, x10 +** fmov d1, x11 +** fmov d2, x12 +** fmov d3, x13 +** fmov d4, x14 +** fmov d5, x15 +** fmov d6, x16 +** fmov d7, x17 +** ... +*/ +void consume_d7 (double d0, double d1, double d2, double d3, + double d4, double d5, double d6, double d7); +void __attribute__((arm_locally_streaming)) +test_d7 () +{ + consume_d7 (1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0); +} + +/* +** test_q7: +** ... +** stp q0, q1, \[sp, #?-128\]! +** stp q2, q3, \[sp, #?32\] +** stp q4, q5, \[sp, #?64\] +** stp q6, q7, \[sp, #?96\] +** smstop sm +** ldp q2, q3, \[sp, #?32\] +** ldp q4, q5, \[sp, #?64\] +** ldp q6, q7, \[sp, #?96\] +** ldp q0, q1, \[sp\], #?128 +** ... +*/ +void consume_q7 (int8x16x4_t q0, int8x16x4_t q4); + +void __attribute__((arm_locally_streaming)) +test_q7 (int8x16x4_t *ptr) +{ + consume_q7 (ptr[0], ptr[1]); +} + +/* +** test_z7: +** ... +** addvl sp, sp, #-8 +** str z0, \[sp\] +** str z1, \[sp, #1, mul vl\] +** str z2, \[sp, #2, mul vl\] +** str z3, \[sp, #3, mul vl\] +** str z4, \[sp, #4, mul vl\] +** str z5, \[sp, #5, mul vl\] +** str z6, \[sp, #6, mul vl\] +** str z7, \[sp, #7, mul vl\] +** smstop sm +** ldr z0, \[sp\] +** ldr z1, \[sp, #1, mul vl\] +** ldr z2, \[sp, #2, mul vl\] +** ldr z3, \[sp, #3, mul vl\] +** ldr z4, \[sp, #4, mul vl\] +** ldr z5, \[sp, #5, mul vl\] +** ldr z6, \[sp, #6, mul vl\] +** ldr z7, \[sp, #7, mul vl\] +** addvl sp, sp, #8 +** ... +*/ +void consume_z7 (svint8x4_t z0, svint8x4_t z4); + +void __attribute__((arm_locally_streaming)) +test_z7 (svint8x4_t *ptr1, svint8x4_t *ptr2) +{ + consume_z7 (*ptr1, *ptr2); +} + +/* +** test_p3: +** ... +** addvl sp, sp, #-1 +** str p0, \[sp\] +** str p1, \[sp, #1, mul vl\] +** str p2, \[sp, #2, mul vl\] +** str p3, \[sp, #3, mul vl\] +** smstop sm +** ldr p0, \[sp\] +** ldr p1, \[sp, #1, mul vl\] +** ldr p2, \[sp, #2, mul vl\] +** ldr p3, \[sp, #3, mul vl\] +** addvl sp, sp, #1 +** ... +*/ +void consume_p3 (svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3); + +void __attribute__((arm_locally_streaming)) +test_p3 (svbool_t *ptr1, svbool_t *ptr2, svbool_t *ptr3, svbool_t *ptr4) +{ + consume_p3 (*ptr1, *ptr2, *ptr3, *ptr4); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sme/sibcall_1.c b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_1.c new file mode 100644 index 00000000000..0b0f4191a60 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_1.c @@ -0,0 +1,45 @@ +/* { dg-options "-O2" } */ + +void __attribute__((arm_streaming_compatible)) sc_callee (); +void __attribute__((arm_streaming)) s_callee (); +void n_callee (); + +void __attribute__((noipa, arm_streaming_compatible, arm_locally_streaming)) +sc_ls_callee () {} +void __attribute__((noipa, arm_locally_streaming)) +n_ls_callee () {} + +void __attribute__((arm_streaming_compatible)) +sc_to_sc () +{ + sc_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_callee} } } */ + +void __attribute__((arm_streaming_compatible)) +sc_to_s () +{ + s_callee (); +} +/* { dg-final { scan-assembler {\tbl\ts_callee} } } */ + +void __attribute__((arm_streaming_compatible)) +sc_to_n () +{ + n_callee (); +} +/* { dg-final { scan-assembler {\tbl\tn_callee} } } */ + +void __attribute__((arm_streaming_compatible)) +sc_to_sc_ls () +{ + sc_ls_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_ls_callee} } } */ + +void __attribute__((arm_streaming_compatible)) +sc_to_n_ls () +{ + n_ls_callee (); +} +/* { dg-final { scan-assembler {\tbl\tn_ls_callee} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/sibcall_2.c b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_2.c new file mode 100644 index 00000000000..95af22dd29d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_2.c @@ -0,0 +1,45 @@ +/* { dg-options "-O2" } */ + +void __attribute__((arm_streaming_compatible)) sc_callee (); +void __attribute__((arm_streaming)) s_callee (); +void n_callee (); + +void __attribute__((noipa, arm_streaming_compatible, arm_locally_streaming)) +sc_ls_callee () {} +void __attribute__((noipa, arm_locally_streaming)) +n_ls_callee () {} + +void __attribute__((arm_streaming)) +s_to_sc () +{ + sc_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_callee} } } */ + +void __attribute__((arm_streaming)) +s_to_s () +{ + s_callee (); +} +/* { dg-final { scan-assembler {\tb\ts_callee} } } */ + +void __attribute__((arm_streaming)) +s_to_n () +{ + n_callee (); +} +/* { dg-final { scan-assembler {\tbl\tn_callee} } } */ + +void __attribute__((arm_streaming)) +s_to_sc_ls () +{ + sc_ls_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_ls_callee} } } */ + +void __attribute__((arm_streaming)) +s_to_n_ls () +{ + n_ls_callee (); +} +/* { dg-final { scan-assembler {\tbl\tn_ls_callee} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/sibcall_3.c b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_3.c new file mode 100644 index 00000000000..5221f925567 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_3.c @@ -0,0 +1,45 @@ +/* { dg-options "-O2" } */ + +void __attribute__((arm_streaming_compatible)) sc_callee (); +void __attribute__((arm_streaming)) s_callee (); +void n_callee (); + +void __attribute__((noipa, arm_streaming_compatible, arm_locally_streaming)) +sc_ls_callee () {} +void __attribute__((noipa, arm_locally_streaming)) +n_ls_callee () {} + +void +n_to_sc () +{ + sc_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_callee} } } */ + +void +n_to_s () +{ + s_callee (); +} +/* { dg-final { scan-assembler {\tbl\ts_callee} } } */ + +void +n_to_n () +{ + n_callee (); +} +/* { dg-final { scan-assembler {\tb\tn_callee} } } */ + +void +n_to_sc_ls () +{ + sc_ls_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_ls_callee} } } */ + +void +n_to_n_ls () +{ + n_ls_callee (); +} +/* { dg-final { scan-assembler {\tb\tn_ls_callee} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/sibcall_4.c b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_4.c new file mode 100644 index 00000000000..21b6a66a1b2 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_4.c @@ -0,0 +1,45 @@ +/* { dg-options "-O2" } */ + +void __attribute__((arm_streaming_compatible)) sc_callee (); +void __attribute__((arm_streaming)) s_callee (); +void n_callee (); + +void __attribute__((noipa, arm_streaming_compatible, arm_locally_streaming)) +sc_ls_callee () {} +void __attribute__((noipa, arm_locally_streaming)) +n_ls_callee () {} + +void __attribute__((arm_streaming_compatible, arm_locally_streaming)) +sc_to_sc () +{ + sc_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_callee} } } */ + +void __attribute__((arm_streaming_compatible, arm_locally_streaming)) +sc_to_s () +{ + s_callee (); +} +/* { dg-final { scan-assembler {\tbl\ts_callee} } } */ + +void __attribute__((arm_streaming_compatible, arm_locally_streaming)) +sc_to_n () +{ + n_callee (); +} +/* { dg-final { scan-assembler {\tbl\tn_callee} } } */ + +void __attribute__((arm_streaming_compatible, arm_locally_streaming)) +sc_to_sc_ls () +{ + sc_ls_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_ls_callee} } } */ + +void __attribute__((arm_streaming_compatible, arm_locally_streaming)) +sc_to_n_ls () +{ + n_ls_callee (); +} +/* { dg-final { scan-assembler {\tbl\tn_ls_callee} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/sibcall_5.c b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_5.c new file mode 100644 index 00000000000..736797a476c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_5.c @@ -0,0 +1,45 @@ +/* { dg-options "-O2" } */ + +void __attribute__((arm_streaming_compatible)) sc_callee (); +void __attribute__((arm_streaming)) s_callee (); +void n_callee (); + +void __attribute__((noipa, arm_streaming_compatible, arm_locally_streaming)) +sc_ls_callee () {} +void __attribute__((noipa, arm_locally_streaming)) +n_ls_callee () {} + +void __attribute__((arm_locally_streaming)) +n_to_sc () +{ + sc_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_callee} } } */ + +void __attribute__((arm_locally_streaming)) +n_to_s () +{ + s_callee (); +} +/* { dg-final { scan-assembler {\tbl\ts_callee} } } */ + +void __attribute__((arm_locally_streaming)) +n_to_n () +{ + n_callee (); +} +/* { dg-final { scan-assembler {\tb\tn_callee} } } */ + +void __attribute__((arm_locally_streaming)) +n_to_sc_ls () +{ + sc_ls_callee (); +} +/* { dg-final { scan-assembler {\tb\tsc_ls_callee} } } */ + +void __attribute__((arm_locally_streaming)) +n_to_n_ls () +{ + n_ls_callee (); +} +/* { dg-final { scan-assembler {\tb\tn_ls_callee} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/sibcall_6.c b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_6.c new file mode 100644 index 00000000000..b2f321b7c8f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_6.c @@ -0,0 +1,26 @@ +/* { dg-options "-O2" } */ + +void __attribute__((arm_shared_za)) shared_callee (); +void __attribute__((noipa, arm_new_za)) new_callee () {} +void normal_callee (); + +void __attribute__((arm_shared_za)) +shared_to_shared () +{ + shared_callee (); +} +/* { dg-final { scan-assembler {\tb\tshared_callee} } } */ + +void __attribute__((arm_shared_za)) +shared_to_new () +{ + new_callee (); +} +/* { dg-final { scan-assembler {\tbl\tnew_callee} } } */ + +void __attribute__((arm_shared_za)) +shared_to_normal () +{ + normal_callee (); +} +/* { dg-final { scan-assembler {\tbl\tnormal_callee} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/sibcall_7.c b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_7.c new file mode 100644 index 00000000000..a096cf591b7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_7.c @@ -0,0 +1,26 @@ +/* { dg-options "-O2" } */ + +void __attribute__((arm_shared_za)) shared_callee (); +void __attribute__((noipa, arm_new_za)) new_callee () {} +void normal_callee (); + +void __attribute__((arm_new_za)) +new_to_shared () +{ + shared_callee (); +} +/* { dg-final { scan-assembler {\tbl\tshared_callee} } } */ + +void __attribute__((arm_new_za)) +new_to_new () +{ + new_callee (); +} +/* { dg-final { scan-assembler {\tb\tnew_callee} } } */ + +void __attribute__((arm_new_za)) +new_to_normal () +{ + normal_callee (); +} +/* { dg-final { scan-assembler {\tb\tnormal_callee} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sme/sibcall_8.c b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_8.c new file mode 100644 index 00000000000..2553c10045a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sme/sibcall_8.c @@ -0,0 +1,19 @@ +/* { dg-options "-O2" } */ + +void __attribute__((arm_shared_za)) shared_callee (); +void __attribute__((noipa, arm_new_za)) new_callee () {} +void normal_callee (); + +void +normal_to_new () +{ + new_callee (); +} +/* { dg-final { scan-assembler {\tb\tnew_callee} } } */ + +void +normal_to_normal () +{ + normal_callee (); +} +/* { dg-final { scan-assembler {\tb\tnormal_callee} } } */