From patchwork Tue May 31 08:49:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 54544 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A937538356B6 for ; Tue, 31 May 2022 08:51:47 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbg153.qq.com (smtpbg153.qq.com [13.245.218.24]) by sourceware.org (Postfix) with ESMTPS id 9C3933857BBC for ; Tue, 31 May 2022 08:50:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9C3933857BBC Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp84t1653987019t99ew0sf Received: from server1.localdomain ( [42.247.22.65]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 31 May 2022 16:50:18 +0800 (CST) X-QQ-SSF: 01400000002000B0F000000A0000000 X-QQ-FEAT: 0VgNaGdhy9iqTjLyuGLHBqI7WZ0vi/WYIN+Z5sxuaO4r9tf5Q5GPuH69LCKm4 z1CLG+XTnAQ+VTRY7R8/zfYfQv/NQ7ruWbClNr3wM8OxuMKLjGvb5kKyloHxPqar9mGJbx7 pQkdOfpLS9nTTMu/UdQDulph9F9fP8XzTCkAcXtkhEZfyA3XvbBL/n3vKdVjhaYqYOhzhpJ iZuAbLEwX/oPeBHhilpfGImsLGt5RDPwzC7RzIsDkW0gFV6nhBKD2VeHFuVNbmDBjVpEuJz tZSB5v0M0qmUmiNyil7WpH5XZuWVyG/+cPr3rS4uc7gyXayp2Xyw0fUNRt9SFXc0gTorR1n Zkb0GVb12xQ2MwM4jvGDRqBEfWOb9t6QqPTbG3k X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Subject: [PATCH 01/21] Add RVV modes and support scalable vector Date: Tue, 31 May 2022 16:49:52 +0800 Message-Id: <20220531085012.269719-2-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220531085012.269719-1-juzhe.zhong@rivai.ai> References: <20220531085012.269719-1-juzhe.zhong@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybgforeign:qybgforeign8 X-QQ-Bgrelay: 1 X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kito.cheng@gmail.com, juzhe.zhong@rivai.ai Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: zhongjuzhe gcc/ChangeLog: * config.gcc: Add riscv-vector.o extra_objs for RVV support. * config/riscv/constraints.md: New constraints. * config/riscv/predicates.md: New predicates. * config/riscv/riscv-modes.def: New machine mode. * config/riscv/riscv-opts.h: New enum. * config/riscv/riscv-protos.h: New functions declare. * config/riscv/riscv-sr.cc (riscv_remove_unneeded_save_restore_calls): Adjust for poly_int. * config/riscv/riscv.cc (struct riscv_frame_info): Change HOST_WIDE_INT to poly_int64. (poly_uint16 riscv_vector_chunks): New declare. (riscv_legitimate_constant_p): Adjust for poly_int. (riscv_cannot_force_const_mem): Adjust for poly_int. (riscv_valid_offset_p): Adjust for poly_int. (riscv_valid_lo_sum_p): Adjust for poly_int. (riscv_classify_address): Disallow PLUS, LO_SUM and CONST_INT memory address for RVV. (riscv_address_insns): Adjust for poly_int. (riscv_const_insns): Adjust for poly_int. (riscv_load_store_insns): Adjust for poly_int. (riscv_legitimize_move): Adjust for poly_int. (riscv_binary_cost): Adjust for poly_int. (riscv_rtx_costs): Adjust for poly_int. (riscv_output_move): Adjust for poly_int. (riscv_extend_comparands): Adjust for poly_int. (riscv_flatten_aggregate_field): Adjust for poly_int. (riscv_get_arg_info): Adjust for poly_int. (riscv_pass_by_reference): Adjust for poly_int. (riscv_elf_select_rtx_section): Adjust for poly_int. (riscv_stack_align): Adjust for poly_int. (riscv_compute_frame_info): Adjust for poly_int. (riscv_initial_elimination_offset): Change HOST_WIDE_INT to poly_int64. (riscv_set_return_address): Adjust for poly_int. (riscv_for_each_saved_reg): Adjust for poly_int. (riscv_first_stack_step): Adjust for poly_int. (riscv_expand_prologue): Adjust for poly_int. (riscv_expand_epilogue): Adjust for poly_int. (riscv_can_use_return_insn): Adjust for poly_int. (riscv_secondary_memory_needed): Disable secondary memory for RVV. (riscv_hard_regno_nregs): Add RVV register allocation. (riscv_hard_regno_mode_ok): Add RVV register allocation. (riscv_convert_riscv_vector_bits): New function. (riscv_option_override): Add RVV vector bits parser. (riscv_promote_function_mode): Adjust for RVV modes. * config/riscv/riscv.h: New macro define. * config/riscv/riscv.md: Adjust for poly_int. * config/riscv/riscv.opt: New option. * config/riscv/t-riscv: New object. * config/riscv/riscv-vector.cc: New file. * config/riscv/riscv-vector.h: New file. --- gcc/config.gcc | 2 +- gcc/config/riscv/constraints.md | 17 ++ gcc/config/riscv/predicates.md | 5 +- gcc/config/riscv/riscv-modes.def | 177 ++++++++++++++++++ gcc/config/riscv/riscv-opts.h | 27 +++ gcc/config/riscv/riscv-protos.h | 9 +- gcc/config/riscv/riscv-sr.cc | 2 +- gcc/config/riscv/riscv-vector.cc | 229 +++++++++++++++++++++++ gcc/config/riscv/riscv-vector.h | 28 +++ gcc/config/riscv/riscv.cc | 302 +++++++++++++++++++++++-------- gcc/config/riscv/riscv.h | 84 +++++++-- gcc/config/riscv/riscv.md | 36 ++-- gcc/config/riscv/riscv.opt | 32 ++++ gcc/config/riscv/t-riscv | 4 + 14 files changed, 849 insertions(+), 105 deletions(-) create mode 100644 gcc/config/riscv/riscv-vector.cc create mode 100644 gcc/config/riscv/riscv-vector.h diff --git a/gcc/config.gcc b/gcc/config.gcc index cdbefb5b4f5..50154c2eb3a 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -517,7 +517,7 @@ pru-*-*) ;; riscv*) cpu_type=riscv - extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memrefs.o" + extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memrefs.o riscv-vector.o" d_target_objs="riscv-d.o" ;; rs6000*-*-*) diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md index bafa4188ccb..7fd61a04216 100644 --- a/gcc/config/riscv/constraints.md +++ b/gcc/config/riscv/constraints.md @@ -80,3 +80,20 @@ A constant @code{move_operand}." (and (match_operand 0 "move_operand") (match_test "CONSTANT_P (op)"))) + +;; Vector constraints. + +(define_register_constraint "vr" "TARGET_VECTOR ? V_REGS : NO_REGS" + "A vector register (if available).") + +(define_register_constraint "vd" "TARGET_VECTOR ? VD_REGS : NO_REGS" + "A vector register except mask register (if available).") + +(define_register_constraint "vm" "TARGET_VECTOR ? VM_REGS : NO_REGS" + "A vector mask register (if available).") + +(define_constraint "vp" + "POLY_INT" + (and (match_code "const_poly_int") + (match_test "CONST_POLY_INT_COEFFS (op)[0] == UNITS_PER_V_REG.coeffs[0] + && CONST_POLY_INT_COEFFS (op)[1] == UNITS_PER_V_REG.coeffs[1]"))) diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md index c37caa2502b..6328cfff367 100644 --- a/gcc/config/riscv/predicates.md +++ b/gcc/config/riscv/predicates.md @@ -71,7 +71,7 @@ { /* Don't handle multi-word moves this way; we don't want to introduce the individual word-mode moves until after reload. */ - if (GET_MODE_SIZE (mode) > UNITS_PER_WORD) + if (GET_MODE_SIZE (mode).to_constant () > UNITS_PER_WORD) return false; /* Check whether the constant can be loaded in a single @@ -145,6 +145,9 @@ { case CONST_INT: return !splittable_const_int_operand (op, mode); + + case CONST_POLY_INT: + return rvv_legitimate_poly_int_p (op); case CONST: case SYMBOL_REF: diff --git a/gcc/config/riscv/riscv-modes.def b/gcc/config/riscv/riscv-modes.def index 653228409a4..88cf9551727 100644 --- a/gcc/config/riscv/riscv-modes.def +++ b/gcc/config/riscv/riscv-modes.def @@ -20,3 +20,180 @@ along with GCC; see the file COPYING3. If not see . */ FLOAT_MODE (TF, 16, ieee_quad_format); + +/* Vector modes. */ + +/* Encode the ratio of SEW/LMUL into the mask types. There are the following mask types. */ + +/* | Type | Mode | SEW/LMUL | + | vbool64_t | VNx2BI | 64 | + | vbool32_t | VNx4BI | 32 | + | vbool16_t | VNx8BI | 16 | + | vbool8_t | VNx16BI | 8 | + | vbool4_t | VNx32BI | 4 | + | vbool2_t | VNx64BI | 2 | + | vbool1_t | VNx128BI | 1 | */ + +VECTOR_BOOL_MODE (VNx2BI, 2, BI, 16); +VECTOR_BOOL_MODE (VNx4BI, 4, BI, 16); +VECTOR_BOOL_MODE (VNx8BI, 8, BI, 16); +VECTOR_BOOL_MODE (VNx16BI, 16, BI, 16); +VECTOR_BOOL_MODE (VNx32BI, 32, BI, 16); +VECTOR_BOOL_MODE (VNx64BI, 64, BI, 16); +VECTOR_BOOL_MODE (VNx128BI, 128, BI, 16); + +ADJUST_NUNITS (VNx2BI, riscv_vector_chunks * 1); +ADJUST_NUNITS (VNx4BI, riscv_vector_chunks * 2); +ADJUST_NUNITS (VNx8BI, riscv_vector_chunks * 4); +ADJUST_NUNITS (VNx16BI, riscv_vector_chunks * 8); +ADJUST_NUNITS (VNx32BI, riscv_vector_chunks * 16); +ADJUST_NUNITS (VNx64BI, riscv_vector_chunks * 32); +ADJUST_NUNITS (VNx128BI, riscv_vector_chunks * 64); + +ADJUST_ALIGNMENT (VNx2BI, 1); +ADJUST_ALIGNMENT (VNx4BI, 1); +ADJUST_ALIGNMENT (VNx8BI, 1); +ADJUST_ALIGNMENT (VNx16BI, 1); +ADJUST_ALIGNMENT (VNx32BI, 1); +ADJUST_ALIGNMENT (VNx64BI, 1); +ADJUST_ALIGNMENT (VNx128BI, 1); + +ADJUST_BYTESIZE (VNx2BI, riscv_vector_chunks * 8); +ADJUST_BYTESIZE (VNx4BI, riscv_vector_chunks * 8); +ADJUST_BYTESIZE (VNx8BI, riscv_vector_chunks * 8); +ADJUST_BYTESIZE (VNx16BI, riscv_vector_chunks * 8); +ADJUST_BYTESIZE (VNx32BI, riscv_vector_chunks * 8); +ADJUST_BYTESIZE (VNx64BI, riscv_vector_chunks * 8); +ADJUST_BYTESIZE (VNx128BI, riscv_vector_chunks * 8); + +/* Define RVV modes for NVECS vectors. VB, VH, VS and VD are the prefixes + for 8-bit, 16-bit, 32-bit and 64-bit elements respectively. It isn't + strictly necessary to set the alignment here, since the default would + be clamped to BIGGEST_ALIGNMENT anyhow, but it seems clearer. */ + +/* TODO:Because there is no 'Zfh' on the upstream GCC, we will support + vector mode with 16-bit half-precision floating-point in the next patch + after 'Zfh' is supported in the GCC upstream. */ + +/* | Type | Mode | SEW/LMUL | + | vint8m1_t/vuint8m1_t | VNx16QI | 8 | + | vint8m2_t/vuint8m2_t | VNx32QI | 4 | + | vint8m4_t/vuint8m4_t | VNx64QI | 2 | + | vint8m8_t/vuint8m8_t | VNx128QI | 1 | + | vint16m1_t/vint16m1_t | VNx8HI | 16 | + | vint16m2_t/vint16m2_t | VNx16HI | 8 | + | vint16m4_t/vint16m4_t | VNx32HI | 4 | + | vint16m8_t/vint16m8_t | VNx64HI | 2 | + | vint32m1_t/vint32m1_t | VNx4SI | 32 | + | vint32m2_t/vint32m2_t | VNx8SI | 16 | + | vint32m4_t/vint32m4_t | VNx16SI | 8 | + | vint32m8_t/vint32m8_t | VNx32SI | 4 | + | vint64m1_t/vint64m1_t | VNx2DI | 64 | + | vint64m2_t/vint64m2_t | VNx4DI | 32 | + | vint64m4_t/vint64m4_t | VNx8DI | 16 | + | vint64m8_t/vint64m8_t | VNx16DI | 8 | + | vfloat32m1_t | VNx4SF | 32 | + | vfloat32m2_t | VNx8SF | 16 | + | vfloat32m4_t | VNx16SF | 8 | + | vfloat32m8_t | VNx32SF | 4 | + | vfloat64m1_t | VNx2DF | 64 | + | vfloat64m2_t | VNx4DF | 32 | + | vfloat64m4_t | VNx8DF | 16 | + | vfloat64m8_t | VNx16DF | 8 | */ + +#define RVV_MODES(NVECS, VB, VH, VS, VD) \ + VECTOR_MODES_WITH_PREFIX (VNx, INT, 16 * NVECS, 0); \ + VECTOR_MODES_WITH_PREFIX (VNx, FLOAT, 16 * NVECS, 0); \ + \ + ADJUST_NUNITS (VB##QI, riscv_vector_chunks * NVECS * 8); \ + ADJUST_NUNITS (VH##HI, riscv_vector_chunks * NVECS * 4); \ + ADJUST_NUNITS (VS##SI, riscv_vector_chunks * NVECS * 2); \ + ADJUST_NUNITS (VD##DI, riscv_vector_chunks * NVECS); \ + ADJUST_NUNITS (VS##SF, riscv_vector_chunks * NVECS * 2); \ + ADJUST_NUNITS (VD##DF, riscv_vector_chunks * NVECS); \ + \ + ADJUST_ALIGNMENT (VB##QI, 1); \ + ADJUST_ALIGNMENT (VH##HI, 2); \ + ADJUST_ALIGNMENT (VS##SI, 4); \ + ADJUST_ALIGNMENT (VD##DI, 8); \ + ADJUST_ALIGNMENT (VS##SF, 4); \ + ADJUST_ALIGNMENT (VD##DF, 8); + +/* Give vectors the names normally used for 128-bit vectors. + The actual number depends on command-line flags. */ +RVV_MODES (1, VNx16, VNx8, VNx4, VNx2) +RVV_MODES (2, VNx32, VNx16, VNx8, VNx4) +RVV_MODES (4, VNx64, VNx32, VNx16, VNx8) +RVV_MODES (8, VNx128, VNx64, VNx32, VNx16) + +/* Partial RVV vectors: + + VNx8QI VNx4HI VNx2SI VNx2SF + VNx4QI VNx2HI + VNx2QI + + In memory they occupy contiguous locations, in the same way as fixed-length + vectors. E.g. VNx8QImode is half the size of VNx16QImode. + + Passing 1 as the final argument ensures that the modes come after all + other modes in the GET_MODE_WIDER chain, so that we never pick them + in preference to a full vector mode. */ + +/* TODO:Because there is no 'Zfh' on the upstream GCC, we will support + vector mode with 16-bit half-precision floating-point in the another patch + after 'Zfh' is supported in the GCC upstream. */ + +/* | Type | Mode | SEW/LMUL | + | vint8mf2_t/vuint8mf2_t | VNx8QI | 16 | + | vint8mf4_t/vuint8mf4_t | VNx4QI | 32 | + | vint8mf8_t/vuint8mf8_t | VNx2QI | 64 | + | vint16mf2_t/vuint16mf2_t | VNx4HI | 32 | + | vint16mf4_t/vuint16mf4_t | VNx2HI | 64 | + | vint32mf2_t/vuint32mf2_t | VNx2SI | 64 | + | vfloat32mf2_t | VNx2SF | 64 | */ + +VECTOR_MODES_WITH_PREFIX (VNx, INT, 2, 1); +VECTOR_MODES_WITH_PREFIX (VNx, INT, 4, 1); +VECTOR_MODES_WITH_PREFIX (VNx, INT, 8, 1); +VECTOR_MODES_WITH_PREFIX (VNx, FLOAT, 4, 1); +VECTOR_MODES_WITH_PREFIX (VNx, FLOAT, 8, 1); + +ADJUST_NUNITS (VNx2QI, riscv_vector_chunks); +ADJUST_NUNITS (VNx2HI, riscv_vector_chunks); +ADJUST_NUNITS (VNx2SI, riscv_vector_chunks); +ADJUST_NUNITS (VNx2SF, riscv_vector_chunks); + +ADJUST_NUNITS (VNx4QI, riscv_vector_chunks * 2); +ADJUST_NUNITS (VNx4HI, riscv_vector_chunks * 2); + +ADJUST_NUNITS (VNx8QI, riscv_vector_chunks * 4); + +ADJUST_ALIGNMENT (VNx2QI, 1); +ADJUST_ALIGNMENT (VNx4QI, 1); +ADJUST_ALIGNMENT (VNx8QI, 1); + +ADJUST_ALIGNMENT (VNx2HI, 2); +ADJUST_ALIGNMENT (VNx4HI, 2); + +ADJUST_ALIGNMENT (VNx2SI, 4); +ADJUST_ALIGNMENT (VNx2SF, 4); + +/* TODO:To support segment instructions, we need a new mode definition for tuple + mode in gcc/gcc/genmodes.cc which is not the codes in RISC-V port. + We support it in the another patch. */ + +/* A 8-tuple of RVV vectors with the maximum -mriscv-vector-bits= setting. + Note that this is a limit only on the compile-time sizes of modes; + it is not a limit on the runtime sizes, since VL-agnostic code + must work with arbitary vector lengths. */ +/* TODO:According to RISC-V 'V' ISA spec, the maximun vector length can + be 65536 for a single vector register which means the vector mode in + GCC can be maximum = 65536 * 8 bits (nf=8). However, 'GET_MODE_SIZE' + is using poly_uint16/unsigned short which will overflow if we specify + vector-length = 65536. To support this feature, we need to change the + codes outside the RISC-V port. We will support it in another patch. */ +#define MAX_BITSIZE_MODE_ANY_MODE (4096 * 8) + +/* Coefficient 1 is multiplied by the number of 64-bit chunks in a vector + minus one. */ +#define NUM_POLY_INT_COEFFS 2 \ No newline at end of file diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h index 1e153b3a6e7..d99b8dcbaf1 100644 --- a/gcc/config/riscv/riscv-opts.h +++ b/gcc/config/riscv/riscv-opts.h @@ -67,6 +67,33 @@ enum stack_protector_guard { SSP_GLOBAL /* global canary */ }; +/* RVV vector register sizes. */ +enum riscv_vector_bits_enum +{ + RVV_SCALABLE, + RVV_NOT_IMPLEMENTED = RVV_SCALABLE, + RVV_64 = 64, + RVV_128 = 128, + RVV_256 = 256, + RVV_512 = 512, + RVV_1024 = 1024, + RVV_2048 = 2048, + RVV_4096 = 4096 +}; + +enum vlmul_field_enum +{ + VLMUL_FIELD_000, /* LMUL = 1 */ + VLMUL_FIELD_001, /* LMUL = 2 */ + VLMUL_FIELD_010, /* LMUL = 4 */ + VLMUL_FIELD_011, /* LMUL = 8 */ + VLMUL_FIELD_100, /* RESERVED */ + VLMUL_FIELD_101, /* LMUL = 1/8 */ + VLMUL_FIELD_110, /* LMUL = 1/4 */ + VLMUL_FIELD_111, /* LMUL = 1/2 */ + MAX_VLMUL_FIELD +}; + #define MASK_ZICSR (1 << 0) #define MASK_ZIFENCEI (1 << 1) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 20c2381c21a..19c50f0e702 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -64,7 +64,7 @@ extern rtx riscv_legitimize_call_address (rtx); extern void riscv_set_return_address (rtx, rtx); extern bool riscv_expand_block_move (rtx, rtx, rtx); extern rtx riscv_return_addr (int, rtx); -extern HOST_WIDE_INT riscv_initial_elimination_offset (int, int); +extern poly_int64 riscv_initial_elimination_offset (int, int); extern void riscv_expand_prologue (void); extern void riscv_expand_epilogue (int); extern bool riscv_epilogue_uses (unsigned int); @@ -109,4 +109,11 @@ struct riscv_cpu_info { extern const riscv_cpu_info *riscv_find_cpu (const char *); +/* Routines implemented in riscv-vector.cc. */ +extern bool rvv_mode_p (machine_mode); +extern bool rvv_legitimate_poly_int_p (rtx); +extern unsigned int rvv_offset_temporaries (bool, poly_int64); +extern enum vlmul_field_enum riscv_classify_vlmul_field (machine_mode); +extern int rvv_regsize (machine_mode); + #endif /* ! GCC_RISCV_PROTOS_H */ diff --git a/gcc/config/riscv/riscv-sr.cc b/gcc/config/riscv/riscv-sr.cc index 694f90c1583..7248f04d68f 100644 --- a/gcc/config/riscv/riscv-sr.cc +++ b/gcc/config/riscv/riscv-sr.cc @@ -247,7 +247,7 @@ riscv_remove_unneeded_save_restore_calls (void) /* We'll adjust stack size after this optimization, that require update every sp use site, which could be unsafe, so we decide to turn off this optimization if there are any arguments put on stack. */ - if (crtl->args.size != 0) + if (known_ne (crtl->args.size, 0)) return; /* Will point to the first instruction of the function body, after the diff --git a/gcc/config/riscv/riscv-vector.cc b/gcc/config/riscv/riscv-vector.cc new file mode 100644 index 00000000000..e315b5d2cac --- /dev/null +++ b/gcc/config/riscv/riscv-vector.cc @@ -0,0 +1,229 @@ +/* Subroutines used for code generation for RISC-V 'V' Extension for GNU compiler. + Copyright (C) 2022-2022 Free Software Foundation, Inc. + Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +#define IN_TARGET_CODE 1 +#define INCLUDE_STRING +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "backend.h" +#include "rtl.h" +#include "regs.h" +#include "insn-config.h" +#include "insn-attr.h" +#include "recog.h" +#include "output.h" +#include "alias.h" +#include "tree.h" +#include "stringpool.h" +#include "attribs.h" +#include "varasm.h" +#include "stor-layout.h" +#include "calls.h" +#include "function.h" +#include "explow.h" +#include "memmodel.h" +#include "emit-rtl.h" +#include "reload.h" +#include "tm_p.h" +#include "target.h" +#include "basic-block.h" +#include "expr.h" +#include "optabs.h" +#include "bitmap.h" +#include "df.h" +#include "diagnostic.h" +#include "builtins.h" +#include "predict.h" +#include "tree-pass.h" +#include "opts.h" +#include "langhooks.h" +#include "rtl-iter.h" +#include "gimple.h" +#include "cfghooks.h" +#include "cfgloop.h" +#include "fold-const.h" +#include "gimple-iterator.h" +#include "tree-vectorizer.h" +#include "tree-ssa-loop-niter.h" +#include "rtx-vector-builder.h" +#include "riscv-vector.h" +/* This file should be included last. */ +#include "target-def.h" + +#include +/* Helper functions for RVV */ + +/* Return true if it is a RVV mask mode. */ +bool +rvv_mask_mode_p (machine_mode mode) +{ + if (VECTOR_MODE_P (mode) + && GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL + && strncmp (GET_MODE_NAME (mode), "VNx", 3) == 0) + return true; + + return false; +} + +/* Return true if it is a RVV vector mode. */ +bool +rvv_vector_mode_p (machine_mode mode) +{ + if (VECTOR_MODE_P (mode) + && (GET_MODE_CLASS (mode) == MODE_VECTOR_INT + || GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT) + && strncmp (GET_MODE_NAME (mode), "VNx", 3) == 0 + /* So far we only support SEW <= 64 RVV mode. */ + && GET_MODE_BITSIZE (GET_MODE_INNER (mode)) <= 64) + return true; + + return false; +} + +/* Return true if it is a RVV mode. */ +bool +rvv_mode_p (machine_mode mode) +{ + return rvv_mask_mode_p (mode) || rvv_vector_mode_p (mode); +} + +/* Return true if it is a const poly int whose size is equal to a LMUL = 1 RVV vector. */ +bool +rvv_legitimate_poly_int_p (rtx x) +{ + poly_int64 value = rtx_to_poly_int64 (x); + + HOST_WIDE_INT factor = value.coeffs[0]; + return (value.coeffs[1] == factor && factor == UNITS_PER_V_REG.coeffs[0]); +} + +/* Return the number of temporary registers that riscv_add_offset_1 + would need to add OFFSET to a register. */ + +static unsigned int +rvv_add_offset_1_temporaries (HOST_WIDE_INT offset) +{ + return SMALL_OPERAND (offset) ? 0 : 1; +} + +/* Return the number of temporary registers that riscv_add_offset + would need to move OFFSET into a register or add OFFSET to a register; + ADD_P is true if we want the latter rather than the former. */ + +unsigned int +rvv_offset_temporaries (bool add_p, poly_int64 offset) +{ + /* This follows the same structure as riscv_add_offset. */ + if (add_p && rvv_legitimate_poly_int_p (gen_int_mode (offset, Pmode))) + return 0; + + unsigned int count = 0; + HOST_WIDE_INT factor = offset.coeffs[1]; + HOST_WIDE_INT constant = offset.coeffs[0] - factor; + poly_int64 poly_offset (factor, factor); + if (add_p && rvv_legitimate_poly_int_p (gen_int_mode (offset, Pmode))) + /* Need one register for the csrr vlenb result. */ + count += 1; + else if (factor != 0) + { + factor = abs (factor); + if (!rvv_legitimate_poly_int_p (gen_int_mode (poly_offset, Pmode))) + /* Need one register for the CNT result and one for the multiplication + factor. If necessary, the second temporary can be reused for the + constant part of the offset. */ + return 2; + /* Need one register for the CNT result (which might then + be shifted). */ + count += 1; + } + return count + rvv_add_offset_1_temporaries (constant); +} + +/* Return the vlmul field for a specific machine mode. */ + +enum vlmul_field_enum +rvv_classify_vlmul_field (machine_mode mode) +{ + /* Case 1: LMUL = 1. */ + if (known_eq (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)) + return VLMUL_FIELD_000; + + /* Case 2: Fractional LMUL. */ + if (known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)) + { + unsigned int factor = exact_div (GET_MODE_SIZE (mode), + BYTES_PER_RISCV_VECTOR).to_constant (); + switch (factor) + { + case 2: + return VLMUL_FIELD_001; + case 4: + return VLMUL_FIELD_010; + case 8: + return VLMUL_FIELD_011; + default: + gcc_unreachable (); + } + } + + /* Case 3: Fractional LMUL. */ + if (known_lt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)) + { + unsigned int factor = exact_div (BYTES_PER_RISCV_VECTOR, + GET_MODE_SIZE (mode)).to_constant (); + switch (factor) + { + case 2: + return VLMUL_FIELD_111; + case 4: + return VLMUL_FIELD_110; + case 8: + return VLMUL_FIELD_101; + default: + gcc_unreachable (); + } + } + gcc_unreachable (); +} + +/* Return vlmul register size for a machine mode. */ + +int +rvv_regsize (machine_mode mode) +{ + if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL) + return 1; + + switch (rvv_classify_vlmul_field (mode)) + { + case VLMUL_FIELD_001: + return 2; + case VLMUL_FIELD_010: + return 4; + case VLMUL_FIELD_011: + return 8; + default: + break; + } + + return 1; +} \ No newline at end of file diff --git a/gcc/config/riscv/riscv-vector.h b/gcc/config/riscv/riscv-vector.h new file mode 100644 index 00000000000..b8d77ddb195 --- /dev/null +++ b/gcc/config/riscv/riscv-vector.h @@ -0,0 +1,28 @@ +/* Definition of RISC-V 'V' Extension for GNU compiler. + Copyright (C) 2022-2022 Free Software Foundation, Inc. + Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +#ifndef GCC_RISCV_VECTOR_H +#define GCC_RISCV_VECTOR_H +bool riscv_vector_mode_p (machine_mode); +bool rvv_legitimate_poly_int_p (rtx); +unsigned int rvv_offset_temporaries (bool, poly_int64); +vlmul_field_enum rvv_classify_vlmul_field (machine_mode); +int rvv_regsize (machine_mode); +#endif // GCC_RISCV_VECTOR_H \ No newline at end of file diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index f83dc796d88..37d8f1271d4 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -57,6 +57,16 @@ along with GCC; see the file COPYING3. If not see #include "predict.h" #include "tree-pass.h" #include "opts.h" +#include "langhooks.h" +#include "rtl-iter.h" +#include "gimple.h" +#include "cfghooks.h" +#include "cfgloop.h" +#include "fold-const.h" +#include "gimple-iterator.h" +#include "tree-vectorizer.h" +#include "tree-ssa-loop-niter.h" +#include "rtx-vector-builder.h" /* True if X is an UNSPEC wrapper around a SYMBOL_REF or LABEL_REF. */ #define UNSPEC_ADDRESS_P(X) \ @@ -100,7 +110,7 @@ enum riscv_address_type { /* Information about a function's frame layout. */ struct GTY(()) riscv_frame_info { /* The size of the frame in bytes. */ - HOST_WIDE_INT total_size; + poly_int64 total_size; /* Bit X is set if the function saves or restores GPR X. */ unsigned int mask; @@ -112,17 +122,20 @@ struct GTY(()) riscv_frame_info { unsigned save_libcall_adjustment; /* Offsets of fixed-point and floating-point save areas from frame bottom */ - HOST_WIDE_INT gp_sp_offset; - HOST_WIDE_INT fp_sp_offset; + poly_int64 gp_sp_offset; + poly_int64 fp_sp_offset; + + /* constant offset of scalable frame. */ + HOST_WIDE_INT constant_offset; /* Offset of virtual frame pointer from stack pointer/frame bottom */ - HOST_WIDE_INT frame_pointer_offset; + poly_int64 frame_pointer_offset; /* Offset of hard frame pointer from stack pointer/frame bottom */ - HOST_WIDE_INT hard_frame_pointer_offset; + poly_int64 hard_frame_pointer_offset; /* The offset of arg_pointer_rtx from the bottom of the frame. */ - HOST_WIDE_INT arg_pointer_offset; + poly_int64 arg_pointer_offset; }; enum riscv_privilege_levels { @@ -255,6 +268,9 @@ static const struct riscv_tune_param *tune_param; /* Which automaton to use for tuning. */ enum riscv_microarchitecture_type riscv_microarchitecture; +/* The number of 64-bit elements in an RVV vector. */ +poly_uint16 riscv_vector_chunks; + /* Index R is the smallest register class that contains register R. */ const enum reg_class riscv_regno_to_class[FIRST_PSEUDO_REGISTER] = { GR_REGS, GR_REGS, GR_REGS, GR_REGS, @@ -273,7 +289,22 @@ const enum reg_class riscv_regno_to_class[FIRST_PSEUDO_REGISTER] = { FP_REGS, FP_REGS, FP_REGS, FP_REGS, FP_REGS, FP_REGS, FP_REGS, FP_REGS, FP_REGS, FP_REGS, FP_REGS, FP_REGS, - FRAME_REGS, FRAME_REGS, + FRAME_REGS, FRAME_REGS, VL_REGS, VTYPE_REGS, + NO_REGS, NO_REGS, NO_REGS, NO_REGS, + NO_REGS, NO_REGS, NO_REGS, NO_REGS, + NO_REGS, NO_REGS, NO_REGS, NO_REGS, + NO_REGS, NO_REGS, NO_REGS, NO_REGS, + NO_REGS, NO_REGS, NO_REGS, NO_REGS, + NO_REGS, NO_REGS, NO_REGS, NO_REGS, + NO_REGS, NO_REGS, NO_REGS, NO_REGS, + VM_REGS, VD_REGS, VD_REGS, VD_REGS, + VD_REGS, VD_REGS, VD_REGS, VD_REGS, + VD_REGS, VD_REGS, VD_REGS, VD_REGS, + VD_REGS, VD_REGS, VD_REGS, VD_REGS, + VD_REGS, VD_REGS, VD_REGS, VD_REGS, + VD_REGS, VD_REGS, VD_REGS, VD_REGS, + VD_REGS, VD_REGS, VD_REGS, VD_REGS, + VD_REGS, VD_REGS, VD_REGS, VD_REGS, }; /* Costs to use when optimizing for rocket. */ @@ -713,6 +744,16 @@ static int riscv_symbol_insns (enum riscv_symbol_type type) static bool riscv_legitimate_constant_p (machine_mode mode ATTRIBUTE_UNUSED, rtx x) { + /* If an offset is being added to something else, we need to allow the + base to be moved into the destination register, meaning that there + are no free temporaries for the offset. */ + poly_int64 offset; + if (CONST_POLY_INT_P (x) + && poly_int_rtx_p (x, &offset) + && !offset.is_constant () + && rvv_offset_temporaries (true, offset) > 0) + return false; + return riscv_const_insns (x) > 0; } @@ -723,7 +764,13 @@ riscv_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x) { enum riscv_symbol_type type; rtx base, offset; - + + /* There's no way to calculate VL-based values using relocations. */ + subrtx_iterator::array_type array; + FOR_EACH_SUBRTX (iter, array, x, ALL) + if (GET_CODE (*iter) == CONST_POLY_INT) + return true; + /* There is no assembler syntax for expressing an address-sized high part. */ if (GET_CODE (x) == HIGH) @@ -798,8 +845,8 @@ riscv_valid_offset_p (rtx x, machine_mode mode) /* We may need to split multiword moves, so make sure that every word is accessible. */ - if (GET_MODE_SIZE (mode) > UNITS_PER_WORD - && !SMALL_OPERAND (INTVAL (x) + GET_MODE_SIZE (mode) - UNITS_PER_WORD)) + if (GET_MODE_SIZE (mode).to_constant () > UNITS_PER_WORD + && !SMALL_OPERAND (INTVAL (x) + GET_MODE_SIZE (mode).to_constant () - UNITS_PER_WORD)) return false; return true; @@ -863,7 +910,7 @@ riscv_valid_lo_sum_p (enum riscv_symbol_type sym_type, machine_mode mode, else { align = GET_MODE_ALIGNMENT (mode); - size = GET_MODE_BITSIZE (mode); + size = GET_MODE_BITSIZE (mode).to_constant (); } /* We may need to split multiword moves, so make sure that each word @@ -893,6 +940,9 @@ riscv_classify_address (struct riscv_address_info *info, rtx x, return riscv_valid_base_register_p (info->reg, mode, strict_p); case PLUS: + /* RVV load/store disallow any offset. */ + if (rvv_mode_p (mode)) + return false; info->type = ADDRESS_REG; info->reg = XEXP (x, 0); info->offset = XEXP (x, 1); @@ -900,6 +950,9 @@ riscv_classify_address (struct riscv_address_info *info, rtx x, && riscv_valid_offset_p (info->offset, mode)); case LO_SUM: + /* RVV load/store disallow LO_SUM. */ + if (rvv_mode_p (mode)) + return false; info->type = ADDRESS_LO_SUM; info->reg = XEXP (x, 0); info->offset = XEXP (x, 1); @@ -918,6 +971,9 @@ riscv_classify_address (struct riscv_address_info *info, rtx x, && riscv_valid_lo_sum_p (info->symbol_type, mode, info->offset)); case CONST_INT: + /* RVV load/store disallow CONST_INT. */ + if (rvv_mode_p (mode)) + return false; /* Small-integer addresses don't occur very often, but they are legitimate if x0 is a valid base register. */ info->type = ADDRESS_CONST_INT; @@ -1003,8 +1059,8 @@ riscv_address_insns (rtx x, machine_mode mode, bool might_split_p) /* BLKmode is used for single unaligned loads and stores and should not count as a multiword mode. */ - if (mode != BLKmode && might_split_p) - n += (GET_MODE_SIZE (mode) + UNITS_PER_WORD - 1) / UNITS_PER_WORD; + if (mode != BLKmode && might_split_p && !rvv_mode_p (mode)) + n += (GET_MODE_SIZE (mode).to_constant () + UNITS_PER_WORD - 1) / UNITS_PER_WORD; if (addr.type == ADDRESS_LO_SUM) n += riscv_symbol_insns (addr.symbol_type) - 1; @@ -1061,7 +1117,13 @@ riscv_const_insns (rtx x) case SYMBOL_REF: case LABEL_REF: return riscv_symbol_insns (riscv_classify_symbol (x)); - + + /* TDO: In RVV, we get CONST_POLY_INT by using csrr vlenb + instruction and several scalar shift or mult instructions, + it is so far unknown. We set it to 4 temporarily. */ + case CONST_POLY_INT: + return 4; + default: return 0; } @@ -1097,9 +1159,9 @@ riscv_load_store_insns (rtx mem, rtx_insn *insn) /* Try to prove that INSN does not need to be split. */ might_split_p = true; - if (GET_MODE_BITSIZE (mode) <= 32) + if (GET_MODE_BITSIZE (mode).to_constant () <= 32) might_split_p = false; - else if (GET_MODE_BITSIZE (mode) == 64) + else if (GET_MODE_BITSIZE (mode).to_constant () == 64) { set = single_set (insn); if (set && !riscv_split_64bit_move_p (SET_DEST (set), SET_SRC (set))) @@ -1616,7 +1678,7 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx src) (set (reg:QI target) (subreg:QI (reg:DI temp) 0)) with auto-sign/zero extend. */ if (GET_MODE_CLASS (mode) == MODE_INT - && GET_MODE_SIZE (mode) < UNITS_PER_WORD + && GET_MODE_SIZE (mode).to_constant () < UNITS_PER_WORD && can_create_pseudo_p () && MEM_P (src)) { @@ -1641,7 +1703,7 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx src) improve cse. */ machine_mode promoted_mode = mode; if (GET_MODE_CLASS (mode) == MODE_INT - && GET_MODE_SIZE (mode) < UNITS_PER_WORD) + && GET_MODE_SIZE (mode).to_constant () < UNITS_PER_WORD) promoted_mode = word_mode; if (splittable_const_int_operand (src, mode)) @@ -1739,7 +1801,8 @@ riscv_immediate_operand_p (int code, HOST_WIDE_INT x) static int riscv_binary_cost (rtx x, int single_insns, int double_insns) { - if (GET_MODE_SIZE (GET_MODE (x)) == UNITS_PER_WORD * 2) + if (!rvv_mode_p (GET_MODE (x)) + && GET_MODE_SIZE (GET_MODE (x)).to_constant () == UNITS_PER_WORD * 2) return COSTS_N_INSNS (double_insns); return COSTS_N_INSNS (single_insns); } @@ -1786,6 +1849,14 @@ static bool riscv_rtx_costs (rtx x, machine_mode mode, int outer_code, int opno ATTRIBUTE_UNUSED, int *total, bool speed) { + /* TODO:We set RVV instruction cost as 1 by default. + Cost Model need to be well analyzed and supported in the future. */ + if (rvv_mode_p (mode)) + { + *total = COSTS_N_INSNS (1); + return true; + } + bool float_mode_p = FLOAT_MODE_P (mode); int cost; @@ -1845,7 +1916,7 @@ riscv_rtx_costs (rtx x, machine_mode mode, int outer_code, int opno ATTRIBUTE_UN return false; case NOT: - *total = COSTS_N_INSNS (GET_MODE_SIZE (mode) > UNITS_PER_WORD ? 2 : 1); + *total = COSTS_N_INSNS (GET_MODE_SIZE (mode).to_constant () > UNITS_PER_WORD ? 2 : 1); return false; case AND: @@ -2092,7 +2163,7 @@ riscv_rtx_costs (rtx x, machine_mode mode, int outer_code, int opno ATTRIBUTE_UN if (float_mode_p) *total = tune_param->fp_add[mode == DFmode]; else - *total = COSTS_N_INSNS (GET_MODE_SIZE (mode) > UNITS_PER_WORD ? 4 : 1); + *total = COSTS_N_INSNS (GET_MODE_SIZE (mode).to_constant () > UNITS_PER_WORD ? 4 : 1); return false; case MULT: @@ -2101,7 +2172,7 @@ riscv_rtx_costs (rtx x, machine_mode mode, int outer_code, int opno ATTRIBUTE_UN else if (!TARGET_MUL) /* Estimate the cost of a library call. */ *total = COSTS_N_INSNS (speed ? 32 : 6); - else if (GET_MODE_SIZE (mode) > UNITS_PER_WORD) + else if (GET_MODE_SIZE (mode).to_constant () > UNITS_PER_WORD) *total = 3 * tune_param->int_mul[0] + COSTS_N_INSNS (2); else if (!speed) *total = COSTS_N_INSNS (1); @@ -2264,7 +2335,7 @@ riscv_output_move (rtx dest, rtx src) dest_code = GET_CODE (dest); src_code = GET_CODE (src); mode = GET_MODE (dest); - dbl_p = (GET_MODE_SIZE (mode) == 8); + dbl_p = (GET_MODE_SIZE (mode).to_constant () == 8); if (dbl_p && riscv_split_64bit_move_p (dest, src)) return "#"; @@ -2275,7 +2346,7 @@ riscv_output_move (rtx dest, rtx src) return dbl_p ? "fmv.x.d\t%0,%1" : "fmv.x.w\t%0,%1"; if (src_code == MEM) - switch (GET_MODE_SIZE (mode)) + switch (GET_MODE_SIZE (mode).to_constant ()) { case 1: return "lbu\t%0,%1"; case 2: return "lhu\t%0,%1"; @@ -2328,7 +2399,7 @@ riscv_output_move (rtx dest, rtx src) } } if (dest_code == MEM) - switch (GET_MODE_SIZE (mode)) + switch (GET_MODE_SIZE (mode).to_constant ()) { case 1: return "sb\t%z1,%0"; case 2: return "sh\t%z1,%0"; @@ -2349,6 +2420,17 @@ riscv_output_move (rtx dest, rtx src) if (src_code == MEM) return dbl_p ? "fld\t%0,%1" : "flw\t%0,%1"; } + if (dest_code == REG + && GP_REG_P (REGNO (dest)) + && src_code == CONST_POLY_INT) + { + /* we only want a single LMUL = 1 RVV vector register vlenb + read after reload. */ + poly_int64 value = rtx_to_poly_int64 (src); + gcc_assert (value.coeffs[0] == UNITS_PER_V_REG.coeffs[0] + && value.coeffs[1] == UNITS_PER_V_REG.coeffs[1]); + return "csrr\t%0,vlenb"; + } gcc_unreachable (); } @@ -2495,7 +2577,7 @@ static void riscv_extend_comparands (rtx_code code, rtx *op0, rtx *op1) { /* Comparisons consider all XLEN bits, so extend sub-XLEN values. */ - if (GET_MODE_SIZE (word_mode) > GET_MODE_SIZE (GET_MODE (*op0))) + if (GET_MODE_SIZE (word_mode) > GET_MODE_SIZE (GET_MODE (*op0)).to_constant ()) { /* It is more profitable to zero-extend QImode values. But not if the first operand has already been sign-extended, and the second one is @@ -2854,7 +2936,7 @@ riscv_flatten_aggregate_field (const_tree type, if (n != 0) return -1; - HOST_WIDE_INT elt_size = GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (type))); + HOST_WIDE_INT elt_size = GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (type))).to_constant (); if (elt_size <= UNITS_PER_FP_ARG) { @@ -2872,9 +2954,9 @@ riscv_flatten_aggregate_field (const_tree type, default: if (n < 2 && ((SCALAR_FLOAT_TYPE_P (type) - && GET_MODE_SIZE (TYPE_MODE (type)) <= UNITS_PER_FP_ARG) + && GET_MODE_SIZE (TYPE_MODE (type)).to_constant () <= UNITS_PER_FP_ARG) || (INTEGRAL_TYPE_P (type) - && GET_MODE_SIZE (TYPE_MODE (type)) <= UNITS_PER_WORD))) + && GET_MODE_SIZE (TYPE_MODE (type)).to_constant () <= UNITS_PER_WORD))) { fields[n].type = type; fields[n].offset = offset; @@ -3110,7 +3192,7 @@ riscv_get_arg_info (struct riscv_arg_info *info, const CUMULATIVE_ARGS *cum, } /* Work out the size of the argument. */ - num_bytes = type ? int_size_in_bytes (type) : GET_MODE_SIZE (mode); + num_bytes = type ? int_size_in_bytes (type) : GET_MODE_SIZE (mode).to_constant (); num_words = (num_bytes + UNITS_PER_WORD - 1) / UNITS_PER_WORD; /* Doubleword-aligned varargs start on an even register boundary. */ @@ -3204,7 +3286,7 @@ riscv_function_value (const_tree type, const_tree func, machine_mode mode) static bool riscv_pass_by_reference (cumulative_args_t cum_v, const function_arg_info &arg) { - HOST_WIDE_INT size = arg.type_size_in_bytes (); + HOST_WIDE_INT size = arg.type_size_in_bytes ().to_constant (); struct riscv_arg_info info; CUMULATIVE_ARGS *cum = get_cumulative_args (cum_v); @@ -3833,7 +3915,7 @@ riscv_elf_select_rtx_section (machine_mode mode, rtx x, { section *s = default_elf_select_rtx_section (mode, x, align); - if (riscv_size_ok_for_small_data_p (GET_MODE_SIZE (mode))) + if (riscv_size_ok_for_small_data_p (GET_MODE_SIZE (mode).to_constant ())) { if (startswith (s->named.name, ".rodata.cst")) { @@ -3941,6 +4023,19 @@ riscv_save_libcall_count (unsigned mask) abort (); } +/* Handle stack align for poly_int. */ +static poly_int64 +riscv_stack_align (poly_int64 value) +{ + return aligned_upper_bound (value, PREFERRED_STACK_BOUNDARY / 8); +} + +static HOST_WIDE_INT +riscv_stack_align (HOST_WIDE_INT value) +{ + return RISCV_STACK_ALIGN (value); +} + /* Populate the current function's riscv_frame_info structure. RISC-V stack frames grown downward. High addresses are at the top. @@ -3989,7 +4084,7 @@ static void riscv_compute_frame_info (void) { struct riscv_frame_info *frame; - HOST_WIDE_INT offset; + poly_int64 offset; bool interrupt_save_prologue_temp = false; unsigned int regno, i, num_x_saved = 0, num_f_saved = 0; @@ -4000,7 +4095,7 @@ riscv_compute_frame_info (void) if (cfun->machine->interrupt_handler_p) { HOST_WIDE_INT step1 = riscv_first_stack_step (frame); - if (! SMALL_OPERAND (frame->total_size - step1)) + if (! POLY_SMALL_OPERAND_P ((frame->total_size - step1))) interrupt_save_prologue_temp = true; } @@ -4030,23 +4125,23 @@ riscv_compute_frame_info (void) } /* At the bottom of the frame are any outgoing stack arguments. */ - offset = RISCV_STACK_ALIGN (crtl->outgoing_args_size); + offset = riscv_stack_align (crtl->outgoing_args_size); /* Next are local stack variables. */ - offset += RISCV_STACK_ALIGN (get_frame_size ()); + offset += riscv_stack_align (get_frame_size ()); /* The virtual frame pointer points above the local variables. */ frame->frame_pointer_offset = offset; /* Next are the callee-saved FPRs. */ if (frame->fmask) - offset += RISCV_STACK_ALIGN (num_f_saved * UNITS_PER_FP_REG); + offset += riscv_stack_align (num_f_saved * UNITS_PER_FP_REG); frame->fp_sp_offset = offset - UNITS_PER_FP_REG; /* Next are the callee-saved GPRs. */ if (frame->mask) { - unsigned x_save_size = RISCV_STACK_ALIGN (num_x_saved * UNITS_PER_WORD); + unsigned x_save_size = riscv_stack_align (num_x_saved * UNITS_PER_WORD); unsigned num_save_restore = 1 + riscv_save_libcall_count (frame->mask); /* Only use save/restore routines if they don't alter the stack size. */ - if (RISCV_STACK_ALIGN (num_save_restore * UNITS_PER_WORD) == x_save_size) + if (riscv_stack_align (num_save_restore * UNITS_PER_WORD) == x_save_size) { /* Libcall saves/restores 3 registers at once, so we need to allocate 12 bytes for callee-saved register. */ @@ -4062,17 +4157,18 @@ riscv_compute_frame_info (void) /* The hard frame pointer points above the callee-saved GPRs. */ frame->hard_frame_pointer_offset = offset; /* Above the hard frame pointer is the callee-allocated varags save area. */ - offset += RISCV_STACK_ALIGN (cfun->machine->varargs_size); + offset += riscv_stack_align (cfun->machine->varargs_size); /* Next is the callee-allocated area for pretend stack arguments. */ - offset += RISCV_STACK_ALIGN (crtl->args.pretend_args_size); + offset += riscv_stack_align (crtl->args.pretend_args_size); /* Arg pointer must be below pretend args, but must be above alignment padding. */ frame->arg_pointer_offset = offset - crtl->args.pretend_args_size; frame->total_size = offset; + /* Next points the incoming stack pointer and any incoming arguments. */ /* Only use save/restore routines when the GPRs are atop the frame. */ - if (frame->hard_frame_pointer_offset != frame->total_size) + if (known_ne (frame->hard_frame_pointer_offset, frame->total_size)) frame->save_libcall_adjustment = 0; } @@ -4089,10 +4185,10 @@ riscv_can_eliminate (const int from ATTRIBUTE_UNUSED, const int to) or argument pointer. TO is either the stack pointer or hard frame pointer. */ -HOST_WIDE_INT +poly_int64 riscv_initial_elimination_offset (int from, int to) { - HOST_WIDE_INT src, dest; + poly_int64 src, dest; riscv_compute_frame_info (); @@ -4136,7 +4232,7 @@ riscv_set_return_address (rtx address, rtx scratch) gcc_assert (BITSET_P (cfun->machine->frame.mask, RETURN_ADDR_REGNUM)); slot_address = riscv_add_offset (scratch, stack_pointer_rtx, - cfun->machine->frame.gp_sp_offset); + cfun->machine->frame.gp_sp_offset.to_constant()); riscv_emit_move (gen_frame_mem (GET_MODE (address), slot_address), address); } @@ -4163,13 +4259,13 @@ riscv_save_restore_reg (machine_mode mode, int regno, of the frame. */ static void -riscv_for_each_saved_reg (HOST_WIDE_INT sp_offset, riscv_save_restore_fn fn, +riscv_for_each_saved_reg (poly_int64 sp_offset, riscv_save_restore_fn fn, bool epilogue, bool maybe_eh_return) { HOST_WIDE_INT offset; /* Save the link register and s-registers. */ - offset = cfun->machine->frame.gp_sp_offset - sp_offset; + offset = (cfun->machine->frame.gp_sp_offset - sp_offset).to_constant (); for (unsigned int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++) if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST)) { @@ -4200,14 +4296,14 @@ riscv_for_each_saved_reg (HOST_WIDE_INT sp_offset, riscv_save_restore_fn fn, /* This loop must iterate over the same space as its companion in riscv_compute_frame_info. */ - offset = cfun->machine->frame.fp_sp_offset - sp_offset; + offset = (cfun->machine->frame.fp_sp_offset - sp_offset).to_constant (); for (unsigned int regno = FP_REG_FIRST; regno <= FP_REG_LAST; regno++) if (BITSET_P (cfun->machine->frame.fmask, regno - FP_REG_FIRST)) { machine_mode mode = TARGET_DOUBLE_FLOAT ? DFmode : SFmode; riscv_save_restore_reg (mode, regno, offset, fn); - offset -= GET_MODE_SIZE (mode); + offset -= GET_MODE_SIZE (mode).to_constant (); } } @@ -4249,21 +4345,21 @@ riscv_restore_reg (rtx reg, rtx mem) static HOST_WIDE_INT riscv_first_stack_step (struct riscv_frame_info *frame) { - if (SMALL_OPERAND (frame->total_size)) - return frame->total_size; + if (SMALL_OPERAND (frame->total_size.to_constant())) + return frame->total_size.to_constant (); HOST_WIDE_INT min_first_step = - RISCV_STACK_ALIGN (frame->total_size - frame->fp_sp_offset); + RISCV_STACK_ALIGN ((frame->total_size - frame->fp_sp_offset).to_constant()); HOST_WIDE_INT max_first_step = IMM_REACH / 2 - PREFERRED_STACK_BOUNDARY / 8; - HOST_WIDE_INT min_second_step = frame->total_size - max_first_step; + HOST_WIDE_INT min_second_step = frame->total_size.to_constant() - max_first_step; gcc_assert (min_first_step <= max_first_step); /* As an optimization, use the least-significant bits of the total frame size, so that the second adjustment step is just LUI + ADD. */ if (!SMALL_OPERAND (min_second_step) - && frame->total_size % IMM_REACH < IMM_REACH / 2 - && frame->total_size % IMM_REACH >= min_first_step) - return frame->total_size % IMM_REACH; + && frame->total_size.to_constant() % IMM_REACH < IMM_REACH / 2 + && frame->total_size.to_constant() % IMM_REACH >= min_first_step) + return frame->total_size.to_constant() % IMM_REACH; if (TARGET_RVC) { @@ -4336,7 +4432,7 @@ void riscv_expand_prologue (void) { struct riscv_frame_info *frame = &cfun->machine->frame; - HOST_WIDE_INT size = frame->total_size; + HOST_WIDE_INT size = frame->total_size.to_constant (); unsigned mask = frame->mask; rtx insn; @@ -4379,7 +4475,7 @@ riscv_expand_prologue (void) if (frame_pointer_needed) { insn = gen_add3_insn (hard_frame_pointer_rtx, stack_pointer_rtx, - GEN_INT (frame->hard_frame_pointer_offset - size)); + GEN_INT ((frame->hard_frame_pointer_offset - size).to_constant ())); RTX_FRAME_RELATED_P (emit_insn (insn)) = 1; riscv_emit_stack_tie (); @@ -4445,7 +4541,7 @@ riscv_expand_epilogue (int style) Start off by assuming that no registers need to be restored. */ struct riscv_frame_info *frame = &cfun->machine->frame; unsigned mask = frame->mask; - HOST_WIDE_INT step1 = frame->total_size; + HOST_WIDE_INT step1 = frame->total_size.to_constant (); HOST_WIDE_INT step2 = 0; bool use_restore_libcall = ((style == NORMAL_RETURN) && riscv_use_save_libcall (frame)); @@ -4453,8 +4549,8 @@ riscv_expand_epilogue (int style) rtx insn; /* We need to add memory barrier to prevent read from deallocated stack. */ - bool need_barrier_p = (get_frame_size () - + cfun->machine->frame.arg_pointer_offset) != 0; + bool need_barrier_p = known_ne (get_frame_size (), + cfun->machine->frame.arg_pointer_offset); if (cfun->machine->naked_p) { @@ -4481,7 +4577,7 @@ riscv_expand_epilogue (int style) riscv_emit_stack_tie (); need_barrier_p = false; - rtx adjust = GEN_INT (-frame->hard_frame_pointer_offset); + rtx adjust = GEN_INT (-frame->hard_frame_pointer_offset.to_constant ()); if (!SMALL_OPERAND (INTVAL (adjust))) { riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode), adjust); @@ -4495,7 +4591,7 @@ riscv_expand_epilogue (int style) rtx dwarf = NULL_RTX; rtx cfa_adjust_value = gen_rtx_PLUS ( Pmode, hard_frame_pointer_rtx, - GEN_INT (-frame->hard_frame_pointer_offset)); + GEN_INT (-frame->hard_frame_pointer_offset.to_constant ())); rtx cfa_adjust_rtx = gen_rtx_SET (stack_pointer_rtx, cfa_adjust_value); dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, cfa_adjust_rtx, dwarf); RTX_FRAME_RELATED_P (insn) = 1; @@ -4512,7 +4608,7 @@ riscv_expand_epilogue (int style) } /* Set TARGET to BASE + STEP1. */ - if (step1 > 0) + if (known_gt (step1, 0)) { /* Emit a barrier to prevent loads from a deallocated stack. */ riscv_emit_stack_tie (); @@ -4638,7 +4734,7 @@ riscv_epilogue_uses (unsigned int regno) bool riscv_can_use_return_insn (void) { - return (reload_completed && cfun->machine->frame.total_size == 0 + return (reload_completed && known_eq (cfun->machine->frame.total_size, 0) && ! cfun->machine->interrupt_handler_p); } @@ -4738,7 +4834,8 @@ static bool riscv_secondary_memory_needed (machine_mode mode, reg_class_t class1, reg_class_t class2) { - return (GET_MODE_SIZE (mode) > UNITS_PER_WORD + return !rvv_mode_p (mode) + && (GET_MODE_SIZE (mode).to_constant () > UNITS_PER_WORD && (class1 == FP_REGS) != (class2 == FP_REGS)); } @@ -4760,11 +4857,33 @@ riscv_register_move_cost (machine_mode mode, static unsigned int riscv_hard_regno_nregs (unsigned int regno, machine_mode mode) { + if (rvv_mode_p (mode)) + { + /* TODO: Tuple mode register manipulation will be supported + for segment instructions in the future. */ + + /* Handle fractional LMUL, it only occupy part of vector register but still + need one vector register to hold. */ + if (maybe_lt (GET_MODE_SIZE (mode), UNITS_PER_V_REG)) + return 1; + + return exact_div (GET_MODE_SIZE (mode), UNITS_PER_V_REG).to_constant (); + } + + /* mode for VL or VTYPE are just a marker, not holding value, + so it always consume one register. */ + if (regno == VTYPE_REGNUM || regno == VL_REGNUM) + return 1; + + /* Assume every valid non-vector mode fits in one vector register. */ + if (V_REG_P (regno)) + return 1; + if (FP_REG_P (regno)) - return (GET_MODE_SIZE (mode) + UNITS_PER_FP_REG - 1) / UNITS_PER_FP_REG; + return (GET_MODE_SIZE (mode).to_constant () + UNITS_PER_FP_REG - 1) / UNITS_PER_FP_REG; /* All other registers are word-sized. */ - return (GET_MODE_SIZE (mode) + UNITS_PER_WORD - 1) / UNITS_PER_WORD; + return (GET_MODE_SIZE (mode).to_constant () + UNITS_PER_WORD - 1) / UNITS_PER_WORD; } /* Implement TARGET_HARD_REGNO_MODE_OK. */ @@ -4776,11 +4895,17 @@ riscv_hard_regno_mode_ok (unsigned int regno, machine_mode mode) if (GP_REG_P (regno)) { + if (rvv_mode_p (mode)) + return false; + if (!GP_REG_P (regno + nregs - 1)) return false; } else if (FP_REG_P (regno)) { + if (rvv_mode_p (mode)) + return false; + if (!FP_REG_P (regno + nregs - 1)) return false; @@ -4795,6 +4920,18 @@ riscv_hard_regno_mode_ok (unsigned int regno, machine_mode mode) && GET_MODE_UNIT_SIZE (mode) > UNITS_PER_FP_ARG)) return false; } + else if (V_REG_P (regno)) + { + if (!rvv_mode_p (mode)) + return false; + + /* 3.3.2. LMUL = 2,4,8, register numbers should be multiple of 2,4,8. + but for mask vector register, register numbers can be any number. */ + int regsize = rvv_regsize (mode); + + if (regsize != 1) + return ((regno % regsize) == 0); + } else return false; @@ -4971,6 +5108,26 @@ riscv_init_machine_status (void) return ggc_cleared_alloc (); } +/* Return the Vlen value associated with -mriscv-vector-bits= value VALUE. */ + +static poly_uint16 +riscv_convert_riscv_vector_bits (riscv_vector_bits_enum value) +{ + /* 64-bit RVV modes use different register layouts + on small-endian targets, so we would need to forbid subregs that convert + from one to the other. By default a reinterpret sequence would then + involve a store to memory in one mode and a load back in the other. + Even if we optimize that sequence using reverse instructions, + it would still be a significant potential overhead. + + For now, it seems better to generate length-agnostic code for that + case instead. */ + if (value == RVV_SCALABLE) + return poly_uint16 (2, 2); + else + return (int) value / 64; +} + /* Implement TARGET_OPTION_OVERRIDE. */ static void @@ -5116,7 +5273,9 @@ riscv_option_override (void) riscv_stack_protector_guard_offset = offs; } - + + /* Convert -mriscv-vector-bits to a chunks count. */ + riscv_vector_chunks = riscv_convert_riscv_vector_bits (riscv_vector_bits); } /* Implement TARGET_CONDITIONAL_REGISTER_USAGE. */ @@ -5474,9 +5633,10 @@ riscv_promote_function_mode (const_tree type ATTRIBUTE_UNUSED, return promote_mode (type, mode, punsignedp); unsignedp = *punsignedp; - PROMOTE_MODE (mode, unsignedp, type); + scalar_mode smode = as_a (mode); + PROMOTE_MODE (smode, unsignedp, type); *punsignedp = unsignedp; - return mode; + return smode; } /* Implement TARGET_MACHINE_DEPENDENT_REORG. */ diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h index 5083a1c24b0..8f56a5a4746 100644 --- a/gcc/config/riscv/riscv.h +++ b/gcc/config/riscv/riscv.h @@ -123,7 +123,7 @@ ASM_MISA_SPEC /* The mapping from gcc register number to DWARF 2 CFA column number. */ #define DWARF_FRAME_REGNUM(REGNO) \ - (GP_REG_P (REGNO) || FP_REG_P (REGNO) ? REGNO : INVALID_REGNUM) + (GP_REG_P (REGNO) || FP_REG_P (REGNO) || (TARGET_VECTOR && V_REG_P (REGNO)) ? REGNO : INVALID_REGNUM) /* The DWARF 2 CFA column which tracks the return address. */ #define DWARF_FRAME_RETURN_COLUMN RETURN_ADDR_REGNUM @@ -155,6 +155,7 @@ ASM_MISA_SPEC /* The `Q' extension is not yet supported. */ #define UNITS_PER_FP_REG (TARGET_DOUBLE_FLOAT ? 8 : 4) +#define UNITS_PER_V_REG (GET_MODE_SIZE (VNx2DImode)) /* The largest type that can be passed in floating-point registers. */ #define UNITS_PER_FP_ARG \ @@ -289,9 +290,13 @@ ASM_MISA_SPEC - 32 floating point registers - 2 fake registers: - ARG_POINTER_REGNUM - - FRAME_POINTER_REGNUM */ + - FRAME_POINTER_REGNUM + - 1 vl register + - 1 vtype register + - 30 unused registers for future expansion + - 32 vector registers */ -#define FIRST_PSEUDO_REGISTER 66 +#define FIRST_PSEUDO_REGISTER 128 /* x0, sp, gp, and tp are fixed. */ @@ -303,7 +308,11 @@ ASM_MISA_SPEC 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \ /* Others. */ \ - 1, 1 \ + 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, \ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \ + /* Vector registers. */ \ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 \ } /* a0-a7, t0-t6, fa0-fa7, and ft0-ft11 are volatile across calls. @@ -317,7 +326,11 @@ ASM_MISA_SPEC 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, \ 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, \ /* Others. */ \ - 1, 1 \ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \ + /* Vector registers. */ \ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 \ } /* Select a register mode required for caller save of hard regno REGNO. @@ -337,6 +350,10 @@ ASM_MISA_SPEC #define FP_REG_LAST 63 #define FP_REG_NUM (FP_REG_LAST - FP_REG_FIRST + 1) +#define V_REG_FIRST 96 +#define V_REG_LAST 127 +#define V_REG_NUM (V_REG_LAST - V_REG_FIRST + 1) + /* The DWARF 2 CFA column which tracks the return address from a signal handler context. This means that to maintain backwards compatibility, no hard register can be assigned this column if it @@ -347,6 +364,8 @@ ASM_MISA_SPEC ((unsigned int) ((int) (REGNO) - GP_REG_FIRST) < GP_REG_NUM) #define FP_REG_P(REGNO) \ ((unsigned int) ((int) (REGNO) - FP_REG_FIRST) < FP_REG_NUM) +#define V_REG_P(REGNO) \ + ((unsigned int) ((int) (REGNO) - V_REG_FIRST) < V_REG_NUM) /* True when REGNO is in SIBCALL_REGS set. */ #define SIBCALL_REG_P(REGNO) \ @@ -430,6 +449,11 @@ enum reg_class GR_REGS, /* integer registers */ FP_REGS, /* floating-point registers */ FRAME_REGS, /* arg pointer and frame pointer */ + VL_REGS, /* vl register */ + VTYPE_REGS, /* vype register */ + VM_REGS, /* v0.t registers */ + VD_REGS, /* vector registers except v0.t */ + V_REGS, /* vector registers */ ALL_REGS, /* all registers */ LIM_REG_CLASSES /* max value + 1 */ }; @@ -450,6 +474,11 @@ enum reg_class "GR_REGS", \ "FP_REGS", \ "FRAME_REGS", \ + "VL_REGS", \ + "VTYPE_REGS", \ + "VM_REGS", \ + "VD_REGS", \ + "V_REGS", \ "ALL_REGS" \ } @@ -466,13 +495,18 @@ enum reg_class #define REG_CLASS_CONTENTS \ { \ - { 0x00000000, 0x00000000, 0x00000000 }, /* NO_REGS */ \ - { 0xf003fcc0, 0x00000000, 0x00000000 }, /* SIBCALL_REGS */ \ - { 0xffffffc0, 0x00000000, 0x00000000 }, /* JALR_REGS */ \ - { 0xffffffff, 0x00000000, 0x00000000 }, /* GR_REGS */ \ - { 0x00000000, 0xffffffff, 0x00000000 }, /* FP_REGS */ \ - { 0x00000000, 0x00000000, 0x00000003 }, /* FRAME_REGS */ \ - { 0xffffffff, 0xffffffff, 0x00000003 } /* ALL_REGS */ \ + { 0x00000000, 0x00000000, 0x00000000, 0x00000000 }, /* NO_REGS */ \ + { 0xf003fcc0, 0x00000000, 0x00000000, 0x00000000 }, /* SIBCALL_REGS */ \ + { 0xffffffc0, 0x00000000, 0x00000000, 0x00000000 }, /* JALR_REGS */ \ + { 0xffffffff, 0x00000000, 0x00000000, 0x00000000 }, /* GR_REGS */ \ + { 0x00000000, 0xffffffff, 0x00000000, 0x00000000 }, /* FP_REGS */ \ + { 0x00000000, 0x00000000, 0x00000003, 0x00000000 }, /* FRAME_REGS */ \ + { 0x00000000, 0x00000000, 0x00000004, 0x00000000 }, /* VL_REGS */ \ + { 0x00000000, 0x00000000, 0x00000008, 0x00000000 }, /* VTYPE_REGS */ \ + { 0x00000000, 0x00000000, 0x00000000, 0x00000001 }, /* V0_REGS */ \ + { 0x00000000, 0x00000000, 0x00000000, 0xfffffffe }, /* VNoV0_REGS */ \ + { 0x00000000, 0x00000000, 0x00000000, 0xffffffff }, /* V_REGS */ \ + { 0xffffffff, 0xffffffff, 0x0000000f, 0xffffffff } /* ALL_REGS */ \ } /* A C expression whose value is a register class containing hard @@ -512,9 +546,16 @@ enum reg_class 60, 61, 62, 63, \ /* Call-saved FPRs. */ \ 40, 41, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, \ + /* V24 ~ V31. */ \ + 120, 121, 122, 123, 124, 125, 126, 127, \ + /* V8 ~ V23. */ \ + 104, 105, 106, 107, 108, 109, 110, 111, \ + 112, 113, 114, 115, 116, 117, 118, 119, \ + /* V0 ~ V7. */ \ + 96, 97, 98, 99, 100, 101, 102, 103, \ /* None of the remaining classes have defined call-saved \ registers. */ \ - 64, 65 \ + 64, 65, 66, 67 \ } /* True if VALUE is a signed 12-bit number. */ @@ -522,6 +563,10 @@ enum reg_class #define SMALL_OPERAND(VALUE) \ ((unsigned HOST_WIDE_INT) (VALUE) + IMM_REACH/2 < IMM_REACH) +#define POLY_SMALL_OPERAND_P(POLY_VALUE) \ + (POLY_VALUE.is_constant () ? \ + SMALL_OPERAND (POLY_VALUE.to_constant ()) : false) + /* True if VALUE can be loaded into a register using LUI. */ #define LUI_OPERAND(VALUE) \ @@ -780,7 +825,14 @@ typedef struct { "fs0", "fs1", "fa0", "fa1", "fa2", "fa3", "fa4", "fa5", \ "fa6", "fa7", "fs2", "fs3", "fs4", "fs5", "fs6", "fs7", \ "fs8", "fs9", "fs10","fs11","ft8", "ft9", "ft10","ft11", \ - "arg", "frame", } + "arg", "frame", "vl", "vtype", "N/A", "N/A", "N/A", "N/A", \ + "N/A", "N/A", "N/A", "N/A", "N/A", "N/A", "N/A", "N/A", \ + "N/A", "N/A", "N/A", "N/A", "N/A", "N/A", "N/A", "N/A", \ + "N/A", "N/A", "N/A", "N/A", "N/A", "N/A", "N/A", "N/A", \ + "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", \ + "v8", "v9", "v10", "v11", "v12", "v13", "v14", "v15", \ + "v16", "v17", "v18", "v19", "v20", "v21", "v22", "v23", \ + "v24", "v25", "v26", "v27", "v28", "v29", "v30", "v31", } #define ADDITIONAL_REGISTER_NAMES \ { \ @@ -955,6 +1007,10 @@ while (0) extern const enum reg_class riscv_regno_to_class[]; extern bool riscv_slow_unaligned_access_p; extern unsigned riscv_stack_boundary; +extern poly_uint16 riscv_vector_chunks; +/* The number of bits and bytes in a RVV vector. */ +#define BITS_PER_RISCV_VECTOR (poly_uint16 (riscv_vector_chunks * 64)) +#define BYTES_PER_RISCV_VECTOR (poly_uint16 (riscv_vector_chunks * 8)) #endif #define ASM_PREFERRED_EH_DATA_FORMAT(CODE,GLOBAL) \ diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index b8ab0cf169a..8e880ba8599 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -102,6 +102,10 @@ (NORMAL_RETURN 0) (SIBCALL_RETURN 1) (EXCEPTION_RETURN 2) + + ;; Constant helper for RVV + (VL_REGNUM 66) + (VTYPE_REGNUM 67) ]) (include "predicates.md") @@ -1619,23 +1623,23 @@ }) (define_insn "*movdi_32bit" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r,m, *f,*f,*r,*f,*m") - (match_operand:DI 1 "move_operand" " r,i,m,r,*J*r,*m,*f,*f,*f"))] + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r,r,m, *f,*f,*r,*f,*m") + (match_operand:DI 1 "move_operand" " vp,r,i,m,r,*J*r,*m,*f,*f,*f"))] "!TARGET_64BIT && (register_operand (operands[0], DImode) || reg_or_0_operand (operands[1], DImode))" { return riscv_output_move (operands[0], operands[1]); } - [(set_attr "move_type" "move,const,load,store,mtc,fpload,mfc,fmove,fpstore") + [(set_attr "move_type" "const,move,const,load,store,mtc,fpload,mfc,fmove,fpstore") (set_attr "mode" "DI")]) (define_insn "*movdi_64bit" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r, m, *f,*f,*r,*f,*m") - (match_operand:DI 1 "move_operand" " r,T,m,rJ,*r*J,*m,*f,*f,*f"))] + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,r,r,r, m, *f,*f,*r,*f,*m") + (match_operand:DI 1 "move_operand" " vp,r,T,m,rJ,*r*J,*m,*f,*f,*f"))] "TARGET_64BIT && (register_operand (operands[0], DImode) || reg_or_0_operand (operands[1], DImode))" { return riscv_output_move (operands[0], operands[1]); } - [(set_attr "move_type" "move,const,load,store,mtc,fpload,mfc,fmove,fpstore") + [(set_attr "move_type" "const,move,const,load,store,mtc,fpload,mfc,fmove,fpstore") (set_attr "mode" "DI")]) ;; 32-bit Integer moves @@ -1650,12 +1654,12 @@ }) (define_insn "*movsi_internal" - [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r, m, *f,*f,*r,*m") - (match_operand:SI 1 "move_operand" " r,T,m,rJ,*r*J,*m,*f,*f"))] + [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,r, m, *f,*f,*r,*m") + (match_operand:SI 1 "move_operand" " vp,r,T,m,rJ,*r*J,*m,*f,*f"))] "(register_operand (operands[0], SImode) || reg_or_0_operand (operands[1], SImode))" { return riscv_output_move (operands[0], operands[1]); } - [(set_attr "move_type" "move,const,load,store,mtc,fpload,mfc,fpstore") + [(set_attr "move_type" "const,move,const,load,store,mtc,fpload,mfc,fpstore") (set_attr "mode" "SI")]) ;; 16-bit Integer moves @@ -1675,12 +1679,12 @@ }) (define_insn "*movhi_internal" - [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r,r, m, *f,*r") - (match_operand:HI 1 "move_operand" " r,T,m,rJ,*r*J,*f"))] + [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r,r,r, m, *f,*r") + (match_operand:HI 1 "move_operand" " vp,r,T,m,rJ,*r*J,*f"))] "(register_operand (operands[0], HImode) || reg_or_0_operand (operands[1], HImode))" { return riscv_output_move (operands[0], operands[1]); } - [(set_attr "move_type" "move,const,load,store,mtc,mfc") + [(set_attr "move_type" "const,move,const,load,store,mtc,mfc") (set_attr "mode" "HI")]) ;; HImode constant generation; see riscv_move_integer for details. @@ -1717,12 +1721,12 @@ }) (define_insn "*movqi_internal" - [(set (match_operand:QI 0 "nonimmediate_operand" "=r,r,r, m, *f,*r") - (match_operand:QI 1 "move_operand" " r,I,m,rJ,*r*J,*f"))] + [(set (match_operand:QI 0 "nonimmediate_operand" "=r,r,r,r, m, *f,*r") + (match_operand:QI 1 "move_operand" " vp,r,I,m,rJ,*r*J,*f"))] "(register_operand (operands[0], QImode) || reg_or_0_operand (operands[1], QImode))" { return riscv_output_move (operands[0], operands[1]); } - [(set_attr "move_type" "move,const,load,store,mtc,mfc") + [(set_attr "move_type" "const,move,const,load,store,mtc,mfc") (set_attr "mode" "QI")]) ;; 32-bit floating point moves @@ -2095,7 +2099,7 @@ (lshiftrt:GPR (match_dup 3) (match_dup 2)))] { /* Op2 is a VOIDmode constant, so get the mode size from op1. */ - operands[2] = GEN_INT (GET_MODE_BITSIZE (GET_MODE (operands[1])) + operands[2] = GEN_INT (GET_MODE_BITSIZE (GET_MODE (operands[1])).to_constant () - exact_log2 (INTVAL (operands[2]) + 1)); }) diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt index 9e9fe6d8ccd..42bdca569cd 100644 --- a/gcc/config/riscv/riscv.opt +++ b/gcc/config/riscv/riscv.opt @@ -70,6 +70,38 @@ Enum(abi_type) String(lp64f) Value(ABI_LP64F) EnumValue Enum(abi_type) String(lp64d) Value(ABI_LP64D) +Enum +Name(riscv_vector_bits) Type(enum riscv_vector_bits_enum) +The possible RVV vector lengths: + +EnumValue +Enum(riscv_vector_bits) String(scalable) Value(RVV_SCALABLE) + +EnumValue +Enum(riscv_vector_bits) String(64) Value(RVV_64) + +EnumValue +Enum(riscv_vector_bits) String(128) Value(RVV_128) + +EnumValue +Enum(riscv_vector_bits) String(256) Value(RVV_256) + +EnumValue +Enum(riscv_vector_bits) String(512) Value(RVV_512) + +EnumValue +Enum(riscv_vector_bits) String(1024) Value(RVV_1024) + +EnumValue +Enum(riscv_vector_bits) String(2048) Value(RVV_2048) + +EnumValue +Enum(riscv_vector_bits) String(4096) Value(RVV_4096) + +mriscv-vector-bits= +Target RejectNegative Joined Enum(riscv_vector_bits) Var(riscv_vector_bits) Init(RVV_SCALABLE) +-mriscv-vector-bits= Set the number of bits in an RVV vector register. + mfdiv Target Mask(FDIV) Use hardware floating-point divide and square root instructions. diff --git a/gcc/config/riscv/t-riscv b/gcc/config/riscv/t-riscv index 19736b3a38f..b5abf9c45d0 100644 --- a/gcc/config/riscv/t-riscv +++ b/gcc/config/riscv/t-riscv @@ -23,6 +23,10 @@ riscv-shorten-memrefs.o: $(srcdir)/config/riscv/riscv-shorten-memrefs.cc $(COMPILE) $< $(POSTCOMPILE) +riscv-vector.o: $(srcdir)/config/riscv/riscv-vector.cc + $(COMPILE) $< + $(POSTCOMPILE) + PASSES_EXTRA += $(srcdir)/config/riscv/riscv-passes.def $(common_out_file): $(srcdir)/config/riscv/riscv-cores.def \ From patchwork Tue May 31 08:49:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 54545 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 989283856246 for ; Tue, 31 May 2022 08:52:44 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpproxy21.qq.com (smtpbg702.qq.com [203.205.195.102]) by sourceware.org (Postfix) with ESMTPS id 952DC3857426 for ; Tue, 31 May 2022 08:50:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 952DC3857426 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp84t1653987022tyjbujku Received: from server1.localdomain ( [42.247.22.65]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 31 May 2022 16:50:21 +0800 (CST) X-QQ-SSF: 01400000002000B0F000000A0000000 X-QQ-FEAT: dLvv507TVpTNoJEit2s5xsxrqGsmTFTCIQywhFQhac7ZCisYNhkQjANsa1oj/ WQayGj9gNw5f2CbYUV5QtB6qbdzpzn+GDMyD3Zk1zr4RxOcXxIe8tQEBxyXv+IIQEsWzzTe C+5Qqy12rVoOMDAnarnFqOn3BXQA+EiJSUPZoCt2OTX9SEcc+dQ/c/7MaNAw7yZwaRi9Bji QnaUf09ve2yaBrvlGmcjjYPW1X2L4CbBWKH79YzhjuyFua3eDvgkHTku7NIXguUQiRNDTRp JvHkqCul0pddabGKyv0wxT9A9dV1qJJQnAMCa1xA71OUJfUCsYapzgchR++A4YD10iTTMTQ xjCR24uCen1fDEmYPPMZ13IdQYO8sOM8Ul9uTnvyekUy7wLMOk= X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Subject: [PATCH 02/21] Add RVV intrinsic framework Date: Tue, 31 May 2022 16:49:53 +0800 Message-Id: <20220531085012.269719-3-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220531085012.269719-1-juzhe.zhong@rivai.ai> References: <20220531085012.269719-1-juzhe.zhong@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybgforeign:qybgforeign9 X-QQ-Bgrelay: 1 X-Spam-Status: No, score=-7.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS, TXREP, T_FILL_THIS_FORM_SHORT, T_SCC_BODY_TEXT_LINE, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kito.cheng@gmail.com, juzhe.zhong@rivai.ai Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: zhongjuzhe gcc/ChangeLog: * config.gcc: Add riscv-vector-builtins-functions.o and riscv-vector-builtins.o extra_objs for RVV support. * config/riscv/riscv-builtins.cc (riscv_init_builtins): Add RVV support. (riscv_builtin_decl): Add RVV support. (riscv_expand_builtin): Add RVV support. (riscv_gimple_fold_builtin): New function. * config/riscv/riscv-protos.h (riscv_gimple_fold_builtin): New function. (enum riscv_builtin_class): New macro define. * config/riscv/riscv-vector.cc (rvv_get_mask_mode): New function. * config/riscv/riscv-vector.h (rvv_get_mask_mode): New function. * config/riscv/riscv.cc (riscv_class_max_nregs): Add RVV register. (riscv_conditional_register_usage): Add RVV register. (TARGET_GIMPLE_FOLD_BUILTIN): New targethook. * config/riscv/t-riscv: New object. * config/riscv/md-parser: New file. * config/riscv/riscv-vector-builtins-functions.cc: New file. * config/riscv/riscv-vector-builtins-functions.def: New file. * config/riscv/riscv-vector-builtins-functions.h: New file. * config/riscv/riscv-vector-builtins-iterators.def: New file. * config/riscv/riscv-vector-builtins.cc: New file. * config/riscv/riscv-vector-builtins.def: New file. * config/riscv/riscv-vector-builtins.h: New file. * config/riscv/vector-iterators.md: New file. --- gcc/config.gcc | 5 +- gcc/config/riscv/md-parser | 205 ++++ gcc/config/riscv/riscv-builtins.cc | 88 +- gcc/config/riscv/riscv-protos.h | 19 + .../riscv/riscv-vector-builtins-functions.cc | 1012 +++++++++++++++++ .../riscv/riscv-vector-builtins-functions.def | 34 + .../riscv/riscv-vector-builtins-functions.h | 491 ++++++++ .../riscv/riscv-vector-builtins-iterators.def | 12 + gcc/config/riscv/riscv-vector-builtins.cc | 266 +++++ gcc/config/riscv/riscv-vector-builtins.def | 37 + gcc/config/riscv/riscv-vector-builtins.h | 59 + gcc/config/riscv/riscv-vector.cc | 17 + gcc/config/riscv/riscv-vector.h | 1 + gcc/config/riscv/riscv.cc | 21 + gcc/config/riscv/t-riscv | 36 + gcc/config/riscv/vector-iterators.md | 19 + 16 files changed, 2307 insertions(+), 15 deletions(-) create mode 100644 gcc/config/riscv/md-parser create mode 100644 gcc/config/riscv/riscv-vector-builtins-functions.cc create mode 100644 gcc/config/riscv/riscv-vector-builtins-functions.def create mode 100644 gcc/config/riscv/riscv-vector-builtins-functions.h create mode 100644 gcc/config/riscv/riscv-vector-builtins-iterators.def create mode 100644 gcc/config/riscv/riscv-vector-builtins.cc create mode 100644 gcc/config/riscv/riscv-vector-builtins.def create mode 100644 gcc/config/riscv/riscv-vector-builtins.h create mode 100644 gcc/config/riscv/vector-iterators.md diff --git a/gcc/config.gcc b/gcc/config.gcc index 50154c2eb3a..bdda82ae576 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -517,8 +517,11 @@ pru-*-*) ;; riscv*) cpu_type=riscv - extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memrefs.o riscv-vector.o" + extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memrefs.o riscv-vector.o riscv-vector-builtins-functions.o riscv-vector-builtins.o" d_target_objs="riscv-d.o" + target_gtfiles="$target_gtfiles \$(srcdir)/config/riscv/riscv-builtins.cc \$(srcdir)/config/riscv/riscv-vector-builtins.cc" + target_gtfiles="$target_gtfiles \$(srcdir)/config/riscv/riscv-vector-builtins-functions.cc" + target_gtfiles="$target_gtfiles \$(srcdir)/config/riscv/riscv-vector-builtins-functions.h" ;; rs6000*-*-*) extra_options="${extra_options} g.opt fused-madd.opt rs6000/rs6000-tables.opt" diff --git a/gcc/config/riscv/md-parser b/gcc/config/riscv/md-parser new file mode 100644 index 00000000000..311b8709c0a --- /dev/null +++ b/gcc/config/riscv/md-parser @@ -0,0 +1,205 @@ +# Mode iterators and attributes parser for RISC-V 'V' Extension for GNU compiler. +# Copyright (C) 2022-2022 Free Software Foundation, Inc. +# Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd. +# +# This file is part of GCC. +# +# GCC is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. +# +# GCC is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . + +import os +import sys + + +def print_err(msg): + print("\033[31m\033[1mRISCV md parser Error: %s\033[0m\033[0m" % msg) + + +def write_to_file(file_path, one_line, opt='a+'): + dir_name = os.path.dirname(file_path) + if dir_name is not None and len(dir_name) > 0: + os.makedirs(dir_name, exist_ok=True) + with open(file_path, opt, encoding="utf-8") as f: + f.write(one_line) + + +def get_prologue(): + return """/* Do not modify this file, it auto gen by md-parser script */ +#ifndef DEF_RISCV_ARG_MODE_ATTR_VARIABLE +#define DEF_RISCV_ARG_MODE_ATTR_VARIABLE(A, B) +#endif + +#ifndef DEF_RISCV_ARG_MODE_ATTR +#define DEF_RISCV_ARG_MODE_ATTR(A, B, C, D, E) +#endif + +""" + + +def get_epilogue(): + return """ +#undef DEF_RISCV_ARG_MODE_ATTR_VARIABLE +#undef DEF_RISCV_ARG_MODE_ATTR +""" + + +def gen_mode_attr(def_seg): + result_str = "" + element_cnt = 0 + for value_seg in def_seg.value_segment_list: + splitted_value_seg_string = value_seg.string.split() + if len(splitted_value_seg_string) == 2: + # we only want the def_seg for intrinsics function code which has the "mode" as right hand side value + attr = splitted_value_seg_string[1] + if not attr.isdigit() and (attr.upper().replace('X', 'x') == attr) and not attr.startswith ("0x"): + result_str += "DEF_RISCV_ARG_MODE_ATTR(%s, %d, %s, %s, %s)\n" % ( + def_seg.name, element_cnt, splitted_value_seg_string[0], attr, "TARGET_FP16" if attr.count("HF") > 0 else "TARGET_HARD_FLOAT" if attr.count("SF") > 0 else "TARGET_DOUBLE_FLOAT" if attr.count("DF") > 0 else "TARGET_ANY") + element_cnt += 1 + else: + # if the mode_attr is not a totally attribute with the "mode" value, ignore the whole pack + return "" + else: + print_err("Value Segment(%s) in \"%s\" maybe not right for \"%s\"" % ( + value_seg.string, def_seg.name, def_seg.def_type)) + if len(result_str) > 0: + result_str = "DEF_RISCV_ARG_MODE_ATTR_VARIABLE(%s, %d)\n%s" % (def_seg.name, element_cnt, result_str) + return result_str + + +def gen_mode_iterator(def_seg): + result_str = "" + element_cnt = 0 + for value_seg in def_seg.value_segment_list: + splitted_value_seg_string = value_seg.string.split() + if len(splitted_value_seg_string) == 1: + result_str += "DEF_RISCV_ARG_MODE_ATTR(%s, %d, %s, %s, TARGET_ANY)\n" % ( + def_seg.name, element_cnt, value_seg.string, value_seg.string) + element_cnt += 1 + elif len(splitted_value_seg_string) == 2: + result_str += "DEF_RISCV_ARG_MODE_ATTR(%s, %d, %s, %s, %s)\n" % ( + def_seg.name, element_cnt, splitted_value_seg_string[0], splitted_value_seg_string[0], + splitted_value_seg_string[1]) + element_cnt += 1 + else: + print_err("Value Segment(%s) in \"%s\" maybe not right for \"%s\"" % ( + value_seg.string, def_seg.name, def_seg.def_type)) + if len(result_str) > 0: + result_str = "DEF_RISCV_ARG_MODE_ATTR_VARIABLE(%s, %d)\n%s" % (def_seg.name, element_cnt, result_str) + return result_str + + +class Segment: + def __init__(self): + self.string = "" + self.parentheses = 0 + + def append(self, char_value): + self.string += char_value + self.parentheses += 1 if char_value == '(' else -1 if char_value == ')' else 0 + + def is_intact(self): + return self.parentheses == 0 + + def strip_self(self): + self.string = self.string.replace('(', '').replace(')', '').replace('"', ' ') + self.string = self.string.strip() + return self + + def __str__(self): + return self.string + + +class DefSegment(Segment): + def __init__(self): + super().__init__() + self.def_type = "" + self.name = "" + self.value_str = "" + self.value_segment_list = [] + + def make_value_segments(self): + # split value string into multiple value segments which will contain a single or a double value array + i = 0 + current_value_segment = None + while i < len(self.value_str): + if self.value_str[i] != ' ' and current_value_segment is None: + # new segment start + current_value_segment = Segment() + current_value_segment.append(self.value_str[i]) + elif current_value_segment is not None: + current_value_segment.append(self.value_str[i]) + if current_value_segment.is_intact() and ( + current_value_segment.string.endswith(' ') or current_value_segment.string.endswith(')')): + self.value_segment_list.append(current_value_segment.strip_self()) + current_value_segment = None + i += 1 + # append the last segment which is missed by the while loop + if current_value_segment is not None: + self.value_segment_list.append(current_value_segment.strip_self()) + + def parse(self): + if len(self.string) == 0: + return False + if self.string.count('[') != 1 or self.string.count(']') != 1: + print_err("\"%s\" is not a valid [...] format string" % self.string) + return False + # ignore the '(' and ')' + self.string = self.string[1:len(self.string) - 1] + self.def_type = self.string.split()[0] + self.name = self.string.split()[1] + # get the value string inside '[' and ']' + self.value_str = self.string[self.string.index('[') + 1:self.string.index(']')].strip() + self.make_value_segments() + return True + + +if __name__ == '__main__': + local_dir = os.path.dirname(os.path.realpath(__file__)) + output_file = local_dir + '/' + sys.argv[1] + target_md_files = [] + for arg in sys.argv[2:]: + target_md_files.append(local_dir + '/' + arg) + segment_list = [] + for md_file in target_md_files: + with open(md_file) as file: + current_segment = None + for line in file.readlines(): + stripped_line = line.strip() + if stripped_line.startswith(";;"): + continue + if stripped_line.count("(") > 0 or current_segment is not None: + char_index = 0 + # add a space at the end of the stripped_line for further split + stripped_line += " " + while char_index < len(stripped_line): + if stripped_line[char_index] == '(' and current_segment is None: + # new segment start + current_segment = DefSegment() + current_segment.append('(') + elif current_segment is not None: + current_segment.append(stripped_line[char_index]) + if current_segment.is_intact(): + segment_list.append(current_segment) + current_segment = None + char_index += 1 + output_str = get_prologue() + for segment in segment_list: + if not segment.parse(): + continue + if segment.def_type == "define_mode_iterator": + output_str += gen_mode_iterator(segment) + elif segment.def_type == "define_mode_attr": + output_str += gen_mode_attr(segment) + output_str += get_epilogue() + write_to_file(output_file, output_str, opt='w') \ No newline at end of file diff --git a/gcc/config/riscv/riscv-builtins.cc b/gcc/config/riscv/riscv-builtins.cc index 795132a0c16..9bd2f1227a8 100644 --- a/gcc/config/riscv/riscv-builtins.cc +++ b/gcc/config/riscv/riscv-builtins.cc @@ -36,6 +36,13 @@ along with GCC; see the file COPYING3. If not see #include "stor-layout.h" #include "expr.h" #include "langhooks.h" +#include "backend.h" +#include "gimple.h" +#include "stringpool.h" +#include "explow.h" +#include "gimple-iterator.h" +#include "riscv-protos.h" +#include "riscv-vector-builtins.h" /* Macros to create an enumeration identifier for a function prototype. */ #define RISCV_FTYPE_NAME0(A) RISCV_##A##_FTYPE @@ -189,6 +196,7 @@ riscv_build_function_type (enum riscv_function_type type) void riscv_init_builtins (void) { + riscv_vector::init_builtins (); for (size_t i = 0; i < ARRAY_SIZE (riscv_builtins); i++) { const struct riscv_builtin_description *d = &riscv_builtins[i]; @@ -196,7 +204,8 @@ riscv_init_builtins (void) { tree type = riscv_build_function_type (d->prototype); riscv_builtin_decls[i] - = add_builtin_function (d->name, type, i, BUILT_IN_MD, NULL, NULL); + = add_builtin_function (d->name, type, (i << RISCV_BUILTIN_SHIFT) + RISCV_BUILTIN_GENERAL, + BUILT_IN_MD, NULL, NULL); riscv_builtin_decl_index[d->icode] = i; } } @@ -205,11 +214,20 @@ riscv_init_builtins (void) /* Implement TARGET_BUILTIN_DECL. */ tree -riscv_builtin_decl (unsigned int code, bool initialize_p ATTRIBUTE_UNUSED) +riscv_builtin_decl (unsigned int code, bool initialize_p) { - if (code >= ARRAY_SIZE (riscv_builtins)) - return error_mark_node; - return riscv_builtin_decls[code]; + unsigned int subcode = code >> RISCV_BUILTIN_SHIFT; + switch (code & RISCV_BUILTIN_CLASS) + { + case RISCV_BUILTIN_GENERAL: + if (subcode >= ARRAY_SIZE (riscv_builtins)) + return error_mark_node; + return riscv_builtin_decls[subcode]; + + case RISCV_BUILTIN_VECTOR: + return riscv_vector::builtin_decl (subcode, initialize_p); + } + return error_mark_node; } /* Take argument ARGNO from EXP's argument list and convert it into @@ -271,20 +289,29 @@ riscv_expand_builtin_direct (enum insn_code icode, rtx target, tree exp, rtx riscv_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED, - machine_mode mode ATTRIBUTE_UNUSED, - int ignore ATTRIBUTE_UNUSED) + machine_mode mode ATTRIBUTE_UNUSED, + int ignore ATTRIBUTE_UNUSED) { tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0); unsigned int fcode = DECL_MD_FUNCTION_CODE (fndecl); - const struct riscv_builtin_description *d = &riscv_builtins[fcode]; - - switch (d->builtin_type) + unsigned int subcode = fcode >> RISCV_BUILTIN_SHIFT; + switch (fcode & RISCV_BUILTIN_CLASS) { - case RISCV_BUILTIN_DIRECT: - return riscv_expand_builtin_direct (d->icode, target, exp, true); + case RISCV_BUILTIN_VECTOR: + return riscv_vector::expand_builtin (subcode, exp, target); + case RISCV_BUILTIN_GENERAL: + { + const struct riscv_builtin_description *d = &riscv_builtins[subcode]; - case RISCV_BUILTIN_DIRECT_NO_TARGET: - return riscv_expand_builtin_direct (d->icode, target, exp, false); + switch (d->builtin_type) + { + case RISCV_BUILTIN_DIRECT: + return riscv_expand_builtin_direct (d->icode, target, exp, true); + + case RISCV_BUILTIN_DIRECT_NO_TARGET: + return riscv_expand_builtin_direct (d->icode, target, exp, false); + } + } } gcc_unreachable (); @@ -307,3 +334,36 @@ riscv_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update) *clear = build_call_expr (fsflags, 1, old_flags); *update = NULL_TREE; } + +/* Implement TARGET_GIMPLE_FOLD_BUILTIN. */ + +bool +riscv_gimple_fold_builtin (gimple_stmt_iterator *gsi) +{ + + gcall *stmt = as_a (gsi_stmt (*gsi)); + tree fndecl = gimple_call_fndecl (stmt); + unsigned int code = DECL_MD_FUNCTION_CODE (fndecl); + unsigned int subcode = code >> RISCV_BUILTIN_SHIFT; + gimple *new_stmt = NULL; + switch (code & RISCV_BUILTIN_CLASS) + { + /* Gernal builtin can fold gimple if necessary, + may wrapp it into a function in the future. */ + case RISCV_BUILTIN_GENERAL: + return false; + + case RISCV_BUILTIN_VECTOR: + new_stmt = riscv_vector::gimple_fold_builtin (subcode, gsi, stmt); + break; + } + + if (!new_stmt) + return false; + + gsi_replace (gsi, new_stmt, true); + + return true; +} + +#include "gt-riscv-builtins.h" \ No newline at end of file diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 19c50f0e702..1cb3586d1f1 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -87,6 +87,7 @@ extern void riscv_atomic_assign_expand_fenv (tree *, tree *, tree *); extern rtx riscv_expand_builtin (tree, rtx, rtx, machine_mode, int); extern tree riscv_builtin_decl (unsigned int, bool); extern void riscv_init_builtins (void); +extern bool riscv_gimple_fold_builtin (gimple_stmt_iterator *); /* Routines implemented in riscv-common.cc. */ extern std::string riscv_arch_str (bool version_p = true); @@ -116,4 +117,22 @@ extern unsigned int rvv_offset_temporaries (bool, poly_int64); extern enum vlmul_field_enum riscv_classify_vlmul_field (machine_mode); extern int rvv_regsize (machine_mode); +/* We classify builtin types into two classes: + 1. General builtin class which is using the + original builtin architecture built in + RISCV. + 2. Vector builtin class which is a special + builtin architecture that implement + intrinsic short into "pragma". */ +enum riscv_builtin_class +{ + RISCV_BUILTIN_GENERAL, + RISCV_BUILTIN_VECTOR +}; + +const unsigned int RISCV_BUILTIN_SHIFT = 1; + +/* Mask that selects the vector_builtin_class part of a function code. */ +const unsigned int RISCV_BUILTIN_CLASS = (1 << RISCV_BUILTIN_SHIFT) - 1; + #endif /* ! GCC_RISCV_PROTOS_H */ diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.cc b/gcc/config/riscv/riscv-vector-builtins-functions.cc new file mode 100644 index 00000000000..19bcb66a83f --- /dev/null +++ b/gcc/config/riscv/riscv-vector-builtins-functions.cc @@ -0,0 +1,1012 @@ +/* Intrinsic implementation for RISC-V 'V' Extension for GNU compiler. + Copyright (C) 2022-2022 Free Software Foundation, Inc. + Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +#define IN_TARGET_CODE 1 + +#include "riscv-vector-builtins-functions.h" +namespace riscv_vector +{ + +extern hash_table *function_table; +extern GTY (()) tree + vector_types[MAX_TUPLE_NUM][NUM_VECTOR_TYPES + 1][MAX_VLMUL_FIELD]; +extern GTY (()) tree + vector_pointer_types[NUM_VECTOR_TYPES + 1][MAX_VLMUL_FIELD]; +extern GTY(()) tree scalar_types[NUM_VECTOR_TYPES]; +extern GTY(()) tree scalar_pointer_types[NUM_VECTOR_TYPES]; +extern GTY(()) tree const_scalar_pointer_types[NUM_VECTOR_TYPES]; + +extern GTY (()) vec *registered_functions; + +/* Flags that describe what a function might do, in addition to reading + its arguments and returning a result. */ +static const unsigned int CP_READ_FPCR = 1U << 0; +static const unsigned int CP_RAISE_FP_EXCEPTIONS = 1U << 1; +static const unsigned int CP_RAISE_LD_EXCEPTIONS = 1U << 2; +static const unsigned int CP_READ_MEMORY = 1U << 3; +static const unsigned int CP_WRITE_MEMORY = 1U << 4; +static const unsigned int CP_READ_CSR = 1U << 5; +static const unsigned int CP_WRITE_CSR = 1U << 6; + +/* True if we've already complained about attempts to use functions + when the required extension is disabled. */ +static bool reported_missing_extension_p; + +static tree +mode2mask_t (machine_mode mode) +{ + int factor = exact_div (GET_MODE_NUNITS (mode), GET_MODE_NUNITS (VNx2BImode)) + .to_constant (); + factor = exact_log2 (factor); + return vector_types[0][VECTOR_TYPE_bool][factor]; +} + +static enum vector_type_index +get_type_idx (machine_mode mode, bool u) +{ + #define RVV_INT_TYPE(MODE, TYPE, SEW) \ + case MODE: \ + return u ? VECTOR_TYPE_u##TYPE##SEW : VECTOR_TYPE_##TYPE##SEW; \ + break; + #define RVV_FLOAT_TYPE(MODE, TYPE, SEW) \ + case MODE: \ + return VECTOR_TYPE_##TYPE##SEW; \ + break; + + switch (mode) + { + case BImode: + return VECTOR_TYPE_bool; + RVV_INT_TYPE (QImode, int, 8) + RVV_INT_TYPE (HImode, int, 16) + RVV_INT_TYPE (SImode, int, 32) + RVV_INT_TYPE (DImode, int, 64) + RVV_FLOAT_TYPE (SFmode, float, 32) + RVV_FLOAT_TYPE (DFmode, float, 64) + default: + gcc_unreachable (); + } + + return (enum vector_type_index) 0; +} + +static int +get_lmul_idx (machine_mode mode) +{ + machine_mode innermode = GET_MODE_INNER (mode); + bool is_bool_p = GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL ? true : false; + machine_mode base_mode = is_bool_p ? VNx2QImode : VNx2BImode; + int offset = exact_div (GET_MODE_NUNITS (mode), GET_MODE_NUNITS (base_mode)) + .to_constant (); + if (is_bool_p) + return exact_log2 (offset); + else + { + int nf = 1; + offset = exact_log2 (offset / nf); + int factor = exact_log2 (GET_MODE_BITSIZE (innermode).to_constant () / + GET_MODE_BITSIZE (QImode)); + return offset + factor; + } +} + +static tree +get_dt_t (machine_mode mode, bool u, bool ptr = false, bool c = false) +{ + if (mode == VOIDmode) + return void_type_node; + + machine_mode innermode = GET_MODE_INNER (mode); + enum vector_type_index base = get_type_idx (innermode, u); + tree type = NULL_TREE; + if (VECTOR_MODE_P (mode)) + { + int offset = get_lmul_idx (mode); + if (ptr) + type = vector_pointer_types[base][offset]; + else + type = vector_types[0][base][offset]; + + gcc_assert (type); + return type; + } + + if (ptr) + { + if (c) + type = const_scalar_pointer_types[base]; + else + type = scalar_pointer_types[base]; + } + else + type = scalar_types[base]; + + gcc_assert (type); + return type; +} + +/* Return true if the function has no return value. */ +static bool +function_returns_void_p (tree fndecl) +{ + return TREE_TYPE (TREE_TYPE (fndecl)) == void_type_node; +} + +/* Take argument ARGNO from EXP's argument list and convert it into + an expand operand. Store the operand in *OP. */ + +static void +add_input_operand (struct expand_operand *op, tree exp, unsigned argno) +{ + tree arg = CALL_EXPR_ARG (exp, argno); + rtx x = expand_normal (arg); + create_input_operand (op, x, TYPE_MODE (TREE_TYPE (arg))); +} + +/* Expand instruction ICODE as part of a built-in function sequence. + Use the first NOPS elements of OPS as the instruction's operands. + HAS_TARGET_P is true if operand 0 is a target; it is false if the + instruction has no target. + + Return the target rtx if HAS_TARGET_P, otherwise return const0_rtx. */ + +static rtx +generate_builtin_insn (enum insn_code icode, unsigned int n_ops, + struct expand_operand *ops, bool has_target_p) +{ + if (!maybe_expand_insn (icode, n_ops, ops)) + { + error ("invalid argument to built-in function"); + return has_target_p ? gen_reg_rtx (ops[0].mode) : const0_rtx; + } + + return has_target_p ? ops[0].value : const0_rtx; +} + +/* Return a hash code for a function_instance. */ +static hashval_t +get_string_hash (const char * input_string) +{ + if (!input_string || strlen (input_string) == 0) + return 0; + + inchash::hash h; + + for (unsigned int i = 0; i < strlen (input_string); i += 4) + { + unsigned int four_chars = 0; + + for (unsigned int j = i; j < strlen (input_string) && j < i + 4; j++) + { + four_chars |= input_string[j] << (8 * (j - i)); + } + + h.add_int (four_chars); + } + + return h.end (); +} + +/* Report an error against LOCATION that the user has tried to use + function FNDECL when extension EXTENSION is disabled. */ +static void +report_missing_extension (location_t location, tree fndecl, + const char *extension) +{ + /* Avoid reporting a slew of messages for a single oversight. */ + if (reported_missing_extension_p) + return; + + error_at (location, "rvv function %qD requires ISA extension %qs", fndecl, + extension); + inform (location, + "you can enable %qs using the command-line" + " option %<-march%>", + extension); + reported_missing_extension_p = true; +} + +/* Add attribute NAME to ATTRS. */ +static tree +add_attribute (const char *name, tree attrs) +{ + return tree_cons (get_identifier (name), NULL_TREE, attrs); +} + +inline hashval_t +registered_function_hasher::hash (value_type value) +{ + return value->instance.hash (); +} + +inline bool +registered_function_hasher::equal (value_type value, const compare_type &key) +{ + return value->instance == key; +} + +/* function_instance class implemention */ + +function_instance::function_instance (function_builder *__builder, + const char *__base_name, + vector_arg_modes &__arg_pattern, + enum predication_index __pred, + enum operation_index __operation) + : m_builder (__builder), m_base_name (__base_name), + m_target_arg_pattern (__arg_pattern), m_target_pred (__pred), + m_target_operation (__operation) +{ + function_name[0] = '\0'; +} + +function_instance::function_instance (const char *__name) +{ + memcpy (function_name, __name, NAME_MAXLEN); +} + +function_instance::~function_instance () +{ +} + +inline bool +function_instance::operator== (const function_instance &other) const +{ + return (strcmp (function_name, other.function_name) == 0); +} + +inline bool +function_instance::operator!= (const function_instance &other) const +{ + return !operator== (other); +} + +/* Return a hash code for a function_instance. */ +hashval_t +function_instance::hash () const +{ + return get_string_hash (function_name); +} + +bool +function_instance::check (location_t, tree, tree, unsigned int, tree *) const +{ + return true; +} + +/* Return a set of CP_* flags that describe what the function could do, + taking the command-line flags into account. */ +unsigned int +function_instance::call_properties () const +{ + unsigned int flags = m_builder->call_properties (); + + /* -fno-trapping-math means that we can assume any FP exceptions + are not user-visible. */ + if (!flag_trapping_math) + flags &= ~CP_RAISE_FP_EXCEPTIONS; + + return flags; +} + +/* Return true if calls to the function could read some form of + global state. */ +bool +function_instance::reads_global_state_p () const +{ + unsigned int flags = call_properties (); + + /* Preserve any dependence on rounding mode, flush to zero mode, etc. + There is currently no way of turning this off; in particular, + -fno-rounding-math (which is the default) means that we should make + the usual assumptions about rounding mode, which for intrinsics means + acting as the instructions do. */ + if (flags & CP_READ_FPCR) + return true; + + /* Handle direct reads of global state. */ + return flags & (CP_READ_MEMORY | CP_READ_CSR); +} + +/* Return true if calls to the function could modify some form of + global state. */ +bool +function_instance::modifies_global_state_p () const +{ + unsigned int flags = call_properties (); + + /* Preserve any exception state written back to the FPCR, + unless -fno-trapping-math says this is unnecessary. */ + if (flags & (CP_RAISE_FP_EXCEPTIONS | CP_RAISE_LD_EXCEPTIONS)) + return true; + + /* Handle direct modifications of global state. */ + return flags & (CP_WRITE_MEMORY | CP_WRITE_CSR); +} + +/* Return true if calls to the function could raise a signal. */ +bool +function_instance::could_trap_p () const +{ + unsigned int flags = call_properties (); + + /* Handle functions that could raise SIGFPE. */ + if (flags & (CP_RAISE_FP_EXCEPTIONS | CP_RAISE_LD_EXCEPTIONS)) + return true; + + /* Handle functions that could raise SIGBUS or SIGSEGV. */ + if (flags & (CP_READ_MEMORY | CP_WRITE_MEMORY)) + return true; + + return false; +} + +const char * +function_instance::get_base_name () const +{ + return m_base_name; +} + +vector_arg_modes +function_instance::get_arg_pattern () const +{ + return m_target_arg_pattern; +} + +enum predication_index +function_instance::get_pred () const +{ + return m_target_pred; +} + +unsigned int +function_instance::get_vma_vta () const +{ + return any_policy; +} + +enum operation_index +function_instance::get_operation () const +{ + return m_target_operation; +} + +enum data_type_index * +function_instance::get_data_type_list () const +{ + return m_builder->get_data_type_list (); +} + +function_builder * +function_instance::builder () const +{ + return m_builder; +} + +/* function_builder class implemention */ + +function_builder::function_builder (const char *__base_name, + vector_arg_all_modes &__arg_patterns, + uint64_t __pattern, + uint64_t __preds, + uint64_t __op_types, + const unsigned int __extensions) + : m_base_name (__base_name), m_target_arg_patterns (__arg_patterns), + m_target_pattern (__pattern), m_target_preds (__preds), + m_target_op_types (__op_types), m_required_extensions (__extensions) +{ + m_iter_idx_cnt = 0; + m_iter_arg_cnt = 0; + m_iter_arg_idx_list = (unsigned int *)xmalloc (m_target_arg_patterns.arg_len * + sizeof (unsigned int)); + + for (unsigned int i = 0; i < m_target_arg_patterns.arg_len; i++) + { + // use mode iterator as mode iter + if (m_target_arg_patterns.target_op_list[i] < 0) + { + m_iter_idx_cnt = + (m_iter_idx_cnt == 0) + ? m_target_arg_patterns.arg_list[i]->attr_len + : m_iter_idx_cnt * + m_target_arg_patterns.arg_list[i]->attr_len; + m_iter_arg_idx_list[m_iter_arg_cnt++] = i; + } + } + + m_direct_overloads = lang_GNU_CXX (); + + gcc_assert (m_iter_arg_cnt > 0); + gcc_assert (m_iter_idx_cnt > 0); + + gcc_obstack_init (&m_string_obstack); +} + +function_builder::~function_builder () +{ + free (m_iter_arg_idx_list); + obstack_free (&m_string_obstack, NULL); +} + +/* Add NAME to the end of the function name being built. */ +void +function_builder::append_name (const char *name) +{ + obstack_grow (&m_string_obstack, name, strlen (name)); +} + +/* Zero-terminate and complete the function name being built. */ +char * +function_builder::finish_name () +{ + obstack_1grow (&m_string_obstack, 0); + return (char *) obstack_finish (&m_string_obstack); +} + +rtx +function_builder::expand_builtin_insn (enum insn_code icode, tree exp, + rtx target, + const function_instance &instance) const +{ + gcc_assert (call_expr_nargs (exp) > 0); + struct expand_operand ops[MAX_RECOG_OPERANDS]; + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0); + enum predication_index pred = instance.get_pred (); + + /* Map any target to operand 0. */ + int opno = 0; + int offset = 0; + + if (!function_returns_void_p (fndecl)) + create_output_operand (&ops[opno++], target, mode); + + if (need_mask_operand_p ()) + { + if (has_mask_arg_p (pred)) + add_input_operand (&ops[opno++], exp, offset++); + else + create_input_operand (&ops[opno++], const0_rtx, Pmode); + } + + if (need_dest_operand_p ()) + { + if (has_dest_arg_p (pred)) + add_input_operand (&ops[opno++], exp, offset++); + else + create_input_operand (&ops[opno++], const0_rtx, Pmode); + } + + for (int argno = offset; argno < call_expr_nargs (exp); argno++) + add_input_operand (&ops[opno++], exp, argno); + + if (pred != PRED_none) + create_input_operand (&ops[opno++], GEN_INT (get_policy (pred)), Pmode); + + /* Map the arguments to the other operands. */ + gcc_assert (opno == insn_data[icode].n_generator_args); + return generate_builtin_insn (icode, opno, ops, + !function_returns_void_p (fndecl)); +} + +gimple * +function_builder::fold (const function_instance &, gimple_stmt_iterator *, + gcall *) const +{ + return NULL; +} + +tree +function_builder::get_return_type (const function_instance &) const +{ + return void_type_node; +} + +size_t +function_builder::get_dest_arguments_length () const +{ + return 1; +} + +tree +function_builder::get_mask_type (tree return_type, + const function_instance &instance, + const vec &) const +{ + tree type = return_type; + if (!VECTOR_MODE_P (TYPE_MODE (type))) + { + /* Fetch the vector mode start from arg[0]. */ + for (unsigned int i = 0; i < instance.get_arg_pattern ().arg_len; i++) + { + if (VECTOR_MODE_P (instance.get_arg_pattern ().arg_list[i])) + { + type = + get_dt_t (instance.get_arg_pattern ().arg_list[i], + instance.get_data_type_list ()[i] == DT_unsigned); + break; + } + } + } + gcc_assert (VECTOR_MODE_P (TYPE_MODE (type))); + if (GET_MODE_CLASS (TYPE_MODE (type)) == MODE_VECTOR_BOOL) + return type; + else + { + machine_mode mask_mode; + gcc_assert ( + rvv_get_mask_mode (TYPE_MODE (type)).exists (&mask_mode)); + return mode2mask_t (mask_mode); + } +} + +void +function_builder::get_argument_types (const function_instance &, + vec &) const +{ +} + +char * +function_builder::assemble_name (function_instance &) +{ + return nullptr; +} + +uint64_t +function_builder::get_pattern () const +{ + return m_target_pattern; +} + +bool +function_builder::need_mask_operand_p () const +{ + uint64_t pat = get_pattern (); + + return (pat & PAT_mask) || (pat & PAT_merge); +} + +bool +function_builder::need_dest_operand_p () const +{ + uint64_t pat = get_pattern (); + + return (pat & PAT_tail) || ((pat & PAT_mask) && !(pat & PAT_ignore_policy)); +} + +bool +function_builder::has_mask_arg_p (enum predication_index pred) const +{ + uint64_t pat = get_pattern (); + + return pred == PRED_m || pred == PRED_tam || pred == PRED_tum || + pred == PRED_tama || pred == PRED_tamu || pred == PRED_tuma || + pred == PRED_tumu || pred == PRED_ma || pred == PRED_mu || + (pat & PAT_merge); +} + +bool +function_builder::has_dest_arg_p (enum predication_index pred) const +{ + uint64_t pat = get_pattern (); + + switch (pred) + { + case PRED_void: + return (pat & PAT_void_dest) || (pat & PAT_dest); + case PRED_ta: + case PRED_tama: + return (pat & PAT_dest); + case PRED_m: + return !(pat & PAT_ignore_policy); + case PRED_tu: + case PRED_tum: + case PRED_mu: + case PRED_tamu: + case PRED_tuma: + case PRED_tumu: + return true; + default: + return false; + } +} + +bool +function_builder::can_be_overloaded_p (const function_instance &) const +{ + return false; +} + +unsigned int +function_builder::get_policy (enum predication_index pred) const +{ + uint64_t pat = get_pattern (); + + switch (pred) + { + case PRED_void: + return (pat & PAT_void_dest || pat & PAT_dest) ? tu_policy : ta_policy; + case PRED_m: + if (pat & PAT_ignore_policy) + return any_policy; + else if (pat & PAT_ignore_mask_policy) + return tu_policy; + else if (pat & PAT_ignore_tail_policy) + return mu_policy; + else + return tumu_policy; + case PRED_ta: + case PRED_tam: + return ta_policy; + case PRED_tu: + case PRED_tum: + return tu_policy; + case PRED_ma: + return ma_policy; + case PRED_mu: + return mu_policy; + case PRED_tama: + return tama_policy; + case PRED_tamu: + return tamu_policy; + case PRED_tuma: + return tuma_policy; + case PRED_tumu: + return tumu_policy; + default: + return any_policy; + } +} + +size_t +function_builder::get_position_of_mask_arg (enum predication_index) const +{ + return 0; +} + +size_t +function_builder::get_position_of_dest_arg (enum predication_index pred) const +{ + uint64_t pat = get_pattern (); + if (pred == PRED_tu || + (pred == PRED_void && (pat & PAT_void_dest || pat & PAT_dest)) || + (pred == PRED_ta && pat & PAT_dest)) + return 0; + else + return 1; +} + +unsigned int +function_builder::call_properties () const +{ + return 0; +} + +enum data_type_index * +function_builder::get_data_type_list () const +{ + return m_target_arg_patterns.dt_list; +} + +/* Return the appropriate function attributes for INSTANCE. */ +tree +function_builder::get_attributes (const function_instance &instance) const +{ + tree attrs = NULL_TREE; + + if (!instance.modifies_global_state_p ()) + { + if (instance.reads_global_state_p ()) + attrs = add_attribute ("pure", attrs); + else + attrs = add_attribute ("const", attrs); + } + + if (!flag_non_call_exceptions || !instance.could_trap_p ()) + attrs = add_attribute ("nothrow", attrs); + + return add_attribute ("leaf", attrs); +} + +/* Add a function called NAME with type FNTYPE and attributes ATTRS. + INSTANCE describes what the function does and OVERLOADED_P indicates + whether it is overloaded. REQUIRED_EXTENSIONS are the set of + architecture extensions that the function requires. */ +registered_function & +function_builder::add_function (const function_instance &instance, + const char *name, tree fntype, tree attrs, + bool overloaded_p, + bool placeholder_p) const +{ + unsigned int code = vec_safe_length (registered_functions); + code = (code << RISCV_BUILTIN_SHIFT) + RISCV_BUILTIN_VECTOR; + + /* We need to be able to generate placeholders to enusre that we have a + consistent numbering scheme for function codes between the C and C++ + frontends, so that everything ties up in LTO. + + Currently, tree-streamer-in.c:unpack_ts_function_decl_value_fields + validates that tree nodes returned by TARGET_BUILTIN_DECL are non-NULL and + some node other than error_mark_node. This is a holdover from when builtin + decls were streamed by code rather than by value. + + Ultimately, we should be able to remove this validation of BUILT_IN_MD + nodes and remove the target hook. For now, however, we need to appease the + validation and return a non-NULL, non-error_mark_node node, so we + arbitrarily choose integer_zero_node. */ + tree decl = placeholder_p + ? integer_zero_node + : simulate_builtin_function_decl (input_location, name, + fntype, code, NULL, attrs); + + registered_function &rfn = *ggc_alloc (); + rfn.instance = instance; + rfn.decl = decl; + rfn.overloaded_p = overloaded_p; + rfn.required_extensions = m_required_extensions; + vec_safe_push (registered_functions, &rfn); + + return rfn; +} + +/* Add a built-in function for INSTANCE, with the argument types given + by ARGUMENT_TYPES and the return type given by RETURN_TYPE. + REQUIRED_EXTENSIONS are the set of architecture extensions that the + function requires. FORCE_DIRECT_OVERLOADS is true if there is a + one-to-one mapping between "short" and "full" names, and if standard + overload resolution therefore isn't necessary. */ +void +function_builder::add_unique_function (function_instance &instance, + tree return_type, + vec &argument_types) +{ + /* Add the function under its full (unique) name. */ + char *overloaded_name = this->assemble_name (instance); + if (instance.function_name[0] == '\0') + return; + + tree fntype = build_function_type_array ( + return_type, argument_types.length (), argument_types.address ()); + tree attrs = get_attributes (instance); + registered_function &rfn = + add_function (instance, instance.function_name, fntype, attrs, false, false); + + /* Enter the function into the hash table. */ + hashval_t hashval = instance.hash (); + registered_function **rfn_slot = + function_table->find_slot_with_hash (instance, hashval, INSERT); + + if (*rfn_slot) + { + error ("duplicate function name: %s", instance.function_name); + gcc_unreachable (); + } + + *rfn_slot = &rfn; + + if (overloaded_name) + { + /* Attribute lists shouldn't be shared. */ + tree attrs = get_attributes (instance); + bool placeholder_p = !m_direct_overloads; + add_function (instance, overloaded_name, fntype, attrs, + false, placeholder_p); + obstack_free (&m_string_obstack, overloaded_name); + } +} + +/* Check whether all the RISCV_* values in REQUIRED_EXTENSIONS are + enabled, given that those extensions are required for function FNDECL. + Report an error against LOCATION if not. */ +bool +function_builder::check_required_extensions (location_t location, tree fndecl, + uint64_t required_extensions) const +{ + /* For the instructions doesn't need any extension, we return true. */ + if (required_extensions == 0) + return true; + + /* check vector extension enable or not. */ + if ((required_extensions & 0x1) && !TARGET_VECTOR) + report_missing_extension (location, fndecl, "V"); + + /* check f extension enable or not. */ + if (((required_extensions >> 4) & 0x1) && !TARGET_HARD_FLOAT) + report_missing_extension (location, fndecl, "F"); + + /* check d extension enable or not. */ + if (((required_extensions >> 5) & 0x1) && !TARGET_DOUBLE_FLOAT) + report_missing_extension (location, fndecl, "D"); + + return true; +} + +/* If INSTANCE has a governing predicate, add it to the list of argument + types in ARGUMENT_TYPES. RETURN_TYPE is the type returned by the + function. */ +void +function_builder::apply_predication (const function_instance &instance, + tree return_type, + vec &argument_types) const +{ + /* check if mask parameter need. */ + if (has_mask_arg_p (instance.get_pred ())) + { + argument_types.quick_insert ( + get_position_of_mask_arg (instance.get_pred ()), + get_mask_type (return_type, instance, argument_types)); + } + + /* check if dest parameter need. */ + if (has_dest_arg_p (instance.get_pred ())) + { + size_t size = get_dest_arguments_length (); + if (argument_types.is_empty ()) + for (size_t i = 0; i < size; i += 1) + argument_types.quick_push (return_type); + else + for (size_t i = 0; i < size; i += 1) + argument_types.quick_insert (get_position_of_dest_arg (instance.get_pred ()) + i, + return_type); + } + + /* check if vl parameter need */ + if (instance.get_pred () != PRED_none) + argument_types.quick_push (size_type_node); +} + +void +function_builder::build_one (function_instance &instance) +{ + /* Byte forms of vlxusegei take 21 arguments. */ + auto_vec argument_types; + tree return_type = get_return_type (instance); + get_argument_types (instance, argument_types); + apply_predication (instance, return_type, argument_types); + add_unique_function (instance, return_type, argument_types); +} + +vector_arg_modes & +function_builder::get_arg_modes_by_iter_idx (unsigned int iter_idx) const +{ + if (iter_idx >= m_iter_idx_cnt) + { + gcc_unreachable (); + } + + unsigned int coefficient = 1; + machine_mode *arg_modes = (machine_mode *)xmalloc ( + sizeof (machine_mode) * m_target_arg_patterns.arg_len); + vector_arg_modes &arg_modes_info = + *(vector_arg_modes *)xmalloc (sizeof (vector_arg_modes)); + arg_modes_info.arg_len = m_target_arg_patterns.arg_len; + arg_modes_info.arg_list = arg_modes; + arg_modes_info.arg_extensions = m_required_extensions; + + // set mode for iter args first + for (unsigned int i = 0; i < m_iter_arg_cnt; i++) + { + // get the iter arg's attribute length + unsigned int arg_attr_len = + m_target_arg_patterns.arg_list[m_iter_arg_idx_list[i]]->attr_len; + vector_mode_attr target_mode_attr = + m_target_arg_patterns.arg_list[m_iter_arg_idx_list[i]] + ->attr_list[(iter_idx / coefficient) % arg_attr_len]; + arg_modes[m_iter_arg_idx_list[i]] = target_mode_attr.mode; + arg_modes_info.arg_extensions |= target_mode_attr.extension; + coefficient *= arg_attr_len; + } + + // set mode for attr args + for (unsigned int i = 0; i < m_target_arg_patterns.arg_len; i++) + { + // skip iter args + if (m_target_arg_patterns.target_op_list[i] < 0) + continue; + + machine_mode iter_mode = + arg_modes[m_target_arg_patterns.target_op_list[i]]; + vector_mode_attr_list *arg_attr_list = m_target_arg_patterns.arg_list[i]; + bool attr_mode_hit = false; + + for (unsigned int j = 0; j < arg_attr_list->attr_len; j++) + { + if (iter_mode == arg_attr_list->attr_list[j].mode) + { + attr_mode_hit = true; + arg_modes[i] = arg_attr_list->attr_list[j].attr; + arg_modes_info.arg_extensions |= + arg_attr_list->attr_list[j].extension; + } + } + + // one argment mode NOT hit, that means maybe NO mode is valid for + // iter_idx + if (!attr_mode_hit) + { + // set arg_len to 0 to skip this arg pattern + arg_modes_info.arg_len = 0; + break; + } + } + + return arg_modes_info; +} + +void +function_builder::register_function () +{ + for (unsigned iter_idx = 0; iter_idx < m_iter_idx_cnt; iter_idx++) + { + vector_arg_modes &arg_modes = get_arg_modes_by_iter_idx (iter_idx); + + bool skip_p = false; + if (arg_modes.arg_len == 0) + skip_p = true; + + if (FLOAT_MODE_P (arg_modes.arg_list[0]) && + get_data_type_list ()[0] == DT_unsigned) + skip_p = true; + + if (skip_p) + { + free (arg_modes.arg_list); + free (&arg_modes); + continue; + } + for (unsigned pred = 1; pred < NUM_PREDS; pred <<= 1) + { + if ((m_target_preds & pred) == 0) + continue; + + for (unsigned op_type = 1; op_type < NUM_OP; op_type <<= 1) + { + if ((m_target_op_types & op_type) == 0) + continue; + + function_instance instance ( + this, m_base_name, arg_modes, + (enum predication_index) (m_target_preds & pred), + (enum operation_index) (m_target_op_types & op_type)); + build_one (instance); + } + } + } +} + +} // end namespace riscv_vector + +using namespace riscv_vector; + +inline void +gt_ggc_mx (function_instance *) +{ +} + +inline void +gt_pch_nx (function_instance *) +{ +} + +inline void +gt_pch_nx (function_instance *, gt_pointer_operator, void *) +{ +} + +#include "gt-riscv-vector-builtins-functions.h" \ No newline at end of file diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def b/gcc/config/riscv/riscv-vector-builtins-functions.def new file mode 100644 index 00000000000..f6161012813 --- /dev/null +++ b/gcc/config/riscv/riscv-vector-builtins-functions.def @@ -0,0 +1,34 @@ +/* Intrinsic macros for RISC-V 'V' Extension for GNU compiler. + Copyright (C) 2022-2022 Free Software Foundation, Inc. + Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd. + Based on MIPS target for GNU compiler. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +#ifndef DEF_RVV_FUNCTION +#define DEF_RVV_FUNCTION(A, B, C, D, E, F, G) +#endif +#ifndef VITER +#define VITER(A) +#endif +#ifndef VATTR +#define VATTR(A, B) +#endif + +#undef DEF_RVV_FUNCTION +#undef VITER +#undef VATTR \ No newline at end of file diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.h b/gcc/config/riscv/riscv-vector-builtins-functions.h new file mode 100644 index 00000000000..1b769743857 --- /dev/null +++ b/gcc/config/riscv/riscv-vector-builtins-functions.h @@ -0,0 +1,491 @@ +/* Intrinsic definitions for RISC-V 'V' Extension for GNU compiler. + Copyright (C) 2022-2022 Free Software Foundation, Inc. + Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +#ifndef GCC_RISCV_VECTOR_BUILTINS_FUNCTIONS_H +#define GCC_RISCV_VECTOR_BUILTINS_FUNCTIONS_H + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "tree.h" +#include "rtl.h" +#include "tm_p.h" +#include "memmodel.h" +#include "insn-codes.h" +#include "optabs.h" +#include "recog.h" +#include "cgraph.h" +#include "diagnostic.h" +#include "expr.h" +#include "basic-block.h" +#include "function.h" +#include "fold-const.h" +#include "varasm.h" +#include "gimple.h" +#include "gimple-iterator.h" +#include "gimplify.h" +#include "explow.h" +#include "emit-rtl.h" +#include "tree-vector-builder.h" +#include "stor-layout.h" +#include "regs.h" +#include "alias.h" +#include "gimple-fold.h" +#include "langhooks.h" +#include "stringpool.h" +#include "attribs.h" +#include "tree-pass.h" +#include "tree-vrp.h" +#include "tree-ssanames.h" +#include "tree-ssa-operands.h" +#include "tree-phinodes.h" +#include "targhooks.h" +#include "langhooks-def.h" +#include "riscv-vector.h" +#include + +namespace riscv_vector +{ + +/* The macro defines the maximum length of name string. */ +static const unsigned int NAME_MAXLEN = 32; +/* The macro defines the maxmum number of tuple. For + RISC-V 'V' Extension, the maxmum tuple number is 8. + TODO: We will support tuple type for segment instructions later. */ +static const unsigned int MAX_TUPLE_NUM = 1; + +/* Describes the various of data_type. + Used by riscv-vector-builtins-iterators.def. */ +enum data_type_index +{ + /* signed data. */ + DT_signed, + /* unsigned data. */ + DT_unsigned, + /* signed data pointer. */ + DT_ptr, + /* unsigned data pointer. */ + DT_uptr, + /* const signed data pointer. */ + DT_c_ptr, + /* const unsigned data pointer. */ + DT_c_uptr, +}; + +/* Describes the various uses of a governing predicate. + Used by riscv-vector-builtins-iterators.def. */ +enum predication_index +{ + /* No governing predicate is present. */ + PRED_none = 1 << 0, + /* tail agnostic, ignore mask policy */ + PRED_ta = 1 << 3, + /* tail undisturbed, ignore mask policy */ + PRED_tu = 1 << 4, + /* mask agnostic, ignore tail policy */ + PRED_ma = 1 << 5, + /* mask undisturbed, ignore tail policy */ + PRED_mu = 1 << 6, + /* mask and tail both agnostic */ + PRED_tama = 1 << 7, + /* mask undisturbed and tail agnostic */ + PRED_tamu = 1 << 8, + /* mask agnostic and tail undisturbed */ + PRED_tuma = 1 << 9, + /* mask and tail both undisturbed */ + PRED_tumu = 1 << 10, + /* No governing predicate is present. */ + PRED_void = 1 << 11, + /* mask and tail both undisturbed */ + PRED_m = 1 << 12, + /* tail agnostic, ignore mask policy */ + PRED_tam = 1 << 13, + /* tail undisturbed, ignore mask policy */ + PRED_tum = 1 << 14, + + NUM_PREDS +}; + +/* Describes the various intrinsic types. */ +enum intrinsic_pattern +{ + /* other intrinsic */ + PAT_none = 1 << 0, + PAT_mask = 1 << 1, + PAT_tail = 1 << 2, + PAT_dest = 1 << 3, + PAT_void_dest = 1 << 4, + PAT_ignore_mask_policy = 1 << 5, + PAT_ignore_tail_policy = 1 << 6, + PAT_ignore_policy = 1 << 7, + PAT_merge = 1 << 8, +}; + +/* Describes the various operation types. */ +enum operation_index +{ + OP_none = 1 << 0, + OP_vv = 1 << 1, + OP_vx = 1 << 2, + OP_v = 1 << 3, + OP_wv = 1 << 4, + OP_wx = 1 << 5, + OP_x_x_v = 1 << 6, + OP_vf2 = 1 << 7, + OP_vf4 = 1 << 8, + OP_vf8 = 1 << 9, + OP_vvm = 1 << 10, + OP_vxm = 1 << 11, + OP_x_x_w = 1 << 12, + OP_v_v = 1 << 13, + OP_v_x = 1 << 14, + OP_vs = 1 << 15, + OP_mm = 1 << 16, + OP_m = 1 << 17, + OP_vf = 1 << 18, + OP_vm = 1 << 19, + OP_wf = 1 << 20, + OP_vfm = 1 << 21, + OP_v_f = 1 << 22, + NUM_OP +}; + +/* Describe indexed-ordered or indexed-unordered load store. */ +enum indexed_mode +{ + INDEXED_u, + INDEXED_o, +}; + +/* Describe vector policy. */ +enum vector_policy +{ + undisturbed = 0, + agnostic = 1, + any = 2, +}; + +/* Enumerates the VECTOR (data) vector types, together called + "vector types" for brevity. */ +enum vector_type_index +{ +#define DEF_RVV_TYPE(ELEM_TYPE, NODE) VECTOR_TYPE_##ELEM_TYPE, +#include "riscv-vector-builtins.def" + NUM_VECTOR_TYPES +#undef DEF_RVV_TYPE +}; + +struct vector_vlmul_info +{ + enum vlmul_field_enum vlmul; + const char *suffix; + const char *boolnum; +}; + +struct vector_type_info +{ + enum vector_type_index type; + const char *elem_name; +}; + +// for function arg mode infomation, include return type +struct vector_mode_attr +{ + machine_mode mode; + machine_mode attr; + // the extension like TARGET_VECTOR + uint64_t extension; +}; + +// the total variable pack for function arg mode infomation, include return +// type +struct vector_mode_attr_list +{ + unsigned int attr_len; + vector_mode_attr *attr_list; +}; + +// for VATTR(OP, MODE_ATTR) +struct vector_arg_attr_info +{ + int target_op; + enum data_type_index dt; + vector_mode_attr_list *mode_attr_list; +}; + +struct vector_arg_all_modes +{ + unsigned int arg_len; + data_type_index *dt_list; + int *target_op_list; + // arg_list[0] is always return type + vector_mode_attr_list **arg_list; +}; + +struct vector_arg_modes +{ + uint64_t arg_extensions; + unsigned int arg_len; + // arg_list[0] is always return type + machine_mode *arg_list; +}; + +constexpr unsigned int +get_vma_vta (vector_policy vma, vector_policy vta) +{ + return (vma << 2) | vta; +} + +constexpr vector_policy +get_vma (unsigned int vma_vta) +{ + return (vector_policy)((vma_vta >> 2) & 0b11); +} + +constexpr vector_policy +get_vta (unsigned int vma_vta) +{ + return (vector_policy)(vma_vta & 0b11); +} + +const unsigned int tama_policy = get_vma_vta(vector_policy::agnostic, vector_policy::agnostic); +const unsigned int tamu_policy = get_vma_vta(vector_policy::undisturbed, vector_policy::agnostic); +const unsigned int tuma_policy = get_vma_vta(vector_policy::agnostic, vector_policy::undisturbed); +const unsigned int tumu_policy = get_vma_vta(vector_policy::undisturbed, vector_policy::undisturbed); +const unsigned int ta_policy = get_vma_vta(vector_policy::any, vector_policy::agnostic); +const unsigned int tu_policy = get_vma_vta(vector_policy::any, vector_policy::undisturbed); +const unsigned int ma_policy = get_vma_vta(vector_policy::agnostic, vector_policy::any); +const unsigned int mu_policy = get_vma_vta(vector_policy::undisturbed, vector_policy::any); +const unsigned int any_policy = get_vma_vta(vector_policy::any, vector_policy::any); + +inline rtx +gen_tama_policy () +{ + return gen_rtx_CONST_INT (Pmode, tama_policy); +} + +inline rtx +gen_tamu_policy () +{ + return gen_rtx_CONST_INT (Pmode, tamu_policy); +} + +inline rtx +gen_tuma_policy () +{ + return gen_rtx_CONST_INT (Pmode, tuma_policy); +} + +inline rtx +gen_tumu_policy () +{ + return gen_rtx_CONST_INT (Pmode, tumu_policy); +} + +inline rtx +gen_ta_policy () +{ + return gen_rtx_CONST_INT (Pmode, ta_policy); +} + +inline rtx +gen_tu_policy () +{ + return gen_rtx_CONST_INT (Pmode, tu_policy); +} + +inline rtx +gen_ma_policy () +{ + return gen_rtx_CONST_INT (Pmode, ma_policy); +} + +inline rtx +gen_mu_policy () +{ + return gen_rtx_CONST_INT (Pmode, mu_policy); +} + +inline rtx +gen_any_policy () +{ + return gen_rtx_CONST_INT (Pmode, any_policy); +} + +class function_builder; + +class GTY ((user)) function_instance +{ +public: + function_instance (function_builder *, const char *, vector_arg_modes &, + enum predication_index, enum operation_index); + function_instance (const char *__name); + ~function_instance (); + + bool operator== (const function_instance &) const; + bool operator!= (const function_instance &) const; + + hashval_t hash () const; + bool check (location_t, tree, tree, unsigned int, tree *) const; + unsigned int call_properties () const; + bool reads_global_state_p () const; + bool modifies_global_state_p () const; + bool could_trap_p () const; + + const char *get_base_name () const; + vector_arg_modes get_arg_pattern () const; + enum predication_index get_pred () const; + unsigned int get_vma_vta () const; + enum operation_index get_operation () const; + enum data_type_index *get_data_type_list () const; + function_builder *builder () const; + + char function_name[NAME_MAXLEN]; + +private: + function_builder *m_builder; + const char *m_base_name; + vector_arg_modes m_target_arg_pattern; + enum predication_index m_target_pred; + enum operation_index m_target_operation; +}; + +/* Describes a function decl. */ +class GTY (()) registered_function +{ +public: + function_instance GTY ((skip)) instance; + + /* The decl itself. */ + tree GTY ((skip)) decl; + + /* The architecture extensions that the function requires, as a set of + RISCV_TARGET_* flags. */ + uint64_t required_extensions; + + /* True if the decl represents an overloaded function that needs to be + resolved. */ + bool overloaded_p; +}; + +/* A class for building and registering function decls. */ +class function_builder +{ +public: + function_builder (const char *, vector_arg_all_modes &, uint64_t, + uint64_t, uint64_t, const unsigned int); + + virtual ~function_builder (); + + void register_function (); + + /* Try to fold the given gimple call. Return the new gimple statement + on success, otherwise return null. */ + virtual gimple * fold (const function_instance &, gimple_stmt_iterator *, gcall *) const; + + /* Expand the given call into rtl. Return the result of the function, + or an arbitrary value if the function doesn't return a result. */ + virtual rtx expand (const function_instance &, tree, rtx) const = 0; + + rtx + expand_builtin_insn (enum insn_code, tree, rtx, + const function_instance &) const; + + virtual tree get_return_type (const function_instance &) const; + + virtual tree get_mask_type (tree, const function_instance &, const vec &) const; + + virtual void get_argument_types (const function_instance &, + vec &) const; + + virtual size_t get_dest_arguments_length () const; + + uint64_t get_pattern () const; + /* check if need add mask operand for corresponding rtl */ + bool need_mask_operand_p () const; + /* check if need add dest operand for corresponding rtl */ + bool need_dest_operand_p () const; + /* check if has mask arg for corresponding function decl */ + bool has_mask_arg_p (enum predication_index) const; + /* check if has dest arg for corresponding function decl */ + virtual bool has_dest_arg_p (enum predication_index) const; + unsigned int get_policy (enum predication_index) const; + /* get parameter position of mask arg */ + virtual size_t get_position_of_mask_arg (enum predication_index) const; + /* get parameter position of dest arg */ + virtual size_t get_position_of_dest_arg (enum predication_index) const; + + void apply_predication (const function_instance &, tree, vec &) const; + + virtual unsigned int call_properties () const; + + vector_arg_modes &get_arg_modes_by_iter_idx (unsigned int) const; + + enum data_type_index *get_data_type_list () const; + + bool check_required_extensions (location_t, tree, uint64_t) const; + virtual char * assemble_name (function_instance &); + void append_name (const char *); + char *finish_name (); + + /* Return true if the function can be overloaded. */ + virtual bool can_be_overloaded_p (const function_instance &) const; + +private: + void add_unique_function (function_instance &, tree, vec &); + void build_one (function_instance &); + tree get_attributes (const function_instance &) const; + + registered_function &add_function (const function_instance &, const char *, + tree, tree, bool, bool) const; + + /* The base name, as a string. */ + const char *m_base_name; + vector_arg_all_modes m_target_arg_patterns; + uint64_t m_target_pattern; + uint64_t m_target_preds; + uint64_t m_target_op_types; + uint64_t m_required_extensions; + + unsigned int m_iter_idx_cnt; + unsigned int m_iter_arg_cnt; + unsigned int *m_iter_arg_idx_list; + + /* True if we should create a separate decl for each instance of an + overloaded function, instead of using function_builder. */ + bool m_direct_overloads; + + /* Used for building up function names. */ + obstack m_string_obstack; +}; + +/* Hash traits for registered_function. */ +struct registered_function_hasher : nofree_ptr_hash +{ + typedef function_instance compare_type; + + static hashval_t hash (value_type); + static bool equal (value_type, const compare_type &); +}; + +} // namespace riscv_vector + +#endif // end GCC_RISCV_VECTOR_BUILTINS_FUNCTIONS_H \ No newline at end of file diff --git a/gcc/config/riscv/riscv-vector-builtins-iterators.def b/gcc/config/riscv/riscv-vector-builtins-iterators.def new file mode 100644 index 00000000000..ac0c37e12d4 --- /dev/null +++ b/gcc/config/riscv/riscv-vector-builtins-iterators.def @@ -0,0 +1,12 @@ +/* Do not modify this file, it auto gen by md-parser script */ +#ifndef DEF_RISCV_ARG_MODE_ATTR_VARIABLE +#define DEF_RISCV_ARG_MODE_ATTR_VARIABLE(A, B) +#endif + +#ifndef DEF_RISCV_ARG_MODE_ATTR +#define DEF_RISCV_ARG_MODE_ATTR(A, B, C, D, E) +#endif + + +#undef DEF_RISCV_ARG_MODE_ATTR_VARIABLE +#undef DEF_RISCV_ARG_MODE_ATTR diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc new file mode 100644 index 00000000000..7ea07a24b5b --- /dev/null +++ b/gcc/config/riscv/riscv-vector-builtins.cc @@ -0,0 +1,266 @@ +/* Builtins implementation for RISC-V 'V' Extension for GNU compiler. + Copyright (C) 2022-2022 Free Software Foundation, Inc. + Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +#define IN_TARGET_CODE 1 + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "tree.h" +#include "rtl.h" +#include "tm_p.h" +#include "memmodel.h" +#include "insn-codes.h" +#include "optabs.h" +#include "recog.h" +#include "cgraph.h" +#include "diagnostic.h" +#include "expr.h" +#include "basic-block.h" +#include "function.h" +#include "fold-const.h" +#include "varasm.h" +#include "gimple.h" +#include "gimple-iterator.h" +#include "gimplify.h" +#include "explow.h" +#include "emit-rtl.h" +#include "tree-vector-builder.h" +#include "stor-layout.h" +#include "regs.h" +#include "alias.h" +#include "gimple-fold.h" +#include "langhooks.h" +#include "stringpool.h" +#include "attribs.h" +#include "tree-pass.h" +#include "tree-vrp.h" +#include "tree-ssanames.h" +#include "tree-ssa-operands.h" +#include "tree-phinodes.h" +#include "targhooks.h" +#include "langhooks-def.h" +#include "riscv-vector-builtins.h" +#include "riscv-vector-builtins-functions.h" +#include "riscv-vector.h" +namespace riscv_vector +{ + +/* The same vlmul doesn't mean use the same mask, + this is used as save codes. + for example: i32m8 use vbool4_t i8m8 use vbool1_t. */ +static CONSTEXPR const vector_vlmul_info vector_vlmuls[] = +{ + { VLMUL_FIELD_101, "mf8", "64" }, { VLMUL_FIELD_110, "mf4", "32" }, + { VLMUL_FIELD_111, "mf2", "16" }, { VLMUL_FIELD_000, "m1", "8" }, + { VLMUL_FIELD_001, "m2", "4" }, { VLMUL_FIELD_010, "m4", "2" }, + { VLMUL_FIELD_011, "m8", "1" }, +}; + +static CONSTEXPR const vector_type_info vector_type_infos[] = +{ +#define DEF_RVV_TYPE(ELEM_TYPE, NODE) { VECTOR_TYPE_##ELEM_TYPE, #ELEM_TYPE }, +#include "riscv-vector-builtins.def" +#undef DEF_RVV_TYPE +}; + +static GTY(()) tree internal_vector_types[NUM_VECTOR_TYPES + 1][MAX_VLMUL_FIELD]; + +/* Same, but with the riscv_vector.h "v..._t" name. */ +GTY(()) tree vector_types[MAX_TUPLE_NUM][NUM_VECTOR_TYPES + 1][MAX_VLMUL_FIELD]; +/* Same, but with the riscv_vector.h "v..._t *" name. */ +GTY(()) tree vector_pointer_types[NUM_VECTOR_TYPES + 1][MAX_VLMUL_FIELD]; +/* The scalar type associated with each vector type. */ +GTY(()) tree scalar_types[NUM_VECTOR_TYPES]; +/* The scalar pointer type associated with each vector type. */ +GTY(()) tree scalar_pointer_types[NUM_VECTOR_TYPES]; +/* The const scalar pointer type associated with each vector type. */ +GTY(()) tree const_scalar_pointer_types[NUM_VECTOR_TYPES]; + +/* All registered function decls, hashed on the function_instance + that they implement. This is used for looking up implementations of + overloaded functions. */ +hash_table *function_table; + +static riscv_vector::function_builder** all_vector_functions; + +/* The list of all registered function decls, indexed by code. */ +vec *registered_functions; + +static unsigned int NUM_INSN_FUNC; + +static void init_def_variables (); + +/* Initialize all compiler built-ins related to RVV that should be + defined at start-up. */ +void +init_builtins () +{ + if (!TARGET_VECTOR) + return; +} + +/* Return the function decl with RVV function subcode CODE, or error_mark_node + if no such function exists. */ +tree +builtin_decl (unsigned int code, bool) +{ + if (code >= vec_safe_length (registered_functions)) + return error_mark_node; + + return (*registered_functions)[code]->decl; +} + +/* Perform any semantic checks needed for a call to the RVV function + with subcode CODE, such as testing for integer constant expressions. + The call occurs at location LOCATION and has NARGS arguments, + given by ARGS. FNDECL is the original function decl, before + overload resolution. + + Return true if the call is valid, otherwise report a suitable error. */ +bool +check_builtin_call (location_t location, vec, unsigned int code, + tree fndecl, unsigned int nargs, tree *args) +{ + const registered_function &rfn = *(*registered_functions)[code]; + function_builder *builder = rfn.instance.builder (); + + if (!builder->check_required_extensions ( + location, rfn.decl, rfn.instance.get_arg_pattern ().arg_extensions)) + return false; + + return rfn.instance.check (location, fndecl, TREE_TYPE (rfn.decl), nargs, + args); +} + +/* Attempt to fold STMT, given that it's a call to the RVV function + with subcode CODE. Return the new statement on success and null + on failure. Insert any other new statements at GSI. */ +gimple * +gimple_fold_builtin (unsigned int code, gimple_stmt_iterator *gsi, gcall *stmt) +{ + registered_function &rfn = *(*registered_functions)[code]; + function_builder *builder = rfn.instance.builder (); + return builder->fold (rfn.instance, gsi, stmt); +} + +/* Expand a call to the RVV function with subcode CODE. EXP is the call + expression and TARGET is the preferred location for the result. + Return the value of the lhs. */ +rtx +expand_builtin (unsigned int code, tree exp, rtx target) +{ + registered_function &rfn = *(*registered_functions)[code]; + function_builder *builder = rfn.instance.builder (); + + if (!builder->check_required_extensions ( + EXPR_LOCATION (exp), rfn.decl, + rfn.instance.get_arg_pattern ().arg_extensions)) + return target; + + return builder->expand (rfn.instance, exp, target); +} + +riscv_vector::vector_arg_all_modes & +get_vector_arg_all_patterns (unsigned int len, + riscv_vector::vector_arg_attr_info attr, ...) +{ + riscv_vector::vector_arg_all_modes &patterns = + *ggc_alloc (); + patterns.arg_len = len; + patterns.arg_list = (riscv_vector::vector_mode_attr_list **)xmalloc ( + len * sizeof (riscv_vector::vector_mode_attr_list *)); + patterns.target_op_list = (int *)xmalloc (len * sizeof (int)); + patterns.dt_list = + (enum data_type_index *)xmalloc (len * sizeof (enum data_type_index)); + + unsigned int arg_idx = 0; + va_list arg_ptr; + riscv_vector::vector_arg_attr_info next_attr = attr; + + va_start (arg_ptr, attr); + + while (arg_idx < len) + { + patterns.dt_list[arg_idx] = next_attr.dt; + patterns.arg_list[arg_idx] = next_attr.mode_attr_list; + patterns.target_op_list[arg_idx] = next_attr.target_op; + next_attr = va_arg (arg_ptr, riscv_vector::vector_arg_attr_info); + arg_idx++; + } + + va_end (arg_ptr); + + return patterns; +} + +static vector_mode_attr_list vector_mode_attr_list_list[vector_arg_mode_category_num]; + +static void +init_def_variables () +{ + +/* define vector arg mode category */ +#define VVAR(NAME) vector_mode_attr_list_list[vector_mode_attr_##NAME] +#define VITER(NAME, SIGN) riscv_vector::vector_arg_attr_info{-1, DT_##SIGN, &VVAR(NAME)} +#define VATTR(OP, NAME, SIGN) riscv_vector::vector_arg_attr_info{OP, DT_##SIGN, &VVAR(NAME)} +#define DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VARIABLE_NAME, ELEM_CNT) \ + VVAR(VARIABLE_NAME) = {ELEM_CNT, \ + (riscv_vector::vector_mode_attr*)xmalloc(ELEM_CNT * sizeof(vector_mode_attr))}; +#include "riscv-vector-builtins-iterators.def" +#undef DEF_RISCV_ARG_MODE_ATTR_VARIABLE + +/* define every vector arg mode in category */ +#define DEF_RISCV_ARG_MODE_ATTR(VARIABLE_NAME, INDEX, MODE, ATTR_MODE, \ + CONDITION) \ + VVAR (VARIABLE_NAME).attr_list[INDEX] \ + = { MODE##mode, ATTR_MODE##mode, RISCV_##CONDITION }; +#include "riscv-vector-builtins-iterators.def" +#undef DEF_RISCV_ARG_MODE_ATTR + + /* count the number of intrinsic functions */ + NUM_INSN_FUNC = 0; +#define DEF_RVV_FUNCTION(BASE_NAME, CLASS_NAME, ARG_PATTERN, INTRNSIC_PATTER, PREDS, OP_TYPES) \ + NUM_INSN_FUNC++; +#include "riscv-vector-builtins-functions.def" +#undef DEF_RVV_FUNCTION + + all_vector_functions = (riscv_vector::function_builder **)xmalloc ( + sizeof (riscv_vector::function_builder *) * NUM_INSN_FUNC); + + unsigned int func_idx = 0; +#define VITER(NAME, SIGN) \ + riscv_vector::vector_arg_attr_info { -1, DT_##SIGN, &VVAR (NAME) } +#define VATTR(OP, NAME, SIGN) \ + riscv_vector::vector_arg_attr_info { OP, DT_##SIGN, &VVAR (NAME) } +#define DEF_RVV_FUNCTION(BASE_NAME, CLASS_NAME, ARG_PATTERN, INTRNSIC_PATTER, PREDS, OP_TYPES) \ + all_vector_functions[func_idx++] = new riscv_vector::CLASS_NAME ( \ + #BASE_NAME, get_vector_arg_all_patterns ARG_PATTERN, INTRNSIC_PATTER, PREDS, OP_TYPES, \ + (REQUIRED_EXTENSIONS)); +#include "riscv-vector-builtins-functions.def" +#undef DEF_RVV_FUNCTION +} + +} //end namespace riscv_vector + +using namespace riscv_vector; + +#include "gt-riscv-vector-builtins.h" \ No newline at end of file diff --git a/gcc/config/riscv/riscv-vector-builtins.def b/gcc/config/riscv/riscv-vector-builtins.def new file mode 100644 index 00000000000..805f9d725dc --- /dev/null +++ b/gcc/config/riscv/riscv-vector-builtins.def @@ -0,0 +1,37 @@ +/* Builtins macros for RISC-V 'V' Extension for GNU compiler. + Copyright (C) 2022-2022 Free Software Foundation, Inc. + Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +#ifndef DEF_RVV_TYPE +#define DEF_RVV_TYPE(A, B) +#endif + +DEF_RVV_TYPE (bool, boolean) +DEF_RVV_TYPE (float32, float) +DEF_RVV_TYPE (float64, double) +DEF_RVV_TYPE (int8, int8) +DEF_RVV_TYPE (int16, int16) +DEF_RVV_TYPE (int32, int32) +DEF_RVV_TYPE (int64, int64) +DEF_RVV_TYPE (uint8, unsigned_int8) +DEF_RVV_TYPE (uint16, unsigned_int16) +DEF_RVV_TYPE (uint32, unsigned_int32) +DEF_RVV_TYPE (uint64, unsigned_int64) + +#undef DEF_RVV_TYPE \ No newline at end of file diff --git a/gcc/config/riscv/riscv-vector-builtins.h b/gcc/config/riscv/riscv-vector-builtins.h new file mode 100644 index 00000000000..6bba4c90c3a --- /dev/null +++ b/gcc/config/riscv/riscv-vector-builtins.h @@ -0,0 +1,59 @@ +/* Builtins definitions for RISC-V 'V' Extension for GNU compiler. + Copyright (C) 2022-2022 Free Software Foundation, Inc. + Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +#ifndef GCC_RISCV_VECTOR_BUILTINS_H +#define GCC_RISCV_VECTOR_BUILTINS_H + +#include +#include "riscv-vector-builtins-functions.h" + +namespace riscv_vector +{ + +/* global share variables */ + +static const unsigned int RISCV_TARGET_ANY = 0; +static const unsigned int RISCV_TARGET_VECTOR = 1; +static const unsigned int RISCV_TARGET_HARD_FLOAT = 1 << 2; +static const unsigned int RISCV_TARGET_DOUBLE_FLOAT = 1 << 3; + +enum vector_arg_mode_category { +#define VVAR(NAME) vector_mode_attr_##NAME +#define DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VARIABLE_NAME, ELEM_CNT) \ + VVAR(VARIABLE_NAME), +#include "riscv-vector-builtins-iterators.def" +#undef DEF_RISCV_ARG_MODE_ATTR_VARIABLE +#undef VVAR + /* the number of arg mode category */ + vector_arg_mode_category_num +}; + +void init_builtins (); +void handle_pragma_vector (); +tree builtin_decl (unsigned, bool); +gimple *gimple_fold_builtin (unsigned int, gimple_stmt_iterator *, gcall *); +rtx expand_builtin (unsigned int, tree, rtx); +bool check_builtin_call (location_t, vec, unsigned int, + tree, unsigned int, tree *); +machine_mode vector_builtin_mode (scalar_mode, enum vlmul_field_enum); + +} // end namespace riscv_vector + +#endif \ No newline at end of file diff --git a/gcc/config/riscv/riscv-vector.cc b/gcc/config/riscv/riscv-vector.cc index e315b5d2cac..87dc6739f4f 100644 --- a/gcc/config/riscv/riscv-vector.cc +++ b/gcc/config/riscv/riscv-vector.cc @@ -226,4 +226,21 @@ rvv_regsize (machine_mode mode) } return 1; +} + +/* Get related mask mode for a RVV vector mode. */ + +opt_machine_mode +rvv_get_mask_mode (machine_mode mode) +{ + machine_mode mask_mode; + int nf = 1; + + FOR_EACH_MODE_IN_CLASS (mask_mode, MODE_VECTOR_BOOL) + if (GET_MODE_INNER (mask_mode) == BImode + && known_eq (GET_MODE_NUNITS (mask_mode) * nf, + GET_MODE_NUNITS (mode)) + && rvv_mask_mode_p (mask_mode)) + return mask_mode; + return default_get_mask_mode (mode); } \ No newline at end of file diff --git a/gcc/config/riscv/riscv-vector.h b/gcc/config/riscv/riscv-vector.h index b8d77ddb195..62df507e7df 100644 --- a/gcc/config/riscv/riscv-vector.h +++ b/gcc/config/riscv/riscv-vector.h @@ -25,4 +25,5 @@ bool rvv_legitimate_poly_int_p (rtx); unsigned int rvv_offset_temporaries (bool, poly_int64); vlmul_field_enum rvv_classify_vlmul_field (machine_mode); int rvv_regsize (machine_mode); +opt_machine_mode rvv_get_mask_mode (machine_mode); #endif // GCC_RISCV_VECTOR_H \ No newline at end of file diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 37d8f1271d4..b82a38da7c9 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -4967,7 +4967,16 @@ riscv_class_max_nregs (reg_class_t rclass, machine_mode mode) if (reg_class_subset_p (rclass, GR_REGS)) return riscv_hard_regno_nregs (GP_REG_FIRST, mode); + + if (reg_class_subset_p (V_REGS, rclass)) + return riscv_hard_regno_nregs (V_REG_FIRST, mode); + + if (reg_class_subset_p (VL_REGS, rclass)) + return 1; + if (reg_class_subset_p (VTYPE_REGS, rclass)) + return 1; + return 0; } @@ -5308,6 +5317,15 @@ riscv_conditional_register_usage (void) for (int regno = FP_REG_FIRST; regno <= FP_REG_LAST; regno++) call_used_regs[regno] = 1; } + + if (!TARGET_VECTOR) + { + for (int regno = V_REG_FIRST; regno <= V_REG_LAST; regno++) + call_used_regs[regno] = 1; + + fixed_regs[VTYPE_REGNUM] = call_used_regs[VTYPE_REGNUM] = 1; + fixed_regs[VL_REGNUM] = call_used_regs[VL_REGNUM] = 1; + } } /* Return a register priority for hard reg REGNO. */ @@ -5903,6 +5921,9 @@ riscv_asan_shadow_offset (void) #undef TARGET_BUILTIN_DECL #define TARGET_BUILTIN_DECL riscv_builtin_decl +#undef TARGET_GIMPLE_FOLD_BUILTIN +#define TARGET_GIMPLE_FOLD_BUILTIN riscv_gimple_fold_builtin + #undef TARGET_EXPAND_BUILTIN #define TARGET_EXPAND_BUILTIN riscv_expand_builtin diff --git a/gcc/config/riscv/t-riscv b/gcc/config/riscv/t-riscv index b5abf9c45d0..9b0da73f3b5 100644 --- a/gcc/config/riscv/t-riscv +++ b/gcc/config/riscv/t-riscv @@ -27,6 +27,42 @@ riscv-vector.o: $(srcdir)/config/riscv/riscv-vector.cc $(COMPILE) $< $(POSTCOMPILE) +riscv-vector-builtins-functions.o: \ + $(srcdir)/config/riscv/riscv-vector-builtins-functions.cc \ + $(srcdir)/config/riscv/riscv-vector-builtins.def \ + $(srcdir)/config/riscv/riscv-vector-builtins-functions.def \ + $(srcdir)/config/riscv/vector-iterators.md \ + $(srcdir)/config/riscv/md-parser \ + $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) $(RTL_H) \ + $(TM_P_H) memmodel.h insn-codes.h $(OPTABS_H) $(RECOG_H) $(DIAGNOSTIC_H) \ + $(EXPR_H) $(BASIC_BLOCK_H) $(FUNCTION_H) fold-const.h $(GIMPLE_H) \ + gimple-iterator.h gimplify.h explow.h $(EMIT_RTL_H) tree-vector-builder.h \ + stor-layout.h $(REG_H) alias.h gimple-fold.h langhooks.h \ + stringpool.h \ + $(srcdir)/config/riscv/riscv-vector-builtins-functions.h + python3 $(srcdir)/config/riscv/md-parser \ + riscv-vector-builtins-iterators.def vector-iterators.md && \ + $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ + $(srcdir)/config/riscv/riscv-vector-builtins-functions.cc + +riscv-vector-builtins.o: \ + $(srcdir)/config/riscv/riscv-vector-builtins.cc \ + $(srcdir)/config/riscv/riscv-vector-builtins.def \ + $(srcdir)/config/riscv/riscv-vector-builtins-functions.def \ + $(srcdir)/config/riscv/vector-iterators.md \ + $(srcdir)/config/riscv/md-parser \ + $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) $(RTL_H) \ + $(TM_P_H) memmodel.h insn-codes.h $(OPTABS_H) $(RECOG_H) $(DIAGNOSTIC_H) \ + $(EXPR_H) $(BASIC_BLOCK_H) $(FUNCTION_H) fold-const.h $(GIMPLE_H) \ + gimple-iterator.h gimplify.h explow.h $(EMIT_RTL_H) tree-vector-builder.h \ + stor-layout.h $(REG_H) alias.h gimple-fold.h langhooks.h \ + stringpool.h \ + $(srcdir)/config/riscv/riscv-vector-builtins.h \ + $(srcdir)/config/riscv/riscv-vector-builtins-functions.h \ + riscv-vector-builtins-functions.o + $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ + $(srcdir)/config/riscv/riscv-vector-builtins.cc + PASSES_EXTRA += $(srcdir)/config/riscv/riscv-passes.def $(common_out_file): $(srcdir)/config/riscv/riscv-cores.def \ diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md new file mode 100644 index 00000000000..450e20c44ce --- /dev/null +++ b/gcc/config/riscv/vector-iterators.md @@ -0,0 +1,19 @@ +;; Machine description for RISCV architecture. +;; Copyright (C) 2022-2022 Free Software Foundation, Inc. +;; Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd. +;; +;; This file is part of GCC. +;; +;; GCC is free software; you can redistribute it and/or modify it +;; under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. +;; +;; GCC is distributed in the hope that it will be useful, but +;; WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +;; General Public License for more details. +;; +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; . \ No newline at end of file From patchwork Tue May 31 08:49:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 54546 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A755F3954421 for ; Tue, 31 May 2022 08:53:31 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgsg2.qq.com (smtpbgsg2.qq.com [54.254.200.128]) by sourceware.org (Postfix) with ESMTPS id 697D4386F463 for ; Tue, 31 May 2022 08:50:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 697D4386F463 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp84t1653987025tmm7icwv Received: from server1.localdomain ( [42.247.22.65]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 31 May 2022 16:50:24 +0800 (CST) X-QQ-SSF: 01400000002000B0F000000A0000000 X-QQ-FEAT: 0VgNaGdhy9iV3OGcTTZT+ByT6OlgolLye3sgRjec1mugZo/B2iylQPhW+KZry 5YUjDueopGqNGAhToWIMLmQ8Xsvwcalb9rLJC9qIGpDPHY1vFjuyEWnp2egMxTYLnlQ5hcp 7DdIFslbmVMhrsdhsUiYHQeOZ47DbBmYFLR/v+IJ8FDd2YeNB4yRKhGZhSn+MOLIs3y41Vs myenhY/Pb+t+r3LZ9CK6aGSP97ynsxtqR9vM0ojiHGfSfB5JfndgNapjec5sElxGsGAbOeQ 2Z/b5QzIELzDGfgMgLcie+hKvGItzg7gI7vw9443grBchLSV+OA478RggYLlHW0uzZ0JU4H dAkXWI6Jg6iN6ag4/BFJEwDzpyPh+2oJlJ9TPiB X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Subject: [PATCH 03/21] Add RVV datatypes Date: Tue, 31 May 2022 16:49:54 +0800 Message-Id: <20220531085012.269719-4-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220531085012.269719-1-juzhe.zhong@rivai.ai> References: <20220531085012.269719-1-juzhe.zhong@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybgforeign:qybgforeign10 X-QQ-Bgrelay: 1 X-Spam-Status: No, score=-8.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kito.cheng@gmail.com, juzhe.zhong@rivai.ai Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: zhongjuzhe gcc/ChangeLog: * config/riscv/riscv-vector-builtins.cc (make_type_sizeless): New function. (sizeless_type_p): New function. (vector_builtin_mode): New function. (vector_legal_vlmul): New function. (add_vector_type_attribute): New function. (register_general_builtin_types): New function. (DEFINE_SCALAR_PTR_TYPE_NODE): New function. (register_builtin_types): New function. (register_vector_type): New function. (handle_pragma_vector): New function. (lookup_rvv_type_attribute): New function. (builtin_type_p): New function. (verify_type_context): New function. (mangle_builtin_type): New function. * config/riscv/riscv-vector-builtins.h (builtin_type_p): New function. (verify_type_context): New function. (mangle_builtin_type): New function. * config/riscv/riscv.cc (riscv_vector_mode_supported_p): New function. (riscv_vector_alignment): New function. (riscv_vectorize_preferred_vector_alignment): New function. (riscv_simd_vector_alignment_reachable): New function. (riscv_builtin_support_vector_misalignment): New function. (riscv_compatible_vector_types_p): New function. (riscv_verify_type_context): New function. (riscv_mangle_type): New function. (TARGET_VECTOR_MODE_SUPPORTED_P): New targethook. (TARGET_VECTOR_ALIGNMENT): New targethook. (TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): New targethook. (TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE): New targethook. (TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT): New targethook. (TARGET_COMPATIBLE_VECTOR_TYPES_P): New targethook. (TARGET_VERIFY_TYPE_CONTEXT): New targethook. (TARGET_MANGLE_TYPE): New targethook. --- gcc/config/riscv/riscv-vector-builtins.cc | 466 ++++++++++++++++++++++ gcc/config/riscv/riscv-vector-builtins.h | 3 + gcc/config/riscv/riscv.cc | 144 +++++++ 3 files changed, 613 insertions(+) diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc index 7ea07a24b5b..ef734572add 100644 --- a/gcc/config/riscv/riscv-vector-builtins.cc +++ b/gcc/config/riscv/riscv-vector-builtins.cc @@ -109,6 +109,286 @@ static unsigned int NUM_INSN_FUNC; static void init_def_variables (); +/* Force TYPE to be a sizeless type. */ +static void +make_type_sizeless (tree type) +{ + TYPE_ATTRIBUTES (type) = tree_cons (get_identifier ("RVV sizeless type"), + NULL_TREE, TYPE_ATTRIBUTES (type)); +} + +/* Return true if TYPE is a sizeless type. */ +static bool +sizeless_type_p (const_tree type) +{ + if (type == error_mark_node) + return NULL_TREE; + return lookup_attribute ("RVV sizeless type", TYPE_ATTRIBUTES (type)); +} + +machine_mode +vector_builtin_mode (scalar_mode inner_mode, enum vlmul_field_enum vlmul) +{ + switch (inner_mode) + { + case E_BImode: + return vlmul == VLMUL_FIELD_000 ? VNx16BImode + : vlmul == VLMUL_FIELD_001 ? VNx32BImode + : vlmul == VLMUL_FIELD_010 ? VNx64BImode + : vlmul == VLMUL_FIELD_011 ? VNx128BImode + : vlmul == VLMUL_FIELD_111 ? VNx8BImode + : vlmul == VLMUL_FIELD_110 ? VNx4BImode + : VNx2BImode; + + case E_QImode: + return vlmul == VLMUL_FIELD_000 ? VNx16QImode + : vlmul == VLMUL_FIELD_001 ? VNx32QImode + : vlmul == VLMUL_FIELD_010 ? VNx64QImode + : vlmul == VLMUL_FIELD_011 ? VNx128QImode + : vlmul == VLMUL_FIELD_111 ? VNx8QImode + : vlmul == VLMUL_FIELD_110 ? VNx4QImode + : VNx2QImode; + + case E_HImode: + if (vlmul == VLMUL_FIELD_101) + gcc_unreachable (); + + return vlmul == VLMUL_FIELD_000 ? VNx8HImode + : vlmul == VLMUL_FIELD_001 ? VNx16HImode + : vlmul == VLMUL_FIELD_010 ? VNx32HImode + : vlmul == VLMUL_FIELD_011 ? VNx64HImode + : vlmul == VLMUL_FIELD_111 ? VNx4HImode + : VNx2HImode; + + case E_SImode: + if (vlmul == VLMUL_FIELD_101 || vlmul == VLMUL_FIELD_110) + gcc_unreachable (); + + return vlmul == VLMUL_FIELD_000 ? VNx4SImode + : vlmul == VLMUL_FIELD_001 ? VNx8SImode + : vlmul == VLMUL_FIELD_010 ? VNx16SImode + : vlmul == VLMUL_FIELD_011 ? VNx32SImode + : VNx2SImode; + + case E_DImode: + if (vlmul == VLMUL_FIELD_101 || vlmul == VLMUL_FIELD_110 || + vlmul == VLMUL_FIELD_111) + gcc_unreachable (); + + return vlmul == VLMUL_FIELD_000 ? VNx2DImode + : vlmul == VLMUL_FIELD_001 ? VNx4DImode + : vlmul == VLMUL_FIELD_010 ? VNx8DImode + : VNx16DImode; + + case E_SFmode: + if (vlmul == VLMUL_FIELD_101 || vlmul == VLMUL_FIELD_110) + gcc_unreachable (); + + return vlmul == VLMUL_FIELD_000 ? VNx4SFmode + : vlmul == VLMUL_FIELD_001 ? VNx8SFmode + : vlmul == VLMUL_FIELD_010 ? VNx16SFmode + : vlmul == VLMUL_FIELD_011 ? VNx32SFmode + : VNx2SFmode; + + case E_DFmode: + if (vlmul == VLMUL_FIELD_101 || vlmul == VLMUL_FIELD_110 || + vlmul == VLMUL_FIELD_111) + gcc_unreachable (); + + return vlmul == VLMUL_FIELD_000 ? VNx2DFmode + : vlmul == VLMUL_FIELD_001 ? VNx4DFmode + : vlmul == VLMUL_FIELD_010 ? VNx8DFmode + : VNx16DFmode; + + default: + gcc_unreachable (); + } + + gcc_unreachable (); +} + +static bool +vector_legal_vlmul (scalar_mode inner_mode, enum vlmul_field_enum vlmul) +{ + if (vlmul == VLMUL_FIELD_100) + return false; + + switch (inner_mode) + { + case E_HImode: + return vlmul != VLMUL_FIELD_101; + + case E_SImode: + case E_SFmode: + return vlmul != VLMUL_FIELD_101 && vlmul != VLMUL_FIELD_110; + + case E_DImode: + case E_DFmode: + return vlmul <= VLMUL_FIELD_011; + + default: + break; + } + + return true; +} + +/* Record that TYPE is an ABI-defined VECTOR type that contains SEW and LMUL + information for RVV vector. MANGLED_NAME, if nonnull, is the ABI-defined + mangling of the type. + */ +static void +add_vector_type_attribute (tree type, unsigned int nf, unsigned int sew, unsigned int vlmul, + unsigned int is_bool, const char *mangled_name) +{ + tree mangled_name_tree + = (mangled_name ? get_identifier (mangled_name) : NULL_TREE); + + tree value = tree_cons (NULL_TREE, mangled_name_tree, NULL_TREE); + value = tree_cons (NULL_TREE, size_int (nf), value); + value = tree_cons (NULL_TREE, size_int (sew), value); + value = tree_cons (NULL_TREE, size_int (vlmul), value); + value = tree_cons (NULL_TREE, size_int (is_bool), value); + TYPE_ATTRIBUTES (type) = tree_cons (get_identifier ("RVV type"), value, + TYPE_ATTRIBUTES (type)); +} + +/* These codes copied from ARM. */ +static void +register_general_builtin_types (void) +{ + scalar_types[VECTOR_TYPE_bool] = boolean_type_node; + scalar_types[VECTOR_TYPE_int8] = intQI_type_node; + scalar_types[VECTOR_TYPE_uint8] = unsigned_intQI_type_node; + scalar_types[VECTOR_TYPE_int16] = intHI_type_node; + scalar_types[VECTOR_TYPE_uint16] = unsigned_intHI_type_node; + + if (TARGET_64BIT) + { + scalar_types[VECTOR_TYPE_int32] = intSI_type_node; + scalar_types[VECTOR_TYPE_uint32] = unsigned_intSI_type_node; + } + else + { + /* int32_t/uint32_t defined as `long`/`unsigned long` in RV32, + but intSI_type_node/unsigned_intSI_type_node is + `int` and `unsigned int`, so use long_integer_type_node and + long_unsigned_type_node here for type consistent. */ + scalar_types[VECTOR_TYPE_int32] = long_integer_type_node; + scalar_types[VECTOR_TYPE_uint32] = long_unsigned_type_node; + } + + scalar_types[VECTOR_TYPE_int64] = intDI_type_node; + scalar_types[VECTOR_TYPE_uint64] = unsigned_intDI_type_node; + scalar_types[VECTOR_TYPE_float32] = float_type_node; + scalar_types[VECTOR_TYPE_float64] = double_type_node; + + /* Pointer type */ +#define DEFINE_SCALAR_PTR_TYPE_NODE(NBITS) \ + scalar_pointer_types[VECTOR_TYPE_int##NBITS] = \ + build_pointer_type (scalar_types[VECTOR_TYPE_int##NBITS]); \ + scalar_pointer_types[VECTOR_TYPE_uint##NBITS] = \ + build_pointer_type (scalar_types[VECTOR_TYPE_uint##NBITS]); \ + const_scalar_pointer_types[VECTOR_TYPE_int##NBITS] = build_pointer_type ( \ + build_type_variant (scalar_types[VECTOR_TYPE_int##NBITS], 1, 0)); \ + const_scalar_pointer_types[VECTOR_TYPE_uint##NBITS] = build_pointer_type ( \ + build_type_variant (scalar_types[VECTOR_TYPE_uint##NBITS], 1, 0)); + + DEFINE_SCALAR_PTR_TYPE_NODE (8) + DEFINE_SCALAR_PTR_TYPE_NODE (16) + DEFINE_SCALAR_PTR_TYPE_NODE (32) + DEFINE_SCALAR_PTR_TYPE_NODE (64) + + scalar_pointer_types[VECTOR_TYPE_float32] = float_ptr_type_node; + scalar_pointer_types[VECTOR_TYPE_float64] = double_ptr_type_node; + const_scalar_pointer_types[VECTOR_TYPE_float32] = build_pointer_type ( + build_type_variant (scalar_types[VECTOR_TYPE_float32], 1, 0)); + const_scalar_pointer_types[VECTOR_TYPE_float64] = build_pointer_type ( + build_type_variant (scalar_types[VECTOR_TYPE_float64], 1, 0)); +} + +/* Register the built-in VECTOR ABI types, such as __rvv_int8mf8_t. */ +static void +register_builtin_types () +{ + for (unsigned int i = 0; i < NUM_VECTOR_TYPES; ++i) + { + tree eltype = scalar_types[i]; + scalar_mode elmode = + (eltype == boolean_type_node) ? BImode : SCALAR_TYPE_MODE (eltype); + + for (unsigned int j = 0; j < ARRAY_SIZE (vector_vlmuls); ++j) + { + if (!vector_legal_vlmul (elmode, vector_vlmuls[j].vlmul)) + continue; + + char abi_name[NAME_MAXLEN] = {0}; + char mangled_name[NAME_MAXLEN] = {0}; + bool is_bool; + tree vectype; + unsigned int sew = GET_MODE_BITSIZE (elmode); + machine_mode mode = + vector_builtin_mode (elmode, vector_vlmuls[j].vlmul); + + /* mask type in RVV. */ + vectype = build_vector_type_for_mode (eltype, mode); + + /* NOTE: Reference to 'omp_clause_aligned_alignment' function in + omp-low.c. We don't know why we need this protection, it seems + to make the buildup of GCC more reliable. */ + if (TYPE_MODE (vectype) != mode) + continue; + + if (eltype == boolean_type_node) + { + gcc_assert (VECTOR_MODE_P (TYPE_MODE (vectype)) && + TYPE_MODE (vectype) == mode && + TYPE_MODE (vectype) == TYPE_MODE_RAW (vectype) && + TYPE_ALIGN (vectype) == 8 && + known_eq (tree_to_poly_uint64 (TYPE_SIZE (vectype)), + BITS_PER_RISCV_VECTOR)); + is_bool = true; + } + else + { + gcc_assert (VECTOR_MODE_P (TYPE_MODE (vectype)) && + TYPE_MODE (vectype) == mode && + TYPE_MODE_RAW (vectype) == mode && + TYPE_ALIGN (vectype) <= 128 && + known_eq (tree_to_poly_uint64 (TYPE_SIZE (vectype)), + GET_MODE_BITSIZE (mode))); + is_bool = false; + } + /* These codes copied from ARM. */ + /* abi_name and api_name follows vector type implementation in LLVM. + Take sew = 8, vlmul = 1/8 for example, + abi_name = __rvv_int8mf8_t, + api_name = vint8mf8_t. + The mangle name follows the rule of aarch64 + that is "u" + length of (abi_name) + abi_name. + So that mangle_name = u15__rvv_int8mf8_t. */ + snprintf (abi_name, NAME_MAXLEN, "__rvv_%s%s_t", + vector_type_infos[i].elem_name, + is_bool ? vector_vlmuls[j].boolnum + : vector_vlmuls[j].suffix); + snprintf (mangled_name, NAME_MAXLEN, "u%d__rvv_%s%s_t", + (int)strlen (abi_name), vector_type_infos[i].elem_name, + is_bool ? vector_vlmuls[j].boolnum + : vector_vlmuls[j].suffix); + vectype = build_distinct_type_copy (vectype); + gcc_assert (vectype == TYPE_MAIN_VARIANT (vectype)); + SET_TYPE_STRUCTURAL_EQUALITY (vectype); + TYPE_ARTIFICIAL (vectype) = 1; + TYPE_INDIVISIBLE_P (vectype) = 1; + add_vector_type_attribute (vectype, 1, sew, vector_vlmuls[j].vlmul, is_bool, + mangled_name); + make_type_sizeless (vectype); + internal_vector_types[i][j] = vectype; + lang_hooks.types.register_builtin_type (vectype, abi_name); + } + } +} + /* Initialize all compiler built-ins related to RVV that should be defined at start-up. */ void @@ -116,6 +396,192 @@ init_builtins () { if (!TARGET_VECTOR) return; + + register_general_builtin_types (); + register_builtin_types (); + + if (in_lto_p) + handle_pragma_vector (); +} + +/* These codes copied from ARM. */ +/* Register vector type TYPE under its risv_vector.h name. */ +static void +register_vector_type (unsigned int type, unsigned int lmul) +{ + tree vectype = internal_vector_types[type][lmul]; + char rvv_name[NAME_MAXLEN] = {0}; + snprintf (rvv_name, NAME_MAXLEN, "v%s%s_t", vector_type_infos[type].elem_name, + strcmp (vector_type_infos[type].elem_name, "bool") == 0 + ? vector_vlmuls[lmul].boolnum + : vector_vlmuls[lmul].suffix); + tree id = get_identifier (rvv_name); + tree decl = build_decl (input_location, TYPE_DECL, id, vectype); + decl = lang_hooks.decls.pushdecl (decl); + + /* Record the new RVV type if pushdecl succeeded without error. Use + the ABI type otherwise, so that the type we record at least has the + right form, even if it doesn't have the right name. This should give + better error recovery behavior than installing error_mark_node or + installing an incorrect type. */ + if (decl && TREE_CODE (decl) == TYPE_DECL && + TREE_TYPE (decl) != error_mark_node && + TYPE_MAIN_VARIANT (TREE_TYPE (decl)) == vectype) + vectype = TREE_TYPE (decl); + + vector_types[0][type][lmul] = vectype; + vector_pointer_types[type][lmul] = build_pointer_type (vectype); +} + +/* Implement #pragma riscv intrinsic vector. */ +void +handle_pragma_vector () +{ + if (function_table) + { + error ("duplicate definition of %qs", "vector"); + return; + } + + /* Define the vector and tuple types. */ + for (unsigned int i = 0; i < NUM_VECTOR_TYPES; ++i) + { + tree eltype = scalar_types[i]; + scalar_mode elmode = + (eltype == boolean_type_node) ? BImode : SCALAR_TYPE_MODE (eltype); + + for (unsigned int j = 0; j < ARRAY_SIZE (vector_vlmuls); ++j) + { + if (!vector_legal_vlmul (elmode, vector_vlmuls[j].vlmul)) + continue; + + register_vector_type (i, j); + } + } + + init_def_variables (); + + /* Define the functions. */ + function_table = new hash_table (1023); + + for (unsigned int i = 0; i < NUM_INSN_FUNC; ++i) + all_vector_functions[i]->register_function (); +} + +/* If TYPE is an ABI-defined RVV type, return its attribute descriptor, + otherwise return null. */ +static tree +lookup_rvv_type_attribute (const_tree type) +{ + if (type == error_mark_node) + return NULL_TREE; + return lookup_attribute ("RVV type", TYPE_ATTRIBUTES (type)); +} + +/* Return true if TYPE is a built-in RVV type defined by the ABI or RVV. */ +bool +builtin_type_p (const_tree type) +{ + return lookup_rvv_type_attribute (type); +} + +/* Implement TARGET_VERIFY_TYPE_CONTEXT for RVV types. */ +bool +verify_type_context (location_t loc, type_context_kind context, + const_tree type, bool silent_p) +{ + if (!sizeless_type_p (type)) + return true; + + switch (context) + { + case TCTX_SIZEOF: + case TCTX_STATIC_STORAGE: + if (!silent_p) + error_at (loc, "RVV type %qT does not have a fixed size", type); + + return false; + + case TCTX_ALIGNOF: + if (!silent_p) + error_at (loc, "RVV type %qT does not have a defined alignment", type); + + return false; + + case TCTX_THREAD_STORAGE: + if (!silent_p) + error_at (loc, + "variables of type %qT cannot have thread-local" + " storage duration", + type); + + return false; + + case TCTX_POINTER_ARITH: + if (!silent_p) + error_at (loc, "arithmetic on pointer to RVV type %qT", type); + + return false; + + case TCTX_FIELD: + if (silent_p) + ; + else if (lang_GNU_CXX ()) + error_at (loc, "member variables cannot have RVV type %qT", type); + else + error_at (loc, "fields cannot have RVV type %qT", type); + + return false; + + case TCTX_ARRAY_ELEMENT: + if (!silent_p) + error_at (loc, "array elements cannot have RVV type %qT", type); + + return false; + + case TCTX_ALLOCATION: + if (!silent_p) + error_at (loc, "cannot allocate objects with RVV type %qT", type); + + return false; + + case TCTX_DEALLOCATION: + if (!silent_p) + error_at (loc, "cannot delete objects with RVV type %qT", type); + + return false; + + case TCTX_EXCEPTIONS: + if (!silent_p) + error_at (loc, "cannot throw or catch RVV type %qT", type); + + return false; + + case TCTX_CAPTURE_BY_COPY: + if (!silent_p) + error_at (loc, "capture by copy of RVV type %qT", type); + + return false; + } + + gcc_unreachable (); +} + +/* If TYPE is a built-in type defined by the RVV ABI, return the mangled name, + otherwise return NULL. */ +const char * +mangle_builtin_type (const_tree type) +{ + /* ??? The C++ frontend normally strips qualifiers and attributes before + calling this hook, adding separate mangling for attributes that affect + type identity. Fortunately the type copy will have the same TYPE_NAME + as the original, so we can get the attributes from there. */ + if (TYPE_NAME (type) && TREE_CODE (TYPE_NAME (type)) == TYPE_DECL) + type = TREE_TYPE (TYPE_NAME (type)); + if (tree attr = lookup_rvv_type_attribute (type)) + if (tree id = TREE_VALUE (chain_index (4, TREE_VALUE (attr)))) + return IDENTIFIER_POINTER (id); + return NULL; } /* Return the function decl with RVV function subcode CODE, or error_mark_node diff --git a/gcc/config/riscv/riscv-vector-builtins.h b/gcc/config/riscv/riscv-vector-builtins.h index 6bba4c90c3a..ef91248035c 100644 --- a/gcc/config/riscv/riscv-vector-builtins.h +++ b/gcc/config/riscv/riscv-vector-builtins.h @@ -53,6 +53,9 @@ rtx expand_builtin (unsigned int, tree, rtx); bool check_builtin_call (location_t, vec, unsigned int, tree, unsigned int, tree *); machine_mode vector_builtin_mode (scalar_mode, enum vlmul_field_enum); +bool builtin_type_p (const_tree); +bool verify_type_context (location_t, type_context_kind, const_tree, bool); +const char * mangle_builtin_type (const_tree); } // end namespace riscv_vector diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index b82a38da7c9..8c78e726a19 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -67,6 +67,8 @@ along with GCC; see the file COPYING3. If not see #include "tree-vectorizer.h" #include "tree-ssa-loop-niter.h" #include "rtx-vector-builder.h" +#include "riscv-vector-builtins.h" +#include "riscv-vector.h" /* True if X is an UNSPEC wrapper around a SYMBOL_REF or LABEL_REF. */ #define UNSPEC_ADDRESS_P(X) \ @@ -5780,6 +5782,124 @@ riscv_asan_shadow_offset (void) return TARGET_64BIT ? (HOST_WIDE_INT_1 << 29) : 0; } +/* Implement TARGET_VECTOR_MODE_SUPPORTED_P. */ + +static bool +riscv_vector_mode_supported_p (machine_mode mode) +{ + if (TARGET_VECTOR && rvv_mode_p (mode)) + return true; + + return false; +} + +/* Implement TARGET_VECTOR_ALIGNMENT. */ + +static HOST_WIDE_INT +riscv_vector_alignment (const_tree type) +{ + /* ??? Checking the mode isn't ideal, but VECTOR_BOOLEAN_TYPE_P can + be set for non-predicate vectors of booleans. Modes are the most + direct way we have of identifying real RVV predicate types. */ + /* FIXME: RVV didn't mention the alignment of bool, we uses + one byte align. */ + if (GET_MODE_CLASS (TYPE_MODE (type)) == MODE_VECTOR_BOOL) + return 8; + + widest_int min_size = + constant_lower_bound (wi::to_poly_widest (TYPE_SIZE (type))); + return wi::umin (min_size, 128).to_uhwi (); +} + +/* Implement target hook TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT. */ + +static poly_uint64 +riscv_vectorize_preferred_vector_alignment (const_tree type) +{ + if (rvv_mode_p (TYPE_MODE (type))) + { + /* If the length of the vector is a fixed power of 2, try to align + to that length, otherwise don't try to align at all. */ + HOST_WIDE_INT result; + + if (!GET_MODE_BITSIZE (TYPE_MODE (type)).is_constant (&result) || + !pow2p_hwi (result)) + result = TYPE_ALIGN (TREE_TYPE (type)); + + return result; + } + + return default_preferred_vector_alignment (type); +} + +/* Implement target hook TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE. */ + +static bool +riscv_simd_vector_alignment_reachable (const_tree type, bool is_packed) +{ + if (is_packed) + return false; + + /* For fixed-length vectors, check that the vectorizer will aim for + full-vector alignment. This isn't true for generic GCC vectors + that are wider than the ABI maximum of 128 bits. */ + poly_uint64 preferred_alignment = + riscv_vectorize_preferred_vector_alignment (type); + if (TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST && + maybe_ne (wi::to_widest (TYPE_SIZE (type)), preferred_alignment)) + return false; + + /* Vectors whose size is <= BIGGEST_ALIGNMENT are naturally aligned. */ + return true; +} + +/* Return true if the vector misalignment factor is supported by the + target. */ + +static bool +riscv_builtin_support_vector_misalignment (machine_mode mode, const_tree type, + int misalignment, bool is_packed) +{ + return default_builtin_support_vector_misalignment (mode, type, misalignment, + is_packed); +} + +/* Implement TARGET_COMPATIBLE_VECTOR_TYPES_P. */ + +static bool +riscv_compatible_vector_types_p (const_tree type1, const_tree type2) +{ + return (riscv_vector::builtin_type_p (type1) == + riscv_vector::builtin_type_p (type2)); +} + +/* Implement TARGET_VERIFY_TYPE_CONTEXT. */ + +static bool +riscv_verify_type_context (location_t loc, type_context_kind context, + const_tree type, bool silent_p) +{ + return riscv_vector::verify_type_context (loc, context, type, silent_p); +} + +/* Implement TARGET_MANGLE_TYPE. */ + +static const char * +riscv_mangle_type (const_tree type) +{ + /* Mangle all vector type for vector extension. */ + /* The mangle name follows the rule of aarch64 + that is "u" + length of (abi_name) + abi_name. */ + if (TYPE_NAME (type) != NULL) + { + const char *res = riscv_vector::mangle_builtin_type (type); + if (res) + return res; + } + /* Use the default mangling. */ + return NULL; +} + /* Initialize the GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t" @@ -5974,6 +6094,30 @@ riscv_asan_shadow_offset (void) #define TARGET_DEFAULT_TARGET_FLAGS (MASK_BIG_ENDIAN) #endif +#undef TARGET_VECTOR_MODE_SUPPORTED_P +#define TARGET_VECTOR_MODE_SUPPORTED_P riscv_vector_mode_supported_p + +#undef TARGET_VECTOR_ALIGNMENT +#define TARGET_VECTOR_ALIGNMENT riscv_vector_alignment + +#undef TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT +#define TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT riscv_vectorize_preferred_vector_alignment + +#undef TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE +#define TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE riscv_simd_vector_alignment_reachable + +#undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT +#define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT riscv_builtin_support_vector_misalignment + +#undef TARGET_COMPATIBLE_VECTOR_TYPES_P +#define TARGET_COMPATIBLE_VECTOR_TYPES_P riscv_compatible_vector_types_p + +#undef TARGET_VERIFY_TYPE_CONTEXT +#define TARGET_VERIFY_TYPE_CONTEXT riscv_verify_type_context + +#undef TARGET_MANGLE_TYPE +#define TARGET_MANGLE_TYPE riscv_mangle_type + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-riscv.h" From patchwork Tue May 31 08:49:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 54547 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D82273955C9D for ; Tue, 31 May 2022 08:54:01 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgau1.qq.com (smtpbgau1.qq.com [54.206.16.166]) by sourceware.org (Postfix) with ESMTPS id C20B63857426 for ; Tue, 31 May 2022 08:50:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C20B63857426 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp84t1653987028trpkf1jx Received: from server1.localdomain ( [42.247.22.65]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 31 May 2022 16:50:27 +0800 (CST) X-QQ-SSF: 01400000002000B0F000000A0000000 X-QQ-FEAT: SIq+DGERJtxYukrb5BUeTPSO9/mb/l0Bh1yj2Nh2l/HM1kiyfB6xy91BK7xmw g8eFriwdTKz9YekGvV9J2m9ki5DGsZA+gxDWXSLMCiabvl3fpDytAPZUeBOiPuC459orAyG QCYmeKBlvOT0Tli4Ea++gwEOTaGNpNW7MRc1m7IwVomP/bpeOK1UXfy91rfxwFobpctrqip 6i9Y3Gqz0X6b7pxwTycO6BHZUUHN/Z4J32ZPEs8kpv09RvLvf93VT+EvblAH3C4rutw7wmf xhg5wk/EAG+3dzFIKEYVykcz4BDXSjMdmbPxjyLFAEDjSvQXckxOCgKDIuCWNvdGjXEeR2/ /gc2r5jnjpJO7GxKX4L9JOcvK6wirJ3Ag98RE9aJKw0b0ZlRxw= X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Subject: [PATCH 04/21] Add RVV intrinsic enable #pragma riscv intrinsic "vector" and introduce RVV header "riscv_vector.h" Date: Tue, 31 May 2022 16:49:55 +0800 Message-Id: <20220531085012.269719-5-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220531085012.269719-1-juzhe.zhong@rivai.ai> References: <20220531085012.269719-1-juzhe.zhong@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybgforeign:qybgforeign9 X-QQ-Bgrelay: 1 X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kito.cheng@gmail.com, juzhe.zhong@rivai.ai Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: zhongjuzhe gcc/ChangeLog: * config.gcc: New header. * config/riscv/riscv-c.cc (riscv_pragma_intrinsic): New function. (riscv_check_builtin_call): New function. (riscv_register_pragmas): New function. * config/riscv/riscv-protos.h (riscv_register_pragmas): New function. * config/riscv/riscv.h (REGISTER_TARGET_PRAGMAS): New targethook. * config/riscv/riscv_vector.h: New file. --- gcc/config.gcc | 1 + gcc/config/riscv/riscv-c.cc | 65 +++++++++++++++++++++++++++++++++ gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv.h | 2 + gcc/config/riscv/riscv_vector.h | 41 +++++++++++++++++++++ 5 files changed, 110 insertions(+) create mode 100644 gcc/config/riscv/riscv_vector.h diff --git a/gcc/config.gcc b/gcc/config.gcc index bdda82ae576..042a7a17737 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -517,6 +517,7 @@ pru-*-*) ;; riscv*) cpu_type=riscv + extra_headers="riscv_vector.h" extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memrefs.o riscv-vector.o riscv-vector-builtins-functions.o riscv-vector-builtins.o" d_target_objs="riscv-d.o" target_gtfiles="$target_gtfiles \$(srcdir)/config/riscv/riscv-builtins.cc \$(srcdir)/config/riscv/riscv-vector-builtins.cc" diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc index eb7ef09297e..5839e849092 100644 --- a/gcc/config/riscv/riscv-c.cc +++ b/gcc/config/riscv/riscv-c.cc @@ -25,9 +25,17 @@ along with GCC; see the file COPYING3. If not see #include "system.h" #include "coretypes.h" #include "tm.h" +#include "input.h" +#include "memmodel.h" +#include "tm_p.h" +#include "flags.h" #include "c-family/c-common.h" #include "cpplib.h" +#include "c-family/c-pragma.h" +#include "langhooks.h" +#include "target.h" #include "riscv-subset.h" +#include "riscv-vector-builtins.h" #define builtin_define(TXT) cpp_define (pfile, TXT) @@ -155,3 +163,60 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile) builtin_define_with_int_value (buf, version_value); } } + +/* Implement "#pragma riscv intrinsic". */ +static void +riscv_pragma_intrinsic (cpp_reader *) +{ + tree x; + + if (pragma_lex (&x) != CPP_STRING) + { + error ("%<#pragma riscv intrinsic%> requires a string parameter"); + return; + } + + const char *name = TREE_STRING_POINTER (x); + + if (strcmp (name, "vector") == 0) + { + if (!TARGET_VECTOR) + error ("%<#pragma riscv intrinsic%> option %qs needs 'V' extension enabled", name); + + riscv_vector::handle_pragma_vector (); + } + else + error ("unknown %<#pragma riscv intrinsic%> option %qs", name); +} + +/* Implement TARGET_CHECK_BUILTIN_CALL. */ + +static bool +riscv_check_builtin_call (location_t loc, vec arg_loc, + tree fndecl, tree orig_fndecl, + unsigned int nargs, tree *args) +{ + unsigned int code = DECL_MD_FUNCTION_CODE (fndecl); + unsigned int subcode = code >> RISCV_BUILTIN_SHIFT; + + switch (code & RISCV_BUILTIN_CLASS) + { + case RISCV_BUILTIN_GENERAL: + return true; + + case RISCV_BUILTIN_VECTOR: + return riscv_vector::check_builtin_call (loc, arg_loc, subcode, + orig_fndecl, nargs, args); + } + + gcc_unreachable (); +} + +/* Implement REGISTER_TARGET_PRAGMAS. */ + +void +riscv_register_pragmas (void) +{ + targetm.check_builtin_call = riscv_check_builtin_call; + c_register_pragma ("riscv", "intrinsic", riscv_pragma_intrinsic); +} \ No newline at end of file diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 1cb3586d1f1..4a4ac645f55 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -116,6 +116,7 @@ extern bool rvv_legitimate_poly_int_p (rtx); extern unsigned int rvv_offset_temporaries (bool, poly_int64); extern enum vlmul_field_enum riscv_classify_vlmul_field (machine_mode); extern int rvv_regsize (machine_mode); +extern void riscv_register_pragmas (void); /* We classify builtin types into two classes: 1. General builtin class which is using the diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h index 8f56a5a4746..cb4cfc0f73e 100644 --- a/gcc/config/riscv/riscv.h +++ b/gcc/config/riscv/riscv.h @@ -1067,4 +1067,6 @@ extern void riscv_remove_unneeded_save_restore_calls (void); #define TARGET_SUPPORTS_WIDE_INT 1 +#define REGISTER_TARGET_PRAGMAS() riscv_register_pragmas () + #endif /* ! GCC_RISCV_H */ diff --git a/gcc/config/riscv/riscv_vector.h b/gcc/config/riscv/riscv_vector.h new file mode 100644 index 00000000000..ef1820a07cb --- /dev/null +++ b/gcc/config/riscv/riscv_vector.h @@ -0,0 +1,41 @@ +/* Header of intrinsics for RISC-V 'V' Extension for GNU compiler. + Copyright (C) 2021-2021 Free Software Foundation, Inc. + Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai). + Based on MIPS target for GNU compiler. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +#ifndef __RISCV_VECTOR_H +#define __RISCV_VECTOR_H + +#include +#include + +typedef float float32_t; +typedef double float64_t; + +typedef float __float32_t; +typedef double __float64_t; + +/* NOTE: This implementation of riscv_vector.h is intentionally short. It does + not define the RVV types and intrinsic functions directly in C and C++ + code, but instead uses the following pragma to tell GCC to insert the + necessary type and function definitions itself. The net effect is the + same, and the file is a complete implementation of riscv_vector.h. */ +#pragma riscv intrinsic "vector" + +#endif // __RISCV_VECTOR_H \ No newline at end of file From patchwork Tue May 31 08:49:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 54548 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C534839540AC for ; Tue, 31 May 2022 08:54:40 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbg152.qq.com (smtpbg152.qq.com [13.245.186.79]) by sourceware.org (Postfix) with ESMTPS id CC388383D830 for ; Tue, 31 May 2022 08:50:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CC388383D830 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp84t1653987031taccdi10 Received: from server1.localdomain ( [42.247.22.65]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 31 May 2022 16:50:30 +0800 (CST) X-QQ-SSF: 01400000002000B0F000000A0000000 X-QQ-FEAT: ZHWZeLXy+8ceeCnIz/jZsDX08nUxDyM39UmC8ae8WnJ4gcMbFBQMg3qBpDEfG hxw71Lxv+UgR/CkJNGapEBArz4sRDfCQpRmdnMSOLKq+Hezf8lbBhJ3bpJaLoAtuZGUF1dI 6mfvlTNvU3pO7oppojxoU7A+gaoZkEEmQpNXrLJhFtMmiu5CNo76Fz1un2s1TY4GVUSNvG6 DA7L+h3Q7zLyEQptTfhuKN51c8/S/ybtVCAiL9Qb1oSKVxVQi9/7eE4SMcYJYEhKnSTypa9 WNNm2p9z8V9HsQW8iaES+UzX2zVruvcEkdYYNbZKAKL1jLSwESoWKK8yVkp4N+LMdmOT+YJ 2ybsxWKXQ+aRNFQtWUH8cEZKpQ+zMkl91bQCrl0P2e2w/gcu+k= X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Subject: [PATCH 05/21] Add RVV configuration intrinsic Date: Tue, 31 May 2022 16:49:56 +0800 Message-Id: <20220531085012.269719-6-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220531085012.269719-1-juzhe.zhong@rivai.ai> References: <20220531085012.269719-1-juzhe.zhong@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybgforeign:qybgforeign9 X-QQ-Bgrelay: 1 X-Spam-Status: No, score=-8.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SCC_10_SHORT_WORD_LINES, SCC_20_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, WEIRD_QUOTING autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kito.cheng@gmail.com, juzhe.zhong@rivai.ai Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: zhongjuzhe gcc/ChangeLog: * config/riscv/riscv-protos.h (riscv_register_pragmas): New function. (riscv_classify_vlmul_field): New enum. (rvv_classify_vlmul_field): New enum. (rvv_parse_vsew_field): New enum. (rvv_parse_vlmul_field): New enum. (rvv_parse_vta_field): New enum. (rvv_parse_vma_field): New enum. * config/riscv/riscv-vector-builtins-functions.cc (get_vtype_for_mode): New function. (mode2data_type_str): New function. (config::call_properties): New function. (config::assemble_name): New function. (config::get_return_type): New function. (vsetvl::get_argument_types): New function. (vsetvl::expand): New function. (vsetvlmax::expand): New function. * config/riscv/riscv-vector-builtins-functions.def: (vsetvl): New macro define. (vsetvlmax): New macro define. * config/riscv/riscv-vector-builtins-functions.h (class config): New class. (class vsetvl): New class. (class vsetvlmax): New class. * config/riscv/riscv-vector-builtins-iterators.def (VI): New iterator. * config/riscv/riscv-vector.cc (rvv_parse_vsew_field): New function. (rvv_parse_vlmul_field): New function. (rvv_parse_vta_field): New function. (rvv_parse_vma_field): New function. * config/riscv/riscv-vector.h (rvv_parse_vsew_field): New function. (rvv_parse_vlmul_field): New function. (rvv_parse_vta_field): New function. (rvv_parse_vma_field): New function. * config/riscv/riscv.md: Add X0_REGNUM constant. * config/riscv/vector-iterators.md (unspec): New unspec. * config/riscv/vector.md: New file. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/intrinsic/rvv-intrinsic.exp: New test. * gcc.target/riscv/rvv/intrinsic/vsetvl.c: New test. --- gcc/config/riscv/riscv-protos.h | 8 +- .../riscv/riscv-vector-builtins-functions.cc | 232 ++++++ .../riscv/riscv-vector-builtins-functions.def | 6 + .../riscv/riscv-vector-builtins-functions.h | 36 + .../riscv/riscv-vector-builtins-iterators.def | 23 + gcc/config/riscv/riscv-vector.cc | 28 + gcc/config/riscv/riscv-vector.h | 4 + gcc/config/riscv/riscv.md | 6 +- gcc/config/riscv/vector-iterators.md | 14 +- gcc/config/riscv/vector.md | 140 ++++ .../riscv/rvv/intrinsic/rvv-intrinsic.exp | 47 ++ .../gcc.target/riscv/rvv/intrinsic/vsetvl.c | 733 ++++++++++++++++++ 12 files changed, 1273 insertions(+), 4 deletions(-) create mode 100644 gcc/config/riscv/vector.md create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/intrinsic/rvv-intrinsic.exp create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/intrinsic/vsetvl.c diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 4a4ac645f55..cae2974b54f 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -77,6 +77,7 @@ extern bool riscv_gpr_save_operation_p (rtx); /* Routines implemented in riscv-c.cc. */ void riscv_cpu_cpp_builtins (cpp_reader *); +void riscv_register_pragmas (void); /* Routines implemented in riscv-d.cc */ extern void riscv_d_target_versions (void); @@ -114,9 +115,12 @@ extern const riscv_cpu_info *riscv_find_cpu (const char *); extern bool rvv_mode_p (machine_mode); extern bool rvv_legitimate_poly_int_p (rtx); extern unsigned int rvv_offset_temporaries (bool, poly_int64); -extern enum vlmul_field_enum riscv_classify_vlmul_field (machine_mode); +extern enum vlmul_field_enum rvv_classify_vlmul_field (machine_mode); +extern unsigned int rvv_parse_vsew_field (unsigned int); +extern unsigned int rvv_parse_vlmul_field (unsigned int); +extern bool rvv_parse_vta_field (unsigned int); +extern bool rvv_parse_vma_field (unsigned int); extern int rvv_regsize (machine_mode); -extern void riscv_register_pragmas (void); /* We classify builtin types into two classes: 1. General builtin class which is using the diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.cc b/gcc/config/riscv/riscv-vector-builtins-functions.cc index 19bcb66a83f..0acda8f671e 100644 --- a/gcc/config/riscv/riscv-vector-builtins-functions.cc +++ b/gcc/config/riscv/riscv-vector-builtins-functions.cc @@ -49,6 +49,164 @@ static const unsigned int CP_WRITE_CSR = 1U << 6; when the required extension is disabled. */ static bool reported_missing_extension_p; +/* Generate vtype bitmap for a specific machine mode. */ +static unsigned int +get_vtype_for_mode (machine_mode mode) +{ + switch (mode) + { + case VNx2QImode: + case VNx2BImode: + return 0x45; + + case VNx4QImode: + case VNx4BImode: + return 0x46; + + case VNx8QImode: + case VNx8BImode: + return 0x47; + + case VNx16QImode: + case VNx16BImode: + return 0x40; + + case VNx32QImode: + case VNx32BImode: + return 0x41; + + case VNx64QImode: + case VNx64BImode: + return 0x42; + + case VNx128QImode: + case VNx128BImode: + return 0x43; + + case VNx2HImode: + return 0x4e; + + case VNx4HImode: + return 0x4f; + + case VNx8HImode: + return 0x48; + + case VNx16HImode: + return 0x49; + + case VNx32HImode: + return 0x4a; + + case VNx64HImode: + return 0x4b; + + case VNx2SImode: + case VNx2SFmode: + return 0x57; + + case VNx4SImode: + case VNx4SFmode: + return 0x50; + + case VNx8SImode: + case VNx8SFmode: + return 0x51; + + case VNx16SImode: + case VNx16SFmode: + return 0x52; + + case VNx32SImode: + case VNx32SFmode: + return 0x53; + + case VNx2DImode: + case VNx2DFmode: + return 0x58; + + case VNx4DImode: + case VNx4DFmode: + return 0x59; + + case VNx8DImode: + case VNx8DFmode: + return 0x5a; + + case VNx16DImode: + return 0x5b; + + default: + break; + } + + gcc_unreachable (); +} + +static const char * +mode2data_type_str (machine_mode mode, bool u, bool ie) +{ + switch (mode) + { + #define MODE2DATA_TYPE_STR(MODE, NBITS, LMUL) \ + case VNx##MODE##mode: \ + return ie ? "_e"#NBITS""#LMUL"" : u ? "_u"#NBITS""#LMUL"" : "_i"#NBITS""#LMUL""; + MODE2DATA_TYPE_STR (2QI, 8, mf8) + MODE2DATA_TYPE_STR (4QI, 8, mf4) + MODE2DATA_TYPE_STR (8QI, 8, mf2) + MODE2DATA_TYPE_STR (16QI, 8, m1) + MODE2DATA_TYPE_STR (32QI, 8, m2) + MODE2DATA_TYPE_STR (64QI, 8, m4) + MODE2DATA_TYPE_STR (128QI, 8, m8) + MODE2DATA_TYPE_STR (2HI, 16, mf4) + MODE2DATA_TYPE_STR (4HI, 16, mf2) + MODE2DATA_TYPE_STR (8HI, 16, m1) + MODE2DATA_TYPE_STR (16HI, 16, m2) + MODE2DATA_TYPE_STR (32HI, 16, m4) + MODE2DATA_TYPE_STR (64HI, 16, m8) + MODE2DATA_TYPE_STR (2SI, 32, mf2) + MODE2DATA_TYPE_STR (4SI, 32, m1) + MODE2DATA_TYPE_STR (8SI, 32, m2) + MODE2DATA_TYPE_STR (16SI, 32, m4) + MODE2DATA_TYPE_STR (32SI, 32, m8) + MODE2DATA_TYPE_STR (2DI, 64, m1) + MODE2DATA_TYPE_STR (4DI, 64, m2) + MODE2DATA_TYPE_STR (8DI, 64, m4) + MODE2DATA_TYPE_STR (16DI, 64, m8) + #undef MODE2DATA_TYPE_STR + #define MODE2DATA_TYPE_STR(MODE, NBITS, LMUL) \ + case VNx##MODE##mode: \ + return "_f"#NBITS""#LMUL""; + MODE2DATA_TYPE_STR (2SF, 32, mf2) + MODE2DATA_TYPE_STR (4SF, 32, m1) + MODE2DATA_TYPE_STR (8SF, 32, m2) + MODE2DATA_TYPE_STR (16SF, 32, m4) + MODE2DATA_TYPE_STR (32SF, 32, m8) + MODE2DATA_TYPE_STR (2DF, 64, m1) + MODE2DATA_TYPE_STR (4DF, 64, m2) + MODE2DATA_TYPE_STR (8DF, 64, m4) + MODE2DATA_TYPE_STR (16DF, 64, m8) + #undef MODE2DATA_TYPE_STR + case VNx2BImode: return "_b64"; + case VNx4BImode: return "_b32"; + case VNx8BImode: return "_b16"; + case VNx16BImode: return "_b8"; + case VNx32BImode: return "_b4"; + case VNx64BImode: return "_b2"; + case VNx128BImode: return "_b1"; + case QImode: return u ? "_u8" : "_i8"; + case HImode: return u ? "_u16" : "_i16"; + case SImode: return u ? "_u32" : "_i32"; + case DImode: return u ? "_u64" : "_i64"; + case SFmode: return "_f32"; + case DFmode: return "_f64"; + default: + break; + } + + gcc_unreachable (); +} + static tree mode2mask_t (machine_mode mode) { @@ -990,6 +1148,80 @@ function_builder::register_function () } } +/* A function implementation for config builder */ +unsigned int +config::call_properties () const +{ + return CP_READ_CSR | CP_WRITE_CSR; +} + +char * +config::assemble_name (function_instance &instance) +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[0]; + const char *name = instance.get_base_name (); + const char *dt = mode2data_type_str (mode, false, true); + snprintf (instance.function_name, NAME_MAXLEN, "%s%s", name, dt); + return nullptr; +} + +tree +config::get_return_type (const function_instance &) const +{ + return size_type_node; +} + +/* A function implementation for vsetvl builder */ +void +vsetvl::get_argument_types (const function_instance &, + vec &argument_types) const +{ + argument_types.quick_push (size_type_node); +} + +rtx +vsetvl::expand (const function_instance &instance, tree exp, rtx target) const +{ + struct expand_operand ops[MAX_RECOG_OPERANDS]; + tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0); + + /* Map any target to operand 0. */ + int opno = 0; + create_output_operand (&ops[opno++], target, Pmode); + unsigned int vtype = + get_vtype_for_mode (instance.get_arg_pattern ().arg_list[0]); + enum insn_code icode = code_for_vsetvl (Pmode); + add_input_operand (&ops[opno++], exp, 0); + /* create vtype input operand. */ + create_input_operand (&ops[opno++], GEN_INT (vtype), Pmode); + /* Map the arguments to the other operands. */ + gcc_assert (opno == insn_data[icode].n_generator_args); + return generate_builtin_insn (icode, opno, ops, + !function_returns_void_p (fndecl)); +} + +/* A function implementation for vsetvlmax builder */ +rtx +vsetvlmax::expand (const function_instance &instance, tree exp, rtx target) const +{ + struct expand_operand ops[MAX_RECOG_OPERANDS]; + tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0); + + /* Map any target to operand 0. */ + int opno = 0; + create_output_operand (&ops[opno++], target, Pmode); + unsigned int vtype = + get_vtype_for_mode (instance.get_arg_pattern ().arg_list[0]); + enum insn_code icode = code_for_vsetvl (Pmode); + create_input_operand (&ops[opno++], gen_rtx_REG (Pmode, X0_REGNUM), Pmode); + /* create vtype input operand. */ + create_input_operand (&ops[opno++], GEN_INT (vtype), Pmode); + /* Map the arguments to the other operands. */ + gcc_assert (opno == insn_data[icode].n_generator_args); + return generate_builtin_insn (icode, opno, ops, + !function_returns_void_p (fndecl)); +} + } // end namespace riscv_vector using namespace riscv_vector; diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def b/gcc/config/riscv/riscv-vector-builtins-functions.def index f6161012813..666e8503d81 100644 --- a/gcc/config/riscv/riscv-vector-builtins-functions.def +++ b/gcc/config/riscv/riscv-vector-builtins-functions.def @@ -29,6 +29,12 @@ along with GCC; see the file COPYING3. If not see #define VATTR(A, B) #endif +/* base_name, class_name, arg_pattern, intrinsic_type, pred_type, operation_type */ +#define REQUIRED_EXTENSIONS (RISCV_TARGET_VECTOR) +/* 6. Configuration-Setting Instructions. */ +DEF_RVV_FUNCTION(vsetvl, vsetvl, (1, VITER(VI, signed)), PAT_none, PRED_none, OP_none) +DEF_RVV_FUNCTION(vsetvlmax, vsetvlmax, (1, VITER(VI, signed)), PAT_none, PRED_none, OP_none) +#undef REQUIRED_EXTENSIONS #undef DEF_RVV_FUNCTION #undef VITER #undef VATTR \ No newline at end of file diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.h b/gcc/config/riscv/riscv-vector-builtins-functions.h index 1b769743857..9846ded1155 100644 --- a/gcc/config/riscv/riscv-vector-builtins-functions.h +++ b/gcc/config/riscv/riscv-vector-builtins-functions.h @@ -486,6 +486,42 @@ struct registered_function_hasher : nofree_ptr_hash static bool equal (value_type, const compare_type &); }; +/* A function_base for config functions. */ +class config : public function_builder +{ +public: + // use the same construction function as the function_builder + using function_builder::function_builder; + + virtual unsigned int call_properties () const override; + + virtual char * assemble_name (function_instance &) override; + + virtual tree get_return_type (const function_instance &) const override; +}; + +/* A function_base for vsetvl functions. */ +class vsetvl : public config +{ +public: + // use the same construction function as the config + using config::config; + + virtual void get_argument_types (const function_instance &, vec &) const override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for config functions. */ +class vsetvlmax : public config +{ +public: + // use the same construction function as the config + using config::config; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + } // namespace riscv_vector #endif // end GCC_RISCV_VECTOR_BUILTINS_FUNCTIONS_H \ No newline at end of file diff --git a/gcc/config/riscv/riscv-vector-builtins-iterators.def b/gcc/config/riscv/riscv-vector-builtins-iterators.def index ac0c37e12d4..cc968f5534f 100644 --- a/gcc/config/riscv/riscv-vector-builtins-iterators.def +++ b/gcc/config/riscv/riscv-vector-builtins-iterators.def @@ -7,6 +7,29 @@ #define DEF_RISCV_ARG_MODE_ATTR(A, B, C, D, E) #endif +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VI, 22) +DEF_RISCV_ARG_MODE_ATTR(VI, 0, VNx2QI, VNx2QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 1, VNx4QI, VNx4QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 2, VNx8QI, VNx8QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 3, VNx16QI, VNx16QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 4, VNx32QI, VNx32QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 5, VNx64QI, VNx64QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 6, VNx128QI, VNx128QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 7, VNx2HI, VNx2HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 8, VNx4HI, VNx4HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 9, VNx8HI, VNx8HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 10, VNx16HI, VNx16HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 11, VNx32HI, VNx32HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 12, VNx64HI, VNx64HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 13, VNx2SI, VNx2SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 14, VNx4SI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 15, VNx8SI, VNx8SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 16, VNx16SI, VNx16SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 17, VNx32SI, VNx32SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 18, VNx2DI, VNx2DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 19, VNx4DI, VNx4DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 20, VNx8DI, VNx8DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VI, 21, VNx16DI, VNx16DI, TARGET_ANY) #undef DEF_RISCV_ARG_MODE_ATTR_VARIABLE #undef DEF_RISCV_ARG_MODE_ATTR diff --git a/gcc/config/riscv/riscv-vector.cc b/gcc/config/riscv/riscv-vector.cc index 87dc6739f4f..a9c8b290104 100644 --- a/gcc/config/riscv/riscv-vector.cc +++ b/gcc/config/riscv/riscv-vector.cc @@ -205,6 +205,34 @@ rvv_classify_vlmul_field (machine_mode mode) gcc_unreachable (); } +/* Return the vsew field for a vtype bitmap. */ +unsigned int +rvv_parse_vsew_field (unsigned int vtype) +{ + return (vtype >> 3) & 0x7; +} + +/* Return the vlmul field for a vtype bitmap. */ +unsigned int +rvv_parse_vlmul_field (unsigned int vtype) +{ + return vtype & 0x7; +} + +/* Return the vta field for a vtype bitmap. */ +bool +rvv_parse_vta_field (unsigned int vtype) +{ + return (vtype & 0x40) != 0; +} + +/* Return the vma field for a vtype bitmap. */ +bool +rvv_parse_vma_field (unsigned int vtype) +{ + return (vtype & 0x80) != 0; +} + /* Return vlmul register size for a machine mode. */ int diff --git a/gcc/config/riscv/riscv-vector.h b/gcc/config/riscv/riscv-vector.h index 62df507e7df..2c242959077 100644 --- a/gcc/config/riscv/riscv-vector.h +++ b/gcc/config/riscv/riscv-vector.h @@ -24,6 +24,10 @@ bool riscv_vector_mode_p (machine_mode); bool rvv_legitimate_poly_int_p (rtx); unsigned int rvv_offset_temporaries (bool, poly_int64); vlmul_field_enum rvv_classify_vlmul_field (machine_mode); +extern unsigned int rvv_parse_vsew_field (unsigned int); +extern unsigned int rvv_parse_vlmul_field (unsigned int); +extern bool rvv_parse_vta_field (unsigned int); +extern bool rvv_parse_vma_field (unsigned int); int rvv_regsize (machine_mode); opt_machine_mode rvv_get_mask_mode (machine_mode); #endif // GCC_RISCV_VECTOR_H \ No newline at end of file diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 8e880ba8599..2b0b76458a7 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -106,6 +106,7 @@ ;; Constant helper for RVV (VL_REGNUM 66) (VTYPE_REGNUM 67) + (X0_REGNUM 0) ]) (include "predicates.md") @@ -182,10 +183,12 @@ ;; nop no operation ;; ghost an instruction that produces no real code ;; bitmanip bit manipulation instructions +;; vsetvl vector configuration setting (define_attr "type" "unknown,branch,jump,call,load,fpload,store,fpstore, mtc,mfc,const,arith,logical,shift,slt,imul,idiv,move,fmove,fadd,fmul, - fmadd,fdiv,fcmp,fcvt,fsqrt,multi,auipc,sfb_alu,nop,ghost,bitmanip,rotate" + fmadd,fdiv,fcmp,fcvt,fsqrt,multi,auipc,sfb_alu,nop,ghost,bitmanip,rotate, + vsetvl" (cond [(eq_attr "got" "load") (const_string "load") ;; If a doubleword move uses these expensive instructions, @@ -2952,3 +2955,4 @@ (include "pic.md") (include "generic.md") (include "sifive-7.md") +(include "vector.md") \ No newline at end of file diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index 450e20c44ce..3e0699de86c 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -16,4 +16,16 @@ ;; ;; You should have received a copy of the GNU General Public License ;; along with GCC; see the file COPYING3. If not see -;; . \ No newline at end of file +;; . + +(define_c_enum "unspec" [ + ;; vsetvli. + UNSPEC_VSETVLI +]) + +;; All integer vector modes supported for RVV. +(define_mode_iterator VI [ + VNx2QI VNx4QI VNx8QI VNx16QI VNx32QI VNx64QI VNx128QI + VNx2HI VNx4HI VNx8HI VNx16HI VNx32HI VNx64HI + VNx2SI VNx4SI VNx8SI VNx16SI VNx32SI + VNx2DI VNx4DI VNx8DI VNx16DI]) \ No newline at end of file diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md new file mode 100644 index 00000000000..31fdec981b9 --- /dev/null +++ b/gcc/config/riscv/vector.md @@ -0,0 +1,140 @@ +;; Machine description for RISC-V vector operations. +;; Copyright (C) 2022-2022 Free Software Foundation, Inc. +;; Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd. + +;; This file is part of GCC. + +;; GCC is free software; you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. + +;; GCC is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;; GNU General Public License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3. If not see +;; . + +;; This file describes the RISC-V instructions from the standard 'V' Vector +;; extension, version 1.0. +;; +;; This file include : +;; +;; - RVV intrinsic implmentation (Document:https://github.com/riscv/rvv-intrinsic-doc) + +(include "vector-iterators.md") +;; =============================================================================== +;; == Intrinsics +;; =============================================================================== + +;; ------------------------------------------------------------------------------- +;; ---- 6. Configuration-Setting Instructions +;; ------------------------------------------------------------------------------- +;; Includes: +;; - 6.1 vsetvli/vsetivl/vsetvl instructions +;; ------------------------------------------------------------------------------- + +;; we dont't define vsetvli as unspec_volatile, +;; because we want this instruction can be scheduling +;; in the future. +;; This means these instructions will be deleted when +;; there is no instructions using vl or vtype in the following. +;; rd | rs1 | AVL value | Effect on vl +;; - | !x0 | x[rs1] | Normal stripmining +;; !x0 | x0 | ~0 | Set vl to VLMAX +(define_insn "@vsetvl_" + [(parallel + [(set (match_operand:X 0 "register_operand" "=r") + (unspec:X + [(match_operand:X 1 "csr_operand" "rK")] UNSPEC_VSETVLI)) + (set (reg:SI VL_REGNUM) + (unspec:SI + [(match_dup 1)] UNSPEC_VSETVLI)) + (set (reg:SI VTYPE_REGNUM) + (unspec:SI + [(match_operand 2 "const_int_operand")] UNSPEC_VSETVLI))])] + "TARGET_VECTOR" + { + char buf[64]; + gcc_assert (CONST_INT_P (operands[2])); + const char *insn = satisfies_constraint_K (operands[1]) ? "vsetivli\t%0,%1" + : "vsetvli\t%0,%1"; + unsigned int vsew = rvv_parse_vsew_field (INTVAL (operands[2])); + unsigned int vlmul = rvv_parse_vlmul_field (INTVAL (operands[2])); + unsigned int vta = rvv_parse_vta_field (INTVAL (operands[2])); + unsigned int vma = rvv_parse_vma_field (INTVAL (operands[2])); + const char *sew = vsew == 0 ? "e8" : vsew == 1 ? "e16" + : vsew == 2 ? "e32" : "e64"; + const char *lmul = vlmul == 0 ? "m1" : vlmul == 1 ? "m2" + : vlmul == 2 ? "m4" : vlmul == 3 ? "m8" + : vlmul == 5 ? "mf8" : vlmul == 6 ? "mf4" : "mf2"; + const char *ta = vta == 0 ? "tu" : "ta"; + const char *ma = vma == 0 ? "mu" : "ma"; + snprintf (buf, sizeof (buf), "%s,%s,%s,%s,%s", insn, sew, lmul, ta, ma); + output_asm_insn (buf, operands); + return ""; + } + [(set_attr "type" "vsetvl") + (set_attr "mode" "none")]) + +(define_insn "vsetvl_zero_zero" + [(set (reg:SI VTYPE_REGNUM) + (unspec:SI + [(match_operand 0 "const_int_operand")] UNSPEC_VSETVLI))] + "TARGET_VECTOR" + { + char buf[64]; + gcc_assert (CONST_INT_P (operands[0])); + const char *insn = "vsetvli\tzero,zero"; + unsigned int vsew = rvv_parse_vsew_field (INTVAL (operands[0])); + unsigned int vlmul = rvv_parse_vlmul_field (INTVAL (operands[0])); + unsigned int vta = rvv_parse_vta_field (INTVAL (operands[0])); + unsigned int vma = rvv_parse_vma_field (INTVAL (operands[0])); + const char *sew = vsew == 0 ? "e8" : vsew == 1 ? "e16" + : vsew == 2 ? "e32" : "e64"; + const char *lmul = vlmul == 0 ? "m1" : vlmul == 1 ? "m2" + : vlmul == 2 ? "m4" : vlmul == 3 ? "m8" + : vlmul == 5 ? "mf8" : vlmul == 6 ? "mf4" : "mf2"; + const char *ta = vta == 0 ? "tu" : "ta"; + const char *ma = vma == 0 ? "mu" : "ma"; + snprintf (buf, sizeof (buf), "%s,%s,%s,%s,%s", insn, sew, lmul, ta, ma); + output_asm_insn (buf, operands); + return ""; + } + [(set_attr "type" "vsetvl") + (set_attr "mode" "none")]) + +(define_insn "@vsetvl_zero_" + [(parallel + [(set (reg:SI VL_REGNUM) + (unspec:SI + [(match_operand:X 0 "csr_operand" "rK")] UNSPEC_VSETVLI)) + (set (reg:SI VTYPE_REGNUM) + (unspec:SI + [(match_operand 1 "const_int_operand")] UNSPEC_VSETVLI))])] + "TARGET_VECTOR" + { + char buf[64]; + gcc_assert (CONST_INT_P (operands[1])); + const char *insn = satisfies_constraint_K (operands[0]) ? "vsetivli\tzero,%0" + : "vsetvli\tzero,%0"; + unsigned int vsew = rvv_parse_vsew_field (INTVAL (operands[1])); + unsigned int vlmul = rvv_parse_vlmul_field (INTVAL (operands[1])); + unsigned int vta = rvv_parse_vta_field (INTVAL (operands[1])); + unsigned int vma = rvv_parse_vma_field (INTVAL (operands[1])); + const char *sew = vsew == 0 ? "e8" : vsew == 1 ? "e16" + : vsew == 2 ? "e32" : "e64"; + const char *lmul = vlmul == 0 ? "m1" : vlmul == 1 ? "m2" + : vlmul == 2 ? "m4" : vlmul == 3 ? "m8" + : vlmul == 5 ? "mf8" : vlmul == 6 ? "mf4" : "mf2"; + const char *ta = vta == 0 ? "tu" : "ta"; + const char *ma = vma == 0 ? "mu" : "ma"; + snprintf (buf, sizeof (buf), "%s,%s,%s,%s,%s", insn, sew, lmul, ta, ma); + output_asm_insn (buf, operands); + return ""; + } + [(set_attr "type" "vsetvl") + (set_attr "mode" "none")]) \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/riscv/rvv/intrinsic/rvv-intrinsic.exp b/gcc/testsuite/gcc.target/riscv/rvv/intrinsic/rvv-intrinsic.exp new file mode 100644 index 00000000000..b6c66e88759 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/intrinsic/rvv-intrinsic.exp @@ -0,0 +1,47 @@ +# Copyright (C) 2022-2022 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . + +# GCC testsuite that uses the `dg.exp' driver. + +# Exit immediately if this isn't a RISC-V target. +if ![istarget riscv*-*-*] then { + return +} + +# Load support procs. +load_lib gcc-dg.exp + +# If a testcase doesn't have special options, use these. +global DEFAULT_CFLAGS +if ![info exists DEFAULT_CFLAGS] then { + set DEFAULT_CFLAGS " -ansi -pedantic-errors" +} + +set gcc_march "rv64gcv_zfh" +if [istarget riscv32-*-*] then { + set gcc_march "rv32gcv_zfh" +} + +# Initialize `dg'. +dg-init + +# Main loop. +set CFLAGS "$DEFAULT_CFLAGS -march=$gcc_march -O3" +dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cS\]]] \ + "" $CFLAGS + +# All done. +dg-finish diff --git a/gcc/testsuite/gcc.target/riscv/rvv/intrinsic/vsetvl.c b/gcc/testsuite/gcc.target/riscv/rvv/intrinsic/vsetvl.c new file mode 100644 index 00000000000..264537de402 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/intrinsic/vsetvl.c @@ -0,0 +1,733 @@ + +/* { dg-do compile } */ +/* { dg-skip-if "test vector intrinsic" { *-*-* } { "*" } { "-march=rv*v*" } } */ + +#include +#include + + +size_t test_vsetvl_e8mf8_imm0() +{ + size_t vl = vsetvl_e8mf8(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e8,\s*mf8,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8mf8_imm31() +{ + size_t vl = vsetvl_e8mf8(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e8,\s*mf8,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8mf8_imm32() +{ + size_t vl = vsetvl_e8mf8(32); + return vl; +} +size_t test_vsetvl_e8mf8(size_t avl) +{ + size_t vl = vsetvl_e8mf8(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e8,\s*mf8,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e8mf8() +{ + size_t vl = vsetvlmax_e8mf8(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e8,\s*mf8,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8mf4_imm0() +{ + size_t vl = vsetvl_e8mf4(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e8,\s*mf4,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8mf4_imm31() +{ + size_t vl = vsetvl_e8mf4(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e8,\s*mf4,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8mf4_imm32() +{ + size_t vl = vsetvl_e8mf4(32); + return vl; +} +size_t test_vsetvl_e8mf4(size_t avl) +{ + size_t vl = vsetvl_e8mf4(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e8,\s*mf4,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e8mf4() +{ + size_t vl = vsetvlmax_e8mf4(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e8,\s*mf4,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8mf2_imm0() +{ + size_t vl = vsetvl_e8mf2(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e8,\s*mf2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8mf2_imm31() +{ + size_t vl = vsetvl_e8mf2(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e8,\s*mf2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8mf2_imm32() +{ + size_t vl = vsetvl_e8mf2(32); + return vl; +} +size_t test_vsetvl_e8mf2(size_t avl) +{ + size_t vl = vsetvl_e8mf2(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e8,\s*mf2,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e8mf2() +{ + size_t vl = vsetvlmax_e8mf2(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e8,\s*mf2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8m1_imm0() +{ + size_t vl = vsetvl_e8m1(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e8,\s*m1,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8m1_imm31() +{ + size_t vl = vsetvl_e8m1(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e8,\s*m1,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8m1_imm32() +{ + size_t vl = vsetvl_e8m1(32); + return vl; +} +size_t test_vsetvl_e8m1(size_t avl) +{ + size_t vl = vsetvl_e8m1(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e8,\s*m1,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e8m1() +{ + size_t vl = vsetvlmax_e8m1(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e8,\s*m1,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8m2_imm0() +{ + size_t vl = vsetvl_e8m2(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e8,\s*m2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8m2_imm31() +{ + size_t vl = vsetvl_e8m2(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e8,\s*m2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8m2_imm32() +{ + size_t vl = vsetvl_e8m2(32); + return vl; +} +size_t test_vsetvl_e8m2(size_t avl) +{ + size_t vl = vsetvl_e8m2(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e8,\s*m2,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e8m2() +{ + size_t vl = vsetvlmax_e8m2(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e8,\s*m2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8m4_imm0() +{ + size_t vl = vsetvl_e8m4(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e8,\s*m4,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8m4_imm31() +{ + size_t vl = vsetvl_e8m4(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e8,\s*m4,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8m4_imm32() +{ + size_t vl = vsetvl_e8m4(32); + return vl; +} +size_t test_vsetvl_e8m4(size_t avl) +{ + size_t vl = vsetvl_e8m4(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e8,\s*m4,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e8m4() +{ + size_t vl = vsetvlmax_e8m4(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e8,\s*m4,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8m8_imm0() +{ + size_t vl = vsetvl_e8m8(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e8,\s*m8,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8m8_imm31() +{ + size_t vl = vsetvl_e8m8(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e8,\s*m8,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e8m8_imm32() +{ + size_t vl = vsetvl_e8m8(32); + return vl; +} +size_t test_vsetvl_e8m8(size_t avl) +{ + size_t vl = vsetvl_e8m8(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e8,\s*m8,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e8m8() +{ + size_t vl = vsetvlmax_e8m8(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e8,\s*m8,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e16mf4_imm0() +{ + size_t vl = vsetvl_e16mf4(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e16,\s*mf4,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e16mf4_imm31() +{ + size_t vl = vsetvl_e16mf4(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e16,\s*mf4,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e16mf4_imm32() +{ + size_t vl = vsetvl_e16mf4(32); + return vl; +} +size_t test_vsetvl_e16mf4(size_t avl) +{ + size_t vl = vsetvl_e16mf4(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e16,\s*mf4,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e16mf4() +{ + size_t vl = vsetvlmax_e16mf4(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e16,\s*mf4,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e16mf2_imm0() +{ + size_t vl = vsetvl_e16mf2(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e16,\s*mf2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e16mf2_imm31() +{ + size_t vl = vsetvl_e16mf2(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e16,\s*mf2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e16mf2_imm32() +{ + size_t vl = vsetvl_e16mf2(32); + return vl; +} +size_t test_vsetvl_e16mf2(size_t avl) +{ + size_t vl = vsetvl_e16mf2(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e16,\s*mf2,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e16mf2() +{ + size_t vl = vsetvlmax_e16mf2(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e16,\s*mf2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e16m1_imm0() +{ + size_t vl = vsetvl_e16m1(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e16,\s*m1,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e16m1_imm31() +{ + size_t vl = vsetvl_e16m1(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e16,\s*m1,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e16m1_imm32() +{ + size_t vl = vsetvl_e16m1(32); + return vl; +} +size_t test_vsetvl_e16m1(size_t avl) +{ + size_t vl = vsetvl_e16m1(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e16,\s*m1,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e16m1() +{ + size_t vl = vsetvlmax_e16m1(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e16,\s*m1,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e16m2_imm0() +{ + size_t vl = vsetvl_e16m2(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e16,\s*m2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e16m2_imm31() +{ + size_t vl = vsetvl_e16m2(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e16,\s*m2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e16m2_imm32() +{ + size_t vl = vsetvl_e16m2(32); + return vl; +} +size_t test_vsetvl_e16m2(size_t avl) +{ + size_t vl = vsetvl_e16m2(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e16,\s*m2,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e16m2() +{ + size_t vl = vsetvlmax_e16m2(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e16,\s*m2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e16m4_imm0() +{ + size_t vl = vsetvl_e16m4(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e16,\s*m4,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e16m4_imm31() +{ + size_t vl = vsetvl_e16m4(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e16,\s*m4,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e16m4_imm32() +{ + size_t vl = vsetvl_e16m4(32); + return vl; +} +size_t test_vsetvl_e16m4(size_t avl) +{ + size_t vl = vsetvl_e16m4(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e16,\s*m4,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e16m4() +{ + size_t vl = vsetvlmax_e16m4(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e16,\s*m4,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e16m8_imm0() +{ + size_t vl = vsetvl_e16m8(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e16,\s*m8,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e16m8_imm31() +{ + size_t vl = vsetvl_e16m8(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e16,\s*m8,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e16m8_imm32() +{ + size_t vl = vsetvl_e16m8(32); + return vl; +} +size_t test_vsetvl_e16m8(size_t avl) +{ + size_t vl = vsetvl_e16m8(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e16,\s*m8,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e16m8() +{ + size_t vl = vsetvlmax_e16m8(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e16,\s*m8,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e32mf2_imm0() +{ + size_t vl = vsetvl_e32mf2(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e32,\s*mf2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e32mf2_imm31() +{ + size_t vl = vsetvl_e32mf2(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e32,\s*mf2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e32mf2_imm32() +{ + size_t vl = vsetvl_e32mf2(32); + return vl; +} +size_t test_vsetvl_e32mf2(size_t avl) +{ + size_t vl = vsetvl_e32mf2(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e32,\s*mf2,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e32mf2() +{ + size_t vl = vsetvlmax_e32mf2(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e32,\s*mf2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e32m1_imm0() +{ + size_t vl = vsetvl_e32m1(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e32,\s*m1,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e32m1_imm31() +{ + size_t vl = vsetvl_e32m1(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e32,\s*m1,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e32m1_imm32() +{ + size_t vl = vsetvl_e32m1(32); + return vl; +} +size_t test_vsetvl_e32m1(size_t avl) +{ + size_t vl = vsetvl_e32m1(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e32,\s*m1,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e32m1() +{ + size_t vl = vsetvlmax_e32m1(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e32,\s*m1,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e32m2_imm0() +{ + size_t vl = vsetvl_e32m2(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e32,\s*m2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e32m2_imm31() +{ + size_t vl = vsetvl_e32m2(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e32,\s*m2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e32m2_imm32() +{ + size_t vl = vsetvl_e32m2(32); + return vl; +} +size_t test_vsetvl_e32m2(size_t avl) +{ + size_t vl = vsetvl_e32m2(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e32,\s*m2,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e32m2() +{ + size_t vl = vsetvlmax_e32m2(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e32,\s*m2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e32m4_imm0() +{ + size_t vl = vsetvl_e32m4(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e32,\s*m4,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e32m4_imm31() +{ + size_t vl = vsetvl_e32m4(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e32,\s*m4,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e32m4_imm32() +{ + size_t vl = vsetvl_e32m4(32); + return vl; +} +size_t test_vsetvl_e32m4(size_t avl) +{ + size_t vl = vsetvl_e32m4(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e32,\s*m4,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e32m4() +{ + size_t vl = vsetvlmax_e32m4(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e32,\s*m4,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e32m8_imm0() +{ + size_t vl = vsetvl_e32m8(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e32,\s*m8,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e32m8_imm31() +{ + size_t vl = vsetvl_e32m8(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e32,\s*m8,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e32m8_imm32() +{ + size_t vl = vsetvl_e32m8(32); + return vl; +} +size_t test_vsetvl_e32m8(size_t avl) +{ + size_t vl = vsetvl_e32m8(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e32,\s*m8,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e32m8() +{ + size_t vl = vsetvlmax_e32m8(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e32,\s*m8,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e64m1_imm0() +{ + size_t vl = vsetvl_e64m1(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e64,\s*m1,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e64m1_imm31() +{ + size_t vl = vsetvl_e64m1(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e64,\s*m1,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e64m1_imm32() +{ + size_t vl = vsetvl_e64m1(32); + return vl; +} +size_t test_vsetvl_e64m1(size_t avl) +{ + size_t vl = vsetvl_e64m1(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e64,\s*m1,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e64m1() +{ + size_t vl = vsetvlmax_e64m1(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e64,\s*m1,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e64m2_imm0() +{ + size_t vl = vsetvl_e64m2(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e64,\s*m2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e64m2_imm31() +{ + size_t vl = vsetvl_e64m2(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e64,\s*m2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e64m2_imm32() +{ + size_t vl = vsetvl_e64m2(32); + return vl; +} +size_t test_vsetvl_e64m2(size_t avl) +{ + size_t vl = vsetvl_e64m2(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e64,\s*m2,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e64m2() +{ + size_t vl = vsetvlmax_e64m2(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e64,\s*m2,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e64m4_imm0() +{ + size_t vl = vsetvl_e64m4(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e64,\s*m4,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e64m4_imm31() +{ + size_t vl = vsetvl_e64m4(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e64,\s*m4,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e64m4_imm32() +{ + size_t vl = vsetvl_e64m4(32); + return vl; +} +size_t test_vsetvl_e64m4(size_t avl) +{ + size_t vl = vsetvl_e64m4(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e64,\s*m4,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e64m4() +{ + size_t vl = vsetvlmax_e64m4(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e64,\s*m4,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e64m8_imm0() +{ + size_t vl = vsetvl_e64m8(0); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*0,\s*e64,\s*m8,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e64m8_imm31() +{ + size_t vl = vsetvl_e64m8(31); + return vl; +} +/* { dg-final { scan-assembler-times {vsetivli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*31,\s*e64,\s*m8,\s*ta,\s*mu} 1 } } */ + +size_t test_vsetvl_e64m8_imm32() +{ + size_t vl = vsetvl_e64m8(32); + return vl; +} +size_t test_vsetvl_e64m8(size_t avl) +{ + size_t vl = vsetvl_e64m8(avl); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*e64,\s*m8,\s*ta,\s*mu} 2 } } */ + +size_t test_vsetvlmax_e64m8() +{ + size_t vl = vsetvlmax_e64m8(); + return vl; +} +/* { dg-final { scan-assembler-times {vsetvli\s+(?:ra|[sgtf]p|t[0-6]|s[0-9]|s10|s11|a[0-7]),\s*zero,\s*e64,\s*m8,\s*ta,\s*mu} 1 } } */ From patchwork Tue May 31 08:49:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 54549 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 98F323857BB2 for ; Tue, 31 May 2022 08:55:30 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbguseast2.qq.com (smtpbguseast2.qq.com [54.204.34.130]) by sourceware.org (Postfix) with ESMTPS id 6F1A0385624D for ; Tue, 31 May 2022 08:50:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6F1A0385624D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp84t1653987034to474f6u Received: from server1.localdomain ( [42.247.22.65]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 31 May 2022 16:50:33 +0800 (CST) X-QQ-SSF: 01400000002000B0F000000A0000000 X-QQ-FEAT: UldK9Jsj7VMAN6S5YxiWMIiEtnRQUru1uG2izgeeC4oWQ19YaC3oBlSuhbAv4 +6d3Y2+hdmVGzfdFORvCTjq78hCkEq+S4cFP4QDZZ7EcU2w/oLmCmkVOkW9HuYBUTmP+yKG vcyrcPYFbANIm2wqVKg/FUwlDfEtiPYzA1EnbrmHuHaJ8O/5cM9sg023sq4LlI23ILg3wgM N8dmTNCeiOp3Rxd6PuVW5buTBb/H2W7XpTMBBrQFTwISFC+Rbe7M397V860S2ePj4X16T9S mWEoLaNJP7a+9m6zWSKbSRn9SPZjlpPi4HG32Ikv5UHfnAuUn5KcvM5waEdW+W/+bH3721k 1uX/wbDp0NT2Vzfob5FlLUxXqUhXyzd3qjBy22n2+4MkcfOlBc= X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Subject: [PATCH 06/21] Add insert-vsetvl pass Date: Tue, 31 May 2022 16:49:57 +0800 Message-Id: <20220531085012.269719-7-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220531085012.269719-1-juzhe.zhong@rivai.ai> References: <20220531085012.269719-1-juzhe.zhong@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybgforeign:qybgforeign4 X-QQ-Bgrelay: 1 X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kito.cheng@gmail.com, juzhe.zhong@rivai.ai Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: zhongjuzhe gcc/ChangeLog: * config.gcc: Add riscv-insert-vsetvl.o extra_objs for RVV support. * config/riscv/constraints.md (Ws5): New constraint. * config/riscv/predicates.md (p_reg_or_const_csr_operand): New predicate. (vector_reg_or_const0_operand): New predicate. (vector_move_operand): New predicate. (reg_or_mem_operand): New predicate. (reg_or_simm5_operand): New predicate. (reg_or_const_int_operand): New predicate. * config/riscv/riscv-opts.h (enum vsew_field_enum): New enum. * config/riscv/riscv-passes.def (INSERT_PASS_AFTER): Run insert vsetvl pass after pass_split_all_insns. (INSERT_PASS_BEFORE): Run insert vsetvl pass before pass_sched2. * config/riscv/riscv-protos.h (make_pass_insert_vsetvl): New function. (make_pass_insert_vsetvl2): New function. (rvv_mask_mode_p): New function. (rvv_classify_vsew_field): New function. (rvv_gen_policy): New function. (rvv_get_mask_mode): New function. (rvv_translate_attr_mode): New function. * config/riscv/riscv-vector-builtins-iterators.def (V): New iterator. (VF): New iterator. (VB): New iterator. (VFULL): New iterator. (VPARTIAL): New iterator. (V64BITI): New iterator. (VM): New iterator. (VSUB): New iterator. (VDI_TO_VSI): New iterator. (VDI_TO_VSI_VM): New iterator. * config/riscv/riscv-vector.cc (enum vsew_field_enum): New enum. (rvv_classify_vsew_field): New function. (rvv_gen_policy): New function. (rvv_translate_attr_mode): New function. (TRANSLATE_VECTOR_MODE): New macro define. (classify_vtype_field): New function. (get_lmulx8): New function. (force_reg_for_over_uimm): New function. (gen_vlx2): New function. (emit_int64_to_vector_32bit): New function. (imm32_p): New function. (imm_p): New function. (gen_3): New function. (gen_4): New function. (gen_5): New function. (gen_6): New function. (gen_7): New function. (enum GEN_CLASS): New enum. (modify_operands): New function. (emit_op5_vmv_v_x): New function. (emit_op5): New function. * config/riscv/riscv-vector.h (riscv_vector_mode_p): New function. (rvv_legitimate_poly_int_p): New function. (rvv_offset_temporaries): New function. (rvv_classify_vlmul_field): New function. (rvv_parse_vsew_field): New function. (rvv_parse_vlmul_field): New function. (rvv_parse_vta_field): New function. (rvv_parse_vma_field): New function. (rvv_regsize): New function. (rvv_get_mask_mode): New function. * config/riscv/riscv.md: Add RVV modes. * config/riscv/t-riscv: New object. * config/riscv/vector-iterators.md: New iterators and attributes. * config/riscv/vector.md (@vec_duplicate): New pattern. (@vle): New pattern. (@vse): New pattern. (@vlm): New pattern. (@vsm): New pattern. (@v_v_x): New pattern. (@vmv_v_x_internal): New pattern. (@vmv_v_x_32bit): New pattern. (@vfmv_v_f): New pattern. (@vmerge_vxm_internal): New pattern. * config/riscv/riscv-insert-vsetvl.cc: New file. --- gcc/config.gcc | 2 +- gcc/config/riscv/constraints.md | 5 + gcc/config/riscv/predicates.md | 31 + gcc/config/riscv/riscv-insert-vsetvl.cc | 2312 +++++++++++++++++ gcc/config/riscv/riscv-opts.h | 12 + gcc/config/riscv/riscv-passes.def | 2 + gcc/config/riscv/riscv-protos.h | 19 + .../riscv/riscv-vector-builtins-iterators.def | 236 ++ gcc/config/riscv/riscv-vector.cc | 368 +++ gcc/config/riscv/riscv-vector.h | 10 - gcc/config/riscv/riscv.md | 67 +- gcc/config/riscv/t-riscv | 4 + gcc/config/riscv/vector-iterators.md | 129 +- gcc/config/riscv/vector.md | 235 +- 14 files changed, 3417 insertions(+), 15 deletions(-) create mode 100644 gcc/config/riscv/riscv-insert-vsetvl.cc diff --git a/gcc/config.gcc b/gcc/config.gcc index 042a7a17737..1592e344531 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -518,7 +518,7 @@ pru-*-*) riscv*) cpu_type=riscv extra_headers="riscv_vector.h" - extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memrefs.o riscv-vector.o riscv-vector-builtins-functions.o riscv-vector-builtins.o" + extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memrefs.o riscv-vector.o riscv-vector-builtins-functions.o riscv-vector-builtins.o riscv-insert-vsetvl.o" d_target_objs="riscv-d.o" target_gtfiles="$target_gtfiles \$(srcdir)/config/riscv/riscv-builtins.cc \$(srcdir)/config/riscv/riscv-vector-builtins.cc" target_gtfiles="$target_gtfiles \$(srcdir)/config/riscv/riscv-vector-builtins-functions.cc" diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md index 7fd61a04216..114878130bb 100644 --- a/gcc/config/riscv/constraints.md +++ b/gcc/config/riscv/constraints.md @@ -97,3 +97,8 @@ (and (match_code "const_poly_int") (match_test "CONST_POLY_INT_COEFFS (op)[0] == UNITS_PER_V_REG.coeffs[0] && CONST_POLY_INT_COEFFS (op)[1] == UNITS_PER_V_REG.coeffs[1]"))) + +(define_constraint "Ws5" + "Signed immediate 5-bit value" + (and (match_code "const_int") + (match_test "IN_RANGE (INTVAL (op), -16, 15)"))) diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md index 6328cfff367..7a101676538 100644 --- a/gcc/config/riscv/predicates.md +++ b/gcc/config/riscv/predicates.md @@ -246,3 +246,34 @@ (define_predicate "imm5_operand" (and (match_code "const_int") (match_test "INTVAL (op) < 5"))) + +;; Vector Predicates. + +(define_special_predicate "p_reg_or_const_csr_operand" + (match_code "reg, subreg, const_int") +{ + if (CONST_INT_P (op)) + return satisfies_constraint_K (op); + return GET_MODE (op) == Pmode; +}) + +(define_predicate "vector_reg_or_const0_operand" + (ior (match_operand 0 "register_operand") + (match_test "op == const0_rtx && !VECTOR_MODE_P (GET_MODE (op))"))) + +(define_predicate "vector_move_operand" + (ior (match_operand 0 "nonimmediate_operand") + (match_code "const_vector"))) + +(define_predicate "reg_or_mem_operand" + (ior (match_operand 0 "register_operand") + (match_operand 0 "memory_operand"))) + +(define_predicate "reg_or_simm5_operand" + (ior (match_operand 0 "register_operand") + (and (match_operand 0 "const_int_operand") + (match_test "!FLOAT_MODE_P (GET_MODE (op)) && IN_RANGE (INTVAL (op), -16, 15)")))) + +(define_predicate "reg_or_const_int_operand" + (ior (match_operand 0 "register_operand") + (match_code "const_wide_int, const_int"))) \ No newline at end of file diff --git a/gcc/config/riscv/riscv-insert-vsetvl.cc b/gcc/config/riscv/riscv-insert-vsetvl.cc new file mode 100644 index 00000000000..939927c5775 --- /dev/null +++ b/gcc/config/riscv/riscv-insert-vsetvl.cc @@ -0,0 +1,2312 @@ +/* Insert-vsetvli pass for RISC-V 'V' Extension for GNU compiler. + Copyright(C) 2022-2022 Free Software Foundation, Inc. + Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or(at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +#define IN_TARGET_CODE 1 +#define INCLUDE_ALGORITHM 1 +#define INCLUDE_FUNCTIONAL 1 + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "rtl.h" +#include "backend.h" +#include "regs.h" +#include "target.h" +#include "memmodel.h" +#include "emit-rtl.h" +#include "df.h" +#include "rtl-ssa.h" +#include "predict.h" +#include "insn-config.h" +#include "insn-attr.h" +#include "recog.h" +#include "cfgrtl.h" +#include "tree.h" +#include "gimple.h" +#include "tree-pass.h" +#include "ssa.h" +#include "gimple-iterator.h" +#include "gimple-walk.h" +#include "langhooks.h" +#include "tree-iterator.h" +#include "gimplify.h" +#include "explow.h" +#include "cfgcleanup.h" + +#include +#include +#include +#include +#include + +#include "riscv-protos.h" +#include "riscv-vector-builtins-functions.h" +#include "riscv-vector-builtins.h" + +using namespace riscv_vector; +using namespace rtl_ssa; + +/* This pass is to insert vsetvli instructions for RVV instructions that depend on vtype or vl. + Because Clang+LLVM compiler has the mature pass to insert vsetvli instructions and works well, + algorithm follows the Clang+LLVM compiler Pass. + + This pass consists of 3 phases: + + Phase 1 collects how each basic block affects VL/VTYPE. + + Phase 2 uses the information from phase 1 to do a data flow analysis to + propagate the VL/VTYPE changes through the function. This gives us the + VL/VTYPE at the start of each basic block. + + Phase 3 inserts vsetvli instructions in each basic block. Information from + phase 2 is used to prevent inserting a vsetvli before the first vector + instruction in the block if possible. */ + +enum state_enum +{ + STATE_UNINITIALIZED, + STATE_KNOWN, + STATE_UNKNOWN +}; + +enum replace_enum +{ + REPLACE_VL, + REPLACE_VTYPE +}; + +enum clobber_pat_enum +{ + MOV_CLOBBER_MEM_REG, + MOV_CLOBBER_REG_MEM, + MOV_CLOBBER_REG_REG, + MOV_CLOBBER_REG_CONST, + OTHERS +}; + +/* Helper functions. */ + +static unsigned int +get_policy_offset (rtx_insn *insn) +{ + unsigned int offset = 1; + if (GET_CODE (PATTERN (insn)) == PARALLEL) + { + if (get_attr_type (insn) == TYPE_VCMP) + offset = 2; + } + return offset; +} + +static unsigned int +get_vl_offset (rtx_insn *insn) +{ + unsigned int offset = 2; + if (GET_CODE (PATTERN (insn)) == PARALLEL) + { + if (get_attr_type (insn) == TYPE_VCMP) + offset = 3; + } + return offset; +} + +static enum clobber_pat_enum +recog_clobber_vl_vtype (rtx_insn *insn) +{ + /* + [(set (match_operand 0 "reg_or_mem_operand" "=vr,m,vr") + (match_operand 1 "reg_or_mem_operand" "m,vr,vr")) + (clobber (match_scratch:SI 2 "=&r,&r,X")) + (clobber (reg:SI VL_REGNUM)) + (clobber (reg:SI VTYPE_REGNUM))] + */ + rtx pat = PATTERN (insn); + if (GET_CODE (pat) != PARALLEL) + return OTHERS; + + unsigned int len = XVECLEN (pat, 0); + if (len < 3) + return OTHERS; + + if (!rtx_equal_p ( + XVECEXP (pat, 0, len - 1), + gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (SImode, VTYPE_REGNUM)))) + return OTHERS; + + if (!rtx_equal_p (XVECEXP (pat, 0, len - 2), + gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (SImode, VL_REGNUM)))) + return OTHERS; + + extract_insn_cached (insn); + rtx mov_pat = gen_rtx_SET (recog_data.operand[0], recog_data.operand[1]); + if (!rtx_equal_p (XVECEXP (pat, 0, 0), mov_pat)) + return OTHERS; + + if (MEM_P (recog_data.operand[0])) + return MOV_CLOBBER_MEM_REG; + + if (MEM_P (recog_data.operand[1])) + return MOV_CLOBBER_REG_MEM; + + if (REG_P (recog_data.operand[1])) + return MOV_CLOBBER_REG_REG; + + if (CONST_VECTOR_P (recog_data.operand[1])) + return MOV_CLOBBER_REG_CONST; + + return OTHERS; +} + +static bool +is_vector_config_instr (rtx_insn *insn) +{ + return insn && INSN_P (insn) && recog_memoized (insn) >= 0 && + get_attr_type (insn) == TYPE_VSETVL; +} + +/// Return true if this is 'vsetvli x0, x0, vtype' which preserves +/// VL and only sets VTYPE. +static bool +is_vl_preserving_config (rtx_insn *insn) +{ + if (is_vector_config_instr (insn)) + { + extract_insn_cached (insn); + return recog_data.n_operands == 1; + } + return false; +} + +static bool +rvv_insn_p (rtx_insn *insn, rtx *src) +{ + *src = NULL_RTX; + if (!insn) + return false; + + if (!INSN_P (insn)) + return false; + + if (recog_memoized (insn) < 0) + return false; + + if (!rvv_mode_p (rvv_translate_attr_mode (insn))) + return false; + + if (recog_clobber_vl_vtype (insn) != OTHERS) + { + if (reload_completed) + { + *src = SET_SRC (XVECEXP (PATTERN (insn), 0, 0)); + return true; + } + else + return false; + } + + if (GET_CODE (PATTERN (insn)) == PARALLEL) + *src = SET_SRC (XVECEXP (PATTERN (insn), 0, 0)); + + if (GET_CODE (PATTERN (insn)) == SET) + *src = SET_SRC (PATTERN (insn)); + + if (!*src) + return false; + + if (GET_CODE (*src) != UNSPEC) + return false; + + if (XINT (*src, 1) != UNSPEC_RVV) + return false; + + return true; +} + +static bool +use_vl_p (rtx_insn *insn) +{ + rtx src = NULL_RTX; + if (!rvv_insn_p (insn, &src)) + return false; + + if (recog_clobber_vl_vtype (insn) != OTHERS) + return true; + + if (rtx_equal_p (XVECEXP (src, 0, XVECLEN (src, 0) - 1), + gen_rtx_REG (SImode, VL_REGNUM))) + return true; + + if (XVECLEN (src, 0) > 1 && + rtx_equal_p (XVECEXP (src, 0, XVECLEN (src, 0) - 2), + gen_rtx_REG (SImode, VL_REGNUM))) + return true; + + return false; +} + +static bool +use_vtype_p (rtx_insn *insn) +{ + rtx src = NULL_RTX; + if (!rvv_insn_p (insn, &src)) + return false; + + if (recog_clobber_vl_vtype (insn) != OTHERS) + return true; + + if (rtx_equal_p (XVECEXP (src, 0, XVECLEN (src, 0) - 1), + gen_rtx_REG (SImode, VTYPE_REGNUM))) + return true; + + return false; +} + +static bool +use_vlmax_p (rtx_insn *insn) +{ + rtx src = NULL_RTX; + unsigned int length = 0; + + if (recog_clobber_vl_vtype (insn) != OTHERS) + return true; + + if (rvv_insn_p (insn, &src)) + length = XVECLEN (src, 0); + + if (length < 2) + return false; + + if (rtx_equal_p (XVECEXP (src, 0, length - 1), + gen_rtx_REG (SImode, VL_REGNUM))) + return rtx_equal_p (XVECEXP (src, 0, length - 2), + gen_rtx_REG (Pmode, X0_REGNUM)); + + if (length < 3) + return false; + + return rtx_equal_p (XVECEXP (src, 0, length - 3), + gen_rtx_REG (Pmode, X0_REGNUM)); +} + +static bool +need_vsetvli_p (rtx_insn *insn) +{ + rtx src = NULL_RTX; + if (!rvv_insn_p (insn, &src)) + return false; + return true; +} + +static void +replace_op (rtx_insn *insn, rtx x, unsigned int replace) +{ + extract_insn_cached (insn); + if (replace == REPLACE_VTYPE) + validate_change (insn, recog_data.operand_loc[recog_data.n_operands - 1], x, false); + + if (replace == REPLACE_VL && !use_vlmax_p (insn)) + { + unsigned int offset = get_vl_offset (insn); + validate_change (insn, + recog_data.operand_loc[recog_data.n_operands - offset], + x, false); + } +} + +static bool +update_vl_vtype_p (rtx_insn *insn) +{ + if (insn && NONDEBUG_INSN_P (insn)) + { + if (recog_memoized (insn) >= 0 && + (get_attr_type (insn) == TYPE_VLEFF)) + { + extract_insn_cached (insn); + if (INTVAL (recog_data.operand[recog_data.n_operands - 1]) == + DO_NOT_UPDATE_VL_VTYPE) + return false; + return true; + } + if (CALL_P (insn)) + return true; + if (PATTERN (insn) && (GET_CODE (PATTERN (insn)) == ASM_INPUT || + GET_CODE (PATTERN (insn)) == ASM_OPERANDS || + asm_noperands (PATTERN (insn)) >= 0)) + return true; + } + return false; +} + +static rtx +get_avl_source (rtx avl, rtx_insn *rtl) +{ + if (!rtl || !avl) + return NULL_RTX; + + if (optimize < 2) + return NULL_RTX; + + insn_info *next; + rtx avl_source = NULL_RTX; + + if (!REG_P (avl)) + return NULL_RTX; + + for (insn_info *insn = crtl->ssa->first_insn (); insn; insn = next) + { + next = insn->next_any_insn (); + if (insn->rtl () == rtl) + { + resource_info resource{GET_MODE (avl), REGNO (avl)}; + def_lookup dl = crtl->ssa->find_def (resource, insn); + def_info *def = dl.prev_def (insn); + + if (!def) + return NULL_RTX; + + if (!is_a (def)) + return NULL_RTX; + + insn_info *def_insn = def->insn (); + + if (!def_insn) + return NULL_RTX; + rtx_insn *def_rtl = def_insn->rtl (); + + if (!def_rtl) + return NULL_RTX; + + if (INSN_P (def_rtl) && single_set (def_rtl)) + { + avl_source = SET_SRC (single_set (def_rtl)); + break; + } + } + } + + return avl_source; +} + +static machine_mode +vsew_to_int_mode (unsigned vsew) +{ + return vsew == 0 ? QImode : vsew == 1 ? HImode : vsew == 2 ? SImode : DImode; +} + +class vinfo +{ +private: + state_enum state; + // Fields from VTYPE. + uint8_t vma : 1; + uint8_t vta : 1; + uint8_t vsew : 3; + uint8_t vlmul : 3; + uint8_t all_maskop_p : 1; + uint8_t store_p : 1; + uint8_t sew_lmul_ratio_only_p : 1; + uint8_t scalar_move_p : 1; + rtx avl; + rtx avl_source; + +public: + vinfo () + : state (STATE_UNINITIALIZED), vma (false), vta (false), vsew (0), + vlmul (0), all_maskop_p (false), store_p (false), sew_lmul_ratio_only_p (false), + scalar_move_p (false), avl (NULL_RTX), avl_source (NULL_RTX) + { + } + + ~vinfo () {} + + static vinfo + get_unknown () + { + vinfo info; + info.set_unknown (); + return info; + } + + bool + valid_p () const + { + return state != STATE_UNINITIALIZED; + } + void + set_unknown () + { + state = STATE_UNKNOWN; + } + bool + unknown_p () const + { + return state == STATE_UNKNOWN; + } + + bool + known_p () const + { + return state == STATE_KNOWN; + } + + void + set_avl (rtx op) + { + avl = op; + state = STATE_KNOWN; + } + + void + set_avl_source (rtx op) + { + avl_source = op; + } + + bool + avl_const_p () const + { + return get_avl () && CONST_SCALAR_INT_P (get_avl ()); + } + + bool + avl_reg_p () const + { + return get_avl () && REG_P (get_avl ()); + } + + rtx + get_avl () const + { + gcc_assert (known_p ()); + return avl; + } + + bool + has_zero_avl () const + { + if (!known_p ()) + return false; + if (get_avl () == NULL_RTX) + return false; + if (avl_const_p ()) + return INTVAL (get_avl ()) == 0; + return false; + } + + bool + has_nonzero_avl () const + { + if (!known_p ()) + return false; + if (get_avl () == NULL_RTX) + return false; + if (avl_const_p ()) + return INTVAL (get_avl ()) > 0; + if (avl_reg_p ()) + return rtx_equal_p (get_avl (), gen_rtx_REG (Pmode, X0_REGNUM)); + return false; + } + + rtx + get_avl_source () const + { + gcc_assert (known_p ()); + return avl_source; + } + + unsigned int + get_vsew () const + { + return vsew; + } + + enum vlmul_field_enum + get_vlmul () const + { + return (enum vlmul_field_enum) vlmul; + } + + unsigned int + get_vta () const + { + return vta; + } + + unsigned int + get_vma () const + { + return vma; + } + + uint8_t + get_store_p () const + { + return store_p; + } + + bool + compare_vl (const vinfo &info) const + { + /* Optimize the code as follows: + if RVV is a fixed vector-length = 128bit. + vsetvli a5, 16, e8, m1...... + ......... + vsetvli a5, zero, e8, m1.....(no need) + */ + if (!get_avl () || !info.get_avl ()) + return false; + + if (REG_P (get_avl ()) && REGNO (get_avl ()) == X0_REGNUM) + { + unsigned int vsew = info.get_vsew (); + machine_mode inner = vsew_to_int_mode (vsew); + machine_mode mode = riscv_vector::vector_builtin_mode ( + as_a (inner), info.get_vlmul ()); + if (CONST_SCALAR_INT_P (info.get_avl ())) + { + if (GET_MODE_NUNITS (mode).is_constant () && + INTVAL (info.get_avl ()) == + GET_MODE_NUNITS (mode).to_constant ()) + return true; + } + + if (REG_P (info.get_avl ())) + { + if (info.get_avl_source ()) + { + if (CONST_SCALAR_INT_P (info.get_avl_source ()) && + GET_MODE_NUNITS (mode).is_constant () && + INTVAL (info.get_avl_source ()) == + GET_MODE_NUNITS (mode).to_constant ()) + return true; + if (CONST_POLY_INT_P (info.get_avl_source ()) && + !GET_MODE_NUNITS (mode).is_constant () && + known_eq (rtx_to_poly_int64 (info.get_avl_source ()), + GET_MODE_NUNITS (mode))) + return true; + } + } + } + + return false; + } + + bool + avl_equal_p (const vinfo &other) const + { + gcc_assert (valid_p () && other.valid_p () && + "Can't compare invalid VSETVLI Infos."); + gcc_assert (!unknown_p () && !other.unknown_p () && + "Can't compare AVL in unknown state."); + + if (compare_vl (other)) + return true; + + if (other.compare_vl (*this)) + return true; + + if (rtx_equal_p (get_avl (), other.get_avl ())) + return true; + + if (!get_avl_source () && !other.get_avl_source ()) + return false; + + if (get_avl_source () && rtx_equal_p (get_avl_source (), other.get_avl ())) + return true; + + if (other.get_avl_source () && + rtx_equal_p (other.get_avl_source (), get_avl ())) + return true; + + return rtx_equal_p (get_avl_source (), other.get_avl_source ()); + } + + void + set_vma (unsigned int vma) + { + gcc_assert (valid_p () && !unknown_p () && + "Can't set VTYPE for uninitialized or unknown."); + vma = vma; + } + + void + set_vta (unsigned int vta) + { + gcc_assert (valid_p () && !unknown_p () && + "Can't set VTYPE for uninitialized or unknown."); + vta = vta; + } + + void + set_vtype (unsigned int vtype) + { + gcc_assert (valid_p () && !unknown_p () && + "Can't set VTYPE for uninitialized or unknown."); + vma = rvv_parse_vma_field (vtype); + vta = rvv_parse_vta_field (vtype); + vsew = rvv_parse_vsew_field (vtype); + vlmul = rvv_parse_vlmul_field (vtype); + } + + void + set_vtype (unsigned vl, unsigned vs, bool vt, bool vm, bool m_p, + bool st_p, bool is_scalar_move_op) + { + gcc_assert (valid_p () && !unknown_p () && + "Can't set VTYPE for uninitialized or unknown."); + vma = vm; + vta = vt; + vsew = vs; + vlmul = vl; + all_maskop_p = m_p; + store_p = st_p; + scalar_move_p = is_scalar_move_op; + } + + // Encode VTYPE into the binary format used by the the VSETVLI instruction + // which is used by our MC layer representation. + // + // Bits | Name | Description + // -----+------------+------------------------------------------------ + // 7 | vma | Vector mask agnostic + // 6 | vta | Vector tail agnostic + // 5:3 | vsew[2:0] | Standard element width(SEW) setting + // 2:0 | vlmul[2:0] | Vector register group multiplier(LMUL) setting + unsigned + encode_vtype () const + { + gcc_assert (valid_p () && !unknown_p () && !sew_lmul_ratio_only_p && + "Can't set VTYPE for uninitialized or unknown."); + gcc_assert (vsew >= 0 && vsew <= 7 && "Invalid SEW."); + unsigned vtype = (vsew << 3) | (vlmul & 0x7); + if (vta) + vtype |= 0x40; + if (vma) + vtype |= 0x80; + + return vtype; + } + + bool + get_sew_lmul_ratio_only_p () const + { + return sew_lmul_ratio_only_p; + } + + bool + sew_equal_p (const vinfo &other) const + { + gcc_assert (valid_p () && other.valid_p () && + "Can't compare invalid VSETVLI Infos."); + gcc_assert (!unknown_p () && !other.unknown_p () && + "Can't compare VTYPE in unknown state."); + gcc_assert (!sew_lmul_ratio_only_p && !other.sew_lmul_ratio_only_p && + "Can't compare when only LMUL/SEW ratio is valid."); + return vsew == other.vsew; + } + + bool + vtype_equal_p (const vinfo &other) const + { + gcc_assert (valid_p () && other.valid_p () && + "Can't compare invalid VSETVLI Infos."); + gcc_assert (!unknown_p () && !other.unknown_p () && + "Can't compare VTYPE in unknown state."); + gcc_assert (!sew_lmul_ratio_only_p && !other.sew_lmul_ratio_only_p && + "Can't compare when only LMUL/SEW ratio is valid."); + return std::tie (vma, vta, vsew, vlmul) == + std::tie (other.vma, other.vta, other.vsew, other.vlmul); + } + + bool + policy_equal_p (const vinfo &other) const + { + gcc_assert (valid_p () && other.valid_p () && + "Can't compare invalid VSETVLI Infos."); + gcc_assert (!unknown_p () && !other.unknown_p () && + "Can't compare VTYPE in unknown state."); + + return vta == other.vta && vma == other.vma; + } + + unsigned + calc_sew_lmul_ratio (unsigned int vsew_arg, unsigned int vlmul_arg) const + { + gcc_assert (valid_p () && !unknown_p () && + "Can't use VTYPE for uninitialized or unknown."); + + unsigned lmul; + unsigned sew; + bool fractional; + + switch (vsew_arg) + { + default: + gcc_unreachable (); + case 0: + sew = 8; + break; + case 1: + sew = 16; + break; + case 2: + sew = 32; + break; + case 3: + sew = 64; + break; + case 4: + sew = 128; + break; + case 5: + sew = 256; + break; + case 6: + sew = 512; + break; + case 7: + sew = 1024; + break; + } + + switch (vlmul_arg) + { + default: + gcc_unreachable (); + case 0: + lmul = 1; + fractional = false; + break; + case 1: + lmul = 2; + fractional = false; + break; + case 2: + lmul = 4; + fractional = false; + break; + case 3: + lmul = 8; + fractional = false; + break; + case 5: + lmul = 8; + fractional = true; + break; + case 6: + lmul = 4; + fractional = true; + break; + case 7: + lmul = 2; + fractional = true; + break; + } + + gcc_assert (sew >= 8 && "Unexpected SEW value."); + unsigned int sew_mul_ratio = fractional ? sew * lmul : sew / lmul; + + return sew_mul_ratio; + } + + unsigned + calc_sew_lmul_ratio () const + { + return calc_sew_lmul_ratio (vsew, vlmul); + } + + // Check if the VTYPE for these two VSETVLI Infos produce the same VLMAX. + bool + vlmax_equal_p (const vinfo &other) const + { + gcc_assert (valid_p () && other.valid_p () && + "Can't compare invalid VSETVLI Infos."); + gcc_assert (!unknown_p () && !other.unknown_p () && + "Can't compare AVL in unknown state."); + return calc_sew_lmul_ratio () == other.calc_sew_lmul_ratio (); + } + + bool + compatible_vtype_p (const vinfo &info) const + { + // Simple case, see if full VTYPE matches. + if (vtype_equal_p (info)) + return true; + + // If this is a mask reg operation, it only cares about VLMAX. + // FIXME: Mask reg operations are probably ok if "this" VLMAX is larger + // than "InstrInfo". + // FIXME: The policy bits can probably be ignored for mask reg operations. + if (info.all_maskop_p && vlmax_equal_p (info) && vta == info.vta && + vma == info.vma) + return true; + + return false; + } + + // Determine whether the vector instructions requirements represented by + // InstrInfo are compatible with the previous vsetvli instruction represented + // by this. + bool + compatible_p (const vinfo &require) const + { + gcc_assert (valid_p () && require.valid_p () && + "Can't compare invalid VSETVLI Infos."); + gcc_assert (!require.sew_lmul_ratio_only_p && + "Expected a valid VTYPE for instruction."); + + // Nothing is compatible with Unknown. + if (unknown_p () || require.unknown_p ()) + return false; + + // If only our VLMAX ratio is valid, then this isn't compatible. + if (sew_lmul_ratio_only_p) + return false; + + // If the instruction doesn't need an AVLReg and the SEW matches, consider + // it compatible. + if (require.known_p () && require.avl == NULL_RTX + && vsew == require.vsew) + return true; + + // For vmv.s.x and vfmv.s.f, there is only two behaviors, VL = 0 and VL > 0. + // So it's compatible when we could make sure that both VL be the same + // situation. + if (require.scalar_move_p && require.get_avl () && + CONST_SCALAR_INT_P (require.get_avl ()) && + ((has_nonzero_avl () && require.has_nonzero_avl ()) || + (has_zero_avl () && require.has_zero_avl ())) && + sew_equal_p (require) && policy_equal_p (require)) + return true; + + // The AVL must match. + if (!avl_equal_p (require)) + return false; + + if (compatible_vtype_p (require)) + return true; + + // Store instructions don't use the policy fields. + // TODO: Move into hasCompatibleVTYPE? + if (require.store_p && vlmul == require.vlmul && vsew == require.vsew) + return true; + + // Anything else is not compatible. + return false; + } + + bool + load_store_compatible_p (unsigned vsew_arg, const vinfo &info) const + { + gcc_assert (valid_p () && info.valid_p () && + "Can't compare invalid VSETVLI Infos."); + gcc_assert (!info.sew_lmul_ratio_only_p && + "Expected a valid VTYPE for instruction."); + gcc_assert (vsew_arg == info.vsew && "Mismatched EEW/SEW for store."); + + if (unknown_p () || get_sew_lmul_ratio_only_p ()) + return false; + + if (!avl_equal_p (info)) + return false; + + // Stores can ignore the tail and mask policies. + if (!info.store_p && (vta != info.vta || vma != info.vma)) + return false; + + return calc_sew_lmul_ratio () == calc_sew_lmul_ratio (vsew_arg, info.vlmul); + } + + bool + operator== (const vinfo &other) const + { + // Uninitialized is only equal to another Uninitialized. + if (!valid_p ()) + return !other.valid_p (); + + if (!other.valid_p ()) + return !valid_p (); + + // Unknown is only equal to another Unknown. + if (unknown_p ()) + return other.unknown_p (); + + if (other.unknown_p ()) + return unknown_p (); + + if (!avl_equal_p (other)) + return false; + + // If only the VLMAX is valid, check that it is the same. + if (sew_lmul_ratio_only_p && other.sew_lmul_ratio_only_p) + return vlmax_equal_p (other); + + // If the full VTYPE is valid, check that it is the same. + if (!sew_lmul_ratio_only_p && !other.sew_lmul_ratio_only_p) + return vtype_equal_p (other); + + // If the sew_lmul_ratio_only bits are different, then they aren't equal. + return false; + } + + bool + operator!= (const vinfo &Other) const + { + return !(*this == Other); + } + + vinfo & + operator= (const vinfo &other) + { + state = other.state; + vma = other.vma; + vta = other.vta; + vsew = other.vsew; + vlmul = other.vlmul; + all_maskop_p = other.all_maskop_p; + sew_lmul_ratio_only_p = other.sew_lmul_ratio_only_p; + avl = other.avl; + avl_source = other.avl_source; + return *this; + } + + // Calculate the vinfo visible to a block assuming this and other are + // both predecessors. + vinfo + intersect (const vinfo &other) const + { + // If the new value isn't valid, ignore it. + if (!other.valid_p ()) + return *this; + + // If this value isn't valid, this must be the first predecessor, use it. + if (!valid_p ()) + return other; + + // If either is unknown, the result is unknown. + if (unknown_p () || other.unknown_p ()) + return vinfo::get_unknown (); + + // If we have an exact, match return this. + if (*this == other) + return *this; + + // Not an exact match, but maybe the AVL and VLMAX are the same. If so, + // return an SEW/LMUL ratio only value. + if (avl_equal_p (other) && vlmax_equal_p (other)) + { + vinfo merge_info = *this; + merge_info.sew_lmul_ratio_only_p = true; + return merge_info; + } + + // otherwise the result is unknown. + return vinfo::get_unknown (); + } + + // Print debug info into rtl dump file. */ + void + print () const + { + fprintf (dump_file, "{\n"); + if (known_p ()) + fprintf (dump_file, " Known\n"); + else if (unknown_p ()) + fprintf (dump_file, " Unknown\n"); + else + fprintf (dump_file, " Uninitialized\n"); + + if (known_p () && get_avl ()) + { + fprintf (dump_file, " Avl="); + print_rtl_single (dump_file, get_avl ()); + if (get_avl_source ()) + { + fprintf (dump_file, " Avl Source="); + print_rtl_single (dump_file, get_avl_source ()); + } + else + fprintf (dump_file, " Avl Source=(nil)\n"); + } + else + fprintf (dump_file, " Avl=(nil)\n Avl Source=(nil)\n"); + fprintf (dump_file, " Vsew=%d\n", (unsigned int)vsew); + fprintf (dump_file, " Vlmul=%d\n", (unsigned int)vlmul); + fprintf (dump_file, " TailAgnostic=%d\n", (unsigned int)vta); + fprintf (dump_file, " MaskAgnostic=%d\n", (unsigned int)vma); + fprintf (dump_file, " MaskOp=%d\n", (unsigned int)all_maskop_p); + fprintf (dump_file, " Store_p=%d\n", (unsigned int)store_p); + fprintf (dump_file, " Scalar_move_p=%d\n", (unsigned int)scalar_move_p); + fprintf (dump_file, " Sew_lmul_ratio_only_p=%d\n", (unsigned int)sew_lmul_ratio_only_p); + fprintf (dump_file, "}\n"); + } +}; + +struct bb_vinfo +{ + // The vinfo that represents the net changes to the VL/VTYPE registers + // made by this block. Calculated in Phase 1. + vinfo change; + + // The vinfo that represents the VL/VTYPE settings on exit from this + // block. Calculated in Phase 2. + vinfo exit; + + // The vinfo that represents the VL/VTYPE settings from all predecessor + // blocks. Calculated in Phase 2, and used by Phase 3. + vinfo pred; + + // Keeps track of whether the block is already in the queue. + bool inqueue = false; + + bb_vinfo () {} +}; + +static std::map bb_vinfo_map; +static std::deque bb_queue; + +static rtx_insn * +fetch_def_insn (rtx_insn *rtl, const vinfo info) +{ + /* We need use rtl ssa def_info to optimize which needs + optimization to large than or equal to 2. */ + if (optimize < 2) + return NULL; + + // We didn't find a compatible value. If our AVL is a virtual register, + // it might be defined by a VSET(I)VLI. If it has the same VTYPE we need + // and the last VL/VTYPE we observed is the same, we don't need a + // VSETVLI here. + if (!info.known_p ()) + return NULL; + if (!info.get_avl ()) + return NULL; + + rtx avl = info.get_avl (); + + if (!REG_P (avl)) + return NULL; + + insn_info *next; + for (insn_info *insn = crtl->ssa->first_insn (); insn; insn = next) + { + next = insn->next_any_insn (); + if (insn->rtl () == rtl) + { + resource_info resource{GET_MODE (avl), REGNO (avl)}; + def_lookup dl = crtl->ssa->find_def (resource, insn); + def_info *def = dl.prev_def (insn); + + if (!def) + return NULL; + + if (!is_a (def)) + return NULL; + + insn_info *def_insn = def->insn (); + rtx_insn *def_rtl = def_insn->rtl (); + + if (!def_rtl) + return NULL; + if (!INSN_P (def_rtl)) + return NULL; + + return def_rtl; + } + } + + return NULL; +} + +static void +emit_vsetvl_insn (rtx op0, rtx op1, rtx op2, rtx_insn *insn) +{ + if (dump_file) + { + fprintf (dump_file, "insert vsetvli for insn %d\n\n", INSN_UID (insn)); + print_rtl_single (dump_file, insn); + } + + if (rtx_equal_p (op0, gen_rtx_REG (Pmode, X0_REGNUM)) && + rtx_equal_p (op1, gen_rtx_REG (Pmode, X0_REGNUM))) + emit_insn_before (gen_vsetvl_zero_zero (op2), insn); + else if (rtx_equal_p (op0, gen_rtx_REG (Pmode, X0_REGNUM))) + emit_insn_before (gen_vsetvl_zero (Pmode, op1, op2), insn); + else + emit_insn_before (gen_vsetvl (Pmode, op0, op1, op2), insn); +} + +static vinfo +compute_info_for_instr (rtx_insn *, vinfo); + +// Return a vinfo representing the changes made by this VSETVLI or +// VSETIVLI instruction. +static vinfo +get_info_for_vsetvli (rtx_insn *insn, vinfo curr_info) +{ + vinfo new_info; + extract_insn_cached (insn); + + if (recog_data.n_operands == 1) + { + gcc_assert (CONST_INT_P (recog_data.operand[0]) && + "Invalid vtype in vsetvli instruction."); + if (curr_info.valid_p () && !curr_info.unknown_p ()) + { + new_info.set_avl (curr_info.get_avl ()); + new_info.set_avl_source (curr_info.get_avl_source ()); + new_info.set_vtype (INTVAL (recog_data.operand[0])); + /* if this X0, X0 vsetvli is redundant, + remove it. */ + if (curr_info.compatible_vtype_p (new_info)) + remove_insn (insn); + } + else + { + /* vsetvli X0, X0 means that the following instruction + use the same vl as before. */ + basic_block bb = BLOCK_FOR_INSN (insn); + rtx_insn *next_insn; + bool find_vl_p = false; + for (next_insn = NEXT_INSN (insn); insn != NEXT_INSN (BB_END (bb)); + next_insn = NEXT_INSN (next_insn)) + { + if (use_vtype_p (next_insn)) + { + vinfo next_info = compute_info_for_instr (next_insn, curr_info); + new_info.set_avl (next_info.get_avl ()); + new_info.set_avl_source (next_info.get_avl_source ()); + extract_insn_cached (insn); + new_info.set_vtype (INTVAL (recog_data.operand[0])); + + if (recog_clobber_vl_vtype (next_insn) != MOV_CLOBBER_REG_REG && + recog_clobber_vl_vtype (next_insn) != OTHERS) + new_info = vinfo::get_unknown (); + + find_vl_p = true; + break; + } + } + gcc_assert (find_vl_p); + } + return new_info; + } + if (recog_data.n_operands == 2) + { + gcc_assert (CONST_INT_P (recog_data.operand[1]) && + "Invalid vtype in vsetvli instruction."); + new_info.set_avl (recog_data.operand[0]); + new_info.set_avl_source (get_avl_source (recog_data.operand[0], insn)); + new_info.set_vtype (INTVAL (recog_data.operand[1])); + return new_info; + } + + gcc_assert (recog_data.n_operands == 3); + rtx vl = recog_data.operand[1]; + rtx vtype = recog_data.operand[2]; + gcc_assert (CONST_INT_P (vtype) && "Invalid vtype in vsetvli instruction."); + new_info.set_avl (vl); + new_info.set_avl_source (get_avl_source (vl, insn)); + new_info.set_vtype (INTVAL (vtype)); + return new_info; +} + +static unsigned int +analyze_vma_vta (rtx_insn *insn, vinfo curr_info) +{ + if (!use_vl_p (insn)) + return 1; + + if (recog_clobber_vl_vtype (insn) != OTHERS) + return 1; + + if (use_vlmax_p (insn)) + return 1; + unsigned int offset = get_policy_offset (insn); + extract_insn_cached (insn); + vector_policy vma = + riscv_vector::get_vma (INTVAL (recog_data.operand[recog_data.n_operands - offset])); + vector_policy vta = + riscv_vector::get_vta (INTVAL (recog_data.operand[recog_data.n_operands - offset])); + unsigned int vma_p = 0; + unsigned int vta_p = 0; + if (vma == vector_policy::agnostic) + vma_p = 1; + else if (vma == vector_policy::undisturbed) + vma_p = 0; + else + { + /* For N/A vma we remain the last vma if it valid. */ + if (curr_info.valid_p () && !curr_info.unknown_p ()) + vma_p = curr_info.get_vma (); + else + vma_p = 0; + } + + if (vta == vector_policy::agnostic) + vta_p = 1; + else if (vta == vector_policy::undisturbed) + vta_p = 0; + else + { + /* For N/A vta we remain the last vta if it valid. */ + if (curr_info.valid_p () && !curr_info.unknown_p ()) + vta_p = curr_info.get_vta (); + else + vta_p = 1; + } + return (vma_p << 1) | vta_p; +} + +static bool +scalar_move_insn_p (rtx_insn *insn) +{ + return insn && INSN_P (insn) && recog_memoized (insn) >= 0 && + (get_attr_type (insn) == TYPE_VMV_S_X || + get_attr_type (insn) == TYPE_VFMV_S_F); +} + +static bool +store_insn_p (rtx_insn *insn) +{ + return insn && INSN_P (insn) && recog_memoized (insn) >= 0 && + (get_attr_type (insn) == TYPE_VSE || + get_attr_type (insn) == TYPE_VSSE); +} + +static bool +can_skip_load_store_insn_p (rtx_insn *insn) +{ + return insn && INSN_P (insn) && recog_memoized (insn) >= 0 && + (get_attr_type (insn) == TYPE_VSE || + get_attr_type (insn) == TYPE_VSSE || + get_attr_type (insn) == TYPE_VLE || + get_attr_type (insn) == TYPE_VLSE); +} + +static vinfo +compute_info_for_instr (rtx_insn *insn, vinfo curr_info) +{ + vinfo info; + + extract_insn_cached (insn); + + if (use_vl_p (insn)) + { + if (recog_clobber_vl_vtype (insn) != OTHERS) + info.set_avl (gen_rtx_REG (Pmode, X0_REGNUM)); + else if (use_vlmax_p (insn)) + info.set_avl (gen_rtx_REG (Pmode, X0_REGNUM)); + else + { + unsigned int offset = get_vl_offset (insn); + info.set_avl_source (get_avl_source ( + recog_data.operand[recog_data.n_operands - offset], insn)); + info.set_avl (recog_data.operand[recog_data.n_operands - offset]); + } + } + else + info.set_avl (NULL_RTX); + + machine_mode mode = rvv_translate_attr_mode (insn); + bool st_p = store_insn_p (insn); + bool scalar_move_p = scalar_move_insn_p (insn); + + unsigned int vma_vta = analyze_vma_vta (insn, curr_info); + unsigned int vta = vma_vta & 0x1; + unsigned int vma = (vma_vta >> 1) & 0x1; + info.set_vtype (rvv_classify_vlmul_field (mode), + rvv_classify_vsew_field (mode), + /*TailAgnostic*/ vta, /*MaskAgnostic*/ vma, + rvv_mask_mode_p (mode), st_p, scalar_move_p); + + return info; +} + +static bool +can_skip_vsetvli_for_load_store_p (rtx_insn *insn, const vinfo &new_info, const vinfo &curr_info) +{ + gcc_assert (recog_memoized (insn) >= 0); + if (!can_skip_load_store_insn_p (insn)) + return false; + machine_mode mode = rvv_translate_attr_mode (insn); + unsigned vsew = rvv_classify_vsew_field (mode); + gcc_assert (store_insn_p (insn) == new_info.get_store_p ()); + return curr_info.load_store_compatible_p (vsew, new_info); +} + +static bool +need_vsetvli (rtx_insn *insn, const vinfo &require, const vinfo &curr_info) +{ + if (!need_vsetvli_p (insn)) + return false; + + if (curr_info.compatible_p (require)) + return false; + + // We didn't find a compatible value. If our AVL is a virtual register, + // it might be defined by a VSET(I)VLI. If it has the same VTYPE we need + // and the last VL/VTYPE we observed is the same, we don't need a + // VSETVLI here. + if (!curr_info.unknown_p () && require.avl_reg_p () && + REGNO (require.get_avl ()) >= FIRST_PSEUDO_REGISTER && + !curr_info.get_sew_lmul_ratio_only_p () && + curr_info.compatible_vtype_p (require)) + { + rtx_insn *def_rtl = fetch_def_insn (insn, require); + if (def_rtl != NULL) + { + if (is_vector_config_instr (def_rtl)) + { + vinfo def_info = get_info_for_vsetvli (def_rtl, curr_info); + if (def_info.avl_equal_p (curr_info) && + def_info.vlmax_equal_p (curr_info)) + return false; + } + } + } + + return true; +} + +static bool +need_vsetvli_phi (const vinfo &new_info, rtx_insn *rtl) +{ + /* Optimize the case as follows: + void foo (int8_t *base, int8_t* out, size_t vl, unsigned int m) + { + vint8mf8_t v0; + size_t avl; + if (m > 1000) + avl = vsetvl_e8mf8 (vl); + else + avl = vsetvl_e8mf8 (vl << 2); + for (int i = 0; i < m; i++) + { + v0 = vle8_v_i8mf8 (base + i * 32,avl); + v0 = vadd_vv_i8mf8 (v0,v0,avl); + } + *(vint8mf8_t*)out = v0; + } */ + + /* We need use rtl ssa phi to optimize which needs + optimization to large than or equal to 2. */ + if (optimize < 2) + return true; + + if (!(!new_info.unknown_p () && new_info.get_avl () && + GET_CODE (new_info.get_avl ()) == REG)) + return true; + + rtx avl = new_info.get_avl (); + + insn_info *next; + /* fetch phi_node. */ + for (insn_info *insn = crtl->ssa->first_insn (); insn; insn = next) + { + next = insn->next_any_insn (); + if (insn->rtl () == rtl) + { + bb_info *bb = insn->bb (); + ebb_info *ebb = bb->ebb (); + resource_info resource{GET_MODE (avl), REGNO (avl)}; + insn_info *phi_insn = ebb->phi_insn (); + phi_info *phi; + def_lookup dl = crtl->ssa->find_def (resource, phi_insn); + def_info *set = dl.prev_def (phi_insn); + + if (!set) + return true; + + if (!is_a (set)) + return true; + + // There is an existing phi. + phi = as_a (set); + for (unsigned int i = 0; i < phi->num_inputs (); i++) + { + def_info *def = phi->input_value (i); + if (!def) + return true; + insn_info *def_insn = def->insn (); + rtx_insn *def_rtl = def_insn->rtl (); + + if (!def_rtl) + return true; + if (!INSN_P (def_rtl)) + return true; + extract_insn_cached (def_rtl); + if (recog_data.n_operands > 0 && + rtx_equal_p (recog_data.operand[0], avl)) + { + if (get_attr_type (def_rtl) && + get_attr_type (def_rtl) == TYPE_VSETVL) + { + basic_block def_bb = BLOCK_FOR_INSN (def_rtl); + bb_vinfo info = bb_vinfo_map.at(def_bb->index); + // If the exit from the predecessor has the VTYPE + // we are looking for we might be able to avoid a + // VSETVLI. + if (info.exit.unknown_p () || + !info.exit.vtype_equal_p (new_info)) + return true; + // We found a VSET(I)VLI make sure it matches the + // output of the predecessor block. + vinfo curr_info; + vinfo avl_def_info = + get_info_for_vsetvli (def_rtl, curr_info); + if (!avl_def_info.vtype_equal_p (info.exit) || + !avl_def_info.avl_equal_p (info.exit)) + return true; + } + else + return true; + } + } + } + } + + // If all the incoming values to the PHI checked out, we don't need + // to insert a VSETVLI. + return false; +} + +static bool +compute_vl_vtype_changes (basic_block bb) +{ + bool vector_p = false; + + bb_vinfo &info = bb_vinfo_map[bb->index]; + info.change = info.pred; + rtx_insn *insn = NULL; + vinfo curr_info; + + FOR_BB_INSNS (bb, insn) + { + // If this is an explicit VSETVLI or VSETIVLI, update our state. + if (is_vector_config_instr (insn)) + { + vector_p = true; + info.change = get_info_for_vsetvli (insn, curr_info); + curr_info = info.change; + continue; + } + + /* According to vector.md, each instruction pattern parallel. + It should have at least 2 side effects. + The last 2 side effects are use vl && use vtype */ + if (use_vtype_p (insn)) + { + vector_p = true; + + vinfo new_info = compute_info_for_instr (insn, curr_info); + curr_info = new_info; + if (!info.change.valid_p ()) + info.change = new_info; + else + { + // If this instruction isn't compatible with the previous VL/VTYPE + // we need to insert a VSETVLI. + // If this is a unit-stride or strided load/store, we may be able + // to use the EMUL=(EEW/SEW)*LMUL relationship to avoid changing + // vtype. NOTE: We only do this if the vtype we're comparing + // against was created in this block. We need the first and third + // phase to treat the store the same way. + if (!can_skip_vsetvli_for_load_store_p (insn, new_info, info.change) && + need_vsetvli (insn, new_info, info.change)) + info.change = new_info; + } + } + // If this is something that updates VL/VTYPE that we don't know about, set + // the state to unknown. + if (update_vl_vtype_p (insn)) + { + curr_info = vinfo::get_unknown (); + info.change = vinfo::get_unknown (); + } + } + + return vector_p; +} + +static void +compute_incoming_vl_vtype (const basic_block bb) +{ + bb_vinfo &info = bb_vinfo_map[bb->index]; + info.inqueue = false; + + vinfo in_info; + if (EDGE_COUNT (bb->preds) == 0) + { + // There are no predecessors, so use the default starting status. + in_info.set_unknown (); + } + else + { + edge e; + edge_iterator ei; + FOR_EACH_EDGE (e, ei, bb->preds) + { + basic_block ancestor = e->src; + in_info = in_info.intersect (bb_vinfo_map.at(ancestor->index).exit); + } + } + + // If we don't have any valid predecessor value, wait until we do. + if (!in_info.valid_p ()) + return; + + // If no change, no need to rerun block + if (in_info == info.pred) + return; + + info.pred = in_info; + if (dump_file) + { + fprintf (dump_file, "Entry state of bb %d changed to\n", bb->index); + info.pred.print (); + } + + // Note: It's tempting to cache the state changes here, but due to the + // compatibility checks performed a blocks output state can change based on + // the input state. To cache, we'd have to add logic for finding + // never-compatible state changes. + compute_vl_vtype_changes (bb); + vinfo tmpstatus = info.change; + + // If the new exit value matches the old exit value, we don't need to revisit + // any blocks. + if (info.exit == tmpstatus) + return; + + info.exit = tmpstatus; + + if (dump_file) + { + fprintf (dump_file, "Exit state of bb %d changed to\n", bb->index); + info.exit.print (); + } + // Add the successors to the work list so we can propagate the changed exit + // status. + edge e; + edge_iterator ei; + FOR_EACH_EDGE (e, ei, bb->succs) + { + basic_block succ = e->dest; + if (!bb_vinfo_map[succ->index].inqueue) + bb_queue.push_back (succ); + } +} + +static void +insert_vsetvl (rtx_insn *insn, const vinfo &curr_info, const vinfo &prev_info) +{ + extract_insn_cached (insn); + rtx avl = curr_info.get_avl (); + rtx vtype = GEN_INT (curr_info.encode_vtype ()); + rtx zero = gen_rtx_REG (Pmode, X0_REGNUM); + + if (recog_clobber_vl_vtype (insn) == MOV_CLOBBER_REG_MEM + || recog_clobber_vl_vtype (insn) == MOV_CLOBBER_MEM_REG) + { + gcc_assert ( + reload_completed && + rtx_equal_p (curr_info.get_avl (), gen_rtx_REG (Pmode, X0_REGNUM))); + avl = recog_data.operand[2]; + PUT_MODE (avl, Pmode); + emit_vsetvl_insn (avl, gen_rtx_REG (Pmode, X0_REGNUM), vtype, insn); + return; + } + + // Use X0, X0 form if the AVL is the same and the SEW+LMUL gives the same + // VLMAX + if (prev_info.valid_p () && !prev_info.unknown_p () && + curr_info.avl_equal_p (prev_info) && curr_info.vlmax_equal_p (prev_info)) + { + emit_vsetvl_insn (zero, zero, vtype, insn); + return; + } + + if (curr_info.get_avl () == NULL_RTX) + { + if (prev_info.valid_p () && !prev_info.unknown_p () && + curr_info.vlmax_equal_p (prev_info)) + { + emit_vsetvl_insn (zero, zero, vtype, insn); + return; + } + // Otherwise use an AVL of 0 to avoid depending on previous vl. + emit_vsetvl_insn (zero, GEN_INT (0), vtype, insn); + return; + } + + if (rtx_equal_p (curr_info.get_avl (), gen_rtx_REG (Pmode, X0_REGNUM))) + { + if (reload_completed) + avl = gen_rtx_REG (Pmode, X0_REGNUM); + else + avl = gen_reg_rtx (Pmode); + emit_vsetvl_insn (avl, gen_rtx_REG (Pmode, X0_REGNUM), vtype, insn); + return; + } + + emit_vsetvl_insn (zero, avl, vtype, insn); +} + +static void +cleanup_insn_op (rtx_insn *insn) +{ + if (!reload_completed) + return; + + /* 1.Remove the vl operand for every rvv instruction. + 2.Replace every reload register spilling rvv instruction. */ + rtx pat; + extract_insn_cached (insn); + machine_mode mode = rvv_translate_attr_mode (insn); + if (recog_clobber_vl_vtype (insn) == MOV_CLOBBER_REG_MEM) + { + if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL) + pat = + gen_vlm (mode, recog_data.operand[0], + XEXP (recog_data.operand[1], 0), const0_rtx, const0_rtx); + else + pat = gen_vle (mode, recog_data.operand[0], const0_rtx, const0_rtx, + XEXP (recog_data.operand[1], 0), const0_rtx, const0_rtx); + + validate_change (insn, &PATTERN (insn), pat, false); + } + else if (recog_clobber_vl_vtype (insn) == MOV_CLOBBER_MEM_REG) + { + if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL) + pat = gen_vsm (mode, XEXP (recog_data.operand[0], 0), + recog_data.operand[1], const0_rtx, const0_rtx); + else + pat = gen_vse (mode, const0_rtx, XEXP (recog_data.operand[0], 0), + recog_data.operand[1], const0_rtx, const0_rtx); + + validate_change (insn, &PATTERN (insn), pat, false); + } + else + replace_op (insn, const0_rtx, REPLACE_VL); +} + +static void +emit_vsetvlis (const basic_block bb) +{ + vinfo curr_info; + rtx_insn *insn = NULL; + + FOR_BB_INSNS (bb, insn) + { + // If this is an explicit VSETVLI or VSETIVLI, update our state. + if (is_vector_config_instr (insn)) + { + curr_info = get_info_for_vsetvli (insn, curr_info); + continue; + } + + if (use_vtype_p (insn)) + { + vinfo new_info = compute_info_for_instr (insn, curr_info); + + if (!curr_info.valid_p ()) + { + // We haven't found any vector instructions or VL/VTYPE changes + // yet, use the predecessor information. + curr_info = bb_vinfo_map[bb->index].pred; + gcc_assert (curr_info.valid_p () && + "Expected a valid predecessor state."); + if (need_vsetvli (insn, new_info, curr_info)) + { + // If this is the first implicit state change, and the state change + // requested can be proven to produce the same register contents, we + // can skip emitting the actual state change and continue as if we + // had since we know the GPR result of the implicit state change + // wouldn't be used and VL/VTYPE registers are correct. Note that + // we *do* need to model the state as if it changed as while the + // register contents are unchanged, the abstract model can change. + if (need_vsetvli_phi (new_info, insn)) + insert_vsetvl (insn, new_info, curr_info); + curr_info = new_info; + } + } + else + { + // If this instruction isn't compatible with the previous VL/VTYPE + // we need to insert a VSETVLI. + // If this is a unit-stride or strided load/store, we may be able + // to use the EMUL=(EEW/SEW)*LMUL relationship to avoid changing + // vtype. NOTE: We can't use predecessor information for the store. + // We must treat it the same as the first phase so that we produce + // the correct vl/vtype for succesor blocks. + if (!can_skip_vsetvli_for_load_store_p (insn, new_info, + curr_info) && + need_vsetvli (insn, new_info, curr_info)) + { + insert_vsetvl (insn, new_info, curr_info); + curr_info = new_info; + } + } + cleanup_insn_op (insn); + } + // If this is something updates VL/VTYPE that we don't know about, set + // the state to unknown. + if (update_vl_vtype_p (insn)) + curr_info = vinfo::get_unknown (); + + // If we reach the end of the block and our current info doesn't match the + // expected info, insert a vsetvli to correct. + if (insn == BB_END (bb)) + { + const vinfo exit_info = bb_vinfo_map.at(bb->index).exit; + if (curr_info.valid_p () && exit_info.valid_p () && + !exit_info.unknown_p () && curr_info != exit_info) + { + insert_vsetvl (insn, exit_info, curr_info); + curr_info = exit_info; + } + } + } +} + +static void +dolocalprepass (const basic_block bb) +{ + rtx_insn *insn = NULL; + vinfo curr_info = vinfo::get_unknown (); + FOR_BB_INSNS (bb, insn) + { + // If this is an explicit VSETVLI or VSETIVLI, update our state. + if (is_vector_config_instr (insn)) + { + curr_info = get_info_for_vsetvli (insn, curr_info); + continue; + } + + if (scalar_move_insn_p (insn)) + { + gcc_assert (use_vtype_p (insn) && use_vl_p (insn)); + const vinfo new_info = compute_info_for_instr (insn, curr_info); + + // For vmv.s.x and vfmv.s.f, there are only two behaviors, VL = 0 and + // VL > 0. We can discard the user requested AVL and just use the last + // one if we can prove it equally zero. This removes a vsetvli entirely + // if the types match or allows use of cheaper avl preserving variant + // if VLMAX doesn't change. If VLMAX might change, we couldn't use + // the 'vsetvli x0, x0, vtype" variant, so we avoid the transform to + // prevent extending live range of an avl register operand. + // TODO: We can probably relax this for immediates. + if (((curr_info.has_nonzero_avl () && new_info.has_nonzero_avl ()) || + (curr_info.has_zero_avl () && new_info.has_zero_avl ())) && + new_info.vlmax_equal_p (curr_info)) + { + replace_op (insn, curr_info.get_avl (), REPLACE_VL); + curr_info = compute_info_for_instr (insn, curr_info); + continue; + } + } + + if (use_vtype_p (insn)) + { + if (use_vl_p (insn)) + { + const auto require = compute_info_for_instr (insn, curr_info); + // If the AVL is the result of a previous vsetvli which has the + // same AVL and VLMAX as our current state, we can reuse the AVL + // from the current state for the new one. This allows us to + // generate 'vsetvli x0, x0, vtype" or possible skip the transition + // entirely. + if (!curr_info.unknown_p () && require.get_avl () && + REG_P (require.get_avl ()) && + REGNO (require.get_avl ()) >= FIRST_PSEUDO_REGISTER) + { + rtx_insn *def_rtl = fetch_def_insn (insn, require); + + if (def_rtl != NULL) + { + if (is_vector_config_instr (def_rtl)) + { + vinfo def_info = get_info_for_vsetvli (def_rtl, curr_info); + if (def_info.avl_equal_p (curr_info) && + def_info.vlmax_equal_p (curr_info)) + { + replace_op (insn, curr_info.get_avl (), REPLACE_VL); + curr_info = compute_info_for_instr (insn, curr_info); + continue; + } + } + } + } + + // If AVL is defined by a vsetvli with the same vtype, we can + // replace the AVL operand with the AVL of the defining vsetvli. + // We avoid general register AVLs to avoid extending live ranges + // without being sure we can kill the original source reg entirely. + // TODO: We can ignore policy bits here, we only need VL to be the + // same. + if (!curr_info.unknown_p () && require.get_avl () && + REG_P (require.get_avl ()) && + REGNO (require.get_avl ()) >= FIRST_PSEUDO_REGISTER) + { + rtx_insn *def_rtl = fetch_def_insn (insn, require); + if (def_rtl != NULL) + { + if (is_vector_config_instr (def_rtl)) + { + vinfo def_info = get_info_for_vsetvli (def_rtl, curr_info); + if (def_info.vtype_equal_p (require) && + (def_info.avl_const_p () || + (def_info.avl_reg_p () && + rtx_equal_p (def_info.get_avl (), gen_rtx_REG (Pmode, X0_REGNUM))))) + { + replace_op (insn, def_info.get_avl (), REPLACE_VL); + curr_info = compute_info_for_instr (insn, curr_info); + continue; + } + } + } + } + } + curr_info = compute_info_for_instr (insn, curr_info); + continue; + } + + // If this is something that updates VL/VTYPE that we don't know about, + // set the state to unknown. + if (update_vl_vtype_p (insn)) + curr_info = vinfo::get_unknown (); + } +} + +static void +dolocalpostpass (const basic_block bb) +{ + rtx_insn *prev_insn = nullptr; + rtx_insn *insn = nullptr; + bool used_vl = false, used_vtype = false; + std::vector to_delete; + FOR_BB_INSNS (bb, insn) + { + // Note: Must be *before* vsetvli handling to account for config cases + // which only change some subfields. + if (update_vl_vtype_p (insn) || use_vl_p (insn)) + used_vl = true; + if (update_vl_vtype_p (insn) || use_vtype_p (insn)) + used_vtype = true; + + if (!is_vector_config_instr (insn)) + continue; + + extract_insn_cached (insn); + if (prev_insn) + { + if (!used_vl && !used_vtype) + { + to_delete.push_back (prev_insn); + // fallthrough + } + else if (!used_vtype && is_vl_preserving_config (insn)) + { + // Note: `vsetvli x0, x0, vtype' is the canonical instruction + // for this case. If you find yourself wanting to add other forms + // to this "unused VTYPE" case, we're probably missing a + // canonicalization earlier. + // Note: We don't need to explicitly check vtype compatibility + // here because this form is only legal (per ISA) when not + // changing VL. + rtx new_vtype = recog_data.operand[recog_data.n_operands - 1]; + replace_op (prev_insn, new_vtype, REPLACE_VTYPE); + to_delete.push_back (insn); + // Leave prev_insn unchanged + continue; + } + } + prev_insn = insn; + used_vl = false; + used_vtype = false; + + rtx vdef = recog_data.operand[0]; + if (!rtx_equal_p (vdef, gen_rtx_REG (Pmode, X0_REGNUM)) && + !(REGNO (vdef) >= FIRST_PSEUDO_REGISTER && + (find_reg_note (insn, REG_UNUSED, vdef) || + find_reg_note (insn, REG_DEAD, vdef)))) + used_vl = true; + } + + for (auto *to_remove : to_delete) + remove_insn (to_remove); +} + +/// Return true if the VL value configured must be equal to the requested one. +static bool +has_fixed_result (const vinfo &info) +{ + if (!info.avl_const_p ()) + // VLMAX is always the same value. + // TODO: Could extend to other registers by looking at the associated + // vreg def placement. + return rtx_equal_p (info.get_avl (), gen_rtx_REG (Pmode, X0_REGNUM)); + + if (VLMUL_FIELD_000 != info.get_vlmul ()) + // TODO: Generalize the code below to account for LMUL + return false; + + if (!BYTES_PER_RISCV_VECTOR.is_constant ()) + return false; + + unsigned int avl = INTVAL (info.get_avl ()); + unsigned int vsew = info.get_vsew (); + machine_mode inner = vsew_to_int_mode (vsew); + unsigned int sew = GET_MODE_BITSIZE (as_a (inner)); + unsigned avl_in_bits = avl * sew; + machine_mode mode = riscv_vector::vector_builtin_mode ( + as_a (inner), info.get_vlmul ()); + return GET_MODE_BITSIZE (mode).to_constant () >= avl_in_bits; +} + +/// Perform simple partial redundancy elimination of the VSETVLI instructions +/// we're about to insert by looking for cases where we can PRE from the +/// beginning of one block to the end of one of its predecessors. Specifically, +/// this is geared to catch the common case of a fixed length vsetvl in a single +/// block loop when it could execute once in the preheader instead. +static void +dopre (const basic_block bb) +{ + if (!bb_vinfo_map[bb->index].pred.unknown_p ()) + return; + + basic_block unavailable_pred = nullptr; + vinfo available_info; + + edge e; + edge_iterator ei; + FOR_EACH_EDGE (e, ei, bb->preds) + { + basic_block predecessor = e->src; + const vinfo &pred_info = bb_vinfo_map[predecessor->index].exit; + if (pred_info.unknown_p ()) + { + if (unavailable_pred) + return; + unavailable_pred = predecessor; + } + else if (!available_info.valid_p ()) + available_info = pred_info; + else if (available_info != pred_info) + return; + } + + // unreachable, single pred, or full redundancy. Note that FRE + // is handled by phase 3. + if (!unavailable_pred || !available_info.valid_p ()) + return; + + // critical edge - TODO: consider splitting? + if (EDGE_COUNT (unavailable_pred->succs) != 1) + return; + + // If VL can be less than AVL, then we can't reduce the frequency of exec. + if (!has_fixed_result (available_info)) + return; + + // Does it actually let us remove an implicit transition in MBB? + bool found = false; + rtx_insn *insn; + vinfo curr_info; + FOR_BB_INSNS (bb, insn) + { + if (is_vector_config_instr (insn)) + return; + + if (use_vtype_p (insn)) + { + if (available_info != compute_info_for_instr (insn, curr_info)) + return; + found = true; + break; + } + } + + if (!found) + return; + + // Finally, update both data flow state and insert the actual vsetvli. + // Doing both keeps the code in sync with the dataflow results, which + // is critical for correctness of phase 3. + auto old_info = bb_vinfo_map[unavailable_pred->index].exit; + if (dump_file) + { + fprintf (dump_file, "PRE VSETVLI from bb %d changed to bb %d\n", bb->index, unavailable_pred->index); + available_info.print (); + } + bb_vinfo_map[unavailable_pred->index].exit = available_info; + bb_vinfo_map[bb->index].pred = available_info; + + // Note there's an implicit assumption here that terminators never use + // or modify VL or VTYPE. Also, fallthrough will return end(). + auto insert_pt = BB_END (unavailable_pred); + insert_vsetvl (insert_pt, available_info, old_info); +} + +static unsigned int +rest_of_handle_insert_vsetvl (function *fn) +{ + basic_block bb; + + if (n_basic_blocks_for_fn (fn) <= 0) + return 0; + + gcc_assert (bb_vinfo_map.empty () && "Expect empty block infos."); + + if (optimize >= 2) + { + // Initialization. + calculate_dominance_info (CDI_DOMINATORS); + df_analyze (); + crtl->ssa = new rtl_ssa::function_info (cfun); + } + + if (dump_file) + fprintf (dump_file, "\nEntering InsertVSETVLI for %s\n\n", + current_function_name ()); + + /* Initialize Basic Block Map */ + FOR_ALL_BB_FN (bb, fn) + { + bb_vinfo bb_init; + bb_vinfo_map.insert (std::pair (bb->index, bb_init)); + } + + // Scan the block locally for cases where we can mutate the operands + // of the instructions to reduce state transitions. Critically, this + // must be done before we start propagating data flow states as these + // transforms are allowed to change the contents of VTYPE and VL so + // long as the semantics of the program stays the same. + FOR_ALL_BB_FN (bb, fn) + dolocalprepass (bb); + + bool vector_p = false; + + if (dump_file) + fprintf ( + dump_file, + "Phase 1 determine how VL/VTYPE are affected by the each block:\n"); + + // Phase 1 - determine how VL/VTYPE are affected by the each block. + FOR_ALL_BB_FN (bb, fn) + { + vector_p |= compute_vl_vtype_changes (bb); + bb_vinfo &info = bb_vinfo_map[bb->index]; + info.exit = info.change; + if (dump_file) + { + fprintf (dump_file, "Initial exit state of bb %d\n", bb->index); + info.exit.print (); + } + } + + if (!vector_p) + { + bb_vinfo_map.clear (); + bb_queue.clear (); + if (optimize >= 2) + { + // Finalization. + free_dominance_info (CDI_DOMINATORS); + if (crtl->ssa->perform_pending_updates ()) + cleanup_cfg (0); + + delete crtl->ssa; + crtl->ssa = nullptr; + } + return 0; + } + + if (dump_file) + fprintf (dump_file, + "Phase 2 determine the exit VL/VTYPE from each block:\n"); + // Phase 2 - determine the exit VL/VTYPE from each block. We add all + // blocks to the list here, but will also add any that need to be + // revisited during Phase 2 processing. + FOR_ALL_BB_FN (bb, fn) + { + bb_queue.push_back (bb); + bb_vinfo_map[bb->index].inqueue = true; + } + while (!bb_queue.empty ()) + { + bb = bb_queue.front (); + bb_queue.pop_front (); + compute_incoming_vl_vtype (bb); + } + + // Perform partial redundancy elimination of vsetvli transitions. + FOR_ALL_BB_FN (bb, fn) + dopre (bb); + + if (dump_file) + fprintf (dump_file, + "Phase 3 add any vsetvli instructions needed in the block:\n"); + // Phase 3 - add any vsetvli instructions needed in the block. Use the + // Phase 2 information to avoid adding vsetvlis before the first vector + // instruction in the block if the VL/VTYPE is satisfied by its + // predecessors. + FOR_ALL_BB_FN (bb, fn) + emit_vsetvlis (bb); + + // Now that all vsetvlis are explicit, go through and do block local + // DSE and peephole based demanded fields based transforms. Note that + // this *must* be done outside the main dataflow so long as we allow + // any cross block analysis within the dataflow. We can't have both + // demanded fields based mutation and non-local analysis in the + // dataflow at the same time without introducing inconsistencies. + FOR_ALL_BB_FN (bb, fn) + dolocalpostpass(bb); + + // Once we're fully done rewriting all the instructions, do a final pass + // through to check for VSETVLIs which write to an unused destination. + // For the non X0, X0 variant, we can replace the destination register + // with X0 to reduce register pressure. This is really a generic + // optimization which can be applied to any dead def (TODO: generalize). + if (!reload_completed) + { + FOR_ALL_BB_FN (bb, fn) + { + rtx_insn *insn = NULL; + FOR_BB_INSNS (bb, insn) + { + if (is_vector_config_instr (insn)) + { + extract_insn_cached (insn); + if (recog_data.n_operands == 3 && + !rtx_equal_p (recog_data.operand[0], + gen_rtx_REG (Pmode, X0_REGNUM)) && + !rtx_equal_p (recog_data.operand[1], + gen_rtx_REG (Pmode, X0_REGNUM)) && + (find_reg_note (insn, REG_UNUSED, recog_data.operand[0]) || + find_reg_note (insn, REG_DEAD, recog_data.operand[0]))) + { + rtx pat = gen_vsetvl_zero (Pmode, recog_data.operand[1], + recog_data.operand[2]); + validate_change (insn, &PATTERN (insn), pat, false); + } + } + } + } + } + + bb_vinfo_map.clear (); + bb_queue.clear (); + + if (optimize >= 2) + { + // Finalization. + free_dominance_info (CDI_DOMINATORS); + if (crtl->ssa->perform_pending_updates ()) + cleanup_cfg (0); + + delete crtl->ssa; + crtl->ssa = nullptr; + } + + return 0; +} + +const pass_data pass_data_insert_vsetvl = { + RTL_PASS, /* type */ + "insert_vsetvl", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + TV_NONE, /* tv_id */ + 0, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + 0, /* todo_flags_finish */ +}; + +class pass_insert_vsetvl : public rtl_opt_pass +{ +public: + pass_insert_vsetvl (gcc::context *ctxt) + : rtl_opt_pass (pass_data_insert_vsetvl, ctxt) + { + } + + /* opt_pass methods: */ + virtual bool + gate (function *) + { + return TARGET_VECTOR; + } + virtual unsigned int + execute (function *fn) + { + return rest_of_handle_insert_vsetvl (fn); + } + +}; // class pass_insert_vsetvl + +rtl_opt_pass * +make_pass_insert_vsetvl (gcc::context *ctxt) +{ + return new pass_insert_vsetvl (ctxt); +} + +const pass_data pass_data_insert_vsetvl2 = { + RTL_PASS, /* type */ + "insert_vsetvl2", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + TV_NONE, /* tv_id */ + 0, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + 0, /* todo_flags_finish */ +}; + +class pass_insert_vsetvl2 : public rtl_opt_pass +{ +public: + pass_insert_vsetvl2 (gcc::context *ctxt) + : rtl_opt_pass (pass_data_insert_vsetvl2, ctxt) + { + } + + /* opt_pass methods: */ + virtual bool + gate (function *) + { + return TARGET_VECTOR; + } + virtual unsigned int + execute (function *fn) + { + return rest_of_handle_insert_vsetvl (fn); + } + +}; // class pass_insert_vsetvl2 + +rtl_opt_pass * +make_pass_insert_vsetvl2 (gcc::context *ctxt) +{ + return new pass_insert_vsetvl2 (ctxt); +} \ No newline at end of file diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h index d99b8dcbaf1..1c42f6297f9 100644 --- a/gcc/config/riscv/riscv-opts.h +++ b/gcc/config/riscv/riscv-opts.h @@ -81,6 +81,18 @@ enum riscv_vector_bits_enum RVV_4096 = 4096 }; +enum vsew_field_enum +{ + VSEW_FIELD_000, /* SEW = 8 */ + VSEW_FIELD_001, /* SEW = 16 */ + VSEW_FIELD_010, /* SEW = 32 */ + VSEW_FIELD_011, /* SEW = 64 */ + VSEW_FIELD_100, /* SEW = 128 */ + VSEW_FIELD_101, /* SEW = 256 */ + VSEW_FIELD_110, /* SEW = 512 */ + VSEW_FIELD_111 /* SEW = 1024 */ +}; + enum vlmul_field_enum { VLMUL_FIELD_000, /* LMUL = 1 */ diff --git a/gcc/config/riscv/riscv-passes.def b/gcc/config/riscv/riscv-passes.def index 23ef8ac6114..282a0402485 100644 --- a/gcc/config/riscv/riscv-passes.def +++ b/gcc/config/riscv/riscv-passes.def @@ -18,3 +18,5 @@ . */ INSERT_PASS_AFTER (pass_rtl_store_motion, 1, pass_shorten_memrefs); +INSERT_PASS_AFTER (pass_split_all_insns, 1, pass_insert_vsetvl); +INSERT_PASS_BEFORE (pass_sched2, 1, pass_insert_vsetvl2); diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index cae2974b54f..9a7e120854a 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -96,6 +96,8 @@ extern std::string riscv_arch_str (bool version_p = true); extern bool riscv_hard_regno_rename_ok (unsigned, unsigned); rtl_opt_pass * make_pass_shorten_memrefs (gcc::context *ctxt); +rtl_opt_pass * make_pass_insert_vsetvl (gcc::context *ctxt); +rtl_opt_pass * make_pass_insert_vsetvl2 (gcc::context *ctxt); /* Information about one CPU we know about. */ struct riscv_cpu_info { @@ -112,15 +114,32 @@ struct riscv_cpu_info { extern const riscv_cpu_info *riscv_find_cpu (const char *); /* Routines implemented in riscv-vector.cc. */ +extern bool rvv_mask_mode_p (machine_mode); extern bool rvv_mode_p (machine_mode); extern bool rvv_legitimate_poly_int_p (rtx); extern unsigned int rvv_offset_temporaries (bool, poly_int64); +extern enum vsew_field_enum rvv_classify_vsew_field (machine_mode); extern enum vlmul_field_enum rvv_classify_vlmul_field (machine_mode); extern unsigned int rvv_parse_vsew_field (unsigned int); extern unsigned int rvv_parse_vlmul_field (unsigned int); extern bool rvv_parse_vta_field (unsigned int); extern bool rvv_parse_vma_field (unsigned int); extern int rvv_regsize (machine_mode); +extern rtx rvv_gen_policy (unsigned int rvv_policy = 0); +extern opt_machine_mode rvv_get_mask_mode (machine_mode); +extern machine_mode rvv_translate_attr_mode (rtx_insn *); +extern void +emit_op5 ( + unsigned int unspec, + machine_mode Vmode, machine_mode VSImode, machine_mode VMSImode, + machine_mode VSUBmode, + rtx *operands, + rtx (*gen_vx) (rtx, rtx, rtx, rtx, rtx), + rtx (*gen_vx_32bit) (rtx, rtx, rtx, rtx, rtx), + rtx (*gen_vv) (rtx, rtx, rtx, rtx, rtx), + bool (*imm_p) (rtx), + int i, bool reverse +); /* We classify builtin types into two classes: 1. General builtin class which is using the diff --git a/gcc/config/riscv/riscv-vector-builtins-iterators.def b/gcc/config/riscv/riscv-vector-builtins-iterators.def index cc968f5534f..77a391c7630 100644 --- a/gcc/config/riscv/riscv-vector-builtins-iterators.def +++ b/gcc/config/riscv/riscv-vector-builtins-iterators.def @@ -7,6 +7,38 @@ #define DEF_RISCV_ARG_MODE_ATTR(A, B, C, D, E) #endif +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(V, 31) +DEF_RISCV_ARG_MODE_ATTR(V, 0, VNx2QI, VNx2QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 1, VNx4QI, VNx4QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 2, VNx8QI, VNx8QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 3, VNx16QI, VNx16QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 4, VNx32QI, VNx32QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 5, VNx64QI, VNx64QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 6, VNx128QI, VNx128QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 7, VNx2HI, VNx2HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 8, VNx4HI, VNx4HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 9, VNx8HI, VNx8HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 10, VNx16HI, VNx16HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 11, VNx32HI, VNx32HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 12, VNx64HI, VNx64HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 13, VNx2SI, VNx2SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 14, VNx4SI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 15, VNx8SI, VNx8SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 16, VNx16SI, VNx16SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 17, VNx32SI, VNx32SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 18, VNx2DI, VNx2DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 19, VNx4DI, VNx4DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 20, VNx8DI, VNx8DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 21, VNx16DI, VNx16DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V, 22, VNx2SF, VNx2SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(V, 23, VNx4SF, VNx4SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(V, 24, VNx8SF, VNx8SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(V, 25, VNx16SF, VNx16SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(V, 26, VNx32SF, VNx32SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(V, 27, VNx2DF, VNx2DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(V, 28, VNx4DF, VNx4DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(V, 29, VNx8DF, VNx8DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(V, 30, VNx16DF, VNx16DF, TARGET_DOUBLE_FLOAT) DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VI, 22) DEF_RISCV_ARG_MODE_ATTR(VI, 0, VNx2QI, VNx2QI, TARGET_ANY) DEF_RISCV_ARG_MODE_ATTR(VI, 1, VNx4QI, VNx4QI, TARGET_ANY) @@ -30,6 +62,210 @@ DEF_RISCV_ARG_MODE_ATTR(VI, 18, VNx2DI, VNx2DI, TARGET_ANY) DEF_RISCV_ARG_MODE_ATTR(VI, 19, VNx4DI, VNx4DI, TARGET_ANY) DEF_RISCV_ARG_MODE_ATTR(VI, 20, VNx8DI, VNx8DI, TARGET_ANY) DEF_RISCV_ARG_MODE_ATTR(VI, 21, VNx16DI, VNx16DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VF, 9) +DEF_RISCV_ARG_MODE_ATTR(VF, 0, VNx2SF, VNx2SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VF, 1, VNx4SF, VNx4SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VF, 2, VNx8SF, VNx8SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VF, 3, VNx16SF, VNx16SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VF, 4, VNx32SF, VNx32SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VF, 5, VNx2DF, VNx2DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VF, 6, VNx4DF, VNx4DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VF, 7, VNx8DF, VNx8DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VF, 8, VNx16DF, VNx16DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VB, 7) +DEF_RISCV_ARG_MODE_ATTR(VB, 0, VNx2BI, VNx2BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VB, 1, VNx4BI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VB, 2, VNx8BI, VNx8BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VB, 3, VNx16BI, VNx16BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VB, 4, VNx32BI, VNx32BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VB, 5, VNx64BI, VNx64BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VB, 6, VNx128BI, VNx128BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VFULL, 24) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 0, VNx16QI, VNx16QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 1, VNx32QI, VNx32QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 2, VNx64QI, VNx64QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 3, VNx128QI, VNx128QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 4, VNx8HI, VNx8HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 5, VNx16HI, VNx16HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 6, VNx32HI, VNx32HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 7, VNx64HI, VNx64HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 8, VNx4SI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 9, VNx8SI, VNx8SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 10, VNx16SI, VNx16SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 11, VNx32SI, VNx32SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 12, VNx2DI, VNx2DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 13, VNx4DI, VNx4DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 14, VNx8DI, VNx8DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 15, VNx16DI, VNx16DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 16, VNx4SF, VNx4SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 17, VNx8SF, VNx8SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 18, VNx16SF, VNx16SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 19, VNx32SF, VNx32SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 20, VNx2DF, VNx2DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 21, VNx4DF, VNx4DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 22, VNx8DF, VNx8DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VFULL, 23, VNx16DF, VNx16DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VPARTIAL, 7) +DEF_RISCV_ARG_MODE_ATTR(VPARTIAL, 0, VNx2QI, VNx2QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VPARTIAL, 1, VNx4QI, VNx4QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VPARTIAL, 2, VNx8QI, VNx8QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VPARTIAL, 3, VNx2HI, VNx2HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VPARTIAL, 4, VNx4HI, VNx4HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VPARTIAL, 5, VNx2SI, VNx2SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VPARTIAL, 6, VNx2SF, VNx2SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(V64BITI, 4) +DEF_RISCV_ARG_MODE_ATTR(V64BITI, 0, VNx2DI, VNx2DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V64BITI, 1, VNx4DI, VNx4DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V64BITI, 2, VNx8DI, VNx8DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(V64BITI, 3, VNx16DI, VNx16DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VM, 69) +DEF_RISCV_ARG_MODE_ATTR(VM, 0, VNx2BI, VNx2BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 1, VNx4BI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 2, VNx8BI, VNx8BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 3, VNx16BI, VNx16BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 4, VNx32BI, VNx32BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 5, VNx64BI, VNx64BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 6, VNx128BI, VNx128BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 7, VNx2QI, VNx2BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 8, VNx4QI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 9, VNx8QI, VNx8BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 10, VNx16QI, VNx16BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 11, VNx32QI, VNx32BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 12, VNx64QI, VNx64BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 13, VNx128QI, VNx128BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 14, VNx2HI, VNx2BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 15, VNx4HI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 16, VNx8HI, VNx8BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 17, VNx16HI, VNx16BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 18, VNx32HI, VNx32BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 19, VNx64HI, VNx64BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 20, VNx2SI, VNx2BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 21, VNx4SI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 22, VNx8SI, VNx8BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 23, VNx16SI, VNx16BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 24, VNx32SI, VNx32BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 25, VNx2DI, VNx2BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 26, VNx4DI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 27, VNx8DI, VNx8BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 28, VNx16DI, VNx16BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 29, VNx2SF, VNx2BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 30, VNx4SF, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 31, VNx8SF, VNx8BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 32, VNx16SF, VNx16BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 33, VNx32SF, VNx32BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 34, VNx2DF, VNx2BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 35, VNx4DF, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 36, VNx8DF, VNx8BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 37, VNx16DF, VNx16BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 38, VNx2QI, VNx2BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 39, VNx4QI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 40, VNx8QI, VNx8BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 41, VNx16QI, VNx16BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 42, VNx32QI, VNx32BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 43, VNx64QI, VNx64BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 44, VNx128QI, VNx128BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 45, VNx2HI, VNx2BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 46, VNx4HI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 47, VNx8HI, VNx8BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 48, VNx16HI, VNx16BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 49, VNx32HI, VNx32BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 50, VNx64HI, VNx64BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 51, VNx2SI, VNx2BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 52, VNx4SI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 53, VNx8SI, VNx8BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 54, VNx16SI, VNx16BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 55, VNx32SI, VNx32BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 56, VNx2DI, VNx2BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 57, VNx4DI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 58, VNx8DI, VNx8BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 59, VNx16DI, VNx16BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 60, VNx2SF, VNx2BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 61, VNx4SF, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 62, VNx8SF, VNx8BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 63, VNx16SF, VNx16BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 64, VNx32SF, VNx32BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 65, VNx2DF, VNx2BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 66, VNx4DF, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 67, VNx8DF, VNx8BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VM, 68, VNx16DF, VNx16BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VSUB, 31) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 0, VNx2QI, QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 1, VNx4QI, QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 2, VNx8QI, QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 3, VNx16QI, QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 4, VNx32QI, QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 5, VNx64QI, QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 6, VNx128QI, QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 7, VNx2HI, HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 8, VNx4HI, HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 9, VNx8HI, HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 10, VNx16HI, HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 11, VNx32HI, HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 12, VNx64HI, HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 13, VNx2SI, SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 14, VNx4SI, SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 15, VNx8SI, SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 16, VNx16SI, SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 17, VNx32SI, SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 18, VNx2DI, DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 19, VNx4DI, DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 20, VNx8DI, DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 21, VNx16DI, DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 22, VNx2SF, SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 23, VNx4SF, SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 24, VNx8SF, SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 25, VNx16SF, SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 26, VNx32SF, SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 27, VNx2DF, DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 28, VNx4DF, DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 29, VNx8DF, DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VSUB, 30, VNx16DF, DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VDI_TO_VSI, 22) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 0, VNx2QI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 1, VNx4QI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 2, VNx8QI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 3, VNx16QI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 4, VNx32QI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 5, VNx64QI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 6, VNx128QI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 7, VNx2HI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 8, VNx4HI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 9, VNx8HI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 10, VNx16HI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 11, VNx32HI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 12, VNx64HI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 13, VNx2SI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 14, VNx4SI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 15, VNx8SI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 16, VNx16SI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 17, VNx32SI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 18, VNx2DI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 19, VNx4DI, VNx8SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 20, VNx8DI, VNx16SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI, 21, VNx16DI, VNx32SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VDI_TO_VSI_VM, 22) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 0, VNx2QI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 1, VNx4QI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 2, VNx8QI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 3, VNx16QI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 4, VNx32QI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 5, VNx64QI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 6, VNx128QI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 7, VNx2HI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 8, VNx4HI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 9, VNx8HI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 10, VNx16HI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 11, VNx32HI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 12, VNx64HI, VNx64BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 13, VNx2SI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 14, VNx4SI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 15, VNx8SI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 16, VNx16SI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 17, VNx32SI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 18, VNx2DI, VNx4BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 19, VNx4DI, VNx8BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 20, VNx8DI, VNx16BI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VDI_TO_VSI_VM, 21, VNx16DI, VNx32BI, TARGET_ANY) #undef DEF_RISCV_ARG_MODE_ATTR_VARIABLE #undef DEF_RISCV_ARG_MODE_ATTR diff --git a/gcc/config/riscv/riscv-vector.cc b/gcc/config/riscv/riscv-vector.cc index a9c8b290104..426490945dd 100644 --- a/gcc/config/riscv/riscv-vector.cc +++ b/gcc/config/riscv/riscv-vector.cc @@ -66,6 +66,7 @@ #include "tree-ssa-loop-niter.h" #include "rtx-vector-builder.h" #include "riscv-vector.h" +#include "riscv-vector-builtins.h" /* This file should be included last. */ #include "target-def.h" @@ -158,6 +159,38 @@ rvv_offset_temporaries (bool add_p, poly_int64 offset) return count + rvv_add_offset_1_temporaries (constant); } +/* Return the vsew field for a specific machine mode. */ + +enum vsew_field_enum +rvv_classify_vsew_field (machine_mode mode) +{ + switch (GET_MODE_INNER (mode)) + { + case E_QImode: + return VSEW_FIELD_000; + + case E_HImode: + return VSEW_FIELD_001; + + case E_SImode: + case E_SFmode: + return VSEW_FIELD_010; + + case E_DImode: + case E_DFmode: + return VSEW_FIELD_011; + + case E_TImode: + return VSEW_FIELD_100; + + default: + break; + } + + /* we don't care about VSEW for Mask */ + return VSEW_FIELD_000; +} + /* Return the vlmul field for a specific machine mode. */ enum vlmul_field_enum @@ -271,4 +304,339 @@ rvv_get_mask_mode (machine_mode mode) && rvv_mask_mode_p (mask_mode)) return mask_mode; return default_get_mask_mode (mode); +} + +/* Generate policy bitmap for a specific rvv_policy. */ +rtx +rvv_gen_policy (unsigned int) +{ + return riscv_vector::gen_any_policy (); +} + +/* Return machine mode for an insn type. */ +machine_mode +rvv_translate_attr_mode (rtx_insn *insn) +{ + gcc_assert (recog_memoized (insn) >= 0); + + switch (get_attr_mode (insn)) + { +#define TRANSLATE_VECTOR_MODE(MODE) \ + case MODE_VNX##MODE: \ + return VNx##MODE##mode; + TRANSLATE_VECTOR_MODE (8QI) + TRANSLATE_VECTOR_MODE (4HI) + TRANSLATE_VECTOR_MODE (2SI) + TRANSLATE_VECTOR_MODE (2SF) + TRANSLATE_VECTOR_MODE (8BI) + TRANSLATE_VECTOR_MODE (4QI) + TRANSLATE_VECTOR_MODE (2HI) + TRANSLATE_VECTOR_MODE (4BI) + TRANSLATE_VECTOR_MODE (2QI) + TRANSLATE_VECTOR_MODE (2BI) + TRANSLATE_VECTOR_MODE (16QI) + TRANSLATE_VECTOR_MODE (8HI) + TRANSLATE_VECTOR_MODE (4SI) + TRANSLATE_VECTOR_MODE (2DI) + TRANSLATE_VECTOR_MODE (4SF) + TRANSLATE_VECTOR_MODE (2DF) + TRANSLATE_VECTOR_MODE (16BI) + TRANSLATE_VECTOR_MODE (32QI) + TRANSLATE_VECTOR_MODE (16HI) + TRANSLATE_VECTOR_MODE (8SI) + TRANSLATE_VECTOR_MODE (4DI) + TRANSLATE_VECTOR_MODE (8SF) + TRANSLATE_VECTOR_MODE (4DF) + TRANSLATE_VECTOR_MODE (32BI) + TRANSLATE_VECTOR_MODE (64QI) + TRANSLATE_VECTOR_MODE (32HI) + TRANSLATE_VECTOR_MODE (16SI) + TRANSLATE_VECTOR_MODE (8DI) + TRANSLATE_VECTOR_MODE (16SF) + TRANSLATE_VECTOR_MODE (8DF) + TRANSLATE_VECTOR_MODE (64BI) + TRANSLATE_VECTOR_MODE (128QI) + TRANSLATE_VECTOR_MODE (64HI) + TRANSLATE_VECTOR_MODE (32SI) + TRANSLATE_VECTOR_MODE (16DI) + TRANSLATE_VECTOR_MODE (32SF) + TRANSLATE_VECTOR_MODE (16DF) + TRANSLATE_VECTOR_MODE (128BI) + + default: + break; + } + + return VOIDmode; +} + +/* Return the vtype field for a specific machine mode. */ +static unsigned int +classify_vtype_field (machine_mode mode) +{ + unsigned int vlmul = rvv_classify_vlmul_field (mode); + unsigned int vsew = rvv_classify_vsew_field (mode); + unsigned int vtype = (vsew << 3) | (vlmul & 0x7) | 0x40; + return vtype; +} + +/* lmul = real_lmul * 8 + guarantee integer + e.g. + 1 => 1/8 + 2 => 1/4 + 4 => 1/2 + 8 => 1 + 16 => 2 + 32 => 4 + 64 => 8 + */ +static unsigned int +get_lmulx8 (machine_mode mode) +{ + unsigned int vlmul = rvv_classify_vlmul_field (mode); + switch (vlmul) + { + case VLMUL_FIELD_000: + return 8; + case VLMUL_FIELD_001: + return 16; + case VLMUL_FIELD_010: + return 32; + case VLMUL_FIELD_011: + return 64; + case VLMUL_FIELD_101: + return 1; + case VLMUL_FIELD_110: + return 2; + case VLMUL_FIELD_111: + return 4; + default: + gcc_unreachable (); + } +} + +/* Helper functions for handling sew=64 on RV32 system. */ +static rtx +force_reg_for_over_uimm (rtx vl) +{ + if (CONST_SCALAR_INT_P (vl) && INTVAL (vl) >= 32) + { + return force_reg (Pmode, vl); + } + + return vl; +} + +/* Helper functions for handling sew=64 on RV32 system. */ +static rtx +gen_vlx2 (rtx avl, machine_mode Vmode, machine_mode VSImode) +{ + if (rtx_equal_p (avl, gen_rtx_REG (Pmode, X0_REGNUM))) + { + return avl; + } + rtx i32vl = NULL_RTX; + if (CONST_SCALAR_INT_P (avl)) + { + unsigned int vlen_max; + unsigned int vlen_min; + if (riscv_vector_chunks.is_constant ()) + { + vlen_max = riscv_vector_chunks.to_constant () * 64; + vlen_min = vlen_max; + } + else + { + /* TODO: vlen_max will be supported as 65536 in the future. */ + vlen_max = RVV_4096; + vlen_min = RVV_128; + } + unsigned int max_vlmax = (vlen_max / GET_MODE_UNIT_BITSIZE (Vmode) * get_lmulx8 (Vmode)) / 8; + unsigned int min_vlmax = (vlen_min / GET_MODE_UNIT_BITSIZE (Vmode) * get_lmulx8 (Vmode)) / 8; + + unsigned HOST_WIDE_INT avl_int = INTVAL (avl); + if (avl_int <= min_vlmax) + { + i32vl = gen_int_mode (2 * avl_int, SImode); + } + else if (avl_int >= 2 * max_vlmax) + { + // Just set i32vl to VLMAX in this situation + i32vl = gen_reg_rtx (Pmode); + unsigned int vtype = classify_vtype_field (VSImode); + emit_insn (gen_vsetvl (Pmode, i32vl, gen_rtx_REG (Pmode, X0_REGNUM), GEN_INT (vtype))); + } + else + { + // For AVL between (MinVLMAX, 2 * MaxVLMAX), the actual working vl + // is related to the hardware implementation. + // So let the following code handle + } + } + if (!i32vl) + { + // Using vsetvli instruction to get actually used length which related to + // the hardware implementation + rtx i64vl = gen_reg_rtx (Pmode); + unsigned int vtype = classify_vtype_field (Vmode); + emit_insn (gen_vsetvl (Pmode, i64vl, force_reg (Pmode, avl), GEN_INT (vtype))); + // scale 2 for 32-bit length + i32vl = gen_reg_rtx (Pmode); + emit_insn (gen_rtx_SET (i32vl, gen_rtx_ASHIFT (Pmode, i64vl, const1_rtx))); + } + + return force_reg_for_over_uimm (i32vl); +} + +/* Helper functions for handling sew=64 on RV32 system. */ +static void +emit_int64_to_vector_32bit (machine_mode Vmode, machine_mode VSImode, + machine_mode VMSImode, rtx vd, rtx s, rtx vl, + rtx tail) +{ + if (CONST_SCALAR_INT_P (s)) + { + s = force_reg (DImode, s); + } + + rtx hi = gen_highpart (SImode, s); + rtx lo = gen_lowpart (SImode, s); + + rtx zero = gen_rtx_REG (SImode, X0_REGNUM); + + /* make a "0101..." mask vector */ + rtx vm1 = gen_reg_rtx (VNx4SImode); + emit_insn (gen_vmv_v_x_internal (VNx4SImode, vm1, const0_rtx, + force_reg (SImode, GEN_INT (0x55555555)), + zero, rvv_gen_policy ())); + rtx vm2 = gen_reg_rtx (VMSImode); + emit_insn (gen_rtx_SET (vm2, gen_lowpart (VMSImode, vm1))); + + rtx vlx2 = gen_vlx2 (vl, Vmode, VSImode); + rtx v2 = gen_reg_rtx (VSImode); + emit_insn (gen_vmv_v_x_internal (VSImode, v2, const0_rtx, hi, vlx2, + rvv_gen_policy ())); + + rtx vd_si = gen_reg_rtx (VSImode); + emit_insn (gen_vmerge_vxm_internal (VSImode, vd_si, vm2, const0_rtx, v2, lo, + vlx2, tail)); + + emit_insn (gen_rtx_SET (vd, gen_lowpart (Vmode, vd_si))); +} + +/* Helper functions for handling sew=64 on RV32 system. */ +bool +imm32_p (rtx a) +{ + if (!CONST_SCALAR_INT_P (a)) + return false; + unsigned HOST_WIDE_INT val = UINTVAL (a); + return val <= 0x7FFFFFFFULL || val >= 0xFFFFFFFF80000000ULL; +} + +typedef bool imm_p (rtx); +typedef rtx gen_3 (rtx, rtx, rtx); +typedef rtx gen_4 (rtx, rtx, rtx, rtx); +typedef rtx gen_5 (rtx, rtx, rtx, rtx, rtx); +typedef rtx gen_6 (rtx, rtx, rtx, rtx, rtx, rtx); +typedef rtx gen_7 (rtx, rtx, rtx, rtx, rtx, rtx, rtx); +enum GEN_CLASS +{ + GEN_VX, + GEN_VX_32BIT, + GEN_VV +}; + +/* Helper functions for handling sew=64 on RV32 system. */ +enum GEN_CLASS +modify_operands (machine_mode Vmode, machine_mode VSImode, + machine_mode VMSImode, machine_mode VSUBmode, rtx *operands, + bool (*imm5_p) (rtx), int i, bool reverse, unsigned int unspec) +{ + if (!TARGET_64BIT && VSUBmode == DImode) + { + if (imm32_p (operands[i])) + { + if (!imm5_p (operands[i])) + operands[i] = force_reg (SImode, operands[i]); + return GEN_VX_32BIT; + } + else + { + rtx result = gen_reg_rtx (Vmode); + rtx zero = gen_rtx_REG (SImode, X0_REGNUM); + rtx tail = rvv_gen_policy (); + + emit_int64_to_vector_32bit (Vmode, VSImode, VMSImode, result, + operands[i], zero, tail); + + operands[i] = result; + + if (reverse) + { + rtx b = operands[i - 1]; + operands[i - 1] = operands[i]; + operands[i] = b; + } + return GEN_VV; + } + } + else + { + if (!imm5_p (operands[i])) + operands[i] = force_reg (VSUBmode, operands[i]); + return GEN_VX; + } +} + +/* Helper functions for handling sew=64 on RV32 system. */ +bool +emit_op5_vmv_v_x (machine_mode Vmode, machine_mode VSImode, + machine_mode VMSImode, machine_mode VSUBmode, rtx *operands, + int i) +{ + if (!TARGET_64BIT && VSUBmode == DImode) + { + if (!imm32_p (operands[i])) + { + rtx vd = operands[1]; + if (rtx_equal_p (vd, const0_rtx)) + { + vd = operands[0]; + } + emit_int64_to_vector_32bit (Vmode, VSImode, VMSImode, vd, operands[i], + operands[3], operands[4]); + + emit_insn (gen_rtx_SET (operands[0], vd)); + return true; + } + } + return false; +} + +/* Helper functions for handling sew=64 on RV32 system. */ +void +emit_op5 (unsigned int unspec, machine_mode Vmode, machine_mode VSImode, + machine_mode VMSImode, machine_mode VSUBmode, rtx *operands, + gen_5 *gen_vx, gen_5 *gen_vx_32bit, gen_5 *gen_vv, imm_p *imm5_p, + int i, bool reverse) +{ + if (unspec == UNSPEC_VMV) + { + if (emit_op5_vmv_v_x (Vmode, VSImode, VMSImode, VSUBmode, operands, i)) + { + return; + } + } + + enum GEN_CLASS gen_class = modify_operands ( + Vmode, VSImode, VMSImode, VSUBmode, operands, imm5_p, i, reverse, unspec); + + gen_5 *gen = gen_class == GEN_VX ? gen_vx + : gen_class == GEN_VV ? gen_vv + : gen_vx_32bit; + + emit_insn ( + (*gen) (operands[0], operands[1], operands[2], operands[3], operands[4])); } \ No newline at end of file diff --git a/gcc/config/riscv/riscv-vector.h b/gcc/config/riscv/riscv-vector.h index 2c242959077..e93852e3e56 100644 --- a/gcc/config/riscv/riscv-vector.h +++ b/gcc/config/riscv/riscv-vector.h @@ -20,14 +20,4 @@ #ifndef GCC_RISCV_VECTOR_H #define GCC_RISCV_VECTOR_H -bool riscv_vector_mode_p (machine_mode); -bool rvv_legitimate_poly_int_p (rtx); -unsigned int rvv_offset_temporaries (bool, poly_int64); -vlmul_field_enum rvv_classify_vlmul_field (machine_mode); -extern unsigned int rvv_parse_vsew_field (unsigned int); -extern unsigned int rvv_parse_vlmul_field (unsigned int); -extern bool rvv_parse_vta_field (unsigned int); -extern bool rvv_parse_vma_field (unsigned int); -int rvv_regsize (machine_mode); -opt_machine_mode rvv_get_mask_mode (machine_mode); #endif // GCC_RISCV_VECTOR_H \ No newline at end of file diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 2b0b76458a7..238c972de09 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -107,6 +107,7 @@ (VL_REGNUM 66) (VTYPE_REGNUM 67) (X0_REGNUM 0) + (DO_NOT_UPDATE_VL_VTYPE 21) ]) (include "predicates.md") @@ -138,7 +139,13 @@ (const_string "unknown")) ;; Main data type used by the insn -(define_attr "mode" "unknown,none,QI,HI,SI,DI,TI,SF,DF,TF" +(define_attr "mode" "unknown,none,QI,HI,SI,DI,TI,SF,DF,TF, + VNx8QI,VNx4HI,VNx2SI,VNx4HF,VNx2SF,VNx4QI,VNx2HI,VNx2HF, + VNx2QI,VNx16QI,VNx8HI,VNx4SI,VNx2DI,VNx8HF,VNx4SF,VNx2DF, + VNx32QI,VNx16HI,VNx8SI,VNx4DI,VNx16HF,VNx8SF,VNx4DF, + VNx64QI,VNx32HI,VNx16SI,VNx8DI,VNx32HF,VNx16SF,VNx8DF, + VNx128QI,VNx64HI,VNx32SI,VNx16DI,VNx64HF,VNx32SF,VNx16DF, + VNx2BI,VNx4BI,VNx8BI,VNx16BI,VNx32BI,VNx64BI,VNx128BI" (const_string "unknown")) ;; True if the main data type is twice the size of a word. @@ -184,11 +191,67 @@ ;; ghost an instruction that produces no real code ;; bitmanip bit manipulation instructions ;; vsetvl vector configuration setting +;; vload vector whole register load +;; vstore vector whole register store +;; vcopy vector whole register copy +;; vle vector unit-stride load +;; vse vector unit-stride store +;; vlse vector strided load +;; vsse vector strided store +;; vluxei vector unordered indexed load +;; vloxei vector ordered indexed load +;; vsuxei vector unordered indexed store +;; vsoxei vector ordered indexed store +;; vleff vector unit-stride fault-only-first load +;; varith vector single-width integer and floating-point arithmetic instructions +;; vadc vector single-width add-with-carry instructions with non-mask dest +;; vmadc vector single-width add-with-carry instructions with mask dest +;; vwarith vector widening integer and floating-point arithmetic instructions +;; vlogical vector integer logical instructions +;; vshift vector integer shift instructions +;; vcmp vector integer and floating-point compare +;; vmul vector integer and floating-point multiply +;; vmulh vector integer highpart multiply +;; vdiv vector integer and floating-point divide +;; vwmul vector integer and floating-point widening multiply +;; vmadd vector single-width integer and floating-point multiply-add/sub +;; vwmadd vector widening integer and floating-point multiply-add/sub +;; vmerge vector element data selection +;; vmove vector register move +;; vsarith vector saturating single-width arithmetic instructions +;; vsmul vector saturating single-width multiply instructions +;; vscaleshift vector scaling single-width shift instructions +;; vclip vector saturating clip +;; vfsqrt vector floating point square root +;; vfsgnj vector floating-point sign-injection +;; vfclass vector floating-point classify instructions +;; vfcvt vector floating point convert +;; vfwcvt vector widening floating point convert +;; vfncvt vector narrowing floating point convert +;; vwcvt vector widening only integer convert +;; vncvt vector narrowing only integer convert +;; vreduc vector single-width reduction operations +;; vwreduc vector widening reduction operations +;; vmask vector mask operations +;; vcpop vector mask population count vpopc +;; vmsetbit vector mask bit manipulation +;; vid vector element index instruction +;; vmv_x_s vmv.x.s instruction +;; vmv_s_x vmv.s.x instruction +;; vfmv_f_s vfmv.f.s instruction +;; vfmv_s_f vfmv.s.f instruction +;; vslide vector slide instrucions +;; vgather vector gather instrucions +;; vcompress vector compress instrucions (define_attr "type" "unknown,branch,jump,call,load,fpload,store,fpstore, mtc,mfc,const,arith,logical,shift,slt,imul,idiv,move,fmove,fadd,fmul, fmadd,fdiv,fcmp,fcvt,fsqrt,multi,auipc,sfb_alu,nop,ghost,bitmanip,rotate, - vsetvl" + vsetvl,vload,vstore,vcopy,vle,vse,vlse,vsse,vluxei,vloxei,vsuxei,vsoxei,vleff, + varith,vadc,vmadc,vwarith,vlogical,vshift,vcmp,vmul,vmulh,vdiv,vwmul,vmadd,vwmadd, + vmerge,vmove,vsarith,vsmul,vscaleshift,vclip,vfsqrt,vfsgnj,vfclass,vfcvt,vfwcvt,vfncvt, + vwcvt,vncvt,vreduc,vwreduc,vmask,vcpop,vmsetbit,viota,vid,vmv_x_s,vmv_s_x,vfmv_f_s,vfmv_s_f, + vslide,vgather,vcompress" (cond [(eq_attr "got" "load") (const_string "load") ;; If a doubleword move uses these expensive instructions, diff --git a/gcc/config/riscv/t-riscv b/gcc/config/riscv/t-riscv index 9b0da73f3b5..278f3a0ba82 100644 --- a/gcc/config/riscv/t-riscv +++ b/gcc/config/riscv/t-riscv @@ -63,6 +63,10 @@ riscv-vector-builtins.o: \ $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ $(srcdir)/config/riscv/riscv-vector-builtins.cc +riscv-insert-vsetvl.o: $(srcdir)/config/riscv/riscv-insert-vsetvl.cc + $(COMPILE) $< + $(POSTCOMPILE) + PASSES_EXTRA += $(srcdir)/config/riscv/riscv-passes.def $(common_out_file): $(srcdir)/config/riscv/riscv-cores.def \ diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index 3e0699de86c..9832d2adaa3 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -21,11 +21,138 @@ (define_c_enum "unspec" [ ;; vsetvli. UNSPEC_VSETVLI + ;; RVV instructions. + UNSPEC_RVV + ;; vector select + UNSPEC_SELECT + + ;; vle/vse + UNSPEC_UNIT_STRIDE_LOAD + UNSPEC_UNIT_STRIDE_STORE + + ;; unspec merge + UNSPEC_MERGE + + UNSPEC_VMV ]) +;; All vector modes supported. +(define_mode_iterator V [ + VNx2QI VNx4QI VNx8QI VNx16QI VNx32QI VNx64QI VNx128QI + VNx2HI VNx4HI VNx8HI VNx16HI VNx32HI VNx64HI + VNx2SI VNx4SI VNx8SI VNx16SI VNx32SI + VNx2DI VNx4DI VNx8DI VNx16DI + (VNx2SF "TARGET_HARD_FLOAT") (VNx4SF "TARGET_HARD_FLOAT") (VNx8SF "TARGET_HARD_FLOAT") + (VNx16SF "TARGET_HARD_FLOAT") (VNx32SF "TARGET_HARD_FLOAT") + (VNx2DF "TARGET_DOUBLE_FLOAT") (VNx4DF "TARGET_DOUBLE_FLOAT") (VNx8DF "TARGET_DOUBLE_FLOAT") + (VNx16DF "TARGET_DOUBLE_FLOAT")]) + ;; All integer vector modes supported for RVV. (define_mode_iterator VI [ VNx2QI VNx4QI VNx8QI VNx16QI VNx32QI VNx64QI VNx128QI VNx2HI VNx4HI VNx8HI VNx16HI VNx32HI VNx64HI VNx2SI VNx4SI VNx8SI VNx16SI VNx32SI - VNx2DI VNx4DI VNx8DI VNx16DI]) \ No newline at end of file + VNx2DI VNx4DI VNx8DI VNx16DI]) + +;; All vector modes supported for float load/store/alu. +(define_mode_iterator VF [ + (VNx2SF "TARGET_HARD_FLOAT") (VNx4SF "TARGET_HARD_FLOAT") (VNx8SF "TARGET_HARD_FLOAT") + (VNx16SF "TARGET_HARD_FLOAT") (VNx32SF "TARGET_HARD_FLOAT") + (VNx2DF "TARGET_DOUBLE_FLOAT") (VNx4DF "TARGET_DOUBLE_FLOAT") (VNx8DF "TARGET_DOUBLE_FLOAT") + (VNx16DF "TARGET_DOUBLE_FLOAT")]) + +;; All vector masking modes. +(define_mode_iterator VB [ + VNx2BI VNx4BI VNx8BI VNx16BI + VNx32BI VNx64BI VNx128BI]) + +;; Full vector modes supported. +(define_mode_iterator VFULL [ + VNx16QI VNx32QI VNx64QI VNx128QI + VNx8HI VNx16HI VNx32HI VNx64HI + VNx4SI VNx8SI VNx16SI VNx32SI + VNx2DI VNx4DI VNx8DI VNx16DI + (VNx4SF "TARGET_HARD_FLOAT") (VNx8SF "TARGET_HARD_FLOAT") (VNx16SF "TARGET_HARD_FLOAT") (VNx32SF "TARGET_HARD_FLOAT") + (VNx2DF "TARGET_DOUBLE_FLOAT") (VNx4DF "TARGET_DOUBLE_FLOAT") (VNx8DF "TARGET_DOUBLE_FLOAT") (VNx16DF "TARGET_DOUBLE_FLOAT")]) + +;; Paritial vector modes supported. +(define_mode_iterator VPARTIAL [ + VNx2QI VNx4QI VNx8QI + VNx2HI VNx4HI + VNx2SI + (VNx2SF "TARGET_HARD_FLOAT")]) + +;; All vector modes supported for integer sew = 64. +(define_mode_iterator V64BITI [VNx2DI VNx4DI VNx8DI VNx16DI]) + +;; Map a vector int or float mode to a vector compare mode. +(define_mode_attr VM [ + (VNx2BI "VNx2BI") (VNx4BI "VNx4BI") (VNx8BI "VNx8BI") (VNx16BI "VNx16BI") + (VNx32BI "VNx32BI") (VNx64BI "VNx64BI") (VNx128BI "VNx128BI") + (VNx2QI "VNx2BI") (VNx4QI "VNx4BI") (VNx8QI "VNx8BI") (VNx16QI "VNx16BI") + (VNx32QI "VNx32BI") (VNx64QI "VNx64BI") (VNx128QI "VNx128BI") (VNx2HI "VNx2BI") + (VNx4HI "VNx4BI") (VNx8HI "VNx8BI") (VNx16HI "VNx16BI") (VNx32HI "VNx32BI") + (VNx64HI "VNx64BI") (VNx2SI "VNx2BI") (VNx4SI "VNx4BI") (VNx8SI "VNx8BI") + (VNx16SI "VNx16BI") (VNx32SI "VNx32BI") (VNx2DI "VNx2BI") (VNx4DI "VNx4BI") + (VNx8DI "VNx8BI") (VNx16DI "VNx16BI") + (VNx2SF "VNx2BI") (VNx4SF "VNx4BI") (VNx8SF "VNx8BI") (VNx16SF "VNx16BI") + (VNx32SF "VNx32BI") (VNx2DF "VNx2BI") (VNx4DF "VNx4BI") (VNx8DF "VNx8BI") + (VNx16DF "VNx16BI") + (VNx2QI "VNx2BI") (VNx4QI "VNx4BI") (VNx8QI "VNx8BI") (VNx16QI "VNx16BI") + (VNx32QI "VNx32BI") (VNx64QI "VNx64BI") (VNx128QI "VNx128BI") (VNx2HI "VNx2BI") + (VNx4HI "VNx4BI") (VNx8HI "VNx8BI") (VNx16HI "VNx16BI") (VNx32HI "VNx32BI") + (VNx64HI "VNx64BI") (VNx2SI "VNx2BI") (VNx4SI "VNx4BI") (VNx8SI "VNx8BI") + (VNx16SI "VNx16BI") (VNx32SI "VNx32BI") (VNx2DI "VNx2BI") (VNx4DI "VNx4BI") + (VNx8DI "VNx8BI") (VNx16DI "VNx16BI") + (VNx2SF "VNx2BI") (VNx4SF "VNx4BI") (VNx8SF "VNx8BI") (VNx16SF "VNx16BI") + (VNx32SF "VNx32BI") (VNx2DF "VNx2BI") (VNx4DF "VNx4BI") (VNx8DF "VNx8BI") + (VNx16DF "VNx16BI")]) + +;; Map a vector mode to its element mode. +(define_mode_attr VSUB [ + (VNx2QI "QI") (VNx4QI "QI") (VNx8QI "QI") (VNx16QI "QI") + (VNx32QI "QI") (VNx64QI "QI") (VNx128QI "QI") (VNx2HI "HI") + (VNx4HI "HI") (VNx8HI "HI") (VNx16HI "HI") (VNx32HI "HI") + (VNx64HI "HI") (VNx2SI "SI") (VNx4SI "SI") (VNx8SI "SI") + (VNx16SI "SI") (VNx32SI "SI") (VNx2DI "DI") (VNx4DI "DI") + (VNx8DI "DI") (VNx16DI "DI") + (VNx2SF "SF") (VNx4SF "SF") (VNx8SF "SF") (VNx16SF "SF") + (VNx32SF "SF") (VNx2DF "DF") (VNx4DF "DF") (VNx8DF "DF") + (VNx16DF "DF")]) + +(define_mode_attr VDI_TO_VSI [ + (VNx2QI "VNx4SI") (VNx4QI "VNx4SI") (VNx8QI "VNx4SI") (VNx16QI "VNx4SI") (VNx32QI "VNx4SI") (VNx64QI "VNx4SI") (VNx128QI "VNx4SI") + (VNx2HI "VNx4SI") (VNx4HI "VNx4SI") (VNx8HI "VNx4SI") (VNx16HI "VNx4SI") (VNx32HI "VNx4SI") (VNx64HI "VNx4SI") + (VNx2SI "VNx4SI") (VNx4SI "VNx4SI") (VNx8SI "VNx4SI") (VNx16SI "VNx4SI") (VNx32SI "VNx4SI") + (VNx2DI "VNx4SI") (VNx4DI "VNx8SI") (VNx8DI "VNx16SI") (VNx16DI "VNx32SI")]) + +(define_mode_attr VDI_TO_VSI_VM [ + (VNx2QI "VNx4BI") (VNx4QI "VNx4BI") (VNx8QI "VNx4BI") (VNx16QI "VNx4BI") + (VNx32QI "VNx4BI") (VNx64QI "VNx4BI") (VNx128QI "VNx4BI") + (VNx2HI "VNx4BI") (VNx4HI "VNx4BI") (VNx8HI "VNx4BI") (VNx16HI "VNx4BI") (VNx32HI "VNx4BI") + (VNx64HI "VNx64BI") + (VNx2SI "VNx4BI") (VNx4SI "VNx4BI") (VNx8SI "VNx4BI") (VNx16SI "VNx4BI") (VNx32SI "VNx4BI") + (VNx2DI "VNx4BI") (VNx4DI "VNx8BI") (VNx8DI "VNx16BI") (VNx16DI "VNx32BI") +]) + +(define_mode_attr vi_to_v64biti [ + (VNx2QI "vnx2di") (VNx4QI "vnx2di") (VNx8QI "vnx2di") (VNx16QI "vnx2di") (VNx32QI "vnx2di") (VNx64QI "vnx2di") (VNx128QI "vnx2di") + (VNx2HI "vnx2di") (VNx4HI "vnx2di") (VNx8HI "vnx2di") (VNx16HI "vnx2di") (VNx32HI "vnx2di") (VNx64HI "vnx2di") + (VNx2SI "vnx2di") (VNx4SI "vnx2di") (VNx8SI "vnx2di") (VNx16SI "vnx2di") (VNx32SI "vnx2di") + (VNx2DI "vnx2di") (VNx4DI "vnx4di") (VNx8DI "vnx8di") (VNx16DI "vnx16di")]) + +(define_int_iterator VMVOP [ + UNSPEC_VMV +]) + +(define_int_attr vxoptab [ + (UNSPEC_VMV "mv") +]) + +(define_int_attr VXOPTAB [ + (UNSPEC_VMV "UNSPEC_VMV") +]) + +(define_int_attr immptab [ + (UNSPEC_VMV "Ws5") +]) diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 31fdec981b9..4a9c6769812 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -26,6 +26,43 @@ ;; - RVV intrinsic implmentation (Document:https://github.com/riscv/rvv-intrinsic-doc) (include "vector-iterators.md") + +;; ========================================================================= +;; == Vector creation +;; ========================================================================= + +;; ------------------------------------------------------------------------- +;; ---- [INT,FP] Vector Creation +;; ------------------------------------------------------------------------- +;; Includes: +;; - Duplicate element to a vector +;; - Initialize from individual elements +;; ------------------------------------------------------------------------- + +;; vector integer modes vec_duplicate. +(define_expand "@vec_duplicate" + [(match_operand:VI 0 "register_operand") + (match_operand: 1 "reg_or_simm5_operand")] + "TARGET_VECTOR" +{ + emit_insn (gen_v_v_x (UNSPEC_VMV, mode, + operands[0], const0_rtx, operands[1], + gen_rtx_REG (Pmode, X0_REGNUM), rvv_gen_policy ())); + DONE; +}) + +;; vector floating-point modes vec_duplicate. +(define_expand "@vec_duplicate" + [(match_operand:VF 0 "register_operand") + (match_operand: 1 "register_operand")] + "TARGET_VECTOR" +{ + emit_insn (gen_vfmv_v_f (mode, operands[0], const0_rtx, + operands[1], gen_rtx_REG (Pmode, X0_REGNUM), + rvv_gen_policy ())); + DONE; +}) + ;; =============================================================================== ;; == Intrinsics ;; =============================================================================== @@ -137,4 +174,200 @@ return ""; } [(set_attr "type" "vsetvl") - (set_attr "mode" "none")]) \ No newline at end of file + (set_attr "mode" "none")]) + +;; ------------------------------------------------------------------------------- +;; ---- 7. Vector Loads and Stores +;; ------------------------------------------------------------------------------- +;; Includes: +;; - 7.4. Vector Unit-Stride Instructions +;; - 7.5. Vector Strided Instructions +;; - 7.6. Vector Indexed Instructions +;; - 7.7. Unit-stride Fault-Only-First Instructions +;; - 7.8. Vector Load/Store Segment Instructions +;; - 7.8.1. Vector Unit-Stride Segment Loads and Stores +;; - 7.8.2. Vector Strided Segment Loads and Stores +;; - 7.8.3. Vector Indexed Segment Loads and Stores +;; ------------------------------------------------------------------------------- + +;; Vector Unit-Stride Loads. +(define_insn "@vle" + [(set (match_operand:V 0 "register_operand" "=vd,vd, vr,vr") + (unspec:V + [(unspec:V + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm, J,J") + (unspec:V + [(match_operand 3 "pmode_register_operand" "r,r, r,r") + (mem:BLK (scratch))] UNSPEC_UNIT_STRIDE_LOAD) + (match_operand:V 2 "vector_reg_or_const0_operand" "0,J, 0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK, rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vle.v\t%0,(%3),%1.t + vle.v\t%0,(%3),%1.t + vle.v\t%0,(%3) + vle.v\t%0,(%3)" + [(set_attr "type" "vle") + (set_attr "mode" "")]) + +;; Vector Unit-Stride Stores. +(define_insn "@vse" + [(set (mem:BLK (scratch)) + (unspec:BLK + [(unspec:V + [(match_operand: 0 "vector_reg_or_const0_operand" "vm,J") + (unspec:BLK + [(match_operand 1 "pmode_register_operand" "r,r") + (match_operand:V 2 "register_operand" "vr,vr") + (mem:BLK (scratch))] UNSPEC_UNIT_STRIDE_STORE) + (match_dup 1)] UNSPEC_SELECT) + (match_operand 3 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vse.v\t%2,(%1),%0.t + vse.v\t%2,(%1)" + [(set_attr "type" "vse") + (set_attr "mode" "")]) + +;; Vector Unit-stride mask Loads. +(define_insn "@vlm" + [(set (match_operand:VB 0 "register_operand" "=vr") + (unspec:VB + [(unspec:VB + [(match_operand 1 "pmode_register_operand" "r") + (mem:BLK (scratch))] UNSPEC_UNIT_STRIDE_LOAD) + (match_operand 2 "p_reg_or_const_csr_operand" "rK") + (match_operand 3 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vlm.v\t%0,(%1)" + [(set_attr "type" "vle") + (set_attr "mode" "")]) + +;; Vector Unit-stride mask Stores. +(define_insn "@vsm" + [(set (mem:BLK (scratch)) + (unspec:BLK + [(unspec:BLK + [(match_operand 0 "pmode_register_operand" "r") + (match_operand:VB 1 "register_operand" "vr") + (mem:BLK (scratch))] UNSPEC_UNIT_STRIDE_STORE) + (match_operand 2 "p_reg_or_const_csr_operand" "rK") + (match_operand 3 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vsm.v\t%1,(%0)" + [(set_attr "type" "vse") + (set_attr "mode" "")]) + +;; vmv.v.x +(define_expand "@v_v_x" + [(unspec [ + (match_operand:VI 0 "register_operand") + (match_operand:VI 1 "vector_reg_or_const0_operand") + (match_operand: 2 "reg_or_const_int_operand") + (match_operand 3 "p_reg_or_const_csr_operand") + (match_operand 4 "const_int_operand") + ] VMVOP)] + "TARGET_VECTOR" + { + emit_op5 ( + , + mode, mode, mode, + mode, + operands, + gen_v_v_x_internal, + gen_v_v_x_32bit, + NULL, + satisfies_constraint_, + 2, false + ); + DONE; + } +) + +;; Vector-Scalar Integer Move. +(define_insn "@vmv_v_x_internal" + [(set (match_operand:VI 0 "register_operand" "=vr,vr,vr,vr") + (unspec:VI + [(match_operand:VI 1 "vector_reg_or_const0_operand" "0,0,J,J") + (vec_duplicate:VI + (match_operand: 2 "reg_or_simm5_operand" "r,Ws5,r,Ws5")) + (match_operand 3 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmv.v.x\t%0,%2 + vmv.v.i\t%0,%2 + vmv.v.x\t%0,%2 + vmv.v.i\t%0,%2" + [(set_attr "type" "vmove") + (set_attr "mode" "")]) + +(define_insn "@vmv_v_x_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=vr,vr,vr,vr") + (unspec:V64BITI + [(match_operand:V64BITI 1 "vector_reg_or_const0_operand" "0,0,J,J") + (vec_duplicate:V64BITI + (sign_extend: (match_operand:SI 2 "reg_or_simm5_operand" "r,Ws5,r,Ws5"))) + (match_operand:SI 3 "csr_operand" "rK,rK,rK,rK") + (match_operand:SI 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmv.v.x\t%0,%2 + vmv.v.i\t%0,%2 + vmv.v.x\t%0,%2 + vmv.v.i\t%0,%2" + [(set_attr "type" "vmove") + (set_attr "mode" "")]) + +;; Vector-Scalar Floating-Point Move. +(define_insn "@vfmv_v_f" + [(set (match_operand:VF 0 "register_operand" "=vr,vr") + (unspec:VF + [(match_operand:VF 1 "vector_reg_or_const0_operand" "0,J") + (vec_duplicate:VF + (match_operand: 2 "register_operand" "f,f")) + (match_operand 3 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vfmv.v.f\t%0,%2" + [(set_attr "type" "vmove") + (set_attr "mode" "")]) + +;; Vector-Scalar integer merge. +(define_insn "@vmerge_vxm_internal" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd") + (unspec:VI + [(match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J") + (unspec:VI + [(match_operand: 1 "register_operand" "vm,vm,vm,vm") + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr") + (vec_duplicate:VI + (match_operand: 4 "reg_or_simm5_operand" "r,Ws5,r,Ws5"))] UNSPEC_MERGE) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmerge.vxm\t%0,%3,%4,%1 + vmerge.vim\t%0,%3,%4,%1 + vmerge.vxm\t%0,%3,%4,%1 + vmerge.vim\t%0,%3,%4,%1" + [(set_attr "type" "vmerge") + (set_attr "mode" "")]) \ No newline at end of file From patchwork Tue May 31 08:49:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 54551 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 107503955605 for ; Tue, 31 May 2022 08:56:52 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbg511.qq.com (smtpbg511.qq.com [203.205.250.109]) by sourceware.org (Postfix) with ESMTPS id 407C73836647 for ; Tue, 31 May 2022 08:50:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 407C73836647 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp84t1653987037tq946wg0 Received: from server1.localdomain ( [42.247.22.65]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 31 May 2022 16:50:36 +0800 (CST) X-QQ-SSF: 01400000002000B0F000000A0000000 X-QQ-FEAT: F3yR32iATbhoILd9IDRuIUdQVEa8JBrsysD+7QNd5ZwKOCsOZWCol2umRM8/c VSsQPOVDS8sG9V3/60rPo+OxNnz2CvPWIF+049dAeRYjCwIowXc1Tzzomg1tpP9Ls48xZIP 3+GL5cOT/pNuC4Sk5F3LfYQeo4hAzQEtZBMJl36c1vECZM49EiM10urAC+gZOJ2QOP7NpO2 MVuNrgY36MGWnCLOUWDEQcYuGheVXaKo1K5mgR7h0fEOG+MsXfixjj2P1dijbpkRuYrPLoW m7wr1HPbc86taHrmX8S4KOm1gdz5pu0uE3UdDUtuQIjtdg5ke3Q9MElzwx1/NXRvfUllsl2 nR7SwO0cVi2ikr6vCL2muZfmGfDAs4xCQ3kUH6lSximJpQFLh0= X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Subject: [PATCH 07/21] Add register spilling support Date: Tue, 31 May 2022 16:49:58 +0800 Message-Id: <20220531085012.269719-8-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220531085012.269719-1-juzhe.zhong@rivai.ai> References: <20220531085012.269719-1-juzhe.zhong@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybgforeign:qybgforeign10 X-QQ-Bgrelay: 1 X-Spam-Status: No, score=-9.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, T_SPF_HELO_TEMPERROR, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kito.cheng@gmail.com, juzhe.zhong@rivai.ai Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: zhongjuzhe gcc/ChangeLog: * config/riscv/riscv-protos.h (rvv_expand_const_vector): New function. (rvv_expand_const_mask): New function. (rvv_const_vec_all_same_in_range_p): New function. * config/riscv/riscv-vector.cc (classify_vtype_field): Move codes location. (get_lmulx8): New function. Move codes location. (force_reg_for_over_uimm): New function. Move codes location. (gen_vlx2): New function. Move codes location. (emit_int64_to_vector_32bit): Move codes location. (rvv_expand_const_vector): New function. (rvv_expand_const_mask): New function. (rvv_const_vec_all_same_in_range_p): New function. * config/riscv/riscv.cc (riscv_const_insns): Add const vector cost. * config/riscv/vector-iterators.md: New iterators and attributes. * config/riscv/vector.md (mov): New pattern. (*mov): New pattern. (*mov_reg): New pattern. (@vmclr_m): New pattern. (@vmset_m): New pattern. --- gcc/config/riscv/riscv-protos.h | 3 + gcc/config/riscv/riscv-vector.cc | 349 ++++++++++++++++----------- gcc/config/riscv/riscv.cc | 67 ++++- gcc/config/riscv/vector-iterators.md | 24 ++ gcc/config/riscv/vector.md | 201 +++++++++++++++ 5 files changed, 502 insertions(+), 142 deletions(-) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 9a7e120854a..618eb746eaa 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -128,6 +128,9 @@ extern int rvv_regsize (machine_mode); extern rtx rvv_gen_policy (unsigned int rvv_policy = 0); extern opt_machine_mode rvv_get_mask_mode (machine_mode); extern machine_mode rvv_translate_attr_mode (rtx_insn *); +extern bool rvv_expand_const_vector (rtx, rtx); +extern bool rvv_expand_const_mask (rtx, rtx); +extern bool rvv_const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT); extern void emit_op5 ( unsigned int unspec, diff --git a/gcc/config/riscv/riscv-vector.cc b/gcc/config/riscv/riscv-vector.cc index 426490945dd..4b2fe2a8d11 100644 --- a/gcc/config/riscv/riscv-vector.cc +++ b/gcc/config/riscv/riscv-vector.cc @@ -71,7 +71,165 @@ #include "target-def.h" #include -/* Helper functions for RVV */ + +/* Internal helper functions for RVV */ + +/* Return the vtype field for a specific machine mode. */ +static unsigned int +classify_vtype_field (machine_mode mode) +{ + unsigned int vlmul = rvv_classify_vlmul_field (mode); + unsigned int vsew = rvv_classify_vsew_field (mode); + unsigned int vtype = (vsew << 3) | (vlmul & 0x7) | 0x40; + return vtype; +} + +/* lmul = real_lmul * 8 + guarantee integer + e.g. + 1 => 1/8 + 2 => 1/4 + 4 => 1/2 + 8 => 1 + 16 => 2 + 32 => 4 + 64 => 8 + */ +static unsigned int +get_lmulx8 (machine_mode mode) +{ + unsigned int vlmul = rvv_classify_vlmul_field (mode); + switch (vlmul) + { + case VLMUL_FIELD_000: + return 8; + case VLMUL_FIELD_001: + return 16; + case VLMUL_FIELD_010: + return 32; + case VLMUL_FIELD_011: + return 64; + case VLMUL_FIELD_101: + return 1; + case VLMUL_FIELD_110: + return 2; + case VLMUL_FIELD_111: + return 4; + default: + gcc_unreachable (); + } +} + +/* Helper functions for handling sew=64 on RV32 system. */ +static rtx +force_reg_for_over_uimm (rtx vl) +{ + if (CONST_SCALAR_INT_P (vl) && INTVAL (vl) >= 32) + { + return force_reg (Pmode, vl); + } + + return vl; +} + +/* Helper functions for handling sew=64 on RV32 system. */ +static rtx +gen_vlx2 (rtx avl, machine_mode Vmode, machine_mode VSImode) +{ + if (rtx_equal_p (avl, gen_rtx_REG (Pmode, X0_REGNUM))) + { + return avl; + } + rtx i32vl = NULL_RTX; + if (CONST_SCALAR_INT_P (avl)) + { + unsigned int vlen_max; + unsigned int vlen_min; + if (riscv_vector_chunks.is_constant ()) + { + vlen_max = riscv_vector_chunks.to_constant () * 64; + vlen_min = vlen_max; + } + else + { + /* TODO: vlen_max will be supported as 65536 in the future. */ + vlen_max = RVV_4096; + vlen_min = RVV_128; + } + unsigned int max_vlmax = (vlen_max / GET_MODE_UNIT_BITSIZE (Vmode) * get_lmulx8 (Vmode)) / 8; + unsigned int min_vlmax = (vlen_min / GET_MODE_UNIT_BITSIZE (Vmode) * get_lmulx8 (Vmode)) / 8; + + unsigned HOST_WIDE_INT avl_int = INTVAL (avl); + if (avl_int <= min_vlmax) + { + i32vl = gen_int_mode (2 * avl_int, SImode); + } + else if (avl_int >= 2 * max_vlmax) + { + // Just set i32vl to VLMAX in this situation + i32vl = gen_reg_rtx (Pmode); + unsigned int vtype = classify_vtype_field (VSImode); + emit_insn (gen_vsetvl (Pmode, i32vl, gen_rtx_REG (Pmode, X0_REGNUM), GEN_INT (vtype))); + } + else + { + // For AVL between (MinVLMAX, 2 * MaxVLMAX), the actual working vl + // is related to the hardware implementation. + // So let the following code handle + } + } + if (!i32vl) + { + // Using vsetvli instruction to get actually used length which related to + // the hardware implementation + rtx i64vl = gen_reg_rtx (Pmode); + unsigned int vtype = classify_vtype_field (Vmode); + emit_insn (gen_vsetvl (Pmode, i64vl, force_reg (Pmode, avl), GEN_INT (vtype))); + // scale 2 for 32-bit length + i32vl = gen_reg_rtx (Pmode); + emit_insn (gen_rtx_SET (i32vl, gen_rtx_ASHIFT (Pmode, i64vl, const1_rtx))); + } + + return force_reg_for_over_uimm (i32vl); +} + +/* Helper functions for handling sew=64 on RV32 system. */ +static void +emit_int64_to_vector_32bit (machine_mode Vmode, machine_mode VSImode, + machine_mode VMSImode, rtx vd, rtx s, rtx vl, + rtx tail) +{ + if (CONST_SCALAR_INT_P (s)) + { + s = force_reg (DImode, s); + } + + rtx hi = gen_highpart (SImode, s); + rtx lo = gen_lowpart (SImode, s); + + rtx zero = gen_rtx_REG (SImode, X0_REGNUM); + + /* make a "0101..." mask vector */ + rtx vm1 = gen_reg_rtx (VNx4SImode); + emit_insn (gen_vmv_v_x_internal (VNx4SImode, vm1, const0_rtx, + force_reg (SImode, GEN_INT (0x55555555)), + zero, rvv_gen_policy ())); + rtx vm2 = gen_reg_rtx (VMSImode); + emit_insn (gen_rtx_SET (vm2, gen_lowpart (VMSImode, vm1))); + + rtx vlx2 = gen_vlx2 (vl, Vmode, VSImode); + rtx v2 = gen_reg_rtx (VSImode); + emit_insn (gen_vmv_v_x_internal (VSImode, v2, const0_rtx, hi, vlx2, + rvv_gen_policy ())); + + rtx vd_si = gen_reg_rtx (VSImode); + emit_insn (gen_vmerge_vxm_internal (VSImode, vd_si, vm2, const0_rtx, v2, lo, + vlx2, tail)); + + emit_insn (gen_rtx_SET (vd, gen_lowpart (Vmode, vd_si))); +} + +/* Globaer RVV implementation. */ /* Return true if it is a RVV mask mode. */ bool @@ -370,159 +528,68 @@ rvv_translate_attr_mode (rtx_insn *insn) return VOIDmode; } -/* Return the vtype field for a specific machine mode. */ -static unsigned int -classify_vtype_field (machine_mode mode) -{ - unsigned int vlmul = rvv_classify_vlmul_field (mode); - unsigned int vsew = rvv_classify_vsew_field (mode); - unsigned int vtype = (vsew << 3) | (vlmul & 0x7) | 0x40; - return vtype; -} - -/* lmul = real_lmul * 8 - guarantee integer - e.g. - 1 => 1/8 - 2 => 1/4 - 4 => 1/2 - 8 => 1 - 16 => 2 - 32 => 4 - 64 => 8 - */ -static unsigned int -get_lmulx8 (machine_mode mode) -{ - unsigned int vlmul = rvv_classify_vlmul_field (mode); - switch (vlmul) - { - case VLMUL_FIELD_000: - return 8; - case VLMUL_FIELD_001: - return 16; - case VLMUL_FIELD_010: - return 32; - case VLMUL_FIELD_011: - return 64; - case VLMUL_FIELD_101: - return 1; - case VLMUL_FIELD_110: - return 2; - case VLMUL_FIELD_111: - return 4; - default: - gcc_unreachable (); - } -} - -/* Helper functions for handling sew=64 on RV32 system. */ -static rtx -force_reg_for_over_uimm (rtx vl) +/* Expand const vector using RVV instructions. */ +bool +rvv_expand_const_vector (rtx target, rtx src) { - if (CONST_SCALAR_INT_P (vl) && INTVAL (vl) >= 32) + rtx x; + machine_mode mode = GET_MODE (target); + machine_mode inner_mode = GET_MODE_INNER (mode); + + /* Case 1: Handle const duplicate vector. */ + if (const_vec_duplicate_p (src, &x)) { - return force_reg (Pmode, vl); + if (FLOAT_MODE_P (mode)) + x = force_reg (inner_mode, x); + emit_insn (gen_vec_duplicate (mode, target, x)); + return true; } - - return vl; + /* TODO: In case of intrinsic support, we only need to deal with const duplicate vector. + More cases will be supported for auto-vectorization. */ + return false; } -/* Helper functions for handling sew=64 on RV32 system. */ -static rtx -gen_vlx2 (rtx avl, machine_mode Vmode, machine_mode VSImode) +/* Expand const mask using RVV instructions. */ +bool +rvv_expand_const_mask (rtx target, rtx src) { - if (rtx_equal_p (avl, gen_rtx_REG (Pmode, X0_REGNUM))) + rtx ele; + rtx zero = gen_rtx_REG (Pmode, X0_REGNUM); + machine_mode mode = GET_MODE (target); + if (const_vec_duplicate_p (src, &ele)) { - return avl; - } - rtx i32vl = NULL_RTX; - if (CONST_SCALAR_INT_P (avl)) - { - unsigned int vlen_max; - unsigned int vlen_min; - if (riscv_vector_chunks.is_constant ()) - { - vlen_max = riscv_vector_chunks.to_constant () * 64; - vlen_min = vlen_max; - } - else - { - /* TODO: vlen_max will be supported as 65536 in the future. */ - vlen_max = RVV_4096; - vlen_min = RVV_128; - } - unsigned int max_vlmax = (vlen_max / GET_MODE_UNIT_BITSIZE (Vmode) * get_lmulx8 (Vmode)) / 8; - unsigned int min_vlmax = (vlen_min / GET_MODE_UNIT_BITSIZE (Vmode) * get_lmulx8 (Vmode)) / 8; - - unsigned HOST_WIDE_INT avl_int = INTVAL (avl); - if (avl_int <= min_vlmax) - { - i32vl = gen_int_mode (2 * avl_int, SImode); - } - else if (avl_int >= 2 * max_vlmax) - { - // Just set i32vl to VLMAX in this situation - i32vl = gen_reg_rtx (Pmode); - unsigned int vtype = classify_vtype_field (VSImode); - emit_insn (gen_vsetvl (Pmode, i32vl, gen_rtx_REG (Pmode, X0_REGNUM), GEN_INT (vtype))); - } - else + gcc_assert (CONST_SCALAR_INT_P (ele)); + switch (INTVAL (ele)) { - // For AVL between (MinVLMAX, 2 * MaxVLMAX), the actual working vl - // is related to the hardware implementation. - // So let the following code handle + case 0: + emit_insn (gen_vmclr_m (mode, target, zero, + rvv_gen_policy ())); + break; + case 1: + emit_insn (gen_vmset_m (mode, target, zero, + rvv_gen_policy ())); + break; + default: + gcc_unreachable (); } + return true; } - if (!i32vl) - { - // Using vsetvli instruction to get actually used length which related to - // the hardware implementation - rtx i64vl = gen_reg_rtx (Pmode); - unsigned int vtype = classify_vtype_field (Vmode); - emit_insn (gen_vsetvl (Pmode, i64vl, force_reg (Pmode, avl), GEN_INT (vtype))); - // scale 2 for 32-bit length - i32vl = gen_reg_rtx (Pmode); - emit_insn (gen_rtx_SET (i32vl, gen_rtx_ASHIFT (Pmode, i64vl, const1_rtx))); - } - - return force_reg_for_over_uimm (i32vl); + + /* TODO: In case of intrinsic support, we only need to deal with const all zeros or const all ones mask. + More cases will be supported for auto-vectorization. */ + return false; } -/* Helper functions for handling sew=64 on RV32 system. */ -static void -emit_int64_to_vector_32bit (machine_mode Vmode, machine_mode VSImode, - machine_mode VMSImode, rtx vd, rtx s, rtx vl, - rtx tail) -{ - if (CONST_SCALAR_INT_P (s)) - { - s = force_reg (DImode, s); - } - - rtx hi = gen_highpart (SImode, s); - rtx lo = gen_lowpart (SImode, s); - - rtx zero = gen_rtx_REG (SImode, X0_REGNUM); - - /* make a "0101..." mask vector */ - rtx vm1 = gen_reg_rtx (VNx4SImode); - emit_insn (gen_vmv_v_x_internal (VNx4SImode, vm1, const0_rtx, - force_reg (SImode, GEN_INT (0x55555555)), - zero, rvv_gen_policy ())); - rtx vm2 = gen_reg_rtx (VMSImode); - emit_insn (gen_rtx_SET (vm2, gen_lowpart (VMSImode, vm1))); +/* Return true if X is a const_vector with all duplicate elements, which is in + the range between MINVAL and MAXVAL. */ - rtx vlx2 = gen_vlx2 (vl, Vmode, VSImode); - rtx v2 = gen_reg_rtx (VSImode); - emit_insn (gen_vmv_v_x_internal (VSImode, v2, const0_rtx, hi, vlx2, - rvv_gen_policy ())); - - rtx vd_si = gen_reg_rtx (VSImode); - emit_insn (gen_vmerge_vxm_internal (VSImode, vd_si, vm2, const0_rtx, v2, lo, - vlx2, tail)); - - emit_insn (gen_rtx_SET (vd, gen_lowpart (Vmode, vd_si))); +bool +rvv_const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT minval, + HOST_WIDE_INT maxval) +{ + rtx elt; + return (const_vec_duplicate_p (x, &elt) && CONST_INT_P (elt) && + IN_RANGE (INTVAL (elt), minval, maxval)); } /* Helper functions for handling sew=64 on RV32 system. */ diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 8c78e726a19..fc27dc957dc 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -1097,9 +1097,74 @@ riscv_const_insns (rtx x) } case CONST_DOUBLE: - case CONST_VECTOR: /* We can use x0 to load floating-point zero. */ return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0; + case CONST_VECTOR: + { + machine_mode mode = GET_MODE (x); + /* For the mode which is not RVV mode, we use + default configuration. */ + if (!rvv_mode_p (mode)) + return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0; + unsigned int factor = 0; + if (GET_MODE_CLASS (GET_MODE (x)) == MODE_VECTOR_BOOL) + { + /* In RVV, we can use vmclr.m/vmset.m to generate + all 0s/1s bool vector. Otherwise we can only use + load instructions. */ + if (x == CONST0_RTX (GET_MODE (x)) + || x == CONSTM1_RTX (GET_MODE (x))) + return 1; + else + return 0; + } + else if (FLOAT_MODE_P (GET_MODE (x))) + { + /* In RVV, Floating-point should be first load + into floating-point register + then duplicate. */ + factor = 3; + } + else + { + rtx elt; + if (!const_vec_duplicate_p (x, &elt)) + { + rtx base, step; + if (const_vec_series_p (x, &base, &step)) + { + /* For const vector: {0, 1, 2, ......}, + we can use a single instruction vid.v + to generate the vector. */ + if (INTVAL (step) == 1 + && INTVAL (base) == 0) + factor = 1; + /* We need a vid + li + vmul.vx instruction. */ + else if (INTVAL (base) == 0) + factor = 2 + riscv_integer_cost (INTVAL (step)); + /* We need a vid + (li + vadd.vx)/vadd.vi instruction. */ + else if (INTVAL (step) == 1) + factor = IN_RANGE (INTVAL (base), -16, 15) ? 2 + : 2 + riscv_integer_cost (INTVAL (base)); + /* We need a vid + (li + vadd.vx)/vadd.vi + li + vmul.vx instruction. */ + else + factor = IN_RANGE (INTVAL (base), -16, 15) ? 4 + : 4 + riscv_integer_cost (INTVAL (base)); + } + else + factor = 0; + } + else + { + /* Use vmv.v.i. */ + if (rvv_const_vec_all_same_in_range_p (x, -15, 16)) + factor = 1; + /* Use li + vmv.v.x. */ + else + factor = 1 + riscv_integer_cost (INTVAL (elt)); + } + } + } case CONST: /* See if we can refer to X directly. */ diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index 9832d2adaa3..e01305ef3fc 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -140,6 +140,30 @@ (VNx2HI "vnx2di") (VNx4HI "vnx2di") (VNx8HI "vnx2di") (VNx16HI "vnx2di") (VNx32HI "vnx2di") (VNx64HI "vnx2di") (VNx2SI "vnx2di") (VNx4SI "vnx2di") (VNx8SI "vnx2di") (VNx16SI "vnx2di") (VNx32SI "vnx2di") (VNx2DI "vnx2di") (VNx4DI "vnx4di") (VNx8DI "vnx8di") (VNx16DI "vnx16di")]) + +;; Map a vector mode to SEW +(define_mode_attr sew [ + (VNx2QI "8") (VNx4QI "8") (VNx8QI "8") (VNx16QI "8") + (VNx32QI "8") (VNx64QI "8") (VNx128QI "8") (VNx2HI "16") + (VNx4HI "16") (VNx8HI "16") (VNx16HI "16") (VNx32HI "16") + (VNx64HI "16") (VNx2SI "32") (VNx4SI "32") (VNx8SI "32") + (VNx16SI "32") (VNx32SI "32") (VNx2DI "64") (VNx4DI "64") + (VNx8DI "64") (VNx16DI "64") + (VNx2SF "32") (VNx4SF "32") (VNx8SF "32") (VNx16SF "32") + (VNx32SF "32") (VNx2DF "64") (VNx4DF "64") (VNx8DF "64") + (VNx16DF "64")]) + +;; Map a vector mode to its LMUL. +(define_mode_attr lmul [ + (VNx2QI "1") (VNx4QI "1") (VNx8QI "1") (VNx16QI "1") + (VNx32QI "2") (VNx64QI "4") (VNx128QI "8") (VNx2HI "1") + (VNx4HI "1") (VNx8HI "1") (VNx16HI "2") (VNx32HI "4") + (VNx64HI "8") (VNx2SI "1") (VNx4SI "1") (VNx8SI "2") + (VNx16SI "4") (VNx32SI "8") (VNx2DI "1") (VNx4DI "2") + (VNx8DI "4") (VNx16DI "8") + (VNx2SF "1") (VNx4SF "1") (VNx8SF "2") (VNx16SF "4") + (VNx32SF "8") (VNx2DF "1") (VNx4DF "2") (VNx8DF "4") + (VNx16DF "8")]) (define_int_iterator VMVOP [ UNSPEC_VMV diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 4a9c6769812..1731d969372 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -62,6 +62,179 @@ rvv_gen_policy ())); DONE; }) + +;; ========================================================================= +;; == Vector spilling +;; ========================================================================= + +;; ------------------------------------------------------------------------- +;; ---- Moves Operations +;; ------------------------------------------------------------------------- +;; Includes: +;; - Full vector load/store/move +;; - Partial vector load/store/move +;; - All vector misalign move +;; ------------------------------------------------------------------------- + +;; Move Pattern for all vector modes. +(define_expand "mov" + [(set (match_operand:VFULL 0 "reg_or_mem_operand") + (match_operand:VFULL 1 "vector_move_operand"))] + "TARGET_VECTOR" +{ + /* Need to force register if mem <- !reg. */ + if (MEM_P (operands[0]) && !REG_P (operands[1])) + operands[1] = force_reg (mode, operands[1]); + + if (GET_CODE (operands[1]) == CONST_VECTOR && + rvv_expand_const_vector (operands[0], operands[1])) + DONE; +}) + +;; Full vector load/store/move. +(define_insn "*mov" + [(set (match_operand:VFULL 0 "reg_or_mem_operand" "=vr,m,vr") + (match_operand:VFULL 1 "reg_or_mem_operand" "m,vr,vr"))] + "TARGET_VECTOR" + "@ + vlre.v\t%0,%1 + vsr.v\t%1,%0 + vmvr.v\t%0,%1" + [(set_attr "type" "vload,vstore,vcopy") + (set_attr "mode" "")]) + +(define_expand "mov" + [(parallel [(set (match_operand:VPARTIAL 0 "reg_or_mem_operand") + (match_operand:VPARTIAL 1 "vector_move_operand")) + (clobber (scratch:SI)) + (clobber (reg:SI VL_REGNUM)) + (clobber (reg:SI VTYPE_REGNUM))])] + "TARGET_VECTOR" +{ + /* Need to force register if mem <- !reg. */ + if (MEM_P (operands[0]) && !REG_P (operands[1])) + operands[1] = force_reg (mode, operands[1]); + + if (GET_CODE (operands[1]) == CONST_VECTOR && + rvv_expand_const_vector (operands[0], operands[1])) + DONE; +}) + +;; Partial vector load/store/move. +(define_insn_and_split "*mov" + [(set (match_operand:VPARTIAL 0 "reg_or_mem_operand" "=vr,m,vr") + (match_operand:VPARTIAL 1 "reg_or_mem_operand" "m,vr,vr")) + (clobber (match_scratch:SI 2 "=&r,&r,X")) + (clobber (reg:SI VL_REGNUM)) + (clobber (reg:SI VTYPE_REGNUM))] + "TARGET_VECTOR" + "@ + vle.v\t%0,%1 + vse.v\t%1,%0 + #" + "&& (!reload_completed || (REG_P (operands[0]) + && REG_P (operands[1])))" + [(const_int 0)] + { + /* Need to force register if mem <- !reg. */ + if (MEM_P (operands[0]) && !REG_P (operands[1])) + operands[1] = force_reg (mode, operands[1]); + + if (MEM_P (operands[0])) + { + emit_insn (gen_vse (mode, const0_rtx, XEXP (operands[0], 0), + operands[1], gen_rtx_REG (Pmode, X0_REGNUM), + rvv_gen_policy ())); + DONE; + } + if (MEM_P (operands[1])) + { + emit_insn (gen_vle (mode, operands[0], const0_rtx, const0_rtx, + XEXP (operands[1], 0), gen_rtx_REG (Pmode, X0_REGNUM), + rvv_gen_policy ())); + DONE; + } + + emit_insn (gen_rtx_SET (operands[0], operands[1])); + DONE; + } + [(set_attr "type" "vle,vse,vcopy") + (set_attr "mode" "")]) + +(define_insn "*mov_reg" + [(set (match_operand:VPARTIAL 0 "register_operand" "=vr") + (match_operand:VPARTIAL 1 "register_operand" "vr"))] + "TARGET_VECTOR" + "vmv1r.v\t%0,%1" + [(set_attr "type" "vcopy") + (set_attr "mode" "")]) + +;; Move pattern for mask modes. +(define_expand "mov" + [(parallel [(set (match_operand:VB 0 "reg_or_mem_operand") + (match_operand:VB 1 "vector_move_operand")) + (clobber (scratch:SI)) + (clobber (reg:SI VL_REGNUM)) + (clobber (reg:SI VTYPE_REGNUM))])] + "TARGET_VECTOR" +{ + /* Need to force register if mem <- !reg. */ + if (MEM_P (operands[0]) && !REG_P (operands[1])) + operands[1] = force_reg (mode, operands[1]); + + if (GET_CODE (operands[1]) == CONST_VECTOR + && rvv_expand_const_mask (operands[0], operands[1])) + DONE; +}) + +;; mask load/store/move. +(define_insn_and_split "*mov" + [(set (match_operand:VB 0 "reg_or_mem_operand" "=vr,m,vr") + (match_operand:VB 1 "reg_or_mem_operand" "m,vr,vr")) + (clobber (match_scratch:SI 2 "=&r,&r,X")) + (clobber (reg:SI VL_REGNUM)) + (clobber (reg:SI VTYPE_REGNUM))] + "TARGET_VECTOR" + "@ + vlm.v\t%0,%1 + vsm.v\t%1,%0 + #" + "&& (!reload_completed || (REG_P (operands[0]) + && REG_P (operands[1])))" + [(const_int 0)] + { + /* Need to force register if mem <- !reg. */ + if (MEM_P (operands[0]) && !REG_P (operands[1])) + operands[1] = force_reg (mode, operands[1]); + + if (MEM_P (operands[0])) + { + emit_insn (gen_vsm (mode, XEXP (operands[0], 0), operands[1], + gen_rtx_REG (Pmode, X0_REGNUM), + rvv_gen_policy ())); + DONE; + } + if (MEM_P (operands[1])) + { + emit_insn (gen_vlm (mode, operands[0], XEXP (operands[1], 0), + gen_rtx_REG (Pmode, X0_REGNUM), + rvv_gen_policy ())); + DONE; + } + + emit_insn (gen_rtx_SET (operands[0], operands[1])); + DONE; + } + [(set_attr "type" "vle,vse,vcopy") + (set_attr "mode" "")]) + +(define_insn "*mov_reg" + [(set (match_operand:VB 0 "register_operand" "=vr") + (match_operand:VB 1 "register_operand" "vr"))] + "TARGET_VECTOR" + "vmv1r.v\t%0,%1" + [(set_attr "type" "vcopy") + (set_attr "mode" "")]) ;; =============================================================================== ;; == Intrinsics @@ -370,4 +543,32 @@ vmerge.vxm\t%0,%3,%4,%1 vmerge.vim\t%0,%3,%4,%1" [(set_attr "type" "vmerge") + (set_attr "mode" "")]) + +;; vmclr.m vd -> vmxor.mm vd,vd,vd # Clear mask register +(define_insn "@vmclr_m" + [(set (match_operand:VB 0 "register_operand" "=vr") + (unspec:VB + [(vec_duplicate:VB (const_int 0)) + (match_operand 1 "p_reg_or_const_csr_operand" "rK") + (match_operand 2 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vmclr.m\t%0" + [(set_attr "type" "vmask") + (set_attr "mode" "")]) + +;; vmset.m vd -> vmxnor.mm vd,vd,vd # Set mask register +(define_insn "@vmset_m" + [(set (match_operand:VB 0 "register_operand" "=vr") + (unspec:VB + [(vec_duplicate:VB (const_int 1)) + (match_operand 1 "p_reg_or_const_csr_operand" "rK") + (match_operand 2 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vmset.m\t%0" + [(set_attr "type" "vmask") (set_attr "mode" "")]) \ No newline at end of file From patchwork Tue May 31 08:49:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 54550 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5B4C63955605 for ; Tue, 31 May 2022 08:56:22 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbg151.qq.com (smtpbg151.qq.com [18.169.211.239]) by sourceware.org (Postfix) with ESMTPS id E085D3836655 for ; Tue, 31 May 2022 08:50:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E085D3836655 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp84t1653987039t8ppay7z Received: from server1.localdomain ( [42.247.22.65]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 31 May 2022 16:50:39 +0800 (CST) X-QQ-SSF: 01400000002000B0F000000A0000000 X-QQ-FEAT: NFbnPP6/4uLanzkTxpVUzEhnpZZNtxTDNHcLGlUsCvgTu6QVZhF86QqY19OhF crDl+w0JFe6E2WSE0vRuBOurOZ0JMqFoBENcxBvoKN3utUWylRkKHlMLtZLCKkzDoF1REnT +T+dG5FGKuDaAq5LidneZjY99e3s7yiKBAdvFwdpPth+51GvjVIX/LIhE8Hg5kpYGehRSmB zMSiwbNxD24nhexYhIR1tbzm5jVzSc1eixba177I2paykAxPDH9w2C+yTFXF+O8N/LRgbAK Glf3F0msRXlAhXEH/xR/1uBIMyt3Qu6IhYPOJMEljYY2ictZEKCXDIMba+wljv5YBj2CS7w aoYOc8OxBu50vCdMeLY2zzuwodh3f9vQd5qrAgPgY4QwaQcLd4= X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Subject: [PATCH 08/21] Add poly manipulation Date: Tue, 31 May 2022 16:49:59 +0800 Message-Id: <20220531085012.269719-9-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220531085012.269719-1-juzhe.zhong@rivai.ai> References: <20220531085012.269719-1-juzhe.zhong@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybgforeign:qybgforeign4 X-QQ-Bgrelay: 1 X-Spam-Status: No, score=-10.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kito.cheng@gmail.com, juzhe.zhong@rivai.ai Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: zhongjuzhe gcc/ChangeLog: * config/riscv/riscv-protos.h (riscv_add_offset): Change riscv_add_offset as global function. * config/riscv/riscv-vector.cc (rvv_report_required): New function. (expand_quotient): New function. (rvv_expand_poly_move): New function. * config/riscv/riscv-vector.h (rvv_report_required): New function. (rvv_expand_poly_move): New function. * config/riscv/riscv.cc (riscv_const_insns): Fix no return value bug. (riscv_split_symbol): Add symbol_ref with poly_int support. (riscv_legitimize_const_move): Add const poly_int move support. (riscv_legitimize_move): Add const poly_int move support. (riscv_hard_regno_mode_ok): Add VL_REGNUM and VTYPE_REGNUM register allocation. (riscv_conditional_register_usage): Fix RVV registers. --- gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv-vector.cc | 254 +++++++++++++++++++++++++++++++ gcc/config/riscv/riscv-vector.h | 2 + gcc/config/riscv/riscv.cc | 46 +++++- 4 files changed, 299 insertions(+), 4 deletions(-) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 618eb746eaa..2d63fe76930 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -74,6 +74,7 @@ extern bool riscv_expand_block_move (rtx, rtx, rtx); extern bool riscv_store_data_bypass_p (rtx_insn *, rtx_insn *); extern rtx riscv_gen_gpr_save_insn (struct riscv_frame_info *); extern bool riscv_gpr_save_operation_p (rtx); +extern rtx riscv_add_offset (rtx, rtx, HOST_WIDE_INT); /* Routines implemented in riscv-c.cc. */ void riscv_cpu_cpp_builtins (cpp_reader *); diff --git a/gcc/config/riscv/riscv-vector.cc b/gcc/config/riscv/riscv-vector.cc index 4b2fe2a8d11..d09fc1b8e49 100644 --- a/gcc/config/riscv/riscv-vector.cc +++ b/gcc/config/riscv/riscv-vector.cc @@ -592,6 +592,260 @@ rvv_const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT minval, IN_RANGE (INTVAL (elt), minval, maxval)); } +/* Report when we try to do something that requires vector when vector is disabled. + This is an error of last resort and isn't very high-quality. It usually + involves attempts to measure the vector length in some way. */ +void +rvv_report_required (void) +{ + static bool reported_p = false; + + /* Avoid reporting a slew of messages for a single oversight. */ + if (reported_p) + return; + + error ("this operation requires the RVV ISA extension"); + inform (input_location, "you can enable RVV using the command-line" + " option %<-march%>, or by using the %" + " attribute or pragma"); + reported_p = true; +} + +/* Note: clobber register holds the vlenb or 1/2 vlenb or 1/4 vlenb or 1/8 vlenb value. */ +/* Expand move for quotient. */ +static void +expand_quotient (int quotient, machine_mode mode, rtx clobber_vlenb, rtx dest) +{ + if (quotient == 0) + { + riscv_emit_move(dest, GEN_INT(0)); + return; + } + + bool is_neg = quotient < 0; + quotient = abs(quotient); + int log2 = exact_log2 (quotient); + int vlenb = BYTES_PER_RISCV_VECTOR.coeffs[1]; + + if (GET_MODE_SIZE (mode).to_constant () <= GET_MODE_SIZE (Pmode)) + emit_insn (gen_rtx_SET (clobber_vlenb, gen_int_mode (poly_int64 (vlenb, vlenb), mode))); + else + { + riscv_emit_move (gen_highpart (Pmode, clobber_vlenb), GEN_INT (0)); + emit_insn (gen_rtx_SET (gen_lowpart (Pmode, clobber_vlenb), gen_int_mode (poly_int64 (vlenb, vlenb), Pmode))); + } + + if (log2 == 0) + { + if (is_neg) + { + if (GET_MODE_SIZE (mode).to_constant () <= GET_MODE_SIZE (Pmode)) + emit_insn (gen_rtx_SET (dest, gen_rtx_NEG (mode, clobber_vlenb))); + else + { + /* We should use SImode to simulate DImode negation. */ + /* prologue and epilogue can not go through this condition. */ + gcc_assert (can_create_pseudo_p ()); + rtx reg = gen_reg_rtx (Pmode); + riscv_emit_move(dest, clobber_vlenb); + emit_insn (gen_rtx_SET (reg, + gen_rtx_NE (Pmode, gen_lowpart (Pmode, dest), const0_rtx))); + emit_insn (gen_rtx_SET (gen_highpart (Pmode, dest), + gen_rtx_NEG (Pmode, gen_highpart (Pmode, dest)))); + emit_insn (gen_rtx_SET (gen_lowpart (Pmode, dest), + gen_rtx_NEG (Pmode, gen_lowpart (Pmode, dest)))); + emit_insn (gen_rtx_SET (gen_highpart (Pmode, dest), + gen_rtx_MINUS (Pmode, gen_highpart (Pmode, dest), reg))); + } + } + else + riscv_emit_move(dest, clobber_vlenb); + } + else if (log2 != -1 + && GET_MODE_SIZE (mode).to_constant () <= GET_MODE_SIZE (Pmode)) + { + gcc_assert (IN_RANGE (log2, 0, 31)); + + if (is_neg) + { + emit_insn (gen_rtx_SET (dest, gen_rtx_NEG (mode, clobber_vlenb))); + emit_insn (gen_rtx_SET (dest, gen_rtx_ASHIFT (mode, dest, GEN_INT (log2)))); + } + else + emit_insn (gen_rtx_SET (dest, gen_rtx_ASHIFT (mode, clobber_vlenb, GEN_INT (log2)))); + } + else if (exact_log2 (quotient + 1) != -1 + && GET_MODE_SIZE (mode).to_constant () <= GET_MODE_SIZE (Pmode)) + { + gcc_assert (IN_RANGE (exact_log2 (quotient + 1), 0, 31)); + emit_insn (gen_rtx_SET ( + dest, + gen_rtx_ASHIFT (mode, clobber_vlenb, GEN_INT (exact_log2 (quotient + 1))))); + + if (is_neg) + emit_insn (gen_rtx_SET (dest, gen_rtx_MINUS (mode, clobber_vlenb, dest))); + else + emit_insn (gen_rtx_SET (dest, gen_rtx_MINUS (mode, dest, clobber_vlenb))); + } + else if (exact_log2 (quotient - 1) != -1 + && GET_MODE_SIZE (mode).to_constant () <= GET_MODE_SIZE (Pmode)) + { + gcc_assert (IN_RANGE (exact_log2 (quotient - 1), 0, 31)); + emit_insn (gen_rtx_SET ( + dest, gen_rtx_ASHIFT (mode, clobber_vlenb, + GEN_INT (exact_log2 (quotient - 1))))); + + if (is_neg) + { + emit_insn (gen_rtx_SET (dest, gen_rtx_NEG (mode, dest))); + emit_insn (gen_rtx_SET (dest, gen_rtx_MINUS (mode, dest, clobber_vlenb))); + } + else + emit_insn (gen_rtx_SET (dest, gen_rtx_PLUS (mode, dest, clobber_vlenb))); + } + else + { + gcc_assert (TARGET_MUL + && "M-extension must be enabled to calculate the poly_int " + "size/offset."); + + if (is_neg) + riscv_emit_move (dest, GEN_INT (-quotient)); + else + riscv_emit_move (dest, GEN_INT (quotient)); + + if (GET_MODE_SIZE (mode).to_constant () <= GET_MODE_SIZE (Pmode)) + emit_insn (gen_rtx_SET (dest, gen_rtx_MULT (mode, dest, clobber_vlenb))); + else + { + /* We should use SImode to simulate DImode multiplication. */ + /* prologue and epilogue can not go through this condition. */ + gcc_assert (can_create_pseudo_p ()); + rtx reg = gen_reg_rtx (Pmode); + emit_insn (gen_umulsi3_highpart (reg, gen_lowpart (Pmode, dest), + gen_lowpart (Pmode, clobber_vlenb))); + emit_insn (gen_rtx_SET (gen_highpart (Pmode, clobber_vlenb), + gen_rtx_MULT (Pmode, gen_highpart (Pmode, clobber_vlenb), + gen_lowpart (Pmode, dest)))); + emit_insn (gen_rtx_SET (gen_highpart (Pmode, dest), + gen_rtx_MULT (Pmode, gen_highpart (Pmode, dest), + gen_lowpart (Pmode, clobber_vlenb)))); + emit_insn (gen_rtx_SET (gen_lowpart (Pmode, dest), + gen_rtx_MULT (Pmode, gen_lowpart (Pmode, dest), + gen_lowpart (Pmode, clobber_vlenb)))); + emit_insn (gen_rtx_SET (gen_highpart (Pmode, dest), + gen_rtx_PLUS (Pmode, gen_highpart (Pmode, dest), + gen_highpart (Pmode, clobber_vlenb)))); + emit_insn (gen_rtx_SET (gen_highpart (Pmode, dest), + gen_rtx_PLUS (Pmode, gen_highpart (Pmode, dest), reg))); + } + } +} + +/* Analyze src and emit const_poly_int mov sequence. */ + +void +rvv_expand_poly_move (machine_mode mode, rtx dest, rtx clobber, rtx src) +{ + poly_int64 value = rtx_to_poly_int64 (src); + int offset = value.coeffs[0]; + int factor = value.coeffs[1]; + int vlenb = BYTES_PER_RISCV_VECTOR.coeffs[1]; + int div_factor = 0; + + if (BYTES_PER_RISCV_VECTOR.is_constant ()) + { + gcc_assert (value.is_constant ()); + riscv_emit_move (dest, GEN_INT (value.to_constant ())); + return; + } + else if ((factor % vlenb) == 0) + expand_quotient (factor / vlenb, mode, clobber, dest); + else if ((factor % (vlenb / 2)) == 0) + { + expand_quotient (factor / (vlenb / 2), mode, clobber, dest); + div_factor = 2; + } + else if ((factor % (vlenb / 4)) == 0) + { + expand_quotient (factor / (vlenb / 4), mode, clobber, dest); + div_factor = 4; + } + else if ((factor % (vlenb / 8)) == 0) + { + expand_quotient (factor / (vlenb / 8), mode, clobber, dest); + div_factor = 8; + } + else if ((factor % (vlenb / 16)) == 0) + { + expand_quotient (factor / (vlenb / 16), mode, clobber, dest); + div_factor = 16; + } + else + gcc_unreachable (); + + if (div_factor != 0) + { + if (GET_MODE_SIZE (mode).to_constant () <= GET_MODE_SIZE (Pmode)) + emit_insn (gen_rtx_SET ( + dest, + gen_rtx_ASHIFTRT (mode, dest, GEN_INT (exact_log2 (div_factor))))); + else + { + /* We should use SImode to simulate DImode shift. */ + /* prologue and epilogue can not go through this condition. */ + gcc_assert (can_create_pseudo_p ()); + rtx reg = gen_reg_rtx (Pmode); + emit_insn (gen_rtx_SET ( + reg, gen_rtx_ASHIFT (Pmode, gen_highpart (Pmode, dest), + GEN_INT (GET_MODE_BITSIZE (Pmode) - + exact_log2 (div_factor))))); + emit_insn (gen_rtx_SET ( + gen_lowpart (Pmode, dest), + gen_rtx_LSHIFTRT (Pmode, gen_lowpart (Pmode, dest), + GEN_INT (exact_log2 (div_factor))))); + emit_insn (gen_rtx_SET ( + gen_lowpart (Pmode, dest), + gen_rtx_IOR (Pmode, reg, gen_lowpart (Pmode, dest)))); + emit_insn (gen_rtx_SET ( + gen_highpart (Pmode, dest), + gen_rtx_ASHIFTRT (Pmode, gen_highpart (Pmode, dest), + GEN_INT (exact_log2 (div_factor))))); + } + } + + HOST_WIDE_INT constant = offset - factor; + + if (constant == 0) + return; + else if (SMALL_OPERAND (constant)) + { + if (GET_MODE_SIZE (mode).to_constant () <= GET_MODE_SIZE (Pmode)) + emit_insn ( + gen_rtx_SET (dest, gen_rtx_PLUS (mode, dest, GEN_INT (constant)))); + else + { + /* We should use SImode to simulate DImode addition. */ + /* prologue and epilogue can not go through this condition. */ + gcc_assert (can_create_pseudo_p ()); + rtx reg = gen_reg_rtx (Pmode); + emit_insn ( + gen_rtx_SET (reg, gen_rtx_PLUS (Pmode, gen_lowpart (Pmode, dest), + GEN_INT (constant)))); + emit_insn (gen_rtx_SET ( + gen_lowpart (Pmode, dest), + gen_rtx_LTU (Pmode, reg, gen_lowpart (Pmode, dest)))); + emit_insn ( + gen_rtx_SET (gen_highpart (Pmode, dest), + gen_rtx_PLUS (Pmode, gen_lowpart (Pmode, dest), + gen_highpart (Pmode, dest)))); + riscv_emit_move (gen_lowpart (Pmode, dest), reg); + } + } + else + emit_insn (gen_rtx_SET (dest, riscv_add_offset (clobber, dest, constant))); +} + /* Helper functions for handling sew=64 on RV32 system. */ bool imm32_p (rtx a) diff --git a/gcc/config/riscv/riscv-vector.h b/gcc/config/riscv/riscv-vector.h index e93852e3e56..b70cf676e26 100644 --- a/gcc/config/riscv/riscv-vector.h +++ b/gcc/config/riscv/riscv-vector.h @@ -20,4 +20,6 @@ #ifndef GCC_RISCV_VECTOR_H #define GCC_RISCV_VECTOR_H +void rvv_report_required (void); +void rvv_expand_poly_move (machine_mode, rtx, rtx, rtx); #endif // GCC_RISCV_VECTOR_H \ No newline at end of file diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index fc27dc957dc..7a1f19b32ee 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -1164,6 +1164,7 @@ riscv_const_insns (rtx x) factor = 1 + riscv_integer_cost (INTVAL (elt)); } } + return factor; } case CONST: @@ -1467,7 +1468,7 @@ riscv_split_symbol (rtx temp, rtx addr, machine_mode mode, rtx *low_out, riscv_force_temporary; it is only needed when OFFSET is not a SMALL_OPERAND. */ -static rtx +rtx riscv_add_offset (rtx temp, rtx reg, HOST_WIDE_INT offset) { if (!SMALL_OPERAND (offset)) @@ -1723,7 +1724,20 @@ riscv_legitimize_const_move (machine_mode mode, rtx dest, rtx src) riscv_emit_move (dest, riscv_add_offset (NULL, base, INTVAL (offset))); return; } - + + /* Handle (const:SI (plus:SI (symbol_ref:SI) + (const_poly_int:SI [16, 16]))) */ + if (GET_CODE (src) == CONST && GET_CODE (XEXP (src, 0)) == PLUS + && CONST_POLY_INT_P (XEXP (XEXP (src, 0), 1))) + { + rtx reg = gen_reg_rtx (mode); + rtx clobber = gen_reg_rtx (mode); + riscv_emit_move (dest, XEXP (XEXP (src, 0), 0)); + rvv_expand_poly_move (mode, reg, clobber, XEXP (XEXP (src, 0), 1)); + emit_insn (gen_rtx_SET (dest, gen_rtx_PLUS (mode, dest, reg))); + return; + } + src = force_const_mem (mode, src); /* When using explicit relocs, constant pool references are sometimes @@ -1738,6 +1752,28 @@ riscv_legitimize_const_move (machine_mode mode, rtx dest, rtx src) bool riscv_legitimize_move (machine_mode mode, rtx dest, rtx src) { + scalar_int_mode int_mode; + if (CONST_POLY_INT_P (src) + && is_a (mode, &int_mode)) + { + poly_int64 value = rtx_to_poly_int64 (src); + if (!value.is_constant () && !TARGET_VECTOR) + { + rvv_report_required (); + return true; + } + if (GET_MODE_SIZE (mode).to_constant () < GET_MODE_SIZE (Pmode)) + { + rtx clobber = gen_reg_rtx (Pmode); + rvv_expand_poly_move (Pmode, gen_lowpart (Pmode, dest), clobber, src); + } + else + { + rtx clobber = gen_reg_rtx (mode); + rvv_expand_poly_move (mode, dest, clobber, src); + } + return true; + } /* Expand (set (reg:QI target) (mem:QI (address))) to @@ -4998,7 +5034,9 @@ riscv_hard_regno_mode_ok (unsigned int regno, machine_mode mode) if (regsize != 1) return ((regno % regsize) == 0); - } + } + else if (regno == VTYPE_REGNUM || regno == VL_REGNUM) + return (nregs == 1 && mode == SImode); else return false; @@ -5388,7 +5426,7 @@ riscv_conditional_register_usage (void) if (!TARGET_VECTOR) { for (int regno = V_REG_FIRST; regno <= V_REG_LAST; regno++) - call_used_regs[regno] = 1; + fixed_regs[regno] = call_used_regs[regno] = 1; fixed_regs[VTYPE_REGNUM] = call_used_regs[VTYPE_REGNUM] = 1; fixed_regs[VL_REGNUM] = call_used_regs[VL_REGNUM] = 1; From patchwork Tue May 31 08:50:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 54553 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EF6573959CA7 for ; Tue, 31 May 2022 08:58:06 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbg511.qq.com (smtpbg511.qq.com [203.205.250.109]) by sourceware.org (Postfix) with ESMTPS id BF53238356AE for ; Tue, 31 May 2022 08:50:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BF53238356AE Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp84t1653987042t2lkgoo7 Received: from server1.localdomain ( [42.247.22.65]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 31 May 2022 16:50:41 +0800 (CST) X-QQ-SSF: 01400000002000B0F000000A0000000 X-QQ-FEAT: eTtJes0duVuLOKChFYlGOeEbUhJDXZt9R6OLWHFR99K3nPLwHO3Bdh8NGSbDH c0QTO0By4KDArQ2X98l7u14ba6tQuMWkgOqQTZoQOp+h/orQxDt0X+eEWpcvtw2uGRaGbq9 sI/yU8NVOK/2BA6SNuIaPWHZxPCNB/4rPRdi1ZO5bsark5SQQesIeTMVZfZDdzjTajNfRTV BRYJ9j8e3lsYD+On6lDnVZzb9wGbeQq78XQamjDMuQVqrKznD5fi2oN0nUuqo9/EY6/LKFA 85M/9lcLd2cLZ6vSpmvB6gypt5jHbj5+J5WkhjAfbFELqXoz+A2ULNoKpUDysbxz+t+WNc0 qiwZG7MGiQUkmgsxjKo0JKO5ESvKwnuFIW0oElxBEzQx7CZeOM= X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Subject: [PATCH 09/21] Add misc function intrinsic support Date: Tue, 31 May 2022 16:50:00 +0800 Message-Id: <20220531085012.269719-10-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220531085012.269719-1-juzhe.zhong@rivai.ai> References: <20220531085012.269719-1-juzhe.zhong@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybgforeign:qybgforeign9 X-QQ-Bgrelay: 1 X-Spam-Status: No, score=-10.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, T_SPF_HELO_TEMPERROR, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kito.cheng@gmail.com, juzhe.zhong@rivai.ai Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: zhongjuzhe gcc/ChangeLog: * config/riscv/predicates.md (vector_any_register_operand): New predicate. * config/riscv/riscv-protos.h (riscv_regmode_natural_size): New function. * config/riscv/riscv-vector-builtins-functions.cc (get_pred_str): New function. (get_operation_str): New function. (is_dt_ptr): New function. (is_dt_unsigned): New function. (is_dt_const): New function. (intrinsic_rename): New function. (get_dt_t_with_index): New function. (misc::assemble_name): New function. (misc::get_return_type): New function. (misc::get_argument_types): New function. (vreinterpret::expand): New function. (vlmul_ext::assemble_name): New function. (vlmul_ext::expand): New function. (vlmul_trunc::assemble_name): New function. (vlmul_trunc::expand): New function. (vundefined::assemble_name): New function. (vundefined::get_argument_types): New function. (vundefined::expand): New function. * config/riscv/riscv-vector-builtins-functions.def (vreinterpret): New macro definition. (vlmul_ext): New macro definition. (vlmul_trunc): New macro definition. (vundefined): New macro definition. * config/riscv/riscv-vector-builtins-functions.h (class misc): New class. (class vreinterpret): New class. (class vlmul_ext): New class. (class vlmul_trunc): New class. (class vundefined): New class. * config/riscv/riscv-vector-builtins-iterators.def (VCONVERFI): New iterator. (VCONVERI): New iterator. (VCONVERI2): New iterator. (VCONVERI3): New iterator. (VCONVERF): New iterator. (VSETI): New iterator. (VSETF): New iterator. (VGETI): New iterator. (VGETF): New iterator. (VLMULEXT): New iterator. (VLMULTRUNC): New iterator. * config/riscv/riscv.cc (riscv_hard_regno_mode_ok): Fix register allocation. (riscv_class_max_nregs): Fix register allocation. (riscv_can_change_mode_class): Add RVV mode can change support. (riscv_regmode_natural_size): New function. * config/riscv/riscv.h (REGMODE_NATURAL_SIZE): New targethook. * config/riscv/vector-iterators.md: New iterators and attributes. * config/riscv/vector.md (@vreinterpret): New pattern. (@vlmul_ext): New pattern. (*vlmul_ext): New pattern. (@vlmul_trunc): New pattern. (@vset): New pattern. (@vget): New pattern. gcc/testsuite/ChangeLog: * g++.target/riscv/rvv/misc_func.C: New test. * g++.target/riscv/rvv/rvv-intrinsic.exp: New test. * gcc.target/riscv/rvv/intrinsic/misc_func.c: New test. --- gcc/config/riscv/predicates.md | 7 + gcc/config/riscv/riscv-protos.h | 1 + .../riscv/riscv-vector-builtins-functions.cc | 288 ++ .../riscv/riscv-vector-builtins-functions.def | 22 + .../riscv/riscv-vector-builtins-functions.h | 62 + .../riscv/riscv-vector-builtins-iterators.def | 171 + gcc/config/riscv/riscv.cc | 33 +- gcc/config/riscv/riscv.h | 2 + gcc/config/riscv/vector-iterators.md | 142 + gcc/config/riscv/vector.md | 133 + .../g++.target/riscv/rvv/misc_func.C | 2597 +++++++++++++++ .../g++.target/riscv/rvv/rvv-intrinsic.exp | 39 + .../riscv/rvv/intrinsic/misc_func.c | 2921 +++++++++++++++++ 13 files changed, 6414 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/g++.target/riscv/rvv/misc_func.C create mode 100644 gcc/testsuite/g++.target/riscv/rvv/rvv-intrinsic.exp create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/intrinsic/misc_func.c diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md index 7a101676538..e31c829bf5b 100644 --- a/gcc/config/riscv/predicates.md +++ b/gcc/config/riscv/predicates.md @@ -257,6 +257,13 @@ return GET_MODE (op) == Pmode; }) +;; A special predicate that doesn't match a particular mode. +(define_special_predicate "vector_any_register_operand" + (match_code "reg, subreg") +{ + return VECTOR_MODE_P (GET_MODE (op)); +}) + (define_predicate "vector_reg_or_const0_operand" (ior (match_operand 0 "register_operand") (match_test "op == const0_rtx && !VECTOR_MODE_P (GET_MODE (op))"))) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 2d63fe76930..fd8906e47de 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -75,6 +75,7 @@ extern bool riscv_store_data_bypass_p (rtx_insn *, rtx_insn *); extern rtx riscv_gen_gpr_save_insn (struct riscv_frame_info *); extern bool riscv_gpr_save_operation_p (rtx); extern rtx riscv_add_offset (rtx, rtx, HOST_WIDE_INT); +extern poly_uint64 riscv_regmode_natural_size (machine_mode); /* Routines implemented in riscv-c.cc. */ void riscv_cpu_cpp_builtins (cpp_reader *); diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.cc b/gcc/config/riscv/riscv-vector-builtins-functions.cc index 0acda8f671e..a25f167f40e 100644 --- a/gcc/config/riscv/riscv-vector-builtins-functions.cc +++ b/gcc/config/riscv/riscv-vector-builtins-functions.cc @@ -143,6 +143,131 @@ get_vtype_for_mode (machine_mode mode) gcc_unreachable (); } +static const char * +get_pred_str (enum predication_index pred, bool overloaded_p = false) +{ + switch (pred) + { + case PRED_void: + case PRED_none: + return ""; + + case PRED_m: + return overloaded_p ? "" : "_m"; + + case PRED_tam: + return "_tam"; + + case PRED_tum: + return "_tum"; + + case PRED_tu: + return "_tu"; + + case PRED_ta: + return "_ta"; + + case PRED_ma: + return "_ma"; + + case PRED_mu: + return "_mu"; + + case PRED_tama: + return "_tama"; + + case PRED_tamu: + return "_tamu"; + + case PRED_tuma: + return "_tuma"; + + case PRED_tumu: + return "_tumu"; + + default: + gcc_unreachable (); + } +} + +static const char * +get_operation_str (enum operation_index op) +{ + switch (op) + { + case OP_vv: + return "_vv"; + + case OP_vx: + return "_vx"; + + case OP_v: + return "_v"; + + case OP_wv: + return "_wv"; + + case OP_wx: + return "_wx"; + + case OP_x_x_v: + return "_x_x_v"; + + case OP_vf2: + return "_vf2"; + + case OP_vf4: + return "_vf4"; + + case OP_vf8: + return "_vf8"; + + case OP_vvm: + return "_vvm"; + + case OP_vxm: + return "_vxm"; + + case OP_x_x_w: + return "_x_x_w"; + + case OP_v_v: + return "_v_v"; + + case OP_v_x: + return "_v_x"; + + case OP_v_f: + return "_v_f"; + + case OP_vs: + return "_vs"; + + case OP_vf: + return "_vf"; + + case OP_wf: + return "_wf"; + + case OP_vfm: + return "_vfm"; + + case OP_vm: + return "_vm"; + + case OP_mm: + return "_mm"; + + case OP_m: + return "_m"; + + default: + break; + } + + return ""; +} + static const char * mode2data_type_str (machine_mode mode, bool u, bool ie) { @@ -300,6 +425,53 @@ get_dt_t (machine_mode mode, bool u, bool ptr = false, bool c = false) return type; } +/* Helper functions to get datatype of arg. */ + +static bool +is_dt_ptr (enum data_type_index dt) +{ + return dt == DT_ptr || dt == DT_uptr || dt == DT_c_ptr || dt == DT_c_uptr; +} + +static bool +is_dt_unsigned (enum data_type_index dt) +{ + return dt == DT_unsigned || dt == DT_uptr || dt == DT_c_uptr; +} + +static bool +is_dt_const (enum data_type_index dt) +{ + return dt == DT_c_ptr || dt == DT_c_uptr; +} + +/* Helper functions for builder implementation. */ +static void +intrinsic_rename (function_instance &instance, int index1, int index2) +{ + machine_mode dst_mode = instance.get_arg_pattern ().arg_list[index1]; + machine_mode src_mode = instance.get_arg_pattern ().arg_list[index2]; + bool dst_unsigned_p = instance.get_data_type_list ()[index1] == DT_unsigned; + bool src_unsigned_p = instance.get_data_type_list ()[index2] == DT_unsigned; + const char *name = instance.get_base_name (); + const char *op = get_operation_str (instance.get_operation ()); + const char *src_suffix = mode2data_type_str (src_mode, src_unsigned_p, false); + const char *dst_suffix = mode2data_type_str (dst_mode, dst_unsigned_p, false); + const char *pred = get_pred_str (instance.get_pred ()); + snprintf (instance.function_name, NAME_MAXLEN, "%s%s%s%s%s", name, op, src_suffix, dst_suffix, pred); +} + +static tree +get_dt_t_with_index (const function_instance &instance, int index) +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[index]; + bool unsigned_p = is_dt_unsigned (instance.get_data_type_list ()[index]); + bool ptr_p = is_dt_ptr (instance.get_data_type_list ()[index]); + bool c_p = is_dt_const (instance.get_data_type_list ()[index]); + return get_dt_t (mode, unsigned_p, ptr_p, c_p); +} + + /* Return true if the function has no return value. */ static bool function_returns_void_p (tree fndecl) @@ -1222,6 +1394,122 @@ vsetvlmax::expand (const function_instance &instance, tree exp, rtx target) cons !function_returns_void_p (fndecl)); } +/* A function implementation for Miscellaneous functions. */ +char * +misc::assemble_name (function_instance &instance) +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[0]; + bool unsigned_p = instance.get_data_type_list ()[0] == DT_unsigned; + intrinsic_rename (instance, 0, 1); + append_name (instance.get_base_name ()); + append_name (mode2data_type_str (mode, unsigned_p, false)); + return finish_name (); +} + +tree +misc::get_return_type (const function_instance &instance) const +{ + return get_dt_t_with_index (instance, 0); +} + +void +misc::get_argument_types (const function_instance &instance, + vec &argument_types) const +{ + argument_types.quick_push (get_dt_t_with_index (instance, 1)); +} + +/* A function implementation for vreinterpret functions. */ +rtx +vreinterpret::expand (const function_instance &instance, tree exp, rtx target) const +{ + enum insn_code icode = code_for_vreinterpret (instance.get_arg_pattern ().arg_list[0]); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vlmul_ext functions. */ +char * +vlmul_ext::assemble_name (function_instance &instance) +{ + machine_mode tmode = instance.get_arg_pattern ().arg_list[0]; + machine_mode smode = instance.get_arg_pattern ().arg_list[1]; + if (GET_MODE_INNER (tmode) != GET_MODE_INNER (smode)) + return nullptr; + + if (tmode == smode) + return nullptr; + + if (known_lt (GET_MODE_SIZE (tmode), GET_MODE_SIZE (smode))) + return nullptr; + + bool unsigned_p = instance.get_data_type_list ()[0] == DT_unsigned; + intrinsic_rename (instance, 0, 1); + append_name (instance.get_base_name ()); + append_name (mode2data_type_str (tmode, unsigned_p, false)); + return finish_name (); +} + +rtx +vlmul_ext::expand (const function_instance &instance, tree exp, rtx target) const +{ + enum insn_code icode = code_for_vlmul_ext (instance.get_arg_pattern ().arg_list[0]); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vlmul_trunc functions. */ +char * +vlmul_trunc::assemble_name (function_instance &instance) +{ + machine_mode tmode = instance.get_arg_pattern ().arg_list[0]; + machine_mode smode = instance.get_arg_pattern ().arg_list[1]; + if (GET_MODE_INNER (tmode) != GET_MODE_INNER (smode)) + return nullptr; + + if (tmode == smode) + return nullptr; + + if (known_gt (GET_MODE_SIZE (tmode), GET_MODE_SIZE (smode))) + return nullptr; + + bool unsigned_p = instance.get_data_type_list ()[0] == DT_unsigned; + intrinsic_rename (instance, 0, 1); + append_name (instance.get_base_name ()); + append_name (mode2data_type_str (tmode, unsigned_p, false)); + return finish_name (); +} + +rtx +vlmul_trunc::expand (const function_instance &instance, tree exp, rtx target) const +{ + enum insn_code icode = code_for_vlmul_trunc (instance.get_arg_pattern ().arg_list[0]); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vundefined functions. */ +char * +vundefined::assemble_name (function_instance &instance) +{ + const char *name = instance.get_base_name (); + machine_mode mode = instance.get_arg_pattern ().arg_list[0]; + bool unsigned_p = instance.get_data_type_list ()[0] == DT_unsigned; + const char *dt = mode2data_type_str (mode, unsigned_p, false); + snprintf (instance.function_name, NAME_MAXLEN, "%s%s", name, dt); + return nullptr; +} + +void +vundefined::get_argument_types (const function_instance &, + vec &) const +{ +} + +rtx +vundefined::expand (const function_instance &, tree, rtx target) const +{ + emit_clobber (copy_rtx (target)); + return target; +} + } // end namespace riscv_vector using namespace riscv_vector; diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def b/gcc/config/riscv/riscv-vector-builtins-functions.def index 666e8503d81..86130e02381 100644 --- a/gcc/config/riscv/riscv-vector-builtins-functions.def +++ b/gcc/config/riscv/riscv-vector-builtins-functions.def @@ -34,6 +34,28 @@ along with GCC; see the file COPYING3. If not see /* 6. Configuration-Setting Instructions. */ DEF_RVV_FUNCTION(vsetvl, vsetvl, (1, VITER(VI, signed)), PAT_none, PRED_none, OP_none) DEF_RVV_FUNCTION(vsetvlmax, vsetvlmax, (1, VITER(VI, signed)), PAT_none, PRED_none, OP_none) +/* Helper misc intrinsics for software programmer. */ +DEF_RVV_FUNCTION(vreinterpret, vreinterpret, (2, VITER(VI, unsigned), VATTR(0, VI, signed)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vreinterpret, vreinterpret, (2, VITER(VI, signed), VATTR(0, VI, unsigned)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vreinterpret, vreinterpret, (2, VATTR(1, VCONVERFI, signed), VITER(VCONVERF, signed)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vreinterpret, vreinterpret, (2, VATTR(1, VCONVERFI, unsigned), VITER(VCONVERF, signed)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vreinterpret, vreinterpret, (2, VITER(VCONVERF, signed), VATTR(0, VCONVERFI, signed)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vreinterpret, vreinterpret, (2, VITER(VCONVERF, signed), VATTR(0, VCONVERFI, unsigned)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vreinterpret, vreinterpret, (2, VITER(VCONVERI, signed), VATTR(0, VCONVERI, signed)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vreinterpret, vreinterpret, (2, VITER(VCONVERI, unsigned), VATTR(0, VCONVERI, unsigned)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vreinterpret, vreinterpret, (2, VITER(VCONVERI2, signed), VATTR(0, VCONVERI2, signed)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vreinterpret, vreinterpret, (2, VITER(VCONVERI2, unsigned), VATTR(0, VCONVERI2, unsigned)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vreinterpret, vreinterpret, (2, VITER(VCONVERI3, signed), VATTR(0, VCONVERI3, signed)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vreinterpret, vreinterpret, (2, VITER(VCONVERI3, unsigned), VATTR(0, VCONVERI3, unsigned)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vlmul_ext, vlmul_ext, (2, VITER(VLMULEXT, signed), VITER(VI, signed)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vlmul_ext, vlmul_ext, (2, VITER(VLMULEXT, unsigned), VITER(VI, unsigned)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vlmul_ext, vlmul_ext, (2, VITER(VLMULEXT, signed), VITER(VF, signed)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vlmul_trunc, vlmul_trunc, (2, VITER(VLMULTRUNC, signed), VITER(VI, signed)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vlmul_trunc, vlmul_trunc, (2, VITER(VLMULTRUNC, unsigned), VITER(VI, unsigned)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vlmul_trunc, vlmul_trunc, (2, VITER(VLMULTRUNC, signed), VITER(VF, signed)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vundefined, vundefined, (1, VITER(VI, signed)), PAT_none, PRED_none, OP_none) +DEF_RVV_FUNCTION(vundefined, vundefined, (1, VITER(VI, unsigned)), PAT_none, PRED_none, OP_none) +DEF_RVV_FUNCTION(vundefined, vundefined, (1, VITER(VF, signed)), PAT_none, PRED_none, OP_none) #undef REQUIRED_EXTENSIONS #undef DEF_RVV_FUNCTION #undef VITER diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.h b/gcc/config/riscv/riscv-vector-builtins-functions.h index 9846ded1155..85e4af5f079 100644 --- a/gcc/config/riscv/riscv-vector-builtins-functions.h +++ b/gcc/config/riscv/riscv-vector-builtins-functions.h @@ -522,6 +522,68 @@ public: virtual rtx expand (const function_instance &, tree, rtx) const override; }; +/* A function_base for Miscellaneous functions. */ +class misc : public function_builder +{ +public: + // use the same construction function as the function_builder + using function_builder::function_builder; + + virtual char * assemble_name (function_instance &) override; + + virtual void get_argument_types (const function_instance &, vec &) const override; + + virtual tree get_return_type (const function_instance &) const override; +}; + +/* A function_base for vreinterpret functions. */ +class vreinterpret : public misc +{ +public: + // use the same construction function as the misc + using misc::misc; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vlmul_ext functions. */ +class vlmul_ext : public misc +{ +public: + // use the same construction function as the misc + using misc::misc; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vlmul_trunc functions. */ +class vlmul_trunc : public misc +{ +public: + // use the same construction function as the misc + using misc::misc; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vundefined functions. */ +class vundefined : public misc +{ +public: + // use the same construction function as the misc + using misc::misc; + + virtual char * assemble_name (function_instance &) override; + + virtual void get_argument_types (const function_instance &, vec &) const override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + } // namespace riscv_vector #endif // end GCC_RISCV_VECTOR_BUILTINS_FUNCTIONS_H \ No newline at end of file diff --git a/gcc/config/riscv/riscv-vector-builtins-iterators.def b/gcc/config/riscv/riscv-vector-builtins-iterators.def index 77a391c7630..8f2ea912804 100644 --- a/gcc/config/riscv/riscv-vector-builtins-iterators.def +++ b/gcc/config/riscv/riscv-vector-builtins-iterators.def @@ -118,6 +118,177 @@ DEF_RISCV_ARG_MODE_ATTR(V64BITI, 0, VNx2DI, VNx2DI, TARGET_ANY) DEF_RISCV_ARG_MODE_ATTR(V64BITI, 1, VNx4DI, VNx4DI, TARGET_ANY) DEF_RISCV_ARG_MODE_ATTR(V64BITI, 2, VNx8DI, VNx8DI, TARGET_ANY) DEF_RISCV_ARG_MODE_ATTR(V64BITI, 3, VNx16DI, VNx16DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VCONVERFI, 9) +DEF_RISCV_ARG_MODE_ATTR(VCONVERFI, 0, VNx2SF, VNx2SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERFI, 1, VNx4SF, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERFI, 2, VNx8SF, VNx8SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERFI, 3, VNx16SF, VNx16SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERFI, 4, VNx32SF, VNx32SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERFI, 5, VNx2DF, VNx2DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERFI, 6, VNx4DF, VNx4DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERFI, 7, VNx8DF, VNx8DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERFI, 8, VNx16DF, VNx16DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VCONVERI, 21) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 0, VNx2HI, VNx4QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 1, VNx4HI, VNx8QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 2, VNx8HI, VNx16QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 3, VNx16HI, VNx32QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 4, VNx32HI, VNx64QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 5, VNx64HI, VNx128QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 6, VNx2SI, VNx8QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 7, VNx4SI, VNx16QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 8, VNx8SI, VNx32QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 9, VNx16SI, VNx64QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 10, VNx32SI, VNx128QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 11, VNx2DI, VNx16QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 12, VNx4DI, VNx32QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 13, VNx8DI, VNx64QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 14, VNx16DI, VNx128QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 15, VNx4QI, VNx2HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 16, VNx8QI, VNx4HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 17, VNx16QI, VNx8HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 18, VNx32QI, VNx16HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 19, VNx64QI, VNx32HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI, 20, VNx128QI, VNx64HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VCONVERI2, 19) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI2, 0, VNx2SI, VNx4HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI2, 1, VNx4SI, VNx8HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI2, 2, VNx8SI, VNx16HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI2, 3, VNx16SI, VNx32HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI2, 4, VNx32SI, VNx64HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI2, 5, VNx2DI, VNx8HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI2, 6, VNx4DI, VNx16HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI2, 7, VNx8DI, VNx32HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI2, 8, VNx16DI, VNx64HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI2, 9, VNx8QI, VNx2SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI2, 10, VNx16QI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI2, 11, VNx32QI, VNx8SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI2, 12, VNx64QI, VNx16SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI2, 13, VNx128QI, VNx32SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI2, 14, VNx4HI, VNx2SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI2, 15, VNx8HI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI2, 16, VNx16HI, VNx8SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI2, 17, VNx32HI, VNx16SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI2, 18, VNx64HI, VNx32SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VCONVERI3, 16) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI3, 0, VNx2DI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI3, 1, VNx4DI, VNx8SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI3, 2, VNx8DI, VNx16SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI3, 3, VNx16DI, VNx32SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI3, 4, VNx16QI, VNx2DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI3, 5, VNx32QI, VNx4DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI3, 6, VNx64QI, VNx8DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI3, 7, VNx128QI, VNx16DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI3, 8, VNx8HI, VNx2DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI3, 9, VNx16HI, VNx4DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI3, 10, VNx32HI, VNx8DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI3, 11, VNx64HI, VNx16DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI3, 12, VNx4SI, VNx2DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI3, 13, VNx8SI, VNx4DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI3, 14, VNx16SI, VNx8DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VCONVERI3, 15, VNx32SI, VNx16DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VCONVERF, 9) +DEF_RISCV_ARG_MODE_ATTR(VCONVERF, 0, VNx2SF, VNx2SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VCONVERF, 1, VNx4SF, VNx4SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VCONVERF, 2, VNx8SF, VNx8SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VCONVERF, 3, VNx16SF, VNx16SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VCONVERF, 4, VNx32SF, VNx32SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VCONVERF, 5, VNx2DF, VNx2DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VCONVERF, 6, VNx4DF, VNx4DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VCONVERF, 7, VNx8DF, VNx8DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VCONVERF, 8, VNx16DF, VNx16DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VSETI, 12) +DEF_RISCV_ARG_MODE_ATTR(VSETI, 0, VNx32QI, VNx32QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSETI, 1, VNx64QI, VNx64QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSETI, 2, VNx128QI, VNx128QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSETI, 3, VNx16HI, VNx16HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSETI, 4, VNx32HI, VNx32HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSETI, 5, VNx64HI, VNx64HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSETI, 6, VNx8SI, VNx8SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSETI, 7, VNx16SI, VNx16SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSETI, 8, VNx32SI, VNx32SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSETI, 9, VNx4DI, VNx4DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSETI, 10, VNx8DI, VNx8DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VSETI, 11, VNx16DI, VNx16DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VSETF, 6) +DEF_RISCV_ARG_MODE_ATTR(VSETF, 0, VNx8SF, VNx8SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VSETF, 1, VNx16SF, VNx16SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VSETF, 2, VNx32SF, VNx32SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VSETF, 3, VNx4DF, VNx4DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VSETF, 4, VNx8DF, VNx8DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VSETF, 5, VNx16DF, VNx16DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VGETI, 12) +DEF_RISCV_ARG_MODE_ATTR(VGETI, 0, VNx16QI, VNx16QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VGETI, 1, VNx32QI, VNx32QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VGETI, 2, VNx64QI, VNx64QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VGETI, 3, VNx8HI, VNx8HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VGETI, 4, VNx16HI, VNx16HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VGETI, 5, VNx32HI, VNx32HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VGETI, 6, VNx4SI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VGETI, 7, VNx8SI, VNx8SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VGETI, 8, VNx16SI, VNx16SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VGETI, 9, VNx2DI, VNx2DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VGETI, 10, VNx4DI, VNx4DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VGETI, 11, VNx8DI, VNx8DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VGETF, 6) +DEF_RISCV_ARG_MODE_ATTR(VGETF, 0, VNx4SF, VNx4SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VGETF, 1, VNx8SF, VNx8SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VGETF, 2, VNx16SF, VNx16SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VGETF, 3, VNx2DF, VNx2DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VGETF, 4, VNx4DF, VNx4DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VGETF, 5, VNx8DF, VNx8DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VLMULEXT, 25) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 0, VNx4QI, VNx4QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 1, VNx8QI, VNx8QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 2, VNx16QI, VNx16QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 3, VNx32QI, VNx32QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 4, VNx64QI, VNx64QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 5, VNx128QI, VNx128QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 6, VNx4HI, VNx4HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 7, VNx8HI, VNx8HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 8, VNx16HI, VNx16HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 9, VNx32HI, VNx32HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 10, VNx64HI, VNx64HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 11, VNx4SI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 12, VNx8SI, VNx8SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 13, VNx16SI, VNx16SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 14, VNx32SI, VNx32SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 15, VNx4DI, VNx4DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 16, VNx8DI, VNx8DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 17, VNx16DI, VNx16DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 18, VNx4SF, VNx4SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 19, VNx8SF, VNx8SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 20, VNx16SF, VNx16SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 21, VNx32SF, VNx32SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 22, VNx4DF, VNx4DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 23, VNx8DF, VNx8DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VLMULEXT, 24, VNx16DF, VNx16DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VLMULTRUNC, 25) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 0, VNx2QI, VNx2QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 1, VNx4QI, VNx4QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 2, VNx8QI, VNx8QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 3, VNx16QI, VNx16QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 4, VNx32QI, VNx32QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 5, VNx64QI, VNx64QI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 6, VNx2HI, VNx2HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 7, VNx4HI, VNx4HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 8, VNx8HI, VNx8HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 9, VNx16HI, VNx16HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 10, VNx32HI, VNx32HI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 11, VNx2SI, VNx2SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 12, VNx4SI, VNx4SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 13, VNx8SI, VNx8SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 14, VNx16SI, VNx16SI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 15, VNx2DI, VNx2DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 16, VNx4DI, VNx4DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 17, VNx8DI, VNx8DI, TARGET_ANY) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 18, VNx2SF, VNx2SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 19, VNx4SF, VNx4SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 20, VNx8SF, VNx8SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 21, VNx16SF, VNx16SF, TARGET_HARD_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 22, VNx2DF, VNx2DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 23, VNx4DF, VNx4DF, TARGET_DOUBLE_FLOAT) +DEF_RISCV_ARG_MODE_ATTR(VLMULTRUNC, 24, VNx8DF, VNx8DF, TARGET_DOUBLE_FLOAT) DEF_RISCV_ARG_MODE_ATTR_VARIABLE(VM, 69) DEF_RISCV_ARG_MODE_ATTR(VM, 0, VNx2BI, VNx2BI, TARGET_ANY) DEF_RISCV_ARG_MODE_ATTR(VM, 1, VNx4BI, VNx4BI, TARGET_ANY) diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 7a1f19b32ee..e88057e992a 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -5009,6 +5009,9 @@ riscv_hard_regno_mode_ok (unsigned int regno, machine_mode mode) if (rvv_mode_p (mode)) return false; + if (!FLOAT_MODE_P (mode)) + return false; + if (!FP_REG_P (regno + nregs - 1)) return false; @@ -5073,13 +5076,13 @@ riscv_class_max_nregs (reg_class_t rclass, machine_mode mode) if (reg_class_subset_p (rclass, GR_REGS)) return riscv_hard_regno_nregs (GP_REG_FIRST, mode); - if (reg_class_subset_p (V_REGS, rclass)) + if (reg_class_subset_p (rclass, V_REGS)) return riscv_hard_regno_nregs (V_REG_FIRST, mode); - if (reg_class_subset_p (VL_REGS, rclass)) + if (reg_class_subset_p (rclass, VL_REGS)) return 1; - if (reg_class_subset_p (VTYPE_REGS, rclass)) + if (reg_class_subset_p (rclass, VTYPE_REGS)) return 1; return 0; @@ -5718,8 +5721,11 @@ riscv_slow_unaligned_access (machine_mode, unsigned int) /* Implement TARGET_CAN_CHANGE_MODE_CLASS. */ static bool -riscv_can_change_mode_class (machine_mode, machine_mode, reg_class_t rclass) +riscv_can_change_mode_class (machine_mode from, machine_mode to, reg_class_t rclass) { + if (rvv_mode_p (from) && rvv_mode_p (to)) + return true; + return !reg_classes_intersect_p (FP_REGS, rclass); } @@ -6003,6 +6009,25 @@ riscv_mangle_type (const_tree type) return NULL; } +/* Implement REGMODE_NATURAL_SIZE. */ + +poly_uint64 +riscv_regmode_natural_size (machine_mode mode) +{ + /* The natural size for RVV data modes is one RVV data vector, + and similarly for predicates. We can't independently modify + anything smaller than that. */ + /* ??? For now, only do this for variable-width RVV registers. + Doing it for constant-sized registers breaks lower-subreg.c. */ + /* For partial tuple vector, we should + use the mode of the vector subpart, + in case of segment loads and stores. */ + if (!riscv_vector_chunks.is_constant () && rvv_mode_p (mode)) + return BYTES_PER_RISCV_VECTOR; + + return UNITS_PER_WORD; +} + /* Initialize the GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t" diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h index cb4cfc0f73e..c54c984e70b 100644 --- a/gcc/config/riscv/riscv.h +++ b/gcc/config/riscv/riscv.h @@ -1069,4 +1069,6 @@ extern void riscv_remove_unneeded_save_restore_calls (void); #define REGISTER_TARGET_PRAGMAS() riscv_register_pragmas () +#define REGMODE_NATURAL_SIZE(MODE) riscv_regmode_natural_size (MODE) + #endif /* ! GCC_RISCV_H */ diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index e01305ef3fc..df9011ee901 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -23,6 +23,18 @@ UNSPEC_VSETVLI ;; RVV instructions. UNSPEC_RVV + ;; reinterpret + UNSPEC_REINTERPRET + ;; lmul_ext + UNSPEC_LMUL_EXT + ;; lmul_trunc + UNSPEC_LMUL_TRUNC + ;; set a vector + UNSPEC_SET_VECTOR + ;; get a vector + UNSPEC_GET_VECTOR + ;; vec_duplicate + UNSPEC_VEC_DUPLICATE ;; vector select UNSPEC_SELECT @@ -85,6 +97,136 @@ ;; All vector modes supported for integer sew = 64. (define_mode_iterator V64BITI [VNx2DI VNx4DI VNx8DI VNx16DI]) +;; vector integer and float-point mode interconversion. +(define_mode_attr VCONVERFI [ + (VNx2SF "VNx2SI") + (VNx4SF "VNx4SI") + (VNx8SF "VNx8SI") + (VNx16SF "VNx16SI") + (VNx32SF "VNx32SI") + (VNx2DF "VNx2DI") + (VNx4DF "VNx4DI") + (VNx8DF "VNx8DI") + (VNx16DF "VNx16DI")]) + +;; vector integer same lmul but different sew interconversion. +(define_mode_attr VCONVERI [ + (VNx2HI "VNx4QI") + (VNx4HI "VNx8QI") + (VNx8HI "VNx16QI") + (VNx16HI "VNx32QI") + (VNx32HI "VNx64QI") + (VNx64HI "VNx128QI") + (VNx2SI "VNx8QI") + (VNx4SI "VNx16QI") + (VNx8SI "VNx32QI") + (VNx16SI "VNx64QI") + (VNx32SI "VNx128QI") + (VNx2DI "VNx16QI") + (VNx4DI "VNx32QI") + (VNx8DI "VNx64QI") + (VNx16DI "VNx128QI") + (VNx4QI "VNx2HI") + (VNx8QI "VNx4HI") + (VNx16QI "VNx8HI") + (VNx32QI "VNx16HI") + (VNx64QI "VNx32HI") + (VNx128QI "VNx64HI")]) + +(define_mode_attr VCONVERI2 [ + (VNx2SI "VNx4HI") + (VNx4SI "VNx8HI") + (VNx8SI "VNx16HI") + (VNx16SI "VNx32HI") + (VNx32SI "VNx64HI") + (VNx2DI "VNx8HI") + (VNx4DI "VNx16HI") + (VNx8DI "VNx32HI") + (VNx16DI "VNx64HI") + (VNx8QI "VNx2SI") + (VNx16QI "VNx4SI") + (VNx32QI "VNx8SI") + (VNx64QI "VNx16SI") + (VNx128QI "VNx32SI") + (VNx4HI "VNx2SI") + (VNx8HI "VNx4SI") + (VNx16HI "VNx8SI") + (VNx32HI "VNx16SI") + (VNx64HI "VNx32SI")]) + +(define_mode_attr VCONVERI3 [ + (VNx2DI "VNx4SI") + (VNx4DI "VNx8SI") + (VNx8DI "VNx16SI") + (VNx16DI "VNx32SI") + (VNx16QI "VNx2DI") + (VNx32QI "VNx4DI") + (VNx64QI "VNx8DI") + (VNx128QI "VNx16DI") + (VNx8HI "VNx2DI") + (VNx16HI "VNx4DI") + (VNx32HI "VNx8DI") + (VNx64HI "VNx16DI") + (VNx4SI "VNx2DI") + (VNx8SI "VNx4DI") + (VNx16SI "VNx8DI") + (VNx32SI "VNx16DI")]) + +;; vector iterator integer and float-point mode interconversion. +(define_mode_iterator VCONVERF [ + (VNx2SF "TARGET_HARD_FLOAT") + (VNx4SF "TARGET_HARD_FLOAT") + (VNx8SF "TARGET_HARD_FLOAT") + (VNx16SF "TARGET_HARD_FLOAT") + (VNx32SF "TARGET_HARD_FLOAT") + (VNx2DF "TARGET_DOUBLE_FLOAT") + (VNx4DF "TARGET_DOUBLE_FLOAT") + (VNx8DF "TARGET_DOUBLE_FLOAT") + (VNx16DF "TARGET_DOUBLE_FLOAT")]) + +;; vector modes can be set. +(define_mode_iterator VSETI [ + VNx32QI VNx64QI VNx128QI + VNx16HI VNx32HI VNx64HI + VNx8SI VNx16SI VNx32SI + VNx4DI VNx8DI VNx16DI]) + +(define_mode_iterator VSETF [ + (VNx8SF "TARGET_HARD_FLOAT") (VNx16SF "TARGET_HARD_FLOAT") (VNx32SF "TARGET_HARD_FLOAT") + (VNx4DF "TARGET_DOUBLE_FLOAT") (VNx8DF "TARGET_DOUBLE_FLOAT") (VNx16DF "TARGET_DOUBLE_FLOAT")]) + +;; vector modes can be get. +(define_mode_iterator VGETI [ + VNx16QI VNx32QI VNx64QI + VNx8HI VNx16HI VNx32HI + VNx4SI VNx8SI VNx16SI + VNx2DI VNx4DI VNx8DI]) + +(define_mode_iterator VGETF [ + (VNx4SF "TARGET_HARD_FLOAT") (VNx8SF "TARGET_HARD_FLOAT") (VNx16SF "TARGET_HARD_FLOAT") + (VNx2DF "TARGET_DOUBLE_FLOAT") (VNx4DF "TARGET_DOUBLE_FLOAT") (VNx8DF "TARGET_DOUBLE_FLOAT")]) + +;; All vector extend modes supported. +(define_mode_iterator VLMULEXT [ + VNx4QI VNx8QI VNx16QI VNx32QI VNx64QI VNx128QI + VNx4HI VNx8HI VNx16HI VNx32HI VNx64HI + VNx4SI VNx8SI VNx16SI VNx32SI + VNx4DI VNx8DI VNx16DI + (VNx4SF "TARGET_HARD_FLOAT") (VNx8SF "TARGET_HARD_FLOAT") + (VNx16SF "TARGET_HARD_FLOAT") (VNx32SF "TARGET_HARD_FLOAT") + (VNx4DF "TARGET_DOUBLE_FLOAT") (VNx8DF "TARGET_DOUBLE_FLOAT") + (VNx16DF "TARGET_DOUBLE_FLOAT")]) + +;; All vector truncate modes supported. +(define_mode_iterator VLMULTRUNC [ + VNx2QI VNx4QI VNx8QI VNx16QI VNx32QI VNx64QI + VNx2HI VNx4HI VNx8HI VNx16HI VNx32HI + VNx2SI VNx4SI VNx8SI VNx16SI + VNx2DI VNx4DI VNx8DI + (VNx2SF "TARGET_HARD_FLOAT") (VNx4SF "TARGET_HARD_FLOAT") (VNx8SF "TARGET_HARD_FLOAT") + (VNx16SF "TARGET_HARD_FLOAT") + (VNx2DF "TARGET_DOUBLE_FLOAT") (VNx4DF "TARGET_DOUBLE_FLOAT") (VNx8DF "TARGET_DOUBLE_FLOAT")]) + ;; Map a vector int or float mode to a vector compare mode. (define_mode_attr VM [ (VNx2BI "VNx2BI") (VNx4BI "VNx4BI") (VNx8BI "VNx8BI") (VNx16BI "VNx16BI") diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 1731d969372..54e68aa165b 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -349,6 +349,139 @@ [(set_attr "type" "vsetvl") (set_attr "mode" "none")]) +;; ------------------------------------------------------------------------- +;; ---- [INT,FP] Vector Misc Functions +;; ------------------------------------------------------------------------- +;; Includes: +;; - Reinterpret between different vector modes +;; - Vector LMUL extension +;; - Vector LMUL truncation +;; - insert a vector to a vector group +;; - get a vector from a vector group +;; ------------------------------------------------------------------------- + +;; Reinterpret operand 1 in operand 0's mode, without changing its contents. +;; This is equivalent to a subreg on little-endian targets but not for +;; big-endian; see the comment at the head of the file for details. +(define_expand "@vreinterpret" + [(set (match_operand:V 0 "register_operand") + (unspec:V + [(match_operand 1 "vector_any_register_operand")] UNSPEC_REINTERPRET))] + "TARGET_VECTOR" +{ + machine_mode src_mode = GET_MODE (operands[1]); + if (targetm.can_change_mode_class (mode, src_mode, FP_REGS)) + { + emit_move_insn (operands[0], gen_lowpart (mode, operands[1])); + DONE; + } +}) + +;; Vector LMUL extension +(define_expand "@vlmul_ext" + [(set (match_operand:VLMULEXT 0 "register_operand") + (unspec:VLMULEXT + [(match_operand 1 "vector_any_register_operand")] UNSPEC_LMUL_EXT))] + "TARGET_VECTOR" +{ +}) + +(define_insn_and_split "*vlmul_ext" + [(set (match_operand:VLMULEXT 0 "register_operand" "=vr, ?&vr") + (unspec:VLMULEXT + [(match_operand:V 1 "register_operand" "0, vr")] UNSPEC_LMUL_EXT))] + "TARGET_VECTOR" + "#" + "&& reload_completed" + [(const_int 0)] + { + rtx subreg = simplify_gen_subreg (mode, operands[0], mode, 0); + riscv_emit_move (subreg, operands[1]); + DONE; + }) + +;; Vector LMUL truncation +(define_expand "@vlmul_trunc" + [(set (match_operand:VLMULTRUNC 0 "register_operand") + (unspec:VLMULTRUNC + [(match_operand 1 "vector_any_register_operand")] UNSPEC_LMUL_TRUNC))] + "TARGET_VECTOR" +{ + rtx subreg = simplify_gen_subreg (mode, operands[1], GET_MODE (operands[1]), 0); + riscv_emit_move (operands[0], subreg); + DONE; +}) + +;; insert a vector to a vector group +(define_expand "@vset" + [(set (match_operand:VSETI 0 "register_operand") + (unspec:VSETI + [(match_operand:VSETI 1 "register_operand" "0") + (match_operand 2 "const_int_operand") + (match_operand 3 "vector_any_register_operand")] UNSPEC_SET_VECTOR))] + "TARGET_VECTOR" +{ + unsigned int nvecs = exact_div (GET_MODE_SIZE (GET_MODE (operands[0])), + GET_MODE_SIZE (GET_MODE (operands[3]))).to_constant (); + poly_int64 offset = (INTVAL (operands[2]) & (nvecs - 1)) + * GET_MODE_SIZE (GET_MODE (operands[3])); + rtx subreg = simplify_gen_subreg (GET_MODE (operands[3]), operands[1], GET_MODE (operands[1]), offset); + riscv_emit_move (subreg, operands[3]); + riscv_emit_move (operands[0], operands[1]); + DONE; +}) + +(define_expand "@vset" + [(set (match_operand:VSETF 0 "register_operand") + (unspec:VSETF + [(match_operand:VSETF 1 "register_operand" "0") + (match_operand 2 "const_int_operand") + (match_operand 3 "vector_any_register_operand")] UNSPEC_SET_VECTOR))] + "TARGET_VECTOR" +{ + unsigned int nvecs = exact_div (GET_MODE_SIZE (GET_MODE (operands[0])), + GET_MODE_SIZE (GET_MODE (operands[3]))).to_constant (); + poly_int64 offset = (INTVAL (operands[2]) & (nvecs - 1)) + * GET_MODE_SIZE (GET_MODE (operands[3])); + rtx subreg = simplify_gen_subreg (GET_MODE (operands[3]), operands[1], GET_MODE (operands[1]), offset); + riscv_emit_move (subreg, operands[3]); + riscv_emit_move (operands[0], operands[1]); + DONE; +}) + +;; get a vector from a vector group +(define_expand "@vget" + [(set (match_operand:VGETI 0 "register_operand") + (unspec:VGETI + [(match_operand 1 "vector_any_register_operand") + (match_operand 2 "const_int_operand")] UNSPEC_GET_VECTOR))] + "TARGET_VECTOR" +{ + unsigned int nvecs = exact_div (GET_MODE_SIZE (GET_MODE (operands[1])), + GET_MODE_SIZE (GET_MODE (operands[0]))).to_constant (); + poly_int64 offset = (INTVAL (operands[2]) & (nvecs - 1)) + * GET_MODE_SIZE (GET_MODE (operands[0])); + rtx subreg = simplify_gen_subreg (GET_MODE (operands[0]), operands[1], GET_MODE (operands[1]), offset); + riscv_emit_move (operands[0], subreg); + DONE; +}) + +(define_expand "@vget" + [(set (match_operand:VGETF 0 "register_operand") + (unspec:VGETF + [(match_operand 1 "vector_any_register_operand") + (match_operand 2 "const_int_operand")] UNSPEC_GET_VECTOR))] + "TARGET_VECTOR" +{ + unsigned int nvecs = exact_div (GET_MODE_SIZE (GET_MODE (operands[1])), + GET_MODE_SIZE (GET_MODE (operands[0]))).to_constant (); + poly_int64 offset = (INTVAL (operands[2]) & (nvecs - 1)) + * GET_MODE_SIZE (GET_MODE (operands[0])); + rtx subreg = simplify_gen_subreg (GET_MODE (operands[0]), operands[1], GET_MODE (operands[1]), offset); + riscv_emit_move (operands[0], subreg); + DONE; +}) + ;; ------------------------------------------------------------------------------- ;; ---- 7. Vector Loads and Stores ;; ------------------------------------------------------------------------------- diff --git a/gcc/testsuite/g++.target/riscv/rvv/misc_func.C b/gcc/testsuite/g++.target/riscv/rvv/misc_func.C new file mode 100644 index 00000000000..b527bce1cab --- /dev/null +++ b/gcc/testsuite/g++.target/riscv/rvv/misc_func.C @@ -0,0 +1,2597 @@ +/* { dg-do compile } */ +/* { dg-skip-if "test vector intrinsic" { *-*-* } { "*" } { "-march=rv*v*" } } */ + +#include +#include + +vuint8mf8_t test_vreinterpret_v_i8mf8_u8mf8(vint8mf8_t src) +{ + vuint8mf8_t a = vreinterpret_u8mf8(src); + return a; +} + +vuint8mf4_t test_vreinterpret_v_i8mf4_u8mf4(vint8mf4_t src) +{ + vuint8mf4_t a = vreinterpret_u8mf4(src); + return a; +} + +vuint8mf2_t test_vreinterpret_v_i8mf2_u8mf2(vint8mf2_t src) +{ + vuint8mf2_t a = vreinterpret_u8mf2(src); + return a; +} + +vuint8m1_t test_vreinterpret_v_i8m1_u8m1(vint8m1_t src) +{ + vuint8m1_t a = vreinterpret_u8m1(src); + return a; +} + +vuint8m2_t test_vreinterpret_v_i8m2_u8m2(vint8m2_t src) +{ + vuint8m2_t a = vreinterpret_u8m2(src); + return a; +} + +vuint8m4_t test_vreinterpret_v_i8m4_u8m4(vint8m4_t src) +{ + vuint8m4_t a = vreinterpret_u8m4(src); + return a; +} + +vuint8m8_t test_vreinterpret_v_i8m8_u8m8(vint8m8_t src) +{ + vuint8m8_t a = vreinterpret_u8m8(src); + return a; +} + +vint8mf8_t test_vreinterpret_v_u8mf8_i8mf8(vuint8mf8_t src) +{ + vint8mf8_t a = vreinterpret_i8mf8(src); + return a; +} + +vint8mf4_t test_vreinterpret_v_u8mf4_i8mf4(vuint8mf4_t src) +{ + vint8mf4_t a = vreinterpret_i8mf4(src); + return a; +} + +vint8mf2_t test_vreinterpret_v_u8mf2_i8mf2(vuint8mf2_t src) +{ + vint8mf2_t a = vreinterpret_i8mf2(src); + return a; +} + +vint8m1_t test_vreinterpret_v_u8m1_i8m1(vuint8m1_t src) +{ + vint8m1_t a = vreinterpret_i8m1(src); + return a; +} + +vint8m2_t test_vreinterpret_v_u8m2_i8m2(vuint8m2_t src) +{ + vint8m2_t a = vreinterpret_i8m2(src); + return a; +} + +vint8m4_t test_vreinterpret_v_u8m4_i8m4(vuint8m4_t src) +{ + vint8m4_t a = vreinterpret_i8m4(src); + return a; +} + +vint8m8_t test_vreinterpret_v_u8m8_i8m8(vuint8m8_t src) +{ + vint8m8_t a = vreinterpret_i8m8(src); + return a; +} + +vuint16mf4_t test_vreinterpret_v_i16mf4_u16mf4(vint16mf4_t src) +{ + vuint16mf4_t a = vreinterpret_u16mf4(src); + return a; +} + +vuint16mf2_t test_vreinterpret_v_i16mf2_u16mf2(vint16mf2_t src) +{ + vuint16mf2_t a = vreinterpret_u16mf2(src); + return a; +} + +vuint16m1_t test_vreinterpret_v_i16m1_u16m1(vint16m1_t src) +{ + vuint16m1_t a = vreinterpret_u16m1(src); + return a; +} + +vuint16m2_t test_vreinterpret_v_i16m2_u16m2(vint16m2_t src) +{ + vuint16m2_t a = vreinterpret_u16m2(src); + return a; +} + +vuint16m4_t test_vreinterpret_v_i16m4_u16m4(vint16m4_t src) +{ + vuint16m4_t a = vreinterpret_u16m4(src); + return a; +} + +vuint16m8_t test_vreinterpret_v_i16m8_u16m8(vint16m8_t src) +{ + vuint16m8_t a = vreinterpret_u16m8(src); + return a; +} + +vint16mf4_t test_vreinterpret_v_u16mf4_i16mf4(vuint16mf4_t src) +{ + vint16mf4_t a = vreinterpret_i16mf4(src); + return a; +} + +vint16mf2_t test_vreinterpret_v_u16mf2_i16mf2(vuint16mf2_t src) +{ + vint16mf2_t a = vreinterpret_i16mf2(src); + return a; +} + +vint16m1_t test_vreinterpret_v_u16m1_i16m1(vuint16m1_t src) +{ + vint16m1_t a = vreinterpret_i16m1(src); + return a; +} + +vint16m2_t test_vreinterpret_v_u16m2_i16m2(vuint16m2_t src) +{ + vint16m2_t a = vreinterpret_i16m2(src); + return a; +} + +vint16m4_t test_vreinterpret_v_u16m4_i16m4(vuint16m4_t src) +{ + vint16m4_t a = vreinterpret_i16m4(src); + return a; +} + +vint16m8_t test_vreinterpret_v_u16m8_i16m8(vuint16m8_t src) +{ + vint16m8_t a = vreinterpret_i16m8(src); + return a; +} + +vuint32mf2_t test_vreinterpret_v_i32mf2_u32mf2(vint32mf2_t src) +{ + vuint32mf2_t a = vreinterpret_u32mf2(src); + return a; +} + +vuint32m1_t test_vreinterpret_v_i32m1_u32m1(vint32m1_t src) +{ + vuint32m1_t a = vreinterpret_u32m1(src); + return a; +} + +vuint32m2_t test_vreinterpret_v_i32m2_u32m2(vint32m2_t src) +{ + vuint32m2_t a = vreinterpret_u32m2(src); + return a; +} + +vuint32m4_t test_vreinterpret_v_i32m4_u32m4(vint32m4_t src) +{ + vuint32m4_t a = vreinterpret_u32m4(src); + return a; +} + +vuint32m8_t test_vreinterpret_v_i32m8_u32m8(vint32m8_t src) +{ + vuint32m8_t a = vreinterpret_u32m8(src); + return a; +} + +vfloat32mf2_t test_vreinterpret_v_i32mf2_f32mf2(vint32mf2_t src) +{ + vfloat32mf2_t a = vreinterpret_f32mf2(src); + return a; +} + +vfloat32m1_t test_vreinterpret_v_i32m1_f32m1(vint32m1_t src) +{ + vfloat32m1_t a = vreinterpret_f32m1(src); + return a; +} + +vfloat32m2_t test_vreinterpret_v_i32m2_f32m2(vint32m2_t src) +{ + vfloat32m2_t a = vreinterpret_f32m2(src); + return a; +} + +vfloat32m4_t test_vreinterpret_v_i32m4_f32m4(vint32m4_t src) +{ + vfloat32m4_t a = vreinterpret_f32m4(src); + return a; +} + +vfloat32m8_t test_vreinterpret_v_i32m8_f32m8(vint32m8_t src) +{ + vfloat32m8_t a = vreinterpret_f32m8(src); + return a; +} + +vint32mf2_t test_vreinterpret_v_u32mf2_i32mf2(vuint32mf2_t src) +{ + vint32mf2_t a = vreinterpret_i32mf2(src); + return a; +} + +vint32m1_t test_vreinterpret_v_u32m1_i32m1(vuint32m1_t src) +{ + vint32m1_t a = vreinterpret_i32m1(src); + return a; +} + +vint32m2_t test_vreinterpret_v_u32m2_i32m2(vuint32m2_t src) +{ + vint32m2_t a = vreinterpret_i32m2(src); + return a; +} + +vint32m4_t test_vreinterpret_v_u32m4_i32m4(vuint32m4_t src) +{ + vint32m4_t a = vreinterpret_i32m4(src); + return a; +} + +vint32m8_t test_vreinterpret_v_u32m8_i32m8(vuint32m8_t src) +{ + vint32m8_t a = vreinterpret_i32m8(src); + return a; +} + +vfloat32mf2_t test_vreinterpret_v_u32mf2_f32mf2(vuint32mf2_t src) +{ + vfloat32mf2_t a = vreinterpret_f32mf2(src); + return a; +} + +vfloat32m1_t test_vreinterpret_v_u32m1_f32m1(vuint32m1_t src) +{ + vfloat32m1_t a = vreinterpret_f32m1(src); + return a; +} + +vfloat32m2_t test_vreinterpret_v_u32m2_f32m2(vuint32m2_t src) +{ + vfloat32m2_t a = vreinterpret_f32m2(src); + return a; +} + +vfloat32m4_t test_vreinterpret_v_u32m4_f32m4(vuint32m4_t src) +{ + vfloat32m4_t a = vreinterpret_f32m4(src); + return a; +} + +vfloat32m8_t test_vreinterpret_v_u32m8_f32m8(vuint32m8_t src) +{ + vfloat32m8_t a = vreinterpret_f32m8(src); + return a; +} + +vint32mf2_t test_vreinterpret_v_f32mf2_i32mf2(vfloat32mf2_t src) +{ + vint32mf2_t a = vreinterpret_i32mf2(src); + return a; +} + +vint32m1_t test_vreinterpret_v_f32m1_i32m1(vfloat32m1_t src) +{ + vint32m1_t a = vreinterpret_i32m1(src); + return a; +} + +vint32m2_t test_vreinterpret_v_f32m2_i32m2(vfloat32m2_t src) +{ + vint32m2_t a = vreinterpret_i32m2(src); + return a; +} + +vint32m4_t test_vreinterpret_v_f32m4_i32m4(vfloat32m4_t src) +{ + vint32m4_t a = vreinterpret_i32m4(src); + return a; +} + +vint32m8_t test_vreinterpret_v_f32m8_i32m8(vfloat32m8_t src) +{ + vint32m8_t a = vreinterpret_i32m8(src); + return a; +} + +vuint32mf2_t test_vreinterpret_v_f32mf2_u32mf2(vfloat32mf2_t src) +{ + vuint32mf2_t a = vreinterpret_u32mf2(src); + return a; +} + +vuint32m1_t test_vreinterpret_v_f32m1_u32m1(vfloat32m1_t src) +{ + vuint32m1_t a = vreinterpret_u32m1(src); + return a; +} + +vuint32m2_t test_vreinterpret_v_f32m2_u32m2(vfloat32m2_t src) +{ + vuint32m2_t a = vreinterpret_u32m2(src); + return a; +} + +vuint32m4_t test_vreinterpret_v_f32m4_u32m4(vfloat32m4_t src) +{ + vuint32m4_t a = vreinterpret_u32m4(src); + return a; +} + +vuint32m8_t test_vreinterpret_v_f32m8_u32m8(vfloat32m8_t src) +{ + vuint32m8_t a = vreinterpret_u32m8(src); + return a; +} + +vuint64m1_t test_vreinterpret_v_i64m1_u64m1(vint64m1_t src) +{ + vuint64m1_t a = vreinterpret_u64m1(src); + return a; +} + +vuint64m2_t test_vreinterpret_v_i64m2_u64m2(vint64m2_t src) +{ + vuint64m2_t a = vreinterpret_u64m2(src); + return a; +} + +vuint64m4_t test_vreinterpret_v_i64m4_u64m4(vint64m4_t src) +{ + vuint64m4_t a = vreinterpret_u64m4(src); + return a; +} + +vuint64m8_t test_vreinterpret_v_i64m8_u64m8(vint64m8_t src) +{ + vuint64m8_t a = vreinterpret_u64m8(src); + return a; +} + +vfloat64m1_t test_vreinterpret_v_i64m1_f64m1(vint64m1_t src) +{ + vfloat64m1_t a = vreinterpret_f64m1(src); + return a; +} + +vfloat64m2_t test_vreinterpret_v_i64m2_f64m2(vint64m2_t src) +{ + vfloat64m2_t a = vreinterpret_f64m2(src); + return a; +} + +vfloat64m4_t test_vreinterpret_v_i64m4_f64m4(vint64m4_t src) +{ + vfloat64m4_t a = vreinterpret_f64m4(src); + return a; +} + +vfloat64m8_t test_vreinterpret_v_i64m8_f64m8(vint64m8_t src) +{ + vfloat64m8_t a = vreinterpret_f64m8(src); + return a; +} + +vint64m1_t test_vreinterpret_v_u64m1_i64m1(vuint64m1_t src) +{ + vint64m1_t a = vreinterpret_i64m1(src); + return a; +} + +vint64m2_t test_vreinterpret_v_u64m2_i64m2(vuint64m2_t src) +{ + vint64m2_t a = vreinterpret_i64m2(src); + return a; +} + +vint64m4_t test_vreinterpret_v_u64m4_i64m4(vuint64m4_t src) +{ + vint64m4_t a = vreinterpret_i64m4(src); + return a; +} + +vint64m8_t test_vreinterpret_v_u64m8_i64m8(vuint64m8_t src) +{ + vint64m8_t a = vreinterpret_i64m8(src); + return a; +} + +vfloat64m1_t test_vreinterpret_v_u64m1_f64m1(vuint64m1_t src) +{ + vfloat64m1_t a = vreinterpret_f64m1(src); + return a; +} + +vfloat64m2_t test_vreinterpret_v_u64m2_f64m2(vuint64m2_t src) +{ + vfloat64m2_t a = vreinterpret_f64m2(src); + return a; +} + +vfloat64m4_t test_vreinterpret_v_u64m4_f64m4(vuint64m4_t src) +{ + vfloat64m4_t a = vreinterpret_f64m4(src); + return a; +} + +vfloat64m8_t test_vreinterpret_v_u64m8_f64m8(vuint64m8_t src) +{ + vfloat64m8_t a = vreinterpret_f64m8(src); + return a; +} + +vint64m1_t test_vreinterpret_v_f64m1_i64m1(vfloat64m1_t src) +{ + vint64m1_t a = vreinterpret_i64m1(src); + return a; +} + +vint64m2_t test_vreinterpret_v_f64m2_i64m2(vfloat64m2_t src) +{ + vint64m2_t a = vreinterpret_i64m2(src); + return a; +} + +vint64m4_t test_vreinterpret_v_f64m4_i64m4(vfloat64m4_t src) +{ + vint64m4_t a = vreinterpret_i64m4(src); + return a; +} + +vint64m8_t test_vreinterpret_v_f64m8_i64m8(vfloat64m8_t src) +{ + vint64m8_t a = vreinterpret_i64m8(src); + return a; +} + +vuint64m1_t test_vreinterpret_v_f64m1_u64m1(vfloat64m1_t src) +{ + vuint64m1_t a = vreinterpret_u64m1(src); + return a; +} + +vuint64m2_t test_vreinterpret_v_f64m2_u64m2(vfloat64m2_t src) +{ + vuint64m2_t a = vreinterpret_u64m2(src); + return a; +} + +vuint64m4_t test_vreinterpret_v_f64m4_u64m4(vfloat64m4_t src) +{ + vuint64m4_t a = vreinterpret_u64m4(src); + return a; +} + +vuint64m8_t test_vreinterpret_v_f64m8_u64m8(vfloat64m8_t src) +{ + vuint64m8_t a = vreinterpret_u64m8(src); + return a; +} + +vint16mf4_t test_vreinterpret_v_i8mf4_i16mf4(vint8mf4_t src) +{ + vint16mf4_t a = vreinterpret_i16mf4(src); + return a; +} + +vint16mf2_t test_vreinterpret_v_i8mf2_i16mf2(vint8mf2_t src) +{ + vint16mf2_t a = vreinterpret_i16mf2(src); + return a; +} + +vint16m1_t test_vreinterpret_v_i8m1_i16m1(vint8m1_t src) +{ + vint16m1_t a = vreinterpret_i16m1(src); + return a; +} + +vint16m2_t test_vreinterpret_v_i8m2_i16m2(vint8m2_t src) +{ + vint16m2_t a = vreinterpret_i16m2(src); + return a; +} + +vint16m4_t test_vreinterpret_v_i8m4_i16m4(vint8m4_t src) +{ + vint16m4_t a = vreinterpret_i16m4(src); + return a; +} + +vint16m8_t test_vreinterpret_v_i8m8_i16m8(vint8m8_t src) +{ + vint16m8_t a = vreinterpret_i16m8(src); + return a; +} + +vint32mf2_t test_vreinterpret_v_i8mf2_i32mf2(vint8mf2_t src) +{ + vint32mf2_t a = vreinterpret_i32mf2(src); + return a; +} + +vint32m1_t test_vreinterpret_v_i8m1_i32m1(vint8m1_t src) +{ + vint32m1_t a = vreinterpret_i32m1(src); + return a; +} + +vint32m2_t test_vreinterpret_v_i8m2_i32m2(vint8m2_t src) +{ + vint32m2_t a = vreinterpret_i32m2(src); + return a; +} + +vint32m4_t test_vreinterpret_v_i8m4_i32m4(vint8m4_t src) +{ + vint32m4_t a = vreinterpret_i32m4(src); + return a; +} + +vint32m8_t test_vreinterpret_v_i8m8_i32m8(vint8m8_t src) +{ + vint32m8_t a = vreinterpret_i32m8(src); + return a; +} + +vint64m1_t test_vreinterpret_v_i8m1_i64m1(vint8m1_t src) +{ + vint64m1_t a = vreinterpret_i64m1(src); + return a; +} + +vint64m2_t test_vreinterpret_v_i8m2_i64m2(vint8m2_t src) +{ + vint64m2_t a = vreinterpret_i64m2(src); + return a; +} + +vint64m4_t test_vreinterpret_v_i8m4_i64m4(vint8m4_t src) +{ + vint64m4_t a = vreinterpret_i64m4(src); + return a; +} + +vint64m8_t test_vreinterpret_v_i8m8_i64m8(vint8m8_t src) +{ + vint64m8_t a = vreinterpret_i64m8(src); + return a; +} + +vint8mf4_t test_vreinterpret_v_i16mf4_i8mf4(vint16mf4_t src) +{ + vint8mf4_t a = vreinterpret_i8mf4(src); + return a; +} + +vint8mf2_t test_vreinterpret_v_i16mf2_i8mf2(vint16mf2_t src) +{ + vint8mf2_t a = vreinterpret_i8mf2(src); + return a; +} + +vint8m1_t test_vreinterpret_v_i16m1_i8m1(vint16m1_t src) +{ + vint8m1_t a = vreinterpret_i8m1(src); + return a; +} + +vint8m2_t test_vreinterpret_v_i16m2_i8m2(vint16m2_t src) +{ + vint8m2_t a = vreinterpret_i8m2(src); + return a; +} + +vint8m4_t test_vreinterpret_v_i16m4_i8m4(vint16m4_t src) +{ + vint8m4_t a = vreinterpret_i8m4(src); + return a; +} + +vint8m8_t test_vreinterpret_v_i16m8_i8m8(vint16m8_t src) +{ + vint8m8_t a = vreinterpret_i8m8(src); + return a; +} + +vint32mf2_t test_vreinterpret_v_i16mf2_i32mf2(vint16mf2_t src) +{ + vint32mf2_t a = vreinterpret_i32mf2(src); + return a; +} + +vint32m1_t test_vreinterpret_v_i16m1_i32m1(vint16m1_t src) +{ + vint32m1_t a = vreinterpret_i32m1(src); + return a; +} + +vint32m2_t test_vreinterpret_v_i16m2_i32m2(vint16m2_t src) +{ + vint32m2_t a = vreinterpret_i32m2(src); + return a; +} + +vint32m4_t test_vreinterpret_v_i16m4_i32m4(vint16m4_t src) +{ + vint32m4_t a = vreinterpret_i32m4(src); + return a; +} + +vint32m8_t test_vreinterpret_v_i16m8_i32m8(vint16m8_t src) +{ + vint32m8_t a = vreinterpret_i32m8(src); + return a; +} + +vint64m1_t test_vreinterpret_v_i16m1_i64m1(vint16m1_t src) +{ + vint64m1_t a = vreinterpret_i64m1(src); + return a; +} + +vint64m2_t test_vreinterpret_v_i16m2_i64m2(vint16m2_t src) +{ + vint64m2_t a = vreinterpret_i64m2(src); + return a; +} + +vint64m4_t test_vreinterpret_v_i16m4_i64m4(vint16m4_t src) +{ + vint64m4_t a = vreinterpret_i64m4(src); + return a; +} + +vint64m8_t test_vreinterpret_v_i16m8_i64m8(vint16m8_t src) +{ + vint64m8_t a = vreinterpret_i64m8(src); + return a; +} + +vint8mf2_t test_vreinterpret_v_i32mf2_i8mf2(vint32mf2_t src) +{ + vint8mf2_t a = vreinterpret_i8mf2(src); + return a; +} + +vint8m1_t test_vreinterpret_v_i32m1_i8m1(vint32m1_t src) +{ + vint8m1_t a = vreinterpret_i8m1(src); + return a; +} + +vint8m2_t test_vreinterpret_v_i32m2_i8m2(vint32m2_t src) +{ + vint8m2_t a = vreinterpret_i8m2(src); + return a; +} + +vint8m4_t test_vreinterpret_v_i32m4_i8m4(vint32m4_t src) +{ + vint8m4_t a = vreinterpret_i8m4(src); + return a; +} + +vint8m8_t test_vreinterpret_v_i32m8_i8m8(vint32m8_t src) +{ + vint8m8_t a = vreinterpret_i8m8(src); + return a; +} + +vint16mf2_t test_vreinterpret_v_i32mf2_i16mf2(vint32mf2_t src) +{ + vint16mf2_t a = vreinterpret_i16mf2(src); + return a; +} + +vint16m1_t test_vreinterpret_v_i32m1_i16m1(vint32m1_t src) +{ + vint16m1_t a = vreinterpret_i16m1(src); + return a; +} + +vint16m2_t test_vreinterpret_v_i32m2_i16m2(vint32m2_t src) +{ + vint16m2_t a = vreinterpret_i16m2(src); + return a; +} + +vint16m4_t test_vreinterpret_v_i32m4_i16m4(vint32m4_t src) +{ + vint16m4_t a = vreinterpret_i16m4(src); + return a; +} + +vint16m8_t test_vreinterpret_v_i32m8_i16m8(vint32m8_t src) +{ + vint16m8_t a = vreinterpret_i16m8(src); + return a; +} + +vint64m1_t test_vreinterpret_v_i32m1_i64m1(vint32m1_t src) +{ + vint64m1_t a = vreinterpret_i64m1(src); + return a; +} + +vint64m2_t test_vreinterpret_v_i32m2_i64m2(vint32m2_t src) +{ + vint64m2_t a = vreinterpret_i64m2(src); + return a; +} + +vint64m4_t test_vreinterpret_v_i32m4_i64m4(vint32m4_t src) +{ + vint64m4_t a = vreinterpret_i64m4(src); + return a; +} + +vint64m8_t test_vreinterpret_v_i32m8_i64m8(vint32m8_t src) +{ + vint64m8_t a = vreinterpret_i64m8(src); + return a; +} + +vint8m1_t test_vreinterpret_v_i64m1_i8m1(vint64m1_t src) +{ + vint8m1_t a = vreinterpret_i8m1(src); + return a; +} + +vint8m2_t test_vreinterpret_v_i64m2_i8m2(vint64m2_t src) +{ + vint8m2_t a = vreinterpret_i8m2(src); + return a; +} + +vint8m4_t test_vreinterpret_v_i64m4_i8m4(vint64m4_t src) +{ + vint8m4_t a = vreinterpret_i8m4(src); + return a; +} + +vint8m8_t test_vreinterpret_v_i64m8_i8m8(vint64m8_t src) +{ + vint8m8_t a = vreinterpret_i8m8(src); + return a; +} + +vint16m1_t test_vreinterpret_v_i64m1_i16m1(vint64m1_t src) +{ + vint16m1_t a = vreinterpret_i16m1(src); + return a; +} + +vint16m2_t test_vreinterpret_v_i64m2_i16m2(vint64m2_t src) +{ + vint16m2_t a = vreinterpret_i16m2(src); + return a; +} + +vint16m4_t test_vreinterpret_v_i64m4_i16m4(vint64m4_t src) +{ + vint16m4_t a = vreinterpret_i16m4(src); + return a; +} + +vint16m8_t test_vreinterpret_v_i64m8_i16m8(vint64m8_t src) +{ + vint16m8_t a = vreinterpret_i16m8(src); + return a; +} + +vint32m1_t test_vreinterpret_v_i64m1_i32m1(vint64m1_t src) +{ + vint32m1_t a = vreinterpret_i32m1(src); + return a; +} + +vint32m2_t test_vreinterpret_v_i64m2_i32m2(vint64m2_t src) +{ + vint32m2_t a = vreinterpret_i32m2(src); + return a; +} + +vint32m4_t test_vreinterpret_v_i64m4_i32m4(vint64m4_t src) +{ + vint32m4_t a = vreinterpret_i32m4(src); + return a; +} + +vint32m8_t test_vreinterpret_v_i64m8_i32m8(vint64m8_t src) +{ + vint32m8_t a = vreinterpret_i32m8(src); + return a; +} + +vuint16mf4_t test_vreinterpret_v_u8mf4_u16mf4(vuint8mf4_t src) +{ + vuint16mf4_t a = vreinterpret_u16mf4(src); + return a; +} + +vuint16mf2_t test_vreinterpret_v_u8mf2_u16mf2(vuint8mf2_t src) +{ + vuint16mf2_t a = vreinterpret_u16mf2(src); + return a; +} + +vuint16m1_t test_vreinterpret_v_u8m1_u16m1(vuint8m1_t src) +{ + vuint16m1_t a = vreinterpret_u16m1(src); + return a; +} + +vuint16m2_t test_vreinterpret_v_u8m2_u16m2(vuint8m2_t src) +{ + vuint16m2_t a = vreinterpret_u16m2(src); + return a; +} + +vuint16m4_t test_vreinterpret_v_u8m4_u16m4(vuint8m4_t src) +{ + vuint16m4_t a = vreinterpret_u16m4(src); + return a; +} + +vuint16m8_t test_vreinterpret_v_u8m8_u16m8(vuint8m8_t src) +{ + vuint16m8_t a = vreinterpret_u16m8(src); + return a; +} + +vuint32mf2_t test_vreinterpret_v_u8mf2_u32mf2(vuint8mf2_t src) +{ + vuint32mf2_t a = vreinterpret_u32mf2(src); + return a; +} + +vuint32m1_t test_vreinterpret_v_u8m1_u32m1(vuint8m1_t src) +{ + vuint32m1_t a = vreinterpret_u32m1(src); + return a; +} + +vuint32m2_t test_vreinterpret_v_u8m2_u32m2(vuint8m2_t src) +{ + vuint32m2_t a = vreinterpret_u32m2(src); + return a; +} + +vuint32m4_t test_vreinterpret_v_u8m4_u32m4(vuint8m4_t src) +{ + vuint32m4_t a = vreinterpret_u32m4(src); + return a; +} + +vuint32m8_t test_vreinterpret_v_u8m8_u32m8(vuint8m8_t src) +{ + vuint32m8_t a = vreinterpret_u32m8(src); + return a; +} + +vuint64m1_t test_vreinterpret_v_u8m1_u64m1(vuint8m1_t src) +{ + vuint64m1_t a = vreinterpret_u64m1(src); + return a; +} + +vuint64m2_t test_vreinterpret_v_u8m2_u64m2(vuint8m2_t src) +{ + vuint64m2_t a = vreinterpret_u64m2(src); + return a; +} + +vuint64m4_t test_vreinterpret_v_u8m4_u64m4(vuint8m4_t src) +{ + vuint64m4_t a = vreinterpret_u64m4(src); + return a; +} + +vuint64m8_t test_vreinterpret_v_u8m8_u64m8(vuint8m8_t src) +{ + vuint64m8_t a = vreinterpret_u64m8(src); + return a; +} + +vuint8mf4_t test_vreinterpret_v_u16mf4_u8mf4(vuint16mf4_t src) +{ + vuint8mf4_t a = vreinterpret_u8mf4(src); + return a; +} + +vuint8mf2_t test_vreinterpret_v_u16mf2_u8mf2(vuint16mf2_t src) +{ + vuint8mf2_t a = vreinterpret_u8mf2(src); + return a; +} + +vuint8m1_t test_vreinterpret_v_u16m1_u8m1(vuint16m1_t src) +{ + vuint8m1_t a = vreinterpret_u8m1(src); + return a; +} + +vuint8m2_t test_vreinterpret_v_u16m2_u8m2(vuint16m2_t src) +{ + vuint8m2_t a = vreinterpret_u8m2(src); + return a; +} + +vuint8m4_t test_vreinterpret_v_u16m4_u8m4(vuint16m4_t src) +{ + vuint8m4_t a = vreinterpret_u8m4(src); + return a; +} + +vuint8m8_t test_vreinterpret_v_u16m8_u8m8(vuint16m8_t src) +{ + vuint8m8_t a = vreinterpret_u8m8(src); + return a; +} + +vuint32mf2_t test_vreinterpret_v_u16mf2_u32mf2(vuint16mf2_t src) +{ + vuint32mf2_t a = vreinterpret_u32mf2(src); + return a; +} + +vuint32m1_t test_vreinterpret_v_u16m1_u32m1(vuint16m1_t src) +{ + vuint32m1_t a = vreinterpret_u32m1(src); + return a; +} + +vuint32m2_t test_vreinterpret_v_u16m2_u32m2(vuint16m2_t src) +{ + vuint32m2_t a = vreinterpret_u32m2(src); + return a; +} + +vuint32m4_t test_vreinterpret_v_u16m4_u32m4(vuint16m4_t src) +{ + vuint32m4_t a = vreinterpret_u32m4(src); + return a; +} + +vuint32m8_t test_vreinterpret_v_u16m8_u32m8(vuint16m8_t src) +{ + vuint32m8_t a = vreinterpret_u32m8(src); + return a; +} + +vuint64m1_t test_vreinterpret_v_u16m1_u64m1(vuint16m1_t src) +{ + vuint64m1_t a = vreinterpret_u64m1(src); + return a; +} + +vuint64m2_t test_vreinterpret_v_u16m2_u64m2(vuint16m2_t src) +{ + vuint64m2_t a = vreinterpret_u64m2(src); + return a; +} + +vuint64m4_t test_vreinterpret_v_u16m4_u64m4(vuint16m4_t src) +{ + vuint64m4_t a = vreinterpret_u64m4(src); + return a; +} + +vuint64m8_t test_vreinterpret_v_u16m8_u64m8(vuint16m8_t src) +{ + vuint64m8_t a = vreinterpret_u64m8(src); + return a; +} + +vuint8mf2_t test_vreinterpret_v_u32mf2_u8mf2(vuint32mf2_t src) +{ + vuint8mf2_t a = vreinterpret_u8mf2(src); + return a; +} + +vuint8m1_t test_vreinterpret_v_u32m1_u8m1(vuint32m1_t src) +{ + vuint8m1_t a = vreinterpret_u8m1(src); + return a; +} + +vuint8m2_t test_vreinterpret_v_u32m2_u8m2(vuint32m2_t src) +{ + vuint8m2_t a = vreinterpret_u8m2(src); + return a; +} + +vuint8m4_t test_vreinterpret_v_u32m4_u8m4(vuint32m4_t src) +{ + vuint8m4_t a = vreinterpret_u8m4(src); + return a; +} + +vuint8m8_t test_vreinterpret_v_u32m8_u8m8(vuint32m8_t src) +{ + vuint8m8_t a = vreinterpret_u8m8(src); + return a; +} + +vuint16mf2_t test_vreinterpret_v_u32mf2_u16mf2(vuint32mf2_t src) +{ + vuint16mf2_t a = vreinterpret_u16mf2(src); + return a; +} + +vuint16m1_t test_vreinterpret_v_u32m1_u16m1(vuint32m1_t src) +{ + vuint16m1_t a = vreinterpret_u16m1(src); + return a; +} + +vuint16m2_t test_vreinterpret_v_u32m2_u16m2(vuint32m2_t src) +{ + vuint16m2_t a = vreinterpret_u16m2(src); + return a; +} + +vuint16m4_t test_vreinterpret_v_u32m4_u16m4(vuint32m4_t src) +{ + vuint16m4_t a = vreinterpret_u16m4(src); + return a; +} + +vuint16m8_t test_vreinterpret_v_u32m8_u16m8(vuint32m8_t src) +{ + vuint16m8_t a = vreinterpret_u16m8(src); + return a; +} + +vuint64m1_t test_vreinterpret_v_u32m1_u64m1(vuint32m1_t src) +{ + vuint64m1_t a = vreinterpret_u64m1(src); + return a; +} + +vuint64m2_t test_vreinterpret_v_u32m2_u64m2(vuint32m2_t src) +{ + vuint64m2_t a = vreinterpret_u64m2(src); + return a; +} + +vuint64m4_t test_vreinterpret_v_u32m4_u64m4(vuint32m4_t src) +{ + vuint64m4_t a = vreinterpret_u64m4(src); + return a; +} + +vuint64m8_t test_vreinterpret_v_u32m8_u64m8(vuint32m8_t src) +{ + vuint64m8_t a = vreinterpret_u64m8(src); + return a; +} + +vuint8m1_t test_vreinterpret_v_u64m1_u8m1(vuint64m1_t src) +{ + vuint8m1_t a = vreinterpret_u8m1(src); + return a; +} + +vuint8m2_t test_vreinterpret_v_u64m2_u8m2(vuint64m2_t src) +{ + vuint8m2_t a = vreinterpret_u8m2(src); + return a; +} + +vuint8m4_t test_vreinterpret_v_u64m4_u8m4(vuint64m4_t src) +{ + vuint8m4_t a = vreinterpret_u8m4(src); + return a; +} + +vuint8m8_t test_vreinterpret_v_u64m8_u8m8(vuint64m8_t src) +{ + vuint8m8_t a = vreinterpret_u8m8(src); + return a; +} + +vuint16m1_t test_vreinterpret_v_u64m1_u16m1(vuint64m1_t src) +{ + vuint16m1_t a = vreinterpret_u16m1(src); + return a; +} + +vuint16m2_t test_vreinterpret_v_u64m2_u16m2(vuint64m2_t src) +{ + vuint16m2_t a = vreinterpret_u16m2(src); + return a; +} + +vuint16m4_t test_vreinterpret_v_u64m4_u16m4(vuint64m4_t src) +{ + vuint16m4_t a = vreinterpret_u16m4(src); + return a; +} + +vuint16m8_t test_vreinterpret_v_u64m8_u16m8(vuint64m8_t src) +{ + vuint16m8_t a = vreinterpret_u16m8(src); + return a; +} + +vuint32m1_t test_vreinterpret_v_u64m1_u32m1(vuint64m1_t src) +{ + vuint32m1_t a = vreinterpret_u32m1(src); + return a; +} + +vuint32m2_t test_vreinterpret_v_u64m2_u32m2(vuint64m2_t src) +{ + vuint32m2_t a = vreinterpret_u32m2(src); + return a; +} + +vuint32m4_t test_vreinterpret_v_u64m4_u32m4(vuint64m4_t src) +{ + vuint32m4_t a = vreinterpret_u32m4(src); + return a; +} + +vuint32m8_t test_vreinterpret_v_u64m8_u32m8(vuint64m8_t src) +{ + vuint32m8_t a = vreinterpret_u32m8(src); + return a; +} + +vint8mf4_t test_vlmul_ext_v_i8mf8_i8mf4(vint8mf8_t op1) +{ + vint8mf4_t a = vlmul_ext_i8mf4(op1); + return a; +} + +vint8mf2_t test_vlmul_ext_v_i8mf8_i8mf2(vint8mf8_t op1) +{ + vint8mf2_t a = vlmul_ext_i8mf2(op1); + return a; +} + +vint8m1_t test_vlmul_ext_v_i8mf8_i8m1(vint8mf8_t op1) +{ + vint8m1_t a = vlmul_ext_i8m1(op1); + return a; +} + +vint8m2_t test_vlmul_ext_v_i8mf8_i8m2(vint8mf8_t op1) +{ + vint8m2_t a = vlmul_ext_i8m2(op1); + return a; +} + +vint8m4_t test_vlmul_ext_v_i8mf8_i8m4(vint8mf8_t op1) +{ + vint8m4_t a = vlmul_ext_i8m4(op1); + return a; +} + +vint8m8_t test_vlmul_ext_v_i8mf8_i8m8(vint8mf8_t op1) +{ + vint8m8_t a = vlmul_ext_i8m8(op1); + return a; +} + +vint8mf2_t test_vlmul_ext_v_i8mf4_i8mf2(vint8mf4_t op1) +{ + vint8mf2_t a = vlmul_ext_i8mf2(op1); + return a; +} + +vint8m1_t test_vlmul_ext_v_i8mf4_i8m1(vint8mf4_t op1) +{ + vint8m1_t a = vlmul_ext_i8m1(op1); + return a; +} + +vint8m2_t test_vlmul_ext_v_i8mf4_i8m2(vint8mf4_t op1) +{ + vint8m2_t a = vlmul_ext_i8m2(op1); + return a; +} + +vint8m4_t test_vlmul_ext_v_i8mf4_i8m4(vint8mf4_t op1) +{ + vint8m4_t a = vlmul_ext_i8m4(op1); + return a; +} + +vint8m8_t test_vlmul_ext_v_i8mf4_i8m8(vint8mf4_t op1) +{ + vint8m8_t a = vlmul_ext_i8m8(op1); + return a; +} + +vint8m1_t test_vlmul_ext_v_i8mf2_i8m1(vint8mf2_t op1) +{ + vint8m1_t a = vlmul_ext_i8m1(op1); + return a; +} + +vint8m2_t test_vlmul_ext_v_i8mf2_i8m2(vint8mf2_t op1) +{ + vint8m2_t a = vlmul_ext_i8m2(op1); + return a; +} + +vint8m4_t test_vlmul_ext_v_i8mf2_i8m4(vint8mf2_t op1) +{ + vint8m4_t a = vlmul_ext_i8m4(op1); + return a; +} + +vint8m8_t test_vlmul_ext_v_i8mf2_i8m8(vint8mf2_t op1) +{ + vint8m8_t a = vlmul_ext_i8m8(op1); + return a; +} + +vint8m2_t test_vlmul_ext_v_i8m1_i8m2(vint8m1_t op1) +{ + vint8m2_t a = vlmul_ext_i8m2(op1); + return a; +} + +vint8m4_t test_vlmul_ext_v_i8m1_i8m4(vint8m1_t op1) +{ + vint8m4_t a = vlmul_ext_i8m4(op1); + return a; +} + +vint8m8_t test_vlmul_ext_v_i8m1_i8m8(vint8m1_t op1) +{ + vint8m8_t a = vlmul_ext_i8m8(op1); + return a; +} + +vint8m4_t test_vlmul_ext_v_i8m2_i8m4(vint8m2_t op1) +{ + vint8m4_t a = vlmul_ext_i8m4(op1); + return a; +} + +vint8m8_t test_vlmul_ext_v_i8m2_i8m8(vint8m2_t op1) +{ + vint8m8_t a = vlmul_ext_i8m8(op1); + return a; +} + +vint8m8_t test_vlmul_ext_v_i8m4_i8m8(vint8m4_t op1) +{ + vint8m8_t a = vlmul_ext_i8m8(op1); + return a; +} + +vint16mf2_t test_vlmul_ext_v_i16mf4_i16mf2(vint16mf4_t op1) +{ + vint16mf2_t a = vlmul_ext_i16mf2(op1); + return a; +} + +vint16m1_t test_vlmul_ext_v_i16mf4_i16m1(vint16mf4_t op1) +{ + vint16m1_t a = vlmul_ext_i16m1(op1); + return a; +} + +vint16m2_t test_vlmul_ext_v_i16mf4_i16m2(vint16mf4_t op1) +{ + vint16m2_t a = vlmul_ext_i16m2(op1); + return a; +} + +vint16m4_t test_vlmul_ext_v_i16mf4_i16m4(vint16mf4_t op1) +{ + vint16m4_t a = vlmul_ext_i16m4(op1); + return a; +} + +vint16m8_t test_vlmul_ext_v_i16mf4_i16m8(vint16mf4_t op1) +{ + vint16m8_t a = vlmul_ext_i16m8(op1); + return a; +} + +vint16m1_t test_vlmul_ext_v_i16mf2_i16m1(vint16mf2_t op1) +{ + vint16m1_t a = vlmul_ext_i16m1(op1); + return a; +} + +vint16m2_t test_vlmul_ext_v_i16mf2_i16m2(vint16mf2_t op1) +{ + vint16m2_t a = vlmul_ext_i16m2(op1); + return a; +} + +vint16m4_t test_vlmul_ext_v_i16mf2_i16m4(vint16mf2_t op1) +{ + vint16m4_t a = vlmul_ext_i16m4(op1); + return a; +} + +vint16m8_t test_vlmul_ext_v_i16mf2_i16m8(vint16mf2_t op1) +{ + vint16m8_t a = vlmul_ext_i16m8(op1); + return a; +} + +vint16m2_t test_vlmul_ext_v_i16m1_i16m2(vint16m1_t op1) +{ + vint16m2_t a = vlmul_ext_i16m2(op1); + return a; +} + +vint16m4_t test_vlmul_ext_v_i16m1_i16m4(vint16m1_t op1) +{ + vint16m4_t a = vlmul_ext_i16m4(op1); + return a; +} + +vint16m8_t test_vlmul_ext_v_i16m1_i16m8(vint16m1_t op1) +{ + vint16m8_t a = vlmul_ext_i16m8(op1); + return a; +} + +vint16m4_t test_vlmul_ext_v_i16m2_i16m4(vint16m2_t op1) +{ + vint16m4_t a = vlmul_ext_i16m4(op1); + return a; +} + +vint16m8_t test_vlmul_ext_v_i16m2_i16m8(vint16m2_t op1) +{ + vint16m8_t a = vlmul_ext_i16m8(op1); + return a; +} + +vint16m8_t test_vlmul_ext_v_i16m4_i16m8(vint16m4_t op1) +{ + vint16m8_t a = vlmul_ext_i16m8(op1); + return a; +} + +vint32m1_t test_vlmul_ext_v_i32mf2_i32m1(vint32mf2_t op1) +{ + vint32m1_t a = vlmul_ext_i32m1(op1); + return a; +} + +vint32m2_t test_vlmul_ext_v_i32mf2_i32m2(vint32mf2_t op1) +{ + vint32m2_t a = vlmul_ext_i32m2(op1); + return a; +} + +vint32m4_t test_vlmul_ext_v_i32mf2_i32m4(vint32mf2_t op1) +{ + vint32m4_t a = vlmul_ext_i32m4(op1); + return a; +} + +vint32m8_t test_vlmul_ext_v_i32mf2_i32m8(vint32mf2_t op1) +{ + vint32m8_t a = vlmul_ext_i32m8(op1); + return a; +} + +vint32m2_t test_vlmul_ext_v_i32m1_i32m2(vint32m1_t op1) +{ + vint32m2_t a = vlmul_ext_i32m2(op1); + return a; +} + +vint32m4_t test_vlmul_ext_v_i32m1_i32m4(vint32m1_t op1) +{ + vint32m4_t a = vlmul_ext_i32m4(op1); + return a; +} + +vint32m8_t test_vlmul_ext_v_i32m1_i32m8(vint32m1_t op1) +{ + vint32m8_t a = vlmul_ext_i32m8(op1); + return a; +} + +vint32m4_t test_vlmul_ext_v_i32m2_i32m4(vint32m2_t op1) +{ + vint32m4_t a = vlmul_ext_i32m4(op1); + return a; +} + +vint32m8_t test_vlmul_ext_v_i32m2_i32m8(vint32m2_t op1) +{ + vint32m8_t a = vlmul_ext_i32m8(op1); + return a; +} + +vint32m8_t test_vlmul_ext_v_i32m4_i32m8(vint32m4_t op1) +{ + vint32m8_t a = vlmul_ext_i32m8(op1); + return a; +} + +vint64m2_t test_vlmul_ext_v_i64m1_i64m2(vint64m1_t op1) +{ + vint64m2_t a = vlmul_ext_i64m2(op1); + return a; +} + +vint64m4_t test_vlmul_ext_v_i64m1_i64m4(vint64m1_t op1) +{ + vint64m4_t a = vlmul_ext_i64m4(op1); + return a; +} + +vint64m8_t test_vlmul_ext_v_i64m1_i64m8(vint64m1_t op1) +{ + vint64m8_t a = vlmul_ext_i64m8(op1); + return a; +} + +vint64m4_t test_vlmul_ext_v_i64m2_i64m4(vint64m2_t op1) +{ + vint64m4_t a = vlmul_ext_i64m4(op1); + return a; +} + +vint64m8_t test_vlmul_ext_v_i64m2_i64m8(vint64m2_t op1) +{ + vint64m8_t a = vlmul_ext_i64m8(op1); + return a; +} + +vint64m8_t test_vlmul_ext_v_i64m4_i64m8(vint64m4_t op1) +{ + vint64m8_t a = vlmul_ext_i64m8(op1); + return a; +} + +vuint8mf4_t test_vlmul_ext_v_u8mf8_u8mf4(vuint8mf8_t op1) +{ + vuint8mf4_t a = vlmul_ext_u8mf4(op1); + return a; +} + +vuint8mf2_t test_vlmul_ext_v_u8mf8_u8mf2(vuint8mf8_t op1) +{ + vuint8mf2_t a = vlmul_ext_u8mf2(op1); + return a; +} + +vuint8m1_t test_vlmul_ext_v_u8mf8_u8m1(vuint8mf8_t op1) +{ + vuint8m1_t a = vlmul_ext_u8m1(op1); + return a; +} + +vuint8m2_t test_vlmul_ext_v_u8mf8_u8m2(vuint8mf8_t op1) +{ + vuint8m2_t a = vlmul_ext_u8m2(op1); + return a; +} + +vuint8m4_t test_vlmul_ext_v_u8mf8_u8m4(vuint8mf8_t op1) +{ + vuint8m4_t a = vlmul_ext_u8m4(op1); + return a; +} + +vuint8m8_t test_vlmul_ext_v_u8mf8_u8m8(vuint8mf8_t op1) +{ + vuint8m8_t a = vlmul_ext_u8m8(op1); + return a; +} + +vuint8mf2_t test_vlmul_ext_v_u8mf4_u8mf2(vuint8mf4_t op1) +{ + vuint8mf2_t a = vlmul_ext_u8mf2(op1); + return a; +} + +vuint8m1_t test_vlmul_ext_v_u8mf4_u8m1(vuint8mf4_t op1) +{ + vuint8m1_t a = vlmul_ext_u8m1(op1); + return a; +} + +vuint8m2_t test_vlmul_ext_v_u8mf4_u8m2(vuint8mf4_t op1) +{ + vuint8m2_t a = vlmul_ext_u8m2(op1); + return a; +} + +vuint8m4_t test_vlmul_ext_v_u8mf4_u8m4(vuint8mf4_t op1) +{ + vuint8m4_t a = vlmul_ext_u8m4(op1); + return a; +} + +vuint8m8_t test_vlmul_ext_v_u8mf4_u8m8(vuint8mf4_t op1) +{ + vuint8m8_t a = vlmul_ext_u8m8(op1); + return a; +} + +vuint8m1_t test_vlmul_ext_v_u8mf2_u8m1(vuint8mf2_t op1) +{ + vuint8m1_t a = vlmul_ext_u8m1(op1); + return a; +} + +vuint8m2_t test_vlmul_ext_v_u8mf2_u8m2(vuint8mf2_t op1) +{ + vuint8m2_t a = vlmul_ext_u8m2(op1); + return a; +} + +vuint8m4_t test_vlmul_ext_v_u8mf2_u8m4(vuint8mf2_t op1) +{ + vuint8m4_t a = vlmul_ext_u8m4(op1); + return a; +} + +vuint8m8_t test_vlmul_ext_v_u8mf2_u8m8(vuint8mf2_t op1) +{ + vuint8m8_t a = vlmul_ext_u8m8(op1); + return a; +} + +vuint8m2_t test_vlmul_ext_v_u8m1_u8m2(vuint8m1_t op1) +{ + vuint8m2_t a = vlmul_ext_u8m2(op1); + return a; +} + +vuint8m4_t test_vlmul_ext_v_u8m1_u8m4(vuint8m1_t op1) +{ + vuint8m4_t a = vlmul_ext_u8m4(op1); + return a; +} + +vuint8m8_t test_vlmul_ext_v_u8m1_u8m8(vuint8m1_t op1) +{ + vuint8m8_t a = vlmul_ext_u8m8(op1); + return a; +} + +vuint8m4_t test_vlmul_ext_v_u8m2_u8m4(vuint8m2_t op1) +{ + vuint8m4_t a = vlmul_ext_u8m4(op1); + return a; +} + +vuint8m8_t test_vlmul_ext_v_u8m2_u8m8(vuint8m2_t op1) +{ + vuint8m8_t a = vlmul_ext_u8m8(op1); + return a; +} + +vuint8m8_t test_vlmul_ext_v_u8m4_u8m8(vuint8m4_t op1) +{ + vuint8m8_t a = vlmul_ext_u8m8(op1); + return a; +} + +vuint16mf2_t test_vlmul_ext_v_u16mf4_u16mf2(vuint16mf4_t op1) +{ + vuint16mf2_t a = vlmul_ext_u16mf2(op1); + return a; +} + +vuint16m1_t test_vlmul_ext_v_u16mf4_u16m1(vuint16mf4_t op1) +{ + vuint16m1_t a = vlmul_ext_u16m1(op1); + return a; +} + +vuint16m2_t test_vlmul_ext_v_u16mf4_u16m2(vuint16mf4_t op1) +{ + vuint16m2_t a = vlmul_ext_u16m2(op1); + return a; +} + +vuint16m4_t test_vlmul_ext_v_u16mf4_u16m4(vuint16mf4_t op1) +{ + vuint16m4_t a = vlmul_ext_u16m4(op1); + return a; +} + +vuint16m8_t test_vlmul_ext_v_u16mf4_u16m8(vuint16mf4_t op1) +{ + vuint16m8_t a = vlmul_ext_u16m8(op1); + return a; +} + +vuint16m1_t test_vlmul_ext_v_u16mf2_u16m1(vuint16mf2_t op1) +{ + vuint16m1_t a = vlmul_ext_u16m1(op1); + return a; +} + +vuint16m2_t test_vlmul_ext_v_u16mf2_u16m2(vuint16mf2_t op1) +{ + vuint16m2_t a = vlmul_ext_u16m2(op1); + return a; +} + +vuint16m4_t test_vlmul_ext_v_u16mf2_u16m4(vuint16mf2_t op1) +{ + vuint16m4_t a = vlmul_ext_u16m4(op1); + return a; +} + +vuint16m8_t test_vlmul_ext_v_u16mf2_u16m8(vuint16mf2_t op1) +{ + vuint16m8_t a = vlmul_ext_u16m8(op1); + return a; +} + +vuint16m2_t test_vlmul_ext_v_u16m1_u16m2(vuint16m1_t op1) +{ + vuint16m2_t a = vlmul_ext_u16m2(op1); + return a; +} + +vuint16m4_t test_vlmul_ext_v_u16m1_u16m4(vuint16m1_t op1) +{ + vuint16m4_t a = vlmul_ext_u16m4(op1); + return a; +} + +vuint16m8_t test_vlmul_ext_v_u16m1_u16m8(vuint16m1_t op1) +{ + vuint16m8_t a = vlmul_ext_u16m8(op1); + return a; +} + +vuint16m4_t test_vlmul_ext_v_u16m2_u16m4(vuint16m2_t op1) +{ + vuint16m4_t a = vlmul_ext_u16m4(op1); + return a; +} + +vuint16m8_t test_vlmul_ext_v_u16m2_u16m8(vuint16m2_t op1) +{ + vuint16m8_t a = vlmul_ext_u16m8(op1); + return a; +} + +vuint16m8_t test_vlmul_ext_v_u16m4_u16m8(vuint16m4_t op1) +{ + vuint16m8_t a = vlmul_ext_u16m8(op1); + return a; +} + +vuint32m1_t test_vlmul_ext_v_u32mf2_u32m1(vuint32mf2_t op1) +{ + vuint32m1_t a = vlmul_ext_u32m1(op1); + return a; +} + +vuint32m2_t test_vlmul_ext_v_u32mf2_u32m2(vuint32mf2_t op1) +{ + vuint32m2_t a = vlmul_ext_u32m2(op1); + return a; +} + +vuint32m4_t test_vlmul_ext_v_u32mf2_u32m4(vuint32mf2_t op1) +{ + vuint32m4_t a = vlmul_ext_u32m4(op1); + return a; +} + +vuint32m8_t test_vlmul_ext_v_u32mf2_u32m8(vuint32mf2_t op1) +{ + vuint32m8_t a = vlmul_ext_u32m8(op1); + return a; +} + +vuint32m2_t test_vlmul_ext_v_u32m1_u32m2(vuint32m1_t op1) +{ + vuint32m2_t a = vlmul_ext_u32m2(op1); + return a; +} + +vuint32m4_t test_vlmul_ext_v_u32m1_u32m4(vuint32m1_t op1) +{ + vuint32m4_t a = vlmul_ext_u32m4(op1); + return a; +} + +vuint32m8_t test_vlmul_ext_v_u32m1_u32m8(vuint32m1_t op1) +{ + vuint32m8_t a = vlmul_ext_u32m8(op1); + return a; +} + +vuint32m4_t test_vlmul_ext_v_u32m2_u32m4(vuint32m2_t op1) +{ + vuint32m4_t a = vlmul_ext_u32m4(op1); + return a; +} + +vuint32m8_t test_vlmul_ext_v_u32m2_u32m8(vuint32m2_t op1) +{ + vuint32m8_t a = vlmul_ext_u32m8(op1); + return a; +} + +vuint32m8_t test_vlmul_ext_v_u32m4_u32m8(vuint32m4_t op1) +{ + vuint32m8_t a = vlmul_ext_u32m8(op1); + return a; +} + +vuint64m2_t test_vlmul_ext_v_u64m1_u64m2(vuint64m1_t op1) +{ + vuint64m2_t a = vlmul_ext_u64m2(op1); + return a; +} + +vuint64m4_t test_vlmul_ext_v_u64m1_u64m4(vuint64m1_t op1) +{ + vuint64m4_t a = vlmul_ext_u64m4(op1); + return a; +} + +vuint64m8_t test_vlmul_ext_v_u64m1_u64m8(vuint64m1_t op1) +{ + vuint64m8_t a = vlmul_ext_u64m8(op1); + return a; +} + +vuint64m4_t test_vlmul_ext_v_u64m2_u64m4(vuint64m2_t op1) +{ + vuint64m4_t a = vlmul_ext_u64m4(op1); + return a; +} + +vuint64m8_t test_vlmul_ext_v_u64m2_u64m8(vuint64m2_t op1) +{ + vuint64m8_t a = vlmul_ext_u64m8(op1); + return a; +} + +vuint64m8_t test_vlmul_ext_v_u64m4_u64m8(vuint64m4_t op1) +{ + vuint64m8_t a = vlmul_ext_u64m8(op1); + return a; +} + +vfloat32m1_t test_vlmul_ext_v_f32mf2_f32m1(vfloat32mf2_t op1) +{ + vfloat32m1_t a = vlmul_ext_f32m1(op1); + return a; +} + +vfloat32m2_t test_vlmul_ext_v_f32mf2_f32m2(vfloat32mf2_t op1) +{ + vfloat32m2_t a = vlmul_ext_f32m2(op1); + return a; +} + +vfloat32m4_t test_vlmul_ext_v_f32mf2_f32m4(vfloat32mf2_t op1) +{ + vfloat32m4_t a = vlmul_ext_f32m4(op1); + return a; +} + +vfloat32m8_t test_vlmul_ext_v_f32mf2_f32m8(vfloat32mf2_t op1) +{ + vfloat32m8_t a = vlmul_ext_f32m8(op1); + return a; +} + +vfloat32m2_t test_vlmul_ext_v_f32m1_f32m2(vfloat32m1_t op1) +{ + vfloat32m2_t a = vlmul_ext_f32m2(op1); + return a; +} + +vfloat32m4_t test_vlmul_ext_v_f32m1_f32m4(vfloat32m1_t op1) +{ + vfloat32m4_t a = vlmul_ext_f32m4(op1); + return a; +} + +vfloat32m8_t test_vlmul_ext_v_f32m1_f32m8(vfloat32m1_t op1) +{ + vfloat32m8_t a = vlmul_ext_f32m8(op1); + return a; +} + +vfloat32m4_t test_vlmul_ext_v_f32m2_f32m4(vfloat32m2_t op1) +{ + vfloat32m4_t a = vlmul_ext_f32m4(op1); + return a; +} + +vfloat32m8_t test_vlmul_ext_v_f32m2_f32m8(vfloat32m2_t op1) +{ + vfloat32m8_t a = vlmul_ext_f32m8(op1); + return a; +} + +vfloat32m8_t test_vlmul_ext_v_f32m4_f32m8(vfloat32m4_t op1) +{ + vfloat32m8_t a = vlmul_ext_f32m8(op1); + return a; +} + +vfloat64m2_t test_vlmul_ext_v_f64m1_f64m2(vfloat64m1_t op1) +{ + vfloat64m2_t a = vlmul_ext_f64m2(op1); + return a; +} + +vfloat64m4_t test_vlmul_ext_v_f64m1_f64m4(vfloat64m1_t op1) +{ + vfloat64m4_t a = vlmul_ext_f64m4(op1); + return a; +} + +vfloat64m8_t test_vlmul_ext_v_f64m1_f64m8(vfloat64m1_t op1) +{ + vfloat64m8_t a = vlmul_ext_f64m8(op1); + return a; +} + +vfloat64m4_t test_vlmul_ext_v_f64m2_f64m4(vfloat64m2_t op1) +{ + vfloat64m4_t a = vlmul_ext_f64m4(op1); + return a; +} + +vfloat64m8_t test_vlmul_ext_v_f64m2_f64m8(vfloat64m2_t op1) +{ + vfloat64m8_t a = vlmul_ext_f64m8(op1); + return a; +} + +vfloat64m8_t test_vlmul_ext_v_f64m4_f64m8(vfloat64m4_t op1) +{ + vfloat64m8_t a = vlmul_ext_f64m8(op1); + return a; +} + +vint8mf8_t test_vlmul_trunc_v_i8mf4_i8mf8(vint8mf4_t op1) +{ + vint8mf8_t a = vlmul_trunc_i8mf8(op1); + return a; +} + +vint8mf8_t test_vlmul_trunc_v_i8mf2_i8mf8(vint8mf2_t op1) +{ + vint8mf8_t a = vlmul_trunc_i8mf8(op1); + return a; +} + +vint8mf4_t test_vlmul_trunc_v_i8mf2_i8mf4(vint8mf2_t op1) +{ + vint8mf4_t a = vlmul_trunc_i8mf4(op1); + return a; +} + +vint8mf8_t test_vlmul_trunc_v_i8m1_i8mf8(vint8m1_t op1) +{ + vint8mf8_t a = vlmul_trunc_i8mf8(op1); + return a; +} + +vint8mf4_t test_vlmul_trunc_v_i8m1_i8mf4(vint8m1_t op1) +{ + vint8mf4_t a = vlmul_trunc_i8mf4(op1); + return a; +} + +vint8mf2_t test_vlmul_trunc_v_i8m1_i8mf2(vint8m1_t op1) +{ + vint8mf2_t a = vlmul_trunc_i8mf2(op1); + return a; +} + +vint8mf8_t test_vlmul_trunc_v_i8m2_i8mf8(vint8m2_t op1) +{ + vint8mf8_t a = vlmul_trunc_i8mf8(op1); + return a; +} + +vint8mf4_t test_vlmul_trunc_v_i8m2_i8mf4(vint8m2_t op1) +{ + vint8mf4_t a = vlmul_trunc_i8mf4(op1); + return a; +} + +vint8mf2_t test_vlmul_trunc_v_i8m2_i8mf2(vint8m2_t op1) +{ + vint8mf2_t a = vlmul_trunc_i8mf2(op1); + return a; +} + +vint8m1_t test_vlmul_trunc_v_i8m2_i8m1(vint8m2_t op1) +{ + vint8m1_t a = vlmul_trunc_i8m1(op1); + return a; +} + +vint8mf8_t test_vlmul_trunc_v_i8m4_i8mf8(vint8m4_t op1) +{ + vint8mf8_t a = vlmul_trunc_i8mf8(op1); + return a; +} + +vint8mf4_t test_vlmul_trunc_v_i8m4_i8mf4(vint8m4_t op1) +{ + vint8mf4_t a = vlmul_trunc_i8mf4(op1); + return a; +} + +vint8mf2_t test_vlmul_trunc_v_i8m4_i8mf2(vint8m4_t op1) +{ + vint8mf2_t a = vlmul_trunc_i8mf2(op1); + return a; +} + +vint8m1_t test_vlmul_trunc_v_i8m4_i8m1(vint8m4_t op1) +{ + vint8m1_t a = vlmul_trunc_i8m1(op1); + return a; +} + +vint8m2_t test_vlmul_trunc_v_i8m4_i8m2(vint8m4_t op1) +{ + vint8m2_t a = vlmul_trunc_i8m2(op1); + return a; +} + +vint8mf8_t test_vlmul_trunc_v_i8m8_i8mf8(vint8m8_t op1) +{ + vint8mf8_t a = vlmul_trunc_i8mf8(op1); + return a; +} + +vint8mf4_t test_vlmul_trunc_v_i8m8_i8mf4(vint8m8_t op1) +{ + vint8mf4_t a = vlmul_trunc_i8mf4(op1); + return a; +} + +vint8mf2_t test_vlmul_trunc_v_i8m8_i8mf2(vint8m8_t op1) +{ + vint8mf2_t a = vlmul_trunc_i8mf2(op1); + return a; +} + +vint8m1_t test_vlmul_trunc_v_i8m8_i8m1(vint8m8_t op1) +{ + vint8m1_t a = vlmul_trunc_i8m1(op1); + return a; +} + +vint8m2_t test_vlmul_trunc_v_i8m8_i8m2(vint8m8_t op1) +{ + vint8m2_t a = vlmul_trunc_i8m2(op1); + return a; +} + +vint8m4_t test_vlmul_trunc_v_i8m8_i8m4(vint8m8_t op1) +{ + vint8m4_t a = vlmul_trunc_i8m4(op1); + return a; +} + +vint16mf4_t test_vlmul_trunc_v_i16mf2_i16mf4(vint16mf2_t op1) +{ + vint16mf4_t a = vlmul_trunc_i16mf4(op1); + return a; +} + +vint16mf4_t test_vlmul_trunc_v_i16m1_i16mf4(vint16m1_t op1) +{ + vint16mf4_t a = vlmul_trunc_i16mf4(op1); + return a; +} + +vint16mf2_t test_vlmul_trunc_v_i16m1_i16mf2(vint16m1_t op1) +{ + vint16mf2_t a = vlmul_trunc_i16mf2(op1); + return a; +} + +vint16mf4_t test_vlmul_trunc_v_i16m2_i16mf4(vint16m2_t op1) +{ + vint16mf4_t a = vlmul_trunc_i16mf4(op1); + return a; +} + +vint16mf2_t test_vlmul_trunc_v_i16m2_i16mf2(vint16m2_t op1) +{ + vint16mf2_t a = vlmul_trunc_i16mf2(op1); + return a; +} + +vint16m1_t test_vlmul_trunc_v_i16m2_i16m1(vint16m2_t op1) +{ + vint16m1_t a = vlmul_trunc_i16m1(op1); + return a; +} + +vint16mf4_t test_vlmul_trunc_v_i16m4_i16mf4(vint16m4_t op1) +{ + vint16mf4_t a = vlmul_trunc_i16mf4(op1); + return a; +} + +vint16mf2_t test_vlmul_trunc_v_i16m4_i16mf2(vint16m4_t op1) +{ + vint16mf2_t a = vlmul_trunc_i16mf2(op1); + return a; +} + +vint16m1_t test_vlmul_trunc_v_i16m4_i16m1(vint16m4_t op1) +{ + vint16m1_t a = vlmul_trunc_i16m1(op1); + return a; +} + +vint16m2_t test_vlmul_trunc_v_i16m4_i16m2(vint16m4_t op1) +{ + vint16m2_t a = vlmul_trunc_i16m2(op1); + return a; +} + +vint16mf4_t test_vlmul_trunc_v_i16m8_i16mf4(vint16m8_t op1) +{ + vint16mf4_t a = vlmul_trunc_i16mf4(op1); + return a; +} + +vint16mf2_t test_vlmul_trunc_v_i16m8_i16mf2(vint16m8_t op1) +{ + vint16mf2_t a = vlmul_trunc_i16mf2(op1); + return a; +} + +vint16m1_t test_vlmul_trunc_v_i16m8_i16m1(vint16m8_t op1) +{ + vint16m1_t a = vlmul_trunc_i16m1(op1); + return a; +} + +vint16m2_t test_vlmul_trunc_v_i16m8_i16m2(vint16m8_t op1) +{ + vint16m2_t a = vlmul_trunc_i16m2(op1); + return a; +} + +vint16m4_t test_vlmul_trunc_v_i16m8_i16m4(vint16m8_t op1) +{ + vint16m4_t a = vlmul_trunc_i16m4(op1); + return a; +} + +vint32mf2_t test_vlmul_trunc_v_i32m1_i32mf2(vint32m1_t op1) +{ + vint32mf2_t a = vlmul_trunc_i32mf2(op1); + return a; +} + +vint32mf2_t test_vlmul_trunc_v_i32m2_i32mf2(vint32m2_t op1) +{ + vint32mf2_t a = vlmul_trunc_i32mf2(op1); + return a; +} + +vint32m1_t test_vlmul_trunc_v_i32m2_i32m1(vint32m2_t op1) +{ + vint32m1_t a = vlmul_trunc_i32m1(op1); + return a; +} + +vint32mf2_t test_vlmul_trunc_v_i32m4_i32mf2(vint32m4_t op1) +{ + vint32mf2_t a = vlmul_trunc_i32mf2(op1); + return a; +} + +vint32m1_t test_vlmul_trunc_v_i32m4_i32m1(vint32m4_t op1) +{ + vint32m1_t a = vlmul_trunc_i32m1(op1); + return a; +} + +vint32m2_t test_vlmul_trunc_v_i32m4_i32m2(vint32m4_t op1) +{ + vint32m2_t a = vlmul_trunc_i32m2(op1); + return a; +} + +vint32mf2_t test_vlmul_trunc_v_i32m8_i32mf2(vint32m8_t op1) +{ + vint32mf2_t a = vlmul_trunc_i32mf2(op1); + return a; +} + +vint32m1_t test_vlmul_trunc_v_i32m8_i32m1(vint32m8_t op1) +{ + vint32m1_t a = vlmul_trunc_i32m1(op1); + return a; +} + +vint32m2_t test_vlmul_trunc_v_i32m8_i32m2(vint32m8_t op1) +{ + vint32m2_t a = vlmul_trunc_i32m2(op1); + return a; +} + +vint32m4_t test_vlmul_trunc_v_i32m8_i32m4(vint32m8_t op1) +{ + vint32m4_t a = vlmul_trunc_i32m4(op1); + return a; +} + +vint64m1_t test_vlmul_trunc_v_i64m2_i64m1(vint64m2_t op1) +{ + vint64m1_t a = vlmul_trunc_i64m1(op1); + return a; +} + +vint64m1_t test_vlmul_trunc_v_i64m4_i64m1(vint64m4_t op1) +{ + vint64m1_t a = vlmul_trunc_i64m1(op1); + return a; +} + +vint64m2_t test_vlmul_trunc_v_i64m4_i64m2(vint64m4_t op1) +{ + vint64m2_t a = vlmul_trunc_i64m2(op1); + return a; +} + +vint64m1_t test_vlmul_trunc_v_i64m8_i64m1(vint64m8_t op1) +{ + vint64m1_t a = vlmul_trunc_i64m1(op1); + return a; +} + +vint64m2_t test_vlmul_trunc_v_i64m8_i64m2(vint64m8_t op1) +{ + vint64m2_t a = vlmul_trunc_i64m2(op1); + return a; +} + +vint64m4_t test_vlmul_trunc_v_i64m8_i64m4(vint64m8_t op1) +{ + vint64m4_t a = vlmul_trunc_i64m4(op1); + return a; +} + +vuint8mf8_t test_vlmul_trunc_v_u8mf4_u8mf8(vuint8mf4_t op1) +{ + vuint8mf8_t a = vlmul_trunc_u8mf8(op1); + return a; +} + +vuint8mf8_t test_vlmul_trunc_v_u8mf2_u8mf8(vuint8mf2_t op1) +{ + vuint8mf8_t a = vlmul_trunc_u8mf8(op1); + return a; +} + +vuint8mf4_t test_vlmul_trunc_v_u8mf2_u8mf4(vuint8mf2_t op1) +{ + vuint8mf4_t a = vlmul_trunc_u8mf4(op1); + return a; +} + +vuint8mf8_t test_vlmul_trunc_v_u8m1_u8mf8(vuint8m1_t op1) +{ + vuint8mf8_t a = vlmul_trunc_u8mf8(op1); + return a; +} + +vuint8mf4_t test_vlmul_trunc_v_u8m1_u8mf4(vuint8m1_t op1) +{ + vuint8mf4_t a = vlmul_trunc_u8mf4(op1); + return a; +} + +vuint8mf2_t test_vlmul_trunc_v_u8m1_u8mf2(vuint8m1_t op1) +{ + vuint8mf2_t a = vlmul_trunc_u8mf2(op1); + return a; +} + +vuint8mf8_t test_vlmul_trunc_v_u8m2_u8mf8(vuint8m2_t op1) +{ + vuint8mf8_t a = vlmul_trunc_u8mf8(op1); + return a; +} + +vuint8mf4_t test_vlmul_trunc_v_u8m2_u8mf4(vuint8m2_t op1) +{ + vuint8mf4_t a = vlmul_trunc_u8mf4(op1); + return a; +} + +vuint8mf2_t test_vlmul_trunc_v_u8m2_u8mf2(vuint8m2_t op1) +{ + vuint8mf2_t a = vlmul_trunc_u8mf2(op1); + return a; +} + +vuint8m1_t test_vlmul_trunc_v_u8m2_u8m1(vuint8m2_t op1) +{ + vuint8m1_t a = vlmul_trunc_u8m1(op1); + return a; +} + +vuint8mf8_t test_vlmul_trunc_v_u8m4_u8mf8(vuint8m4_t op1) +{ + vuint8mf8_t a = vlmul_trunc_u8mf8(op1); + return a; +} + +vuint8mf4_t test_vlmul_trunc_v_u8m4_u8mf4(vuint8m4_t op1) +{ + vuint8mf4_t a = vlmul_trunc_u8mf4(op1); + return a; +} + +vuint8mf2_t test_vlmul_trunc_v_u8m4_u8mf2(vuint8m4_t op1) +{ + vuint8mf2_t a = vlmul_trunc_u8mf2(op1); + return a; +} + +vuint8m1_t test_vlmul_trunc_v_u8m4_u8m1(vuint8m4_t op1) +{ + vuint8m1_t a = vlmul_trunc_u8m1(op1); + return a; +} + +vuint8m2_t test_vlmul_trunc_v_u8m4_u8m2(vuint8m4_t op1) +{ + vuint8m2_t a = vlmul_trunc_u8m2(op1); + return a; +} + +vuint8mf8_t test_vlmul_trunc_v_u8m8_u8mf8(vuint8m8_t op1) +{ + vuint8mf8_t a = vlmul_trunc_u8mf8(op1); + return a; +} + +vuint8mf4_t test_vlmul_trunc_v_u8m8_u8mf4(vuint8m8_t op1) +{ + vuint8mf4_t a = vlmul_trunc_u8mf4(op1); + return a; +} + +vuint8mf2_t test_vlmul_trunc_v_u8m8_u8mf2(vuint8m8_t op1) +{ + vuint8mf2_t a = vlmul_trunc_u8mf2(op1); + return a; +} + +vuint8m1_t test_vlmul_trunc_v_u8m8_u8m1(vuint8m8_t op1) +{ + vuint8m1_t a = vlmul_trunc_u8m1(op1); + return a; +} + +vuint8m2_t test_vlmul_trunc_v_u8m8_u8m2(vuint8m8_t op1) +{ + vuint8m2_t a = vlmul_trunc_u8m2(op1); + return a; +} + +vuint8m4_t test_vlmul_trunc_v_u8m8_u8m4(vuint8m8_t op1) +{ + vuint8m4_t a = vlmul_trunc_u8m4(op1); + return a; +} + +vuint16mf4_t test_vlmul_trunc_v_u16mf2_u16mf4(vuint16mf2_t op1) +{ + vuint16mf4_t a = vlmul_trunc_u16mf4(op1); + return a; +} + +vuint16mf4_t test_vlmul_trunc_v_u16m1_u16mf4(vuint16m1_t op1) +{ + vuint16mf4_t a = vlmul_trunc_u16mf4(op1); + return a; +} + +vuint16mf2_t test_vlmul_trunc_v_u16m1_u16mf2(vuint16m1_t op1) +{ + vuint16mf2_t a = vlmul_trunc_u16mf2(op1); + return a; +} + +vuint16mf4_t test_vlmul_trunc_v_u16m2_u16mf4(vuint16m2_t op1) +{ + vuint16mf4_t a = vlmul_trunc_u16mf4(op1); + return a; +} + +vuint16mf2_t test_vlmul_trunc_v_u16m2_u16mf2(vuint16m2_t op1) +{ + vuint16mf2_t a = vlmul_trunc_u16mf2(op1); + return a; +} + +vuint16m1_t test_vlmul_trunc_v_u16m2_u16m1(vuint16m2_t op1) +{ + vuint16m1_t a = vlmul_trunc_u16m1(op1); + return a; +} + +vuint16mf4_t test_vlmul_trunc_v_u16m4_u16mf4(vuint16m4_t op1) +{ + vuint16mf4_t a = vlmul_trunc_u16mf4(op1); + return a; +} + +vuint16mf2_t test_vlmul_trunc_v_u16m4_u16mf2(vuint16m4_t op1) +{ + vuint16mf2_t a = vlmul_trunc_u16mf2(op1); + return a; +} + +vuint16m1_t test_vlmul_trunc_v_u16m4_u16m1(vuint16m4_t op1) +{ + vuint16m1_t a = vlmul_trunc_u16m1(op1); + return a; +} + +vuint16m2_t test_vlmul_trunc_v_u16m4_u16m2(vuint16m4_t op1) +{ + vuint16m2_t a = vlmul_trunc_u16m2(op1); + return a; +} + +vuint16mf4_t test_vlmul_trunc_v_u16m8_u16mf4(vuint16m8_t op1) +{ + vuint16mf4_t a = vlmul_trunc_u16mf4(op1); + return a; +} + +vuint16mf2_t test_vlmul_trunc_v_u16m8_u16mf2(vuint16m8_t op1) +{ + vuint16mf2_t a = vlmul_trunc_u16mf2(op1); + return a; +} + +vuint16m1_t test_vlmul_trunc_v_u16m8_u16m1(vuint16m8_t op1) +{ + vuint16m1_t a = vlmul_trunc_u16m1(op1); + return a; +} + +vuint16m2_t test_vlmul_trunc_v_u16m8_u16m2(vuint16m8_t op1) +{ + vuint16m2_t a = vlmul_trunc_u16m2(op1); + return a; +} + +vuint16m4_t test_vlmul_trunc_v_u16m8_u16m4(vuint16m8_t op1) +{ + vuint16m4_t a = vlmul_trunc_u16m4(op1); + return a; +} + +vuint32mf2_t test_vlmul_trunc_v_u32m1_u32mf2(vuint32m1_t op1) +{ + vuint32mf2_t a = vlmul_trunc_u32mf2(op1); + return a; +} + +vuint32mf2_t test_vlmul_trunc_v_u32m2_u32mf2(vuint32m2_t op1) +{ + vuint32mf2_t a = vlmul_trunc_u32mf2(op1); + return a; +} + +vuint32m1_t test_vlmul_trunc_v_u32m2_u32m1(vuint32m2_t op1) +{ + vuint32m1_t a = vlmul_trunc_u32m1(op1); + return a; +} + +vuint32mf2_t test_vlmul_trunc_v_u32m4_u32mf2(vuint32m4_t op1) +{ + vuint32mf2_t a = vlmul_trunc_u32mf2(op1); + return a; +} + +vuint32m1_t test_vlmul_trunc_v_u32m4_u32m1(vuint32m4_t op1) +{ + vuint32m1_t a = vlmul_trunc_u32m1(op1); + return a; +} + +vuint32m2_t test_vlmul_trunc_v_u32m4_u32m2(vuint32m4_t op1) +{ + vuint32m2_t a = vlmul_trunc_u32m2(op1); + return a; +} + +vuint32mf2_t test_vlmul_trunc_v_u32m8_u32mf2(vuint32m8_t op1) +{ + vuint32mf2_t a = vlmul_trunc_u32mf2(op1); + return a; +} + +vuint32m1_t test_vlmul_trunc_v_u32m8_u32m1(vuint32m8_t op1) +{ + vuint32m1_t a = vlmul_trunc_u32m1(op1); + return a; +} + +vuint32m2_t test_vlmul_trunc_v_u32m8_u32m2(vuint32m8_t op1) +{ + vuint32m2_t a = vlmul_trunc_u32m2(op1); + return a; +} + +vuint32m4_t test_vlmul_trunc_v_u32m8_u32m4(vuint32m8_t op1) +{ + vuint32m4_t a = vlmul_trunc_u32m4(op1); + return a; +} + +vuint64m1_t test_vlmul_trunc_v_u64m2_u64m1(vuint64m2_t op1) +{ + vuint64m1_t a = vlmul_trunc_u64m1(op1); + return a; +} + +vuint64m1_t test_vlmul_trunc_v_u64m4_u64m1(vuint64m4_t op1) +{ + vuint64m1_t a = vlmul_trunc_u64m1(op1); + return a; +} + +vuint64m2_t test_vlmul_trunc_v_u64m4_u64m2(vuint64m4_t op1) +{ + vuint64m2_t a = vlmul_trunc_u64m2(op1); + return a; +} + +vuint64m1_t test_vlmul_trunc_v_u64m8_u64m1(vuint64m8_t op1) +{ + vuint64m1_t a = vlmul_trunc_u64m1(op1); + return a; +} + +vuint64m2_t test_vlmul_trunc_v_u64m8_u64m2(vuint64m8_t op1) +{ + vuint64m2_t a = vlmul_trunc_u64m2(op1); + return a; +} + +vuint64m4_t test_vlmul_trunc_v_u64m8_u64m4(vuint64m8_t op1) +{ + vuint64m4_t a = vlmul_trunc_u64m4(op1); + return a; +} + +vfloat32mf2_t test_vlmul_trunc_v_f32m1_f32mf2(vfloat32m1_t op1) +{ + vfloat32mf2_t a = vlmul_trunc_f32mf2(op1); + return a; +} + +vfloat32mf2_t test_vlmul_trunc_v_f32m2_f32mf2(vfloat32m2_t op1) +{ + vfloat32mf2_t a = vlmul_trunc_f32mf2(op1); + return a; +} + +vfloat32m1_t test_vlmul_trunc_v_f32m2_f32m1(vfloat32m2_t op1) +{ + vfloat32m1_t a = vlmul_trunc_f32m1(op1); + return a; +} + +vfloat32mf2_t test_vlmul_trunc_v_f32m4_f32mf2(vfloat32m4_t op1) +{ + vfloat32mf2_t a = vlmul_trunc_f32mf2(op1); + return a; +} + +vfloat32m1_t test_vlmul_trunc_v_f32m4_f32m1(vfloat32m4_t op1) +{ + vfloat32m1_t a = vlmul_trunc_f32m1(op1); + return a; +} + +vfloat32m2_t test_vlmul_trunc_v_f32m4_f32m2(vfloat32m4_t op1) +{ + vfloat32m2_t a = vlmul_trunc_f32m2(op1); + return a; +} + +vfloat32mf2_t test_vlmul_trunc_v_f32m8_f32mf2(vfloat32m8_t op1) +{ + vfloat32mf2_t a = vlmul_trunc_f32mf2(op1); + return a; +} + +vfloat32m1_t test_vlmul_trunc_v_f32m8_f32m1(vfloat32m8_t op1) +{ + vfloat32m1_t a = vlmul_trunc_f32m1(op1); + return a; +} + +vfloat32m2_t test_vlmul_trunc_v_f32m8_f32m2(vfloat32m8_t op1) +{ + vfloat32m2_t a = vlmul_trunc_f32m2(op1); + return a; +} + +vfloat32m4_t test_vlmul_trunc_v_f32m8_f32m4(vfloat32m8_t op1) +{ + vfloat32m4_t a = vlmul_trunc_f32m4(op1); + return a; +} + +vfloat64m1_t test_vlmul_trunc_v_f64m2_f64m1(vfloat64m2_t op1) +{ + vfloat64m1_t a = vlmul_trunc_f64m1(op1); + return a; +} + +vfloat64m1_t test_vlmul_trunc_v_f64m4_f64m1(vfloat64m4_t op1) +{ + vfloat64m1_t a = vlmul_trunc_f64m1(op1); + return a; +} + +vfloat64m2_t test_vlmul_trunc_v_f64m4_f64m2(vfloat64m4_t op1) +{ + vfloat64m2_t a = vlmul_trunc_f64m2(op1); + return a; +} + +vfloat64m1_t test_vlmul_trunc_v_f64m8_f64m1(vfloat64m8_t op1) +{ + vfloat64m1_t a = vlmul_trunc_f64m1(op1); + return a; +} + +vfloat64m2_t test_vlmul_trunc_v_f64m8_f64m2(vfloat64m8_t op1) +{ + vfloat64m2_t a = vlmul_trunc_f64m2(op1); + return a; +} + +vfloat64m4_t test_vlmul_trunc_v_f64m8_f64m4(vfloat64m8_t op1) +{ + vfloat64m4_t a = vlmul_trunc_f64m4(op1); + return a; +} \ No newline at end of file diff --git a/gcc/testsuite/g++.target/riscv/rvv/rvv-intrinsic.exp b/gcc/testsuite/g++.target/riscv/rvv/rvv-intrinsic.exp new file mode 100644 index 00000000000..c8db25f0fbd --- /dev/null +++ b/gcc/testsuite/g++.target/riscv/rvv/rvv-intrinsic.exp @@ -0,0 +1,39 @@ +# Copyright (C) 2022-2022 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . + +# GCC testsuite that uses the `dg.exp' driver. + +# Test the front-end for C++. +# We don't need to test back-end code-gen in RV32 system for C++ +# Because it is already tested in C. +# Exit immediately if this isn't a RISC-V target. +if ![istarget riscv64*-*-*] then { + return +} + +# Load support procs. +load_lib g++-dg.exp + +# Initialize `dg'. +dg-init + +# Main loop. +set CFLAGS "-march=rv64gcv_zfh -O3" +dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.C]] \ + "" $CFLAGS + +# All done. +dg-finish \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/riscv/rvv/intrinsic/misc_func.c b/gcc/testsuite/gcc.target/riscv/rvv/intrinsic/misc_func.c new file mode 100644 index 00000000000..387da4205f9 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/intrinsic/misc_func.c @@ -0,0 +1,2921 @@ + +/* { dg-do compile } */ +/* { dg-skip-if "test vector intrinsic" { *-*-* } { "*" } { "-march=rv*v*" } } */ + +#include +#include + + +/* Reinterpret between different type under the same SEW and LMUL */ + +vuint8mf8_t test_vreinterpret_v_i8mf8_u8mf8(vint8mf8_t src) +{ + vuint8mf8_t a = vreinterpret_v_i8mf8_u8mf8(src); + return a; +} + +vuint8mf4_t test_vreinterpret_v_i8mf4_u8mf4(vint8mf4_t src) +{ + vuint8mf4_t a = vreinterpret_v_i8mf4_u8mf4(src); + return a; +} + +vuint8mf2_t test_vreinterpret_v_i8mf2_u8mf2(vint8mf2_t src) +{ + vuint8mf2_t a = vreinterpret_v_i8mf2_u8mf2(src); + return a; +} + +vuint8m1_t test_vreinterpret_v_i8m1_u8m1(vint8m1_t src) +{ + vuint8m1_t a = vreinterpret_v_i8m1_u8m1(src); + return a; +} + +vuint8m2_t test_vreinterpret_v_i8m2_u8m2(vint8m2_t src) +{ + vuint8m2_t a = vreinterpret_v_i8m2_u8m2(src); + return a; +} + +vuint8m4_t test_vreinterpret_v_i8m4_u8m4(vint8m4_t src) +{ + vuint8m4_t a = vreinterpret_v_i8m4_u8m4(src); + return a; +} + +vuint8m8_t test_vreinterpret_v_i8m8_u8m8(vint8m8_t src) +{ + vuint8m8_t a = vreinterpret_v_i8m8_u8m8(src); + return a; +} + +vint8mf8_t test_vreinterpret_v_u8mf8_i8mf8(vuint8mf8_t src) +{ + vint8mf8_t a = vreinterpret_v_u8mf8_i8mf8(src); + return a; +} + +vint8mf4_t test_vreinterpret_v_u8mf4_i8mf4(vuint8mf4_t src) +{ + vint8mf4_t a = vreinterpret_v_u8mf4_i8mf4(src); + return a; +} + +vint8mf2_t test_vreinterpret_v_u8mf2_i8mf2(vuint8mf2_t src) +{ + vint8mf2_t a = vreinterpret_v_u8mf2_i8mf2(src); + return a; +} + +vint8m1_t test_vreinterpret_v_u8m1_i8m1(vuint8m1_t src) +{ + vint8m1_t a = vreinterpret_v_u8m1_i8m1(src); + return a; +} + +vint8m2_t test_vreinterpret_v_u8m2_i8m2(vuint8m2_t src) +{ + vint8m2_t a = vreinterpret_v_u8m2_i8m2(src); + return a; +} + +vint8m4_t test_vreinterpret_v_u8m4_i8m4(vuint8m4_t src) +{ + vint8m4_t a = vreinterpret_v_u8m4_i8m4(src); + return a; +} + +vint8m8_t test_vreinterpret_v_u8m8_i8m8(vuint8m8_t src) +{ + vint8m8_t a = vreinterpret_v_u8m8_i8m8(src); + return a; +} + +vuint16mf4_t test_vreinterpret_v_i16mf4_u16mf4(vint16mf4_t src) +{ + vuint16mf4_t a = vreinterpret_v_i16mf4_u16mf4(src); + return a; +} + +vuint16mf2_t test_vreinterpret_v_i16mf2_u16mf2(vint16mf2_t src) +{ + vuint16mf2_t a = vreinterpret_v_i16mf2_u16mf2(src); + return a; +} + +vuint16m1_t test_vreinterpret_v_i16m1_u16m1(vint16m1_t src) +{ + vuint16m1_t a = vreinterpret_v_i16m1_u16m1(src); + return a; +} + +vuint16m2_t test_vreinterpret_v_i16m2_u16m2(vint16m2_t src) +{ + vuint16m2_t a = vreinterpret_v_i16m2_u16m2(src); + return a; +} + +vuint16m4_t test_vreinterpret_v_i16m4_u16m4(vint16m4_t src) +{ + vuint16m4_t a = vreinterpret_v_i16m4_u16m4(src); + return a; +} + +vuint16m8_t test_vreinterpret_v_i16m8_u16m8(vint16m8_t src) +{ + vuint16m8_t a = vreinterpret_v_i16m8_u16m8(src); + return a; +} + +vint16mf4_t test_vreinterpret_v_u16mf4_i16mf4(vuint16mf4_t src) +{ + vint16mf4_t a = vreinterpret_v_u16mf4_i16mf4(src); + return a; +} + +vint16mf2_t test_vreinterpret_v_u16mf2_i16mf2(vuint16mf2_t src) +{ + vint16mf2_t a = vreinterpret_v_u16mf2_i16mf2(src); + return a; +} + +vint16m1_t test_vreinterpret_v_u16m1_i16m1(vuint16m1_t src) +{ + vint16m1_t a = vreinterpret_v_u16m1_i16m1(src); + return a; +} + +vint16m2_t test_vreinterpret_v_u16m2_i16m2(vuint16m2_t src) +{ + vint16m2_t a = vreinterpret_v_u16m2_i16m2(src); + return a; +} + +vint16m4_t test_vreinterpret_v_u16m4_i16m4(vuint16m4_t src) +{ + vint16m4_t a = vreinterpret_v_u16m4_i16m4(src); + return a; +} + +vint16m8_t test_vreinterpret_v_u16m8_i16m8(vuint16m8_t src) +{ + vint16m8_t a = vreinterpret_v_u16m8_i16m8(src); + return a; +} + +vuint32mf2_t test_vreinterpret_v_i32mf2_u32mf2(vint32mf2_t src) +{ + vuint32mf2_t a = vreinterpret_v_i32mf2_u32mf2(src); + return a; +} + +vuint32m1_t test_vreinterpret_v_i32m1_u32m1(vint32m1_t src) +{ + vuint32m1_t a = vreinterpret_v_i32m1_u32m1(src); + return a; +} + +vuint32m2_t test_vreinterpret_v_i32m2_u32m2(vint32m2_t src) +{ + vuint32m2_t a = vreinterpret_v_i32m2_u32m2(src); + return a; +} + +vuint32m4_t test_vreinterpret_v_i32m4_u32m4(vint32m4_t src) +{ + vuint32m4_t a = vreinterpret_v_i32m4_u32m4(src); + return a; +} + +vuint32m8_t test_vreinterpret_v_i32m8_u32m8(vint32m8_t src) +{ + vuint32m8_t a = vreinterpret_v_i32m8_u32m8(src); + return a; +} + +vfloat32mf2_t test_vreinterpret_v_i32mf2_f32mf2(vint32mf2_t src) +{ + vfloat32mf2_t a = vreinterpret_v_i32mf2_f32mf2(src); + return a; +} + +vfloat32m1_t test_vreinterpret_v_i32m1_f32m1(vint32m1_t src) +{ + vfloat32m1_t a = vreinterpret_v_i32m1_f32m1(src); + return a; +} + +vfloat32m2_t test_vreinterpret_v_i32m2_f32m2(vint32m2_t src) +{ + vfloat32m2_t a = vreinterpret_v_i32m2_f32m2(src); + return a; +} + +vfloat32m4_t test_vreinterpret_v_i32m4_f32m4(vint32m4_t src) +{ + vfloat32m4_t a = vreinterpret_v_i32m4_f32m4(src); + return a; +} + +vfloat32m8_t test_vreinterpret_v_i32m8_f32m8(vint32m8_t src) +{ + vfloat32m8_t a = vreinterpret_v_i32m8_f32m8(src); + return a; +} + +vint32mf2_t test_vreinterpret_v_u32mf2_i32mf2(vuint32mf2_t src) +{ + vint32mf2_t a = vreinterpret_v_u32mf2_i32mf2(src); + return a; +} + +vint32m1_t test_vreinterpret_v_u32m1_i32m1(vuint32m1_t src) +{ + vint32m1_t a = vreinterpret_v_u32m1_i32m1(src); + return a; +} + +vint32m2_t test_vreinterpret_v_u32m2_i32m2(vuint32m2_t src) +{ + vint32m2_t a = vreinterpret_v_u32m2_i32m2(src); + return a; +} + +vint32m4_t test_vreinterpret_v_u32m4_i32m4(vuint32m4_t src) +{ + vint32m4_t a = vreinterpret_v_u32m4_i32m4(src); + return a; +} + +vint32m8_t test_vreinterpret_v_u32m8_i32m8(vuint32m8_t src) +{ + vint32m8_t a = vreinterpret_v_u32m8_i32m8(src); + return a; +} + +vfloat32mf2_t test_vreinterpret_v_u32mf2_f32mf2(vuint32mf2_t src) +{ + vfloat32mf2_t a = vreinterpret_v_u32mf2_f32mf2(src); + return a; +} + +vfloat32m1_t test_vreinterpret_v_u32m1_f32m1(vuint32m1_t src) +{ + vfloat32m1_t a = vreinterpret_v_u32m1_f32m1(src); + return a; +} + +vfloat32m2_t test_vreinterpret_v_u32m2_f32m2(vuint32m2_t src) +{ + vfloat32m2_t a = vreinterpret_v_u32m2_f32m2(src); + return a; +} + +vfloat32m4_t test_vreinterpret_v_u32m4_f32m4(vuint32m4_t src) +{ + vfloat32m4_t a = vreinterpret_v_u32m4_f32m4(src); + return a; +} + +vfloat32m8_t test_vreinterpret_v_u32m8_f32m8(vuint32m8_t src) +{ + vfloat32m8_t a = vreinterpret_v_u32m8_f32m8(src); + return a; +} + +vint32mf2_t test_vreinterpret_v_f32mf2_i32mf2(vfloat32mf2_t src) +{ + vint32mf2_t a = vreinterpret_v_f32mf2_i32mf2(src); + return a; +} + +vint32m1_t test_vreinterpret_v_f32m1_i32m1(vfloat32m1_t src) +{ + vint32m1_t a = vreinterpret_v_f32m1_i32m1(src); + return a; +} + +vint32m2_t test_vreinterpret_v_f32m2_i32m2(vfloat32m2_t src) +{ + vint32m2_t a = vreinterpret_v_f32m2_i32m2(src); + return a; +} + +vint32m4_t test_vreinterpret_v_f32m4_i32m4(vfloat32m4_t src) +{ + vint32m4_t a = vreinterpret_v_f32m4_i32m4(src); + return a; +} + +vint32m8_t test_vreinterpret_v_f32m8_i32m8(vfloat32m8_t src) +{ + vint32m8_t a = vreinterpret_v_f32m8_i32m8(src); + return a; +} + +vuint32mf2_t test_vreinterpret_v_f32mf2_u32mf2(vfloat32mf2_t src) +{ + vuint32mf2_t a = vreinterpret_v_f32mf2_u32mf2(src); + return a; +} + +vuint32m1_t test_vreinterpret_v_f32m1_u32m1(vfloat32m1_t src) +{ + vuint32m1_t a = vreinterpret_v_f32m1_u32m1(src); + return a; +} + +vuint32m2_t test_vreinterpret_v_f32m2_u32m2(vfloat32m2_t src) +{ + vuint32m2_t a = vreinterpret_v_f32m2_u32m2(src); + return a; +} + +vuint32m4_t test_vreinterpret_v_f32m4_u32m4(vfloat32m4_t src) +{ + vuint32m4_t a = vreinterpret_v_f32m4_u32m4(src); + return a; +} + +vuint32m8_t test_vreinterpret_v_f32m8_u32m8(vfloat32m8_t src) +{ + vuint32m8_t a = vreinterpret_v_f32m8_u32m8(src); + return a; +} + +vuint64m1_t test_vreinterpret_v_i64m1_u64m1(vint64m1_t src) +{ + vuint64m1_t a = vreinterpret_v_i64m1_u64m1(src); + return a; +} + +vuint64m2_t test_vreinterpret_v_i64m2_u64m2(vint64m2_t src) +{ + vuint64m2_t a = vreinterpret_v_i64m2_u64m2(src); + return a; +} + +vuint64m4_t test_vreinterpret_v_i64m4_u64m4(vint64m4_t src) +{ + vuint64m4_t a = vreinterpret_v_i64m4_u64m4(src); + return a; +} + +vuint64m8_t test_vreinterpret_v_i64m8_u64m8(vint64m8_t src) +{ + vuint64m8_t a = vreinterpret_v_i64m8_u64m8(src); + return a; +} + +vfloat64m1_t test_vreinterpret_v_i64m1_f64m1(vint64m1_t src) +{ + vfloat64m1_t a = vreinterpret_v_i64m1_f64m1(src); + return a; +} + +vfloat64m2_t test_vreinterpret_v_i64m2_f64m2(vint64m2_t src) +{ + vfloat64m2_t a = vreinterpret_v_i64m2_f64m2(src); + return a; +} + +vfloat64m4_t test_vreinterpret_v_i64m4_f64m4(vint64m4_t src) +{ + vfloat64m4_t a = vreinterpret_v_i64m4_f64m4(src); + return a; +} + +vfloat64m8_t test_vreinterpret_v_i64m8_f64m8(vint64m8_t src) +{ + vfloat64m8_t a = vreinterpret_v_i64m8_f64m8(src); + return a; +} + +vint64m1_t test_vreinterpret_v_u64m1_i64m1(vuint64m1_t src) +{ + vint64m1_t a = vreinterpret_v_u64m1_i64m1(src); + return a; +} + +vint64m2_t test_vreinterpret_v_u64m2_i64m2(vuint64m2_t src) +{ + vint64m2_t a = vreinterpret_v_u64m2_i64m2(src); + return a; +} + +vint64m4_t test_vreinterpret_v_u64m4_i64m4(vuint64m4_t src) +{ + vint64m4_t a = vreinterpret_v_u64m4_i64m4(src); + return a; +} + +vint64m8_t test_vreinterpret_v_u64m8_i64m8(vuint64m8_t src) +{ + vint64m8_t a = vreinterpret_v_u64m8_i64m8(src); + return a; +} + +vfloat64m1_t test_vreinterpret_v_u64m1_f64m1(vuint64m1_t src) +{ + vfloat64m1_t a = vreinterpret_v_u64m1_f64m1(src); + return a; +} + +vfloat64m2_t test_vreinterpret_v_u64m2_f64m2(vuint64m2_t src) +{ + vfloat64m2_t a = vreinterpret_v_u64m2_f64m2(src); + return a; +} + +vfloat64m4_t test_vreinterpret_v_u64m4_f64m4(vuint64m4_t src) +{ + vfloat64m4_t a = vreinterpret_v_u64m4_f64m4(src); + return a; +} + +vfloat64m8_t test_vreinterpret_v_u64m8_f64m8(vuint64m8_t src) +{ + vfloat64m8_t a = vreinterpret_v_u64m8_f64m8(src); + return a; +} + +vint64m1_t test_vreinterpret_v_f64m1_i64m1(vfloat64m1_t src) +{ + vint64m1_t a = vreinterpret_v_f64m1_i64m1(src); + return a; +} + +vint64m2_t test_vreinterpret_v_f64m2_i64m2(vfloat64m2_t src) +{ + vint64m2_t a = vreinterpret_v_f64m2_i64m2(src); + return a; +} + +vint64m4_t test_vreinterpret_v_f64m4_i64m4(vfloat64m4_t src) +{ + vint64m4_t a = vreinterpret_v_f64m4_i64m4(src); + return a; +} + +vint64m8_t test_vreinterpret_v_f64m8_i64m8(vfloat64m8_t src) +{ + vint64m8_t a = vreinterpret_v_f64m8_i64m8(src); + return a; +} + +vuint64m1_t test_vreinterpret_v_f64m1_u64m1(vfloat64m1_t src) +{ + vuint64m1_t a = vreinterpret_v_f64m1_u64m1(src); + return a; +} + +vuint64m2_t test_vreinterpret_v_f64m2_u64m2(vfloat64m2_t src) +{ + vuint64m2_t a = vreinterpret_v_f64m2_u64m2(src); + return a; +} + +vuint64m4_t test_vreinterpret_v_f64m4_u64m4(vfloat64m4_t src) +{ + vuint64m4_t a = vreinterpret_v_f64m4_u64m4(src); + return a; +} + +vuint64m8_t test_vreinterpret_v_f64m8_u64m8(vfloat64m8_t src) +{ + vuint64m8_t a = vreinterpret_v_f64m8_u64m8(src); + return a; +} + +/* Reinterpret between different SEW under the same LMUL */ + +vint16mf4_t test_vreinterpret_v_i8mf4_i16mf4(vint8mf4_t src) +{ + vint16mf4_t a = vreinterpret_v_i8mf4_i16mf4(src); + return a; +} + +vint16mf2_t test_vreinterpret_v_i8mf2_i16mf2(vint8mf2_t src) +{ + vint16mf2_t a = vreinterpret_v_i8mf2_i16mf2(src); + return a; +} + +vint16m1_t test_vreinterpret_v_i8m1_i16m1(vint8m1_t src) +{ + vint16m1_t a = vreinterpret_v_i8m1_i16m1(src); + return a; +} + +vint16m2_t test_vreinterpret_v_i8m2_i16m2(vint8m2_t src) +{ + vint16m2_t a = vreinterpret_v_i8m2_i16m2(src); + return a; +} + +vint16m4_t test_vreinterpret_v_i8m4_i16m4(vint8m4_t src) +{ + vint16m4_t a = vreinterpret_v_i8m4_i16m4(src); + return a; +} + +vint16m8_t test_vreinterpret_v_i8m8_i16m8(vint8m8_t src) +{ + vint16m8_t a = vreinterpret_v_i8m8_i16m8(src); + return a; +} + +vint32mf2_t test_vreinterpret_v_i8mf2_i32mf2(vint8mf2_t src) +{ + vint32mf2_t a = vreinterpret_v_i8mf2_i32mf2(src); + return a; +} + +vint32m1_t test_vreinterpret_v_i8m1_i32m1(vint8m1_t src) +{ + vint32m1_t a = vreinterpret_v_i8m1_i32m1(src); + return a; +} + +vint32m2_t test_vreinterpret_v_i8m2_i32m2(vint8m2_t src) +{ + vint32m2_t a = vreinterpret_v_i8m2_i32m2(src); + return a; +} + +vint32m4_t test_vreinterpret_v_i8m4_i32m4(vint8m4_t src) +{ + vint32m4_t a = vreinterpret_v_i8m4_i32m4(src); + return a; +} + +vint32m8_t test_vreinterpret_v_i8m8_i32m8(vint8m8_t src) +{ + vint32m8_t a = vreinterpret_v_i8m8_i32m8(src); + return a; +} + +vint64m1_t test_vreinterpret_v_i8m1_i64m1(vint8m1_t src) +{ + vint64m1_t a = vreinterpret_v_i8m1_i64m1(src); + return a; +} + +vint64m2_t test_vreinterpret_v_i8m2_i64m2(vint8m2_t src) +{ + vint64m2_t a = vreinterpret_v_i8m2_i64m2(src); + return a; +} + +vint64m4_t test_vreinterpret_v_i8m4_i64m4(vint8m4_t src) +{ + vint64m4_t a = vreinterpret_v_i8m4_i64m4(src); + return a; +} + +vint64m8_t test_vreinterpret_v_i8m8_i64m8(vint8m8_t src) +{ + vint64m8_t a = vreinterpret_v_i8m8_i64m8(src); + return a; +} + +vint8mf4_t test_vreinterpret_v_i16mf4_i8mf4(vint16mf4_t src) +{ + vint8mf4_t a = vreinterpret_v_i16mf4_i8mf4(src); + return a; +} + +vint8mf2_t test_vreinterpret_v_i16mf2_i8mf2(vint16mf2_t src) +{ + vint8mf2_t a = vreinterpret_v_i16mf2_i8mf2(src); + return a; +} + +vint8m1_t test_vreinterpret_v_i16m1_i8m1(vint16m1_t src) +{ + vint8m1_t a = vreinterpret_v_i16m1_i8m1(src); + return a; +} + +vint8m2_t test_vreinterpret_v_i16m2_i8m2(vint16m2_t src) +{ + vint8m2_t a = vreinterpret_v_i16m2_i8m2(src); + return a; +} + +vint8m4_t test_vreinterpret_v_i16m4_i8m4(vint16m4_t src) +{ + vint8m4_t a = vreinterpret_v_i16m4_i8m4(src); + return a; +} + +vint8m8_t test_vreinterpret_v_i16m8_i8m8(vint16m8_t src) +{ + vint8m8_t a = vreinterpret_v_i16m8_i8m8(src); + return a; +} + +vint32mf2_t test_vreinterpret_v_i16mf2_i32mf2(vint16mf2_t src) +{ + vint32mf2_t a = vreinterpret_v_i16mf2_i32mf2(src); + return a; +} + +vint32m1_t test_vreinterpret_v_i16m1_i32m1(vint16m1_t src) +{ + vint32m1_t a = vreinterpret_v_i16m1_i32m1(src); + return a; +} + +vint32m2_t test_vreinterpret_v_i16m2_i32m2(vint16m2_t src) +{ + vint32m2_t a = vreinterpret_v_i16m2_i32m2(src); + return a; +} + +vint32m4_t test_vreinterpret_v_i16m4_i32m4(vint16m4_t src) +{ + vint32m4_t a = vreinterpret_v_i16m4_i32m4(src); + return a; +} + +vint32m8_t test_vreinterpret_v_i16m8_i32m8(vint16m8_t src) +{ + vint32m8_t a = vreinterpret_v_i16m8_i32m8(src); + return a; +} + +vint64m1_t test_vreinterpret_v_i16m1_i64m1(vint16m1_t src) +{ + vint64m1_t a = vreinterpret_v_i16m1_i64m1(src); + return a; +} + +vint64m2_t test_vreinterpret_v_i16m2_i64m2(vint16m2_t src) +{ + vint64m2_t a = vreinterpret_v_i16m2_i64m2(src); + return a; +} + +vint64m4_t test_vreinterpret_v_i16m4_i64m4(vint16m4_t src) +{ + vint64m4_t a = vreinterpret_v_i16m4_i64m4(src); + return a; +} + +vint64m8_t test_vreinterpret_v_i16m8_i64m8(vint16m8_t src) +{ + vint64m8_t a = vreinterpret_v_i16m8_i64m8(src); + return a; +} + +vint8mf2_t test_vreinterpret_v_i32mf2_i8mf2(vint32mf2_t src) +{ + vint8mf2_t a = vreinterpret_v_i32mf2_i8mf2(src); + return a; +} + +vint8m1_t test_vreinterpret_v_i32m1_i8m1(vint32m1_t src) +{ + vint8m1_t a = vreinterpret_v_i32m1_i8m1(src); + return a; +} + +vint8m2_t test_vreinterpret_v_i32m2_i8m2(vint32m2_t src) +{ + vint8m2_t a = vreinterpret_v_i32m2_i8m2(src); + return a; +} + +vint8m4_t test_vreinterpret_v_i32m4_i8m4(vint32m4_t src) +{ + vint8m4_t a = vreinterpret_v_i32m4_i8m4(src); + return a; +} + +vint8m8_t test_vreinterpret_v_i32m8_i8m8(vint32m8_t src) +{ + vint8m8_t a = vreinterpret_v_i32m8_i8m8(src); + return a; +} + +vint16mf2_t test_vreinterpret_v_i32mf2_i16mf2(vint32mf2_t src) +{ + vint16mf2_t a = vreinterpret_v_i32mf2_i16mf2(src); + return a; +} + +vint16m1_t test_vreinterpret_v_i32m1_i16m1(vint32m1_t src) +{ + vint16m1_t a = vreinterpret_v_i32m1_i16m1(src); + return a; +} + +vint16m2_t test_vreinterpret_v_i32m2_i16m2(vint32m2_t src) +{ + vint16m2_t a = vreinterpret_v_i32m2_i16m2(src); + return a; +} + +vint16m4_t test_vreinterpret_v_i32m4_i16m4(vint32m4_t src) +{ + vint16m4_t a = vreinterpret_v_i32m4_i16m4(src); + return a; +} + +vint16m8_t test_vreinterpret_v_i32m8_i16m8(vint32m8_t src) +{ + vint16m8_t a = vreinterpret_v_i32m8_i16m8(src); + return a; +} + +vint64m1_t test_vreinterpret_v_i32m1_i64m1(vint32m1_t src) +{ + vint64m1_t a = vreinterpret_v_i32m1_i64m1(src); + return a; +} + +vint64m2_t test_vreinterpret_v_i32m2_i64m2(vint32m2_t src) +{ + vint64m2_t a = vreinterpret_v_i32m2_i64m2(src); + return a; +} + +vint64m4_t test_vreinterpret_v_i32m4_i64m4(vint32m4_t src) +{ + vint64m4_t a = vreinterpret_v_i32m4_i64m4(src); + return a; +} + +vint64m8_t test_vreinterpret_v_i32m8_i64m8(vint32m8_t src) +{ + vint64m8_t a = vreinterpret_v_i32m8_i64m8(src); + return a; +} + +vint8m1_t test_vreinterpret_v_i64m1_i8m1(vint64m1_t src) +{ + vint8m1_t a = vreinterpret_v_i64m1_i8m1(src); + return a; +} + +vint8m2_t test_vreinterpret_v_i64m2_i8m2(vint64m2_t src) +{ + vint8m2_t a = vreinterpret_v_i64m2_i8m2(src); + return a; +} + +vint8m4_t test_vreinterpret_v_i64m4_i8m4(vint64m4_t src) +{ + vint8m4_t a = vreinterpret_v_i64m4_i8m4(src); + return a; +} + +vint8m8_t test_vreinterpret_v_i64m8_i8m8(vint64m8_t src) +{ + vint8m8_t a = vreinterpret_v_i64m8_i8m8(src); + return a; +} + +vint16m1_t test_vreinterpret_v_i64m1_i16m1(vint64m1_t src) +{ + vint16m1_t a = vreinterpret_v_i64m1_i16m1(src); + return a; +} + +vint16m2_t test_vreinterpret_v_i64m2_i16m2(vint64m2_t src) +{ + vint16m2_t a = vreinterpret_v_i64m2_i16m2(src); + return a; +} + +vint16m4_t test_vreinterpret_v_i64m4_i16m4(vint64m4_t src) +{ + vint16m4_t a = vreinterpret_v_i64m4_i16m4(src); + return a; +} + +vint16m8_t test_vreinterpret_v_i64m8_i16m8(vint64m8_t src) +{ + vint16m8_t a = vreinterpret_v_i64m8_i16m8(src); + return a; +} + +vint32m1_t test_vreinterpret_v_i64m1_i32m1(vint64m1_t src) +{ + vint32m1_t a = vreinterpret_v_i64m1_i32m1(src); + return a; +} + +vint32m2_t test_vreinterpret_v_i64m2_i32m2(vint64m2_t src) +{ + vint32m2_t a = vreinterpret_v_i64m2_i32m2(src); + return a; +} + +vint32m4_t test_vreinterpret_v_i64m4_i32m4(vint64m4_t src) +{ + vint32m4_t a = vreinterpret_v_i64m4_i32m4(src); + return a; +} + +vint32m8_t test_vreinterpret_v_i64m8_i32m8(vint64m8_t src) +{ + vint32m8_t a = vreinterpret_v_i64m8_i32m8(src); + return a; +} + +vuint16mf4_t test_vreinterpret_v_u8mf4_u16mf4(vuint8mf4_t src) +{ + vuint16mf4_t a = vreinterpret_v_u8mf4_u16mf4(src); + return a; +} + +vuint16mf2_t test_vreinterpret_v_u8mf2_u16mf2(vuint8mf2_t src) +{ + vuint16mf2_t a = vreinterpret_v_u8mf2_u16mf2(src); + return a; +} + +vuint16m1_t test_vreinterpret_v_u8m1_u16m1(vuint8m1_t src) +{ + vuint16m1_t a = vreinterpret_v_u8m1_u16m1(src); + return a; +} + +vuint16m2_t test_vreinterpret_v_u8m2_u16m2(vuint8m2_t src) +{ + vuint16m2_t a = vreinterpret_v_u8m2_u16m2(src); + return a; +} + +vuint16m4_t test_vreinterpret_v_u8m4_u16m4(vuint8m4_t src) +{ + vuint16m4_t a = vreinterpret_v_u8m4_u16m4(src); + return a; +} + +vuint16m8_t test_vreinterpret_v_u8m8_u16m8(vuint8m8_t src) +{ + vuint16m8_t a = vreinterpret_v_u8m8_u16m8(src); + return a; +} + +vuint32mf2_t test_vreinterpret_v_u8mf2_u32mf2(vuint8mf2_t src) +{ + vuint32mf2_t a = vreinterpret_v_u8mf2_u32mf2(src); + return a; +} + +vuint32m1_t test_vreinterpret_v_u8m1_u32m1(vuint8m1_t src) +{ + vuint32m1_t a = vreinterpret_v_u8m1_u32m1(src); + return a; +} + +vuint32m2_t test_vreinterpret_v_u8m2_u32m2(vuint8m2_t src) +{ + vuint32m2_t a = vreinterpret_v_u8m2_u32m2(src); + return a; +} + +vuint32m4_t test_vreinterpret_v_u8m4_u32m4(vuint8m4_t src) +{ + vuint32m4_t a = vreinterpret_v_u8m4_u32m4(src); + return a; +} + +vuint32m8_t test_vreinterpret_v_u8m8_u32m8(vuint8m8_t src) +{ + vuint32m8_t a = vreinterpret_v_u8m8_u32m8(src); + return a; +} + +vuint64m1_t test_vreinterpret_v_u8m1_u64m1(vuint8m1_t src) +{ + vuint64m1_t a = vreinterpret_v_u8m1_u64m1(src); + return a; +} + +vuint64m2_t test_vreinterpret_v_u8m2_u64m2(vuint8m2_t src) +{ + vuint64m2_t a = vreinterpret_v_u8m2_u64m2(src); + return a; +} + +vuint64m4_t test_vreinterpret_v_u8m4_u64m4(vuint8m4_t src) +{ + vuint64m4_t a = vreinterpret_v_u8m4_u64m4(src); + return a; +} + +vuint64m8_t test_vreinterpret_v_u8m8_u64m8(vuint8m8_t src) +{ + vuint64m8_t a = vreinterpret_v_u8m8_u64m8(src); + return a; +} + +vuint8mf4_t test_vreinterpret_v_u16mf4_u8mf4(vuint16mf4_t src) +{ + vuint8mf4_t a = vreinterpret_v_u16mf4_u8mf4(src); + return a; +} + +vuint8mf2_t test_vreinterpret_v_u16mf2_u8mf2(vuint16mf2_t src) +{ + vuint8mf2_t a = vreinterpret_v_u16mf2_u8mf2(src); + return a; +} + +vuint8m1_t test_vreinterpret_v_u16m1_u8m1(vuint16m1_t src) +{ + vuint8m1_t a = vreinterpret_v_u16m1_u8m1(src); + return a; +} + +vuint8m2_t test_vreinterpret_v_u16m2_u8m2(vuint16m2_t src) +{ + vuint8m2_t a = vreinterpret_v_u16m2_u8m2(src); + return a; +} + +vuint8m4_t test_vreinterpret_v_u16m4_u8m4(vuint16m4_t src) +{ + vuint8m4_t a = vreinterpret_v_u16m4_u8m4(src); + return a; +} + +vuint8m8_t test_vreinterpret_v_u16m8_u8m8(vuint16m8_t src) +{ + vuint8m8_t a = vreinterpret_v_u16m8_u8m8(src); + return a; +} + +vuint32mf2_t test_vreinterpret_v_u16mf2_u32mf2(vuint16mf2_t src) +{ + vuint32mf2_t a = vreinterpret_v_u16mf2_u32mf2(src); + return a; +} + +vuint32m1_t test_vreinterpret_v_u16m1_u32m1(vuint16m1_t src) +{ + vuint32m1_t a = vreinterpret_v_u16m1_u32m1(src); + return a; +} + +vuint32m2_t test_vreinterpret_v_u16m2_u32m2(vuint16m2_t src) +{ + vuint32m2_t a = vreinterpret_v_u16m2_u32m2(src); + return a; +} + +vuint32m4_t test_vreinterpret_v_u16m4_u32m4(vuint16m4_t src) +{ + vuint32m4_t a = vreinterpret_v_u16m4_u32m4(src); + return a; +} + +vuint32m8_t test_vreinterpret_v_u16m8_u32m8(vuint16m8_t src) +{ + vuint32m8_t a = vreinterpret_v_u16m8_u32m8(src); + return a; +} + +vuint64m1_t test_vreinterpret_v_u16m1_u64m1(vuint16m1_t src) +{ + vuint64m1_t a = vreinterpret_v_u16m1_u64m1(src); + return a; +} + +vuint64m2_t test_vreinterpret_v_u16m2_u64m2(vuint16m2_t src) +{ + vuint64m2_t a = vreinterpret_v_u16m2_u64m2(src); + return a; +} + +vuint64m4_t test_vreinterpret_v_u16m4_u64m4(vuint16m4_t src) +{ + vuint64m4_t a = vreinterpret_v_u16m4_u64m4(src); + return a; +} + +vuint64m8_t test_vreinterpret_v_u16m8_u64m8(vuint16m8_t src) +{ + vuint64m8_t a = vreinterpret_v_u16m8_u64m8(src); + return a; +} + +vuint8mf2_t test_vreinterpret_v_u32mf2_u8mf2(vuint32mf2_t src) +{ + vuint8mf2_t a = vreinterpret_v_u32mf2_u8mf2(src); + return a; +} + +vuint8m1_t test_vreinterpret_v_u32m1_u8m1(vuint32m1_t src) +{ + vuint8m1_t a = vreinterpret_v_u32m1_u8m1(src); + return a; +} + +vuint8m2_t test_vreinterpret_v_u32m2_u8m2(vuint32m2_t src) +{ + vuint8m2_t a = vreinterpret_v_u32m2_u8m2(src); + return a; +} + +vuint8m4_t test_vreinterpret_v_u32m4_u8m4(vuint32m4_t src) +{ + vuint8m4_t a = vreinterpret_v_u32m4_u8m4(src); + return a; +} + +vuint8m8_t test_vreinterpret_v_u32m8_u8m8(vuint32m8_t src) +{ + vuint8m8_t a = vreinterpret_v_u32m8_u8m8(src); + return a; +} + +vuint16mf2_t test_vreinterpret_v_u32mf2_u16mf2(vuint32mf2_t src) +{ + vuint16mf2_t a = vreinterpret_v_u32mf2_u16mf2(src); + return a; +} + +vuint16m1_t test_vreinterpret_v_u32m1_u16m1(vuint32m1_t src) +{ + vuint16m1_t a = vreinterpret_v_u32m1_u16m1(src); + return a; +} + +vuint16m2_t test_vreinterpret_v_u32m2_u16m2(vuint32m2_t src) +{ + vuint16m2_t a = vreinterpret_v_u32m2_u16m2(src); + return a; +} + +vuint16m4_t test_vreinterpret_v_u32m4_u16m4(vuint32m4_t src) +{ + vuint16m4_t a = vreinterpret_v_u32m4_u16m4(src); + return a; +} + +vuint16m8_t test_vreinterpret_v_u32m8_u16m8(vuint32m8_t src) +{ + vuint16m8_t a = vreinterpret_v_u32m8_u16m8(src); + return a; +} + +vuint64m1_t test_vreinterpret_v_u32m1_u64m1(vuint32m1_t src) +{ + vuint64m1_t a = vreinterpret_v_u32m1_u64m1(src); + return a; +} + +vuint64m2_t test_vreinterpret_v_u32m2_u64m2(vuint32m2_t src) +{ + vuint64m2_t a = vreinterpret_v_u32m2_u64m2(src); + return a; +} + +vuint64m4_t test_vreinterpret_v_u32m4_u64m4(vuint32m4_t src) +{ + vuint64m4_t a = vreinterpret_v_u32m4_u64m4(src); + return a; +} + +vuint64m8_t test_vreinterpret_v_u32m8_u64m8(vuint32m8_t src) +{ + vuint64m8_t a = vreinterpret_v_u32m8_u64m8(src); + return a; +} + +vuint8m1_t test_vreinterpret_v_u64m1_u8m1(vuint64m1_t src) +{ + vuint8m1_t a = vreinterpret_v_u64m1_u8m1(src); + return a; +} + +vuint8m2_t test_vreinterpret_v_u64m2_u8m2(vuint64m2_t src) +{ + vuint8m2_t a = vreinterpret_v_u64m2_u8m2(src); + return a; +} + +vuint8m4_t test_vreinterpret_v_u64m4_u8m4(vuint64m4_t src) +{ + vuint8m4_t a = vreinterpret_v_u64m4_u8m4(src); + return a; +} + +vuint8m8_t test_vreinterpret_v_u64m8_u8m8(vuint64m8_t src) +{ + vuint8m8_t a = vreinterpret_v_u64m8_u8m8(src); + return a; +} + +vuint16m1_t test_vreinterpret_v_u64m1_u16m1(vuint64m1_t src) +{ + vuint16m1_t a = vreinterpret_v_u64m1_u16m1(src); + return a; +} + +vuint16m2_t test_vreinterpret_v_u64m2_u16m2(vuint64m2_t src) +{ + vuint16m2_t a = vreinterpret_v_u64m2_u16m2(src); + return a; +} + +vuint16m4_t test_vreinterpret_v_u64m4_u16m4(vuint64m4_t src) +{ + vuint16m4_t a = vreinterpret_v_u64m4_u16m4(src); + return a; +} + +vuint16m8_t test_vreinterpret_v_u64m8_u16m8(vuint64m8_t src) +{ + vuint16m8_t a = vreinterpret_v_u64m8_u16m8(src); + return a; +} + +vuint32m1_t test_vreinterpret_v_u64m1_u32m1(vuint64m1_t src) +{ + vuint32m1_t a = vreinterpret_v_u64m1_u32m1(src); + return a; +} + +vuint32m2_t test_vreinterpret_v_u64m2_u32m2(vuint64m2_t src) +{ + vuint32m2_t a = vreinterpret_v_u64m2_u32m2(src); + return a; +} + +vuint32m4_t test_vreinterpret_v_u64m4_u32m4(vuint64m4_t src) +{ + vuint32m4_t a = vreinterpret_v_u64m4_u32m4(src); + return a; +} + +vuint32m8_t test_vreinterpret_v_u64m8_u32m8(vuint64m8_t src) +{ + vuint32m8_t a = vreinterpret_v_u64m8_u32m8(src); + return a; +} + +vint8mf4_t test_vlmul_ext_v_i8mf8_i8mf4(vint8mf8_t op1) +{ + vint8mf4_t a = vlmul_ext_v_i8mf8_i8mf4(op1); + return a; +} + +vint8mf2_t test_vlmul_ext_v_i8mf8_i8mf2(vint8mf8_t op1) +{ + vint8mf2_t a = vlmul_ext_v_i8mf8_i8mf2(op1); + return a; +} + +vint8m1_t test_vlmul_ext_v_i8mf8_i8m1(vint8mf8_t op1) +{ + vint8m1_t a = vlmul_ext_v_i8mf8_i8m1(op1); + return a; +} + +vint8m2_t test_vlmul_ext_v_i8mf8_i8m2(vint8mf8_t op1) +{ + vint8m2_t a = vlmul_ext_v_i8mf8_i8m2(op1); + return a; +} + +vint8m4_t test_vlmul_ext_v_i8mf8_i8m4(vint8mf8_t op1) +{ + vint8m4_t a = vlmul_ext_v_i8mf8_i8m4(op1); + return a; +} + +vint8m8_t test_vlmul_ext_v_i8mf8_i8m8(vint8mf8_t op1) +{ + vint8m8_t a = vlmul_ext_v_i8mf8_i8m8(op1); + return a; +} + +vint8mf2_t test_vlmul_ext_v_i8mf4_i8mf2(vint8mf4_t op1) +{ + vint8mf2_t a = vlmul_ext_v_i8mf4_i8mf2(op1); + return a; +} + +vint8m1_t test_vlmul_ext_v_i8mf4_i8m1(vint8mf4_t op1) +{ + vint8m1_t a = vlmul_ext_v_i8mf4_i8m1(op1); + return a; +} + +vint8m2_t test_vlmul_ext_v_i8mf4_i8m2(vint8mf4_t op1) +{ + vint8m2_t a = vlmul_ext_v_i8mf4_i8m2(op1); + return a; +} + +vint8m4_t test_vlmul_ext_v_i8mf4_i8m4(vint8mf4_t op1) +{ + vint8m4_t a = vlmul_ext_v_i8mf4_i8m4(op1); + return a; +} + +vint8m8_t test_vlmul_ext_v_i8mf4_i8m8(vint8mf4_t op1) +{ + vint8m8_t a = vlmul_ext_v_i8mf4_i8m8(op1); + return a; +} + +vint8m1_t test_vlmul_ext_v_i8mf2_i8m1(vint8mf2_t op1) +{ + vint8m1_t a = vlmul_ext_v_i8mf2_i8m1(op1); + return a; +} + +vint8m2_t test_vlmul_ext_v_i8mf2_i8m2(vint8mf2_t op1) +{ + vint8m2_t a = vlmul_ext_v_i8mf2_i8m2(op1); + return a; +} + +vint8m4_t test_vlmul_ext_v_i8mf2_i8m4(vint8mf2_t op1) +{ + vint8m4_t a = vlmul_ext_v_i8mf2_i8m4(op1); + return a; +} + +vint8m8_t test_vlmul_ext_v_i8mf2_i8m8(vint8mf2_t op1) +{ + vint8m8_t a = vlmul_ext_v_i8mf2_i8m8(op1); + return a; +} + +vint8m2_t test_vlmul_ext_v_i8m1_i8m2(vint8m1_t op1) +{ + vint8m2_t a = vlmul_ext_v_i8m1_i8m2(op1); + return a; +} + +vint8m4_t test_vlmul_ext_v_i8m1_i8m4(vint8m1_t op1) +{ + vint8m4_t a = vlmul_ext_v_i8m1_i8m4(op1); + return a; +} + +vint8m8_t test_vlmul_ext_v_i8m1_i8m8(vint8m1_t op1) +{ + vint8m8_t a = vlmul_ext_v_i8m1_i8m8(op1); + return a; +} + +vint8m4_t test_vlmul_ext_v_i8m2_i8m4(vint8m2_t op1) +{ + vint8m4_t a = vlmul_ext_v_i8m2_i8m4(op1); + return a; +} + +vint8m8_t test_vlmul_ext_v_i8m2_i8m8(vint8m2_t op1) +{ + vint8m8_t a = vlmul_ext_v_i8m2_i8m8(op1); + return a; +} + +vint8m8_t test_vlmul_ext_v_i8m4_i8m8(vint8m4_t op1) +{ + vint8m8_t a = vlmul_ext_v_i8m4_i8m8(op1); + return a; +} + +vint16mf2_t test_vlmul_ext_v_i16mf4_i16mf2(vint16mf4_t op1) +{ + vint16mf2_t a = vlmul_ext_v_i16mf4_i16mf2(op1); + return a; +} + +vint16m1_t test_vlmul_ext_v_i16mf4_i16m1(vint16mf4_t op1) +{ + vint16m1_t a = vlmul_ext_v_i16mf4_i16m1(op1); + return a; +} + +vint16m2_t test_vlmul_ext_v_i16mf4_i16m2(vint16mf4_t op1) +{ + vint16m2_t a = vlmul_ext_v_i16mf4_i16m2(op1); + return a; +} + +vint16m4_t test_vlmul_ext_v_i16mf4_i16m4(vint16mf4_t op1) +{ + vint16m4_t a = vlmul_ext_v_i16mf4_i16m4(op1); + return a; +} + +vint16m8_t test_vlmul_ext_v_i16mf4_i16m8(vint16mf4_t op1) +{ + vint16m8_t a = vlmul_ext_v_i16mf4_i16m8(op1); + return a; +} + +vint16m1_t test_vlmul_ext_v_i16mf2_i16m1(vint16mf2_t op1) +{ + vint16m1_t a = vlmul_ext_v_i16mf2_i16m1(op1); + return a; +} + +vint16m2_t test_vlmul_ext_v_i16mf2_i16m2(vint16mf2_t op1) +{ + vint16m2_t a = vlmul_ext_v_i16mf2_i16m2(op1); + return a; +} + +vint16m4_t test_vlmul_ext_v_i16mf2_i16m4(vint16mf2_t op1) +{ + vint16m4_t a = vlmul_ext_v_i16mf2_i16m4(op1); + return a; +} + +vint16m8_t test_vlmul_ext_v_i16mf2_i16m8(vint16mf2_t op1) +{ + vint16m8_t a = vlmul_ext_v_i16mf2_i16m8(op1); + return a; +} + +vint16m2_t test_vlmul_ext_v_i16m1_i16m2(vint16m1_t op1) +{ + vint16m2_t a = vlmul_ext_v_i16m1_i16m2(op1); + return a; +} + +vint16m4_t test_vlmul_ext_v_i16m1_i16m4(vint16m1_t op1) +{ + vint16m4_t a = vlmul_ext_v_i16m1_i16m4(op1); + return a; +} + +vint16m8_t test_vlmul_ext_v_i16m1_i16m8(vint16m1_t op1) +{ + vint16m8_t a = vlmul_ext_v_i16m1_i16m8(op1); + return a; +} + +vint16m4_t test_vlmul_ext_v_i16m2_i16m4(vint16m2_t op1) +{ + vint16m4_t a = vlmul_ext_v_i16m2_i16m4(op1); + return a; +} + +vint16m8_t test_vlmul_ext_v_i16m2_i16m8(vint16m2_t op1) +{ + vint16m8_t a = vlmul_ext_v_i16m2_i16m8(op1); + return a; +} + +vint16m8_t test_vlmul_ext_v_i16m4_i16m8(vint16m4_t op1) +{ + vint16m8_t a = vlmul_ext_v_i16m4_i16m8(op1); + return a; +} + +vint32m1_t test_vlmul_ext_v_i32mf2_i32m1(vint32mf2_t op1) +{ + vint32m1_t a = vlmul_ext_v_i32mf2_i32m1(op1); + return a; +} + +vint32m2_t test_vlmul_ext_v_i32mf2_i32m2(vint32mf2_t op1) +{ + vint32m2_t a = vlmul_ext_v_i32mf2_i32m2(op1); + return a; +} + +vint32m4_t test_vlmul_ext_v_i32mf2_i32m4(vint32mf2_t op1) +{ + vint32m4_t a = vlmul_ext_v_i32mf2_i32m4(op1); + return a; +} + +vint32m8_t test_vlmul_ext_v_i32mf2_i32m8(vint32mf2_t op1) +{ + vint32m8_t a = vlmul_ext_v_i32mf2_i32m8(op1); + return a; +} + +vint32m2_t test_vlmul_ext_v_i32m1_i32m2(vint32m1_t op1) +{ + vint32m2_t a = vlmul_ext_v_i32m1_i32m2(op1); + return a; +} + +vint32m4_t test_vlmul_ext_v_i32m1_i32m4(vint32m1_t op1) +{ + vint32m4_t a = vlmul_ext_v_i32m1_i32m4(op1); + return a; +} + +vint32m8_t test_vlmul_ext_v_i32m1_i32m8(vint32m1_t op1) +{ + vint32m8_t a = vlmul_ext_v_i32m1_i32m8(op1); + return a; +} + +vint32m4_t test_vlmul_ext_v_i32m2_i32m4(vint32m2_t op1) +{ + vint32m4_t a = vlmul_ext_v_i32m2_i32m4(op1); + return a; +} + +vint32m8_t test_vlmul_ext_v_i32m2_i32m8(vint32m2_t op1) +{ + vint32m8_t a = vlmul_ext_v_i32m2_i32m8(op1); + return a; +} + +vint32m8_t test_vlmul_ext_v_i32m4_i32m8(vint32m4_t op1) +{ + vint32m8_t a = vlmul_ext_v_i32m4_i32m8(op1); + return a; +} + +vint64m2_t test_vlmul_ext_v_i64m1_i64m2(vint64m1_t op1) +{ + vint64m2_t a = vlmul_ext_v_i64m1_i64m2(op1); + return a; +} + +vint64m4_t test_vlmul_ext_v_i64m1_i64m4(vint64m1_t op1) +{ + vint64m4_t a = vlmul_ext_v_i64m1_i64m4(op1); + return a; +} + +vint64m8_t test_vlmul_ext_v_i64m1_i64m8(vint64m1_t op1) +{ + vint64m8_t a = vlmul_ext_v_i64m1_i64m8(op1); + return a; +} + +vint64m4_t test_vlmul_ext_v_i64m2_i64m4(vint64m2_t op1) +{ + vint64m4_t a = vlmul_ext_v_i64m2_i64m4(op1); + return a; +} + +vint64m8_t test_vlmul_ext_v_i64m2_i64m8(vint64m2_t op1) +{ + vint64m8_t a = vlmul_ext_v_i64m2_i64m8(op1); + return a; +} + +vint64m8_t test_vlmul_ext_v_i64m4_i64m8(vint64m4_t op1) +{ + vint64m8_t a = vlmul_ext_v_i64m4_i64m8(op1); + return a; +} + +vuint8mf4_t test_vlmul_ext_v_u8mf8_u8mf4(vuint8mf8_t op1) +{ + vuint8mf4_t a = vlmul_ext_v_u8mf8_u8mf4(op1); + return a; +} + +vuint8mf2_t test_vlmul_ext_v_u8mf8_u8mf2(vuint8mf8_t op1) +{ + vuint8mf2_t a = vlmul_ext_v_u8mf8_u8mf2(op1); + return a; +} + +vuint8m1_t test_vlmul_ext_v_u8mf8_u8m1(vuint8mf8_t op1) +{ + vuint8m1_t a = vlmul_ext_v_u8mf8_u8m1(op1); + return a; +} + +vuint8m2_t test_vlmul_ext_v_u8mf8_u8m2(vuint8mf8_t op1) +{ + vuint8m2_t a = vlmul_ext_v_u8mf8_u8m2(op1); + return a; +} + +vuint8m4_t test_vlmul_ext_v_u8mf8_u8m4(vuint8mf8_t op1) +{ + vuint8m4_t a = vlmul_ext_v_u8mf8_u8m4(op1); + return a; +} + +vuint8m8_t test_vlmul_ext_v_u8mf8_u8m8(vuint8mf8_t op1) +{ + vuint8m8_t a = vlmul_ext_v_u8mf8_u8m8(op1); + return a; +} + +vuint8mf2_t test_vlmul_ext_v_u8mf4_u8mf2(vuint8mf4_t op1) +{ + vuint8mf2_t a = vlmul_ext_v_u8mf4_u8mf2(op1); + return a; +} + +vuint8m1_t test_vlmul_ext_v_u8mf4_u8m1(vuint8mf4_t op1) +{ + vuint8m1_t a = vlmul_ext_v_u8mf4_u8m1(op1); + return a; +} + +vuint8m2_t test_vlmul_ext_v_u8mf4_u8m2(vuint8mf4_t op1) +{ + vuint8m2_t a = vlmul_ext_v_u8mf4_u8m2(op1); + return a; +} + +vuint8m4_t test_vlmul_ext_v_u8mf4_u8m4(vuint8mf4_t op1) +{ + vuint8m4_t a = vlmul_ext_v_u8mf4_u8m4(op1); + return a; +} + +vuint8m8_t test_vlmul_ext_v_u8mf4_u8m8(vuint8mf4_t op1) +{ + vuint8m8_t a = vlmul_ext_v_u8mf4_u8m8(op1); + return a; +} + +vuint8m1_t test_vlmul_ext_v_u8mf2_u8m1(vuint8mf2_t op1) +{ + vuint8m1_t a = vlmul_ext_v_u8mf2_u8m1(op1); + return a; +} + +vuint8m2_t test_vlmul_ext_v_u8mf2_u8m2(vuint8mf2_t op1) +{ + vuint8m2_t a = vlmul_ext_v_u8mf2_u8m2(op1); + return a; +} + +vuint8m4_t test_vlmul_ext_v_u8mf2_u8m4(vuint8mf2_t op1) +{ + vuint8m4_t a = vlmul_ext_v_u8mf2_u8m4(op1); + return a; +} + +vuint8m8_t test_vlmul_ext_v_u8mf2_u8m8(vuint8mf2_t op1) +{ + vuint8m8_t a = vlmul_ext_v_u8mf2_u8m8(op1); + return a; +} + +vuint8m2_t test_vlmul_ext_v_u8m1_u8m2(vuint8m1_t op1) +{ + vuint8m2_t a = vlmul_ext_v_u8m1_u8m2(op1); + return a; +} + +vuint8m4_t test_vlmul_ext_v_u8m1_u8m4(vuint8m1_t op1) +{ + vuint8m4_t a = vlmul_ext_v_u8m1_u8m4(op1); + return a; +} + +vuint8m8_t test_vlmul_ext_v_u8m1_u8m8(vuint8m1_t op1) +{ + vuint8m8_t a = vlmul_ext_v_u8m1_u8m8(op1); + return a; +} + +vuint8m4_t test_vlmul_ext_v_u8m2_u8m4(vuint8m2_t op1) +{ + vuint8m4_t a = vlmul_ext_v_u8m2_u8m4(op1); + return a; +} + +vuint8m8_t test_vlmul_ext_v_u8m2_u8m8(vuint8m2_t op1) +{ + vuint8m8_t a = vlmul_ext_v_u8m2_u8m8(op1); + return a; +} + +vuint8m8_t test_vlmul_ext_v_u8m4_u8m8(vuint8m4_t op1) +{ + vuint8m8_t a = vlmul_ext_v_u8m4_u8m8(op1); + return a; +} + +vuint16mf2_t test_vlmul_ext_v_u16mf4_u16mf2(vuint16mf4_t op1) +{ + vuint16mf2_t a = vlmul_ext_v_u16mf4_u16mf2(op1); + return a; +} + +vuint16m1_t test_vlmul_ext_v_u16mf4_u16m1(vuint16mf4_t op1) +{ + vuint16m1_t a = vlmul_ext_v_u16mf4_u16m1(op1); + return a; +} + +vuint16m2_t test_vlmul_ext_v_u16mf4_u16m2(vuint16mf4_t op1) +{ + vuint16m2_t a = vlmul_ext_v_u16mf4_u16m2(op1); + return a; +} + +vuint16m4_t test_vlmul_ext_v_u16mf4_u16m4(vuint16mf4_t op1) +{ + vuint16m4_t a = vlmul_ext_v_u16mf4_u16m4(op1); + return a; +} + +vuint16m8_t test_vlmul_ext_v_u16mf4_u16m8(vuint16mf4_t op1) +{ + vuint16m8_t a = vlmul_ext_v_u16mf4_u16m8(op1); + return a; +} + +vuint16m1_t test_vlmul_ext_v_u16mf2_u16m1(vuint16mf2_t op1) +{ + vuint16m1_t a = vlmul_ext_v_u16mf2_u16m1(op1); + return a; +} + +vuint16m2_t test_vlmul_ext_v_u16mf2_u16m2(vuint16mf2_t op1) +{ + vuint16m2_t a = vlmul_ext_v_u16mf2_u16m2(op1); + return a; +} + +vuint16m4_t test_vlmul_ext_v_u16mf2_u16m4(vuint16mf2_t op1) +{ + vuint16m4_t a = vlmul_ext_v_u16mf2_u16m4(op1); + return a; +} + +vuint16m8_t test_vlmul_ext_v_u16mf2_u16m8(vuint16mf2_t op1) +{ + vuint16m8_t a = vlmul_ext_v_u16mf2_u16m8(op1); + return a; +} + +vuint16m2_t test_vlmul_ext_v_u16m1_u16m2(vuint16m1_t op1) +{ + vuint16m2_t a = vlmul_ext_v_u16m1_u16m2(op1); + return a; +} + +vuint16m4_t test_vlmul_ext_v_u16m1_u16m4(vuint16m1_t op1) +{ + vuint16m4_t a = vlmul_ext_v_u16m1_u16m4(op1); + return a; +} + +vuint16m8_t test_vlmul_ext_v_u16m1_u16m8(vuint16m1_t op1) +{ + vuint16m8_t a = vlmul_ext_v_u16m1_u16m8(op1); + return a; +} + +vuint16m4_t test_vlmul_ext_v_u16m2_u16m4(vuint16m2_t op1) +{ + vuint16m4_t a = vlmul_ext_v_u16m2_u16m4(op1); + return a; +} + +vuint16m8_t test_vlmul_ext_v_u16m2_u16m8(vuint16m2_t op1) +{ + vuint16m8_t a = vlmul_ext_v_u16m2_u16m8(op1); + return a; +} + +vuint16m8_t test_vlmul_ext_v_u16m4_u16m8(vuint16m4_t op1) +{ + vuint16m8_t a = vlmul_ext_v_u16m4_u16m8(op1); + return a; +} + +vuint32m1_t test_vlmul_ext_v_u32mf2_u32m1(vuint32mf2_t op1) +{ + vuint32m1_t a = vlmul_ext_v_u32mf2_u32m1(op1); + return a; +} + +vuint32m2_t test_vlmul_ext_v_u32mf2_u32m2(vuint32mf2_t op1) +{ + vuint32m2_t a = vlmul_ext_v_u32mf2_u32m2(op1); + return a; +} + +vuint32m4_t test_vlmul_ext_v_u32mf2_u32m4(vuint32mf2_t op1) +{ + vuint32m4_t a = vlmul_ext_v_u32mf2_u32m4(op1); + return a; +} + +vuint32m8_t test_vlmul_ext_v_u32mf2_u32m8(vuint32mf2_t op1) +{ + vuint32m8_t a = vlmul_ext_v_u32mf2_u32m8(op1); + return a; +} + +vuint32m2_t test_vlmul_ext_v_u32m1_u32m2(vuint32m1_t op1) +{ + vuint32m2_t a = vlmul_ext_v_u32m1_u32m2(op1); + return a; +} + +vuint32m4_t test_vlmul_ext_v_u32m1_u32m4(vuint32m1_t op1) +{ + vuint32m4_t a = vlmul_ext_v_u32m1_u32m4(op1); + return a; +} + +vuint32m8_t test_vlmul_ext_v_u32m1_u32m8(vuint32m1_t op1) +{ + vuint32m8_t a = vlmul_ext_v_u32m1_u32m8(op1); + return a; +} + +vuint32m4_t test_vlmul_ext_v_u32m2_u32m4(vuint32m2_t op1) +{ + vuint32m4_t a = vlmul_ext_v_u32m2_u32m4(op1); + return a; +} + +vuint32m8_t test_vlmul_ext_v_u32m2_u32m8(vuint32m2_t op1) +{ + vuint32m8_t a = vlmul_ext_v_u32m2_u32m8(op1); + return a; +} + +vuint32m8_t test_vlmul_ext_v_u32m4_u32m8(vuint32m4_t op1) +{ + vuint32m8_t a = vlmul_ext_v_u32m4_u32m8(op1); + return a; +} + +vuint64m2_t test_vlmul_ext_v_u64m1_u64m2(vuint64m1_t op1) +{ + vuint64m2_t a = vlmul_ext_v_u64m1_u64m2(op1); + return a; +} + +vuint64m4_t test_vlmul_ext_v_u64m1_u64m4(vuint64m1_t op1) +{ + vuint64m4_t a = vlmul_ext_v_u64m1_u64m4(op1); + return a; +} + +vuint64m8_t test_vlmul_ext_v_u64m1_u64m8(vuint64m1_t op1) +{ + vuint64m8_t a = vlmul_ext_v_u64m1_u64m8(op1); + return a; +} + +vuint64m4_t test_vlmul_ext_v_u64m2_u64m4(vuint64m2_t op1) +{ + vuint64m4_t a = vlmul_ext_v_u64m2_u64m4(op1); + return a; +} + +vuint64m8_t test_vlmul_ext_v_u64m2_u64m8(vuint64m2_t op1) +{ + vuint64m8_t a = vlmul_ext_v_u64m2_u64m8(op1); + return a; +} + +vuint64m8_t test_vlmul_ext_v_u64m4_u64m8(vuint64m4_t op1) +{ + vuint64m8_t a = vlmul_ext_v_u64m4_u64m8(op1); + return a; +} + +vfloat32m1_t test_vlmul_ext_v_f32mf2_f32m1(vfloat32mf2_t op1) +{ + vfloat32m1_t a = vlmul_ext_v_f32mf2_f32m1(op1); + return a; +} + +vfloat32m2_t test_vlmul_ext_v_f32mf2_f32m2(vfloat32mf2_t op1) +{ + vfloat32m2_t a = vlmul_ext_v_f32mf2_f32m2(op1); + return a; +} + +vfloat32m4_t test_vlmul_ext_v_f32mf2_f32m4(vfloat32mf2_t op1) +{ + vfloat32m4_t a = vlmul_ext_v_f32mf2_f32m4(op1); + return a; +} + +vfloat32m8_t test_vlmul_ext_v_f32mf2_f32m8(vfloat32mf2_t op1) +{ + vfloat32m8_t a = vlmul_ext_v_f32mf2_f32m8(op1); + return a; +} + +vfloat32m2_t test_vlmul_ext_v_f32m1_f32m2(vfloat32m1_t op1) +{ + vfloat32m2_t a = vlmul_ext_v_f32m1_f32m2(op1); + return a; +} + +vfloat32m4_t test_vlmul_ext_v_f32m1_f32m4(vfloat32m1_t op1) +{ + vfloat32m4_t a = vlmul_ext_v_f32m1_f32m4(op1); + return a; +} + +vfloat32m8_t test_vlmul_ext_v_f32m1_f32m8(vfloat32m1_t op1) +{ + vfloat32m8_t a = vlmul_ext_v_f32m1_f32m8(op1); + return a; +} + +vfloat32m4_t test_vlmul_ext_v_f32m2_f32m4(vfloat32m2_t op1) +{ + vfloat32m4_t a = vlmul_ext_v_f32m2_f32m4(op1); + return a; +} + +vfloat32m8_t test_vlmul_ext_v_f32m2_f32m8(vfloat32m2_t op1) +{ + vfloat32m8_t a = vlmul_ext_v_f32m2_f32m8(op1); + return a; +} + +vfloat32m8_t test_vlmul_ext_v_f32m4_f32m8(vfloat32m4_t op1) +{ + vfloat32m8_t a = vlmul_ext_v_f32m4_f32m8(op1); + return a; +} + +vfloat64m2_t test_vlmul_ext_v_f64m1_f64m2(vfloat64m1_t op1) +{ + vfloat64m2_t a = vlmul_ext_v_f64m1_f64m2(op1); + return a; +} + +vfloat64m4_t test_vlmul_ext_v_f64m1_f64m4(vfloat64m1_t op1) +{ + vfloat64m4_t a = vlmul_ext_v_f64m1_f64m4(op1); + return a; +} + +vfloat64m8_t test_vlmul_ext_v_f64m1_f64m8(vfloat64m1_t op1) +{ + vfloat64m8_t a = vlmul_ext_v_f64m1_f64m8(op1); + return a; +} + +vfloat64m4_t test_vlmul_ext_v_f64m2_f64m4(vfloat64m2_t op1) +{ + vfloat64m4_t a = vlmul_ext_v_f64m2_f64m4(op1); + return a; +} + +vfloat64m8_t test_vlmul_ext_v_f64m2_f64m8(vfloat64m2_t op1) +{ + vfloat64m8_t a = vlmul_ext_v_f64m2_f64m8(op1); + return a; +} + +vfloat64m8_t test_vlmul_ext_v_f64m4_f64m8(vfloat64m4_t op1) +{ + vfloat64m8_t a = vlmul_ext_v_f64m4_f64m8(op1); + return a; +} + +vint8mf8_t test_vlmul_trunc_v_i8mf4_i8mf8(vint8mf4_t op1) +{ + vint8mf8_t a = vlmul_trunc_v_i8mf4_i8mf8(op1); + return a; +} + +vint8mf8_t test_vlmul_trunc_v_i8mf2_i8mf8(vint8mf2_t op1) +{ + vint8mf8_t a = vlmul_trunc_v_i8mf2_i8mf8(op1); + return a; +} + +vint8mf4_t test_vlmul_trunc_v_i8mf2_i8mf4(vint8mf2_t op1) +{ + vint8mf4_t a = vlmul_trunc_v_i8mf2_i8mf4(op1); + return a; +} + +vint8mf8_t test_vlmul_trunc_v_i8m1_i8mf8(vint8m1_t op1) +{ + vint8mf8_t a = vlmul_trunc_v_i8m1_i8mf8(op1); + return a; +} + +vint8mf4_t test_vlmul_trunc_v_i8m1_i8mf4(vint8m1_t op1) +{ + vint8mf4_t a = vlmul_trunc_v_i8m1_i8mf4(op1); + return a; +} + +vint8mf2_t test_vlmul_trunc_v_i8m1_i8mf2(vint8m1_t op1) +{ + vint8mf2_t a = vlmul_trunc_v_i8m1_i8mf2(op1); + return a; +} + +vint8mf8_t test_vlmul_trunc_v_i8m2_i8mf8(vint8m2_t op1) +{ + vint8mf8_t a = vlmul_trunc_v_i8m2_i8mf8(op1); + return a; +} + +vint8mf4_t test_vlmul_trunc_v_i8m2_i8mf4(vint8m2_t op1) +{ + vint8mf4_t a = vlmul_trunc_v_i8m2_i8mf4(op1); + return a; +} + +vint8mf2_t test_vlmul_trunc_v_i8m2_i8mf2(vint8m2_t op1) +{ + vint8mf2_t a = vlmul_trunc_v_i8m2_i8mf2(op1); + return a; +} + +vint8m1_t test_vlmul_trunc_v_i8m2_i8m1(vint8m2_t op1) +{ + vint8m1_t a = vlmul_trunc_v_i8m2_i8m1(op1); + return a; +} + +vint8mf8_t test_vlmul_trunc_v_i8m4_i8mf8(vint8m4_t op1) +{ + vint8mf8_t a = vlmul_trunc_v_i8m4_i8mf8(op1); + return a; +} + +vint8mf4_t test_vlmul_trunc_v_i8m4_i8mf4(vint8m4_t op1) +{ + vint8mf4_t a = vlmul_trunc_v_i8m4_i8mf4(op1); + return a; +} + +vint8mf2_t test_vlmul_trunc_v_i8m4_i8mf2(vint8m4_t op1) +{ + vint8mf2_t a = vlmul_trunc_v_i8m4_i8mf2(op1); + return a; +} + +vint8m1_t test_vlmul_trunc_v_i8m4_i8m1(vint8m4_t op1) +{ + vint8m1_t a = vlmul_trunc_v_i8m4_i8m1(op1); + return a; +} + +vint8m2_t test_vlmul_trunc_v_i8m4_i8m2(vint8m4_t op1) +{ + vint8m2_t a = vlmul_trunc_v_i8m4_i8m2(op1); + return a; +} + +vint8mf8_t test_vlmul_trunc_v_i8m8_i8mf8(vint8m8_t op1) +{ + vint8mf8_t a = vlmul_trunc_v_i8m8_i8mf8(op1); + return a; +} + +vint8mf4_t test_vlmul_trunc_v_i8m8_i8mf4(vint8m8_t op1) +{ + vint8mf4_t a = vlmul_trunc_v_i8m8_i8mf4(op1); + return a; +} + +vint8mf2_t test_vlmul_trunc_v_i8m8_i8mf2(vint8m8_t op1) +{ + vint8mf2_t a = vlmul_trunc_v_i8m8_i8mf2(op1); + return a; +} + +vint8m1_t test_vlmul_trunc_v_i8m8_i8m1(vint8m8_t op1) +{ + vint8m1_t a = vlmul_trunc_v_i8m8_i8m1(op1); + return a; +} + +vint8m2_t test_vlmul_trunc_v_i8m8_i8m2(vint8m8_t op1) +{ + vint8m2_t a = vlmul_trunc_v_i8m8_i8m2(op1); + return a; +} + +vint8m4_t test_vlmul_trunc_v_i8m8_i8m4(vint8m8_t op1) +{ + vint8m4_t a = vlmul_trunc_v_i8m8_i8m4(op1); + return a; +} + +vint16mf4_t test_vlmul_trunc_v_i16mf2_i16mf4(vint16mf2_t op1) +{ + vint16mf4_t a = vlmul_trunc_v_i16mf2_i16mf4(op1); + return a; +} + +vint16mf4_t test_vlmul_trunc_v_i16m1_i16mf4(vint16m1_t op1) +{ + vint16mf4_t a = vlmul_trunc_v_i16m1_i16mf4(op1); + return a; +} + +vint16mf2_t test_vlmul_trunc_v_i16m1_i16mf2(vint16m1_t op1) +{ + vint16mf2_t a = vlmul_trunc_v_i16m1_i16mf2(op1); + return a; +} + +vint16mf4_t test_vlmul_trunc_v_i16m2_i16mf4(vint16m2_t op1) +{ + vint16mf4_t a = vlmul_trunc_v_i16m2_i16mf4(op1); + return a; +} + +vint16mf2_t test_vlmul_trunc_v_i16m2_i16mf2(vint16m2_t op1) +{ + vint16mf2_t a = vlmul_trunc_v_i16m2_i16mf2(op1); + return a; +} + +vint16m1_t test_vlmul_trunc_v_i16m2_i16m1(vint16m2_t op1) +{ + vint16m1_t a = vlmul_trunc_v_i16m2_i16m1(op1); + return a; +} + +vint16mf4_t test_vlmul_trunc_v_i16m4_i16mf4(vint16m4_t op1) +{ + vint16mf4_t a = vlmul_trunc_v_i16m4_i16mf4(op1); + return a; +} + +vint16mf2_t test_vlmul_trunc_v_i16m4_i16mf2(vint16m4_t op1) +{ + vint16mf2_t a = vlmul_trunc_v_i16m4_i16mf2(op1); + return a; +} + +vint16m1_t test_vlmul_trunc_v_i16m4_i16m1(vint16m4_t op1) +{ + vint16m1_t a = vlmul_trunc_v_i16m4_i16m1(op1); + return a; +} + +vint16m2_t test_vlmul_trunc_v_i16m4_i16m2(vint16m4_t op1) +{ + vint16m2_t a = vlmul_trunc_v_i16m4_i16m2(op1); + return a; +} + +vint16mf4_t test_vlmul_trunc_v_i16m8_i16mf4(vint16m8_t op1) +{ + vint16mf4_t a = vlmul_trunc_v_i16m8_i16mf4(op1); + return a; +} + +vint16mf2_t test_vlmul_trunc_v_i16m8_i16mf2(vint16m8_t op1) +{ + vint16mf2_t a = vlmul_trunc_v_i16m8_i16mf2(op1); + return a; +} + +vint16m1_t test_vlmul_trunc_v_i16m8_i16m1(vint16m8_t op1) +{ + vint16m1_t a = vlmul_trunc_v_i16m8_i16m1(op1); + return a; +} + +vint16m2_t test_vlmul_trunc_v_i16m8_i16m2(vint16m8_t op1) +{ + vint16m2_t a = vlmul_trunc_v_i16m8_i16m2(op1); + return a; +} + +vint16m4_t test_vlmul_trunc_v_i16m8_i16m4(vint16m8_t op1) +{ + vint16m4_t a = vlmul_trunc_v_i16m8_i16m4(op1); + return a; +} + +vint32mf2_t test_vlmul_trunc_v_i32m1_i32mf2(vint32m1_t op1) +{ + vint32mf2_t a = vlmul_trunc_v_i32m1_i32mf2(op1); + return a; +} + +vint32mf2_t test_vlmul_trunc_v_i32m2_i32mf2(vint32m2_t op1) +{ + vint32mf2_t a = vlmul_trunc_v_i32m2_i32mf2(op1); + return a; +} + +vint32m1_t test_vlmul_trunc_v_i32m2_i32m1(vint32m2_t op1) +{ + vint32m1_t a = vlmul_trunc_v_i32m2_i32m1(op1); + return a; +} + +vint32mf2_t test_vlmul_trunc_v_i32m4_i32mf2(vint32m4_t op1) +{ + vint32mf2_t a = vlmul_trunc_v_i32m4_i32mf2(op1); + return a; +} + +vint32m1_t test_vlmul_trunc_v_i32m4_i32m1(vint32m4_t op1) +{ + vint32m1_t a = vlmul_trunc_v_i32m4_i32m1(op1); + return a; +} + +vint32m2_t test_vlmul_trunc_v_i32m4_i32m2(vint32m4_t op1) +{ + vint32m2_t a = vlmul_trunc_v_i32m4_i32m2(op1); + return a; +} + +vint32mf2_t test_vlmul_trunc_v_i32m8_i32mf2(vint32m8_t op1) +{ + vint32mf2_t a = vlmul_trunc_v_i32m8_i32mf2(op1); + return a; +} + +vint32m1_t test_vlmul_trunc_v_i32m8_i32m1(vint32m8_t op1) +{ + vint32m1_t a = vlmul_trunc_v_i32m8_i32m1(op1); + return a; +} + +vint32m2_t test_vlmul_trunc_v_i32m8_i32m2(vint32m8_t op1) +{ + vint32m2_t a = vlmul_trunc_v_i32m8_i32m2(op1); + return a; +} + +vint32m4_t test_vlmul_trunc_v_i32m8_i32m4(vint32m8_t op1) +{ + vint32m4_t a = vlmul_trunc_v_i32m8_i32m4(op1); + return a; +} + +vint64m1_t test_vlmul_trunc_v_i64m2_i64m1(vint64m2_t op1) +{ + vint64m1_t a = vlmul_trunc_v_i64m2_i64m1(op1); + return a; +} + +vint64m1_t test_vlmul_trunc_v_i64m4_i64m1(vint64m4_t op1) +{ + vint64m1_t a = vlmul_trunc_v_i64m4_i64m1(op1); + return a; +} + +vint64m2_t test_vlmul_trunc_v_i64m4_i64m2(vint64m4_t op1) +{ + vint64m2_t a = vlmul_trunc_v_i64m4_i64m2(op1); + return a; +} + +vint64m1_t test_vlmul_trunc_v_i64m8_i64m1(vint64m8_t op1) +{ + vint64m1_t a = vlmul_trunc_v_i64m8_i64m1(op1); + return a; +} + +vint64m2_t test_vlmul_trunc_v_i64m8_i64m2(vint64m8_t op1) +{ + vint64m2_t a = vlmul_trunc_v_i64m8_i64m2(op1); + return a; +} + +vint64m4_t test_vlmul_trunc_v_i64m8_i64m4(vint64m8_t op1) +{ + vint64m4_t a = vlmul_trunc_v_i64m8_i64m4(op1); + return a; +} + +vuint8mf8_t test_vlmul_trunc_v_u8mf4_u8mf8(vuint8mf4_t op1) +{ + vuint8mf8_t a = vlmul_trunc_v_u8mf4_u8mf8(op1); + return a; +} + +vuint8mf8_t test_vlmul_trunc_v_u8mf2_u8mf8(vuint8mf2_t op1) +{ + vuint8mf8_t a = vlmul_trunc_v_u8mf2_u8mf8(op1); + return a; +} + +vuint8mf4_t test_vlmul_trunc_v_u8mf2_u8mf4(vuint8mf2_t op1) +{ + vuint8mf4_t a = vlmul_trunc_v_u8mf2_u8mf4(op1); + return a; +} + +vuint8mf8_t test_vlmul_trunc_v_u8m1_u8mf8(vuint8m1_t op1) +{ + vuint8mf8_t a = vlmul_trunc_v_u8m1_u8mf8(op1); + return a; +} + +vuint8mf4_t test_vlmul_trunc_v_u8m1_u8mf4(vuint8m1_t op1) +{ + vuint8mf4_t a = vlmul_trunc_v_u8m1_u8mf4(op1); + return a; +} + +vuint8mf2_t test_vlmul_trunc_v_u8m1_u8mf2(vuint8m1_t op1) +{ + vuint8mf2_t a = vlmul_trunc_v_u8m1_u8mf2(op1); + return a; +} + +vuint8mf8_t test_vlmul_trunc_v_u8m2_u8mf8(vuint8m2_t op1) +{ + vuint8mf8_t a = vlmul_trunc_v_u8m2_u8mf8(op1); + return a; +} + +vuint8mf4_t test_vlmul_trunc_v_u8m2_u8mf4(vuint8m2_t op1) +{ + vuint8mf4_t a = vlmul_trunc_v_u8m2_u8mf4(op1); + return a; +} + +vuint8mf2_t test_vlmul_trunc_v_u8m2_u8mf2(vuint8m2_t op1) +{ + vuint8mf2_t a = vlmul_trunc_v_u8m2_u8mf2(op1); + return a; +} + +vuint8m1_t test_vlmul_trunc_v_u8m2_u8m1(vuint8m2_t op1) +{ + vuint8m1_t a = vlmul_trunc_v_u8m2_u8m1(op1); + return a; +} + +vuint8mf8_t test_vlmul_trunc_v_u8m4_u8mf8(vuint8m4_t op1) +{ + vuint8mf8_t a = vlmul_trunc_v_u8m4_u8mf8(op1); + return a; +} + +vuint8mf4_t test_vlmul_trunc_v_u8m4_u8mf4(vuint8m4_t op1) +{ + vuint8mf4_t a = vlmul_trunc_v_u8m4_u8mf4(op1); + return a; +} + +vuint8mf2_t test_vlmul_trunc_v_u8m4_u8mf2(vuint8m4_t op1) +{ + vuint8mf2_t a = vlmul_trunc_v_u8m4_u8mf2(op1); + return a; +} + +vuint8m1_t test_vlmul_trunc_v_u8m4_u8m1(vuint8m4_t op1) +{ + vuint8m1_t a = vlmul_trunc_v_u8m4_u8m1(op1); + return a; +} + +vuint8m2_t test_vlmul_trunc_v_u8m4_u8m2(vuint8m4_t op1) +{ + vuint8m2_t a = vlmul_trunc_v_u8m4_u8m2(op1); + return a; +} + +vuint8mf8_t test_vlmul_trunc_v_u8m8_u8mf8(vuint8m8_t op1) +{ + vuint8mf8_t a = vlmul_trunc_v_u8m8_u8mf8(op1); + return a; +} + +vuint8mf4_t test_vlmul_trunc_v_u8m8_u8mf4(vuint8m8_t op1) +{ + vuint8mf4_t a = vlmul_trunc_v_u8m8_u8mf4(op1); + return a; +} + +vuint8mf2_t test_vlmul_trunc_v_u8m8_u8mf2(vuint8m8_t op1) +{ + vuint8mf2_t a = vlmul_trunc_v_u8m8_u8mf2(op1); + return a; +} + +vuint8m1_t test_vlmul_trunc_v_u8m8_u8m1(vuint8m8_t op1) +{ + vuint8m1_t a = vlmul_trunc_v_u8m8_u8m1(op1); + return a; +} + +vuint8m2_t test_vlmul_trunc_v_u8m8_u8m2(vuint8m8_t op1) +{ + vuint8m2_t a = vlmul_trunc_v_u8m8_u8m2(op1); + return a; +} + +vuint8m4_t test_vlmul_trunc_v_u8m8_u8m4(vuint8m8_t op1) +{ + vuint8m4_t a = vlmul_trunc_v_u8m8_u8m4(op1); + return a; +} + +vuint16mf4_t test_vlmul_trunc_v_u16mf2_u16mf4(vuint16mf2_t op1) +{ + vuint16mf4_t a = vlmul_trunc_v_u16mf2_u16mf4(op1); + return a; +} + +vuint16mf4_t test_vlmul_trunc_v_u16m1_u16mf4(vuint16m1_t op1) +{ + vuint16mf4_t a = vlmul_trunc_v_u16m1_u16mf4(op1); + return a; +} + +vuint16mf2_t test_vlmul_trunc_v_u16m1_u16mf2(vuint16m1_t op1) +{ + vuint16mf2_t a = vlmul_trunc_v_u16m1_u16mf2(op1); + return a; +} + +vuint16mf4_t test_vlmul_trunc_v_u16m2_u16mf4(vuint16m2_t op1) +{ + vuint16mf4_t a = vlmul_trunc_v_u16m2_u16mf4(op1); + return a; +} + +vuint16mf2_t test_vlmul_trunc_v_u16m2_u16mf2(vuint16m2_t op1) +{ + vuint16mf2_t a = vlmul_trunc_v_u16m2_u16mf2(op1); + return a; +} + +vuint16m1_t test_vlmul_trunc_v_u16m2_u16m1(vuint16m2_t op1) +{ + vuint16m1_t a = vlmul_trunc_v_u16m2_u16m1(op1); + return a; +} + +vuint16mf4_t test_vlmul_trunc_v_u16m4_u16mf4(vuint16m4_t op1) +{ + vuint16mf4_t a = vlmul_trunc_v_u16m4_u16mf4(op1); + return a; +} + +vuint16mf2_t test_vlmul_trunc_v_u16m4_u16mf2(vuint16m4_t op1) +{ + vuint16mf2_t a = vlmul_trunc_v_u16m4_u16mf2(op1); + return a; +} + +vuint16m1_t test_vlmul_trunc_v_u16m4_u16m1(vuint16m4_t op1) +{ + vuint16m1_t a = vlmul_trunc_v_u16m4_u16m1(op1); + return a; +} + +vuint16m2_t test_vlmul_trunc_v_u16m4_u16m2(vuint16m4_t op1) +{ + vuint16m2_t a = vlmul_trunc_v_u16m4_u16m2(op1); + return a; +} + +vuint16mf4_t test_vlmul_trunc_v_u16m8_u16mf4(vuint16m8_t op1) +{ + vuint16mf4_t a = vlmul_trunc_v_u16m8_u16mf4(op1); + return a; +} + +vuint16mf2_t test_vlmul_trunc_v_u16m8_u16mf2(vuint16m8_t op1) +{ + vuint16mf2_t a = vlmul_trunc_v_u16m8_u16mf2(op1); + return a; +} + +vuint16m1_t test_vlmul_trunc_v_u16m8_u16m1(vuint16m8_t op1) +{ + vuint16m1_t a = vlmul_trunc_v_u16m8_u16m1(op1); + return a; +} + +vuint16m2_t test_vlmul_trunc_v_u16m8_u16m2(vuint16m8_t op1) +{ + vuint16m2_t a = vlmul_trunc_v_u16m8_u16m2(op1); + return a; +} + +vuint16m4_t test_vlmul_trunc_v_u16m8_u16m4(vuint16m8_t op1) +{ + vuint16m4_t a = vlmul_trunc_v_u16m8_u16m4(op1); + return a; +} + +vuint32mf2_t test_vlmul_trunc_v_u32m1_u32mf2(vuint32m1_t op1) +{ + vuint32mf2_t a = vlmul_trunc_v_u32m1_u32mf2(op1); + return a; +} + +vuint32mf2_t test_vlmul_trunc_v_u32m2_u32mf2(vuint32m2_t op1) +{ + vuint32mf2_t a = vlmul_trunc_v_u32m2_u32mf2(op1); + return a; +} + +vuint32m1_t test_vlmul_trunc_v_u32m2_u32m1(vuint32m2_t op1) +{ + vuint32m1_t a = vlmul_trunc_v_u32m2_u32m1(op1); + return a; +} + +vuint32mf2_t test_vlmul_trunc_v_u32m4_u32mf2(vuint32m4_t op1) +{ + vuint32mf2_t a = vlmul_trunc_v_u32m4_u32mf2(op1); + return a; +} + +vuint32m1_t test_vlmul_trunc_v_u32m4_u32m1(vuint32m4_t op1) +{ + vuint32m1_t a = vlmul_trunc_v_u32m4_u32m1(op1); + return a; +} + +vuint32m2_t test_vlmul_trunc_v_u32m4_u32m2(vuint32m4_t op1) +{ + vuint32m2_t a = vlmul_trunc_v_u32m4_u32m2(op1); + return a; +} + +vuint32mf2_t test_vlmul_trunc_v_u32m8_u32mf2(vuint32m8_t op1) +{ + vuint32mf2_t a = vlmul_trunc_v_u32m8_u32mf2(op1); + return a; +} + +vuint32m1_t test_vlmul_trunc_v_u32m8_u32m1(vuint32m8_t op1) +{ + vuint32m1_t a = vlmul_trunc_v_u32m8_u32m1(op1); + return a; +} + +vuint32m2_t test_vlmul_trunc_v_u32m8_u32m2(vuint32m8_t op1) +{ + vuint32m2_t a = vlmul_trunc_v_u32m8_u32m2(op1); + return a; +} + +vuint32m4_t test_vlmul_trunc_v_u32m8_u32m4(vuint32m8_t op1) +{ + vuint32m4_t a = vlmul_trunc_v_u32m8_u32m4(op1); + return a; +} + +vuint64m1_t test_vlmul_trunc_v_u64m2_u64m1(vuint64m2_t op1) +{ + vuint64m1_t a = vlmul_trunc_v_u64m2_u64m1(op1); + return a; +} + +vuint64m1_t test_vlmul_trunc_v_u64m4_u64m1(vuint64m4_t op1) +{ + vuint64m1_t a = vlmul_trunc_v_u64m4_u64m1(op1); + return a; +} + +vuint64m2_t test_vlmul_trunc_v_u64m4_u64m2(vuint64m4_t op1) +{ + vuint64m2_t a = vlmul_trunc_v_u64m4_u64m2(op1); + return a; +} + +vuint64m1_t test_vlmul_trunc_v_u64m8_u64m1(vuint64m8_t op1) +{ + vuint64m1_t a = vlmul_trunc_v_u64m8_u64m1(op1); + return a; +} + +vuint64m2_t test_vlmul_trunc_v_u64m8_u64m2(vuint64m8_t op1) +{ + vuint64m2_t a = vlmul_trunc_v_u64m8_u64m2(op1); + return a; +} + +vuint64m4_t test_vlmul_trunc_v_u64m8_u64m4(vuint64m8_t op1) +{ + vuint64m4_t a = vlmul_trunc_v_u64m8_u64m4(op1); + return a; +} + +vfloat32mf2_t test_vlmul_trunc_v_f32m1_f32mf2(vfloat32m1_t op1) +{ + vfloat32mf2_t a = vlmul_trunc_v_f32m1_f32mf2(op1); + return a; +} + +vfloat32mf2_t test_vlmul_trunc_v_f32m2_f32mf2(vfloat32m2_t op1) +{ + vfloat32mf2_t a = vlmul_trunc_v_f32m2_f32mf2(op1); + return a; +} + +vfloat32m1_t test_vlmul_trunc_v_f32m2_f32m1(vfloat32m2_t op1) +{ + vfloat32m1_t a = vlmul_trunc_v_f32m2_f32m1(op1); + return a; +} + +vfloat32mf2_t test_vlmul_trunc_v_f32m4_f32mf2(vfloat32m4_t op1) +{ + vfloat32mf2_t a = vlmul_trunc_v_f32m4_f32mf2(op1); + return a; +} + +vfloat32m1_t test_vlmul_trunc_v_f32m4_f32m1(vfloat32m4_t op1) +{ + vfloat32m1_t a = vlmul_trunc_v_f32m4_f32m1(op1); + return a; +} + +vfloat32m2_t test_vlmul_trunc_v_f32m4_f32m2(vfloat32m4_t op1) +{ + vfloat32m2_t a = vlmul_trunc_v_f32m4_f32m2(op1); + return a; +} + +vfloat32mf2_t test_vlmul_trunc_v_f32m8_f32mf2(vfloat32m8_t op1) +{ + vfloat32mf2_t a = vlmul_trunc_v_f32m8_f32mf2(op1); + return a; +} + +vfloat32m1_t test_vlmul_trunc_v_f32m8_f32m1(vfloat32m8_t op1) +{ + vfloat32m1_t a = vlmul_trunc_v_f32m8_f32m1(op1); + return a; +} + +vfloat32m2_t test_vlmul_trunc_v_f32m8_f32m2(vfloat32m8_t op1) +{ + vfloat32m2_t a = vlmul_trunc_v_f32m8_f32m2(op1); + return a; +} + +vfloat32m4_t test_vlmul_trunc_v_f32m8_f32m4(vfloat32m8_t op1) +{ + vfloat32m4_t a = vlmul_trunc_v_f32m8_f32m4(op1); + return a; +} + +vfloat64m1_t test_vlmul_trunc_v_f64m2_f64m1(vfloat64m2_t op1) +{ + vfloat64m1_t a = vlmul_trunc_v_f64m2_f64m1(op1); + return a; +} + +vfloat64m1_t test_vlmul_trunc_v_f64m4_f64m1(vfloat64m4_t op1) +{ + vfloat64m1_t a = vlmul_trunc_v_f64m4_f64m1(op1); + return a; +} + +vfloat64m2_t test_vlmul_trunc_v_f64m4_f64m2(vfloat64m4_t op1) +{ + vfloat64m2_t a = vlmul_trunc_v_f64m4_f64m2(op1); + return a; +} + +vfloat64m1_t test_vlmul_trunc_v_f64m8_f64m1(vfloat64m8_t op1) +{ + vfloat64m1_t a = vlmul_trunc_v_f64m8_f64m1(op1); + return a; +} + +vfloat64m2_t test_vlmul_trunc_v_f64m8_f64m2(vfloat64m8_t op1) +{ + vfloat64m2_t a = vlmul_trunc_v_f64m8_f64m2(op1); + return a; +} + +vfloat64m4_t test_vlmul_trunc_v_f64m8_f64m4(vfloat64m8_t op1) +{ + vfloat64m4_t a = vlmul_trunc_v_f64m8_f64m4(op1); + return a; +} + +vint8mf8_t test_vundefined_i8mf8() +{ + vint8mf8_t a = vundefined_i8mf8(); + return a; +} + +vint8mf4_t test_vundefined_i8mf4() +{ + vint8mf4_t a = vundefined_i8mf4(); + return a; +} + +vint8mf2_t test_vundefined_i8mf2() +{ + vint8mf2_t a = vundefined_i8mf2(); + return a; +} + +vint8m1_t test_vundefined_i8m1() +{ + vint8m1_t a = vundefined_i8m1(); + return a; +} + +vint8m2_t test_vundefined_i8m2() +{ + vint8m2_t a = vundefined_i8m2(); + return a; +} + +vint8m4_t test_vundefined_i8m4() +{ + vint8m4_t a = vundefined_i8m4(); + return a; +} + +vint8m8_t test_vundefined_i8m8() +{ + vint8m8_t a = vundefined_i8m8(); + return a; +} + +vint16mf4_t test_vundefined_i16mf4() +{ + vint16mf4_t a = vundefined_i16mf4(); + return a; +} + +vint16mf2_t test_vundefined_i16mf2() +{ + vint16mf2_t a = vundefined_i16mf2(); + return a; +} + +vint16m1_t test_vundefined_i16m1() +{ + vint16m1_t a = vundefined_i16m1(); + return a; +} + +vint16m2_t test_vundefined_i16m2() +{ + vint16m2_t a = vundefined_i16m2(); + return a; +} + +vint16m4_t test_vundefined_i16m4() +{ + vint16m4_t a = vundefined_i16m4(); + return a; +} + +vint16m8_t test_vundefined_i16m8() +{ + vint16m8_t a = vundefined_i16m8(); + return a; +} + +vint32mf2_t test_vundefined_i32mf2() +{ + vint32mf2_t a = vundefined_i32mf2(); + return a; +} + +vint32m1_t test_vundefined_i32m1() +{ + vint32m1_t a = vundefined_i32m1(); + return a; +} + +vint32m2_t test_vundefined_i32m2() +{ + vint32m2_t a = vundefined_i32m2(); + return a; +} + +vint32m4_t test_vundefined_i32m4() +{ + vint32m4_t a = vundefined_i32m4(); + return a; +} + +vint32m8_t test_vundefined_i32m8() +{ + vint32m8_t a = vundefined_i32m8(); + return a; +} + +vint64m1_t test_vundefined_i64m1() +{ + vint64m1_t a = vundefined_i64m1(); + return a; +} + +vint64m2_t test_vundefined_i64m2() +{ + vint64m2_t a = vundefined_i64m2(); + return a; +} + +vint64m4_t test_vundefined_i64m4() +{ + vint64m4_t a = vundefined_i64m4(); + return a; +} + +vint64m8_t test_vundefined_i64m8() +{ + vint64m8_t a = vundefined_i64m8(); + return a; +} + +vuint8mf8_t test_vundefined_u8mf8() +{ + vuint8mf8_t a = vundefined_u8mf8(); + return a; +} + +vuint8mf4_t test_vundefined_u8mf4() +{ + vuint8mf4_t a = vundefined_u8mf4(); + return a; +} + +vuint8mf2_t test_vundefined_u8mf2() +{ + vuint8mf2_t a = vundefined_u8mf2(); + return a; +} + +vuint8m1_t test_vundefined_u8m1() +{ + vuint8m1_t a = vundefined_u8m1(); + return a; +} + +vuint8m2_t test_vundefined_u8m2() +{ + vuint8m2_t a = vundefined_u8m2(); + return a; +} + +vuint8m4_t test_vundefined_u8m4() +{ + vuint8m4_t a = vundefined_u8m4(); + return a; +} + +vuint8m8_t test_vundefined_u8m8() +{ + vuint8m8_t a = vundefined_u8m8(); + return a; +} + +vuint16mf4_t test_vundefined_u16mf4() +{ + vuint16mf4_t a = vundefined_u16mf4(); + return a; +} + +vuint16mf2_t test_vundefined_u16mf2() +{ + vuint16mf2_t a = vundefined_u16mf2(); + return a; +} + +vuint16m1_t test_vundefined_u16m1() +{ + vuint16m1_t a = vundefined_u16m1(); + return a; +} + +vuint16m2_t test_vundefined_u16m2() +{ + vuint16m2_t a = vundefined_u16m2(); + return a; +} + +vuint16m4_t test_vundefined_u16m4() +{ + vuint16m4_t a = vundefined_u16m4(); + return a; +} + +vuint16m8_t test_vundefined_u16m8() +{ + vuint16m8_t a = vundefined_u16m8(); + return a; +} + +vuint32mf2_t test_vundefined_u32mf2() +{ + vuint32mf2_t a = vundefined_u32mf2(); + return a; +} + +vuint32m1_t test_vundefined_u32m1() +{ + vuint32m1_t a = vundefined_u32m1(); + return a; +} + +vuint32m2_t test_vundefined_u32m2() +{ + vuint32m2_t a = vundefined_u32m2(); + return a; +} + +vuint32m4_t test_vundefined_u32m4() +{ + vuint32m4_t a = vundefined_u32m4(); + return a; +} + +vuint32m8_t test_vundefined_u32m8() +{ + vuint32m8_t a = vundefined_u32m8(); + return a; +} + +vuint64m1_t test_vundefined_u64m1() +{ + vuint64m1_t a = vundefined_u64m1(); + return a; +} + +vuint64m2_t test_vundefined_u64m2() +{ + vuint64m2_t a = vundefined_u64m2(); + return a; +} + +vuint64m4_t test_vundefined_u64m4() +{ + vuint64m4_t a = vundefined_u64m4(); + return a; +} + +vuint64m8_t test_vundefined_u64m8() +{ + vuint64m8_t a = vundefined_u64m8(); + return a; +} + +vfloat32mf2_t test_vundefined_f32mf2() +{ + vfloat32mf2_t a = vundefined_f32mf2(); + return a; +} + +vfloat32m1_t test_vundefined_f32m1() +{ + vfloat32m1_t a = vundefined_f32m1(); + return a; +} + +vfloat32m2_t test_vundefined_f32m2() +{ + vfloat32m2_t a = vundefined_f32m2(); + return a; +} + +vfloat32m4_t test_vundefined_f32m4() +{ + vfloat32m4_t a = vundefined_f32m4(); + return a; +} + +vfloat32m8_t test_vundefined_f32m8() +{ + vfloat32m8_t a = vundefined_f32m8(); + return a; +} + +vfloat64m1_t test_vundefined_f64m1() +{ + vfloat64m1_t a = vundefined_f64m1(); + return a; +} + +vfloat64m2_t test_vundefined_f64m2() +{ + vfloat64m2_t a = vundefined_f64m2(); + return a; +} + +vfloat64m4_t test_vundefined_f64m4() +{ + vfloat64m4_t a = vundefined_f64m4(); + return a; +} + +vfloat64m8_t test_vundefined_f64m8() +{ + vfloat64m8_t a = vundefined_f64m8(); + return a; +} From patchwork Tue May 31 08:50:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 54552 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E4314383665B for ; Tue, 31 May 2022 08:57:28 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpproxy21.qq.com (smtpbg704.qq.com [203.205.195.105]) by sourceware.org (Postfix) with ESMTPS id DFC9E3834E42 for ; Tue, 31 May 2022 08:50:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DFC9E3834E42 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp84t1653987049tpqjlloa Received: from server1.localdomain ( [42.247.22.65]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 31 May 2022 16:50:48 +0800 (CST) X-QQ-SSF: 01400000002000B0F000000A0000000 X-QQ-FEAT: 6NsApsECK7dIatawgx6zpbI/wg1LOv9YzPFZ/a3SgirnFCgf06ifvKN3RXUqU pxEV+lTBq5zXpEcANtllYnCZzUomJnYgy94Vj0aIcJPIcslxpJg3yhA+SBaLxndnY42uDqd avOulvwHlytERxaaZH5PBySeqZzkrcEDOmhT4NaYcNo/96AfBIdDHSp82Av/IJ/KjckCHHu j4orpd+jMx9ZXIwcxcwCjCi0DCyM8QaUJGInnHaW/5zkluHWPDEft7DazBRgGPBmLw0z+mK OutSZlkuJGtxnk5CHH/kOahrpyVEd8RUmWQhuqXx9Pc3WPp+6AhYLrOWREQSSOtnsG1KKLg D8g2F/KpPMd33a1wQBeXSbpmv8nsVIAzvkUU5tjsTlSzA+OwGk= X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Subject: [PATCH 11/21] Add calling function support Date: Tue, 31 May 2022 16:50:02 +0800 Message-Id: <20220531085012.269719-12-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220531085012.269719-1-juzhe.zhong@rivai.ai> References: <20220531085012.269719-1-juzhe.zhong@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybgforeign:qybgforeign8 X-QQ-Bgrelay: 1 X-Spam-Status: No, score=-10.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kito.cheng@gmail.com, juzhe.zhong@rivai.ai Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: zhongjuzhe gcc/ChangeLog: * config/riscv/riscv.cc (struct riscv_arg_info): Add calling convention support. (riscv_get_arg_info): Add calling convention support. (riscv_function_arg_advance): Add calling convention support. (riscv_pass_by_reference): Add calling convention support. * config/riscv/riscv.h (GCC_RISCV_H): include . (V_RETURN): New macro define. (MAX_ARGS_IN_VECTOR_REGISTERS): New macro define. (MAX_ARGS_IN_MASK_REGISTERS): New macro define. (V_ARG_FIRST): New macro define. (V_ARG_LAST): New macro define. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/custom/calling_convention_1.c: New test. * gcc.target/riscv/rvv/custom/rvv-custom.exp: New test. --- gcc/config/riscv/riscv.cc | 90 +++++++++++++++++++ gcc/config/riscv/riscv.h | 14 +++ .../riscv/rvv/custom/calling_convention_1.c | 46 ++++++++++ .../riscv/rvv/custom/rvv-custom.exp | 47 ++++++++++ 4 files changed, 197 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/custom/calling_convention_1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/custom/rvv-custom.exp diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index e88057e992a..832c1754002 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -181,6 +181,18 @@ struct riscv_arg_info { /* The offset of the first register used, provided num_fprs is nonzero. */ unsigned int fpr_offset; + + /* The number of vector registers allocated to this argument. */ + unsigned int num_vrs; + + /* The offset of the first register used, provided num_vrs is nonzero. */ + unsigned int vr_offset; + + /* The number of mask registers allocated to this argument. */ + unsigned int num_mrs; + + /* The offset of the first register used, provided num_mrs is nonzero. */ + unsigned int mr_offset; }; /* Information about an address described by riscv_address_type. @@ -3225,11 +3237,13 @@ riscv_get_arg_info (struct riscv_arg_info *info, const CUMULATIVE_ARGS *cum, unsigned num_bytes, num_words; unsigned fpr_base = return_p ? FP_RETURN : FP_ARG_FIRST; unsigned gpr_base = return_p ? GP_RETURN : GP_ARG_FIRST; + unsigned vr_base = return_p ? V_RETURN : V_ARG_FIRST; unsigned alignment = riscv_function_arg_boundary (mode, type); memset (info, 0, sizeof (*info)); info->gpr_offset = cum->num_gprs; info->fpr_offset = cum->num_fprs; + info->mr_offset = cum->num_mrs; if (named) { @@ -3292,6 +3306,67 @@ riscv_get_arg_info (struct riscv_arg_info *info, const CUMULATIVE_ARGS *cum, gregno, TYPE_MODE (fields[1].type), fields[1].offset); } + /* Pass vectors in VRs. For the argument contain scalable vectors, + for example: foo (vint8m1_t a), we pass this in VRs to reduce + redundant register spills. The maximum vector arg registers is + MAX_ARGS_IN_VECTOR_REGISTERS. */ + if (rvv_mode_p (mode)) + { + /* For return vector register, we use V_RETURN as default. */ + if (return_p) + { + if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL) + return gen_rtx_REG (mode, V_REG_FIRST); + else + return gen_rtx_REG (mode, vr_base); + } + /* The first mask register in argument we use is v0, the res of them + we use v8,v9,.....etc same as vector registers. */ + if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL) + { + info->num_mrs = 1; + + if (info->mr_offset + info->num_mrs <= MAX_ARGS_IN_MASK_REGISTERS) + return gen_rtx_REG (mode, V_REG_FIRST); + } + /* The number of vectors to pass in the function arg. + When the mode size is less than a full vector, we + use 1 vector to pass. */ + int nvecs; + nvecs = known_le (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR) ? 1 : + exact_div (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR).to_constant (); + int align = rvv_regsize (mode); + for (int i = 0; i + nvecs <= MAX_ARGS_IN_VECTOR_REGISTERS; i += 1) + { + if (!cum->used_vrs[i] && (i + 8) % align == 0) + { + bool find_space = true; + int j = 1; + for (; j < nvecs; j += 1) + { + if (cum->used_vrs[i + j]) + { + find_space = false; + break; + } + } + if (find_space) + { + info->num_vrs = nvecs; + info->vr_offset = i; + return gen_rtx_REG(mode, vr_base + i); + } + else + { + /* skip the j num registers which can not be used */ + i += j; + } + } + } + info->num_vrs = 0; + info->num_mrs = 0; + return NULL_RTX; + } } /* Work out the size of the argument. */ @@ -3344,6 +3419,15 @@ riscv_function_arg_advance (cumulative_args_t cum_v, argument on the stack. */ cum->num_fprs = info.fpr_offset + info.num_fprs; cum->num_gprs = info.gpr_offset + info.num_gprs; + if (info.num_vrs > 0) + { + for (unsigned int i = 0; i < info.num_vrs; i += 1) + { + /* set current used vector registers */ + cum->used_vrs[info.vr_offset + i] = true; + } + } + cum->num_mrs = info.mr_offset + info.num_mrs; } /* Implement TARGET_ARG_PARTIAL_BYTES. */ @@ -3401,6 +3485,12 @@ riscv_pass_by_reference (cumulative_args_t cum_v, const function_arg_info &arg) /* Don't pass by reference if we can use a floating-point register. */ riscv_get_arg_info (&info, cum, arg.mode, arg.type, arg.named, false); if (info.num_fprs) + return false; + /* Don't pass by reference if we can use a RVV vector register. */ + if (info.num_vrs) + return false; + /* Don't pass by reference if we can use a RVV mask register. */ + if (info.num_mrs) return false; } diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h index c54c984e70b..5de745bc824 100644 --- a/gcc/config/riscv/riscv.h +++ b/gcc/config/riscv/riscv.h @@ -22,6 +22,7 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_RISCV_H #define GCC_RISCV_H +#include #include "config/riscv/riscv-opts.h" /* Target CPU builtins. */ @@ -620,8 +621,13 @@ enum reg_class #define GP_RETURN GP_ARG_FIRST #define FP_RETURN (UNITS_PER_FP_ARG == 0 ? GP_RETURN : FP_ARG_FIRST) +#define V_RETURN V_ARG_FIRST #define MAX_ARGS_IN_REGISTERS (riscv_abi == ABI_ILP32E ? 6 : 8) +/* Follow the calling convention in LLVM, + maximum 16 vector registers and 1 mask register in function arg. */ +#define MAX_ARGS_IN_VECTOR_REGISTERS (16) +#define MAX_ARGS_IN_MASK_REGISTERS (1) /* Symbolic macros for the first/last argument registers. */ @@ -630,6 +636,8 @@ enum reg_class #define GP_TEMP_FIRST (GP_REG_FIRST + 5) #define FP_ARG_FIRST (FP_REG_FIRST + 10) #define FP_ARG_LAST (FP_ARG_FIRST + MAX_ARGS_IN_REGISTERS - 1) +#define V_ARG_FIRST (V_REG_FIRST + 8) +#define V_ARG_LAST (V_ARG_FIRST + MAX_ARGS_IN_VECTOR_REGISTERS - 1) #define CALLEE_SAVED_REG_NUMBER(REGNO) \ ((REGNO) >= 8 && (REGNO) <= 9 ? (REGNO) - 8 : \ @@ -657,6 +665,12 @@ typedef struct { /* Number of floating-point registers used so far, likewise. */ unsigned int num_fprs; + + /* The used state of args in vectors, 1 for used by prev arg, initial to 0 */ + bool used_vrs[MAX_ARGS_IN_VECTOR_REGISTERS]; + + /* Number of mask registers used so far, up to MAX_ARGS_IN_MASK_REGISTERS. */ + unsigned int num_mrs; } CUMULATIVE_ARGS; /* Initialize a variable CUM of type CUMULATIVE_ARGS diff --git a/gcc/testsuite/gcc.target/riscv/rvv/custom/calling_convention_1.c b/gcc/testsuite/gcc.target/riscv/rvv/custom/calling_convention_1.c new file mode 100644 index 00000000000..5f79da2a9d6 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/custom/calling_convention_1.c @@ -0,0 +1,46 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3" } */ + +#include +#include + +void +test ( + vint8mf2_t a, vint8m4_t c, vint8m4_t d,vint8m2_t e, vint8m2_t b, + vint8m1_t f,vint8m1_t f2, vint8m1_t f3, vint8m1_t f4, vint8m2_t g, vint8m2_t h, + vbool16_t m1, vbool4_t m2, + int8_t *a1, int8_t *b1, int8_t *c1, + size_t vl) +{ + /* f4 => a0, g => a1, h => a2, m2 => a3, + a1 => a4, b1 => a5, c1 => a6, vl => a7 + + m1 => v0 + a => v8, c => v12, d => v16, e => v10, + b => v20, f => v9, f2 => v22, f3 => v23 */ + + vse8_v_i8mf2_m(m1, a1, a, vl); + + vse8_v_i8m2_m(m2, b1, b, vl); + + vse8_v_i8m4(c1, c, vl); + vse8_v_i8m4(c1, d, vl); + + vse8_v_i8m2(c1, e, vl); + + vse8_v_i8m1(c1, f, vl); + vse8_v_i8m1(c1, f2, vl); + vse8_v_i8m1(c1, f3, vl); + vse8_v_i8m1(c1, f4, vl); + + vse8_v_i8m2(c1, g, vl); + vse8_v_i8m2(c1, h, vl); +} +/* { dg-final { scan-assembler-times {vse8.v\s+v8,\s*\(a4\),\s*v0.t} 1 } } */ +/* { dg-final { scan-assembler-times {vse8.v\s+v20,\s*\(a5\),\s*v0.t} 1 } } */ +/* { dg-final { scan-assembler-times {vse8.v\s+v12,\s*\(a6\)} 1 } } */ +/* { dg-final { scan-assembler-times {vse8.v\s+v16,\s*\(a6\)} 1 } } */ +/* { dg-final { scan-assembler-times {vse8.v\s+v10,\s*\(a6\)} 1 } } */ +/* { dg-final { scan-assembler-times {vse8.v\s+v9,\s*\(a6\)} 1 } } */ +/* { dg-final { scan-assembler-times {vse8.v\s+v22,\s*\(a6\)} 1 } } */ +/* { dg-final { scan-assembler-times {vse8.v\s+v23,\s*\(a6\)} 1 } } */ \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/riscv/rvv/custom/rvv-custom.exp b/gcc/testsuite/gcc.target/riscv/rvv/custom/rvv-custom.exp new file mode 100644 index 00000000000..4956ac0b184 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/custom/rvv-custom.exp @@ -0,0 +1,47 @@ +# Copyright (C) 2022-2022 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . + +# GCC testsuite that uses the `dg.exp' driver. + +# Exit immediately if this isn't a RISC-V target. +if ![istarget riscv*-*-*] then { + return +} + +# Load support procs. +load_lib gcc-dg.exp + +# If a testcase doesn't have special options, use these. +global DEFAULT_CFLAGS +if ![info exists DEFAULT_CFLAGS] then { + set DEFAULT_CFLAGS "-pedantic-errors -ansi" +} + +set gcc_march "rv64gcv_zfh" +if [istarget riscv32-*-*] then { + set gcc_march "rv32gcv_zfh" +} + +# Initialize `dg'. +dg-init + +# Main loop. +set CFLAGS "$DEFAULT_CFLAGS -march=$gcc_march -std=gnu11" +dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.c]] \ + "" $CFLAGS + +# All done. +dg-finish \ No newline at end of file From patchwork Tue May 31 08:50:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 54554 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 50A193955CA1 for ; Tue, 31 May 2022 08:58:58 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbg150.qq.com (smtpbg150.qq.com [18.132.163.193]) by sourceware.org (Postfix) with ESMTPS id ED2623836654 for ; Tue, 31 May 2022 08:50:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org ED2623836654 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp84t1653987052tsa9oceb Received: from server1.localdomain ( [42.247.22.65]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 31 May 2022 16:50:51 +0800 (CST) X-QQ-SSF: 01400000002000B0F000000A0000000 X-QQ-FEAT: 4LFlwc+MlXnEbUX9d0YsY5r9y8I1EdUsF4MQ2xZ37/Ke1lmgeVPazM/begawA JG68a9IPM78+os8v4tRg6ry8RvWveCCZ9bsvv4g6zcWrVgquokbNwfnpRQYHNSezwkCFjPh Rwj6+zlad8qaNg+tWCRpdciA9f8ZfkJ03wRuVKKYwtuMUED+rBiggp9vjoV9JzpXQ7rs4e0 B8q4IR52h2PKxbPcODkHBC1QfJmGt7CEbLGgiqwDRfpB4DM3lA9AyIl9DKD9z3184PQLVQi IC8+botOO61NxvzMiZY1yK9+LmWMjNFNpY6WXWK8DyRwM1qumDI5COCerHRKh8cxTsd93kD FiRYi3rCd6AfaJ66Z/UPmxiaJHsRg7QUur6DMf/ X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Subject: [PATCH 12/21] Add set get intrinsic support Date: Tue, 31 May 2022 16:50:03 +0800 Message-Id: <20220531085012.269719-13-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220531085012.269719-1-juzhe.zhong@rivai.ai> References: <20220531085012.269719-1-juzhe.zhong@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybgforeign:qybgforeign3 X-QQ-Bgrelay: 1 X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kito.cheng@gmail.com, juzhe.zhong@rivai.ai Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: zhongjuzhe gcc/ChangeLog: * config/riscv/riscv-vector-builtins-functions.cc (vset::assemble_name): New function. (vset::get_argument_types): New function. (vset::expand): New function. (vget::assemble_name): New function. (vget::get_argument_types): New function. (vget::expand): New function. * config/riscv/riscv-vector-builtins-functions.def (vset): New macro define. (vget): New macro define. * config/riscv/riscv-vector-builtins-functions.h (class vset): New class. (class vget): New class. gcc/testsuite/ChangeLog: * g++.target/riscv/rvv/set-get.C: New test. * gcc.target/riscv/rvv/intrinsic/set-get.c: New test. --- .../riscv/riscv-vector-builtins-functions.cc | 73 ++ .../riscv/riscv-vector-builtins-functions.def | 6 + .../riscv/riscv-vector-builtins-functions.h | 28 + gcc/testsuite/g++.target/riscv/rvv/set-get.C | 730 ++++++++++++++++++ .../gcc.target/riscv/rvv/intrinsic/set-get.c | 730 ++++++++++++++++++ 5 files changed, 1567 insertions(+) create mode 100644 gcc/testsuite/g++.target/riscv/rvv/set-get.C create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/intrinsic/set-get.c diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.cc b/gcc/config/riscv/riscv-vector-builtins-functions.cc index fa39eedcd86..9d2895c3d3e 100644 --- a/gcc/config/riscv/riscv-vector-builtins-functions.cc +++ b/gcc/config/riscv/riscv-vector-builtins-functions.cc @@ -1510,6 +1510,79 @@ vundefined::expand (const function_instance &, tree, rtx target) const return target; } +/* A function implementation for vset functions. */ +char * +vset::assemble_name (function_instance &instance) +{ + machine_mode tmode = instance.get_arg_pattern ().arg_list[0]; + machine_mode smode = instance.get_arg_pattern ().arg_list[2]; + if (GET_MODE_INNER (tmode) != GET_MODE_INNER (smode)) + return nullptr; + + if (tmode == smode) + return nullptr; + + if (known_lt (GET_MODE_SIZE (tmode), GET_MODE_SIZE (smode))) + return nullptr; + + intrinsic_rename (instance, 0, 2); + append_name (instance.get_base_name ()); + return finish_name (); +} + +void +vset::get_argument_types (const function_instance &instance, + vec &argument_types) const +{ + misc::get_argument_types (instance, argument_types); + argument_types.quick_push (size_type_node); + argument_types.quick_push (get_dt_t_with_index (instance, 2)); +} + +rtx +vset::expand (const function_instance &instance, tree exp, rtx target) const +{ + enum insn_code icode = code_for_vset (instance.get_arg_pattern ().arg_list[0]); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vget functions. */ +char * +vget::assemble_name (function_instance &instance) +{ + machine_mode tmode = instance.get_arg_pattern ().arg_list[0]; + machine_mode smode = instance.get_arg_pattern ().arg_list[1]; + if (GET_MODE_INNER (tmode) != GET_MODE_INNER (smode)) + return nullptr; + + if (tmode == smode) + return nullptr; + + if (known_gt (GET_MODE_SIZE (tmode), GET_MODE_SIZE (smode))) + return nullptr; + + bool unsigned_p = instance.get_data_type_list ()[0] == DT_unsigned; + intrinsic_rename (instance, 0, 1); + append_name (instance.get_base_name ()); + append_name (mode2data_type_str (tmode, unsigned_p, false)); + return finish_name (); +} + +void +vget::get_argument_types (const function_instance &instance, + vec &argument_types) const +{ + misc::get_argument_types (instance, argument_types); + argument_types.quick_push (size_type_node); +} + +rtx +vget::expand (const function_instance &instance, tree exp, rtx target) const +{ + enum insn_code icode = code_for_vget (instance.get_arg_pattern ().arg_list[0]); + return expand_builtin_insn (icode, exp, target, instance); +} + /* A function implementation for loadstore functions. */ char * loadstore::assemble_name (function_instance &instance) diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def b/gcc/config/riscv/riscv-vector-builtins-functions.def index deb32ccd031..739ae60fff5 100644 --- a/gcc/config/riscv/riscv-vector-builtins-functions.def +++ b/gcc/config/riscv/riscv-vector-builtins-functions.def @@ -56,6 +56,12 @@ DEF_RVV_FUNCTION(vlmul_trunc, vlmul_trunc, (2, VITER(VLMULTRUNC, signed), VITER( DEF_RVV_FUNCTION(vundefined, vundefined, (1, VITER(VI, signed)), PAT_none, PRED_none, OP_none) DEF_RVV_FUNCTION(vundefined, vundefined, (1, VITER(VI, unsigned)), PAT_none, PRED_none, OP_none) DEF_RVV_FUNCTION(vundefined, vundefined, (1, VITER(VF, signed)), PAT_none, PRED_none, OP_none) +DEF_RVV_FUNCTION(vset, vset, (3, VITER(VSETI, signed), VATTR(0, VSETI, signed), VITER(VFULL, signed)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vset, vset, (3, VITER(VSETI, unsigned), VATTR(0, VSETI, unsigned), VITER(VFULL, unsigned)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vset, vset, (3, VITER(VSETF, signed), VATTR(0, VSETF, signed), VITER(VFULL, signed)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vget, vget, (2, VITER(VGETI, signed), VITER(VFULL, signed)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vget, vget, (2, VITER(VGETI, unsigned), VITER(VFULL, unsigned)), PAT_none, PRED_none, OP_v) +DEF_RVV_FUNCTION(vget, vget, (2, VITER(VGETF, signed), VITER(VFULL, signed)), PAT_none, PRED_none, OP_v) /* 7. Vector Loads and Stores. */ DEF_RVV_FUNCTION(vle, vle, (2, VITER(VI, signed), VATTR(0, VSUB, c_ptr)), pat_mask_tail, pred_all, OP_v) DEF_RVV_FUNCTION(vle, vle, (2, VITER(VI, unsigned), VATTR(0, VSUB, c_uptr)), pat_mask_tail, pred_all, OP_v) diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.h b/gcc/config/riscv/riscv-vector-builtins-functions.h index c9e1b2a34ca..90063005024 100644 --- a/gcc/config/riscv/riscv-vector-builtins-functions.h +++ b/gcc/config/riscv/riscv-vector-builtins-functions.h @@ -584,6 +584,34 @@ public: virtual rtx expand (const function_instance &, tree, rtx) const override; }; +/* A function_base for vset functions. */ +class vset : public misc +{ +public: + // use the same construction function as the misc + using misc::misc; + + virtual char * assemble_name (function_instance &) override; + + virtual void get_argument_types (const function_instance &, vec &) const override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vget functions. */ +class vget : public misc +{ +public: + // use the same construction function as the misc + using misc::misc; + + virtual char * assemble_name (function_instance &) override; + + virtual void get_argument_types (const function_instance &, vec &) const override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + /* A function_base for loadstore functions. */ class loadstore : public function_builder { diff --git a/gcc/testsuite/g++.target/riscv/rvv/set-get.C b/gcc/testsuite/g++.target/riscv/rvv/set-get.C new file mode 100644 index 00000000000..7c8deb96a39 --- /dev/null +++ b/gcc/testsuite/g++.target/riscv/rvv/set-get.C @@ -0,0 +1,730 @@ +/* { dg-do compile } */ +/* { dg-skip-if "test vector intrinsic" { *-*-* } { "*" } { "-march=rv*v*" } } */ + +#include +#include + + +vint8m2_t +test_vset_v_i8m1_i8m2 (vint8m2_t dest, vint8m1_t val) +{ + return vset(dest, 1, val); +} + +vint8m4_t +test_vset_v_i8m1_i8m4 (vint8m4_t dest, vint8m1_t val) +{ + return vset(dest, 1, val); +} + +vint8m4_t +test_vset_v_i8m2_i8m4 (vint8m4_t dest, vint8m2_t val) +{ + return vset(dest, 1, val); +} + +vint8m8_t +test_vset_v_i8m1_i8m8 (vint8m8_t dest, vint8m1_t val) +{ + return vset(dest, 1, val); +} + +vint8m8_t +test_vset_v_i8m2_i8m8 (vint8m8_t dest, vint8m2_t val) +{ + return vset(dest, 1, val); +} + +vint8m8_t +test_vset_v_i8m4_i8m8 (vint8m8_t dest, vint8m4_t val) +{ + return vset(dest, 1, val); +} + +vint8m1_t +test_vget_v_i8m2_i8m1 (vint8m2_t src) +{ + return vget_i8m1(src, 1); +} + +vint8m1_t +test_vget_v_i8m4_i8m1 (vint8m4_t src) +{ + return vget_i8m1(src, 1); +} + +vint8m1_t +test_vget_v_i8m8_i8m1 (vint8m8_t src) +{ + return vget_i8m1(src, 1); +} + +vint8m2_t +test_vget_v_i8m4_i8m2 (vint8m4_t src) +{ + return vget_i8m2(src, 1); +} + +vint8m2_t +test_vget_v_i8m8_i8m2 (vint8m8_t src) +{ + return vget_i8m2(src, 1); +} + +vint8m4_t +test_vget_v_i8m8_i8m4 (vint8m8_t src) +{ + return vget_i8m4(src, 1); +} + +vint16m2_t +test_vset_v_i16m1_i16m2 (vint16m2_t dest, vint16m1_t val) +{ + return vset(dest, 1, val); +} + +vint16m4_t +test_vset_v_i16m1_i16m4 (vint16m4_t dest, vint16m1_t val) +{ + return vset(dest, 1, val); +} + +vint16m4_t +test_vset_v_i16m2_i16m4 (vint16m4_t dest, vint16m2_t val) +{ + return vset(dest, 1, val); +} + +vint16m8_t +test_vset_v_i16m1_i16m8 (vint16m8_t dest, vint16m1_t val) +{ + return vset(dest, 1, val); +} + +vint16m8_t +test_vset_v_i16m2_i16m8 (vint16m8_t dest, vint16m2_t val) +{ + return vset(dest, 1, val); +} + +vint16m8_t +test_vset_v_i16m4_i16m8 (vint16m8_t dest, vint16m4_t val) +{ + return vset(dest, 1, val); +} + +vint16m1_t +test_vget_v_i16m2_i16m1 (vint16m2_t src) +{ + return vget_i16m1(src, 1); +} + +vint16m1_t +test_vget_v_i16m4_i16m1 (vint16m4_t src) +{ + return vget_i16m1(src, 1); +} + +vint16m1_t +test_vget_v_i16m8_i16m1 (vint16m8_t src) +{ + return vget_i16m1(src, 1); +} + +vint16m2_t +test_vget_v_i16m4_i16m2 (vint16m4_t src) +{ + return vget_i16m2(src, 1); +} + +vint16m2_t +test_vget_v_i16m8_i16m2 (vint16m8_t src) +{ + return vget_i16m2(src, 1); +} + +vint16m4_t +test_vget_v_i16m8_i16m4 (vint16m8_t src) +{ + return vget_i16m4(src, 1); +} + +vint32m2_t +test_vset_v_i32m1_i32m2 (vint32m2_t dest, vint32m1_t val) +{ + return vset(dest, 1, val); +} + +vint32m4_t +test_vset_v_i32m1_i32m4 (vint32m4_t dest, vint32m1_t val) +{ + return vset(dest, 1, val); +} + +vint32m4_t +test_vset_v_i32m2_i32m4 (vint32m4_t dest, vint32m2_t val) +{ + return vset(dest, 1, val); +} + +vint32m8_t +test_vset_v_i32m1_i32m8 (vint32m8_t dest, vint32m1_t val) +{ + return vset(dest, 1, val); +} + +vint32m8_t +test_vset_v_i32m2_i32m8 (vint32m8_t dest, vint32m2_t val) +{ + return vset(dest, 1, val); +} + +vint32m8_t +test_vset_v_i32m4_i32m8 (vint32m8_t dest, vint32m4_t val) +{ + return vset(dest, 1, val); +} + +vint32m1_t +test_vget_v_i32m2_i32m1 (vint32m2_t src) +{ + return vget_i32m1(src, 1); +} + +vint32m1_t +test_vget_v_i32m4_i32m1 (vint32m4_t src) +{ + return vget_i32m1(src, 1); +} + +vint32m1_t +test_vget_v_i32m8_i32m1 (vint32m8_t src) +{ + return vget_i32m1(src, 1); +} + +vint32m2_t +test_vget_v_i32m4_i32m2 (vint32m4_t src) +{ + return vget_i32m2(src, 1); +} + +vint32m2_t +test_vget_v_i32m8_i32m2 (vint32m8_t src) +{ + return vget_i32m2(src, 1); +} + +vint32m4_t +test_vget_v_i32m8_i32m4 (vint32m8_t src) +{ + return vget_i32m4(src, 1); +} + +vint64m2_t +test_vset_v_i64m1_i64m2 (vint64m2_t dest, vint64m1_t val) +{ + return vset(dest, 1, val); +} + +vint64m4_t +test_vset_v_i64m1_i64m4 (vint64m4_t dest, vint64m1_t val) +{ + return vset(dest, 1, val); +} + +vint64m4_t +test_vset_v_i64m2_i64m4 (vint64m4_t dest, vint64m2_t val) +{ + return vset(dest, 1, val); +} + +vint64m8_t +test_vset_v_i64m1_i64m8 (vint64m8_t dest, vint64m1_t val) +{ + return vset(dest, 1, val); +} + +vint64m8_t +test_vset_v_i64m2_i64m8 (vint64m8_t dest, vint64m2_t val) +{ + return vset(dest, 1, val); +} + +vint64m8_t +test_vset_v_i64m4_i64m8 (vint64m8_t dest, vint64m4_t val) +{ + return vset(dest, 1, val); +} + +vint64m1_t +test_vget_v_i64m2_i64m1 (vint64m2_t src) +{ + return vget_i64m1(src, 1); +} + +vint64m1_t +test_vget_v_i64m4_i64m1 (vint64m4_t src) +{ + return vget_i64m1(src, 1); +} + +vint64m1_t +test_vget_v_i64m8_i64m1 (vint64m8_t src) +{ + return vget_i64m1(src, 1); +} + +vint64m2_t +test_vget_v_i64m4_i64m2 (vint64m4_t src) +{ + return vget_i64m2(src, 1); +} + +vint64m2_t +test_vget_v_i64m8_i64m2 (vint64m8_t src) +{ + return vget_i64m2(src, 1); +} + +vint64m4_t +test_vget_v_i64m8_i64m4 (vint64m8_t src) +{ + return vget_i64m4(src, 1); +} + +vuint8m2_t +test_vset_v_u8m1_u8m2 (vuint8m2_t dest, vuint8m1_t val) +{ + return vset(dest, 1, val); +} + +vuint8m4_t +test_vset_v_u8m1_u8m4 (vuint8m4_t dest, vuint8m1_t val) +{ + return vset(dest, 1, val); +} + +vuint8m4_t +test_vset_v_u8m2_u8m4 (vuint8m4_t dest, vuint8m2_t val) +{ + return vset(dest, 1, val); +} + +vuint8m8_t +test_vset_v_u8m1_u8m8 (vuint8m8_t dest, vuint8m1_t val) +{ + return vset(dest, 1, val); +} + +vuint8m8_t +test_vset_v_u8m2_u8m8 (vuint8m8_t dest, vuint8m2_t val) +{ + return vset(dest, 1, val); +} + +vuint8m8_t +test_vset_v_u8m4_u8m8 (vuint8m8_t dest, vuint8m4_t val) +{ + return vset(dest, 1, val); +} + +vuint8m1_t +test_vget_v_u8m2_u8m1 (vuint8m2_t src) +{ + return vget_u8m1(src, 1); +} + +vuint8m1_t +test_vget_v_u8m4_u8m1 (vuint8m4_t src) +{ + return vget_u8m1(src, 1); +} + +vuint8m1_t +test_vget_v_u8m8_u8m1 (vuint8m8_t src) +{ + return vget_u8m1(src, 1); +} + +vuint8m2_t +test_vget_v_u8m4_u8m2 (vuint8m4_t src) +{ + return vget_u8m2(src, 1); +} + +vuint8m2_t +test_vget_v_u8m8_u8m2 (vuint8m8_t src) +{ + return vget_u8m2(src, 1); +} + +vuint8m4_t +test_vget_v_u8m8_u8m4 (vuint8m8_t src) +{ + return vget_u8m4(src, 1); +} + +vuint16m2_t +test_vset_v_u16m1_u16m2 (vuint16m2_t dest, vuint16m1_t val) +{ + return vset(dest, 1, val); +} + +vuint16m4_t +test_vset_v_u16m1_u16m4 (vuint16m4_t dest, vuint16m1_t val) +{ + return vset(dest, 1, val); +} + +vuint16m4_t +test_vset_v_u16m2_u16m4 (vuint16m4_t dest, vuint16m2_t val) +{ + return vset(dest, 1, val); +} + +vuint16m8_t +test_vset_v_u16m1_u16m8 (vuint16m8_t dest, vuint16m1_t val) +{ + return vset(dest, 1, val); +} + +vuint16m8_t +test_vset_v_u16m2_u16m8 (vuint16m8_t dest, vuint16m2_t val) +{ + return vset(dest, 1, val); +} + +vuint16m8_t +test_vset_v_u16m4_u16m8 (vuint16m8_t dest, vuint16m4_t val) +{ + return vset(dest, 1, val); +} + +vuint16m1_t +test_vget_v_u16m2_u16m1 (vuint16m2_t src) +{ + return vget_u16m1(src, 1); +} + +vuint16m1_t +test_vget_v_u16m4_u16m1 (vuint16m4_t src) +{ + return vget_u16m1(src, 1); +} + +vuint16m1_t +test_vget_v_u16m8_u16m1 (vuint16m8_t src) +{ + return vget_u16m1(src, 1); +} + +vuint16m2_t +test_vget_v_u16m4_u16m2 (vuint16m4_t src) +{ + return vget_u16m2(src, 1); +} + +vuint16m2_t +test_vget_v_u16m8_u16m2 (vuint16m8_t src) +{ + return vget_u16m2(src, 1); +} + +vuint16m4_t +test_vget_v_u16m8_u16m4 (vuint16m8_t src) +{ + return vget_u16m4(src, 1); +} + +vuint32m2_t +test_vset_v_u32m1_u32m2 (vuint32m2_t dest, vuint32m1_t val) +{ + return vset(dest, 1, val); +} + +vuint32m4_t +test_vset_v_u32m1_u32m4 (vuint32m4_t dest, vuint32m1_t val) +{ + return vset(dest, 1, val); +} + +vuint32m4_t +test_vset_v_u32m2_u32m4 (vuint32m4_t dest, vuint32m2_t val) +{ + return vset(dest, 1, val); +} + +vuint32m8_t +test_vset_v_u32m1_u32m8 (vuint32m8_t dest, vuint32m1_t val) +{ + return vset(dest, 1, val); +} + +vuint32m8_t +test_vset_v_u32m2_u32m8 (vuint32m8_t dest, vuint32m2_t val) +{ + return vset(dest, 1, val); +} + +vuint32m8_t +test_vset_v_u32m4_u32m8 (vuint32m8_t dest, vuint32m4_t val) +{ + return vset(dest, 1, val); +} + +vuint32m1_t +test_vget_v_u32m2_u32m1 (vuint32m2_t src) +{ + return vget_u32m1(src, 1); +} + +vuint32m1_t +test_vget_v_u32m4_u32m1 (vuint32m4_t src) +{ + return vget_u32m1(src, 1); +} + +vuint32m1_t +test_vget_v_u32m8_u32m1 (vuint32m8_t src) +{ + return vget_u32m1(src, 1); +} + +vuint32m2_t +test_vget_v_u32m4_u32m2 (vuint32m4_t src) +{ + return vget_u32m2(src, 1); +} + +vuint32m2_t +test_vget_v_u32m8_u32m2 (vuint32m8_t src) +{ + return vget_u32m2(src, 1); +} + +vuint32m4_t +test_vget_v_u32m8_u32m4 (vuint32m8_t src) +{ + return vget_u32m4(src, 1); +} + +vuint64m2_t +test_vset_v_u64m1_u64m2 (vuint64m2_t dest, vuint64m1_t val) +{ + return vset(dest, 1, val); +} + +vuint64m4_t +test_vset_v_u64m1_u64m4 (vuint64m4_t dest, vuint64m1_t val) +{ + return vset(dest, 1, val); +} + +vuint64m4_t +test_vset_v_u64m2_u64m4 (vuint64m4_t dest, vuint64m2_t val) +{ + return vset(dest, 1, val); +} + +vuint64m8_t +test_vset_v_u64m1_u64m8 (vuint64m8_t dest, vuint64m1_t val) +{ + return vset(dest, 1, val); +} + +vuint64m8_t +test_vset_v_u64m2_u64m8 (vuint64m8_t dest, vuint64m2_t val) +{ + return vset(dest, 1, val); +} + +vuint64m8_t +test_vset_v_u64m4_u64m8 (vuint64m8_t dest, vuint64m4_t val) +{ + return vset(dest, 1, val); +} + +vuint64m1_t +test_vget_v_u64m2_u64m1 (vuint64m2_t src) +{ + return vget_u64m1(src, 1); +} + +vuint64m1_t +test_vget_v_u64m4_u64m1 (vuint64m4_t src) +{ + return vget_u64m1(src, 1); +} + +vuint64m1_t +test_vget_v_u64m8_u64m1 (vuint64m8_t src) +{ + return vget_u64m1(src, 1); +} + +vuint64m2_t +test_vget_v_u64m4_u64m2 (vuint64m4_t src) +{ + return vget_u64m2(src, 1); +} + +vuint64m2_t +test_vget_v_u64m8_u64m2 (vuint64m8_t src) +{ + return vget_u64m2(src, 1); +} + +vuint64m4_t +test_vget_v_u64m8_u64m4 (vuint64m8_t src) +{ + return vget_u64m4(src, 1); +} + +vfloat32m2_t +test_vset_v_f32m1_f32m2 (vfloat32m2_t dest, vfloat32m1_t val) +{ + return vset(dest, 1, val); +} + +vfloat32m4_t +test_vset_v_f32m1_f32m4 (vfloat32m4_t dest, vfloat32m1_t val) +{ + return vset(dest, 1, val); +} + +vfloat32m4_t +test_vset_v_f32m2_f32m4 (vfloat32m4_t dest, vfloat32m2_t val) +{ + return vset(dest, 1, val); +} + +vfloat32m8_t +test_vset_v_f32m1_f32m8 (vfloat32m8_t dest, vfloat32m1_t val) +{ + return vset(dest, 1, val); +} + +vfloat32m8_t +test_vset_v_f32m2_f32m8 (vfloat32m8_t dest, vfloat32m2_t val) +{ + return vset(dest, 1, val); +} + +vfloat32m8_t +test_vset_v_f32m4_f32m8 (vfloat32m8_t dest, vfloat32m4_t val) +{ + return vset(dest, 1, val); +} + +vfloat32m1_t +test_vget_v_f32m2_f32m1 (vfloat32m2_t src) +{ + return vget_f32m1(src, 1); +} + +vfloat32m1_t +test_vget_v_f32m4_f32m1 (vfloat32m4_t src) +{ + return vget_f32m1(src, 1); +} + +vfloat32m1_t +test_vget_v_f32m8_f32m1 (vfloat32m8_t src) +{ + return vget_f32m1(src, 1); +} + +vfloat32m2_t +test_vget_v_f32m4_f32m2 (vfloat32m4_t src) +{ + return vget_f32m2(src, 1); +} + +vfloat32m2_t +test_vget_v_f32m8_f32m2 (vfloat32m8_t src) +{ + return vget_f32m2(src, 1); +} + +vfloat32m4_t +test_vget_v_f32m8_f32m4 (vfloat32m8_t src) +{ + return vget_f32m4(src, 1); +} + +vfloat64m2_t +test_vset_v_f64m1_f64m2 (vfloat64m2_t dest, vfloat64m1_t val) +{ + return vset(dest, 1, val); +} + +vfloat64m4_t +test_vset_v_f64m1_f64m4 (vfloat64m4_t dest, vfloat64m1_t val) +{ + return vset(dest, 1, val); +} + +vfloat64m4_t +test_vset_v_f64m2_f64m4 (vfloat64m4_t dest, vfloat64m2_t val) +{ + return vset(dest, 1, val); +} + +vfloat64m8_t +test_vset_v_f64m1_f64m8 (vfloat64m8_t dest, vfloat64m1_t val) +{ + return vset(dest, 1, val); +} + +vfloat64m8_t +test_vset_v_f64m2_f64m8 (vfloat64m8_t dest, vfloat64m2_t val) +{ + return vset(dest, 1, val); +} + +vfloat64m8_t +test_vset_v_f64m4_f64m8 (vfloat64m8_t dest, vfloat64m4_t val) +{ + return vset(dest, 1, val); +} + +vfloat64m1_t +test_vget_v_f64m2_f64m1 (vfloat64m2_t src) +{ + return vget_f64m1(src, 1); +} + +vfloat64m1_t +test_vget_v_f64m4_f64m1 (vfloat64m4_t src) +{ + return vget_f64m1(src, 1); +} + +vfloat64m1_t +test_vget_v_f64m8_f64m1 (vfloat64m8_t src) +{ + return vget_f64m1(src, 1); +} + +vfloat64m2_t +test_vget_v_f64m4_f64m2 (vfloat64m4_t src) +{ + return vget_f64m2(src, 1); +} + +vfloat64m2_t +test_vget_v_f64m8_f64m2 (vfloat64m8_t src) +{ + return vget_f64m2(src, 1); +} + +vfloat64m4_t +test_vget_v_f64m8_f64m4 (vfloat64m8_t src) +{ + return vget_f64m4(src, 1); +} +/* { dg-final { scan-assembler-times {vmv1r\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*(?:v[0-9]|v[1-2][0-9]|v3[0-1])} 60 } } */ +/* { dg-final { scan-assembler-times {vmv2r\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*(?:v[0-9]|v[1-2][0-9]|v3[0-1])} 40 } } */ +/* { dg-final { scan-assembler-times {vmv4r\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*(?:v[0-9]|v[1-2][0-9]|v3[0-1])} 20 } } */ + diff --git a/gcc/testsuite/gcc.target/riscv/rvv/intrinsic/set-get.c b/gcc/testsuite/gcc.target/riscv/rvv/intrinsic/set-get.c new file mode 100644 index 00000000000..33d5a129aae --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/intrinsic/set-get.c @@ -0,0 +1,730 @@ + +/* { dg-do compile } */ +/* { dg-skip-if "test vector intrinsic" { *-*-* } { "*" } { "-march=rv*v*" } } */ + +#include +#include + + +vint8m2_t +test_vset_v_i8m1_i8m2 (vint8m2_t dest, vint8m1_t val) +{ + return vset_v_i8m1_i8m2(dest, 1, val); +} + +vint8m4_t +test_vset_v_i8m1_i8m4 (vint8m4_t dest, vint8m1_t val) +{ + return vset_v_i8m1_i8m4(dest, 1, val); +} + +vint8m4_t +test_vset_v_i8m2_i8m4 (vint8m4_t dest, vint8m2_t val) +{ + return vset_v_i8m2_i8m4(dest, 1, val); +} + +vint8m8_t +test_vset_v_i8m1_i8m8 (vint8m8_t dest, vint8m1_t val) +{ + return vset_v_i8m1_i8m8(dest, 1, val); +} + +vint8m8_t +test_vset_v_i8m2_i8m8 (vint8m8_t dest, vint8m2_t val) +{ + return vset_v_i8m2_i8m8(dest, 1, val); +} + +vint8m8_t +test_vset_v_i8m4_i8m8 (vint8m8_t dest, vint8m4_t val) +{ + return vset_v_i8m4_i8m8(dest, 1, val); +} + +vint8m1_t +test_vget_v_i8m2_i8m1 (vint8m2_t src) +{ + return vget_v_i8m2_i8m1(src, 1); +} + +vint8m1_t +test_vget_v_i8m4_i8m1 (vint8m4_t src) +{ + return vget_v_i8m4_i8m1(src, 1); +} + +vint8m1_t +test_vget_v_i8m8_i8m1 (vint8m8_t src) +{ + return vget_v_i8m8_i8m1(src, 1); +} + +vint8m2_t +test_vget_v_i8m4_i8m2 (vint8m4_t src) +{ + return vget_v_i8m4_i8m2(src, 1); +} + +vint8m2_t +test_vget_v_i8m8_i8m2 (vint8m8_t src) +{ + return vget_v_i8m8_i8m2(src, 1); +} + +vint8m4_t +test_vget_v_i8m8_i8m4 (vint8m8_t src) +{ + return vget_v_i8m8_i8m4(src, 1); +} + +vint16m2_t +test_vset_v_i16m1_i16m2 (vint16m2_t dest, vint16m1_t val) +{ + return vset_v_i16m1_i16m2(dest, 1, val); +} + +vint16m4_t +test_vset_v_i16m1_i16m4 (vint16m4_t dest, vint16m1_t val) +{ + return vset_v_i16m1_i16m4(dest, 1, val); +} + +vint16m4_t +test_vset_v_i16m2_i16m4 (vint16m4_t dest, vint16m2_t val) +{ + return vset_v_i16m2_i16m4(dest, 1, val); +} + +vint16m8_t +test_vset_v_i16m1_i16m8 (vint16m8_t dest, vint16m1_t val) +{ + return vset_v_i16m1_i16m8(dest, 1, val); +} + +vint16m8_t +test_vset_v_i16m2_i16m8 (vint16m8_t dest, vint16m2_t val) +{ + return vset_v_i16m2_i16m8(dest, 1, val); +} + +vint16m8_t +test_vset_v_i16m4_i16m8 (vint16m8_t dest, vint16m4_t val) +{ + return vset_v_i16m4_i16m8(dest, 1, val); +} + +vint16m1_t +test_vget_v_i16m2_i16m1 (vint16m2_t src) +{ + return vget_v_i16m2_i16m1(src, 1); +} + +vint16m1_t +test_vget_v_i16m4_i16m1 (vint16m4_t src) +{ + return vget_v_i16m4_i16m1(src, 1); +} + +vint16m1_t +test_vget_v_i16m8_i16m1 (vint16m8_t src) +{ + return vget_v_i16m8_i16m1(src, 1); +} + +vint16m2_t +test_vget_v_i16m4_i16m2 (vint16m4_t src) +{ + return vget_v_i16m4_i16m2(src, 1); +} + +vint16m2_t +test_vget_v_i16m8_i16m2 (vint16m8_t src) +{ + return vget_v_i16m8_i16m2(src, 1); +} + +vint16m4_t +test_vget_v_i16m8_i16m4 (vint16m8_t src) +{ + return vget_v_i16m8_i16m4(src, 1); +} + +vint32m2_t +test_vset_v_i32m1_i32m2 (vint32m2_t dest, vint32m1_t val) +{ + return vset_v_i32m1_i32m2(dest, 1, val); +} + +vint32m4_t +test_vset_v_i32m1_i32m4 (vint32m4_t dest, vint32m1_t val) +{ + return vset_v_i32m1_i32m4(dest, 1, val); +} + +vint32m4_t +test_vset_v_i32m2_i32m4 (vint32m4_t dest, vint32m2_t val) +{ + return vset_v_i32m2_i32m4(dest, 1, val); +} + +vint32m8_t +test_vset_v_i32m1_i32m8 (vint32m8_t dest, vint32m1_t val) +{ + return vset_v_i32m1_i32m8(dest, 1, val); +} + +vint32m8_t +test_vset_v_i32m2_i32m8 (vint32m8_t dest, vint32m2_t val) +{ + return vset_v_i32m2_i32m8(dest, 1, val); +} + +vint32m8_t +test_vset_v_i32m4_i32m8 (vint32m8_t dest, vint32m4_t val) +{ + return vset_v_i32m4_i32m8(dest, 1, val); +} + +vint32m1_t +test_vget_v_i32m2_i32m1 (vint32m2_t src) +{ + return vget_v_i32m2_i32m1(src, 1); +} + +vint32m1_t +test_vget_v_i32m4_i32m1 (vint32m4_t src) +{ + return vget_v_i32m4_i32m1(src, 1); +} + +vint32m1_t +test_vget_v_i32m8_i32m1 (vint32m8_t src) +{ + return vget_v_i32m8_i32m1(src, 1); +} + +vint32m2_t +test_vget_v_i32m4_i32m2 (vint32m4_t src) +{ + return vget_v_i32m4_i32m2(src, 1); +} + +vint32m2_t +test_vget_v_i32m8_i32m2 (vint32m8_t src) +{ + return vget_v_i32m8_i32m2(src, 1); +} + +vint32m4_t +test_vget_v_i32m8_i32m4 (vint32m8_t src) +{ + return vget_v_i32m8_i32m4(src, 1); +} + +vint64m2_t +test_vset_v_i64m1_i64m2 (vint64m2_t dest, vint64m1_t val) +{ + return vset_v_i64m1_i64m2(dest, 1, val); +} + +vint64m4_t +test_vset_v_i64m1_i64m4 (vint64m4_t dest, vint64m1_t val) +{ + return vset_v_i64m1_i64m4(dest, 1, val); +} + +vint64m4_t +test_vset_v_i64m2_i64m4 (vint64m4_t dest, vint64m2_t val) +{ + return vset_v_i64m2_i64m4(dest, 1, val); +} + +vint64m8_t +test_vset_v_i64m1_i64m8 (vint64m8_t dest, vint64m1_t val) +{ + return vset_v_i64m1_i64m8(dest, 1, val); +} + +vint64m8_t +test_vset_v_i64m2_i64m8 (vint64m8_t dest, vint64m2_t val) +{ + return vset_v_i64m2_i64m8(dest, 1, val); +} + +vint64m8_t +test_vset_v_i64m4_i64m8 (vint64m8_t dest, vint64m4_t val) +{ + return vset_v_i64m4_i64m8(dest, 1, val); +} + +vint64m1_t +test_vget_v_i64m2_i64m1 (vint64m2_t src) +{ + return vget_v_i64m2_i64m1(src, 1); +} + +vint64m1_t +test_vget_v_i64m4_i64m1 (vint64m4_t src) +{ + return vget_v_i64m4_i64m1(src, 1); +} + +vint64m1_t +test_vget_v_i64m8_i64m1 (vint64m8_t src) +{ + return vget_v_i64m8_i64m1(src, 1); +} + +vint64m2_t +test_vget_v_i64m4_i64m2 (vint64m4_t src) +{ + return vget_v_i64m4_i64m2(src, 1); +} + +vint64m2_t +test_vget_v_i64m8_i64m2 (vint64m8_t src) +{ + return vget_v_i64m8_i64m2(src, 1); +} + +vint64m4_t +test_vget_v_i64m8_i64m4 (vint64m8_t src) +{ + return vget_v_i64m8_i64m4(src, 1); +} + +vuint8m2_t +test_vset_v_u8m1_u8m2 (vuint8m2_t dest, vuint8m1_t val) +{ + return vset_v_u8m1_u8m2(dest, 1, val); +} + +vuint8m4_t +test_vset_v_u8m1_u8m4 (vuint8m4_t dest, vuint8m1_t val) +{ + return vset_v_u8m1_u8m4(dest, 1, val); +} + +vuint8m4_t +test_vset_v_u8m2_u8m4 (vuint8m4_t dest, vuint8m2_t val) +{ + return vset_v_u8m2_u8m4(dest, 1, val); +} + +vuint8m8_t +test_vset_v_u8m1_u8m8 (vuint8m8_t dest, vuint8m1_t val) +{ + return vset_v_u8m1_u8m8(dest, 1, val); +} + +vuint8m8_t +test_vset_v_u8m2_u8m8 (vuint8m8_t dest, vuint8m2_t val) +{ + return vset_v_u8m2_u8m8(dest, 1, val); +} + +vuint8m8_t +test_vset_v_u8m4_u8m8 (vuint8m8_t dest, vuint8m4_t val) +{ + return vset_v_u8m4_u8m8(dest, 1, val); +} + +vuint8m1_t +test_vget_v_u8m2_u8m1 (vuint8m2_t src) +{ + return vget_v_u8m2_u8m1(src, 1); +} + +vuint8m1_t +test_vget_v_u8m4_u8m1 (vuint8m4_t src) +{ + return vget_v_u8m4_u8m1(src, 1); +} + +vuint8m1_t +test_vget_v_u8m8_u8m1 (vuint8m8_t src) +{ + return vget_v_u8m8_u8m1(src, 1); +} + +vuint8m2_t +test_vget_v_u8m4_u8m2 (vuint8m4_t src) +{ + return vget_v_u8m4_u8m2(src, 1); +} + +vuint8m2_t +test_vget_v_u8m8_u8m2 (vuint8m8_t src) +{ + return vget_v_u8m8_u8m2(src, 1); +} + +vuint8m4_t +test_vget_v_u8m8_u8m4 (vuint8m8_t src) +{ + return vget_v_u8m8_u8m4(src, 1); +} + +vuint16m2_t +test_vset_v_u16m1_u16m2 (vuint16m2_t dest, vuint16m1_t val) +{ + return vset_v_u16m1_u16m2(dest, 1, val); +} + +vuint16m4_t +test_vset_v_u16m1_u16m4 (vuint16m4_t dest, vuint16m1_t val) +{ + return vset_v_u16m1_u16m4(dest, 1, val); +} + +vuint16m4_t +test_vset_v_u16m2_u16m4 (vuint16m4_t dest, vuint16m2_t val) +{ + return vset_v_u16m2_u16m4(dest, 1, val); +} + +vuint16m8_t +test_vset_v_u16m1_u16m8 (vuint16m8_t dest, vuint16m1_t val) +{ + return vset_v_u16m1_u16m8(dest, 1, val); +} + +vuint16m8_t +test_vset_v_u16m2_u16m8 (vuint16m8_t dest, vuint16m2_t val) +{ + return vset_v_u16m2_u16m8(dest, 1, val); +} + +vuint16m8_t +test_vset_v_u16m4_u16m8 (vuint16m8_t dest, vuint16m4_t val) +{ + return vset_v_u16m4_u16m8(dest, 1, val); +} + +vuint16m1_t +test_vget_v_u16m2_u16m1 (vuint16m2_t src) +{ + return vget_v_u16m2_u16m1(src, 1); +} + +vuint16m1_t +test_vget_v_u16m4_u16m1 (vuint16m4_t src) +{ + return vget_v_u16m4_u16m1(src, 1); +} + +vuint16m1_t +test_vget_v_u16m8_u16m1 (vuint16m8_t src) +{ + return vget_v_u16m8_u16m1(src, 1); +} + +vuint16m2_t +test_vget_v_u16m4_u16m2 (vuint16m4_t src) +{ + return vget_v_u16m4_u16m2(src, 1); +} + +vuint16m2_t +test_vget_v_u16m8_u16m2 (vuint16m8_t src) +{ + return vget_v_u16m8_u16m2(src, 1); +} + +vuint16m4_t +test_vget_v_u16m8_u16m4 (vuint16m8_t src) +{ + return vget_v_u16m8_u16m4(src, 1); +} + +vuint32m2_t +test_vset_v_u32m1_u32m2 (vuint32m2_t dest, vuint32m1_t val) +{ + return vset_v_u32m1_u32m2(dest, 1, val); +} + +vuint32m4_t +test_vset_v_u32m1_u32m4 (vuint32m4_t dest, vuint32m1_t val) +{ + return vset_v_u32m1_u32m4(dest, 1, val); +} + +vuint32m4_t +test_vset_v_u32m2_u32m4 (vuint32m4_t dest, vuint32m2_t val) +{ + return vset_v_u32m2_u32m4(dest, 1, val); +} + +vuint32m8_t +test_vset_v_u32m1_u32m8 (vuint32m8_t dest, vuint32m1_t val) +{ + return vset_v_u32m1_u32m8(dest, 1, val); +} + +vuint32m8_t +test_vset_v_u32m2_u32m8 (vuint32m8_t dest, vuint32m2_t val) +{ + return vset_v_u32m2_u32m8(dest, 1, val); +} + +vuint32m8_t +test_vset_v_u32m4_u32m8 (vuint32m8_t dest, vuint32m4_t val) +{ + return vset_v_u32m4_u32m8(dest, 1, val); +} + +vuint32m1_t +test_vget_v_u32m2_u32m1 (vuint32m2_t src) +{ + return vget_v_u32m2_u32m1(src, 1); +} + +vuint32m1_t +test_vget_v_u32m4_u32m1 (vuint32m4_t src) +{ + return vget_v_u32m4_u32m1(src, 1); +} + +vuint32m1_t +test_vget_v_u32m8_u32m1 (vuint32m8_t src) +{ + return vget_v_u32m8_u32m1(src, 1); +} + +vuint32m2_t +test_vget_v_u32m4_u32m2 (vuint32m4_t src) +{ + return vget_v_u32m4_u32m2(src, 1); +} + +vuint32m2_t +test_vget_v_u32m8_u32m2 (vuint32m8_t src) +{ + return vget_v_u32m8_u32m2(src, 1); +} + +vuint32m4_t +test_vget_v_u32m8_u32m4 (vuint32m8_t src) +{ + return vget_v_u32m8_u32m4(src, 1); +} + +vuint64m2_t +test_vset_v_u64m1_u64m2 (vuint64m2_t dest, vuint64m1_t val) +{ + return vset_v_u64m1_u64m2(dest, 1, val); +} + +vuint64m4_t +test_vset_v_u64m1_u64m4 (vuint64m4_t dest, vuint64m1_t val) +{ + return vset_v_u64m1_u64m4(dest, 1, val); +} + +vuint64m4_t +test_vset_v_u64m2_u64m4 (vuint64m4_t dest, vuint64m2_t val) +{ + return vset_v_u64m2_u64m4(dest, 1, val); +} + +vuint64m8_t +test_vset_v_u64m1_u64m8 (vuint64m8_t dest, vuint64m1_t val) +{ + return vset_v_u64m1_u64m8(dest, 1, val); +} + +vuint64m8_t +test_vset_v_u64m2_u64m8 (vuint64m8_t dest, vuint64m2_t val) +{ + return vset_v_u64m2_u64m8(dest, 1, val); +} + +vuint64m8_t +test_vset_v_u64m4_u64m8 (vuint64m8_t dest, vuint64m4_t val) +{ + return vset_v_u64m4_u64m8(dest, 1, val); +} + +vuint64m1_t +test_vget_v_u64m2_u64m1 (vuint64m2_t src) +{ + return vget_v_u64m2_u64m1(src, 1); +} + +vuint64m1_t +test_vget_v_u64m4_u64m1 (vuint64m4_t src) +{ + return vget_v_u64m4_u64m1(src, 1); +} + +vuint64m1_t +test_vget_v_u64m8_u64m1 (vuint64m8_t src) +{ + return vget_v_u64m8_u64m1(src, 1); +} + +vuint64m2_t +test_vget_v_u64m4_u64m2 (vuint64m4_t src) +{ + return vget_v_u64m4_u64m2(src, 1); +} + +vuint64m2_t +test_vget_v_u64m8_u64m2 (vuint64m8_t src) +{ + return vget_v_u64m8_u64m2(src, 1); +} + +vuint64m4_t +test_vget_v_u64m8_u64m4 (vuint64m8_t src) +{ + return vget_v_u64m8_u64m4(src, 1); +} + +vfloat32m2_t +test_vset_v_f32m1_f32m2 (vfloat32m2_t dest, vfloat32m1_t val) +{ + return vset_v_f32m1_f32m2(dest, 1, val); +} + +vfloat32m4_t +test_vset_v_f32m1_f32m4 (vfloat32m4_t dest, vfloat32m1_t val) +{ + return vset_v_f32m1_f32m4(dest, 1, val); +} + +vfloat32m4_t +test_vset_v_f32m2_f32m4 (vfloat32m4_t dest, vfloat32m2_t val) +{ + return vset_v_f32m2_f32m4(dest, 1, val); +} + +vfloat32m8_t +test_vset_v_f32m1_f32m8 (vfloat32m8_t dest, vfloat32m1_t val) +{ + return vset_v_f32m1_f32m8(dest, 1, val); +} + +vfloat32m8_t +test_vset_v_f32m2_f32m8 (vfloat32m8_t dest, vfloat32m2_t val) +{ + return vset_v_f32m2_f32m8(dest, 1, val); +} + +vfloat32m8_t +test_vset_v_f32m4_f32m8 (vfloat32m8_t dest, vfloat32m4_t val) +{ + return vset_v_f32m4_f32m8(dest, 1, val); +} + +vfloat32m1_t +test_vget_v_f32m2_f32m1 (vfloat32m2_t src) +{ + return vget_v_f32m2_f32m1(src, 1); +} + +vfloat32m1_t +test_vget_v_f32m4_f32m1 (vfloat32m4_t src) +{ + return vget_v_f32m4_f32m1(src, 1); +} + +vfloat32m1_t +test_vget_v_f32m8_f32m1 (vfloat32m8_t src) +{ + return vget_v_f32m8_f32m1(src, 1); +} + +vfloat32m2_t +test_vget_v_f32m4_f32m2 (vfloat32m4_t src) +{ + return vget_v_f32m4_f32m2(src, 1); +} + +vfloat32m2_t +test_vget_v_f32m8_f32m2 (vfloat32m8_t src) +{ + return vget_v_f32m8_f32m2(src, 1); +} + +vfloat32m4_t +test_vget_v_f32m8_f32m4 (vfloat32m8_t src) +{ + return vget_v_f32m8_f32m4(src, 1); +} + +vfloat64m2_t +test_vset_v_f64m1_f64m2 (vfloat64m2_t dest, vfloat64m1_t val) +{ + return vset_v_f64m1_f64m2(dest, 1, val); +} + +vfloat64m4_t +test_vset_v_f64m1_f64m4 (vfloat64m4_t dest, vfloat64m1_t val) +{ + return vset_v_f64m1_f64m4(dest, 1, val); +} + +vfloat64m4_t +test_vset_v_f64m2_f64m4 (vfloat64m4_t dest, vfloat64m2_t val) +{ + return vset_v_f64m2_f64m4(dest, 1, val); +} + +vfloat64m8_t +test_vset_v_f64m1_f64m8 (vfloat64m8_t dest, vfloat64m1_t val) +{ + return vset_v_f64m1_f64m8(dest, 1, val); +} + +vfloat64m8_t +test_vset_v_f64m2_f64m8 (vfloat64m8_t dest, vfloat64m2_t val) +{ + return vset_v_f64m2_f64m8(dest, 1, val); +} + +vfloat64m8_t +test_vset_v_f64m4_f64m8 (vfloat64m8_t dest, vfloat64m4_t val) +{ + return vset_v_f64m4_f64m8(dest, 1, val); +} + +vfloat64m1_t +test_vget_v_f64m2_f64m1 (vfloat64m2_t src) +{ + return vget_v_f64m2_f64m1(src, 1); +} + +vfloat64m1_t +test_vget_v_f64m4_f64m1 (vfloat64m4_t src) +{ + return vget_v_f64m4_f64m1(src, 1); +} + +vfloat64m1_t +test_vget_v_f64m8_f64m1 (vfloat64m8_t src) +{ + return vget_v_f64m8_f64m1(src, 1); +} + +vfloat64m2_t +test_vget_v_f64m4_f64m2 (vfloat64m4_t src) +{ + return vget_v_f64m4_f64m2(src, 1); +} + +vfloat64m2_t +test_vget_v_f64m8_f64m2 (vfloat64m8_t src) +{ + return vget_v_f64m8_f64m2(src, 1); +} + +vfloat64m4_t +test_vget_v_f64m8_f64m4 (vfloat64m8_t src) +{ + return vget_v_f64m8_f64m4(src, 1); +} +/* { dg-final { scan-assembler-times {vmv1r\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*(?:v[0-9]|v[1-2][0-9]|v3[0-1])} 60 } } */ +/* { dg-final { scan-assembler-times {vmv2r\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*(?:v[0-9]|v[1-2][0-9]|v3[0-1])} 40 } } */ +/* { dg-final { scan-assembler-times {vmv4r\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*(?:v[0-9]|v[1-2][0-9]|v3[0-1])} 20 } } */ From patchwork Tue May 31 08:50:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 54555 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EC258395A061 for ; Tue, 31 May 2022 08:59:36 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbg152.qq.com (smtpbg152.qq.com [13.245.186.79]) by sourceware.org (Postfix) with ESMTPS id DE9A13834E42 for ; Tue, 31 May 2022 08:51:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DE9A13834E42 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp84t1653987055tr11u3ir Received: from server1.localdomain ( [42.247.22.65]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 31 May 2022 16:50:54 +0800 (CST) X-QQ-SSF: 01400000002000B0F000000A0000000 X-QQ-FEAT: 2j+C2ndjE44LGVTMMecgNqpoq1HvOsxbiuDfh2eHQVNIjoUHD33yLfwR+UcGE SNYAXW3+XOZSaiDhV7lo6o2A5HQM2Tpn9QPR4mEUHhCsGnGwwPy/Ta2REnRyJ6V3QGgLPUN ZjhMqqo3sSKQeCU0GST8ORdBqXbl1Ra1B0ru01C4Yn6Ajc41fxKeh8W06l1RdJf1KOh+/Ck LkC1gJkWSkKL2ku0N2NQ+tQexnL0XNujPDPfFhfBDhiYwjVkKqFdE0q7DX+Td/F/ajJS1qO oqFtdgCRjNpHYKE/DVjIYxBDVY7QlU7dMegYezO7XkeeI0Zin62hWcLB8MI7z30lNR0qy9+ a1j497cUGZ2kIk/WsoOWQfXFbt/NI4VJDl8RkpL X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Subject: [PATCH 13/21] Adjust scalable frame and full testcases Date: Tue, 31 May 2022 16:50:04 +0800 Message-Id: <20220531085012.269719-14-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220531085012.269719-1-juzhe.zhong@rivai.ai> References: <20220531085012.269719-1-juzhe.zhong@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybgforeign:qybgforeign10 X-QQ-Bgrelay: 1 X-Spam-Status: No, score=-10.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kito.cheng@gmail.com, juzhe.zhong@rivai.ai Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: zhongjuzhe gcc/ChangeLog: * config/riscv/riscv-vector.cc (rvv_adjust_frame): Adjust frame manipulation for RVV scalable vector. * config/riscv/riscv-vector.h (rvv_adjust_frame): Adjust frame manipulation for RVV scalable vector. * config/riscv/riscv.cc (riscv_compute_frame_info): Adjust frame manipulation for RVV scalable vector. (riscv_first_stack_step): Adjust frame manipulation for RVV scalable vector. (riscv_expand_prologue): Adjust frame manipulation for RVV scalable vector. (riscv_expand_epilogue): Adjust frame manipulation for RVV scalable vector. (riscv_dwarf_poly_indeterminate_value): New function. (riscv_estimated_poly_value): New function. (TARGET_DWARF_POLY_INDETERMINATE_VALUE): New targethook. (TARGET_ESTIMATED_POLY_VALUE): New targethook. * config/riscv/riscv.h (RISCV_PROLOGUE_TEMP2_REGNUM): New macro define. (RISCV_PROLOGUE_TEMP2): New macro define. (RISCV_DWARF_VLENB): New macro define. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/stack/rvv-stack.exp: New test. * gcc.target/riscv/rvv/stack/stack-check-alloca-scalar.c: New test. * gcc.target/riscv/rvv/stack/stack-check-alloca-vector.c: New test. * gcc.target/riscv/rvv/stack/stack-check-save-restore-scalar.c: New test. * gcc.target/riscv/rvv/stack/stack-check-save-restore-vector.c: New test. * gcc.target/riscv/rvv/stack/stack-check-scalar.c: New test. * gcc.target/riscv/rvv/stack/stack-check-vararg-scalar.c: New test. * gcc.target/riscv/rvv/stack/stack-check-vector_1.c: New test. * gcc.target/riscv/rvv/stack/stack-check-vector_2.c: New test. --- gcc/config/riscv/riscv-vector.cc | 33 +++ gcc/config/riscv/riscv-vector.h | 1 + gcc/config/riscv/riscv.cc | 275 ++++++++++++----- gcc/config/riscv/riscv.h | 4 + .../gcc.target/riscv/rvv/stack/rvv-stack.exp | 47 +++ .../rvv/stack/stack-check-alloca-scalar.c | 53 ++++ .../rvv/stack/stack-check-alloca-vector.c | 45 +++ .../stack/stack-check-save-restore-scalar.c | 48 +++ .../stack/stack-check-save-restore-vector.c | 62 ++++ .../riscv/rvv/stack/stack-check-scalar.c | 205 +++++++++++++ .../rvv/stack/stack-check-vararg-scalar.c | 33 +++ .../riscv/rvv/stack/stack-check-vector_1.c | 277 ++++++++++++++++++ .../riscv/rvv/stack/stack-check-vector_2.c | 141 +++++++++ 13 files changed, 1143 insertions(+), 81 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/stack/rvv-stack.exp create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-alloca-scalar.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-alloca-vector.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-save-restore-scalar.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-save-restore-vector.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-scalar.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vararg-scalar.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vector_1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vector_2.c diff --git a/gcc/config/riscv/riscv-vector.cc b/gcc/config/riscv/riscv-vector.cc index d09fc1b8e49..4cb5e79421d 100644 --- a/gcc/config/riscv/riscv-vector.cc +++ b/gcc/config/riscv/riscv-vector.cc @@ -846,6 +846,39 @@ rvv_expand_poly_move (machine_mode mode, rtx dest, rtx clobber, rtx src) emit_insn (gen_rtx_SET (dest, riscv_add_offset (clobber, dest, constant))); } +/* Adjust frame of vector for prologue && epilogue. */ +void +rvv_adjust_frame (rtx target, poly_int64 offset, bool epilogue) +{ + rtx clobber = RISCV_PROLOGUE_TEMP (Pmode); + rtx space = RISCV_PROLOGUE_TEMP2 (Pmode); + rtx insn, dwarf, adjust_frame_rtx; + + rvv_expand_poly_move (Pmode, space, clobber, gen_int_mode (offset, Pmode)); + + if (epilogue) + { + insn = gen_add3_insn (target, target, space); + } + else + { + insn = gen_sub3_insn (target, target, space); + } + + insn = emit_insn (insn); + + RTX_FRAME_RELATED_P (insn) = 1; + + adjust_frame_rtx = + gen_rtx_SET (target, + plus_constant (Pmode, target, epilogue ? offset : -offset)); + + dwarf = alloc_reg_note (REG_FRAME_RELATED_EXPR, + copy_rtx (adjust_frame_rtx), NULL_RTX); + + REG_NOTES (insn) = dwarf; +} + /* Helper functions for handling sew=64 on RV32 system. */ bool imm32_p (rtx a) diff --git a/gcc/config/riscv/riscv-vector.h b/gcc/config/riscv/riscv-vector.h index b70cf676e26..98f47ea0ec1 100644 --- a/gcc/config/riscv/riscv-vector.h +++ b/gcc/config/riscv/riscv-vector.h @@ -22,4 +22,5 @@ #define GCC_RISCV_VECTOR_H void rvv_report_required (void); void rvv_expand_poly_move (machine_mode, rtx, rtx, rtx); +void rvv_adjust_frame (rtx, poly_int64, bool); #endif // GCC_RISCV_VECTOR_H \ No newline at end of file diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 832c1754002..29106bbf6fe 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -4357,7 +4357,11 @@ riscv_compute_frame_info (void) padding. */ frame->arg_pointer_offset = offset - crtl->args.pretend_args_size; frame->total_size = offset; - + + /* Calculate the constant offset of a scalable frame. We Handle the constant + and scalable part of frame seperatly. */ + frame->constant_offset = riscv_stack_align (frame->total_size.coeffs[0]) - + riscv_stack_align (frame->total_size.coeffs[1]); /* Next points the incoming stack pointer and any incoming arguments. */ /* Only use save/restore routines when the GPRs are atop the frame. */ @@ -4538,21 +4542,27 @@ riscv_restore_reg (rtx reg, rtx mem) static HOST_WIDE_INT riscv_first_stack_step (struct riscv_frame_info *frame) { - if (SMALL_OPERAND (frame->total_size.to_constant())) - return frame->total_size.to_constant (); + HOST_WIDE_INT frame_total_size; + if (!frame->total_size.is_constant()) + frame_total_size = frame->constant_offset; + else + frame_total_size = frame->total_size.to_constant(); + + if (SMALL_OPERAND (frame_total_size)) + return frame_total_size; HOST_WIDE_INT min_first_step = RISCV_STACK_ALIGN ((frame->total_size - frame->fp_sp_offset).to_constant()); HOST_WIDE_INT max_first_step = IMM_REACH / 2 - PREFERRED_STACK_BOUNDARY / 8; - HOST_WIDE_INT min_second_step = frame->total_size.to_constant() - max_first_step; + HOST_WIDE_INT min_second_step = frame_total_size - max_first_step; gcc_assert (min_first_step <= max_first_step); /* As an optimization, use the least-significant bits of the total frame size, so that the second adjustment step is just LUI + ADD. */ if (!SMALL_OPERAND (min_second_step) - && frame->total_size.to_constant() % IMM_REACH < IMM_REACH / 2 - && frame->total_size.to_constant() % IMM_REACH >= min_first_step) - return frame->total_size.to_constant() % IMM_REACH; + && frame_total_size % IMM_REACH < IMM_REACH / 2 + && frame_total_size % IMM_REACH >= min_first_step) + return frame_total_size % IMM_REACH; if (TARGET_RVC) { @@ -4625,12 +4635,12 @@ void riscv_expand_prologue (void) { struct riscv_frame_info *frame = &cfun->machine->frame; - HOST_WIDE_INT size = frame->total_size.to_constant (); + poly_int64 size = frame->total_size; unsigned mask = frame->mask; rtx insn; if (flag_stack_usage_info) - current_function_static_stack_size = size; + current_function_static_stack_size = constant_lower_bound (size); if (cfun->machine->naked_p) return; @@ -4640,7 +4650,6 @@ riscv_expand_prologue (void) { rtx dwarf = NULL_RTX; dwarf = riscv_adjust_libcall_cfi_prologue (); - size -= frame->save_libcall_adjustment; insn = emit_insn (riscv_gen_gpr_save_insn (frame)); frame->mask = 0; /* Temporarily fib that we need not save GPRs. */ @@ -4652,11 +4661,14 @@ riscv_expand_prologue (void) /* Save the registers. */ if ((frame->mask | frame->fmask) != 0) { - HOST_WIDE_INT step1 = MIN (size, riscv_first_stack_step (frame)); + HOST_WIDE_INT step1 = riscv_first_stack_step (frame); + if (size.is_constant ()) + step1 = MIN (size.to_constant(), step1); + gcc_assert (SMALL_OPERAND (-step1)); insn = gen_add3_insn (stack_pointer_rtx, - stack_pointer_rtx, - GEN_INT (-step1)); + stack_pointer_rtx, + GEN_INT (-step1)); RTX_FRAME_RELATED_P (emit_insn (insn)) = 1; size -= step1; riscv_for_each_saved_reg (size, riscv_save_reg, false, false); @@ -4667,34 +4679,56 @@ riscv_expand_prologue (void) /* Set up the frame pointer, if we're using one. */ if (frame_pointer_needed) { + poly_int64 offset = frame->hard_frame_pointer_offset - size; insn = gen_add3_insn (hard_frame_pointer_rtx, stack_pointer_rtx, - GEN_INT ((frame->hard_frame_pointer_offset - size).to_constant ())); + GEN_INT (offset.to_constant ())); RTX_FRAME_RELATED_P (emit_insn (insn)) = 1; riscv_emit_stack_tie (); } /* Allocate the rest of the frame. */ - if (size > 0) + if (known_gt (size, 0)) { - if (SMALL_OPERAND (-size)) - { - insn = gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx, - GEN_INT (-size)); - RTX_FRAME_RELATED_P (emit_insn (insn)) = 1; - } - else - { - riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode), GEN_INT (-size)); - emit_insn (gen_add3_insn (stack_pointer_rtx, - stack_pointer_rtx, - RISCV_PROLOGUE_TEMP (Pmode))); - - /* Describe the effect of the previous instructions. */ - insn = plus_constant (Pmode, stack_pointer_rtx, -size); - insn = gen_rtx_SET (stack_pointer_rtx, insn); - riscv_set_frame_expr (insn); - } + /* Two step adjustment, first for scalar frame, second for vector frame. */ + poly_int64 poly_offset (0, 0); + if (!size.is_constant ()) + { + HOST_WIDE_INT factor = size.coeffs[1]; + poly_offset.coeffs[0] = factor; + poly_offset.coeffs[1] = factor; + size -= poly_offset; + } + + /* First step for scalar frame. */ + HOST_WIDE_INT size_value = size.to_constant (); + if (size_value > 0) + { + if (SMALL_OPERAND (-size_value)) + { + insn = gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx, + GEN_INT (-size_value)); + RTX_FRAME_RELATED_P (emit_insn (insn)) = 1; + } + else + { + riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode), GEN_INT (size_value)); + emit_insn (gen_sub3_insn (stack_pointer_rtx, + stack_pointer_rtx, + RISCV_PROLOGUE_TEMP (Pmode))); + + /* Describe the effect of the previous instructions. */ + insn = plus_constant (Pmode, stack_pointer_rtx, -size_value); + insn = gen_rtx_SET (stack_pointer_rtx, insn); + riscv_set_frame_expr (insn); + } + } + + /* Second step for vector frame */ + if (known_gt (poly_offset, 0)) + { + rvv_adjust_frame (stack_pointer_rtx, poly_offset, false); + } } } @@ -4734,7 +4768,8 @@ riscv_expand_epilogue (int style) Start off by assuming that no registers need to be restored. */ struct riscv_frame_info *frame = &cfun->machine->frame; unsigned mask = frame->mask; - HOST_WIDE_INT step1 = frame->total_size.to_constant (); + poly_int64 step1 = frame->total_size; + poly_int64 restore_offset; /* For restore register */ HOST_WIDE_INT step2 = 0; bool use_restore_libcall = ((style == NORMAL_RETURN) && riscv_use_save_libcall (frame)); @@ -4742,8 +4777,8 @@ riscv_expand_epilogue (int style) rtx insn; /* We need to add memory barrier to prevent read from deallocated stack. */ - bool need_barrier_p = known_ne (get_frame_size (), - cfun->machine->frame.arg_pointer_offset); + bool need_barrier_p = known_ne (get_frame_size () + + cfun->machine->frame.arg_pointer_offset, 0); if (cfun->machine->naked_p) { @@ -4763,6 +4798,21 @@ riscv_expand_epilogue (int style) /* Reset the epilogue cfa info before starting to emit the epilogue. */ epilogue_cfa_sp_offset = 0; + if (use_restore_libcall) + { + step1 -= frame->save_libcall_adjustment; + frame->mask = 0; /* Temporarily fib that we need not save GPRs. */ + } + + /* If we need to restore registers, deallocate as much stack as + possible in the second step without going out of range. */ + if ((frame->mask | frame->fmask) != 0) + { + step2 = riscv_first_stack_step (frame); + step1 -= step2; + restore_offset = step1; + } + /* Move past any dynamic stack allocations. */ if (cfun->calls_alloca) { @@ -4770,21 +4820,18 @@ riscv_expand_epilogue (int style) riscv_emit_stack_tie (); need_barrier_p = false; - rtx adjust = GEN_INT (-frame->hard_frame_pointer_offset.to_constant ()); - if (!SMALL_OPERAND (INTVAL (adjust))) - { - riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode), adjust); - adjust = RISCV_PROLOGUE_TEMP (Pmode); - } - - insn = emit_insn ( - gen_add3_insn (stack_pointer_rtx, hard_frame_pointer_rtx, - adjust)); + gcc_assert (frame_pointer_needed); + poly_int64 offset = frame->hard_frame_pointer_offset - step1; + insn = emit_insn (gen_add3_insn (stack_pointer_rtx, hard_frame_pointer_rtx, + GEN_INT (-offset.to_constant ()))); + /* By using hard_frame_pointer_rtx, it can skip the adjust of step1 + and go directly to the position of step2 */ + step1 = 0; rtx dwarf = NULL_RTX; rtx cfa_adjust_value = gen_rtx_PLUS ( - Pmode, hard_frame_pointer_rtx, - GEN_INT (-frame->hard_frame_pointer_offset.to_constant ())); + Pmode, hard_frame_pointer_rtx, + GEN_INT (-offset.to_constant ())); rtx cfa_adjust_rtx = gen_rtx_SET (stack_pointer_rtx, cfa_adjust_value); dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, cfa_adjust_rtx, dwarf); RTX_FRAME_RELATED_P (insn) = 1; @@ -4792,14 +4839,6 @@ riscv_expand_epilogue (int style) REG_NOTES (insn) = dwarf; } - /* If we need to restore registers, deallocate as much stack as - possible in the second step without going out of range. */ - if ((frame->mask | frame->fmask) != 0) - { - step2 = riscv_first_stack_step (frame); - step1 -= step2; - } - /* Set TARGET to BASE + STEP1. */ if (known_gt (step1, 0)) { @@ -4807,25 +4846,38 @@ riscv_expand_epilogue (int style) riscv_emit_stack_tie (); need_barrier_p = false; - /* Get an rtx for STEP1 that we can add to BASE. */ - rtx adjust = GEN_INT (step1); - if (!SMALL_OPERAND (step1)) - { - riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode), adjust); - adjust = RISCV_PROLOGUE_TEMP (Pmode); - } + /* First step for vector frame */ + if (!step1.is_constant ()) + { + HOST_WIDE_INT factor = step1.coeffs[1]; + poly_int64 poly_offset (factor, factor); + rvv_adjust_frame (stack_pointer_rtx, poly_offset, true); + step1 -= poly_offset; + } - insn = emit_insn ( - gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx, adjust)); + /* Second step for scalar frame. */ + HOST_WIDE_INT scalar_step1 = step1.to_constant (); + if (scalar_step1 > 0) + { + rtx adjust = GEN_INT (scalar_step1); + if (!SMALL_OPERAND (scalar_step1)) + { + riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode), adjust); + adjust = RISCV_PROLOGUE_TEMP (Pmode); + } - rtx dwarf = NULL_RTX; - rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx, - GEN_INT (step2)); + insn = emit_insn ( + gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx, adjust)); - dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf); - RTX_FRAME_RELATED_P (insn) = 1; + rtx dwarf = NULL_RTX; + rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx, + GEN_INT (step2)); - REG_NOTES (insn) = dwarf; + dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf); + RTX_FRAME_RELATED_P (insn) = 1; + + REG_NOTES (insn) = dwarf; + } } else if (frame_pointer_needed) { @@ -4834,18 +4886,11 @@ riscv_expand_epilogue (int style) epilogue_cfa_sp_offset = step2; } - if (use_restore_libcall) - frame->mask = 0; /* Temporarily fib that we need not save GPRs. */ - /* Restore the registers. */ - riscv_for_each_saved_reg (frame->total_size - step2, riscv_restore_reg, - true, style == EXCEPTION_RETURN); - - if (use_restore_libcall) + if ((frame->mask | frame->fmask) != 0) { - frame->mask = mask; /* Undo the above fib. */ - gcc_assert (step2 >= frame->save_libcall_adjustment); - step2 -= frame->save_libcall_adjustment; + riscv_for_each_saved_reg (restore_offset, riscv_restore_reg, + true, style == EXCEPTION_RETURN); } if (need_barrier_p) @@ -4868,6 +4913,7 @@ riscv_expand_epilogue (int style) if (use_restore_libcall) { + frame->mask = mask; /* Undo the above fib. */ rtx dwarf = riscv_adjust_libcall_cfi_epilogue (); insn = emit_insn (gen_gpr_restore (GEN_INT (riscv_save_libcall_count (mask)))); RTX_FRAME_RELATED_P (insn) = 1; @@ -6118,6 +6164,67 @@ riscv_regmode_natural_size (machine_mode mode) return UNITS_PER_WORD; } +/* Implement the TARGET_DWARF_POLY_INDETERMINATE_VALUE hook. */ + +static unsigned int +riscv_dwarf_poly_indeterminate_value (unsigned int i, unsigned int *factor, + int *offset) +{ + /* Polynomial invariant 1 == (VLENB / 8) - 1. */ + gcc_assert (i == 1); + *factor = 8; + *offset = 1; + return RISCV_DWARF_VLENB; +} + +/* Implement TARGET_ESTIMATED_POLY_VALUE. + Look into the tuning structure for an estimate. + KIND specifies the type of requested estimate: min, max or likely. + For cores with a known RVV width all three estimates are the same. + For generic RVV tuning we want to distinguish the maximum estimate from + the minimum and likely ones. + The likely estimate is the same as the minimum in that case to give a + conservative behavior of auto-vectorizing with RVV when it is a win + even for 128-bit RVV. + When RVV width information is available VAL.coeffs[1] is multiplied by + the number of VQ chunks over the initial Advanced SIMD 128 bits. */ + +static HOST_WIDE_INT +riscv_estimated_poly_value (poly_int64 val, + poly_value_estimate_kind kind = POLY_VALUE_LIKELY) +{ + unsigned int width_source = + BITS_PER_RISCV_VECTOR.is_constant () + ? (unsigned int)BITS_PER_RISCV_VECTOR.to_constant () + : (unsigned int)RVV_SCALABLE; + + /* If there is no core-specific information then the minimum and likely + values are based on 128-bit vectors and the maximum is based on + the architectural maximum of 2048 bits. */ + if (width_source == RVV_SCALABLE) + switch (kind) + { + case POLY_VALUE_MIN: + case POLY_VALUE_LIKELY: + return val.coeffs[0]; + + case POLY_VALUE_MAX: + return val.coeffs[0] + val.coeffs[1] * 15; + } + + /* Allow BITS_PER_RISCV_VECTOR to be a bitmask of different VL, treating the + lowest as likely. This could be made more general if future -mtune + options need it to be. */ + if (kind == POLY_VALUE_MAX) + width_source = 1 << floor_log2 (width_source); + else + width_source = least_bit_hwi (width_source); + + /* If the core provides width information, use that. */ + HOST_WIDE_INT over_128 = width_source - 128; + return val.coeffs[0] + val.coeffs[1] * over_128 / 128; +} + /* Initialize the GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t" @@ -6336,6 +6443,12 @@ riscv_regmode_natural_size (machine_mode mode) #undef TARGET_MANGLE_TYPE #define TARGET_MANGLE_TYPE riscv_mangle_type +#undef TARGET_DWARF_POLY_INDETERMINATE_VALUE +#define TARGET_DWARF_POLY_INDETERMINATE_VALUE riscv_dwarf_poly_indeterminate_value + +#undef TARGET_ESTIMATED_POLY_VALUE +#define TARGET_ESTIMATED_POLY_VALUE riscv_estimated_poly_value + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-riscv.h" diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h index 5de745bc824..03eb92900be 100644 --- a/gcc/config/riscv/riscv.h +++ b/gcc/config/riscv/riscv.h @@ -396,6 +396,8 @@ ASM_MISA_SPEC #define RISCV_PROLOGUE_TEMP_REGNUM (GP_TEMP_FIRST) #define RISCV_PROLOGUE_TEMP(MODE) gen_rtx_REG (MODE, RISCV_PROLOGUE_TEMP_REGNUM) +#define RISCV_PROLOGUE_TEMP2_REGNUM (GP_TEMP_FIRST + 1) +#define RISCV_PROLOGUE_TEMP2(MODE) gen_rtx_REG (MODE, RISCV_PROLOGUE_TEMP2_REGNUM) #define RISCV_CALL_ADDRESS_TEMP_REGNUM (GP_TEMP_FIRST + 1) #define RISCV_CALL_ADDRESS_TEMP(MODE) \ @@ -1085,4 +1087,6 @@ extern void riscv_remove_unneeded_save_restore_calls (void); #define REGMODE_NATURAL_SIZE(MODE) riscv_regmode_natural_size (MODE) +#define RISCV_DWARF_VLENB (4096 + 0xc22) + #endif /* ! GCC_RISCV_H */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/stack/rvv-stack.exp b/gcc/testsuite/gcc.target/riscv/rvv/stack/rvv-stack.exp new file mode 100644 index 00000000000..9a558f32ed0 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/stack/rvv-stack.exp @@ -0,0 +1,47 @@ +# Copyright (C) 2022-2022 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . + +# GCC testsuite that uses the `dg.exp' driver. + +# Exit immediately if this isn't a RISC-V target. +if ![istarget riscv*-*-*] then { + return +} + +# Load support procs. +load_lib gcc-dg.exp + +set gcc_march_list [list "-march=rv64gcv" "-march=rv64gv"] +if [istarget riscv32-*-*] then { + set gcc_march_list [list "-march=rv32gcv" "-march=rv32gv"] +} + +# If a testcase doesn't have special options, use these. +global DEFAULT_CFLAGS +if ![info exists DEFAULT_CFLAGS] then { + set DEFAULT_CFLAGS " -ansi -pedantic-errors" +} + +# Initialize `dg'. +dg-init + +# Main loop. +foreach march $gcc_march_list { + gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cS\]]] \ + $march $DEFAULT_CFLAGS +} +# All done. +dg-finish diff --git a/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-alloca-scalar.c b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-alloca-scalar.c new file mode 100644 index 00000000000..58c8e6de603 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-alloca-scalar.c @@ -0,0 +1,53 @@ + +/* { dg-do compile } */ +/* { dg-options "-fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +void f2 (char*); +void f3 (char*, ...); + +/* +** stach_check_alloca_1: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-48 +** sw ra,12\(sp\) +** sw s0,8\(sp\) +** addi s0,sp,16 +** ... +** addi a0,a0,23 +** andi a0,a0,-16 +** sub sp,sp,a0 +** ... +** addi sp,s0,-16 +** lw ra,12\(sp\) +** lw s0,8\(sp\) +** addi sp,sp,48 +** jr ra +*/ +void stach_check_alloca_1 (int y, ...) +{ + char* pStr = (char*)__builtin_alloca(y); + f2(pStr); +} + +/* +** stach_check_alloca_2: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-48 +** sw ra,44\(sp\) +** sw s0,40\(sp\) +** addi s0,sp,48 +** addi a0,a0,23 +** andi a0,a0,-16 +** sub sp,sp,a0 +** ... +** addi sp,s0,-48 +** lw ra,44\(sp\) +** lw s0,40\(sp\) +** addi sp,sp,48 +** jr ra +*/ +void stach_check_alloca_2 (int y) +{ + char* pStr = (char*)__builtin_alloca(y); + f3(pStr, pStr, pStr, pStr, pStr, pStr, pStr, pStr, 2, pStr, pStr, pStr, 1); +} + diff --git a/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-alloca-vector.c b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-alloca-vector.c new file mode 100644 index 00000000000..12c05b337cb --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-alloca-vector.c @@ -0,0 +1,45 @@ + +/* { dg-do compile } */ +/* { dg-options "-fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +void f (char*); + +/* +** stach_check_alloca_1: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-48 +** sw ra,12\(sp\) +** sw s0,8\(sp\) +** addi s0,sp,16 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** addi a1,a1,23 +** andi a1,a1,-16 +** sub sp,sp,a1 +** ... +** addi sp,s0,-16 +** lw ra,12\(sp\) +** lw s0,8\(sp\) +** addi sp,sp,48 +** jr ra +*/ +void stach_check_alloca_1 (vuint8m1_t data, uint8_t *base, int y, ...) +{ + vuint8m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + *(vuint8m1_t *)base = data; + char* pStr = (char*)__builtin_alloca(y); + f(pStr); +} + + diff --git a/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-save-restore-scalar.c b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-save-restore-scalar.c new file mode 100644 index 00000000000..72179791677 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-save-restore-scalar.c @@ -0,0 +1,48 @@ + +/* { dg-do compile } */ +/* { dg-options "-msave-restore -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + + +void fn2 (float a1, float a2, float a3, float a4, + float a5, float a6, float a7, float a8); +void fn3 (char*); + + +/* +** stack_save_restore_1: { target { { any-opts "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** call t0,__riscv_save_0 +** addi sp,sp,-32 +** fs(w|d) fs0,24\(sp\) +** li t0,8192 +** addi t0,t0,-208 +** sub sp,sp,t0 +** ... +** li t0,8192 +** addi t0,t0,-208 +** add sp,sp,t0 +** fl(w|d) fs0,24\(sp\) +** addi sp,sp,32 +** tail __riscv_restore_0 +*/ +int stack_save_restore_1 (float a1, float a2, float a3, float a4, + float a5, float a6, float a7, float a8) +{ + char d[8000]; + float f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13; + asm volatile ("nop" + : "=f" (f1), "=f" (f2), "=f" (f3), "=f" (f4), "=f" (f5), "=f" (f6), + "=f" (f7), "=f" (f8), "=f" (f9), "=f" (f10), "=f" (f11), + "=f" (f12), "=f" (f13) + : + :); + asm volatile ("nop" + : + : "f" (f1), "f" (f2), "f" (f3), "f" (f4), "f" (f5), "f" (f6), + "f" (f7), "f" (f8), "f" (f9), "f" (f10), "f" (f11), + "f" (f12), "f" (f13) + :); + fn2 (a1, a2, a3, a4, a5, a6, a7, a8); + fn3(d); + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-save-restore-vector.c b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-save-restore-vector.c new file mode 100644 index 00000000000..694ce4669c0 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-save-restore-vector.c @@ -0,0 +1,62 @@ + +/* { dg-do compile } */ +/* { dg-options "-msave-restore -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +#include + +void fn2 (float a1, float a2, float a3, float a4, + float a5, float a6, float a7, float a8); +void fn3 (char*); + +/* +** stack_save_restore_2: { target { { any-opts "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** call t0,__riscv_save_0 +** addi sp,sp,-32 +** fs(w|d) fs0,24\(sp\) +** li t0,8192 +** addi t0,t0,-208 +** sub sp,sp,t0 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** li t0,8192 +** addi t0,t0,-208 +** add sp,sp,t0 +** fl(w|d) fs0,24\(sp\) +** addi sp,sp,32 +** tail __riscv_restore_0 +*/ +int stack_save_restore_2 (float a1, float a2, float a3, float a4, + float a5, float a6, float a7, float a8, + vuint8m1_t data, uint8_t *base) +{ + char d[8000]; + float f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13; + asm volatile ("nop" + : "=f" (f1), "=f" (f2), "=f" (f3), "=f" (f4), "=f" (f5), "=f" (f6), + "=f" (f7), "=f" (f8), "=f" (f9), "=f" (f10), "=f" (f11), + "=f" (f12), "=f" (f13) + : + :); + asm volatile ("nop" + : + : "f" (f1), "f" (f2), "f" (f3), "f" (f4), "f" (f5), "f" (f6), + "f" (f7), "f" (f8), "f" (f9), "f" (f10), "f" (f11), + "f" (f12), "f" (f13) + :); + vuint8m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + *(vuint8m1_t *)base = data; + fn2 (a1, a2, a3, a4, a5, a6, a7, a8); + fn3(d); + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-scalar.c b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-scalar.c new file mode 100644 index 00000000000..4400470b650 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-scalar.c @@ -0,0 +1,205 @@ +/* { dg-do compile } */ +/* { dg-options "-fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +void f (uint8_t *); + +/* GPR: 16, local: 16, total: 32 +** stack_offset1_1: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-32 +** sw ra,28\(sp\) +** addi a0,sp,4 +** call f +** lw ra,28\(sp\) +** addi sp,sp,32 +** jr ra +*/ +void stack_offset1_1 () +{ + uint8_t local[10]; + f(local); +} + +/* GPR: 16, local: 2016, total: 2032 +** stack_offset1_2: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** mv a0,sp +** call f +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset1_2 () +{ + uint8_t local[2016]; + f(local); +} + +/* GPR: 16, local: 6000, total: 6016 +** stack_offset2_1: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-1920 +** sw ra,1916\(sp\) +** li t0,4096 +** sub sp,sp,t0 +** li a0,-4096 +** addi a0,a0,-1904 +** li a5,4096 +** addi a5,a5,1904 +** add a5,a5,a0 +** add a0,a5,sp +** call f +** li t0,4096 +** add sp,sp,t0 +** lw ra,1916\(sp\) +** addi sp,sp,1920 +** jr ra +*/ +void stack_offset2_1 () +{ + uint8_t local[6000]; + f(local); +} + +/* GPR: 16, local: 2032, total: 2048 +** stack_offset3_1: { target { { any-opts "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-32 +** sw ra,28\(sp\) +** addi sp,sp,-2016 +** addi a0,sp,12 +** call f +** addi sp,sp,2016 +** lw ra,28\(sp\) +** addi sp,sp,32 +** jr ra +*/ +/* +** stack_offset3_1: { target { { any-opts "-march=rv32gv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** addi sp,sp,-16 +** addi a0,sp,12 +** call f +** addi sp,sp,16 +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset3_1 () +{ + uint8_t local[2017]; + f(local); +} + +/* GPR: 16, local: 2112, total: 2128 +** stack_offset3_2: { target { { any-opts "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-96 +** sw ra,92\(sp\) +** addi sp,sp,-2032 +** li a0,-4096 +** addi a0,a0,1996 +** li a5,4096 +** addi a5,a5,-1984 +** add a5,a5,a0 +** add a0,a5,sp +** call f +** addi sp,sp,2032 +** lw ra,92\(sp\) +** addi sp,sp,96 +** jr ra +*/ +/* +** stack_offset3_2: { target { { any-opts "-march=rv32gv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** addi sp,sp,-96 +** li a0,-4096 +** addi a0,a0,1996 +** li a5,4096 +** addi a5,a5,-1984 +** add a5,a5,a0 +** add a0,a5,sp +** call f +** addi sp,sp,96 +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset3_2 () +{ + uint8_t local[2100]; + f(local); +} + +/* GPR: 16, local: 8000, total: 8016 +** stack_offset4_1: { target { { any-opts "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-32 +** sw ra,28\(sp\) +** li t0,8192 +** addi t0,t0,-208 +** sub sp,sp,t0 +** li a0,-8192 +** addi a0,a0,192 +** li a5,8192 +** addi a5,a5,-192 +** add a5,a5,a0 +** add a0,a5,sp +** call f +** li t0,8192 +** addi t0,t0,-208 +** add sp,sp,t0 +** lw ra,28\(sp\) +** addi sp,sp,32 +** jr ra +*/ +/* +** stack_offset4_1: { target { { any-opts "-march=rv32gv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** li t0,4096 +** addi t0,t0,1888 +** sub sp,sp,t0 +** li a0,-8192 +** addi a0,a0,192 +** li a5,8192 +** addi a5,a5,-192 +** add a5,a5,a0 +** add a0,a5,sp +** call f +** li t0,4096 +** addi t0,t0,1888 +** add sp,sp,t0 +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset4_1 () +{ + uint8_t local[8000]; + f(local); +} + +/* GPR: 16, local: 3056, total: 3072 +** stack_offset5_1: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** addi sp,sp,-1040 +** li a0,-4096 +** addi a0,a0,1048 +** li a5,4096 +** addi a5,a5,-1040 +** add a5,a5,a0 +** add a0,a5,sp +** call f +** addi sp,sp,1040 +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset5_1 () +{ + uint8_t local[3048]; + f(local); +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vararg-scalar.c b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vararg-scalar.c new file mode 100644 index 00000000000..ffc90a02f65 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vararg-scalar.c @@ -0,0 +1,33 @@ +/* { dg-do run } */ +/* { dg-options "-std=c11" } */ + +#include +#include + +int va_sum(int a, int count, ...) +{ + va_list ap; + va_start(ap, count); + for (int i = count; i > 0; i--) + { + int arg = va_arg(ap, int); + a = a + arg; + } + va_end(ap); + return a; +} + +int main() +{ + int sum = 0; + int a = 1; + int b = 2; + int c = 3; + int d = 4; + sum = va_sum(sum, 4, a, b, c, d); + + if (sum != 10) + { + abort(); + } +} \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vector_1.c b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vector_1.c new file mode 100644 index 00000000000..afca87532ae --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vector_1.c @@ -0,0 +1,277 @@ +/* { dg-do compile } */ +/* { dg-options "-fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include + +void f (uint8_t *); +void f2 (vuint8m1_t); + +/* GPR: 16, local: 16+vlenb, total: 32+vlenb +** stack_offset1_1: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-32 +** sw ra,28\(sp\) +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** lw ra,28\(sp\) +** addi sp,sp,32 +** jr ra +*/ +void stack_offset1_1 (vuint8m1_t data) +{ + uint8_t local[10]; + f(local); + vint16m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + f2(data); +} + +/* GPR: 16, local: 2016+vlenb, total: 2032+vlenb +** stack_offset1_2: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset1_2 (vuint8m1_t data) +{ + uint8_t local[2016]; + f(local); + vint16m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + f2(data); +} + +/* GPR: 16, local: 6000+vlenb, total: 6016+vlenb +** stack_offset2_1: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-1920 +** sw ra,1916\(sp\) +** li t0,4096 +** sub sp,sp,t0 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** li t0,4096 +** add sp,sp,t0 +** lw ra,1916\(sp\) +** addi sp,sp,1920 +** jr ra +*/ +void stack_offset2_1 (vuint8m1_t data) +{ + uint8_t local[6000]; + f(local); + vint16m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + f2(data); +} + +/* GPR: 16, local: 2032+vlenb, total: 2048+vlenb +** stack_offset3_1: { target { { any-opts "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-32 +** sw ra,28\(sp\) +** addi sp,sp,-2016 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** addi sp,sp,2016 +** lw ra,28\(sp\) +** addi sp,sp,32 +** jr ra +*/ +/* +** stack_offset3_1: { target { { any-opts "-march=rv32gv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** addi sp,sp,-16 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** addi sp,sp,16 +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset3_1 (vuint8m1_t data) +{ + uint8_t local[2017]; + f(local); + vint16m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + f2(data); +} + +/* GPR: 16, local: 2112+vlenb, total: 2128+vlenb +** stack_offset3_2: { target { { any-opts "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-96 +** sw ra,92\(sp\) +** addi sp,sp,-2032 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** addi sp,sp,2032 +** lw ra,92\(sp\) +** addi sp,sp,96 +** jr ra +*/ +/* +** stack_offset3_2: { target { { any-opts "-march=rv32gv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** addi sp,sp,-96 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** addi sp,sp,96 +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset3_2 (vuint8m1_t data) +{ + uint8_t local[2100]; + f(local); + vint16m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + f2(data); +} + +/* GPR: 16, local: 8000+vlenb, total: 8016+vlenb +** stack_offset4_1: { target { { any-opts "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-32 +** sw ra,28\(sp\) +** li t0,8192 +** addi t0,t0,-208 +** sub sp,sp,t0 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** li t0,8192 +** addi t0,t0,-208 +** add sp,sp,t0 +** lw ra,28\(sp\) +** addi sp,sp,32 +** jr ra +*/ +/* +** stack_offset4_1: { target { { any-opts "-march=rv32gv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** li t0,4096 +** addi t0,t0,1888 +** sub sp,sp,t0 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** li t0,4096 +** addi t0,t0,1888 +** add sp,sp,t0 +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset4_1 (vuint8m1_t data) +{ + uint8_t local[8000]; + f(local); + vint16m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + f2(data); +} + +/* GPR: 16, local: 3056+vlenb, total: 3072+vlenb +** stack_offset5_1: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** addi sp,sp,-1040 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** addi sp,sp,1040 +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset5_1 (vuint8m1_t data) +{ + uint8_t local[3048]; + f(local); + vint16m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + f2(data); +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vector_2.c b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vector_2.c new file mode 100644 index 00000000000..7938fa6261c --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vector_2.c @@ -0,0 +1,141 @@ +/* { dg-do compile } */ +/* { dg-options "-fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include + +void f (uint8_t *); +void f2 (vuint8m1_t); + +/* 1*vlenb +** stack_offset1_1: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** jr ra +*/ +void stack_offset1_1 (vuint8m1_t data, uint8_t *base) +{ + vuint8m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + *(vuint8m1_t *)base = data; +} + +/* 8*vlenb +** stack_offset1_2: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** csrr t0,vlenb +** slli t1,t0,3 +** sub sp,sp,t1 +** ... +** csrr t0,vlenb +** slli t1,t0,3 +** add sp,sp,t1 +** jr ra +*/ +void stack_offset1_2 (vuint8m8_t data, uint8_t *base) +{ + vuint8m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + *(vuint8m8_t *)base = data; +} + +/* 3*vlenb +** stack_offset1_3: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** csrr t0,vlenb +** slli t1,t0,2 +** sub t1,t1,t0 +** sub sp,sp,t1 +** ... +** csrr t0,vlenb +** slli t1,t0,2 +** sub t1,t1,t0 +** add sp,sp,t1 +** jr ra +*/ +void stack_offset1_3 (vuint8m1_t data, vuint8m2_t data2, uint8_t *base) +{ + vuint8m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + *(vuint8m2_t *)base = data2; + *(vuint8m1_t *)base = data; +} + +/* 9*vlenb +** stack_offset1_4: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** csrr t0,vlenb +** slli t1,t0,3 +** add t1,t1,t0 +** sub sp,sp,t1 +** ... +** csrr t0,vlenb +** slli t1,t0,3 +** add t1,t1,t0 +** add sp,sp,t1 +** jr ra +*/ +void stack_offset1_4 (vuint8m1_t data, vuint8m8_t data2, uint8_t *base) +{ + vuint8m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + *(vuint8m8_t *)base = data2; + *(vuint8m1_t *)base = data; +} + +/* 10*vlenb +** stack_offset1_5: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** csrr t0,vlenb +** li t1,10 +** mul t1,t1,t0 +** sub sp,sp,t1 +** ... +** csrr t0,vlenb +** li t1,10 +** mul t1,t1,t0 +** add sp,sp,t1 +** jr ra +*/ +void stack_offset1_5 (vuint8m2_t data, vuint8m8_t data2, uint8_t *base) +{ + vuint8m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + *(vuint8m8_t *)base = data2; + *(vuint8m2_t *)base = data; +} From patchwork Tue May 31 08:50:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 54556 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 84FEE3857BBC for ; Tue, 31 May 2022 09:00:33 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbg151.qq.com (smtpbg151.qq.com [18.169.211.239]) by sourceware.org (Postfix) with ESMTPS id 223A538356AE for ; Tue, 31 May 2022 08:51:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 223A538356AE Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp84t1653987072tmoq2ff5 Received: from server1.localdomain ( [42.247.22.65]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 31 May 2022 16:51:12 +0800 (CST) X-QQ-SSF: 01400000002000B0F000000A0000000 X-QQ-FEAT: FVwG7769hyjCyxEx39OXN/eUbGdrXBH3cTtO6iGIyC2j/ObY7JEJV90sfRj6T mTWQb2rSibOzvUFunteRJEe1N2+7VOT5VRw4l038vfA6Q/ux6UxEg3O7gCA5Sm7pEh8wR+I 2VrGgqqgrQYcSwNsTIC/Fikkz9m4y3Y1UUo5FeIOMPty1PheW9r43HRsOwPaMQT09n8frNN bwoZsyf/F+ZjuXH+IPOIbAitOK+21b/iSC28OxNZ7ruqplh/ue6CM71EKvSWGW5o3zmJVO3 wdTMnPr/RsVCRWR/Ppcel+tp4P/kjELYnZxtHCauGPdaEooOMkkJ8byAQEmSFyBLrB2+EhJ GqJgXAlnTyn4cEmTpvI7w75ypgYjWKvUIp7kfTySVSPRVt4lfY= X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Subject: [PATCH 15/21] Add integer intrinsics Date: Tue, 31 May 2022 16:50:06 +0800 Message-Id: <20220531085012.269719-16-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220531085012.269719-1-juzhe.zhong@rivai.ai> References: <20220531085012.269719-1-juzhe.zhong@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybgforeign:qybgforeign4 X-QQ-Bgrelay: 1 X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kito.cheng@gmail.com, juzhe.zhong@rivai.ai Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: zhongjuzhe gcc/ChangeLog: * config/riscv/riscv-protos.h: New functions. * config/riscv/riscv-vector-builtins-functions.cc (basic_alu::assemble_name): New function. (basic_alu::get_return_type): New function. (unop::get_argument_types): New function. (unop::can_be_overloaded_p): New function. (binop::get_argument_types): New function. (binop::can_be_overloaded_p): New function. (wbinop::assemble_name): New function. (ternop::get_argument_types): New function. (ternop::can_be_overloaded_p): New function. (vadd::expand): New function. (vsub::expand): New function. (vrsub::expand): New function. (vneg::expand): New function. (vwadd::expand): New function. (vwsub::expand): New function. (vwaddu::expand): New function. (vwsubu::expand): New function. (vwcvt::expand): New function. (vwcvtu::expand): New function. (vsext::assemble_name): New function. (vsext::expand): New function. (vzext::assemble_name): New function. (vzext::expand): New function. (vadc::expand): New function. (vsbc::expand): New function. (vmadc::assemble_name): New function. (vmadc::expand): New function. (vmsbc::assemble_name): New function. (vmsbc::expand): New function. (vand::expand): New function. (vor::expand): New function. (vxor::expand): New function. (vnot::expand): New function. (vshift::get_argument_types): New function. (vsll::expand): New function. (vsrl::expand): New function. (vsra::expand): New function. (vnsrl::expand): New function. (vnsra::expand): New function. (vncvt::expand): New function. (vcmp::assemble_name): New function. (vmseq::expand): New function. (vmsne::expand): New function. (vmslt::expand): New function. (vmsltu::expand): New function. (vmsgt::expand): New function. (vmsgtu::expand): New function. (vmsle::expand): New function. (vmsleu::expand): New function. (vmsge::expand): New function. (vmsgeu::expand): New function. (vmin::expand): New function. (vminu::expand): New function. (vmax::expand): New function. (vmaxu::expand): New function. (vmul::expand): New function. (vmulh::expand): New function. (vmulhu::expand): New function. (vmulhsu::expand): New function. (vdiv::expand): New function. (vdivu::expand): New function. (vrem::expand): New function. (vremu::expand): New function. (vwmul::expand): New function. (vwmulu::expand): New function. (vwmulsu::expand): New function. (vmacc::expand): New function. (vnmsac::expand): New function. (vmadd::expand): New function. (vnmsub::expand): New function. (vwmacc::expand): New function. (vwmaccu::expand): New function. (vwmaccsu::expand): New function. (vwmaccus::expand): New function. (vmerge::get_position_of_dest_arg): New function. (vmerge::expand): New function. (vmv::get_argument_types): New function. (vmv::can_be_overloaded_p): New function. (vmv::expand): New function. * config/riscv/riscv-vector-builtins-functions.def (vadd): New macro define. (vsub): New macro define. (vrsub): New macro define. (vneg): New macro define. (vwadd): New macro define. (vwsub): New macro define. (vwaddu): New macro define. (vwsubu): New macro define. (vwcvt): New macro define. (vwcvtu): New macro define. (vsext): New macro define. (vzext): New macro define. (vadc): New macro define. (vsbc): New macro define. (vmadc): New macro define. (vmsbc): New macro define. (vand): New macro define. (vor): New macro define. (vxor): New macro define. (vnot): New macro define. (vsll): New macro define. (vsrl): New macro define. (vsra): New macro define. (vnsrl): New macro define. (vnsra): New macro define. (vncvt): New macro define. (vmseq): New macro define. (vmsne): New macro define. (vmslt): New macro define. (vmsltu): New macro define. (vmsgt): New macro define. (vmsgtu): New macro define. (vmsle): New macro define. (vmsleu): New macro define. (vmsge): New macro define. (vmsgeu): New macro define. (vmin): New macro define. (vminu): New macro define. (vmax): New macro define. (vmaxu): New macro define. (vmul): New macro define. (vmulh): New macro define. (vmulhu): New macro define. (vmulhsu): New macro define. (vdiv): New macro define. (vdivu): New macro define. (vrem): New macro define. (vremu): New macro define. (vwmul): New macro define. (vwmulu): New macro define. (vwmulsu): New macro define. (vmacc): New macro define. (vnmsac): New macro define. (vmadd): New macro define. (vnmsub): New macro define. (vwmacc): New macro define. (vwmaccu): New macro define. (vwmaccsu): New macro define. (vwmaccus): New macro define. (vmerge): New macro define. (vmv): New macro define. * config/riscv/riscv-vector-builtins-functions.h (class basic_alu): New class. (class unop): New class. (class binop): New class. (class wbinop): New class. (class ternop): New class. (class vadd): New class. (class vsub): New class. (class vrsub): New class. (class vneg): New class. (class vwadd): New class. (class vwsub): New class. (class vwaddu): New class. (class vwsubu): New class. (class vwcvt): New class. (class vwcvtu): New class. (class vsext): New class. (class vzext): New class. (class vadc): New class. (class vsbc): New class. (class vmadc): New class. (class vmsbc): New class. (class vand): New class. (class vor): New class. (class vxor): New class. (class vnot): New class. (class vshift): New class. (class vsll): New class. (class vsrl): New class. (class vsra): New class. (class vnsrl): New class. (class vnsra): New class. (class vncvt): New class. (class vcmp): New class. (class vmseq): New class. (class vmsne): New class. (class vmslt): New class. (class vmsltu): New class. (class vmsgt): New class. (class vmsgtu): New class. (class vmsle): New class. (class vmsleu): New class. (class vmsge): New class. (class vmsgeu): New class. (class vmin): New class. (class vminu): New class. (class vmax): New class. (class vmaxu): New class. (class vmul): New class. (class vmulh): New class. (class vmulhu): New class. (class vmulhsu): New class. (class vdiv): New class. (class vdivu): New class. (class vrem): New class. (class vremu): New class. (class vwmul): New class. (class vwmulu): New class. (class vwmulsu): New class. (class vmacc): New class. (class vnmsac): New class. (class vmadd): New class. (class vnmsub): New class. (class vwmacc): New class. (class vwmaccu): New class. (class vwmaccsu): New class. (class vwmaccus): New class. (class vmerge): New class. (class vmv): New class. * config/riscv/riscv-vector.cc (modify_operands): Fix for UNSPEC_VSSUBU. (emit_op6): New function. (emit_op7): New function. * config/riscv/riscv.cc (riscv_print_operand): Add %v and %V assembly support. (riscv_modes_tieable_p): Adjust for RVV modes. * config/riscv/riscv.md: Add more code support. * config/riscv/vector-iterators.md: Fix iterators. * config/riscv/vector.md (@v_vx): New pattern. (@v_vxm): New pattern. (@vms_vx): New pattern. (@vadd_vv): New pattern. (@vsub_vv): New pattern. (@vadd_vx_internal): New pattern. (@vadd_vx_32bit): New pattern. (@vsub_vx_internal): New pattern. (@vsub_vx_32bit): New pattern. (@vrsub_vx_internal): New pattern. (@vrsub_vx_32bit): New pattern. (@vneg_v): New pattern. (@vw_vv): New pattern. (@vw_vx): New pattern. (@vw_wv): New pattern. (@vw_wx): New pattern. (@vwcvt_x_x_v): New pattern. (@vext_vf2): New pattern. (@vext_vf4): New pattern. (@vext_vf8): New pattern. (@vadc_vvm): New pattern. (@vsbc_vvm): New pattern. (@vadc_vxm_internal): New pattern. (@vadc_vxm_32bit): New pattern. (@vsbc_vxm_internal): New pattern. (@vsbc_vxm_32bit): New pattern. (@vmadc_vvm): New pattern. (@vmsbc_vvm): New pattern. (@vmadc_vxm_internal): New pattern. (@vmadc_vxm_32bit): New pattern. (@vmsbc_vxm_internal): New pattern. (@vmsbc_vxm_32bit): New pattern. (@vmadc_vv): New pattern. (@vmsbc_vv): New pattern. (@vmadc_vx_internal): New pattern. (@vmadc_vx_32bit): New pattern. (@vmsbc_vx_internal): New pattern. (@vmsbc_vx_32bit): New pattern. (@v_vv): New pattern. (@v_vx_internal): New pattern. (@v_vx_32bit): New pattern. (@vnot_v): New pattern. (@v_vx): New pattern. (@vn_wv): New pattern. (@vn_wx): New pattern. (@vncvt_x_x_w): New pattern. (@vms_vv): New pattern. (@vms_vx_internal): New pattern. (@vms_vx_32bit): New pattern. (*vms_vx): New pattern. (*vms_vx_32bit): New pattern. (@vmul_vv): New pattern. (@vmul_vx_internal): New pattern. (@vmul_vx_32bit): New pattern. (@vmulh_vv): New pattern. (@vmulh_vx_internal): New pattern. (@vmulh_vx_32bit): New pattern. (@vmulhsu_vv): New pattern. (@vmulhsu_vx_internal): New pattern. (@vmulhsu_vx_32bit): New pattern. (@vwmul_vv): New pattern. (@vwmul_vx): New pattern. (@vwmulsu_vv): New pattern. (@vwmulsu_vx): New pattern. (@v_vv): New pattern. (@v_vx_internal): New pattern. (@v_vx_32bit): New pattern. (@vwmacc_vv): New pattern. (@vwmacc_vx): New pattern. (@vwmaccsu_vv): New pattern. (@vwmaccsu_vx): New pattern. (@vwmaccus_vx): New pattern. (@vmerge_vvm): New pattern. (@vmerge_vxm_internal): New pattern. (@vmerge_vxm_32bit): New pattern. (@vmv_v_v): New pattern. --- gcc/config/riscv/riscv-protos.h | 24 + .../riscv/riscv-vector-builtins-functions.cc | 991 +++++++ .../riscv/riscv-vector-builtins-functions.def | 98 + .../riscv/riscv-vector-builtins-functions.h | 701 +++++ gcc/config/riscv/riscv-vector.cc | 41 +- gcc/config/riscv/riscv.cc | 53 + gcc/config/riscv/riscv.md | 62 +- gcc/config/riscv/vector-iterators.md | 3 - gcc/config/riscv/vector.md | 2575 ++++++++++++++++- 9 files changed, 4502 insertions(+), 46 deletions(-) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index fd8906e47de..c47ab8e079d 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -145,6 +145,30 @@ emit_op5 ( bool (*imm_p) (rtx), int i, bool reverse ); +extern void +emit_op6 ( + unsigned int unspec, + machine_mode Vmode, machine_mode VSImode, machine_mode VMSImode, + machine_mode VSUBmode, + rtx *operands, + rtx (*gen_vx) (rtx, rtx, rtx, rtx, rtx, rtx), + rtx (*gen_vx_32bit) (rtx, rtx, rtx, rtx, rtx, rtx), + rtx (*gen_vv) (rtx, rtx, rtx, rtx, rtx, rtx), + bool (*imm_p) (rtx), + int i, bool reverse +); +extern void +emit_op7 ( + unsigned int unspec, + machine_mode Vmode, machine_mode VSImode, machine_mode VMSImode, + machine_mode VSUBmode, + rtx *operands, + rtx (*gen_vx) (rtx, rtx, rtx, rtx, rtx, rtx, rtx), + rtx (*gen_vx_32bit) (rtx, rtx, rtx, rtx, rtx, rtx, rtx), + rtx (*gen_vv) (rtx, rtx, rtx, rtx, rtx, rtx, rtx), + bool (*imm_p) (rtx), + int i, bool reverse +); /* We classify builtin types into two classes: 1. General builtin class which is using the diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.cc b/gcc/config/riscv/riscv-vector-builtins-functions.cc index 0726465f146..6e0fd0b3570 100644 --- a/gcc/config/riscv/riscv-vector-builtins-functions.cc +++ b/gcc/config/riscv/riscv-vector-builtins-functions.cc @@ -2072,6 +2072,997 @@ vleff::expand (const function_instance &instance, tree exp, rtx target) const return expand_builtin_insn (icode, exp, target, instance); } +/* A function implementation for basic_alu functions. */ +char * +basic_alu::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + if (this->can_be_overloaded_p (instance)) + { + append_name (instance.get_base_name ()); + if (instance.get_operation () == OP_v_x || + instance.get_operation () == OP_v_v || + instance.get_operation () == OP_v_f) + append_name ("_v"); + if (instance.get_operation () == OP_x_x_v || + instance.get_operation () == OP_x_x_w) + append_name ("_x"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); + } + return nullptr; +} + +tree +basic_alu::get_return_type (const function_instance &instance) const +{ + return get_dt_t_with_index (instance, 0); +} + +/* A function implementation for unary functions. */ +void +unop::get_argument_types (const function_instance &instance, + vec &argument_types) const +{ + argument_types.quick_push (get_dt_t_with_index (instance, 1)); +} + +bool +unop::can_be_overloaded_p (const function_instance &instance) const +{ + if (instance.get_pred () == PRED_none) + return false; + + return true; +} + +/* A function implementation for binary functions. */ +void +binop::get_argument_types (const function_instance &instance, + vec &argument_types) const +{ + for (unsigned int i = 1; i < instance.get_arg_pattern ().arg_len; i++) + { + if (i == 2 && vector_scalar_operation_p (instance.get_operation ())) + { + machine_mode mode = GET_MODE_INNER (instance.get_arg_pattern ().arg_list[i]); + bool unsigned_p = is_dt_unsigned (instance.get_data_type_list ()[i]); + argument_types.quick_push (get_dt_t (mode, unsigned_p)); + } + else + argument_types.quick_push (get_dt_t_with_index (instance, i)); + } +} + +bool +binop::can_be_overloaded_p (const function_instance &) const +{ + return true; +} + +/* A function implementation for widen binary functions. */ +char * +wbinop::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name (instance.get_base_name ()); + append_name (get_operation_str (instance.get_operation ())); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +/* A function implementation for ternary functions. */ +void +ternop::get_argument_types (const function_instance &instance, + vec &argument_types) const +{ + if (vector_scalar_operation_p (instance.get_operation ())) + { + machine_mode mode = GET_MODE_INNER (instance.get_arg_pattern ().arg_list[1]); + bool unsigned_p = is_dt_unsigned (instance.get_data_type_list ()[1]); + argument_types.quick_push (get_dt_t (mode, unsigned_p)); + } + else + argument_types.quick_push (get_dt_t_with_index (instance, 1)); + for (unsigned int i = 2; i < instance.get_arg_pattern ().arg_len; i++) + argument_types.quick_push (get_dt_t_with_index (instance, i)); +} + +bool +ternop::can_be_overloaded_p (const function_instance &) const +{ + return true; +} + +/* A function implementation for vadd functions. */ +rtx +vadd::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vadd_vv (mode); + else + icode = code_for_v_vx (UNSPEC_VADD, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +rtx +vsub::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vsub_vv (mode); + else + icode = code_for_v_vx (UNSPEC_VSUB, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vrsub functions. */ +rtx +vrsub::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_v_vx (UNSPEC_VRSUB, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vneg functions. */ +rtx +vneg::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vneg_v (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vwadd functions. */ +rtx +vwadd::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[2]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vw_vv (PLUS, SIGN_EXTEND, mode); + else if (instance.get_operation () == OP_vx) + icode = code_for_vw_vx (PLUS, SIGN_EXTEND, mode); + else if (instance.get_operation () == OP_wv) + icode = code_for_vw_wv (PLUS, SIGN_EXTEND, mode); + else + icode = code_for_vw_wx (PLUS, SIGN_EXTEND, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vwsub functions. */ +rtx +vwsub::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[2]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vw_vv (MINUS, SIGN_EXTEND, mode); + else if (instance.get_operation () == OP_vx) + icode = code_for_vw_vx (MINUS, SIGN_EXTEND, mode); + else if (instance.get_operation () == OP_wv) + icode = code_for_vw_wv (MINUS, SIGN_EXTEND, mode); + else + icode = code_for_vw_wx (MINUS, SIGN_EXTEND, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vwaddu functions. */ +rtx +vwaddu::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[2]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vw_vv (PLUS, ZERO_EXTEND, mode); + else if (instance.get_operation () == OP_vx) + icode = code_for_vw_vx (PLUS, ZERO_EXTEND, mode); + else if (instance.get_operation () == OP_wv) + icode = code_for_vw_wv (PLUS, ZERO_EXTEND, mode); + else + icode = code_for_vw_wx (PLUS, ZERO_EXTEND, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vwsubu functions. */ +rtx +vwsubu::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[2]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vw_vv (MINUS, ZERO_EXTEND, mode); + else if (instance.get_operation () == OP_vx) + icode = code_for_vw_vx (MINUS, ZERO_EXTEND, mode); + else if (instance.get_operation () == OP_wv) + icode = code_for_vw_wv (MINUS, ZERO_EXTEND, mode); + else + icode = code_for_vw_wx (MINUS, ZERO_EXTEND, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vwcvt functions. */ +rtx +vwcvt::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vwcvt_x_x_v (SIGN_EXTEND, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vwcvtu functions. */ +rtx +vwcvtu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vwcvt_x_x_v (ZERO_EXTEND, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vsext functions. */ +char * +vsext::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name (instance.get_base_name ()); + append_name (get_operation_str (instance.get_operation ())); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vsext::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vf2) + icode = code_for_vext_vf2 (SIGN_EXTEND, mode); + else if (instance.get_operation () == OP_vf4) + icode = code_for_vext_vf4 (SIGN_EXTEND, mode); + else + icode = code_for_vext_vf8 (SIGN_EXTEND, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vzext functions. */ +char * +vzext::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name (instance.get_base_name ()); + append_name (get_operation_str (instance.get_operation ())); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vzext::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vf2) + icode = code_for_vext_vf2 (ZERO_EXTEND, mode); + else if (instance.get_operation () == OP_vf4) + icode = code_for_vext_vf4 (ZERO_EXTEND, mode); + else + icode = code_for_vext_vf8 (ZERO_EXTEND, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vadc functions. */ +rtx +vadc::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vvm) + icode = code_for_vadc_vvm (mode); + else + icode = code_for_v_vxm (UNSPEC_VADC, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vsbc functions. */ +rtx +vsbc::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vvm) + icode = code_for_vsbc_vvm (mode); + else + icode = code_for_v_vxm (UNSPEC_VSBC, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmadc functions. */ +char * +vmadc::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0, 1); + append_name (instance.get_base_name ()); + return finish_name (); +} + +rtx +vmadc::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vvm) + icode = code_for_vmadc_vvm (mode); + else if (instance.get_operation () == OP_vv) + icode = code_for_vmadc_vv (mode); + else if (instance.get_operation () == OP_vxm) + icode = code_for_v_vxm (UNSPEC_VMADC, mode); + else + icode = code_for_v_vx (UNSPEC_VMADC, mode); + + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmsbc functions. */ +char * +vmsbc::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0, 1); + append_name (instance.get_base_name ()); + return finish_name (); +} + +rtx +vmsbc::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vvm) + icode = code_for_vmsbc_vvm (mode); + else if (instance.get_operation () == OP_vv) + icode = code_for_vmsbc_vv (mode); + else if (instance.get_operation () == OP_vxm) + icode = code_for_v_vxm (UNSPEC_VMSBC, mode); + else + icode = code_for_v_vx (UNSPEC_VMSBC, mode); + + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vand functions. */ +rtx +vand::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (AND, mode); + else + icode = code_for_v_vx (UNSPEC_VAND, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vor functions. */ +rtx +vor::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (IOR, mode); + else + icode = code_for_v_vx (UNSPEC_VIOX, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vxor functions. */ +rtx +vxor::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (XOR, mode); + else + icode = code_for_v_vx (UNSPEC_VXOR, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vnot functions. */ +rtx +vnot::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + + enum insn_code icode = code_for_vnot_v (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vshift functions. */ +void +vshift::get_argument_types (const function_instance &instance, + vec &argument_types) const +{ + argument_types.quick_push (get_dt_t_with_index (instance, 1)); + if (instance.get_operation () == OP_vx || instance.get_operation () == OP_wx) + argument_types.quick_push (size_type_node); + else + argument_types.quick_push (get_dt_t_with_index (instance, 2)); +} + +/* A function implementation for vsll functions. */ +rtx +vsll::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (ASHIFT, mode); + else + icode = code_for_v_vx (ASHIFT, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vsrl functions. */ +rtx +vsrl::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (LSHIFTRT, mode); + else + icode = code_for_v_vx (LSHIFTRT, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vsra functions. */ +rtx +vsra::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (ASHIFTRT, mode); + else + icode = code_for_v_vx (ASHIFTRT, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vnsrl functions. */ +rtx +vnsrl::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[0]; + enum insn_code icode; + if (instance.get_operation () == OP_wv) + icode = code_for_vn_wv (LSHIFTRT, mode); + else + icode = code_for_vn_wx (LSHIFTRT, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vnsra functions. */ +rtx +vnsra::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[0]; + enum insn_code icode; + if (instance.get_operation () == OP_wv) + icode = code_for_vn_wv (ASHIFTRT, mode); + else + icode = code_for_vn_wx (ASHIFTRT, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vncvt functions. */ +rtx +vncvt::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vncvt_x_x_w (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vcmp functions. */ +char * +vcmp::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0, 1); + append_name (instance.get_base_name ()); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +/* A function implementation for vmseq functions. */ +rtx +vmseq::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vms_vv (EQ, mode); + else + icode = code_for_vms_vx (EQ, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmsne functions. */ +rtx +vmsne::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vms_vv (NE, mode); + else + icode = code_for_vms_vx (NE, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmslt functions. */ +rtx +vmslt::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vms_vv (LT, mode); + else + icode = code_for_vms_vx (LT, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmsltu functions. */ +rtx +vmsltu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vms_vv (LTU, mode); + else + icode = code_for_vms_vx (LTU, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmsgt functions. */ +rtx +vmsgt::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vms_vv (GT, mode); + else + icode = code_for_vms_vx (GT, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmsgtu functions. */ +rtx +vmsgtu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vms_vv (GTU, mode); + else + icode = code_for_vms_vx (GTU, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmsle functions. */ +rtx +vmsle::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vms_vv (LE, mode); + else + icode = code_for_vms_vx (LE, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmsleu functions. */ +rtx +vmsleu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vms_vv (LEU, mode); + else + icode = code_for_vms_vx (LEU, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmsge functions. */ +rtx +vmsge::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vms_vv (GE, mode); + else + icode = code_for_vms_vx (GE, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmsgeu functions. */ +rtx +vmsgeu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vms_vv (GEU, mode); + else + icode = code_for_vms_vx (GEU, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmin functions. */ +rtx +vmin::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (SMIN, mode); + else + icode = code_for_v_vx (UNSPEC_VMIN, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vminu functions. */ +rtx +vminu::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (UMIN, mode); + else + icode = code_for_v_vx (UNSPEC_VMINU, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmax functions. */ +rtx +vmax::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (SMAX, mode); + else + icode = code_for_v_vx (UNSPEC_VMAX, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmaxu functions. */ +rtx +vmaxu::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (UMAX, mode); + else + icode = code_for_v_vx (UNSPEC_VMAXU, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmul functions. */ +rtx +vmul::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vmul_vv (mode); + else + icode = code_for_v_vx (UNSPEC_VMUL, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmulh functions. */ +rtx +vmulh::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vmulh_vv (UNSPEC_VMULH, mode); + else + icode = code_for_v_vx (UNSPEC_VMULH, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmulhu functions. */ +rtx +vmulhu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vmulh_vv (UNSPEC_VMULHU, mode); + else + icode = code_for_v_vx (UNSPEC_VMULHU, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmulhsu functions. */ +rtx +vmulhsu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vmulhsu_vv (mode); + else + icode = code_for_v_vx (UNSPEC_VMULHSU, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vdiv functions. */ +rtx +vdiv::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (DIV, mode); + else + icode = code_for_v_vx (UNSPEC_VDIV, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vdivu functions. */ +rtx +vdivu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (UDIV, mode); + else + icode = code_for_v_vx (UNSPEC_VDIVU, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vrem functions. */ +rtx +vrem::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (MOD, mode); + else + icode = code_for_v_vx (UNSPEC_VREM, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vremu functions. */ +rtx +vremu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (UMOD, mode); + else + icode = code_for_v_vx (UNSPEC_VREMU, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vwmul functions. */ +rtx +vwmul::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vwmul_vv (SIGN_EXTEND, mode); + else + icode = code_for_vwmul_vx (SIGN_EXTEND, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vwmulu functions. */ +rtx +vwmulu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vwmul_vv (ZERO_EXTEND, mode); + else + icode = code_for_vwmul_vx (ZERO_EXTEND, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vwmulsusu functions. */ +rtx +vwmulsu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vwmulsu_vv (mode); + else + icode = code_for_vwmulsu_vx (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmacc functions. */ +rtx +vmacc::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (UNSPEC_MACC, mode); + else + icode = code_for_v_vx (UNSPEC_MACC, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vnmsac functions. */ +rtx +vnmsac::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (UNSPEC_NMSAC, mode); + else + icode = code_for_v_vx (UNSPEC_NMSAC, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmadd functions. */ +rtx +vmadd::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (UNSPEC_MADD, mode); + else + icode = code_for_v_vx (UNSPEC_MADD, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vnmsub functions. */ +rtx +vnmsub::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (UNSPEC_NMSUB, mode); + else + icode = code_for_v_vx (UNSPEC_NMSUB, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vwmacc functions. */ +rtx +vwmacc::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[2]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vwmacc_vv (SIGN_EXTEND, mode); + else + icode = code_for_vwmacc_vx (SIGN_EXTEND, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vwmaccu functions. */ +rtx +vwmaccu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[2]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vwmacc_vv (ZERO_EXTEND, mode); + else + icode = code_for_vwmacc_vx (ZERO_EXTEND, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vwmaccsu functions. */ +rtx +vwmaccsu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vwmaccsu_vv (mode); + else + icode = code_for_vwmaccsu_vx (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vwmaccus functions. */ +rtx +vwmaccus::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vwmaccus_vx (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmerge functions. */ +size_t +vmerge::get_position_of_dest_arg (enum predication_index) const +{ + return 1; +} + +rtx +vmerge::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vvm) + icode = code_for_vmerge_vvm (mode); + else + icode = code_for_v_vxm (UNSPEC_VMERGE, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmv functions. */ +void +vmv::get_argument_types (const function_instance &instance, + vec &argument_types) const +{ + if (instance.get_operation () == OP_v_x) + argument_types.quick_push (get_dt_t_with_index (instance, 1)); + else + argument_types.quick_push (get_dt_t_with_index (instance, 0)); +} + +bool +vmv::can_be_overloaded_p (const function_instance &instance) const +{ + if (instance.get_operation () == OP_v_v) + return true; + + if (instance.get_pred () == PRED_tu) + return true; + + return false; +} + +rtx +vmv::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_v_x) + icode = code_for_v_v_x (UNSPEC_VMV, mode); + else + icode = code_for_vmv_v_v (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + } // end namespace riscv_vector using namespace riscv_vector; diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def b/gcc/config/riscv/riscv-vector-builtins-functions.def index 6d82b1c933d..bf9d42e6d67 100644 --- a/gcc/config/riscv/riscv-vector-builtins-functions.def +++ b/gcc/config/riscv/riscv-vector-builtins-functions.def @@ -139,6 +139,104 @@ DEF_RVV_FUNCTION(vsoxei, vsoxei, (4, VITER(V128UNITSI, unsigned), VATTR(0, VSUB, DEF_RVV_FUNCTION(vleff, vleff, (2, VITER(VI, signed), VATTR(0, VSUB, c_ptr)), pat_mask_tail, pred_all, OP_v) DEF_RVV_FUNCTION(vleff, vleff, (2, VITER(VI, unsigned), VATTR(0, VSUB, c_uptr)), pat_mask_tail, pred_all, OP_v) DEF_RVV_FUNCTION(vleff, vleff, (2, VITER(VF, signed), VATTR(0, VSUB, c_ptr)), pat_mask_tail, pred_all, OP_v) +/* 11. Vector Integer Arithmetic Instructions. */ +DEF_RVV_FUNCTION(vadd, vadd, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vadd, vadd, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vsub, vsub, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vsub, vsub, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vrsub, vrsub, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail, pred_all, OP_vx) +DEF_RVV_FUNCTION(vrsub, vrsub, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vx) +DEF_RVV_FUNCTION(vneg, vneg, (2, VITER(VI, signed), VATTR(0, VI, signed)), pat_mask_tail, pred_all, OP_v) +DEF_RVV_FUNCTION(vwadd, vwadd, (3, VATTR(1, VW, signed), VITER(VWI, signed), VATTR(1, VWI, signed)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vwsub, vwsub, (3, VATTR(1, VW, signed), VITER(VWI, signed), VATTR(1, VWI, signed)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vwaddu, vwaddu, (3, VATTR(1, VW, unsigned), VITER(VWI, unsigned), VATTR(1, VWI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vwsubu, vwsubu, (3, VATTR(1, VW, unsigned), VITER(VWI, unsigned), VATTR(1, VWI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vwadd, vwadd, (3, VATTR(2, VW, signed), VATTR(2, VW, signed), VITER(VWI, signed)), pat_mask_tail, pred_all, OP_wv | OP_wx) +DEF_RVV_FUNCTION(vwsub, vwsub, (3, VATTR(2, VW, signed), VATTR(2, VW, signed), VITER(VWI, signed)), pat_mask_tail, pred_all, OP_wv | OP_wx) +DEF_RVV_FUNCTION(vwaddu, vwaddu, (3, VATTR(2, VW, unsigned), VATTR(2, VW, unsigned), VITER(VWI, unsigned)), pat_mask_tail, pred_all, OP_wv | OP_wx) +DEF_RVV_FUNCTION(vwsubu, vwsubu, (3, VATTR(2, VW, unsigned), VATTR(2, VW, unsigned), VITER(VWI, unsigned)), pat_mask_tail, pred_all, OP_wv | OP_wx) +DEF_RVV_FUNCTION(vwcvt, vwcvt, (2, VATTR(1, VW, signed), VITER(VWI, signed)), pat_mask_tail, pred_all, OP_x_x_v) +DEF_RVV_FUNCTION(vwcvtu, vwcvtu, (2, VATTR(1, VW, unsigned), VITER(VWI, unsigned)), pat_mask_tail, pred_all, OP_x_x_v) +DEF_RVV_FUNCTION(vsext, vsext, (2, VATTR(1, VW, signed), VITER(VWI, signed)), pat_mask_tail, pred_all, OP_vf2) +DEF_RVV_FUNCTION(vsext, vsext, (2, VATTR(1, VQW, signed), VITER(VQWI, signed)), pat_mask_tail, pred_all, OP_vf4) +DEF_RVV_FUNCTION(vsext, vsext, (2, VATTR(1, VOW, signed), VITER(VOWI, signed)), pat_mask_tail, pred_all, OP_vf8) +DEF_RVV_FUNCTION(vzext, vzext, (2, VATTR(1, VW, unsigned), VITER(VWI, unsigned)), pat_mask_tail, pred_all, OP_vf2) +DEF_RVV_FUNCTION(vzext, vzext, (2, VATTR(1, VQW, unsigned), VITER(VQWI, unsigned)), pat_mask_tail, pred_all, OP_vf4) +DEF_RVV_FUNCTION(vzext, vzext, (2, VATTR(1, VOW, unsigned), VITER(VOWI, unsigned)), pat_mask_tail, pred_all, OP_vf8) +DEF_RVV_FUNCTION(vadc, vadc, (4, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed), VATTR(0, VM, signed)), PAT_tail, pred_tail, OP_vvm | OP_vxm) +DEF_RVV_FUNCTION(vsbc, vsbc, (4, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed), VATTR(0, VM, signed)), PAT_tail, pred_tail, OP_vvm | OP_vxm) +DEF_RVV_FUNCTION(vadc, vadc, (4, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VM, unsigned)), PAT_tail, pred_tail, OP_vvm | OP_vxm) +DEF_RVV_FUNCTION(vsbc, vsbc, (4, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VM, unsigned)), PAT_tail, pred_tail, OP_vvm | OP_vxm) +DEF_RVV_FUNCTION(vmadc, vmadc, (4, VATTR(1, VM, signed), VITER(VI, signed), VATTR(1, VI, signed), VATTR(1, VM, signed)), PAT_none, PRED_void, OP_vvm | OP_vxm) +DEF_RVV_FUNCTION(vmadc, vmadc, (3, VATTR(1, VM, signed), VITER(VI, signed), VATTR(1, VI, signed)), PAT_none, PRED_void, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmsbc, vmsbc, (4, VATTR(1, VM, signed), VITER(VI, signed), VATTR(1, VI, signed), VATTR(1, VM, signed)), PAT_none, PRED_void, OP_vvm | OP_vxm) +DEF_RVV_FUNCTION(vmsbc, vmsbc, (3, VATTR(1, VM, signed), VITER(VI, signed), VATTR(1, VI, signed)), PAT_none, PRED_void, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmadc, vmadc, (4, VATTR(1, VM, signed), VITER(VI, unsigned), VATTR(1, VI, unsigned), VATTR(1, VM, signed)), PAT_none, PRED_void, OP_vvm | OP_vxm) +DEF_RVV_FUNCTION(vmadc, vmadc, (3, VATTR(1, VM, signed), VITER(VI, unsigned), VATTR(1, VI, unsigned)), PAT_none, PRED_void, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmsbc, vmsbc, (4, VATTR(1, VM, signed), VITER(VI, unsigned), VATTR(1, VI, unsigned), VATTR(1, VM, signed)), PAT_none, PRED_void, OP_vvm | OP_vxm) +DEF_RVV_FUNCTION(vmsbc, vmsbc, (3, VATTR(1, VM, signed), VITER(VI, unsigned), VATTR(1, VI, unsigned)), PAT_none, PRED_void, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vand, vand, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vor, vor, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vxor, vxor, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vand, vand, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vor, vor, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vxor, vxor, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vnot, vnot, (2, VITER(VI, signed), VATTR(0, VI, signed)), pat_mask_tail, pred_all, OP_v) +DEF_RVV_FUNCTION(vnot, vnot, (2, VITER(VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_v) +DEF_RVV_FUNCTION(vsll, vsll, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vsll, vsll, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vsrl, vsrl, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vsra, vsra, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vnsrl, vnsrl, (3, VITER(VWI, unsigned), VATTR(0, VW, unsigned), VATTR(0, VWI, unsigned)), pat_mask_tail, pred_all, OP_wv | OP_wx) +DEF_RVV_FUNCTION(vnsra, vnsra, (3, VITER(VWI, signed), VATTR(0, VW, signed), VATTR(0, VWI, unsigned)), pat_mask_tail, pred_all, OP_wv | OP_wx) +DEF_RVV_FUNCTION(vncvt, vncvt, (2, VITER(VWI, signed), VATTR(0, VW, signed)), pat_mask_tail, pred_all, OP_x_x_w) +DEF_RVV_FUNCTION(vncvt, vncvt, (2, VITER(VWI, unsigned), VATTR(0, VW, unsigned)), pat_mask_tail, pred_all, OP_x_x_w) +DEF_RVV_FUNCTION(vmseq, vmseq, (3, VATTR(1, VM, signed), VITER(VI, signed), VATTR(1, VI, signed)), pat_mask_ignore_tp, pred_mask, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmseq, vmseq, (3, VATTR(1, VM, unsigned), VITER(VI, unsigned), VATTR(1, VI, unsigned)), pat_mask_ignore_tp, pred_mask, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmsne, vmsne, (3, VATTR(1, VM, signed), VITER(VI, signed), VATTR(1, VI, signed)), pat_mask_ignore_tp, pred_mask, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmsne, vmsne, (3, VATTR(1, VM, unsigned), VITER(VI, unsigned), VATTR(1, VI, unsigned)), pat_mask_ignore_tp, pred_mask, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmslt, vmslt, (3, VATTR(1, VM, signed), VITER(VI, signed), VATTR(1, VI, signed)), pat_mask_ignore_tp, pred_mask, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmsltu, vmsltu, (3, VATTR(1, VM, unsigned), VITER(VI, unsigned), VATTR(1, VI, unsigned)), pat_mask_ignore_tp, pred_mask, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmsgt, vmsgt, (3, VATTR(1, VM, signed), VITER(VI, signed), VATTR(1, VI, signed)), pat_mask_ignore_tp, pred_mask, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmsgtu, vmsgtu, (3, VATTR(1, VM, unsigned), VITER(VI, unsigned), VATTR(1, VI, unsigned)), pat_mask_ignore_tp, pred_mask, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmsle, vmsle, (3, VATTR(1, VM, signed), VITER(VI, signed), VATTR(1, VI, signed)), pat_mask_ignore_tp, pred_mask, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmsleu, vmsleu, (3, VATTR(1, VM, unsigned), VITER(VI, unsigned), VATTR(1, VI, unsigned)), pat_mask_ignore_tp, pred_mask, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmsge, vmsge, (3, VATTR(1, VM, signed), VITER(VI, signed), VATTR(1, VI, signed)), pat_mask_ignore_tp, pred_mask, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmsgeu, vmsgeu, (3, VATTR(1, VM, unsigned), VITER(VI, unsigned), VATTR(1, VI, unsigned)), pat_mask_ignore_tp, pred_mask, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmin, vmin, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vminu, vminu, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmax, vmax, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmaxu, vmaxu, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmul, vmul, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmul, vmul, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmulh, vmulh, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmulhu, vmulhu, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmulhsu, vmulhsu, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vdiv, vdiv, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vdivu, vdivu, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vrem, vrem, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vremu, vremu, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vwmul, vwmul, (3, VATTR(1, VW, signed), VITER(VWI, signed), VATTR(1, VWI, signed)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vwmulu, vwmulu, (3, VATTR(1, VW, unsigned), VITER(VWI, unsigned), VATTR(1, VWI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vwmulsu, vwmulsu, (3, VATTR(1, VW, signed), VITER(VWI, signed), VATTR(1, VWI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmacc, vmacc, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail_dest, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vnmsac, vnmsac, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail_dest, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmadd, vmadd, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail_dest, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vnmsub, vnmsub, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail_dest, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmacc, vmacc, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail_dest, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vnmsac, vnmsac, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail_dest, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vmadd, vmadd, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail_dest, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vnmsub, vnmsub, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail_dest, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vwmacc, vwmacc, (3, VATTR(1, VW, signed), VITER(VWI, signed), VATTR(1, VWI, signed)), pat_mask_tail_dest, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vwmaccu, vwmaccu, (3, VATTR(1, VW, unsigned), VITER(VWI, unsigned), VATTR(1, VWI, unsigned)), pat_mask_tail_dest, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vwmaccsu, vwmaccsu, (3, VATTR(1, VW, signed), VITER(VWI, signed), VATTR(1, VWI, unsigned)), pat_mask_tail_dest, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vwmaccus, vwmaccus, (3, VATTR(1, VW, signed), VITER(VWI, unsigned), VATTR(1, VWI, signed)), pat_mask_tail_dest, pred_all, OP_vx) +DEF_RVV_FUNCTION(vmerge, vmerge, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), PAT_tail | PAT_merge, pred_tail, OP_vvm | OP_vxm) +DEF_RVV_FUNCTION(vmerge, vmerge, (3, VITER(VI, unsigned),VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), PAT_tail | PAT_merge, pred_tail, OP_vvm | OP_vxm) +DEF_RVV_FUNCTION(vmerge, vmerge, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), PAT_tail | PAT_merge, pred_tail, OP_vvm) +DEF_RVV_FUNCTION(vmv, vmv, (2, VITER(VI, signed), VATTR(0, VSUB, signed)), PAT_tail, pred_tail, OP_v_v | OP_v_x) +DEF_RVV_FUNCTION(vmv, vmv, (2, VITER(VI, unsigned), VATTR(0, VSUB, unsigned)), PAT_tail, pred_tail, OP_v_v | OP_v_x) +DEF_RVV_FUNCTION(vmv, vmv, (2, VITER(VF, signed), VATTR(0, VSUB, signed)), PAT_tail, pred_tail, OP_v_v) #undef REQUIRED_EXTENSIONS #undef DEF_RVV_FUNCTION #undef VITER diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.h b/gcc/config/riscv/riscv-vector-builtins-functions.h index a37e21876a6..bde03e8d49d 100644 --- a/gcc/config/riscv/riscv-vector-builtins-functions.h +++ b/gcc/config/riscv/riscv-vector-builtins-functions.h @@ -831,6 +831,707 @@ public: virtual rtx expand (const function_instance &, tree, rtx) const override; }; +/* A function_base for basic_alu functions. */ +class basic_alu : public function_builder +{ +public: + // use the same construction function as the function_builder + using function_builder::function_builder; + + virtual char * assemble_name (function_instance &) override; + + virtual tree get_return_type (const function_instance &) const override; +}; + +/* A function_base for unary functions. */ +class unop : public basic_alu +{ +public: + // use the same construction function as the basic_alu + using basic_alu::basic_alu; + + virtual void get_argument_types (const function_instance &, vec &) const override; + + virtual bool can_be_overloaded_p (const function_instance &) const override; +}; + +/* A function_base for binary functions. */ +class binop : public basic_alu +{ +public: + // use the same construction function as the basic_alu + using basic_alu::basic_alu; + + virtual void get_argument_types (const function_instance &, vec &) const override; + + virtual bool can_be_overloaded_p (const function_instance &) const override; +}; + +/* A function_base for widen binary functions. */ +class wbinop : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual char * assemble_name (function_instance &) override; +}; + +/* A function_base for ternary functions. */ +class ternop : public basic_alu +{ +public: + // use the same construction function as the function_builder + using basic_alu::basic_alu; + + virtual void get_argument_types (const function_instance &, vec &) const override; + + virtual bool can_be_overloaded_p (const function_instance &) const override; +}; + +/* A function_base for vadd functions. */ +class vadd : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vsub functions. */ +class vsub : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vrsub functions. */ +class vrsub : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vneg functions. */ +class vneg : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vwadd functions. */ +class vwadd : public wbinop +{ +public: + // use the same construction function as the wbinop + using wbinop::wbinop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vwsub functions. */ +class vwsub : public wbinop +{ +public: + // use the same construction function as the wbinop + using wbinop::wbinop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vwaddu functions. */ +class vwaddu : public wbinop +{ +public: + // use the same construction function as the wbinop + using wbinop::wbinop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vwsubu functions. */ +class vwsubu : public wbinop +{ +public: + // use the same construction function as the wbinop + using wbinop::wbinop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vwcvt functions. */ +class vwcvt : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vwcvtu functions. */ +class vwcvtu : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vsext functions. */ +class vsext : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vzext functions. */ +class vzext : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vadc functions. */ +class vadc : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vsbc functions. */ +class vsbc : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmadc functions. */ +class vmadc : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmsbc functions. */ +class vmsbc : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vand functions. */ +class vand : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vor functions. */ +class vor : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vxor functions. */ +class vxor : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vnot functions. */ +class vnot : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vshift functions. */ +class vshift : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual void get_argument_types (const function_instance &, vec &) const override; +}; + +/* A function_base for vsll functions. */ +class vsll : public vshift +{ +public: + // use the same construction function as the vshift + using vshift::vshift; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vsrl functions. */ +class vsrl : public vshift +{ +public: + // use the same construction function as the vshift + using vshift::vshift; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vsra functions. */ +class vsra : public vshift +{ +public: + // use the same construction function as the vshift + using vshift::vshift; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vnsrl functions. */ +class vnsrl : public vshift +{ +public: + // use the same construction function as the binop + using vshift::vshift; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vnsra functions. */ +class vnsra : public vshift +{ +public: + // use the same construction function as the binop + using vshift::vshift; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vncvt functions. */ +class vncvt : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vcmp functions. */ +class vcmp : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual char * assemble_name (function_instance &) override; +}; + +/* A function_base for vmseq functions. */ +class vmseq : public vcmp +{ +public: + // use the same construction function as the binop + using vcmp::vcmp; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmsne functions. */ +class vmsne : public vcmp +{ +public: + // use the same construction function as the binop + using vcmp::vcmp; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmslt functions. */ +class vmslt : public vcmp +{ +public: + // use the same construction function as the binop + using vcmp::vcmp; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmsltu functions. */ +class vmsltu : public vcmp +{ +public: + // use the same construction function as the binop + using vcmp::vcmp; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmsgt functions. */ +class vmsgt : public vcmp +{ +public: + // use the same construction function as the binop + using vcmp::vcmp; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmsgtu functions. */ +class vmsgtu : public vcmp +{ +public: + // use the same construction function as the binop + using vcmp::vcmp; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmsle functions. */ +class vmsle : public vcmp +{ +public: + // use the same construction function as the binop + using vcmp::vcmp; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmsleu functions. */ +class vmsleu : public vcmp +{ +public: + // use the same construction function as the binop + using vcmp::vcmp; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmsge functions. */ +class vmsge : public vcmp +{ +public: + // use the same construction function as the binop + using vcmp::vcmp; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmsgeu functions. */ +class vmsgeu : public vcmp +{ +public: + // use the same construction function as the binop + using vcmp::vcmp; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmin functions. */ +class vmin : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vminu functions. */ +class vminu : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmax functions. */ +class vmax : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmaxu functions. */ +class vmaxu : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmul functions. */ +class vmul : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmulh functions. */ +class vmulh : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmulhu functions. */ +class vmulhu : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmulhsu functions. */ +class vmulhsu : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vdiv functions. */ +class vdiv : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vdivu functions. */ +class vdivu : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vrem functions. */ +class vrem : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vremu functions. */ +class vremu : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vwmul functions. */ +class vwmul : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vwmulu functions. */ +class vwmulu : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vwmulsusu functions. */ +class vwmulsu : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmacc functions. */ +class vmacc : public ternop +{ +public: + // use the same construction function as the ternop + using ternop::ternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vnmsac functions. */ +class vnmsac : public ternop +{ +public: + // use the same construction function as the ternop + using ternop::ternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmadd functions. */ +class vmadd : public ternop +{ +public: + // use the same construction function as the ternop + using ternop::ternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vnmsub functions. */ +class vnmsub : public ternop +{ +public: + // use the same construction function as the ternop + using ternop::ternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vwmacc functions. */ +class vwmacc : public ternop +{ +public: + // use the same construction function as the ternop + using ternop::ternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vwmaccu functions. */ +class vwmaccu : public ternop +{ +public: + // use the same construction function as the ternop + using ternop::ternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vwmaccsu functions. */ +class vwmaccsu : public ternop +{ +public: + // use the same construction function as the ternop + using ternop::ternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vwmaccus functions. */ +class vwmaccus : public ternop +{ +public: + // use the same construction function as the ternop + using ternop::ternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmerge functions. */ +class vmerge : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual size_t get_position_of_dest_arg (enum predication_index) const override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmv functions. */ +class vmv : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual void get_argument_types (const function_instance &, vec &) const override; + + virtual bool can_be_overloaded_p (const function_instance &) const override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; } // namespace riscv_vector diff --git a/gcc/config/riscv/riscv-vector.cc b/gcc/config/riscv/riscv-vector.cc index 175d6da4695..1d53c50a751 100644 --- a/gcc/config/riscv/riscv-vector.cc +++ b/gcc/config/riscv/riscv-vector.cc @@ -937,7 +937,7 @@ modify_operands (machine_mode Vmode, machine_mode VSImode, { if (imm32_p (operands[i])) { - if (!imm5_p (operands[i])) + if (!imm5_p (operands[i]) || unspec == UNSPEC_VSSUBU) operands[i] = force_reg (SImode, operands[i]); return GEN_VX_32BIT; } @@ -963,7 +963,7 @@ modify_operands (machine_mode Vmode, machine_mode VSImode, } else { - if (!imm5_p (operands[i])) + if (!imm5_p (operands[i]) || unspec == UNSPEC_VSSUBU) operands[i] = force_reg (VSUBmode, operands[i]); return GEN_VX; } @@ -1018,4 +1018,41 @@ emit_op5 (unsigned int unspec, machine_mode Vmode, machine_mode VSImode, emit_insn ( (*gen) (operands[0], operands[1], operands[2], operands[3], operands[4])); +} + +/* Helper functions for handling sew=64 on RV32 system. */ +void +emit_op6 (unsigned int unspec ATTRIBUTE_UNUSED, machine_mode Vmode, + machine_mode VSImode, machine_mode VMSImode, machine_mode VSUBmode, + rtx *operands, gen_6 *gen_vx, gen_6 *gen_vx_32bit, gen_6 *gen_vv, + imm_p *imm5_p, int i, bool reverse) +{ + enum GEN_CLASS gen_class = modify_operands ( + Vmode, VSImode, VMSImode, VSUBmode, operands, imm5_p, i, reverse, unspec); + + gen_6 *gen = gen_class == GEN_VX ? gen_vx + : gen_class == GEN_VV ? gen_vv + : gen_vx_32bit; + + emit_insn ((*gen) (operands[0], operands[1], operands[2], operands[3], + operands[4], operands[5])); +} + + +/* Helper functions for handling sew=64 on RV32 system. */ +void +emit_op7 (unsigned int unspec, machine_mode Vmode, machine_mode VSImode, + machine_mode VMSImode, machine_mode VSUBmode, rtx *operands, + gen_7 *gen_vx, gen_7 *gen_vx_32bit, gen_7 *gen_vv, imm_p *imm5_p, + int i, bool reverse) +{ + enum GEN_CLASS gen_class = modify_operands ( + Vmode, VSImode, VMSImode, VSUBmode, operands, imm5_p, i, reverse, unspec); + + gen_7 *gen = gen_class == GEN_VX ? gen_vx + : gen_class == GEN_VV ? gen_vv + : gen_vx_32bit; + + emit_insn ((*gen) (operands[0], operands[1], operands[2], operands[3], + operands[4], force_reg_for_over_uimm (operands[5]), operands[6])); } \ No newline at end of file diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 29106bbf6fe..664798b9108 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -3964,6 +3964,28 @@ riscv_print_operand (FILE *file, rtx op, int letter) output_addr_const (file, newop); break; } + case 'v': + { + rtx elt; + if (!const_vec_duplicate_p (op, &elt)) + output_operand_lossage ("invalid vector constant"); + else if (GET_MODE_CLASS (GET_MODE (op)) == MODE_VECTOR_INT) + asm_fprintf (file, "%wd", INTVAL (elt)); + else + output_operand_lossage ("invalid vector constant"); + break; + } + case 'V': + { + rtx elt; + if (!const_vec_duplicate_p (op, &elt)) + output_operand_lossage ("invalid vector constant"); + else if (GET_MODE_CLASS (GET_MODE (op)) == MODE_VECTOR_INT) + asm_fprintf (file, "%wd", -INTVAL (elt)); + else + output_operand_lossage ("invalid vector constant"); + break; + } default: switch (code) { @@ -3980,6 +4002,19 @@ riscv_print_operand (FILE *file, rtx op, int letter) output_address (mode, XEXP (op, 0)); break; + case CONST_VECTOR: + { + rtx imm; + if (!const_vec_duplicate_p (op, &imm)) + { + output_operand_lossage ("invalid immediate value for vector"); + break; + } + gcc_assert (CONST_INT_P (imm)); + asm_fprintf (file, "%wd", INTVAL (imm)); + break; + } + default: if (letter == 'z' && op == CONST0_RTX (GET_MODE (op))) fputs (reg_names[GP_REG_FIRST], file); @@ -5196,6 +5231,24 @@ riscv_hard_regno_mode_ok (unsigned int regno, machine_mode mode) static bool riscv_modes_tieable_p (machine_mode mode1, machine_mode mode2) { + if (rvv_mode_p (mode1) && rvv_mode_p (mode2)) + { + /* Only allow normal vector modes to be tied. */ + return true; + } + else if (rvv_mode_p (mode1) || rvv_mode_p (mode2)) + { + /* If only one is vector mode, then don't allow scaler and vector mode to be tied. */ + return false; + } + else if (TARGET_VECTOR && ( + (GET_MODE_CLASS (mode1) == MODE_FLOAT && GET_MODE_CLASS (mode2) != MODE_FLOAT) + || (GET_MODE_CLASS (mode1) != MODE_FLOAT && GET_MODE_CLASS (mode2) == MODE_FLOAT))) + { + /* When V extension is enabled, that implied F or D Extension is also enabled. + In this situation, disable float and scaler mode to be tied. */ + return false; + } return (mode1 == mode2 || !(GET_MODE_CLASS (mode1) == MODE_FLOAT && GET_MODE_CLASS (mode2) == MODE_FLOAT)); diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index ae4f5b50214..f1d4fce24ca 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -471,7 +471,9 @@ (gt "") (gtu "u") (ge "") (geu "u") (lt "") (ltu "u") - (le "") (leu "u")]) + (le "") (leu "u") + (fix "") (unsigned_fix "u") + (float "") (unsigned_float "u")]) ;; is like , but the signed form expands to "s" rather than "". (define_code_attr su [(sign_extend "s") (zero_extend "u")]) @@ -480,19 +482,52 @@ (define_code_attr optab [(ashift "ashl") (ashiftrt "ashr") (lshiftrt "lshr") + (mult "mul") (div "div") (mod "mod") (udiv "udiv") (umod "umod") + (eq "eq") + (ne "ne") (ge "ge") (le "le") (gt "gt") (lt "lt") + (geu "geu") + (leu "leu") + (gtu "gtu") + (ltu "ltu") (ior "ior") (xor "xor") (and "and") (plus "add") - (minus "sub")]) + (minus "sub") + (smax "smax") + (umax "umax") + (smin "smin") + (umin "umin") + (us_plus "usadd") + (ss_plus "ssadd") + (us_minus "ussub") + (ss_minus "sssub") + (neg "neg") + (not "one_cmpl") + (abs "abs") + (fix "fix_trunc") + (unsigned_fix "fixuns_trunc") + (float "float") + (unsigned_float "floatuns") + (sqrt "sqrt") + (unordered "unordered") + (ordered "ordered") + (unlt "unlt") + (unle "unle") + (unge "unge") + (ungt "ungt") + (uneq "uneq") + (ltgt "ltgt") + (sign_extend "extend") + (zero_extend "zero_extend")]) ;; expands to the name of the insn that implements a particular code. (define_code_attr insn [(ashift "sll") @@ -506,7 +541,28 @@ (xor "xor") (and "and") (plus "add") - (minus "sub")]) + (minus "sub") + (smax "max") + (umax "maxu") + (smin "min") + (umin "minu") + (us_plus "saddu") + (ss_plus "sadd") + (us_minus "ssubu") + (ss_minus "ssub") + (eq "eq") + (ne "ne") + (ge "ge") + (le "le") + (gt "gt") + (lt "lt") + (geu "geu") + (leu "leu") + (gtu "gtu") + (ltu "ltu") + (neg "neg") + (not "not") + (mult "mul")]) ;; Ghost instructions produce no real code and introduce no hazards. ;; They exist purely to express an effect on dataflow. diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index 501980d822f..748025d4080 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -792,9 +792,6 @@ UNSPEC_VMIN UNSPEC_VMINU UNSPEC_VMAX UNSPEC_VMAXU UNSPEC_VMUL UNSPEC_VMULH UNSPEC_VMULHU UNSPEC_VMULHSU UNSPEC_VDIV UNSPEC_VDIVU UNSPEC_VREM UNSPEC_VREMU - UNSPEC_VSADD UNSPEC_VSADDU UNSPEC_VSSUB UNSPEC_VSSUBU - UNSPEC_VAADD UNSPEC_VAADDU UNSPEC_VASUB UNSPEC_VASUBU - UNSPEC_VSMUL ]) (define_int_iterator VXMOP [ diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index fc7ec77dfc4..cb8bdc5781f 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -974,6 +974,228 @@ [(set_attr "type" "vleff") (set_attr "mode" "")]) +;; ------------------------------------------------------------------------------- +;; ---- expands for some insns, to support sew64 on TARGET_32BIT +;; ------------------------------------------------------------------------------- +(define_expand "@v_vx" + [(unspec [ + (match_operand:VI 0 "register_operand") + (match_operand: 1 "vector_reg_or_const0_operand") + (match_operand:VI 2 "vector_reg_or_const0_operand") + (match_operand:VI 3 "register_operand") + (match_operand: 4 "reg_or_const_int_operand") + (match_operand 5 "p_reg_or_const_csr_operand") + (match_operand 6 "const_int_operand") + ] VXOP)] + "TARGET_VECTOR" + { + emit_op7 ( + , + mode, mode, mode, + mode, + operands, + gen_v_vx_internal, + gen_v_vx_32bit, + gen_v_vv, + satisfies_constraint_, + 4, false + ); + DONE; + } +) + +;; vrsub +(define_expand "@v_vx" + [(unspec [ + (match_operand:VI 0 "register_operand") + (match_operand: 1 "vector_reg_or_const0_operand") + (match_operand:VI 2 "vector_reg_or_const0_operand") + (match_operand:VI 3 "register_operand") + (match_operand: 4 "reg_or_const_int_operand") + (match_operand 5 "p_reg_or_const_csr_operand") + (match_operand 6 "const_int_operand") + ] VXROP)] + "TARGET_VECTOR" + { + emit_op7 ( + , + mode, mode, mode, + mode, + operands, + gen_v_vx_internal, + gen_v_vx_32bit, + gen_vsub_vv, + satisfies_constraint_, + 4, true + ); + DONE; + } +) + +;; mvx no mask +(define_expand "@v_vx" + [(unspec [ + (match_operand: 0 "register_operand") + (match_operand:VI 1 "register_operand") + (match_operand: 2 "reg_or_const_int_operand") + (match_operand 3 "p_reg_or_const_csr_operand") + (match_operand 4 "const_int_operand") + ] MVXOP)] + "TARGET_VECTOR" + { + emit_op5 ( + 0, + mode, mode, mode, + mode, + operands, + gen_v_vx_internal, + gen_v_vx_32bit, + gen_v_vv, + satisfies_constraint_, + 2, false + ); + DONE; + } +) + +;; vxm no tail policy +(define_expand "@v_vxm" + [(unspec [ + (match_operand: 0 "register_operand") + (match_operand:VI 1 "register_operand") + (match_operand: 2 "reg_or_const_int_operand") + (match_operand: 3 "register_operand") + (match_operand 4 "p_reg_or_const_csr_operand") + (match_operand 5 "const_int_operand") + ] VXMOP_NO_POLICY)] + "TARGET_VECTOR" + { + emit_op6 ( + , + mode, mode, mode, + mode, + operands, + gen_v_vxm_internal, + gen_v_vxm_32bit, + gen_v_vvm, + satisfies_constraint_, + 2, false + ); + DONE; + } +) + +;; compare +(define_expand "@vms_vx" + [(match_operand: 0 "register_operand") + (match_operand: 1 "vector_reg_or_const0_operand") + (match_operand: 2 "vector_reg_or_const0_operand") + (cmp_all: (match_operand:VI 3 "register_operand") + (vec_duplicate:VI (match_operand: 4 "reg_or_const_int_operand"))) + (match_operand 5 "p_reg_or_const_csr_operand") + (match_operand 6 "const_int_operand") + ] + "TARGET_VECTOR" + { + emit_op7 ( + 0, + mode, mode, mode, + mode, + operands, + gen_vms_vx_internal, + gen_vms_vx_32bit, + gen_vms_vv, + satisfies_constraint_, + 4, false + ); + DONE; + } +) + +;; vxm +(define_expand "@v_vxm" + [(unspec [ + (match_operand:VI 0 "register_operand") + (match_operand:VI 1 "vector_reg_or_const0_operand") + (match_operand:VI 2 "register_operand") + (match_operand: 3 "reg_or_const_int_operand") + (match_operand: 4 "register_operand") + (match_operand 5 "p_reg_or_const_csr_operand") + (match_operand 6 "const_int_operand") + ] VXMOP)] + "TARGET_VECTOR" + { + emit_op7 ( + , + mode, mode, mode, + mode, + operands, + gen_v_vxm_internal, + gen_v_vxm_32bit, + gen_v_vvm, + satisfies_constraint_, + 3, false + ); + DONE; + } +) + +;; mac +(define_expand "@v_vx" + [(unspec [ + (match_operand:VI 0 "register_operand") + (match_operand: 1 "vector_reg_or_const0_operand") + (match_operand:VI 2 "register_operand") + (match_operand: 3 "reg_or_const_int_operand") + (match_operand:VI 4 "register_operand") + (match_operand 5 "p_reg_or_const_csr_operand") + (match_operand 6 "const_int_operand") + ] MACOP)] + "TARGET_VECTOR" + { + emit_op7 ( + , + mode, mode, mode, + mode, + operands, + gen_v_vx_internal, + gen_v_vx_32bit, + gen_v_vv, + satisfies_constraint_, + 3, false + ); + DONE; + } +) + +;; vmerge +(define_expand "@v_vxm" + [(unspec [ + (match_operand:VI 0 "register_operand") + (match_operand: 1 "register_operand") + (match_operand:VI 2 "vector_reg_or_const0_operand") + (match_operand:VI 3 "register_operand") + (match_operand: 4 "reg_or_const_int_operand") + (match_operand 5 "p_reg_or_const_csr_operand") + (match_operand 6 "const_int_operand") + ] VMERGEOP)] + "TARGET_VECTOR" + { + emit_op7 ( + , + mode, mode, mode, + mode, + operands, + gen_v_vxm_internal, + gen_v_vxm_32bit, + gen_v_vvm, + satisfies_constraint_, + 4, false + ); + DONE; + } +) + ;; vmv.v.x (define_expand "@v_v_x" [(unspec [ @@ -984,22 +1206,2322 @@ (match_operand 4 "const_int_operand") ] VMVOP)] "TARGET_VECTOR" - { - emit_op5 ( - , - mode, mode, mode, - mode, - operands, - gen_v_v_x_internal, - gen_v_v_x_32bit, - NULL, - satisfies_constraint_, - 2, false - ); - DONE; - } -) + { + emit_op5 ( + , + mode, mode, mode, + mode, + operands, + gen_v_v_x_internal, + gen_v_v_x_32bit, + NULL, + satisfies_constraint_, + 2, false + ); + DONE; + } +) + +;; ------------------------------------------------------------------------------- +;; ---- 11. Vector Integer Arithmetic Instructions +;; ------------------------------------------------------------------------------- +;; Includes: +;; - 11.1 Vector Single-Width Integer Add and Subtract +;; - 11.2 Vector Widening Integer Add/Subtract +;; - 11.3 Vector Integer Extension +;; - 11.4 Vector Integer Add-with-Carry/Subtract-with-Borrow Instructions +;; - 11.5 Vector Bitwise Logical Instructions +;; - 11.6 Vector Single-Width Bit Shift Instructions +;; - 11.7 Vector Narrowing Integer Right Shift Instructions +;; - 11.8 Vector Integer Comparision Instructions +;; - 11.9 Vector Integer Min/Max Instructions +;; - 11.10 Vector Single-Width Integer Multiply Instructions +;; - 11.11 Vector Integer Divide Instructions +;; - 11.12 Vector Widening Integer Multiply Instructions +;; - 11.13 Vector Single-Width Integer Multiply-Add Instructions +;; - 11.14 Vector Widening Integer Multiply-Add Instructions +;; - 11.15 Vector Integer Merge Instructions +;; - 11.16 Vector Integer Move Instructions +;; ------------------------------------------------------------------------------- + +;; Vector-Vector Integer Add: vadd.vv. +;; optimize the const vector that all elments are +;; 5-bit signed immediate by using vadd.vi. +(define_insn "@vadd_vv" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd, vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm, J,J,J,J") + (plus:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr, vr,vr,vr,vr") + (match_operand:VI 4 "vector_arith_operand" "vr,vr,vi,vi, vr,vr,vi,vi")) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,J,0,J, 0,J,0,J") + ] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK, rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vadd.vv\t%0,%3,%4,%1.t + vadd.vv\t%0,%3,%4,%1.t + vadd.vi\t%0,%3,%v4,%1.t + vadd.vi\t%0,%3,%v4,%1.t + vadd.vv\t%0,%3,%4 + vadd.vv\t%0,%3,%4 + vadd.vi\t%0,%3,%v4 + vadd.vi\t%0,%3,%v4" + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +;; Vector-Vector Integer subtract:vsub.vv +;; Since RVV doesn't have vsub.vi. +;; Optimize this case using vadd.vi for const vector +;; that all elements are 5-bit signed immediate neg value. +(define_insn "@vsub_vv" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (minus:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (match_operand:VI 4 "vector_neg_arith_operand" "vr,vj,vr,vj,vr,vj,vr,vj")) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vsub.vv\t%0,%3,%4,%1.t + vadd.vi\t%0,%3,%V4,%1.t + vsub.vv\t%0,%3,%4,%1.t + vadd.vi\t%0,%3,%V4,%1.t + vsub.vv\t%0,%3,%4 + vadd.vi\t%0,%3,%V4 + vsub.vv\t%0,%3,%4 + vadd.vi\t%0,%3,%V4" + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +;; Vector-scalar Integer add: vadd.vx +;; Optimize the const vector that all elements +;; are 5-bit signed immediate value with +;; vadd.vi. +(define_insn "@vadd_vx_internal" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd, vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm, J,J,J,J") + (plus:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr, vr,vr,vr,vr") + (vec_duplicate:VI + (match_operand: 4 "reg_or_simm5_operand" "r,r,Ws5,Ws5, r,r,Ws5,Ws5"))) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,J,0,J, 0,J,0,J") + ] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK, rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vadd.vx\t%0,%3,%4,%1.t + vadd.vx\t%0,%3,%4,%1.t + vadd.vi\t%0,%3,%4,%1.t + vadd.vi\t%0,%3,%4,%1.t + vadd.vx\t%0,%3,%4 + vadd.vx\t%0,%3,%4 + vadd.vi\t%0,%3,%4 + vadd.vi\t%0,%3,%4" + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +(define_insn "@vadd_vx_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:V64BITI + [(unspec:V64BITI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (plus:V64BITI + (match_operand:V64BITI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (vec_duplicate:V64BITI + (sign_extend: + (match_operand:SI 4 "reg_or_simm5_operand" "r,Ws5,r,Ws5,r,Ws5,r,Ws5")))) + (match_operand:V64BITI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J") + ] UNSPEC_SELECT) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand:SI 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vadd.vx\t%0,%3,%4,%1.t + vadd.vi\t%0,%3,%4,%1.t + vadd.vx\t%0,%3,%4,%1.t + vadd.vi\t%0,%3,%4,%1.t + vadd.vx\t%0,%3,%4 + vadd.vi\t%0,%3,%4 + vadd.vx\t%0,%3,%4 + vadd.vi\t%0,%3,%4" + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +;; Vector-scalar Integer sub: vsub.vx +;; Since RVV doesn't have vsub.vi +;; Optimize the const vector that all elements +;; are 5-bit signed immediate neg value with +;; vadd.vi. +(define_insn "@vsub_vx_internal" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (minus:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (vec_duplicate:VI + (match_operand: 4 "reg_or_neg_simm5_operand" "r,Wn5,r,Wn5,r,Wn5,r,Wn5"))) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J") + ] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + { + const char *tail = satisfies_constraint_J (operands[1]) ? "" : ",%1.t"; + char buf[64] = {0}; + if (satisfies_constraint_Wn5 (operands[4])) + { + const char *insn = "vadd.vi\t%0,%3"; + snprintf (buf, sizeof (buf), "%s,%d%s", insn, (int)(-INTVAL (operands[4])), tail); + } + else + { + const char *insn = "vsub.vx\t%0,%3,%4"; + snprintf (buf, sizeof (buf), "%s%s", insn, tail); + } + output_asm_insn (buf, operands); + return ""; + } + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +(define_insn "@vsub_vx_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:V64BITI + [(unspec:V64BITI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (minus:V64BITI + (match_operand:V64BITI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (vec_duplicate:V64BITI + (sign_extend: + (match_operand:SI 4 "reg_or_neg_simm5_operand" "r,Wn5,r,Wn5,r,Wn5,r,Wn5")))) + (match_operand:V64BITI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand:SI 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + { + const char *tail = satisfies_constraint_J (operands[1]) ? "" : ",%1.t"; + char buf[64] = {0}; + if (satisfies_constraint_Wn5 (operands[4])) + { + const char *insn = "vadd.vi\t%0,%3"; + snprintf (buf, sizeof (buf), "%s,%d%s", insn, (int)(-INTVAL (operands[4])), tail); + } + else + { + const char *insn = "vsub.vx\t%0,%3,%4"; + snprintf (buf, sizeof (buf), "%s%s", insn, tail); + } + output_asm_insn (buf, operands); + return ""; + } + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +;; Vector-Scalar and Vector-immediate +;; Integer Reverse Sub. +(define_insn "@vrsub_vx_internal" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (minus:VI + (vec_duplicate:VI + (match_operand: 4 "reg_or_simm5_operand" "r,Ws5,r,Ws5,r,Ws5,r,Ws5")) + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr")) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vrsub.vx\t%0,%3,%4,%1.t + vrsub.vi\t%0,%3,%4,%1.t + vrsub.vx\t%0,%3,%4,%1.t + vrsub.vi\t%0,%3,%4,%1.t + vrsub.vx\t%0,%3,%4 + vrsub.vi\t%0,%3,%4 + vrsub.vx\t%0,%3,%4 + vrsub.vi\t%0,%3,%4" + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +(define_insn "@vrsub_vx_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:V64BITI + [(unspec:V64BITI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (minus:V64BITI + (vec_duplicate:V64BITI + (sign_extend: + (match_operand:SI 4 "reg_or_simm5_operand" "r,Ws5,r,Ws5,r,Ws5,r,Ws5"))) + (match_operand:V64BITI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr")) + (match_operand:V64BITI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand:SI 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vrsub.vx\t%0,%3,%4,%1.t + vrsub.vi\t%0,%3,%4,%1.t + vrsub.vx\t%0,%3,%4,%1.t + vrsub.vi\t%0,%3,%4,%1.t + vrsub.vx\t%0,%3,%4 + vrsub.vi\t%0,%3,%4 + vrsub.vx\t%0,%3,%4 + vrsub.vi\t%0,%3,%4" + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +;; pseudo-instruction vneg.v vd,vs = vrsub.vx vd,vs,x0. +(define_insn "@vneg_v" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (neg:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr")) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vneg.v\t%0,%3,%1.t + vneg.v\t%0,%3,%1.t + vneg.v\t%0,%3 + vneg.v\t%0,%3" + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +;; Vector-Vector Widening Signed/Unsigned Integer Add/Subtract. +(define_insn "@vw_vv" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (plus_minus: + (any_extend: + (match_operand:VWI 3 "register_operand" "vr,vr,vr,vr")) + (any_extend: + (match_operand:VWI 4 "register_operand" "vr,vr,vr,vr"))) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vw.vv\t%0,%3,%4,%1.t + vw.vv\t%0,%3,%4,%1.t + vw.vv\t%0,%3,%4 + vw.vv\t%0,%3,%4" + [(set_attr "type" "vwarith") + (set_attr "mode" "")]) + +;; Vector-Scalar Widening Signed/Unsigned Integer Add/Subtract. +(define_insn "@vw_vx" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr,&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (plus_minus: + (any_extend: + (match_operand:VWI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr")) + (any_extend: + (vec_duplicate:VWI + (match_operand: 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J")))) + (match_operand: 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vw.vx\t%0,%3,%4,%1.t + vw.vx\t%0,%3,zero,%1.t + vw.vx\t%0,%3,%4,%1.t + vw.vx\t%0,%3,zero,%1.t + vw.vx\t%0,%3,%4 + vw.vx\t%0,%3,zero + vw.vx\t%0,%3,%4 + vw.vx\t%0,%3,zero" + [(set_attr "type" "vwarith") + (set_attr "mode" "")]) + +;; Vector-Vector Widening Signed/Unsigned Integer Add/Subtract. +(define_insn "@vw_wv" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (plus_minus: + (match_operand: 3 "register_operand" "vr,vr,vr,vr") + (any_extend: + (match_operand:VWI 4 "register_operand" "vr,vr,vr,vr"))) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vw.wv\t%0,%3,%4,%1.t + vw.wv\t%0,%3,%4,%1.t + vw.wv\t%0,%3,%4 + vw.wv\t%0,%3,%4" + [(set_attr "type" "vwarith") + (set_attr "mode" "")]) + +;; Vector-Scalar Widening Signed/Unsigned Integer Add/Subtract. +(define_insn "@vw_wx" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr,&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (plus_minus: + (match_operand: 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (any_extend: + (unspec:VWI + [(match_operand: 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J") + ] UNSPEC_VEC_DUPLICATE))) + (match_operand: 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vw.wx\t%0,%3,%4,%1.t + vw.wx\t%0,%3,zero,%1.t + vw.wx\t%0,%3,%4,%1.t + vw.wx\t%0,%3,zero,%1.t + vw.wx\t%0,%3,%4 + vw.wx\t%0,%3,zero + vw.wx\t%0,%3,%4 + vw.wx\t%0,%3,zero" + [(set_attr "type" "vwarith") + (set_attr "mode" "")]) + +;; pseudo-instruction vwcvt.x.x.v vd,vs,vm = vwadd.vx vd,vs,x0,vm. +(define_insn "@vwcvt_x_x_v" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec: + [(any_extend: + (match_operand:VWI 3 "register_operand" "vr,vr,vr,vr"))] UNSPEC_DOUBLE_WIDEN) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vwcvt.x.x.v\t%0,%3,%1.t + vwcvt.x.x.v\t%0,%3,%1.t + vwcvt.x.x.v\t%0,%3 + vwcvt.x.x.v\t%0,%3" + [(set_attr "type" "vwcvt") + (set_attr "mode" "")]) + +;; Vector Double-Widening Sign-extend and Zero-extend. +(define_insn "@vext_vf2" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (any_extend: + (match_operand:VWI 3 "register_operand" "vr,vr,vr,vr")) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vext.vf2\t%0,%3,%1.t + vext.vf2\t%0,%3,%1.t + vext.vf2\t%0,%3 + vext.vf2\t%0,%3" + [(set_attr "type" "vwcvt") + (set_attr "mode" "")]) + +;; Vector Quad-Widening Sign-extend and Zero-extend. +(define_insn "@vext_vf4" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (any_extend: + (match_operand:VQWI 3 "register_operand" "vr,vr,vr,vr")) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vext.vf4\t%0,%3,%1.t + vext.vf4\t%0,%3,%1.t + vext.vf4\t%0,%3 + vext.vf4\t%0,%3" + [(set_attr "type" "vwcvt") + (set_attr "mode" "")]) + +;; Vector Oct-Widening Sign-extend and Zero-extend. +(define_insn "@vext_vf8" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (any_extend: + (match_operand:VOWI 3 "register_operand" "vr,vr,vr,vr")) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vext.vf8\t%0,%3,%1.t + vext.vf8\t%0,%3,%1.t + vext.vf8\t%0,%3 + vext.vf8\t%0,%3" + [(set_attr "type" "vwcvt") + (set_attr "mode" "")]) + +;; Vector Integer Add-with-Carry/Subtract-with-Borrow Instructions +;; For vadc and vsbc, the instruction encoding is reserved if the destination +;; vector register is v0. +;; Vector-Vector Produce sum with carry. +(define_insn "@vadc_vvm" + [(set (match_operand:VI 0 "register_operand" "=&vd,&vd,&vd,&vd") + (unspec:VI + [(match_operand:VI 1 "vector_reg_or_const0_operand" "0,0,J,J") + (plus:VI + (plus:VI + (match_operand:VI 2 "register_operand" "vr,vr,vr,vr") + (match_operand:VI 3 "vector_arith_operand" "vr,vi,vr,vi")) + (if_then_else:VI + (match_operand: 4 "register_operand" "vm,vm,vm,vm") + (vec_duplicate:VI (const_int 1)) + (vec_duplicate:VI (const_int 0)))) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vadc.vvm\t%0,%2,%3,%4 + vadc.vim\t%0,%2,%v3,%4 + vadc.vvm\t%0,%2,%3,%4 + vadc.vim\t%0,%2,%v3,%4" + [(set_attr "type" "vadc") + (set_attr "mode" "")]) + +;; Vector-Vector Produce difference with borrow. +(define_insn "@vsbc_vvm" + [(set (match_operand:VI 0 "register_operand" "=&vd,&vd") + (unspec:VI + [(match_operand:VI 1 "vector_reg_or_const0_operand" "0,J") + (minus:VI + (minus:VI + (match_operand:VI 2 "register_operand" "vr,vr") + (match_operand:VI 3 "register_operand" "vr,vr")) + (if_then_else:VI + (match_operand: 4 "register_operand" "vm,vm") + (vec_duplicate:VI (const_int 1)) + (vec_duplicate:VI (const_int 0)))) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vsbc.vvm\t%0,%2,%3,%4" + [(set_attr "type" "vadc") + (set_attr "mode" "")]) + +;; Vector-Scalar Produce sum with carry. +(define_insn "@vadc_vxm_internal" + [(set (match_operand:VI 0 "register_operand" "=&vd,&vd,&vd,&vd") + (unspec:VI + [(match_operand:VI 1 "vector_reg_or_const0_operand" "0,0,J,J") + (plus:VI + (plus:VI + (match_operand:VI 2 "register_operand" "vr,vr,vr,vr") + (vec_duplicate:VI + (match_operand: 3 "reg_or_simm5_operand" "r,Ws5,r,Ws5"))) + (if_then_else:VI + (match_operand: 4 "register_operand" "vm,vm,vm,vm") + (vec_duplicate:VI (const_int 1)) + (vec_duplicate:VI (const_int 0)))) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vadc.vxm\t%0,%2,%3,%4 + vadc.vim\t%0,%2,%3,%4 + vadc.vxm\t%0,%2,%3,%4 + vadc.vim\t%0,%2,%3,%4" + [(set_attr "type" "vadc") + (set_attr "mode" "")]) + +(define_insn "@vadc_vxm_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=&vd,&vd,&vd,&vd") + (unspec:V64BITI + [(match_operand:V64BITI 1 "vector_reg_or_const0_operand" "0,0,J,J") + (plus:V64BITI + (plus:V64BITI + (match_operand:V64BITI 2 "register_operand" "vr,vr,vr,vr") + (vec_duplicate:V64BITI + (sign_extend: + (match_operand:SI 3 "reg_or_simm5_operand" "r,Ws5,r,Ws5")))) + (if_then_else:V64BITI + (match_operand: 4 "register_operand" "vm,vm,vm,vm") + (vec_duplicate:V64BITI (const_int 1)) + (vec_duplicate:V64BITI (const_int 0)))) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK") + (match_operand:SI 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vadc.vxm\t%0,%2,%3,%4 + vadc.vim\t%0,%2,%3,%4 + vadc.vxm\t%0,%2,%3,%4 + vadc.vim\t%0,%2,%3,%4" + [(set_attr "type" "vadc") + (set_attr "mode" "")]) + +;; Vector-Vector Scalar difference with borrow. +(define_insn "@vsbc_vxm_internal" + [(set (match_operand:VI 0 "register_operand" "=&vd,&vd,&vd,&vd") + (unspec:VI + [(match_operand:VI 1 "vector_reg_or_const0_operand" "0,0,J,J") + (minus:VI + (minus:VI + (match_operand:VI 2 "register_operand" "vr,vr,vr,vr") + (vec_duplicate:VI + (match_operand: 3 "reg_or_0_operand" "r,J,r,J"))) + (if_then_else:VI + (match_operand: 4 "register_operand" "vm,vm,vm,vm") + (vec_duplicate:VI (const_int 1)) + (vec_duplicate:VI (const_int 0)))) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vsbc.vxm\t%0,%2,%3,%4 + vsbc.vxm\t%0,%2,zero,%4 + vsbc.vxm\t%0,%2,%3,%4 + vsbc.vxm\t%0,%2,zero,%4" + [(set_attr "type" "vadc") + (set_attr "mode" "")]) + +(define_insn "@vsbc_vxm_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=&vd,&vd,&vd,&vd") + (unspec:V64BITI + [(match_operand:V64BITI 1 "vector_reg_or_const0_operand" "0,0,J,J") + (minus:V64BITI + (minus:V64BITI + (match_operand:V64BITI 2 "register_operand" "vr,vr,vr,vr") + (vec_duplicate:V64BITI + (sign_extend: (match_operand:SI 3 "reg_or_0_operand" "r,J,r,J")))) + (if_then_else:V64BITI + (match_operand: 4 "register_operand" "vm,vm,vm,vm") + (vec_duplicate:V64BITI (const_int 1)) + (vec_duplicate:V64BITI (const_int 0)))) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK") + (match_operand:SI 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vsbc.vxm \t%0,%2,%3,%4 + vsbc.vxm \t%0,%2,zero,%4 + vsbc.vxm \t%0,%2,%3,%4 + vsbc.vxm \t%0,%2,zero,%4" + [(set_attr "type" "vadc") + (set_attr "mode" "")]) + +;; Vector-Vector Produce carry out in mask register format. +(define_insn "@vmadc_vvm" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(plus:VI + (plus:VI + (match_operand:VI 1 "register_operand" "vr,vr") + (match_operand:VI 2 "vector_arith_operand" "vr,vi")) + (if_then_else:VI + (match_operand: 3 "register_operand" "vm,vm") + (vec_duplicate:VI (const_int 1)) + (vec_duplicate:VI (const_int 0)))) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmadc.vvm\t%0,%1,%2,%3 + vmadc.vim\t%0,%1,%v2,%3" + [(set_attr "type" "vmadc") + (set_attr "mode" "")]) + +;; Vector-Vector Produce borrow out in mask register format. +(define_insn "@vmsbc_vvm" + [(set (match_operand: 0 "register_operand" "=&vr") + (unspec: + [(minus:VI + (minus:VI + (match_operand:VI 1 "register_operand" "vr") + (match_operand:VI 2 "register_operand" "vr")) + (if_then_else:VI + (match_operand: 3 "register_operand" "vm") + (vec_duplicate:VI (const_int 1)) + (vec_duplicate:VI (const_int 0)))) + (match_operand 4 "p_reg_or_const_csr_operand" "rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vmsbc.vvm\t%0,%1,%2,%3" + [(set_attr "type" "vmadc") + (set_attr "mode" "")]) + +;; Vector-Scalar Produce carry out in mask register format. +(define_insn "@vmadc_vxm_internal" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(plus:VI + (plus:VI + (match_operand:VI 1 "register_operand" "vr,vr") + (vec_duplicate:VI + (match_operand: 2 "reg_or_simm5_operand" "r,Ws5"))) + (if_then_else:VI + (match_operand: 3 "register_operand" "vm,vm") + (vec_duplicate:VI (const_int 1)) + (vec_duplicate:VI (const_int 0)))) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmadc.vxm\t%0,%1,%2,%3 + vmadc.vim\t%0,%1,%2,%3" + [(set_attr "type" "vmadc") + (set_attr "mode" "")]) + +(define_insn "@vmadc_vxm_32bit" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(plus:V64BITI + (plus:V64BITI + (match_operand:V64BITI 1 "register_operand" "vr,vr") + (vec_duplicate:V64BITI + (sign_extend: + (match_operand:SI 2 "reg_or_simm5_operand" "r,Ws5")))) + (if_then_else:V64BITI + (match_operand: 3 "register_operand" "vm,vm") + (vec_duplicate:V64BITI (const_int 1)) + (vec_duplicate:V64BITI (const_int 0)))) + (match_operand:SI 4 "csr_operand" "rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmadc.vxm\t%0,%1,%2,%3 + vmadc.vim\t%0,%1,%2,%3" + [(set_attr "type" "vmadc") + (set_attr "mode" "")]) + +;; Vector-Scalar Produce borrow out in mask register format. +(define_insn "@vmsbc_vxm_internal" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(minus:VI + (minus:VI + (match_operand:VI 1 "register_operand" "vr,vr") + (vec_duplicate:VI + (match_operand: 2 "reg_or_0_operand" "r,J"))) + (if_then_else:VI + (match_operand: 3 "register_operand" "vm,vm") + (vec_duplicate:VI (const_int 1)) + (vec_duplicate:VI (const_int 0)))) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmsbc.vxm\t%0,%1,%2,%3 + vmsbc.vxm\t%0,%1,zero,%3" + [(set_attr "type" "vmadc") + (set_attr "mode" "")]) + +(define_insn "@vmsbc_vxm_32bit" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(minus:V64BITI + (minus:V64BITI + (match_operand:V64BITI 1 "register_operand" "vr,vr") + (vec_duplicate:V64BITI + (sign_extend: + (match_operand:SI 2 "reg_or_0_operand" "r,J")))) + (if_then_else:V64BITI + (match_operand: 3 "register_operand" "vm,vm") + (vec_duplicate:V64BITI (const_int 1)) + (vec_duplicate:V64BITI (const_int 0)))) + (match_operand:SI 4 "csr_operand" "rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmsbc.vxm\t%0,%1,%2,%3 + vmsbc.vxm\t%0,%1,zero,%3" + [(set_attr "type" "vmadc") + (set_attr "mode" "")]) + +;; Vector-Vector Produce carry out in mask register format. +(define_insn "@vmadc_vv" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(plus:VI + (match_operand:VI 1 "register_operand" "vr,vr") + (match_operand:VI 2 "vector_arith_operand" "vr,vi")) + (match_operand 3 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmadc.vv\t%0,%1,%2 + vmadc.vi\t%0,%1,%v2" + [(set_attr "type" "vmadc") + (set_attr "mode" "")]) + +;; Vector-Vector Produce borrow out in mask register format. +(define_insn "@vmsbc_vv" + [(set (match_operand: 0 "register_operand" "=&vr") + (unspec: + [(minus:VI + (match_operand:VI 1 "register_operand" "vr") + (match_operand:VI 2 "register_operand" "vr")) + (match_operand 3 "p_reg_or_const_csr_operand" "rK") + (match_operand 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vmsbc.vv\t%0,%1,%2" + [(set_attr "type" "vmadc") + (set_attr "mode" "")]) + +;; Vector-Scalar Produce carry out in mask register format. +(define_insn "@vmadc_vx_internal" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(plus:VI + (match_operand:VI 1 "register_operand" "vr,vr") + (vec_duplicate:VI + (match_operand: 2 "reg_or_simm5_operand" "r,Ws5"))) + (match_operand 3 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmadc.vx\t%0,%1,%2 + vmadc.vi\t%0,%1,%2" + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +(define_insn "@vmadc_vx_32bit" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(plus:V64BITI + (match_operand:V64BITI 1 "register_operand" "vr,vr") + (vec_duplicate:V64BITI + (sign_extend: + (match_operand:SI 2 "reg_or_simm5_operand" "r,Ws5")))) + (match_operand:SI 3 "csr_operand" "rK,rK") + (match_operand 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmadc.vx \t%0,%1,%2 + vmadc.vi \t%0,%1,%2" + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +;; Vector-Scalar Produce borrow out in mask register format. +(define_insn "@vmsbc_vx_internal" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(minus:VI + (match_operand:VI 1 "register_operand" "vr,vr") + (vec_duplicate:VI + (match_operand: 2 "reg_or_0_operand" "r,J"))) + (match_operand 3 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmsbc.vx\t%0,%1,%2 + vmsbc.vx\t%0,%1,zero" + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +(define_insn "@vmsbc_vx_32bit" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(minus:V64BITI + (match_operand:V64BITI 1 "register_operand" "vr,vr") + (vec_duplicate:V64BITI + (sign_extend: + (match_operand:SI 2 "reg_or_0_operand" "r,J")))) + (match_operand:SI 3 "csr_operand" "rK,rK") + (match_operand 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmsbc.vx \t%0,%1,%2 + vmsbc.vx \t%0,%1,zero" + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +;; Vector-Vector Bitwise logical operations. +(define_insn "@v_vv" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (any_bitwise:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (match_operand:VI 4 "vector_arith_operand" "vr,vi,vr,vi,vr,vi,vr,vi")) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vv\t%0,%3,%4,%1.t + v.vi\t%0,%3,%v4,%1.t + v.vv\t%0,%3,%4,%1.t + v.vi\t%0,%3,%v4,%1.t + v.vv\t%0,%3,%4 + v.vi\t%0,%3,%v4 + v.vv\t%0,%3,%4 + v.vi\t%0,%3,%v4" + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +;; Vector-Scalar Bitwise logical operations. +(define_insn "@v_vx_internal" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (any_bitwise:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (vec_duplicate:VI + (match_operand: 4 "reg_or_simm5_operand" "r,Ws5,r,Ws5,r,Ws5,r,Ws5"))) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J") + ] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vx\t%0,%3,%4,%1.t + v.vi\t%0,%3,%4,%1.t + v.vx\t%0,%3,%4,%1.t + v.vi\t%0,%3,%4,%1.t + v.vx\t%0,%3,%4 + v.vi\t%0,%3,%4 + v.vx\t%0,%3,%4 + v.vi\t%0,%3,%4" + [(set_attr "type" "vlogical") + (set_attr "mode" "")]) + +(define_insn "@v_vx_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:V64BITI + [(unspec:V64BITI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (any_bitwise:V64BITI + (match_operand:V64BITI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (vec_duplicate:V64BITI + (sign_extend: (match_operand:SI 4 "reg_or_simm5_operand" "r,Ws5,r,Ws5,r,Ws5,r,Ws5")))) + (match_operand:V64BITI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J") + ] UNSPEC_SELECT) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand:SI 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vx\t%0,%3,%4,%1.t + v.vi\t%0,%3,%4,%1.t + v.vx\t%0,%3,%4,%1.t + v.vi\t%0,%3,%4,%1.t + v.vx\t%0,%3,%4 + v.vi\t%0,%3,%4 + v.vx\t%0,%3,%4 + v.vi\t%0,%3,%4" + [(set_attr "type" "vlogical") + (set_attr "mode" "")]) + +;; pseudo-instruction vnot.v vd,vs = vxor.vi vd,vs,-1. +(define_insn "@vnot_v" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (not:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr")) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vnot.v\t%0,%3,%1.t + vnot.v\t%0,%3,%1.t + vnot.v\t%0,%3 + vnot.v\t%0,%3" + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +;; Vector-Vector Bit shift operations. +(define_insn "@v_vv" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (any_shift:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (match_operand:VI 4 "vector_arith_operand" "vr,vk,vr,vk,vr,vk,vr,vk")) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vv\t%0,%3,%4,%1.t + v.vi\t%0,%3,%v4,%1.t + v.vv\t%0,%3,%4,%1.t + v.vi\t%0,%3,%v4,%1.t + v.vv\t%0,%3,%4 + v.vi\t%0,%3,%v4 + v.vv\t%0,%3,%4 + v.vi\t%0,%3,%v4" + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +;; Vector-Scalar Bit shift operations. +(define_insn "@v_vx" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (any_shift:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (match_operand 4 "p_reg_or_uimm5_operand" "r,K,r,K,r,K,r,K")) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J") + ] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vx\t%0,%3,%4,%1.t + v.vi\t%0,%3,%4,%1.t + v.vx\t%0,%3,%4,%1.t + v.vi\t%0,%3,%4,%1.t + v.vx\t%0,%3,%4 + v.vi\t%0,%3,%4 + v.vx\t%0,%3,%4 + v.vi\t%0,%3,%4" + [(set_attr "type" "vshift") + (set_attr "mode" "")]) + +;; Vector-Vector Narrowing Integer Right Shift Instructions. +(define_insn "@vn_wv" + [(set (match_operand:VWI 0 "register_operand" "=vd,vd,&vd,vd,&vd, vd,vd,&vd,vd,&vd, vr,vr,&vr,vr,&vr, vr,vr,&vr,vr,&vr") + (unspec:VWI + [(unspec:VWI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,vm, vm,vm,vm,vm,vm, J,J,J,J,J, J,J,J,J,J") + (truncate:VWI + (any_shiftrt: + (match_operand: 3 "register_operand" "0,vr,vr,0,vr, 0,vr,vr,0,vr, 0,vr,vr,0,vr, 0,vr,vr,0,vr") + (match_operand:VWI 4 "vector_shift_operand" "vr,0,vr,vk,vk, vr,0,vr,vk,vk, vr,0,vr,vk,vk, vr,0,vr,vk,vk"))) + (match_operand:VWI 2 "vector_reg_or_const0_operand" "0,0,0,0,0, J,J,J,J,J, 0,0,0,0,0, J,J,J,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK, rK,rK,rK,rK,rK, rK,rK,rK,rK,rK, rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vn.wv\t%0,%3,%4,%1.t + vn.wv\t%0,%3,%4,%1.t + vn.wv\t%0,%3,%4,%1.t + vn.wi\t%0,%3,%v4,%1.t + vn.wi\t%0,%3,%v4,%1.t + vn.wv\t%0,%3,%4,%1.t + vn.wv\t%0,%3,%4,%1.t + vn.wv\t%0,%3,%4,%1.t + vn.wi\t%0,%3,%v4,%1.t + vn.wi\t%0,%3,%v4,%1.t + vn.wv\t%0,%3,%4 + vn.wv\t%0,%3,%4 + vn.wv\t%0,%3,%4 + vn.wi\t%0,%3,%v4 + vn.wi\t%0,%3,%v4 + vn.wv\t%0,%3,%4 + vn.wv\t%0,%3,%4 + vn.wv\t%0,%3,%4 + vn.wi\t%0,%3,%v4 + vn.wi\t%0,%3,%v4" + [(set_attr "type" "vshift") + (set_attr "mode" "")]) + +;; Vector-Scalar Narrowing Integer Right Shift Instructions. +(define_insn "@vn_wx" + [(set (match_operand:VWI 0 "register_operand" "=vd,&vd,vd,&vd, vd,&vd,vd,&vd, vr,&vr,vr,&vr, vr,&vr,vr,&vr") + (unspec:VWI + [(unspec:VWI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm, vm,vm,vm,vm, J,J,J,J, J,J,J,J") + (truncate:VWI + (any_shiftrt: + (match_operand: 3 "register_operand" "0,vr,0,vr, 0,vr,0,vr, 0,vr,0,vr, 0,vr,0,vr") + (match_operand 4 "p_reg_or_uimm5_operand" "r,r,K,K, r,r,K,K, r,r,K,K, r,r,K,K"))) + (match_operand:VWI 2 "vector_reg_or_const0_operand" "0,0,0,0, J,J,J,J, 0,0,0,0, J,J,J,J") + ] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK, rK,rK,rK,rK, rK,rK,rK,rK, rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vn.wx\t%0,%3,%4,%1.t + vn.wx\t%0,%3,%4,%1.t + vn.wi\t%0,%3,%4,%1.t + vn.wi\t%0,%3,%4,%1.t + vn.wx\t%0,%3,%4,%1.t + vn.wx\t%0,%3,%4,%1.t + vn.wi\t%0,%3,%4,%1.t + vn.wi\t%0,%3,%4,%1.t + vn.wx\t%0,%3,%4 + vn.wx\t%0,%3,%4 + vn.wi\t%0,%3,%4 + vn.wi\t%0,%3,%4 + vn.wx\t%0,%3,%4 + vn.wx\t%0,%3,%4 + vn.wi\t%0,%3,%4 + vn.wi\t%0,%3,%4" + [(set_attr "type" "vshift") + (set_attr "mode" "")]) + +;; pseudo-instruction vncvt.x.x.w vd,vs,vm = vnsrl.wx vd,vs,x0,vm. +(define_insn "@vncvt_x_x_w" + [(set (match_operand:VWI 0 "register_operand" "=vd,&vd, vd,&vd, vr,&vr, vr,&vr") + (unspec:VWI + [(unspec:VWI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm, vm,vm, J,J, J,J") + (truncate:VWI + (match_operand: 3 "register_operand" "0,vr, 0,vr, 0,vr, 0,vr")) + (match_operand:VWI 2 "vector_reg_or_const0_operand" "0,0, J,J, 0,0, J,J") + ] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK, rK,rK, rK,rK, rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vncvt.x.x.w\t%0,%3,%1.t + vncvt.x.x.w\t%0,%3,%1.t + vncvt.x.x.w\t%0,%3,%1.t + vncvt.x.x.w\t%0,%3,%1.t + vncvt.x.x.w\t%0,%3 + vncvt.x.x.w\t%0,%3 + vncvt.x.x.w\t%0,%3 + vncvt.x.x.w\t%0,%3" + [(set_attr "type" "vshift") + (set_attr "mode" "")]) + +;; Vector-Vector Integer Comparision Instructions. +(define_insn "@vms_vv" + [(set (match_operand: 0 "register_operand" "=vr,vr,vm,&vr,vr,vm,&vr, vr,vr,vm,&vr,vr,vm,&vr, vr,vr,&vr,vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,0,vm,vm,0,vm, vm,vm,0,vm,vm,0,vm, J,J,J,J,J") + (cmp_noltge: + (match_operand:VI 3 "register_operand" "0,vr,vr,vr,0,vr,vr, 0,vr,vr,vr,0,vr,vr, 0,vr,vr,0,vr") + (match_operand:VI 4 "vector_arith_operand" "vr,0,vr,vr,vi,vi,vi, vr,0,vr,vr,vi,vi,vi, vr,0,vr,vi,vi")) + (match_operand: 2 "vector_reg_or_const0_operand" "0,0,0,0,0,0,0, J,J,J,J,J,J,J, J,J,J,J,J") + ] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK, rK,rK,rK,rK,rK,rK,rK, rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vms.vv\t%0,%3,%4,%1.t + vms.vv\t%0,%3,%4,%1.t + vms.vv\t%0,%3,%4,%1.t + vms.vv\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%v4,%1.t + vms.vi\t%0,%3,%v4,%1.t + vms.vi\t%0,%3,%v4,%1.t + vms.vv\t%0,%3,%4,%1.t + vms.vv\t%0,%3,%4,%1.t + vms.vv\t%0,%3,%4,%1.t + vms.vv\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%v4,%1.t + vms.vi\t%0,%3,%v4,%1.t + vms.vi\t%0,%3,%v4,%1.t + vms.vv\t%0,%3,%4 + vms.vv\t%0,%3,%4 + vms.vv\t%0,%3,%4 + vms.vi\t%0,%3,%v4 + vms.vi\t%0,%3,%v4" + [(set_attr "type" "vcmp") + (set_attr "mode" "")]) + +(define_insn "@vms_vv" + [(set (match_operand: 0 "register_operand" "=vr,vr,vm,&vr,vr,vm,&vr, vr,vr,vm,&vr,vr,vm,&vr, vr,vr,&vr,vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,0,vm,vm,0,vm, vm,vm,0,vm,vm,0,vm, J,J,J,J,J") + (cmp_ltge: + (match_operand:VI 3 "register_operand" "0,vr,vr,vr,0,vr,vr, 0,vr,vr,vr,0,vr,vr, 0,vr,vr,0,vr") + (match_operand:VI 4 "vector_neg_arith_operand" "vr,0,vr,vr,vj,vj,vj, vr,0,vr,vr,vj,vj,vj, vr,0,vr,vj,vj")) + (match_operand: 2 "vector_reg_or_const0_operand" "0,0,0,0,0,0,0, J,J,J,J,J,J,J, J,J,J,J,J") + ] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK, rK,rK,rK,rK,rK,rK,rK, rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vms.vv\t%0,%3,%4,%1.t + vms.vv\t%0,%3,%4,%1.t + vms.vv\t%0,%3,%4,%1.t + vms.vv\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%v4,%1.t + vms.vi\t%0,%3,%v4,%1.t + vms.vi\t%0,%3,%v4,%1.t + vms.vv\t%0,%3,%4,%1.t + vms.vv\t%0,%3,%4,%1.t + vms.vv\t%0,%3,%4,%1.t + vms.vv\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%v4,%1.t + vms.vi\t%0,%3,%v4,%1.t + vms.vi\t%0,%3,%v4,%1.t + vms.vv\t%0,%3,%4 + vms.vv\t%0,%3,%4 + vms.vv\t%0,%3,%4 + vms.vi\t%0,%3,%v4 + vms.vi\t%0,%3,%v4" + [(set_attr "type" "vcmp") + (set_attr "mode" "")]) + +;; Vector-Scalar Integer Comparision Instructions. +(define_insn "@vms_vx_internal" + [(set (match_operand: 0 "register_operand" "=vr,vm,&vr,vr,vm,&vr, vr,vm,&vr,vr,vm,&vr, vr,&vr,vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,0,vm,vm,0,vm, vm,0,vm,vm,0,vm, J,J,J,J") + (cmp_noltge: + (match_operand:VI 3 "register_operand" "0,vr,vr,0,vr,vr, 0,vr,vr,0,vr,vr, 0,vr,0,vr") + (vec_duplicate:VI + (match_operand: 4 "reg_or_simm5_operand" "r,r,r,Ws5,Ws5,Ws5, r,r,r,Ws5,Ws5,Ws5, r,r,Ws5,Ws5"))) + (match_operand: 2 "vector_reg_or_const0_operand" "0,0,0,0,0,0, J,J,J,J,J,J, J,J,J,J") + ] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK, rK,rK,rK,rK,rK,rK, rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4 + vms.vx\t%0,%3,%4 + vms.vi\t%0,%3,%4 + vms.vi\t%0,%3,%4" + [(set_attr "type" "vcmp") + (set_attr "mode" "")]) + +(define_insn "@vms_vx_32bit" + [(set (match_operand: 0 "register_operand" "=vr,vm,&vr,vr,vm,&vr, vr,vm,&vr,vr,vm,&vr, vr,&vr,vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,0,vm,vm,0,vm, vm,0,vm,vm,0,vm, J,J,J,J") + (cmp_noltge: + (match_operand:V64BITI 3 "register_operand" "0,vr,vr,0,vr,vr, 0,vr,vr,0,vr,vr, 0,vr,0,vr") + (vec_duplicate:V64BITI + (sign_extend: (match_operand:SI 4 "reg_or_simm5_operand" "r,r,r,Ws5,Ws5,Ws5, r,r,r,Ws5,Ws5,Ws5, r,r,Ws5,Ws5")))) + (match_operand: 2 "vector_reg_or_const0_operand" "0,0,0,0,0,0, J,J,J,J,J,J, J,J,J,J") + ] UNSPEC_SELECT) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK,rK,rK, rK,rK,rK,rK,rK,rK, rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4 + vms.vx\t%0,%3,%4 + vms.vi\t%0,%3,%4 + vms.vi\t%0,%3,%4" + [(set_attr "type" "vcmp") + (set_attr "mode" "")]) + +(define_insn "@vms_vx_internal" + [(set (match_operand: 0 "register_operand" "=vr,vm,&vr,vr,vm,&vr, vr,vm,&vr,vr,vm,&vr, vr,&vr,vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,0,vm,vm,0,vm, vm,0,vm,vm,0,vm, J,J,J,J") + (cmp_lt: + (match_operand:VI 3 "register_operand" "0,vr,vr,0,vr,vr, 0,vr,vr,0,vr,vr, 0,vr,0,vr") + (vec_duplicate:VI + (match_operand: 4 "reg_or_neg_simm5_operand" "r,r,r,Wn5,Wn5,Wn5, r,r,r,Wn5,Wn5,Wn5, r,r,Wn5,Wn5"))) + (match_operand: 2 "vector_reg_or_const0_operand" "0,0,0,0,0,0, J,J,J,J,J,J, J,J,J,J") + ] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK, rK,rK,rK,rK,rK,rK, rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4 + vms.vx\t%0,%3,%4 + vms.vi\t%0,%3,%4 + vms.vi\t%0,%3,%4" + [(set_attr "type" "vcmp") + (set_attr "mode" "")]) + +(define_insn "@vms_vx_32bit" + [(set (match_operand: 0 "register_operand" "=vr,vm,&vr,vr,vm,&vr, vr,vm,&vr,vr,vm,&vr, vr,&vr,vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,0,vm,vm,0,vm, vm,0,vm,vm,0,vm, J,J,J,J") + (cmp_lt: + (match_operand:V64BITI 3 "register_operand" "0,vr,vr,0,vr,vr, 0,vr,vr,0,vr,vr, 0,vr,0,vr") + (vec_duplicate:V64BITI + (sign_extend: (match_operand:SI 4 "reg_or_neg_simm5_operand" "r,r,r,Wn5,Wn5,Wn5, r,r,r,Wn5,Wn5,Wn5, r,r,Wn5,Wn5")))) + (match_operand: 2 "vector_reg_or_const0_operand" "0,0,0,0,0,0, J,J,J,J,J,J, J,J,J,J") + ] UNSPEC_SELECT) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK,rK,rK, rK,rK,rK,rK,rK,rK, rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4 + vms.vx\t%0,%3,%4 + vms.vi\t%0,%3,%4 + vms.vi\t%0,%3,%4" + [(set_attr "type" "vcmp") + (set_attr "mode" "")]) + +(define_expand "@vms_vx_internal" + [(parallel + [(set (match_operand: 0 "register_operand") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand") + (cmp_ge: + (match_operand:VI 3 "register_operand") + (vec_duplicate:VI + (match_operand: 4 "reg_or_neg_simm5_operand"))) + (match_operand: 2 "vector_reg_or_const0_operand")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV)) + (clobber (scratch:))])] + "TARGET_VECTOR" + { + }) + +(define_expand "@vms_vx_32bit" + [(parallel + [(set (match_operand: 0 "register_operand") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand") + (cmp_ge: + (match_operand:V64BITI 3 "register_operand") + (vec_duplicate:V64BITI + (sign_extend: (match_operand:SI 4 "reg_or_neg_simm5_operand")))) + (match_operand: 2 "vector_reg_or_const0_operand")] UNSPEC_SELECT) + (match_operand:SI 5 "csr_operand") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV)) + (clobber (scratch:))])] + "TARGET_VECTOR" + { + }) + +;; When destination EEW is smaller than this source EEW, destination register and source register overlap in the +;; lowest-numbered part of the source register grop according to vector spec(at 5.2. Vector Operands). All +;; insructions are: integer and floating-point compare instructions, narrowing integer right shift instructions, +;; narrowing fixed-point clip instructions. +;; So change case 1 to case 2 to allow the overlap case, and let overlap case has high priority. +;; case 1: case 2: +;; match_operand 0 &vr match_operand 0 vr, ?&vr +;; match_operand 3 vr match_operand 3 0, vr +(define_insn "*vms_vx" + [(set (match_operand: 0 "register_operand" "=vm, vd,&vd,vd,&vd, &vd,vd,vd,&vd, vr,&vr,vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "0, vm,vm,vm,vm, vm,vm,vm,vm, J,J,J,J") + (cmp_ge: + (match_operand:VI 3 "register_operand" "vr, 0,vr,0,vr, vr,0,0,vr, 0,vr,0,vr") + (vec_duplicate:VI + (match_operand: 4 "reg_or_neg_simm5_operand" "r, r,r,Wn5,Wn5, r,r,Wn5,Wn5, r,r,Wn5,Wn5"))) + (match_operand: 2 "vector_reg_or_const0_operand" "0, 0,0,0,0, J,J,J,J, J,J,J,J") + ] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK, rK,rK,rK,rK, rK,rK,rK,rK, rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV)) + (clobber (match_scratch: 7 "=&vr,X,X,X,X, X,X,X,X, X,X,X,X"))] + "TARGET_VECTOR" + "@ + vms.vx\t%0,%3,%4,%1.t,%7 + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4 + vms.vx\t%0,%3,%4 + vms.vi\t%0,%3,%4 + vms.vi\t%0,%3,%4" + [(set_attr "type" "vcmp") + (set_attr "mode" "")]) + +(define_insn "*vms_vx_32bit" + [(set (match_operand: 0 "register_operand" "=vm, vd,&vd,vd,&vd, &vd,vd,vd,&vd, vr,&vr,vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "0, vm,vm,vm,vm, vm,vm,vm,vm, J,J,J,J") + (cmp_ge: + (match_operand:V64BITI 3 "register_operand" "vr, 0,vr,0,vr, vr,0,0,vr, 0,vr,0,vr") + (vec_duplicate:V64BITI + (sign_extend: (match_operand:SI 4 "reg_or_neg_simm5_operand" "r, r,r,Wn5,Wn5, r,r,Wn5,Wn5, r,r,Wn5,Wn5")))) + (match_operand: 2 "vector_reg_or_const0_operand" "0, 0,0,0,0, J,J,J,J, J,J,J,J") + ] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK, rK,rK,rK,rK, rK,rK,rK,rK, rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV)) + (clobber (match_scratch: 7 "=&vr,X,X,X,X, X,X,X,X, X,X,X,X"))] + "TARGET_VECTOR" + "@ + vms.vx\t%0,%3,%4,%1.t,%7 + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vi\t%0,%3,%4,%1.t + vms.vx\t%0,%3,%4 + vms.vx\t%0,%3,%4 + vms.vi\t%0,%3,%4 + vms.vi\t%0,%3,%4" + [(set_attr "type" "vcmp") + (set_attr "mode" "")]) + +;; Vector-Vector Integer Signed/Unsigned Minimum/Maximum. +(define_insn "@v_vv" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (any_minmax:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr") + (match_operand:VI 4 "register_operand" "vr,vr,vr,vr")) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,J,0,J") + ] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vv\t%0,%3,%4,%1.t + v.vv\t%0,%3,%4,%1.t + v.vv\t%0,%3,%4 + v.vv\t%0,%3,%4" + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +;; Vector-Scalar Signed/Unsigned min/max. +(define_insn "@v_vx_internal" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (any_minmax:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (vec_duplicate:VI + (match_operand: 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J"))) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vx\t%0,%3,%4,%1.t + v.vx\t%0,%3,zero,%1.t + v.vx\t%0,%3,%4,%1.t + v.vx\t%0,%3,zero,%1.t + v.vx\t%0,%3,%4 + v.vx\t%0,%3,zero + v.vx\t%0,%3,%4 + v.vx\t%0,%3,zero" + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +(define_insn "@v_vx_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:V64BITI + [(unspec:V64BITI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (any_minmax:V64BITI + (match_operand:V64BITI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (vec_duplicate:V64BITI + (sign_extend: (match_operand:SI 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J")))) + (match_operand:V64BITI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand:SI 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vx\t%0,%3,%4,%1.t + v.vx\t%0,%3,zero,%1.t + v.vx\t%0,%3,%4,%1.t + v.vx\t%0,%3,zero,%1.t + v.vx\t%0,%3,%4 + v.vx\t%0,%3,zero + v.vx\t%0,%3,%4 + v.vx\t%0,%3,zero" + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +;; Vector-Vector Signed multiply, returning low bits of product. +(define_insn "@vmul_vv" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (mult:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr") + (match_operand:VI 4 "register_operand" "vr,vr,vr,vr")) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmul.vv\t%0,%3,%4,%1.t + vmul.vv\t%0,%3,%4,%1.t + vmul.vv\t%0,%3,%4 + vmul.vv\t%0,%3,%4" + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +;; Vector-Scalar Signed multiply, returning low bits of product. +(define_insn "@vmul_vx_internal" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (mult:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (vec_duplicate:VI + (match_operand: 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J"))) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmul.vx\t%0,%3,%4,%1.t + vmul.vx\t%0,%3,zero,%1.t + vmul.vx\t%0,%3,%4,%1.t + vmul.vx\t%0,%3,zero,%1.t + vmul.vx\t%0,%3,%4 + vmul.vx\t%0,%3,zero + vmul.vx\t%0,%3,%4 + vmul.vx\t%0,%3,zero" + [(set_attr "type" "vmul") + (set_attr "mode" "")]) + +(define_insn "@vmul_vx_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:V64BITI + [(unspec:V64BITI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (mult:V64BITI + (match_operand:V64BITI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (vec_duplicate:V64BITI + (sign_extend: (match_operand:SI 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J")))) + (match_operand:V64BITI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand:SI 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmul.vx\t%0,%3,%4,%1.t + vmul.vx\t%0,%3,zero,%1.t + vmul.vx\t%0,%3,%4,%1.t + vmul.vx\t%0,%3,zero,%1.t + vmul.vx\t%0,%3,%4 + vmul.vx\t%0,%3,zero + vmul.vx\t%0,%3,%4 + vmul.vx\t%0,%3,zero" + [(set_attr "type" "vmul") + (set_attr "mode" "")]) + +;; Vector-Vector Signed/Unsigned highpart multiply, returning high bits of product. +(define_insn "@vmulh_vv" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec:VI + [(match_operand:VI 3 "register_operand" "vr,vr,vr,vr") + (match_operand:VI 4 "register_operand" "vr,vr,vr,vr")] + MUL_HIGHPART) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmulh.vv\t%0,%3,%4,%1.t + vmulh.vv\t%0,%3,%4,%1.t + vmulh.vv\t%0,%3,%4 + vmulh.vv\t%0,%3,%4" + [(set_attr "type" "vmulh") + (set_attr "mode" "")]) + +;; Vector-Scalar Signed/Unsigned multiply, returning high bits of product. +(define_insn "@vmulh_vx_internal" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (unspec:VI + [(match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (match_operand: 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J")] + MUL_HIGHPART) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmulh.vx\t%0,%3,%4,%1.t + vmulh.vx\t%0,%3,zero,%1.t + vmulh.vx\t%0,%3,%4,%1.t + vmulh.vx\t%0,%3,zero,%1.t + vmulh.vx\t%0,%3,%4 + vmulh.vx\t%0,%3,zero + vmulh.vx\t%0,%3,%4 + vmulh.vx\t%0,%3,zero" + [(set_attr "type" "vmulh") + (set_attr "mode" "")]) + +(define_insn "@vmulh_vx_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:V64BITI + [(unspec:V64BITI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (unspec:V64BITI + [(match_operand:V64BITI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (match_operand:SI 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J")] + MUL_HIGHPART) + (match_operand:V64BITI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand:SI 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmulh.vx\t%0,%3,%4,%1.t + vmulh.vx\t%0,%3,zero,%1.t + vmulh.vx\t%0,%3,%4,%1.t + vmulh.vx\t%0,%3,zero,%1.t + vmulh.vx\t%0,%3,%4 + vmulh.vx\t%0,%3,zero + vmulh.vx\t%0,%3,%4 + vmulh.vx\t%0,%3,zero" + [(set_attr "type" "vmulh") + (set_attr "mode" "")]) + +;; Vector-Vector Signed(vs2)-Unsigned multiply, returning high bits of product. +(define_insn "@vmulhsu_vv" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec:VI + [(match_operand:VI 3 "register_operand" "vr,vr,vr,vr") + (match_operand:VI 4 "register_operand" "vr,vr,vr,vr")] UNSPEC_VMULHSU) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmulhsu.vv\t%0,%3,%4,%1.t + vmulhsu.vv\t%0,%3,%4,%1.t + vmulhsu.vv\t%0,%3,%4 + vmulhsu.vv\t%0,%3,%4" + [(set_attr "type" "vmulh") + (set_attr "mode" "")]) + +;; Vector-Scalar Signed(vs2)-Unsigned multiply, returning high bits of product. +(define_insn "@vmulhsu_vx_internal" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (unspec:VI + [(match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (match_operand: 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J")] + UNSPEC_VMULHSU) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmulhsu.vx\t%0,%3,%4,%1.t + vmulhsu.vx\t%0,%3,zero,%1.t + vmulhsu.vx\t%0,%3,%4,%1.t + vmulhsu.vx\t%0,%3,zero,%1.t + vmulhsu.vx\t%0,%3,%4 + vmulhsu.vx\t%0,%3,zero + vmulhsu.vx\t%0,%3,%4 + vmulhsu.vx\t%0,%3,zero" + [(set_attr "type" "vmulh") + (set_attr "mode" "")]) + +(define_insn "@vmulhsu_vx_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:V64BITI + [(unspec:V64BITI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (unspec:V64BITI + [(match_operand:V64BITI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (match_operand:SI 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J")] + UNSPEC_VMULHSU) + (match_operand:V64BITI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand:SI 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmulhsu.vx\t%0,%3,%4,%1.t + vmulhsu.vx\t%0,%3,zero,%1.t + vmulhsu.vx\t%0,%3,%4,%1.t + vmulhsu.vx\t%0,%3,zero,%1.t + vmulhsu.vx\t%0,%3,%4 + vmulhsu.vx\t%0,%3,zero + vmulhsu.vx\t%0,%3,%4 + vmulhsu.vx\t%0,%3,zero" + [(set_attr "type" "vmulh") + (set_attr "mode" "")]) + +;; Vector-Vector Signed/Unsigned divide/remainder. +(define_insn "@v_vv" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (any_div:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr") + (match_operand:VI 4 "register_operand" "vr,vr,vr,vr")) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vv\t%0,%3,%4,%1.t + v.vv\t%0,%3,%4,%1.t + v.vv\t%0,%3,%4 + v.vv\t%0,%3,%4" + [(set_attr "type" "vdiv") + (set_attr "mode" "")]) + +;; Vector-Scalar Signed/Unsigned divide/remainder. +(define_insn "@v_vx_internal" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (any_div:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (vec_duplicate:VI + (match_operand: 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J"))) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vx\t%0,%3,%4,%1.t + v.vx\t%0,%3,zero,%1.t + v.vx\t%0,%3,%4,%1.t + v.vx\t%0,%3,zero,%1.t + v.vx\t%0,%3,%4 + v.vx\t%0,%3,zero + v.vx\t%0,%3,%4 + v.vx\t%0,%3,zero" + [(set_attr "type" "vdiv") + (set_attr "mode" "")]) + +(define_insn "@v_vx_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:V64BITI + [(unspec:V64BITI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (any_div:V64BITI + (match_operand:V64BITI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (vec_duplicate:V64BITI + (sign_extend: (match_operand:SI 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J")))) + (match_operand:V64BITI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J") + ] UNSPEC_SELECT) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand:SI 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vx\t%0,%3,%4,%1.t + v.vx\t%0,%3,zero,%1.t + v.vx\t%0,%3,%4,%1.t + v.vx\t%0,%3,zero,%1.t + v.vx\t%0,%3,%4 + v.vx\t%0,%3,zero + v.vx\t%0,%3,%4 + v.vx\t%0,%3,zero" + [(set_attr "type" "vdiv") + (set_attr "mode" "")]) + +;; Vector-Vector Widening Signed/Unsigned Integer multiply. +(define_insn "@vwmul_vv" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (mult: + (any_extend: + (match_operand:VWI 3 "register_operand" "vr,vr,vr,vr")) + (any_extend: + (match_operand:VWI 4 "register_operand" "vr,vr,vr,vr"))) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vwmul.vv\t%0,%3,%4,%1.t + vwmul.vv\t%0,%3,%4,%1.t + vwmul.vv\t%0,%3,%4 + vwmul.vv\t%0,%3,%4" + [(set_attr "type" "vwarith") + (set_attr "mode" "")]) + +;; Vector-Scalar Widening Signed/Unsigned Integer multiply. +(define_insn "@vwmul_vx" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr,&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (mult: + (any_extend: + (match_operand:VWI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr")) + (any_extend: + (vec_duplicate:VWI + (match_operand: 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J")))) + (match_operand: 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vwmul.vx\t%0,%3,%4,%1.t + vwmul.vx\t%0,%3,zero,%1.t + vwmul.vx\t%0,%3,%4,%1.t + vwmul.vx\t%0,%3,zero,%1.t + vwmul.vx\t%0,%3,%4 + vwmul.vx\t%0,%3,zero + vwmul.vx\t%0,%3,%4 + vwmul.vx\t%0,%3,zero" + [(set_attr "type" "vwarith") + (set_attr "mode" "")]) + +;; Vector-Vector Widening Signed-Unsigned Integer multiply. +(define_insn "@vwmulsu_vv" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (mult: + (sign_extend: + (match_operand:VWI 3 "register_operand" "vr,vr,vr,vr")) + (zero_extend: + (match_operand:VWI 4 "register_operand" "vr,vr,vr,vr"))) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vwmulsu.vv\t%0,%3,%4,%1.t + vwmulsu.vv\t%0,%3,%4,%1.t + vwmulsu.vv\t%0,%3,%4 + vwmulsu.vv\t%0,%3,%4" + [(set_attr "type" "vwmul") + (set_attr "mode" "")]) + +;; Vector-Scalar Widening Signed-Unsigned Integer multiply. +(define_insn "@vwmulsu_vx" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr,&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (mult: + (sign_extend: + (match_operand:VWI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr")) + (zero_extend: + (vec_duplicate:VWI + (match_operand: 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J")))) + (match_operand: 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vwmulsu.vx\t%0,%3,%4,%1.t + vwmulsu.vx\t%0,%3,zero,%1.t + vwmulsu.vx\t%0,%3,%4,%1.t + vwmulsu.vx\t%0,%3,zero,%1.t + vwmulsu.vx\t%0,%3,%4 + vwmulsu.vx\t%0,%3,zero + vwmulsu.vx\t%0,%3,%4 + vwmulsu.vx\t%0,%3,zero" + [(set_attr "type" "vwmul") + (set_attr "mode" "")]) + +;; Vector-Vector Single-Width Integer Multiply-Add Instructions. +(define_insn "@v_vv" + [(set (match_operand:VI 0 "register_operand" "=vd,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,J") + (unspec:VI + [(match_operand:VI 2 "register_operand" "0,0") + (match_operand:VI 3 "register_operand" "vr,vr") + (match_operand:VI 4 "register_operand" "vr,vr")] IMAC) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vv\t%0,%3,%4,%1.t + v.vv\t%0,%3,%4" + [(set_attr "type" "vmadd") + (set_attr "mode" "")]) + +;; Vector-Scalar Single-Width Integer Multiply-Add Instructions. +(define_insn "@v_vx_internal" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec:VI + [(match_operand:VI 2 "register_operand" "0,0,0,0") + (vec_duplicate:VI + (match_operand: 3 "reg_or_0_operand" "r,J,r,J")) + (match_operand:VI 4 "register_operand" "vr,vr,vr,vr")] IMAC) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vx\t%0,%3,%4,%1.t + v.vx\t%0,zero,%4,%1.t + v.vx\t%0,%3,%4 + v.vx\t%0,zero,%4" + [(set_attr "type" "vmadd") + (set_attr "mode" "")]) + +(define_insn "@v_vx_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=vd,vd,vr,vr") + (unspec:V64BITI + [(unspec:V64BITI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec:V64BITI + [(match_operand:V64BITI 2 "register_operand" "0,0,0,0") + (vec_duplicate:V64BITI + (sign_extend: (match_operand:SI 3 "reg_or_0_operand" "r,J,r,J"))) + (match_operand:V64BITI 4 "register_operand" "vr,vr,vr,vr")] IMAC) + (match_dup 2)] UNSPEC_SELECT) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK") + (match_operand:SI 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vx\t%0,%3,%4,%1.t + v.vx\t%0,zero,%4,%1.t + v.vx\t%0,%3,%4 + v.vx\t%0,zero,%4" + [(set_attr "type" "vmadd") + (set_attr "mode" "")]) + +;; Vector-Vector Widening signed-integer multiply-add, overwrite addend. +;; Vector-Vector Widening unsigned-integer multiply-add, overwrite addend. +(define_insn "@vwmacc_vv" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,J") + (plus: + (mult: + (any_extend: + (match_operand:VWI 3 "register_operand" "vr,vr")) + (any_extend: + (match_operand:VWI 4 "register_operand" "vr,vr"))) + (match_operand: 2 "register_operand" "0,0")) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vwmacc.vv\t%0,%3,%4,%1.t + vwmacc.vv\t%0,%3,%4" + [(set_attr "type" "vwmadd") + (set_attr "mode" "")]) + +;; Vector-Scalar Widening signed-integer multiply-add, overwrite addend. +;; Vector-Scalar Widening unsigned-integer multiply-add, overwrite addend. +(define_insn "@vwmacc_vx" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (plus: + (mult: + (any_extend: + (vec_duplicate:VWI + (match_operand: 3 "reg_or_0_operand" "r,J,r,J"))) + (any_extend: + (match_operand:VWI 4 "register_operand" "vr,vr,vr,vr"))) + (match_operand: 2 "register_operand" "0,0,0,0")) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vwmacc.vx\t%0,%3,%4,%1.t + vwmacc.vx\t%0,zero,%4,%1.t + vwmacc.vx\t%0,%3,%4 + vwmacc.vx\t%0,zero,%4" + [(set_attr "type" "vwmadd") + (set_attr "mode" "")]) + +;; Vector-Vector Widening signed-unsigned-integer multiply-add, overwrite addend. +(define_insn "@vwmaccsu_vv" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,J") + (plus: + (mult: + (sign_extend: + (match_operand:VWI 3 "register_operand" "vr,vr")) + (zero_extend: + (match_operand:VWI 4 "register_operand" "vr,vr"))) + (match_operand: 2 "register_operand" "0,0")) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vwmaccsu.vv\t%0,%3,%4,%1.t + vwmaccsu.vv\t%0,%3,%4" + [(set_attr "type" "vwmadd") + (set_attr "mode" "")]) + +;; Vector-Scalar Widening signed-unsigned-integer multiply-add, overwrite addend. +(define_insn "@vwmaccsu_vx" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (plus: + (mult: + (sign_extend: + (vec_duplicate:VWI + (match_operand: 3 "reg_or_0_operand" "r,J,r,J"))) + (zero_extend: + (match_operand:VWI 4 "register_operand" "vr,vr,vr,vr"))) + (match_operand: 2 "register_operand" "0,0,0,0")) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vwmaccsu.vx\t%0,%3,%4,%1.t + vwmaccsu.vx\t%0,zero,%4,%1.t + vwmaccsu.vx\t%0,%3,%4 + vwmaccsu.vx\t%0,zero,%4" + [(set_attr "type" "vwmadd") + (set_attr "mode" "")]) + +;; Vector-Scalar Widening unsigned-signed-integer multiply-add, overwrite addend. +(define_insn "@vwmaccus_vx" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (plus: + (mult: + (zero_extend: + (vec_duplicate:VWI + (match_operand: 3 "reg_or_0_operand" "r,J,r,J"))) + (sign_extend: + (match_operand:VWI 4 "register_operand" "vr,vr,vr,vr"))) + (match_operand: 2 "register_operand" "0,0,0,0")) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vwmaccus.vx\t%0,%3,%4,%1.t + vwmaccus.vx\t%0,zero,%4,%1.t + vwmaccus.vx\t%0,%3,%4 + vwmaccus.vx\t%0,zero,%4" + [(set_attr "type" "vwmadd") + (set_attr "mode" "")]) + +;; Vector-Vector integer and float merge. +(define_insn "@vmerge_vvm" + [(set (match_operand:V 0 "register_operand" "=vd,vd,vd,vd") + (unspec:V + [(match_operand:V 2 "vector_reg_or_const0_operand" "0,0,J,J") + (unspec:V + [(match_operand: 1 "register_operand" "vm,vm,vm,vm") + (match_operand:V 3 "register_operand" "vr,vr,vr,vr") + (match_operand:V 4 "vector_arith_operand" "vr,vi,vr,vi")] UNSPEC_MERGE) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmerge.vvm\t%0,%3,%4,%1 + vmerge.vim\t%0,%3,%v4,%1 + vmerge.vvm\t%0,%3,%4,%1 + vmerge.vim\t%0,%3,%v4,%1" + [(set_attr "type" "vmerge") + (set_attr "mode" "")]) + +;; Vector-Scalar integer merge. +(define_insn "@vmerge_vxm_internal" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd") + (unspec:VI + [(match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J") + (unspec:VI + [(match_operand: 1 "register_operand" "vm,vm,vm,vm") + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr") + (vec_duplicate:VI + (match_operand: 4 "reg_or_simm5_operand" "r,Ws5,r,Ws5"))] UNSPEC_MERGE) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmerge.vxm\t%0,%3,%4,%1 + vmerge.vim\t%0,%3,%4,%1 + vmerge.vxm\t%0,%3,%4,%1 + vmerge.vim\t%0,%3,%4,%1" + [(set_attr "type" "vmerge") + (set_attr "mode" "")]) + +(define_insn "@vmerge_vxm_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=vd,vd,vd,vd") + (unspec:V64BITI + [(match_operand:V64BITI 2 "vector_reg_or_const0_operand" "0,0,J,J") + (unspec:V64BITI + [(match_operand: 1 "register_operand" "vm,vm,vm,vm") + (match_operand:V64BITI 3 "register_operand" "vr,vr,vr,vr") + (vec_duplicate:V64BITI + (sign_extend: + (match_operand:SI 4 "reg_or_simm5_operand" "r,Ws5,r,Ws5")))] UNSPEC_MERGE) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK") + (match_operand:SI 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmerge.vxm\t%0,%3,%4,%1 + vmerge.vim\t%0,%3,%4,%1 + vmerge.vxm\t%0,%3,%4,%1 + vmerge.vim\t%0,%3,%4,%1" + [(set_attr "type" "vmerge") + (set_attr "mode" "")]) +;; Vector-Vector Integer/Float Move. +(define_insn "@vmv_v_v" + [(set (match_operand:V 0 "register_operand" "=vr,vr") + (unspec:V + [(match_operand:V 1 "vector_reg_or_const0_operand" "0,J") + (unspec:V + [(match_operand:V 2 "register_operand" "vr,vr")] UNSPEC_MOVE) + (match_operand 3 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vmv.v.v\t%0,%2" + [(set_attr "type" "vmove") + (set_attr "mode" "")]) + ;; Vector-Scalar Integer Move. (define_insn "@vmv_v_x_internal" [(set (match_operand:VI 0 "register_operand" "=vr,vr,vr,vr") @@ -1055,29 +3577,6 @@ [(set_attr "type" "vmove") (set_attr "mode" "")]) -;; Vector-Scalar integer merge. -(define_insn "@vmerge_vxm_internal" - [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd") - (unspec:VI - [(match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J") - (unspec:VI - [(match_operand: 1 "register_operand" "vm,vm,vm,vm") - (match_operand:VI 3 "register_operand" "vr,vr,vr,vr") - (vec_duplicate:VI - (match_operand: 4 "reg_or_simm5_operand" "r,Ws5,r,Ws5"))] UNSPEC_MERGE) - (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") - (match_operand 6 "const_int_operand") - (reg:SI VL_REGNUM) - (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] - "TARGET_VECTOR" - "@ - vmerge.vxm\t%0,%3,%4,%1 - vmerge.vim\t%0,%3,%4,%1 - vmerge.vxm\t%0,%3,%4,%1 - vmerge.vim\t%0,%3,%4,%1" - [(set_attr "type" "vmerge") - (set_attr "mode" "")]) - ;; vmclr.m vd -> vmxor.mm vd,vd,vd # Clear mask register (define_insn "@vmclr_m" [(set (match_operand:VB 0 "register_operand" "=vr") From patchwork Tue May 31 08:50:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 54557 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EC6AA386F463 for ; Tue, 31 May 2022 09:01:56 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbg152.qq.com (smtpbg152.qq.com [13.245.186.79]) by sourceware.org (Postfix) with ESMTPS id A508438356B6 for ; Tue, 31 May 2022 08:52:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A508438356B6 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp84t1653987144t9ofps4y Received: from server1.localdomain ( [42.247.22.65]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 31 May 2022 16:52:23 +0800 (CST) X-QQ-SSF: 01400000002000B0F000000A0000000 X-QQ-FEAT: aFXyU9/pqzVgBsVaeyA/RInjUVma9GxVvksuVvy/5uF2plGX01VydmQTlNEcQ jF7XeqArZgc/Y0ZA0lNz9lCbSKgvj/tfsD0k9VnBv4F/JmmAmogDERc854QIz9UxJx8HzhP mqJE1lO/jdEEO7XWgc+eSVl2KolNCPgVGXfr3EWMEnGlLi/LkwOLhTzs77lAWTVBBMey1jw dAKH0bbhtL32isu8F1cTZ/tgv/y2zoQhOD0UUZHaEnLPBIhupYSSaecf1Ngyb1xNP/MokOM sZr6AMvqR3m6CaP29T7cBhsVZbCQhgQHfoh3Ei2gddhqcI4oqVTfp5X12LJAtyebI/cVEqV Dp74PU5mkef3Bcvzd/Ge3drufWQsjpwv9kJHTxqzNAkZl6lcXQ= X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Subject: [PATCH 18/21] Add rest intrinsic support Date: Tue, 31 May 2022 16:50:09 +0800 Message-Id: <20220531085012.269719-19-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220531085012.269719-1-juzhe.zhong@rivai.ai> References: <20220531085012.269719-1-juzhe.zhong@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybgforeign:qybgforeign4 X-QQ-Bgrelay: 1 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kito.cheng@gmail.com, juzhe.zhong@rivai.ai Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: zhongjuzhe gcc/ChangeLog: * config/riscv/riscv-vector-builtins-functions.cc (reduceop::assemble_name): New function. (reduceop::get_argument_types): New function. (reduceop::get_mask_type): New function. (vsadd::expand): New function. (vsaddu::expand): New function. (vaadd::expand): New function. (vaaddu::expand): New function. (vssub::expand): New function. (vssubu::expand): New function. (vasub::expand): New function. (vasubu::expand): New function. (vssrl::expand): New function. (vssra::expand): New function. (vsmul::expand): New function. (vnclip::expand): New function. (vnclipu::expand): New function. (funop::call_properties): New function. (fbinop::call_properties): New function. (fwbinop::call_properties): New function. (fternop::call_properties): New function. (vfadd::expand): New function. (vfsub::expand): New function. (vfmul::expand): New function. (vfdiv::expand): New function. (vfrsub::expand): New function. (vfrdiv::expand): New function. (vfneg::expand): New function. (vfwadd::expand): New function. (vfwsub::expand): New function. (vfwmul::expand): New function. (vfmacc::expand): New function. (vfmsac::expand): New function. (vfnmacc::expand): New function. (vfnmsac::expand): New function. (vfmadd::expand): New function. (vfnmadd::expand): New function. (vfmsub::expand): New function. (vfnmsub::expand): New function. (vfwmacc::expand): New function. (vfwnmacc::expand): New function. (vfwmsac::expand): New function. (vfwnmsac::expand): New function. (vfsqrt::expand): New function. (vfrsqrt7::expand): New function. (vfrec7::expand): New function. (vfmax::expand): New function. (vfmin::expand): New function. (vfsgnj::expand): New function. (vfsgnjn::expand): New function. (vfsgnjx::expand): New function. (vfabs::expand): New function. (vfcmp::assemble_name): New function. (vmfeq::expand): New function. (vmfne::expand): New function. (vmflt::expand): New function. (vmfgt::expand): New function. (vmfle::expand): New function. (vmfge::expand): New function. (vfclass::expand): New function. (vfmerge::get_position_of_dest_arg): New function. (vfmerge::expand): New function. (vfmv::can_be_overloaded_p): New function. (vfmv::expand): New function. (vfcvt_f2i::assemble_name): New function. (vfcvt_f2i::expand): New function. (vfcvt_f2u::assemble_name): New function. (vfcvt_f2u::expand): New function. (vfcvt_rtz_f2i::assemble_name): New function. (vfcvt_rtz_f2i::expand): New function. (vfcvt_rtz_f2u::assemble_name): New function. (vfcvt_rtz_f2u::expand): New function. (vfcvt_i2f::assemble_name): New function. (vfcvt_i2f::expand): New function. (vfcvt_u2f::assemble_name): New function. (vfcvt_u2f::expand): New function. (vfwcvt_f2i::assemble_name): New function. (vfwcvt_f2i::expand): New function. (vfwcvt_f2u::assemble_name): New function. (vfwcvt_f2u::expand): New function. (vfwcvt_rtz_f2i::assemble_name): New function. (vfwcvt_rtz_f2i::expand): New function. (vfwcvt_rtz_f2u::assemble_name): New function. (vfwcvt_rtz_f2u::expand): New function. (vfwcvt_i2f::assemble_name): New function. (vfwcvt_i2f::expand): New function. (vfwcvt_u2f::assemble_name): New function. (vfwcvt_u2f::expand): New function. (vfwcvt_f2f::assemble_name): New function. (vfwcvt_f2f::expand): New function. (vfncvt_f2i::assemble_name): New function. (vfncvt_f2i::expand): New function. (vfncvt_f2u::assemble_name): New function. (vfncvt_f2u::expand): New function. (vfncvt_rtz_f2i::assemble_name): New function. (vfncvt_rtz_f2i::expand): New function. (vfncvt_rtz_f2u::assemble_name): New function. (vfncvt_rtz_f2u::expand): New function. (vfncvt_i2f::assemble_name): New function. (vfncvt_i2f::expand): New function. (vfncvt_u2f::assemble_name): New function. (vfncvt_u2f::expand): New function. (vfncvt_f2f::assemble_name): New function. (vfncvt_f2f::expand): New function. (vfncvt_f2rodf::assemble_name): New function. (vfncvt_f2rodf::expand): New function. (vredsum::expand): New function. (vredmax::expand): New function. (vredmaxu::expand): New function. (vredmin::expand): New function. (vredminu::expand): New function. (vredand::expand): New function. (vredor::expand): New function. (vredxor::expand): New function. (vwredsum::expand): New function. (vwredsumu::expand): New function. (freduceop::call_properties): New function. (vfredosum::expand): New function. (vfredusum::expand): New function. (vfredmax::expand): New function. (vfredmin::expand): New function. (vfwredosum::expand): New function. (vfwredusum::expand): New function. (vmand::expand): New function. (vmor::expand): New function. (vmxor::expand): New function. (vmnand::expand): New function. (vmnor::expand): New function. (vmxnor::expand): New function. (vmandn::expand): New function. (vmorn::expand): New function. (vmmv::expand): New function. (vmnot::expand): New function. (vmclr::get_argument_types): New function. (vmclr::can_be_overloaded_p): New function. (vmclr::expand): New function. (vmset::get_argument_types): New function. (vmset::can_be_overloaded_p): New function. (vmset::expand): New function. (vcpop::get_return_type): New function. (vcpop::expand): New function. (vfirst::get_return_type): New function. (vfirst::expand): New function. (vmsbf::expand): New function. (vmsif::expand): New function. (vmsof::expand): New function. (viota::can_be_overloaded_p): New function. (viota::expand): New function. (vid::get_argument_types): New function. (vid::can_be_overloaded_p): New function. (vid::expand): New function. (vmv_x_s::assemble_name): New function. (vmv_x_s::expand): New function. (vmv_s_x::assemble_name): New function. (vmv_s_x::expand): New function. (vfmv_f_s::assemble_name): New function. (vfmv_f_s::expand): New function. (vfmv_s_f::assemble_name): New function. (vfmv_s_f::expand): New function. (vslideup::expand): New function. (vslidedown::expand): New function. (vslide1up::expand): New function. (vslide1down::expand): New function. (vfslide1up::expand): New function. (vfslide1down::expand): New function. (vrgather::expand): New function. (vrgatherei16::expand): New function. (vcompress::get_position_of_dest_arg): New function. (vcompress::expand): New function. * config/riscv/riscv-vector-builtins-functions.def (vsadd): New macro define. (vsaddu): New macro define. (vaadd): New macro define. (vaaddu): New macro define. (vssub): New macro define. (vssubu): New macro define. (vasub): New macro define. (vasubu): New macro define. (vsmul): New macro define. (vssrl): New macro define. (vssra): New macro define. (vnclip): New macro define. (vnclipu): New macro define. (vfadd): New macro define. (vfsub): New macro define. (vfmul): New macro define. (vfdiv): New macro define. (vfrsub): New macro define. (vfrdiv): New macro define. (vfneg): New macro define. (vfwadd): New macro define. (vfwsub): New macro define. (vfwmul): New macro define. (vfmacc): New macro define. (vfmsac): New macro define. (vfnmacc): New macro define. (vfnmsac): New macro define. (vfmadd): New macro define. (vfnmadd): New macro define. (vfmsub): New macro define. (vfnmsub): New macro define. (vfwmacc): New macro define. (vfwmsac): New macro define. (vfwnmacc): New macro define. (vfwnmsac): New macro define. (vfsqrt): New macro define. (vfrsqrt7): New macro define. (vfrec7): New macro define. (vfmax): New macro define. (vfmin): New macro define. (vfsgnj): New macro define. (vfsgnjn): New macro define. (vfsgnjx): New macro define. (vfabs): New macro define. (vmfeq): New macro define. (vmfne): New macro define. (vmflt): New macro define. (vmfle): New macro define. (vmfgt): New macro define. (vmfge): New macro define. (vfclass): New macro define. (vfmerge): New macro define. (vfmv): New macro define. (vfcvt_x_f_v): New macro define. (vfcvt_xu_f_v): New macro define. (vfcvt_rtz_x_f_v): New macro define. (vfcvt_rtz_xu_f_v): New macro define. (vfcvt_f_x_v): New macro define. (vfcvt_f_xu_v): New macro define. (vfwcvt_x_f_v): New macro define. (vfwcvt_xu_f_v): New macro define. (vfwcvt_rtz_x_f_v): New macro define. (vfwcvt_rtz_xu_f_v): New macro define. (vfwcvt_f_x_v): New macro define. (vfwcvt_f_xu_v): New macro define. (vfwcvt_f_f_v): New macro define. (vfncvt_x_f_w): New macro define. (vfncvt_xu_f_w): New macro define. (vfncvt_rtz_x_f_w): New macro define. (vfncvt_rtz_xu_f_w): New macro define. (vfncvt_f_x_w): New macro define. (vfncvt_f_xu_w): New macro define. (vfncvt_f_f_w): New macro define. (vfncvt_rod_f_f_w): New macro define. (vredsum): New macro define. (vredmax): New macro define. (vredmaxu): New macro define. (vredmin): New macro define. (vredminu): New macro define. (vredand): New macro define. (vredor): New macro define. (vredxor): New macro define. (vwredsum): New macro define. (vwredsumu): New macro define. (vfredosum): New macro define. (vfredusum): New macro define. (vfredmax): New macro define. (vfredmin): New macro define. (vfwredosum): New macro define. (vfwredusum): New macro define. (vmand): New macro define. (vmor): New macro define. (vmxor): New macro define. (vmnand): New macro define. (vmnor): New macro define. (vmxnor): New macro define. (vmandn): New macro define. (vmorn): New macro define. (vmmv): New macro define. (vmnot): New macro define. (vmclr): New macro define. (vmset): New macro define. (vcpop): New macro define. (vfirst): New macro define. (vmsbf): New macro define. (vmsif): New macro define. (vmsof): New macro define. (viota): New macro define. (vid): New macro define. (vmv_x_s): New macro define. (vmv_s_x): New macro define. (vfmv_f_s): New macro define. (vfmv_s_f): New macro define. (vslideup): New macro define. (vslidedown): New macro define. (vslide1up): New macro define. (vslide1down): New macro define. (vfslide1up): New macro define. (vfslide1down): New macro define. (vrgather): New macro define. (vrgatherei16): New macro define. (vcompress): New macro define. * config/riscv/riscv-vector-builtins-functions.h (class reduceop): New class. (class vsadd): New class. (class vsaddu): New class. (class vaadd): New class. (class vaaddu): New class. (class vssub): New class. (class vssubu): New class. (class vasub): New class. (class vasubu): New class. (class vssrl): New class. (class vssra): New class. (class vsmul): New class. (class vnclip): New class. (class vnclipu): New class. (class funop): New class. (class fbinop): New class. (class fwbinop): New class. (class fternop): New class. (class vfadd): New class. (class vfsub): New class. (class vfmul): New class. (class vfdiv): New class. (class vfrsub): New class. (class vfrdiv): New class. (class vfneg): New class. (class vfwadd): New class. (class vfwsub): New class. (class vfwmul): New class. (class vfmacc): New class. (class vfmsac): New class. (class vfnmacc): New class. (class vfnmsac): New class. (class vfmadd): New class. (class vfnmadd): New class. (class vfmsub): New class. (class vfnmsub): New class. (class vfwmacc): New class. (class vfwmsac): New class. (class vfwnmacc): New class. (class vfwnmsac): New class. (class vfsqrt): New class. (class vfrsqrt7): New class. (class vfrec7): New class. (class vfmax): New class. (class vfmin): New class. (class vfsgnj): New class. (class vfsgnjn): New class. (class vfsgnjx): New class. (class vfabs): New class. (class vfcmp): New class. (class vmfeq): New class. (class vmfne): New class. (class vmflt): New class. (class vmfle): New class. (class vmfgt): New class. (class vmfge): New class. (class vfclass): New class. (class vfmerge): New class. (class vfmv): New class. (class vfcvt_f2i): New class. (class vfcvt_f2u): New class. (class vfcvt_rtz_f2i): New class. (class vfcvt_rtz_f2u): New class. (class vfcvt_i2f): New class. (class vfcvt_u2f): New class. (class vfwcvt_f2i): New class. (class vfwcvt_f2u): New class. (class vfwcvt_rtz_f2i): New class. (class vfwcvt_rtz_f2u): New class. (class vfwcvt_i2f): New class. (class vfwcvt_u2f): New class. (class vfwcvt_f2f): New class. (class vfncvt_f2i): New class. (class vfncvt_f2u): New class. (class vfncvt_rtz_f2i): New class. (class vfncvt_rtz_f2u): New class. (class vfncvt_i2f): New class. (class vfncvt_u2f): New class. (class vfncvt_f2f): New class. (class vfncvt_f2rodf): New class. (class vredsum): New class. (class vredmax): New class. (class vredmaxu): New class. (class vredmin): New class. (class vredminu): New class. (class vredand): New class. (class vredor): New class. (class vredxor): New class. (class vwredsum): New class. (class vwredsumu): New class. (class freduceop): New class. (class vfredosum): New class. (class vfredusum): New class. (class vfredmax): New class. (class vfredmin): New class. (class vfwredosum): New class. (class vfwredusum): New class. (class vmand): New class. (class vmor): New class. (class vmxor): New class. (class vmnand): New class. (class vmnor): New class. (class vmxnor): New class. (class vmandn): New class. (class vmorn): New class. (class vmmv): New class. (class vmnot): New class. (class vmclr): New class. (class vmset): New class. (class vcpop): New class. (class vfirst): New class. (class vmsbf): New class. (class vmsif): New class. (class vmsof): New class. (class viota): New class. (class vid): New class. (class vmv_x_s): New class. (class vmv_s_x): New class. (class vfmv_f_s): New class. (class vfmv_s_f): New class. (class vslideup): New class. (class vslidedown): New class. (class vslide1up): New class. (class vslide1down): New class. (class vfslide1up): New class. (class vfslide1down): New class. (class vrgather): New class. (class vrgatherei16): New class. (class vcompress): New class. * config/riscv/riscv-vector.cc (rvv_adjust_frame): Change to static function. (enum GEN_CLASS): Change to static function. (modify_operands): Change to static function. (emit_op5_vmv_s_x): Change to static function. (emit_op5): Change to static function. (emit_op7_slide1): Change to static function. (emit_op7): Change to static function. * config/riscv/vector-iterators.md: Fix iterstors. * config/riscv/vector.md (@v_s_x): New pattern. (@vslide1_vx): New pattern. (vmv_vlx2_help): New pattern. (@vfmv_v_f): New pattern. (@v_vv): New pattern. (@vmclr_m): New pattern. (@vsssub_vv): New pattern. (@vussub_vv): New pattern. (@vmset_m): New pattern. (@v_vx_internal): New pattern. (@v_vx_32bit): New pattern. (@vsssub_vx_internal): New pattern. (@vussub_vx_internal): New pattern. (@vsssub_vx_32bit): New pattern. (@vussub_vx_32bit): New pattern. (@v_vv): New pattern. (@v_vx_internal): New pattern. (@v_vx_32bit): New pattern. (@v_vv): New pattern. (@v_vx): New pattern. (@vn_wv): New pattern. (@vn_wx): New pattern. (@vf_vv): New pattern. (@vf_vf): New pattern. (@vfr_vf): New pattern. (@vfw_vv): New pattern. (@vfw_vf): New pattern. (@vfw_wv): New pattern. (@vfw_wf): New pattern. (@vfwmul_vv): New pattern. (@vfwmul_vf): New pattern. (@vf_vv): New pattern. (@vf_vf): New pattern. (@vfwmacc_vv): New pattern. (@vfwmsac_vv): New pattern. (@vfwmacc_vf): New pattern. (@vfwmsac_vf): New pattern. (@vfwnmacc_vv): New pattern. (@vfwnmsac_vv): New pattern. (@vfwnmacc_vf): New pattern. (@vfwnmsac_vf): New pattern. (@vfsqrt_v): New pattern. (@vf_v): New pattern. (@vfsgnj_vv): New pattern. (@vfsgnj_vf): New pattern. (@vfneg_v): New pattern. (@vfabs_v): New pattern. (@vmf_vv): New pattern. (@vmf_vf): New pattern. (@vfclass_v): New pattern. (@vfmerge_vfm): New pattern. (@vfcvt_x_f_v): New pattern. (@vfcvt_rtz_x_f_v): New pattern. (@vfcvt_f_x_v): New pattern. (@vfwcvt_x_f_v): New pattern. (@vfwcvt_rtz_x_f_v): New pattern. (@vfwcvt_f_x_v): New pattern. (@vfwcvt_f_f_v): New pattern. (@vfncvt_x_f_w): New pattern. (@vfncvt_rtz_x_f_w): New pattern. (@vfncvt_f_x_w): New pattern. (@vfncvt_f_f_w): New pattern. (@vfncvt_rod_f_f_w): New pattern. (@vred_vs): New pattern. (@vwredsum_vs): New pattern. (@vfred_vs): New pattern. (@vfwredusum_vs): New pattern. (@vfwredosum_vs): New pattern. (@vm_mm): New pattern. (@vmn_mm): New pattern. (@vmnot_mm): New pattern. (@vmmv_m): New pattern. (@vmnot_m): New pattern. (@vcpop__m): New pattern. (@vfirst__m): New pattern. (@vm_m): New pattern. (@viota_m): New pattern. (@vid_v): New pattern. (@vmv_x_s): New pattern. (vmv_x_s_di_internal): New pattern. (vmv_x_s_lo): New pattern. (vmv_x_s_hi): New pattern. (@vmv_s_x_internal): New pattern. (@vmv_s_x_32bit): New pattern. (@vfmv_f_s): New pattern. (@vfmv_s_f): New pattern. (@vslide_vx): New pattern. (@vslide1_vx_internal): New pattern. (@vslide1_vx_32bit): New pattern. (@vfslide1_vf): New pattern. (@vrgather_vv): New pattern. (@vrgatherei16_vv): New pattern. (@vrgather_vx): New pattern. (@vcompress_vm): New pattern. --- .../riscv/riscv-vector-builtins-functions.cc | 1703 +++++++++++ .../riscv/riscv-vector-builtins-functions.def | 148 + .../riscv/riscv-vector-builtins-functions.h | 1367 +++++++++ gcc/config/riscv/riscv-vector.cc | 145 +- gcc/config/riscv/vector-iterators.md | 3 + gcc/config/riscv/vector.md | 2508 ++++++++++++++++- 6 files changed, 5830 insertions(+), 44 deletions(-) diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.cc b/gcc/config/riscv/riscv-vector-builtins-functions.cc index 6e0fd0b3570..fe3a477347e 100644 --- a/gcc/config/riscv/riscv-vector-builtins-functions.cc +++ b/gcc/config/riscv/riscv-vector-builtins-functions.cc @@ -3063,6 +3063,1709 @@ vmv::expand (const function_instance &instance, tree exp, rtx target) const return expand_builtin_insn (icode, exp, target, instance); } +/* A function implementation for reduction functions. */ +char * +reduceop::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0, 1); + append_name (instance.get_base_name ()); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +void +reduceop::get_argument_types (const function_instance &instance, + vec &argument_types) const +{ + for (unsigned int i = 1; i < instance.get_arg_pattern ().arg_len; i++) + argument_types.quick_push (get_dt_t_with_index (instance, i)); +} + +tree +reduceop::get_mask_type (tree, const function_instance &, + const vec &argument_types) const +{ + machine_mode mask_mode; + gcc_assert (rvv_get_mask_mode (TYPE_MODE (argument_types[0])).exists (&mask_mode)); + return mode2mask_t (mask_mode); +} + +/* A function implementation for vsadd functions. */ +rtx +vsadd::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (SS_PLUS, mode); + else + icode = code_for_v_vx (UNSPEC_VSADD, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vsaddu functions. */ +rtx +vsaddu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (US_PLUS, mode); + else + icode = code_for_v_vx (UNSPEC_VSADDU, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vaadd functions. */ +rtx +vaadd::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (UNSPEC_AADD, mode); + else + icode = code_for_v_vx (UNSPEC_VAADD, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vaaddu functions. */ +rtx +vaaddu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (UNSPEC_AADDU, mode); + else + icode = code_for_v_vx (UNSPEC_VAADDU, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vssub functions. */ +rtx +vssub::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vsssub_vv (mode); + else + icode = code_for_v_vx (UNSPEC_VSSUB, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vssubu functions. */ +rtx +vssubu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vussub_vv (mode); + else + icode = code_for_v_vx (UNSPEC_VSSUBU, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vasub functions. */ +rtx +vasub::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (UNSPEC_ASUB, mode); + else + icode = code_for_v_vx (UNSPEC_VASUB, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vasubu functions. */ +rtx +vasubu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (UNSPEC_ASUBU, mode); + else + icode = code_for_v_vx (UNSPEC_VASUBU, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vssrl functions. */ +rtx +vssrl::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (UNSPEC_SSRL, mode); + else + icode = code_for_v_vx (UNSPEC_SSRL, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vssra functions. */ +rtx +vssra::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (UNSPEC_SSRA, mode); + else + icode = code_for_v_vx (UNSPEC_SSRA, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vsmul functions. */ +rtx +vsmul::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_v_vv (UNSPEC_SMUL, mode); + else + icode = code_for_v_vx (UNSPEC_VSMUL, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vnclip functions. */ +rtx +vnclip::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_wv) + icode = code_for_vn_wv (UNSPEC_SIGNED_CLIP, mode); + else + icode = code_for_vn_wx (UNSPEC_SIGNED_CLIP, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vnclipu functions. */ +rtx +vnclipu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_wv) + icode = code_for_vn_wv (UNSPEC_UNSIGNED_CLIP, mode); + else + icode = code_for_vn_wx (UNSPEC_UNSIGNED_CLIP, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for funop functions. */ +unsigned int +funop::call_properties () const +{ + return CP_RAISE_FP_EXCEPTIONS; +} + +/* A function implementation for fbinop functions. */ +unsigned int +fbinop::call_properties () const +{ + return CP_RAISE_FP_EXCEPTIONS; +} + +/* A function implementation for fwbinop functions. */ +unsigned int +fwbinop::call_properties () const +{ + return CP_RAISE_FP_EXCEPTIONS; +} + +/* A function implementation for fternop functions. */ +unsigned int +fternop::call_properties () const +{ + return CP_RAISE_FP_EXCEPTIONS; +} + +/* A function implementation for vfadd functions. */ +rtx +vfadd::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vf_vv (PLUS, mode); + else + icode = code_for_vf_vf (PLUS, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfsub functions. */ +rtx +vfsub::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vf_vv (MINUS, mode); + else + icode = code_for_vf_vf (MINUS, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfmul functions. */ +rtx +vfmul::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vf_vv (MULT, mode); + else + icode = code_for_vf_vf (MULT, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfdiv functions. */ +rtx +vfdiv::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vf_vv (DIV, mode); + else + icode = code_for_vf_vf (DIV, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfrsub and vfrdiv functions. */ +rtx +vfrsub::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vfr_vf (MINUS, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfrdiv functions. */ +rtx +vfrdiv::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vfr_vf (DIV, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfneg functions. */ +rtx +vfneg::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vfneg_v (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfwadd functions. */ +rtx +vfwadd::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[2]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vfw_vv (PLUS, mode); + else if (instance.get_operation () == OP_vf) + icode = code_for_vfw_vf (PLUS, mode); + else if (instance.get_operation () == OP_wv) + icode = code_for_vfw_wv (PLUS, mode); + else + icode = code_for_vfw_wf (PLUS, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfwsub functions. */ +rtx +vfwsub::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[2]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vfw_vv (MINUS, mode); + else if (instance.get_operation () == OP_vf) + icode = code_for_vfw_vf (MINUS, mode); + else if (instance.get_operation () == OP_wv) + icode = code_for_vfw_wv (MINUS, mode); + else + icode = code_for_vfw_wf (MINUS, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfwmul functions. */ +rtx +vfwmul::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vfwmul_vv (mode); + else + icode = code_for_vfwmul_vf (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfmacc functions. */ +rtx +vfmacc::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vf_vv (UNSPEC_MACC, mode); + else + icode = code_for_vf_vf (UNSPEC_MACC, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfmsac functions. */ +rtx +vfmsac::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vf_vv (UNSPEC_MSAC, mode); + else + icode = code_for_vf_vf (UNSPEC_MSAC, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfnmacc functions. */ +rtx +vfnmacc::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vf_vv (UNSPEC_NMACC, mode); + else + icode = code_for_vf_vf (UNSPEC_NMACC, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfnmsac functions. */ +rtx +vfnmsac::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vf_vv (UNSPEC_NMSAC, mode); + else + icode = code_for_vf_vf (UNSPEC_NMSAC, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfmadd functions. */ +rtx +vfmadd::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vf_vv (UNSPEC_MADD, mode); + else + icode = code_for_vf_vf (UNSPEC_MADD, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfnmadd functions. */ +rtx +vfnmadd::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vf_vv (UNSPEC_NMADD, mode); + else + icode = code_for_vf_vf (UNSPEC_NMADD, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfmsub functions. */ +rtx +vfmsub::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vf_vv (UNSPEC_MSUB, mode); + else + icode = code_for_vf_vf (UNSPEC_MSUB, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfnmsub functions. */ +rtx +vfnmsub::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vf_vv (UNSPEC_NMSUB, mode); + else + icode = code_for_vf_vf (UNSPEC_NMSUB, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfwmacc functions. */ +rtx +vfwmacc::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[2]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vfwmacc_vv (mode); + else + icode = code_for_vfwmacc_vf (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfwnmacc functions. */ +rtx +vfwnmacc::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[2]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vfwnmacc_vv (mode); + else + icode = code_for_vfwnmacc_vf (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfwmsac functions. */ +rtx +vfwmsac::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[2]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vfwmsac_vv (mode); + else + icode = code_for_vfwmsac_vf (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfwnmsac functions. */ +rtx +vfwnmsac::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[2]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vfwnmsac_vv (mode); + else + icode = code_for_vfwnmsac_vf (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfsqrt functions. */ +rtx +vfsqrt::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vfsqrt_v (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfrsqrt7 functions. */ +rtx +vfrsqrt7::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vf_v (UNSPEC_RSQRT7, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfrec7 functions. */ +rtx +vfrec7::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vf_v (UNSPEC_REC7, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfmax functions. */ +rtx +vfmax::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vf_vv (SMAX, mode); + else + icode = code_for_vf_vf (SMAX, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfmin functions. */ +rtx +vfmin::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vf_vv (SMIN, mode); + else + icode = code_for_vf_vf (SMIN, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfsgnj functions. */ +rtx +vfsgnj::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vfsgnj_vv (UNSPEC_COPYSIGN, mode); + else + icode = code_for_vfsgnj_vf (UNSPEC_COPYSIGN, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfsgnjn functions. */ +rtx +vfsgnjn::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vfsgnj_vv (UNSPEC_NCOPYSIGN, mode); + else + icode = code_for_vfsgnj_vf (UNSPEC_NCOPYSIGN, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfsgnjx functions. */ +rtx +vfsgnjx::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vfsgnj_vv (UNSPEC_XORSIGN, mode); + else + icode = code_for_vfsgnj_vf (UNSPEC_XORSIGN, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfabs functions. */ +rtx +vfabs::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vfabs_v (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfcmp functions. */ +char * +vfcmp::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0, 1); + append_name (instance.get_base_name ()); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +/* A function implementation for vmfeq functions. */ +rtx +vmfeq::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vmf_vv (EQ, mode); + else + icode = code_for_vmf_vf (EQ, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmfne functions. */ +rtx +vmfne::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vmf_vv (NE, mode); + else + icode = code_for_vmf_vf (NE, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmflt functions. */ +rtx +vmflt::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vmf_vv (LT, mode); + else + icode = code_for_vmf_vf (LT, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmfgt functions. */ +rtx +vmfgt::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vmf_vv (GT, mode); + else + icode = code_for_vmf_vf (GT, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmfle functions. */ +rtx +vmfle::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vmf_vv (LE, mode); + else + icode = code_for_vmf_vf (LE, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmfge functions. */ +rtx +vmfge::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vmf_vv (GE, mode); + else + icode = code_for_vmf_vf (GE, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfclass functions. */ +rtx +vfclass::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vfclass_v (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfmerge functions. */ +size_t +vfmerge::get_position_of_dest_arg (enum predication_index) const +{ + return 1; +} + +rtx +vfmerge::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vfmerge_vfm (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfmv functions. */ +bool +vfmv::can_be_overloaded_p (const function_instance &instance) const +{ + if (instance.get_pred () == PRED_tu) + return true; + + return false; +} + +rtx +vfmv::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_v_f) + icode = code_for_vfmv_v_f (mode); + else + icode = code_for_vmv_v_v (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfcvt_x_f_v functions. */ +char * +vfcvt_f2i::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfcvt_x"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfcvt_f2i::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vfcvt_x_f_v (mode, UNSPEC_FLOAT_TO_SIGNED_INT); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfcvt_xu_f_v functions. */ +char * +vfcvt_f2u::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfcvt_xu"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfcvt_f2u::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vfcvt_x_f_v (mode, UNSPEC_FLOAT_TO_UNSIGNED_INT); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfcvt_rtz_x_f_v functions. */ +char * +vfcvt_rtz_f2i::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfcvt_rtz_x"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfcvt_rtz_f2i::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vfcvt_rtz_x_f_v (mode, FIX); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfcvt_rtz_xu_f_v functions. */ +char * +vfcvt_rtz_f2u::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfcvt_rtz_xu"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfcvt_rtz_f2u::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vfcvt_rtz_x_f_v (mode, UNSIGNED_FIX); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfcvt_f_x_v functions. */ +char * +vfcvt_i2f::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfcvt_f"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfcvt_i2f::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[0]; + enum insn_code icode = code_for_vfcvt_f_x_v (mode, FLOAT); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfcvt_f_xu_v functions. */ +char * +vfcvt_u2f::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfcvt_f"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfcvt_u2f::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[0]; + enum insn_code icode = code_for_vfcvt_f_x_v (mode, UNSIGNED_FLOAT); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfwcvt_x_f_v functions. */ +char * +vfwcvt_f2i::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfwcvt_x"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfwcvt_f2i::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vfwcvt_x_f_v (mode, UNSPEC_FLOAT_TO_SIGNED_INT); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfwcvt_xu_f_v functions. */ +char * +vfwcvt_f2u::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfwcvt_xu"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfwcvt_f2u::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vfwcvt_x_f_v (mode, UNSPEC_FLOAT_TO_UNSIGNED_INT); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfwcvt_rtz_x_f_v functions. */ +char * +vfwcvt_rtz_f2i::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfwcvt_rtz_x"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfwcvt_rtz_f2i::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vfwcvt_rtz_x_f_v (mode, FIX); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfwcvt_rtz_xu_f_v functions. */ +char * +vfwcvt_rtz_f2u::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfwcvt_rtz_xu"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfwcvt_rtz_f2u::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vfwcvt_rtz_x_f_v (mode, UNSIGNED_FIX); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfwcvt_f_x_v functions. */ +char * +vfwcvt_i2f::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfwcvt_f"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfwcvt_i2f::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vfwcvt_f_x_v (mode, FLOAT); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfwcvt_f_xu_v functions. */ +char * +vfwcvt_u2f::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfwcvt_f"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfwcvt_u2f::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vfwcvt_f_x_v (mode, UNSIGNED_FLOAT); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfwcvt_f_f_v functions. */ +char * +vfwcvt_f2f::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfwcvt_f"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfwcvt_f2f::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vfwcvt_f_f_v (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfncvt_x_f_w functions. */ +char * +vfncvt_f2i::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfncvt_x"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfncvt_f2i::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[0]; + enum insn_code icode = code_for_vfncvt_x_f_w (mode, UNSPEC_FLOAT_TO_SIGNED_INT); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfncvt_xu_f_w functions. */ +char * +vfncvt_f2u::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfncvt_xu"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfncvt_f2u::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[0]; + enum insn_code icode = code_for_vfncvt_x_f_w (mode, UNSPEC_FLOAT_TO_UNSIGNED_INT); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfncvt_rtz_x_f_w functions. */ +char * +vfncvt_rtz_f2i::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfncvt_rtz_x"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfncvt_rtz_f2i::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[0]; + enum insn_code icode = code_for_vfncvt_rtz_x_f_w (mode, FIX); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfncvt_rtz_xu_f_w functions. */ +char * +vfncvt_rtz_f2u::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfncvt_rtz_xu"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfncvt_rtz_f2u::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[0]; + enum insn_code icode = code_for_vfncvt_rtz_x_f_w (mode, UNSIGNED_FIX); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfncvt_f_x_w functions. */ +char * +vfncvt_i2f::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfncvt_f"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfncvt_i2f::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[0]; + enum insn_code icode = code_for_vfncvt_f_x_w (mode, FLOAT); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfncvt_f_xu_w functions. */ +char * +vfncvt_u2f::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfncvt_f"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfncvt_u2f::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[0]; + enum insn_code icode = code_for_vfncvt_f_x_w (mode, UNSIGNED_FLOAT); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfncvt_f_f_w functions. */ +char * +vfncvt_f2f::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfncvt_f"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfncvt_f2f::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vfncvt_f_f_w (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfncvt_rod_f_f_w functions. */ +char * +vfncvt_f2rodf::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + append_name ("vfncvt_rod_f"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfncvt_f2rodf::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vfncvt_rod_f_f_w (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vredsum functions. */ +rtx +vredsum::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vred_vs (UNSPEC_REDUC_SUM, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vredmax functions. */ +rtx +vredmax::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vred_vs (UNSPEC_REDUC_MAX, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vredmaxu functions. */ +rtx +vredmaxu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vred_vs (UNSPEC_REDUC_MAXU, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vredmin functions. */ +rtx +vredmin::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vred_vs (UNSPEC_REDUC_MIN, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vredminu functions. */ +rtx +vredminu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vred_vs (UNSPEC_REDUC_MINU, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vredand functions. */ +rtx +vredand::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vred_vs (UNSPEC_REDUC_AND, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vredor functions. */ +rtx +vredor::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vred_vs (UNSPEC_REDUC_OR, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vredxor functions. */ +rtx +vredxor::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vred_vs (UNSPEC_REDUC_XOR, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vwredsum functions. */ +rtx +vwredsum::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vwredsum_vs (SIGN_EXTEND, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vwredsumu functions. */ +rtx +vwredsumu::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vwredsum_vs (ZERO_EXTEND, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for freduceop functions. */ +unsigned int +freduceop::call_properties () const +{ + return CP_RAISE_FP_EXCEPTIONS; +} + +/* A function implementation for vfredosum functions. */ +rtx +vfredosum::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vfred_vs (UNSPEC_REDUC_ORDERED_SUM, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfredusum functions. */ +rtx +vfredusum::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vfred_vs (UNSPEC_REDUC_UNORDERED_SUM, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfredmax functions. */ +rtx +vfredmax::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vfred_vs (UNSPEC_REDUC_MAX, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfredmin functions. */ +rtx +vfredmin::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vfred_vs (UNSPEC_REDUC_MIN, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfwredosum functions. */ +rtx +vfwredosum::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vfwredosum_vs (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfwredusum functions. */ +rtx +vfwredusum::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vfwredusum_vs (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmand functions. */ +rtx +vmand::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vm_mm (AND, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmor functions. */ +rtx +vmor::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vm_mm (IOR, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmxor functions. */ +rtx +vmxor::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vm_mm (XOR, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmnand functions. */ +rtx +vmnand::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vmn_mm (AND, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmnor functions. */ +rtx +vmnor::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vmn_mm (IOR, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmxnor functions. */ +rtx +vmxnor::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vmn_mm (XOR, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmlogicn functions. */ +rtx +vmandn::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vmnot_mm (AND, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmorn functions. */ +rtx +vmorn::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vmnot_mm (IOR, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmmv functions. */ +rtx +vmmv::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vmmv_m (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmnot functions. */ +rtx +vmnot::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vmnot_m (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmclr functions. */ +void +vmclr::get_argument_types (const function_instance &, + vec &) const +{ +} + +bool +vmclr::can_be_overloaded_p (const function_instance &) const +{ + return false; +} + +rtx +vmclr::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vmclr_m (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmset functions. */ +void +vmset::get_argument_types (const function_instance &, + vec &) const +{ +} + +bool +vmset::can_be_overloaded_p (const function_instance &) const +{ + return false; +} + +rtx +vmset::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vmset_m (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vcpop functions. */ +tree +vcpop::get_return_type (const function_instance &) const +{ + return long_unsigned_type_node; +} + +rtx +vcpop::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[0]; + enum insn_code icode = code_for_vcpop_m (mode, Pmode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfirst functions. */ +tree +vfirst::get_return_type (const function_instance &) const +{ + return long_integer_type_node; +} + +rtx +vfirst::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[0]; + enum insn_code icode = code_for_vfirst_m (mode, Pmode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmsbf functions. */ +rtx +vmsbf::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vm_m (UNSPEC_SBF, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmsif functions. */ +rtx +vmsif::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vm_m (UNSPEC_SIF, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmsof functions. */ +rtx +vmsof::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vm_m (UNSPEC_SOF, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for viota functions. */ +bool +viota::can_be_overloaded_p (const function_instance &instance) const +{ + if (instance.get_pred () == PRED_void || instance.get_pred () == PRED_ta || + instance.get_pred () == PRED_tama) + return false; + + return true; +} + +rtx +viota::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_viota_m (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vid functions. */ +void +vid::get_argument_types (const function_instance &, + vec &) const +{ +} + +bool +vid::can_be_overloaded_p (const function_instance &instance) const +{ + if (instance.get_pred () == PRED_void || instance.get_pred () == PRED_ta || + instance.get_pred () == PRED_tama) + return false; + + return true; +} + +rtx +vid::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vid_v (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmv_x_s functions. */ +char * +vmv_x_s::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0, 1); + append_name ("vmv_x"); + return finish_name (); +} + +rtx +vmv_x_s::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vmv_x_s (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vmv_s_x functions. */ +char * +vmv_s_x::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + if (instance.get_pred () == PRED_ta) + return nullptr; + append_name ("vmv_s"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vmv_s_x::expand (const function_instance &instance, tree exp, rtx target) const +{ + + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_v_s_x (UNSPEC_VMVS, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfmv_f_s functions. */ +char * +vfmv_f_s::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0, 1); + append_name ("vfmv_f"); + return finish_name (); +} + +rtx +vfmv_f_s::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = instance.get_arg_pattern ().arg_list[1]; + enum insn_code icode = code_for_vfmv_f_s (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfmv_s_f functions. */ +char * +vfmv_s_f::assemble_name (function_instance &instance) +{ + intrinsic_rename (instance, 0); + if (instance.get_pred () == PRED_ta) + return nullptr; + append_name ("vfmv_s"); + append_name (get_pred_str (instance.get_pred (), true)); + return finish_name (); +} + +rtx +vfmv_s_f::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vfmv_s_f (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vslideup functions. */ +rtx +vslideup::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vslide_vx (UNSPEC_SLIDEUP, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vslidedown functions. */ +rtx +vslidedown::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vslide_vx (UNSPEC_SLIDEDOWN, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vslide1up functions. */ +rtx +vslide1up::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vslide1_vx (UNSPEC_SLIDE1UP, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vslide1down functions. */ +rtx +vslide1down::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vslide1_vx (UNSPEC_SLIDE1DOWN, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfslide1up functions. */ +rtx +vfslide1up::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vfslide1_vf (UNSPEC_SLIDE1UP, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vfslide1down functions. */ +rtx +vfslide1down::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vfslide1_vf (UNSPEC_SLIDE1DOWN, mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vrgather functions. */ +rtx +vrgather::expand (const function_instance &instance, tree exp, rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode; + if (instance.get_operation () == OP_vv) + icode = code_for_vrgather_vv (mode); + else + icode = code_for_vrgather_vx (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vrgather functions. */ +rtx +vrgatherei16::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vrgatherei16_vv (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + +/* A function implementation for vcompress functions. */ +size_t +vcompress::get_position_of_dest_arg (enum predication_index) const +{ + return 1; +} + +rtx +vcompress::expand (const function_instance &instance, tree exp, + rtx target) const +{ + machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum insn_code icode = code_for_vcompress_vm (mode); + return expand_builtin_insn (icode, exp, target, instance); +} + } // end namespace riscv_vector using namespace riscv_vector; diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def b/gcc/config/riscv/riscv-vector-builtins-functions.def index bf9d42e6d67..efe89b3e270 100644 --- a/gcc/config/riscv/riscv-vector-builtins-functions.def +++ b/gcc/config/riscv/riscv-vector-builtins-functions.def @@ -237,6 +237,154 @@ DEF_RVV_FUNCTION(vmerge, vmerge, (3, VITER(VF, signed), VATTR(0, VF, signed), VA DEF_RVV_FUNCTION(vmv, vmv, (2, VITER(VI, signed), VATTR(0, VSUB, signed)), PAT_tail, pred_tail, OP_v_v | OP_v_x) DEF_RVV_FUNCTION(vmv, vmv, (2, VITER(VI, unsigned), VATTR(0, VSUB, unsigned)), PAT_tail, pred_tail, OP_v_v | OP_v_x) DEF_RVV_FUNCTION(vmv, vmv, (2, VITER(VF, signed), VATTR(0, VSUB, signed)), PAT_tail, pred_tail, OP_v_v) +/* 12. Vector Fixed-Point Arithmetic Instructions. */ +DEF_RVV_FUNCTION(vsadd, vsadd, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vsaddu, vsaddu, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vaadd, vaadd, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vaaddu, vaaddu, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vssub, vssub, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vssubu, vssubu, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vasub, vasub, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vasubu, vasubu, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vsmul, vsmul, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, signed)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vssrl, vssrl, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vssra, vssra, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VI, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vnclip, vnclip, (3, VITER(VWI, signed), VATTR(0, VW, signed), VATTR(0, VWI, unsigned)), pat_mask_tail, pred_all, OP_wv | OP_wx) +DEF_RVV_FUNCTION(vnclipu, vnclipu, (3, VITER(VWI, unsigned), VATTR(0, VW, unsigned), VATTR(0, VWI, unsigned)), pat_mask_tail, pred_all, OP_wv | OP_wx) +/* 13. Vector Floating-Point Arithmetic Instructions. */ +DEF_RVV_FUNCTION(vfadd, vfadd, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), pat_mask_tail, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfsub, vfsub, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), pat_mask_tail, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfmul, vfmul, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), pat_mask_tail, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfdiv, vfdiv, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), pat_mask_tail, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfrsub, vfrsub, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), pat_mask_tail, pred_all, OP_vf) +DEF_RVV_FUNCTION(vfrdiv, vfrdiv, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), pat_mask_tail, pred_all, OP_vf) +DEF_RVV_FUNCTION(vfneg, vfneg, (2, VITER(VF, signed), VATTR(0, VF, signed)), pat_mask_tail, pred_all, OP_v) +DEF_RVV_FUNCTION(vfwadd, vfwadd, (3, VATTR(1, VW, signed), VITER(VWF, signed), VATTR(1, VWF, signed)), pat_mask_tail, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfwsub, vfwsub, (3, VATTR(1, VW, signed), VITER(VWF, signed), VATTR(1, VWF, signed)), pat_mask_tail, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfwadd, vfwadd, (3, VATTR(2, VW, signed), VATTR(2, VW, signed), VITER(VWF, signed)), pat_mask_tail, pred_all, OP_wv | OP_wf) +DEF_RVV_FUNCTION(vfwsub, vfwsub, (3, VATTR(2, VW, signed), VATTR(2, VW, signed), VITER(VWF, signed)), pat_mask_tail, pred_all, OP_wv | OP_wf) +DEF_RVV_FUNCTION(vfwmul, vfwmul, (3, VATTR(1, VW, signed), VITER(VWF, signed), VATTR(1, VWF, signed)), pat_mask_tail, pred_all, OP_vv | OP_vf | OP_wv | OP_wf) +DEF_RVV_FUNCTION(vfmacc, vfmacc, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), pat_mask_tail_dest, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfmsac, vfmsac, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), pat_mask_tail_dest, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfnmacc, vfnmacc, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), pat_mask_tail_dest, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfnmsac, vfnmsac, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), pat_mask_tail_dest, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfmadd, vfmadd, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), pat_mask_tail_dest, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfnmadd, vfnmadd, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), pat_mask_tail_dest, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfmsub, vfmsub, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), pat_mask_tail_dest, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfnmsub, vfnmsub, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), pat_mask_tail_dest, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfwmacc, vfwmacc, (3, VATTR(1, VW, signed), VITER(VWF, signed), VATTR(1, VWF, signed)), pat_mask_tail_dest, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfwmsac, vfwmsac, (3, VATTR(1, VW, signed), VITER(VWF, signed), VATTR(1, VWF, signed)), pat_mask_tail_dest, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfwnmacc, vfwnmacc, (3, VATTR(1, VW, signed), VITER(VWF, signed), VATTR(1, VWF, signed)), pat_mask_tail_dest, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfwnmsac, vfwnmsac, (3, VATTR(1, VW, signed), VITER(VWF, signed), VATTR(1, VWF, signed)), pat_mask_tail_dest, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfsqrt, vfsqrt, (2, VITER(VF, signed), VATTR(0, VF, signed)), pat_mask_tail, pred_all, OP_v) +DEF_RVV_FUNCTION(vfrsqrt7, vfrsqrt7, (2, VITER(VF, signed), VATTR(0, VF, signed)), pat_mask_tail, pred_all, OP_v) +DEF_RVV_FUNCTION(vfrec7, vfrec7, (2, VITER(VF, signed), VATTR(0, VF, signed)), pat_mask_tail, pred_all, OP_v) +DEF_RVV_FUNCTION(vfmax, vfmax, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), pat_mask_tail, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfmin, vfmin, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), pat_mask_tail, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfsgnj, vfsgnj, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), pat_mask_tail, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfsgnjn, vfsgnjn, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), pat_mask_tail, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfsgnjx, vfsgnjx, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), pat_mask_tail, pred_all, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfabs, vfabs, (2, VITER(VF, signed), VATTR(0, VF, signed)), pat_mask_tail, pred_all, OP_v) +DEF_RVV_FUNCTION(vmfeq, vmfeq, (3, VATTR(1, VM, signed), VITER(VF, signed), VATTR(1, VF, signed)), pat_mask_ignore_tp, pred_mask, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vmfne, vmfne, (3, VATTR(1, VM, signed), VITER(VF, signed), VATTR(1, VF, signed)), pat_mask_ignore_tp, pred_mask, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vmflt, vmflt, (3, VATTR(1, VM, signed), VITER(VF, signed), VATTR(1, VF, signed)), pat_mask_ignore_tp, pred_mask, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vmfle, vmfle, (3, VATTR(1, VM, signed), VITER(VF, signed), VATTR(1, VF, signed)), pat_mask_ignore_tp, pred_mask, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vmfgt, vmfgt, (3, VATTR(1, VM, signed), VITER(VF, signed), VATTR(1, VF, signed)), pat_mask_ignore_tp, pred_mask, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vmfge, vmfge, (3, VATTR(1, VM, signed), VITER(VF, signed), VATTR(1, VF, signed)), pat_mask_ignore_tp, pred_mask, OP_vv | OP_vf) +DEF_RVV_FUNCTION(vfclass, vfclass, (2, VATTR(1, VMAP, unsigned), VITER(VF, signed)), pat_mask_tail, pred_all, OP_v) +DEF_RVV_FUNCTION(vfmerge, vfmerge, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VF, signed)), PAT_tail | PAT_merge, pred_tail, OP_vfm) +DEF_RVV_FUNCTION(vfmv, vfmv, (2, VITER(VF, signed), VATTR(0, VSUB, signed)), PAT_tail, pred_tail, OP_v_f) +DEF_RVV_FUNCTION(vfcvt_x_f_v, vfcvt_f2i, (2, VATTR(1, VMAP, signed), VITER(VF, signed)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfcvt_xu_f_v, vfcvt_f2u, (2, VATTR(1, VMAP, unsigned), VITER(VF, signed)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfcvt_rtz_x_f_v, vfcvt_rtz_f2i, (2, VATTR(1, VMAP, signed), VITER(VF, signed)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfcvt_rtz_xu_f_v, vfcvt_rtz_f2u, (2, VATTR(1, VMAP, unsigned), VITER(VF, signed)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfcvt_f_x_v, vfcvt_i2f, (2, VITER(VF, signed), VATTR(0, VMAP, signed)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfcvt_f_xu_v, vfcvt_u2f, (2, VITER(VF, signed), VATTR(0, VMAP, unsigned)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfwcvt_x_f_v, vfwcvt_f2i, (2, VATTR(1, VWMAP, signed), VITER(VWF, signed)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfwcvt_xu_f_v, vfwcvt_f2u, (2, VATTR(1, VWMAP, unsigned), VITER(VWF, signed)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfwcvt_rtz_x_f_v, vfwcvt_rtz_f2i, (2, VATTR(1, VWMAP, signed), VITER(VWF, signed)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfwcvt_rtz_xu_f_v, vfwcvt_rtz_f2u, (2, VATTR(1, VWMAP, unsigned), VITER(VWF, signed)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfwcvt_f_x_v, vfwcvt_i2f, (2, VATTR(1, VWFMAP, signed), VITER(VWINOQI, signed)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfwcvt_f_xu_v, vfwcvt_u2f, (2, VATTR(1, VWFMAP, signed), VITER(VWINOQI, unsigned)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfwcvt_f_f_v, vfwcvt_f2f, (2, VATTR(1, VW, signed), VITER(VWF, signed)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfncvt_x_f_w, vfncvt_f2i, (2, VITER(VWINOQI, signed), VATTR(0, VWFMAP, signed)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfncvt_xu_f_w, vfncvt_f2u, (2, VITER(VWINOQI, unsigned), VATTR(0, VWFMAP, signed)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfncvt_rtz_x_f_w, vfncvt_rtz_f2i, (2, VITER(VWINOQI, signed), VATTR(0, VWFMAP, signed)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfncvt_rtz_xu_f_w, vfncvt_rtz_f2u, (2, VITER(VWINOQI, unsigned), VATTR(0, VWFMAP, signed)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfncvt_f_x_w, vfncvt_i2f, (2, VITER(VWF, signed), VATTR(0, VWMAP, signed)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfncvt_f_xu_w, vfncvt_u2f, (2, VITER(VWF, signed), VATTR(0, VWMAP, unsigned)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfncvt_f_f_w, vfncvt_f2f, (2, VITER(VWF, signed), VATTR(0, VW, signed)), pat_mask_tail, pred_all, OP_none) +DEF_RVV_FUNCTION(vfncvt_rod_f_f_w, vfncvt_f2rodf, (2, VITER(VWF, signed), VATTR(0, VW, signed)), pat_mask_tail, pred_all, OP_none) +/* 14. Vector Reduction Operations. */ +DEF_RVV_FUNCTION(vredsum, vredsum, (3, VATTR(1, VLMUL1, signed), VITER(VI, signed), VATTR(1, VLMUL1, signed)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +DEF_RVV_FUNCTION(vredsum, vredsum, (3, VATTR(1, VLMUL1, unsigned), VITER(VI, unsigned), VATTR(1, VLMUL1, unsigned)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +DEF_RVV_FUNCTION(vredmax, vredmax, (3, VATTR(1, VLMUL1, signed), VITER(VI, signed), VATTR(1, VLMUL1, signed)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +DEF_RVV_FUNCTION(vredmaxu, vredmaxu, (3, VATTR(1, VLMUL1, unsigned), VITER(VI, unsigned), VATTR(1, VLMUL1, unsigned)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +DEF_RVV_FUNCTION(vredmin, vredmin, (3, VATTR(1, VLMUL1, signed), VITER(VI, signed), VATTR(1, VLMUL1, signed)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +DEF_RVV_FUNCTION(vredminu, vredminu, (3, VATTR(1, VLMUL1, unsigned), VITER(VI, unsigned), VATTR(1, VLMUL1, unsigned)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +DEF_RVV_FUNCTION(vredand, vredand, (3, VATTR(1, VLMUL1, signed), VITER(VI, signed), VATTR(1, VLMUL1, signed)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +DEF_RVV_FUNCTION(vredand, vredand, (3, VATTR(1, VLMUL1, unsigned), VITER(VI, unsigned), VATTR(1, VLMUL1, unsigned)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +DEF_RVV_FUNCTION(vredor, vredor, (3, VATTR(1, VLMUL1, signed), VITER(VI, signed), VATTR(1, VLMUL1, signed)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +DEF_RVV_FUNCTION(vredor, vredor, (3, VATTR(1, VLMUL1, unsigned), VITER(VI, unsigned), VATTR(1, VLMUL1, unsigned)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +DEF_RVV_FUNCTION(vredxor, vredxor, (3, VATTR(1, VLMUL1, signed), VITER(VI, signed), VATTR(1, VLMUL1, signed)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +DEF_RVV_FUNCTION(vredxor, vredxor, (3, VATTR(1, VLMUL1, unsigned), VITER(VI, unsigned), VATTR(1, VLMUL1, unsigned)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +DEF_RVV_FUNCTION(vwredsum, vwredsum, (3, VATTR(1, VWLMUL1, signed), VITER(VWREDI, signed), VATTR(1, VWLMUL1, signed)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +DEF_RVV_FUNCTION(vwredsumu, vwredsumu, (3, VATTR(1, VWLMUL1, unsigned), VITER(VWREDI, unsigned), VATTR(1, VWLMUL1, unsigned)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +DEF_RVV_FUNCTION(vfredosum, vfredosum, (3, VATTR(1, VLMUL1, signed), VITER(VF, signed), VATTR(1, VLMUL1, signed)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +DEF_RVV_FUNCTION(vfredusum, vfredusum, (3, VATTR(1, VLMUL1, signed), VITER(VF, signed), VATTR(1, VLMUL1, signed)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +DEF_RVV_FUNCTION(vfredmax, vfredmax, (3, VATTR(1, VLMUL1, signed), VITER(VF, signed), VATTR(1, VLMUL1, signed)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +DEF_RVV_FUNCTION(vfredmin, vfredmin, (3, VATTR(1, VLMUL1, signed), VITER(VF, signed), VATTR(1, VLMUL1, signed)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +DEF_RVV_FUNCTION(vfwredosum, vfwredosum, (3, VATTR(1, VWLMUL1, signed), VITER(VWREDF, signed), VATTR(1, VWLMUL1, signed)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +DEF_RVV_FUNCTION(vfwredusum, vfwredusum, (3, VATTR(1, VWLMUL1, signed), VITER(VWREDF, signed), VATTR(1, VWLMUL1, signed)), pat_void_dest_ignore_mp, pred_reduce, OP_vs) +/* 15. Vector Mask Instructions. */ +DEF_RVV_FUNCTION(vmand, vmand, (3, VITER(VB, signed), VATTR(0, VB, signed), VATTR(0, VB, signed)), PAT_none, PRED_void, OP_mm) +DEF_RVV_FUNCTION(vmor, vmor, (3, VITER(VB, signed), VATTR(0, VB, signed), VATTR(0, VB, signed)), PAT_none, PRED_void, OP_mm) +DEF_RVV_FUNCTION(vmxor, vmxor, (3, VITER(VB, signed), VATTR(0, VB, signed), VATTR(0, VB, signed)), PAT_none, PRED_void, OP_mm) +DEF_RVV_FUNCTION(vmnand, vmnand, (3, VITER(VB, signed), VATTR(0, VB, signed), VATTR(0, VB, signed)), PAT_none, PRED_void, OP_mm) +DEF_RVV_FUNCTION(vmnor, vmnor, (3, VITER(VB, signed), VATTR(0, VB, signed), VATTR(0, VB, signed)), PAT_none, PRED_void, OP_mm) +DEF_RVV_FUNCTION(vmxnor, vmxnor, (3, VITER(VB, signed), VATTR(0, VB, signed), VATTR(0, VB, signed)), PAT_none, PRED_void, OP_mm) +DEF_RVV_FUNCTION(vmandn, vmandn, (3, VITER(VB, signed), VATTR(0, VB, signed), VATTR(0, VB, signed)), PAT_none, PRED_void, OP_mm) +DEF_RVV_FUNCTION(vmorn, vmorn, (3, VITER(VB, signed), VATTR(0, VB, signed), VATTR(0, VB, signed)), PAT_none, PRED_void, OP_mm) +DEF_RVV_FUNCTION(vmmv, vmmv, (2, VITER(VB, signed), VATTR(0, VB, signed)), PAT_none, PRED_void, OP_m) +DEF_RVV_FUNCTION(vmnot, vmnot, (2, VITER(VB, signed), VATTR(0, VB, signed)), PAT_none, PRED_void, OP_m) +DEF_RVV_FUNCTION(vmclr, vmclr, (1, VITER(VB, signed)), PAT_none, PRED_void, OP_m) +DEF_RVV_FUNCTION(vmset, vmset, (1, VITER(VB, signed)), PAT_none, PRED_void, OP_m) +DEF_RVV_FUNCTION(vcpop, vcpop, (2, VITER(VB, signed), VATTR(0, VB, signed)), pat_mask_ignore_policy, pred_mask2, OP_m) +DEF_RVV_FUNCTION(vfirst, vfirst, (2, VITER(VB, signed), VATTR(0, VB, signed)), pat_mask_ignore_policy, pred_mask2, OP_m) +DEF_RVV_FUNCTION(vmsbf, vmsbf, (2, VITER(VB, signed), VATTR(0, VB, signed)), pat_mask_ignore_tp, pred_mask, OP_m) +DEF_RVV_FUNCTION(vmsif, vmsif, (2, VITER(VB, signed), VATTR(0, VB, signed)), pat_mask_ignore_tp, pred_mask, OP_m) +DEF_RVV_FUNCTION(vmsof, vmsof, (2, VITER(VB, signed), VATTR(0, VB, signed)), pat_mask_ignore_tp, pred_mask, OP_m) +DEF_RVV_FUNCTION(viota, viota, (2, VITER(VI, unsigned), VATTR(0, VM, signed)), pat_mask_tail, pred_all, OP_m) +DEF_RVV_FUNCTION(vid, vid, (1, VITER(VI, signed)), pat_mask_tail, pred_all, OP_v) +DEF_RVV_FUNCTION(vid, vid, (1, VITER(VI, unsigned)), pat_mask_tail, pred_all, OP_v) +/* 16. Vector Permutation Instructions. */ +DEF_RVV_FUNCTION(vmv_x_s, vmv_x_s, (2, VATTR(1, VSUB, signed), VITER(VI, signed)), PAT_none, PRED_none, OP_none) +DEF_RVV_FUNCTION(vmv_x_s, vmv_x_s, (2, VATTR(1, VSUB, unsigned), VITER(VI, unsigned)), PAT_none, PRED_none, OP_none) +DEF_RVV_FUNCTION(vmv_s_x, vmv_s_x, (2, VITER(VI, signed), VATTR(0, VSUB, signed)), pat_tail_void_dest, pred_tail, OP_none) +DEF_RVV_FUNCTION(vmv_s_x, vmv_s_x, (2, VITER(VI, unsigned), VATTR(0, VSUB, unsigned)), pat_tail_void_dest, pred_tail, OP_none) +DEF_RVV_FUNCTION(vfmv_f_s, vfmv_f_s, (2, VATTR(1, VSUB, signed), VITER(VF, signed)), PAT_none, PRED_none, OP_none) +DEF_RVV_FUNCTION(vfmv_s_f, vfmv_s_f, (2, VITER(VF, signed), VATTR(0, VSUB, signed)), pat_tail_void_dest, pred_tail, OP_none) +DEF_RVV_FUNCTION(vslideup, vslideup, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VSUB, signed)), pat_mask_tail_dest, pred_all, OP_vx) +DEF_RVV_FUNCTION(vslideup, vslideup, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VSUB, unsigned)), pat_mask_tail_dest, pred_all, OP_vx) +DEF_RVV_FUNCTION(vslideup, vslideup, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VSUB, signed)), pat_mask_tail_dest, pred_all, OP_vx) +DEF_RVV_FUNCTION(vslidedown, vslidedown, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VSUB, signed)), pat_mask_tail_dest, pred_all, OP_vx) +DEF_RVV_FUNCTION(vslidedown, vslidedown, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VSUB, unsigned)), pat_mask_tail_dest, pred_all, OP_vx) +DEF_RVV_FUNCTION(vslidedown, vslidedown, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VSUB, signed)), pat_mask_tail_dest, pred_all, OP_vx) +DEF_RVV_FUNCTION(vslide1up, vslide1up, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VSUB, signed)), pat_mask_tail, pred_all, OP_vx) +DEF_RVV_FUNCTION(vslide1up, vslide1up, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VSUB, unsigned)), pat_mask_tail, pred_all, OP_vx) +DEF_RVV_FUNCTION(vslide1down, vslide1down, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VSUB, signed)), pat_mask_tail, pred_all, OP_vx) +DEF_RVV_FUNCTION(vslide1down, vslide1down, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VSUB, unsigned)), pat_mask_tail, pred_all, OP_vx) +DEF_RVV_FUNCTION(vfslide1up, vfslide1up, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VSUB, signed)), pat_mask_tail, pred_all, OP_vf) +DEF_RVV_FUNCTION(vfslide1down, vfslide1down, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VSUB, signed)), pat_mask_tail, pred_all, OP_vf) +DEF_RVV_FUNCTION(vrgather, vrgather, (3, VITER(VI, signed), VATTR(0, VI, signed), VATTR(0, VMAP, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vrgather, vrgather, (3, VITER(VI, unsigned), VATTR(0, VI, unsigned), VATTR(0, VMAP, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vrgather, vrgather, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VMAP, unsigned)), pat_mask_tail, pred_all, OP_vv | OP_vx) +DEF_RVV_FUNCTION(vrgatherei16, vrgatherei16, (3, VITER(VI16, signed), VATTR(0, VI16, signed), VATTR(0, VMAPI16, unsigned)), pat_mask_tail, pred_all, OP_vv) +DEF_RVV_FUNCTION(vrgatherei16, vrgatherei16, (3, VITER(VI16, unsigned), VATTR(0, VI16, unsigned), VATTR(0, VMAPI16, unsigned)), pat_mask_tail, pred_all, OP_vv) +DEF_RVV_FUNCTION(vrgatherei16, vrgatherei16, (3, VITER(VF, signed), VATTR(0, VF, signed), VATTR(0, VMAPI16, unsigned)), pat_mask_tail, pred_all, OP_vv) +DEF_RVV_FUNCTION(vcompress, vcompress, (2, VITER(VI, signed), VATTR(0, VI, signed)), PAT_tail | PAT_void_dest | PAT_merge, pred_tail, OP_vm) +DEF_RVV_FUNCTION(vcompress, vcompress, (2, VITER(VI, unsigned), VATTR(0, VI, unsigned)), PAT_tail | PAT_void_dest | PAT_merge, pred_tail, OP_vm) +DEF_RVV_FUNCTION(vcompress, vcompress, (2, VITER(VF, signed), VATTR(0, VF, signed)), PAT_tail | PAT_void_dest | PAT_merge, pred_tail, OP_vm) #undef REQUIRED_EXTENSIONS #undef DEF_RVV_FUNCTION #undef VITER diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.h b/gcc/config/riscv/riscv-vector-builtins-functions.h index bde03e8d49d..85ed9d1ae26 100644 --- a/gcc/config/riscv/riscv-vector-builtins-functions.h +++ b/gcc/config/riscv/riscv-vector-builtins-functions.h @@ -1533,6 +1533,1373 @@ public: virtual rtx expand (const function_instance &, tree, rtx) const override; }; +/* A function_base for reduction functions. */ +class reduceop : public basic_alu +{ +public: + // use the same construction function as the basic_alu + using basic_alu::basic_alu; + + virtual char * assemble_name (function_instance &) override; + + virtual tree get_mask_type (tree, const function_instance &, const vec &) const override; + + virtual void get_argument_types (const function_instance &, vec &) const override; +}; + +/* A function_base for vsadd functions. */ +class vsadd : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vsaddu functions. */ +class vsaddu : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vaadd functions. */ +class vaadd : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vaaddu functions. */ +class vaaddu : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vssub functions. */ +class vssub : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vssubu functions. */ +class vssubu : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vasub functions. */ +class vasub : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vasubu functions. */ +class vasubu : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vssrl functions. */ +class vssrl : public vshift +{ +public: + // use the same construction function as the vshift + using vshift::vshift; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vssra functions. */ +class vssra : public vshift +{ +public: + // use the same construction function as the vshift + using vshift::vshift; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vsmul functions. */ +class vsmul : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vnclip functions. */ +class vnclip : public vshift +{ +public: + // use the same construction function as the vshift + using vshift::vshift; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vnclipu functions. */ +class vnclipu : public vshift +{ +public: + // use the same construction function as the vshift + using vshift::vshift; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for funop functions. */ +class funop : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual unsigned int call_properties () const override; +}; + +/* A function_base for fbinop functions. */ +class fbinop : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual unsigned int call_properties () const override; +}; + +/* A function_base for fbinop functions. */ +class fwbinop : public wbinop +{ +public: + // use the same construction function as the wbinop + using wbinop::wbinop; + + virtual unsigned int call_properties () const override; +}; + +/* A function_base for fternop functions. */ +class fternop : public ternop +{ +public: + // use the same construction function as the binop + using ternop::ternop; + + virtual unsigned int call_properties () const override; +}; + +/* A function_base for vfadd functions. */ +class vfadd : public fbinop +{ +public: + // use the same construction function as the fbinop + using fbinop::fbinop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfsub functions. */ +class vfsub : public fbinop +{ +public: + // use the same construction function as the fbinop + using fbinop::fbinop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfmul functions. */ +class vfmul : public fbinop +{ +public: + // use the same construction function as the fbinop + using fbinop::fbinop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfdiv functions. */ +class vfdiv : public fbinop +{ +public: + // use the same construction function as the fbinop + using fbinop::fbinop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfrsub functions. */ +class vfrsub : public fbinop +{ +public: + // use the same construction function as the binop + using fbinop::fbinop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfrdiv functions. */ +class vfrdiv : public fbinop +{ +public: + // use the same construction function as the binop + using fbinop::fbinop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfneg functions. */ +class vfneg : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfwadd functions. */ +class vfwadd : public fwbinop +{ +public: + // use the same construction function as the fwbinop + using fwbinop::fwbinop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfwsub functions. */ +class vfwsub : public fwbinop +{ +public: + // use the same construction function as the fwbinop + using fwbinop::fwbinop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfwmul functions. */ +class vfwmul : public fbinop +{ +public: + // use the same construction function as the fbinop + using fbinop::fbinop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfmacc functions. */ +class vfmacc : public fternop +{ +public: + // use the same construction function as the fternop + using fternop::fternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfmsac functions. */ +class vfmsac : public fternop +{ +public: + // use the same construction function as the fternop + using fternop::fternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfnmacc functions. */ +class vfnmacc : public fternop +{ +public: + // use the same construction function as the fternop + using fternop::fternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfnmsac functions. */ +class vfnmsac : public fternop +{ +public: + // use the same construction function as the fternop + using fternop::fternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfmadd functions. */ +class vfmadd : public fternop +{ +public: + // use the same construction function as the fternop + using fternop::fternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfnmadd functions. */ +class vfnmadd : public fternop +{ +public: + // use the same construction function as the fternop + using fternop::fternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfmsub functions. */ +class vfmsub : public fternop +{ +public: + // use the same construction function as the fternop + using fternop::fternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfnmsub functions. */ +class vfnmsub : public fternop +{ +public: + // use the same construction function as the fternop + using fternop::fternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfwmacc functions. */ +class vfwmacc : public fternop +{ +public: + // use the same construction function as the ternop + using fternop::fternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfwmsac functions. */ +class vfwmsac : public fternop +{ +public: + // use the same construction function as the ternop + using fternop::fternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfwnmacc functions. */ +class vfwnmacc : public fternop +{ +public: + // use the same construction function as the ternop + using fternop::fternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfwnmsac functions. */ +class vfwnmsac : public fternop +{ +public: + // use the same construction function as the ternop + using fternop::fternop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfsqrt functions. */ +class vfsqrt : public funop +{ +public: + // use the same construction function as the unop + using funop::funop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfrsqrt7 functions. */ +class vfrsqrt7 : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfrec7 functions. */ +class vfrec7 : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfmax functions. */ +class vfmax : public fbinop +{ +public: + // use the same construction function as the fbinop + using fbinop::fbinop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfmin functions. */ +class vfmin : public fbinop +{ +public: + // use the same construction function as the fbinop + using fbinop::fbinop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfsgnj, vfsgnjn and vfsgnjx functions. */ +class vfsgnj : public fbinop +{ +public: + // use the same construction function as the fbinop + using fbinop::fbinop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfsgnjn functions. */ +class vfsgnjn : public fbinop +{ +public: + // use the same construction function as the fbinop + using fbinop::fbinop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfsgnjx functions. */ +class vfsgnjx : public fbinop +{ +public: + // use the same construction function as the fbinop + using fbinop::fbinop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfabs functions. */ +class vfabs : public funop +{ +public: + // use the same construction function as the unop + using funop::funop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfcmp functions. */ +class vfcmp : public fbinop +{ +public: + // use the same construction function as the fbinop + using fbinop::fbinop; + + virtual char * assemble_name (function_instance &) override; +}; + +/* A function_base for vmfeq functions. */ +class vmfeq : public vfcmp +{ +public: + // use the same construction function as the vfcmp + using vfcmp::vfcmp; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmfne functions. */ +class vmfne : public vfcmp +{ +public: + // use the same construction function as the vfcmp + using vfcmp::vfcmp; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmflt functions. */ +class vmflt : public vfcmp +{ +public: + // use the same construction function as the vfcmp + using vfcmp::vfcmp; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmfle functions. */ +class vmfle : public vfcmp +{ +public: + // use the same construction function as the vfcmp + using vfcmp::vfcmp; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmfgt functions. */ +class vmfgt : public vfcmp +{ +public: + // use the same construction function as the vfcmp + using vfcmp::vfcmp; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmfge functions. */ +class vmfge : public vfcmp +{ +public: + // use the same construction function as the vfcmp + using vfcmp::vfcmp; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfclass functions. */ +class vfclass : public unop +{ +public: + // use the same construction function as the binop + using unop::unop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfmerge functions. */ +class vfmerge : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual size_t get_position_of_dest_arg (enum predication_index) const override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfmv functions. */ +class vfmv : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual bool can_be_overloaded_p (const function_instance &) const override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfcvt_x_f_v functions. */ +class vfcvt_f2i : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfcvt_xu_f_v functions. */ +class vfcvt_f2u : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfcvt_rtz_x_f_v functions. */ +class vfcvt_rtz_f2i : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfcvt_rtz_xu_f_v functions. */ +class vfcvt_rtz_f2u : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfcvt_f_x_v functions. */ +class vfcvt_i2f : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfcvt_f_xu_v functions. */ +class vfcvt_u2f : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfwcvt_x_f_v functions. */ +class vfwcvt_f2i : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfwcvt_xu_f_v functions. */ +class vfwcvt_f2u : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfwcvt_rtz_x_f_v functions. */ +class vfwcvt_rtz_f2i : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfwcvt_rtz_xu_f_v functions. */ +class vfwcvt_rtz_f2u : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfwcvt_f_x_v functions. */ +class vfwcvt_i2f : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfwcvt_f_xu_v functions. */ +class vfwcvt_u2f : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfwcvt_f_f_v functions. */ +class vfwcvt_f2f : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfncvt_x_f_w functions. */ +class vfncvt_f2i : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfncvt_xu_f_w functions. */ +class vfncvt_f2u : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfncvt_rtz_x_f_w functions. */ +class vfncvt_rtz_f2i : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfncvt_rtz_xu_f_w functions. */ +class vfncvt_rtz_f2u : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfncvt_f_x_w functions. */ +class vfncvt_i2f : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfncvt_f_xu_w functions. */ +class vfncvt_u2f : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfncvt_f_f_w functions. */ +class vfncvt_f2f : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfncvt_rod_f_f_w functions. */ +class vfncvt_f2rodf : public funop +{ +public: + // use the same construction function as the funop + using funop::funop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vredsum functions. */ +class vredsum : public reduceop +{ +public: + // use the same construction function as the reduceop + using reduceop::reduceop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vredmax functions. */ +class vredmax : public reduceop +{ +public: + // use the same construction function as the reduceop + using reduceop::reduceop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vredmaxu functions. */ +class vredmaxu : public reduceop +{ +public: + // use the same construction function as the reduceop + using reduceop::reduceop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vredmin functions. */ +class vredmin : public reduceop +{ +public: + // use the same construction function as the reduceop + using reduceop::reduceop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vredminu functions. */ +class vredminu : public reduceop +{ +public: + // use the same construction function as the reduceop + using reduceop::reduceop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vredand functions. */ +class vredand : public reduceop +{ +public: + // use the same construction function as the reduceop + using reduceop::reduceop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vredor functions. */ +class vredor : public reduceop +{ +public: + // use the same construction function as the reduceop + using reduceop::reduceop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vredxor functions. */ +class vredxor : public reduceop +{ +public: + // use the same construction function as the reduceop + using reduceop::reduceop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vwredsum functions. */ +class vwredsum : public reduceop +{ +public: + // use the same construction function as the reduceop + using reduceop::reduceop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vwredsumu functions. */ +class vwredsumu : public reduceop +{ +public: + // use the same construction function as the reduceop + using reduceop::reduceop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for freduceop functions. */ +class freduceop : public reduceop +{ +public: + // use the same construction function as the reduceop + using reduceop::reduceop; + + virtual unsigned int call_properties () const override; +}; + +/* A function_base for vfredosum functions. */ +class vfredosum : public freduceop +{ +public: + // use the same construction function as the freduceop + using freduceop::freduceop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfredusum functions. */ +class vfredusum : public freduceop +{ +public: + // use the same construction function as the freduceop + using freduceop::freduceop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfredmax functions. */ +class vfredmax : public freduceop +{ +public: + // use the same construction function as the freduceop + using freduceop::freduceop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfredmin functions. */ +class vfredmin : public freduceop +{ +public: + // use the same construction function as the freduceop + using freduceop::freduceop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfwredosum functions. */ +class vfwredosum : public freduceop +{ +public: + // use the same construction function as the freduceop + using freduceop::freduceop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfwredusum functions. */ +class vfwredusum : public freduceop +{ +public: + // use the same construction function as the freduceop + using freduceop::freduceop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmand functions. */ +class vmand : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmor functions. */ +class vmor : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmxor functions. */ +class vmxor : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmnand functions. */ +class vmnand : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmnor functions. */ +class vmnor : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmxnor functions. */ +class vmxnor : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmandn functions. */ +class vmandn : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmorn functions. */ +class vmorn : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmmv functions. */ +class vmmv : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmnot functions. */ +class vmnot : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmclr functions. */ +class vmclr : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual void get_argument_types (const function_instance &, vec &) const override; + + virtual bool can_be_overloaded_p (const function_instance &) const override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmset functions. */ +class vmset : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual void get_argument_types (const function_instance &, vec &) const override; + + virtual bool can_be_overloaded_p (const function_instance &) const override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vcpop functions. */ +class vcpop : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual tree get_return_type (const function_instance &) const override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfirst functions. */ +class vfirst : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual tree get_return_type (const function_instance &) const override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmsbf functions. */ +class vmsbf : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmsif functions. */ +class vmsif : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmsof functions. */ +class vmsof : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for viota functions. */ +class viota : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual bool can_be_overloaded_p (const function_instance &) const override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vid functions. */ +class vid : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual void get_argument_types (const function_instance &, vec &) const override; + + virtual bool can_be_overloaded_p (const function_instance &) const override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmv_x_s functions. */ +class vmv_x_s : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vmv_s_x functions. */ +class vmv_s_x : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfmv_f_s functions. */ +class vfmv_f_s : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfmv_s_f functions. */ +class vfmv_s_f : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual char * assemble_name (function_instance &) override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vslideup functions. */ +class vslideup : public vshift +{ +public: + // use the same construction function as the vshift + using vshift::vshift; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vslidedown functions. */ +class vslidedown : public vshift +{ +public: + // use the same construction function as the vshift + using vshift::vshift; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vslide1up functions. */ +class vslide1up : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vslide1down functions. */ +class vslide1down : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfslide1up functions. */ +class vfslide1up : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vfslide1down functions. */ +class vfslide1down : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vrgather functions. */ +class vrgather : public vshift +{ +public: + // use the same construction function as the vshift + using vshift::vshift; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +/* A function_base for vrgather functions. */ +class vrgatherei16 : public binop +{ +public: + // use the same construction function as the binop + using binop::binop; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + +class vcompress : public unop +{ +public: + // use the same construction function as the unop + using unop::unop; + + virtual size_t get_position_of_dest_arg (enum predication_index) const override; + + virtual rtx expand (const function_instance &, tree, rtx) const override; +}; + } // namespace riscv_vector #endif // end GCC_RISCV_VECTOR_BUILTINS_FUNCTIONS_H \ No newline at end of file diff --git a/gcc/config/riscv/riscv-vector.cc b/gcc/config/riscv/riscv-vector.cc index 1d53c50a751..e892fb05a95 100644 --- a/gcc/config/riscv/riscv-vector.cc +++ b/gcc/config/riscv/riscv-vector.cc @@ -905,7 +905,7 @@ rvv_adjust_frame (rtx target, poly_int64 offset, bool epilogue) } /* Helper functions for handling sew=64 on RV32 system. */ -bool +static bool imm32_p (rtx a) { if (!CONST_SCALAR_INT_P (a)) @@ -928,7 +928,7 @@ enum GEN_CLASS }; /* Helper functions for handling sew=64 on RV32 system. */ -enum GEN_CLASS +static enum GEN_CLASS modify_operands (machine_mode Vmode, machine_mode VSImode, machine_mode VMSImode, machine_mode VSUBmode, rtx *operands, bool (*imm5_p) (rtx), int i, bool reverse, unsigned int unspec) @@ -970,7 +970,7 @@ modify_operands (machine_mode Vmode, machine_mode VSImode, } /* Helper functions for handling sew=64 on RV32 system. */ -bool +static bool emit_op5_vmv_v_x (machine_mode Vmode, machine_mode VSImode, machine_mode VMSImode, machine_mode VSUBmode, rtx *operands, int i) @@ -994,6 +994,51 @@ emit_op5_vmv_v_x (machine_mode Vmode, machine_mode VSImode, return false; } +/* Helper functions for handling sew=64 on RV32 system. */ +static bool +emit_op5_vmv_s_x (machine_mode Vmode, machine_mode VSImode, + machine_mode VSUBmode, rtx *operands, int i) +{ + if (!TARGET_64BIT && VSUBmode == DImode) + { + if (!imm32_p (operands[i])) + { + rtx s = operands[i]; + if (CONST_SCALAR_INT_P (s)) + { + s = force_reg (DImode, s); + } + + rtx hi = gen_highpart (SImode, s); + rtx lo = gen_lowpart (SImode, s); + rtx vlx2 = gen_vlx2 (operands[3], Vmode, VSImode); + + rtx vret = operands[0]; + rtx vd = operands[1]; + if (vd == const0_rtx) + { + vd = gen_reg_rtx (Vmode); + } + rtx vd_si = gen_lowpart (VSImode, vd); + + emit_insn (gen_vslide_vx (UNSPEC_SLIDEDOWN, VSImode, vd_si, + const0_rtx, vd_si, vd_si, const2_rtx, vlx2, + operands[4])); + emit_insn (gen_vslide1_vx_internal (UNSPEC_SLIDE1UP, VSImode, vd_si, + const0_rtx, vd_si, vd_si, hi, + vlx2, operands[4])); + emit_insn (gen_vslide1_vx_internal (UNSPEC_SLIDE1UP, VSImode, vd_si, + const0_rtx, vd_si, vd_si, lo, vlx2, + operands[4])); + + emit_insn (gen_rtx_SET (vret, gen_lowpart (Vmode, vd_si))); + + return true; + } + } + return false; +} + /* Helper functions for handling sew=64 on RV32 system. */ void emit_op5 (unsigned int unspec, machine_mode Vmode, machine_mode VSImode, @@ -1008,6 +1053,13 @@ emit_op5 (unsigned int unspec, machine_mode Vmode, machine_mode VSImode, return; } } + else if (unspec == UNSPEC_VMVS) + { + if (emit_op5_vmv_s_x (Vmode, VSImode, VSUBmode, operands, i)) + { + return; + } + } enum GEN_CLASS gen_class = modify_operands ( Vmode, VSImode, VMSImode, VSUBmode, operands, imm5_p, i, reverse, unspec); @@ -1038,6 +1090,85 @@ emit_op6 (unsigned int unspec ATTRIBUTE_UNUSED, machine_mode Vmode, operands[4], operands[5])); } +/* Helper functions for handling sew=64 on RV32 system. */ +static bool +emit_op7_slide1 (unsigned int unspec, machine_mode Vmode, machine_mode VSImode, + machine_mode VSUBmode, rtx *operands, int i) +{ + if (!TARGET_64BIT && VSUBmode == DImode) + { + if (!imm32_p (operands[i])) + { + rtx s = operands[i]; + if (CONST_SCALAR_INT_P (s)) + { + s = force_reg (DImode, s); + } + + rtx hi = gen_highpart (SImode, s); + rtx lo = gen_lowpart (SImode, s); + + rtx vret = operands[0]; + rtx mask = operands[1]; + rtx vs = operands[3]; + rtx avl = operands[5]; + rtx vlx2 = gen_vlx2 (avl, Vmode, VSImode); + rtx vs_si = gen_lowpart (VSImode, vs); + rtx vtemp; + if (rtx_equal_p (operands[2], const0_rtx)) + { + vtemp = gen_reg_rtx (VSImode); + } + else + { + vtemp = gen_lowpart (VSImode, operands[2]); + } + + if (unspec == UNSPEC_SLIDE1UP) + { + rtx v1 = gen_reg_rtx (VSImode); + + emit_insn (gen_vslide1_vx_internal (UNSPEC_SLIDE1UP, VSImode, v1, + const0_rtx, const0_rtx, vs_si, + hi, vlx2, operands[6])); + emit_insn (gen_vslide1_vx_internal (UNSPEC_SLIDE1UP, VSImode, + vtemp, const0_rtx, const0_rtx, + v1, lo, vlx2, operands[6])); + } + else + { + emit_insn (gen_vslide1_vx_internal ( + UNSPEC_SLIDE1DOWN, VSImode, vtemp, const0_rtx, const0_rtx, + vs_si, force_reg (GET_MODE (lo), lo), vlx2, operands[6])); + emit_insn (gen_vslide1_vx_internal ( + UNSPEC_SLIDE1DOWN, VSImode, vtemp, const0_rtx, const0_rtx, + vtemp, force_reg (GET_MODE (hi), hi), vlx2, operands[6])); + } + + if (rtx_equal_p (mask, const0_rtx)) + { + emit_insn (gen_rtx_SET (vret, gen_lowpart (Vmode, vtemp))); + } + else + { + rtx dest = operands[2]; + if (rtx_equal_p (dest, const0_rtx)) + { + dest = vret; + } + emit_insn (gen_vmerge_vvm (Vmode, dest, mask, dest, dest, + gen_lowpart (Vmode, vtemp), + force_reg_for_over_uimm (avl), + operands[6])); + + emit_insn (gen_rtx_SET (vret, dest)); + } + + return true; + } + } + return false; +} /* Helper functions for handling sew=64 on RV32 system. */ void @@ -1046,6 +1177,14 @@ emit_op7 (unsigned int unspec, machine_mode Vmode, machine_mode VSImode, gen_7 *gen_vx, gen_7 *gen_vx_32bit, gen_7 *gen_vv, imm_p *imm5_p, int i, bool reverse) { + if (unspec == UNSPEC_SLIDE1UP || unspec == UNSPEC_SLIDE1DOWN) + { + if (emit_op7_slide1 (unspec, Vmode, VSImode, VSUBmode, operands, i)) + { + return; + } + } + enum GEN_CLASS gen_class = modify_operands ( Vmode, VSImode, VMSImode, VSUBmode, operands, imm5_p, i, reverse, unspec); diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index 748025d4080..501980d822f 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -792,6 +792,9 @@ UNSPEC_VMIN UNSPEC_VMINU UNSPEC_VMAX UNSPEC_VMAXU UNSPEC_VMUL UNSPEC_VMULH UNSPEC_VMULHU UNSPEC_VMULHSU UNSPEC_VDIV UNSPEC_VDIVU UNSPEC_VREM UNSPEC_VREMU + UNSPEC_VSADD UNSPEC_VSADDU UNSPEC_VSSUB UNSPEC_VSSUBU + UNSPEC_VAADD UNSPEC_VAADDU UNSPEC_VASUB UNSPEC_VASUBU + UNSPEC_VSMUL ]) (define_int_iterator VXMOP [ diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index cb8bdc5781f..118d3ce6f61 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -1222,6 +1222,77 @@ } ) +;; vmv.s.x +(define_expand "@v_s_x" + [(unspec [ + (match_operand:VI 0 "register_operand") + (match_operand:VI 1 "vector_reg_or_const0_operand") + (match_operand: 2 "reg_or_const_int_operand") + (match_operand 3 "p_reg_or_const_csr_operand") + (match_operand 4 "const_int_operand") + ] VMVSOP)] + "TARGET_VECTOR" + { + emit_op5 ( + , + mode, mode, mode, + mode, + operands, + gen_v_s_x_internal, + gen_v_s_x_32bit, + NULL, + satisfies_constraint_, + 2, false + ); + DONE; + } +) + +;; vslide1 +(define_expand "@vslide1_vx" + [(unspec [ + (match_operand:VI 0 "register_operand") + (match_operand: 1 "vector_reg_or_const0_operand") + (match_operand:VI 2 "vector_reg_or_const0_operand") + (match_operand:VI 3 "register_operand") + (match_operand: 4 "reg_or_const_int_operand") + (match_operand 5 "reg_or_const_int_operand") + (match_operand 6 "const_int_operand") + ] VSLIDE1)] + "TARGET_VECTOR" + { + emit_op7 ( + , + mode, mode, mode, + mode, + operands, + gen_vslide1_vx_internal, + gen_vslide1_vx_32bit, + NULL, + satisfies_constraint_, + 4, false + ); + DONE; + } +) + +;; helper expand to double the vl operand +(define_expand "vmv_vlx2_help" + [ + (set (match_operand:SI 0 "register_operand") + (ashift:SI (match_operand:SI 1 "register_operand") + (const_int 1))) + (set (match_operand:SI 2 "register_operand") + (ltu:SI (match_dup 0) (match_dup 1))) + (set (match_dup 2) + (minus:SI (reg:SI X0_REGNUM) (match_dup 2))) + (set (match_dup 0) + (ior:SI (match_dup 0) (match_dup 2))) + ] + "TARGET_VECTOR" + "" +) + ;; ------------------------------------------------------------------------------- ;; ---- 11. Vector Integer Arithmetic Instructions ;; ------------------------------------------------------------------------------- @@ -3521,14 +3592,14 @@ "vmv.v.v\t%0,%2" [(set_attr "type" "vmove") (set_attr "mode" "")]) - + ;; Vector-Scalar Integer Move. (define_insn "@vmv_v_x_internal" [(set (match_operand:VI 0 "register_operand" "=vr,vr,vr,vr") (unspec:VI [(match_operand:VI 1 "vector_reg_or_const0_operand" "0,0,J,J") (vec_duplicate:VI - (match_operand: 2 "reg_or_simm5_operand" "r,Ws5,r,Ws5")) + (match_operand: 2 "reg_or_simm5_operand" "r,Ws5,r,Ws5")) (match_operand 3 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") (match_operand 4 "const_int_operand") (reg:SI VL_REGNUM) @@ -3561,46 +3632,2401 @@ [(set_attr "type" "vmove") (set_attr "mode" "")]) -;; Vector-Scalar Floating-Point Move. -(define_insn "@vfmv_v_f" - [(set (match_operand:VF 0 "register_operand" "=vr,vr") - (unspec:VF - [(match_operand:VF 1 "vector_reg_or_const0_operand" "0,J") - (vec_duplicate:VF - (match_operand: 2 "register_operand" "f,f")) - (match_operand 3 "p_reg_or_const_csr_operand" "rK,rK") - (match_operand 4 "const_int_operand") +;; ------------------------------------------------------------------------------- +;; ---- 12. Vector Fixed-Point Arithmetic Instructions +;; ------------------------------------------------------------------------------- +;; Includes: +;; - 12.1 Vector Single-Width Saturating Add and Subtract +;; - 12.2 Vector Single-Width Aaveraging Add and Subtract +;; - 12.3 Vector Single-Width Fractional Multiply with Rounding and Saturation +;; - 12.5 Vector Single-Width Scaling Shift Instructions +;; - 12.6 Vector Narrowing Fixed-Point Clip Instructions +;; ------------------------------------------------------------------------------- + +;; Vector-Vector Single-Width Saturating Add. +(define_insn "@v_vv" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (any_satplus:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (match_operand:VI 4 "vector_arith_operand" "vr,vi,vr,vi,vr,vi,vr,vi")) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] - "TARGET_VECTOR" - "vfmv.v.f\t%0,%2" - [(set_attr "type" "vmove") - (set_attr "mode" "")]) + "TARGET_VECTOR" + "@ + v.vv\t%0,%3,%4,%1.t + v.vi\t%0,%3,%v4,%1.t + v.vv\t%0,%3,%4,%1.t + v.vi\t%0,%3,%v4,%1.t + v.vv\t%0,%3,%4 + v.vi\t%0,%3,%v4 + v.vv\t%0,%3,%4 + v.vi\t%0,%3,%v4" + [(set_attr "type" "vsarith") + (set_attr "mode" "")]) -;; vmclr.m vd -> vmxor.mm vd,vd,vd # Clear mask register -(define_insn "@vmclr_m" - [(set (match_operand:VB 0 "register_operand" "=vr") - (unspec:VB - [(vec_duplicate:VB (const_int 0)) - (match_operand 1 "p_reg_or_const_csr_operand" "rK") - (match_operand 2 "const_int_operand") - (reg:SI VL_REGNUM) - (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] - "TARGET_VECTOR" - "vmclr.m\t%0" - [(set_attr "type" "vmask") - (set_attr "mode" "")]) +;; Vector-Vector Single-Width Saturating Sub. +(define_insn "@vsssub_vv" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (ss_minus:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (match_operand:VI 4 "vector_neg_arith_operand" "vr,vj,vr,vj,vr,vj,vr,vj")) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vssub.vv\t%0,%3,%4,%1.t + vsadd.vi\t%0,%3,%V4,%1.t + vssub.vv\t%0,%3,%4,%1.t + vsadd.vi\t%0,%3,%V4,%1.t + vssub.vv\t%0,%3,%4 + vsadd.vi\t%0,%3,%V4 + vssub.vv\t%0,%3,%4 + vsadd.vi\t%0,%3,%V4" + [(set_attr "type" "vsarith") + (set_attr "mode" "")]) + +(define_insn "@vussub_vv" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (us_minus:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr") + (match_operand:VI 4 "register_operand" "vr,vr,vr,vr")) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vssubu.vv\t%0,%3,%4,%1.t + vssubu.vv\t%0,%3,%4,%1.t + vssubu.vv\t%0,%3,%4 + vssubu.vv\t%0,%3,%4" + [(set_attr "type" "vsarith") + (set_attr "mode" "")]) -;; vmset.m vd -> vmxnor.mm vd,vd,vd # Set mask register -(define_insn "@vmset_m" - [(set (match_operand:VB 0 "register_operand" "=vr") - (unspec:VB - [(vec_duplicate:VB (const_int 1)) - (match_operand 1 "p_reg_or_const_csr_operand" "rK") - (match_operand 2 "const_int_operand") - (reg:SI VL_REGNUM) - (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] - "TARGET_VECTOR" - "vmset.m\t%0" - [(set_attr "type" "vmask") - (set_attr "mode" "")]) \ No newline at end of file +;; Vector-Scalar Single-Width Saturating Add. +(define_insn "@v_vx_internal" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (any_satplus:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (vec_duplicate:VI + (match_operand: 4 "reg_or_simm5_operand" "r,Ws5,r,Ws5,r,Ws5,r,Ws5"))) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vx\t%0,%3,%4,%1.t + v.vi\t%0,%3,%4,%1.t + v.vx\t%0,%3,%4,%1.t + v.vi\t%0,%3,%4,%1.t + v.vx\t%0,%3,%4 + v.vi\t%0,%3,%4 + v.vx\t%0,%3,%4 + v.vi\t%0,%3,%4" + [(set_attr "type" "vsarith") + (set_attr "mode" "")]) + +(define_insn "@v_vx_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:V64BITI + [(unspec:V64BITI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (any_satplus:V64BITI + (match_operand:V64BITI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (vec_duplicate:V64BITI + (sign_extend: + (match_operand:SI 4 "reg_or_simm5_operand" "r,Ws5,r,Ws5,r,Ws5,r,Ws5")))) + (match_operand:V64BITI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand:SI 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vx\t%0,%3,%4,%1.t + v.vi\t%0,%3,%4,%1.t + v.vx\t%0,%3,%4,%1.t + v.vi\t%0,%3,%4,%1.t + v.vx\t%0,%3,%4 + v.vi\t%0,%3,%4 + v.vx\t%0,%3,%4 + v.vi\t%0,%3,%4" + [(set_attr "type" "vsarith") + (set_attr "mode" "")]) + +;; Vector-Scalar Single-Width Saturating Sub. +(define_insn "@vsssub_vx_internal" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (ss_minus:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (vec_duplicate:VI + (match_operand: 4 "reg_or_neg_simm5_operand" "r,Wn5,r,Wn5,r,Wn5,r,Wn5"))) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + { + const char *tail = satisfies_constraint_J (operands[1]) ? "" : ",%1.t"; + char buf[64] = {0}; + if (satisfies_constraint_Wn5 (operands[4])) + { + const char *insn = "vsadd.vi\t%0,%3"; + snprintf (buf, sizeof (buf), "%s,%d%s", insn, (int)(-INTVAL (operands[4])), tail); + } + else + { + const char *insn = "vssub.vx\t%0,%3,%4"; + snprintf (buf, sizeof (buf), "%s%s", insn, tail); + } + output_asm_insn (buf, operands); + return ""; + } + [(set_attr "type" "vsarith") + (set_attr "mode" "")]) + +(define_insn "@vussub_vx_internal" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (us_minus:VI + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr") + (vec_duplicate:VI + (match_operand: 4 "register_operand" "r,r,r,r"))) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vssubu.vx\t%0,%3,%4,%1.t + vssubu.vx\t%0,%3,%4,%1.t + vssubu.vx\t%0,%3,%4 + vssubu.vx\t%0,%3,%4" + [(set_attr "type" "vsarith") + (set_attr "mode" "")]) + +(define_insn "@vsssub_vx_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:V64BITI + [(unspec:V64BITI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (ss_minus:V64BITI + (match_operand:V64BITI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (vec_duplicate:V64BITI + (sign_extend: + (match_operand:SI 4 "reg_or_neg_simm5_operand" "r,Wn5,r,Wn5,r,Wn5,r,Wn5")))) + (match_operand:V64BITI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand:SI 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + { + const char *tail = satisfies_constraint_J (operands[1]) ? "" : ",%1.t"; + char buf[64] = {0}; + if (satisfies_constraint_Wn5 (operands[4])) + { + const char *insn = "vsadd.vi\t%0,%3"; + snprintf (buf, sizeof (buf), "%s,%d%s", insn, (int)(-INTVAL (operands[4])), tail); + } + else + { + const char *insn = "vssub.vx\t%0,%3,%4"; + snprintf (buf, sizeof (buf), "%s%s", insn, tail); + } + output_asm_insn (buf, operands); + return ""; + } + [(set_attr "type" "vsarith") + (set_attr "mode" "")]) + +(define_insn "@vussub_vx_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=vd,vd,vr,vr") + (unspec:V64BITI + [(unspec:V64BITI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (us_minus:V64BITI + (match_operand:V64BITI 3 "register_operand" "vr,vr,vr,vr") + (vec_duplicate:V64BITI + (sign_extend: + (match_operand:SI 4 "register_operand" "r,r,r,r")))) + (match_operand:V64BITI 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK") + (match_operand:SI 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vssubu.vx\t%0,%3,%4,%1.t + vssubu.vx\t%0,%3,%4,%1.t + vssubu.vx\t%0,%3,%4 + vssubu.vx\t%0,%3,%4" + [(set_attr "type" "vsarith") + (set_attr "mode" "")]) + +;; Vector-Vector Single-Width Averaging Add and Subtract. +;; Vector-Vector Single-Width Fractional Multiply with Rounding and Saturation. +(define_insn "@v_vv" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec:VI + [(match_operand:VI 3 "register_operand" "vr,vr,vr,vr") + (match_operand:VI 4 "register_operand" "vr,vr,vr,vr")] SAT_OP) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vv\t%0,%3,%4,%1.t + v.vv\t%0,%3,%4,%1.t + v.vv\t%0,%3,%4 + v.vv\t%0,%3,%4" + [(set_attr "type" "") + (set_attr "mode" "")]) + +;; Vector-Scalar Single-Width Averaging Add and Subtract. +;; Vector-Scalar Single-Width Fractional Multiply with Rounding and Saturation. +(define_insn "@v_vx_internal" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (unspec:VI + [(match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (vec_duplicate:VI + (match_operand: 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J"))] SAT_OP) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vx\t%0,%3,%4,%1.t + v.vx\t%0,%3,zero,%1.t + v.vx\t%0,%3,%4,%1.t + v.vx\t%0,%3,zero,%1.t + v.vx\t%0,%3,%4 + v.vx\t%0,%3,zero + v.vx\t%0,%3,%4 + v.vx\t%0,%3,zero" + [(set_attr "type" "") + (set_attr "mode" "")]) + +(define_insn "@v_vx_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:V64BITI + [(unspec:V64BITI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (unspec:V64BITI + [(match_operand:V64BITI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (vec_duplicate:V64BITI + (sign_extend: + (match_operand:SI 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J")))] SAT_OP) + (match_operand:V64BITI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand:SI 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vx\t%0,%3,%4,%1.t + v.vx\t%0,%3,zero,%1.t + v.vx\t%0,%3,%4,%1.t + v.vx\t%0,%3,zero,%1.t + v.vx\t%0,%3,%4 + v.vx\t%0,%3,zero + v.vx\t%0,%3,%4 + v.vx\t%0,%3,zero" + [(set_attr "type" "") + (set_attr "mode" "")]) + +;; Vector-Vector Single-Width Scaling Shift Instructions. +(define_insn "@v_vv" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (unspec:VI + [(match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (match_operand:VI 4 "vector_shift_operand" "vr,vk,vr,vk,vr,vk,vr,vk")] SSHIFT) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vv\t%0,%3,%4,%1.t + v.vi\t%0,%3,%v4,%1.t + v.vv\t%0,%3,%4,%1.t + v.vi\t%0,%3,%v4,%1.t + v.vv\t%0,%3,%4 + v.vi\t%0,%3,%v4 + v.vv\t%0,%3,%4 + v.vi\t%0,%3,%v4" + [(set_attr "type" "vscaleshift") + (set_attr "mode" "")]) + +;; Vector-Scalar Single-Width Scaling Shift Instructions. +(define_insn "@v_vx" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (unspec:VI + [(match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (match_operand 4 "p_reg_or_uimm5_operand" "r,K,r,K,r,K,r,K")] SSHIFT) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + v.vx\t%0,%3,%4,%1.t + v.vi\t%0,%3,%4,%1.t + v.vx\t%0,%3,%4,%1.t + v.vi\t%0,%3,%4,%1.t + v.vx\t%0,%3,%4 + v.vi\t%0,%3,%4 + v.vx\t%0,%3,%4 + v.vi\t%0,%3,%4" + [(set_attr "type" "vscaleshift") + (set_attr "mode" "")]) + +;; Vector-Vector signed/unsigned clip. +(define_insn "@vn_wv" + [(set (match_operand:VWI 0 "register_operand" "=vd,vd,&vd,vd,&vd, vd,vd,&vd,vd,&vd, vr,vr,&vr,vr,&vr, vr,vr,&vr,vr,&vr") + (unspec:VWI + [(unspec:VWI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,vm, vm,vm,vm,vm,vm, J,J,J,J,J, J,J,J,J,J") + (unspec:VWI + [(match_operand: 3 "register_operand" "0,vr,vr,0,vr, 0,vr,vr,0,vr, 0,vr,vr,0,vr, 0,vr,vr,0,vr") + (match_operand:VWI 4 "vector_shift_operand" "vr,0,vr,vk,vk, vr,0,vr,vk,vk, vr,0,vr,vk,vk, vr,0,vr,vk,vk")] CLIP) + (match_operand:VWI 2 "vector_reg_or_const0_operand" "0,0,0,0,0, J,J,J,J,J, 0,0,0,0,0, J,J,J,J,J") + ] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK, rK,rK,rK,rK,rK, rK,rK,rK,rK,rK, rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vn.wv\t%0,%3,%4,%1.t + vn.wv\t%0,%3,%4,%1.t + vn.wv\t%0,%3,%4,%1.t + vn.wi\t%0,%3,%v4,%1.t + vn.wi\t%0,%3,%v4,%1.t + vn.wv\t%0,%3,%4,%1.t + vn.wv\t%0,%3,%4,%1.t + vn.wv\t%0,%3,%4,%1.t + vn.wi\t%0,%3,%v4,%1.t + vn.wi\t%0,%3,%v4,%1.t + vn.wv\t%0,%3,%4 + vn.wv\t%0,%3,%4 + vn.wv\t%0,%3,%4 + vn.wi\t%0,%3,%v4 + vn.wi\t%0,%3,%v4 + vn.wv\t%0,%3,%4 + vn.wv\t%0,%3,%4 + vn.wv\t%0,%3,%4 + vn.wi\t%0,%3,%v4 + vn.wi\t%0,%3,%v4" + [(set_attr "type" "vclip") + (set_attr "mode" "")]) + +;; Vector-Scalar signed/unsigned clip. +(define_insn "@vn_wx" + [(set (match_operand:VWI 0 "register_operand" "=vd,&vd,vd,&vd, vd,&vd,vd,&vd, vr,&vr,vr,&vr, vr,?&vr,vr,&vr") + (unspec:VWI + [(unspec:VWI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm, vm,vm,vm,vm, J,J,J,J, J,J,J,J") + (unspec:VWI + [(match_operand: 3 "register_operand" "0,vr,0,vr, 0,vr,0,vr, 0,vr,0,vr, 0,vr,0,vr") + (match_operand 4 "p_reg_or_uimm5_operand" "r,r,K,K, r,r,K,K, r,r,K,K, r,r,K,K")] CLIP) + (match_operand:VWI 2 "vector_reg_or_const0_operand" "0,0,0,0, J,J,J,J, 0,0,0,0, J,J,J,J") + ] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK, rK,rK,rK,rK, rK,rK,rK,rK, rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vn.wx\t%0,%3,%4,%1.t + vn.wx\t%0,%3,%4,%1.t + vn.wi\t%0,%3,%4,%1.t + vn.wi\t%0,%3,%4,%1.t + vn.wx\t%0,%3,%4,%1.t + vn.wx\t%0,%3,%4,%1.t + vn.wi\t%0,%3,%4,%1.t + vn.wi\t%0,%3,%4,%1.t + vn.wx\t%0,%3,%4 + vn.wx\t%0,%3,%4 + vn.wi\t%0,%3,%4 + vn.wi\t%0,%3,%4 + vn.wx\t%0,%3,%4 + vn.wx\t%0,%3,%4 + vn.wi\t%0,%3,%4 + vn.wi\t%0,%3,%4" + [(set_attr "type" "vclip") + (set_attr "mode" "")]) + +;; ------------------------------------------------------------------------------- +;; ---- 13. Vector Floating-Point Arithmetic Instructions +;; ------------------------------------------------------------------------------- +;; Includes: +;; - 13.2 Vector Single-Width Floating-Point Add/Subtract Instructions +;; - 13.3 Vector Widening Floating-Point Add/Subtract Instrucions +;; - 13.4 Vector Single-Width Floating-Point Multiply/Divide Instrucions +;; - 13.5 Vector Widening Floating-Point Multiply +;; - 13.6 Vector Single-Width Floating-Point Fused Multiply-Add Instrucions +;; - 13.7 Vector Widening Floating-Point Fused Multiply-Add Instrucions +;; - 13.8 Vector Floating-Point Square-Root Instrucion +;; - 13.9 Vector Floating-Point Reciprocal Square-Root Estimate Instrucion +;; - 13.10 Vector Floating-Point Reciprocal Estimate Instruction +;; - 13.11 Vector Floating-Point MIN/MAX Instrucions +;; - 13.12 Vector Floating-Point Sign-Injection Instrucions +;; - 13.13 Vector Floating-Point Compare Instrucions +;; - 13.14 Vector Floating-Point Classify Instruction +;; - 13.15 Vector Floating-Point Merge Instruction +;; - 13.16 Vector Floating-Point Move Instruction +;; - 13.17 Single-Width Floating-Point/Integer Type-Convert Instructions +;; - 13.18 Widening Floating-Point/Integer Type-Convert Instructions +;; - 13.19 Narrowing Floating-Point/Integer Type-Convert Instructions +;; ------------------------------------------------------------------------------- + +;; Vector-Vector Single-Width Floating-Point Add/Subtract Instructions. +;; Vector-Vector Single-Width Floating-Point Multiply/Divide Instrucions. +;; Vector-Vector Single-Width Floating-Point MIN/MAX Instrucions. +(define_insn "@vf_vv" + [(set (match_operand:VF 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VF + [(unspec:VF + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (any_fop:VF + (match_operand:VF 3 "register_operand" "vr,vr,vr,vr") + (match_operand:VF 4 "register_operand" "vr,vr,vr,vr")) + (match_operand:VF 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vf.vv\t%0,%3,%4,%1.t + vf.vv\t%0,%3,%4,%1.t + vf.vv\t%0,%3,%4 + vf.vv\t%0,%3,%4" + [(set_attr "type" "") + (set_attr "mode" "")]) + +;; Vector-Scalar Single-Width Floating-Point Add/Subtract Instructions. +;; Vector-Scalar Single-Width Floating-Point Multiply/Divide Instrucions. +;; Vector-Scalar Single-Width Floating-Point MIN/MAX Instrucions. +(define_insn "@vf_vf" + [(set (match_operand:VF 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VF + [(unspec:VF + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (any_fop:VF + (match_operand:VF 3 "register_operand" "vr,vr,vr,vr") + (vec_duplicate:VF + (match_operand: 4 "register_operand" "f,f,f,f"))) + (match_operand:VF 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vf.vf\t%0,%3,%4,%1.t + vf.vf\t%0,%3,%4,%1.t + vf.vf\t%0,%3,%4 + vf.vf\t%0,%3,%4" + [(set_attr "type" "") + (set_attr "mode" "")]) + +;; Floating-Point Reverse Sub/Div. +(define_insn "@vfr_vf" + [(set (match_operand:VF 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VF + [(unspec:VF + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (minus_div:VF + (vec_duplicate:VF + (match_operand: 4 "register_operand" "f,f,f,f")) + (match_operand:VF 3 "register_operand" "vr,vr,vr,vr")) + (match_operand:VF 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfr.vf\t%0,%3,%4,%1.t + vfr.vf\t%0,%3,%4,%1.t + vfr.vf\t%0,%3,%4 + vfr.vf\t%0,%3,%4" + [(set_attr "type" "varith") + (set_attr "mode" "")]) + +;; Vector-Vector Widening Float Add/Subtract. +(define_insn "@vfw_vv" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (plus_minus: + (float_extend: + (match_operand:VWF 3 "register_operand" "vr,vr,vr,vr")) + (float_extend: + (match_operand:VWF 4 "register_operand" "vr,vr,vr,vr"))) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfw.vv\t%0,%3,%4,%1.t + vfw.vv\t%0,%3,%4,%1.t + vfw.vv\t%0,%3,%4 + vfw.vv\t%0,%3,%4" + [(set_attr "type" "vwarith") + (set_attr "mode" "")]) + +;; Vector-Scalar Widening Float Add/Subtract. +(define_insn "@vfw_vf" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (plus_minus: + (float_extend: + (match_operand:VWF 3 "register_operand" "vr,vr,vr,vr")) + (float_extend: + (vec_duplicate:VWF + (match_operand: 4 "register_operand" "f,f,f,f")))) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfw.vf\t%0,%3,%4,%1.t + vfw.vf\t%0,%3,%4,%1.t + vfw.vf\t%0,%3,%4 + vfw.vf\t%0,%3,%4" + [(set_attr "type" "vwarith") + (set_attr "mode" "")]) + +;; Vector-Vector Widening Float Add/Subtract. +(define_insn "@vfw_wv" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (plus_minus: + (match_operand: 3 "register_operand" "vr,vr,vr,vr") + (float_extend: + (match_operand:VWF 4 "register_operand" "vr,vr,vr,vr"))) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfw.wv\t%0,%3,%4,%1.t + vfw.wv\t%0,%3,%4,%1.t + vfw.wv\t%0,%3,%4 + vfw.wv\t%0,%3,%4" + [(set_attr "type" "vwarith") + (set_attr "mode" "")]) + +;; Vector-Scalar Widening Float Add/Subtract. +(define_insn "@vfw_wf" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (plus_minus: + (match_operand: 3 "register_operand" "vr,vr,vr,vr") + (float_extend: + (vec_duplicate:VWF + (match_operand: 4 "register_operand" "f,f,f,f")))) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfw.wf\t%0,%3,%4,%1.t + vfw.wf\t%0,%3,%4,%1.t + vfw.wf\t%0,%3,%4 + vfw.wf\t%0,%3,%4" + [(set_attr "type" "vwarith") + (set_attr "mode" "")]) + +;; Vector-Vector Widening Float multiply. +(define_insn "@vfwmul_vv" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (mult: + (float_extend: + (match_operand:VWF 3 "register_operand" "vr,vr,vr,vr")) + (float_extend: + (match_operand:VWF 4 "register_operand" "vr,vr,vr,vr"))) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfwmul.vv\t%0,%3,%4,%1.t + vfwmul.vv\t%0,%3,%4,%1.t + vfwmul.vv\t%0,%3,%4 + vfwmul.vv\t%0,%3,%4" + [(set_attr "type" "vwarith") + (set_attr "mode" "")]) + +;; Vector-Scalar Widening Float multiply. +(define_insn "@vfwmul_vf" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (mult: + (float_extend: + (match_operand:VWF 3 "register_operand" "vr,vr,vr,vr")) + (float_extend: + (vec_duplicate:VWF + (match_operand: 4 "register_operand" "f,f,f,f")))) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfwmul.vf\t%0,%3,%4,%1.t + vfwmul.vf\t%0,%3,%4,%1.t + vfwmul.vf\t%0,%3,%4 + vfwmul.vf\t%0,%3,%4" + [(set_attr "type" "vwarith") + (set_attr "mode" "")]) + +;; Vector-Vector Single-Width Floating-Point Fused Multiply-Add Instrucions. +(define_insn "@vf_vv" + [(set (match_operand:VF 0 "register_operand" "=vd,vr") + (unspec:VF + [(unspec:VF + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,J") + (unspec:VF + [(match_operand:VF 2 "register_operand" "0,0") + (match_operand:VF 3 "register_operand" "vr,vr") + (match_operand:VF 4 "register_operand" "vr,vr")] FMAC) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vf.vv\t%0,%3,%4,%1.t + vf.vv\t%0,%3,%4" + [(set_attr "type" "vmadd") + (set_attr "mode" "")]) + +;; Vector-Scalar Single-Width Floating-Point Fused Multiply-Add Instrucions. +(define_insn "@vf_vf" + [(set (match_operand:VF 0 "register_operand" "=vd,vr") + (unspec:VF + [(unspec:VF + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,J") + (unspec:VF + [(match_operand:VF 2 "register_operand" "0,0") + (vec_duplicate:VF + (match_operand: 3 "register_operand" "f,f")) + (match_operand:VF 4 "register_operand" "vr,vr")] FMAC) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vf.vf\t%0,%3,%4,%1.t + vf.vf\t%0,%3,%4" + [(set_attr "type" "vmadd") + (set_attr "mode" "")]) + +;; Vector-Vector Widening multiply-accumulate, overwrite addend. +;; Vector-Vector Widening multiply-subtract-accumulate, overwrite addend. +(define_insn "@vfwmacc_vv" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,J") + (plus: + (mult: + (float_extend: + (match_operand:VWF 3 "register_operand" "vr,vr")) + (float_extend: + (match_operand:VWF 4 "register_operand" "vr,vr"))) + (match_operand: 2 "register_operand" "0,0")) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfwmacc.vv\t%0,%3,%4,%1.t + vfwmacc.vv\t%0,%3,%4" + [(set_attr "type" "vwmadd") + (set_attr "mode" "")]) + +(define_insn "@vfwmsac_vv" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,J") + (minus: + (mult: + (float_extend: + (match_operand:VWF 3 "register_operand" "vr,vr")) + (float_extend: + (match_operand:VWF 4 "register_operand" "vr,vr"))) + (match_operand: 2 "register_operand" "0,0")) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfwmsac.vv\t%0,%3,%4,%1.t + vfwmsac.vv\t%0,%3,%4" + [(set_attr "type" "vwmadd") + (set_attr "mode" "")]) + +;; Vector-Scalar Widening multiply-accumulate, overwrite addend. +;; Vector-Scalar Widening multiply-subtract-accumulate, overwrite addend. +(define_insn "@vfwmacc_vf" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,J") + (plus: + (mult: + (float_extend: + (match_operand:VWF 4 "register_operand" "vr,vr")) + (float_extend: + (vec_duplicate:VWF + (match_operand: 3 "register_operand" "f,f")))) + (match_operand: 2 "register_operand" "0,0")) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfwmacc.vf\t%0,%3,%4,%1.t + vfwmacc.vf\t%0,%3,%4" + [(set_attr "type" "vwmadd") + (set_attr "mode" "")]) + +(define_insn "@vfwmsac_vf" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,J") + (minus: + (mult: + (float_extend: + (match_operand:VWF 4 "register_operand" "vr,vr")) + (float_extend: + (vec_duplicate:VWF + (match_operand: 3 "register_operand" "f,f")))) + (match_operand: 2 "register_operand" "0,0")) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfwmsac.vf\t%0,%3,%4,%1.t + vfwmsac.vf\t%0,%3,%4" + [(set_attr "type" "vwmadd") + (set_attr "mode" "")]) + +;; Vector-Vector Widening negate-(multiply-accumulate), overwrite addend. +;; Vector-Vector Widening negate-(multiply-subtract-accumulate), overwrite addend. +(define_insn "@vfwnmacc_vv" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,J") + (neg: + (plus: + (mult: + (float_extend: + (match_operand:VWF 4 "register_operand" "vr,vr")) + (float_extend: + (match_operand:VWF 3 "register_operand" "vr,vr"))) + (match_operand: 2 "register_operand" "0,0"))) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfwnmacc.vv\t%0,%3,%4,%1.t + vfwnmacc.vv\t%0,%3,%4" + [(set_attr "type" "vwmadd") + (set_attr "mode" "")]) + +(define_insn "@vfwnmsac_vv" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,J") + (neg: + (minus: + (mult: + (float_extend: + (match_operand:VWF 4 "register_operand" "vr,vr")) + (float_extend: + (match_operand:VWF 3 "register_operand" "vr,vr"))) + (match_operand: 2 "register_operand" "0,0"))) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfwnmsac.vv\t%0,%3,%4,%1.t + vfwnmsac.vv\t%0,%3,%4" + [(set_attr "type" "vwmadd") + (set_attr "mode" "")]) + +;; Vector-Scalar Widening negate-(multiply-accumulate), overwrite addend. +;; Vector-Scalar Widening negate-(multiply-subtract-accumulate), overwrite addend. +(define_insn "@vfwnmacc_vf" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,J") + (neg: + (plus: + (mult: + (float_extend: + (match_operand:VWF 4 "register_operand" "vr,vr")) + (float_extend: + (vec_duplicate:VWF + (match_operand: 3 "register_operand" "f,f")))) + (match_operand: 2 "register_operand" "0,0"))) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfwnmacc.vf\t%0,%3,%4,%1.t + vfwnmacc.vf\t%0,%3,%4" + [(set_attr "type" "vwmadd") + (set_attr "mode" "")]) + +(define_insn "@vfwnmsac_vf" + [(set (match_operand: 0 "register_operand" "=&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,J") + (neg: + (minus: + (mult: + (float_extend: + (match_operand:VWF 4 "register_operand" "vr,vr")) + (float_extend: + (vec_duplicate:VWF + (match_operand: 3 "register_operand" "f,f")))) + (match_operand: 2 "register_operand" "0,0"))) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfwnmsac.vf\t%0,%3,%4,%1.t + vfwnmsac.vf\t%0,%3,%4" + [(set_attr "type" "vwmadd") + (set_attr "mode" "")]) + +;; Floating-Point square root. +(define_insn "@vfsqrt_v" + [(set (match_operand:VF 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VF + [(unspec:VF + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (sqrt:VF + (match_operand:VF 3 "register_operand" "vr,vr,vr,vr")) + (match_operand:VF 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfsqrt.v\t%0,%3,%1.t + vfsqrt.v\t%0,%3,%1.t + vfsqrt.v\t%0,%3 + vfsqrt.v\t%0,%3" + [(set_attr "type" "vfsqrt") + (set_attr "mode" "")]) + +;; Floating-Point Reciprocal Square-Root Estimate. +;; Floating-Point Reciprocal Estimate. +(define_insn "@vf_v" + [(set (match_operand:VF 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VF + [(unspec:VF + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec:VF + [(match_operand:VF 3 "register_operand" "vr,vr,vr,vr")] RECIPROCAL) + (match_operand:VF 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vf.v\t%0,%3,%1.t + vf.v\t%0,%3,%1.t + vf.v\t%0,%3 + vf.v\t%0,%3" + [(set_attr "type" "vdiv") + (set_attr "mode" "")]) + +;; Vector-Vector Floating-Point Sign-Injection. +(define_insn "@vfsgnj_vv" + [(set (match_operand:VF 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VF + [(unspec:VF + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec:VF + [(match_operand:VF 3 "register_operand" "vr,vr,vr,vr") + (match_operand:VF 4 "register_operand" "vr,vr,vr,vr")] COPYSIGNS) + (match_operand:VF 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfsgnj.vv\t%0,%3,%4,%1.t + vfsgnj.vv\t%0,%3,%4,%1.t + vfsgnj.vv\t%0,%3,%4 + vfsgnj.vv\t%0,%3,%4" + [(set_attr "type" "vfsgnj") + (set_attr "mode" "")]) + +;; Vector-Scalar Floating-Point Sign-Injection. +(define_insn "@vfsgnj_vf" + [(set (match_operand:VF 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VF + [(unspec:VF + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec:VF + [(match_operand:VF 3 "register_operand" "vr,vr,vr,vr") + (vec_duplicate:VF + (match_operand: 4 "register_operand" "f,f,f,f"))] COPYSIGNS) + (match_operand:VF 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfsgnj.vf\t%0,%3,%4,%1.t + vfsgnj.vf\t%0,%3,%4,%1.t + vfsgnj.vf\t%0,%3,%4 + vfsgnj.vf\t%0,%3,%4" + [(set_attr "type" "vfsgnj") + (set_attr "mode" "")]) + +;; vfneg.v vd,vs = vfsgnjn.vv vd,vs,vs. +(define_insn "@vfneg_v" + [(set (match_operand:VF 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VF + [(unspec:VF + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (neg:VF + (match_operand:VF 3 "register_operand" "vr,vr,vr,vr")) + (match_operand:VF 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfneg.v\t%0,%3,%1.t + vfneg.v\t%0,%3,%1.t + vfneg.v\t%0,%3 + vfneg.v\t%0,%3" + [(set_attr "type" "vfsgnj") + (set_attr "mode" "")]) + +;; vfabs.v vd,vs = vfsgnjn.vv vd,vs,vs. +(define_insn "@vfabs_v" + [(set (match_operand:VF 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VF + [(unspec:VF + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (abs:VF + (match_operand:VF 3 "register_operand" "vr,vr,vr,vr")) + (match_operand:VF 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfabs.v\t%0,%3,%1.t + vfabs.v\t%0,%3,%1.t + vfabs.v\t%0,%3 + vfabs.v\t%0,%3" + [(set_attr "type" "vfsgnj") + (set_attr "mode" "")]) + +;; Vector-Vector Floating-Point Compare Instrucions. +(define_insn "@vmf_vv" + [(set (match_operand: 0 "register_operand" "=vr,vr,vm,&vr, vr,vr,vm,&vr, vr,vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,0,vm, vm,vm,0,vm, J,J,J") + (any_fcmp: + (match_operand:VF 3 "register_operand" "0,vr,vr,vr, 0,vr,vr,vr, 0,vr,vr") + (match_operand:VF 4 "register_operand" "vr,0,vr,vr, vr,0,vr,vr, vr,0,vr")) + (match_operand: 2 "vector_reg_or_const0_operand" "0,0,0,0, J,J,J,J, J,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK, rK,rK,rK,rK, rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmf.vv\t%0,%3,%4,%1.t + vmf.vv\t%0,%3,%4,%1.t + vmf.vv\t%0,%3,%4,%1.t + vmf.vv\t%0,%3,%4,%1.t + vmf.vv\t%0,%3,%4,%1.t + vmf.vv\t%0,%3,%4,%1.t + vmf.vv\t%0,%3,%4,%1.t + vmf.vv\t%0,%3,%4,%1.t + vmf.vv\t%0,%3,%4 + vmf.vv\t%0,%3,%4 + vmf.vv\t%0,%3,%4" + [(set_attr "type" "vcmp") + (set_attr "mode" "")]) + +;; Vector-Scalar Floating-Point Compare Instrucions. +(define_insn "@vmf_vf" + [(set (match_operand: 0 "register_operand" "=vr,vm,&vr, vr,vm,&vr, vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,0,vm, vm,0,vm, J,J") + (any_fcmp: + (match_operand:VF 3 "register_operand" "0,vr,vr, 0,vr,vr, 0,vr") + (vec_duplicate:VF + (match_operand: 4 "register_operand" "f,f,f, f,f,f, f,f"))) + (match_operand: 2 "vector_reg_or_const0_operand" "0,0,0, J,J,J, J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK, rK,rK,rK, rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmf.vf\t%0,%3,%4,%1.t + vmf.vf\t%0,%3,%4,%1.t + vmf.vf\t%0,%3,%4,%1.t + vmf.vf\t%0,%3,%4,%1.t + vmf.vf\t%0,%3,%4,%1.t + vmf.vf\t%0,%3,%4,%1.t + vmf.vf\t%0,%3,%4 + vmf.vf\t%0,%3,%4" + [(set_attr "type" "vcmp") + (set_attr "mode" "")]) + +;; Vector-Vector Floating-Point Comparison with no trapping. +;; These are used by auto-vectorization. +(define_expand "@vmf_vv" + [(set (match_operand: 0 "register_operand") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand") + (any_fcmp_no_trapping: + (match_operand:VF 3 "register_operand") + (match_operand:VF 4 "register_operand")) + (match_operand: 2 "vector_reg_or_const0_operand")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand") + (match_operand 6 "const_int_operand")] UNSPEC_RVV))] + "TARGET_VECTOR" +{ + rtx mask = gen_reg_rtx (mode); + if (strcmp ("", "ltgt") == 0) + { + emit_insn (gen_vmf_vv (GT, mode, operands[0], + operands[1], operands[2], operands[3], operands[4], + operands[5], operands[6])); + emit_insn (gen_vmf_vv (GT, mode, mask, + operands[1], operands[2], operands[4], operands[3], + operands[5], operands[6])); + emit_insn (gen_vm_mm (IOR, mode, operands[0], operands[0], mask, + operands[5], operands[6])); + } + else + { + /* Example of implementing isgreater() + vmfeq.vv v0, va, va ;; Only set where A is not NaN. + vmfeq.vv v1, vb, vb ;; Only set where B is not NaN. + vmand.mm v0, v0, v1 ;; Only set where A and B are ordered, + vmfgt.vv v0, va, vb, v0.t ;; so only set flags on ordered values. */ + emit_insn (gen_vmf_vv (EQ, mode, operands[0], + operands[1], operands[2], operands[3], operands[3], + operands[5], operands[6])); + emit_insn (gen_vmf_vv (EQ, mode, mask, + operands[1], operands[2], operands[4], operands[4], + operands[5], operands[6])); + emit_insn (gen_vm_mm (AND, mode, operands[0], operands[0], mask, + operands[5], operands[6])); + + rtx all_ones = gen_reg_rtx (mode); + emit_insn (gen_vmset_m (all_ones, operands[5], + rvv_gen_policy ())); + + if (strcmp ("", "ordered") != 0) + { + if (strcmp ("", "unordered") == 0) + emit_insn (gen_vmnot_m (mode, operands[0], operands[0], operands[5], operands[6])); + else + { + enum rtx_code code = strcmp ("", "unlt") == 0 ? LT : + strcmp ("", "unle") == 0 ? LE : + strcmp ("", "unge") == 0 ? GE : + strcmp ("", "ungt") == 0 ? GT : EQ; + emit_insn (gen_vmf_vv (code, mode, operands[0], + operands[0], all_ones, operands[3], operands[4], + operands[5], operands[6])); + } + } + } + DONE; +}) + +;; Floating-Point Classify Instruction. +(define_insn "@vfclass_v" + [(set (match_operand: 0 "register_operand" "=vd,vd,vr,vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec: + [(match_operand:VF 3 "register_operand" "vr,vr,vr,vr")] UNSPEC_FCLASS) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfclass.v\t%0,%3,%1.t + vfclass.v\t%0,%3,%1.t + vfclass.v\t%0,%3 + vfclass.v\t%0,%3" + [(set_attr "type" "vfclass") + (set_attr "mode" "")]) + +;; Vector-Scalar Floating-Point merge. +(define_insn "@vfmerge_vfm" + [(set (match_operand:VF 0 "register_operand" "=vd,vd") + (unspec:VF + [(match_operand:VF 2 "vector_reg_or_const0_operand" "0,J") + (unspec:VF + [(match_operand: 1 "register_operand" "vm,vm") + (match_operand:VF 3 "register_operand" "vr,vr") + (vec_duplicate:VF + (match_operand: 4 "register_operand" "f,f"))] UNSPEC_MERGE) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfmerge.vfm\t%0,%3,%4,%1 + vfmerge.vfm\t%0,%3,%4,%1" + [(set_attr "type" "vmerge") + (set_attr "mode" "")]) + +;; Vector-Scalar Floating-Point Move. +(define_insn "@vfmv_v_f" + [(set (match_operand:VF 0 "register_operand" "=vr,vr") + (unspec:VF + [(match_operand:VF 1 "vector_reg_or_const0_operand" "0,J") + (vec_duplicate:VF + (match_operand: 2 "register_operand" "f,f")) + (match_operand 3 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vfmv.v.f\t%0,%2" + [(set_attr "type" "vmove") + (set_attr "mode" "")]) + +;; Convert float to unsigned integer. +;; Convert float to signed integer. +(define_insn "@vfcvt_x_f_v" + [(set (match_operand: 0 "register_operand" "=vd,vd,vr,vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec: + [(match_operand:VF 3 "register_operand" "vr,vr,vr,vr")] FCVT) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfcvt.x.f.v\t%0,%3,%1.t + vfcvt.x.f.v\t%0,%3,%1.t + vfcvt.x.f.v\t%0,%3 + vfcvt.x.f.v\t%0,%3" + [(set_attr "type" "vfcvt") + (set_attr "mode" "")]) + +;; Convert float to unsigned integer, truncating. +;; Convert float to signed integer, truncating. +(define_insn "@vfcvt_rtz_x_f_v" + [(set (match_operand: 0 "register_operand" "=vd,vd,vr,vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (any_fix: + (match_operand:VF 3 "register_operand" "vr,vr,vr,vr")) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfcvt.rtz.x.f.v\t%0,%3,%1.t + vfcvt.rtz.x.f.v\t%0,%3,%1.t + vfcvt.rtz.x.f.v\t%0,%3 + vfcvt.rtz.x.f.v\t%0,%3" + [(set_attr "type" "vfcvt") + (set_attr "mode" "")]) + +;; Convert unsigned integer to float. +;; Convert signed integer to float. +(define_insn "@vfcvt_f_x_v" + [(set (match_operand:VF 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VF + [(unspec:VF + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (any_float:VF + (match_operand: 3 "register_operand" "vr,vr,vr,vr")) + (match_operand:VF 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfcvt.f.x.v\t%0,%3,%1.t + vfcvt.f.x.v\t%0,%3,%1.t + vfcvt.f.x.v\t%0,%3 + vfcvt.f.x.v\t%0,%3" + [(set_attr "type" "vfcvt") + (set_attr "mode" "")]) + +;; Convert float to double-width unsigned integer. +;; Convert float to double-width signed integer. +(define_insn "@vfwcvt_x_f_v" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec: + [(match_operand:VWF 3 "register_operand" "vr,vr,vr,vr")] FCVT) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfwcvt.x.f.v\t%0,%3,%1.t + vfwcvt.x.f.v\t%0,%3,%1.t + vfwcvt.x.f.v\t%0,%3 + vfwcvt.x.f.v\t%0,%3" + [(set_attr "type" "vfwcvt") + (set_attr "mode" "")]) + +;; Convert float to double-width unsigned integer, truncating. +;; Convert float to double-width signed integer, truncating. +(define_insn "@vfwcvt_rtz_x_f_v" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (any_fix: + (match_operand:VWF 3 "register_operand" "vr,vr,vr,vr")) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfwcvt.rtz.x.f.v\t%0,%3,%1.t + vfwcvt.rtz.x.f.v\t%0,%3,%1.t + vfwcvt.rtz.x.f.v\t%0,%3 + vfwcvt.rtz.x.f.v\t%0,%3" + [(set_attr "type" "vfwcvt") + (set_attr "mode" "")]) + +;; Convert unsigned integer to double-width float. +;; Convert signed integer to double-width float. +(define_insn "@vfwcvt_f_x_v" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (any_float: + (match_operand:VWINOQI 3 "register_operand" "vr,vr,vr,vr")) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfwcvt.f.x.v\t%0,%3,%1.t + vfwcvt.f.x.v\t%0,%3,%1.t + vfwcvt.f.x.v\t%0,%3 + vfwcvt.f.x.v\t%0,%3" + [(set_attr "type" "vfwcvt") + (set_attr "mode" "")]) + +;; Convert single-width float to double-width float +(define_insn "@vfwcvt_f_f_v" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (float_extend: + (match_operand:VWF 3 "register_operand" "vr,vr,vr,vr")) + (match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfwcvt.f.f.v\t%0,%3,%1.t + vfwcvt.f.f.v\t%0,%3,%1.t + vfwcvt.f.f.v\t%0,%3 + vfwcvt.f.f.v\t%0,%3" + [(set_attr "type" "vfwcvt") + (set_attr "mode" "")]) + +;; Convert double-width float to unsigned integer. +;; Convert double-width float to signed integer. +(define_insn "@vfncvt_x_f_w" + [(set (match_operand:VWINOQI 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec:VWINOQI + [(unspec:VWINOQI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec:VWINOQI + [(match_operand: 3 "register_operand" "vr,vr,vr,vr")] FCVT) + (match_operand:VWINOQI 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfncvt.x.f.w\t%0,%3,%1.t + vfncvt.x.f.w\t%0,%3,%1.t + vfncvt.x.f.w\t%0,%3 + vfncvt.x.f.w\t%0,%3" + [(set_attr "type" "vfncvt") + (set_attr "mode" "")]) + +;; Convert double-width float to unsigned integer, truncating. +;; Convert double-width float to signed integer, truncating. +(define_insn "@vfncvt_rtz_x_f_w" + [(set (match_operand:VWINOQI 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec:VWINOQI + [(unspec:VWINOQI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (any_fix:VWINOQI + (match_operand: 3 "register_operand" "vr,vr,vr,vr")) + (match_operand:VWINOQI 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfncvt.rtz.x.f.w\t%0,%3,%1.t + vfncvt.rtz.x.f.w\t%0,%3,%1.t + vfncvt.rtz.x.f.w\t%0,%3 + vfncvt.rtz.x.f.w\t%0,%3" + [(set_attr "type" "vfncvt") + (set_attr "mode" "")]) + +;; Convert double-width unsigned integer to float. +;; Convert double-width signed integer to float. +(define_insn "@vfncvt_f_x_w" + [(set (match_operand:VWF 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec:VWF + [(unspec:VWF + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (any_float:VWF + (match_operand: 3 "register_operand" "vr,vr,vr,vr")) + (match_operand:VWF 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfncvt.f.x.w\t%0,%3,%1.t + vfncvt.f.x.w\t%0,%3,%1.t + vfncvt.f.x.w\t%0,%3 + vfncvt.f.x.w\t%0,%3" + [(set_attr "type" "vfncvt") + (set_attr "mode" "")]) + +;; Convert double-width float to single-width float. +(define_insn "@vfncvt_f_f_w" + [(set (match_operand:VWF 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec:VWF + [(unspec:VWF + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (float_truncate:VWF + (match_operand: 3 "register_operand" "vr,vr,vr,vr")) + (match_operand:VWF 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfncvt.f.f.w\t%0,%3,%1.t + vfncvt.f.f.w\t%0,%3,%1.t + vfncvt.f.f.w\t%0,%3 + vfncvt.f.f.w\t%0,%3" + [(set_attr "type" "vfncvt") + (set_attr "mode" "")]) + +;; Convert double-width float to single-width float, rounding towards odd. +(define_insn "@vfncvt_rod_f_f_w" + [(set (match_operand:VWF 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec:VWF + [(unspec:VWF + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec:VWF + [(float_extend:VWF + (match_operand: 3 "register_operand" "vr,vr,vr,vr"))] UNSPEC_ROD) + (match_operand:VWF 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfncvt.rod.f.f.w\t%0,%3,%1.t + vfncvt.rod.f.f.w\t%0,%3,%1.t + vfncvt.rod.f.f.w\t%0,%3 + vfncvt.rod.f.f.w\t%0,%3" + [(set_attr "type" "vfncvt") + (set_attr "mode" "")]) + +;; ------------------------------------------------------------------------------- +;; ---- 14. Vector Reduction Operations +;; ------------------------------------------------------------------------------- +;; Includes: +;; - 14.1 Vector Single-Width Integer Reduction Instructions +;; - 14.2 Vector Widening Integer Reduction Instructions +;; - 14.3 Vector Single-Width Floating-Point Reduction +;; - 14.4 Vector Widening Floating-Point Reduction Instructions +;; ------------------------------------------------------------------------------- + +;; Integer simple-reductions. +(define_insn "@vred_vs" + [(set (match_operand: 0 "register_operand" "=vr,vr,vr,vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec: + [(match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J") + (match_operand:VI 3 "register_operand" "vr,vr,vr,vr") + (match_operand: 4 "register_operand" "vr,vr,vr,vr")] REDUC) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vred.vs\t%0,%3,%4,%1.t + vred.vs\t%0,%3,%4,%1.t + vred.vs\t%0,%3,%4 + vred.vs\t%0,%3,%4" + [(set_attr "type" "vreduc") + (set_attr "mode" "")]) + +;; Signed/Unsigned sum reduction into double-width accumulator. +(define_insn "@vwredsum_vs" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec: + [(match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J") + (any_extend: + (match_operand:VWREDI 3 "register_operand" "vr,vr,vr,vr")) + (match_operand: 4 "register_operand" "vr,vr,vr,vr")] UNSPEC_REDUC_SUM) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vwredsum.vs\t%0,%3,%4,%1.t + vwredsum.vs\t%0,%3,%4,%1.t + vwredsum.vs\t%0,%3,%4 + vwredsum.vs\t%0,%3,%4" + [(set_attr "type" "vwreduc") + (set_attr "mode" "")]) + +;; Floating-Point simple-reductions. +(define_insn "@vfred_vs" + [(set (match_operand: 0 "register_operand" "=vr,vr,vr,vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec: + [(match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J") + (match_operand:VF 3 "register_operand" "vr,vr,vr,vr") + (match_operand: 4 "register_operand" "vr,vr,vr,vr")] FREDUC) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfred.vs\t%0,%3,%4,%1.t + vfred.vs\t%0,%3,%4,%1.t + vfred.vs\t%0,%3,%4 + vfred.vs\t%0,%3,%4" + [(set_attr "type" "vreduc") + (set_attr "mode" "")]) + +;; unordered sum reduction into double-width accumulator. +(define_insn "@vfwredusum_vs" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec: + [(match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J") + (float_extend: + (match_operand:VWREDF 3 "register_operand" "vr,vr,vr,vr")) + (match_operand: 4 "register_operand" "vr,vr,vr,vr")] UNSPEC_REDUC_UNORDERED_SUM) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfwredusum.vs\t%0,%3,%4,%1.t + vfwredusum.vs\t%0,%3,%4,%1.t + vfwredusum.vs\t%0,%3,%4 + vfwredusum.vs\t%0,%3,%4" + [(set_attr "type" "vwreduc") + (set_attr "mode" "")]) + +;; ordered sum reduction into double-width accumulator. +(define_insn "@vfwredosum_vs" + [(set (match_operand: 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec: + [(unspec: + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec: + [(match_operand: 2 "vector_reg_or_const0_operand" "0,J,0,J") + (float_extend: + (match_operand:VWREDF 3 "register_operand" "vr,vr,vr,vr")) + (match_operand: 4 "register_operand" "vr,vr,vr,vr")] UNSPEC_REDUC_ORDERED_SUM) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfwredosum.vs\t%0,%3,%4,%1.t + vfwredosum.vs\t%0,%3,%4,%1.t + vfwredosum.vs\t%0,%3,%4 + vfwredosum.vs\t%0,%3,%4" + [(set_attr "type" "vwreduc") + (set_attr "mode" "")]) + +;; ------------------------------------------------------------------------------- +;; ---- 15. Vector Mask Instructions +;; ------------------------------------------------------------------------------- +;; Includes: +;; - 15.1 Vector Mask-Register Logical Instructions +;; - 15.2 Vector mask population count vpopc +;; - 15.3 vfirst find-first-set mask bit +;; - 15.4 vmsbf.m set-before-first mask bit +;; - 15.5 vmsif.m set-including-fisrt mask bit +;; - 15.6 vmsof.m set-only-first mask bit +;; - 15.8 Vector Iota Instruction +;; - 15.9 Vector Element Index Instructions +;; ------------------------------------------------------------------------------- + +;; Vector Mask-Register Logical Instructions. +(define_insn "@vm_mm" + [(set (match_operand:VB 0 "register_operand" "=vr") + (unspec:VB + [(any_bitwise:VB + (match_operand:VB 1 "register_operand" "vr") + (match_operand:VB 2 "register_operand" "vr")) + (match_operand 3 "p_reg_or_const_csr_operand" "rK") + (match_operand 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vm.mm\t%0,%1,%2" + [(set_attr "type" "vmask") + (set_attr "mode" "")]) + +(define_insn "@vmn_mm" + [(set (match_operand:VB 0 "register_operand" "=vr") + (unspec:VB + [(not:VB + (any_bitwise:VB + (match_operand:VB 1 "register_operand" "vr") + (match_operand:VB 2 "register_operand" "vr"))) + (match_operand 3 "p_reg_or_const_csr_operand" "rK") + (match_operand 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vm.mm\t%0,%1,%2" + [(set_attr "type" "vmask") + (set_attr "mode" "")]) + +(define_insn "@vmnot_mm" + [(set (match_operand:VB 0 "register_operand" "=vr") + (unspec:VB + [(any_logicalnot:VB + (match_operand:VB 1 "register_operand" "vr") + (not:VB + (match_operand:VB 2 "register_operand" "vr"))) + (match_operand 3 "p_reg_or_const_csr_operand" "rK") + (match_operand 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vmn.mm\t%0,%1,%2" + [(set_attr "type" "vmask") + (set_attr "mode" "")]) + +;; vmmv.m vd,vs -> vmand.mm vd,vs,vs # Copy mask register +(define_insn "@vmmv_m" + [(set (match_operand:VB 0 "register_operand" "=vr") + (unspec:VB + [(match_operand:VB 1 "register_operand" "vr") + (match_operand 2 "p_reg_or_const_csr_operand" "rK") + (match_operand 3 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vmmv.m\t%0,%1" + [(set_attr "type" "vmask") + (set_attr "mode" "")]) + +;; vmclr.m vd -> vmxor.mm vd,vd,vd # Clear mask register +(define_insn "@vmclr_m" + [(set (match_operand:VB 0 "register_operand" "=vr") + (unspec:VB + [(vec_duplicate:VB (const_int 0)) + (match_operand 1 "p_reg_or_const_csr_operand" "rK") + (match_operand 2 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vmclr.m\t%0" + [(set_attr "type" "vmask") + (set_attr "mode" "")]) + +;; vmset.m vd -> vmxnor.mm vd,vd,vd # Set mask register +(define_insn "@vmset_m" + [(set (match_operand:VB 0 "register_operand" "=vr") + (unspec:VB + [(vec_duplicate:VB (const_int 1)) + (match_operand 1 "p_reg_or_const_csr_operand" "rK") + (match_operand 2 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vmset.m\t%0" + [(set_attr "type" "vmask") + (set_attr "mode" "")]) + +;; vmnot.m vd,vs -> vmnand.mm vd,vs,vs # Invert bits +(define_insn "@vmnot_m" + [(set (match_operand:VB 0 "register_operand" "=vr") + (unspec:VB + [(not:VB + (match_operand:VB 1 "register_operand" "vr")) + (match_operand 2 "p_reg_or_const_csr_operand" "rK") + (match_operand 3 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vmnot.m\t%0,%1" + [(set_attr "type" "vmask") + (set_attr "mode" "")]) + +;; Vector mask population count vpopc +(define_insn "@vcpop__m" + [(set (match_operand:X 0 "register_operand" "=r,r") + (unspec:X + [(unspec:VB + [(match_operand:VB 1 "vector_reg_or_const0_operand" "vm,J") + (match_operand:VB 2 "register_operand" "vr,vr") + ] UNSPEC_VCPOP) + (match_operand 3 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vcpop.m\t%0,%2,%1.t + vcpop.m\t%0,%2" + [(set_attr "type" "vcpop") + (set_attr "mode" "")]) + +;; vfirst find-first-set mask bit +(define_insn "@vfirst__m" + [(set (match_operand:X 0 "register_operand" "=r,r") + (unspec:X + [(unspec:VB + [(match_operand:VB 1 "vector_reg_or_const0_operand" "vm,J") + (match_operand:VB 2 "register_operand" "vr,vr")] UNSPEC_FIRST) + (match_operand 3 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfirst.m\t%0,%2,%1.t + vfirst.m\t%0,%2" + [(set_attr "type" "vmsetbit") + (set_attr "mode" "")]) + +;; vmsbf.m set-before-first mask bit. +;; vmsif.m set-including-fisrt mask bit. +;; vmsof.m set-only-first mask bit. +(define_insn "@vm_m" + [(set (match_operand:VB 0 "register_operand" "=&vr,&vr,&vr") + (unspec:VB + [(unspec:VB + [(match_operand:VB 1 "vector_reg_or_const0_operand" "vm,vm,J") + (unspec:VB + [(match_operand:VB 3 "register_operand" "vr,vr,vr")] MASK_SET) + (match_operand:VB 2 "vector_reg_or_const0_operand" "0,J,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vm.m\t%0,%3,%1.t + vm.m\t%0,%3,%1.t + vm.m\t%0,%3" + [(set_attr "type" "vmsetbit") + (set_attr "mode" "")]) + +;; Vector Iota Instruction. +(define_insn "@viota_m" + [(set (match_operand:VI 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec:VI + [(match_operand: 3 "register_operand" "vr,vr,vr,vr")] UNSPEC_IOTA) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + viota.m\t%0,%3,%1.t + viota.m\t%0,%3,%1.t + viota.m\t%0,%3 + viota.m\t%0,%3" + [(set_attr "type" "viota") + (set_attr "mode" "")]) + +;; Vector Element Index Instructions. +(define_insn "@vid_v" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec:VI + [(match_operand 3 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 4 "const_int_operand")] UNSPEC_ID) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vid.v\t%0,%1.t + vid.v\t%0,%1.t + vid.v\t%0 + vid.v\t%0" + [(set_attr "type" "vid") + (set_attr "mode" "")]) + +;; ------------------------------------------------------------------------------- +;; ---- 16. Vector Permutation Instructions +;; ------------------------------------------------------------------------------- +;; Includes: +;; - 16.1 Integer Scalar Move Instructions +;; - 16.2 Floating-Point Scalar Move Instructions +;; - 16.3 Vector slide Instructins +;; - 16.4 Vector Register Gather Instructions +;; - 16.5 Vector Compress Instructions +;; ------------------------------------------------------------------------------- + +;; Integer Scalar Move Instructions. +(define_insn "@vmv_x_s" + [(set (match_operand: 0 "register_operand" "=r") + (unspec: + [(vec_select: + (match_operand:VNOT64BITI 1 "register_operand" "vr") + (parallel [(const_int 0)])) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vmv.x.s\t%0,%1" + [(set_attr "type" "vmv_x_s") + (set_attr "mode" "")]) + +(define_expand "@vmv_x_s" + [(set (match_operand: 0 "register_operand") + (unspec: + [(vec_select: + (match_operand:V64BITI 1 "register_operand") + (parallel [(const_int 0)])) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + { + if (!TARGET_64BIT) + { + rtx vector = gen_reg_rtx (mode); + rtx shift = gen_reg_rtx (Pmode); + shift = force_reg (Pmode, GEN_INT (32)); + + rtx lo = gen_lowpart (Pmode, operands[0]); + rtx hi = gen_highpart (Pmode, operands[0]); + emit_insn (gen_vlshr_vx (vector, + const0_rtx, const0_rtx, operands[1], + shift, GEN_INT(1), rvv_gen_policy ())); + emit_insn (gen_vmv_x_s_lo (lo, operands[1])); + emit_insn (gen_vmv_x_s_hi (hi, vector)); + DONE; + } + + emit_insn (gen_vmv_x_s_di_internal (operands[0], operands[1])); + DONE; + }) + +(define_insn "vmv_x_s_di_internal" + [(set (match_operand: 0 "register_operand" "=r") + (unspec: + [(vec_select: + (match_operand:V64BITI 1 "register_operand" "vr") + (parallel [(const_int 0)])) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vmv.x.s\t%0,%1" + [(set_attr "type" "vmv_x_s") + (set_attr "mode" "")]) + +(define_insn "vmv_x_s_lo" + [(set (match_operand:SI 0 "register_operand" "=r") + (unspec:SI + [(vec_select:DI + (match_operand:V64BITI 1 "register_operand" "vr") + (parallel [(const_int 0)])) + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_LO))] + "TARGET_VECTOR" + "vmv.x.s\t%0,%1" + [(set_attr "type" "vmv_x_s") + (set_attr "mode" "")]) + +(define_insn "vmv_x_s_hi" + [(set (match_operand:SI 0 "register_operand" "=r") + (unspec:SI + [(vec_select:DI + (match_operand:V64BITI 1 "register_operand" "vr") + (parallel [(const_int 0)])) + (reg:SI VTYPE_REGNUM)] UNSPEC_HI))] + "TARGET_VECTOR" + "vmv.x.s\t%0,%1" + [(set_attr "type" "vmv_x_s") + (set_attr "mode" "")]) + +(define_insn "@vmv_s_x_internal" + [(set (match_operand:VI 0 "register_operand" "=vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(vec_duplicate:VI + (match_operand: 2 "reg_or_0_operand" "r,J,r,J")) + (match_operand:VI 1 "vector_reg_or_const0_operand" "0,0,J,J") + (const_int 1)] UNSPEC_VMV_SX) + (match_operand 3 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmv.s.x\t%0,%2 + vmv.s.x\t%0,zero + vmv.s.x\t%0,%2 + vmv.s.x\t%0,zero" + [(set_attr "type" "vmv_s_x") + (set_attr "mode" "")]) + +(define_insn "@vmv_s_x_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=vr,vr,vr,vr") + (unspec:V64BITI + [(unspec:V64BITI + [(vec_duplicate:V64BITI + (sign_extend: + (match_operand:SI 2 "reg_or_0_operand" "r,J,r,J"))) + (match_operand:V64BITI 1 "vector_reg_or_const0_operand" "0,0,J,J") + (const_int 1)] UNSPEC_VMV_SX) + (match_operand:SI 3 "csr_operand" "rK,rK,rK,rK") + (match_operand:SI 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vmv.s.x\t%0,%2 + vmv.s.x\t%0,zero + vmv.s.x\t%0,%2 + vmv.s.x\t%0,zero" + [(set_attr "type" "vmv_s_x") + (set_attr "mode" "")]) + +;; This pattern is used by auto-vectorization to +;; initiate a vector whose value of element 0 is +;; zero. We dont't want to use subreg to generate +;; transformation between floating-point and integer. +(define_insn "@vmv_s_x_internal" + [(set (match_operand:VF 0 "register_operand" "=vr") + (unspec:VF + [(unspec:VF + [(const_int 0) + (const_int 1)] UNSPEC_VMV_SX) + (match_operand 1 "p_reg_or_const_csr_operand" "rK") + (match_operand 2 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vmv.s.x\t%0,zero" + [(set_attr "type" "vmv_s_x") + (set_attr "mode" "")]) + +;; Floating-Point Scalar Move Instructions. +(define_insn "@vfmv_f_s" + [(set (match_operand: 0 "register_operand" "=f") + (unspec: + [(vec_select: + (match_operand:VF 1 "register_operand" "vr") + (parallel [(const_int 0)])) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vfmv.f.s\t%0,%1" + [(set_attr "type" "vfmv_f_s") + (set_attr "mode" "")]) + +(define_insn "@vfmv_s_f" + [(set (match_operand:VF 0 "register_operand" "=vr,vr") + (unspec:VF + [(unspec:VF + [(vec_duplicate:VF + (match_operand: 2 "register_operand" "f,f")) + (match_operand:VF 1 "vector_reg_or_const0_operand" "0,J") + (const_int 1)] UNSPEC_VMV_SX) + (match_operand 3 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 4 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vfmv.s.f\t%0,%2" + [(set_attr "type" "vfmv_s_f") + (set_attr "mode" "")]) + +;; Vector Slideup/Slidedown Instructions. +(define_insn "@vslide_vx" + [(set (match_operand:V 0 "register_operand" "=&vr,&vr,&vr,&vr,&vr,&vr,&vr,&vr") + (unspec:V + [(unspec:V + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (unspec:V + [(match_operand:V 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J") + (match_operand:V 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (match_operand 4 "p_reg_or_uimm5_operand" "r,K,r,K,r,K,r,K")] SLIDE_UP) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vslide.vx\t%0,%3,%4,%1.t + vslide.vi\t%0,%3,%4,%1.t + vslide.vx\t%0,%3,%4,%1.t + vslide.vi\t%0,%3,%4,%1.t + vslide.vx\t%0,%3,%4 + vslide.vi\t%0,%3,%4 + vslide.vx\t%0,%3,%4 + vslide.vi\t%0,%3,%4" + [(set_attr "type" "vslide") + (set_attr "mode" "")]) + +(define_insn "@vslide_vx" + [(set (match_operand:V 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:V + [(unspec:V + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (unspec:V + [(match_operand:V 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J") + (match_operand:V 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (match_operand 4 "p_reg_or_uimm5_operand" "r,K,r,K,r,K,r,K")] SLIDE_DOWN) + (match_dup 2)] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vslide.vx\t%0,%3,%4,%1.t + vslide.vi\t%0,%3,%4,%1.t + vslide.vx\t%0,%3,%4,%1.t + vslide.vi\t%0,%3,%4,%1.t + vslide.vx\t%0,%3,%4 + vslide.vi\t%0,%3,%4 + vslide.vx\t%0,%3,%4 + vslide.vi\t%0,%3,%4" + [(set_attr "type" "vslide") + (set_attr "mode" "")]) + +;; Vector Integer Slide1up/Slide1down Instructions. +(define_insn "@vslide1_vx_internal" + [(set (match_operand:VI 0 "register_operand" "=&vr,&vr,&vr,&vr,&vr,&vr,&vr,&vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (unspec:VI + [(match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (match_operand: 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J")] SLIDE1_UP) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vslide1.vx\t%0,%3,%4,%1.t + vslide1.vx\t%0,%3,zero,%1.t + vslide1.vx\t%0,%3,%4,%1.t + vslide1.vx\t%0,%3,zero,%1.t + vslide1.vx\t%0,%3,%4 + vslide1.vx\t%0,%3,zero + vslide1.vx\t%0,%3,%4 + vslide1.vx\t%0,%3,zero" + [(set_attr "type" "vslide") + (set_attr "mode" "")]) + +(define_insn "@vslide1_vx_internal" + [(set (match_operand:VI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:VI + [(unspec:VI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (unspec:VI + [(match_operand:VI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (match_operand: 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J")] SLIDE1_DOWN) + (match_operand:VI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vslide1.vx\t%0,%3,%4,%1.t + vslide1.vx\t%0,%3,zero,%1.t + vslide1.vx\t%0,%3,%4,%1.t + vslide1.vx\t%0,%3,zero,%1.t + vslide1.vx\t%0,%3,%4 + vslide1.vx\t%0,%3,zero + vslide1.vx\t%0,%3,%4 + vslide1.vx\t%0,%3,zero" + [(set_attr "type" "vslide") + (set_attr "mode" "")]) + +(define_insn "@vslide1_vx_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=&vr,&vr,&vr,&vr,&vr,&vr,&vr,&vr") + (unspec:V64BITI + [(unspec:V64BITI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (unspec:V64BITI + [(match_operand:V64BITI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (sign_extend: (match_operand:SI 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J"))] SLIDE1_UP) + (match_operand:V64BITI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand:SI 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vslide1.vx\t%0,%3,%4,%1.t + vslide1.vx\t%0,%3,zero,%1.t + vslide1.vx\t%0,%3,%4,%1.t + vslide1.vx\t%0,%3,zero,%1.t + vslide1.vx\t%0,%3,%4 + vslide1.vx\t%0,%3,zero + vslide1.vx\t%0,%3,%4 + vslide1.vx\t%0,%3,zero" + [(set_attr "type" "vslide") + (set_attr "mode" "")]) + +(define_insn "@vslide1_vx_32bit" + [(set (match_operand:V64BITI 0 "register_operand" "=vd,vd,vd,vd,vr,vr,vr,vr") + (unspec:V64BITI + [(unspec:V64BITI + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (unspec:V64BITI + [(match_operand:V64BITI 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (sign_extend: (match_operand:SI 4 "reg_or_0_operand" "r,J,r,J,r,J,r,J"))] SLIDE1_DOWN) + (match_operand:V64BITI 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand:SI 5 "csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand:SI 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vslide1.vx\t%0,%3,%4,%1.t + vslide1.vx\t%0,%3,zero,%1.t + vslide1.vx\t%0,%3,%4,%1.t + vslide1.vx\t%0,%3,zero,%1.t + vslide1.vx\t%0,%3,%4 + vslide1.vx\t%0,%3,zero + vslide1.vx\t%0,%3,%4 + vslide1.vx\t%0,%3,zero" + [(set_attr "type" "vslide") + (set_attr "mode" "")]) + +;; Vector Floating-Point Slide1up/Slide1down Instructions. +(define_insn "@vfslide1_vf" + [(set (match_operand:VF 0 "register_operand" "=vd,vd,vr,vr") + (unspec:VF + [(unspec:VF + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec:VF + [(match_operand:VF 3 "register_operand" "vr,vr,vr,vr") + (match_operand: 4 "register_operand" "f,f,f,f")] SLIDE1_DOWN) + (match_operand:VF 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfslide1.vf\t%0,%3,%4,%1.t + vfslide1.vf\t%0,%3,%4,%1.t + vfslide1.vf\t%0,%3,%4 + vfslide1.vf\t%0,%3,%4" + [(set_attr "type" "vslide") + (set_attr "mode" "")]) + +(define_insn "@vfslide1_vf" + [(set (match_operand:VF 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec:VF + [(unspec:VF + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec:VF + [(match_operand:VF 3 "register_operand" "vr,vr,vr,vr") + (match_operand: 4 "register_operand" "f,f,f,f")] SLIDE1_UP) + (match_operand:VF 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vfslide1.vf\t%0,%3,%4,%1.t + vfslide1.vf\t%0,%3,%4,%1.t + vfslide1.vf\t%0,%3,%4 + vfslide1.vf\t%0,%3,%4" + [(set_attr "type" "vslide") + (set_attr "mode" "")]) + +;; Vector-Vector vrgater instruction. +(define_insn "@vrgather_vv" + [(set (match_operand:V 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec:V + [(unspec:V + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec:V + [(match_operand:V 3 "register_operand" "vr,vr,vr,vr") + (match_operand: 4 "register_operand" "vr,vr,vr,vr")] UNSPEC_RGATHER) + (match_operand:V 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vrgather.vv\t%0,%3,%4,%1.t + vrgather.vv\t%0,%3,%4,%1.t + vrgather.vv\t%0,%3,%4 + vrgather.vv\t%0,%3,%4" + [(set_attr "type" "vgather") + (set_attr "mode" "")]) + +;; Vector-Vector vrgaterei16 instruction. +(define_insn "@vrgatherei16_vv" + [(set (match_operand:V16 0 "register_operand" "=&vr,&vr,&vr,&vr") + (unspec:V16 + [(unspec:V16 + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,J,J") + (unspec:V16 + [(match_operand:V16 3 "register_operand" "vr,vr,vr,vr") + (match_operand: 4 "register_operand" "vr,vr,vr,vr")] UNSPEC_RGATHEREI16) + (match_operand:V16 2 "vector_reg_or_const0_operand" "0,J,0,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vrgatherei16.vv\t%0,%3,%4,%1.t + vrgatherei16.vv\t%0,%3,%4,%1.t + vrgatherei16.vv\t%0,%3,%4 + vrgatherei16.vv\t%0,%3,%4" + [(set_attr "type" "vgather") + (set_attr "mode" "")]) + +;; Vector-Scalar vrgater instruction. +(define_insn "@vrgather_vx" + [(set (match_operand:V 0 "register_operand" "=&vr,&vr,&vr,&vr,&vr,&vr,&vr,&vr") + (unspec:V + [(unspec:V + [(match_operand: 1 "vector_reg_or_const0_operand" "vm,vm,vm,vm,J,J,J,J") + (unspec:V + [(match_operand:V 3 "register_operand" "vr,vr,vr,vr,vr,vr,vr,vr") + (match_operand 4 "p_reg_or_uimm5_operand" "r,K,r,K,r,K,r,K")] UNSPEC_RGATHER) + (match_operand:V 2 "vector_reg_or_const0_operand" "0,0,J,J,0,0,J,J")] UNSPEC_SELECT) + (match_operand 5 "p_reg_or_const_csr_operand" "rK,rK,rK,rK,rK,rK,rK,rK") + (match_operand 6 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "@ + vrgather.vx\t%0,%3,%4,%1.t + vrgather.vi\t%0,%3,%4,%1.t + vrgather.vx\t%0,%3,%4,%1.t + vrgather.vi\t%0,%3,%4,%1.t + vrgather.vx\t%0,%3,%4 + vrgather.vi\t%0,%3,%4 + vrgather.vx\t%0,%3,%4 + vrgather.vi\t%0,%3,%4" + [(set_attr "type" "vgather") + (set_attr "mode" "")]) + +;; Vector Compress Instruction. +(define_insn "@vcompress_vm" + [(set (match_operand:V 0 "register_operand" "=&vr,&vr") + (unspec:V + [(unspec:V + [(match_operand: 1 "register_operand" "vm,vm") + (match_operand:V 2 "vector_reg_or_const0_operand" "0,J") + (match_operand:V 3 "register_operand" "vr,vr")] UNSPEC_COMPRESS) + (match_operand 4 "p_reg_or_const_csr_operand" "rK,rK") + (match_operand 5 "const_int_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_RVV))] + "TARGET_VECTOR" + "vcompress.vm\t%0,%3,%1" + [(set_attr "type" "vcompress") + (set_attr "mode" "")]) \ No newline at end of file