From patchwork Fri Mar 3 04:52:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Michael Collison X-Patchwork-Id: 65949 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 107C03850874 for ; Fri, 3 Mar 2023 04:53:01 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qt1-x833.google.com (mail-qt1-x833.google.com [IPv6:2607:f8b0:4864:20::833]) by sourceware.org (Postfix) with ESMTPS id 43BF93858CDB for ; Fri, 3 Mar 2023 04:52:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 43BF93858CDB Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-qt1-x833.google.com with SMTP id h19so1722626qtk.7 for ; Thu, 02 Mar 2023 20:52:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:content-language:to:subject:from :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=eknxH/T0RuOkYWLSr2kWkCtooA5JrT2JlcaWRYvHlmM=; b=lW5pThQsGROCrDkbt5azrlWGNAD4bh+9KbS/sBsjjA5aKRJRS8kInYGo+57GMKtb0a d/fPfqFC3NMhJGw7vMvCAU7m8Ip8GLfTqKRP6Jxs7czJlUQvn75GQ/Bp1tMn7vDVwijw rU+rj7CT8Q6Lz17f2uU94gzhpKk1NRSpmQ3YlcGufVmSK4utt3RAbeF1g2OKQjsM1xkL tAagMv/G9EXTxm+D913tRqQg6H2WckBWFw/TqSkYoh0HRByJumle+Cz/GhPuS+aCsye1 DB4RGt1Mn4fI/80509w2B6dRTyjsJZ3l0JtXPQidkMNldGJyBalg4QrDEHYZygNbpdLG ++AA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:content-language:to:subject:from :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=eknxH/T0RuOkYWLSr2kWkCtooA5JrT2JlcaWRYvHlmM=; b=oYFgseOg2TV0+Wc/Kd/ZdJIqXhw4sueaIdAmn2xrAOnjGYwpqq3sn2pX/8LFzTnRkb imlx09iLFfteeg9iQ+rpjatxN9F2lv3nIqP73VMi1p4o1DO4gTJNWj9SIR4wWgOo7ujY Z+7AtwLMJnDdxhdVYl23LwZ1M4dbv5AC2zeNyZ3+9HGt+zUjT2U2krGJWItUavjrxsDf ZkcdSoo5HtG3jTIt/yV9EYnFH5deGY1y3tpj1CiWtoGrLCcFBYChU+4byvO6xpYlkyYZ PFb/cGtAPjQKAN4S9uTjgEItiYCdNnd0J0wTnBgIWhrRCbfb2sgDVuSppvUWP4a+Msgb F8jA== X-Gm-Message-State: AO0yUKXb2Tebuldz7QFmu3Acwgpzxk5reJ8LZDVZpkFKQ/n0j20Fib/U lOhe6NgHhPb1/Z5wF6Uiw6AEYluGEWzBBIZ0 X-Google-Smtp-Source: AK7set9qIfJssHHziP8yY8vkSqFnDGIS+wsDuo9+QnZJ/ZODfPEwnYGA7SuGto8Rqbv7kNSubgQ4CQ== X-Received: by 2002:a05:622a:1a01:b0:3b9:bc8c:c1f6 with SMTP id f1-20020a05622a1a0100b003b9bc8cc1f6mr7649399qtb.1.1677819163403; Thu, 02 Mar 2023 20:52:43 -0800 (PST) Received: from [192.168.86.117] ([136.57.172.92]) by smtp.gmail.com with ESMTPSA id l19-20020a05620a211300b007423c122457sm1009039qkl.63.2023.03.02.20.52.43 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 02 Mar 2023 20:52:43 -0800 (PST) Message-ID: Date: Thu, 2 Mar 2023 23:52:42 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 From: Michael Collison Subject: [PATCH 01/07] RISC-V: Add auto-vectorization support To: gcc-patches Content-Language: en-US X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch adds foundational support in the form of: 1. New predicates 2. New function prototypes 3. Exporting emit_vlmax_vsetvl to global scope 4. Add a new command line option -mriscv_vector_lmul gcc/ChangeLog:     * config/riscv/riscv-protos.h (riscv_classify_vlmul_field):     New external declaration.     (riscv_vector_preferred_simd_mode): Ditto.     (riscv_tuple_mode_p): Ditto.     (riscv_vector_mask_mode_p): Ditto.     (riscv_classify_nf): Ditto.     (riscv_vlmul_regsize): Ditto.     (riscv_vector_preferred_simd_mode): Ditto.     (riscv_vector_get_mask_mode): Ditto.     (emit_vlmax_vsetvl): Ditto.     (get_mask_policy_no_pred): Ditto.     (get_tail_policy_no_pred): Ditto.     * config/riscv/riscv-opts.h (riscv_vector_bits_enum): New enum.     (riscv_vector_lmul_enum): Ditto.     (vlmul_field_enum): Ditto.     * config/riscv/riscv-v.cc (emit_vlmax_vsetvl):     Remove static scope.     * config/riscv/riscv.opt (riscv_vector_lmul):     New option -mriscv_vector_lmul.     * config/riscv/predicates.md (p_reg_or_const_csr_operand):     New predicate.     (vector_reg_or_const_dup_operand): Ditto. ---  gcc/config/riscv/predicates.md  | 13 +++++++++++  gcc/config/riscv/riscv-opts.h   | 40 +++++++++++++++++++++++++++++++++  gcc/config/riscv/riscv-protos.h | 16 +++++++++++++  gcc/config/riscv/riscv-v.cc     |  2 +-  gcc/config/riscv/riscv.opt      | 20 +++++++++++++++++  5 files changed, 90 insertions(+), 1 deletion(-)  Use hardware floating-point divide and square root instructions. diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md index 7bc7c0b4f4d..31517ae4606 100644 --- a/gcc/config/riscv/predicates.md +++ b/gcc/config/riscv/predicates.md @@ -264,6 +264,14 @@  })  ;; Predicates for the V extension. +(define_special_predicate "p_reg_or_const_csr_operand" +  (match_code "reg, subreg, const_int") +{ +  if (CONST_INT_P (op)) +    return satisfies_constraint_K (op); +  return GET_MODE (op) == Pmode; +}) +  (define_special_predicate "vector_length_operand"    (ior (match_operand 0 "pmode_register_operand")         (match_operand 0 "const_csr_operand"))) @@ -287,6 +295,11 @@    (ior (match_operand 0 "register_operand")         (match_test "op == CONSTM1_RTX (GET_MODE (op))"))) +(define_predicate "vector_reg_or_const_dup_operand" +  (ior (match_operand 0 "register_operand") +       (match_test "const_vec_duplicate_p (op) +          && !CONST_POLY_INT_P (CONST_VECTOR_ELT (op, 0))"))) +  (define_predicate "vector_mask_operand"    (ior (match_operand 0 "register_operand")         (match_operand 0 "vector_all_trues_mask_operand"))) diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h index ff398c0a2ae..2057a14e153 100644 --- a/gcc/config/riscv/riscv-opts.h +++ b/gcc/config/riscv/riscv-opts.h @@ -67,6 +67,46 @@ enum stack_protector_guard {    SSP_GLOBAL            /* global canary */  }; +/* RVV vector register sizes.  */ +enum riscv_vector_bits_enum +{ +  RVV_SCALABLE, +  RVV_NOT_IMPLEMENTED = RVV_SCALABLE, +  RVV_64 = 64, +  RVV_128 = 128, +  RVV_256 = 256, +  RVV_512 = 512, +  RVV_1024 = 1024, +  RVV_2048 = 2048, +  RVV_4096 = 4096, +  RVV_8192 = 8192, +  RVV_16384 = 16384, +  RVV_32768 = 32768, +  RVV_65536 = 65536 +}; + +/* vectorization factor.  */ +enum riscv_vector_lmul_enum +{ +  RVV_LMUL1 = 1, +  RVV_LMUL2 = 2, +  RVV_LMUL4 = 4, +  RVV_LMUL8 = 8 +}; + +enum vlmul_field_enum +{ +  VLMUL_FIELD_000, /* LMUL = 1 */ +  VLMUL_FIELD_001, /* LMUL = 2 */ +  VLMUL_FIELD_010, /* LMUL = 4 */ +  VLMUL_FIELD_011, /* LMUL = 8 */ +  VLMUL_FIELD_100, /* RESERVED */ +  VLMUL_FIELD_101, /* LMUL = 1/8 */ +  VLMUL_FIELD_110, /* LMUL = 1/4 */ +  VLMUL_FIELD_111, /* LMUL = 1/2 */ +  MAX_VLMUL_FIELD +}; +  #define MASK_ZICSR    (1 << 0)  #define MASK_ZIFENCEI (1 << 1) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 37c634eca1d..70c8dc4ce69 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -200,4 +200,19 @@ const unsigned int RISCV_BUILTIN_SHIFT = 1;  /* Mask that selects the riscv_builtin_class part of a function code.  */  const unsigned int RISCV_BUILTIN_CLASS = (1 << RISCV_BUILTIN_SHIFT) - 1; +/* Routines implemented in riscv-v.cc*/ + +namespace riscv_vector { +extern unsigned int riscv_classify_vlmul_field (enum machine_mode m); +extern machine_mode riscv_vector_preferred_simd_mode (scalar_mode mode, unsigned vf); +extern bool riscv_tuple_mode_p (machine_mode); +extern bool riscv_vector_mask_mode_p (machine_mode); +extern int riscv_classify_nf (machine_mode); +extern int riscv_vlmul_regsize(machine_mode); +extern machine_mode riscv_vector_preferred_simd_mode (scalar_mode mode, unsigned vf); +extern opt_machine_mode riscv_vector_get_mask_mode (machine_mode mode); +extern rtx emit_vlmax_vsetvl (machine_mode vmode); +extern rtx get_mask_policy_no_pred (); +extern rtx get_tail_policy_no_pred (); +}  #endif /* ! GCC_RISCV_PROTOS_H */ diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 59c25c65cd5..58007cc16eb 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -108,7 +108,7 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT minval,        && IN_RANGE (INTVAL (elt), minval, maxval));  } -static rtx +rtx  emit_vlmax_vsetvl (machine_mode vmode)  {    rtx vl = gen_reg_rtx (Pmode); diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt index 95535235354..27005fb0f4a 100644 --- a/gcc/config/riscv/riscv.opt +++ b/gcc/config/riscv/riscv.opt @@ -70,6 +70,26 @@ Enum(abi_type) String(lp64f) Value(ABI_LP64F)  EnumValue  Enum(abi_type) String(lp64d) Value(ABI_LP64D) +Enum +Name(riscv_vector_lmul) Type(enum riscv_vector_lmul_enum) +The possible vectorization factor: + +EnumValue +Enum(riscv_vector_lmul) String(1) Value(RVV_LMUL1) + +EnumValue +Enum(riscv_vector_lmul) String(2) Value(RVV_LMUL2) + +EnumValue +Enum(riscv_vector_lmul) String(4) Value(RVV_LMUL4) + +EnumValue +Enum(riscv_vector_lmul) String(8) Value(RVV_LMUL8) + +mriscv-vector-lmul= +Target RejectNegative Joined Enum(riscv_vector_lmul) Var(riscv_vector_lmul) Init(RVV_LMUL1) +-mriscv-vector-lmul=    Set the vf using lmul in auto-vectorization. +  mfdiv  Target Mask(FDIV) From patchwork Fri Mar 3 04:52:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Michael Collison X-Patchwork-Id: 65950 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 90A20384FB7A for ; Fri, 3 Mar 2023 04:53:12 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qt1-x829.google.com (mail-qt1-x829.google.com [IPv6:2607:f8b0:4864:20::829]) by sourceware.org (Postfix) with ESMTPS id C58B7385B50C for ; Fri, 3 Mar 2023 04:52:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C58B7385B50C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-qt1-x829.google.com with SMTP id s12so1695972qtq.11 for ; Thu, 02 Mar 2023 20:52:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:content-language:to:subject:from :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=6QKW991zfW0dw9V1XZxaOyiNvFtp5nbAd0gKdfGUnE4=; b=KQ5vlxVmG5xRT/E6OtMURJBlWhlETGAadF+HSYBAMAyUt2wiP6pQobOOVCt1l8MgV0 pvlCJ32/kw+TaY1XVVAnf+pWRE3mZ52wVfbHwVpcTj+LaaCOt7PU5CWOsc2teUShagXy IXuTVYEeS6boyV2FGAr/FcGK/Qai95+cGwfFE2knCQA40JNTi9XrIudz1aq3UsEnVdTj IlUYzVWlxYT2KF8GsSrjKk5pxgt2cby+bqd/R1vQAeovBAni10RqXrnoy44cJYbscHw4 jSAqaXCnu9jkjqx5PWfm2CjsiNVb8h/MpIYQTpWdUKDEfAtKazTvCuSl/eAOXHzjXPtC T72Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:content-language:to:subject:from :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=6QKW991zfW0dw9V1XZxaOyiNvFtp5nbAd0gKdfGUnE4=; b=4oBrgdsz1nfD/ExktwekIDs/wi7o/YEKJfWHChg8b8v0tX+KMOPGpibv6ezO1pcCfR qWxyCTc0Ki9YmzRUdgejEdvKnJvdfVkI/5urEFOIFFUMzD0sm+9Tv2uEmGfWT0jgL7Xs bUXzVMzfj4xD99A7Ao1ZNB5sapOMsmEI0wP4EDR+rtYH2vhV3/3GHgpBsX4mW46cOkIK tkpSHBy3zFQkzxLEqXyOUHX6vAh9NB+eqaY6HsK8nOjrBled/eG8W2CVaHJxnvkZaI6Y qwPInhLppe3kV3wjYKscOC7mqNrwhCoE4j381Hl3K0WqNXkgkD3htWBokNyZ/KJSC+cQ ejDg== X-Gm-Message-State: AO0yUKVyIwwnDzelGN2W8yCYMG723lEhExY3j5X0bvXiuREy1DNAQYQc GYYG/HX8BqYUDg7WFrsR0F+ZV+DgcCoF6xQL X-Google-Smtp-Source: AK7set+8Mr4Q5+WAhCGk/5fGtg4RdxVL5lUktmnRzKWrsyfINunQk9A5N3SdiWaWlFxKAfxIPFyZ5g== X-Received: by 2002:ac8:5c90:0:b0:3b6:9c63:5ca1 with SMTP id r16-20020ac85c90000000b003b69c635ca1mr1101804qta.43.1677819175874; Thu, 02 Mar 2023 20:52:55 -0800 (PST) Received: from [192.168.86.117] ([136.57.172.92]) by smtp.gmail.com with ESMTPSA id q25-20020a37f719000000b007343fceee5fsm1044998qkj.8.2023.03.02.20.52.55 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 02 Mar 2023 20:52:55 -0800 (PST) Message-ID: Date: Thu, 2 Mar 2023 23:52:55 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 From: Michael Collison Subject: [PATCH 02/07] RISC-V: Add auto-vectorization support To: gcc-patches Content-Language: en-US X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch adds foundational support by making two functions that handle predication policies visibly globally. gcc/ChangeLog:     * config/riscv/riscv-vector-builtins.cc (get_tail_policy_for_pred):     Remove static declaration to to make externally visible.     (get_mask_policy_for_pred): Ditto.     * config/riscv/riscv-vector-builtins.h (get_tail_policy_for_pred):     New external declaration.     (get_mask_policy_for_pred): Ditto. ---  gcc/config/riscv/riscv-vector-builtins.cc | 4 ++--  gcc/config/riscv/riscv-vector-builtins.h  | 2 ++  2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc index 2e92ece3b64..90fc73a5bcf 100644 --- a/gcc/config/riscv/riscv-vector-builtins.cc +++ b/gcc/config/riscv/riscv-vector-builtins.cc @@ -1850,7 +1850,7 @@ use_real_merge_p (enum predication_type_index pred)  /* Get TAIL policy for predication. If predication indicates TU, return the TU.     Otherwise, return the prefer default configuration.  */ -static rtx +rtx  get_tail_policy_for_pred (enum predication_type_index pred)  {    if (pred == PRED_TYPE_tu || pred == PRED_TYPE_tum || pred == PRED_TYPE_tumu) @@ -1860,7 +1860,7 @@ get_tail_policy_for_pred (enum predication_type_index pred)  /* Get MASK policy for predication. If predication indicates MU, return the MU.     Otherwise, return the prefer default configuration.  */ -static rtx +rtx  get_mask_policy_for_pred (enum predication_type_index pred)  {    if (pred == PRED_TYPE_tumu || pred == PRED_TYPE_mu) diff --git a/gcc/config/riscv/riscv-vector-builtins.h b/gcc/config/riscv/riscv-vector-builtins.h index ede08c6a480..135e2463b1e 100644 --- a/gcc/config/riscv/riscv-vector-builtins.h +++ b/gcc/config/riscv/riscv-vector-builtins.h @@ -433,6 +433,8 @@ extern const char *const operand_suffixes[NUM_OP_TYPES];  extern const rvv_builtin_suffixes type_suffixes[NUM_VECTOR_TYPES + 1];  extern const char *const predication_suffixes[NUM_PRED_TYPES];  extern rvv_builtin_types_t builtin_types[NUM_VECTOR_TYPES + 1]; +extern rtx get_tail_policy_for_pred (enum predication_type_index pred); +extern rtx get_mask_policy_for_pred (enum predication_type_index pred);  inline bool  function_instance::operator!= (const function_instance &other) const From patchwork Fri Mar 3 04:53:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Michael Collison X-Patchwork-Id: 65951 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9E04D38493D6 for ; Fri, 3 Mar 2023 04:53:27 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qt1-x831.google.com (mail-qt1-x831.google.com [IPv6:2607:f8b0:4864:20::831]) by sourceware.org (Postfix) with ESMTPS id D43833850208 for ; Fri, 3 Mar 2023 04:53:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D43833850208 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-qt1-x831.google.com with SMTP id d7so1690945qtr.12 for ; Thu, 02 Mar 2023 20:53:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; h=content-language:to:subject:from:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=wfku3vxdP2USxInhYqICUdofN+sq3KuFayzWhnVk/1g=; b=Hf7UDDkK5VjEOLIwqJP/hmctwZ1FQP3rFZcTYAASExbONcDL+Y096hgNGcl6+NGHmL a1RxuCvh6gcPOMUq42ibdA5LmeNdvasKf5/7+ZmiNgVNnxU6fBPhWC+agP8SDHzEeN88 OC34P6Ktdhl7bJEVXaWyYoX9+VVZfPInkXlq5/1hlXTwWAPvWEO3dDoLo7wyaGYwqlFN S0bM6n8uIF++VNW5UqQ6aeTJVTUxFJy5Qr1Kb9v/MGj7kwizJw8qMnfuFTZAhrHQbkn0 QWxSBRxd+OrXfE+ILHzANhFLSyeDDNhspuenCD0pT6ghg68Tw1UQJLJ7L3dl+BaRppAe 5auQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-language:to:subject:from:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=wfku3vxdP2USxInhYqICUdofN+sq3KuFayzWhnVk/1g=; b=8AM4jKaY/30NMQQ5+1qx3hWohTa5I8aTFmH0VskzeseNgVPGpgrfrmPK3dsFNpHp3S ZyLUwYluXdp7/oqrc80i+Y4WFyCK/DXQNHyfuD1A0Fp4ZMH48pSZCzd4aWgxB5lz9cSu DBb9uOzMuLLoaqCCcvcvpavnLFO4GXz/lP1HiD+urrjkUnYQfWQ4f8bXqWqcXf5DU+71 umCXacAYddyaBkN8XObLOeHMyOLILZDY9Peauovc3Lp/8Ot8cOuxa0gNASU/ucX8KM53 VaT0wbMNyZGkU4xA5bQ8tabK+uCDmng4iufb/qK8kt5C3CKfCWoutO2Y+/AB1ORGODRn Xt6A== X-Gm-Message-State: AO0yUKV5v+jWzC4dbTxfpztwjrNo08OgEtw/0EWo8iMi7B6qEclCz9jV XZ/xil30gMBw4IBc5VEROkXCePn06Oeh4OBV X-Google-Smtp-Source: AK7set+e3bDoKz7j6BKewfr/BZUNSdHi53/kcThrZqGvFCsp50Lo+WNp4MlLk++nypyapiHsLpLFjw== X-Received: by 2002:ac8:5c4a:0:b0:3bf:b950:f684 with SMTP id j10-20020ac85c4a000000b003bfb950f684mr1126954qtj.53.1677819184116; Thu, 02 Mar 2023 20:53:04 -0800 (PST) Received: from [192.168.86.117] ([136.57.172.92]) by smtp.gmail.com with ESMTPSA id r25-20020ac87959000000b003bfc1f49ad1sm1010876qtt.87.2023.03.02.20.53.03 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 02 Mar 2023 20:53:03 -0800 (PST) Message-ID: Date: Thu, 2 Mar 2023 23:53:03 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 From: Michael Collison Subject: [PATCH 03/07] RISC-V: Add auto-vectorization support To: gcc-patches Content-Language: en-US X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patches adds two new files to support the vector cost model and modifies the Makefile fragment to build the cost model c++ file. Due to the large size this patch is provided as an attachment. gcc/ChangeLog:     * gcc/config.gcc (riscv-vector-cost.o): New object file to build.     * config/riscv/riscv-vector-cost.cc: New file for riscv vector cost     model     * config/riscv/riscv-vector-cost.h: New header file for riscv vector     cost model.     * config/riscv/t-riscv: Add make rule for riscv-vector-cost.o. From eb995818cd5f77f85e8df93b690b00ce1fd1aa35 Mon Sep 17 00:00:00 2001 From: Michael Collison Date: Thu, 2 Mar 2023 12:27:36 -0500 Subject: [PATCH] Autovectorization patch set 2 --- gcc/config.gcc | 2 +- gcc/config/riscv/riscv-vector-cost.cc | 620 ++++++++++++++++++++++++++ gcc/config/riscv/riscv-vector-cost.h | 400 +++++++++++++++++ gcc/config/riscv/t-riscv | 5 + 4 files changed, 1026 insertions(+), 1 deletion(-) create mode 100644 gcc/config/riscv/riscv-vector-cost.cc create mode 100644 gcc/config/riscv/riscv-vector-cost.h diff --git a/gcc/config.gcc b/gcc/config.gcc index c070e6ecd2e..a4017777187 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -530,7 +530,7 @@ pru-*-*) riscv*) cpu_type=riscv extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memrefs.o riscv-selftests.o riscv-v.o riscv-vsetvl.o" - extra_objs="${extra_objs} riscv-vector-builtins.o riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o" + extra_objs="${extra_objs} riscv-vector-cost.o riscv-vector-builtins.o riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o" d_target_objs="riscv-d.o" extra_headers="riscv_vector.h" target_gtfiles="$target_gtfiles \$(srcdir)/config/riscv/riscv-vector-builtins.cc" diff --git a/gcc/config/riscv/riscv-vector-cost.cc b/gcc/config/riscv/riscv-vector-cost.cc new file mode 100644 index 00000000000..5a33b20843a --- /dev/null +++ b/gcc/config/riscv/riscv-vector-cost.cc @@ -0,0 +1,620 @@ +/* Cost model implementation for RISC-V 'V' Extension for GNU compiler. + Copyright (C) 2022-2023 Free Software Foundation, Inc. + Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +#define INCLUDE_STRING +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "backend.h" +#include "rtl.h" +#include "regs.h" +#include "insn-config.h" +#include "insn-attr.h" +#include "recog.h" +#include "rtlanal.h" +#include "output.h" +#include "alias.h" +#include "tree.h" +#include "stringpool.h" +#include "attribs.h" +#include "varasm.h" +#include "stor-layout.h" +#include "calls.h" +#include "function.h" +#include "explow.h" +#include "memmodel.h" +#include "emit-rtl.h" +#include "reload.h" +#include "tm_p.h" +#include "target.h" +#include "basic-block.h" +#include "expr.h" +#include "optabs.h" +#include "bitmap.h" +#include "df.h" +#include "diagnostic.h" +#include "builtins.h" +#include "predict.h" +#include "tree-pass.h" +#include "opts.h" +#include "langhooks.h" +#include "rtl-iter.h" +#include "gimple.h" +#include "cfghooks.h" +#include "cfgloop.h" +#include "fold-const.h" +#include "gimple-iterator.h" +#include "tree-vectorizer.h" +#include "tree-ssa-loop-niter.h" +#include "riscv-vector-builtins.h" + +/* This file should be included last. */ +#include "riscv-vector-cost.h" +#include "target-def.h" + +bool vector_insn_cost_table::get_cost(rtx x, machine_mode mode, int *cost, + bool speed) const { + rtx op0, op1, op2; + enum rtx_code code = GET_CODE(x); + scalar_int_mode int_mode; + + /* By default, assume that everything has equivalent cost to the + cheapest instruction. Any additional costs are applied as a delta + above this default. */ + *cost = COSTS_N_INSNS(1); + + switch (code) { + case SET: + /* The cost depends entirely on the operands to SET. */ + *cost = 0; + op0 = SET_DEST(x); + op1 = SET_SRC(x); + + switch (GET_CODE(op0)) { + case MEM: + if (speed) { + *cost += store->cost(x, mode); + } + + //*cost += rtx_cost(op1, mode, SET, 1, speed); + return true; + + case SUBREG: + if (!REG_P(SUBREG_REG(op0))) + *cost += rtx_cost(SUBREG_REG(op0), VOIDmode, SET, 0, speed); + + /* Fall through. */ + case REG: + /* The cost is one per vector-register copied. */ + if (VECTOR_MODE_P(GET_MODE(op0))) { + *cost = mov->cost(x, mode); + } else + /* Cost is just the cost of the RHS of the set. */ + *cost += rtx_cost(op1, mode, SET, 1, speed); + return true; + + case ZERO_EXTRACT: + case SIGN_EXTRACT: + /* Bit-field insertion. Strip any redundant widening of + the RHS to meet the width of the target. */ + if (SUBREG_P(op1)) + op1 = SUBREG_REG(op1); + if ((GET_CODE(op1) == ZERO_EXTEND || GET_CODE(op1) == SIGN_EXTEND) && + CONST_INT_P(XEXP(op0, 1)) && + is_a(GET_MODE(XEXP(op1, 0)), &int_mode) && + GET_MODE_BITSIZE(int_mode) >= INTVAL(XEXP(op0, 1))) + op1 = XEXP(op1, 0); + + if (CONST_INT_P(op1)) { + /* MOV immediate is assumed to always be cheap. */ + *cost = COSTS_N_INSNS(1); + } else { + /* BFM. */ + if (speed) + *cost += alu->cost(x, mode); + *cost += rtx_cost(op1, VOIDmode, (enum rtx_code)code, 1, speed); + } + + return true; + + default: + /* We can't make sense of this, assume default cost. */ + *cost = COSTS_N_INSNS(1); + return false; + } + return false; + + case MEM: + if (speed) { + *cost += load->cost(x, mode); + } + + return true; + + case NEG: + op0 = XEXP(x, 0); + + if (speed) { + /* FNEG. */ + *cost += alu->cost(x, mode); + } + return false; + + if (GET_MODE_CLASS(mode) == MODE_INT) { + if (GET_RTX_CLASS(GET_CODE(op0)) == RTX_COMPARE || + GET_RTX_CLASS(GET_CODE(op0)) == RTX_COMM_COMPARE) { + /* CSETM. */ + *cost += rtx_cost(XEXP(op0, 0), VOIDmode, NEG, 0, speed); + return true; + } + + /* Cost this as SUB wzr, X. */ + op0 = CONST0_RTX(mode); + op1 = XEXP(x, 0); + goto cost_minus; + } + return false; + + case COMPARE: + op0 = XEXP(x, 0); + op1 = XEXP(x, 1); + + if (op1 == const0_rtx && GET_CODE(op0) == AND) { + x = op0; + mode = GET_MODE(op0); + goto cost_logic; + } + + if (GET_MODE_CLASS(GET_MODE(op0)) == MODE_INT) { + /* TODO: A write to the CC flags possibly costs extra, this + needs encoding in the cost tables. */ + + mode = GET_MODE(op0); + /* ANDS. */ + if (GET_CODE(op0) == AND) { + x = op0; + goto cost_logic; + } + + if (GET_CODE(op0) == PLUS) { + /* ADDS (and CMN alias). */ + x = op0; + goto cost_plus; + } + + if (GET_CODE(op0) == MINUS) { + /* SUBS. */ + x = op0; + goto cost_minus; + } + + if (GET_CODE(op0) == ZERO_EXTRACT && op1 == const0_rtx && + CONST_INT_P(XEXP(op0, 1)) && CONST_INT_P(XEXP(op0, 2))) { + /* COMPARE of ZERO_EXTRACT form of TST-immediate. + Handle it here directly rather than going to cost_logic + since we know the immediate generated for the TST is valid + so we can avoid creating an intermediate rtx for it only + for costing purposes. */ + if (speed) + *cost += alu->cost(x, mode); + + *cost += rtx_cost(XEXP(op0, 0), GET_MODE(op0), ZERO_EXTRACT, 0, speed); + return true; + } + + if (GET_CODE(op1) == NEG) { + /* CMN. */ + if (speed) + *cost += alu->cost(x, mode); + + *cost += rtx_cost(op0, mode, COMPARE, 0, speed); + *cost += rtx_cost(XEXP(op1, 0), mode, NEG, 1, speed); + return true; + } + + /* CMP. + + Compare can freely swap the order of operands, and + canonicalization puts the more complex operation first. + But the integer MINUS logic expects the shift/extend + operation in op1. */ + if (!(REG_P(op0) || (SUBREG_P(op0) && REG_P(SUBREG_REG(op0))))) { + op0 = XEXP(x, 1); + op1 = XEXP(x, 0); + } + goto cost_minus; + } + + if (VECTOR_MODE_P(mode)) { + /* Vector compare. */ + if (speed) + *cost += alu->cost(x, mode); + + return false; + } + return false; + + case MINUS: { + op0 = XEXP(x, 0); + op1 = XEXP(x, 1); + + cost_minus: + *cost += rtx_cost(op0, mode, MINUS, 0, speed); + + return true; + } + + case PLUS: { + op0 = XEXP(x, 0); + op1 = XEXP(x, 1); + + cost_plus: + if (GET_RTX_CLASS(GET_CODE(op0)) == RTX_COMPARE || + GET_RTX_CLASS(GET_CODE(op0)) == RTX_COMM_COMPARE) { + /* CSINC. */ + *cost += rtx_cost(XEXP(op0, 0), mode, PLUS, 0, speed); + *cost += rtx_cost(op1, mode, PLUS, 1, speed); + return true; + } + + *cost += rtx_cost(op1, mode, PLUS, 1, speed); + + return true; + } + + case BSWAP: + *cost = COSTS_N_INSNS(1); + + if (speed) { + *cost += alu->cost(x, mode); + } + return false; + + case IOR: + *cost = COSTS_N_INSNS(1); + + if (speed) { + *cost += alu->cost(x, mode); + } + return true; + + case XOR: + case AND: + cost_logic: + if (speed) + *cost += alu->cost(x, mode); + return true; + + case NOT: + *cost += alu->cost(x, mode); + return false; + + case ZERO_EXTEND: + + op0 = XEXP(x, 0); + /* If a value is written in SI mode, then zero extended to DI + mode, the operation will in general be free as a write to + a 'w' register implicitly zeroes the upper bits of an 'x' + register. However, if this is + + (set (reg) (zero_extend (reg))) + + we must cost the explicit register move. */ + if (mode == DImode && GET_MODE(op0) == SImode) { + int op_cost = rtx_cost(op0, VOIDmode, ZERO_EXTEND, 0, speed); + + /* If OP_COST is non-zero, then the cost of the zero extend + is effectively the cost of the inner operation. Otherwise + we have a MOV instruction and we take the cost from the MOV + itself. This is true independently of whether we are + optimizing for space or time. */ + if (op_cost) + *cost = op_cost; + + return true; + } else if (MEM_P(op0)) { + /* All loads can zero extend to any size for free. */ + *cost = rtx_cost(op0, VOIDmode, ZERO_EXTEND, 0, speed); + return true; + } + + if (speed) { + /* UMOV. */ + *cost += alu->cost(x, mode); + } + return false; + + case SIGN_EXTEND: + if (MEM_P(XEXP(x, 0))) { + if (speed) { + *cost += load->cost(x, mode); + } + return true; + } + + if (speed) { + *cost += alu->cost(x, mode); + } + return false; + + case ASHIFT: + op0 = XEXP(x, 0); + op1 = XEXP(x, 1); + + if (CONST_INT_P(op1)) { + if (speed) { + *cost += alu->cost(x, mode); + } + + /* We can incorporate zero/sign extend for free. */ + if (GET_CODE(op0) == ZERO_EXTEND || GET_CODE(op0) == SIGN_EXTEND) + op0 = XEXP(op0, 0); + + *cost += rtx_cost(op0, VOIDmode, ASHIFT, 0, speed); + return true; + } else { + if (speed) + /* Vector shift (register). */ + *cost += alu->cost(x, mode); + return false; /* All arguments need to be in registers. */ + } + + case ROTATE: + case ROTATERT: + case LSHIFTRT: + case ASHIFTRT: + op0 = XEXP(x, 0); + op1 = XEXP(x, 1); + + if (CONST_INT_P(op1)) { + /* ASR (immediate) and friends. */ + if (speed) { + *cost += alu->cost(x, mode); + } + + *cost += rtx_cost(op0, mode, (enum rtx_code)code, 0, speed); + return true; + } else { + if (VECTOR_MODE_P(mode)) { + if (speed) + /* Vector shift (register). */ + *cost += alu->cost(x, mode); + } + return false; /* All arguments need to be in registers. */ + } + + case SYMBOL_REF: + return true; + + case HIGH: + case LO_SUM: + /* ADRP/ADD (immediate). */ + if (speed) + *cost += alu->cost(x, mode); + return true; + + case ZERO_EXTRACT: + case SIGN_EXTRACT: + /* UBFX/SBFX. */ + if (speed) { + *cost += alu->cost(x, mode); + } + + /* We can trust that the immediates used will be correct (there + are no by-register forms), so we need only cost op0. */ + *cost += rtx_cost(XEXP(x, 0), VOIDmode, (enum rtx_code)code, 0, speed); + return true; + + case MULT: + *cost += mult->cost(x, mode); + return true; + + case MOD: + case UMOD: + if (speed) { + /* Slighly prefer UMOD over SMOD. */ + *cost += alu->cost(x, mode); + } + return false; /* All arguments need to be in registers. */ + + case DIV: + case UDIV: + case SQRT: + if (speed) { + *cost += alu->cost(x, mode); + } + return false; /* All arguments need to be in registers. */ + + case IF_THEN_ELSE: + if (speed) { + *cost += if_then_else->cost(x, mode); + } + return true; + + case EQ: + case NE: + case GT: + case GTU: + case LT: + case LTU: + case GE: + case GEU: + case LE: + case LEU: + + return false; /* All arguments must be in registers. */ + + case FMA: + op0 = XEXP(x, 0); + op1 = XEXP(x, 1); + op2 = XEXP(x, 2); + + if (speed) { + *cost += alu->cost(x, mode); + } + + /* FMSUB, FNMADD, and FNMSUB are free. */ + if (GET_CODE(op0) == NEG) + op0 = XEXP(op0, 0); + + if (GET_CODE(op2) == NEG) + op2 = XEXP(op2, 0); + + /* aarch64_fnma4_elt_to_64v2df has the NEG as operand 1, + and the by-element operand as operand 0. */ + if (GET_CODE(op1) == NEG) + op1 = XEXP(op1, 0); + + /* Catch vector-by-element operations. The by-element operand can + either be (vec_duplicate (vec_select (x))) or just + (vec_select (x)), depending on whether we are multiplying by + a vector or a scalar. + + Canonicalization is not very good in these cases, FMA4 will put the + by-element operand as operand 0, FNMA4 will have it as operand 1. */ + if (GET_CODE(op0) == VEC_DUPLICATE) + op0 = XEXP(op0, 0); + else if (GET_CODE(op1) == VEC_DUPLICATE) + op1 = XEXP(op1, 0); + + if (GET_CODE(op0) == VEC_SELECT) + op0 = XEXP(op0, 0); + else if (GET_CODE(op1) == VEC_SELECT) + op1 = XEXP(op1, 0); + + /* If the remaining parameters are not registers, + get the cost to put them into registers. */ + *cost += rtx_cost(op0, mode, FMA, 0, speed); + *cost += rtx_cost(op1, mode, FMA, 1, speed); + *cost += rtx_cost(op2, mode, FMA, 2, speed); + return true; + + case FLOAT: + case UNSIGNED_FLOAT: + return false; + + case FLOAT_EXTEND: + if (speed) { + /*Vector truncate. */ + *cost += alu->cost(x, mode); + } + return false; + + case FLOAT_TRUNCATE: + if (speed) { + /*Vector conversion. */ + *cost += alu->cost(x, mode); + } + return false; + + case FIX: + case UNSIGNED_FIX: + x = XEXP(x, 0); + if (speed) { + *cost += alu->cost(x, mode); + } + + *cost += rtx_cost(x, VOIDmode, (enum rtx_code)code, 0, speed); + return true; + + case ABS: + /* ABS (vector). */ + if (speed) + *cost += alu->cost(x, mode); + return false; + + case SMAX: + case SMIN: + if (speed) { + *cost += alu->cost(x, mode); + } + return false; + + case UNSPEC: + break; + + case TRUNCATE: + break; + case CONST_VECTOR: { + *cost = mov->cost(x, mode); + break; + } + case VEC_CONCAT: + /* depending on the operation, either DUP or INS. + For now, keep default costing. */ + break; + case VEC_DUPLICATE: + /* Load using a DUP. */ + *cost = dup->cost(x, mode); + return false; + case VEC_SELECT: { + rtx op0 = XEXP(x, 0); + *cost = rtx_cost(op0, GET_MODE(op0), VEC_SELECT, 0, speed); + + /* cost subreg of 0 as free, otherwise as DUP */ + rtx op1 = XEXP(x, 1); + if (vec_series_lowpart_p(mode, GET_MODE(op1), op1)) + ; + else if (vec_series_highpart_p(mode, GET_MODE(op1), op1)) + *cost = dup->cost(x, mode); + else + *cost = extract->cost(x, mode); + return true; + } + default: + break; + } + + if (dump_file) + fprintf(dump_file, "\nFailed to cost RTX. Assuming default cost.\n"); + + return true; +} + +extern int riscv_builtin_vectorization_cost (enum vect_cost_for_stmt, tree, int); + +riscv_vector_costs::riscv_vector_costs(vec_info *vinfo, bool costing_for_scalar) + : vector_costs(vinfo, costing_for_scalar) {} + +unsigned riscv_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, + stmt_vec_info stmt_info, slp_tree, + tree vectype, int misalign, + vect_cost_model_location where) { + int stmt_cost + = riscv_builtin_vectorization_cost (kind, vectype, misalign); + return record_stmt_cost(stmt_info, where, count * stmt_cost); +} + +void riscv_vector_costs::finish_cost(const vector_costs *uncast_scalar_costs) { + auto *scalar_costs = + static_cast(uncast_scalar_costs); + loop_vec_info loop_vinfo = dyn_cast(m_vinfo); + if (loop_vinfo) + m_costs[vect_body] = 1; + vector_costs::finish_cost(scalar_costs); +} + +bool riscv_vector_costs::better_main_loop_than_p( + const vector_costs *uncast_other) const { + auto other = static_cast(uncast_other); + + return vector_costs::better_main_loop_than_p(other); +} diff --git a/gcc/config/riscv/riscv-vector-cost.h b/gcc/config/riscv/riscv-vector-cost.h new file mode 100644 index 00000000000..ef398915a18 --- /dev/null +++ b/gcc/config/riscv/riscv-vector-cost.h @@ -0,0 +1,400 @@ +/* Cost model definitions for RISC-V 'V' Extension for GNU compiler. + Copyright (C) 2022-2023 Free Software Foundation, Inc. + Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +#ifndef GCC_RISCV_VECTOR_COST_H +#define GCC_RISCV_VECTOR_COST_H + +enum vector_tune_type { + VECTOR_TUNE_GENERIC, +}; + +struct vector_insn_scale_table { + const int load; + const int store; + const int alu; + const int mult; + const int mov; + const int dup; + const int extract; + const int if_then_else; +}; + +struct vector_stmt_scale_table { + const int scalar_int_stmt_cost; /* Cost of any int scalar operation, + excluding load and store. */ + const int scalar_fp_stmt_cost; /* Cost of any fp scalar operation, + excluding load and store. */ + const int scalar_load_cost; /* Cost of scalar load. */ + const int scalar_store_cost; /* Cost of scalar store. */ + const int vec_int_stmt_cost; /* Cost of any int vector operation, + excluding load, store, permute, + vector-to-scalar and + scalar-to-vector operation. */ + const int vec_fp_stmt_cost; /* Cost of any fp vector operation, + excluding load, store, permute, + vector-to-scalar and + scalar-to-vector operation. */ + const int vec_permute_cost; /* Cost of permute operation. */ + const int vec_to_scalar_cost; /* Cost of vec-to-scalar operation. */ + const int scalar_to_vec_cost; /* Cost of scalar-to-vector + operation. */ + const int vec_align_load_cost; /* Cost of aligned vector load. */ + const int vec_unalign_load_cost; /* Cost of unaligned vector load. */ + const int vec_unalign_store_cost; /* Cost of unaligned vector store. */ + const int vec_store_cost; /* Cost of vector store. */ + const int cond_taken_branch_cost; /* Cost of taken branch. */ + const int cond_not_taken_branch_cost; /* Cost of not taken branch. */ +}; + +/* Information about vector code that we're in the process of costing. */ +class riscv_vector_costs : public vector_costs { +public: + riscv_vector_costs(vec_info *, bool); + + unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind, + stmt_vec_info stmt_info, slp_tree, tree vectype, + int misalign, + vect_cost_model_location where) override; + void finish_cost(const vector_costs *) override; + bool better_main_loop_than_p(const vector_costs *other) const override; +}; + +template class vector_insn_cost { +public: + vector_insn_cost(const T *_scale_table) : m_scale_table(_scale_table) {} + ~vector_insn_cost() {} + + virtual int scale(RTX_CODE) const { return 1; } + + virtual unsigned cost(rtx x, machine_mode mode) const { + return riscv_vector::riscv_classify_nf(mode) * riscv_vector::riscv_vlmul_regsize(mode) * + scale(x == NULL_RTX ? UNKNOWN : GET_CODE(x)); + } + +protected: + const T *m_scale_table; +}; + +template class vector_cost_table { +public: + vector_cost_table(const T *) {} + ~vector_cost_table() {} + + virtual bool get_cost(rtx, machine_mode, int *, bool) const { return 1; } +}; + +class vector_alu_cost : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { return m_scale_table->alu; } +}; + +class vector_load_cost : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { return m_scale_table->load; } +}; + +class vector_store_cost : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { return m_scale_table->store; } +}; + +class vector_mult_cost : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { return m_scale_table->mult; } +}; + +class vector_mov_cost : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { return m_scale_table->mov; } +}; + +class vector_dup_cost : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { return m_scale_table->dup; } +}; + +class vector_extract_cost : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { return m_scale_table->extract; } +}; + +class vector_if_then_else_cost + : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { + return m_scale_table->if_then_else; + } +}; + +class vector_insn_cost_table + : public vector_cost_table { +public: + vector_insn_cost_table(const vector_insn_scale_table *_scale_table) + : vector_cost_table(_scale_table) { + load = new vector_load_cost(_scale_table); + store = new vector_store_cost(_scale_table); + alu = new vector_alu_cost(_scale_table); + mult = new vector_mult_cost(_scale_table); + mov = new vector_mov_cost(_scale_table); + dup = new vector_dup_cost(_scale_table); + extract = new vector_extract_cost(_scale_table); + if_then_else = new vector_if_then_else_cost(_scale_table); + } + + bool get_cost(rtx, machine_mode, int *, bool) const override; + +public: + const vector_insn_cost *load; + const vector_insn_cost *store; + const vector_insn_cost *alu; + const vector_insn_cost *mult; + const vector_insn_cost *mov; + const vector_insn_cost *dup; + const vector_insn_cost *extract; + const vector_insn_cost *if_then_else; +}; + +// ==================== vector stmt cost========================= +class vector_scalar_int_cost + : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { + return m_scale_table->scalar_int_stmt_cost; + } +}; + +class vector_scalar_fp_cost : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { + return m_scale_table->scalar_fp_stmt_cost; + } +}; + +class vector_scalar_load_cost + : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { + return m_scale_table->scalar_load_cost; + } +}; + +class vector_scalar_store_cost + : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { + return m_scale_table->scalar_store_cost; + } +}; + +class vector_vec_int_cost : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { + return m_scale_table->vec_int_stmt_cost; + } +}; + +class vector_vec_fp_cost : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { + return m_scale_table->vec_fp_stmt_cost; + } +}; + +class vector_vec_permute_cost + : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { + return m_scale_table->vec_permute_cost; + } +}; + +class vector_vec_to_scalar_cost + : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { + return m_scale_table->vec_to_scalar_cost; + } +}; + +class vector_scalar_to_vec_cost + : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { + return m_scale_table->scalar_to_vec_cost; + } +}; + +class vector_vec_align_load_cost + : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { + return m_scale_table->vec_align_load_cost; + } +}; + +class vector_vec_unalign_load_cost + : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { + return m_scale_table->vec_unalign_load_cost; + } +}; + +class vector_vec_unalign_store_cost + : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { + return m_scale_table->vec_unalign_store_cost; + } +}; + +class vector_vec_store_cost : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { + return m_scale_table->vec_store_cost; + } +}; + +class vector_cond_taken_branch_cost + : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { + return m_scale_table->cond_taken_branch_cost; + } +}; + +class vector_cond_not_taken_branch_cost + : public vector_insn_cost { +public: + // use the same construction function as the vector_insn_cost + using vector_insn_cost::vector_insn_cost; + + int scale(RTX_CODE) const override { + return m_scale_table->cond_not_taken_branch_cost; + } +}; + +class vector_stmt_cost_table + : public vector_cost_table { +public: + vector_stmt_cost_table(const vector_stmt_scale_table *_scale_table) + : vector_cost_table(_scale_table) { + scalar_int = new vector_scalar_int_cost(_scale_table); + scalar_fp = new vector_scalar_fp_cost(_scale_table); + scalar_load = new vector_scalar_load_cost(_scale_table); + scalar_store = new vector_scalar_store_cost(_scale_table); + vec_int = new vector_vec_int_cost(_scale_table); + vec_fp = new vector_vec_fp_cost(_scale_table); + vec_permute = new vector_vec_permute_cost(_scale_table); + vec_to_scalar = new vector_vec_to_scalar_cost(_scale_table); + scalar_to_vec = new vector_scalar_to_vec_cost(_scale_table); + vec_align_load = new vector_vec_align_load_cost(_scale_table); + vec_unalign_load = new vector_vec_unalign_load_cost(_scale_table); + vec_unalign_store = new vector_vec_unalign_store_cost(_scale_table); + vec_store = new vector_vec_store_cost(_scale_table); + cond_taken_branch = new vector_cond_taken_branch_cost(_scale_table); + cond_not_taken_branch = new vector_cond_not_taken_branch_cost(_scale_table); + } + +public: + const vector_insn_cost *scalar_int; + const vector_insn_cost *scalar_fp; + const vector_insn_cost *scalar_load; + const vector_insn_cost *scalar_store; + const vector_insn_cost *vec_int; + const vector_insn_cost *vec_fp; + const vector_insn_cost *vec_permute; + const vector_insn_cost *vec_to_scalar; + const vector_insn_cost *scalar_to_vec; + const vector_insn_cost *vec_align_load; + const vector_insn_cost *vec_unalign_load; + const vector_insn_cost *vec_unalign_store; + const vector_insn_cost *vec_store; + const vector_insn_cost *cond_taken_branch; + const vector_insn_cost *cond_not_taken_branch; +}; + +#endif // GCC_RISCV_VECTOR_COST_H diff --git a/gcc/config/riscv/t-riscv b/gcc/config/riscv/t-riscv index d30e0235356..095169741bb 100644 --- a/gcc/config/riscv/t-riscv +++ b/gcc/config/riscv/t-riscv @@ -51,6 +51,11 @@ riscv-c.o: $(srcdir)/config/riscv/riscv-c.cc $(CONFIG_H) $(SYSTEM_H) \ $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ $(srcdir)/config/riscv/riscv-c.cc +riscv-vector-cost.o: $(srcdir)/config/riscv/riscv-vector-cost.cc $(CONFIG_H) $(SYSTEM_H) \ + coretypes.h $(TM_H) $(TREE_H) output.h $(C_COMMON_H) $(TARGET_H) + $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ + $(srcdir)/config/riscv/riscv-vector-cost.cc + riscv-vsetvl.o: $(srcdir)/config/riscv/riscv-vsetvl.cc \ $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(REGS_H) \ $(TARGET_H) tree-pass.h df.h rtl-ssa.h cfgcleanup.h insn-config.h \ -- 2.34.1 From patchwork Fri Mar 3 04:53:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Michael Collison X-Patchwork-Id: 65952 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6C10C384B04E for ; Fri, 3 Mar 2023 04:53:34 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qt1-x82b.google.com (mail-qt1-x82b.google.com [IPv6:2607:f8b0:4864:20::82b]) by sourceware.org (Postfix) with ESMTPS id 15133384F00C for ; Fri, 3 Mar 2023 04:53:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 15133384F00C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-qt1-x82b.google.com with SMTP id cf14so1705384qtb.10 for ; Thu, 02 Mar 2023 20:53:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:content-language:to:subject:from :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=FPO0X5Bjn0SaOLebzQFeV7JmnGyXqX24tQrz8/6R4jw=; b=CLjRyqKD0VRqNPb0auGmNIrMuPlHjwi1t6FDYbF/CTSYulSSzQe0K6NXrdO6sWOsG3 4u45bAEDMpwLN/5XuB0ywjNIlGhk+WxEPYnQS5Wlii3MGFgOE871KT32UNkTJk/sAzDS pCvlJxcflYC8ghvTSXbzlFTY90hPOXMzf9KQL8WEQglCc0iVF6hwZGFb+4eMApvLl3M2 tdVQek95msg7d1BTN74iVFAwtUIZoRm/vMOHPMQjSDoyHLKgwpj4sNRMXkCE5h3j66SW Jj9V98OqRan4awuc/5LN3ag2GEn7ECsTKPdUeXgUzHm7Boo+TmNjpnSpg3MpYvJE6DGO sQHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:content-language:to:subject:from :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=FPO0X5Bjn0SaOLebzQFeV7JmnGyXqX24tQrz8/6R4jw=; b=V2uKi/wMukjrqI6+/qlOW3SBjcaYlMru8l1+sQ9qTUZNBvM/1mKdoBvgb+gid3I2YO +Ne4XmAqJhNW9Gumt/1dHLXfyB+oZaIB3nszP0DUda/a+vTSB2R1UEoEuQfIv7lTdlAg 8K7VHRAh6LR/eEdQKHH4NzWzjc7BPDr0+1As20zDY0l0K+9Hiwxir4jclPno+XHnqdjr zs0qZINZS24L3rcuLjiJiJE93fgewhxGBggw4KPyLCuogHxSPpf4XBAkxUeE6YQbyBjj tHLJNIwPmxtLHhWDC49rtivL53XO5X597XPRhmfdk2pE6/doyNS3WGzUdorPV95l+qYr HBZg== X-Gm-Message-State: AO0yUKVv4zQL0huPdX2AdJ+b6dFrIB4f2e2RpTh+tzJmnBLFedc5c3Gh 6x6+eQrzjL60itSbD8mRklp8ER7ppoP0TqHb X-Google-Smtp-Source: AK7set89Bz3gk+Cr2THV3XB5rGikBjNTC5hL9CuJD7vWo94SMj9NCg/eWMrWpUFE7ViwZotXrprl3g== X-Received: by 2002:ac8:5882:0:b0:3b9:b761:b0aa with SMTP id t2-20020ac85882000000b003b9b761b0aamr979913qta.11.1677819195197; Thu, 02 Mar 2023 20:53:15 -0800 (PST) Received: from [192.168.86.117] ([136.57.172.92]) by smtp.gmail.com with ESMTPSA id bl14-20020a05620a1a8e00b00706bc44fda8sm1002653qkb.79.2023.03.02.20.53.14 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 02 Mar 2023 20:53:14 -0800 (PST) Message-ID: Date: Thu, 2 Mar 2023 23:53:14 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 From: Michael Collison Subject: [PATCH 04/07] RISC-V: Add auto-vectorization support To: gcc-patches Content-Language: en-US X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch adds support for functions used in implementing various portions of autovectorization support. gcc/ChangeLog:     * config/riscv/riscv-v.cc (riscv_classify_vlmul_field):     New function.     (riscv_vector_preferred_simd_mode): Ditto.     (get_mask_policy_no_pred): Ditto.     (get_tail_policy_no_pred): Ditto.     (riscv_tuple_mode_p): Ditto.     (riscv_classify_nf): Ditto.     (riscv_vlmul_regsize): Ditto.     (riscv_vector_mask_mode_p): Ditto.     (riscv_vector_get_mask_mode): Ditto. ---  gcc/config/riscv/riscv-v.cc | 176 ++++++++++++++++++++++++++++++++++++  1 file changed, 176 insertions(+)  { @@ -162,6 +199,64 @@ calculate_ratio (unsigned int sew, enum vlmul_type vlmul)    return ratio;  } +/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE for RVV.  */ + +machine_mode +riscv_vector_preferred_simd_mode (scalar_mode mode, unsigned vf) +{ +  if (!TARGET_VECTOR) +    return word_mode; + +  switch (mode) +    { +    case E_QImode: +      return vf == 1   ? VNx8QImode +         : vf == 2 ? VNx16QImode +         : vf == 4 ? VNx32QImode +               : VNx64QImode; +      break; +    case E_HImode: +      return vf == 1   ? VNx4HImode +         : vf == 2 ? VNx8HImode +         : vf == 4 ? VNx16HImode +               : VNx32HImode; +      break; +    case E_SImode: +      return vf == 1   ? VNx2SImode +         : vf == 2 ? VNx4SImode +         : vf == 4 ? VNx8SImode +               : VNx16SImode; +      break; +    case E_DImode: +      if (riscv_vector_elen_flags != MASK_VECTOR_ELEN_32 +      && riscv_vector_elen_flags != MASK_VECTOR_ELEN_FP_32) +    return vf == 1     ? VNx1DImode +           : vf == 2 ? VNx2DImode +           : vf == 4 ? VNx4DImode +             : VNx8DImode; +      break; +    case E_SFmode: +      if (TARGET_HARD_FLOAT && riscv_vector_elen_flags != MASK_VECTOR_ELEN_32 +      && riscv_vector_elen_flags != MASK_VECTOR_ELEN_64) +    return vf == 1     ? VNx2SFmode +           : vf == 2 ? VNx4SFmode +           : vf == 4 ? VNx8SFmode +             : VNx16SFmode; +      break; +    case E_DFmode: +      if (TARGET_DOUBLE_FLOAT && TARGET_VECTOR_ELEN_FP_64) +    return vf == 1     ? VNx1DFmode +           : vf == 2 ? VNx2DFmode +           : vf == 4 ? VNx4DFmode +             : VNx8DFmode; +      break; +    default: +      break; +    } + +  return word_mode; +} +  /* Emit an RVV unmask && vl mov from SRC to DEST.  */  static void  emit_pred_op (unsigned icode, rtx mask, rtx dest, rtx src, rtx len, @@ -374,6 +469,87 @@ get_avl_type_rtx (enum avl_type type)    return gen_int_mode (type, Pmode);  } +rtx +get_mask_policy_no_pred () +{ +  return get_mask_policy_for_pred(PRED_TYPE_none); +} + +rtx +get_tail_policy_no_pred () +{ +  return get_mask_policy_for_pred(PRED_TYPE_none); +} + +/* Return true if it is a RVV tuple mode. */ +bool +riscv_tuple_mode_p (machine_mode mode ATTRIBUTE_UNUSED) +{ +  return false; +} + +/* Return nf for a machine mode. */ +int +riscv_classify_nf (machine_mode mode) +{ +  switch (mode) +    { + +    default: +      break; +    } + +  return 1; +} + +/* Return vlmul register size for a machine mode. */ +int +riscv_vlmul_regsize (machine_mode mode) +{ +  if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL) +    return 1; +  switch (riscv_classify_vlmul_field (mode)) +    { +    case VLMUL_FIELD_001: +      return 2; +    case VLMUL_FIELD_010: +      return 4; +    case VLMUL_FIELD_011: +      return 8; +    case VLMUL_FIELD_100: +      gcc_unreachable (); +    default: +      return 1; +    } +} + +/* Return true if it is a RVV mask mode. */ +bool +riscv_vector_mask_mode_p (machine_mode mode) +{ +  return (mode == VNx1BImode || mode == VNx2BImode || mode == VNx4BImode +      || mode == VNx8BImode || mode == VNx16BImode || mode == VNx32BImode +      || mode == VNx64BImode); +} + +/* Implement TARGET_VECTORIZE_GET_MASK_MODE for RVV.  */ + +opt_machine_mode +riscv_vector_get_mask_mode (machine_mode mode) +{ +  machine_mode mask_mode; +  int nf = 1; +  if (riscv_tuple_mode_p (mode)) +    nf = riscv_classify_nf (mode); + +  FOR_EACH_MODE_IN_CLASS (mask_mode, MODE_VECTOR_BOOL) +  if (GET_MODE_INNER (mask_mode) == BImode +      && known_eq (GET_MODE_NUNITS (mask_mode) * nf, GET_MODE_NUNITS (mode)) +      && riscv_vector_mask_mode_p (mask_mode)) +    return mask_mode; +  return default_get_mask_mode (mode); +} +  /* Return the RVV vector mode that has NUNITS elements of mode INNER_MODE.     This function is not only used by builtins, but also will be used by     auto-vectorization in the future.  */ diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 58007cc16eb..58f69e259c0 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -39,9 +39,11 @@  #include "emit-rtl.h"  #include "tm_p.h"  #include "target.h" +#include "targhooks.h"  #include "expr.h"  #include "optabs.h"  #include "tm-constrs.h" +#include "riscv-vector-builtins.h"  using namespace riscv_vector; @@ -108,6 +110,41 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT minval,        && IN_RANGE (INTVAL (elt), minval, maxval));  } +/* Return the vlmul field for a specific machine mode. */ +unsigned int +riscv_classify_vlmul_field (enum machine_mode mode) +{ +  /* Make the decision based on the mode's enum value rather than its +     properties, so that we keep the correct classification regardless +     of -mriscv-vector-bits.  */ +  switch (mode) +    { +    case E_VNx8BImode: +      return VLMUL_FIELD_111; + +    case E_VNx4BImode: +      return VLMUL_FIELD_110; + +    case E_VNx2BImode: +      return VLMUL_FIELD_101; + +    case E_VNx16BImode: +      return VLMUL_FIELD_000; + +    case E_VNx32BImode: +      return VLMUL_FIELD_001; + +    case E_VNx64BImode: +      return VLMUL_FIELD_010; + +    default: +      break; +    } + +  /* we don't care about VLMUL for Mask */ +  return VLMUL_FIELD_000; +} +  rtx  emit_vlmax_vsetvl (machine_mode vmode) From patchwork Fri Mar 3 04:53:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Michael Collison X-Patchwork-Id: 65953 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BB51A383EC4B for ; Fri, 3 Mar 2023 04:53:46 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qt1-x829.google.com (mail-qt1-x829.google.com [IPv6:2607:f8b0:4864:20::829]) by sourceware.org (Postfix) with ESMTPS id A095038515FD for ; Fri, 3 Mar 2023 04:53:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A095038515FD Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-qt1-x829.google.com with SMTP id c18so1736824qte.5 for ; Thu, 02 Mar 2023 20:53:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:content-language:to:subject:from :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=i6SnaafxPCkWdg7nZywvKlXaEBftLbKirFYFeulg/7Q=; b=cOBXMidLtuyDEGjGFELHcccr2zs4uC1JwjC652Xv6C9G0A3YM/64tuxu0LM118LYE/ pOY4TiLZffRgSeJN66rF3XoAKY1Qzrljws8cCxH8zvdhiDEgQcdf7e0DdzHP2wKBL0dP Q1yAeHg3kvWfTfFWXUX3bXlutZ11Yx/RnEyxKQSctILdh/omZpvtFOOCJspidHHa8YDf +gSFi3PG6b0n9nocrWCHC7PYhvy1sgR8/srMJzQbIcjvLtV8L9MOHAJQq0Bz4PenL1K6 aF4jNZIyZcUfJzRDPlaz5Cc51ycWJFy45zKacFaioeTSdkYoWIjGPEUM08IzGVuTd5VX 9CKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:content-language:to:subject:from :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=i6SnaafxPCkWdg7nZywvKlXaEBftLbKirFYFeulg/7Q=; b=YPOb7mDtxLXcYAzVi3RiGxWHWC4uIo1e7YhZblTcwQom5+06BSB6KCtxv8touGkBYv suffxTHD1WGRVUM/EF+rBy5/zTiY+byztKUSTiuLRRquHIy02Ggzg3p6fSYJcWbioL1n yb4nH+N6WcpcSaFHp1oXABudJC/4YoqJjxQrsdtu/jwYT94OEvMEkcXL+XtlI9EYdN8D 4LvuuG4YlAHIcxGay19DaXT1UIl78tAu1WxX0yxI2746m80MlAnvkT0KP35YRX13pxlG XHNGH4Lc7FIPFwrg65jCQw11MCvs+em7krABUgji+R7Pr6ueraRLz+mSueQpQPSq1sbS Hbmg== X-Gm-Message-State: AO0yUKV+ISng5BR05ARUCZnpHvc7zYHv+Aeg2JRLEzUthOCY4mB1gTpY aDFGHPsxLwQqwzarKUSfgixjjmk42ZEE3DCC X-Google-Smtp-Source: AK7set9JgBFXQExPLJ2QM3Fyve4FOyLML8mkGQkzaCuFK+KrK/OMH6V/Eyp7HecXWpLmgtS9s3TxrQ== X-Received: by 2002:ac8:7d42:0:b0:3bf:d688:2a77 with SMTP id h2-20020ac87d42000000b003bfd6882a77mr821493qtb.64.1677819205701; Thu, 02 Mar 2023 20:53:25 -0800 (PST) Received: from [192.168.86.117] ([136.57.172.92]) by smtp.gmail.com with ESMTPSA id l5-20020ac87245000000b003b9a426d626sm1071865qtp.22.2023.03.02.20.53.25 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 02 Mar 2023 20:53:25 -0800 (PST) Message-ID: Date: Thu, 2 Mar 2023 23:53:25 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 From: Michael Collison Subject: [PATCH 05/07] RISC-V: Add auto-vectorization support To: gcc-patches Content-Language: en-US X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch adds support for registering target hooks for basic autovectorization support as well as basic tuning information for the vector extension. gcc/ChangeLog:     * config/riscv/riscv-cores.def (RISCV_TUNE):     Add VECTOR_TUNE_INFO parameter and     * common/config/riscv/riscv-common.cc (RISCV_TUNE):     Add VECTOR_TUNE_INFO parameter.     * config/riscv/riscv.cc (riscv_vector_tune_param):     New struct for vector tuning information.     (riscv_tune_info): add vector_tune_param.     (vector_tune_param): New static variable.     (riscv_vectorization_factor): New variable.     (generic_rvv_insn_scale_table): New struct.     (generic_rvv_stmt_scale_table): New struct.     (generic_rvv_insn_cost_table): New vector insn cost table.     (generic_rvv_stmt_cost_table): New vector statement cost table.     (generic_rvv_tune_info): New rvv tuning table.     (RISCV_TUNE): Add VECTOR_TUNE_INFO parameter.     (riscv_rtx_costs): Return vector estimate if vector mode.     (riscv_option_override): Set vector_tune_param.     (riscv_option_override): Set riscv_vectorization_factor.     (riscv_estimated_poly_value): Implement     TARGET_ESTIMATED_POLY_VALUE.     (riscv_preferred_simd_mode): Implement     TARGET_VECTORIZE_PREFERRED_SIMD_MODE.     (riscv_autovectorize_vector_modes): Implement     TARGET_AUTOVECTORIZE_VECTOR_MODES.     (riscv_get_mask_mode): Implement TARGET_VECTORIZE_GET_MASK_MODE.     (riscv_empty_mask_is_expensive): Implement     TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE.     (riscv_builtin_vectorization_cost): Implement     TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST.     (riscv_vectorize_create_costs): Implement     TARGET_VECTORIZE_CREATE_COSTS.     (TARGET_ESTIMATED_POLY_VALUE): Register target macro.     (TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST): Ditto.     (TARGET_VECTORIZE_PREFERRED_SIMD_MODE): Ditto.     (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Ditto.     (TARGET_VECTORIZE_GET_MASK_MODE): Ditto.     (TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE): Ditto.     (TARGET_VECTORIZE_LOOP_LEN_OVERRIDE_MASK): Ditto.     (TARGET_VECTORIZE_CREATE_COSTS): Ditto ---  gcc/common/config/riscv/riscv-common.cc |   2 +-  gcc/config/riscv/riscv-cores.def        |  14 +-  gcc/config/riscv/riscv.cc               | 321 +++++++++++++++++++++++-  3 files changed, 325 insertions(+), 12 deletions(-)  static tree riscv_handle_type_attribute (tree *, tree, tree, int, bool *); @@ -403,8 +469,8 @@ static const unsigned gpr_save_reg_order[] = {  /* A table describing all the processors GCC knows about.  */  static const struct riscv_tune_info riscv_tune_info_table[] = { -#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO)    \ -  { TUNE_NAME, PIPELINE_MODEL, & TUNE_INFO}, +#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO, VECTOR_TUNE_INFO)    \ +  { TUNE_NAME, PIPELINE_MODEL, & TUNE_INFO, &VECTOR_TUNE_INFO},  #include "riscv-cores.def"  }; @@ -2237,8 +2303,7 @@ riscv_rtx_costs (rtx x, machine_mode mode, int outer_code, int opno ATTRIBUTE_UN       Cost Model need to be well analyzed and supported in the future. */    if (riscv_v_ext_vector_mode_p (mode))      { -      *total = COSTS_N_INSNS (1); -      return true; +      return vector_tune_param->rvv_insn_costs_table->get_cost (x, mode, total, speed);      }    bool float_mode_p = FLOAT_MODE_P (mode); @@ -6079,6 +6144,7 @@ riscv_option_override (void)                 RISCV_TUNE_STRING_DEFAULT));    riscv_microarchitecture = cpu->microarchitecture;    tune_param = optimize_size ? &optimize_size_tune_info : cpu->tune_param; +  vector_tune_param = cpu->vector_tune_param;    /* Use -mtune's setting for slow_unaligned_access, even when optimizing       for size.  For architectures that trap and emulate unaligned accesses, @@ -6198,6 +6264,10 @@ riscv_option_override (void)    /* Convert -march to a chunks count.  */    riscv_vector_chunks = riscv_convert_vector_bits (); + +  if (TARGET_VECTOR) +    riscv_vectorization_factor = riscv_vector_lmul; +  }  /* Implement TARGET_CONDITIONAL_REGISTER_USAGE.  */ @@ -6892,6 +6962,218 @@ riscv_dwarf_poly_indeterminate_value (unsigned int i, unsigned int *factor,    return RISCV_DWARF_VLENB;  } +/* Implement TARGET_ESTIMATED_POLY_VALUE. +   Look into the tuning structure for an estimate. +   KIND specifies the type of requested estimate: min, max or likely. +   For cores with a known RVV width all three estimates are the same. +   For generic RVV tuning we want to distinguish the maximum estimate from +   the minimum and likely ones. +   The likely estimate is the same as the minimum in that case to give a +   conservative behavior of auto-vectorizing with RVV when it is a win +   even for 128-bit RVV. +   When RVV width information is available VAL.coeffs[1] is multiplied by +   the number of VQ chunks over the initial Advanced SIMD 128 bits.  */ + +static HOST_WIDE_INT +riscv_estimated_poly_value (poly_int64 val, +                            poly_value_estimate_kind kind = POLY_VALUE_LIKELY) +{ +  unsigned int width_source = +      BITS_PER_RISCV_VECTOR.is_constant () +          ? (unsigned int)BITS_PER_RISCV_VECTOR.to_constant () +          : (unsigned int)RVV_SCALABLE; + +  /* If there is no core-specific information then the minimum and likely +     values are based on 128-bit vectors and the maximum is based on +     the architectural maximum of 2048 bits.  */ +  if (width_source == RVV_SCALABLE) +    switch (kind) +      { +      case POLY_VALUE_MIN: +      case POLY_VALUE_LIKELY: +        return val.coeffs[0]; + +      case POLY_VALUE_MAX: +        return val.coeffs[0] + val.coeffs[1] * 15; +      } + +  /* Allow BITS_PER_RISCV_VECTOR to be a bitmask of different VL, treating the +     lowest as likely.  This could be made more general if future -mtune +     options need it to be.  */ +  if (kind == POLY_VALUE_MAX) +    width_source = 1 << floor_log2 (width_source); +  else +    width_source = least_bit_hwi (width_source); + +  /* If the core provides width information, use that.  */ +  HOST_WIDE_INT over_128 = width_source - 128; +  return val.coeffs[0] + val.coeffs[1] * over_128 / 128; +} + +/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE.  */ + +static machine_mode +riscv_preferred_simd_mode (scalar_mode mode) +{ +  machine_mode vmode = +    riscv_vector::riscv_vector_preferred_simd_mode (mode, riscv_vectorization_factor); +  if (VECTOR_MODE_P (vmode)) +    return vmode; + +  return word_mode; +} + +/* Implement TARGET_AUTOVECTORIZE_VECTOR_MODES for RVV.  */ +static unsigned int +riscv_autovectorize_vector_modes (vector_modes *modes, bool) +{ +  if (!TARGET_VECTOR) +    return 0; + +  if (riscv_vectorization_factor == RVV_LMUL1) +    { +      modes->safe_push (VNx16QImode); +      modes->safe_push (VNx8QImode); +      modes->safe_push (VNx4QImode); +      modes->safe_push (VNx2QImode); +    } +  else if (riscv_vectorization_factor == RVV_LMUL2) +    { +      modes->safe_push (VNx32QImode); +      modes->safe_push (VNx16QImode); +      modes->safe_push (VNx8QImode); +      modes->safe_push (VNx4QImode); +    } +  else if (riscv_vectorization_factor == RVV_LMUL4) +    { +      modes->safe_push (VNx64QImode); +      modes->safe_push (VNx32QImode); +      modes->safe_push (VNx16QImode); +      modes->safe_push (VNx8QImode); +    } +  else +    { +      modes->safe_push (VNx64QImode); +      modes->safe_push (VNx32QImode); +      modes->safe_push (VNx16QImode); +    } + +  return 0; +} + +/* Implement TARGET_VECTORIZE_GET_MASK_MODE.  */ + +static opt_machine_mode +riscv_get_mask_mode (machine_mode mode) +{ +  machine_mode mask_mode = VOIDmode; +  if (TARGET_VECTOR && +      riscv_vector::riscv_vector_get_mask_mode (mode).exists (&mask_mode)) +    return mask_mode; + +  return default_get_mask_mode (mode); +} + +/* Implement TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE.  Assume for now that +   it isn't worth branching around empty masked ops (including masked +   stores).  */ + +static bool +riscv_empty_mask_is_expensive (unsigned) +{ +  return false; +} + +/* Implement targetm.vectorize.builtin_vectorization_cost.  */ +int +riscv_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost, +                                  tree vectype, int misalign ATTRIBUTE_UNUSED) +{ +  unsigned elements; +  bool fp = false; +  rtx x = NULL_RTX; +  machine_mode mode = VOIDmode; + +  if (vectype != NULL) +    { +      fp = FLOAT_TYPE_P (vectype); +      mode = TYPE_MODE (vectype); +    } + +  switch (type_of_cost) +    { +    case scalar_stmt: +      return fp ? vector_tune_param->rvv_stmt_costs_table->scalar_fp->cost ( +                      x, mode) +                : vector_tune_param->rvv_stmt_costs_table->scalar_int->cost ( +                      x, mode); + +    case scalar_load: +      return vector_tune_param->rvv_stmt_costs_table->scalar_load->cost (x, + mode); + +    case scalar_store: +      return vector_tune_param->rvv_stmt_costs_table->scalar_store->cost (x, + mode); + +    case vector_stmt: +      return fp ? vector_tune_param->rvv_stmt_costs_table->vec_fp->cost (x, + mode) +                : vector_tune_param->rvv_stmt_costs_table->vec_int->cost (x, + mode); + +    case vector_load: +      return vector_tune_param->rvv_stmt_costs_table->vec_align_load->cost ( +          x, mode); + +    case vector_store: +      return vector_tune_param->rvv_stmt_costs_table->vec_store->cost (x, mode); + +    case vec_to_scalar: +      return vector_tune_param->rvv_stmt_costs_table->vec_to_scalar->cost ( +          x, mode); + +    case scalar_to_vec: +      return vector_tune_param->rvv_stmt_costs_table->scalar_to_vec->cost ( +          x, mode); + +    case unaligned_load: +    case vector_gather_load: +      return vector_tune_param->rvv_stmt_costs_table->vec_unalign_load->cost ( +          x, mode); + +    case unaligned_store: +    case vector_scatter_store: +      return vector_tune_param->rvv_stmt_costs_table->vec_unalign_store->cost ( +          x, mode); + +    case cond_branch_taken: +      return vector_tune_param->rvv_stmt_costs_table->cond_taken_branch->cost ( +          x, mode); + +    case cond_branch_not_taken: +      return vector_tune_param->rvv_stmt_costs_table->cond_not_taken_branch +          ->cost (x, mode); + +    case vec_perm: +      return vector_tune_param->rvv_stmt_costs_table->vec_permute->cost (x, + mode); + +    case vec_promote_demote: +      return fp ? vector_tune_param->rvv_stmt_costs_table->vec_fp->cost (x, + mode) +                : vector_tune_param->rvv_stmt_costs_table->vec_int->cost (x, + mode); + +    case vec_construct: +      elements = estimated_poly_value (TYPE_VECTOR_SUBPARTS (vectype)); +      return elements / 2 + 1; + +    default: +      gcc_unreachable (); +    } +} +  /* Return true if a shift-amount matches the trailing cleared bits on     a bitmask.  */ @@ -6901,6 +7183,13 @@ riscv_shamt_matches_mask_p (int shamt, HOST_WIDE_INT mask)    return shamt == ctz_hwi (mask);  } +/* Implement TARGET_VECTORIZE_CREATE_COSTS.  */ +vector_costs * +riscv_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar) +{ +  return new riscv_vector_costs (vinfo, costing_for_scalar); +} +  /* Initialize the GCC target structure.  */  #undef TARGET_ASM_ALIGNED_HI_OP  #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t" @@ -7143,6 +7432,30 @@ riscv_shamt_matches_mask_p (int shamt, HOST_WIDE_INT mask)  #undef TARGET_VERIFY_TYPE_CONTEXT  #define TARGET_VERIFY_TYPE_CONTEXT riscv_verify_type_context +#undef TARGET_ESTIMATED_POLY_VALUE +#define TARGET_ESTIMATED_POLY_VALUE riscv_estimated_poly_value + +#undef TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST +#define TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST riscv_builtin_vectorization_cost + +#undef TARGET_VECTORIZE_PREFERRED_SIMD_MODE +#define TARGET_VECTORIZE_PREFERRED_SIMD_MODE riscv_preferred_simd_mode + +#undef TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES +#define TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES riscv_autovectorize_vector_modes + +#undef TARGET_VECTORIZE_GET_MASK_MODE +#define TARGET_VECTORIZE_GET_MASK_MODE riscv_get_mask_mode + +#undef TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE +#define TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE riscv_empty_mask_is_expensive + +#undef TARGET_VECTORIZE_LOOP_LEN_OVERRIDE_MASK +#define TARGET_VECTORIZE_LOOP_LEN_OVERRIDE_MASK riscv_loop_len_override_mask + +#undef TARGET_VECTORIZE_CREATE_COSTS +#define TARGET_VECTORIZE_CREATE_COSTS riscv_vectorize_create_costs +  #undef TARGET_VECTOR_ALIGNMENT  #define TARGET_VECTOR_ALIGNMENT riscv_vector_alignment diff --git a/gcc/common/config/riscv/riscv-common.cc b/gcc/common/config/riscv/riscv-common.cc index ebc1ed7d7e4..6b8d92af986 100644 --- a/gcc/common/config/riscv/riscv-common.cc +++ b/gcc/common/config/riscv/riscv-common.cc @@ -246,7 +246,7 @@ static const riscv_cpu_info riscv_cpu_tables[] =  static const char *riscv_tunes[] =  { -#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO) \ +#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO, VECTOR_TUNE_INFO)    \      TUNE_NAME,  #include "../../../config/riscv/riscv-cores.def"      NULL diff --git a/gcc/config/riscv/riscv-cores.def b/gcc/config/riscv/riscv-cores.def index 2a834cae21d..4feb0366222 100644 --- a/gcc/config/riscv/riscv-cores.def +++ b/gcc/config/riscv/riscv-cores.def @@ -30,15 +30,15 @@     identifier, reference to riscv.cc.  */  #ifndef RISCV_TUNE -#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO) +#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO, VECTOR_TUNE_INFO)  #endif -RISCV_TUNE("rocket", generic, rocket_tune_info) -RISCV_TUNE("sifive-3-series", generic, rocket_tune_info) -RISCV_TUNE("sifive-5-series", generic, rocket_tune_info) -RISCV_TUNE("sifive-7-series", sifive_7, sifive_7_tune_info) -RISCV_TUNE("thead-c906", generic, thead_c906_tune_info) -RISCV_TUNE("size", generic, optimize_size_tune_info) +RISCV_TUNE("rocket", generic, rocket_tune_info, generic_rvv_tune_info) +RISCV_TUNE("sifive-3-series", generic, rocket_tune_info, generic_rvv_tune_info) +RISCV_TUNE("sifive-5-series", generic, rocket_tune_info, generic_rvv_tune_info) +RISCV_TUNE("sifive-7-series", sifive_7, sifive_7_tune_info, generic_rvv_tune_info) +RISCV_TUNE("thead-c906", generic, thead_c906_tune_info, generic_rvv_tune_info) +RISCV_TUNE("size", generic, optimize_size_tune_info, generic_rvv_tune_info)  #undef RISCV_TUNE diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index f11b7949a49..16b38ba4d76 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -60,6 +60,16 @@ along with GCC; see the file COPYING3.  If not see  #include "opts.h"  #include "tm-constrs.h"  #include "rtl-iter.h" +#include "gimple.h" +#include "cfghooks.h" +#include "cfgloop.h" +#include "cfgrtl.h" +#include "sel-sched.h" +#include "fold-const.h" +#include "gimple-iterator.h" +#include "gimple-expr.h" +#include "tree-vectorizer.h" +#include "riscv-vector-cost.h"  /* This file should be included last.  */  #include "target-def.h" @@ -238,6 +248,12 @@ struct riscv_tune_param    bool slow_unaligned_access;  }; +/* Cost for vector insn classes.  */ +struct riscv_vector_tune_param { +    const vector_insn_cost_table* rvv_insn_costs_table; +    const vector_stmt_cost_table* rvv_stmt_costs_table; +}; +  /* Information about one micro-arch we know about.  */  struct riscv_tune_info {    /* This micro-arch canonical name.  */ @@ -248,6 +264,9 @@ struct riscv_tune_info {    /* Tuning parameters for this micro-arch.  */    const struct riscv_tune_param *tune_param; + +  /* Tuning vector parameters for this micro-arch.  */ +  const struct riscv_vector_tune_param *vector_tune_param;  };  /* Global variables for machine-dependent things.  */ @@ -266,6 +285,9 @@ static int epilogue_cfa_sp_offset;  /* Which tuning parameters to use.  */  static const struct riscv_tune_param *tune_param; +/* Which vector tuning parameters to use.  */ +static const struct riscv_vector_tune_param *vector_tune_param; +  /* Which automaton to use for tuning.  */  enum riscv_microarchitecture_type riscv_microarchitecture; @@ -275,6 +297,9 @@ poly_uint16 riscv_vector_chunks;  /* The number of bytes in a vector chunk.  */  unsigned riscv_bytes_per_vector_chunk; +/* Prefer vf for auto-vectorizer.  */ +unsigned riscv_vectorization_factor; +  /* Index R is the smallest register class that contains register R.  */  const enum reg_class riscv_regno_to_class[FIRST_PSEUDO_REGISTER] = {    GR_REGS,    GR_REGS,    GR_REGS,    GR_REGS, @@ -367,6 +392,47 @@ static const struct riscv_tune_param optimize_size_tune_info = {    false,                    /* slow_unaligned_access */  }; +static const vector_insn_scale_table generic_rvv_insn_scale_table = { +    4, /*load*/ +    1, /*store*/ +    1, /*alu*/ +    1, /*mult*/ +    1, /*movi*/ +    1, /*dup*/ +    1, /*extract*/ +    1, /*if_then_else*/ +}; + +static const vector_stmt_scale_table generic_rvv_stmt_scale_table = { +    1, /* scalar_int_stmt_cost  */ +    1, /* scalar_fp_stmt_cost  */ +    1, /* scalar_load_cost  */ +    1, /* scalar_store_cost  */ +    1, /* vec_int_stmt_cost  */ +    1, /* vec_fp_stmt_cost  */ +    1, /* vec_permute_cost  */ +    1, /* vec_to_scalar_cost  */ +    1, /* scalar_to_vec_cost  */ +    1, /* vec_align_load_cost  */ +    1, /* vec_unalign_load_cost  */ +    1, /* vec_unalign_store_cost  */ +    1, /* vec_store_cost  */ +    1, /* cond_taken_branch_cost  */ +    1 /* cond_not_taken_branch_cost  */ +}; + +static const vector_insn_cost_table* generic_rvv_insn_cost_table = +            new vector_insn_cost_table(&generic_rvv_insn_scale_table); + +static const vector_stmt_cost_table* generic_rvv_stmt_cost_table = +            new vector_stmt_cost_table(&generic_rvv_stmt_scale_table); + +/* Costs to use when optimizing for riscv vector.  */ +static const struct riscv_vector_tune_param generic_rvv_tune_info = { +  generic_rvv_insn_cost_table, +  generic_rvv_stmt_cost_table +}; +  static tree riscv_handle_fndecl_attribute (tree *, tree, tree, int, bool *); From patchwork Fri Mar 3 04:53:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Michael Collison X-Patchwork-Id: 65954 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3BF60383E682 for ; Fri, 3 Mar 2023 04:54:19 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qv1-xf30.google.com (mail-qv1-xf30.google.com [IPv6:2607:f8b0:4864:20::f30]) by sourceware.org (Postfix) with ESMTPS id E4EA5384644A for ; Fri, 3 Mar 2023 04:53:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E4EA5384644A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-qv1-xf30.google.com with SMTP id f1so1022507qvx.13 for ; Thu, 02 Mar 2023 20:53:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:content-language:to:subject:from :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=NNWdMamffT+eNWkln+jxwXjqJiAbH4q2J5ciu7yB1Vo=; b=vNzWgsGOtMYKw24hukAiVO5ppnXE0KZ7JjRHyi2+jH8Dadr0apfjXqTSQT5XJXE4a3 Bp2KiL65wIL2cvhdhmvNaoUtm8mY3IDmlsIEYoG2vP+UbZtjhQnqNIE2vUHGkV18UnmE eXMYD8IzUdlMiZa+uxQzM1XiuvUHmYb4bjGFCCErbelzJTLzySv6e9seXByABcOVYIJf VvOePzBuEvTvQVmuZ2sNy09tSztp451jJoFo50RKoDaYI9VlUuCasSYs/rSoLuENur6W e3yZdG2bTeVC1miX/5wdEWJUk04A/y16J/9hymjjym1J9uvYLM3BwdGnlmfUwVLbvmmB +q2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:content-language:to:subject:from :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=NNWdMamffT+eNWkln+jxwXjqJiAbH4q2J5ciu7yB1Vo=; b=d5SJ7JLr3GH8y6WJJ1vUfR4fCGJSl9qgOiosNLfJXmq3GU1b+tVnXjXKvyMlxK3beZ IieF8H2Lo//gZJc5rHmRqJaTarCd21UV3GsE60SNrIWHQ/wk3VHRlfxzHyMJuRYu91Ra WZvQ8H89MqAk2OjNpcLMhgb1LCyv4PHKP0SrSyejjWjA1VB243n0Sr79W9ODNkHoECLb RWgzOwGvaTJiGETwRXmgc53d+iQhwcFCWPGQLLKGxkI2zdCA6y3ucIBbP1ShfwnjfZiJ b2OpyzxoqeMcouuqUh067bK+ngu4dX4tm5LLmNY/pv/yjLRtmcf3e/78DFCWkPvUsJBI x9JA== X-Gm-Message-State: AO0yUKWmFjxfxV4bc4lFV71lQ+05FmAYIzDMeGCuH/h7o6xCq+sEeJy5 x3KOI0WyVZ4r2ZLcayAdGHZHNg8FIVAt9Xc+ X-Google-Smtp-Source: AK7set9oYiNdH9FIKoVzhS4fCYGjIKm3UlO5fRTDy+gmNq9osvbFvDWPaq26YB1FHSA5/smxUdjwLA== X-Received: by 2002:a05:6214:f09:b0:577:5b89:577e with SMTP id gw9-20020a0562140f0900b005775b89577emr1069426qvb.32.1677819215863; Thu, 02 Mar 2023 20:53:35 -0800 (PST) Received: from [192.168.86.117] ([136.57.172.92]) by smtp.gmail.com with ESMTPSA id 74-20020a370a4d000000b0073b878e3f30sm1015960qkk.59.2023.03.02.20.53.35 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 02 Mar 2023 20:53:35 -0800 (PST) Message-ID: <927ed290-1340-5793-2c7f-8e0359cd0cea@rivosinc.com> Date: Thu, 2 Mar 2023 23:53:35 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 From: Michael Collison Subject: [PATCH 06/07] RISC-V: Add auto-vectorization support To: gcc-patches Content-Language: en-US X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch adds patterns that provide basic autovectorization support for integer adds and subtracts. gcc/ChangeLog:     * config/riscv/riscv.md (riscv_classify_vlmul_field):     New external declaration.     (riscv_vector_preferred_simd_mode): Include     vector-iterators.md.     * config/riscv/vector-auto.md: New file containing     autovectorization patterns.     * config/riscv/vector-iterators.md (UNSPEC_VADD/UNSPEC_VSUB):     New unspecs for autovectorization patterns.     * config/riscv/vector.md: Remove include of vector-iterators.md     and include vector-auto.md. ---  gcc/config/riscv/riscv.md            |   1 +  gcc/config/riscv/vector-auto.md      | 172 +++++++++++++++++++++++++++  gcc/config/riscv/vector-iterators.md |   2 +  gcc/config/riscv/vector.md           |   4 +-  4 files changed, 177 insertions(+), 2 deletions(-)  create mode 100644 gcc/config/riscv/vector-auto.md diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 05924e9bbf1..c34124095f7 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -131,6 +131,7 @@  (include "predicates.md")  (include "constraints.md")  (include "iterators.md") +(include "vector-iterators.md")  ;; ....................  ;; diff --git a/gcc/config/riscv/vector-auto.md b/gcc/config/riscv/vector-auto.md new file mode 100644 index 00000000000..e5a19663d18 --- /dev/null +++ b/gcc/config/riscv/vector-auto.md @@ -0,0 +1,172 @@ +;; Machine description for RISC-V 'V' Extension for GNU compiler. +;; Copyright (C) 2022-2023 Free Software Foundation, Inc. +;; Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd. +;; Contributed by Michael Collison (collison@rivosinc.com, Rivos Inc. + +;; This file is part of GCC. + +;; GCC is free software; you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 3, or (at your option) +;; any later version. + +;; GCC is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the +;; GNU General Public License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with GCC; see the file COPYING3.  If not see +;; . + + +;; ------------------------------------------------------------------------- +;; ---- [INT] Addition +;; ------------------------------------------------------------------------- +;; Includes: +;; - vadd.vv +;; - vadd.vx +;; - vadd.vi +;; ------------------------------------------------------------------------- + +(define_expand "add3" +  [(match_operand:VI 0 "register_operand") +   (match_operand:VI 1 "register_operand") +   (match_operand:VI 2 "vector_arith_operand")] +  "TARGET_VECTOR" +{ +  using namespace riscv_vector; + +  rtx merge = gen_rtx_UNSPEC (mode, gen_rtvec (1, const0_rtx), UNSPEC_VUNDEF); +  rtx vl = emit_vlmax_vsetvl (mode); +  rtx mask_policy = get_mask_policy_no_pred(); +  rtx tail_policy = get_tail_policy_no_pred(); +  rtx mask = CONSTM1_RTX(mode); +  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX); + +  emit_insn(gen_pred_add(operands[0], mask, merge, operands[1], operands[2], +                vl, tail_policy, mask_policy, vlmax_avl_p)); + +  DONE; +}) + +(define_expand "cond_add" +  [(match_operand:VI 0 "register_operand") +   (match_operand: 1 "register_operand") +   (match_operand:VI 2 "register_operand") +   (match_operand:VI 3 "vector_reg_or_const_dup_operand") +   (match_operand:VI 4 "register_operand")] +  "TARGET_VECTOR" +{ +  using namespace riscv_vector; + +  rtx merge = operands[4]; +  rtx vl = emit_vlmax_vsetvl (mode); +  rtx mask_policy = get_mask_policy_no_pred(); +  rtx tail_policy = get_tail_policy_no_pred(); +  rtx mask = operands[1]; +  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX); + +  emit_insn(gen_pred_add(operands[0], mask, merge, operands[2], operands[3], +                vl, tail_policy, mask_policy, vlmax_avl_p)); +  DONE; +}) + +(define_expand "len_add" +  [(match_operand:VI 0 "register_operand") +   (match_operand:VI 1 "register_operand") +   (match_operand:VI 2 "vector_reg_or_const_dup_operand") +   (match_operand 3 "p_reg_or_const_csr_operand")] +  "TARGET_VECTOR" +{ +  using namespace riscv_vector; + +  rtx merge = gen_rtx_UNSPEC (mode, gen_rtvec (1, const0_rtx), UNSPEC_VUNDEF); +  rtx vl = operands[3]; +  rtx mask_policy = get_mask_policy_no_pred(); +  rtx tail_policy = get_tail_policy_no_pred(); +  rtx mask = CONSTM1_RTX(mode); +  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX); + +  emit_insn(gen_pred_add(operands[0], mask, merge, operands[1], operands[2], +                vl, tail_policy, mask_policy, vlmax_avl_p)); +  DONE; +}) + + +;; ------------------------------------------------------------------------- +;; ---- [INT] Subtraction +;; ------------------------------------------------------------------------- +;; Includes: +;; - vsub.vv +;; - vsub.vx +;; - vadd.vi +;; - vrsub.vx +;; - vrsub.vi +;; ------------------------------------------------------------------------- + +(define_expand "sub3" +  [(match_operand:VI 0 "register_operand") +   (match_operand:VI 1 "register_operand") +   (match_operand:VI 2 "register_operand")] +  "TARGET_VECTOR" +{ +  using namespace riscv_vector; + +  rtx merge = gen_rtx_UNSPEC (mode, gen_rtvec (1, const0_rtx), UNSPEC_VUNDEF); +  rtx vl = emit_vlmax_vsetvl (mode); +  rtx mask_policy = get_mask_policy_no_pred(); +  rtx tail_policy = get_tail_policy_no_pred(); +  rtx mask = CONSTM1_RTX(mode); +  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX); + +  emit_insn(gen_pred_sub(operands[0], mask, merge, operands[1], operands[2], +                vl, tail_policy, mask_policy, vlmax_avl_p)); + +  DONE; +}) + +(define_expand "cond_sub" +  [(match_operand:VI 0 "register_operand") +   (match_operand: 1 "register_operand") +   (match_operand:VI 2 "register_operand") +   (match_operand:VI 3 "register_operand") +   (match_operand:VI 4 "register_operand")] +  "TARGET_VECTOR" +{ +  using namespace riscv_vector; + +  rtx merge = operands[4]; +  rtx vl = emit_vlmax_vsetvl (mode); +  rtx mask_policy = get_mask_policy_no_pred(); +  rtx tail_policy = get_tail_policy_no_pred(); +  rtx mask = operands[1]; +  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX); + +  emit_insn(gen_pred_sub(operands[0], mask, merge, operands[2], operands[3], +                vl, tail_policy, mask_policy, vlmax_avl_p)); + +  DONE; +}) + +(define_expand "len_sub" +  [(match_operand:VI 0 "register_operand") +   (match_operand:VI 1 "register_operand") +   (match_operand:VI 2 "register_operand") +   (match_operand 3 "p_reg_or_const_csr_operand")] +  "TARGET_VECTOR" +{ +  using namespace riscv_vector; + +  rtx merge = gen_rtx_UNSPEC (mode, gen_rtvec (1, const0_rtx), UNSPEC_VUNDEF); +  rtx vl = operands[3]; +  rtx mask_policy = get_mask_policy_no_pred(); +  rtx tail_policy = get_tail_policy_no_pred(); +  rtx mask = CONSTM1_RTX(mode); +  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX); + +  emit_insn(gen_pred_sub(operands[0], mask, merge, operands[1], operands[2], +                vl, tail_policy, mask_policy, vlmax_avl_p)); + +  DONE; +}) diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index cb817abcfde..1a88e511d1b 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -34,6 +34,8 @@    UNSPEC_VMULHU    UNSPEC_VMULHSU +  UNSPEC_VADD +  UNSPEC_VSUB    UNSPEC_VADC    UNSPEC_VSBC    UNSPEC_VMADC diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 69b7cafbf17..87d85d3a415 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -26,8 +26,6 @@  ;; - Auto-vectorization (TBD)  ;; - Combine optimization (TBD) -(include "vector-iterators.md") -  (define_constants [     (INVALID_ATTRIBUTE            255)     (X0_REGNUM                      0) @@ -336,6 +334,8 @@         (symbol_ref "INTVAL (operands[4])")]      (const_int INVALID_ATTRIBUTE))) +(include "vector-auto.md") +  ;; -----------------------------------------------------------------  ;; ---- Miscellaneous Operations  ;; ----------------------------------------------------------------- From patchwork Fri Mar 3 04:53:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Michael Collison X-Patchwork-Id: 65955 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 79C0B385840F for ; Fri, 3 Mar 2023 04:54:49 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qt1-x82b.google.com (mail-qt1-x82b.google.com [IPv6:2607:f8b0:4864:20::82b]) by sourceware.org (Postfix) with ESMTPS id 795EC383FBB6 for ; Fri, 3 Mar 2023 04:53:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 795EC383FBB6 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-qt1-x82b.google.com with SMTP id cf14so1706006qtb.10 for ; Thu, 02 Mar 2023 20:53:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:content-language:to:subject:from :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=Fh3AxmA49n+kQWlSqyZvf8rzYcF47ucV9eudaUIwM7s=; b=2pjTWJFUq735R9Hz7vKaWkz12QHn8tjEFqaXM4KJmtgxaqutlg5uJYxOXzv3FEgNUI 8oHslZG/LLwLd1Co01QWPPeUhoWopj89fTQifiWvt8f8DSbtQ4u4CoihM7kPYJCC+nNa dEfN/GCwZU56QGUzB/LKE5sL2cJeisXWUnGZqoZD2cDbYiks7yNNnnTSfZ7R1mtALZK8 Q/J2DiloisLs8iRRzNWRDZzye2dS79xVTEY0kBIAP/6ifslye2WfMww9CbQlS3uCz27Z 91EIliAqb2jOP43xi2k1sGKHTDw6LjMheJIgLxLq/7NM46nmz7WnAYKY1bsQSGQ0NtOn Q0yg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:content-language:to:subject:from :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=Fh3AxmA49n+kQWlSqyZvf8rzYcF47ucV9eudaUIwM7s=; b=qz0kAhagXO8BhpL4VrOACPaPApB/zad77q9vtu8VDRg3dIUVFhGW6jT3T/jSic1Hhl ZHFG/FseLcGz3CQZHuXEvbffr5jmaAmSbQmHx3tORqq0RwxRYYnapYHg8bLG27vlm5aw qjm23S4vydTn1NQU98rZ/QnSsU3uRqtA369uohbFgJgvGhXQbuUt0E6rAvLfukkFT3zS GyG2eyVKkr4rNfvIWa/HzplX1kgscgD+PUHVcY1XG4VuypLN3Zsnc4sX1J8TcrDIcEIu ge2x8pVRp4ZNnuorjDjgCWu/bVFmLysPtraM9cPDXEZQ6CiSV2cuTuqp9RyLgidSZGuN EVeg== X-Gm-Message-State: AO0yUKUqc3HwWbObN3oSPywr6fjZ2nxwgGeHT0/5LCQD0venoTy8BCet iEX6yz9ULDUo44H39bvOSscVYjgfvDHxJTJr X-Google-Smtp-Source: AK7set84z6yF2T4rtqBPeXGs+I6ZXyO1w2vVDixrcQAyFuPbmAEauR8VqekmO/MF3oPL8OvHissavw== X-Received: by 2002:ac8:4904:0:b0:3ba:2b4:7b39 with SMTP id e4-20020ac84904000000b003ba02b47b39mr769173qtq.46.1677819223061; Thu, 02 Mar 2023 20:53:43 -0800 (PST) Received: from [192.168.86.117] ([136.57.172.92]) by smtp.gmail.com with ESMTPSA id w15-20020ac8718f000000b003b9bf862c04sm1037449qto.55.2023.03.02.20.53.42 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 02 Mar 2023 20:53:42 -0800 (PST) Message-ID: Date: Thu, 2 Mar 2023 23:53:42 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 From: Michael Collison Subject: [PATCH 07/07] RISC-V: Add auto-vectorization support To: gcc-patches Content-Language: en-US X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch adds tests for autovectorization of integer add and subtract. gcc/testsuite/ChangeLog:     * gcc.target/riscv/rvv/autovec: New directory     for autovectorization tests.     * gcc.target/riscv/rvv/autovec/loop-add-rv32.c: New     test to verify code generation of vector add on rv32.     * gcc.target/riscv/rvv/autovec/loop-add.c: New     test to verify code generation of vector add on rv64.     * gcc.target/riscv/rvv/autovec/loop-sub-rv32.c: New     test to verify code generation of vector subtract on rv32.     * gcc.target/riscv/rvv/autovec/loop-sub.c: New     test to verify code generation of vector subtract on rv64. ---  .../riscv/rvv/autovec/loop-add-rv32.c         | 24 +++++++++++++++++++  .../gcc.target/riscv/rvv/autovec/loop-add.c   | 24 +++++++++++++++++++  .../riscv/rvv/autovec/loop-sub-rv32.c         | 24 +++++++++++++++++++  .../gcc.target/riscv/rvv/autovec/loop-sub.c   | 24 +++++++++++++++++++  4 files changed, 96 insertions(+)  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c new file mode 100644 index 00000000000..bdc3b6892e9 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O2 -ftree-vectorize -march=rv32gcv -mabi=ilp32d" } */ + +#include + +#define TEST_TYPE(TYPE)                 \ +  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)    \ +  {                            \ +    for (int i = 0; i < n; i++)                \ +      dst[i] = a[i] + b[i];                \ +  } + +/* *int8_t not autovec currently. */ +#define TEST_ALL()    \ + TEST_TYPE(int16_t)    \ + TEST_TYPE(uint16_t)    \ + TEST_TYPE(int32_t)    \ + TEST_TYPE(uint32_t)    \ + TEST_TYPE(int64_t)    \ + TEST_TYPE(uint64_t) + +TEST_ALL() + +/* { dg-final { scan-assembler-times {\tvadd\.vv} 6 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c new file mode 100644 index 00000000000..d7f992c7d27 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O2 -ftree-vectorize -march=rv64gcv -mabi=lp64d" } */ + +#include + +#define TEST_TYPE(TYPE)                 \ +  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)    \ +  {                            \ +    for (int i = 0; i < n; i++)                \ +      dst[i] = a[i] + b[i];                \ +  } + +/* *int8_t not autovec currently. */ +#define TEST_ALL()    \ + TEST_TYPE(int16_t)    \ + TEST_TYPE(uint16_t)    \ + TEST_TYPE(int32_t)    \ + TEST_TYPE(uint32_t)    \ + TEST_TYPE(int64_t)    \ + TEST_TYPE(uint64_t) + +TEST_ALL() + +/* { dg-final { scan-assembler-times {\tvadd\.vv} 6 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c new file mode 100644 index 00000000000..7d0a40ec539 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O2 -ftree-vectorize -march=rv32gcv -mabi=ilp32d" } */ + +#include + +#define TEST_TYPE(TYPE)                 \ +  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)    \ +  {                            \ +    for (int i = 0; i < n; i++)                \ +      dst[i] = a[i] - b[i];                \ +  } + +/* *int8_t not autovec currently. */ +#define TEST_ALL()    \ + TEST_TYPE(int16_t)    \ + TEST_TYPE(uint16_t)    \ + TEST_TYPE(int32_t)    \ + TEST_TYPE(uint32_t)    \ + TEST_TYPE(int64_t)    \ + TEST_TYPE(uint64_t) + +TEST_ALL() + +/* { dg-final { scan-assembler-times {\tvsub\.vv} 6 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c new file mode 100644 index 00000000000..c8900884f83 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O2 -ftree-vectorize -march=rv64gcv -mabi=lp64d" } */ + +#include + +#define TEST_TYPE(TYPE)                 \ +  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)    \ +  {                            \ +    for (int i = 0; i < n; i++)                \ +      dst[i] = a[i] - b[i];                \ +  } + +/* *int8_t not autovec currently. */ +#define TEST_ALL()    \ + TEST_TYPE(int16_t)    \ + TEST_TYPE(uint16_t)    \ + TEST_TYPE(int32_t)    \ + TEST_TYPE(uint32_t)    \ + TEST_TYPE(int64_t)    \ + TEST_TYPE(uint64_t) + +TEST_ALL() + +/* { dg-final { scan-assembler-times {\tvsub\.vv} 6 } } */