From patchwork Fri Mar  3 04:52:42 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Michael Collison <collison@rivosinc.com>
X-Patchwork-Id: 65949
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 107C03850874
	for <patchwork@sourceware.org>; Fri,  3 Mar 2023 04:53:01 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mail-qt1-x833.google.com (mail-qt1-x833.google.com
 [IPv6:2607:f8b0:4864:20::833])
 by sourceware.org (Postfix) with ESMTPS id 43BF93858CDB
 for <gcc-patches@gcc.gnu.org>; Fri,  3 Mar 2023 04:52:44 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 43BF93858CDB
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=rivosinc.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com
Received: by mail-qt1-x833.google.com with SMTP id h19so1722626qtk.7
 for <gcc-patches@gcc.gnu.org>; Thu, 02 Mar 2023 20:52:44 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=rivosinc-com.20210112.gappssmtp.com; s=20210112;
 h=content-transfer-encoding:content-language:to:subject:from
 :user-agent:mime-version:date:message-id:from:to:cc:subject:date
 :message-id:reply-to;
 bh=eknxH/T0RuOkYWLSr2kWkCtooA5JrT2JlcaWRYvHlmM=;
 b=lW5pThQsGROCrDkbt5azrlWGNAD4bh+9KbS/sBsjjA5aKRJRS8kInYGo+57GMKtb0a
 d/fPfqFC3NMhJGw7vMvCAU7m8Ip8GLfTqKRP6Jxs7czJlUQvn75GQ/Bp1tMn7vDVwijw
 rU+rj7CT8Q6Lz17f2uU94gzhpKk1NRSpmQ3YlcGufVmSK4utt3RAbeF1g2OKQjsM1xkL
 tAagMv/G9EXTxm+D913tRqQg6H2WckBWFw/TqSkYoh0HRByJumle+Cz/GhPuS+aCsye1
 DB4RGt1Mn4fI/80509w2B6dRTyjsJZ3l0JtXPQidkMNldGJyBalg4QrDEHYZygNbpdLG
 ++AA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=content-transfer-encoding:content-language:to:subject:from
 :user-agent:mime-version:date:message-id:x-gm-message-state:from:to
 :cc:subject:date:message-id:reply-to;
 bh=eknxH/T0RuOkYWLSr2kWkCtooA5JrT2JlcaWRYvHlmM=;
 b=oYFgseOg2TV0+Wc/Kd/ZdJIqXhw4sueaIdAmn2xrAOnjGYwpqq3sn2pX/8LFzTnRkb
 imlx09iLFfteeg9iQ+rpjatxN9F2lv3nIqP73VMi1p4o1DO4gTJNWj9SIR4wWgOo7ujY
 Z+7AtwLMJnDdxhdVYl23LwZ1M4dbv5AC2zeNyZ3+9HGt+zUjT2U2krGJWItUavjrxsDf
 ZkcdSoo5HtG3jTIt/yV9EYnFH5deGY1y3tpj1CiWtoGrLCcFBYChU+4byvO6xpYlkyYZ
 PFb/cGtAPjQKAN4S9uTjgEItiYCdNnd0J0wTnBgIWhrRCbfb2sgDVuSppvUWP4a+Msgb
 F8jA==
X-Gm-Message-State: AO0yUKXb2Tebuldz7QFmu3Acwgpzxk5reJ8LZDVZpkFKQ/n0j20Fib/U
 lOhe6NgHhPb1/Z5wF6Uiw6AEYluGEWzBBIZ0
X-Google-Smtp-Source: 
 AK7set9qIfJssHHziP8yY8vkSqFnDGIS+wsDuo9+QnZJ/ZODfPEwnYGA7SuGto8Rqbv7kNSubgQ4CQ==
X-Received: by 2002:a05:622a:1a01:b0:3b9:bc8c:c1f6 with SMTP id
 f1-20020a05622a1a0100b003b9bc8cc1f6mr7649399qtb.1.1677819163403;
 Thu, 02 Mar 2023 20:52:43 -0800 (PST)
Received: from [192.168.86.117] ([136.57.172.92])
 by smtp.gmail.com with ESMTPSA id
 l19-20020a05620a211300b007423c122457sm1009039qkl.63.2023.03.02.20.52.43
 for <gcc-patches@gcc.gnu.org>
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Thu, 02 Mar 2023 20:52:43 -0800 (PST)
Message-ID: <c0790eac-0656-ed9c-5426-9e83d786ff30@rivosinc.com>
Date: Thu, 2 Mar 2023 23:52:42 -0500
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.7.1
From: Michael Collison <collison@rivosinc.com>
Subject: [PATCH 01/07] RISC-V: Add auto-vectorization support
To: gcc-patches <gcc-patches@gcc.gnu.org>
Content-Language: en-US
X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, BODY_8BITS,
 DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,
 SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>

This patch adds foundational support in the form of:

1. New predicates

2. New function prototypes

3. Exporting emit_vlmax_vsetvl to global scope

4. Add a new command line option -mriscv_vector_lmul

gcc/ChangeLog:

     * config/riscv/riscv-protos.h (riscv_classify_vlmul_field):
     New external declaration.
     (riscv_vector_preferred_simd_mode): Ditto.
     (riscv_tuple_mode_p): Ditto.
     (riscv_vector_mask_mode_p): Ditto.
     (riscv_classify_nf): Ditto.
     (riscv_vlmul_regsize): Ditto.
     (riscv_vector_preferred_simd_mode): Ditto.
     (riscv_vector_get_mask_mode): Ditto.
     (emit_vlmax_vsetvl): Ditto.
     (get_mask_policy_no_pred): Ditto.
     (get_tail_policy_no_pred): Ditto.
     * config/riscv/riscv-opts.h (riscv_vector_bits_enum): New enum.
     (riscv_vector_lmul_enum): Ditto.
     (vlmul_field_enum): Ditto.
     * config/riscv/riscv-v.cc (emit_vlmax_vsetvl):
     Remove static scope.
     * config/riscv/riscv.opt (riscv_vector_lmul):
     New option -mriscv_vector_lmul.
     * config/riscv/predicates.md (p_reg_or_const_csr_operand):
     New predicate.
     (vector_reg_or_const_dup_operand): Ditto.
---
  gcc/config/riscv/predicates.md  | 13 +++++++++++
  gcc/config/riscv/riscv-opts.h   | 40 +++++++++++++++++++++++++++++++++
  gcc/config/riscv/riscv-protos.h | 16 +++++++++++++
  gcc/config/riscv/riscv-v.cc     |  2 +-
  gcc/config/riscv/riscv.opt      | 20 +++++++++++++++++
  5 files changed, 90 insertions(+), 1 deletion(-)

  Use hardware floating-point divide and square root instructions.

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 7bc7c0b4f4d..31517ae4606 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -264,6 +264,14 @@
  })

  ;; Predicates for the V extension.
+(define_special_predicate "p_reg_or_const_csr_operand"
+  (match_code "reg, subreg, const_int")
+{
+  if (CONST_INT_P (op))
+    return satisfies_constraint_K (op);
+  return GET_MODE (op) == Pmode;
+})
+
  (define_special_predicate "vector_length_operand"
    (ior (match_operand 0 "pmode_register_operand")
         (match_operand 0 "const_csr_operand")))
@@ -287,6 +295,11 @@
    (ior (match_operand 0 "register_operand")
         (match_test "op == CONSTM1_RTX (GET_MODE (op))")))

+(define_predicate "vector_reg_or_const_dup_operand"
+  (ior (match_operand 0 "register_operand")
+       (match_test "const_vec_duplicate_p (op)
+          && !CONST_POLY_INT_P (CONST_VECTOR_ELT (op, 0))")))
+
  (define_predicate "vector_mask_operand"
    (ior (match_operand 0 "register_operand")
         (match_operand 0 "vector_all_trues_mask_operand")))
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index ff398c0a2ae..2057a14e153 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -67,6 +67,46 @@ enum stack_protector_guard {
    SSP_GLOBAL            /* global canary */
  };

+/* RVV vector register sizes.  */
+enum riscv_vector_bits_enum
+{
+  RVV_SCALABLE,
+  RVV_NOT_IMPLEMENTED = RVV_SCALABLE,
+  RVV_64 = 64,
+  RVV_128 = 128,
+  RVV_256 = 256,
+  RVV_512 = 512,
+  RVV_1024 = 1024,
+  RVV_2048 = 2048,
+  RVV_4096 = 4096,
+  RVV_8192 = 8192,
+  RVV_16384 = 16384,
+  RVV_32768 = 32768,
+  RVV_65536 = 65536
+};
+
+/* vectorization factor.  */
+enum riscv_vector_lmul_enum
+{
+  RVV_LMUL1 = 1,
+  RVV_LMUL2 = 2,
+  RVV_LMUL4 = 4,
+  RVV_LMUL8 = 8
+};
+
+enum vlmul_field_enum
+{
+  VLMUL_FIELD_000, /* LMUL = 1 */
+  VLMUL_FIELD_001, /* LMUL = 2 */
+  VLMUL_FIELD_010, /* LMUL = 4 */
+  VLMUL_FIELD_011, /* LMUL = 8 */
+  VLMUL_FIELD_100, /* RESERVED */
+  VLMUL_FIELD_101, /* LMUL = 1/8 */
+  VLMUL_FIELD_110, /* LMUL = 1/4 */
+  VLMUL_FIELD_111, /* LMUL = 1/2 */
+  MAX_VLMUL_FIELD
+};
+
  #define MASK_ZICSR    (1 << 0)
  #define MASK_ZIFENCEI (1 << 1)

diff --git a/gcc/config/riscv/riscv-protos.h 
b/gcc/config/riscv/riscv-protos.h
index 37c634eca1d..70c8dc4ce69 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -200,4 +200,19 @@ const unsigned int RISCV_BUILTIN_SHIFT = 1;
  /* Mask that selects the riscv_builtin_class part of a function code.  */
  const unsigned int RISCV_BUILTIN_CLASS = (1 << RISCV_BUILTIN_SHIFT) - 1;

+/* Routines implemented in riscv-v.cc*/
+
+namespace riscv_vector {
+extern unsigned int riscv_classify_vlmul_field (enum machine_mode m);
+extern machine_mode riscv_vector_preferred_simd_mode (scalar_mode mode, 
unsigned vf);
+extern bool riscv_tuple_mode_p (machine_mode);
+extern bool riscv_vector_mask_mode_p (machine_mode);
+extern int riscv_classify_nf (machine_mode);
+extern int riscv_vlmul_regsize(machine_mode);
+extern machine_mode riscv_vector_preferred_simd_mode (scalar_mode mode, 
unsigned vf);
+extern opt_machine_mode riscv_vector_get_mask_mode (machine_mode mode);
+extern rtx emit_vlmax_vsetvl (machine_mode vmode);
+extern rtx get_mask_policy_no_pred ();
+extern rtx get_tail_policy_no_pred ();
+}
  #endif /* ! GCC_RISCV_PROTOS_H */
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 59c25c65cd5..58007cc16eb 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -108,7 +108,7 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT 
minval,
        && IN_RANGE (INTVAL (elt), minval, maxval));
  }

-static rtx
+rtx
  emit_vlmax_vsetvl (machine_mode vmode)
  {
    rtx vl = gen_reg_rtx (Pmode);
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index 95535235354..27005fb0f4a 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -70,6 +70,26 @@ Enum(abi_type) String(lp64f) Value(ABI_LP64F)
  EnumValue
  Enum(abi_type) String(lp64d) Value(ABI_LP64D)

+Enum
+Name(riscv_vector_lmul) Type(enum riscv_vector_lmul_enum)
+The possible vectorization factor:
+
+EnumValue
+Enum(riscv_vector_lmul) String(1) Value(RVV_LMUL1)
+
+EnumValue
+Enum(riscv_vector_lmul) String(2) Value(RVV_LMUL2)
+
+EnumValue
+Enum(riscv_vector_lmul) String(4) Value(RVV_LMUL4)
+
+EnumValue
+Enum(riscv_vector_lmul) String(8) Value(RVV_LMUL8)
+
+mriscv-vector-lmul=
+Target RejectNegative Joined Enum(riscv_vector_lmul) 
Var(riscv_vector_lmul) Init(RVV_LMUL1)
+-mriscv-vector-lmul=<lmul>    Set the vf using lmul in auto-vectorization.
+
  mfdiv
  Target Mask(FDIV)

From patchwork Fri Mar  3 04:52:55 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Michael Collison <collison@rivosinc.com>
X-Patchwork-Id: 65950
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 90A20384FB7A
	for <patchwork@sourceware.org>; Fri,  3 Mar 2023 04:53:12 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mail-qt1-x829.google.com (mail-qt1-x829.google.com
 [IPv6:2607:f8b0:4864:20::829])
 by sourceware.org (Postfix) with ESMTPS id C58B7385B50C
 for <gcc-patches@gcc.gnu.org>; Fri,  3 Mar 2023 04:52:56 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C58B7385B50C
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=rivosinc.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com
Received: by mail-qt1-x829.google.com with SMTP id s12so1695972qtq.11
 for <gcc-patches@gcc.gnu.org>; Thu, 02 Mar 2023 20:52:56 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=rivosinc-com.20210112.gappssmtp.com; s=20210112;
 h=content-transfer-encoding:content-language:to:subject:from
 :user-agent:mime-version:date:message-id:from:to:cc:subject:date
 :message-id:reply-to;
 bh=6QKW991zfW0dw9V1XZxaOyiNvFtp5nbAd0gKdfGUnE4=;
 b=KQ5vlxVmG5xRT/E6OtMURJBlWhlETGAadF+HSYBAMAyUt2wiP6pQobOOVCt1l8MgV0
 pvlCJ32/kw+TaY1XVVAnf+pWRE3mZ52wVfbHwVpcTj+LaaCOt7PU5CWOsc2teUShagXy
 IXuTVYEeS6boyV2FGAr/FcGK/Qai95+cGwfFE2knCQA40JNTi9XrIudz1aq3UsEnVdTj
 IlUYzVWlxYT2KF8GsSrjKk5pxgt2cby+bqd/R1vQAeovBAni10RqXrnoy44cJYbscHw4
 jSAqaXCnu9jkjqx5PWfm2CjsiNVb8h/MpIYQTpWdUKDEfAtKazTvCuSl/eAOXHzjXPtC
 T72Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=content-transfer-encoding:content-language:to:subject:from
 :user-agent:mime-version:date:message-id:x-gm-message-state:from:to
 :cc:subject:date:message-id:reply-to;
 bh=6QKW991zfW0dw9V1XZxaOyiNvFtp5nbAd0gKdfGUnE4=;
 b=4oBrgdsz1nfD/ExktwekIDs/wi7o/YEKJfWHChg8b8v0tX+KMOPGpibv6ezO1pcCfR
 qWxyCTc0Ki9YmzRUdgejEdvKnJvdfVkI/5urEFOIFFUMzD0sm+9Tv2uEmGfWT0jgL7Xs
 bUXzVMzfj4xD99A7Ao1ZNB5sapOMsmEI0wP4EDR+rtYH2vhV3/3GHgpBsX4mW46cOkIK
 tkpSHBy3zFQkzxLEqXyOUHX6vAh9NB+eqaY6HsK8nOjrBled/eG8W2CVaHJxnvkZaI6Y
 qwPInhLppe3kV3wjYKscOC7mqNrwhCoE4j381Hl3K0WqNXkgkD3htWBokNyZ/KJSC+cQ
 ejDg==
X-Gm-Message-State: AO0yUKVyIwwnDzelGN2W8yCYMG723lEhExY3j5X0bvXiuREy1DNAQYQc
 GYYG/HX8BqYUDg7WFrsR0F+ZV+DgcCoF6xQL
X-Google-Smtp-Source: 
 AK7set+8Mr4Q5+WAhCGk/5fGtg4RdxVL5lUktmnRzKWrsyfINunQk9A5N3SdiWaWlFxKAfxIPFyZ5g==
X-Received: by 2002:ac8:5c90:0:b0:3b6:9c63:5ca1 with SMTP id
 r16-20020ac85c90000000b003b69c635ca1mr1101804qta.43.1677819175874;
 Thu, 02 Mar 2023 20:52:55 -0800 (PST)
Received: from [192.168.86.117] ([136.57.172.92])
 by smtp.gmail.com with ESMTPSA id
 q25-20020a37f719000000b007343fceee5fsm1044998qkj.8.2023.03.02.20.52.55
 for <gcc-patches@gcc.gnu.org>
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Thu, 02 Mar 2023 20:52:55 -0800 (PST)
Message-ID: <a6305e96-ff71-cde6-9b91-4333489a47ed@rivosinc.com>
Date: Thu, 2 Mar 2023 23:52:55 -0500
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.7.1
From: Michael Collison <collison@rivosinc.com>
Subject: [PATCH 02/07] RISC-V: Add auto-vectorization support
To: gcc-patches <gcc-patches@gcc.gnu.org>
Content-Language: en-US
X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>

This patch adds foundational support by making two functions that handle 
predication policies visibly globally.

gcc/ChangeLog:

     * config/riscv/riscv-vector-builtins.cc (get_tail_policy_for_pred):
     Remove static declaration to to make externally visible.
     (get_mask_policy_for_pred): Ditto.
     * config/riscv/riscv-vector-builtins.h (get_tail_policy_for_pred):
     New external declaration.
     (get_mask_policy_for_pred): Ditto.
---
  gcc/config/riscv/riscv-vector-builtins.cc | 4 ++--
  gcc/config/riscv/riscv-vector-builtins.h  | 2 ++
  2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 2e92ece3b64..90fc73a5bcf 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -1850,7 +1850,7 @@ use_real_merge_p (enum predication_type_index pred)

  /* Get TAIL policy for predication. If predication indicates TU, 
return the TU.
     Otherwise, return the prefer default configuration.  */
-static rtx
+rtx
  get_tail_policy_for_pred (enum predication_type_index pred)
  {
    if (pred == PRED_TYPE_tu || pred == PRED_TYPE_tum || pred == 
PRED_TYPE_tumu)
@@ -1860,7 +1860,7 @@ get_tail_policy_for_pred (enum 
predication_type_index pred)

  /* Get MASK policy for predication. If predication indicates MU, 
return the MU.
     Otherwise, return the prefer default configuration.  */
-static rtx
+rtx
  get_mask_policy_for_pred (enum predication_type_index pred)
  {
    if (pred == PRED_TYPE_tumu || pred == PRED_TYPE_mu)
diff --git a/gcc/config/riscv/riscv-vector-builtins.h 
b/gcc/config/riscv/riscv-vector-builtins.h
index ede08c6a480..135e2463b1e 100644
--- a/gcc/config/riscv/riscv-vector-builtins.h
+++ b/gcc/config/riscv/riscv-vector-builtins.h
@@ -433,6 +433,8 @@ extern const char *const operand_suffixes[NUM_OP_TYPES];
  extern const rvv_builtin_suffixes type_suffixes[NUM_VECTOR_TYPES + 1];
  extern const char *const predication_suffixes[NUM_PRED_TYPES];
  extern rvv_builtin_types_t builtin_types[NUM_VECTOR_TYPES + 1];
+extern rtx get_tail_policy_for_pred (enum predication_type_index pred);
+extern rtx get_mask_policy_for_pred (enum predication_type_index pred);

  inline bool
  function_instance::operator!= (const function_instance &other) const

From patchwork Fri Mar  3 04:53:03 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Michael Collison <collison@rivosinc.com>
X-Patchwork-Id: 65951
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 9E04D38493D6
	for <patchwork@sourceware.org>; Fri,  3 Mar 2023 04:53:27 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mail-qt1-x831.google.com (mail-qt1-x831.google.com
 [IPv6:2607:f8b0:4864:20::831])
 by sourceware.org (Postfix) with ESMTPS id D43833850208
 for <gcc-patches@gcc.gnu.org>; Fri,  3 Mar 2023 04:53:04 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D43833850208
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=rivosinc.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com
Received: by mail-qt1-x831.google.com with SMTP id d7so1690945qtr.12
 for <gcc-patches@gcc.gnu.org>; Thu, 02 Mar 2023 20:53:04 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=rivosinc-com.20210112.gappssmtp.com; s=20210112;
 h=content-language:to:subject:from:user-agent:mime-version:date
 :message-id:from:to:cc:subject:date:message-id:reply-to;
 bh=wfku3vxdP2USxInhYqICUdofN+sq3KuFayzWhnVk/1g=;
 b=Hf7UDDkK5VjEOLIwqJP/hmctwZ1FQP3rFZcTYAASExbONcDL+Y096hgNGcl6+NGHmL
 a1RxuCvh6gcPOMUq42ibdA5LmeNdvasKf5/7+ZmiNgVNnxU6fBPhWC+agP8SDHzEeN88
 OC34P6Ktdhl7bJEVXaWyYoX9+VVZfPInkXlq5/1hlXTwWAPvWEO3dDoLo7wyaGYwqlFN
 S0bM6n8uIF++VNW5UqQ6aeTJVTUxFJy5Qr1Kb9v/MGj7kwizJw8qMnfuFTZAhrHQbkn0
 QWxSBRxd+OrXfE+ILHzANhFLSyeDDNhspuenCD0pT6ghg68Tw1UQJLJ7L3dl+BaRppAe
 5auQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=content-language:to:subject:from:user-agent:mime-version:date
 :message-id:x-gm-message-state:from:to:cc:subject:date:message-id
 :reply-to;
 bh=wfku3vxdP2USxInhYqICUdofN+sq3KuFayzWhnVk/1g=;
 b=8AM4jKaY/30NMQQ5+1qx3hWohTa5I8aTFmH0VskzeseNgVPGpgrfrmPK3dsFNpHp3S
 ZyLUwYluXdp7/oqrc80i+Y4WFyCK/DXQNHyfuD1A0Fp4ZMH48pSZCzd4aWgxB5lz9cSu
 DBb9uOzMuLLoaqCCcvcvpavnLFO4GXz/lP1HiD+urrjkUnYQfWQ4f8bXqWqcXf5DU+71
 umCXacAYddyaBkN8XObLOeHMyOLILZDY9Peauovc3Lp/8Ot8cOuxa0gNASU/ucX8KM53
 VaT0wbMNyZGkU4xA5bQ8tabK+uCDmng4iufb/qK8kt5C3CKfCWoutO2Y+/AB1ORGODRn
 Xt6A==
X-Gm-Message-State: AO0yUKV5v+jWzC4dbTxfpztwjrNo08OgEtw/0EWo8iMi7B6qEclCz9jV
 XZ/xil30gMBw4IBc5VEROkXCePn06Oeh4OBV
X-Google-Smtp-Source: 
 AK7set+e3bDoKz7j6BKewfr/BZUNSdHi53/kcThrZqGvFCsp50Lo+WNp4MlLk++nypyapiHsLpLFjw==
X-Received: by 2002:ac8:5c4a:0:b0:3bf:b950:f684 with SMTP id
 j10-20020ac85c4a000000b003bfb950f684mr1126954qtj.53.1677819184116;
 Thu, 02 Mar 2023 20:53:04 -0800 (PST)
Received: from [192.168.86.117] ([136.57.172.92])
 by smtp.gmail.com with ESMTPSA id
 r25-20020ac87959000000b003bfc1f49ad1sm1010876qtt.87.2023.03.02.20.53.03
 for <gcc-patches@gcc.gnu.org>
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Thu, 02 Mar 2023 20:53:03 -0800 (PST)
Message-ID: <e79c40af-4269-f950-131e-926f813b9f76@rivosinc.com>
Date: Thu, 2 Mar 2023 23:53:03 -0500
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.7.1
From: Michael Collison <collison@rivosinc.com>
Subject: [PATCH 03/07] RISC-V: Add auto-vectorization support
To: gcc-patches <gcc-patches@gcc.gnu.org>
Content-Language: en-US
X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,
 SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>

This patches adds two new files to support the vector cost model and 
modifies the Makefile fragment to build the cost model c++ file. Due to 
the large size this patch is provided as an attachment.

gcc/ChangeLog:

     * gcc/config.gcc (riscv-vector-cost.o): New object file to build.
     * config/riscv/riscv-vector-cost.cc: New file for riscv vector cost
     model
     * config/riscv/riscv-vector-cost.h: New header file for riscv vector
     cost model.
     * config/riscv/t-riscv: Add make rule for riscv-vector-cost.o.

From eb995818cd5f77f85e8df93b690b00ce1fd1aa35 Mon Sep 17 00:00:00 2001
From: Michael Collison <collison@rivosinc.com>
Date: Thu, 2 Mar 2023 12:27:36 -0500
Subject: [PATCH] Autovectorization patch set 2

---
 gcc/config.gcc                        |   2 +-
 gcc/config/riscv/riscv-vector-cost.cc | 620 ++++++++++++++++++++++++++
 gcc/config/riscv/riscv-vector-cost.h  | 400 +++++++++++++++++
 gcc/config/riscv/t-riscv              |   5 +
 4 files changed, 1026 insertions(+), 1 deletion(-)
 create mode 100644 gcc/config/riscv/riscv-vector-cost.cc
 create mode 100644 gcc/config/riscv/riscv-vector-cost.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index c070e6ecd2e..a4017777187 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -530,7 +530,7 @@ pru-*-*)
 riscv*)
 	cpu_type=riscv
 	extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memrefs.o riscv-selftests.o riscv-v.o riscv-vsetvl.o"
-	extra_objs="${extra_objs} riscv-vector-builtins.o riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o"
+	extra_objs="${extra_objs} riscv-vector-cost.o riscv-vector-builtins.o riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o"
 	d_target_objs="riscv-d.o"
 	extra_headers="riscv_vector.h"
 	target_gtfiles="$target_gtfiles \$(srcdir)/config/riscv/riscv-vector-builtins.cc"
diff --git a/gcc/config/riscv/riscv-vector-cost.cc b/gcc/config/riscv/riscv-vector-cost.cc
new file mode 100644
index 00000000000..5a33b20843a
--- /dev/null
+++ b/gcc/config/riscv/riscv-vector-cost.cc
@@ -0,0 +1,620 @@
+/* Cost model implementation for RISC-V 'V' Extension for GNU compiler.
+   Copyright (C) 2022-2023 Free Software Foundation, Inc.
+   Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+#define INCLUDE_STRING
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "backend.h"
+#include "rtl.h"
+#include "regs.h"
+#include "insn-config.h"
+#include "insn-attr.h"
+#include "recog.h"
+#include "rtlanal.h"
+#include "output.h"
+#include "alias.h"
+#include "tree.h"
+#include "stringpool.h"
+#include "attribs.h"
+#include "varasm.h"
+#include "stor-layout.h"
+#include "calls.h"
+#include "function.h"
+#include "explow.h"
+#include "memmodel.h"
+#include "emit-rtl.h"
+#include "reload.h"
+#include "tm_p.h"
+#include "target.h"
+#include "basic-block.h"
+#include "expr.h"
+#include "optabs.h"
+#include "bitmap.h"
+#include "df.h"
+#include "diagnostic.h"
+#include "builtins.h"
+#include "predict.h"
+#include "tree-pass.h"
+#include "opts.h"
+#include "langhooks.h"
+#include "rtl-iter.h"
+#include "gimple.h"
+#include "cfghooks.h"
+#include "cfgloop.h"
+#include "fold-const.h"
+#include "gimple-iterator.h"
+#include "tree-vectorizer.h"
+#include "tree-ssa-loop-niter.h"
+#include "riscv-vector-builtins.h"
+
+/* This file should be included last.  */
+#include "riscv-vector-cost.h"
+#include "target-def.h"
+
+bool vector_insn_cost_table::get_cost(rtx x, machine_mode mode, int *cost,
+                                      bool speed) const {
+  rtx op0, op1, op2;
+  enum rtx_code code = GET_CODE(x);
+  scalar_int_mode int_mode;
+
+  /* By default, assume that everything has equivalent cost to the
+     cheapest instruction.  Any additional costs are applied as a delta
+     above this default.  */
+  *cost = COSTS_N_INSNS(1);
+
+  switch (code) {
+  case SET:
+    /* The cost depends entirely on the operands to SET.  */
+    *cost = 0;
+    op0 = SET_DEST(x);
+    op1 = SET_SRC(x);
+
+    switch (GET_CODE(op0)) {
+    case MEM:
+      if (speed) {
+        *cost += store->cost(x, mode);
+      }
+
+      //*cost += rtx_cost(op1, mode, SET, 1, speed);
+      return true;
+
+    case SUBREG:
+      if (!REG_P(SUBREG_REG(op0)))
+        *cost += rtx_cost(SUBREG_REG(op0), VOIDmode, SET, 0, speed);
+
+      /* Fall through.  */
+    case REG:
+      /* The cost is one per vector-register copied.  */
+      if (VECTOR_MODE_P(GET_MODE(op0))) {
+        *cost = mov->cost(x, mode);
+      } else
+        /* Cost is just the cost of the RHS of the set.  */
+        *cost += rtx_cost(op1, mode, SET, 1, speed);
+      return true;
+
+    case ZERO_EXTRACT:
+    case SIGN_EXTRACT:
+      /* Bit-field insertion.  Strip any redundant widening of
+         the RHS to meet the width of the target.  */
+      if (SUBREG_P(op1))
+        op1 = SUBREG_REG(op1);
+      if ((GET_CODE(op1) == ZERO_EXTEND || GET_CODE(op1) == SIGN_EXTEND) &&
+          CONST_INT_P(XEXP(op0, 1)) &&
+          is_a<scalar_int_mode>(GET_MODE(XEXP(op1, 0)), &int_mode) &&
+          GET_MODE_BITSIZE(int_mode) >= INTVAL(XEXP(op0, 1)))
+        op1 = XEXP(op1, 0);
+
+      if (CONST_INT_P(op1)) {
+        /* MOV immediate is assumed to always be cheap.  */
+        *cost = COSTS_N_INSNS(1);
+      } else {
+        /* BFM.  */
+        if (speed)
+          *cost += alu->cost(x, mode);
+        *cost += rtx_cost(op1, VOIDmode, (enum rtx_code)code, 1, speed);
+      }
+
+      return true;
+
+    default:
+      /* We can't make sense of this, assume default cost.  */
+      *cost = COSTS_N_INSNS(1);
+      return false;
+    }
+    return false;
+
+  case MEM:
+    if (speed) {
+      *cost += load->cost(x, mode);
+    }
+
+    return true;
+
+  case NEG:
+    op0 = XEXP(x, 0);
+
+    if (speed) {
+      /* FNEG.  */
+      *cost += alu->cost(x, mode);
+    }
+    return false;
+
+    if (GET_MODE_CLASS(mode) == MODE_INT) {
+      if (GET_RTX_CLASS(GET_CODE(op0)) == RTX_COMPARE ||
+          GET_RTX_CLASS(GET_CODE(op0)) == RTX_COMM_COMPARE) {
+        /* CSETM.  */
+        *cost += rtx_cost(XEXP(op0, 0), VOIDmode, NEG, 0, speed);
+        return true;
+      }
+
+      /* Cost this as SUB wzr, X.  */
+      op0 = CONST0_RTX(mode);
+      op1 = XEXP(x, 0);
+      goto cost_minus;
+    }
+    return false;
+
+  case COMPARE:
+    op0 = XEXP(x, 0);
+    op1 = XEXP(x, 1);
+
+    if (op1 == const0_rtx && GET_CODE(op0) == AND) {
+      x = op0;
+      mode = GET_MODE(op0);
+      goto cost_logic;
+    }
+
+    if (GET_MODE_CLASS(GET_MODE(op0)) == MODE_INT) {
+      /* TODO: A write to the CC flags possibly costs extra, this
+	 needs encoding in the cost tables.  */
+
+      mode = GET_MODE(op0);
+      /* ANDS.  */
+      if (GET_CODE(op0) == AND) {
+        x = op0;
+        goto cost_logic;
+      }
+
+      if (GET_CODE(op0) == PLUS) {
+        /* ADDS (and CMN alias).  */
+        x = op0;
+        goto cost_plus;
+      }
+
+      if (GET_CODE(op0) == MINUS) {
+        /* SUBS.  */
+        x = op0;
+        goto cost_minus;
+      }
+
+      if (GET_CODE(op0) == ZERO_EXTRACT && op1 == const0_rtx &&
+          CONST_INT_P(XEXP(op0, 1)) && CONST_INT_P(XEXP(op0, 2))) {
+        /* COMPARE of ZERO_EXTRACT form of TST-immediate.
+	   Handle it here directly rather than going to cost_logic
+	   since we know the immediate generated for the TST is valid
+	   so we can avoid creating an intermediate rtx for it only
+	   for costing purposes.  */
+        if (speed)
+          *cost += alu->cost(x, mode);
+
+        *cost += rtx_cost(XEXP(op0, 0), GET_MODE(op0), ZERO_EXTRACT, 0, speed);
+        return true;
+      }
+
+      if (GET_CODE(op1) == NEG) {
+        /* CMN.  */
+        if (speed)
+          *cost += alu->cost(x, mode);
+
+        *cost += rtx_cost(op0, mode, COMPARE, 0, speed);
+        *cost += rtx_cost(XEXP(op1, 0), mode, NEG, 1, speed);
+        return true;
+      }
+
+      /* CMP.
+
+	 Compare can freely swap the order of operands, and
+         canonicalization puts the more complex operation first.
+         But the integer MINUS logic expects the shift/extend
+         operation in op1.  */
+      if (!(REG_P(op0) || (SUBREG_P(op0) && REG_P(SUBREG_REG(op0))))) {
+        op0 = XEXP(x, 1);
+        op1 = XEXP(x, 0);
+      }
+      goto cost_minus;
+    }
+
+    if (VECTOR_MODE_P(mode)) {
+      /* Vector compare.  */
+      if (speed)
+        *cost += alu->cost(x, mode);
+
+      return false;
+    }
+    return false;
+
+  case MINUS: {
+    op0 = XEXP(x, 0);
+    op1 = XEXP(x, 1);
+
+    cost_minus:
+    *cost += rtx_cost(op0, mode, MINUS, 0, speed);
+
+    return true;
+  }
+
+  case PLUS: {
+    op0 = XEXP(x, 0);
+    op1 = XEXP(x, 1);
+
+    cost_plus:
+    if (GET_RTX_CLASS(GET_CODE(op0)) == RTX_COMPARE ||
+        GET_RTX_CLASS(GET_CODE(op0)) == RTX_COMM_COMPARE) {
+      /* CSINC.  */
+      *cost += rtx_cost(XEXP(op0, 0), mode, PLUS, 0, speed);
+      *cost += rtx_cost(op1, mode, PLUS, 1, speed);
+      return true;
+    }
+
+    *cost += rtx_cost(op1, mode, PLUS, 1, speed);
+
+    return true;
+  }
+
+  case BSWAP:
+    *cost = COSTS_N_INSNS(1);
+
+    if (speed) {
+      *cost += alu->cost(x, mode);
+    }
+    return false;
+
+  case IOR:
+    *cost = COSTS_N_INSNS(1);
+
+    if (speed) {
+      *cost += alu->cost(x, mode);
+    }
+    return true;
+
+  case XOR:
+  case AND:
+  cost_logic:
+    if (speed)
+      *cost += alu->cost(x, mode);
+    return true;
+
+  case NOT:
+    *cost += alu->cost(x, mode);
+    return false;
+
+  case ZERO_EXTEND:
+
+    op0 = XEXP(x, 0);
+    /* If a value is written in SI mode, then zero extended to DI
+       mode, the operation will in general be free as a write to
+       a 'w' register implicitly zeroes the upper bits of an 'x'
+       register.  However, if this is
+
+       (set (reg) (zero_extend (reg)))
+
+       we must cost the explicit register move.  */
+    if (mode == DImode && GET_MODE(op0) == SImode) {
+      int op_cost = rtx_cost(op0, VOIDmode, ZERO_EXTEND, 0, speed);
+
+      /* If OP_COST is non-zero, then the cost of the zero extend
+         is effectively the cost of the inner operation.  Otherwise
+         we have a MOV instruction and we take the cost from the MOV
+         itself.  This is true independently of whether we are
+         optimizing for space or time.  */
+      if (op_cost)
+        *cost = op_cost;
+
+      return true;
+    } else if (MEM_P(op0)) {
+      /* All loads can zero extend to any size for free.  */
+      *cost = rtx_cost(op0, VOIDmode, ZERO_EXTEND, 0, speed);
+      return true;
+    }
+
+    if (speed) {
+      /* UMOV.  */
+      *cost += alu->cost(x, mode);
+    }
+    return false;
+
+  case SIGN_EXTEND:
+    if (MEM_P(XEXP(x, 0))) {
+      if (speed) {
+        *cost += load->cost(x, mode);
+      }
+      return true;
+    }
+
+    if (speed) {
+      *cost += alu->cost(x, mode);
+    }
+    return false;
+
+  case ASHIFT:
+    op0 = XEXP(x, 0);
+    op1 = XEXP(x, 1);
+
+    if (CONST_INT_P(op1)) {
+      if (speed) {
+        *cost += alu->cost(x, mode);
+      }
+
+      /* We can incorporate zero/sign extend for free.  */
+      if (GET_CODE(op0) == ZERO_EXTEND || GET_CODE(op0) == SIGN_EXTEND)
+        op0 = XEXP(op0, 0);
+
+      *cost += rtx_cost(op0, VOIDmode, ASHIFT, 0, speed);
+      return true;
+    } else {
+      if (speed)
+        /* Vector shift (register).  */
+        *cost += alu->cost(x, mode);
+      return false; /* All arguments need to be in registers.  */
+    }
+
+  case ROTATE:
+  case ROTATERT:
+  case LSHIFTRT:
+  case ASHIFTRT:
+    op0 = XEXP(x, 0);
+    op1 = XEXP(x, 1);
+
+    if (CONST_INT_P(op1)) {
+      /* ASR (immediate) and friends.  */
+      if (speed) {
+        *cost += alu->cost(x, mode);
+      }
+
+      *cost += rtx_cost(op0, mode, (enum rtx_code)code, 0, speed);
+      return true;
+    } else {
+      if (VECTOR_MODE_P(mode)) {
+        if (speed)
+          /* Vector shift (register).  */
+          *cost += alu->cost(x, mode);
+      }
+      return false; /* All arguments need to be in registers.  */
+    }
+
+  case SYMBOL_REF:
+    return true;
+
+  case HIGH:
+  case LO_SUM:
+    /* ADRP/ADD (immediate).  */
+    if (speed)
+      *cost += alu->cost(x, mode);
+    return true;
+
+  case ZERO_EXTRACT:
+  case SIGN_EXTRACT:
+    /* UBFX/SBFX.  */
+    if (speed) {
+      *cost += alu->cost(x, mode);
+    }
+
+    /* We can trust that the immediates used will be correct (there
+       are no by-register forms), so we need only cost op0.  */
+    *cost += rtx_cost(XEXP(x, 0), VOIDmode, (enum rtx_code)code, 0, speed);
+    return true;
+
+  case MULT:
+    *cost += mult->cost(x, mode);
+    return true;
+
+  case MOD:
+  case UMOD:
+    if (speed) {
+      /* Slighly prefer UMOD over SMOD.  */
+      *cost += alu->cost(x, mode);
+    }
+    return false; /* All arguments need to be in registers.  */
+
+  case DIV:
+  case UDIV:
+  case SQRT:
+    if (speed) {
+      *cost += alu->cost(x, mode);
+    }
+    return false; /* All arguments need to be in registers.  */
+
+  case IF_THEN_ELSE:
+    if (speed) {
+      *cost += if_then_else->cost(x, mode);
+    }
+    return true;
+
+  case EQ:
+  case NE:
+  case GT:
+  case GTU:
+  case LT:
+  case LTU:
+  case GE:
+  case GEU:
+  case LE:
+  case LEU:
+
+    return false; /* All arguments must be in registers.  */
+
+  case FMA:
+    op0 = XEXP(x, 0);
+    op1 = XEXP(x, 1);
+    op2 = XEXP(x, 2);
+
+    if (speed) {
+      *cost += alu->cost(x, mode);
+    }
+
+    /* FMSUB, FNMADD, and FNMSUB are free.  */
+    if (GET_CODE(op0) == NEG)
+      op0 = XEXP(op0, 0);
+
+    if (GET_CODE(op2) == NEG)
+      op2 = XEXP(op2, 0);
+
+    /* aarch64_fnma4_elt_to_64v2df has the NEG as operand 1,
+       and the by-element operand as operand 0.  */
+    if (GET_CODE(op1) == NEG)
+      op1 = XEXP(op1, 0);
+
+    /* Catch vector-by-element operations.  The by-element operand can
+       either be (vec_duplicate (vec_select (x))) or just
+       (vec_select (x)), depending on whether we are multiplying by
+       a vector or a scalar.
+
+       Canonicalization is not very good in these cases, FMA4 will put the
+       by-element operand as operand 0, FNMA4 will have it as operand 1.  */
+    if (GET_CODE(op0) == VEC_DUPLICATE)
+      op0 = XEXP(op0, 0);
+    else if (GET_CODE(op1) == VEC_DUPLICATE)
+      op1 = XEXP(op1, 0);
+
+    if (GET_CODE(op0) == VEC_SELECT)
+      op0 = XEXP(op0, 0);
+    else if (GET_CODE(op1) == VEC_SELECT)
+      op1 = XEXP(op1, 0);
+
+    /* If the remaining parameters are not registers,
+       get the cost to put them into registers.  */
+    *cost += rtx_cost(op0, mode, FMA, 0, speed);
+    *cost += rtx_cost(op1, mode, FMA, 1, speed);
+    *cost += rtx_cost(op2, mode, FMA, 2, speed);
+    return true;
+
+  case FLOAT:
+  case UNSIGNED_FLOAT:
+    return false;
+
+  case FLOAT_EXTEND:
+    if (speed) {
+      /*Vector truncate.  */
+      *cost += alu->cost(x, mode);
+    }
+    return false;
+
+  case FLOAT_TRUNCATE:
+    if (speed) {
+      /*Vector conversion.  */
+      *cost += alu->cost(x, mode);
+    }
+    return false;
+
+  case FIX:
+  case UNSIGNED_FIX:
+    x = XEXP(x, 0);
+    if (speed) {
+      *cost += alu->cost(x, mode);
+    }
+
+    *cost += rtx_cost(x, VOIDmode, (enum rtx_code)code, 0, speed);
+    return true;
+
+  case ABS:
+    /* ABS (vector).  */
+    if (speed)
+      *cost += alu->cost(x, mode);
+    return false;
+
+  case SMAX:
+  case SMIN:
+    if (speed) {
+      *cost += alu->cost(x, mode);
+    }
+    return false;
+
+  case UNSPEC:
+    break;
+
+  case TRUNCATE:
+    break;
+  case CONST_VECTOR: {
+    *cost = mov->cost(x, mode);
+    break;
+  }
+  case VEC_CONCAT:
+    /* depending on the operation, either DUP or INS.
+       For now, keep default costing.  */
+    break;
+  case VEC_DUPLICATE:
+    /* Load using a DUP.  */
+    *cost = dup->cost(x, mode);
+    return false;
+  case VEC_SELECT: {
+    rtx op0 = XEXP(x, 0);
+    *cost = rtx_cost(op0, GET_MODE(op0), VEC_SELECT, 0, speed);
+
+    /* cost subreg of 0 as free, otherwise as DUP */
+    rtx op1 = XEXP(x, 1);
+    if (vec_series_lowpart_p(mode, GET_MODE(op1), op1))
+      ;
+    else if (vec_series_highpart_p(mode, GET_MODE(op1), op1))
+      *cost = dup->cost(x, mode);
+    else
+      *cost = extract->cost(x, mode);
+    return true;
+  }
+  default:
+    break;
+  }
+
+  if (dump_file)
+    fprintf(dump_file, "\nFailed to cost RTX.  Assuming default cost.\n");
+
+  return true;
+}
+
+extern int riscv_builtin_vectorization_cost (enum vect_cost_for_stmt, tree, int);
+
+riscv_vector_costs::riscv_vector_costs(vec_info *vinfo, bool costing_for_scalar)
+  : vector_costs(vinfo, costing_for_scalar) {}
+
+unsigned riscv_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
+					    stmt_vec_info stmt_info, slp_tree,
+					    tree vectype, int misalign,
+					    vect_cost_model_location where) {
+  int stmt_cost
+    = riscv_builtin_vectorization_cost (kind, vectype, misalign);
+  return record_stmt_cost(stmt_info, where, count * stmt_cost);
+}
+
+void riscv_vector_costs::finish_cost(const vector_costs *uncast_scalar_costs) {
+  auto *scalar_costs =
+    static_cast<const riscv_vector_costs *>(uncast_scalar_costs);
+  loop_vec_info loop_vinfo = dyn_cast<loop_vec_info>(m_vinfo);
+  if (loop_vinfo)
+    m_costs[vect_body] = 1;
+  vector_costs::finish_cost(scalar_costs);
+}
+
+bool riscv_vector_costs::better_main_loop_than_p(
+						 const vector_costs *uncast_other) const {
+  auto other = static_cast<const riscv_vector_costs *>(uncast_other);
+
+  return vector_costs::better_main_loop_than_p(other);
+}
diff --git a/gcc/config/riscv/riscv-vector-cost.h b/gcc/config/riscv/riscv-vector-cost.h
new file mode 100644
index 00000000000..ef398915a18
--- /dev/null
+++ b/gcc/config/riscv/riscv-vector-cost.h
@@ -0,0 +1,400 @@
+/* Cost model definitions for RISC-V 'V' Extension for GNU compiler.
+   Copyright (C) 2022-2023 Free Software Foundation, Inc.
+   Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI Technologies Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_RISCV_VECTOR_COST_H
+#define GCC_RISCV_VECTOR_COST_H
+
+enum vector_tune_type {
+  VECTOR_TUNE_GENERIC,
+};
+
+struct vector_insn_scale_table {
+  const int load;
+  const int store;
+  const int alu;
+  const int mult;
+  const int mov;
+  const int dup;
+  const int extract;
+  const int if_then_else;
+};
+
+struct vector_stmt_scale_table {
+  const int scalar_int_stmt_cost;       /* Cost of any int scalar operation,
+                                         excluding load and store.  */
+  const int scalar_fp_stmt_cost;        /* Cost of any fp scalar operation,
+                                         excluding load and store.  */
+  const int scalar_load_cost;           /* Cost of scalar load.  */
+  const int scalar_store_cost;          /* Cost of scalar store.  */
+  const int vec_int_stmt_cost;          /* Cost of any int vector operation,
+                                         excluding load, store, permute,
+                                         vector-to-scalar and
+                                         scalar-to-vector operation.  */
+  const int vec_fp_stmt_cost;           /* Cost of any fp vector operation,
+                                         excluding load, store, permute,
+                                         vector-to-scalar and
+                                         scalar-to-vector operation.  */
+  const int vec_permute_cost;           /* Cost of permute operation.  */
+  const int vec_to_scalar_cost;         /* Cost of vec-to-scalar operation.  */
+  const int scalar_to_vec_cost;         /* Cost of scalar-to-vector
+                                         operation.  */
+  const int vec_align_load_cost;        /* Cost of aligned vector load.  */
+  const int vec_unalign_load_cost;      /* Cost of unaligned vector load.  */
+  const int vec_unalign_store_cost;     /* Cost of unaligned vector store.  */
+  const int vec_store_cost;             /* Cost of vector store.  */
+  const int cond_taken_branch_cost;     /* Cost of taken branch.  */
+  const int cond_not_taken_branch_cost; /* Cost of not taken branch.  */
+};
+
+/* Information about vector code that we're in the process of costing.  */
+class riscv_vector_costs : public vector_costs {
+public:
+  riscv_vector_costs(vec_info *, bool);
+
+  unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind,
+			      stmt_vec_info stmt_info, slp_tree, tree vectype,
+			      int misalign,
+			      vect_cost_model_location where) override;
+  void finish_cost(const vector_costs *) override;
+  bool better_main_loop_than_p(const vector_costs *other) const override;
+};
+
+template <typename T> class vector_insn_cost {
+public:
+  vector_insn_cost(const T *_scale_table) : m_scale_table(_scale_table) {}
+  ~vector_insn_cost() {}
+
+  virtual int scale(RTX_CODE) const { return 1; }
+
+  virtual unsigned cost(rtx x, machine_mode mode) const {
+    return riscv_vector::riscv_classify_nf(mode) * riscv_vector::riscv_vlmul_regsize(mode) *
+           scale(x == NULL_RTX ? UNKNOWN : GET_CODE(x));
+  }
+
+protected:
+  const T *m_scale_table;
+};
+
+template <typename T> class vector_cost_table {
+public:
+  vector_cost_table(const T *) {}
+  ~vector_cost_table() {}
+
+  virtual bool get_cost(rtx, machine_mode, int *, bool) const { return 1; }
+};
+
+class vector_alu_cost : public vector_insn_cost<vector_insn_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override { return m_scale_table->alu; }
+};
+
+class vector_load_cost : public vector_insn_cost<vector_insn_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override { return m_scale_table->load; }
+};
+
+class vector_store_cost : public vector_insn_cost<vector_insn_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override { return m_scale_table->store; }
+};
+
+class vector_mult_cost : public vector_insn_cost<vector_insn_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override { return m_scale_table->mult; }
+};
+
+class vector_mov_cost : public vector_insn_cost<vector_insn_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override { return m_scale_table->mov; }
+};
+
+class vector_dup_cost : public vector_insn_cost<vector_insn_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override { return m_scale_table->dup; }
+};
+
+class vector_extract_cost : public vector_insn_cost<vector_insn_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override { return m_scale_table->extract; }
+};
+
+class vector_if_then_else_cost
+    : public vector_insn_cost<vector_insn_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override {
+    return m_scale_table->if_then_else;
+  }
+};
+
+class vector_insn_cost_table
+    : public vector_cost_table<vector_insn_scale_table> {
+public:
+  vector_insn_cost_table(const vector_insn_scale_table *_scale_table)
+      : vector_cost_table(_scale_table) {
+    load = new vector_load_cost(_scale_table);
+    store = new vector_store_cost(_scale_table);
+    alu = new vector_alu_cost(_scale_table);
+    mult = new vector_mult_cost(_scale_table);
+    mov = new vector_mov_cost(_scale_table);
+    dup = new vector_dup_cost(_scale_table);
+    extract = new vector_extract_cost(_scale_table);
+    if_then_else = new vector_if_then_else_cost(_scale_table);
+  }
+
+  bool get_cost(rtx, machine_mode, int *, bool) const override;
+
+public:
+  const vector_insn_cost<vector_insn_scale_table> *load;
+  const vector_insn_cost<vector_insn_scale_table> *store;
+  const vector_insn_cost<vector_insn_scale_table> *alu;
+  const vector_insn_cost<vector_insn_scale_table> *mult;
+  const vector_insn_cost<vector_insn_scale_table> *mov;
+  const vector_insn_cost<vector_insn_scale_table> *dup;
+  const vector_insn_cost<vector_insn_scale_table> *extract;
+  const vector_insn_cost<vector_insn_scale_table> *if_then_else;
+};
+
+// ==================== vector stmt cost=========================
+class vector_scalar_int_cost
+    : public vector_insn_cost<vector_stmt_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override {
+    return m_scale_table->scalar_int_stmt_cost;
+  }
+};
+
+class vector_scalar_fp_cost : public vector_insn_cost<vector_stmt_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override {
+    return m_scale_table->scalar_fp_stmt_cost;
+  }
+};
+
+class vector_scalar_load_cost
+    : public vector_insn_cost<vector_stmt_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override {
+    return m_scale_table->scalar_load_cost;
+  }
+};
+
+class vector_scalar_store_cost
+    : public vector_insn_cost<vector_stmt_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override {
+    return m_scale_table->scalar_store_cost;
+  }
+};
+
+class vector_vec_int_cost : public vector_insn_cost<vector_stmt_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override {
+    return m_scale_table->vec_int_stmt_cost;
+  }
+};
+
+class vector_vec_fp_cost : public vector_insn_cost<vector_stmt_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override {
+    return m_scale_table->vec_fp_stmt_cost;
+  }
+};
+
+class vector_vec_permute_cost
+    : public vector_insn_cost<vector_stmt_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override {
+    return m_scale_table->vec_permute_cost;
+  }
+};
+
+class vector_vec_to_scalar_cost
+    : public vector_insn_cost<vector_stmt_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override {
+    return m_scale_table->vec_to_scalar_cost;
+  }
+};
+
+class vector_scalar_to_vec_cost
+    : public vector_insn_cost<vector_stmt_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override {
+    return m_scale_table->scalar_to_vec_cost;
+  }
+};
+
+class vector_vec_align_load_cost
+    : public vector_insn_cost<vector_stmt_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override {
+    return m_scale_table->vec_align_load_cost;
+  }
+};
+
+class vector_vec_unalign_load_cost
+    : public vector_insn_cost<vector_stmt_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override {
+    return m_scale_table->vec_unalign_load_cost;
+  }
+};
+
+class vector_vec_unalign_store_cost
+    : public vector_insn_cost<vector_stmt_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override {
+    return m_scale_table->vec_unalign_store_cost;
+  }
+};
+
+class vector_vec_store_cost : public vector_insn_cost<vector_stmt_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override {
+    return m_scale_table->vec_store_cost;
+  }
+};
+
+class vector_cond_taken_branch_cost
+    : public vector_insn_cost<vector_stmt_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override {
+    return m_scale_table->cond_taken_branch_cost;
+  }
+};
+
+class vector_cond_not_taken_branch_cost
+    : public vector_insn_cost<vector_stmt_scale_table> {
+public:
+  // use the same construction function as the vector_insn_cost
+  using vector_insn_cost::vector_insn_cost;
+
+  int scale(RTX_CODE) const override {
+    return m_scale_table->cond_not_taken_branch_cost;
+  }
+};
+
+class vector_stmt_cost_table
+    : public vector_cost_table<vector_stmt_scale_table> {
+public:
+  vector_stmt_cost_table(const vector_stmt_scale_table *_scale_table)
+      : vector_cost_table(_scale_table) {
+    scalar_int = new vector_scalar_int_cost(_scale_table);
+    scalar_fp = new vector_scalar_fp_cost(_scale_table);
+    scalar_load = new vector_scalar_load_cost(_scale_table);
+    scalar_store = new vector_scalar_store_cost(_scale_table);
+    vec_int = new vector_vec_int_cost(_scale_table);
+    vec_fp = new vector_vec_fp_cost(_scale_table);
+    vec_permute = new vector_vec_permute_cost(_scale_table);
+    vec_to_scalar = new vector_vec_to_scalar_cost(_scale_table);
+    scalar_to_vec = new vector_scalar_to_vec_cost(_scale_table);
+    vec_align_load = new vector_vec_align_load_cost(_scale_table);
+    vec_unalign_load = new vector_vec_unalign_load_cost(_scale_table);
+    vec_unalign_store = new vector_vec_unalign_store_cost(_scale_table);
+    vec_store = new vector_vec_store_cost(_scale_table);
+    cond_taken_branch = new vector_cond_taken_branch_cost(_scale_table);
+    cond_not_taken_branch = new vector_cond_not_taken_branch_cost(_scale_table);
+  }
+
+public:
+  const vector_insn_cost<vector_stmt_scale_table> *scalar_int;
+  const vector_insn_cost<vector_stmt_scale_table> *scalar_fp;
+  const vector_insn_cost<vector_stmt_scale_table> *scalar_load;
+  const vector_insn_cost<vector_stmt_scale_table> *scalar_store;
+  const vector_insn_cost<vector_stmt_scale_table> *vec_int;
+  const vector_insn_cost<vector_stmt_scale_table> *vec_fp;
+  const vector_insn_cost<vector_stmt_scale_table> *vec_permute;
+  const vector_insn_cost<vector_stmt_scale_table> *vec_to_scalar;
+  const vector_insn_cost<vector_stmt_scale_table> *scalar_to_vec;
+  const vector_insn_cost<vector_stmt_scale_table> *vec_align_load;
+  const vector_insn_cost<vector_stmt_scale_table> *vec_unalign_load;
+  const vector_insn_cost<vector_stmt_scale_table> *vec_unalign_store;
+  const vector_insn_cost<vector_stmt_scale_table> *vec_store;
+  const vector_insn_cost<vector_stmt_scale_table> *cond_taken_branch;
+  const vector_insn_cost<vector_stmt_scale_table> *cond_not_taken_branch;
+};
+
+#endif // GCC_RISCV_VECTOR_COST_H
diff --git a/gcc/config/riscv/t-riscv b/gcc/config/riscv/t-riscv
index d30e0235356..095169741bb 100644
--- a/gcc/config/riscv/t-riscv
+++ b/gcc/config/riscv/t-riscv
@@ -51,6 +51,11 @@ riscv-c.o: $(srcdir)/config/riscv/riscv-c.cc $(CONFIG_H) $(SYSTEM_H) \
 	$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
 		$(srcdir)/config/riscv/riscv-c.cc
 
+riscv-vector-cost.o: $(srcdir)/config/riscv/riscv-vector-cost.cc $(CONFIG_H) $(SYSTEM_H) \
+    coretypes.h $(TM_H) $(TREE_H) output.h $(C_COMMON_H) $(TARGET_H)
+	$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
+		$(srcdir)/config/riscv/riscv-vector-cost.cc
+
 riscv-vsetvl.o: $(srcdir)/config/riscv/riscv-vsetvl.cc \
   $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(REGS_H) \
   $(TARGET_H) tree-pass.h df.h rtl-ssa.h cfgcleanup.h insn-config.h \
-- 
2.34.1


From patchwork Fri Mar  3 04:53:14 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Michael Collison <collison@rivosinc.com>
X-Patchwork-Id: 65952
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 6C10C384B04E
	for <patchwork@sourceware.org>; Fri,  3 Mar 2023 04:53:34 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mail-qt1-x82b.google.com (mail-qt1-x82b.google.com
 [IPv6:2607:f8b0:4864:20::82b])
 by sourceware.org (Postfix) with ESMTPS id 15133384F00C
 for <gcc-patches@gcc.gnu.org>; Fri,  3 Mar 2023 04:53:16 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 15133384F00C
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=rivosinc.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com
Received: by mail-qt1-x82b.google.com with SMTP id cf14so1705384qtb.10
 for <gcc-patches@gcc.gnu.org>; Thu, 02 Mar 2023 20:53:16 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=rivosinc-com.20210112.gappssmtp.com; s=20210112;
 h=content-transfer-encoding:content-language:to:subject:from
 :user-agent:mime-version:date:message-id:from:to:cc:subject:date
 :message-id:reply-to;
 bh=FPO0X5Bjn0SaOLebzQFeV7JmnGyXqX24tQrz8/6R4jw=;
 b=CLjRyqKD0VRqNPb0auGmNIrMuPlHjwi1t6FDYbF/CTSYulSSzQe0K6NXrdO6sWOsG3
 4u45bAEDMpwLN/5XuB0ywjNIlGhk+WxEPYnQS5Wlii3MGFgOE871KT32UNkTJk/sAzDS
 pCvlJxcflYC8ghvTSXbzlFTY90hPOXMzf9KQL8WEQglCc0iVF6hwZGFb+4eMApvLl3M2
 tdVQek95msg7d1BTN74iVFAwtUIZoRm/vMOHPMQjSDoyHLKgwpj4sNRMXkCE5h3j66SW
 Jj9V98OqRan4awuc/5LN3ag2GEn7ECsTKPdUeXgUzHm7Boo+TmNjpnSpg3MpYvJE6DGO
 sQHg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=content-transfer-encoding:content-language:to:subject:from
 :user-agent:mime-version:date:message-id:x-gm-message-state:from:to
 :cc:subject:date:message-id:reply-to;
 bh=FPO0X5Bjn0SaOLebzQFeV7JmnGyXqX24tQrz8/6R4jw=;
 b=V2uKi/wMukjrqI6+/qlOW3SBjcaYlMru8l1+sQ9qTUZNBvM/1mKdoBvgb+gid3I2YO
 +Ne4XmAqJhNW9Gumt/1dHLXfyB+oZaIB3nszP0DUda/a+vTSB2R1UEoEuQfIv7lTdlAg
 8K7VHRAh6LR/eEdQKHH4NzWzjc7BPDr0+1As20zDY0l0K+9Hiwxir4jclPno+XHnqdjr
 zs0qZINZS24L3rcuLjiJiJE93fgewhxGBggw4KPyLCuogHxSPpf4XBAkxUeE6YQbyBjj
 tHLJNIwPmxtLHhWDC49rtivL53XO5X597XPRhmfdk2pE6/doyNS3WGzUdorPV95l+qYr
 HBZg==
X-Gm-Message-State: AO0yUKVv4zQL0huPdX2AdJ+b6dFrIB4f2e2RpTh+tzJmnBLFedc5c3Gh
 6x6+eQrzjL60itSbD8mRklp8ER7ppoP0TqHb
X-Google-Smtp-Source: 
 AK7set89Bz3gk+Cr2THV3XB5rGikBjNTC5hL9CuJD7vWo94SMj9NCg/eWMrWpUFE7ViwZotXrprl3g==
X-Received: by 2002:ac8:5882:0:b0:3b9:b761:b0aa with SMTP id
 t2-20020ac85882000000b003b9b761b0aamr979913qta.11.1677819195197;
 Thu, 02 Mar 2023 20:53:15 -0800 (PST)
Received: from [192.168.86.117] ([136.57.172.92])
 by smtp.gmail.com with ESMTPSA id
 bl14-20020a05620a1a8e00b00706bc44fda8sm1002653qkb.79.2023.03.02.20.53.14
 for <gcc-patches@gcc.gnu.org>
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Thu, 02 Mar 2023 20:53:14 -0800 (PST)
Message-ID: <abc3ee25-4d56-47ec-63de-3fcc7ce0591a@rivosinc.com>
Date: Thu, 2 Mar 2023 23:53:14 -0500
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.7.1
From: Michael Collison <collison@rivosinc.com>
Subject: [PATCH 04/07] RISC-V: Add auto-vectorization support
To: gcc-patches <gcc-patches@gcc.gnu.org>
Content-Language: en-US
X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, BODY_8BITS,
 DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,
 SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>

This patch adds support for functions used in implementing various 
portions of autovectorization support.

gcc/ChangeLog:

     * config/riscv/riscv-v.cc (riscv_classify_vlmul_field):
     New function.
     (riscv_vector_preferred_simd_mode): Ditto.
     (get_mask_policy_no_pred): Ditto.
     (get_tail_policy_no_pred): Ditto.
     (riscv_tuple_mode_p): Ditto.
     (riscv_classify_nf): Ditto.
     (riscv_vlmul_regsize): Ditto.
     (riscv_vector_mask_mode_p): Ditto.
     (riscv_vector_get_mask_mode): Ditto.
---
  gcc/config/riscv/riscv-v.cc | 176 ++++++++++++++++++++++++++++++++++++
  1 file changed, 176 insertions(+)

  {
@@ -162,6 +199,64 @@ calculate_ratio (unsigned int sew, enum vlmul_type 
vlmul)
    return ratio;
  }

+/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE for RVV.  */
+
+machine_mode
+riscv_vector_preferred_simd_mode (scalar_mode mode, unsigned vf)
+{
+  if (!TARGET_VECTOR)
+    return word_mode;
+
+  switch (mode)
+    {
+    case E_QImode:
+      return vf == 1   ? VNx8QImode
+         : vf == 2 ? VNx16QImode
+         : vf == 4 ? VNx32QImode
+               : VNx64QImode;
+      break;
+    case E_HImode:
+      return vf == 1   ? VNx4HImode
+         : vf == 2 ? VNx8HImode
+         : vf == 4 ? VNx16HImode
+               : VNx32HImode;
+      break;
+    case E_SImode:
+      return vf == 1   ? VNx2SImode
+         : vf == 2 ? VNx4SImode
+         : vf == 4 ? VNx8SImode
+               : VNx16SImode;
+      break;
+    case E_DImode:
+      if (riscv_vector_elen_flags != MASK_VECTOR_ELEN_32
+      && riscv_vector_elen_flags != MASK_VECTOR_ELEN_FP_32)
+    return vf == 1     ? VNx1DImode
+           : vf == 2 ? VNx2DImode
+           : vf == 4 ? VNx4DImode
+             : VNx8DImode;
+      break;
+    case E_SFmode:
+      if (TARGET_HARD_FLOAT && riscv_vector_elen_flags != 
MASK_VECTOR_ELEN_32
+      && riscv_vector_elen_flags != MASK_VECTOR_ELEN_64)
+    return vf == 1     ? VNx2SFmode
+           : vf == 2 ? VNx4SFmode
+           : vf == 4 ? VNx8SFmode
+             : VNx16SFmode;
+      break;
+    case E_DFmode:
+      if (TARGET_DOUBLE_FLOAT && TARGET_VECTOR_ELEN_FP_64)
+    return vf == 1     ? VNx1DFmode
+           : vf == 2 ? VNx2DFmode
+           : vf == 4 ? VNx4DFmode
+             : VNx8DFmode;
+      break;
+    default:
+      break;
+    }
+
+  return word_mode;
+}
+
  /* Emit an RVV unmask && vl mov from SRC to DEST.  */
  static void
  emit_pred_op (unsigned icode, rtx mask, rtx dest, rtx src, rtx len,
@@ -374,6 +469,87 @@ get_avl_type_rtx (enum avl_type type)
    return gen_int_mode (type, Pmode);
  }

+rtx
+get_mask_policy_no_pred ()
+{
+  return get_mask_policy_for_pred(PRED_TYPE_none);
+}
+
+rtx
+get_tail_policy_no_pred ()
+{
+  return get_mask_policy_for_pred(PRED_TYPE_none);
+}
+
+/* Return true if it is a RVV tuple mode. */
+bool
+riscv_tuple_mode_p (machine_mode mode ATTRIBUTE_UNUSED)
+{
+  return false;
+}
+
+/* Return nf for a machine mode. */
+int
+riscv_classify_nf (machine_mode mode)
+{
+  switch (mode)
+    {
+
+    default:
+      break;
+    }
+
+  return 1;
+}
+
+/* Return vlmul register size for a machine mode. */
+int
+riscv_vlmul_regsize (machine_mode mode)
+{
+  if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL)
+    return 1;
+  switch (riscv_classify_vlmul_field (mode))
+    {
+    case VLMUL_FIELD_001:
+      return 2;
+    case VLMUL_FIELD_010:
+      return 4;
+    case VLMUL_FIELD_011:
+      return 8;
+    case VLMUL_FIELD_100:
+      gcc_unreachable ();
+    default:
+      return 1;
+    }
+}
+
+/* Return true if it is a RVV mask mode. */
+bool
+riscv_vector_mask_mode_p (machine_mode mode)
+{
+  return (mode == VNx1BImode || mode == VNx2BImode || mode == VNx4BImode
+      || mode == VNx8BImode || mode == VNx16BImode || mode == VNx32BImode
+      || mode == VNx64BImode);
+}
+
+/* Implement TARGET_VECTORIZE_GET_MASK_MODE for RVV.  */
+
+opt_machine_mode
+riscv_vector_get_mask_mode (machine_mode mode)
+{
+  machine_mode mask_mode;
+  int nf = 1;
+  if (riscv_tuple_mode_p (mode))
+    nf = riscv_classify_nf (mode);
+
+  FOR_EACH_MODE_IN_CLASS (mask_mode, MODE_VECTOR_BOOL)
+  if (GET_MODE_INNER (mask_mode) == BImode
+      && known_eq (GET_MODE_NUNITS (mask_mode) * nf, GET_MODE_NUNITS 
(mode))
+      && riscv_vector_mask_mode_p (mask_mode))
+    return mask_mode;
+  return default_get_mask_mode (mode);
+}
+
  /* Return the RVV vector mode that has NUNITS elements of mode INNER_MODE.
     This function is not only used by builtins, but also will be used by
     auto-vectorization in the future.  */

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 58007cc16eb..58f69e259c0 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -39,9 +39,11 @@
  #include "emit-rtl.h"
  #include "tm_p.h"
  #include "target.h"
+#include "targhooks.h"
  #include "expr.h"
  #include "optabs.h"
  #include "tm-constrs.h"
+#include "riscv-vector-builtins.h"

  using namespace riscv_vector;

@@ -108,6 +110,41 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT 
minval,
        && IN_RANGE (INTVAL (elt), minval, maxval));
  }

+/* Return the vlmul field for a specific machine mode. */
+unsigned int
+riscv_classify_vlmul_field (enum machine_mode mode)
+{
+  /* Make the decision based on the mode's enum value rather than its
+     properties, so that we keep the correct classification regardless
+     of -mriscv-vector-bits.  */
+  switch (mode)
+    {
+    case E_VNx8BImode:
+      return VLMUL_FIELD_111;
+
+    case E_VNx4BImode:
+      return VLMUL_FIELD_110;
+
+    case E_VNx2BImode:
+      return VLMUL_FIELD_101;
+
+    case E_VNx16BImode:
+      return VLMUL_FIELD_000;
+
+    case E_VNx32BImode:
+      return VLMUL_FIELD_001;
+
+    case E_VNx64BImode:
+      return VLMUL_FIELD_010;
+
+    default:
+      break;
+    }
+
+  /* we don't care about VLMUL for Mask */
+  return VLMUL_FIELD_000;
+}
+
  rtx
  emit_vlmax_vsetvl (machine_mode vmode)

From patchwork Fri Mar  3 04:53:25 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Michael Collison <collison@rivosinc.com>
X-Patchwork-Id: 65953
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id BB51A383EC4B
	for <patchwork@sourceware.org>; Fri,  3 Mar 2023 04:53:46 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mail-qt1-x829.google.com (mail-qt1-x829.google.com
 [IPv6:2607:f8b0:4864:20::829])
 by sourceware.org (Postfix) with ESMTPS id A095038515FD
 for <gcc-patches@gcc.gnu.org>; Fri,  3 Mar 2023 04:53:26 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A095038515FD
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=rivosinc.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com
Received: by mail-qt1-x829.google.com with SMTP id c18so1736824qte.5
 for <gcc-patches@gcc.gnu.org>; Thu, 02 Mar 2023 20:53:26 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=rivosinc-com.20210112.gappssmtp.com; s=20210112;
 h=content-transfer-encoding:content-language:to:subject:from
 :user-agent:mime-version:date:message-id:from:to:cc:subject:date
 :message-id:reply-to;
 bh=i6SnaafxPCkWdg7nZywvKlXaEBftLbKirFYFeulg/7Q=;
 b=cOBXMidLtuyDEGjGFELHcccr2zs4uC1JwjC652Xv6C9G0A3YM/64tuxu0LM118LYE/
 pOY4TiLZffRgSeJN66rF3XoAKY1Qzrljws8cCxH8zvdhiDEgQcdf7e0DdzHP2wKBL0dP
 Q1yAeHg3kvWfTfFWXUX3bXlutZ11Yx/RnEyxKQSctILdh/omZpvtFOOCJspidHHa8YDf
 +gSFi3PG6b0n9nocrWCHC7PYhvy1sgR8/srMJzQbIcjvLtV8L9MOHAJQq0Bz4PenL1K6
 aF4jNZIyZcUfJzRDPlaz5Cc51ycWJFy45zKacFaioeTSdkYoWIjGPEUM08IzGVuTd5VX
 9CKA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=content-transfer-encoding:content-language:to:subject:from
 :user-agent:mime-version:date:message-id:x-gm-message-state:from:to
 :cc:subject:date:message-id:reply-to;
 bh=i6SnaafxPCkWdg7nZywvKlXaEBftLbKirFYFeulg/7Q=;
 b=YPOb7mDtxLXcYAzVi3RiGxWHWC4uIo1e7YhZblTcwQom5+06BSB6KCtxv8touGkBYv
 suffxTHD1WGRVUM/EF+rBy5/zTiY+byztKUSTiuLRRquHIy02Ggzg3p6fSYJcWbioL1n
 yb4nH+N6WcpcSaFHp1oXABudJC/4YoqJjxQrsdtu/jwYT94OEvMEkcXL+XtlI9EYdN8D
 4LvuuG4YlAHIcxGay19DaXT1UIl78tAu1WxX0yxI2746m80MlAnvkT0KP35YRX13pxlG
 XHNGH4Lc7FIPFwrg65jCQw11MCvs+em7krABUgji+R7Pr6ueraRLz+mSueQpQPSq1sbS
 Hbmg==
X-Gm-Message-State: AO0yUKV+ISng5BR05ARUCZnpHvc7zYHv+Aeg2JRLEzUthOCY4mB1gTpY
 aDFGHPsxLwQqwzarKUSfgixjjmk42ZEE3DCC
X-Google-Smtp-Source: 
 AK7set9JgBFXQExPLJ2QM3Fyve4FOyLML8mkGQkzaCuFK+KrK/OMH6V/Eyp7HecXWpLmgtS9s3TxrQ==
X-Received: by 2002:ac8:7d42:0:b0:3bf:d688:2a77 with SMTP id
 h2-20020ac87d42000000b003bfd6882a77mr821493qtb.64.1677819205701;
 Thu, 02 Mar 2023 20:53:25 -0800 (PST)
Received: from [192.168.86.117] ([136.57.172.92])
 by smtp.gmail.com with ESMTPSA id
 l5-20020ac87245000000b003b9a426d626sm1071865qtp.22.2023.03.02.20.53.25
 for <gcc-patches@gcc.gnu.org>
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Thu, 02 Mar 2023 20:53:25 -0800 (PST)
Message-ID: <d2107aec-938f-0581-244c-4c08ee08190e@rivosinc.com>
Date: Thu, 2 Mar 2023 23:53:25 -0500
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.7.1
From: Michael Collison <collison@rivosinc.com>
Subject: [PATCH 05/07] RISC-V: Add auto-vectorization support
To: gcc-patches <gcc-patches@gcc.gnu.org>
Content-Language: en-US
X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, BODY_8BITS,
 DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,
 SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>

This patch adds support for registering target hooks for basic 
autovectorization support as well as basic tuning information for the 
vector extension.

gcc/ChangeLog:

     * config/riscv/riscv-cores.def (RISCV_TUNE):
     Add VECTOR_TUNE_INFO parameter and
     * common/config/riscv/riscv-common.cc (RISCV_TUNE):
     Add VECTOR_TUNE_INFO parameter.
     * config/riscv/riscv.cc (riscv_vector_tune_param):
     New struct for vector tuning information.
     (riscv_tune_info): add vector_tune_param.
     (vector_tune_param): New static variable.
     (riscv_vectorization_factor): New variable.
     (generic_rvv_insn_scale_table): New struct.
     (generic_rvv_stmt_scale_table): New struct.
     (generic_rvv_insn_cost_table): New vector insn cost table.
     (generic_rvv_stmt_cost_table): New vector statement cost table.
     (generic_rvv_tune_info): New rvv tuning table.
     (RISCV_TUNE): Add VECTOR_TUNE_INFO parameter.
     (riscv_rtx_costs): Return vector estimate if vector mode.
     (riscv_option_override): Set vector_tune_param.
     (riscv_option_override): Set riscv_vectorization_factor.
     (riscv_estimated_poly_value): Implement
     TARGET_ESTIMATED_POLY_VALUE.
     (riscv_preferred_simd_mode): Implement
     TARGET_VECTORIZE_PREFERRED_SIMD_MODE.
     (riscv_autovectorize_vector_modes): Implement
     TARGET_AUTOVECTORIZE_VECTOR_MODES.
     (riscv_get_mask_mode): Implement TARGET_VECTORIZE_GET_MASK_MODE.
     (riscv_empty_mask_is_expensive): Implement
     TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE.
     (riscv_builtin_vectorization_cost): Implement
     TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST.
     (riscv_vectorize_create_costs): Implement
     TARGET_VECTORIZE_CREATE_COSTS.
     (TARGET_ESTIMATED_POLY_VALUE): Register target macro.
     (TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST): Ditto.
     (TARGET_VECTORIZE_PREFERRED_SIMD_MODE): Ditto.
     (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Ditto.
     (TARGET_VECTORIZE_GET_MASK_MODE): Ditto.
     (TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE): Ditto.
     (TARGET_VECTORIZE_LOOP_LEN_OVERRIDE_MASK): Ditto.
     (TARGET_VECTORIZE_CREATE_COSTS): Ditto
---
  gcc/common/config/riscv/riscv-common.cc |   2 +-
  gcc/config/riscv/riscv-cores.def        |  14 +-
  gcc/config/riscv/riscv.cc               | 321 +++++++++++++++++++++++-
  3 files changed, 325 insertions(+), 12 deletions(-)

  static tree riscv_handle_type_attribute (tree *, tree, tree, int, bool *);

@@ -403,8 +469,8 @@ static const unsigned gpr_save_reg_order[] = {

  /* A table describing all the processors GCC knows about.  */
  static const struct riscv_tune_info riscv_tune_info_table[] = {
-#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO)    \
-  { TUNE_NAME, PIPELINE_MODEL, & TUNE_INFO},
+#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO, 
VECTOR_TUNE_INFO)    \
+  { TUNE_NAME, PIPELINE_MODEL, & TUNE_INFO, &VECTOR_TUNE_INFO},
  #include "riscv-cores.def"
  };

@@ -2237,8 +2303,7 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
outer_code, int opno ATTRIBUTE_UN
       Cost Model need to be well analyzed and supported in the future. */
    if (riscv_v_ext_vector_mode_p (mode))
      {
-      *total = COSTS_N_INSNS (1);
-      return true;
+      return vector_tune_param->rvv_insn_costs_table->get_cost (x, 
mode, total, speed);
      }

    bool float_mode_p = FLOAT_MODE_P (mode);
@@ -6079,6 +6144,7 @@ riscv_option_override (void)
                 RISCV_TUNE_STRING_DEFAULT));
    riscv_microarchitecture = cpu->microarchitecture;
    tune_param = optimize_size ? &optimize_size_tune_info : cpu->tune_param;
+  vector_tune_param = cpu->vector_tune_param;

    /* Use -mtune's setting for slow_unaligned_access, even when optimizing
       for size.  For architectures that trap and emulate unaligned 
accesses,
@@ -6198,6 +6264,10 @@ riscv_option_override (void)

    /* Convert -march to a chunks count.  */
    riscv_vector_chunks = riscv_convert_vector_bits ();
+
+  if (TARGET_VECTOR)
+    riscv_vectorization_factor = riscv_vector_lmul;
+
  }

  /* Implement TARGET_CONDITIONAL_REGISTER_USAGE.  */
@@ -6892,6 +6962,218 @@ riscv_dwarf_poly_indeterminate_value (unsigned 
int i, unsigned int *factor,
    return RISCV_DWARF_VLENB;
  }

+/* Implement TARGET_ESTIMATED_POLY_VALUE.
+   Look into the tuning structure for an estimate.
+   KIND specifies the type of requested estimate: min, max or likely.
+   For cores with a known RVV width all three estimates are the same.
+   For generic RVV tuning we want to distinguish the maximum estimate from
+   the minimum and likely ones.
+   The likely estimate is the same as the minimum in that case to give a
+   conservative behavior of auto-vectorizing with RVV when it is a win
+   even for 128-bit RVV.
+   When RVV width information is available VAL.coeffs[1] is multiplied by
+   the number of VQ chunks over the initial Advanced SIMD 128 bits.  */
+
+static HOST_WIDE_INT
+riscv_estimated_poly_value (poly_int64 val,
+                            poly_value_estimate_kind kind = 
POLY_VALUE_LIKELY)
+{
+  unsigned int width_source =
+      BITS_PER_RISCV_VECTOR.is_constant ()
+          ? (unsigned int)BITS_PER_RISCV_VECTOR.to_constant ()
+          : (unsigned int)RVV_SCALABLE;
+
+  /* If there is no core-specific information then the minimum and likely
+     values are based on 128-bit vectors and the maximum is based on
+     the architectural maximum of 2048 bits.  */
+  if (width_source == RVV_SCALABLE)
+    switch (kind)
+      {
+      case POLY_VALUE_MIN:
+      case POLY_VALUE_LIKELY:
+        return val.coeffs[0];
+
+      case POLY_VALUE_MAX:
+        return val.coeffs[0] + val.coeffs[1] * 15;
+      }
+
+  /* Allow BITS_PER_RISCV_VECTOR to be a bitmask of different VL, 
treating the
+     lowest as likely.  This could be made more general if future -mtune
+     options need it to be.  */
+  if (kind == POLY_VALUE_MAX)
+    width_source = 1 << floor_log2 (width_source);
+  else
+    width_source = least_bit_hwi (width_source);
+
+  /* If the core provides width information, use that.  */
+  HOST_WIDE_INT over_128 = width_source - 128;
+  return val.coeffs[0] + val.coeffs[1] * over_128 / 128;
+}
+
+/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE.  */
+
+static machine_mode
+riscv_preferred_simd_mode (scalar_mode mode)
+{
+  machine_mode vmode =
+    riscv_vector::riscv_vector_preferred_simd_mode (mode, 
riscv_vectorization_factor);
+  if (VECTOR_MODE_P (vmode))
+    return vmode;
+
+  return word_mode;
+}
+
+/* Implement TARGET_AUTOVECTORIZE_VECTOR_MODES for RVV.  */
+static unsigned int
+riscv_autovectorize_vector_modes (vector_modes *modes, bool)
+{
+  if (!TARGET_VECTOR)
+    return 0;
+
+  if (riscv_vectorization_factor == RVV_LMUL1)
+    {
+      modes->safe_push (VNx16QImode);
+      modes->safe_push (VNx8QImode);
+      modes->safe_push (VNx4QImode);
+      modes->safe_push (VNx2QImode);
+    }
+  else if (riscv_vectorization_factor == RVV_LMUL2)
+    {
+      modes->safe_push (VNx32QImode);
+      modes->safe_push (VNx16QImode);
+      modes->safe_push (VNx8QImode);
+      modes->safe_push (VNx4QImode);
+    }
+  else if (riscv_vectorization_factor == RVV_LMUL4)
+    {
+      modes->safe_push (VNx64QImode);
+      modes->safe_push (VNx32QImode);
+      modes->safe_push (VNx16QImode);
+      modes->safe_push (VNx8QImode);
+    }
+  else
+    {
+      modes->safe_push (VNx64QImode);
+      modes->safe_push (VNx32QImode);
+      modes->safe_push (VNx16QImode);
+    }
+
+  return 0;
+}
+
+/* Implement TARGET_VECTORIZE_GET_MASK_MODE.  */
+
+static opt_machine_mode
+riscv_get_mask_mode (machine_mode mode)
+{
+  machine_mode mask_mode = VOIDmode;
+  if (TARGET_VECTOR &&
+      riscv_vector::riscv_vector_get_mask_mode (mode).exists (&mask_mode))
+    return mask_mode;
+
+  return default_get_mask_mode (mode);
+}
+
+/* Implement TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE.  Assume for now that
+   it isn't worth branching around empty masked ops (including masked
+   stores).  */
+
+static bool
+riscv_empty_mask_is_expensive (unsigned)
+{
+  return false;
+}
+
+/* Implement targetm.vectorize.builtin_vectorization_cost.  */
+int
+riscv_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
+                                  tree vectype, int misalign 
ATTRIBUTE_UNUSED)
+{
+  unsigned elements;
+  bool fp = false;
+  rtx x = NULL_RTX;
+  machine_mode mode = VOIDmode;
+
+  if (vectype != NULL)
+    {
+      fp = FLOAT_TYPE_P (vectype);
+      mode = TYPE_MODE (vectype);
+    }
+
+  switch (type_of_cost)
+    {
+    case scalar_stmt:
+      return fp ? 
vector_tune_param->rvv_stmt_costs_table->scalar_fp->cost (
+                      x, mode)
+                : 
vector_tune_param->rvv_stmt_costs_table->scalar_int->cost (
+                      x, mode);
+
+    case scalar_load:
+      return vector_tune_param->rvv_stmt_costs_table->scalar_load->cost (x,
+ mode);
+
+    case scalar_store:
+      return 
vector_tune_param->rvv_stmt_costs_table->scalar_store->cost (x,
+ mode);
+
+    case vector_stmt:
+      return fp ? vector_tune_param->rvv_stmt_costs_table->vec_fp->cost (x,
+ mode)
+                : 
vector_tune_param->rvv_stmt_costs_table->vec_int->cost (x,
+ mode);
+
+    case vector_load:
+      return 
vector_tune_param->rvv_stmt_costs_table->vec_align_load->cost (
+          x, mode);
+
+    case vector_store:
+      return vector_tune_param->rvv_stmt_costs_table->vec_store->cost 
(x, mode);
+
+    case vec_to_scalar:
+      return vector_tune_param->rvv_stmt_costs_table->vec_to_scalar->cost (
+          x, mode);
+
+    case scalar_to_vec:
+      return vector_tune_param->rvv_stmt_costs_table->scalar_to_vec->cost (
+          x, mode);
+
+    case unaligned_load:
+    case vector_gather_load:
+      return 
vector_tune_param->rvv_stmt_costs_table->vec_unalign_load->cost (
+          x, mode);
+
+    case unaligned_store:
+    case vector_scatter_store:
+      return 
vector_tune_param->rvv_stmt_costs_table->vec_unalign_store->cost (
+          x, mode);
+
+    case cond_branch_taken:
+      return 
vector_tune_param->rvv_stmt_costs_table->cond_taken_branch->cost (
+          x, mode);
+
+    case cond_branch_not_taken:
+      return vector_tune_param->rvv_stmt_costs_table->cond_not_taken_branch
+          ->cost (x, mode);
+
+    case vec_perm:
+      return vector_tune_param->rvv_stmt_costs_table->vec_permute->cost (x,
+ mode);
+
+    case vec_promote_demote:
+      return fp ? vector_tune_param->rvv_stmt_costs_table->vec_fp->cost (x,
+ mode)
+                : 
vector_tune_param->rvv_stmt_costs_table->vec_int->cost (x,
+ mode);
+
+    case vec_construct:
+      elements = estimated_poly_value (TYPE_VECTOR_SUBPARTS (vectype));
+      return elements / 2 + 1;
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
  /* Return true if a shift-amount matches the trailing cleared bits on
     a bitmask.  */

@@ -6901,6 +7183,13 @@ riscv_shamt_matches_mask_p (int shamt, 
HOST_WIDE_INT mask)
    return shamt == ctz_hwi (mask);
  }

+/* Implement TARGET_VECTORIZE_CREATE_COSTS.  */
+vector_costs *
+riscv_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar)
+{
+  return new riscv_vector_costs (vinfo, costing_for_scalar);
+}
+
  /* Initialize the GCC target structure.  */
  #undef TARGET_ASM_ALIGNED_HI_OP
  #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
@@ -7143,6 +7432,30 @@ riscv_shamt_matches_mask_p (int shamt, 
HOST_WIDE_INT mask)
  #undef TARGET_VERIFY_TYPE_CONTEXT
  #define TARGET_VERIFY_TYPE_CONTEXT riscv_verify_type_context

+#undef TARGET_ESTIMATED_POLY_VALUE
+#define TARGET_ESTIMATED_POLY_VALUE riscv_estimated_poly_value
+
+#undef TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST
+#define TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST 
riscv_builtin_vectorization_cost
+
+#undef TARGET_VECTORIZE_PREFERRED_SIMD_MODE
+#define TARGET_VECTORIZE_PREFERRED_SIMD_MODE riscv_preferred_simd_mode
+
+#undef TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES
+#define TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES 
riscv_autovectorize_vector_modes
+
+#undef TARGET_VECTORIZE_GET_MASK_MODE
+#define TARGET_VECTORIZE_GET_MASK_MODE riscv_get_mask_mode
+
+#undef TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE
+#define TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE 
riscv_empty_mask_is_expensive
+
+#undef TARGET_VECTORIZE_LOOP_LEN_OVERRIDE_MASK
+#define TARGET_VECTORIZE_LOOP_LEN_OVERRIDE_MASK 
riscv_loop_len_override_mask
+
+#undef TARGET_VECTORIZE_CREATE_COSTS
+#define TARGET_VECTORIZE_CREATE_COSTS riscv_vectorize_create_costs
+
  #undef TARGET_VECTOR_ALIGNMENT
  #define TARGET_VECTOR_ALIGNMENT riscv_vector_alignment

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index ebc1ed7d7e4..6b8d92af986 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -246,7 +246,7 @@ static const riscv_cpu_info riscv_cpu_tables[] =

  static const char *riscv_tunes[] =
  {
-#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO) \
+#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO, 
VECTOR_TUNE_INFO)    \
      TUNE_NAME,
  #include "../../../config/riscv/riscv-cores.def"
      NULL
diff --git a/gcc/config/riscv/riscv-cores.def 
b/gcc/config/riscv/riscv-cores.def
index 2a834cae21d..4feb0366222 100644
--- a/gcc/config/riscv/riscv-cores.def
+++ b/gcc/config/riscv/riscv-cores.def
@@ -30,15 +30,15 @@
     identifier, reference to riscv.cc.  */

  #ifndef RISCV_TUNE
-#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO)
+#define RISCV_TUNE(TUNE_NAME, PIPELINE_MODEL, TUNE_INFO, VECTOR_TUNE_INFO)
  #endif

-RISCV_TUNE("rocket", generic, rocket_tune_info)
-RISCV_TUNE("sifive-3-series", generic, rocket_tune_info)
-RISCV_TUNE("sifive-5-series", generic, rocket_tune_info)
-RISCV_TUNE("sifive-7-series", sifive_7, sifive_7_tune_info)
-RISCV_TUNE("thead-c906", generic, thead_c906_tune_info)
-RISCV_TUNE("size", generic, optimize_size_tune_info)
+RISCV_TUNE("rocket", generic, rocket_tune_info, generic_rvv_tune_info)
+RISCV_TUNE("sifive-3-series", generic, rocket_tune_info, 
generic_rvv_tune_info)
+RISCV_TUNE("sifive-5-series", generic, rocket_tune_info, 
generic_rvv_tune_info)
+RISCV_TUNE("sifive-7-series", sifive_7, sifive_7_tune_info, 
generic_rvv_tune_info)
+RISCV_TUNE("thead-c906", generic, thead_c906_tune_info, 
generic_rvv_tune_info)
+RISCV_TUNE("size", generic, optimize_size_tune_info, generic_rvv_tune_info)

  #undef RISCV_TUNE

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index f11b7949a49..16b38ba4d76 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -60,6 +60,16 @@ along with GCC; see the file COPYING3.  If not see
  #include "opts.h"
  #include "tm-constrs.h"
  #include "rtl-iter.h"
+#include "gimple.h"
+#include "cfghooks.h"
+#include "cfgloop.h"
+#include "cfgrtl.h"
+#include "sel-sched.h"
+#include "fold-const.h"
+#include "gimple-iterator.h"
+#include "gimple-expr.h"
+#include "tree-vectorizer.h"
+#include "riscv-vector-cost.h"

  /* This file should be included last.  */
  #include "target-def.h"
@@ -238,6 +248,12 @@ struct riscv_tune_param
    bool slow_unaligned_access;
  };

+/* Cost for vector insn classes.  */
+struct riscv_vector_tune_param {
+    const vector_insn_cost_table* rvv_insn_costs_table;
+    const vector_stmt_cost_table* rvv_stmt_costs_table;
+};
+
  /* Information about one micro-arch we know about.  */
  struct riscv_tune_info {
    /* This micro-arch canonical name.  */
@@ -248,6 +264,9 @@ struct riscv_tune_info {

    /* Tuning parameters for this micro-arch.  */
    const struct riscv_tune_param *tune_param;
+
+  /* Tuning vector parameters for this micro-arch.  */
+  const struct riscv_vector_tune_param *vector_tune_param;
  };

  /* Global variables for machine-dependent things.  */
@@ -266,6 +285,9 @@ static int epilogue_cfa_sp_offset;
  /* Which tuning parameters to use.  */
  static const struct riscv_tune_param *tune_param;

+/* Which vector tuning parameters to use.  */
+static const struct riscv_vector_tune_param *vector_tune_param;
+
  /* Which automaton to use for tuning.  */
  enum riscv_microarchitecture_type riscv_microarchitecture;

@@ -275,6 +297,9 @@ poly_uint16 riscv_vector_chunks;
  /* The number of bytes in a vector chunk.  */
  unsigned riscv_bytes_per_vector_chunk;

+/* Prefer vf for auto-vectorizer.  */
+unsigned riscv_vectorization_factor;
+
  /* Index R is the smallest register class that contains register R.  */
  const enum reg_class riscv_regno_to_class[FIRST_PSEUDO_REGISTER] = {
    GR_REGS,    GR_REGS,    GR_REGS,    GR_REGS,
@@ -367,6 +392,47 @@ static const struct riscv_tune_param 
optimize_size_tune_info = {
    false,                    /* slow_unaligned_access */
  };

+static const vector_insn_scale_table generic_rvv_insn_scale_table = {
+    4, /*load*/
+    1, /*store*/
+    1, /*alu*/
+    1, /*mult*/
+    1, /*movi*/
+    1, /*dup*/
+    1, /*extract*/
+    1, /*if_then_else*/
+};
+
+static const vector_stmt_scale_table generic_rvv_stmt_scale_table = {
+    1, /* scalar_int_stmt_cost  */
+    1, /* scalar_fp_stmt_cost  */
+    1, /* scalar_load_cost  */
+    1, /* scalar_store_cost  */
+    1, /* vec_int_stmt_cost  */
+    1, /* vec_fp_stmt_cost  */
+    1, /* vec_permute_cost  */
+    1, /* vec_to_scalar_cost  */
+    1, /* scalar_to_vec_cost  */
+    1, /* vec_align_load_cost  */
+    1, /* vec_unalign_load_cost  */
+    1, /* vec_unalign_store_cost  */
+    1, /* vec_store_cost  */
+    1, /* cond_taken_branch_cost  */
+    1 /* cond_not_taken_branch_cost  */
+};
+
+static const vector_insn_cost_table* generic_rvv_insn_cost_table =
+            new vector_insn_cost_table(&generic_rvv_insn_scale_table);
+
+static const vector_stmt_cost_table* generic_rvv_stmt_cost_table =
+            new vector_stmt_cost_table(&generic_rvv_stmt_scale_table);
+
+/* Costs to use when optimizing for riscv vector.  */
+static const struct riscv_vector_tune_param generic_rvv_tune_info = {
+  generic_rvv_insn_cost_table,
+  generic_rvv_stmt_cost_table
+};
+
  static tree riscv_handle_fndecl_attribute (tree *, tree, tree, int, 
bool *);

From patchwork Fri Mar  3 04:53:35 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Michael Collison <collison@rivosinc.com>
X-Patchwork-Id: 65954
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 3BF60383E682
	for <patchwork@sourceware.org>; Fri,  3 Mar 2023 04:54:19 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mail-qv1-xf30.google.com (mail-qv1-xf30.google.com
 [IPv6:2607:f8b0:4864:20::f30])
 by sourceware.org (Postfix) with ESMTPS id E4EA5384644A
 for <gcc-patches@gcc.gnu.org>; Fri,  3 Mar 2023 04:53:39 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E4EA5384644A
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=rivosinc.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com
Received: by mail-qv1-xf30.google.com with SMTP id f1so1022507qvx.13
 for <gcc-patches@gcc.gnu.org>; Thu, 02 Mar 2023 20:53:39 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=rivosinc-com.20210112.gappssmtp.com; s=20210112;
 h=content-transfer-encoding:content-language:to:subject:from
 :user-agent:mime-version:date:message-id:from:to:cc:subject:date
 :message-id:reply-to;
 bh=NNWdMamffT+eNWkln+jxwXjqJiAbH4q2J5ciu7yB1Vo=;
 b=vNzWgsGOtMYKw24hukAiVO5ppnXE0KZ7JjRHyi2+jH8Dadr0apfjXqTSQT5XJXE4a3
 Bp2KiL65wIL2cvhdhmvNaoUtm8mY3IDmlsIEYoG2vP+UbZtjhQnqNIE2vUHGkV18UnmE
 eXMYD8IzUdlMiZa+uxQzM1XiuvUHmYb4bjGFCCErbelzJTLzySv6e9seXByABcOVYIJf
 VvOePzBuEvTvQVmuZ2sNy09tSztp451jJoFo50RKoDaYI9VlUuCasSYs/rSoLuENur6W
 e3yZdG2bTeVC1miX/5wdEWJUk04A/y16J/9hymjjym1J9uvYLM3BwdGnlmfUwVLbvmmB
 +q2A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=content-transfer-encoding:content-language:to:subject:from
 :user-agent:mime-version:date:message-id:x-gm-message-state:from:to
 :cc:subject:date:message-id:reply-to;
 bh=NNWdMamffT+eNWkln+jxwXjqJiAbH4q2J5ciu7yB1Vo=;
 b=d5SJ7JLr3GH8y6WJJ1vUfR4fCGJSl9qgOiosNLfJXmq3GU1b+tVnXjXKvyMlxK3beZ
 IieF8H2Lo//gZJc5rHmRqJaTarCd21UV3GsE60SNrIWHQ/wk3VHRlfxzHyMJuRYu91Ra
 WZvQ8H89MqAk2OjNpcLMhgb1LCyv4PHKP0SrSyejjWjA1VB243n0Sr79W9ODNkHoECLb
 RWgzOwGvaTJiGETwRXmgc53d+iQhwcFCWPGQLLKGxkI2zdCA6y3ucIBbP1ShfwnjfZiJ
 b2OpyzxoqeMcouuqUh067bK+ngu4dX4tm5LLmNY/pv/yjLRtmcf3e/78DFCWkPvUsJBI
 x9JA==
X-Gm-Message-State: AO0yUKWmFjxfxV4bc4lFV71lQ+05FmAYIzDMeGCuH/h7o6xCq+sEeJy5
 x3KOI0WyVZ4r2ZLcayAdGHZHNg8FIVAt9Xc+
X-Google-Smtp-Source: 
 AK7set9oYiNdH9FIKoVzhS4fCYGjIKm3UlO5fRTDy+gmNq9osvbFvDWPaq26YB1FHSA5/smxUdjwLA==
X-Received: by 2002:a05:6214:f09:b0:577:5b89:577e with SMTP id
 gw9-20020a0562140f0900b005775b89577emr1069426qvb.32.1677819215863;
 Thu, 02 Mar 2023 20:53:35 -0800 (PST)
Received: from [192.168.86.117] ([136.57.172.92])
 by smtp.gmail.com with ESMTPSA id
 74-20020a370a4d000000b0073b878e3f30sm1015960qkk.59.2023.03.02.20.53.35
 for <gcc-patches@gcc.gnu.org>
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Thu, 02 Mar 2023 20:53:35 -0800 (PST)
Message-ID: <927ed290-1340-5793-2c7f-8e0359cd0cea@rivosinc.com>
Date: Thu, 2 Mar 2023 23:53:35 -0500
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.7.1
From: Michael Collison <collison@rivosinc.com>
Subject: [PATCH 06/07] RISC-V: Add auto-vectorization support
To: gcc-patches <gcc-patches@gcc.gnu.org>
Content-Language: en-US
X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, BODY_8BITS,
 DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_SHORT,
 RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>

This patch adds patterns that provide basic autovectorization support 
for integer adds and subtracts.

gcc/ChangeLog:

     * config/riscv/riscv.md (riscv_classify_vlmul_field):
     New external declaration.
     (riscv_vector_preferred_simd_mode): Include
     vector-iterators.md.
     * config/riscv/vector-auto.md: New file containing
     autovectorization patterns.
     * config/riscv/vector-iterators.md (UNSPEC_VADD/UNSPEC_VSUB):
     New unspecs for autovectorization patterns.
     * config/riscv/vector.md: Remove include of vector-iterators.md
     and include vector-auto.md.
---
  gcc/config/riscv/riscv.md            |   1 +
  gcc/config/riscv/vector-auto.md      | 172 +++++++++++++++++++++++++++
  gcc/config/riscv/vector-iterators.md |   2 +
  gcc/config/riscv/vector.md           |   4 +-
  4 files changed, 177 insertions(+), 2 deletions(-)
  create mode 100644 gcc/config/riscv/vector-auto.md

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 05924e9bbf1..c34124095f7 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -131,6 +131,7 @@
  (include "predicates.md")
  (include "constraints.md")
  (include "iterators.md")
+(include "vector-iterators.md")

  ;; ....................
  ;;
diff --git a/gcc/config/riscv/vector-auto.md 
b/gcc/config/riscv/vector-auto.md
new file mode 100644
index 00000000000..e5a19663d18
--- /dev/null
+++ b/gcc/config/riscv/vector-auto.md
@@ -0,0 +1,172 @@
+;; Machine description for RISC-V 'V' Extension for GNU compiler.
+;; Copyright (C) 2022-2023 Free Software Foundation, Inc.
+;; Contributed by Juzhe Zhong (juzhe.zhong@rivai.ai), RiVAI 
Technologies Ltd.
+;; Contributed by Michael Collison (collison@rivosinc.com, Rivos Inc.
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+
+;; 
-------------------------------------------------------------------------
+;; ---- [INT] Addition
+;; 
-------------------------------------------------------------------------
+;; Includes:
+;; - vadd.vv
+;; - vadd.vx
+;; - vadd.vi
+;; 
-------------------------------------------------------------------------
+
+(define_expand "add<mode>3"
+  [(match_operand:VI 0 "register_operand")
+   (match_operand:VI 1 "register_operand")
+   (match_operand:VI 2 "vector_arith_operand")]
+  "TARGET_VECTOR"
+{
+  using namespace riscv_vector;
+
+  rtx merge = gen_rtx_UNSPEC (<MODE>mode, gen_rtvec (1, const0_rtx), 
UNSPEC_VUNDEF);
+  rtx vl = emit_vlmax_vsetvl (<MODE>mode);
+  rtx mask_policy = get_mask_policy_no_pred();
+  rtx tail_policy = get_tail_policy_no_pred();
+  rtx mask = CONSTM1_RTX(<VM>mode);
+  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX);
+
+  emit_insn(gen_pred_add<mode>(operands[0], mask, merge, operands[1], 
operands[2],
+                vl, tail_policy, mask_policy, vlmax_avl_p));
+
+  DONE;
+})
+
+(define_expand "cond_add<mode>"
+  [(match_operand:VI 0 "register_operand")
+   (match_operand:<VM> 1 "register_operand")
+   (match_operand:VI 2 "register_operand")
+   (match_operand:VI 3 "vector_reg_or_const_dup_operand")
+   (match_operand:VI 4 "register_operand")]
+  "TARGET_VECTOR"
+{
+  using namespace riscv_vector;
+
+  rtx merge = operands[4];
+  rtx vl = emit_vlmax_vsetvl (<MODE>mode);
+  rtx mask_policy = get_mask_policy_no_pred();
+  rtx tail_policy = get_tail_policy_no_pred();
+  rtx mask = operands[1];
+  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX);
+
+  emit_insn(gen_pred_add<mode>(operands[0], mask, merge, operands[2], 
operands[3],
+                vl, tail_policy, mask_policy, vlmax_avl_p));
+  DONE;
+})
+
+(define_expand "len_add<mode>"
+  [(match_operand:VI 0 "register_operand")
+   (match_operand:VI 1 "register_operand")
+   (match_operand:VI 2 "vector_reg_or_const_dup_operand")
+   (match_operand 3 "p_reg_or_const_csr_operand")]
+  "TARGET_VECTOR"
+{
+  using namespace riscv_vector;
+
+  rtx merge = gen_rtx_UNSPEC (<MODE>mode, gen_rtvec (1, const0_rtx), 
UNSPEC_VUNDEF);
+  rtx vl = operands[3];
+  rtx mask_policy = get_mask_policy_no_pred();
+  rtx tail_policy = get_tail_policy_no_pred();
+  rtx mask = CONSTM1_RTX(<VM>mode);
+  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX);
+
+  emit_insn(gen_pred_add<mode>(operands[0], mask, merge, operands[1], 
operands[2],
+                vl, tail_policy, mask_policy, vlmax_avl_p));
+  DONE;
+})
+
+
+;; 
-------------------------------------------------------------------------
+;; ---- [INT] Subtraction
+;; 
-------------------------------------------------------------------------
+;; Includes:
+;; - vsub.vv
+;; - vsub.vx
+;; - vadd.vi
+;; - vrsub.vx
+;; - vrsub.vi
+;; 
-------------------------------------------------------------------------
+
+(define_expand "sub<mode>3"
+  [(match_operand:VI 0 "register_operand")
+   (match_operand:VI 1 "register_operand")
+   (match_operand:VI 2 "register_operand")]
+  "TARGET_VECTOR"
+{
+  using namespace riscv_vector;
+
+  rtx merge = gen_rtx_UNSPEC (<MODE>mode, gen_rtvec (1, const0_rtx), 
UNSPEC_VUNDEF);
+  rtx vl = emit_vlmax_vsetvl (<MODE>mode);
+  rtx mask_policy = get_mask_policy_no_pred();
+  rtx tail_policy = get_tail_policy_no_pred();
+  rtx mask = CONSTM1_RTX(<VM>mode);
+  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX);
+
+  emit_insn(gen_pred_sub<mode>(operands[0], mask, merge, operands[1], 
operands[2],
+                vl, tail_policy, mask_policy, vlmax_avl_p));
+
+  DONE;
+})
+
+(define_expand "cond_sub<mode>"
+  [(match_operand:VI 0 "register_operand")
+   (match_operand:<VM> 1 "register_operand")
+   (match_operand:VI 2 "register_operand")
+   (match_operand:VI 3 "register_operand")
+   (match_operand:VI 4 "register_operand")]
+  "TARGET_VECTOR"
+{
+  using namespace riscv_vector;
+
+  rtx merge = operands[4];
+  rtx vl = emit_vlmax_vsetvl (<MODE>mode);
+  rtx mask_policy = get_mask_policy_no_pred();
+  rtx tail_policy = get_tail_policy_no_pred();
+  rtx mask = operands[1];
+  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX);
+
+  emit_insn(gen_pred_sub<mode>(operands[0], mask, merge, operands[2], 
operands[3],
+                vl, tail_policy, mask_policy, vlmax_avl_p));
+
+  DONE;
+})
+
+(define_expand "len_sub<mode>"
+  [(match_operand:VI 0 "register_operand")
+   (match_operand:VI 1 "register_operand")
+   (match_operand:VI 2 "register_operand")
+   (match_operand 3 "p_reg_or_const_csr_operand")]
+  "TARGET_VECTOR"
+{
+  using namespace riscv_vector;
+
+  rtx merge = gen_rtx_UNSPEC (<MODE>mode, gen_rtvec (1, const0_rtx), 
UNSPEC_VUNDEF);
+  rtx vl = operands[3];
+  rtx mask_policy = get_mask_policy_no_pred();
+  rtx tail_policy = get_tail_policy_no_pred();
+  rtx mask = CONSTM1_RTX(<VM>mode);
+  rtx vlmax_avl_p = get_avl_type_rtx(NONVLMAX);
+
+  emit_insn(gen_pred_sub<mode>(operands[0], mask, merge, operands[1], 
operands[2],
+                vl, tail_policy, mask_policy, vlmax_avl_p));
+
+  DONE;
+})
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index cb817abcfde..1a88e511d1b 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -34,6 +34,8 @@
    UNSPEC_VMULHU
    UNSPEC_VMULHSU

+  UNSPEC_VADD
+  UNSPEC_VSUB
    UNSPEC_VADC
    UNSPEC_VSBC
    UNSPEC_VMADC
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 69b7cafbf17..87d85d3a415 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -26,8 +26,6 @@
  ;; - Auto-vectorization (TBD)
  ;; - Combine optimization (TBD)

-(include "vector-iterators.md")
-
  (define_constants [
     (INVALID_ATTRIBUTE            255)
     (X0_REGNUM                      0)
@@ -336,6 +334,8 @@
         (symbol_ref "INTVAL (operands[4])")]
      (const_int INVALID_ATTRIBUTE)))

+(include "vector-auto.md")
+
  ;; -----------------------------------------------------------------
  ;; ---- Miscellaneous Operations
  ;; -----------------------------------------------------------------

From patchwork Fri Mar  3 04:53:42 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Michael Collison <collison@rivosinc.com>
X-Patchwork-Id: 65955
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 79C0B385840F
	for <patchwork@sourceware.org>; Fri,  3 Mar 2023 04:54:49 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mail-qt1-x82b.google.com (mail-qt1-x82b.google.com
 [IPv6:2607:f8b0:4864:20::82b])
 by sourceware.org (Postfix) with ESMTPS id 795EC383FBB6
 for <gcc-patches@gcc.gnu.org>; Fri,  3 Mar 2023 04:53:43 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 795EC383FBB6
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=rivosinc.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com
Received: by mail-qt1-x82b.google.com with SMTP id cf14so1706006qtb.10
 for <gcc-patches@gcc.gnu.org>; Thu, 02 Mar 2023 20:53:43 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=rivosinc-com.20210112.gappssmtp.com; s=20210112;
 h=content-transfer-encoding:content-language:to:subject:from
 :user-agent:mime-version:date:message-id:from:to:cc:subject:date
 :message-id:reply-to;
 bh=Fh3AxmA49n+kQWlSqyZvf8rzYcF47ucV9eudaUIwM7s=;
 b=2pjTWJFUq735R9Hz7vKaWkz12QHn8tjEFqaXM4KJmtgxaqutlg5uJYxOXzv3FEgNUI
 8oHslZG/LLwLd1Co01QWPPeUhoWopj89fTQifiWvt8f8DSbtQ4u4CoihM7kPYJCC+nNa
 dEfN/GCwZU56QGUzB/LKE5sL2cJeisXWUnGZqoZD2cDbYiks7yNNnnTSfZ7R1mtALZK8
 Q/J2DiloisLs8iRRzNWRDZzye2dS79xVTEY0kBIAP/6ifslye2WfMww9CbQlS3uCz27Z
 91EIliAqb2jOP43xi2k1sGKHTDw6LjMheJIgLxLq/7NM46nmz7WnAYKY1bsQSGQ0NtOn
 Q0yg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20210112;
 h=content-transfer-encoding:content-language:to:subject:from
 :user-agent:mime-version:date:message-id:x-gm-message-state:from:to
 :cc:subject:date:message-id:reply-to;
 bh=Fh3AxmA49n+kQWlSqyZvf8rzYcF47ucV9eudaUIwM7s=;
 b=qz0kAhagXO8BhpL4VrOACPaPApB/zad77q9vtu8VDRg3dIUVFhGW6jT3T/jSic1Hhl
 ZHFG/FseLcGz3CQZHuXEvbffr5jmaAmSbQmHx3tORqq0RwxRYYnapYHg8bLG27vlm5aw
 qjm23S4vydTn1NQU98rZ/QnSsU3uRqtA369uohbFgJgvGhXQbuUt0E6rAvLfukkFT3zS
 GyG2eyVKkr4rNfvIWa/HzplX1kgscgD+PUHVcY1XG4VuypLN3Zsnc4sX1J8TcrDIcEIu
 ge2x8pVRp4ZNnuorjDjgCWu/bVFmLysPtraM9cPDXEZQ6CiSV2cuTuqp9RyLgidSZGuN
 EVeg==
X-Gm-Message-State: AO0yUKUqc3HwWbObN3oSPywr6fjZ2nxwgGeHT0/5LCQD0venoTy8BCet
 iEX6yz9ULDUo44H39bvOSscVYjgfvDHxJTJr
X-Google-Smtp-Source: 
 AK7set84z6yF2T4rtqBPeXGs+I6ZXyO1w2vVDixrcQAyFuPbmAEauR8VqekmO/MF3oPL8OvHissavw==
X-Received: by 2002:ac8:4904:0:b0:3ba:2b4:7b39 with SMTP id
 e4-20020ac84904000000b003ba02b47b39mr769173qtq.46.1677819223061;
 Thu, 02 Mar 2023 20:53:43 -0800 (PST)
Received: from [192.168.86.117] ([136.57.172.92])
 by smtp.gmail.com with ESMTPSA id
 w15-20020ac8718f000000b003b9bf862c04sm1037449qto.55.2023.03.02.20.53.42
 for <gcc-patches@gcc.gnu.org>
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Thu, 02 Mar 2023 20:53:42 -0800 (PST)
Message-ID: <eefb0311-e12b-307f-fe70-c3e4641bb402@rivosinc.com>
Date: Thu, 2 Mar 2023 23:53:42 -0500
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.7.1
From: Michael Collison <collison@rivosinc.com>
Subject: [PATCH 07/07] RISC-V: Add auto-vectorization support
To: gcc-patches <gcc-patches@gcc.gnu.org>
Content-Language: en-US
X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, BODY_8BITS,
 DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE,
 SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>

This patch adds tests for autovectorization of integer add and subtract.

gcc/testsuite/ChangeLog:

     * gcc.target/riscv/rvv/autovec: New directory
     for autovectorization tests.
     * gcc.target/riscv/rvv/autovec/loop-add-rv32.c: New
     test to verify code generation of vector add on rv32.
     * gcc.target/riscv/rvv/autovec/loop-add.c: New
     test to verify code generation of vector add on rv64.
     * gcc.target/riscv/rvv/autovec/loop-sub-rv32.c: New
     test to verify code generation of vector subtract on rv32.
     * gcc.target/riscv/rvv/autovec/loop-sub.c: New
     test to verify code generation of vector subtract on rv64.
---
  .../riscv/rvv/autovec/loop-add-rv32.c         | 24 +++++++++++++++++++
  .../gcc.target/riscv/rvv/autovec/loop-add.c   | 24 +++++++++++++++++++
  .../riscv/rvv/autovec/loop-sub-rv32.c         | 24 +++++++++++++++++++
  .../gcc.target/riscv/rvv/autovec/loop-sub.c   | 24 +++++++++++++++++++
  4 files changed, 96 insertions(+)
  create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
  create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c
new file mode 100644
index 00000000000..bdc3b6892e9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add-rv32.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv32gcv 
-mabi=ilp32d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE)                 \
+  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)    \
+  {                            \
+    for (int i = 0; i < n; i++)                \
+      dst[i] = a[i] + b[i];                \
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()    \
+ TEST_TYPE(int16_t)    \
+ TEST_TYPE(uint16_t)    \
+ TEST_TYPE(int32_t)    \
+ TEST_TYPE(uint32_t)    \
+ TEST_TYPE(int64_t)    \
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvadd\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
new file mode 100644
index 00000000000..d7f992c7d27
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-add.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv64gcv 
-mabi=lp64d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE)                 \
+  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)    \
+  {                            \
+    for (int i = 0; i < n; i++)                \
+      dst[i] = a[i] + b[i];                \
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()    \
+ TEST_TYPE(int16_t)    \
+ TEST_TYPE(uint16_t)    \
+ TEST_TYPE(int32_t)    \
+ TEST_TYPE(uint32_t)    \
+ TEST_TYPE(int64_t)    \
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvadd\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c
new file mode 100644
index 00000000000..7d0a40ec539
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub-rv32.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv32gcv 
-mabi=ilp32d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE)                 \
+  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)    \
+  {                            \
+    for (int i = 0; i < n; i++)                \
+      dst[i] = a[i] - b[i];                \
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()    \
+ TEST_TYPE(int16_t)    \
+ TEST_TYPE(uint16_t)    \
+ TEST_TYPE(int32_t)    \
+ TEST_TYPE(uint32_t)    \
+ TEST_TYPE(int64_t)    \
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvsub\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c
new file mode 100644
index 00000000000..c8900884f83
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/loop-sub.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O2 -ftree-vectorize -march=rv64gcv 
-mabi=lp64d" } */
+
+#include <stdint.h>
+
+#define TEST_TYPE(TYPE)                 \
+  void vadd_##TYPE (TYPE *dst, TYPE *a, TYPE *b, int n)    \
+  {                            \
+    for (int i = 0; i < n; i++)                \
+      dst[i] = a[i] - b[i];                \
+  }
+
+/* *int8_t not autovec currently. */
+#define TEST_ALL()    \
+ TEST_TYPE(int16_t)    \
+ TEST_TYPE(uint16_t)    \
+ TEST_TYPE(int32_t)    \
+ TEST_TYPE(uint32_t)    \
+ TEST_TYPE(int64_t)    \
+ TEST_TYPE(uint64_t)
+
+TEST_ALL()
+
+/* { dg-final { scan-assembler-times {\tvsub\.vv} 6 } } */