From patchwork Sat Jun 10 00:37:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 70856 X-Patchwork-Delegate: kito.cheng@gmail.com Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F18453856624 for ; Sat, 10 Jun 2023 00:38:55 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbguseast3.qq.com (smtpbguseast3.qq.com [54.243.244.52]) by sourceware.org (Postfix) with ESMTPS id 6364938323F1 for ; Sat, 10 Jun 2023 00:37:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6364938323F1 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp80t1686357460tza3c1hk Received: from rios-cad5.localdomain ( [58.60.1.11]) by bizesmtp.qq.com (ESMTP) with id ; Sat, 10 Jun 2023 08:37:38 +0800 (CST) X-QQ-SSF: 01400000000000F0S000000A0000000 X-QQ-FEAT: XlKBH1Cks64ibfiVfrlhUb/PVACJKhPQWwk0y3D9SDi23YtHWZfNuMmzhhtrF +mQvZEL40mwDH4ztwBsGk229tiJXIR3647LDurHdMDCirhrjZvq+LLWievKQjhv3axpYVAA yOUE79oJGkBBRkWQcZQSqRQqZ2npMmm/Y6JoigBbDm8frDKusYhCKjEJSnH+sQgY8WFChLj 0QogYZ/t8s/PoBROwGwohYpkNI3ZxQEffuKGW6Y95ZElPAnv1lplRe8zNjeBnWiGBopUDSY uKS9kjWwWNXv007v7IthlxgeIo/x6uLXzPs5O90ur/FdKcuMNfKA9rW0yblJLirUdJFFOlA mBjDuq6+NxqJzxlBFzgH8UZ2i/AWH8Ob0n7sTYqEBnn0soNXYBW4Ti+yrLPWNhCyafg4Txb do0wLpjS50o= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 9928688754418754342 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Cc: kito.cheng@sifive.com, palmer@rivosinc.com, rdapp.gcc@gmail.com, jeffreyalaw@gmail.com, Juzhe-Zhong Subject: [PATCH] RISC-V: Enable select_vl for RVV auto-vectorization Date: Sat, 10 Jun 2023 08:37:37 +0800 Message-Id: <20230610003737.1679827-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: Juzhe-Zhong Consider this following example: void vec_add(int32_t *restrict c, int32_t *restrict a, int32_t *restrict b, int N) { for (long i = 0; i < N; i++) { c[i] = a[i] + b[i]; } } After this patch: vec_add: ble a3,zero,.L5 .L3: vsetvli a5,a3,e32,m1,ta,ma vle32.v v2,0(a1) vle32.v v1,0(a2) vsetvli a6,zero,e32,m1,ta,ma ===> redundant vsetvl. slli a4,a5,2 vadd.vv v1,v1,v2 sub a3,a3,a5 vsetvli zero,a5,e32,m1,ta,ma ===> redundant vsetvl. vse32.v v1,0(a0) add a1,a1,a4 add a2,a2,a4 add a0,a0,a4 bne a3,zero,.L3 .L5: ret We can get close-to-optimal codegen but with some redundant vsetvls. This is not the big issue which will be easily addressed in RISC-V backend. I am going to add a standalone PASS "AVL propagation" (avlprop) to addresse such issue. gcc/ChangeLog: * config/riscv/autovec.md (select_vl): New pattern. * config/riscv/riscv-protos.h (expand_select_vl): New function. * config/riscv/riscv-v.cc (expand_select_vl): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/ternop/ternop-2.c: Adapt test. * gcc.target/riscv/rvv/autovec/ternop/ternop-5.c: Ditto. * gcc.target/riscv/rvv/autovec/partial/select_vl-1.c: New test. --- gcc/config/riscv/autovec.md | 14 ++++++++++ gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv-v.cc | 12 +++++++++ .../riscv/rvv/autovec/partial/select_vl-1.c | 26 +++++++++++++++++++ .../riscv/rvv/autovec/ternop/ternop-2.c | 2 +- .../riscv/rvv/autovec/ternop/ternop-5.c | 2 +- 6 files changed, 55 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/select_vl-1.c diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index 9f4492db23c..b7070099f29 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -626,3 +626,17 @@ } [(set_attr "type" "vimuladd") (set_attr "mode" "")]) + +;; ========================================================================= +;; == SELECT_VL +;; ========================================================================= + +(define_expand "select_vl" + [(match_operand:P 0 "register_operand") + (match_operand:P 1 "vector_length_operand") + (match_operand:P 2 "")] + "TARGET_VECTOR" +{ + riscv_vector::expand_select_vl (operands); + DONE; +}) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 66c1f535d60..6db3a46c682 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -246,6 +246,7 @@ void expand_vec_series (rtx, rtx, rtx); void expand_vec_init (rtx, rtx); void expand_vcond (rtx *); void expand_vec_perm (rtx, rtx, rtx, rtx); +void expand_select_vl (rtx *); /* Rounding mode bitfield for fixed point VXRM. */ enum vxrm_field_enum { diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 477a22cd2b0..e1b85a5af91 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -2447,4 +2447,16 @@ expand_vec_perm_const (machine_mode vmode, machine_mode op_mode, rtx target, return ret; } +/* Generate no side effects vsetvl to get the vector length. */ +void +expand_select_vl (rtx *ops) +{ + poly_int64 nunits = rtx_to_poly_int64 (ops[2]); + /* We arbitrary picked QImode as inner scalar mode to get vector mode. + since vsetvl only demand ratio. We let VSETVL PASS to optimize it. */ + scalar_int_mode mode = QImode; + machine_mode rvv_mode = get_vector_mode (mode, nunits).require (); + emit_insn (gen_no_side_effects_vsetvl_rtx (rvv_mode, ops[0], ops[1])); +} + } // namespace riscv_vector diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/select_vl-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/select_vl-1.c new file mode 100644 index 00000000000..74bbf40ee9f --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/select_vl-1.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param riscv-autovec-preference=scalable -fno-vect-cost-model -fno-tree-loop-distribute-patterns -fdump-tree-optimized-details" } */ + +#include + +#define TEST_TYPE(TYPE) \ + __attribute__ ((noipa)) void select_vl_##TYPE (TYPE *__restrict dst, \ + TYPE *__restrict a, int n) \ + { \ + for (int i = 0; i < n; i++) \ + dst[i] = a[i]; \ + } + +#define TEST_ALL() \ + TEST_TYPE (int8_t) \ + TEST_TYPE (uint8_t) \ + TEST_TYPE (int16_t) \ + TEST_TYPE (uint16_t) \ + TEST_TYPE (int32_t) \ + TEST_TYPE (uint32_t) \ + TEST_TYPE (int64_t) \ + TEST_TYPE (uint64_t) \ + TEST_TYPE (float) \ + TEST_TYPE (double) + +TEST_ALL () diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/ternop/ternop-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/ternop/ternop-2.c index 89eeaf6315f..e52e07ddd09 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/ternop/ternop-2.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/ternop/ternop-2.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=scalable" } */ +/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=scalable -fno-schedule-insns" } */ #include diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/ternop/ternop-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/ternop/ternop-5.c index a9a7198feb4..49c85efbf3a 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/ternop/ternop-5.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/ternop/ternop-5.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=scalable" } */ +/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=scalable -fno-schedule-insns" } */ #include