From patchwork Thu Jan 11 08:23:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?6ZKf5bGF5ZOy?= X-Patchwork-Id: 83850 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C591F386EC0E for ; Thu, 11 Jan 2024 08:24:27 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbguseast1.qq.com (smtpbguseast1.qq.com [54.204.34.129]) by sourceware.org (Postfix) with ESMTPS id 57C8A38497B6 for ; Thu, 11 Jan 2024 08:23:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 57C8A38497B6 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 57C8A38497B6 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.204.34.129 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704961421; cv=none; b=Cs6GGes8KpghlgeAX83qOgTWce9qWXBwhrhqgoA8HszgOFu4UXkQqb/CwuGjDttU+shtuWCiUczLmY/Yjgi5z0r283F/YNSFYON5nwn7nN+k4Fcu7yV113avwPjwb7K1G4alvCTQneixnmjAaaxk2MD3FJYfS7JEQrzBHz/lCMU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704961421; c=relaxed/simple; bh=H+pIEEXT/KOhE5G0I2CWoVIBscZftVKbL1PMp/D4OQk=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=OvmhfsEd+l2A/kkDr2boMvVJd/ipMoibDmdjRCCySy07NPygRR3MbIjh751iH1b1i57BHxUpdajS+x62VFFfJyaDtKKZwqvKC1yjDrS5lwClN0b1aE4jyHMa6cuOZ8maCBicWmMvXJxFji+/AnYpwOt05p2SDQMuevDVefpQlyA= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp76t1704961411tvrbp1bf Received: from rios-cad122.hadoop.rioslab.org ( [58.60.1.26]) by bizesmtp.qq.com (ESMTP) with id ; Thu, 11 Jan 2024 16:23:30 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: W+onFc5Tw4NotKEs6UstT2d50HBYkFT4wmpjUOwhdQVt0vPKThwF+7u4usG79 bPjFbNySFzAcCrZoJzusEZLQoOWUQ7wu8rs7fsA4o4H+LKX9c3pLlF2GzA4w0glyRbYH2wQ RQgrq3pG+146ZFK6h1zv62Yc2ZuKVa9G7n4ZW019bykCwFpWbmZ/oVUJeqoqqTleb50/nFH ELvgUHuC3eU9mAMX+cu299ti766JJlhauQmNxGrX3h6e5PgPtrE79GEYxWFl5ZK2TF9LxOo 9jXu/HZFU3PgjEcHgZfUWUJ7BEPRUkRHfLFBYfnoFaoCVfSEGGLa6OO7nDRjnMtXfEojB2B OQI3ZKeO72u2DzpmOJrZpTeGTnfG9w8PpxwHtF+0Og8QnQ2lEjV6vThO5vlfxRKQi2Kob+I 9wIakxtb4/FiiROFgkMEDg== X-QQ-GoodBg: 2 X-BIZMAIL-ID: 8046176303306343630 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [PATCH] RISC-V: Increase scalar_to_vec_cost from 1 to 3 Date: Thu, 11 Jan 2024 16:23:29 +0800 Message-Id: <20240111082329.1198064-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-9.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org This patch fixes the following inefficient vectorized codes: vsetvli a5,zero,e8,mf2,ta,ma li a2,17 vid.v v1 li a4,-32768 vsetvli zero,zero,e16,m1,ta,ma addiw a4,a4,104 vmv.v.i v3,15 lui a1,%hi(a) li a0,19 vsetvli zero,zero,e8,mf2,ta,ma vadd.vx v1,v1,a2 sb a0,%lo(a)(a1) vsetvli zero,zero,e16,m1,ta,ma vzext.vf2 v2,v1 vmv.v.x v1,a4 vminu.vv v2,v2,v3 vsrl.vv v1,v1,v2 vslidedown.vi v1,v1,1 vmv.x.s a0,v1 snez a0,a0 ret The reason is scalar_to_vec_cost is too low. Consider in VEC_SET, we always have a slide + scalar move instruction, scalar_to_vec_cost = 1 (current cost) is not reasonable. I tried to set it as 2 but failed fix this case, that is, I need to set it as 3 to fix this case. No matter scalar move or slide instruction, I believe they are more costly than normal vector instructions (e.g. vadd.vv). So set it as 3 looks reasonable to me. After this patch: lui a5,%hi(a) li a4,19 sb a4,%lo(a)(a5) li a0,0 ret Tested on both RV32/RV64 no regression, Ok for trunk ? PR target/113281 gcc/ChangeLog: * config/riscv/riscv.cc: Set scalar_to_vec_cost as 3. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr113209.c: Adapt test. * gcc.dg/vect/costmodel/riscv/rvv/pr113281-1.c: New test. --- gcc/config/riscv/riscv.cc | 4 ++-- .../vect/costmodel/riscv/rvv/pr113281-1.c | 18 ++++++++++++++++++ .../gcc.target/riscv/rvv/autovec/pr113209.c | 2 +- 3 files changed, 21 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-1.c diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index df9799d9c5e..bcfb3c15a39 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -366,7 +366,7 @@ static const common_vector_cost rvv_vls_vector_cost = { 1, /* gather_load_cost */ 1, /* scatter_store_cost */ 1, /* vec_to_scalar_cost */ - 1, /* scalar_to_vec_cost */ + 3, /* scalar_to_vec_cost */ 1, /* permute_cost */ 1, /* align_load_cost */ 1, /* align_store_cost */ @@ -382,7 +382,7 @@ static const scalable_vector_cost rvv_vla_vector_cost = { 1, /* gather_load_cost */ 1, /* scatter_store_cost */ 1, /* vec_to_scalar_cost */ - 1, /* scalar_to_vec_cost */ + 3, /* scalar_to_vec_cost */ 1, /* permute_cost */ 1, /* align_load_cost */ 1, /* align_store_cost */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-1.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-1.c new file mode 100644 index 00000000000..331cf961a1f --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-1.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvl256b -mabi=lp64d -O3 -ftree-vectorize -fdump-tree-vect-details" } */ + +unsigned char a; + +int main() { + short b = a = 0; + for (; a != 19; a++) + if (a) + b = 32872 >> a; + + if (b == 0) + return 0; + else + return 1; +} + +/* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113209.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113209.c index 081ee369394..70aae151000 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113209.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113209.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-march=rv64gcv_zvl256b -mabi=lp64d -O3" } */ +/* { dg-options "-march=rv64gcv_zvl256b -mabi=lp64d -O3 -fno-vect-cost-model" } */ int b, c, d, f, i, a; int e[1] = {0};