From patchwork Wed Jan 10 03:06:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?6ZKf5bGF5ZOy?= X-Patchwork-Id: 83717 X-Patchwork-Delegate: rdapp.gcc@gmail.com Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1C1B13858C74 for ; Wed, 10 Jan 2024 03:07:36 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgbr2.qq.com (smtpbgbr2.qq.com [54.207.22.56]) by sourceware.org (Postfix) with ESMTPS id 0B1853858C41 for ; Wed, 10 Jan 2024 03:06:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0B1853858C41 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0B1853858C41 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.207.22.56 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704856024; cv=none; b=PxryF/Mwlfz9/STx6WqzjXVVuBsW2jrTEiGYgasnoSIaCmHoqvGGalqej8c5XcCNgm9w8tdROj4XENE1AqQCusVasfqevhvYTVfM6PhAPi7cHSJwUV9UUBn0C91vVpqk6l9y+kbi7Tze5Xf+xKxg+/Co6BEXvr1W2rxySXxKugk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704856024; c=relaxed/simple; bh=bCY7dgdGxhxVAUEHnYaUkEtK7I054OCh2xm1Fjcsju8=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=hFSalbQzbVVhuZOWCCCVcAKEiHUYAacZEQWpboNVEW0qjsX++mmDW/+oxREbNjEOgdqqTS88qef3Lt+ybTIuFMlnN2ejJclTdKH46JGhR31U1rx5V3tTWACqBMFhOiPa4qegfrtRvNwyAtkV5ZV7c3XSr664UxQDbK8Yrs+9AFg= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp91t1704856012tnfl3end Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Wed, 10 Jan 2024 11:06:51 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: CR3LFp2JE4mkk3+w1fezpjMOsGcJLH8kbDKHKk6nTRl/THJzvMhNFwcBsBH09 h8ZMHfX3AJvkiSn00fs6yyvs44raSgXtJ4UTXtmOfg4wBiIlgJH5KNtaBR0JIhSoLuxmBN1 WFmDSNjhLpfLF6CJM64GupWIzl+RgqgtRjz2vzOS7KP/88NPaxHKGqQVdmEKYAnypbNxNMW T+0ikvvywQJ1ATt/xf84KFDkUTSrOMYs51JI0cZRdP9h6E/WwfEQnuaKPgbUwteSwi9deed ujhV43AGC44RYT7EdhboM++m60D9bRCm6VUOFcJeWKqF04rOZEfz8TmJp/f8mgceHciy3iV FIplNCzurXyVqLkh+S7pcSSER+8kwgJV8wKyxOr9JcdM/Rf811TvTR+NTsBbUH+GzmAPPYb hUhP8eygFq4= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 10051813884389356720 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [PATCH V2] RISC-V: Minor tweak dynamic cost model Date: Wed, 10 Jan 2024 11:06:50 +0800 Message-Id: <20240110030650.1338056-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org v2 update: Robostify tests. While working on cost model, I notice one case that dynamic lmul cost doesn't work well. Before this patch: foo: lui a4,%hi(.LANCHOR0) li a0,1953 li a1,63 addi a4,a4,%lo(.LANCHOR0) li a3,64 vsetvli a2,zero,e32,mf2,ta,ma vmv.v.x v5,a0 vmv.v.x v4,a1 vid.v v3 .L2: vsetvli a5,a3,e32,mf2,ta,ma vadd.vi v2,v3,1 vadd.vv v1,v3,v5 mv a2,a5 vmacc.vv v1,v2,v4 slli a1,a5,2 vse32.v v1,0(a4) sub a3,a3,a5 add a4,a4,a1 vsetvli a5,zero,e32,mf2,ta,ma vmv.v.x v1,a2 vadd.vv v3,v3,v1 bne a3,zero,.L2 li a0,0 ret Unexpected: Use scalable vector and LMUL = MF2 which is wasting computation resources. Ideally, we should use LMUL = M8 VLS modes. The root cause is the dynamic LMUL heuristic dominates the VLS heuristic. Adapt the cost model heuristic. After this patch: foo: lui a4,%hi(.LANCHOR0) addi a4,a4,%lo(.LANCHOR0) li a3,4096 li a5,32 li a1,2016 addi a2,a4,128 addiw a3,a3,-32 vsetvli zero,a5,e32,m8,ta,ma li a0,0 vid.v v8 vsll.vi v8,v8,6 vadd.vx v16,v8,a1 vadd.vx v8,v8,a3 vse32.v v16,0(a4) vse32.v v8,0(a2) ret Tested on both RV32/RV64 no regression. Ok for trunk ? gcc/ChangeLog: * config/riscv/riscv-vector-costs.cc (costs::better_main_loop_than_p): Minior tweak. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-10.c: Fix test. * gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-11.c: Ditto. * gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-12.c: Ditto. --- gcc/config/riscv/riscv-vector-costs.cc | 3 ++- .../gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-10.c | 5 ++--- .../gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-11.c | 5 ++--- .../gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-12.c | 7 +++---- 4 files changed, 9 insertions(+), 11 deletions(-) diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc index f4a1a789f23..e53f4a186f3 100644 --- a/gcc/config/riscv/riscv-vector-costs.cc +++ b/gcc/config/riscv/riscv-vector-costs.cc @@ -994,7 +994,8 @@ costs::better_main_loop_than_p (const vector_costs *uncast_other) const vect_vf_for_cost (other_loop_vinfo)); /* Apply the unrolling heuristic described above m_unrolled_vls_niters. */ - if (bool (m_unrolled_vls_stmts) != bool (other->m_unrolled_vls_stmts)) + if (bool (m_unrolled_vls_stmts) != bool (other->m_unrolled_vls_stmts) + && m_cost_type != other->m_cost_type) { bool this_prefer_unrolled = this->prefer_unrolled_loop (); bool other_prefer_unrolled = other->prefer_unrolled_loop (); diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-10.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-10.c index 3ddffa37fe4..89a6c678960 100644 --- a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-10.c +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-10.c @@ -3,7 +3,7 @@ #include -#define N 40 +#define N 48 int a[N]; @@ -22,7 +22,6 @@ foo (){ return 0; } -/* { dg-final { scan-assembler-times {vsetivli\s+zero,\s*8,\s*e32,\s*m2,\s*t[au],\s*m[au]} 1 } } */ /* { dg-final { scan-assembler-times {vsetivli\s+zero,\s*16,\s*e32,\s*m4,\s*t[au],\s*m[au]} 1 } } */ -/* { dg-final { scan-assembler-times {vsetivli} 2 } } */ +/* { dg-final { scan-assembler-times {vsetivli} 1 } } */ /* { dg-final { scan-assembler-not {vsetvli} } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-11.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-11.c index 7625ec5c4b1..86732ef2ce5 100644 --- a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-11.c +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-11.c @@ -3,7 +3,7 @@ #include -#define N 40 +#define N 64 int a[N]; @@ -22,7 +22,6 @@ foo (){ return 0; } -/* { dg-final { scan-assembler-times {vsetivli\s+zero,\s*8,\s*e32,\s*m2,\s*t[au],\s*m[au]} 1 } } */ /* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*m8,\s*t[au],\s*m[au]} 1 } } */ -/* { dg-final { scan-assembler-times {vsetivli} 1 } } */ +/* { dg-final { scan-assembler-not {vsetivli} } } */ /* { dg-final { scan-assembler-times {vsetvli} 1 } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-12.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-12.c index 7625ec5c4b1..a1fcb3f3443 100644 --- a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-12.c +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-12.c @@ -1,9 +1,9 @@ /* { dg-do compile } */ -/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 --param=riscv-autovec-lmul=m8 -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 --param=riscv-autovec-lmul=dynamic -fno-schedule-insns -fno-schedule-insns2" } */ #include -#define N 40 +#define N 64 int a[N]; @@ -22,7 +22,6 @@ foo (){ return 0; } -/* { dg-final { scan-assembler-times {vsetivli\s+zero,\s*8,\s*e32,\s*m2,\s*t[au],\s*m[au]} 1 } } */ /* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*m8,\s*t[au],\s*m[au]} 1 } } */ -/* { dg-final { scan-assembler-times {vsetivli} 1 } } */ +/* { dg-final { scan-assembler-not {vsetivli} } } */ /* { dg-final { scan-assembler-times {vsetvli} 1 } } */