From patchwork Sat Jun 1 07:45:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiao Zeng X-Patchwork-Id: 91275 X-Patchwork-Delegate: jlaw@ventanamicro.com Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E0F71383301B for ; Sat, 1 Jun 2024 07:40:58 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from azure-sdnproxy.icoremail.net (azure-sdnproxy.icoremail.net [207.46.229.174]) by sourceware.org (Postfix) with ESMTP id 7EC543835800 for ; Sat, 1 Jun 2024 07:39:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7EC543835800 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=eswincomputing.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=eswincomputing.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7EC543835800 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=207.46.229.174 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1717227601; cv=none; b=ESUbqhF+jrhU+2c8yuo5JvAozf78QGW+pKfHSnYw4iOlK9RnBKlcuRxq2d6Aamvhr5rIzj2oU7UooeW2G2iXhh67o+sIoNT5YObqnt+d6DZLh5ag0UxZVJniYfYmaHaDcH4pGmdQOBLhziVk1O6lGjQsQy4Qgh6O69alDeoVxdg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1717227601; c=relaxed/simple; bh=Xda6+TRAxmjmHasKAIo/mCHjBJ54Gc1aIVHWdjHrvng=; h=From:To:Subject:Date:Message-Id; b=WeNNe1ygMvwj97uBun6l+Ark8smN0ob6ZkdY9bYdCYEvfZ6b1t3FGevAiaUmW6lybRngzPLMe05aEFHp5trzscDV7GEC+LJ1a3z9TFm2Lkx43kJ6pyDTUGUXPVBWzeoC+UkWzfSI7serFDAKVyIwyf2QorxBEBPNZTIGe18h5cU= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from localhost.localdomain (unknown [10.12.130.38]) by app2 (Coremail) with SMTP id TQJkCgAXKry4z1pmvtYNAA--.34981S4; Sat, 01 Jun 2024 15:37:29 +0800 (CST) From: Xiao Zeng To: gcc-patches@gcc.gnu.org Cc: jeffreyalaw@gmail.com, research_trasio@irq.a4lg.com, kito.cheng@gmail.com, palmer@dabbelt.com, zhengyu@eswincomputing.com, Xiao Zeng Subject: [PATCH v2] RISC-V: Add Zfbfmin extension Date: Sat, 1 Jun 2024 15:45:47 +0800 Message-Id: <20240601074547.80271-1-zengxiao@eswincomputing.com> X-Mailer: git-send-email 2.17.1 X-CM-TRANSID: TQJkCgAXKry4z1pmvtYNAA--.34981S4 X-Coremail-Antispam: 1UD129KBjvAXoW3Zw48WF1DuryrZrW7ur1Utrb_yoW8Gr4ruo ZY9r4kZ345urn29rWq9w4UJr4DX3ZakrZIqF4vkF18CF4DX3Z5AryUtw45ur13JryfWryU ua48Aa98AFWUJws3n29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7v73VFW2AGmfu7bjvjm3 AaLaJ3UjIYCTnIWjp_UUUYb7AC8VAFwI0_Gr0_Xr1l1xkIjI8I6I8E6xAIw20EY4v20xva j40_Wr0E3s1l1IIY67AEw4v_Jr0_Jr4l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2 x7M28EF7xvwVC0I7IYx2IY67AKxVW8JVW5JwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8 JVWxJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oV Cq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0 I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r 4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCY02Avz4vE-syl42xK 82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGw C20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r126r1DMIIYrxkI7VAKI48J MIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMI IF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E 87Iv6xkF7I0E14v26r1j6r4UYxBIdaVFxhVjvjDU0xZFpf9x0JUdHUDUUUUU= X-CM-SenderInfo: p2hqw5xldrqvxvzl0uprps33xlqjhudrp/ X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, RCVD_IN_VALIDITY_RPBL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org 1 In the previous patch, the libcall for BF16 was implemented: 2 Riscv provides Zfbfmin extension, which completes the "Scalar BF16 Converts": 3 Implemented replacing libcall with Zfbfmin extension instruction. 4 Reused previous testcases in: gcc/ChangeLog: * config/riscv/iterators.md: Add mode_iterator between floating-point modes and BFmode. * config/riscv/riscv.cc (riscv_output_move): Handle BFmode move for zfbfmin. * config/riscv/riscv.md (truncbf2): New pattern for BFmode. (extendbfsf2): Dotto. (*movhf_hardfloat): Add BFmode. (*mov_hardfloat): Dotto. gcc/testsuite/ChangeLog: * gcc.target/riscv/zfbfmin-bf16_arithmetic.c: New test. * gcc.target/riscv/zfbfmin-bf16_comparison.c: New test. * gcc.target/riscv/zfbfmin-bf16_float_libcall_convert.c: New test. * gcc.target/riscv/zfbfmin-bf16_integer_libcall_convert.c: New test. --- gcc/config/riscv/iterators.md | 6 +- gcc/config/riscv/riscv.cc | 4 +- gcc/config/riscv/riscv.md | 49 ++++++++++++-- .../riscv/zfbfmin-bf16_arithmetic.c | 35 ++++++++++ .../riscv/zfbfmin-bf16_comparison.c | 33 ++++++++++ .../zfbfmin-bf16_float_libcall_convert.c | 45 +++++++++++++ .../zfbfmin-bf16_integer_libcall_convert.c | 66 +++++++++++++++++++ 7 files changed, 228 insertions(+), 10 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/zfbfmin-bf16_arithmetic.c create mode 100644 gcc/testsuite/gcc.target/riscv/zfbfmin-bf16_comparison.c create mode 100644 gcc/testsuite/gcc.target/riscv/zfbfmin-bf16_float_libcall_convert.c create mode 100644 gcc/testsuite/gcc.target/riscv/zfbfmin-bf16_integer_libcall_convert.c diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md index 3c139bc2e30..1e37e843023 100644 --- a/gcc/config/riscv/iterators.md +++ b/gcc/config/riscv/iterators.md @@ -78,9 +78,13 @@ ;; Iterator for floating-point modes that can be loaded into X registers. (define_mode_iterator SOFTF [SF (DF "TARGET_64BIT") (HF "TARGET_ZFHMIN")]) -;; Iterator for floating-point modes of BF16 +;; Iterator for floating-point modes of BF16. (define_mode_iterator HFBF [HF BF]) +;; Conversion between floating-point modes and BF16. +;; SF to BF16 have hardware instructions. +(define_mode_iterator FBF [HF DF TF]) + ;; ------------------------------------------------------------------- ;; Mode attributes ;; ------------------------------------------------------------------- diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 10af38a5a81..c5c4c777349 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -4310,7 +4310,7 @@ riscv_output_move (rtx dest, rtx src) switch (width) { case 2: - if (TARGET_ZFHMIN) + if (TARGET_ZFHMIN || TARGET_ZFBFMIN) return "fmv.x.h\t%0,%1"; /* Using fmv.x.s + sign-extend to emulate fmv.x.h. */ return "fmv.x.s\t%0,%1;slli\t%0,%0,16;srai\t%0,%0,16"; @@ -4366,7 +4366,7 @@ riscv_output_move (rtx dest, rtx src) switch (width) { case 2: - if (TARGET_ZFHMIN) + if (TARGET_ZFHMIN || TARGET_ZFBFMIN) return "fmv.h.x\t%0,%z1"; /* High 16 bits should be all-1, otherwise HW will treated as a n-bit canonical NaN, but isn't matter for softfloat. */ diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 25d341ec987..e57bfcf616a 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -1763,6 +1763,31 @@ [(set_attr "type" "fcvt") (set_attr "mode" "HF")]) +(define_insn "truncsfbf2" + [(set (match_operand:BF 0 "register_operand" "=f") + (float_truncate:BF + (match_operand:SF 1 "register_operand" " f")))] + "TARGET_ZFBFMIN" + "fcvt.bf16.s\t%0,%1" + [(set_attr "type" "fcvt") + (set_attr "mode" "BF")]) + +;; The conversion of HF/DF/TF to BF needs to be done with SF if there is a +;; chance to generate at least one instruction, otherwise just using +;; libfunc __trunc[h|d|t]fbf2. +(define_expand "truncbf2" + [(set (match_operand:BF 0 "register_operand" "=f") + (float_truncate:BF + (match_operand:FBF 1 "register_operand" " f")))] + "TARGET_ZFBFMIN" + { + convert_move (operands[0], + convert_modes (SFmode, mode, operands[1], 0), 0); + DONE; + } + [(set_attr "type" "fcvt") + (set_attr "mode" "BF")]) + ;; ;; .................... ;; @@ -1907,6 +1932,15 @@ [(set_attr "type" "fcvt") (set_attr "mode" "SF")]) +(define_insn "extendbfsf2" + [(set (match_operand:SF 0 "register_operand" "=f") + (float_extend:SF + (match_operand:BF 1 "register_operand" " f")))] + "TARGET_ZFBFMIN" + "fcvt.s.bf16\t%0,%1" + [(set_attr "type" "fcvt") + (set_attr "mode" "SF")]) + (define_insn "extendsfdf2" [(set (match_operand:DF 0 "register_operand" "=f") (float_extend:DF @@ -1936,16 +1970,17 @@ DONE; }) -(define_insn "*movhf_hardfloat" - [(set (match_operand:HF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*f,*r, *r,*r,*m") - (match_operand:HF 1 "move_operand" " f,zfli,G,m,f,G,*r,*f,*G*r,*m,*r"))] - "TARGET_ZFHMIN - && (register_operand (operands[0], HFmode) - || reg_or_0_operand (operands[1], HFmode))" +(define_insn "*mov_hardfloat" + [(set (match_operand:HFBF 0 "nonimmediate_operand" "=f, f,f,f,m,m,*f,*r, *r,*r,*m") + (match_operand:HFBF 1 "move_operand" " f,zfli,G,m,f,G,*r,*f,*G*r,*m,*r"))] + "((TARGET_ZFHMIN && mode == HFmode) + || (TARGET_ZFBFMIN && mode == BFmode)) + && (register_operand (operands[0], mode) + || reg_or_0_operand (operands[1], mode))" { return riscv_output_move (operands[0], operands[1]); } [(set_attr "move_type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") (set_attr "type" "fmove,fmove,mtc,fpload,fpstore,store,mtc,mfc,move,load,store") - (set_attr "mode" "HF")]) + (set_attr "mode" "")]) (define_insn "*mov_softfloat" [(set (match_operand:HFBF 0 "nonimmediate_operand" "=f, r,r,m,*f,*r") diff --git a/gcc/testsuite/gcc.target/riscv/zfbfmin-bf16_arithmetic.c b/gcc/testsuite/gcc.target/riscv/zfbfmin-bf16_arithmetic.c new file mode 100644 index 00000000000..2ca170345af --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zfbfmin-bf16_arithmetic.c @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32i_zfbfmin -mabi=ilp32f -mcmodel=medany -O" { target { rv32 } } } */ +/* { dg-options "-march=rv64i_zfbfmin -mabi=lp64f -mcmodel=medany -O" { target { rv64 } } } */ + +/* 1) bf -> sf fcvt.s.bf16 */ +/* 2) sf1 [+|-|*|/] sf2 f[add|sub|mul|div].s */ +/* 3) sf -> bf fcvt.bf16.s */ +extern __bf16 bf; +extern __bf16 bf1; +extern __bf16 bf2; + +void bf_add_bf () { bf = bf1 + bf2; } + +void bf_sub_bf () { bf = bf1 - bf2; } + +void bf_mul_bf () { bf = bf1 * bf2; } + +void bf_div_bf () { bf = bf1 / bf2; } + +void bf_add_const () { bf = bf1 + 3.14f; } + +void const_sub_bf () { bf = 3.14f - bf2; } + +void bf_mul_const () { bf = bf1 * 3.14f; } + +void const_div_bf () { bf = 3.14f / bf2; } + +void bf_inc () { ++bf; } + +void bf_dec () { --bf; } + +/* { dg-final { scan-assembler-times "fcvt.s.bf16" 14 } } */ +/* { dg-final { scan-assembler-times "fcvt.bf16.s" 10 } } */ + +/* { dg-final { scan-assembler-not "call" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/zfbfmin-bf16_comparison.c b/gcc/testsuite/gcc.target/riscv/zfbfmin-bf16_comparison.c new file mode 100644 index 00000000000..62d532063c4 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zfbfmin-bf16_comparison.c @@ -0,0 +1,33 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32i_zfbfmin -mabi=ilp32f -O" { target { rv32 } } } */ +/* { dg-options "-march=rv64i_zfbfmin -mabi=lp64f -O" { target { rv64 } } } */ + +/* 1) bf -> sf fcvt.s.bf16 */ +/* 2) sf1 [<|<=|>|>=|==] sf2 f[lt|le|gt|ge|eq].s */ +extern __bf16 bf; +extern __bf16 bf1; +extern __bf16 bf2; + +void bf_lt_bf () { bf = (bf1 < bf2) ? bf1 : bf2; } + +void bf_le_bf () { bf = (bf1 <= bf2) ? bf1 : bf2; } + +void bf_gt_bf () { bf = (bf1 > bf2) ? bf1 : bf2; } + +void bf_ge_bf () { bf = (bf1 >= bf2) ? bf1 : bf2; } + +void bf_eq_bf () { bf = (bf1 == bf2) ? bf1 : bf2; } + +void bf_lt_const () { bf = (bf1 < 3.14f) ? bf1 : bf2; } + +void bf_le_const () { bf = (bf1 <= 3.14f) ? bf1 : bf2; } + +void const_gt_bf () { bf = (3.14f > bf2) ? bf1 : bf2; } + +void const_ge_bf () { bf = (3.14f >= bf2) ? bf1 : bf2; } + +void bf_eq_const () { bf = (bf1 == 3.14f) ? bf1 : bf2; } + +/* { dg-final { scan-assembler-times "fcvt.s.bf16" 15 } } */ + +/* { dg-final { scan-assembler-not "call" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/zfbfmin-bf16_float_libcall_convert.c b/gcc/testsuite/gcc.target/riscv/zfbfmin-bf16_float_libcall_convert.c new file mode 100644 index 00000000000..95e65996e28 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zfbfmin-bf16_float_libcall_convert.c @@ -0,0 +1,45 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32i_zfbfmin -mabi=ilp32f -O" { target { rv32 } } } */ +/* { dg-options "-march=rv64i_zfbfmin -mabi=lp64f -O" { target { rv64 } } } */ + + +/* 1) float -> BF16 + * hf -> bf == hf -> sf -> bf fcvt.h.s + fcvt.bf16.s + * sf -> bf == sf -> bf fcvt.bf16.s + * df -> bf == df -> sf -> bf __truncdfsf2 + fcvt.bf16.s + * tf -> bf == tf -> sf -> bf __trunctfsf2 + fcvt.bf16.s +*/ + +/* 2) BF16 -> float + * bf -> hf == bf -> sf -> hf fcvt.s.bf16 + fcvt.s.h + * bf -> sf == bf -> sf fcvt.s.bf16 + * bf -> df == bf -> sf -> df fcvt.s.bf16 + __extendsfdf2 + * bf -> tf == bf -> sf -> tf fcvt.s.bf16 + __extendsftf2 +*/ + +extern __bf16 bf; +extern _Float16 hf; +extern float sf; +extern double df; +extern long double tf; + +void hf_to_bf () { bf = hf; } +void bf_to_hf () { hf = bf; } + +void sf_to_bf () { bf = sf; } +void bf_to_sf () { sf = bf; } + +void df_to_bf () { bf = df; } +void bf_to_df () { df = bf; } + +void tf_to_bf () { bf = tf; } +void bf_to_tf () { tf = bf; } + +/* { dg-final { scan-assembler-times "fcvt.bf16.s" 4 } } */ +/* { dg-final { scan-assembler-times "fcvt.s.bf16" 4 } } */ +/* { dg-final { scan-assembler-times "fcvt.h.s" 1 } } */ +/* { dg-final { scan-assembler-times "fcvt.s.h" 1 } } */ +/* { dg-final { scan-assembler-times "call\t__truncdfsf2" 1 } } */ +/* { dg-final { scan-assembler-times "call\t__trunctfsf2" 1 } } */ +/* { dg-final { scan-assembler-times "call\t__extendsfdf2" 1 } } */ +/* { dg-final { scan-assembler-times "call\t__extendsftf2" 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/zfbfmin-bf16_integer_libcall_convert.c b/gcc/testsuite/gcc.target/riscv/zfbfmin-bf16_integer_libcall_convert.c new file mode 100644 index 00000000000..1a998bfe2f8 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zfbfmin-bf16_integer_libcall_convert.c @@ -0,0 +1,66 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32i_zfbfmin -mabi=ilp32f -O" { target { rv32 } } } */ +/* { dg-options "-march=rv64i_zfbfmin -mabi=lp64f -O" { target { rv64 } } } */ + +/* 1) Integer -> BF16 + * qi/hi -> bf fcvt.s.w + fcvt.bf16.s + * uqi/uhi -> bf fcvt.s.wu + fcvt.bf16.s + * + * si/di/ti -> bf __float[s|d|t]ibf + * usi/udi/uti -> bf __floatun[s|d|t]ibf +*/ + +/* 2) BF16 -> Integer + * bf -> qi/hi/si/di fcvt.s.bf16 + fcvt.[w|l].s + * bf -> uqi/uhi/usi/udi fcvt.s.bf16 + fcvt.[w|l]u.s + * bf -> ti/uti fcvt.s.bf16 + __fix[uns]sfti +*/ + +extern __bf16 bf; + +extern signed char qi; +extern unsigned char uqi; +extern signed short hi; +extern unsigned short uhi; +extern signed int si; +extern unsigned int usi; +extern signed long long di; +extern unsigned long long udi; + +void qi_to_bf () { bf = qi; } +void uqi_to_bf () { bf = uqi; } +void bf_to_qi () { qi = bf; } +void bf_to_uqi () { uqi = bf; } + +void hi_to_bf () { bf = hi; } +void uhi_to_bf () { bf = uhi; } +void bf_to_hi () { hi = bf; } +void bf_to_uhi () { uhi = bf; } + +void si_to_bf () { bf = si; } +void usi_to_bf () { bf = usi; } +void bf_to_si () { si = bf; } +void bf_to_usi () { usi = bf; } + +void di_to_bf () { bf = di; } +void udi_to_bf () { bf = udi; } +void bf_to_di () { di = bf; } +void bf_to_udi () { udi = bf; } + +#if __riscv_xlen == 64 +extern signed __int128 ti; +extern unsigned __int128 uti; +void ti_to_bf () { bf = ti; } /* { dg-final { scan-assembler-times "call\t__floattibf" 1 { target { rv64 } } } } */ +void uti_to_bf () { bf = uti; } /* { dg-final { scan-assembler-times "call\t__floatuntibf" 1 { target { rv64 } } } } */ +void bf_to_ti () { ti = bf; } /* { dg-final { scan-assembler-times "call\t__fixsfti" 1 { target { rv64 } } } } */ +void bf_to_uti () { uti = bf; } /* { dg-final { scan-assembler-times "call\t__fixunssfti" 1 { target { rv64 } } } } */ +#endif + +/* { dg-final { scan-assembler-times "fcvt.bf16.s" 4 } } */ +/* { dg-final { scan-assembler-times "fcvt.s.bf16" 8 { target { rv32 } } } } */ +/* { dg-final { scan-assembler-times "fcvt.s.bf16" 10 { target { rv64 } } } } */ + +/* { dg-final { scan-assembler-times "call\t__floatsibf" 1 } } */ +/* { dg-final { scan-assembler-times "call\t__floatunsibf" 1 } } */ +/* { dg-final { scan-assembler-times "call\t__floatdibf" 1 } } */ +/* { dg-final { scan-assembler-times "call\t__floatundibf" 1 } } */