RISC-V: Add autovec FP unary operations.

Message ID 490fd4af-75d2-de76-fa74-f9ebb478b8b8@gmail.com
State Superseded
Headers
Series RISC-V: Add autovec FP unary operations. |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 fail Patch failed to apply
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 fail Patch failed to apply
linaro-tcwg-bot/tcwg_gcc_build--master-arm fail Patch failed to apply

Commit Message

Robin Dapp June 14, 2023, 3:31 p.m. UTC
  Hi,

this patch adds floating-point autovec expanders for vfneg, vfabs as well as
vfsqrt and the accompanying tests.  vfrsqrt7 will be added at a later time.

Similary to the binop tests, there are flavors for zvfh now.  Prerequisites
as before.

Regards
 Robin

gcc/ChangeLog:

	* config/riscv/autovec.md (<optab><mode>2): Add unop expanders.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/unop/abs-run.c: Add FP.
	* gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c: Add FP.
	* gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c: Add FP.
	* gcc.target/riscv/rvv/autovec/unop/abs-template.h: Add FP.
	* gcc.target/riscv/rvv/autovec/unop/vneg-run.c: Add FP.
	* gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c: Add FP.
	* gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c: Add FP.
	* gcc.target/riscv/rvv/autovec/unop/vneg-template.h: Add FP.
	* gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h: New test.
	* gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c: New test.
---
 gcc/config/riscv/autovec.md                   | 36 ++++++++++++++++++-
 .../riscv/rvv/autovec/unop/abs-run.c          |  6 ++--
 .../riscv/rvv/autovec/unop/abs-rv32gcv.c      |  3 +-
 .../riscv/rvv/autovec/unop/abs-rv64gcv.c      |  3 +-
 .../riscv/rvv/autovec/unop/abs-template.h     | 14 +++++++-
 .../riscv/rvv/autovec/unop/abs-zvfh-run.c     | 35 ++++++++++++++++++
 .../riscv/rvv/autovec/unop/vfsqrt-run.c       | 29 +++++++++++++++
 .../riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c   | 10 ++++++
 .../riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c   | 10 ++++++
 .../riscv/rvv/autovec/unop/vfsqrt-template.h  | 31 ++++++++++++++++
 .../riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c  | 32 +++++++++++++++++
 .../riscv/rvv/autovec/unop/vneg-run.c         |  6 ++--
 .../riscv/rvv/autovec/unop/vneg-rv32gcv.c     |  3 +-
 .../riscv/rvv/autovec/unop/vneg-rv64gcv.c     |  3 +-
 .../riscv/rvv/autovec/unop/vneg-template.h    |  5 ++-
 .../riscv/rvv/autovec/unop/vneg-zvfh-run.c    | 26 ++++++++++++++
 16 files changed, 241 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c
  

Comments

Jeff Law June 14, 2023, 7:43 p.m. UTC | #1
On 6/14/23 09:31, Robin Dapp wrote:
> Hi,
> 
> this patch adds floating-point autovec expanders for vfneg, vfabs as well as
> vfsqrt and the accompanying tests.  vfrsqrt7 will be added at a later time.
So with vrsqrt7 I think the question turns into will we be able to use 
it effectively.  With its limited initial accuracy, we'll be stuck with 
another round of Newton-Raphson or Goldschmidt, so we're not likely 
going to beat the latency of a standard vsqrt.  We can use it to improve 
throughput though since it does pipeline (using the fmacs of course, so 
there's a definite trade-off if the fmacs are already saturated).


> 
> Similary to the binop tests, there are flavors for zvfh now.  Prerequisites
> as before.
> 
> Regards
>   Robin
> 
> gcc/ChangeLog:
> 
> 	* config/riscv/autovec.md (<optab><mode>2): Add unop expanders.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/rvv/autovec/unop/abs-run.c: Add FP.
> 	* gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c: Add FP.
> 	* gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c: Add FP.
> 	* gcc.target/riscv/rvv/autovec/unop/abs-template.h: Add FP.
> 	* gcc.target/riscv/rvv/autovec/unop/vneg-run.c: Add FP.
> 	* gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c: Add FP.
> 	* gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c: Add FP.
> 	* gcc.target/riscv/rvv/autovec/unop/vneg-template.h: Add FP.
> 	* gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c: New test.
> 	* gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c: New test.
> 	* gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c: New test.
> 	* gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c: New test.
> 	* gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h: New test.
> 	* gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c: New test.
> 	* gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c: New test.
LGTM.  So if Juzhe is happy with it, then it's good to go once 
dependencies are resolved.

jeff
  
钟居哲 June 14, 2023, 9:15 p.m. UTC | #2
Hi, Jeff.  Thanks for quick approval.

When I reviewed the patch:
(define_expand "<optab><mode>2"
  [(set (match_operand:VF 0 "register_operand")
    (any_float_unop_nofrm:VF
     (match_operand:VF 1 "register_operand")))]
  "TARGET_VECTOR"
{
  insn_code icode = code_for_pred (<CODE>, <MODE>mode);
  riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands);
  DONE;
})

There could be issue here of FP16 vector. 
Since let's see VF iterator:
(define_mode_iterator VF [
  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
  (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
  (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
  (VNx8HF "TARGET_VECTOR_ELEN_FP_16")
  (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
  (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
  (VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
....

You can see For all FP16 mode, we use predicate "TARGET_VECTOR_ELEN_FP_16"
which is true when either TARGET_ZVFHM or TARGET_ZVFHMIN.
The reason we do that since most floating-point instructions are using same iterators that we can't add TARGET_ZVFHMIN or TARGET_ZVFH
in naive way. Some instructions pattern are using VF for example vle16.v which should be enabled as long as TARGET_ZVFHMIN wheras
the instructions like vfneg.v need TARGET_ZVFH.

So I do the experiment:
void
f (_Float16 *restrict a, _Float16 *restrict b)
{
  for (int i = 0; i < 100; ++i)
    {
      a[i] = -b[i];
    }
}

with compile option:
-march=rv64gcv_zvfhmin --param=riscv-autovec-preference=fixed-vlmax -O3

ICE happens:
auto.c:26:1: error: unable to generate reloads for:
(insn 8 7 9 2 (set (reg:VNx8HF 186 [ vect__6.7 ])
        (if_then_else:VNx8HF (unspec:VNx8BI [
                    (const_vector:VNx8BI [
                            (const_int 1 [0x1]) repeated x8
                        ])
                    (const_int 8 [0x8])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (neg:VNx8HF (reg:VNx8HF 134 [ vect__4.6 ]))
            (unspec:VNx8HF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "auto.c":24:14 6631 {pred_negvnx8hf}
     (expr_list:REG_DEAD (reg:VNx8HF 134 [ vect__4.6 ])
        (nil)))

The reason of ICE is that we have enabled auto-vectorzation pattern of vfneg.v when TARGET_ZVFHMIN according to VF iterators but
the instructions pattern of vfneg.v is correctly disabled and only enabled when TARGET_ZVFH since we have this attribute for each
RVV instruction pattern:
(define_attr "fp_vector_disabled" "no,yes"
  (cond [
    (and (eq_attr "type" "vfmov,vfalu,vfmul,vfdiv,
        vfwalu,vfwmul,vfmuladd,vfwmuladd,
        vfsqrt,vfrecp,vfminmax,vfsgnj,vfcmp,
        vfclass,vfmerge,
        vfncvtitof,vfwcvtftoi,vfcvtftoi,vfcvtitof,
        vfredo,vfredu,vfwredo,vfwredu,
        vfslide1up,vfslide1down")
   (and (eq_attr "mode" "VNx1HF,VNx2HF,VNx4HF,VNx8HF,VNx16HF,VNx32HF,VNx64HF")
        (match_test "!TARGET_ZVFH")))
    (const_string "yes")

    ;; The mode records as QI for the FP16 <=> INT8 instruction.
    (and (eq_attr "type" "vfncvtftoi,vfwcvtitof")
   (and (eq_attr "mode" "VNx1QI,VNx2QI,VNx4QI,VNx8QI,VNx16QI,VNx32QI,VNx64QI")
        (match_test "!TARGET_ZVFH")))
    (const_string "yes")
  ]
  (const_string "no")))

When I slightly change the pattern as follows:
(define_expand "<optab><mode>2"
  [(set (match_operand:VF 0 "register_operand")
    (any_float_unop_nofrm:VF
     (match_operand:VF 1 "register_operand")))]
  "TARGET_VECTOR && !(GET_MODE_INNER (<MODE>mode) == HFmode && !TARGET_ZVFH)"
{
  insn_code icode = code_for_pred (<CODE>, <MODE>mode);
  riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands);
  DONE;
})

Add && !(GET_MODE_INNER (<MODE>mode) == HFmode && !TARGET_ZVFH)
to condition.

It works for both TARGET_ZVFH and TARGET_ZVFHMIN
-march=rv64gcv_zvfhmin:
f:
        li      a4,2147450880
        li      a5,-2147450880
        addi    a4,a4,-1
        addi    a5,a5,1
        slli    a3,a5,32
        slli    a2,a4,32
        mv      a5,a4
        li      a4,-2147450880
        addi    a6,a1,200
        add     a3,a3,a4
        add     a2,a2,a5
.L2:
        ld      a5,0(a1)
        addi    a0,a0,8
        addi    a1,a1,8
        not     a4,a5
        and     a5,a5,a2
        and     a4,a4,a3
        sub     a5,a3,a5
        xor     a5,a4,a5
        sd      a5,-8(a0)
        bne     a1,a6,.L2
        ret

-march=rv64gcv_zvfh:
f:
        vsetivli        zero,8,e16,m1,ta,ma
        addi    a4,a1,16
        addi    a5,a0,16
        vle16.v v1,0(a1)
        vfneg.v v1,v1
        vse16.v v1,0(a0)
        addi    a2,a1,32
        addi    a3,a0,32
        vle16.v v1,0(a4)
        vfneg.v v1,v1
        vse16.v v1,0(a5)
        addi    a4,a1,48
        addi    a5,a0,48
        vle16.v v1,0(a2)
        vfneg.v v1,v1
        vse16.v v1,0(a3)
        addi    a2,a1,64
        addi    a3,a0,64
        vle16.v v1,0(a4)
        vfneg.v v1,v1
        vse16.v v1,0(a5)
        addi    a4,a1,80
        addi    a5,a0,80
        vle16.v v1,0(a2)
        vfneg.v v1,v1
        vse16.v v1,0(a3)
....


This is what we expected, TARGET_ZVFH enable auto-vectorization wheras no auto-vectorization when TARGET_ZVFHMIN since
vfneg.v is not allowed in TARGET_ZVFHMIN.

However, I think adding !(GET_MODE_INNER (<MODE>mode) == HFmode && !TARGET_ZVFH)
is an ugly implementation and not easy to maintain since we will need add this condition to each floating-point patterns.

So, give me some time to figure out an elegant way to support auto-vectorization.

Thanks.


juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2023-06-15 03:43
To: Robin Dapp; gcc-patches; palmer; Kito Cheng; juzhe.zhong@rivai.ai
Subject: Re: [PATCH] RISC-V: Add autovec FP unary operations.
 
 
On 6/14/23 09:31, Robin Dapp wrote:
> Hi,
> 
> this patch adds floating-point autovec expanders for vfneg, vfabs as well as
> vfsqrt and the accompanying tests.  vfrsqrt7 will be added at a later time.
So with vrsqrt7 I think the question turns into will we be able to use 
it effectively.  With its limited initial accuracy, we'll be stuck with 
another round of Newton-Raphson or Goldschmidt, so we're not likely 
going to beat the latency of a standard vsqrt.  We can use it to improve 
throughput though since it does pipeline (using the fmacs of course, so 
there's a definite trade-off if the fmacs are already saturated).
 
 
> 
> Similary to the binop tests, there are flavors for zvfh now.  Prerequisites
> as before.
> 
> Regards
>   Robin
> 
> gcc/ChangeLog:
> 
> * config/riscv/autovec.md (<optab><mode>2): Add unop expanders.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/riscv/rvv/autovec/unop/abs-run.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-template.h: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-run.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-template.h: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c: New test.
LGTM.  So if Juzhe is happy with it, then it's good to go once 
dependencies are resolved.
 
jeff
  
钟居哲 June 14, 2023, 9:27 p.m. UTC | #3
After several tries:

(define_mode_iterator VF_AUTO [
  (VNx1HF "TARGET_ZVFH && TARGET_MIN_VLEN < 128")
  (VNx2HF "TARGET_ZVFH")
  (VNx4HF "TARGET_ZVFH")
  (VNx8HF "TARGET_ZVFH")
  (VNx16HF "TARGET_ZVFH")
  (VNx32HF "TARGET_ZVFH && TARGET_MIN_VLEN > 32")
  (VNx64HF "TARGET_ZVFH && TARGET_MIN_VLEN >= 128")

  (VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
  (VNx2SF "TARGET_VECTOR_ELEN_FP_32")
  (VNx4SF "TARGET_VECTOR_ELEN_FP_32")
  (VNx8SF "TARGET_VECTOR_ELEN_FP_32")
  (VNx16SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32")
  (VNx32SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
  (VNx1DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN < 128")
  (VNx2DF "TARGET_VECTOR_ELEN_FP_64")
  (VNx4DF "TARGET_VECTOR_ELEN_FP_64")
  (VNx8DF "TARGET_VECTOR_ELEN_FP_64")
  (VNx16DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 128")
])


I think we should add VF_AUTO change iterator into using TARGET_ZVFH.
Then it also works now. -march=zvfhmin no auto-vectorization , -march=zvfh has auto-vectorization.

Feel free to comments more solutions.

Thanks.


juzhe.zhong@rivai.ai
 
From: 钟居哲
Date: 2023-06-15 05:15
To: Jeff Law; rdapp.gcc; gcc-patches; palmer; kito.cheng
Subject: Re: Re: [PATCH] RISC-V: Add autovec FP unary operations.
Hi, Jeff.  Thanks for quick approval.

When I reviewed the patch:
(define_expand "<optab><mode>2"
  [(set (match_operand:VF 0 "register_operand")
    (any_float_unop_nofrm:VF
     (match_operand:VF 1 "register_operand")))]
  "TARGET_VECTOR"
{
  insn_code icode = code_for_pred (<CODE>, <MODE>mode);
  riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands);
  DONE;
})

There could be issue here of FP16 vector. 
Since let's see VF iterator:
(define_mode_iterator VF [
  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
  (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
  (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
  (VNx8HF "TARGET_VECTOR_ELEN_FP_16")
  (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
  (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
  (VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
....

You can see For all FP16 mode, we use predicate "TARGET_VECTOR_ELEN_FP_16"
which is true when either TARGET_ZVFHM or TARGET_ZVFHMIN.
The reason we do that since most floating-point instructions are using same iterators that we can't add TARGET_ZVFHMIN or TARGET_ZVFH
in naive way. Some instructions pattern are using VF for example vle16.v which should be enabled as long as TARGET_ZVFHMIN wheras
the instructions like vfneg.v need TARGET_ZVFH.

So I do the experiment:
void
f (_Float16 *restrict a, _Float16 *restrict b)
{
  for (int i = 0; i < 100; ++i)
    {
      a[i] = -b[i];
    }
}

with compile option:
-march=rv64gcv_zvfhmin --param=riscv-autovec-preference=fixed-vlmax -O3

ICE happens:
auto.c:26:1: error: unable to generate reloads for:
(insn 8 7 9 2 (set (reg:VNx8HF 186 [ vect__6.7 ])
        (if_then_else:VNx8HF (unspec:VNx8BI [
                    (const_vector:VNx8BI [
                            (const_int 1 [0x1]) repeated x8
                        ])
                    (const_int 8 [0x8])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (neg:VNx8HF (reg:VNx8HF 134 [ vect__4.6 ]))
            (unspec:VNx8HF [
                    (reg:SI 0 zero)
                ] UNSPEC_VUNDEF))) "auto.c":24:14 6631 {pred_negvnx8hf}
     (expr_list:REG_DEAD (reg:VNx8HF 134 [ vect__4.6 ])
        (nil)))

The reason of ICE is that we have enabled auto-vectorzation pattern of vfneg.v when TARGET_ZVFHMIN according to VF iterators but
the instructions pattern of vfneg.v is correctly disabled and only enabled when TARGET_ZVFH since we have this attribute for each
RVV instruction pattern:
(define_attr "fp_vector_disabled" "no,yes"
  (cond [
    (and (eq_attr "type" "vfmov,vfalu,vfmul,vfdiv,
        vfwalu,vfwmul,vfmuladd,vfwmuladd,
        vfsqrt,vfrecp,vfminmax,vfsgnj,vfcmp,
        vfclass,vfmerge,
        vfncvtitof,vfwcvtftoi,vfcvtftoi,vfcvtitof,
        vfredo,vfredu,vfwredo,vfwredu,
        vfslide1up,vfslide1down")
   (and (eq_attr "mode" "VNx1HF,VNx2HF,VNx4HF,VNx8HF,VNx16HF,VNx32HF,VNx64HF")
        (match_test "!TARGET_ZVFH")))
    (const_string "yes")

    ;; The mode records as QI for the FP16 <=> INT8 instruction.
    (and (eq_attr "type" "vfncvtftoi,vfwcvtitof")
   (and (eq_attr "mode" "VNx1QI,VNx2QI,VNx4QI,VNx8QI,VNx16QI,VNx32QI,VNx64QI")
        (match_test "!TARGET_ZVFH")))
    (const_string "yes")
  ]
  (const_string "no")))

When I slightly change the pattern as follows:
(define_expand "<optab><mode>2"
  [(set (match_operand:VF 0 "register_operand")
    (any_float_unop_nofrm:VF
     (match_operand:VF 1 "register_operand")))]
  "TARGET_VECTOR && !(GET_MODE_INNER (<MODE>mode) == HFmode && !TARGET_ZVFH)"
{
  insn_code icode = code_for_pred (<CODE>, <MODE>mode);
  riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands);
  DONE;
})

Add && !(GET_MODE_INNER (<MODE>mode) == HFmode && !TARGET_ZVFH)
to condition.

It works for both TARGET_ZVFH and TARGET_ZVFHMIN
-march=rv64gcv_zvfhmin:
f:
        li      a4,2147450880
        li      a5,-2147450880
        addi    a4,a4,-1
        addi    a5,a5,1
        slli    a3,a5,32
        slli    a2,a4,32
        mv      a5,a4
        li      a4,-2147450880
        addi    a6,a1,200
        add     a3,a3,a4
        add     a2,a2,a5
.L2:
        ld      a5,0(a1)
        addi    a0,a0,8
        addi    a1,a1,8
        not     a4,a5
        and     a5,a5,a2
        and     a4,a4,a3
        sub     a5,a3,a5
        xor     a5,a4,a5
        sd      a5,-8(a0)
        bne     a1,a6,.L2
        ret

-march=rv64gcv_zvfh:
f:
        vsetivli        zero,8,e16,m1,ta,ma
        addi    a4,a1,16
        addi    a5,a0,16
        vle16.v v1,0(a1)
        vfneg.v v1,v1
        vse16.v v1,0(a0)
        addi    a2,a1,32
        addi    a3,a0,32
        vle16.v v1,0(a4)
        vfneg.v v1,v1
        vse16.v v1,0(a5)
        addi    a4,a1,48
        addi    a5,a0,48
        vle16.v v1,0(a2)
        vfneg.v v1,v1
        vse16.v v1,0(a3)
        addi    a2,a1,64
        addi    a3,a0,64
        vle16.v v1,0(a4)
        vfneg.v v1,v1
        vse16.v v1,0(a5)
        addi    a4,a1,80
        addi    a5,a0,80
        vle16.v v1,0(a2)
        vfneg.v v1,v1
        vse16.v v1,0(a3)
....


This is what we expected, TARGET_ZVFH enable auto-vectorization wheras no auto-vectorization when TARGET_ZVFHMIN since
vfneg.v is not allowed in TARGET_ZVFHMIN.

However, I think adding !(GET_MODE_INNER (<MODE>mode) == HFmode && !TARGET_ZVFH)
is an ugly implementation and not easy to maintain since we will need add this condition to each floating-point patterns.

So, give me some time to figure out an elegant way to support auto-vectorization.

Thanks.


juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2023-06-15 03:43
To: Robin Dapp; gcc-patches; palmer; Kito Cheng; juzhe.zhong@rivai.ai
Subject: Re: [PATCH] RISC-V: Add autovec FP unary operations.
 
 
On 6/14/23 09:31, Robin Dapp wrote:
> Hi,
> 
> this patch adds floating-point autovec expanders for vfneg, vfabs as well as
> vfsqrt and the accompanying tests.  vfrsqrt7 will be added at a later time.
So with vrsqrt7 I think the question turns into will we be able to use 
it effectively.  With its limited initial accuracy, we'll be stuck with 
another round of Newton-Raphson or Goldschmidt, so we're not likely 
going to beat the latency of a standard vsqrt.  We can use it to improve 
throughput though since it does pipeline (using the fmacs of course, so 
there's a definite trade-off if the fmacs are already saturated).
 
 
> 
> Similary to the binop tests, there are flavors for zvfh now.  Prerequisites
> as before.
> 
> Regards
>   Robin
> 
> gcc/ChangeLog:
> 
> * config/riscv/autovec.md (<optab><mode>2): Add unop expanders.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/riscv/rvv/autovec/unop/abs-run.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-template.h: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-run.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/vneg-template.h: Add FP.
> * gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h: New test.
> * gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c: New test.
> * gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c: New test.
LGTM.  So if Juzhe is happy with it, then it's good to go once 
dependencies are resolved.
 
jeff
  
钟居哲 June 15, 2023, 2:31 a.m. UTC | #4
After several considerations, I think we may need to add VF_AUTO iterators (with predicate TARGET_ZVFH for vector HF mode) for FP autovec.
Add add testcase of these unary operations with -march=rv64gc_zvfhmin to make sure they don't
cause any ICE and vectorizations.

like https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621322.html
this patch.

Thanks.


juzhe.zhong@rivai.ai
 
From: Robin Dapp
Date: 2023-06-14 23:31
To: gcc-patches; palmer; Kito Cheng; juzhe.zhong@rivai.ai; jeffreyalaw
CC: rdapp.gcc
Subject: [PATCH] RISC-V: Add autovec FP unary operations.
Hi,
 
this patch adds floating-point autovec expanders for vfneg, vfabs as well as
vfsqrt and the accompanying tests.  vfrsqrt7 will be added at a later time.
 
Similary to the binop tests, there are flavors for zvfh now.  Prerequisites
as before.
 
Regards
Robin
 
gcc/ChangeLog:
 
* config/riscv/autovec.md (<optab><mode>2): Add unop expanders.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/autovec/unop/abs-run.c: Add FP.
* gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c: Add FP.
* gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c: Add FP.
* gcc.target/riscv/rvv/autovec/unop/abs-template.h: Add FP.
* gcc.target/riscv/rvv/autovec/unop/vneg-run.c: Add FP.
* gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c: Add FP.
* gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c: Add FP.
* gcc.target/riscv/rvv/autovec/unop/vneg-template.h: Add FP.
* gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h: New test.
* gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c: New test.
* gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c: New test.
---
gcc/config/riscv/autovec.md                   | 36 ++++++++++++++++++-
.../riscv/rvv/autovec/unop/abs-run.c          |  6 ++--
.../riscv/rvv/autovec/unop/abs-rv32gcv.c      |  3 +-
.../riscv/rvv/autovec/unop/abs-rv64gcv.c      |  3 +-
.../riscv/rvv/autovec/unop/abs-template.h     | 14 +++++++-
.../riscv/rvv/autovec/unop/abs-zvfh-run.c     | 35 ++++++++++++++++++
.../riscv/rvv/autovec/unop/vfsqrt-run.c       | 29 +++++++++++++++
.../riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c   | 10 ++++++
.../riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c   | 10 ++++++
.../riscv/rvv/autovec/unop/vfsqrt-template.h  | 31 ++++++++++++++++
.../riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c  | 32 +++++++++++++++++
.../riscv/rvv/autovec/unop/vneg-run.c         |  6 ++--
.../riscv/rvv/autovec/unop/vneg-rv32gcv.c     |  3 +-
.../riscv/rvv/autovec/unop/vneg-rv64gcv.c     |  3 +-
.../riscv/rvv/autovec/unop/vneg-template.h    |  5 ++-
.../riscv/rvv/autovec/unop/vneg-zvfh-run.c    | 26 ++++++++++++++
16 files changed, 241 insertions(+), 11 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c
 
diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 1c6d793cae0..72154400f1f 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -498,7 +498,7 @@ (define_expand "<optab><mode>2"
})
;; -------------------------------------------------------------------------------
-;; - ABS expansion to vmslt and vneg
+;; - [INT] ABS expansion to vmslt and vneg.
;; -------------------------------------------------------------------------------
(define_expand "abs<mode>2"
@@ -517,6 +517,40 @@ (define_expand "abs<mode>2"
   DONE;
})
+;; -------------------------------------------------------------------------------
+;; ---- [FP] Unary operations
+;; -------------------------------------------------------------------------------
+;; Includes:
+;; - vfneg.v/vfabs.v
+;; -------------------------------------------------------------------------------
+(define_expand "<optab><mode>2"
+  [(set (match_operand:VF 0 "register_operand")
+    (any_float_unop_nofrm:VF
+     (match_operand:VF 1 "register_operand")))]
+  "TARGET_VECTOR"
+{
+  insn_code icode = code_for_pred (<CODE>, <MODE>mode);
+  riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands);
+  DONE;
+})
+
+;; -------------------------------------------------------------------------------
+;; - [FP] Square root
+;; -------------------------------------------------------------------------------
+;; Includes:
+;; - vfsqrt.v
+;; -------------------------------------------------------------------------------
+(define_expand "<optab><mode>2"
+  [(set (match_operand:VF 0 "register_operand")
+    (any_float_unop:VF
+     (match_operand:VF 1 "register_operand")))]
+  "TARGET_VECTOR"
+{
+  insn_code icode = code_for_pred (<CODE>, <MODE>mode);
+  riscv_vector::emit_vlmax_fp_insn (icode, riscv_vector::RVV_UNOP, operands);
+  DONE;
+})
+
;; =========================================================================
;; == Ternary arithmetic
;; =========================================================================
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-run.c
index d93a7c768d2..18c7a55e23d 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-run.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-run.c
@@ -1,5 +1,5 @@
/* { dg-do run { target { riscv_vector_hw } } } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
#include "abs-template.h"
@@ -30,7 +30,9 @@
  RUN(int8_t)                                 \
  RUN(int16_t)                                 \
  RUN(int32_t)                                 \
- RUN(int64_t)
+ RUN(int64_t) \
+ RUN(float)                                 \
+ RUN(double) \
int main ()
{
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c
index a8b92c9450f..dea790ccc2d 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c
@@ -1,8 +1,9 @@
/* { dg-do compile } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
#include "abs-template.h"
/* { dg-final { scan-assembler-times {\tvseti?vli\s+[a-z0-9,]+,ta,mu} 4 } } */
/* { dg-final { scan-assembler-times {\tvmslt\.vi} 4 } } */
/* { dg-final { scan-assembler-times {\tvneg.v\sv[0-9]+,v[0-9]+,v0\.t} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfabs.v} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c
index 2e7f0864ee7..b58f1aa3496 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c
@@ -1,8 +1,9 @@
/* { dg-do compile } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv_zvfh -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
#include "abs-template.h"
/* { dg-final { scan-assembler-times {\tvseti?vli\s+[a-z0-9,]+,ta,mu} 4 } } */
/* { dg-final { scan-assembler-times {\tvmslt\.vi} 4 } } */
/* { dg-final { scan-assembler-times {\tvneg.v\sv[0-9]+,v[0-9]+,v0\.t} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfabs.v} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-template.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-template.h
index 882de9f4efb..b86d04bfbc8 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-template.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-template.h
@@ -1,5 +1,6 @@
#include <stdlib.h>
#include <stdint-gcc.h>
+#include <math.h>
#define TEST_TYPE(TYPE) \
   __attribute__((noipa)) \
@@ -17,10 +18,21 @@
       dst[i] = llabs (a[i]); \
   }
+#define TEST_TYPE3(TYPE) \
+  __attribute__((noipa)) \
+  void vabs_##TYPE (TYPE *dst, TYPE *a, int n) \
+  { \
+    for (int i = 0; i < n; i++) \
+      dst[i] = fabs (a[i]); \
+  }
+
#define TEST_ALL() \
  TEST_TYPE(int8_t) \
  TEST_TYPE(int16_t) \
  TEST_TYPE(int32_t) \
- TEST_TYPE2(int64_t)
+ TEST_TYPE2(int64_t) \
+ TEST_TYPE3(_Float16) \
+ TEST_TYPE3(float) \
+ TEST_TYPE3(double) \
TEST_ALL()
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c
new file mode 100644
index 00000000000..9b1c26381d0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c
@@ -0,0 +1,35 @@
+/* { dg-do run { target { riscv_zvfh_hw } } } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "abs-template.h"
+
+#include <assert.h>
+
+#define SZ 128
+
+#define RUN(TYPE) \
+  TYPE a##TYPE[SZ]; \
+  for (int i = 0; i < SZ; i++) \
+    {                             \
+      if (i & 1) \
+ a##TYPE[i] = i - 64;            \
+      else \
+ a##TYPE[i] = i;            \
+    }                             \
+  vabs_##TYPE (a##TYPE, a##TYPE, SZ);         \
+  for (int i = 0; i < SZ; i++) \
+    { \
+      if (i & 1) \
+ assert (a##TYPE[i] == abs (i - 64));    \
+      else \
+ assert (a##TYPE[i] == i); \
+    }
+
+
+#define RUN_ALL()                         \
+ RUN(_Float16)                                 \
+
+int main ()
+{
+  RUN_ALL()
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c
new file mode 100644
index 00000000000..8038a87bdc9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c
@@ -0,0 +1,29 @@
+/* { dg-do run { target { riscv_vector_hw } } } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vfsqrt-template.h"
+
+#include <assert.h>
+
+#define SZ 255
+
+#define EPS 1e-5
+
+#define RUN(TYPE) \
+  TYPE a##TYPE[SZ]; \
+  for (int i = 0; i < SZ; i++) \
+  {                             \
+    a##TYPE[i] = (TYPE)i;             \
+  }                             \
+  vsqrt_##TYPE (a##TYPE, a##TYPE, SZ);         \
+  for (int i = 0; i < SZ; i++) \
+    assert (__builtin_fabs (a##TYPE[i] - __builtin_sqrtf((TYPE)i)) < EPS);
+
+#define RUN_ALL()                         \
+ RUN(float)                                 \
+ RUN(double)                                 \
+
+int main ()
+{
+  RUN_ALL()
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c
new file mode 100644
index 00000000000..96c6f959925
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vfsqrt-template.h"
+
+/* We cannot link this without the _zfh extension so define
+   it here instead of in the template directly.  */
+TEST_TYPE3(_Float16)
+
+/* { dg-final { scan-assembler-times {\tvfsqrt\.v} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c
new file mode 100644
index 00000000000..ea724e9548f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv_zvfh -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vfsqrt-template.h"
+
+/* We cannot link this without the _zfh extension so define
+   it here instead of in the template directly.  */
+TEST_TYPE3(_Float16)
+
+/* { dg-final { scan-assembler-times {\tvfsqrt\.v} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h
new file mode 100644
index 00000000000..314ea646bec
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h
@@ -0,0 +1,31 @@
+#include <stdint-gcc.h>
+
+#define TEST_TYPE(TYPE) \
+  __attribute__((noipa)) \
+  void vsqrt_##TYPE (TYPE *dst, TYPE *a, int n) \
+  { \
+    for (int i = 0; i < n; i++) \
+      dst[i] = __builtin_sqrtf (a[i]); \
+  }
+
+#define TEST_TYPE2(TYPE) \
+  __attribute__((noipa)) \
+  void vsqrt_##TYPE (TYPE *dst, TYPE *a, int n) \
+  { \
+    for (int i = 0; i < n; i++) \
+      dst[i] = __builtin_sqrt (a[i]); \
+  }
+
+#define TEST_TYPE3(TYPE) \
+  __attribute__((noipa)) \
+  void vsqrt_##TYPE (TYPE *dst, TYPE *a, int n) \
+  { \
+    for (int i = 0; i < n; i++) \
+      dst[i] = __builtin_sqrtf16 (a[i]); \
+  }
+
+#define TEST_ALL() \
+ TEST_TYPE(float) \
+ TEST_TYPE2(double) \
+
+TEST_ALL()
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c
new file mode 100644
index 00000000000..655bc1c42dd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c
@@ -0,0 +1,32 @@
+/* { dg-do run { target { riscv_zvfh_hw } } } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vfsqrt-template.h"
+
+/* We cannot link this without the _zfh extension so define
+   it here instead of in the template directly.  */
+TEST_TYPE3(_Float16)
+
+#include <assert.h>
+
+#define SZ 255
+
+#define EPS 1e-5
+
+#define RUN(TYPE) \
+  TYPE a##TYPE[SZ]; \
+  for (int i = 0; i < SZ; i++) \
+  {                             \
+    a##TYPE[i] = (TYPE)i;             \
+  }                             \
+  vsqrt_##TYPE (a##TYPE, a##TYPE, SZ);         \
+  for (int i = 0; i < SZ; i++) \
+    assert (__builtin_fabs (a##TYPE[i] - __builtin_sqrtf((TYPE)i)) < EPS);
+
+#define RUN_ALL()                         \
+ RUN(_Float16)                                 \
+
+int main ()
+{
+  RUN_ALL()
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-run.c
index 98c7f30ec56..4805538f252 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-run.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-run.c
@@ -1,5 +1,5 @@
/* { dg-do run { target { riscv_vector_hw } } } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
#include "vneg-template.h"
@@ -21,7 +21,9 @@
  RUN(int8_t)                                 \
  RUN(int16_t)                                 \
  RUN(int32_t)                                 \
- RUN(int64_t)
+ RUN(int64_t) \
+ RUN(float)                                 \
+ RUN(double) \
int main ()
{
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c
index 69d9ebb0953..4a9ceb5faf2 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
#include "vneg-template.h"
/* { dg-final { scan-assembler-times {\tvneg\.v} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfneg\.v} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c
index d2c2e17c13e..2c5e2bd2a0b 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c
@@ -1,6 +1,7 @@
/* { dg-do compile } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv_zvfh -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
#include "vneg-template.h"
/* { dg-final { scan-assembler-times {\tvneg\.v} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfneg\.v} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-template.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-template.h
index 93e690f3cec..892d9d72c38 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-template.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-template.h
@@ -13,6 +13,9 @@
  TEST_TYPE(int8_t) \
  TEST_TYPE(int16_t) \
  TEST_TYPE(int32_t) \
- TEST_TYPE(int64_t)
+ TEST_TYPE(int64_t) \
+ TEST_TYPE(_Float16) \
+ TEST_TYPE(float) \
+ TEST_TYPE(double) \
TEST_ALL()
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c
new file mode 100644
index 00000000000..e9de7a003c6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c
@@ -0,0 +1,26 @@
+/* { dg-do run { target { riscv_zvfh_hw } } } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vneg-template.h"
+
+#include <assert.h>
+
+#define SZ 255
+
+#define RUN(TYPE) \
+  TYPE a##TYPE[SZ]; \
+  for (int i = 0; i < SZ; i++) \
+  {                             \
+    a##TYPE[i] = i - 127;             \
+  }                             \
+  vneg_##TYPE (a##TYPE, a##TYPE, SZ); \
+  for (int i = 0; i < SZ; i++) \
+    assert (a##TYPE[i] == -(i - 127));
+
+#define RUN_ALL()                         \
+ RUN(_Float16)                                 \
+
+int main ()
+{
+  RUN_ALL()
+}
-- 
2.40.1
  
Robin Dapp June 15, 2023, 8:24 a.m. UTC | #5
Hi Juzhe,

I like the iterator solution better,  I added it to the
binops V2 patch with a comment and will post it in a while.
Also realized there is already a testcase and the "enabled"
attribute is set properly now but I hadn't rebased to the
current master branch in a while...

Btw. I'm currently running the testsuite with rv64gcv_zfhmin
default march and see some additional FAILs.  Will report back.

Regards
 Robin
  
Robin Dapp June 15, 2023, 2:33 p.m. UTC | #6
> Btw. I'm currently running the testsuite with rv64gcv_zfhmin
> default march and see some additional FAILs.  Will report back.

Reporting back - the FAILs are a combination of an older qemu
version and not fully comprehensive target selectors.  I'm going
to send a V2 for the testsuite patch as well.

Regards
 Robin
  
Jeff Law June 17, 2023, 2:17 a.m. UTC | #7
On 6/14/23 15:15, 钟居哲 wrote:
> Hi, Jeff.  Thanks for quick approval.
> 
> When I reviewed the patch:
> (define_expand "<optab><mode>2"
>    [(set (match_operand:VF 0 "register_operand")
>      (any_float_unop_nofrm:VF
>       (match_operand:VF 1 "register_operand")))]
> "TARGET_VECTOR"
> {
>    insn_code icode = code_for_pred (<CODE>, <MODE>mode);
>    riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands);
>    DONE;
> })
> 
> There could be issue here of FP16 vector.
> Since let's see VF iterator:
> (define_mode_iterator VF [
>    (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
>    (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
>    (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
>    (VNx8HF "TARGET_VECTOR_ELEN_FP_16")
>    (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
>    (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
>    (VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
> ....
> 
> You can see For all FP16 mode, we use predicate "TARGET_VECTOR_ELEN_FP_16"
> which is true when either TARGET_ZVFHM or TARGET_ZVFHMIN.
> The reason we do that since most floating-point instructions are using 
> same iterators that we can't add TARGET_ZVFHMIN or TARGET_ZVFH
> in naive way. Some instructions pattern are using VF for example vle16.v 
> which should be enabled as long as TARGET_ZVFHMIN wheras
> the instructions like vfneg.v need TARGET_ZVFH.
> 
> So I do the experiment:
> void
> f (_Float16 *restrict a, _Float16 *restrict b)
> {
> for (int i = 0; i < 100; ++i)
>      {
> a[i] = -b[i];
>      }
> }
> 
> with compile option:
> -march=rv64gcv_zvfhmin --param=riscv-autovec-preference=fixed-vlmax -O3
> 
> ICE happens:
> auto.c:26:1: error: unable to generate reloads for:
> (insn 8 7 9 2 (set (reg:VNx8HF 186 [ vect__6.7 ])
>          (if_then_else:VNx8HF (unspec:VNx8BI [
>                      (const_vector:VNx8BI [
>                              (const_int 1 [0x1]) repeated x8
>                          ])
>                      (const_int 8 [0x8])
>                      (const_int 2 [0x2]) repeated x2
>                      (const_int 0 [0])
>                      (reg:SI 66 vl)
>                      (reg:SI 67 vtype)
>                  ] UNSPEC_VPREDICATE)
>              (neg:VNx8HF (reg:VNx8HF 134 [ vect__4.6 ]))
>              (unspec:VNx8HF [
>                      (reg:SI 0 zero)
>                  ] UNSPEC_VUNDEF))) "auto.c":24:14 6631 {pred_negvnx8hf}
>       (expr_list:REG_DEAD (reg:VNx8HF 134 [ vect__4.6 ])
>          (nil)))
> 
> The reason of ICE is that we have enabled auto-vectorzation pattern of 
> vfneg.v when TARGET_ZVFHMIN according to VF iterators but
> the instructions pattern of vfneg.v is correctly disabled and only 
> enabled when TARGET_ZVFH since we have this attribute for each
> RVV instruction pattern:
> (define_attr "fp_vector_disabled" "no,yes"
>    (cond [
>      (and (eq_attr "type" "vfmov,vfalu,vfmul,vfdiv,
>          vfwalu,vfwmul,vfmuladd,vfwmuladd,
>          vfsqrt,vfrecp,vfminmax,vfsgnj,vfcmp,
>          vfclass,vfmerge,
>          vfncvtitof,vfwcvtftoi,vfcvtftoi,vfcvtitof,
>          vfredo,vfredu,vfwredo,vfwredu,
>          vfslide1up,vfslide1down")
>     (and (eq_attr "mode" 
> "VNx1HF,VNx2HF,VNx4HF,VNx8HF,VNx16HF,VNx32HF,VNx64HF")
>          (match_test "!TARGET_ZVFH")))
>      (const_string "yes")
> 
> ;; The mode records as QI for the FP16 <=> INT8 instruction.
>      (and (eq_attr "type" "vfncvtftoi,vfwcvtitof")
>     (and (eq_attr "mode" 
> "VNx1QI,VNx2QI,VNx4QI,VNx8QI,VNx16QI,VNx32QI,VNx64QI")
>          (match_test "!TARGET_ZVFH")))
>      (const_string "yes")
>    ]
>    (const_string "no")))
> 
> When I slightly change the pattern as follows:
> (define_expand "<optab><mode>2"
>    [(set (match_operand:VF 0 "register_operand")
>      (any_float_unop_nofrm:VF
>       (match_operand:VF 1 "register_operand")))]
> "TARGET_VECTOR && !(GET_MODE_INNER (<MODE>mode) == HFmode && !TARGET_ZVFH)"
> {
>    insn_code icode = code_for_pred (<CODE>, <MODE>mode);
>    riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands);
>    DONE;
> })
> 
> Add && !(GET_MODE_INNER (<MODE>mode) == HFmode && !TARGET_ZVFH)
> to condition.
> 
> It works for both TARGET_ZVFH and TARGET_ZVFHMIN
> -march=rv64gcv_zvfhmin:
> f:
>          li      a4,2147450880
>          li      a5,-2147450880
>          addi    a4,a4,-1
>          addi    a5,a5,1
>          slli    a3,a5,32
>          slli    a2,a4,32
>          mv      a5,a4
>          li      a4,-2147450880
>          addi    a6,a1,200
>          add     a3,a3,a4
>          add     a2,a2,a5
> .L2:
>          ld      a5,0(a1)
>          addi    a0,a0,8
>          addi    a1,a1,8
>          not     a4,a5
>          and     a5,a5,a2
>          and     a4,a4,a3
>          sub     a5,a3,a5
>          xor     a5,a4,a5
>          sd      a5,-8(a0)
>          bne     a1,a6,.L2
>          ret
> 
> -march=rv64gcv_zvfh:
> f:
>          vsetivli        zero,8,e16,m1,ta,ma
>          addi    a4,a1,16
>          addi    a5,a0,16
>          vle16.v v1,0(a1)
>          vfneg.v v1,v1
>          vse16.v v1,0(a0)
>          addi    a2,a1,32
>          addi    a3,a0,32
>          vle16.v v1,0(a4)
>          vfneg.v v1,v1
>          vse16.v v1,0(a5)
>          addi    a4,a1,48
>          addi    a5,a0,48
>          vle16.v v1,0(a2)
>          vfneg.v v1,v1
>          vse16.v v1,0(a3)
>          addi    a2,a1,64
>          addi    a3,a0,64
>          vle16.v v1,0(a4)
>          vfneg.v v1,v1
>          vse16.v v1,0(a5)
>          addi    a4,a1,80
>          addi    a5,a0,80
>          vle16.v v1,0(a2)
>          vfneg.v v1,v1
>          vse16.v v1,0(a3)
> ....
> 
> 
> This is what we expected, TARGET_ZVFH enable auto-vectorization wheras 
> no auto-vectorization when TARGET_ZVFHMIN since
> vfneg.v is not allowed in TARGET_ZVFHMIN.
> 
> However, I think adding !(GET_MODE_INNER (<MODE>mode) == HFmode && 
> !TARGET_ZVFH)
> is an ugly implementation and not easy to maintain since we will need 
> add this condition to each floating-point patterns.
> 
> So, give me some time to figure out an elegant way to support 
> auto-vectorization.
Sigh.  There are days when I look at how the ISA is managed and I don't 
know whether to cry or scream.

Thanks for the detailed explanation, you're absolutely correct that we 
need to be cognizant of the pitfalls of how the iterators interact ZVFH 
and ZVFHMIN.

jeff
  

Patch

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 1c6d793cae0..72154400f1f 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -498,7 +498,7 @@  (define_expand "<optab><mode>2"
 })
 
 ;; -------------------------------------------------------------------------------
-;; - ABS expansion to vmslt and vneg
+;; - [INT] ABS expansion to vmslt and vneg.
 ;; -------------------------------------------------------------------------------
 
 (define_expand "abs<mode>2"
@@ -517,6 +517,40 @@  (define_expand "abs<mode>2"
   DONE;
 })
 
+;; -------------------------------------------------------------------------------
+;; ---- [FP] Unary operations
+;; -------------------------------------------------------------------------------
+;; Includes:
+;; - vfneg.v/vfabs.v
+;; -------------------------------------------------------------------------------
+(define_expand "<optab><mode>2"
+  [(set (match_operand:VF 0 "register_operand")
+    (any_float_unop_nofrm:VF
+     (match_operand:VF 1 "register_operand")))]
+  "TARGET_VECTOR"
+{
+  insn_code icode = code_for_pred (<CODE>, <MODE>mode);
+  riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, operands);
+  DONE;
+})
+
+;; -------------------------------------------------------------------------------
+;; - [FP] Square root
+;; -------------------------------------------------------------------------------
+;; Includes:
+;; - vfsqrt.v
+;; -------------------------------------------------------------------------------
+(define_expand "<optab><mode>2"
+  [(set (match_operand:VF 0 "register_operand")
+    (any_float_unop:VF
+     (match_operand:VF 1 "register_operand")))]
+  "TARGET_VECTOR"
+{
+  insn_code icode = code_for_pred (<CODE>, <MODE>mode);
+  riscv_vector::emit_vlmax_fp_insn (icode, riscv_vector::RVV_UNOP, operands);
+  DONE;
+})
+
 ;; =========================================================================
 ;; == Ternary arithmetic
 ;; =========================================================================
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-run.c
index d93a7c768d2..18c7a55e23d 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-run.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-run.c
@@ -1,5 +1,5 @@ 
 /* { dg-do run { target { riscv_vector_hw } } } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
 
 #include "abs-template.h"
 
@@ -30,7 +30,9 @@ 
  RUN(int8_t)	                                \
  RUN(int16_t)	                                \
  RUN(int32_t)	                                \
- RUN(int64_t)
+ RUN(int64_t)					\
+ RUN(float)	                                \
+ RUN(double)					\
 
 int main ()
 {
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c
index a8b92c9450f..dea790ccc2d 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c
@@ -1,8 +1,9 @@ 
 /* { dg-do compile } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
 
 #include "abs-template.h"
 
 /* { dg-final { scan-assembler-times {\tvseti?vli\s+[a-z0-9,]+,ta,mu} 4 } } */
 /* { dg-final { scan-assembler-times {\tvmslt\.vi} 4 } } */
 /* { dg-final { scan-assembler-times {\tvneg.v\sv[0-9]+,v[0-9]+,v0\.t} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfabs.v} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c
index 2e7f0864ee7..b58f1aa3496 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c
@@ -1,8 +1,9 @@ 
 /* { dg-do compile } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv_zvfh -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
 
 #include "abs-template.h"
 
 /* { dg-final { scan-assembler-times {\tvseti?vli\s+[a-z0-9,]+,ta,mu} 4 } } */
 /* { dg-final { scan-assembler-times {\tvmslt\.vi} 4 } } */
 /* { dg-final { scan-assembler-times {\tvneg.v\sv[0-9]+,v[0-9]+,v0\.t} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfabs.v} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-template.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-template.h
index 882de9f4efb..b86d04bfbc8 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-template.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-template.h
@@ -1,5 +1,6 @@ 
 #include <stdlib.h>
 #include <stdint-gcc.h>
+#include <math.h>
 
 #define TEST_TYPE(TYPE) 				\
   __attribute__((noipa))				\
@@ -17,10 +18,21 @@ 
       dst[i] = llabs (a[i]);				\
   }
 
+#define TEST_TYPE3(TYPE) 				\
+  __attribute__((noipa))				\
+  void vabs_##TYPE (TYPE *dst, TYPE *a, int n)		\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = fabs (a[i]);				\
+  }
+
 #define TEST_ALL()	\
  TEST_TYPE(int8_t)	\
  TEST_TYPE(int16_t)	\
  TEST_TYPE(int32_t)	\
- TEST_TYPE2(int64_t)
+ TEST_TYPE2(int64_t)	\
+ TEST_TYPE3(_Float16)	\
+ TEST_TYPE3(float)	\
+ TEST_TYPE3(double)	\
 
 TEST_ALL()
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c
new file mode 100644
index 00000000000..9b1c26381d0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/abs-zvfh-run.c
@@ -0,0 +1,35 @@ 
+/* { dg-do run { target { riscv_zvfh_hw } } } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "abs-template.h"
+
+#include <assert.h>
+
+#define SZ 128
+
+#define RUN(TYPE)				\
+  TYPE a##TYPE[SZ];				\
+  for (int i = 0; i < SZ; i++)			\
+    {                             		\
+      if (i & 1)				\
+	a##TYPE[i] = i - 64;            	\
+      else					\
+	a##TYPE[i] = i;            		\
+    }                             		\
+  vabs_##TYPE (a##TYPE, a##TYPE, SZ);	        \
+  for (int i = 0; i < SZ; i++)			\
+    {						\
+      if (i & 1)				\
+	assert (a##TYPE[i] == abs (i - 64));    \
+      else					\
+	assert (a##TYPE[i] == i);		\
+    }
+
+
+#define RUN_ALL()	                        \
+ RUN(_Float16)	                                \
+
+int main ()
+{
+  RUN_ALL()
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c
new file mode 100644
index 00000000000..8038a87bdc9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c
@@ -0,0 +1,29 @@ 
+/* { dg-do run { target { riscv_vector_hw } } } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vfsqrt-template.h"
+
+#include <assert.h>
+
+#define SZ 255
+
+#define EPS 1e-5
+
+#define RUN(TYPE)				\
+  TYPE a##TYPE[SZ];				\
+  for (int i = 0; i < SZ; i++)			\
+  {                             		\
+    a##TYPE[i] = (TYPE)i;             		\
+  }                             		\
+  vsqrt_##TYPE (a##TYPE, a##TYPE, SZ);	        \
+  for (int i = 0; i < SZ; i++)			\
+    assert (__builtin_fabs (a##TYPE[i] - __builtin_sqrtf((TYPE)i)) < EPS);
+
+#define RUN_ALL()	                        \
+ RUN(float)	                                \
+ RUN(double)	                                \
+
+int main ()
+{
+  RUN_ALL()
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c
new file mode 100644
index 00000000000..96c6f959925
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv32gcv.c
@@ -0,0 +1,10 @@ 
+/* { dg-do compile } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vfsqrt-template.h"
+
+/* We cannot link this without the _zfh extension so define
+   it here instead of in the template directly.  */
+TEST_TYPE3(_Float16)
+
+/* { dg-final { scan-assembler-times {\tvfsqrt\.v} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c
new file mode 100644
index 00000000000..ea724e9548f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-rv64gcv.c
@@ -0,0 +1,10 @@ 
+/* { dg-do compile } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv_zvfh -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vfsqrt-template.h"
+
+/* We cannot link this without the _zfh extension so define
+   it here instead of in the template directly.  */
+TEST_TYPE3(_Float16)
+
+/* { dg-final { scan-assembler-times {\tvfsqrt\.v} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h
new file mode 100644
index 00000000000..314ea646bec
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-template.h
@@ -0,0 +1,31 @@ 
+#include <stdint-gcc.h>
+
+#define TEST_TYPE(TYPE) 				\
+  __attribute__((noipa))				\
+  void vsqrt_##TYPE (TYPE *dst, TYPE *a, int n)		\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = __builtin_sqrtf (a[i]);			\
+  }
+
+#define TEST_TYPE2(TYPE) 				\
+  __attribute__((noipa))				\
+  void vsqrt_##TYPE (TYPE *dst, TYPE *a, int n)		\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = __builtin_sqrt (a[i]);			\
+  }
+
+#define TEST_TYPE3(TYPE) 				\
+  __attribute__((noipa))				\
+  void vsqrt_##TYPE (TYPE *dst, TYPE *a, int n)		\
+  {							\
+    for (int i = 0; i < n; i++)				\
+      dst[i] = __builtin_sqrtf16 (a[i]);		\
+  }
+
+#define TEST_ALL()					\
+ TEST_TYPE(float)					\
+ TEST_TYPE2(double)					\
+
+TEST_ALL()
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c
new file mode 100644
index 00000000000..655bc1c42dd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vfsqrt-zvfh-run.c
@@ -0,0 +1,32 @@ 
+/* { dg-do run { target { riscv_zvfh_hw } } } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vfsqrt-template.h"
+
+/* We cannot link this without the _zfh extension so define
+   it here instead of in the template directly.  */
+TEST_TYPE3(_Float16)
+
+#include <assert.h>
+
+#define SZ 255
+
+#define EPS 1e-5
+
+#define RUN(TYPE)				\
+  TYPE a##TYPE[SZ];				\
+  for (int i = 0; i < SZ; i++)			\
+  {                             		\
+    a##TYPE[i] = (TYPE)i;             		\
+  }                             		\
+  vsqrt_##TYPE (a##TYPE, a##TYPE, SZ);	        \
+  for (int i = 0; i < SZ; i++)			\
+    assert (__builtin_fabs (a##TYPE[i] - __builtin_sqrtf((TYPE)i)) < EPS);
+
+#define RUN_ALL()	                        \
+ RUN(_Float16)	                                \
+
+int main ()
+{
+  RUN_ALL()
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-run.c
index 98c7f30ec56..4805538f252 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-run.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-run.c
@@ -1,5 +1,5 @@ 
 /* { dg-do run { target { riscv_vector_hw } } } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
 
 #include "vneg-template.h"
 
@@ -21,7 +21,9 @@ 
  RUN(int8_t)	                                \
  RUN(int16_t)	                                \
  RUN(int32_t)	                                \
- RUN(int64_t)
+ RUN(int64_t)					\
+ RUN(float)	                                \
+ RUN(double)					\
 
 int main ()
 {
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c
index 69d9ebb0953..4a9ceb5faf2 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv32gcv.c
@@ -1,6 +1,7 @@ 
 /* { dg-do compile } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
 
 #include "vneg-template.h"
 
 /* { dg-final { scan-assembler-times {\tvneg\.v} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfneg\.v} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c
index d2c2e17c13e..2c5e2bd2a0b 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-rv64gcv.c
@@ -1,6 +1,7 @@ 
 /* { dg-do compile } */
-/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax" } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model -march=rv64gcv_zvfh -mabi=lp64d --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
 
 #include "vneg-template.h"
 
 /* { dg-final { scan-assembler-times {\tvneg\.v} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfneg\.v} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-template.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-template.h
index 93e690f3cec..892d9d72c38 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-template.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-template.h
@@ -13,6 +13,9 @@ 
  TEST_TYPE(int8_t)					\
  TEST_TYPE(int16_t)					\
  TEST_TYPE(int32_t)					\
- TEST_TYPE(int64_t)
+ TEST_TYPE(int64_t)					\
+ TEST_TYPE(_Float16)					\
+ TEST_TYPE(float)					\
+ TEST_TYPE(double)					\
 
 TEST_ALL()
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c
new file mode 100644
index 00000000000..e9de7a003c6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vneg-zvfh-run.c
@@ -0,0 +1,26 @@ 
+/* { dg-do run { target { riscv_zvfh_hw } } } */
+/* { dg-additional-options "-std=c99 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax -ffast-math" } */
+
+#include "vneg-template.h"
+
+#include <assert.h>
+
+#define SZ 255
+
+#define RUN(TYPE)				\
+  TYPE a##TYPE[SZ];				\
+  for (int i = 0; i < SZ; i++)			\
+  {                             		\
+    a##TYPE[i] = i - 127;             		\
+  }                             		\
+  vneg_##TYPE (a##TYPE, a##TYPE, SZ);	\
+  for (int i = 0; i < SZ; i++)			\
+    assert (a##TYPE[i] == -(i - 127));
+
+#define RUN_ALL()	                        \
+ RUN(_Float16)	                                \
+
+int main ()
+{
+  RUN_ALL()
+}