RISC-V: Fix double mode under RV32 not utilize vf

Message ID 20240719085508.2641039-1-demin.han@starfivetech.com
State Committed
Delegated to: Jeff Law
Headers
Series RISC-V: Fix double mode under RV32 not utilize vf |

Checks

Context Check Description
rivoscibot/toolchain-ci-rivos-apply-patch success Patch applied
rivoscibot/toolchain-ci-rivos-lint success Lint passed
rivoscibot/toolchain-ci-rivos-build--newlib-rv64gcv-lp64d-multilib success Build passed
rivoscibot/toolchain-ci-rivos-build--linux-rv64gcv-lp64d-multilib success Build passed
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Build passed
rivoscibot/toolchain-ci-rivos-build--newlib-rv32imc_zba_zbb_zbc_zbs-ilp32d-non-multilib success Build passed
rivoscibot/toolchain-ci-rivos-build--newlib-rv64gc-lp64d-multilib success Build passed
rivoscibot/toolchain-ci-rivos-build--linux-rv32gc_zba_zbb_zbc_zbs-ilp32d-non-multilib success Build passed
rivoscibot/toolchain-ci-rivos-build--linux-rv64gc_zba_zbb_zbc_zbs-lp64d-non-multilib success Build passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_gcc_check--master-arm success Test passed
rivoscibot/toolchain-ci-rivos-test fail Testing failed
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 success Test passed

Commit Message

Demin Han July 19, 2024, 8:55 a.m. UTC
  Currently, some binops of vector vs double scalar under RV32 can't
translated to vf but vfmv+vxx.vv.

The cause is that vec_duplicate is also expanded to broadcast for double mode
under RV32. last-combine can't process expanded broadcast.

gcc/ChangeLog:

	* config/riscv/vector.md: Add !FLOAT_MODE_P constrain

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/binop/vadd-rv32gcv-nofm.c: Fix test
	* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Ditto
	* gcc.target/riscv/rvv/autovec/binop/vmul-rv32gcv-nofm.c: Ditto
	* gcc.target/riscv/rvv/autovec/binop/vsub-rv32gcv-nofm.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: Ditto
	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c: Ditto

Signed-off-by: demin.han <demin.han@starfivetech.com>
---
 gcc/config/riscv/vector.md                                | 3 ++-
 .../riscv/rvv/autovec/binop/vadd-rv32gcv-nofm.c           | 4 ++--
 .../riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c           | 4 ++--
 .../riscv/rvv/autovec/binop/vmul-rv32gcv-nofm.c           | 4 ++--
 .../riscv/rvv/autovec/binop/vsub-rv32gcv-nofm.c           | 4 ++--
 .../riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c        | 8 ++++----
 .../gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c       | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c       | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c       | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c       | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c   | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c   | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c   | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c   | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c   | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c       | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c       | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c       | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c       | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c       | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c       | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c       | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c       | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c   | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c   | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c   | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c   | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c   | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c       | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c       | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c       | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c       | 4 ++--
 .../gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c       | 4 ++--
 33 files changed, 68 insertions(+), 67 deletions(-)
  

Comments

Jeff Law July 19, 2024, 6:07 p.m. UTC | #1
On 7/19/24 2:55 AM, demin.han wrote:
> Currently, some binops of vector vs double scalar under RV32 can't
> translated to vf but vfmv+vxx.vv.
> 
> The cause is that vec_duplicate is also expanded to broadcast for double mode
> under RV32. last-combine can't process expanded broadcast.
> 
> gcc/ChangeLog:
> 
> 	* config/riscv/vector.md: Add !FLOAT_MODE_P constrain
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/rvv/autovec/binop/vadd-rv32gcv-nofm.c: Fix test
> 	* gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/binop/vmul-rv32gcv-nofm.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/binop/vsub-rv32gcv-nofm.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: Ditto
> 	* gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c: Ditto
It looks like vadd-rv32gcv-nofm still isn't quite right according to the 
pre-commit testing:

  > 
https://github.com/ewlu/gcc-precommit-ci/issues/1931#issuecomment-2238752679


OK once that's fixed.  No need to wait for another review cycle.

And a note.  We need to be careful as some uarchs may pay a penalty when 
the vector unit needs to get an operand from the GP or FP register 
files.  So there could well be cases where using .vf or .vx forms is 
slower.  Consider these two scenarios.

First, we broadcast from the GP/FP across a vector regsiter outside a 
loop, the use a .vv form in the loop.

Second we use a .vf or .vx form in the loop instead without any broadcast.

In the former case we only pay the penalty for crossing register files 
once.  In the second case we'd pay it for every iteration of the loop.

Given this is going to be uarch sensitive, I don't mind biasing towards 
the .vx/.vf forms right now, but we may need to add some costing models 
to this in the future as we can test on a wider variety of uarchs.

jeff
  
Jeff Law Aug. 25, 2024, 9:58 p.m. UTC | #2
On Fri, Jul 19, 2024 at 12:07 PM Jeff Law <jeffreyalaw@gmail.com> wrote:

>
>
> On 7/19/24 2:55 AM, demin.han wrote:
> > Currently, some binops of vector vs double scalar under RV32 can't
> > translated to vf but vfmv+vxx.vv.
> >
> > The cause is that vec_duplicate is also expanded to broadcast for double
> mode
> > under RV32. last-combine can't process expanded broadcast.
> >
> > gcc/ChangeLog:
> >
> >       * config/riscv/vector.md: Add !FLOAT_MODE_P constrain
> >
> > gcc/testsuite/ChangeLog:
> >
> >       * gcc.target/riscv/rvv/autovec/binop/vadd-rv32gcv-nofm.c: Fix test
> >       * gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/binop/vmul-rv32gcv-nofm.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/binop/vsub-rv32gcv-nofm.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c: Ditto
> It looks like vadd-rv32gcv-nofm still isn't quite right according to the
> pre-commit testing:
>
>   >
>
> https://github.com/ewlu/gcc-precommit-ci/issues/1931#issuecomment-2238752679
>
>
> OK once that's fixed.  No need to wait for another review cycle.
>
There's a reasonable chance late-combine was catching more cases that could
be turned into .vf forms.  That was pretty common when I first looked at
the late-combine changes.

Regardless,  I adjusted the vadd/vsub tests and pushed this to the trunk.

Thanks,
jeff
  
Andrew Waterman Aug. 27, 2024, 12:34 a.m. UTC | #3
On Fri, Jul 19, 2024 at 11:08 AM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
> On 7/19/24 2:55 AM, demin.han wrote:
> > Currently, some binops of vector vs double scalar under RV32 can't
> > translated to vf but vfmv+vxx.vv.
> >
> > The cause is that vec_duplicate is also expanded to broadcast for double mode
> > under RV32. last-combine can't process expanded broadcast.
> >
> > gcc/ChangeLog:
> >
> >       * config/riscv/vector.md: Add !FLOAT_MODE_P constrain
> >
> > gcc/testsuite/ChangeLog:
> >
> >       * gcc.target/riscv/rvv/autovec/binop/vadd-rv32gcv-nofm.c: Fix test
> >       * gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/binop/vmul-rv32gcv-nofm.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/binop/vsub-rv32gcv-nofm.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: Ditto
> >       * gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c: Ditto
> It looks like vadd-rv32gcv-nofm still isn't quite right according to the
> pre-commit testing:
>
>   >
> https://github.com/ewlu/gcc-precommit-ci/issues/1931#issuecomment-2238752679
>
>
> OK once that's fixed.  No need to wait for another review cycle.
>
> And a note.  We need to be careful as some uarchs may pay a penalty when
> the vector unit needs to get an operand from the GP or FP register
> files.  So there could well be cases where using .vf or .vx forms is
> slower.  Consider these two scenarios.
>
> First, we broadcast from the GP/FP across a vector regsiter outside a
> loop, the use a .vv form in the loop.
>
> Second we use a .vf or .vx form in the loop instead without any broadcast.
>
> In the former case we only pay the penalty for crossing register files
> once.  In the second case we'd pay it for every iteration of the loop.
>
> Given this is going to be uarch sensitive, I don't mind biasing towards
> the .vx/.vf forms right now, but we may need to add some costing models
> to this in the future as we can test on a wider variety of uarchs.

Just wanted to chime in to say that this should indeed be a tuning
decision, but our mental model should bias us in favor of the .vf/.vx
forms when we don't have any additional information.

It's a safe assumption that, for all uarches, it's better to use a
.vf/.vx form if the scalar operand is used only once.  If the scalar
is loop-invariant, then it's definitely uarch-dependent as to whether
a hoisted splat is preferable to repeated use of .vf/.vx.  (For
SiFive's in-order vector units, the splat is pure overhead; the
.vf/.vx forms are preferred.  I know the same is not true of other
uarches, though.)

There's the additional complicating factor: when the scalar operand
comes from memory, some uarches will prefer to use a strided load with
rs2=x0, rather than a scalar load followed by .vf/.vx, or a scalar
load followed by a splat.  (For SiFive's in-order vector units, this
optimization is profitable when the load is a cache miss, and it's a
de-optimization otherwise.  It isn't a case that's easy to tune for,
so thus far we've relegated it to hand-written code.)


>
>
> jeff
>
  
Demin Han Aug. 27, 2024, 12:09 p.m. UTC | #4
Hi Jeff,

Yes, there are some tests fails after the last_combine pass introduced.
I remember these tests still have vv format which not become vf after last_combine.

I’ll update the testcase based on my local branch after your push.

Regards,
Demin

From: Jeff Law <jeffreyalaw@gmail.com>
Sent: 2024年8月26日 5:59
To: Demin Han <demin.han@starfivetech.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; kito.cheng@gmail.com; pan2.li@intel.com; rdapp.gcc@gmail.com
Subject: Re: [PATCH] RISC-V: Fix double mode under RV32 not utilize vf



On Fri, Jul 19, 2024 at 12:07 PM Jeff Law <jeffreyalaw@gmail.com<mailto:jeffreyalaw@gmail.com>> wrote:


On 7/19/24 2:55 AM, demin.han wrote:
> Currently, some binops of vector vs double scalar under RV32 can't
> translated to vf but vfmv+vxx.vv.
>
> The cause is that vec_duplicate is also expanded to broadcast for double mode
> under RV32. last-combine can't process expanded broadcast.
>
> gcc/ChangeLog:
>
>       * config/riscv/vector.md: Add !FLOAT_MODE_P constrain
>
> gcc/testsuite/ChangeLog:
>
>       * gcc.target/riscv/rvv/autovec/binop/vadd-rv32gcv-nofm.c: Fix test
>       * gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Ditto
>       * gcc.target/riscv/rvv/autovec/binop/vmul-rv32gcv-nofm.c: Ditto
>       * gcc.target/riscv/rvv/autovec/binop/vsub-rv32gcv-nofm.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c: Ditto
>       * gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c: Ditto
It looks like vadd-rv32gcv-nofm still isn't quite right according to the
pre-commit testing:

  >
https://github.com/ewlu/gcc-precommit-ci/issues/1931#issuecomment-2238752679


OK once that's fixed.  No need to wait for another review cycle.
There's a reasonable chance late-combine was catching more cases that could be turned into .vf forms.  That was pretty common when I first looked at the late-combine changes.

Regardless,  I adjusted the vadd/vsub tests and pushed this to the trunk.

Thanks,
jeff
  

Patch

diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index bcedf3d79e2..d1518f3e623 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -1486,7 +1486,8 @@  (define_expand "vec_duplicate<mode>"
   {
     /* Early expand DImode broadcast in RV32 system to avoid RA reload
        generate (set (reg) (vec_duplicate:DI)).  */
-    if (maybe_gt (GET_MODE_SIZE (<VEL>mode), GET_MODE_SIZE (Pmode)))
+    bool gt_p = maybe_gt (GET_MODE_SIZE (<VEL>mode), GET_MODE_SIZE (Pmode));
+    if (!FLOAT_MODE_P (<VEL>mode) && gt_p)
       {
         riscv_vector::emit_vlmax_insn (code_for_pred_broadcast (<MODE>mode),
 				       riscv_vector::UNARY_OP, operands);
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vadd-rv32gcv-nofm.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vadd-rv32gcv-nofm.c
index db8c653b179..b9a040f2f78 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vadd-rv32gcv-nofm.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vadd-rv32gcv-nofm.c
@@ -5,7 +5,7 @@ 
 
 /* { dg-final { scan-assembler-times {\tvadd\.vv} 16 } } */
 /* { dg-final { scan-assembler-times {\tvadd\.vi} 8 } } */
-/* { dg-final { scan-assembler-times {\tvfadd\.vv} 5 } } */
-/* { dg-final { scan-assembler-times {\tvfadd\.vf} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfadd\.vv} 4 } } */
+/* { dg-final { scan-assembler-times {\tvfadd\.vf} 5 } } */
 
 /* { dg-final { scan-tree-dump-times "\.COND_LEN_ADD" 9 "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c
index d7a2d259495..0750d8efc3a 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c
@@ -8,8 +8,8 @@ 
 /* { dg-final { scan-assembler-times {\tvdivu\.vv} 5 } } */
 /* { dg-final { scan-assembler-times {\tvdivu\.vx} 3 } } */
 
-/* { dg-final { scan-assembler-times {\tvfdiv\.vv} 4 } } */
-/* { dg-final { scan-assembler-times {\tvfdiv\.vf} 2 } } */
+/* { dg-final { scan-assembler-times {\tvfdiv\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {\tvfdiv\.vf} 3 } } */
 
 /* { dg-final { scan-tree-dump-times "\.COND_LEN_DIV" 16 "optimized" } } */
 /* { dg-final { scan-tree-dump-times "\.COND_LEN_RDIV" 6 "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vmul-rv32gcv-nofm.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vmul-rv32gcv-nofm.c
index 58310135ea6..7197bf2a385 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vmul-rv32gcv-nofm.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vmul-rv32gcv-nofm.c
@@ -4,6 +4,6 @@ 
 #include "vmul-template.h"
 
 /* { dg-final { scan-assembler-times {\tvmul\.vv} 16 } } */
-/* { dg-final { scan-assembler-times {\tvfmul\.vv} 4 } } */
-/* { dg-final { scan-assembler-times {\tvfmul\.vf} 2 } } */
+/* { dg-final { scan-assembler-times {\tvfmul\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {\tvfmul\.vf} 3 } } */
 /* { dg-final { scan-tree-dump-times "\.COND_LEN_MUL" 6 "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vsub-rv32gcv-nofm.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vsub-rv32gcv-nofm.c
index aa20a90583f..c2afbde8368 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vsub-rv32gcv-nofm.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vsub-rv32gcv-nofm.c
@@ -6,8 +6,8 @@ 
 /* { dg-final { scan-assembler-times {\tvsub\.vv} 16 } } */
 /* { dg-final { scan-assembler-times {\tvrsub\.vi} 16 } } */
 
-/* { dg-final { scan-assembler-times {\tvfsub\.vv} 6 } } */
-/* { dg-final { scan-assembler-times {\tvfsub\.vf} 2 } } */
+/* { dg-final { scan-assembler-times {\tvfsub\.vv} 7 } } */
+/* { dg-final { scan-assembler-times {\tvfsub\.vf} 3 } } */
 /* { dg-final { scan-assembler-times {\tvfrsub\.vf} 4 } } */
 /* { dg-final { scan-tree-dump-times "\.COND_LEN_SUB" 12 "optimized" } } */
 
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c
index f633d40df10..b9cfc238c73 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_copysign-rv32gcv.c
@@ -3,13 +3,13 @@ 
 
 #include "cond_copysign-template.h"
 
-/* { dg-final { scan-assembler-times {\tvfsgnj\.vv} 4 } } */
-/* { dg-final { scan-assembler-times {\tvfsgnj\.vf} 2 } } */
+/* { dg-final { scan-assembler-times {\tvfsgnj\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {\tvfsgnj\.vf} 3 } } */
 /* 1. The vectorizer wraps scalar variants of copysign into vector constants which
       expand cannot handle currently.
    2. match.pd convert .COPYSIGN (1, b) + COND_MUL to AND + XOR currently.  */
 /* { dg-final { scan-assembler-times {\tvfsgnjx\.vv} 6 { xfail riscv*-*-* } } } */
-/* { dg-final { scan-assembler-times {\tvfsgnjn\.vv} 4 } } */
-/* { dg-final { scan-assembler-times {\tvfsgnjn\.vf} 2 } } */
+/* { dg-final { scan-assembler-times {\tvfsgnjn\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {\tvfsgnjn\.vf} 3 } } */
 /* { dg-final { scan-assembler-not {\tvmerge\.vvm} } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c
index 1cdcbf2c36d..1aac30659f2 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fadd-1.c
@@ -29,6 +29,6 @@ 
 
 TEST_ALL (DEF_LOOP)
 
-/* { dg-final { scan-assembler-times {vfadd\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 6 } } */
-/* { dg-final { scan-assembler-times {vfadd\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 12 } } */
+/* { dg-final { scan-assembler-times {vfadd\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 0 } } */
+/* { dg-final { scan-assembler-times {vfadd\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 18 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c
index 87ba39164a2..947e43ccde2 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fadd-2.c
@@ -28,6 +28,6 @@ 
 
 TEST_ALL (DEF_LOOP)
 
-/* { dg-final { scan-assembler-times {vfadd\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 6 } } */
-/* { dg-final { scan-assembler-times {vfadd\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 6 } } */
+/* { dg-final { scan-assembler-times {vfadd\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 0 } } */
+/* { dg-final { scan-assembler-times {vfadd\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 12 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c
index 728e4470216..8a8d7d03a42 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fadd-3.c
@@ -29,6 +29,6 @@ 
 
 TEST_ALL (DEF_LOOP)
 
-/* { dg-final { scan-assembler-times {vfadd\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 6 } } */
-/* { dg-final { scan-assembler-times {vfadd\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 12 } } */
+/* { dg-final { scan-assembler-times {vfadd\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 0 } } */
+/* { dg-final { scan-assembler-times {vfadd\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 18 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c
index 7f6cb24a3a8..e282d2c2edc 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fadd-4.c
@@ -29,6 +29,6 @@ 
 
 TEST_ALL (DEF_LOOP)
 
-/* { dg-final { scan-assembler-times {vfadd\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 6 } } */
-/* { dg-final { scan-assembler-times {vfadd\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 12 } } */
+/* { dg-final { scan-assembler-times {vfadd\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 0 } } */
+/* { dg-final { scan-assembler-times {vfadd\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 18 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c
index 4a8523d13da..ef8631dd2ed 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-1.c
@@ -33,7 +33,7 @@  TEST_ALL (DEF_LOOP)
 /* { dg-final { scan-assembler-times {vmadd\.vx\s+v[0-9]+,[a-x][0-9]+,v[0-9]+,v0.t} 3 } } */
 /* { dg-final { scan-assembler-times {vnmsub\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 1 } } */
 /* { dg-final { scan-assembler-times {vnmsub\.vx\s+v[0-9]+,[a-x][0-9]+,v[0-9]+,v0.t} 3 } } */
-/* { dg-final { scan-assembler-times {vfmadd\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 1 } } */
-/* { dg-final { scan-assembler-times {vfmadd\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmadd\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 0 } } */
+/* { dg-final { scan-assembler-times {vfmadd\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 3 } } */
 /* { dg-final { scan-assembler-times {vfnmsub\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 3 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c
index d49cdbe5715..e3aaba2c921 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-3.c
@@ -33,7 +33,7 @@  TEST_ALL (DEF_LOOP)
 /* { dg-final { scan-assembler-times {vmacc\.vx\s+v[0-9]+,[a-x][0-9]+,v[0-9]+,v0.t} 3 } } */
 /* { dg-final { scan-assembler-times {vnmsac\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 1 } } */
 /* { dg-final { scan-assembler-times {vnmsac\.vx\s+v[0-9]+,[a-x][0-9]+,v[0-9]+,v0.t} 3 } } */
-/* { dg-final { scan-assembler-times {vfmacc\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 1 } } */
-/* { dg-final { scan-assembler-times {vfmacc\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmacc\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 0 } } */
+/* { dg-final { scan-assembler-times {vfmacc\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 3 } } */
 /* { dg-final { scan-assembler-times {vfnmsac\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 3 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c
index 6f37968a222..f91bec12eac 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-4.c
@@ -33,7 +33,7 @@  TEST_ALL (DEF_LOOP)
 /* { dg-final { scan-assembler-times {vmacc\.vx\s+v[0-9]+,[a-x][0-9]+,v[0-9]+,v0.t} 3 } } */
 /* { dg-final { scan-assembler-times {vnmsac\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 1 } } */
 /* { dg-final { scan-assembler-times {vnmsac\.vx\s+v[0-9]+,[a-x][0-9]+,v[0-9]+,v0.t} 3 } } */
-/* { dg-final { scan-assembler-times {vfmacc\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 1 } } */
-/* { dg-final { scan-assembler-times {vfmacc\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmacc\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 0 } } */
+/* { dg-final { scan-assembler-times {vfmacc\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 3 } } */
 /* { dg-final { scan-assembler-times {vfnmsac\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 3 } } */
 /* { dg-final { scan-assembler-times {\tvf?merge\.v[vxi]m\t} 14 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c
index 3a3841ff7ca..381d40532e6 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-5.c
@@ -32,8 +32,8 @@  TEST_ALL (DEF_LOOP)
 /* { dg-final { scan-assembler-times {vmacc\.vx\s+v[0-9]+,[a-x][0-9]+,v[0-9]+,v0.t} 3 } } */
 /* { dg-final { scan-assembler-times {vnmsac\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 1 } } */
 /* { dg-final { scan-assembler-times {vnmsac\.vx\s+v[0-9]+,[a-x][0-9]+,v[0-9]+,v0.t} 3 } } */
-/* { dg-final { scan-assembler-times {vfmacc\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 1 } } */
-/* { dg-final { scan-assembler-times {vfmacc\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmacc\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 0 } } */
+/* { dg-final { scan-assembler-times {vfmacc\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 3 } } */
 /* { dg-final { scan-assembler-times {vfnmsac\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 3 } } */
 /* NOTE: 14 vmerge is need for other purpose.  */
 /* { dg-final { scan-assembler-times {\tvf?merge\.v[vxi]m\t} 14 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c
index 9d084ff0e24..cb878167619 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fma_fnma-6.c
@@ -33,8 +33,8 @@  TEST_ALL (DEF_LOOP)
 /* { dg-final { scan-assembler-times {vmacc\.vx\s+v[0-9]+,[a-x][0-9]+,v[0-9]+,v0.t} 3 } } */
 /* { dg-final { scan-assembler-times {vnmsac\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 1 } } */
 /* { dg-final { scan-assembler-times {vnmsac\.vx\s+v[0-9]+,[a-x][0-9]+,v[0-9]+,v0.t} 3 } } */
-/* { dg-final { scan-assembler-times {vfmacc\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 1 } } */
-/* { dg-final { scan-assembler-times {vfmacc\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmacc\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 0 } } */
+/* { dg-final { scan-assembler-times {vfmacc\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 3 } } */
 /* { dg-final { scan-assembler-times {vfnmsac\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 3 } } */
 /* NOTE: 14 vmerge is need for other purpose.  */
 /* { dg-final { scan-assembler-times {\tvf?merge\.v[vxi]m\t} 14 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c
index 1ec67c37f20..95368ad38d1 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmax-1.c
@@ -29,6 +29,6 @@ 
 
 TEST_ALL (DEF_LOOP)
 
-/* { dg-final { scan-assembler-times {vfmax\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 4 } } */
-/* { dg-final { scan-assembler-times {vfmax\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmax\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmax\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 4 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c
index d59f7db2406..c07b331d169 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmax-2.c
@@ -29,6 +29,6 @@ 
 
 TEST_ALL (DEF_LOOP)
 
-/* { dg-final { scan-assembler-times {vfmax\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 4 } } */
-/* { dg-final { scan-assembler-times {vfmax\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmax\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmax\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 4 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c
index 6d8b93db4fc..a01ba8db5b2 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmax-3.c
@@ -29,6 +29,6 @@ 
 
 TEST_ALL (DEF_LOOP)
 
-/* { dg-final { scan-assembler-times {vfmax\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 4 } } */
-/* { dg-final { scan-assembler-times {vfmax\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmax\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmax\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 4 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c
index eb567af346f..9aabfb51d72 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmax-4.c
@@ -29,6 +29,6 @@ 
 
 TEST_ALL (DEF_LOOP)
 
-/* { dg-final { scan-assembler-times {vfmax\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 4 } } */
-/* { dg-final { scan-assembler-times {vfmax\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmax\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmax\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 4 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c
index d53ffcacb9e..116131b009e 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmin-1.c
@@ -6,6 +6,6 @@ 
 #define FN(X) __builtin_fmin##X
 #include "cond_fmax-1.c"
 
-/* { dg-final { scan-assembler-times {vfmin\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 4 } } */
-/* { dg-final { scan-assembler-times {vfmin\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmin\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmin\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 4 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c
index 2cb90512983..6ac47cb0ab9 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmin-2.c
@@ -6,6 +6,6 @@ 
 #define FN(X) __builtin_fmin##X
 #include "cond_fmax-2.c"
 
-/* { dg-final { scan-assembler-times {vfmin\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 4 } } */
-/* { dg-final { scan-assembler-times {vfmin\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmin\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmin\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 4 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c
index 44e9be24afe..2d445a9224d 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmin-3.c
@@ -6,6 +6,6 @@ 
 #define FN(X) __builtin_fmin##X
 #include "cond_fmax-3.c"
 
-/* { dg-final { scan-assembler-times {vfmin\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 4 } } */
-/* { dg-final { scan-assembler-times {vfmin\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmin\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmin\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 4 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c
index 7ce291d6a40..ae642061c38 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmin-4.c
@@ -6,6 +6,6 @@ 
 #define FN(X) __builtin_fmin##X
 #include "cond_fmax-4.c"
 
-/* { dg-final { scan-assembler-times {vfmin\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 4 } } */
-/* { dg-final { scan-assembler-times {vfmin\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmin\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmin\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 4 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c
index 187641f4eaf..1e367b324da 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-1.c
@@ -26,6 +26,6 @@ 
 TEST_ALL (DEF_LOOP)
 
 /* { dg-final { scan-assembler-times {vfnmacc\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 3 { xfail riscv*-*-* } } } */
-/* { dg-final { scan-assembler-times {vfmsub\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 1 } } */
-/* { dg-final { scan-assembler-times {vfmsub\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmsub\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 0 } } */
+/* { dg-final { scan-assembler-times {vfmsub\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 3 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c
index e99545e5dfb..3af559dd7ef 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-3.c
@@ -26,6 +26,6 @@ 
 TEST_ALL (DEF_LOOP)
 
 /* { dg-final { scan-assembler-times {vfnmacc\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 3 { xfail riscv*-*-* } } } */
-/* { dg-final { scan-assembler-times {vfmsac\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 1 } } */
-/* { dg-final { scan-assembler-times {vfmsac\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmsac\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 0 } } */
+/* { dg-final { scan-assembler-times {vfmsac\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 3 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c
index 456f67db38d..e777c8c4755 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-4.c
@@ -26,7 +26,7 @@ 
 TEST_ALL (DEF_LOOP)
 
 /* { dg-final { scan-assembler-times {vfnmacc\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 3 { xfail riscv*-*-* } } } */
-/* { dg-final { scan-assembler-times {vfmsac\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 1 } } */
-/* { dg-final { scan-assembler-times {vfmsac\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmsac\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 0 } } */
+/* { dg-final { scan-assembler-times {vfmsac\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 3 } } */
 /* NOTE: 3 vmerge is need for other purpose.  */
 /* { dg-final { scan-assembler-times {\tvf?merge\.v[vxi]m\t} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c
index 456f67db38d..e777c8c4755 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-5.c
@@ -26,7 +26,7 @@ 
 TEST_ALL (DEF_LOOP)
 
 /* { dg-final { scan-assembler-times {vfnmacc\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 3 { xfail riscv*-*-* } } } */
-/* { dg-final { scan-assembler-times {vfmsac\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 1 } } */
-/* { dg-final { scan-assembler-times {vfmsac\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmsac\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 0 } } */
+/* { dg-final { scan-assembler-times {vfmsac\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 3 } } */
 /* NOTE: 3 vmerge is need for other purpose.  */
 /* { dg-final { scan-assembler-times {\tvf?merge\.v[vxi]m\t} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c
index ed9897f86bb..46f2b5ff264 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fms_fnms-6.c
@@ -26,7 +26,7 @@ 
 TEST_ALL (DEF_LOOP)
 
 /* { dg-final { scan-assembler-times {vfnmacc\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 3 { xfail riscv*-*-* } } } */
-/* { dg-final { scan-assembler-times {vfmsac\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 1 } } */
-/* { dg-final { scan-assembler-times {vfmsac\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 2 } } */
+/* { dg-final { scan-assembler-times {vfmsac\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 0 } } */
+/* { dg-final { scan-assembler-times {vfmsac\.vf\s+v[0-9]+,fa[0-9],v[0-9]+,v0.t} 3 } } */
 /* NOTE: 3 vmerge is need for other purpose.  */
 /* { dg-final { scan-assembler-times {\tvf?merge\.v[vxi]m\t} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c
index 97b0c37dab8..0f85dfc4fdc 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-1.c
@@ -26,6 +26,6 @@ 
 
 TEST_ALL (DEF_LOOP)
 
-/* { dg-final { scan-assembler-times {vfmul\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 3 } } */
-/* { dg-final { scan-assembler-times {vfmul\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 6 } } */
+/* { dg-final { scan-assembler-times {vfmul\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 0 } } */
+/* { dg-final { scan-assembler-times {vfmul\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 9 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c
index 9ffe3ea6733..6cdb2c40d85 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-2.c
@@ -25,6 +25,6 @@ 
 
 TEST_ALL (DEF_LOOP)
 
-/* { dg-final { scan-assembler-times {vfmul\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 3 } } */
-/* { dg-final { scan-assembler-times {vfmul\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 3 } } */
+/* { dg-final { scan-assembler-times {vfmul\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 0 } } */
+/* { dg-final { scan-assembler-times {vfmul\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 6 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c
index a1dd46295e9..5a921cb614a 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-3.c
@@ -26,6 +26,6 @@ 
 
 TEST_ALL (DEF_LOOP)
 
-/* { dg-final { scan-assembler-times {vfmul\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 3 } } */
-/* { dg-final { scan-assembler-times {vfmul\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 6 } } */
+/* { dg-final { scan-assembler-times {vfmul\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 0 } } */
+/* { dg-final { scan-assembler-times {vfmul\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 9 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c
index 2f59e98f062..939e6bd8f7f 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-4.c
@@ -26,6 +26,6 @@ 
 
 TEST_ALL (DEF_LOOP)
 
-/* { dg-final { scan-assembler-times {vfmul\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 3 } } */
-/* { dg-final { scan-assembler-times {vfmul\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 6 } } */
+/* { dg-final { scan-assembler-times {vfmul\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 0 } } */
+/* { dg-final { scan-assembler-times {vfmul\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 9 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c
index 20d230898e5..608fbef7ba9 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_fmul-5.c
@@ -25,6 +25,6 @@ 
 
 TEST_ALL (DEF_LOOP)
 
-/* { dg-final { scan-assembler-times {vfmul\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 3 } } */
-/* { dg-final { scan-assembler-times {vfmul\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 6 } } */
+/* { dg-final { scan-assembler-times {vfmul\.vv\s+v[0-9]+,v[0-9]+,v[0-9]+,v0.t} 0 } } */
+/* { dg-final { scan-assembler-times {vfmul\.vf\s+v[0-9]+,v[0-9]+,fa[0-9],v0.t} 9 } } */
 /* { dg-final { scan-assembler-not {\tvf?merge\.v[vxi]m\t} } } */