[v2,0/4] RISC-V: Combine vec_duplicate + v{widen}u.vv to v{widen}u.vx on GR2VR cost

Message ID 20250916115428.1176402-1-pan2.li@intel.com
Headers
Series RISC-V: Combine vec_duplicate + v{widen}u.vv to v{widen}u.vx on GR2VR cost |

Message

Li, Pan2 Sept. 16, 2025, 11:52 a.m. UTC
  From: Pan Li <pan2.li@intel.com>

This patch would like to introduce the combine of vec_dup + v{widen}u.vv
into v{widen}u.vx on the cost value of GR2VR.  The late-combine will take
place if the cost of GR2VRlike 1, 2, 15 in test.

The below insn from uint32_t to uint64_t are included.
* vwaddu.vx
* vwsubu.vx
* vwmulu.vx

From:
 |   ...
 |   vmv.v.x
 | L1:
 |   v{widen}u.vv
 |   J L1
 |   ...

To:
 |   ...
 | L1:
 |   v{widen}u.vx
 |   J L1
 |   ...

The below test suites are passed for this patch series.
* The rv64gcv fully regression test.

Pan Li (4):
  RISC-V: Combine vec_duplicate + vwaddu.vv to vwaddu.vx on GR2VR cost
  RISC-V: Add test for vec_duplicate + vwaddu.vv signed combine with GR2VR cost 0, 1 and 15
  RISC-V: Add test for vec_duplicate + vwsubu.vv signed combine with GR2VR cost 0, 1 and 15
  RISC-V: Add test for vec_duplicate + vwmulu.vv signed combine with GR2VR cost 0, 1 and 15

 gcc/config/riscv/autovec-opt.md               |  42 +++++
 gcc/config/riscv/iterators.md                 |   3 +
 gcc/config/riscv/vector-iterators.md          |  16 ++
 .../riscv/rvv/autovec/vx_vf/vx-1-u16.c        |   6 +
 .../riscv/rvv/autovec/vx_vf/vx-1-u32.c        |   6 +
 .../riscv/rvv/autovec/vx_vf/vx-1-u64.c        |   6 +
 .../riscv/rvv/autovec/vx_vf/vx-2-u16.c        |   6 +
 .../riscv/rvv/autovec/vx_vf/vx-2-u32.c        |   6 +
 .../riscv/rvv/autovec/vx_vf/vx-2-u64.c        |   6 +
 .../riscv/rvv/autovec/vx_vf/vx-3-u16.c        |   6 +
 .../riscv/rvv/autovec/vx_vf/vx-3-u32.c        |   6 +
 .../riscv/rvv/autovec/vx_vf/vx-3-u64.c        |   6 +
 .../rvv/autovec/vx_vf/vx_vwaddu-run-1-u64.c   |  18 ++
 .../rvv/autovec/vx_vf/vx_vwmulu-run-1-u64.c   |  18 ++
 .../rvv/autovec/vx_vf/vx_vwsubu-run-1-u64.c   |  18 ++
 .../riscv/rvv/autovec/vx_vf/vx_widen.h        |  36 ++++
 .../riscv/rvv/autovec/vx_vf/vx_widen_data.h   | 159 ++++++++++++++++++
 .../riscv/rvv/autovec/vx_vf/vx_widen_vx_run.h |  27 +++
 18 files changed, 391 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vwaddu-run-1-u64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vwmulu-run-1-u64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vwsubu-run-1-u64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_widen.h
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_widen_data.h
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_widen_vx_run.h
  

Comments

Robin Dapp Sept. 16, 2025, 7:06 p.m. UTC | #1
> This patch would like to introduce the combine of vec_dup + v{widen}u.vv
> into v{widen}u.vx on the cost value of GR2VR.  The late-combine will take
> place if the cost of GR2VRlike 1, 2, 15 in test.

This series LGTM, thanks.