mbox series

[v2,00/14] ARM/MVE use vectors of boolean for predicates

Message ID 20211013101554.2732342-1-christophe.lyon@foss.st.com
Headers show
Series ARM/MVE use vectors of boolean for predicates | expand

Message

Christophe LYON Oct. 13, 2021, 10:15 a.m. UTC
This is v2 of this patch series, addressing the comments I received.
The changes v1 -> v2 are:

- Patch 3: added an executable test, and updated
  check_effective_target_arm_mve_hw
- Patch 4: split into patch 4 and patch 14 (to keep numbering the same
  for the other patches)
- Patch 5: updated arm_class_likely_spilled_p as suggested.
- Patch 7: updated test_vector_ops_duplicate in simplify-rtx.c as
  suggested.
- Patch 8: added V2DI -> HI/hi mapping in MVE_VPRED/MVE_vpred
  iterators, removed now useless mve_vpselq_<supf>v2di, and fixed
  mov<mode> expander.
- Patch 9: arm_mode_to_pred_mode now returns opt_machine_mode, removed
  useless floating-point checks in vec_cmpu.
- Patch 12: replaced hi with v8bi in v2di load/store instructions

I'll squash patch 2 with patch patch 9 and patch 3 with patch 8.

Original text:

This patch series addresses PR 100757 and 101325 by representing
vectors of predicates (MVE VPR.P0 register) as vectors of booleans
rather than using HImode.

As this implies a lot of mostly mechanical changes, I have tried to
split the patches in a way that should help reviewers, but the split
is a bit artificial.

Patches 1-3 add new tests.

Patches 4-6 are small independent improvements.

Patch 7 implements the predicate qualifier, but does not change any
builtin yet.

Patch 8 is the first of the two main patches, and uses the new
qualifier to describe the vcmp and vpsel builtins that are useful for
auto-vectorization of comparisons.

Patch 9 is the second main patch, which fixes the vcond_mask expander.

Patches 10-13 convert almost all the remaining builtins with HI
operands to use the predicate qualifier.  After these, there are still
a few builtins with HI operands left, about which I am not sure: vctp,
vpnot, load-gather and store-scatter with v2di operands.  In fact,
patches 11/12 update some STR/LDR qualifiers in a way that breaks
these v2di builtins although existing tests still pass.

Christophe Lyon (14):
  arm: Add new tests for comparison vectorization with Neon and MVE
  arm: Add tests for PR target/100757
  arm: Add tests for PR target/101325
  arm: Add GENERAL_AND_VPR_REGS regclass
  arm: Add support for VPR_REG in arm_class_likely_spilled_p
  arm: Fix mve_vmvnq_n_<supf><mode> argument mode
  arm: Implement MVE predicates as vectors of booleans
  arm: Implement auto-vectorized MVE comparisons with vectors of boolean
    predicates
  arm: Fix vcond_mask expander for MVE (PR target/100757)
  arm: Convert remaining MVE vcmp builtins to predicate qualifiers
  arm: Convert more MVE builtins to predicate qualifiers
  arm: Convert more load/store MVE builtins to predicate qualifiers
  arm: Convert more MVE/CDE builtins to predicate qualifiers
  arm: Add VPR_REG to ALL_REGS

 gcc/config/arm/arm-builtins.c                 | 228 +++--
 gcc/config/arm/arm-modes.def                  |   5 +
 gcc/config/arm/arm-protos.h                   |   3 +-
 gcc/config/arm/arm-simd-builtin-types.def     |   4 +
 gcc/config/arm/arm.c                          | 130 ++-
 gcc/config/arm/arm.h                          |   5 +-
 gcc/config/arm/arm_mve_builtins.def           | 746 ++++++++--------
 gcc/config/arm/iterators.md                   |   5 +
 gcc/config/arm/mve.md                         | 832 ++++++++++--------
 gcc/config/arm/neon.md                        |  39 +
 gcc/config/arm/vec-common.md                  |  52 --
 gcc/simplify-rtx.c                            |  26 +-
 .../arm/acle/cde-mve-full-assembly.c          | 264 +++---
 .../gcc.target/arm/simd/mve-vcmp-f32-2.c      |  32 +
 .../gcc.target/arm/simd/neon-compare-1.c      |  78 ++
 .../gcc.target/arm/simd/neon-compare-2.c      |  13 +
 .../gcc.target/arm/simd/neon-compare-3.c      |  14 +
 .../arm/simd/neon-compare-scalar-1.c          |  57 ++
 .../gcc.target/arm/simd/neon-vcmp-f16.c       |  12 +
 .../gcc.target/arm/simd/neon-vcmp-f32-2.c     |  15 +
 .../gcc.target/arm/simd/neon-vcmp-f32-3.c     |  12 +
 .../gcc.target/arm/simd/neon-vcmp-f32.c       |  12 +
 gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c |  22 +
 .../gcc.target/arm/simd/pr100757-2.c          |  20 +
 .../gcc.target/arm/simd/pr100757-3.c          |  20 +
 .../gcc.target/arm/simd/pr100757-4.c          |  19 +
 gcc/testsuite/gcc.target/arm/simd/pr100757.c  |  19 +
 .../gcc.target/arm/simd/pr101325-2.c          |  19 +
 gcc/testsuite/gcc.target/arm/simd/pr101325.c  |  14 +
 gcc/testsuite/lib/target-supports.exp         |   3 +-
 30 files changed, 1611 insertions(+), 1109 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-3.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-scalar-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f16.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-3.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr101325-2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr101325.c

Comments

Richard Sandiford Oct. 15, 2021, 3:10 p.m. UTC | #1
Christophe Lyon via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> This is v2 of this patch series, addressing the comments I received.
> The changes v1 -> v2 are:
>
> - Patch 3: added an executable test, and updated
>   check_effective_target_arm_mve_hw
> - Patch 4: split into patch 4 and patch 14 (to keep numbering the same
>   for the other patches)
> - Patch 5: updated arm_class_likely_spilled_p as suggested.
> - Patch 7: updated test_vector_ops_duplicate in simplify-rtx.c as
>   suggested.
> - Patch 8: added V2DI -> HI/hi mapping in MVE_VPRED/MVE_vpred
>   iterators, removed now useless mve_vpselq_<supf>v2di, and fixed
>   mov<mode> expander.
> - Patch 9: arm_mode_to_pred_mode now returns opt_machine_mode, removed
>   useless floating-point checks in vec_cmpu.
> - Patch 12: replaced hi with v8bi in v2di load/store instructions
>
> I'll squash patch 2 with patch patch 9 and patch 3 with patch 8.

This looks good to me part from the question in 12/14 and the couple
of other (very) minor nits.

Thanks,
Richard

> Original text:
>
> This patch series addresses PR 100757 and 101325 by representing
> vectors of predicates (MVE VPR.P0 register) as vectors of booleans
> rather than using HImode.
>
> As this implies a lot of mostly mechanical changes, I have tried to
> split the patches in a way that should help reviewers, but the split
> is a bit artificial.
>
> Patches 1-3 add new tests.
>
> Patches 4-6 are small independent improvements.
>
> Patch 7 implements the predicate qualifier, but does not change any
> builtin yet.
>
> Patch 8 is the first of the two main patches, and uses the new
> qualifier to describe the vcmp and vpsel builtins that are useful for
> auto-vectorization of comparisons.
>
> Patch 9 is the second main patch, which fixes the vcond_mask expander.
>
> Patches 10-13 convert almost all the remaining builtins with HI
> operands to use the predicate qualifier.  After these, there are still
> a few builtins with HI operands left, about which I am not sure: vctp,
> vpnot, load-gather and store-scatter with v2di operands.  In fact,
> patches 11/12 update some STR/LDR qualifiers in a way that breaks
> these v2di builtins although existing tests still pass.
>
> Christophe Lyon (14):
>   arm: Add new tests for comparison vectorization with Neon and MVE
>   arm: Add tests for PR target/100757
>   arm: Add tests for PR target/101325
>   arm: Add GENERAL_AND_VPR_REGS regclass
>   arm: Add support for VPR_REG in arm_class_likely_spilled_p
>   arm: Fix mve_vmvnq_n_<supf><mode> argument mode
>   arm: Implement MVE predicates as vectors of booleans
>   arm: Implement auto-vectorized MVE comparisons with vectors of boolean
>     predicates
>   arm: Fix vcond_mask expander for MVE (PR target/100757)
>   arm: Convert remaining MVE vcmp builtins to predicate qualifiers
>   arm: Convert more MVE builtins to predicate qualifiers
>   arm: Convert more load/store MVE builtins to predicate qualifiers
>   arm: Convert more MVE/CDE builtins to predicate qualifiers
>   arm: Add VPR_REG to ALL_REGS
>
>  gcc/config/arm/arm-builtins.c                 | 228 +++--
>  gcc/config/arm/arm-modes.def                  |   5 +
>  gcc/config/arm/arm-protos.h                   |   3 +-
>  gcc/config/arm/arm-simd-builtin-types.def     |   4 +
>  gcc/config/arm/arm.c                          | 130 ++-
>  gcc/config/arm/arm.h                          |   5 +-
>  gcc/config/arm/arm_mve_builtins.def           | 746 ++++++++--------
>  gcc/config/arm/iterators.md                   |   5 +
>  gcc/config/arm/mve.md                         | 832 ++++++++++--------
>  gcc/config/arm/neon.md                        |  39 +
>  gcc/config/arm/vec-common.md                  |  52 --
>  gcc/simplify-rtx.c                            |  26 +-
>  .../arm/acle/cde-mve-full-assembly.c          | 264 +++---
>  .../gcc.target/arm/simd/mve-vcmp-f32-2.c      |  32 +
>  .../gcc.target/arm/simd/neon-compare-1.c      |  78 ++
>  .../gcc.target/arm/simd/neon-compare-2.c      |  13 +
>  .../gcc.target/arm/simd/neon-compare-3.c      |  14 +
>  .../arm/simd/neon-compare-scalar-1.c          |  57 ++
>  .../gcc.target/arm/simd/neon-vcmp-f16.c       |  12 +
>  .../gcc.target/arm/simd/neon-vcmp-f32-2.c     |  15 +
>  .../gcc.target/arm/simd/neon-vcmp-f32-3.c     |  12 +
>  .../gcc.target/arm/simd/neon-vcmp-f32.c       |  12 +
>  gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c |  22 +
>  .../gcc.target/arm/simd/pr100757-2.c          |  20 +
>  .../gcc.target/arm/simd/pr100757-3.c          |  20 +
>  .../gcc.target/arm/simd/pr100757-4.c          |  19 +
>  gcc/testsuite/gcc.target/arm/simd/pr100757.c  |  19 +
>  .../gcc.target/arm/simd/pr101325-2.c          |  19 +
>  gcc/testsuite/gcc.target/arm/simd/pr101325.c  |  14 +
>  gcc/testsuite/lib/target-supports.exp         |   3 +-
>  30 files changed, 1611 insertions(+), 1109 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-2.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-3.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-scalar-1.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f16.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-2.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-3.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr101325-2.c
>  create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr101325.c