arm: [MVE intrinsics] Fix support for predicate constants [PR target/114801]
Checks
Context |
Check |
Description |
linaro-tcwg-bot/tcwg_gcc_build--master-arm |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_gcc_check--master-arm |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 |
success
|
Testing passed
|
Commit Message
In this PR, we have to handle a case where MVE predicates are supplied
as a const_int, where individual predicates have illegal boolean
values (such as 0xc for a 4-bit boolean predicate). To avoid the ICE,
we canonicalize them, replacing a non-null value with -1.
2024-04-26 Christophe Lyon <christophe.lyon@linaro.org>
Jakub Jelinek <jakub@redhat.com>
PR target/114801
gcc/
* config/arm/arm-mve-builtins.cc
(function_expander::add_input_operand): Handle CONST_INT
predicates.
gcc/testsuite/
* gcc.target/arm/mve/pr114801.c: New test.
---
gcc/config/arm/arm-mve-builtins.cc | 21 +++++++++++-
gcc/testsuite/gcc.target/arm/mve/pr114801.c | 36 +++++++++++++++++++++
2 files changed, 56 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/gcc.target/arm/mve/pr114801.c
Comments
On Fri, Apr 26, 2024 at 11:10:12PM +0000, Christophe Lyon wrote:
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/pr114801.c
> @@ -0,0 +1,36 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-final { check-function-bodies "**" "" "" } } */
> +
> +#include <arm_mve.h>
> +
> +/*
> +** test_32:
> +**...
> +** mov r[0-9]+, #65535 @ movhi
> +**...
> +*/
> +uint32x4_t test_32() {
> + return vdupq_m_n_u32(vdupq_n_u32(0), 0, 0xcccc);
Just a testcase nit. I think testing 0xcccc isn't that useful,
it tests the same 4 bits 4 times.
Might be more interesting to test 4 different 4 bit elements,
one of them 0 (to verify it doesn't turn that into all ones),
one all 1s (that is the other valid case) and then 2 random
other values in between.
> +}
> +
> +/*
> +** test_16:
> +**...
> +** mov r[0-9]+, #52428 @ movhi
> +**...
> +*/
> +uint16x8_t test_16() {
> + return vdupq_m_n_u16(vdupq_n_u16(0), 0, 0xcccc);
And for these it can actually test all 4 possible 2 bit elements,
so say 0x3021
> +}
> +
> +/*
> +** test_8:
> +**...
> +** mov r[0-9]+, #52428 @ movhi
> +**...
> +*/
> +uint8x16_t test_8() {
> + return vdupq_m_n_u8(vdupq_n_u8(0), 0, 0xcccc);
and here use some random pattern.
BTW, the patch is ok for 14.1 if it is approved and committed today
(so that it can be cherry-picked tomorrow morning at latest to the branch).
Jakub
On Mon, 29 Apr 2024 at 15:29, Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Fri, Apr 26, 2024 at 11:10:12PM +0000, Christophe Lyon wrote:
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/arm/mve/pr114801.c
> > @@ -0,0 +1,36 @@
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> > +/* { dg-add-options arm_v8_1m_mve } */
> > +/* { dg-final { check-function-bodies "**" "" "" } } */
> > +
> > +#include <arm_mve.h>
> > +
> > +/*
> > +** test_32:
> > +**...
> > +** mov r[0-9]+, #65535 @ movhi
> > +**...
> > +*/
> > +uint32x4_t test_32() {
> > + return vdupq_m_n_u32(vdupq_n_u32(0), 0, 0xcccc);
>
> Just a testcase nit. I think testing 0xcccc isn't that useful,
> it tests the same 4 bits 4 times.
> Might be more interesting to test 4 different 4 bit elements,
> one of them 0 (to verify it doesn't turn that into all ones),
> one all 1s (that is the other valid case) and then 2 random
> other values in between.
>
> > +}
> > +
> > +/*
> > +** test_16:
> > +**...
> > +** mov r[0-9]+, #52428 @ movhi
> > +**...
> > +*/
> > +uint16x8_t test_16() {
> > + return vdupq_m_n_u16(vdupq_n_u16(0), 0, 0xcccc);
>
> And for these it can actually test all 4 possible 2 bit elements,
> so say 0x3021
>
> > +}
> > +
> > +/*
> > +** test_8:
> > +**...
> > +** mov r[0-9]+, #52428 @ movhi
> > +**...
> > +*/
> > +uint8x16_t test_8() {
> > + return vdupq_m_n_u8(vdupq_n_u8(0), 0, 0xcccc);
>
> and here use some random pattern.
>
> BTW, the patch is ok for 14.1 if it is approved and committed today
> (so that it can be cherry-picked tomorrow morning at latest to the branch).
Thanks for your comments, I'll update the testcase, but Andre provided
additional info in the PR:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114801#c17
I tried just removing the call to gcc_unreachable in
rtx_vector_builder::find_cached_value and that does the trick, but I'm
worried by such a change.
Christophe
>
> Jakub
>
@@ -43,6 +43,7 @@
#include "stringpool.h"
#include "attribs.h"
#include "diagnostic.h"
+#include "rtx-vector-builder.h"
#include "arm-protos.h"
#include "arm-builtins.h"
#include "arm-mve-builtins.h"
@@ -2205,7 +2206,25 @@ function_expander::add_input_operand (insn_code icode, rtx x)
mode = GET_MODE (x);
}
else if (VALID_MVE_PRED_MODE (mode))
- x = gen_lowpart (mode, x);
+ {
+ if (CONST_INT_P (x) && (mode == V8BImode || mode == V4BImode))
+ {
+ /* In V8BI or V4BI each element has 2 or 4 bits, if those
+ bits aren't all the same, it is UB and gen_lowpart might
+ ICE. Canonicalize all the 2 or 4 bits to all ones if any
+ of them is non-zero. */
+ unsigned HOST_WIDE_INT xi = UINTVAL (x);
+ xi |= ((xi & 0x5555) << 1) | ((xi & 0xaaaa) >> 1);
+ if (mode == V4BImode)
+ xi |= ((xi & 0x3333) << 2) | ((xi & 0xcccc) >> 2);
+ x = gen_int_mode (xi, HImode);
+ }
+ else if (SUBREG_P (x))
+ /* gen_lowpart on a SUBREG can ICE. */
+ x = force_reg (GET_MODE (x), x);
+
+ x = gen_lowpart (mode, x);
+ }
m_ops.safe_grow (m_ops.length () + 1, true);
create_input_operand (&m_ops.last (), x, mode);
new file mode 100644
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+#include <arm_mve.h>
+
+/*
+** test_32:
+**...
+** mov r[0-9]+, #65535 @ movhi
+**...
+*/
+uint32x4_t test_32() {
+ return vdupq_m_n_u32(vdupq_n_u32(0), 0, 0xcccc);
+}
+
+/*
+** test_16:
+**...
+** mov r[0-9]+, #52428 @ movhi
+**...
+*/
+uint16x8_t test_16() {
+ return vdupq_m_n_u16(vdupq_n_u16(0), 0, 0xcccc);
+}
+
+/*
+** test_8:
+**...
+** mov r[0-9]+, #52428 @ movhi
+**...
+*/
+uint8x16_t test_8() {
+ return vdupq_m_n_u8(vdupq_n_u8(0), 0, 0xcccc);
+}