vect: Tighten check for SLP memory groups [PR103517]

Message ID mpt35ncy55v.fsf@arm.com
State Committed
Commit 1e625a44f6f3001cea31e0f7c563943ecba92b68
Headers
Series vect: Tighten check for SLP memory groups [PR103517] |

Commit Message

Richard Sandiford Dec. 1, 2021, 10:45 a.m. UTC
  When checking for compatible stmts, vect_build_slp_tree_1 did:

	       && !(STMT_VINFO_GROUPED_ACCESS (stmt_info)
		    && (first_stmt_code == ARRAY_REF
			|| first_stmt_code == BIT_FIELD_REF
			|| first_stmt_code == INDIRECT_REF
			|| first_stmt_code == COMPONENT_REF
			|| first_stmt_code == MEM_REF)))

That is, it allowed any rhs_code as long as the first_stmt_code
looked valid.  This had the effect of allowing IFN_MASK_LOAD
to be paired with an earlier non-call code (but didn't allow
the reverse).

This patch makes the check symmetrical.

Still testing on x86_64-linux-gnu.  OK if testing passes, or doesn't
this seem like the right approach?

Richard


gcc/
	PR tree-optimization/103517
	* tree-vect-slp.c (vect_build_slp_tree_1): When allowing two
	different component references, check the codes of both them,
	rather than just the first.

gcc/testsuite/
	PR tree-optimization/103517
	* gcc.dg/vect/pr103517.c: New test.
---
 gcc/testsuite/gcc.dg/vect/pr103517.c | 13 +++++++++++++
 gcc/tree-vect-slp.c                  |  7 ++++++-
 2 files changed, 19 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr103517.c
  

Comments

Richard Biener Dec. 1, 2021, 1:14 p.m. UTC | #1
On Wed, Dec 1, 2021 at 11:56 AM Richard Sandiford via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> When checking for compatible stmts, vect_build_slp_tree_1 did:
>
>                && !(STMT_VINFO_GROUPED_ACCESS (stmt_info)
>                     && (first_stmt_code == ARRAY_REF
>                         || first_stmt_code == BIT_FIELD_REF
>                         || first_stmt_code == INDIRECT_REF
>                         || first_stmt_code == COMPONENT_REF
>                         || first_stmt_code == MEM_REF)))
>
> That is, it allowed any rhs_code as long as the first_stmt_code
> looked valid.  This had the effect of allowing IFN_MASK_LOAD
> to be paired with an earlier non-call code (but didn't allow
> the reverse).
>
> This patch makes the check symmetrical.
>
> Still testing on x86_64-linux-gnu.  OK if testing passes, or doesn't
> this seem like the right approach?

It's indeed a too weak comparison of the classification of the first
and the followup operands, some larger refactoring is probably
needed to improve here (note how we compare STMT_VINFO_GROUPED_ACCESS
of the followup against the tree codes of the first stmt but also later
compare first_stmt_load_p against load_p).

The proposed patch looks reasonable (but then we could drop
the STMT_VINFO_GROUPED_ACCESS (stmt_info) part of the check?),
so OK.

Thanks,
Richard.

> Richard
>
>
> gcc/
>         PR tree-optimization/103517
>         * tree-vect-slp.c (vect_build_slp_tree_1): When allowing two
>         different component references, check the codes of both them,
>         rather than just the first.
>
> gcc/testsuite/
>         PR tree-optimization/103517
>         * gcc.dg/vect/pr103517.c: New test.
> ---
>  gcc/testsuite/gcc.dg/vect/pr103517.c | 13 +++++++++++++
>  gcc/tree-vect-slp.c                  |  7 ++++++-
>  2 files changed, 19 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/pr103517.c
>
> diff --git a/gcc/testsuite/gcc.dg/vect/pr103517.c b/gcc/testsuite/gcc.dg/vect/pr103517.c
> new file mode 100644
> index 00000000000..de87fc48f84
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/pr103517.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=skylake-avx512" { target x86_64-*-* i?86-*-* } } */
> +
> +int a;
> +short b, c;
> +extern short d[];
> +void e() {
> +  for (short f = 1; f < (short)a; f += 2)
> +    if (d[f + 1]) {
> +      b = d[f];
> +      c = d[f + 1];
> +    }
> +}
> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
> index 7bff5118bd0..bc22ffeed82 100644
> --- a/gcc/tree-vect-slp.c
> +++ b/gcc/tree-vect-slp.c
> @@ -1121,7 +1121,12 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char *swap,
>                         || first_stmt_code == BIT_FIELD_REF
>                         || first_stmt_code == INDIRECT_REF
>                         || first_stmt_code == COMPONENT_REF
> -                       || first_stmt_code == MEM_REF)))
> +                       || first_stmt_code == MEM_REF)
> +                   && (rhs_code == ARRAY_REF
> +                       || rhs_code == BIT_FIELD_REF
> +                       || rhs_code == INDIRECT_REF
> +                       || rhs_code == COMPONENT_REF
> +                       || rhs_code == MEM_REF)))
>               || first_stmt_load_p != load_p
>               || first_stmt_phi_p != phi_p)
>             {
> --
> 2.25.1
>
  
Richard Sandiford Dec. 1, 2021, 1:25 p.m. UTC | #2
Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> On Wed, Dec 1, 2021 at 11:56 AM Richard Sandiford via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>> When checking for compatible stmts, vect_build_slp_tree_1 did:
>>
>>                && !(STMT_VINFO_GROUPED_ACCESS (stmt_info)
>>                     && (first_stmt_code == ARRAY_REF
>>                         || first_stmt_code == BIT_FIELD_REF
>>                         || first_stmt_code == INDIRECT_REF
>>                         || first_stmt_code == COMPONENT_REF
>>                         || first_stmt_code == MEM_REF)))
>>
>> That is, it allowed any rhs_code as long as the first_stmt_code
>> looked valid.  This had the effect of allowing IFN_MASK_LOAD
>> to be paired with an earlier non-call code (but didn't allow
>> the reverse).
>>
>> This patch makes the check symmetrical.
>>
>> Still testing on x86_64-linux-gnu.  OK if testing passes, or doesn't
>> this seem like the right approach?
>
> It's indeed a too weak comparison of the classification of the first
> and the followup operands, some larger refactoring is probably
> needed to improve here (note how we compare STMT_VINFO_GROUPED_ACCESS
> of the followup against the tree codes of the first stmt but also later
> compare first_stmt_load_p against load_p).
>
> The proposed patch looks reasonable (but then we could drop
> the STMT_VINFO_GROUPED_ACCESS (stmt_info) part of the check?),

Yeah, was wondering about that.  Seemed safer to keep it, since without
it we might pair non-memory BIT_FIELD_REFs with other things.  I guess
the same goes for the first stmt though, and the mismatch ought to be
caught later anyway.

> so OK.

Thanks,
Richard

> Thanks,
> Richard.
>
>> Richard
>>
>>
>> gcc/
>>         PR tree-optimization/103517
>>         * tree-vect-slp.c (vect_build_slp_tree_1): When allowing two
>>         different component references, check the codes of both them,
>>         rather than just the first.
>>
>> gcc/testsuite/
>>         PR tree-optimization/103517
>>         * gcc.dg/vect/pr103517.c: New test.
>> ---
>>  gcc/testsuite/gcc.dg/vect/pr103517.c | 13 +++++++++++++
>>  gcc/tree-vect-slp.c                  |  7 ++++++-
>>  2 files changed, 19 insertions(+), 1 deletion(-)
>>  create mode 100644 gcc/testsuite/gcc.dg/vect/pr103517.c
>>
>> diff --git a/gcc/testsuite/gcc.dg/vect/pr103517.c b/gcc/testsuite/gcc.dg/vect/pr103517.c
>> new file mode 100644
>> index 00000000000..de87fc48f84
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/vect/pr103517.c
>> @@ -0,0 +1,13 @@
>> +/* { dg-do compile } */
>> +/* { dg-additional-options "-march=skylake-avx512" { target x86_64-*-* i?86-*-* } } */
>> +
>> +int a;
>> +short b, c;
>> +extern short d[];
>> +void e() {
>> +  for (short f = 1; f < (short)a; f += 2)
>> +    if (d[f + 1]) {
>> +      b = d[f];
>> +      c = d[f + 1];
>> +    }
>> +}
>> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
>> index 7bff5118bd0..bc22ffeed82 100644
>> --- a/gcc/tree-vect-slp.c
>> +++ b/gcc/tree-vect-slp.c
>> @@ -1121,7 +1121,12 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char *swap,
>>                         || first_stmt_code == BIT_FIELD_REF
>>                         || first_stmt_code == INDIRECT_REF
>>                         || first_stmt_code == COMPONENT_REF
>> -                       || first_stmt_code == MEM_REF)))
>> +                       || first_stmt_code == MEM_REF)
>> +                   && (rhs_code == ARRAY_REF
>> +                       || rhs_code == BIT_FIELD_REF
>> +                       || rhs_code == INDIRECT_REF
>> +                       || rhs_code == COMPONENT_REF
>> +                       || rhs_code == MEM_REF)))
>>               || first_stmt_load_p != load_p
>>               || first_stmt_phi_p != phi_p)
>>             {
>> --
>> 2.25.1
>>
  
Richard Biener Dec. 1, 2021, 1:28 p.m. UTC | #3
On Wed, Dec 1, 2021 at 2:25 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > On Wed, Dec 1, 2021 at 11:56 AM Richard Sandiford via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> When checking for compatible stmts, vect_build_slp_tree_1 did:
> >>
> >>                && !(STMT_VINFO_GROUPED_ACCESS (stmt_info)
> >>                     && (first_stmt_code == ARRAY_REF
> >>                         || first_stmt_code == BIT_FIELD_REF
> >>                         || first_stmt_code == INDIRECT_REF
> >>                         || first_stmt_code == COMPONENT_REF
> >>                         || first_stmt_code == MEM_REF)))
> >>
> >> That is, it allowed any rhs_code as long as the first_stmt_code
> >> looked valid.  This had the effect of allowing IFN_MASK_LOAD
> >> to be paired with an earlier non-call code (but didn't allow
> >> the reverse).
> >>
> >> This patch makes the check symmetrical.
> >>
> >> Still testing on x86_64-linux-gnu.  OK if testing passes, or doesn't
> >> this seem like the right approach?
> >
> > It's indeed a too weak comparison of the classification of the first
> > and the followup operands, some larger refactoring is probably
> > needed to improve here (note how we compare STMT_VINFO_GROUPED_ACCESS
> > of the followup against the tree codes of the first stmt but also later
> > compare first_stmt_load_p against load_p).
> >
> > The proposed patch looks reasonable (but then we could drop
> > the STMT_VINFO_GROUPED_ACCESS (stmt_info) part of the check?),
>
> Yeah, was wondering about that.  Seemed safer to keep it, since without
> it we might pair non-memory BIT_FIELD_REFs with other things.  I guess
> the same goes for the first stmt though, and the mismatch ought to be
> caught later anyway.

Hmm, yeah.  I guess some classify_stmt () returning a custom enum
and comparing that might clean up this whole thing.  But nothing for stage3.

Richard.

> > so OK.
>
> Thanks,
> Richard
>
> > Thanks,
> > Richard.
> >
> >> Richard
> >>
> >>
> >> gcc/
> >>         PR tree-optimization/103517
> >>         * tree-vect-slp.c (vect_build_slp_tree_1): When allowing two
> >>         different component references, check the codes of both them,
> >>         rather than just the first.
> >>
> >> gcc/testsuite/
> >>         PR tree-optimization/103517
> >>         * gcc.dg/vect/pr103517.c: New test.
> >> ---
> >>  gcc/testsuite/gcc.dg/vect/pr103517.c | 13 +++++++++++++
> >>  gcc/tree-vect-slp.c                  |  7 ++++++-
> >>  2 files changed, 19 insertions(+), 1 deletion(-)
> >>  create mode 100644 gcc/testsuite/gcc.dg/vect/pr103517.c
> >>
> >> diff --git a/gcc/testsuite/gcc.dg/vect/pr103517.c b/gcc/testsuite/gcc.dg/vect/pr103517.c
> >> new file mode 100644
> >> index 00000000000..de87fc48f84
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/vect/pr103517.c
> >> @@ -0,0 +1,13 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-additional-options "-march=skylake-avx512" { target x86_64-*-* i?86-*-* } } */
> >> +
> >> +int a;
> >> +short b, c;
> >> +extern short d[];
> >> +void e() {
> >> +  for (short f = 1; f < (short)a; f += 2)
> >> +    if (d[f + 1]) {
> >> +      b = d[f];
> >> +      c = d[f + 1];
> >> +    }
> >> +}
> >> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
> >> index 7bff5118bd0..bc22ffeed82 100644
> >> --- a/gcc/tree-vect-slp.c
> >> +++ b/gcc/tree-vect-slp.c
> >> @@ -1121,7 +1121,12 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char *swap,
> >>                         || first_stmt_code == BIT_FIELD_REF
> >>                         || first_stmt_code == INDIRECT_REF
> >>                         || first_stmt_code == COMPONENT_REF
> >> -                       || first_stmt_code == MEM_REF)))
> >> +                       || first_stmt_code == MEM_REF)
> >> +                   && (rhs_code == ARRAY_REF
> >> +                       || rhs_code == BIT_FIELD_REF
> >> +                       || rhs_code == INDIRECT_REF
> >> +                       || rhs_code == COMPONENT_REF
> >> +                       || rhs_code == MEM_REF)))
> >>               || first_stmt_load_p != load_p
> >>               || first_stmt_phi_p != phi_p)
> >>             {
> >> --
> >> 2.25.1
> >>
  

Patch

diff --git a/gcc/testsuite/gcc.dg/vect/pr103517.c b/gcc/testsuite/gcc.dg/vect/pr103517.c
new file mode 100644
index 00000000000..de87fc48f84
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr103517.c
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-additional-options "-march=skylake-avx512" { target x86_64-*-* i?86-*-* } } */
+
+int a;
+short b, c;
+extern short d[];
+void e() {
+  for (short f = 1; f < (short)a; f += 2)
+    if (d[f + 1]) {
+      b = d[f];
+      c = d[f + 1];
+    }
+}
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 7bff5118bd0..bc22ffeed82 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -1121,7 +1121,12 @@  vect_build_slp_tree_1 (vec_info *vinfo, unsigned char *swap,
 			|| first_stmt_code == BIT_FIELD_REF
 			|| first_stmt_code == INDIRECT_REF
 			|| first_stmt_code == COMPONENT_REF
-			|| first_stmt_code == MEM_REF)))
+			|| first_stmt_code == MEM_REF)
+		    && (rhs_code == ARRAY_REF
+			|| rhs_code == BIT_FIELD_REF
+			|| rhs_code == INDIRECT_REF
+			|| rhs_code == COMPONENT_REF
+			|| rhs_code == MEM_REF)))
 	      || first_stmt_load_p != load_p
 	      || first_stmt_phi_p != phi_p)
 	    {