test regression fix: Remove xfail for variable length targets of bb-slp-subgroups-3.c

Message ID 20240116065216.3834327-1-juzhe.zhong@rivai.ai
State New
Headers
Series test regression fix: Remove xfail for variable length targets of bb-slp-subgroups-3.c |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gcc_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 success Testing passed

Commit Message

juzhe.zhong@rivai.ai Jan. 16, 2024, 6:52 a.m. UTC
  Notice there is a regression recently:
XPASS: gcc.dg/vect/bb-slp-subgroups-3.c -flto -ffat-lto-objects  scan-tree-dump-times slp2 "optimized: basic block" 2
XPASS: gcc.dg/vect/bb-slp-subgroups-3.c scan-tree-dump-times slp2 "optimized: basic block" 2

Checked on both ARM SVE an RVV:

https://godbolt.org/z/jz4cYbqc8

"optimized: basic block" appears twice.

I guess ARM SVE has the same XPASS as RVV.

Hi, Andrew. Could you confirm about it ?

gcc/testsuite/ChangeLog:

	* gcc.dg/vect/bb-slp-subgroups-3.c: Remove XFAIL of variable length.

---
 gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Richard Biener Jan. 16, 2024, 7:43 a.m. UTC | #1
On Tue, 16 Jan 2024, Juzhe-Zhong wrote:

> Notice there is a regression recently:
> XPASS: gcc.dg/vect/bb-slp-subgroups-3.c -flto -ffat-lto-objects  scan-tree-dump-times slp2 "optimized: basic block" 2
> XPASS: gcc.dg/vect/bb-slp-subgroups-3.c scan-tree-dump-times slp2 "optimized: basic block" 2
> 
> Checked on both ARM SVE an RVV:
> 
> https://godbolt.org/z/jz4cYbqc8
> 
> "optimized: basic block" appears twice.
> 
> I guess ARM SVE has the same XPASS as RVV.
> 
> Hi, Andrew. Could you confirm about it ?

How does it vectorize it?  See the comments in the testcase.  The
intent was to check we can split the store and vectorize the
add and multiplication separately even when fed from the same
load group.  So ideally we'd add sth similar as in bb-slp-43.c,
looking for not "vector operands from scalars"

> gcc/testsuite/ChangeLog:
> 
> 	* gcc.dg/vect/bb-slp-subgroups-3.c: Remove XFAIL of variable length.
> 
> ---
>  gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c b/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c
> index fb719915db7..3f0d45ce4a1 100644
> --- a/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c
> @@ -42,7 +42,7 @@ main (int argc, char **argv)
>  /* Because we disable the cost model, targets with variable-length
>     vectors can end up vectorizing the store to a[0..7] on its own.
>     With the cost model we do something sensible.  */
> -/* { dg-final { scan-tree-dump-times "optimized: basic block" 2 "slp2" { target { ! amdgcn-*-* } xfail vect_variable_length } } } */
> +/* { dg-final { scan-tree-dump-times "optimized: basic block" 2 "slp2" { target { ! amdgcn-*-* } } } } */
>  
>  /* amdgcn can do this in one vector.  */
>  /* { dg-final { scan-tree-dump-times "optimized: basic block" 1 "slp2" { target amdgcn-*-* } } } */
>
  
juzhe.zhong@rivai.ai Jan. 16, 2024, 8:06 a.m. UTC | #2
I think it's vectorized by 128bit vector too.

  vector(4) int vect__9.9;
  vector(4) int vect__2.6;
  vector(4) int vect__1.5;
  int _1;
  int _5;
  int _11;
  int _13;
  vector(4) int _27;

  <bb 2> [local count: 1073741824]:
  vect__1.5_24 = MEM <vector(4) int> [(int *)&b];
  vect__2.6_25 = vect__1.5_24 + { 1, 2, 3, 4 };
  _1 = b[0];
  _5 = b[2];
  MEM <vector(4) int> [(int *)&a] = vect__2.6_25;
  _11 = b[4];
  _13 = b[6];
  _27 = {_1, _5, _11, _13};
  vect__9.9_28 = _27 * { 3, 4, 5, 7 };
  MEM <vector(4) int> [(int *)&a + 16B] = vect__9.9_28;


We can confirm it here: https://godbolt.org/z/6jGrEoz9s



juzhe.zhong@rivai.ai
 
From: Richard Biener
Date: 2024-01-16 15:43
To: Juzhe-Zhong
CC: gcc-patches; pinskia
Subject: Re: [PATCH] test regression fix: Remove xfail for variable length targets of bb-slp-subgroups-3.c
On Tue, 16 Jan 2024, Juzhe-Zhong wrote:
 
> Notice there is a regression recently:
> XPASS: gcc.dg/vect/bb-slp-subgroups-3.c -flto -ffat-lto-objects  scan-tree-dump-times slp2 "optimized: basic block" 2
> XPASS: gcc.dg/vect/bb-slp-subgroups-3.c scan-tree-dump-times slp2 "optimized: basic block" 2
> 
> Checked on both ARM SVE an RVV:
> 
> https://godbolt.org/z/jz4cYbqc8
> 
> "optimized: basic block" appears twice.
> 
> I guess ARM SVE has the same XPASS as RVV.
> 
> Hi, Andrew. Could you confirm about it ?
 
How does it vectorize it?  See the comments in the testcase.  The
intent was to check we can split the store and vectorize the
add and multiplication separately even when fed from the same
load group.  So ideally we'd add sth similar as in bb-slp-43.c,
looking for not "vector operands from scalars"
 
> gcc/testsuite/ChangeLog:
> 
> * gcc.dg/vect/bb-slp-subgroups-3.c: Remove XFAIL of variable length.
> 
> ---
>  gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c b/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c
> index fb719915db7..3f0d45ce4a1 100644
> --- a/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c
> @@ -42,7 +42,7 @@ main (int argc, char **argv)
>  /* Because we disable the cost model, targets with variable-length
>     vectors can end up vectorizing the store to a[0..7] on its own.
>     With the cost model we do something sensible.  */
> -/* { dg-final { scan-tree-dump-times "optimized: basic block" 2 "slp2" { target { ! amdgcn-*-* } xfail vect_variable_length } } } */
> +/* { dg-final { scan-tree-dump-times "optimized: basic block" 2 "slp2" { target { ! amdgcn-*-* } } } } */
>  
>  /* amdgcn can do this in one vector.  */
>  /* { dg-final { scan-tree-dump-times "optimized: basic block" 1 "slp2" { target amdgcn-*-* } } } */
> 
 
-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
  
Richard Biener Jan. 16, 2024, 8:44 a.m. UTC | #3
On Tue, 16 Jan 2024, juzhe.zhong@rivai.ai wrote:

> I think it's vectorized by 128bit vector too.
> 
>   vector(4) int vect__9.9;
>   vector(4) int vect__2.6;
>   vector(4) int vect__1.5;
>   int _1;
>   int _5;
>   int _11;
>   int _13;
>   vector(4) int _27;
> 
>   <bb 2> [local count: 1073741824]:
>   vect__1.5_24 = MEM <vector(4) int> [(int *)&b];
>   vect__2.6_25 = vect__1.5_24 + { 1, 2, 3, 4 };
>   _1 = b[0];
>   _5 = b[2];
>   MEM <vector(4) int> [(int *)&a] = vect__2.6_25;
>   _11 = b[4];
>   _13 = b[6];
>   _27 = {_1, _5, _11, _13};
>   vect__9.9_28 = _27 * { 3, 4, 5, 7 };
>   MEM <vector(4) int> [(int *)&a + 16B] = vect__9.9_28;
> 
> 
> We can confirm it here: https://godbolt.org/z/6jGrEoz9s

So same thing, add && { ! vect128 }?

> 
> 
> juzhe.zhong@rivai.ai
>  
> From: Richard Biener
> Date: 2024-01-16 15:43
> To: Juzhe-Zhong
> CC: gcc-patches; pinskia
> Subject: Re: [PATCH] test regression fix: Remove xfail for variable length targets of bb-slp-subgroups-3.c
> On Tue, 16 Jan 2024, Juzhe-Zhong wrote:
>  
> > Notice there is a regression recently:
> > XPASS: gcc.dg/vect/bb-slp-subgroups-3.c -flto -ffat-lto-objects  scan-tree-dump-times slp2 "optimized: basic block" 2
> > XPASS: gcc.dg/vect/bb-slp-subgroups-3.c scan-tree-dump-times slp2 "optimized: basic block" 2
> > 
> > Checked on both ARM SVE an RVV:
> > 
> > https://godbolt.org/z/jz4cYbqc8
> > 
> > "optimized: basic block" appears twice.
> > 
> > I guess ARM SVE has the same XPASS as RVV.
> > 
> > Hi, Andrew. Could you confirm about it ?
>  
> How does it vectorize it?  See the comments in the testcase.  The
> intent was to check we can split the store and vectorize the
> add and multiplication separately even when fed from the same
> load group.  So ideally we'd add sth similar as in bb-slp-43.c,
> looking for not "vector operands from scalars"
>  
> > gcc/testsuite/ChangeLog:
> > 
> > * gcc.dg/vect/bb-slp-subgroups-3.c: Remove XFAIL of variable length.
> > 
> > ---
> >  gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c b/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c
> > index fb719915db7..3f0d45ce4a1 100644
> > --- a/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c
> > +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c
> > @@ -42,7 +42,7 @@ main (int argc, char **argv)
> >  /* Because we disable the cost model, targets with variable-length
> >     vectors can end up vectorizing the store to a[0..7] on its own.
> >     With the cost model we do something sensible.  */
> > -/* { dg-final { scan-tree-dump-times "optimized: basic block" 2 "slp2" { target { ! amdgcn-*-* } xfail vect_variable_length } } } */
> > +/* { dg-final { scan-tree-dump-times "optimized: basic block" 2 "slp2" { target { ! amdgcn-*-* } } } } */
> >  
> >  /* amdgcn can do this in one vector.  */
> >  /* { dg-final { scan-tree-dump-times "optimized: basic block" 1 "slp2" { target amdgcn-*-* } } } */
> > 
>  
>
  

Patch

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c b/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c
index fb719915db7..3f0d45ce4a1 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c
@@ -42,7 +42,7 @@  main (int argc, char **argv)
 /* Because we disable the cost model, targets with variable-length
    vectors can end up vectorizing the store to a[0..7] on its own.
    With the cost model we do something sensible.  */
-/* { dg-final { scan-tree-dump-times "optimized: basic block" 2 "slp2" { target { ! amdgcn-*-* } xfail vect_variable_length } } } */
+/* { dg-final { scan-tree-dump-times "optimized: basic block" 2 "slp2" { target { ! amdgcn-*-* } } } } */
 
 /* amdgcn can do this in one vector.  */
 /* { dg-final { scan-tree-dump-times "optimized: basic block" 1 "slp2" { target amdgcn-*-* } } } */