s390: testsuite: Fix forwprop-4{0,1}.c

Message ID 20240422062306.2696720-1-stefansf@linux.ibm.com
State New
Headers
Series s390: testsuite: Fix forwprop-4{0,1}.c |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gcc_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 success Testing passed

Commit Message

Stefan Schulze Frielinghaus April 22, 2024, 6:23 a.m. UTC
  The tests fail on s390 since can_vec_perm_const_p fails and therefore
the bit insert/ref survive which r14-3381-g27de9aa152141e aims for.
Strictly speaking, the tests only fail in case the target supports
vectors, i.e., for targets prior z13 or in case of -mesa the emulated
vector operations are optimized out.

Easiest would be to skip the entire test for s390.  Another solution
would be to xfail in case of vector support hoping that eventually we
end up with an xpass for a future machine generation or if gcc advances.
That is implemented by this patch.  In order to do so I implemented a
new target test s390_mvx which tests whether vector support is available
or not.  Maybe this is already over-engineered for a simple test?  Any
thoughts?
---
 gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c |  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c |  4 ++--
 gcc/testsuite/lib/target-supports.exp       | 14 ++++++++++++++
 3 files changed, 18 insertions(+), 4 deletions(-)
  

Comments

Andreas Krebbel April 22, 2024, 8:46 a.m. UTC | #1
Hi Stefan,

due to that missed optimization we currently generate silly code for these two tests and should
really fix this (after gcc entering stage1). So just skipping it on s390x would definitely be the
wrong choice I think.

I think our vectorize_vec_perm_const correctly rejects this permute pattern, since it would require
a load from literal pool. Question is why we do have to rely on this being turned into a permute
first to get rid of the obviously redundant assignments. Shouldn't fwprop be able to handle this
without it?

I'm ok with your patch, but please also open a BZ for it and perhaps mention it in the comment close
to the xfail.

Thanks!

Andreas

On 4/22/24 08:23, Stefan Schulze Frielinghaus wrote:
> The tests fail on s390 since can_vec_perm_const_p fails and therefore
> the bit insert/ref survive which r14-3381-g27de9aa152141e aims for.
> Strictly speaking, the tests only fail in case the target supports
> vectors, i.e., for targets prior z13 or in case of -mesa the emulated
> vector operations are optimized out.
> 
> Easiest would be to skip the entire test for s390.  Another solution
> would be to xfail in case of vector support hoping that eventually we
> end up with an xpass for a future machine generation or if gcc advances.
> That is implemented by this patch.  In order to do so I implemented a
> new target test s390_mvx which tests whether vector support is available
> or not.  Maybe this is already over-engineered for a simple test?  Any
> thoughts?
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c |  4 ++--
>  gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c |  4 ++--
>  gcc/testsuite/lib/target-supports.exp       | 14 ++++++++++++++
>  3 files changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c
> index 7513497f552..b67e3e93a7f 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c
> @@ -10,5 +10,5 @@ vector int g(vector int a)
>    return a;
>  }
>  
> -/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 0 "optimized" } } */
> -/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 0 "optimized" { xfail s390_mvx } } } */
> +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" { xfail s390_mvx } } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c
> index b1e75797a90..0f119675207 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c
> @@ -11,6 +11,6 @@ vector int g(vector int a, int c)
>    return a;
>  }
>  
> -/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 1 "optimized" } } */
> -/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 1 "optimized" { xfail s390_mvx } } } */
> +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" { xfail s390_mvx } } } */
>  /* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 0 "optimized" } } */
> diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
> index edce672c0e2..5a692baa8ef 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -12380,6 +12380,20 @@ proc check_effective_target_profile_update_atomic {} {
>      } "-fprofile-update=atomic -fprofile-generate"]
>  }
>  
> +# Return 1 if the target has a vector facility.
> +proc check_effective_target_s390_mvx { } {
> +    if ![istarget s390*-*-*] then {
> +	return 0;
> +    }
> +
> +    return [check_no_compiler_messages_nocache s390_mvx assembly {
> +	#if !defined __VX__
> +	#error no vector facility.
> +	#endif
> +	int dummy;
> +    } [current_compiler_flags]]
> +}
> +
>  # Return 1 if vector (va - vector add) instructions are understood by
>  # the assembler and can be executed.  This also covers checking for
>  # the VX kernel feature.  A kernel without that feature does not
  
Richard Biener April 22, 2024, 7:06 p.m. UTC | #2
On Mon, Apr 22, 2024 at 10:47 AM Andreas Krebbel <krebbel@linux.ibm.com> wrote:
>
> Hi Stefan,
>
> due to that missed optimization we currently generate silly code for these two tests and should
> really fix this (after gcc entering stage1). So just skipping it on s390x would definitely be the
> wrong choice I think.
>
> I think our vectorize_vec_perm_const correctly rejects this permute pattern, since it would require
> a load from literal pool. Question is why we do have to rely on this being turned into a permute
> first to get rid of the obviously redundant assignments. Shouldn't fwprop be able to handle this
> without it?

We do not detect "redundant" BIT_INSERT, but the match.pd pattern
could eventually detect
this case (ISTR we have one doing that but I may be mistaken).

> I'm ok with your patch, but please also open a BZ for it and perhaps mention it in the comment close
> to the xfail.
>
> Thanks!
>
> Andreas
>
> On 4/22/24 08:23, Stefan Schulze Frielinghaus wrote:
> > The tests fail on s390 since can_vec_perm_const_p fails and therefore
> > the bit insert/ref survive which r14-3381-g27de9aa152141e aims for.
> > Strictly speaking, the tests only fail in case the target supports
> > vectors, i.e., for targets prior z13 or in case of -mesa the emulated
> > vector operations are optimized out.
> >
> > Easiest would be to skip the entire test for s390.  Another solution
> > would be to xfail in case of vector support hoping that eventually we
> > end up with an xpass for a future machine generation or if gcc advances.
> > That is implemented by this patch.  In order to do so I implemented a
> > new target test s390_mvx which tests whether vector support is available
> > or not.  Maybe this is already over-engineered for a simple test?  Any
> > thoughts?
> > ---
> >  gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c |  4 ++--
> >  gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c |  4 ++--
> >  gcc/testsuite/lib/target-supports.exp       | 14 ++++++++++++++
> >  3 files changed, 18 insertions(+), 4 deletions(-)
> >
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c
> > index 7513497f552..b67e3e93a7f 100644
> > --- a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c
> > @@ -10,5 +10,5 @@ vector int g(vector int a)
> >    return a;
> >  }
> >
> > -/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 0 "optimized" } } */
> > -/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" } } */
> > +/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 0 "optimized" { xfail s390_mvx } } } */
> > +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" { xfail s390_mvx } } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c
> > index b1e75797a90..0f119675207 100644
> > --- a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c
> > @@ -11,6 +11,6 @@ vector int g(vector int a, int c)
> >    return a;
> >  }
> >
> > -/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 1 "optimized" } } */
> > -/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" } } */
> > +/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 1 "optimized" { xfail s390_mvx } } } */
> > +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" { xfail s390_mvx } } } */
> >  /* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 0 "optimized" } } */
> > diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
> > index edce672c0e2..5a692baa8ef 100644
> > --- a/gcc/testsuite/lib/target-supports.exp
> > +++ b/gcc/testsuite/lib/target-supports.exp
> > @@ -12380,6 +12380,20 @@ proc check_effective_target_profile_update_atomic {} {
> >      } "-fprofile-update=atomic -fprofile-generate"]
> >  }
> >
> > +# Return 1 if the target has a vector facility.
> > +proc check_effective_target_s390_mvx { } {
> > +    if ![istarget s390*-*-*] then {
> > +     return 0;
> > +    }
> > +
> > +    return [check_no_compiler_messages_nocache s390_mvx assembly {
> > +     #if !defined __VX__
> > +     #error no vector facility.
> > +     #endif
> > +     int dummy;
> > +    } [current_compiler_flags]]
> > +}
> > +
> >  # Return 1 if vector (va - vector add) instructions are understood by
> >  # the assembler and can be executed.  This also covers checking for
> >  # the VX kernel feature.  A kernel without that feature does not
>
  

Patch

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c
index 7513497f552..b67e3e93a7f 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-40.c
@@ -10,5 +10,5 @@  vector int g(vector int a)
   return a;
 }
 
-/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 0 "optimized" } } */
-/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 0 "optimized" { xfail s390_mvx } } } */
+/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" { xfail s390_mvx } } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c
index b1e75797a90..0f119675207 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-41.c
@@ -11,6 +11,6 @@  vector int g(vector int a, int c)
   return a;
 }
 
-/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 1 "optimized" } } */
-/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "BIT_INSERT_EXPR" 1 "optimized" { xfail s390_mvx } } } */
+/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 0 "optimized" { xfail s390_mvx } } } */
 /* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 0 "optimized" } } */
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index edce672c0e2..5a692baa8ef 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -12380,6 +12380,20 @@  proc check_effective_target_profile_update_atomic {} {
     } "-fprofile-update=atomic -fprofile-generate"]
 }
 
+# Return 1 if the target has a vector facility.
+proc check_effective_target_s390_mvx { } {
+    if ![istarget s390*-*-*] then {
+	return 0;
+    }
+
+    return [check_no_compiler_messages_nocache s390_mvx assembly {
+	#if !defined __VX__
+	#error no vector facility.
+	#endif
+	int dummy;
+    } [current_compiler_flags]]
+}
+
 # Return 1 if vector (va - vector add) instructions are understood by
 # the assembler and can be executed.  This also covers checking for
 # the VX kernel feature.  A kernel without that feature does not