Adjust testcase for O2 vectorization enabling

Message ID 20211011021047.66652-1-hongtao.liu@intel.com
State New
Headers
Series Adjust testcase for O2 vectorization enabling |

Commit Message

Liu, Hongtao Oct. 11, 2021, 2:10 a.m. UTC
  libgomp/ChangeLog:

	* testsuite/libgomp.graphite/force-parallel-8.c: Add -fno-tree-vectorize.
---
 libgomp/testsuite/libgomp.graphite/force-parallel-8.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Kewen.Lin Oct. 11, 2021, 2:22 a.m. UTC | #1
Hi Hongtao,

on 2021/10/11 上午10:10, liuhongt via Gcc-patches wrote:
> libgomp/ChangeLog:
> 
> 	* testsuite/libgomp.graphite/force-parallel-8.c: Add -fno-tree-vectorize.
> ---
>  libgomp/testsuite/libgomp.graphite/force-parallel-8.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libgomp/testsuite/libgomp.graphite/force-parallel-8.c b/libgomp/testsuite/libgomp.graphite/force-parallel-8.c
> index f9e07039172..0f3d138cebd 100644
> --- a/libgomp/testsuite/libgomp.graphite/force-parallel-8.c
> +++ b/libgomp/testsuite/libgomp.graphite/force-parallel-8.c
> @@ -1,4 +1,4 @@
> -/* { dg-additional-options "-fdisable-tree-thread1 -fdisable-tree-vrp-thread1" } */
> +/* { dg-additional-options "-fdisable-tree-thread1 -fdisable-tree-vrp-thread1 -fno-tree-vectorize" } */
>  
>  #define N 1500
>  
> 

Thanks for doing this!

This issue was observed in rs6000 specific PR102658 as well.

I've looked into it a bit, it's caused by the "conditional store replacement" which
is originally disabled without vectorization as below code.

  /* If either vectorization or if-conversion is disabled then do
     not sink any stores.  */
  if (param_max_stores_to_sink == 0
      || (!flag_tree_loop_vectorize && !flag_tree_slp_vectorize)
      || !flag_tree_loop_if_convert)
    return false;

The new change makes the innermost loop look like

for (int c1 = 0; c1 <= 1499; c1 += 1) {
  if (c1 <= 500) {
     S_10(c0, c1);
  } else {
      S_9(c0, c1);
  }
  S_11(c0, c1);
} 

and can not be splitted as:

for (int c1 = 0; c1 <= 500; c1 += 1)
  S_10(c0, c1);

for (int c1 = 501; c1 <= 1499; c1 += 1)
  S_9(c0, c1);

So instead of disabling vectorization, could we just disable this cs replacement
with parameter "--param max-stores-to-sink=0"?

I tested this proposal on ppc64le, it should work as well.

What do you think of it?

BR,
Kewen
  
Hongtao Liu Oct. 11, 2021, 2:39 a.m. UTC | #2
On Mon, Oct 11, 2021 at 10:23 AM Kewen.Lin via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi Hongtao,
>
> on 2021/10/11 上午10:10, liuhongt via Gcc-patches wrote:
> > libgomp/ChangeLog:
> >
> >       * testsuite/libgomp.graphite/force-parallel-8.c: Add -fno-tree-vectorize.
> > ---
> >  libgomp/testsuite/libgomp.graphite/force-parallel-8.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/libgomp/testsuite/libgomp.graphite/force-parallel-8.c b/libgomp/testsuite/libgomp.graphite/force-parallel-8.c
> > index f9e07039172..0f3d138cebd 100644
> > --- a/libgomp/testsuite/libgomp.graphite/force-parallel-8.c
> > +++ b/libgomp/testsuite/libgomp.graphite/force-parallel-8.c
> > @@ -1,4 +1,4 @@
> > -/* { dg-additional-options "-fdisable-tree-thread1 -fdisable-tree-vrp-thread1" } */
> > +/* { dg-additional-options "-fdisable-tree-thread1 -fdisable-tree-vrp-thread1 -fno-tree-vectorize" } */
> >
> >  #define N 1500
> >
> >
>
> Thanks for doing this!
>
> This issue was observed in rs6000 specific PR102658 as well.
>
> I've looked into it a bit, it's caused by the "conditional store replacement" which
> is originally disabled without vectorization as below code.
>
>   /* If either vectorization or if-conversion is disabled then do
>      not sink any stores.  */
>   if (param_max_stores_to_sink == 0
>       || (!flag_tree_loop_vectorize && !flag_tree_slp_vectorize)
>       || !flag_tree_loop_if_convert)
>     return false;
>
> The new change makes the innermost loop look like
>
> for (int c1 = 0; c1 <= 1499; c1 += 1) {
>   if (c1 <= 500) {
>      S_10(c0, c1);
>   } else {
>       S_9(c0, c1);
>   }
>   S_11(c0, c1);
> }
>
> and can not be splitted as:
>
> for (int c1 = 0; c1 <= 500; c1 += 1)
>   S_10(c0, c1);
>
> for (int c1 = 501; c1 <= 1499; c1 += 1)
>   S_9(c0, c1);
>
> So instead of disabling vectorization, could we just disable this cs replacement
> with parameter "--param max-stores-to-sink=0"?
>
> I tested this proposal on ppc64le, it should work as well.
>
> What do you think of it?
I can confirm it also works for x86, I'll adjust the patch.
Thank you for the analysis.
>
> BR,
> Kewen
  

Patch

diff --git a/libgomp/testsuite/libgomp.graphite/force-parallel-8.c b/libgomp/testsuite/libgomp.graphite/force-parallel-8.c
index f9e07039172..0f3d138cebd 100644
--- a/libgomp/testsuite/libgomp.graphite/force-parallel-8.c
+++ b/libgomp/testsuite/libgomp.graphite/force-parallel-8.c
@@ -1,4 +1,4 @@ 
-/* { dg-additional-options "-fdisable-tree-thread1 -fdisable-tree-vrp-thread1" } */
+/* { dg-additional-options "-fdisable-tree-thread1 -fdisable-tree-vrp-thread1 -fno-tree-vectorize" } */
 
 #define N 1500