rtl: Try to remove EH edges after {pro,epi}logue generation [PR90259]

Message ID 454e83e7-e7a2-6e10-e051-b33c2d1b580d@linux.ibm.com
State New
Headers
Series rtl: Try to remove EH edges after {pro,epi}logue generation [PR90259] |

Commit Message

Kewen.Lin Nov. 8, 2022, 2:48 a.m. UTC
  Hi,

After prologue and epilogue generation, the judgement on whether
one memory access onto stack frame may trap or not could change,
since we get more exact stack information by now.

As PR90259 shows, some memory access becomes impossible to trap
any more after prologue and epilogue generation, it can make
subsequent optimization be able to remove it if safe, but it
results in unexpected control flow status due to REG_EH_REGION
note missing.

This patch proposes to try to remove EH edges with function
purge_all_dead_edges after prologue and epilogue generation,
it simplifies CFG as early as we can and don't need any fixup
in downstream passes.

CFG simplification result with PR90259's case as example:

*before*

   18: %1:TF=call [`__gcc_qdiv'] argc:0
      REG_EH_REGION 0x2
   77: NOTE_INSN_BASIC_BLOCK 3
   19: NOTE_INSN_DELETED
   20: NOTE_INSN_DELETED
  110: [%31:SI+0x20]=%1:DF
      REG_EH_REGION 0x2
  116: NOTE_INSN_BASIC_BLOCK 4
  111: [%31:SI+0x28]=%2:DF
      REG_EH_REGION 0x2
   22: NOTE_INSN_BASIC_BLOCK 5
  108: %0:DF=[%31:SI+0x20]
      REG_EH_REGION 0x2
  117: NOTE_INSN_BASIC_BLOCK 6
  109: %1:DF=[%31:SI+0x28]
      REG_EH_REGION 0x2
   79: NOTE_INSN_BASIC_BLOCK 7
   26: [%31:SI+0x18]=%0:DF
  104: pc=L69
  105: barrier

*after*

   18: %1:TF=call [`__gcc_qdiv'] argc:0
      REG_EH_REGION 0x2
   77: NOTE_INSN_BASIC_BLOCK 3
   19: NOTE_INSN_DELETED
   20: NOTE_INSN_DELETED
  110: [%31:SI+0x20]=%1:DF
  111: [%31:SI+0x28]=%2:DF
  108: %0:DF=[%31:SI+0x20]
  109: %1:DF=[%31:SI+0x28]
   26: [%31:SI+0x18]=%0:DF
  104: pc=L69
  105: barrier

Bootstrapped and regtested on x86_64-redhat-linux,
aarch64-linux-gnu and powerpc64{,le}-linux-gnu.

Is it ok for trunk?

BR,
Kewen

-----
	PR rtl-optimization/90259

gcc/ChangeLog:

	* function.cc (rest_of_handle_thread_prologue_and_epilogue): Add
	parameter fun, and call function purge_all_dead_edges.
	(pass_thread_prologue_and_epilogue::execute): Name unamed parameter
	as fun, and use it for rest_of_handle_thread_prologue_and_epilogue.

gcc/testsuite/ChangeLog:

	* g++.target/powerpc/pr90259.C: New.
---
 gcc/function.cc                            |  13 ++-
 gcc/testsuite/g++.target/powerpc/pr90259.C | 103 +++++++++++++++++++++
 2 files changed, 113 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/powerpc/pr90259.C

--
2.35.4
  

Comments

Richard Biener Nov. 8, 2022, 10:09 a.m. UTC | #1
On Tue, Nov 8, 2022 at 3:49 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>
> Hi,
>
> After prologue and epilogue generation, the judgement on whether
> one memory access onto stack frame may trap or not could change,
> since we get more exact stack information by now.
>
> As PR90259 shows, some memory access becomes impossible to trap
> any more after prologue and epilogue generation, it can make
> subsequent optimization be able to remove it if safe, but it
> results in unexpected control flow status due to REG_EH_REGION
> note missing.
>
> This patch proposes to try to remove EH edges with function
> purge_all_dead_edges after prologue and epilogue generation,
> it simplifies CFG as early as we can and don't need any fixup
> in downstream passes.
>
> CFG simplification result with PR90259's case as example:
>
> *before*
>
>    18: %1:TF=call [`__gcc_qdiv'] argc:0
>       REG_EH_REGION 0x2
>    77: NOTE_INSN_BASIC_BLOCK 3
>    19: NOTE_INSN_DELETED
>    20: NOTE_INSN_DELETED
>   110: [%31:SI+0x20]=%1:DF
>       REG_EH_REGION 0x2
>   116: NOTE_INSN_BASIC_BLOCK 4
>   111: [%31:SI+0x28]=%2:DF
>       REG_EH_REGION 0x2
>    22: NOTE_INSN_BASIC_BLOCK 5
>   108: %0:DF=[%31:SI+0x20]
>       REG_EH_REGION 0x2
>   117: NOTE_INSN_BASIC_BLOCK 6
>   109: %1:DF=[%31:SI+0x28]
>       REG_EH_REGION 0x2
>    79: NOTE_INSN_BASIC_BLOCK 7
>    26: [%31:SI+0x18]=%0:DF
>   104: pc=L69
>   105: barrier
>
> *after*
>
>    18: %1:TF=call [`__gcc_qdiv'] argc:0
>       REG_EH_REGION 0x2
>    77: NOTE_INSN_BASIC_BLOCK 3
>    19: NOTE_INSN_DELETED
>    20: NOTE_INSN_DELETED
>   110: [%31:SI+0x20]=%1:DF
>   111: [%31:SI+0x28]=%2:DF
>   108: %0:DF=[%31:SI+0x20]
>   109: %1:DF=[%31:SI+0x28]
>    26: [%31:SI+0x18]=%0:DF
>   104: pc=L69
>   105: barrier
>
> Bootstrapped and regtested on x86_64-redhat-linux,
> aarch64-linux-gnu and powerpc64{,le}-linux-gnu.
>
> Is it ok for trunk?

It looks reasonable - OK if the others CCed have no comments.

Thanks,
Richard.

> BR,
> Kewen
>
> -----
>         PR rtl-optimization/90259
>
> gcc/ChangeLog:
>
>         * function.cc (rest_of_handle_thread_prologue_and_epilogue): Add
>         parameter fun, and call function purge_all_dead_edges.
>         (pass_thread_prologue_and_epilogue::execute): Name unamed parameter
>         as fun, and use it for rest_of_handle_thread_prologue_and_epilogue.
>
> gcc/testsuite/ChangeLog:
>
>         * g++.target/powerpc/pr90259.C: New.
> ---
>  gcc/function.cc                            |  13 ++-
>  gcc/testsuite/g++.target/powerpc/pr90259.C | 103 +++++++++++++++++++++
>  2 files changed, 113 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/g++.target/powerpc/pr90259.C
>
> diff --git a/gcc/function.cc b/gcc/function.cc
> index 6474a663b30..3757ded547d 100644
> --- a/gcc/function.cc
> +++ b/gcc/function.cc
> @@ -6540,7 +6540,7 @@ make_pass_leaf_regs (gcc::context *ctxt)
>  }
>
>  static unsigned int
> -rest_of_handle_thread_prologue_and_epilogue (void)
> +rest_of_handle_thread_prologue_and_epilogue (function *fun)
>  {
>    /* prepare_shrink_wrap is sensitive to the block structure of the control
>       flow graph, so clean it up first.  */
> @@ -6557,6 +6557,13 @@ rest_of_handle_thread_prologue_and_epilogue (void)
>       Fix that up.  */
>    fixup_partitions ();
>
> +  /* After prologue and epilogue generation, the judgement on whether
> +     one memory access onto stack frame may trap or not could change,
> +     since we get more exact stack information by now.  So try to
> +     remove any EH edges here, see PR90259.  */
> +  if (fun->can_throw_non_call_exceptions)
> +    purge_all_dead_edges ();
> +
>    /* Shrink-wrapping can result in unreachable edges in the epilogue,
>       see PR57320.  */
>    cleanup_cfg (optimize ? CLEANUP_EXPENSIVE : 0);
> @@ -6625,9 +6632,9 @@ public:
>    {}
>
>    /* opt_pass methods: */
> -  unsigned int execute (function *) final override
> +  unsigned int execute (function * fun) final override
>      {
> -      return rest_of_handle_thread_prologue_and_epilogue ();
> +      return rest_of_handle_thread_prologue_and_epilogue (fun);
>      }
>
>  }; // class pass_thread_prologue_and_epilogue
> diff --git a/gcc/testsuite/g++.target/powerpc/pr90259.C b/gcc/testsuite/g++.target/powerpc/pr90259.C
> new file mode 100644
> index 00000000000..db75ac7fe02
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/powerpc/pr90259.C
> @@ -0,0 +1,103 @@
> +/* { dg-require-effective-target long_double_ibm128 } */
> +/* { dg-options "-O2 -ffloat-store -fgcse -fnon-call-exceptions -fno-forward-propagate -fno-omit-frame-pointer -fstack-protector-all" } */
> +/* { dg-add-options long_double_ibm128 } */
> +
> +/* Verify there is no ICE.  */
> +
> +template <int a> struct b
> +{
> +  static constexpr int c = a;
> +};
> +template <bool a> using d = b<a>;
> +struct e
> +{
> +  int f;
> +  int
> +  g ()
> +  {
> +    return __builtin_ceil (f / (long double) h);
> +  }
> +  float h;
> +};
> +template <typename, typename> using k = d<!bool ()>;
> +template <typename> class n
> +{
> +public:
> +  e ae;
> +  void af ();
> +};
> +template <typename l>
> +void
> +n<l>::af ()
> +{
> +  ae.g ();
> +}
> +template <bool> using m = int;
> +template <typename ag, typename ah, typename ai = m<k<ag, ah>::c>>
> +using aj = n<ai>;
> +struct o
> +{
> +  void
> +  af ()
> +  {
> +    al.af ();
> +  }
> +  aj<int, int> al;
> +};
> +template <typename> class am;
> +template <typename i> class ao
> +{
> +protected:
> +  static i *ap (int);
> +};
> +template <typename, typename> class p;
> +template <typename ar, typename i, typename... j> class p<ar (j...), i> : ao<i>
> +{
> +public:
> +  static ar
> +  as (const int &p1, j...)
> +  {
> +    (*ao<i>::ap (p1)) (j ()...);
> +  }
> +};
> +template <typename ar, typename... j> class am<ar (j...)>
> +{
> +  template <typename, typename> using av = int;
> +
> +public:
> +  template <typename i, typename = av<d<!bool ()>, void>,
> +           typename = av<i, void>>
> +  am (i);
> +  using aw = ar (*) (const int &, j...);
> +  aw ax;
> +};
> +template <typename ar, typename... j>
> +template <typename i, typename, typename>
> +am<ar (j...)>::am (i)
> +{
> +  ax = p<ar (j...), i>::as;
> +}
> +struct G
> +{
> +  void ba (am<void (o)>);
> +};
> +struct q
> +{
> +  q ()
> +  {
> +    G a;
> +    a.ba (r ());
> +  }
> +  struct r
> +  {
> +    void
> +    operator() (o p1)
> +    try
> +      {
> +       p1.af ();
> +      }
> +    catch (int)
> +      {
> +      }
> +  };
> +} s;
> --
> 2.35.4
  
Eric Botcazou Nov. 8, 2022, 10:37 a.m. UTC | #2
> It looks reasonable - OK if the others CCed have no comments.

My only comment is that it needs to be tested with languages enabling -fnon-
call-exceptions by default (Ada & Go), if not already done.
  
Kewen.Lin Nov. 9, 2022, 7:05 a.m. UTC | #3
Hi Richi and Eric,

Thanks for your reviews and comments!

on 2022/11/8 18:37, Eric Botcazou wrote:
>> It looks reasonable - OK if the others CCed have no comments.
> 
> My only comment is that it needs to be tested with languages enabling -fnon-
> call-exceptions by default (Ada & Go), if not already done.
> 

The previous testings on powerpc64{,le}-linux-gnu covered language Go, but
not Ada.  I re-tested it with languages c,c++,fortran,objc,obj-c++,go,ada
on powerpc64le-linux-gnu, the result looked good.  Both x86 and aarch64
cfarm machines which I used for testing don't have gnat installed, do you
think testing Ada on ppc64le is enough?

BR,
Kewen
  
Eric Botcazou Nov. 9, 2022, 7:56 a.m. UTC | #4
> The previous testings on powerpc64{,le}-linux-gnu covered language Go, but
> not Ada.  I re-tested it with languages c,c++,fortran,objc,obj-c++,go,ada
> on powerpc64le-linux-gnu, the result looked good.  Both x86 and aarch64
> cfarm machines which I used for testing don't have gnat installed, do you
> think testing Ada on ppc64le is enough?

Sure, thanks for having done it!
  
Kewen.Lin Nov. 10, 2022, 3:30 a.m. UTC | #5
on 2022/11/9 15:56, Eric Botcazou wrote:
>> The previous testings on powerpc64{,le}-linux-gnu covered language Go, but
>> not Ada.  I re-tested it with languages c,c++,fortran,objc,obj-c++,go,ada
>> on powerpc64le-linux-gnu, the result looked good.  Both x86 and aarch64
>> cfarm machines which I used for testing don't have gnat installed, do you
>> think testing Ada on ppc64le is enough?
> 
> Sure, thanks for having done it!
> 

Thanks for confirming.

I'm going to push this next Monday if there is no objection or further
comments in this week.

BR,
Kewen
  
Kewen.Lin Nov. 16, 2022, 2:33 a.m. UTC | #6
on 2022/11/10 11:30, Kewen.Lin wrote:
> on 2022/11/9 15:56, Eric Botcazou wrote:
>>> The previous testings on powerpc64{,le}-linux-gnu covered language Go, but
>>> not Ada.  I re-tested it with languages c,c++,fortran,objc,obj-c++,go,ada
>>> on powerpc64le-linux-gnu, the result looked good.  Both x86 and aarch64
>>> cfarm machines which I used for testing don't have gnat installed, do you
>>> think testing Ada on ppc64le is enough?
>>
>> Sure, thanks for having done it!
>>
> 
> Thanks for confirming.
> 
> I'm going to push this next Monday if there is no objection or further
> comments in this week.
> 

Committed in r13-4079.  Do we want this to be backported a week later?

BR,
Kewen
  
Richard Biener Nov. 16, 2022, 3:35 p.m. UTC | #7
On Wed, Nov 16, 2022 at 3:33 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>
> on 2022/11/10 11:30, Kewen.Lin wrote:
> > on 2022/11/9 15:56, Eric Botcazou wrote:
> >>> The previous testings on powerpc64{,le}-linux-gnu covered language Go, but
> >>> not Ada.  I re-tested it with languages c,c++,fortran,objc,obj-c++,go,ada
> >>> on powerpc64le-linux-gnu, the result looked good.  Both x86 and aarch64
> >>> cfarm machines which I used for testing don't have gnat installed, do you
> >>> think testing Ada on ppc64le is enough?
> >>
> >> Sure, thanks for having done it!
> >>
> >
> > Thanks for confirming.
> >
> > I'm going to push this next Monday if there is no objection or further
> > comments in this week.
> >
>
> Committed in r13-4079.  Do we want this to be backported a week later?

It doesn't seem to be a regression and the report was from a fuzzed testcase
so I think no.

Richard.

> BR,
> Kewen
  

Patch

diff --git a/gcc/function.cc b/gcc/function.cc
index 6474a663b30..3757ded547d 100644
--- a/gcc/function.cc
+++ b/gcc/function.cc
@@ -6540,7 +6540,7 @@  make_pass_leaf_regs (gcc::context *ctxt)
 }

 static unsigned int
-rest_of_handle_thread_prologue_and_epilogue (void)
+rest_of_handle_thread_prologue_and_epilogue (function *fun)
 {
   /* prepare_shrink_wrap is sensitive to the block structure of the control
      flow graph, so clean it up first.  */
@@ -6557,6 +6557,13 @@  rest_of_handle_thread_prologue_and_epilogue (void)
      Fix that up.  */
   fixup_partitions ();

+  /* After prologue and epilogue generation, the judgement on whether
+     one memory access onto stack frame may trap or not could change,
+     since we get more exact stack information by now.  So try to
+     remove any EH edges here, see PR90259.  */
+  if (fun->can_throw_non_call_exceptions)
+    purge_all_dead_edges ();
+
   /* Shrink-wrapping can result in unreachable edges in the epilogue,
      see PR57320.  */
   cleanup_cfg (optimize ? CLEANUP_EXPENSIVE : 0);
@@ -6625,9 +6632,9 @@  public:
   {}

   /* opt_pass methods: */
-  unsigned int execute (function *) final override
+  unsigned int execute (function * fun) final override
     {
-      return rest_of_handle_thread_prologue_and_epilogue ();
+      return rest_of_handle_thread_prologue_and_epilogue (fun);
     }

 }; // class pass_thread_prologue_and_epilogue
diff --git a/gcc/testsuite/g++.target/powerpc/pr90259.C b/gcc/testsuite/g++.target/powerpc/pr90259.C
new file mode 100644
index 00000000000..db75ac7fe02
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/pr90259.C
@@ -0,0 +1,103 @@ 
+/* { dg-require-effective-target long_double_ibm128 } */
+/* { dg-options "-O2 -ffloat-store -fgcse -fnon-call-exceptions -fno-forward-propagate -fno-omit-frame-pointer -fstack-protector-all" } */
+/* { dg-add-options long_double_ibm128 } */
+
+/* Verify there is no ICE.  */
+
+template <int a> struct b
+{
+  static constexpr int c = a;
+};
+template <bool a> using d = b<a>;
+struct e
+{
+  int f;
+  int
+  g ()
+  {
+    return __builtin_ceil (f / (long double) h);
+  }
+  float h;
+};
+template <typename, typename> using k = d<!bool ()>;
+template <typename> class n
+{
+public:
+  e ae;
+  void af ();
+};
+template <typename l>
+void
+n<l>::af ()
+{
+  ae.g ();
+}
+template <bool> using m = int;
+template <typename ag, typename ah, typename ai = m<k<ag, ah>::c>>
+using aj = n<ai>;
+struct o
+{
+  void
+  af ()
+  {
+    al.af ();
+  }
+  aj<int, int> al;
+};
+template <typename> class am;
+template <typename i> class ao
+{
+protected:
+  static i *ap (int);
+};
+template <typename, typename> class p;
+template <typename ar, typename i, typename... j> class p<ar (j...), i> : ao<i>
+{
+public:
+  static ar
+  as (const int &p1, j...)
+  {
+    (*ao<i>::ap (p1)) (j ()...);
+  }
+};
+template <typename ar, typename... j> class am<ar (j...)>
+{
+  template <typename, typename> using av = int;
+
+public:
+  template <typename i, typename = av<d<!bool ()>, void>,
+	    typename = av<i, void>>
+  am (i);
+  using aw = ar (*) (const int &, j...);
+  aw ax;
+};
+template <typename ar, typename... j>
+template <typename i, typename, typename>
+am<ar (j...)>::am (i)
+{
+  ax = p<ar (j...), i>::as;
+}
+struct G
+{
+  void ba (am<void (o)>);
+};
+struct q
+{
+  q ()
+  {
+    G a;
+    a.ba (r ());
+  }
+  struct r
+  {
+    void
+    operator() (o p1)
+    try
+      {
+	p1.af ();
+      }
+    catch (int)
+      {
+      }
+  };
+} s;