[RFC] tree-ssa-sink: do not sink to in front of setjmp

Message ID d6ba28f2-3d88-ed2c-e81-ba16e84f31ca@ispras.ru
State New
Headers
Series [RFC] tree-ssa-sink: do not sink to in front of setjmp |

Commit Message

Alexander Monakov Dec. 13, 2021, 2:25 p.m. UTC
  Greetings!

While testing our patch that reimplements -Wclobbered on GIMPLE we found
a case where tree-ssa-sink moves a statement to a basic block in front
of a setjmp call.

I am confident that this is unintended and should be considered invalid
GIMPLE. One of the edges incoming to a setjmp BB will be an edge from
the ABNORMAL_DISPATCHER block, corresponding to transfer of control flow
from a longjmp-like function and resulting in a "second return" from
setjmp. When that happens, it is not possible for GIMPLE statements in
front of setjmp to be somehow re-executed after longjmp.

I am unsure how this could be prevented "once and for all" so the
following patch just attacks one place (out of three) in
tree-ssa-sink by checking 'bb_has_abnormal_pred (sinkbb)'. Alexey
(Cc'ed) bootstrapped and regtested the patch on trunk.

The testcase is just

struct __jmp_buf_tag { };
typedef struct __jmp_buf_tag jmp_buf[1];
struct globals { jmp_buf listingbuf; };
extern struct globals *const ptr_to_globals;
void foo()
{
  if ( _setjmp ( ((*ptr_to_globals).listingbuf )))
    ;
}

Before tree-ssa-sink we have

void foo ()
{
   struct globals * ptr_to_globals.0_1;
   struct __jmp_buf_tag[1] * _2;

   <bb 2> :
   ptr_to_globals.0_1 = ptr_to_globals;
   _2 = &ptr_to_globals.0_1->listingbuf;

   <bb 3> :
   _setjmp (_2);
   goto <bb 5>; [INV]

   <bb 4> :
   .ABNORMAL_DISPATCHER (0);

   <bb 5> :
   return;

}

And tree-ssa-sink yields (see BB 3)

Sinking _2 = &ptr_to_globals.0_1->listingbuf;
  from bb 2 to bb 3
void foo ()
{
   struct globals * ptr_to_globals.0_1;
   struct __jmp_buf_tag[1] * _2;

   <bb 2> :
   ptr_to_globals.0_1 = ptr_to_globals;

   <bb 3> :
   _2 = &ptr_to_globals.0_1->listingbuf;
   _setjmp (_2);
   goto <bb 5>; [INV]

   <bb 4> :
   .ABNORMAL_DISPATCHER (0);

   <bb 5> :
   return;

}

The patch:
  

Comments

Richard Biener Dec. 13, 2021, 2:45 p.m. UTC | #1
On December 13, 2021 3:25:47 PM GMT+01:00, Alexander Monakov <amonakov@ispras.ru> wrote:
>Greetings!
>
>While testing our patch that reimplements -Wclobbered on GIMPLE we found
>a case where tree-ssa-sink moves a statement to a basic block in front
>of a setjmp call.
>
>I am confident that this is unintended and should be considered invalid
>GIMPLE. 

Does CFG validation not catch this? That is, doesn't setjmp force the start of a new BB?

One of the edges incoming to a setjmp BB will be an edge from
>the ABNORMAL_DISPATCHER block, corresponding to transfer of control flow
>from a longjmp-like function and resulting in a "second return" from
>setjmp. When that happens, it is not possible for GIMPLE statements in
>front of setjmp to be somehow re-executed after longjmp.
>
>I am unsure how this could be prevented "once and for all" so the
>following patch just attacks one place (out of three) in
>tree-ssa-sink by checking 'bb_has_abnormal_pred (sinkbb)'. Alexey
>(Cc'ed) bootstrapped and regtested the patch on trunk.

I think sinking relies on dominance and post dominance here but post dominance may be too fragile with the abnormal cycles which are likely not backwards reachable from exit. 

That said, checking for abnormal preds is OK, I just want to make sure we detect the invalid CFG - do we? 

>The testcase is just
>
>struct __jmp_buf_tag { };
>typedef struct __jmp_buf_tag jmp_buf[1];
>struct globals { jmp_buf listingbuf; };
>extern struct globals *const ptr_to_globals;
>void foo()
>{
>  if ( _setjmp ( ((*ptr_to_globals).listingbuf )))
>    ;
>}
>
>Before tree-ssa-sink we have
>
>void foo ()
>{
>   struct globals * ptr_to_globals.0_1;
>   struct __jmp_buf_tag[1] * _2;
>
>   <bb 2> :
>   ptr_to_globals.0_1 = ptr_to_globals;
>   _2 = &ptr_to_globals.0_1->listingbuf;
>
>   <bb 3> :
>   _setjmp (_2);
>   goto <bb 5>; [INV]
>
>   <bb 4> :
>   .ABNORMAL_DISPATCHER (0);
>
>   <bb 5> :
>   return;
>
>}
>
>And tree-ssa-sink yields (see BB 3)
>
>Sinking _2 = &ptr_to_globals.0_1->listingbuf;
>  from bb 2 to bb 3
>void foo ()
>{
>   struct globals * ptr_to_globals.0_1;
>   struct __jmp_buf_tag[1] * _2;
>
>   <bb 2> :
>   ptr_to_globals.0_1 = ptr_to_globals;
>
>   <bb 3> :
>   _2 = &ptr_to_globals.0_1->listingbuf;
>   _setjmp (_2);
>   goto <bb 5>; [INV]
>
>   <bb 4> :
>   .ABNORMAL_DISPATCHER (0);
>
>   <bb 5> :
>   return;
>
>}
>
>The patch:
>
>diff --git a/gcc/tree-ssa-sink.c b/gcc/tree-ssa-sink.c
>index c5d67840be3..317e2563114 100644
>--- a/gcc/tree-ssa-sink.c
>+++ b/gcc/tree-ssa-sink.c
>@@ -461,6 +461,9 @@ statement_sink_location (gimple *stmt, basic_block frombb,
> 	  else
> 	    *togsi = gsi_after_labels (sinkbb);
> 
>+	  if (bb_has_abnormal_pred (sinkbb))
>+	    return false;
>+
> 	  return true;
> 	}
>     }
  
Alexander Monakov Dec. 13, 2021, 3:20 p.m. UTC | #2
On Mon, 13 Dec 2021, Richard Biener wrote:

> On December 13, 2021 3:25:47 PM GMT+01:00, Alexander Monakov <amonakov@ispras.ru> wrote:
> >Greetings!
> >
> >While testing our patch that reimplements -Wclobbered on GIMPLE we found
> >a case where tree-ssa-sink moves a statement to a basic block in front
> >of a setjmp call.
> >
> >I am confident that this is unintended and should be considered invalid
> >GIMPLE. 
> 
> Does CFG validation not catch this? That is, doesn't setjmp force the start of
> a new BB?

Oh, good point. There's stmt_start_bb_p which returns true for setjmp, but
gimple_verify_flow_info doesn't check it. I guess we can try adding that
and collect the fallout on bootstrap/regtest.

> I think sinking relies on dominance and post dominance here but post dominance
> may be too fragile with the abnormal cycles which are likely not backwards
> reachable from exit. 
> 
> That said, checking for abnormal preds is OK, I just want to make sure we
> detect the invalid CFG - do we? 

As above, no, otherwise it would have been caught much earlier than ICE'ing
our -Wclobbered patch :)

Thank you.
Alexander
  
Алексей Нурмухаметов Dec. 14, 2021, 11:10 a.m. UTC | #3
On 13.12.2021 18:20, Alexander Monakov wrote:
> On Mon, 13 Dec 2021, Richard Biener wrote:
>
>> On December 13, 2021 3:25:47 PM GMT+01:00, Alexander Monakov <amonakov@ispras.ru> wrote:
>>> Greetings!
>>>
>>> While testing our patch that reimplements -Wclobbered on GIMPLE we found
>>> a case where tree-ssa-sink moves a statement to a basic block in front
>>> of a setjmp call.
>>>
>>> I am confident that this is unintended and should be considered invalid
>>> GIMPLE.
>> Does CFG validation not catch this? That is, doesn't setjmp force the start of
>> a new BB?
> Oh, good point. There's stmt_start_bb_p which returns true for setjmp, but
> gimple_verify_flow_info doesn't check it. I guess we can try adding that
> and collect the fallout on bootstrap/regtest.

Bootstrap looks good, but testsuite has some regression (the applied 
patch is below).

The overall number of unexpected failures and unresolved testcases is 
around 100. The diff is in attachment.
>> I think sinking relies on dominance and post dominance here but post dominance
>> may be too fragile with the abnormal cycles which are likely not backwards
>> reachable from exit.
>>
>> That said, checking for abnormal preds is OK, I just want to make sure we
>> detect the invalid CFG - do we?
> As above, no, otherwise it would have been caught much earlier than ICE'ing
> our -Wclobbered patch :)
>
> Thank you.
> Alexander

The patch for CFG validation:

diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index ebbd894ae03..92b08d1d6d8 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -5663,6 +5663,7 @@ gimple_verify_flow_info (void)
         }

        /* Verify that body of basic block BB is free of control flow.  */
+      gimple *prev_stmt = NULL;
        for (; !gsi_end_p (gsi); gsi_next (&gsi))
         {
           gimple *stmt = gsi_stmt (gsi);
@@ -5674,6 +5675,14 @@ gimple_verify_flow_info (void)
               err = 1;
             }

+         if (prev_stmt && stmt_starts_bb_p (stmt, prev_stmt))
+           {
+             error ("setjmp in the middle of basic block %d", bb->index);
+             err = 1;
+           }
+         if (!is_gimple_debug (stmt))
+           prev_stmt = stmt;
+
           if (stmt_ends_bb_p (stmt))
             found_ctrl_stmt = true;
  
Richard Biener Jan. 3, 2022, 1:41 p.m. UTC | #4
On Tue, Dec 14, 2021 at 12:10 PM Алексей Нурмухаметов
<nurmukhametov@ispras.ru> wrote:
>
> On 13.12.2021 18:20, Alexander Monakov wrote:
> > On Mon, 13 Dec 2021, Richard Biener wrote:
> >
> >> On December 13, 2021 3:25:47 PM GMT+01:00, Alexander Monakov <amonakov@ispras.ru> wrote:
> >>> Greetings!
> >>>
> >>> While testing our patch that reimplements -Wclobbered on GIMPLE we found
> >>> a case where tree-ssa-sink moves a statement to a basic block in front
> >>> of a setjmp call.
> >>>
> >>> I am confident that this is unintended and should be considered invalid
> >>> GIMPLE.
> >> Does CFG validation not catch this? That is, doesn't setjmp force the start of
> >> a new BB?
> > Oh, good point. There's stmt_start_bb_p which returns true for setjmp, but
> > gimple_verify_flow_info doesn't check it. I guess we can try adding that
> > and collect the fallout on bootstrap/regtest.
>
> Bootstrap looks good, but testsuite has some regression (the applied
> patch is below).
>
> The overall number of unexpected failures and unresolved testcases is
> around 100. The diff is in attachment.
> >> I think sinking relies on dominance and post dominance here but post dominance
> >> may be too fragile with the abnormal cycles which are likely not backwards
> >> reachable from exit.
> >>
> >> That said, checking for abnormal preds is OK, I just want to make sure we
> >> detect the invalid CFG - do we?
> > As above, no, otherwise it would have been caught much earlier than ICE'ing
> > our -Wclobbered patch :)
> >
> > Thank you.
> > Alexander
>
> The patch for CFG validation:
>
> diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
> index ebbd894ae03..92b08d1d6d8 100644
> --- a/gcc/tree-cfg.c
> +++ b/gcc/tree-cfg.c
> @@ -5663,6 +5663,7 @@ gimple_verify_flow_info (void)
>          }
>
>         /* Verify that body of basic block BB is free of control flow.  */
> +      gimple *prev_stmt = NULL;
>         for (; !gsi_end_p (gsi); gsi_next (&gsi))
>          {
>            gimple *stmt = gsi_stmt (gsi);
> @@ -5674,6 +5675,14 @@ gimple_verify_flow_info (void)
>                err = 1;
>              }
>
> +         if (prev_stmt && stmt_starts_bb_p (stmt, prev_stmt))

stmt_starts_bb_p is really a helper used during CFG build, I'd rather
test explicitely for a GIMPLE call with ECF_RETURNS_TWICE, or maybe,
verify that iff a block has abnormal predecessors it starts with such
a call (because IIRC we in some cases elide abnormal edges and then
it's OK to move "down" the calls).  So yes, if a block has abnormal preds
it should start with a ECF_RETURNS_TWICE call, I think we cannot
verify the reverse reliably - abnormal edges cannot easily be re-built
in late stages (it's a bug that we do so during RTL expansion).


> +           {
> +             error ("setjmp in the middle of basic block %d", bb->index);
> +             err = 1;
> +           }
> +         if (!is_gimple_debug (stmt))
> +           prev_stmt = stmt;
> +
>            if (stmt_ends_bb_p (stmt))
>              found_ctrl_stmt = true;
>
  
Alexander Monakov Jan. 3, 2022, 4:35 p.m. UTC | #5
On Mon, 3 Jan 2022, Richard Biener wrote:

> > @@ -5674,6 +5675,14 @@ gimple_verify_flow_info (void)
> >                err = 1;
> >              }
> >
> > +         if (prev_stmt && stmt_starts_bb_p (stmt, prev_stmt))
> 
> stmt_starts_bb_p is really a helper used during CFG build, I'd rather
> test explicitely for a GIMPLE call with ECF_RETURNS_TWICE, or maybe,
> verify that iff a block has abnormal predecessors it starts with such
> a call (because IIRC we in some cases elide abnormal edges and then
> it's OK to move "down" the calls).  So yes, if a block has abnormal preds
> it should start with a ECF_RETURNS_TWICE call, I think we cannot
> verify the reverse reliably - abnormal edges cannot easily be re-built
> in late stages (it's a bug that we do so during RTL expansion).

Thanks, I could rewrite the patch along those lines, but I'm not sure where
this is going: the ~100 extra FAILs will still be there. What would the next
steps be for this patch and the initial tree-ssa-sink patch?

Alexander
  
Richard Biener Jan. 4, 2022, 7:25 a.m. UTC | #6
On Mon, Jan 3, 2022 at 5:35 PM Alexander Monakov <amonakov@ispras.ru> wrote:
>
> On Mon, 3 Jan 2022, Richard Biener wrote:
>
> > > @@ -5674,6 +5675,14 @@ gimple_verify_flow_info (void)
> > >                err = 1;
> > >              }
> > >
> > > +         if (prev_stmt && stmt_starts_bb_p (stmt, prev_stmt))
> >
> > stmt_starts_bb_p is really a helper used during CFG build, I'd rather
> > test explicitely for a GIMPLE call with ECF_RETURNS_TWICE, or maybe,
> > verify that iff a block has abnormal predecessors it starts with such
> > a call (because IIRC we in some cases elide abnormal edges and then
> > it's OK to move "down" the calls).  So yes, if a block has abnormal preds
> > it should start with a ECF_RETURNS_TWICE call, I think we cannot
> > verify the reverse reliably - abnormal edges cannot easily be re-built
> > in late stages (it's a bug that we do so during RTL expansion).
>
> Thanks, I could rewrite the patch along those lines, but I'm not sure where
> this is going: the ~100 extra FAILs will still be there. What would the next
> steps be for this patch and the initial tree-ssa-sink patch?

I approved the initial sink patch (maybe not clearly enough).  Can you open
a bugreport about the missing CFG verification and list the set of FAILs
(all errors in some passes similar to the one you fixed in sinking I guess)?
It indeed sounds like something to tackle during next stage1 (unless you
already narrowed down the culprit to a single offender...)

Richard.

>
> Alexander
  
Alexander Monakov Jan. 14, 2022, 6:20 p.m. UTC | #7
> I approved the initial sink patch (maybe not clearly enough).

I wasn't entirely happy with that patch. The new version solves this better.

> Can you open
> a bugreport about the missing CFG verification and list the set of FAILs
> (all errors in some passes similar to the one you fixed in sinking I guess)?
> It indeed sounds like something to tackle during next stage1 (unless you
> already narrowed down the culprit to a single offender...)

Most of the failures were related to transactional memory, and the rest are
seemingly solved by forbidding duplication of returns_twice calls. In reply
to this email I'm sending three patches, the first is a revised patch for
tree-ssa-sink, the second forbids duplication of setjmp-like calls, and the
third implements the checks in verify_flow_info:

  tree-ssa-sink: do not sink to in front of setjmp
  tree-cfg: do not duplicate returns_twice calls
  tree-cfg: check placement of returns_twice calls

 gcc/testsuite/gcc.dg/setjmp-7.c | 13 +++++++++++
 gcc/tree-cfg.c                  | 40 +++++++++++++++++++++++++++++++--
 gcc/tree-ssa-sink.c             |  6 +++++
 3 files changed, 57 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/setjmp-7.c
  

Patch

diff --git a/gcc/tree-ssa-sink.c b/gcc/tree-ssa-sink.c
index c5d67840be3..317e2563114 100644
--- a/gcc/tree-ssa-sink.c
+++ b/gcc/tree-ssa-sink.c
@@ -461,6 +461,9 @@  statement_sink_location (gimple *stmt, basic_block frombb,
 	  else
 	    *togsi = gsi_after_labels (sinkbb);
 
+	  if (bb_has_abnormal_pred (sinkbb))
+	    return false;
+
 	  return true;
 	}
     }