[PATCH/RFC] On the use of -funreachable-traps to deal with PR 109627

Message ID 56A9A5FB-8294-47CB-A6C4-22FD5561C71A@googlemail.com
State New
Headers
Series [PATCH/RFC] On the use of -funreachable-traps to deal with PR 109627 |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm fail Patch failed to apply
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 fail Patch failed to apply

Commit Message

Iain Sandoe April 8, 2024, 11:04 p.m. UTC
  Hi

PR 109627 is about functions that have had their bodies completely elided, but still have the wrappers for EH frames (either .cfi_xxx or LFSxx/LFExx).

These are causing issues for some linkers because such functions result in FDEs with a 0 code extent.

The simplest representation of this is (from PR109527)

void foo () { __builtin_unreachable (); }

The solution (so far) is to detect this case during final lowering and replace the unreachable (which is expanded to nothing, at least for the targets I’ve dealt with) by a trap; this results in two positive improvements (1) the FDE is now finite-sized so the linker consumes it and (2) actually the trap is considerably more user-friendly UB than falling through to some other arbitrary place.

I was looking into using -funreachable-traps to do this for aarch64 Darwin - because the ad-hoc solutions that were applied to X86 and PPC are not easily usable for aarch64.

-funreachabe-traps was added for similar reasons (helping make missing returns less unexpected) in r13-1204-gd68d3664253696 by Jason (and then there have been further improvements resulting in the use of __builtin_unreachable trap () from Jakub)

As I read the commit message for r13-1204, I would expect -funreachable-traps to work for the simple case above, but it does not.  I think that is because the incremental patch below is needed.  however, I am not sure if there was some reason this was not done at the time?

PR 109627 is currently a show-stopper for the aarch64-darwin branch since libgomp and libgm2 fail to bootstrap - and other workarounds (e.g. -D__builtin_unreachable=__builtin_trap) do not work got m2 (since it does not use the C preprocessor by default).

Setting -funreachable-traps either per affected file, or globally for a target resolves the issue in a neater manner.

Any guidance / comments would be most welcome - if the direction seems sane, I can repost this patch formally.

(I have tested quite widely on Darwin and on a small number of Linux cases too)

thanks
Iain

* I will note that applying this does result in some regressions in several contracts test cases - but they also regress for -fsanitize=undefined -fsanitise-traps (not yet clear if that’s expected or we’ve uncovered a bug in the contracts impl.).

----------


====
  

Comments

Andrew Pinski April 8, 2024, 11:11 p.m. UTC | #1
On Mon, Apr 8, 2024 at 4:04 PM Iain Sandoe <idsandoe@googlemail.com> wrote:
>
> Hi
>
> PR 109627 is about functions that have had their bodies completely elided, but still have the wrappers for EH frames (either .cfi_xxx or LFSxx/LFExx).

I was thinking about how to fix this once and for all. The easiest
method I could think of was if __builtin_unreachable is the only thing
in the CFG expand it as __builtin_trap.
And then it should just work.

It should not to hard to add that check in expand_gimple_basic_block
and handle it that way.

What do you think of that? I can code this up for GCC 15 if you want.

Thanks,
Andrew Pinski

>
> These are causing issues for some linkers because such functions result in FDEs with a 0 code extent.
>
> The simplest representation of this is (from PR109527)
>
> void foo () { __builtin_unreachable (); }
>
> The solution (so far) is to detect this case during final lowering and replace the unreachable (which is expanded to nothing, at least for the targets I’ve dealt with) by a trap; this results in two positive improvements (1) the FDE is now finite-sized so the linker consumes it and (2) actually the trap is considerably more user-friendly UB than falling through to some other arbitrary place.
>
> I was looking into using -funreachable-traps to do this for aarch64 Darwin - because the ad-hoc solutions that were applied to X86 and PPC are not easily usable for aarch64.
>
> -funreachabe-traps was added for similar reasons (helping make missing returns less unexpected) in r13-1204-gd68d3664253696 by Jason (and then there have been further improvements resulting in the use of __builtin_unreachable trap () from Jakub)
>
> As I read the commit message for r13-1204, I would expect -funreachable-traps to work for the simple case above, but it does not.  I think that is because the incremental patch below is needed.  however, I am not sure if there was some reason this was not done at the time?
>
> PR 109627 is currently a show-stopper for the aarch64-darwin branch since libgomp and libgm2 fail to bootstrap - and other workarounds (e.g. -D__builtin_unreachable=__builtin_trap) do not work got m2 (since it does not use the C preprocessor by default).
>
> Setting -funreachable-traps either per affected file, or globally for a target resolves the issue in a neater manner.
>
> Any guidance / comments would be most welcome - if the direction seems sane, I can repost this patch formally.
>
> (I have tested quite widely on Darwin and on a small number of Linux cases too)
>
> thanks
> Iain
>
> * I will note that applying this does result in some regressions in several contracts test cases - but they also regress for -fsanitize=undefined -fsanitise-traps (not yet clear if that’s expected or we’ve uncovered a bug in the contracts impl.).
>
> ----------
>
>
> diff --git a/gcc/builtins.cc b/gcc/builtins.cc
> index f8d94c4b435..e2d26e45744 100644
> --- a/gcc/builtins.cc
> +++ b/gcc/builtins.cc
> @@ -5931,7 +5931,8 @@ expand_builtin_unreachable (void)
>  {
>    /* Use gimple_build_builtin_unreachable or builtin_decl_unreachable
>       to avoid this.  */
> -  gcc_checking_assert (!sanitize_flags_p (SANITIZE_UNREACHABLE));
> +  gcc_checking_assert (!sanitize_flags_p (SANITIZE_UNREACHABLE)
> +                      && !flag_unreachable_traps);
>    emit_barrier ();
>  }
>
> @@ -10442,7 +10443,7 @@ fold_builtin_0 (location_t loc, tree fndecl)
>
>      case BUILT_IN_UNREACHABLE:
>        /* Rewrite any explicit calls to __builtin_unreachable.  */
> -      if (sanitize_flags_p (SANITIZE_UNREACHABLE))
> +      if (sanitize_flags_p (SANITIZE_UNREACHABLE) || flag_unreachable_traps)
>         return build_builtin_unreachable (loc);
>        break;
>
> ====
  
Jeff Law April 9, 2024, 4:03 a.m. UTC | #2
On 4/8/24 5:04 PM, Iain Sandoe wrote:
> Hi
> 
> PR 109627 is about functions that have had their bodies completely elided, but still have the wrappers for EH frames (either .cfi_xxx or LFSxx/LFExx).
> 
> These are causing issues for some linkers because such functions result in FDEs with a 0 code extent.
> 
> The simplest representation of this is (from PR109527)
> 
> void foo () { __builtin_unreachable (); }
With the possibility of sounding like a broken record, I think 
__builtin_unreachable is fundamentally flawed.   It generates no code 
and just lets the program continue if ever "reached".  This is a 
security risk and (IMHO) just plain silly.  We're in a situation that is 
never supposed to happen, so continuing to execute code is just asking 
for problems.

If it were up to me, I'd have __builtin_unreachable emit a trap or 
similar construct that should (in general) halt execution.

Jeff
  
Richard Biener April 9, 2024, 7:03 a.m. UTC | #3
On Tue, Apr 9, 2024 at 6:03 AM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
> On 4/8/24 5:04 PM, Iain Sandoe wrote:
> > Hi
> >
> > PR 109627 is about functions that have had their bodies completely elided, but still have the wrappers for EH frames (either .cfi_xxx or LFSxx/LFExx).
> >
> > These are causing issues for some linkers because such functions result in FDEs with a 0 code extent.
> >
> > The simplest representation of this is (from PR109527)
> >
> > void foo () { __builtin_unreachable (); }
> With the possibility of sounding like a broken record, I think
> __builtin_unreachable is fundamentally flawed.   It generates no code
> and just lets the program continue if ever "reached".  This is a
> security risk and (IMHO) just plain silly.  We're in a situation that is
> never supposed to happen, so continuing to execute code is just asking
> for problems.
>
> If it were up to me, I'd have __builtin_unreachable emit a trap or
> similar construct that should (in general) halt execution.

__builtin_unreachable tells the compiler it's OK to omit a path to it
while __builtin_trap doesn't.  So once we replace the former with the
latter we have to keep the path.  Maybe that's OK.  I do agree that
the RTL representation of expanding __builtin_unreachable () to
"nothing" is bad.  Expanding to a trap always would be OK with me.

Richard.

> Jeff
>
  
Jakub Jelinek April 9, 2024, 7:11 a.m. UTC | #4
On Tue, Apr 09, 2024 at 09:03:59AM +0200, Richard Biener wrote:
> > With the possibility of sounding like a broken record, I think
> > __builtin_unreachable is fundamentally flawed.   It generates no code
> > and just lets the program continue if ever "reached".  This is a
> > security risk and (IMHO) just plain silly.  We're in a situation that is
> > never supposed to happen, so continuing to execute code is just asking
> > for problems.
> >
> > If it were up to me, I'd have __builtin_unreachable emit a trap or
> > similar construct that should (in general) halt execution.
> 
> __builtin_unreachable tells the compiler it's OK to omit a path to it
> while __builtin_trap doesn't.  So once we replace the former with the
> latter we have to keep the path.  Maybe that's OK.  I do agree that
> the RTL representation of expanding __builtin_unreachable () to
> "nothing" is bad.  Expanding to a trap always would be OK with me.

Even that would prevent tons of needed optimizations, especially the
reason why __builtin_unreachable () has been added in the first place
- for asm goto which always branches and so the kernel can put
__builtin_unreachable () after it to say that it won't fall through.
I think the kernel folks would be upset if we change that.

So, can't we instead just emit a trap when in the last cfglayout -> cfgrtl
switch we see that the last bb in the function doesn't have any successors?

	Jakub
  
Richard Biener April 9, 2024, 7:44 a.m. UTC | #5
On Tue, Apr 9, 2024 at 9:11 AM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Tue, Apr 09, 2024 at 09:03:59AM +0200, Richard Biener wrote:
> > > With the possibility of sounding like a broken record, I think
> > > __builtin_unreachable is fundamentally flawed.   It generates no code
> > > and just lets the program continue if ever "reached".  This is a
> > > security risk and (IMHO) just plain silly.  We're in a situation that is
> > > never supposed to happen, so continuing to execute code is just asking
> > > for problems.
> > >
> > > If it were up to me, I'd have __builtin_unreachable emit a trap or
> > > similar construct that should (in general) halt execution.
> >
> > __builtin_unreachable tells the compiler it's OK to omit a path to it
> > while __builtin_trap doesn't.  So once we replace the former with the
> > latter we have to keep the path.  Maybe that's OK.  I do agree that
> > the RTL representation of expanding __builtin_unreachable () to
> > "nothing" is bad.  Expanding to a trap always would be OK with me.
>
> Even that would prevent tons of needed optimizations, especially the
> reason why __builtin_unreachable () has been added in the first place
> - for asm goto which always branches and so the kernel can put
> __builtin_unreachable () after it to say that it won't fall through.
> I think the kernel folks would be upset if we change that.
>
> So, can't we instead just emit a trap when in the last cfglayout -> cfgrtl
> switch we see that the last bb in the function doesn't have any successors?

That's probably a good middle-ground if we can identify that "last" switch
easily (why not do it at each such switch?)

Richard.

>         Jakub
>
  
Jakub Jelinek April 9, 2024, 7:48 a.m. UTC | #6
On Tue, Apr 09, 2024 at 09:44:01AM +0200, Richard Biener wrote:
> (why not do it at each such switch?)

Because the traps would then be added even to the bbs which later
end up in the middle of the function.

	Jakub
  
Iain Sandoe April 9, 2024, 7:53 a.m. UTC | #7
> On 9 Apr 2024, at 08:48, Jakub Jelinek <jakub@redhat.com> wrote:
> 
> On Tue, Apr 09, 2024 at 09:44:01AM +0200, Richard Biener wrote:
>> (why not do it at each such switch?)
> 
> Because the traps would then be added even to the bbs which later
> end up in the middle of the function.

If we defer the unreachable => trap change until expand, then it would
not affect any of the current decisions made by the middle end.

Since the default expansion of unreachable is to a barrier - would this
actually make material difference to RTL optimizations?

Iain

> 
> 	Jakub
>
  
Iain Sandoe April 9, 2024, 1:59 p.m. UTC | #8
> On 9 Apr 2024, at 08:53, Iain Sandoe <idsandoe@googlemail.com> wrote:
> 
> 
> 
>> On 9 Apr 2024, at 08:48, Jakub Jelinek <jakub@redhat.com> wrote:
>> 
>> On Tue, Apr 09, 2024 at 09:44:01AM +0200, Richard Biener wrote:
>>> (why not do it at each such switch?)
>> 
>> Because the traps would then be added even to the bbs which later
>> end up in the middle of the function.
> 
> If we defer the unreachable => trap change until expand, then it would
> not affect any of the current decisions made by the middle end.
> 
> Since the default expansion of unreachable is to a barrier - would this
> actually make material difference to RTL optimizations?

Here is an implementation of this:
https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649074.html

Taking a solution to PR109267 out of the equation - it would still be good
to get an answer to the original question “is -funreachable-traps behaving
as expected”? (since it does not substitute in the TU we’ve been discussing)

thanks
Iain
  

Patch

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index f8d94c4b435..e2d26e45744 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -5931,7 +5931,8 @@  expand_builtin_unreachable (void)
 {
   /* Use gimple_build_builtin_unreachable or builtin_decl_unreachable
      to avoid this.  */
-  gcc_checking_assert (!sanitize_flags_p (SANITIZE_UNREACHABLE));
+  gcc_checking_assert (!sanitize_flags_p (SANITIZE_UNREACHABLE)
+                      && !flag_unreachable_traps);
   emit_barrier ();
 }
 
@@ -10442,7 +10443,7 @@  fold_builtin_0 (location_t loc, tree fndecl)
 
     case BUILT_IN_UNREACHABLE:
       /* Rewrite any explicit calls to __builtin_unreachable.  */
-      if (sanitize_flags_p (SANITIZE_UNREACHABLE))
+      if (sanitize_flags_p (SANITIZE_UNREACHABLE) || flag_unreachable_traps)
        return build_builtin_unreachable (loc);
       break;