tst-setcontext2: avoid bug from compiler optimization

Message ID 6c06a49b-8f4d-ad5f-53c1-984bd90a687b@mellanox.com
State New, archived
Headers

Commit Message

Chris Metcalf Jan. 25, 2017, 5:23 p.m. UTC
  On 1/25/2017 6:23 AM, Torvald Riegel wrote:
> On Tue, 2017-01-24 at 19:35 -0500, Chris Metcalf wrote:
>> Ping!  I will plan to commit this later this week if no one objects; it seems
>> like a straightforward bug avoidance.
>>
>> On 1/13/2017 1:01 PM, Chris Metcalf wrote:
>>> With an uninitialized oldctx, the compiler is free to observe that
>>> the only path that sets up a value in oldctx is through the
>>> "if (global == 2)" arm, in which arm we apparently return 0 without
>>> referencing oldctx again.
>>>
>>> Then, after the "if" cascade, the compiler can inline the "check"
>>> function and then observe that the sigset_t "set" variable there
>>> is only used locally, before any apparent uses of oldctx, and as a
>>> result it can decide to use the same stack region for both variables.
>>> Unfortunately this has the effect of clobbering oldctx when we call
>>> sigprocmask, and results in the test failing.
>>>
>>> By initializing oldctx at the top, we let the compiler know that it
>>> has a value that has to be preserved down to the part of the code
>>> after the "if" cascade, and it won't try to place another variable
>>> in that same part of the stack.
> The compiler would also know what the initial value is, which it could
> store somewhere else, which then would still allow for reuse of a stack
> slot.

Good point.

> I agree with Florian that the compiler needs to be made aware that
> getcontext can return twice, or something to that effect.  This would
> tell it that it has to reason about the lifetimes of variables
> differently.

The problem is that "returns_twice" doesn't offer the semantics we want.
It ensures that register-allocated variables are handled properly, i.e.
everything is saved to the stack frame prior to calling the function.  But
here the issue is that the stack frame itself isn't being set up in a way that
actually works.  And in practice, tagging getcontext and swapcontext with
attribute((returns_twice)) does not fix the bug.  (It does seem like doing so
isn't a bad idea, but it is beyond the scope of fixing this one test bug.)

Another way to fix the problem is to make the context variables function static,
which should forbid the compiler from doing anything funky with them.
(Although do_test itself is static, it is called from main, and the compiler
has to assume main could get called again and expect to find the updated
context variables still updated, so it can't trickily ignore the static modifier
or anything like that, I think.)

commit 5a054bf335e350e96e2a38b5d2573f4f26a2185a
Author: Chris Metcalf <cmetcalf@mellanox.com>
Date:   Fri Jan 13 12:50:50 2017 -0500

     tst-setcontext2: avoid bug from compiler optimization

     With an uninitialized oldctx, the compiler is free to observe that
     the only path that sets up a value in oldctx is through the
     "if (global == 2)" arm, in which arm we apparently return 0 without
     referencing oldctx again.

     Then, after the "if" cascade, the compiler can inline the "check"
     function and then observe that the sigset_t "set" variable there
     is only used locally, before any apparent uses of oldctx, and as a
     result it can decide to use the same stack region for both variables.
     Unfortunately this has the effect of clobbering oldctx when we call
     sigprocmask, and results in the test failing.

     By making oldctx (and ctx) have static scope, we forbid the compiler
     from performing this optimization.  The compiler will be required
     to allocate them to separate memory so that they would have their
     updated values if the do_test function were to be invoked a second
     time.  (We know that doesn't happen, but the compiler can't prove
     it solely by examination of this compilation unit.)

     Seen on tilegx with gcc 4.8 at -O3.
  

Comments

Andreas Schwab Jan. 25, 2017, 5:54 p.m. UTC | #1
On Jan 25 2017, Chris Metcalf <cmetcalf@mellanox.com> wrote:

>     By making oldctx (and ctx) have static scope, we forbid the compiler

There's no such thing as static scope, what you are meaning is the
storage duration.

Andreas.
  
Torvald Riegel Jan. 25, 2017, 6:05 p.m. UTC | #2
On Wed, 2017-01-25 at 12:23 -0500, Chris Metcalf wrote:
> On 1/25/2017 6:23 AM, Torvald Riegel wrote:
> > On Tue, 2017-01-24 at 19:35 -0500, Chris Metcalf wrote:
> >> Ping!  I will plan to commit this later this week if no one objects; it seems
> >> like a straightforward bug avoidance.
> >>
> >> On 1/13/2017 1:01 PM, Chris Metcalf wrote:
> >>> With an uninitialized oldctx, the compiler is free to observe that
> >>> the only path that sets up a value in oldctx is through the
> >>> "if (global == 2)" arm, in which arm we apparently return 0 without
> >>> referencing oldctx again.
> >>>
> >>> Then, after the "if" cascade, the compiler can inline the "check"
> >>> function and then observe that the sigset_t "set" variable there
> >>> is only used locally, before any apparent uses of oldctx, and as a
> >>> result it can decide to use the same stack region for both variables.
> >>> Unfortunately this has the effect of clobbering oldctx when we call
> >>> sigprocmask, and results in the test failing.
> >>>
> >>> By initializing oldctx at the top, we let the compiler know that it
> >>> has a value that has to be preserved down to the part of the code
> >>> after the "if" cascade, and it won't try to place another variable
> >>> in that same part of the stack.
> > The compiler would also know what the initial value is, which it could
> > store somewhere else, which then would still allow for reuse of a stack
> > slot.
> 
> Good point.
> 
> > I agree with Florian that the compiler needs to be made aware that
> > getcontext can return twice, or something to that effect.  This would
> > tell it that it has to reason about the lifetimes of variables
> > differently.
> 
> The problem is that "returns_twice" doesn't offer the semantics we want.

There are similarities to setjmp/longjmp, I think.

From C11 7.13.2.1:
All accessible objects have values, and all other components of the
abstract machine have state, as of the time the longjmp function was
called, except that the values of objects of automatic storage duration
that are local to the function containing the invocation of the
corresponding setjmp macro that do not have volatile-qualified type
and have been changed between the setjmp invocation and longjmp call are
indeterminate.

oldctx is modified between the getcontext (like setjmp) and effective
longjmp part of swapcontext.

> It ensures that register-allocated variables are handled properly, i.e.
> everything is saved to the stack frame prior to calling the function.  But
> here the issue is that the stack frame itself isn't being set up in a way that
> actually works.  And in practice, tagging getcontext and swapcontext with
> attribute((returns_twice)) does not fix the bug.  (It does seem like doing so
> isn't a bad idea, but it is beyond the scope of fixing this one test bug.)
> 
> Another way to fix the problem is to make the context variables function static,
> which should forbid the compiler from doing anything funky with them.
> (Although do_test itself is static, it is called from main, and the compiler
> has to assume main could get called again and expect to find the updated
> context variables still updated, so it can't trickily ignore the static modifier
> or anything like that, I think.)

I think that we need the returns_twice attribute, but we also shouldn't
put oldctx on the stack (unless it's marked volatile).
  
Chris Metcalf Jan. 25, 2017, 8:16 p.m. UTC | #3
On 1/25/2017 12:54 PM, Andreas Schwab wrote:
> On Jan 25 2017, Chris Metcalf <cmetcalf@mellanox.com> wrote:
>
>>      By making oldctx (and ctx) have static scope, we forbid the compiler
> There's no such thing as static scope, what you are meaning is the
> storage duration.

Thanks, fixed in my tree.  I know the difference but apparently
my brain was an autopilot when I was typing. :-)
  
Chris Metcalf Jan. 25, 2017, 9:26 p.m. UTC | #4
On 1/25/2017 1:05 PM, Torvald Riegel wrote:
> On Wed, 2017-01-25 at 12:23 -0500, Chris Metcalf wrote:
>> On 1/25/2017 6:23 AM, Torvald Riegel wrote:
>>> I agree with Florian that the compiler needs to be made aware that
>>> getcontext can return twice, or something to that effect.  This would
>>> tell it that it has to reason about the lifetimes of variables
>>> differently.
>> The problem is that "returns_twice" doesn't offer the semantics we want.
> There are similarities to setjmp/longjmp, I think.
>
>  From C11 7.13.2.1:
> All accessible objects have values, and all other components of the
> abstract machine have state, as of the time the longjmp function was
> called, except that the values of objects of automatic storage duration
> that are local to the function containing the invocation of the
> corresponding setjmp macro that do not have volatile-qualified type
> and have been changed between the setjmp invocation and longjmp call are
> indeterminate.
>
> oldctx is modified between the getcontext (like setjmp) and effective
> longjmp part of swapcontext.

Yes, there are certainly similarities.  Note that swapcontext also acts like
getcontext as a returns_twice function.

>> It ensures that register-allocated variables are handled properly, i.e.
>> everything is saved to the stack frame prior to calling the function.  But
>> here the issue is that the stack frame itself isn't being set up in a way that
>> actually works.  And in practice, tagging getcontext and swapcontext with
>> attribute((returns_twice)) does not fix the bug.  (It does seem like doing so
>> isn't a bad idea, but it is beyond the scope of fixing this one test bug.)
>>
>> Another way to fix the problem is to make the context variables function static,
>> which should forbid the compiler from doing anything funky with them.
>> (Although do_test itself is static, it is called from main, and the compiler
>> has to assume main could get called again and expect to find the updated
>> context variables still updated, so it can't trickily ignore the static modifier
>> or anything like that, I think.)
> I think that we need the returns_twice attribute, but we also shouldn't
> put oldctx on the stack (unless it's marked volatile).

I'm not sure we need or want the returns_twice attribute, given that
gcc already has special code for recognizing setjmp, vfork, etc. Currently
glibc does not use the returns_twice attribute anywhere in its source tree.
If getcontext or swapcontext is missing from gcc (and I don't know that
it is) then presumably it's a gcc bug.
  
Joseph Myers Jan. 25, 2017, 9:48 p.m. UTC | #5
On Wed, 25 Jan 2017, Chris Metcalf wrote:

> I'm not sure we need or want the returns_twice attribute, given that
> gcc already has special code for recognizing setjmp, vfork, etc. Currently
> glibc does not use the returns_twice attribute anywhere in its source tree.
> If getcontext or swapcontext is missing from gcc (and I don't know that
> it is) then presumably it's a gcc bug.

Recognizing based on function names and setting ECF_* is definitely an 
obsolescent approach in GCC, compared to the approach of GCC defining 
built-in functions using the appropriate attributes (and defining built-in 
functions is problematic for some of these functions because of the 
header-defined types involved in the function prototypes).  Defining 
built-in functions has the limitation of not providing the attribute with 
-fno-builtin.

Where an attribute is needed for correctness, as here, it should be 
included in the glibc headers.  GCC could then in future eliminate the 
special_function_p special-casing, replacing it either by built-in 
functions or by making fixincludes add the attribute to headers that don't 
already have it.

(Should clone have this attribute, just like vfork?  If so, that's a case 
missing from GCC.)
  
Chris Metcalf Jan. 27, 2017, 8:34 p.m. UTC | #6
I'd like to commit the one-line patch below that just declares the context
variables for test-setcontext2 as static.  I think the consensus is that
this is necessary, and orthogonal to the question of whether and how the
returns_twice attribute is associated with various functions in glibc;
it fixes a test failure for tilegx.  (The patch is quoted below; I've updated
the commit message to avoid the think-o observed by Andreas.)

I'll do this early next week unless someone raises an objection.

On 1/25/2017 12:23 PM, Chris Metcalf wrote:
> On 1/25/2017 6:23 AM, Torvald Riegel wrote:
>> I agree with Florian that the compiler needs to be made aware that
>> getcontext can return twice, or something to that effect.  This would
>> tell it that it has to reason about the lifetimes of variables
>> differently.
>
> The problem is that "returns_twice" doesn't offer the semantics we want.
> It ensures that register-allocated variables are handled properly, i.e.
> everything is saved to the stack frame prior to calling the function.  But
> here the issue is that the stack frame itself isn't being set up in a way that
> actually works.  And in practice, tagging getcontext and swapcontext with
> attribute((returns_twice)) does not fix the bug.  (It does seem like doing so
> isn't a bad idea, but it is beyond the scope of fixing this one test bug.)
>
> Another way to fix the problem is to make the context variables function static,
> which should forbid the compiler from doing anything funky with them.
> (Although do_test itself is static, it is called from main, and the compiler
> has to assume main could get called again and expect to find the updated
> context variables still updated, so it can't trickily ignore the static modifier
> or anything like that, I think.)
>
> commit 5a054bf335e350e96e2a38b5d2573f4f26a2185a
> Author: Chris Metcalf <cmetcalf@mellanox.com>
> Date:   Fri Jan 13 12:50:50 2017 -0500
>
>     tst-setcontext2: avoid bug from compiler optimization
>
>     With an uninitialized oldctx, the compiler is free to observe that
>     the only path that sets up a value in oldctx is through the
>     "if (global == 2)" arm, in which arm we apparently return 0 without
>     referencing oldctx again.
>
>     Then, after the "if" cascade, the compiler can inline the "check"
>     function and then observe that the sigset_t "set" variable there
>     is only used locally, before any apparent uses of oldctx, and as a
>     result it can decide to use the same stack region for both variables.
>     Unfortunately this has the effect of clobbering oldctx when we call
>     sigprocmask, and results in the test failing.
>
>     By making oldctx (and ctx) have static scope, we forbid the compiler
>     from performing this optimization.  The compiler will be required
>     to allocate them to separate memory so that they would have their
>     updated values if the do_test function were to be invoked a second
>     time.  (We know that doesn't happen, but the compiler can't prove
>     it solely by examination of this compilation unit.)
>
>     Seen on tilegx with gcc 4.8 at -O3.
>
> diff --git a/stdlib/tst-setcontext2.c b/stdlib/tst-setcontext2.c
> index 07fb974c4684..c937f6c396db 100644
> --- a/stdlib/tst-setcontext2.c
> +++ b/stdlib/tst-setcontext2.c
> @@ -87,7 +87,7 @@ handler (int __attribute__ ((unused)) signum)
>  static int
>  do_test (void)
>  {
> -  ucontext_t ctx, oldctx;
> +  static ucontext_t ctx, oldctx;
>    struct sigaction action;
>    pid_t pid;
>
>
  
Florian Weimer Jan. 27, 2017, 9:30 p.m. UTC | #7
On 01/27/2017 09:34 PM, Chris Metcalf wrote:
> I'd like to commit the one-line patch below that just declares the context
> variables for test-setcontext2 as static.  I think the consensus is that
> this is necessary,

No, I think the test is supposed to work as-is; it is valid.

So if this test is broken, it is a bug in the implementation of the 
context switching functions (as they are visible to applications, i.e. 
the fix might not be in the functions themselves, but adjusting their 
calling conventions, inhibiting GCC optimizations which make incorrect 
assumptions, etc.).

If we do not have a fix right now, it is better to let the test keep 
failing (perhaps XFAIL it), rather than tweaking it so that it tests 
something else.

Thanks,
Florian
  
Chris Metcalf Jan. 27, 2017, 9:53 p.m. UTC | #8
On 1/27/2017 4:30 PM, Florian Weimer wrote:
> On 01/27/2017 09:34 PM, Chris Metcalf wrote:
>> I'd like to commit the one-line patch below that just declares the context
>> variables for test-setcontext2 as static.  I think the consensus is that
>> this is necessary,
>
> No, I think the test is supposed to work as-is; it is valid.
>
> So if this test is broken, it is a bug in the implementation of the context switching functions (as they are visible to applications, i.e. the fix might not be in the functions themselves, but adjusting their calling conventions, inhibiting GCC optimizations which make incorrect assumptions, etc.).
>
> If we do not have a fix right now, it is better to let the test keep failing (perhaps XFAIL it), rather than tweaking it so that it tests something else.

We would be waiting on functionality from gcc that doesn't currently
exist, correct?  The current returns_twice semantics is insufficient to
describe the problem that we currently see with shared stack frames.
It's possible that returns_twice could be extended to also have the
semantics of "don't share my arguments' stack space with any other
variables in this function" pretty reasonably, but I don't want to try to
guess what will make most sense to the gcc folks.

I can certainly let the test keep failing and just open a bug to reference
it and capture some of this discussion.
  
Florian Weimer Jan. 30, 2017, 11:57 a.m. UTC | #9
On 01/27/2017 10:53 PM, Chris Metcalf wrote:
> On 1/27/2017 4:30 PM, Florian Weimer wrote:
>> On 01/27/2017 09:34 PM, Chris Metcalf wrote:
>>> I'd like to commit the one-line patch below that just declares the
>>> context
>>> variables for test-setcontext2 as static.  I think the consensus is that
>>> this is necessary,
>>
>> No, I think the test is supposed to work as-is; it is valid.
>>
>> So if this test is broken, it is a bug in the implementation of the
>> context switching functions (as they are visible to applications, i.e.
>> the fix might not be in the functions themselves, but adjusting their
>> calling conventions, inhibiting GCC optimizations which make incorrect
>> assumptions, etc.).
>>
>> If we do not have a fix right now, it is better to let the test keep
>> failing (perhaps XFAIL it), rather than tweaking it so that it tests
>> something else.
>
> We would be waiting on functionality from gcc that doesn't currently
> exist, correct?  The current returns_twice semantics is insufficient to
> describe the problem that we currently see with shared stack frames.

I think this could still be a GCC regression in the handling of the 
returns_twice attribute, introduced when GCC started sharing stack slots 
more aggressively.  When returns_twice was initially implemented and 
documented, the question of stack slot reuse did not arise because GCC 
probably did not do it at all.

> It's possible that returns_twice could be extended to also have the
> semantics of "don't share my arguments' stack space with any other
> variables in this function" pretty reasonably, but I don't want to try to
> guess what will make most sense to the gcc folks.

GCC already has this behavior on x86_64.  This could be an accident.

> I can certainly let the test keep failing and just open a bug to reference
> it and capture some of this discussion.

Yes, would you please file a GCC bug with a preprocessed input file?

Thanks,
Florian
  
Torvald Riegel Jan. 30, 2017, 1:40 p.m. UTC | #10
On Mon, 2017-01-30 at 12:57 +0100, Florian Weimer wrote:
> On 01/27/2017 10:53 PM, Chris Metcalf wrote:
> > On 1/27/2017 4:30 PM, Florian Weimer wrote:
> >> On 01/27/2017 09:34 PM, Chris Metcalf wrote:
> >>> I'd like to commit the one-line patch below that just declares the
> >>> context
> >>> variables for test-setcontext2 as static.  I think the consensus is that
> >>> this is necessary,
> >>
> >> No, I think the test is supposed to work as-is; it is valid.
> >>
> >> So if this test is broken, it is a bug in the implementation of the
> >> context switching functions (as they are visible to applications, i.e.
> >> the fix might not be in the functions themselves, but adjusting their
> >> calling conventions, inhibiting GCC optimizations which make incorrect
> >> assumptions, etc.).
> >>
> >> If we do not have a fix right now, it is better to let the test keep
> >> failing (perhaps XFAIL it), rather than tweaking it so that it tests
> >> something else.
> >
> > We would be waiting on functionality from gcc that doesn't currently
> > exist, correct?  The current returns_twice semantics is insufficient to
> > describe the problem that we currently see with shared stack frames.
> 
> I think this could still be a GCC regression in the handling of the 
> returns_twice attribute, introduced when GCC started sharing stack slots 
> more aggressively.  When returns_twice was initially implemented and 
> documented, the question of stack slot reuse did not arise because GCC 
> probably did not do it at all.

I may be misremembering, but I think I had to add a returns_twice
attribute to the function call that starts a transaction (for GCC's
support for transactional memory).  This was motivated by seeing sharing
of stack slots, and adding the returns_twice was supposed to fix this
back then.  Which is to say that this sounds like a regression on GCC's
side to me, too.
  
Chris Metcalf Jan. 31, 2017, 4:04 p.m. UTC | #11
On 1/30/2017 6:57 AM, Florian Weimer wrote:
> On 01/27/2017 10:53 PM, Chris Metcalf wrote:
>> On 1/27/2017 4:30 PM, Florian Weimer wrote:
>>> On 01/27/2017 09:34 PM, Chris Metcalf wrote:
>>>> I'd like to commit the one-line patch below that just declares the
>>>> context
>>>> variables for test-setcontext2 as static.  I think the consensus is that
>>>> this is necessary,
>>>
>>> No, I think the test is supposed to work as-is; it is valid.
>>>
>>> So if this test is broken, it is a bug in the implementation of the
>>> context switching functions (as they are visible to applications, i.e.
>>> the fix might not be in the functions themselves, but adjusting their
>>> calling conventions, inhibiting GCC optimizations which make incorrect
>>> assumptions, etc.).
>>>
>>> If we do not have a fix right now, it is better to let the test keep
>>> failing (perhaps XFAIL it), rather than tweaking it so that it tests
>>> something else.
>>
>> We would be waiting on functionality from gcc that doesn't currently
>> exist, correct?  The current returns_twice semantics is insufficient to
>> describe the problem that we currently see with shared stack frames.
>
> I think this could still be a GCC regression in the handling of the returns_twice attribute, introduced when GCC started sharing stack slots more aggressively.  When returns_twice was initially implemented and documented, the question of stack slot reuse did not arise because GCC probably did not do it at all.
>
>> It's possible that returns_twice could be extended to also have the
>> semantics of "don't share my arguments' stack space with any other
>> variables in this function" pretty reasonably, but I don't want to try to
>> guess what will make most sense to the gcc folks.
>
> GCC already has this behavior on x86_64.  This could be an accident.
>
>> I can certainly let the test keep failing and just open a bug to reference
>> it and capture some of this discussion.
>
> Yes, would you please file a GCC bug with a preprocessed input file?

OK, so upgrading to a newer gcc seems to have fixed the problem for
me.  I was using 4.8.2, which has the problem, as does 4.8.5 (the
current 4.8 release).  However, I don't see it in 5.4, 6.3, or a gcc
tip build from earlier this month, so I think I will just upgrade my
gcc to 6.3 and not worry about it.

I'm not sure how aggressively the gcc folks backport these kinds of
things to earlier releases; if it seems like it would be helpful, I
suppose I could file a bug against 4.8 on tilegx specifically?
  
Florian Weimer Jan. 31, 2017, 4:11 p.m. UTC | #12
On 01/31/2017 05:04 PM, Chris Metcalf wrote:
> I'm not sure how aggressively the gcc folks backport these kinds of
> things to earlier releases; if it seems like it would be helpful, I
> suppose I could file a bug against 4.8 on tilegx specifically?

The 4.8 and 4.9 branches are closed upstream.  It looks like all active 
branches have already been fixed, so no further action is needed.

Thanks,
Florian
  
Andreas Schwab Jan. 31, 2017, 4:17 p.m. UTC | #13
On Jan 31 2017, Chris Metcalf <cmetcalf@mellanox.com> wrote:

> I'm not sure how aggressively the gcc folks backport these kinds of
> things to earlier releases; if it seems like it would be helpful, I
> suppose I could file a bug against 4.8 on tilegx specifically?

GCC 4.8 is already out of support.

Andreas.
  

Patch

diff --git a/stdlib/tst-setcontext2.c b/stdlib/tst-setcontext2.c
index 07fb974c4684..c937f6c396db 100644
--- a/stdlib/tst-setcontext2.c
+++ b/stdlib/tst-setcontext2.c
@@ -87,7 +87,7 @@  handler (int __attribute__ ((unused)) signum)
  static int
  do_test (void)
  {
-  ucontext_t ctx, oldctx;
+  static ucontext_t ctx, oldctx;
    struct sigaction action;
    pid_t pid;