pthread_once hangs when init routine throws an exception [BZ #18435]

Message ID 20210303125227.GO3748@tucnak
State Superseded
Headers
Series pthread_once hangs when init routine throws an exception [BZ #18435] |

Commit Message

Jakub Jelinek March 3, 2021, 12:52 p.m. UTC
  This is another attempt at making pthread_once handle throwing exceptions
from the init routine callback.  As the new testcases show, just switching
to the cleanup attribute based cleanup does fix the tst-once5 test, but
breaks the new tst-oncey3 test.  That is because when throwing exceptions,
only the unwind info registered cleanups (i.e. C++ destructors or cleanup
attribute), when cancelling threads and there has been unwind info from the
cancellation point up to whatever needs cleanup both unwind info registered
cleanups and THREAD_SETMEM (self, cleanup, ...) registered cleanups are
invoked, but once we hit some frame with no unwind info, only the
THREAD_SETMEM (self, cleanup, ...) registered cleanups are invoked.
So, to stay fully backwards compatible (allow init routines without
unwind info which encounter cancellation points) and handle exception throwing
we actually need to register the pthread_once cleanups in both unwind info
and in the THREAD_SETMEM (self, cleanup, ...) way.
If an exception is thrown, only the former will happen and we in that case
need to also unregister the THREAD_SETMEM (self, cleanup, ...) registered
handler, because otherwise after catching the exception the user code could
call deeper into the stack some cancellation point, get cancelled and then
a stale cleanup handler would clobber stack and probably crash.
If a thread calling init routine is cancelled and unwind info ends before
the pthread_once frame, it will be cleaned up through self->cleanup as
before.  And if unwind info is present, unwind_stop first calls the
self->cleanup registered handler for the frame, then it will call the
unwind info registered handler but that will already see __do_it == 0
and do nothing.


	Jakub
  

Comments

Florian Weimer March 4, 2021, 11:50 a.m. UTC | #1
* Jakub Jelinek via Libc-alpha:

> This is another attempt at making pthread_once handle throwing exceptions
> from the init routine callback.  As the new testcases show, just switching
> to the cleanup attribute based cleanup does fix the tst-once5 test, but
> breaks the new tst-oncey3 test.  That is because when throwing exceptions,
> only the unwind info registered cleanups (i.e. C++ destructors or cleanup
> attribute), when cancelling threads and there has been unwind info from the
> cancellation point up to whatever needs cleanup both unwind info registered
> cleanups and THREAD_SETMEM (self, cleanup, ...) registered cleanups are
> invoked, but once we hit some frame with no unwind info, only the
> THREAD_SETMEM (self, cleanup, ...) registered cleanups are invoked.
> So, to stay fully backwards compatible (allow init routines without
> unwind info which encounter cancellation points) and handle exception throwing
> we actually need to register the pthread_once cleanups in both unwind info
> and in the THREAD_SETMEM (self, cleanup, ...) way.
> If an exception is thrown, only the former will happen and we in that case
> need to also unregister the THREAD_SETMEM (self, cleanup, ...) registered
> handler, because otherwise after catching the exception the user code could
> call deeper into the stack some cancellation point, get cancelled and then
> a stale cleanup handler would clobber stack and probably crash.
> If a thread calling init routine is cancelled and unwind info ends before
> the pthread_once frame, it will be cleaned up through self->cleanup as
> before.  And if unwind info is present, unwind_stop first calls the
> self->cleanup registered handler for the frame, then it will call the
> unwind info registered handler but that will already see __do_it == 0
> and do nothing.

I agree that this solves the issue.

I initially thought that we had to use setjmp-based registration
approach, but that does not seem to be the case.

Any idea why we have the setjmp-based approach (for -fno-exceptions) for
external use, and not use the legacy approach internal to glibc?  One
difference is that it restores the value of callee-saved registers,
which could matter if we switch between two kinds of unwinding (callback
registration and DWARF).

Do you think this approach should used for all callback-based functions,
not just pthread_once?

> diff --git a/nptl/cleanup_compat.c b/nptl/cleanup_compat.c
> index fec88c2f86..b5b848338a 100644
> --- a/nptl/cleanup_compat.c
> +++ b/nptl/cleanup_compat.c
> @@ -48,3 +48,11 @@ _pthread_cleanup_pop (struct _pthread_cleanup_buffer *buffer, int execute)
>      buffer->__routine (buffer->__arg);
>  }
>  strong_alias (_pthread_cleanup_pop, __pthread_cleanup_pop)
> +
> +void
> +__pthread_cleanup_cleanup (struct _pthread_cleanup_buffer *buffer)
> +{
> +  struct pthread *self = THREAD_SELF;
> +  if (THREAD_GETMEM (self, cleanup) == buffer)
> +    THREAD_SETMEM (self, cleanup, buffer->__prev);
> +}

It's surprising that the cleanup field update is conditional.  How does
that happen?  Is it really safe?

> diff --git a/nptl/pthreadP.h b/nptl/pthreadP.h
> index d2fd0826fe..297426be1e 100644
> --- a/nptl/pthreadP.h
> +++ b/nptl/pthreadP.h
> @@ -604,6 +604,67 @@ extern void __pthread_cleanup_pop (struct _pthread_cleanup_buffer *buffer,

> +# if defined __GNUC__ && defined __EXCEPTIONS && !defined __cplusplus
> +/* Structure to hold the cleanup handler information.  */
> +struct __pthread_cleanup_combined_frame
> +{
> +  void (*__cancel_routine) (void *);
> +  void *__cancel_arg;
> +  int __do_it;
> +  struct _pthread_cleanup_buffer __buffer;
> +};

No __GNUC__ is necessary in internal headers.

> +/* Special cleanup macros which register cleanup both using
> +   __pthread_cleanup_{push,pop} and using cleanup attribute.  This is needed
> +   for pthread_once, so that it supports both throwing exceptions from the
> +   pthread_once callback (only cleanup attribute works there) and cancellation
> +   of the thread running the callback if the callback or some routines it
> +   calls don't have unwind information.  */
> +
> +static inline void
> +__pthread_cleanup_combined_routine (struct __pthread_cleanup_combined_frame
> +				    *__frame)
> +{
> +  if (__frame->__do_it)
> +    {
> +      __frame->__cancel_routine (__frame->__cancel_arg);
> +      __frame->__do_it = 0;
> +      __pthread_cleanup_cleanup (&__frame->__buffer);
> +    }
> +}

Is there a way to turn the call to __frame->__cancel_routine into a
direct call?  For more general use (outside pthread_once) that might be
desirable from a security hardening perspective.  I expect it will
happen if __pthread_cleanup_combined_routine_voidptr and
__pthread_cleanup_combined_routine do not use the sam variable.

I guess we can fix this later.  For pthread_once, the indirect call only
happens on the unwinding path.

> +static inline void
> +__pthread_cleanup_combined_routine_voidptr (void *__arg)
> +{
> +  struct __pthread_cleanup_combined_frame *__frame
> +    = (struct __pthread_cleanup_combined_frame *) __arg;
> +  if (__frame->__do_it)
> +    {
> +      __frame->__cancel_routine (__frame->__cancel_arg);
> +      __frame->__do_it = 0;
> +    }
> +}

I think you can make this an out-of-line routine because it can never be
inlined.  But given that this is called only once, it probably does not
matter.

> diff --git a/sysdeps/pthread/Makefile b/sysdeps/pthread/Makefile
> index eeb64f9fb0..53b65ef349 100644
> --- a/sysdeps/pthread/Makefile
> +++ b/sysdeps/pthread/Makefile
> @@ -231,7 +231,7 @@ generated += $(objpfx)tst-atfork2.mtrace \
>  
>  tests-internal += tst-cancel25 tst-robust8
>  
> -tests += tst-oncex3 tst-oncex4
> +tests += tst-oncex3 tst-oncex4 tst-oncey3 tst-oncey4

tst-oncey3.c, tst-oncey4.c are missing, but I guessed their contents.

Thanks,
Florian
  
Jakub Jelinek March 4, 2021, 1 p.m. UTC | #2
On Thu, Mar 04, 2021 at 12:50:54PM +0100, Florian Weimer wrote:
> I initially thought that we had to use setjmp-based registration
> approach, but that does not seem to be the case.
> 
> Any idea why we have the setjmp-based approach (for -fno-exceptions) for
> external use, and not use the legacy approach internal to glibc?  One
> difference is that it restores the value of callee-saved registers,
> which could matter if we switch between two kinds of unwinding (callback
> registration and DWARF).

I'm afraid I don't know, I thought the public pthread_cleanup_push/pop
for -fno-exceptions is the same as the pthreadP.h one except that it uses
_pthread_cleanup_* instead of __pthread_cleanup_*.
All I remember that the normal unwinding does a longjmp at the end after
processing all the self->cleanup registered cleanups, but I thought it is
just to the pthread_create start.

> Do you think this approach should used for all callback-based functions,
> not just pthread_once?

If we want to support throw from the callbacks that would do cleanups and
continue throwing outside of the libc routines and at the same time
support no unwind callbacks with cancellation points.
I guess it would need separate analysis of:
argp/Makefile:CFLAGS-argp-help.c += $(uses-callbacks) -fexceptions
argp/Makefile:CFLAGS-argp-parse.c += $(uses-callbacks)
dirent/Makefile:CFLAGS-scandir.c += $(uses-callbacks)
dirent/Makefile:CFLAGS-scandir64.c += $(uses-callbacks)
dirent/Makefile:CFLAGS-scandir-tail.c += $(uses-callbacks)
dirent/Makefile:CFLAGS-scandir64-tail.c += $(uses-callbacks)
elf/Makefile:CFLAGS-dl-iteratephdr.c += $(uses-callbacks)
io/Makefile:CFLAGS-fts.c += -Wno-uninitialized $(uses-callbacks) -fexceptions
io/Makefile:CFLAGS-fts64.c += -Wno-uninitialized $(uses-callbacks) -fexceptions
io/Makefile:CFLAGS-ftw.c += $(uses-callbacks) -fexceptions
io/Makefile:CFLAGS-ftw64.c += $(uses-callbacks) -fexceptions
malloc/Makefile:CFLAGS-obstack.c += $(uses-callbacks)
misc/Makefile:CFLAGS-tsearch.c += $(uses-callbacks)
misc/Makefile:CFLAGS-lsearch.c += $(uses-callbacks)
nptl/Makefile:CFLAGS-pthread_once.c += $(uses-callbacks) -fexceptions \
posix/Makefile:CFLAGS-glob.c += $(uses-callbacks) -fexceptions
posix/Makefile:CFLAGS-glob64.c += $(uses-callbacks) -fexceptions
stdlib/Makefile:CFLAGS-bsearch.c += $(uses-callbacks)
stdlib/Makefile:CFLAGS-msort.c += $(uses-callbacks)
stdlib/Makefile:CFLAGS-qsort.c += $(uses-callbacks)
find out what we support currently and what we need to support.

> > --- a/nptl/cleanup_compat.c
> > +++ b/nptl/cleanup_compat.c
> > @@ -48,3 +48,11 @@ _pthread_cleanup_pop (struct _pthread_cleanup_buffer *buffer, int execute)
> >      buffer->__routine (buffer->__arg);
> >  }
> >  strong_alias (_pthread_cleanup_pop, __pthread_cleanup_pop)
> > +
> > +void
> > +__pthread_cleanup_cleanup (struct _pthread_cleanup_buffer *buffer)
> > +{
> > +  struct pthread *self = THREAD_SELF;
> > +  if (THREAD_GETMEM (self, cleanup) == buffer)
> > +    THREAD_SETMEM (self, cleanup, buffer->__prev);
> > +}
> 
> It's surprising that the cleanup field update is conditional.  How does
> that happen?  Is it really safe?

As I've tried to explain, there are several cases.
One is throw in the callback, that invokes __pthread_cleanup_combined_routine
only and not the other routine and we need to cleanup.  In that case we need
to do the cleanup and self->cleanup ought to be equal to the buffer.

Another is cancellation with unwind info in callback, from debugging it
seems that the self->cleanup registered callbacks are invoked first and
those unregister the cleanup.  In earlier versions of the patch
__pthread_cleanup_cleanup has been called unconditionally from
__pthread_cleanup_combined_routine and in that case __pthread_cleanup_cleanup
would need to be conditional, but in the current version that is not needed.

Another is cancellation without unwind info in callback, that does
just self->cleanup registered callbacks and no problematic cases.

Another is normal return from the callback, in that case whether the
cleanups are run depends on the execute argument (which is 0 for
pthread_once).  The self->cleanup cleanup is unregistered with hardcoded 0
so callback isn't invoked, and then the cleanup attribute based one uses
the execute passed to the macro.  If it would be non-zero, we need
conditional cleanup because __pthread_cleanup_pop has already unregistered
it, if we only support 0, then __pthread_cleanup_cleanup can be
unconditional.

> > +# if defined __GNUC__ && defined __EXCEPTIONS && !defined __cplusplus
> > +/* Structure to hold the cleanup handler information.  */
> > +struct __pthread_cleanup_combined_frame
> > +{
> > +  void (*__cancel_routine) (void *);
> > +  void *__cancel_arg;
> > +  int __do_it;
> > +  struct _pthread_cleanup_buffer __buffer;
> > +};
> 
> No __GNUC__ is necessary in internal headers.

Ok, will change.

> > +/* Special cleanup macros which register cleanup both using
> > +   __pthread_cleanup_{push,pop} and using cleanup attribute.  This is needed
> > +   for pthread_once, so that it supports both throwing exceptions from the
> > +   pthread_once callback (only cleanup attribute works there) and cancellation
> > +   of the thread running the callback if the callback or some routines it
> > +   calls don't have unwind information.  */
> > +
> > +static inline void
> > +__pthread_cleanup_combined_routine (struct __pthread_cleanup_combined_frame
> > +				    *__frame)
> > +{
> > +  if (__frame->__do_it)
> > +    {
> > +      __frame->__cancel_routine (__frame->__cancel_arg);
> > +      __frame->__do_it = 0;
> > +      __pthread_cleanup_cleanup (&__frame->__buffer);
> > +    }
> > +}
> 
> Is there a way to turn the call to __frame->__cancel_routine into a
> direct call?  For more general use (outside pthread_once) that might be
> desirable from a security hardening perspective.  I expect it will
> happen if __pthread_cleanup_combined_routine_voidptr and
> __pthread_cleanup_combined_routine do not use the sam variable.

Looking at GCC tree dumps, __pthread_cleanup_combined_routine is inlined
(but because it is considered expensive, it is inlined only partially (i.e.
the __frame->__do_it test).
__attribute__((__always_inline__)) can force it to be inlined fully.
The other thing is the __cancel_routine call, and that is sadly indirect
even if I change it so that while __clframe.__do_it = 1; is done before
__pthread_cleanup_push, __clframe.__cancel_{routine,arg} is initialized
after it returns.  For the compiler, the address of __clframe escaped
and so it doesn't know if init_routine will not modify it.

For __pthread_cleanup_push registered callback, I'm afraid there is no way
out of that.
For the unwind info registered callback, maybe having one structure that
wouldn't be really address taken (containing __cancel_routine, __cancel_arg
and pointer to another structure containing the pthread cleanup buffer
and __do_it var) could do the job.

> I guess we can fix this later.  For pthread_once, the indirect call only
> happens on the unwinding path.

Yes.
> 
> > +static inline void
> > +__pthread_cleanup_combined_routine_voidptr (void *__arg)
> > +{
> > +  struct __pthread_cleanup_combined_frame *__frame
> > +    = (struct __pthread_cleanup_combined_frame *) __arg;
> > +  if (__frame->__do_it)
> > +    {
> > +      __frame->__cancel_routine (__frame->__cancel_arg);
> > +      __frame->__do_it = 0;
> > +    }
> > +}
> 
> I think you can make this an out-of-line routine because it can never be
> inlined.  But given that this is called only once, it probably does not
> matter.

This one indeed could be out of line and the other could have out of line
copy and extern inline version for inlining (or static inline
always_inline).  I didn't bother when it is for a single spot, but if
it would be for multiple ones, sure.
> 
> > diff --git a/sysdeps/pthread/Makefile b/sysdeps/pthread/Makefile
> > index eeb64f9fb0..53b65ef349 100644
> > --- a/sysdeps/pthread/Makefile
> > +++ b/sysdeps/pthread/Makefile
> > @@ -231,7 +231,7 @@ generated += $(objpfx)tst-atfork2.mtrace \
> >  
> >  tests-internal += tst-cancel25 tst-robust8
> >  
> > -tests += tst-oncex3 tst-oncex4
> > +tests += tst-oncex3 tst-oncex4 tst-oncey3 tst-oncey4
> 
> tst-oncey3.c, tst-oncey4.c are missing, but I guessed their contents.

Oops, sorry.  Their content is the same as of tst-oncex3.c and tst-oncex4.c.

	Jakub
  
Florian Weimer March 4, 2021, 1:26 p.m. UTC | #3
* Jakub Jelinek:

> On Thu, Mar 04, 2021 at 12:50:54PM +0100, Florian Weimer wrote:
>> I initially thought that we had to use setjmp-based registration
>> approach, but that does not seem to be the case.
>> 
>> Any idea why we have the setjmp-based approach (for -fno-exceptions) for
>> external use, and not use the legacy approach internal to glibc?  One
>> difference is that it restores the value of callee-saved registers,
>> which could matter if we switch between two kinds of unwinding (callback
>> registration and DWARF).
>
> I'm afraid I don't know, I thought the public pthread_cleanup_push/pop
> for -fno-exceptions is the same as the pthreadP.h one except that it uses
> _pthread_cleanup_* instead of __pthread_cleanup_*.
> All I remember that the normal unwinding does a longjmp at the end after
> processing all the self->cleanup registered cleanups, but I thought it is
> just to the pthread_create start.

I'm going to assume it's the callee saved registers issue.

>> Do you think this approach should used for all callback-based functions,
>> not just pthread_once?
>
> If we want to support throw from the callbacks that would do cleanups and
> continue throwing outside of the libc routines and at the same time
> support no unwind callbacks with cancellation points.
> I guess it would need separate analysis of:
> argp/Makefile:CFLAGS-argp-help.c += $(uses-callbacks) -fexceptions
> argp/Makefile:CFLAGS-argp-parse.c += $(uses-callbacks)
> dirent/Makefile:CFLAGS-scandir.c += $(uses-callbacks)
> dirent/Makefile:CFLAGS-scandir64.c += $(uses-callbacks)
> dirent/Makefile:CFLAGS-scandir-tail.c += $(uses-callbacks)
> dirent/Makefile:CFLAGS-scandir64-tail.c += $(uses-callbacks)
> elf/Makefile:CFLAGS-dl-iteratephdr.c += $(uses-callbacks)
> io/Makefile:CFLAGS-fts.c += -Wno-uninitialized $(uses-callbacks) -fexceptions
> io/Makefile:CFLAGS-fts64.c += -Wno-uninitialized $(uses-callbacks) -fexceptions
> io/Makefile:CFLAGS-ftw.c += $(uses-callbacks) -fexceptions
> io/Makefile:CFLAGS-ftw64.c += $(uses-callbacks) -fexceptions
> malloc/Makefile:CFLAGS-obstack.c += $(uses-callbacks)
> misc/Makefile:CFLAGS-tsearch.c += $(uses-callbacks)
> misc/Makefile:CFLAGS-lsearch.c += $(uses-callbacks)
> nptl/Makefile:CFLAGS-pthread_once.c += $(uses-callbacks) -fexceptions \
> posix/Makefile:CFLAGS-glob.c += $(uses-callbacks) -fexceptions
> posix/Makefile:CFLAGS-glob64.c += $(uses-callbacks) -fexceptions
> stdlib/Makefile:CFLAGS-bsearch.c += $(uses-callbacks)
> stdlib/Makefile:CFLAGS-msort.c += $(uses-callbacks)
> stdlib/Makefile:CFLAGS-qsort.c += $(uses-callbacks)
> find out what we support currently and what we need to support.

Or rename the macros and fix the stuff that breaks, yes.

For the internal uses, we could use a centralized cleanup function which
uses a switch statement over the cleanup type.  This would likely be
easier to use, too, and the indirect calls go away.

>> > --- a/nptl/cleanup_compat.c
>> > +++ b/nptl/cleanup_compat.c
>> > @@ -48,3 +48,11 @@ _pthread_cleanup_pop (struct _pthread_cleanup_buffer *buffer, int execute)
>> >      buffer->__routine (buffer->__arg);
>> >  }
>> >  strong_alias (_pthread_cleanup_pop, __pthread_cleanup_pop)
>> > +
>> > +void
>> > +__pthread_cleanup_cleanup (struct _pthread_cleanup_buffer *buffer)
>> > +{
>> > +  struct pthread *self = THREAD_SELF;
>> > +  if (THREAD_GETMEM (self, cleanup) == buffer)
>> > +    THREAD_SETMEM (self, cleanup, buffer->__prev);
>> > +}
>> 
>> It's surprising that the cleanup field update is conditional.  How does
>> that happen?  Is it really safe?
>
> As I've tried to explain, there are several cases.
> One is throw in the callback, that invokes __pthread_cleanup_combined_routine
> only and not the other routine and we need to cleanup.  In that case we need
> to do the cleanup and self->cleanup ought to be equal to the buffer.
>
> Another is cancellation with unwind info in callback, from debugging it
> seems that the self->cleanup registered callbacks are invoked first and
> those unregister the cleanup.  In earlier versions of the patch
> __pthread_cleanup_cleanup has been called unconditionally from
> __pthread_cleanup_combined_routine and in that case __pthread_cleanup_cleanup
> would need to be conditional, but in the current version that is not needed.
>
> Another is cancellation without unwind info in callback, that does
> just self->cleanup registered callbacks and no problematic cases.
>
> Another is normal return from the callback, in that case whether the
> cleanups are run depends on the execute argument (which is 0 for
> pthread_once).  The self->cleanup cleanup is unregistered with hardcoded 0
> so callback isn't invoked, and then the cleanup attribute based one uses
> the execute passed to the macro.  If it would be non-zero, we need
> conditional cleanup because __pthread_cleanup_pop has already unregistered
> it, if we only support 0, then __pthread_cleanup_cleanup can be
> unconditional.

Okay, we'll have to revisit this if we ever change the interfaces and do
the consolidation.

>> Is there a way to turn the call to __frame->__cancel_routine into a
>> direct call?  For more general use (outside pthread_once) that might be
>> desirable from a security hardening perspective.  I expect it will
>> happen if __pthread_cleanup_combined_routine_voidptr and
>> __pthread_cleanup_combined_routine do not use the sam variable.
>
> Looking at GCC tree dumps, __pthread_cleanup_combined_routine is inlined
> (but because it is considered expensive, it is inlined only partially (i.e.
> the __frame->__do_it test).
> __attribute__((__always_inline__)) can force it to be inlined fully.
> The other thing is the __cancel_routine call, and that is sadly indirect
> even if I change it so that while __clframe.__do_it = 1; is done before
> __pthread_cleanup_push, __clframe.__cancel_{routine,arg} is initialized
> after it returns.  For the compiler, the address of __clframe escaped
> and so it doesn't know if init_routine will not modify it.
>
> For __pthread_cleanup_push registered callback, I'm afraid there is no way
> out of that.
> For the unwind info registered callback, maybe having one structure that
> wouldn't be really address taken (containing __cancel_routine, __cancel_arg
> and pointer to another structure containing the pthread cleanup buffer
> and __do_it var) could do the job.

For __pthread_cleanup_push, we can avoid the indirect call on the
non-exception path if we move it out of the function, like I did here:

  [PATCH 3/8] nptl: Move legacy unwinding implementation into libc
  <https://sourceware.org/pipermail/libc-alpha/2021-March/123205.html>

Again, for pthread_once this does not matter.

>> > +static inline void
>> > +__pthread_cleanup_combined_routine_voidptr (void *__arg)
>> > +{
>> > +  struct __pthread_cleanup_combined_frame *__frame
>> > +    = (struct __pthread_cleanup_combined_frame *) __arg;
>> > +  if (__frame->__do_it)
>> > +    {
>> > +      __frame->__cancel_routine (__frame->__cancel_arg);
>> > +      __frame->__do_it = 0;
>> > +    }
>> > +}
>> 
>> I think you can make this an out-of-line routine because it can never be
>> inlined.  But given that this is called only once, it probably does not
>> matter.
>
> This one indeed could be out of line and the other could have out of line
> copy and extern inline version for inlining (or static inline
> always_inline).  I didn't bother when it is for a single spot, but if
> it would be for multiple ones, sure.

Understood, thanks.

I think the patch is okay, except for the nits I pointed out.

Thanks,
Florian
  
Jakub Jelinek March 4, 2021, 2:14 p.m. UTC | #4
On Thu, Mar 04, 2021 at 02:26:32PM +0100, Florian Weimer wrote:
> I think the patch is okay, except for the nits I pointed out.

Here is an updated patch.  Changes from the first one are
dropped __GNUC__ check, added tst-oncey*.c tests, avoiding indirect
function call in the theoretical execute != 0 case and dropping
__pthread_cleanup_cleanup because of that as it is now called
only in case the unwind cleanup does the cleaning up which means
neither unwind_stop nor __pthread_cleanup_pop did it.

[PATCH] pthread_once hangs when init routine throws an exception [BZ #18435]

This is another attempt at making pthread_once handle throwing exceptions
from the init routine callback.  As the new testcases show, just switching
to the cleanup attribute based cleanup does fix the tst-once5 test, but
breaks the new tst-oncey3 test.  That is because when throwing exceptions,
only the unwind info registered cleanups (i.e. C++ destructors or cleanup
attribute), when cancelling threads and there has been unwind info from the
cancellation point up to whatever needs cleanup both unwind info registered
cleanups and THREAD_SETMEM (self, cleanup, ...) registered cleanups are
invoked, but once we hit some frame with no unwind info, only the
THREAD_SETMEM (self, cleanup, ...) registered cleanups are invoked.
So, to stay fully backwards compatible (allow init routines without
unwind info which encounter cancellation points) and handle exception throwing
we actually need to register the pthread_once cleanups in both unwind info
and in the THREAD_SETMEM (self, cleanup, ...) way.
If an exception is thrown, only the former will happen and we in that case
need to also unregister the THREAD_SETMEM (self, cleanup, ...) registered
handler, because otherwise after catching the exception the user code could
call deeper into the stack some cancellation point, get cancelled and then
a stale cleanup handler would clobber stack and probably crash.
If a thread calling init routine is cancelled and unwind info ends before
the pthread_once frame, it will be cleaned up through self->cleanup as
before.  And if unwind info is present, unwind_stop first calls the
self->cleanup registered handler for the frame, then it will call the
unwind info registered handler but that will already see __do_it == 0
and do nothing.

diff --git a/nptl/Makefile b/nptl/Makefile
index 5f85dd7854..33766eaf7a 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -386,10 +386,6 @@ xtests += tst-eintr1
 
 test-srcs = tst-oddstacklimit
 
-# Test expected to fail on most targets (except x86_64) due to bug
-# 18435 - pthread_once hangs when init routine throws an exception.
-test-xfail-tst-once5 = yes
-
 gen-as-const-headers = unwindbuf.sym \
 		       pthread-pi-defines.sym
 
diff --git a/nptl/pthreadP.h b/nptl/pthreadP.h
index d2fd0826fe..93f3cef00f 100644
--- a/nptl/pthreadP.h
+++ b/nptl/pthreadP.h
@@ -604,6 +604,67 @@ extern void __pthread_cleanup_pop (struct _pthread_cleanup_buffer *buffer,
 # undef pthread_cleanup_pop
 # define pthread_cleanup_pop(execute)                   \
   __pthread_cleanup_pop (&_buffer, (execute)); }
+
+# if defined __EXCEPTIONS && !defined __cplusplus
+/* Structure to hold the cleanup handler information.  */
+struct __pthread_cleanup_combined_frame
+{
+  void (*__cancel_routine) (void *);
+  void *__cancel_arg;
+  int __do_it;
+  struct _pthread_cleanup_buffer __buffer;
+};
+
+/* Special cleanup macros which register cleanup both using
+   __pthread_cleanup_{push,pop} and using cleanup attribute.  This is needed
+   for pthread_once, so that it supports both throwing exceptions from the
+   pthread_once callback (only cleanup attribute works there) and cancellation
+   of the thread running the callback if the callback or some routines it
+   calls don't have unwind information.  */
+
+static __always_inline void
+__pthread_cleanup_combined_routine (struct __pthread_cleanup_combined_frame
+				    *__frame)
+{
+  if (__frame->__do_it)
+    {
+      __frame->__cancel_routine (__frame->__cancel_arg);
+      __frame->__do_it = 0;
+      __pthread_cleanup_pop (&__frame->__buffer, 0);
+    }
+}
+
+static inline void
+__pthread_cleanup_combined_routine_voidptr (void *__arg)
+{
+  struct __pthread_cleanup_combined_frame *__frame
+    = (struct __pthread_cleanup_combined_frame *) __arg;
+  if (__frame->__do_it)
+    {
+      __frame->__cancel_routine (__frame->__cancel_arg);
+      __frame->__do_it = 0;
+    }
+}
+
+#  define pthread_cleanup_combined_push(routine, arg) \
+  do {									      \
+    void (*__cancel_routine) (void *) = (routine);			      \
+    struct __pthread_cleanup_combined_frame __clframe			      \
+      __attribute__ ((__cleanup__ (__pthread_cleanup_combined_routine)))      \
+      = { .__cancel_routine = __cancel_routine, .__cancel_arg = (arg),	      \
+	  .__do_it = 1 };						      \
+    __pthread_cleanup_push (&__clframe.__buffer,			      \
+			    __pthread_cleanup_combined_routine_voidptr,	      \
+			    &__clframe);
+
+#  define pthread_cleanup_combined_pop(execute) \
+    __pthread_cleanup_pop (&__clframe.__buffer, 0);			      \
+    __clframe.__do_it = 0;						      \
+    if (execute)							      \
+      __cancel_routine (__clframe.__cancel_arg);			      \
+  } while (0)
+
+# endif
 #endif
 
 extern void __pthread_cleanup_push_defer (struct _pthread_cleanup_buffer *buffer,
diff --git a/nptl/pthread_once.c b/nptl/pthread_once.c
index 28d97097c7..7645da222a 100644
--- a/nptl/pthread_once.c
+++ b/nptl/pthread_once.c
@@ -111,11 +111,11 @@ __pthread_once_slow (pthread_once_t *once_control, void (*init_routine) (void))
       /* This thread is the first here.  Do the initialization.
 	 Register a cleanup handler so that in case the thread gets
 	 interrupted the initialization can be restarted.  */
-      pthread_cleanup_push (clear_once_control, once_control);
+      pthread_cleanup_combined_push (clear_once_control, once_control);
 
       init_routine ();
 
-      pthread_cleanup_pop (0);
+      pthread_cleanup_combined_pop (0);
 
 
       /* Mark *once_control as having finished the initialization.  We need
diff --git a/nptl/tst-once5.cc b/nptl/tst-once5.cc
index b797ab3562..60fe1ef820 100644
--- a/nptl/tst-once5.cc
+++ b/nptl/tst-once5.cc
@@ -59,7 +59,7 @@ do_test (void)
                " throwing an exception", stderr);
     }
     catch (OnceException) {
-      if (1 < niter)
+      if (niter > 1)
         fputs ("pthread_once unexpectedly threw", stderr);
       result = 0;
     }
@@ -75,7 +75,5 @@ do_test (void)
   return result;
 }
 
-// The test currently hangs and is XFAILed.  Reduce the timeout.
-#define TIMEOUT 1
 #define TEST_FUNCTION do_test ()
 #include "../test-skeleton.c"
diff --git a/sysdeps/pthread/Makefile b/sysdeps/pthread/Makefile
index eeb64f9fb0..53b65ef349 100644
--- a/sysdeps/pthread/Makefile
+++ b/sysdeps/pthread/Makefile
@@ -231,7 +231,7 @@ generated += $(objpfx)tst-atfork2.mtrace \
 
 tests-internal += tst-cancel25 tst-robust8
 
-tests += tst-oncex3 tst-oncex4
+tests += tst-oncex3 tst-oncex4 tst-oncey3 tst-oncey4
 
 modules-names += tst-join7mod
 
@@ -242,6 +242,8 @@ endif
 
 CFLAGS-tst-oncex3.c += -fexceptions
 CFLAGS-tst-oncex4.c += -fexceptions
+CFLAGS-tst-oncey3.c += -fno-exceptions -fno-asynchronous-unwind-tables
+CFLAGS-tst-oncey4.c += -fno-exceptions -fno-asynchronous-unwind-tables
 
 $(objpfx)tst-join7: $(libdl) $(shared-thread-library)
 $(objpfx)tst-join7.out: $(objpfx)tst-join7mod.so
diff --git a/sysdeps/pthread/tst-oncey3.c b/sysdeps/pthread/tst-oncey3.c
new file mode 100644
index 0000000000..08225b88dc
--- /dev/null
+++ b/sysdeps/pthread/tst-oncey3.c
@@ -0,0 +1 @@
+#include "tst-once3.c"
diff --git a/sysdeps/pthread/tst-oncey4.c b/sysdeps/pthread/tst-oncey4.c
new file mode 100644
index 0000000000..9b4d98f3f1
--- /dev/null
+++ b/sysdeps/pthread/tst-oncey4.c
@@ -0,0 +1 @@
+#include "tst-once4.c"

	Jakub
  

Patch

diff --git a/nptl/Makefile b/nptl/Makefile
index 5f85dd7854..33766eaf7a 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -386,10 +386,6 @@  xtests += tst-eintr1
 
 test-srcs = tst-oddstacklimit
 
-# Test expected to fail on most targets (except x86_64) due to bug
-# 18435 - pthread_once hangs when init routine throws an exception.
-test-xfail-tst-once5 = yes
-
 gen-as-const-headers = unwindbuf.sym \
 		       pthread-pi-defines.sym
 
diff --git a/nptl/cleanup_compat.c b/nptl/cleanup_compat.c
index fec88c2f86..b5b848338a 100644
--- a/nptl/cleanup_compat.c
+++ b/nptl/cleanup_compat.c
@@ -48,3 +48,11 @@  _pthread_cleanup_pop (struct _pthread_cleanup_buffer *buffer, int execute)
     buffer->__routine (buffer->__arg);
 }
 strong_alias (_pthread_cleanup_pop, __pthread_cleanup_pop)
+
+void
+__pthread_cleanup_cleanup (struct _pthread_cleanup_buffer *buffer)
+{
+  struct pthread *self = THREAD_SELF;
+  if (THREAD_GETMEM (self, cleanup) == buffer)
+    THREAD_SETMEM (self, cleanup, buffer->__prev);
+}
diff --git a/nptl/pthreadP.h b/nptl/pthreadP.h
index d2fd0826fe..297426be1e 100644
--- a/nptl/pthreadP.h
+++ b/nptl/pthreadP.h
@@ -604,6 +604,67 @@  extern void __pthread_cleanup_pop (struct _pthread_cleanup_buffer *buffer,
 # undef pthread_cleanup_pop
 # define pthread_cleanup_pop(execute)                   \
   __pthread_cleanup_pop (&_buffer, (execute)); }
+
+extern void __pthread_cleanup_cleanup (struct _pthread_cleanup_buffer *buffer)
+     attribute_hidden;
+
+# if defined __GNUC__ && defined __EXCEPTIONS && !defined __cplusplus
+/* Structure to hold the cleanup handler information.  */
+struct __pthread_cleanup_combined_frame
+{
+  void (*__cancel_routine) (void *);
+  void *__cancel_arg;
+  int __do_it;
+  struct _pthread_cleanup_buffer __buffer;
+};
+
+/* Special cleanup macros which register cleanup both using
+   __pthread_cleanup_{push,pop} and using cleanup attribute.  This is needed
+   for pthread_once, so that it supports both throwing exceptions from the
+   pthread_once callback (only cleanup attribute works there) and cancellation
+   of the thread running the callback if the callback or some routines it
+   calls don't have unwind information.  */
+
+static inline void
+__pthread_cleanup_combined_routine (struct __pthread_cleanup_combined_frame
+				    *__frame)
+{
+  if (__frame->__do_it)
+    {
+      __frame->__cancel_routine (__frame->__cancel_arg);
+      __frame->__do_it = 0;
+      __pthread_cleanup_cleanup (&__frame->__buffer);
+    }
+}
+
+static inline void
+__pthread_cleanup_combined_routine_voidptr (void *__arg)
+{
+  struct __pthread_cleanup_combined_frame *__frame
+    = (struct __pthread_cleanup_combined_frame *) __arg;
+  if (__frame->__do_it)
+    {
+      __frame->__cancel_routine (__frame->__cancel_arg);
+      __frame->__do_it = 0;
+    }
+}
+
+#  define pthread_cleanup_combined_push(routine, arg) \
+  do {									      \
+    struct __pthread_cleanup_combined_frame __clframe			      \
+      __attribute__ ((__cleanup__ (__pthread_cleanup_combined_routine)))      \
+      = { .__cancel_routine = (routine), .__cancel_arg = (arg),	 	      \
+	  .__do_it = 1 };						      \
+    __pthread_cleanup_push (&__clframe.__buffer,			      \
+			    __pthread_cleanup_combined_routine_voidptr,	      \
+			    &__clframe);
+
+#  define pthread_cleanup_combined_pop(execute) \
+    __pthread_cleanup_pop (&__clframe.__buffer, 0);			      \
+    __clframe.__do_it = (execute);					      \
+  } while (0)
+
+# endif
 #endif
 
 extern void __pthread_cleanup_push_defer (struct _pthread_cleanup_buffer *buffer,
diff --git a/nptl/pthread_once.c b/nptl/pthread_once.c
index 28d97097c7..7645da222a 100644
--- a/nptl/pthread_once.c
+++ b/nptl/pthread_once.c
@@ -111,11 +111,11 @@  __pthread_once_slow (pthread_once_t *once_control, void (*init_routine) (void))
       /* This thread is the first here.  Do the initialization.
 	 Register a cleanup handler so that in case the thread gets
 	 interrupted the initialization can be restarted.  */
-      pthread_cleanup_push (clear_once_control, once_control);
+      pthread_cleanup_combined_push (clear_once_control, once_control);
 
       init_routine ();
 
-      pthread_cleanup_pop (0);
+      pthread_cleanup_combined_pop (0);
 
 
       /* Mark *once_control as having finished the initialization.  We need
diff --git a/nptl/tst-once5.cc b/nptl/tst-once5.cc
index b797ab3562..60fe1ef820 100644
--- a/nptl/tst-once5.cc
+++ b/nptl/tst-once5.cc
@@ -59,7 +59,7 @@  do_test (void)
                " throwing an exception", stderr);
     }
     catch (OnceException) {
-      if (1 < niter)
+      if (niter > 1)
         fputs ("pthread_once unexpectedly threw", stderr);
       result = 0;
     }
@@ -75,7 +75,5 @@  do_test (void)
   return result;
 }
 
-// The test currently hangs and is XFAILed.  Reduce the timeout.
-#define TIMEOUT 1
 #define TEST_FUNCTION do_test ()
 #include "../test-skeleton.c"
diff --git a/sysdeps/pthread/Makefile b/sysdeps/pthread/Makefile
index eeb64f9fb0..53b65ef349 100644
--- a/sysdeps/pthread/Makefile
+++ b/sysdeps/pthread/Makefile
@@ -231,7 +231,7 @@  generated += $(objpfx)tst-atfork2.mtrace \
 
 tests-internal += tst-cancel25 tst-robust8
 
-tests += tst-oncex3 tst-oncex4
+tests += tst-oncex3 tst-oncex4 tst-oncey3 tst-oncey4
 
 modules-names += tst-join7mod
 
@@ -242,6 +242,8 @@  endif
 
 CFLAGS-tst-oncex3.c += -fexceptions
 CFLAGS-tst-oncex4.c += -fexceptions
+CFLAGS-tst-oncey3.c += -fno-exceptions -fno-asynchronous-unwind-tables
+CFLAGS-tst-oncey4.c += -fno-exceptions -fno-asynchronous-unwind-tables
 
 $(objpfx)tst-join7: $(libdl) $(shared-thread-library)
 $(objpfx)tst-join7.out: $(objpfx)tst-join7mod.so