Fix for bz14333 -- race between atexit() and exit()
Commit Message
On Tue, Sep 19, 2017 at 6:42 PM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 09/19/2017 04:33 PM, Paul Pluzhnikov wrote:
>> I'll add your test to the set of tests and send updated patch.
>
> My test has an implicit timing dependency. We wait 10 seconds to allow
> the other thread to make progress against the exit() call, we should
> change that to use another semaphore so it proceeds in a tick-tock
> fashion without any timing requirement.
I decided to keep the fix for this newly-discovered problem and the
new test to a separate patch, which I'll mail after this one is
committed.
Attached is the current patch with comments updated per above discussion.
Thanks,
Comments
On 09/20/2017 09:52 AM, Paul Pluzhnikov wrote:
> On Tue, Sep 19, 2017 at 6:42 PM, Carlos O'Donell <carlos@redhat.com> wrote:
>> On 09/19/2017 04:33 PM, Paul Pluzhnikov wrote:
>
>>> I'll add your test to the set of tests and send updated patch.
>>
>> My test has an implicit timing dependency. We wait 10 seconds to allow
>> the other thread to make progress against the exit() call, we should
>> change that to use another semaphore so it proceeds in a tick-tock
>> fashion without any timing requirement.
>
> I decided to keep the fix for this newly-discovered problem and the
> new test to a separate patch, which I'll mail after this one is
> committed.
>
> Attached is the current patch with comments updated per above discussion.
Looks great.
Please commit.
I look forward to reviewing the follow-on patch too :-)
I note the commit message just said "Fix BZ 14333". Can people please
include the more detailed descriptions of changes made, as in their
mailing list postings, in the commit message? There are certainly some
simple changes for which just a summary line is sufficient explanation,
but I don't think this was such a change. Descriptive commit messages are
especially important if we stop using ChangeLog entries in future.
(This does mean it's good practice, when making a series of revisions of a
patch, for each successive revision to have both the full self-contained
explanation, updated for that patch revision, such as would go in the
commit message, and a description of what changed from the previous patch
version, which won't go in the commit message.)
On 09/20/2017 10:50 AM, Joseph Myers wrote:
> I note the commit message just said "Fix BZ 14333". Can people please
> include the more detailed descriptions of changes made, as in their
> mailing list postings, in the commit message? There are certainly some
> simple changes for which just a summary line is sufficient explanation,
> but I don't think this was such a change. Descriptive commit messages are
> especially important if we stop using ChangeLog entries in future.
>
> (This does mean it's good practice, when making a series of revisions of a
> patch, for each successive revision to have both the full self-contained
> explanation, updated for that patch revision, such as would go in the
> commit message, and a description of what changed from the previous patch
> version, which won't go in the commit message.)
I agree.
I expect committers to follow what we ask for here:
https://sourceware.org/glibc/wiki/Contribution%20checklist#Detailed_Explanation_of_the_Patch
~~~
5. Detailed Explanation of the Patch
The detailed explanation will become the body of the commit message
for your patch. Please keep this in mind and format accordingly or
indicate to the reviewer which part of the email should be the
body of the commit message.
~~~
I will be more diligent in asking the submitter to clarify exactly
what will be in the commit message and to repost with a clear message
for subsequent versions.
On Wed, Sep 20, 2017 at 9:57 AM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 09/20/2017 10:50 AM, Joseph Myers wrote:
>> I note the commit message just said "Fix BZ 14333". Can people please
>> include the more detailed descriptions of changes made, as in their
>> mailing list postings, in the commit message?
Sorry about that.
(I've been using the ChangeLog entry as the commit message, but I
noticed that few others do.)
On 9/20/17, Paul Pluzhnikov <ppluzhnikov@google.com> wrote:
> On Wed, Sep 20, 2017 at 9:57 AM, Carlos O'Donell <carlos@redhat.com> wrote:
>> On 09/20/2017 10:50 AM, Joseph Myers wrote:
>>> I note the commit message just said "Fix BZ 14333". Can people please
>>> include the more detailed descriptions of changes made, as in their
>>> mailing list postings, in the commit message?
>
> Sorry about that.
>
> (I've been using the ChangeLog entry as the commit message, but I
> noticed that few others do.)
>
New tests fail at random on i686:
https://sourceware.org/bugzilla/show_bug.cgi?id=22207
On Mon, Sep 25, 2017 at 3:53 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On 9/20/17, Paul Pluzhnikov <ppluzhnikov@google.com> wrote:
> > On Wed, Sep 20, 2017 at 9:57 AM, Carlos O'Donell <carlos@redhat.com> wrote:
> >> On 09/20/2017 10:50 AM, Joseph Myers wrote:
> >>> I note the commit message just said "Fix BZ 14333". Can people please
> >>> include the more detailed descriptions of changes made, as in their
> >>> mailing list postings, in the commit message?
> >
> > Sorry about that.
> >
> > (I've been using the ChangeLog entry as the commit message, but I
> > noticed that few others do.)
> >
>
> New tests fail at random on i686:
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=22207
With default "ulimit -s" of 8192, the test can try to create 1024
threads with a total of 8GiB of RAM usage, which is a bit too much for
a 32-bit system.
I'll send a patch.
Thanks,
@@ -81,7 +81,9 @@ tests := tst-strtol tst-strtod testmb testrand testsort testdiv \
tst-quick_exit tst-thread-quick_exit tst-width \
tst-width-stdint tst-strfrom tst-strfrom-locale \
tst-getrandom tst-atexit tst-at_quick_exit \
- tst-cxa_atexit tst-on_exit
+ tst-cxa_atexit tst-on_exit test-atexit-race \
+ test-at_quick_exit-race test-cxa_atexit-race \
+ test-on_exit-race
tests-internal := tst-strtod1i tst-strtod3 tst-strtod4 tst-strtod5i \
tst-tls-atexit tst-tls-atexit-nodelete
@@ -91,6 +93,11 @@ ifeq ($(build-hardcoded-path-in-tests),yes)
tests += tst-empty-env
endif
+LDLIBS-test-atexit-race = $(shared-thread-library)
+LDLIBS-test-at_quick_exit-race = $(shared-thread-library)
+LDLIBS-test-cxa_atexit-race = $(shared-thread-library)
+LDLIBS-test-on_exit-race = $(shared-thread-library)
+
ifeq ($(have-cxx-thread_local),yes)
CFLAGS-tst-quick_exit.o = -std=c++11
LDLIBS-tst-quick_exit = -lstdc++
@@ -21,21 +21,29 @@
#include <libc-lock.h>
#include "exit.h"
-#include <atomic.h>
#include <sysdep.h>
#undef __cxa_atexit
+/* See concurrency notes in stdlib/exit.h where this lock is declared. */
+__libc_lock_define_initialized (, __exit_funcs_lock)
+
int
attribute_hidden
__internal_atexit (void (*func) (void *), void *arg, void *d,
struct exit_function_list **listp)
{
- struct exit_function *new = __new_exitfn (listp);
+ struct exit_function *new;
+
+ __libc_lock_lock (__exit_funcs_lock);
+ new = __new_exitfn (listp);
if (new == NULL)
- return -1;
+ {
+ __libc_lock_unlock (__exit_funcs_lock);
+ return -1;
+ }
#ifdef PTR_MANGLE
PTR_MANGLE (func);
@@ -43,8 +51,8 @@ __internal_atexit (void (*func) (void *), void *arg, void *d,
new->func.cxa.fn = (void (*) (void *, int)) func;
new->func.cxa.arg = arg;
new->func.cxa.dso_handle = d;
- atomic_write_barrier ();
new->flavor = ef_cxa;
+ __libc_lock_unlock (__exit_funcs_lock);
return 0;
}
@@ -60,14 +68,11 @@ __cxa_atexit (void (*func) (void *), void *arg, void *d)
libc_hidden_def (__cxa_atexit)
-/* We change global data, so we need locking. */
-__libc_lock_define_initialized (static, lock)
-
-
static struct exit_function_list initial;
struct exit_function_list *__exit_funcs = &initial;
uint64_t __new_exitfn_called;
+/* Must be called with __exit_funcs_lock held. */
struct exit_function *
__new_exitfn (struct exit_function_list **listp)
{
@@ -76,7 +81,10 @@ __new_exitfn (struct exit_function_list **listp)
struct exit_function *r = NULL;
size_t i = 0;
- __libc_lock_lock (lock);
+ if (__exit_funcs_done)
+ /* Exit code is finished processing all registered exit functions,
+ therefore we fail this registration. */
+ return NULL;
for (l = *listp; l != NULL; p = l, l = l->next)
{
@@ -127,7 +135,5 @@ __new_exitfn (struct exit_function_list **listp)
++__new_exitfn_called;
}
- __libc_lock_unlock (lock);
-
return r;
}
@@ -17,7 +17,6 @@
#include <assert.h>
#include <stdlib.h>
-#include <atomic.h>
#include "exit.h"
#include <fork.h>
#include <sysdep.h>
@@ -31,36 +30,64 @@ __cxa_finalize (void *d)
{
struct exit_function_list *funcs;
+ __libc_lock_lock (__exit_funcs_lock);
+
restart:
for (funcs = __exit_funcs; funcs; funcs = funcs->next)
{
struct exit_function *f;
for (f = &funcs->fns[funcs->idx - 1]; f >= &funcs->fns[0]; --f)
- {
- void (*cxafn) (void *arg, int status);
- void *cxaarg;
+ if ((d == NULL || d == f->func.cxa.dso_handle) && f->flavor == ef_cxa)
+ {
+ const uint64_t check = __new_exitfn_called;
+ void (*cxafn) (void *arg, int status) = f->func.cxa.fn;
+ void *cxaarg = f->func.cxa.arg;
+
+ /* We don't want to run this cleanup more than once. The Itanium
+ C++ ABI requires that multiple calls to __cxa_finalize not
+ result in calling termination functions more than once. One
+ potential scenario where that could happen is with a concurrent
+ dlclose and exit, where the running dlclose must at some point
+ release the list lock, an exiting thread may acquire it, and
+ without setting flavor to ef_free, might re-run this destructor
+ which could result in undefined behaviour. Therefore we must
+ set flavor to ef_free to avoid calling this destructor again.
+ Note that the concurrent exit must also take the dynamic loader
+ lock (for library finalizer processing) and therefore will
+ block while dlclose completes the processing of any in-progress
+ exit functions. Lastly, once we release the list lock for the
+ entry marked ef_free, we must not read from that entry again
+ since it may have been reused by the time we take the list lock
+ again. Lastly the detection of new registered exit functions is
+ based on a monotonically incrementing counter, and there is an
+ ABA if between the unlock to run the exit function and the
+ re-lock after completion the user registers 2^64 exit functions,
+ the implementation will not detect this and continue without
+ executing any more functions.
- if ((d == NULL || d == f->func.cxa.dso_handle)
- /* We don't want to run this cleanup more than once. */
- && (cxafn = f->func.cxa.fn,
- cxaarg = f->func.cxa.arg,
- ! catomic_compare_and_exchange_bool_acq (&f->flavor, ef_free,
- ef_cxa)))
- {
- uint64_t check = __new_exitfn_called;
+ One minor issue remains: A registered exit function that is in
+ progress by a call to dlclose() may not completely finish before
+ the next registered exit function is run. This may, according to
+ some readings of POSIX violate the requirement that functions
+ run in effective LIFO order. This should probably be fixed in a
+ future implementation to ensure the functions do not run in
+ parallel. */
+ f->flavor = ef_free;
#ifdef PTR_DEMANGLE
- PTR_DEMANGLE (cxafn);
+ PTR_DEMANGLE (cxafn);
#endif
- cxafn (cxaarg, 0);
+ /* Unlock the list while we call a foreign function. */
+ __libc_lock_unlock (__exit_funcs_lock);
+ cxafn (cxaarg, 0);
+ __libc_lock_lock (__exit_funcs_lock);
- /* It is possible that that last exit function registered
- more exit functions. Start the loop over. */
- if (__glibc_unlikely (check != __new_exitfn_called))
- goto restart;
- }
- }
+ /* It is possible that that last exit function registered
+ more exit functions. Start the loop over. */
+ if (__glibc_unlikely (check != __new_exitfn_called))
+ goto restart;
+ }
}
/* Also remove the quick_exit handlers, but do not call them. */
@@ -79,4 +106,5 @@ __cxa_finalize (void *d)
if (d != NULL)
UNREGISTER_ATFORK (d);
#endif
+ __libc_lock_unlock (__exit_funcs_lock);
}
@@ -19,11 +19,16 @@
#include <stdlib.h>
#include <unistd.h>
#include <sysdep.h>
+#include <libc-lock.h>
#include "exit.h"
#include "set-hooks.h"
DEFINE_HOOK (__libc_atexit, (void))
+/* Initialize the flag that indicates exit function processing
+ is complete. See concurrency notes in stdlib/exit.h where
+ __exit_funcs_lock is declared. */
+bool __exit_funcs_done = false;
/* Call all functions registered with `atexit' and `on_exit',
in the reverse of the order in which they were registered
@@ -44,14 +49,32 @@ __run_exit_handlers (int status, struct exit_function_list **listp,
the functions registered with `atexit' and `on_exit'. We call
everyone on the list and use the status value in the last
exit (). */
- while (*listp != NULL)
+ while (true)
{
- struct exit_function_list *cur = *listp;
+ struct exit_function_list *cur;
+
+ __libc_lock_lock (__exit_funcs_lock);
+
+ restart:
+ cur = *listp;
+
+ if (cur == NULL)
+ {
+ /* Exit processing complete. We will not allow any more
+ atexit/on_exit registrations. */
+ __exit_funcs_done = true;
+ __libc_lock_unlock (__exit_funcs_lock);
+ break;
+ }
while (cur->idx > 0)
{
const struct exit_function *const f =
&cur->fns[--cur->idx];
+ const uint64_t new_exitfn_called = __new_exitfn_called;
+
+ /* Unlock the list while we call a foreign function. */
+ __libc_lock_unlock (__exit_funcs_lock);
switch (f->flavor)
{
void (*atfct) (void);
@@ -83,6 +106,13 @@ __run_exit_handlers (int status, struct exit_function_list **listp,
cxafct (f->func.cxa.arg, status);
break;
}
+ /* Re-lock again before looking at global state. */
+ __libc_lock_lock (__exit_funcs_lock);
+
+ if (__glibc_unlikely (new_exitfn_called != __new_exitfn_called))
+ /* The last exit function, or another thread, has registered
+ more exit functions. Start the loop over. */
+ goto restart;
}
*listp = cur->next;
@@ -90,6 +120,8 @@ __run_exit_handlers (int status, struct exit_function_list **listp,
/* Don't free the last element in the chain, this is the statically
allocate element. */
free (cur);
+
+ __libc_lock_unlock (__exit_funcs_lock);
}
if (run_list_atexit)
@@ -20,6 +20,7 @@
#include <stdbool.h>
#include <stdint.h>
+#include <libc-lock.h>
enum
{
@@ -57,11 +58,27 @@ struct exit_function_list
size_t idx;
struct exit_function fns[32];
};
+
extern struct exit_function_list *__exit_funcs attribute_hidden;
extern struct exit_function_list *__quick_exit_funcs attribute_hidden;
+extern uint64_t __new_exitfn_called attribute_hidden;
+
+/* True once all registered atexit/at_quick_exit/onexit handlers have been
+ called */
+extern bool __exit_funcs_done attribute_hidden;
+
+/* This lock protects __exit_funcs, __quick_exit_funcs, __exit_funcs_done
+ and __new_exitfn_called globals against simultaneous access from
+ atexit/on_exit/at_quick_exit in multiple threads, and also from
+ simultaneous access while another thread is in the middle of calling
+ exit handlers. See BZ#14333. Note: for lists, the entire list, and
+ each associated entry in the list, is protected for all access by this
+ lock. */
+__libc_lock_define (extern, __exit_funcs_lock);
+
extern struct exit_function *__new_exitfn (struct exit_function_list **listp);
-extern uint64_t __new_exitfn_called attribute_hidden;
+
extern void __run_exit_handlers (int status,
struct exit_function_list **listp,
@@ -17,25 +17,30 @@
#include <stdlib.h>
#include "exit.h"
-#include <atomic.h>
#include <sysdep.h>
/* Register a function to be called by exit. */
int
__on_exit (void (*func) (int status, void *arg), void *arg)
{
- struct exit_function *new = __new_exitfn (&__exit_funcs);
+ struct exit_function *new;
+
+ __libc_lock_lock (__exit_funcs_lock);
+ new = __new_exitfn (&__exit_funcs);
if (new == NULL)
- return -1;
+ {
+ __libc_lock_unlock (__exit_funcs_lock);
+ return -1;
+ }
#ifdef PTR_MANGLE
PTR_MANGLE (func);
#endif
new->func.on.fn = func;
new->func.on.arg = arg;
- atomic_write_barrier ();
new->flavor = ef_on;
+ __libc_lock_unlock (__exit_funcs_lock);
return 0;
}
weak_alias (__on_exit, on_exit)
new file mode 100644
@@ -0,0 +1,32 @@
+/* Bug 14333: a test for at_quick_exit/quick_exit race.
+
+ Copyright (C) 2017 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+/* This file must be run from within a directory called "stdlib". */
+
+/* See stdlib/test-atexit-race-common.c for details on this test. */
+
+#define CALL_ATEXIT at_quick_exit (&no_op)
+#define CALL_EXIT quick_exit (0)
+
+static void
+no_op (void)
+{
+}
+
+#include <stdlib/test-atexit-race-common.c>
new file mode 100644
@@ -0,0 +1,70 @@
+/* Bug 14333: Support file for atexit/exit, etc. race tests.
+
+ Copyright (C) 2017 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+/* This file must be run from within a directory called "stdlib". */
+
+/* The atexit/exit, at_quick_exit/quick_exit, __cxa_atexit/exit, etc.
+ exhibited data race while accessing destructor function list (Bug 14333).
+
+ This test spawns large number of threads, which all race to register
+ large number of destructors.
+
+ Before the fix, running this test resulted in a SIGSEGV.
+ After the fix, we expect clean process termination. */
+
+#if !defined(CALL_EXIT) || !defined(CALL_ATEXIT)
+#error Must define CALL_EXIT and CALL_ATEXIT before using this file.
+#endif
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <support/xthread.h>
+
+const size_t kNumThreads = 1024;
+const size_t kNumHandlers = 1024;
+
+static void *
+threadfunc (void *unused)
+{
+ size_t i;
+ for (i = 0; i < kNumHandlers; ++i) {
+ CALL_ATEXIT;
+ }
+ return NULL;
+}
+
+static int
+do_test (void)
+{
+ size_t i;
+ pthread_attr_t attr;
+
+ xpthread_attr_init (&attr);
+ xpthread_attr_setdetachstate (&attr, 1);
+
+ for (i = 0; i < kNumThreads; ++i) {
+ xpthread_create (&attr, threadfunc, NULL);
+ }
+ xpthread_attr_destroy (&attr);
+
+ CALL_EXIT;
+}
+
+#define TEST_FUNCTION do_test
+#include <support/test-driver.c>
new file mode 100644
@@ -0,0 +1,32 @@
+/* Bug 14333: a test for atexit/exit race.
+
+ Copyright (C) 2017 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+/* This file must be run from within a directory called "stdlib". */
+
+/* See stdlib/test-atexit-race-common.c for details on this test. */
+
+#define CALL_ATEXIT atexit (&no_op)
+#define CALL_EXIT exit (0)
+
+static void
+no_op (void)
+{
+}
+
+#include <stdlib/test-atexit-race-common.c>
new file mode 100644
@@ -0,0 +1,36 @@
+/* Bug 14333: a test for __cxa_atexit/exit race.
+
+ Copyright (C) 2017 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+/* This file must be run from within a directory called "stdlib". */
+
+/* See stdlib/test-atexit-race-common.c for details on this test. */
+
+#include <stdio.h>
+
+#define CALL_ATEXIT __cxa_atexit (&no_op, NULL, NULL)
+#define CALL_EXIT exit (0)
+
+int __cxa_atexit (void (*func) (void *), void *arg, void *d);
+
+static void
+no_op (void *ignored)
+{
+}
+
+#include <stdlib/test-atexit-race-common.c>
new file mode 100644
@@ -0,0 +1,32 @@
+/* Bug 14333: a test for on_exit/exit race.
+
+ Copyright (C) 2017 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+/* This file must be run from within a directory called "stdlib". */
+
+/* See stdlib/test-atexit-race-common.c for details on this test. */
+
+#define CALL_ATEXIT on_exit (&no_op, (void *) 0)
+#define CALL_EXIT exit (0)
+
+static void
+no_op (int exit_code, void *ignored)
+{
+}
+
+#include <stdlib/test-atexit-race-common.c>