From patchwork Mon Feb 11 10:23:05 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Stefan Liebler X-Patchwork-Id: 31395 Received: (qmail 25544 invoked by alias); 11 Feb 2019 10:23:16 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 25258 invoked by uid 89); 11 Feb 2019 10:23:16 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-27.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy=nobody, up, cloned, Bye X-HELO: mx0a-001b2d01.pphosted.com Subject: Re: [PATCH] misc/tst-clone3: Fix waiting for exited thread. To: libc-alpha@sourceware.org References: <0bffc1e1-9cde-181b-cf10-4cc3530fa9cb@linux.ibm.com> <3a9bc23d-c5c6-2a6f-890e-efb9f476e104@linaro.org> From: Stefan Liebler Date: Mon, 11 Feb 2019 11:23:05 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <3a9bc23d-c5c6-2a6f-890e-efb9f476e104@linaro.org> x-cbid: 19021110-4275-0000-0000-0000030E0114 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19021110-4276-0000-0000-0000381C11AE Message-Id: Attached the updated patch. See also notes below. On 02/08/2019 07:37 PM, Adhemerval Zanella wrote: > > > On 08/02/2019 13:12, Stefan Liebler wrote: >> Hi, >> >> from time to time the test misc/tst-clone3 fails with an timeout.  Then futex_wait is blocking.  Usually ctid should be set to zero due to CLONE_CHILD_CLEARTID and the futex should be waken up.  But the fail occures if the thread has already exited before ctid is set to the return value of clone().  Then futex_wait() will block as there will be nobody who wakes the futex up again. >> >> This patch initializes ctid to a known value before calling clone and the kernel is the only one who updates the value to zero after clone. If futex_wait is called then it is either waked up due to the exited thread or the futex syscall fails as *ctid_ptr is already zero instead of the specified value 1. >> >> Okay to commit? >> >> Bye >> Stefan >> >> ChangeLog: >> >>     * sysdeps/unix/sysv/linux/tst-clone3.c (do_test): >>     Initialize ctid with a known value and remove update of ctid >>     after clone. >>     (wait_tid): Adjust arguments and call futex_wait with ctid_val >>     as assumed current value of ctid_ptr. >> >> 20190208_misc_tst-clone3.patch >> >> commit 5a8f80973dbbf06a0eebc2064d2a14521c6a6131 >> Author: Stefan Liebler >> Date: Fri Feb 8 12:42:52 2019 +0100 >> >> misc/tst-clone3: Fix waiting for exited thread. >> >> From time to time the test misc/tst-clone3 fails with an timeout. >> Then futex_wait is blocking. Usually ctid should be set to zero >> due to CLONE_CHILD_CLEARTID and the futex should be waken up. >> But the fail occures if the thread has already exited before >> ctid is set to the return value of clone(). Then futex_wait() will >> block as there will be nobody who wakes the futex up again. >> >> This patch initializes ctid to a known value before calling clone >> and the kernel is the only one who updates the value to zero after clone. >> If futex_wait is called then it is either waked up due to the exited thread >> or the futex syscall fails as *ctid_ptr is already zero instead of the >> specified value 1. >> >> ChangeLog: >> >> * sysdeps/unix/sysv/linux/tst-clone3.c (do_test): >> Initialize ctid with a known value and remove update of ctid >> after clone. >> (wait_tid): Adjust arguments and call futex_wait with ctid_val >> as assumed current value of ctid_ptr. > > Thanks for catching it. > >> >> diff --git a/sysdeps/unix/sysv/linux/tst-clone3.c b/sysdeps/unix/sysv/linux/tst-clone3.c >> index aa8e718afe..ffa2056eb6 100644 >> --- a/sysdeps/unix/sysv/linux/tst-clone3.c >> +++ b/sysdeps/unix/sysv/linux/tst-clone3.c >> @@ -42,11 +42,11 @@ f (void *a) >> >> /* Futex wait for TID argument, similar to pthread_join internal >> implementation. */ >> -#define wait_tid(tid) \ >> +#define wait_tid(ctid_ptr, ctid_val) \ >> do { \ >> - __typeof (tid) __tid; \ >> - while ((__tid = (tid)) != 0) \ >> - futex_wait (&(tid), __tid); \ >> + __typeof (*(ctid_ptr)) __tid; \ >> + while ((__tid = *(ctid_ptr)) != 0) \ >> + futex_wait (ctid_ptr, ctid_val); \ >> } while (0) > > lll_wait_tid uses an atomic load with acquire semantic, I think we should > use it as well. We can either include the atomic header or use c11 atomic. > I've first tried to include atomic.h, but it failed building on x86_64. Thus I'm using the c11 atomic load in the updated patch. Okay to commit? As information, I've observed those gcc errors on x86_64: In file included from ../sysdeps/unix/sysv/linux/x86_64/sysdep.h:30, from ../sysdeps/x86_64/nptl/tls.h:28, from ../sysdeps/x86/atomic-machine.h:23, from ../include/atomic.h:50, from ../sysdeps/unix/sysv/linux/tst-clone3.c:29: ../sysdeps/unix/sysv/linux/dl-sysdep.h: In function ‘_dl_discover_osversion’: ../sysdeps/unix/sysv/linux/dl-sysdep.h:31:42: error: expected declaration specifiers before ‘attribute_hidden’ extern int _dl_discover_osversion (void) attribute_hidden; ^~~~~~~~~~~~~~~~ In file included from ../sysdeps/x86_64/nptl/tls.h:31, from ../sysdeps/x86/atomic-machine.h:23, from ../include/atomic.h:50, from ../sysdeps/unix/sysv/linux/tst-clone3.c:29: ../sysdeps/generic/dl-dtv.h:22:1: error: empty declaration [-Werror] struct dtv_pointer ^~~~~~ ../sysdeps/generic/dl-dtv.h:33:3: error: storage class specified for parameter ‘dtv_t’ } dtv_t; ^~~~~ ... many many more errors ... I've also tried to add tst-clone3 to tests_internal instead of tests in the Makefile, but then I've got: In file included from ../sysdeps/x86_64/nptl/tls.h:130, from ../sysdeps/x86/atomic-machine.h:23, from ../include/atomic.h:50, from ../sysdeps/unix/sysv/linux/tst-clone3.c:29: ../nptl/descr.h:104:8: error: redefinition of ‘struct robust_list_head’ struct robust_list_head ^~~~~~~~~~~~~~~~ In file included from ../sysdeps/unix/sysv/linux/tst-clone3.c:26: /usr/include/linux/futex.h:70:8: note: originally defined here struct robust_list_head { ^~~~~~~~~~~~~~~~ >> >> static inline int >> @@ -64,7 +64,11 @@ do_test (void) >> clone_flags |= CLONE_VM | CLONE_SIGHAND; >> /* We will used ctid to call on futex to wait for thread exit. */ >> clone_flags |= CLONE_CHILD_CLEARTID; >> - pid_t ctid, tid; >> + /* Initialize with a known value. ctid is set to zero by the kernel after the >> + cloned thread has exited. */ >> +#define CTID_INIT_VAL 1 >> + pid_t ctid = CTID_INIT_VAL; >> + pid_t tid; >> >> #ifdef __ia64__ >> extern int __clone2 (int (*__fn) (void *__arg), void *__child_stack_base, >> @@ -86,8 +90,7 @@ do_test (void) >> if (tid == -1) >> FAIL_EXIT1 ("clone failed: %m"); >> >> - ctid = tid; >> - wait_tid (ctid); >> + wait_tid (&ctid, CTID_INIT_VAL); >> >> return 2; >> } >> > commit 3672d05bd3c5c51831ef8b91ee2b0ff491fd2986 Author: Stefan Liebler Date: Fri Feb 8 12:42:52 2019 +0100 misc/tst-clone3: Fix waiting for exited thread. From time to time the test misc/tst-clone3 fails with a timeout. Then futex_wait is blocking. Usually ctid should be set to zero due to CLONE_CHILD_CLEARTID and the futex should be waken up. But the fail occures if the thread has already exited before ctid is set to the return value of clone(). Then futex_wait() will block as there will be nobody who wakes the futex up again. This patch initializes ctid to a known value before calling clone and the kernel is the only one who updates the value to zero after clone. If futex_wait is called then it is either waked up due to the exited thread or the futex syscall fails as *ctid_ptr is already zero instead of the specified value 1. ChangeLog: * sysdeps/unix/sysv/linux/tst-clone3.c (do_test): Initialize ctid with a known value and remove update of ctid after clone. (wait_tid): Adjust arguments and call futex_wait with ctid_val as assumed current value of ctid_ptr. diff --git a/sysdeps/unix/sysv/linux/tst-clone3.c b/sysdeps/unix/sysv/linux/tst-clone3.c index aa8e718afe..b345d04b4d 100644 --- a/sysdeps/unix/sysv/linux/tst-clone3.c +++ b/sysdeps/unix/sysv/linux/tst-clone3.c @@ -42,11 +42,13 @@ f (void *a) /* Futex wait for TID argument, similar to pthread_join internal implementation. */ -#define wait_tid(tid) \ - do { \ - __typeof (tid) __tid; \ - while ((__tid = (tid)) != 0) \ - futex_wait (&(tid), __tid); \ +#define wait_tid(ctid_ptr, ctid_val) \ + do { \ + __typeof (*(ctid_ptr)) __tid; \ + /* We need acquire MO here so that we synchronize with the \ + kernel's store to 0 when the clone terminates. */ \ + while ((__tid = __atomic_load_n (ctid_ptr, __ATOMIC_ACQUIRE)) != 0) \ + futex_wait (ctid_ptr, ctid_val); \ } while (0) static inline int @@ -64,7 +66,11 @@ do_test (void) clone_flags |= CLONE_VM | CLONE_SIGHAND; /* We will used ctid to call on futex to wait for thread exit. */ clone_flags |= CLONE_CHILD_CLEARTID; - pid_t ctid, tid; + /* Initialize with a known value. ctid is set to zero by the kernel after the + cloned thread has exited. */ +#define CTID_INIT_VAL 1 + pid_t ctid = CTID_INIT_VAL; + pid_t tid; #ifdef __ia64__ extern int __clone2 (int (*__fn) (void *__arg), void *__child_stack_base, @@ -86,8 +92,7 @@ do_test (void) if (tid == -1) FAIL_EXIT1 ("clone failed: %m"); - ctid = tid; - wait_tid (ctid); + wait_tid (&ctid, CTID_INIT_VAL); return 2; }