misc/tst-clone3: Fix waiting for exited thread.

Message ID 0bffc1e1-9cde-181b-cf10-4cc3530fa9cb@linux.ibm.com
State Superseded
Headers

Commit Message

Stefan Liebler Feb. 8, 2019, 3:12 p.m. UTC
  Hi,

from time to time the test misc/tst-clone3 fails with an timeout.  Then 
futex_wait is blocking.  Usually ctid should be set to zero due to 
CLONE_CHILD_CLEARTID and the futex should be waken up.  But the fail 
occures if the thread has already exited before ctid is set to the 
return value of clone().  Then futex_wait() will block as there will be 
nobody who wakes the futex up again.

This patch initializes ctid to a known value before calling clone and 
the kernel is the only one who updates the value to zero after clone. 
If futex_wait is called then it is either waked up due to the exited 
thread or the futex syscall fails as *ctid_ptr is already zero instead 
of the specified value 1.

Okay to commit?

Bye
Stefan

ChangeLog:

	* sysdeps/unix/sysv/linux/tst-clone3.c (do_test):
	Initialize ctid with a known value and remove update of ctid
	after clone.
	(wait_tid): Adjust arguments and call futex_wait with ctid_val
	as assumed current value of ctid_ptr.
  

Comments

Adhemerval Zanella Feb. 8, 2019, 6:37 p.m. UTC | #1
On 08/02/2019 13:12, Stefan Liebler wrote:
> Hi,
> 
> from time to time the test misc/tst-clone3 fails with an timeout.  Then futex_wait is blocking.  Usually ctid should be set to zero due to CLONE_CHILD_CLEARTID and the futex should be waken up.  But the fail occures if the thread has already exited before ctid is set to the return value of clone().  Then futex_wait() will block as there will be nobody who wakes the futex up again.
> 
> This patch initializes ctid to a known value before calling clone and the kernel is the only one who updates the value to zero after clone. If futex_wait is called then it is either waked up due to the exited thread or the futex syscall fails as *ctid_ptr is already zero instead of the specified value 1.
> 
> Okay to commit?
> 
> Bye
> Stefan
> 
> ChangeLog:
> 
>     * sysdeps/unix/sysv/linux/tst-clone3.c (do_test):
>     Initialize ctid with a known value and remove update of ctid
>     after clone.
>     (wait_tid): Adjust arguments and call futex_wait with ctid_val
>     as assumed current value of ctid_ptr.
> 
> 20190208_misc_tst-clone3.patch
> 
> commit 5a8f80973dbbf06a0eebc2064d2a14521c6a6131
> Author: Stefan Liebler <stli@linux.ibm.com>
> Date:   Fri Feb 8 12:42:52 2019 +0100
> 
>     misc/tst-clone3: Fix waiting for exited thread.
>     
>     From time to time the test misc/tst-clone3 fails with an timeout.
>     Then futex_wait is blocking.  Usually ctid should be set to zero
>     due to CLONE_CHILD_CLEARTID and the futex should be waken up.
>     But the fail occures if the thread has already exited before
>     ctid is set to the return value of clone().  Then futex_wait() will
>     block as there will be nobody who wakes the futex up again.
>     
>     This patch initializes ctid to a known value before calling clone
>     and the kernel is the only one who updates the value to zero after clone.
>     If futex_wait is called then it is either waked up due to the exited thread
>     or the futex syscall fails as *ctid_ptr is already zero instead of the
>     specified value 1.
>     
>     ChangeLog:
>     
>             * sysdeps/unix/sysv/linux/tst-clone3.c (do_test):
>             Initialize ctid with a known value and remove update of ctid
>             after clone.
>             (wait_tid): Adjust arguments and call futex_wait with ctid_val
>             as assumed current value of ctid_ptr.

Thanks for catching it.

> 
> diff --git a/sysdeps/unix/sysv/linux/tst-clone3.c b/sysdeps/unix/sysv/linux/tst-clone3.c
> index aa8e718afe..ffa2056eb6 100644
> --- a/sysdeps/unix/sysv/linux/tst-clone3.c
> +++ b/sysdeps/unix/sysv/linux/tst-clone3.c
> @@ -42,11 +42,11 @@ f (void *a)
>  
>  /* Futex wait for TID argument, similar to pthread_join internal
>     implementation.  */
> -#define wait_tid(tid) \
> +#define wait_tid(ctid_ptr, ctid_val)	\
>    do {					\
> -    __typeof (tid) __tid;		\
> -    while ((__tid = (tid)) != 0)	\
> -      futex_wait (&(tid), __tid);	\
> +    __typeof (*(ctid_ptr)) __tid;	\
> +    while ((__tid = *(ctid_ptr)) != 0)	\
> +      futex_wait (ctid_ptr, ctid_val);	\
>    } while (0)

lll_wait_tid uses an atomic load with acquire semantic, I think we should
use it as well. We can either include the atomic header or use c11 atomic.

>  
>  static inline int
> @@ -64,7 +64,11 @@ do_test (void)
>    clone_flags |= CLONE_VM | CLONE_SIGHAND;
>    /* We will used ctid to call on futex to wait for thread exit.  */
>    clone_flags |= CLONE_CHILD_CLEARTID;
> -  pid_t ctid, tid;
> +  /* Initialize with a known value.  ctid is set to zero by the kernel after the
> +     cloned thread has exited.  */
> +#define CTID_INIT_VAL 1
> +  pid_t ctid = CTID_INIT_VAL;
> +  pid_t tid;
>  
>  #ifdef __ia64__
>    extern int __clone2 (int (*__fn) (void *__arg), void *__child_stack_base,
> @@ -86,8 +90,7 @@ do_test (void)
>    if (tid == -1)
>      FAIL_EXIT1 ("clone failed: %m");
>  
> -  ctid = tid;
> -  wait_tid (ctid);
> +  wait_tid (&ctid, CTID_INIT_VAL);
>  
>    return 2;
>  }
>
  

Patch

commit 5a8f80973dbbf06a0eebc2064d2a14521c6a6131
Author: Stefan Liebler <stli@linux.ibm.com>
Date:   Fri Feb 8 12:42:52 2019 +0100

    misc/tst-clone3: Fix waiting for exited thread.
    
    From time to time the test misc/tst-clone3 fails with an timeout.
    Then futex_wait is blocking.  Usually ctid should be set to zero
    due to CLONE_CHILD_CLEARTID and the futex should be waken up.
    But the fail occures if the thread has already exited before
    ctid is set to the return value of clone().  Then futex_wait() will
    block as there will be nobody who wakes the futex up again.
    
    This patch initializes ctid to a known value before calling clone
    and the kernel is the only one who updates the value to zero after clone.
    If futex_wait is called then it is either waked up due to the exited thread
    or the futex syscall fails as *ctid_ptr is already zero instead of the
    specified value 1.
    
    ChangeLog:
    
            * sysdeps/unix/sysv/linux/tst-clone3.c (do_test):
            Initialize ctid with a known value and remove update of ctid
            after clone.
            (wait_tid): Adjust arguments and call futex_wait with ctid_val
            as assumed current value of ctid_ptr.

diff --git a/sysdeps/unix/sysv/linux/tst-clone3.c b/sysdeps/unix/sysv/linux/tst-clone3.c
index aa8e718afe..ffa2056eb6 100644
--- a/sysdeps/unix/sysv/linux/tst-clone3.c
+++ b/sysdeps/unix/sysv/linux/tst-clone3.c
@@ -42,11 +42,11 @@  f (void *a)
 
 /* Futex wait for TID argument, similar to pthread_join internal
    implementation.  */
-#define wait_tid(tid) \
+#define wait_tid(ctid_ptr, ctid_val)	\
   do {					\
-    __typeof (tid) __tid;		\
-    while ((__tid = (tid)) != 0)	\
-      futex_wait (&(tid), __tid);	\
+    __typeof (*(ctid_ptr)) __tid;	\
+    while ((__tid = *(ctid_ptr)) != 0)	\
+      futex_wait (ctid_ptr, ctid_val);	\
   } while (0)
 
 static inline int
@@ -64,7 +64,11 @@  do_test (void)
   clone_flags |= CLONE_VM | CLONE_SIGHAND;
   /* We will used ctid to call on futex to wait for thread exit.  */
   clone_flags |= CLONE_CHILD_CLEARTID;
-  pid_t ctid, tid;
+  /* Initialize with a known value.  ctid is set to zero by the kernel after the
+     cloned thread has exited.  */
+#define CTID_INIT_VAL 1
+  pid_t ctid = CTID_INIT_VAL;
+  pid_t tid;
 
 #ifdef __ia64__
   extern int __clone2 (int (*__fn) (void *__arg), void *__child_stack_base,
@@ -86,8 +90,7 @@  do_test (void)
   if (tid == -1)
     FAIL_EXIT1 ("clone failed: %m");
 
-  ctid = tid;
-  wait_tid (ctid);
+  wait_tid (&ctid, CTID_INIT_VAL);
 
   return 2;
 }