[v2,05/19] nptl: Move setxid flag out of cancelhandling

Message ID 20210823195047.543237-6-adhemerval.zanella@linaro.org
State Superseded
Headers
Series Fix various NPTL synchronization issues |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent

Commit Message

Adhemerval Zanella Aug. 23, 2021, 7:50 p.m. UTC
  Now that the thread state is tracked by 'joinstate' field, there is no
need to keep the setxid flag within the 'cancelhandling'.  It simplifies
the atomic code to set and reset it (since there is no need to handle
the thread state concurrent update).

Checked on x86_64-linux-gnu and aarch64-linux-gnu.
---
 nptl/allocatestack.c  |  3 +++
 nptl/descr.h          |  5 ++---
 nptl/nptl_setxid.c    | 49 ++++++++++---------------------------------
 nptl/pthread_create.c |  4 ++--
 4 files changed, 18 insertions(+), 43 deletions(-)
  

Comments

Florian Weimer Aug. 26, 2021, 11:34 a.m. UTC | #1
* Adhemerval Zanella:

> Now that the thread state is tracked by 'joinstate' field, there is no
> need to keep the setxid flag within the 'cancelhandling'.  It simplifies
> the atomic code to set and reset it (since there is no need to handle
> the thread state concurrent update).

Some of this functionality reimplements the tidlock.

There is some functionality around thread creation, but I do not know if
the synchronization is entirely correct.  This makes it rather difficult
to review changes.

It's also not clear to me if we need to do anything to prevent early
setxid calls:

>  	  /* Don't allow setxid until cloned.  */

I suspect this should just look at the TID instead.  We create the
thread with all signals blocked, so once there is a TID (and the thread
has not exited), it is safe to send the signal.

> diff --git a/nptl/descr.h b/nptl/descr.h
> index 4b2db6edab..563b152bff 100644
> --- a/nptl/descr.h
> +++ b/nptl/descr.h

> +  /* Indicate whether thread is supposed to change XID.  */
> +  int setxid_flag;

This should document the possible values.

Thanks,
Florian
  
Adhemerval Zanella Aug. 26, 2021, 3:11 p.m. UTC | #2
On 26/08/2021 08:34, Florian Weimer wrote:
> * Adhemerval Zanella:
> 
>> Now that the thread state is tracked by 'joinstate' field, there is no
>> need to keep the setxid flag within the 'cancelhandling'.  It simplifies
>> the atomic code to set and reset it (since there is no need to handle
>> the thread state concurrent update).
> 
> Some of this functionality reimplements the tidlock.
> 
> There is some functionality around thread creation, but I do not know if
> the synchronization is entirely correct.  This makes it rather difficult
> to review changes.
> 
> It's also not clear to me if we need to do anything to prevent early
> setxid calls:
> 
>>  	  /* Don't allow setxid until cloned.  */
> 
> I suspect this should just look at the TID instead.  We create the
> thread with all signals blocked, so once there is a TID (and the thread
> has not exited), it is safe to send the signal.

I agree that we might improve the setxid code, however what I intended
with this changes is to keep its semantic as-is and just decouple it
from cancelhandling.  I might revise it on a later patch to follow your
suggestions.

> 
>> diff --git a/nptl/descr.h b/nptl/descr.h
>> index 4b2db6edab..563b152bff 100644
>> --- a/nptl/descr.h
>> +++ b/nptl/descr.h
> 
>> +  /* Indicate whether thread is supposed to change XID.  */
>> +  int setxid_flag;
> 
> This should document the possible values.

I changed to:

  /* Indicate whether thread is supposed to change XID, it acts a boolean     
     variable (0 means no operation pending).  */
  
Florian Weimer Aug. 26, 2021, 3:21 p.m. UTC | #3
* Adhemerval Zanella:

> On 26/08/2021 08:34, Florian Weimer wrote:
>> * Adhemerval Zanella:
>> 
>>> Now that the thread state is tracked by 'joinstate' field, there is no
>>> need to keep the setxid flag within the 'cancelhandling'.  It simplifies
>>> the atomic code to set and reset it (since there is no need to handle
>>> the thread state concurrent update).
>> 
>> Some of this functionality reimplements the tidlock.
>> 
>> There is some functionality around thread creation, but I do not know if
>> the synchronization is entirely correct.  This makes it rather difficult
>> to review changes.
>> 
>> It's also not clear to me if we need to do anything to prevent early
>> setxid calls:
>> 
>>>  	  /* Don't allow setxid until cloned.  */
>> 
>> I suspect this should just look at the TID instead.  We create the
>> thread with all signals blocked, so once there is a TID (and the thread
>> has not exited), it is safe to send the signal.
>
> I agree that we might improve the setxid code, however what I intended
> with this changes is to keep its semantic as-is and just decouple it
> from cancelhandling.  I might revise it on a later patch to follow your
> suggestions.

I wonder if there is a new race here:

@@ -109,41 +100,23 @@ setxid_mark_thread (struct xid_command *cmdp, struct pthread *t)
   /* Don't let the thread exit before the setxid handler runs.  */
   t->setxid_futex = 0;
 
+  /* If thread is exiting right now, ignore it.  */
+  if (atomic_load_acquire (&t->joinstate) == THREAD_STATE_EXITING)
+    return;
 
+  /* Release the futex if there is no other setxid in progress.  */
+  if (atomic_exchange_acquire (&t->setxid_flag, 1) == 0)
+    {
+      t->setxid_futex = 1;
+      futex_wake (&t->setxid_futex, 1, FUTEX_PRIVATE);
     }

That is, if the setxid-sending thread can now wait for a futex wake that
never comes.

There's also a use of atomic_exchange_release whose result is not used:

+  atomic_exchange_release (&t->setxid_flag, 0);

That should probably be store.

Thanks,
Florian
  
Adhemerval Zanella Aug. 26, 2021, 4:39 p.m. UTC | #4
On 26/08/2021 12:21, Florian Weimer wrote:
> * Adhemerval Zanella:
> 
>> On 26/08/2021 08:34, Florian Weimer wrote:
>>> * Adhemerval Zanella:
>>>
>>>> Now that the thread state is tracked by 'joinstate' field, there is no
>>>> need to keep the setxid flag within the 'cancelhandling'.  It simplifies
>>>> the atomic code to set and reset it (since there is no need to handle
>>>> the thread state concurrent update).
>>>
>>> Some of this functionality reimplements the tidlock.
>>>
>>> There is some functionality around thread creation, but I do not know if
>>> the synchronization is entirely correct.  This makes it rather difficult
>>> to review changes.
>>>
>>> It's also not clear to me if we need to do anything to prevent early
>>> setxid calls:
>>>
>>>>  	  /* Don't allow setxid until cloned.  */
>>>
>>> I suspect this should just look at the TID instead.  We create the
>>> thread with all signals blocked, so once there is a TID (and the thread
>>> has not exited), it is safe to send the signal.
>>
>> I agree that we might improve the setxid code, however what I intended
>> with this changes is to keep its semantic as-is and just decouple it
>> from cancelhandling.  I might revise it on a later patch to follow your
>> suggestions.
> 
> I wonder if there is a new race here:
> 
> @@ -109,41 +100,23 @@ setxid_mark_thread (struct xid_command *cmdp, struct pthread *t)
>    /* Don't let the thread exit before the setxid handler runs.  */
>    t->setxid_futex = 0;
>  
> +  /* If thread is exiting right now, ignore it.  */
> +  if (atomic_load_acquire (&t->joinstate) == THREAD_STATE_EXITING)
> +    return;
>  
> +  /* Release the futex if there is no other setxid in progress.  */
> +  if (atomic_exchange_acquire (&t->setxid_flag, 1) == 0)
> +    {
> +      t->setxid_futex = 1;
> +      futex_wake (&t->setxid_futex, 1, FUTEX_PRIVATE);
>      }
> 
> That is, if the setxid-sending thread can now wait for a futex wake that
> never comes.

Wouldn't be handled by setxid_unmark_thread, which wakes the futex
unconditionally? 

> 
> There's also a use of atomic_exchange_release whose result is not used:
> 
> +  atomic_exchange_release (&t->setxid_flag, 0);
> 
> That should probably be store.

Ack.
  
Florian Weimer Aug. 30, 2021, 11:27 a.m. UTC | #5
* Adhemerval Zanella:

> On 26/08/2021 12:21, Florian Weimer wrote:
>> * Adhemerval Zanella:
>> 
>>> On 26/08/2021 08:34, Florian Weimer wrote:
>>>> * Adhemerval Zanella:
>>>>
>>>>> Now that the thread state is tracked by 'joinstate' field, there is no
>>>>> need to keep the setxid flag within the 'cancelhandling'.  It simplifies
>>>>> the atomic code to set and reset it (since there is no need to handle
>>>>> the thread state concurrent update).
>>>>
>>>> Some of this functionality reimplements the tidlock.
>>>>
>>>> There is some functionality around thread creation, but I do not know if
>>>> the synchronization is entirely correct.  This makes it rather difficult
>>>> to review changes.
>>>>
>>>> It's also not clear to me if we need to do anything to prevent early
>>>> setxid calls:
>>>>
>>>>>  	  /* Don't allow setxid until cloned.  */
>>>>
>>>> I suspect this should just look at the TID instead.  We create the
>>>> thread with all signals blocked, so once there is a TID (and the thread
>>>> has not exited), it is safe to send the signal.
>>>
>>> I agree that we might improve the setxid code, however what I intended
>>> with this changes is to keep its semantic as-is and just decouple it
>>> from cancelhandling.  I might revise it on a later patch to follow your
>>> suggestions.
>> 
>> I wonder if there is a new race here:
>> 
>> @@ -109,41 +100,23 @@ setxid_mark_thread (struct xid_command *cmdp, struct pthread *t)
>>    /* Don't let the thread exit before the setxid handler runs.  */
>>    t->setxid_futex = 0;
>>  
>> +  /* If thread is exiting right now, ignore it.  */
>> +  if (atomic_load_acquire (&t->joinstate) == THREAD_STATE_EXITING)
>> +    return;
>>  
>> +  /* Release the futex if there is no other setxid in progress.  */
>> +  if (atomic_exchange_acquire (&t->setxid_flag, 1) == 0)
>> +    {
>> +      t->setxid_futex = 1;
>> +      futex_wake (&t->setxid_futex, 1, FUTEX_PRIVATE);
>>      }
>> 
>> That is, if the setxid-sending thread can now wait for a futex wake that
>> never comes.
>
> Wouldn't be handled by setxid_unmark_thread, which wakes the futex
> unconditionally?

But that code only runs after the loop around signalled has concluded.
That's why I'm worried.

Thanks,
Florian
  

Patch

diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c
index cfe37a3443..ccb1aa21c1 100644
--- a/nptl/allocatestack.c
+++ b/nptl/allocatestack.c
@@ -99,6 +99,7 @@  get_cached_stack (size_t *sizep, void **memp)
     }
 
   /* Don't allow setxid until cloned.  */
+  result->setxid_flag = 0;
   result->setxid_futex = -1;
 
   /* Dequeue the entry.  */
@@ -301,6 +302,7 @@  allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
 #endif
 
       /* Don't allow setxid until cloned.  */
+      pd->setxid_flag = 0;
       pd->setxid_futex = -1;
 
       /* Allocate the DTV for this thread.  */
@@ -422,6 +424,7 @@  allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
 #endif
 
 	  /* Don't allow setxid until cloned.  */
+	  pd->setxid_flag = 0;
 	  pd->setxid_futex = -1;
 
 	  /* Allocate the DTV for this thread.  */
diff --git a/nptl/descr.h b/nptl/descr.h
index 4b2db6edab..563b152bff 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -297,9 +297,6 @@  struct pthread
   /* Bit set if thread terminated and TCB is freed.  */
 #define TERMINATED_BIT		5
 #define TERMINATED_BITMASK	(0x01 << TERMINATED_BIT)
-  /* Bit set if thread is supposed to change XID.  */
-#define SETXID_BIT		6
-#define SETXID_BITMASK		(0x01 << SETXID_BIT)
 
   /* Flags.  Including those copied from the thread attribute.  */
   int flags;
@@ -339,6 +336,8 @@  struct pthread
   /* Lock to synchronize access to the descriptor.  */
   int lock;
 
+  /* Indicate whether thread is supposed to change XID.  */
+  int setxid_flag;
   /* Lock for synchronizing setxid calls.  */
   unsigned int setxid_futex;
 
diff --git a/nptl/nptl_setxid.c b/nptl/nptl_setxid.c
index 2f35772411..c337fa8577 100644
--- a/nptl/nptl_setxid.c
+++ b/nptl/nptl_setxid.c
@@ -76,14 +76,7 @@  __nptl_setxid_sighandler (int sig, siginfo_t *si, void *ctx)
 
   /* Reset the SETXID flag.  */
   struct pthread *self = THREAD_SELF;
-  int flags, newval;
-  do
-    {
-      flags = THREAD_GETMEM (self, cancelhandling);
-      newval = THREAD_ATOMIC_CMPXCHG_VAL (self, cancelhandling,
-					  flags & ~SETXID_BITMASK, flags);
-    }
-  while (flags != newval);
+  atomic_store_release (&self->setxid_flag, 0);
 
   /* And release the futex.  */
   self->setxid_futex = 1;
@@ -97,8 +90,6 @@  libc_hidden_def (__nptl_setxid_sighandler)
 static void
 setxid_mark_thread (struct xid_command *cmdp, struct pthread *t)
 {
-  int ch;
-
   /* Wait until this thread is cloned.  */
   if (t->setxid_futex == -1
       && ! atomic_compare_and_exchange_bool_acq (&t->setxid_futex, -2, -1))
@@ -109,41 +100,23 @@  setxid_mark_thread (struct xid_command *cmdp, struct pthread *t)
   /* Don't let the thread exit before the setxid handler runs.  */
   t->setxid_futex = 0;
 
-  do
-    {
-      ch = t->cancelhandling;
+  /* If thread is exiting right now, ignore it.  */
+  if (atomic_load_acquire (&t->joinstate) == THREAD_STATE_EXITING)
+    return;
 
-      /* If the thread is exiting right now, ignore it.  */
-      if ((ch & EXITING_BITMASK) != 0)
-        {
-          /* Release the futex if there is no other setxid in
-             progress.  */
-          if ((ch & SETXID_BITMASK) == 0)
-            {
-              t->setxid_futex = 1;
-              futex_wake (&t->setxid_futex, 1, FUTEX_PRIVATE);
-            }
-          return;
-        }
+  /* Release the futex if there is no other setxid in progress.  */
+  if (atomic_exchange_acquire (&t->setxid_flag, 1) == 0)
+    {
+      t->setxid_futex = 1;
+      futex_wake (&t->setxid_futex, 1, FUTEX_PRIVATE);
     }
-  while (atomic_compare_and_exchange_bool_acq (&t->cancelhandling,
-                                               ch | SETXID_BITMASK, ch));
 }
 
 
 static void
 setxid_unmark_thread (struct xid_command *cmdp, struct pthread *t)
 {
-  int ch;
-
-  do
-    {
-      ch = t->cancelhandling;
-      if ((ch & SETXID_BITMASK) == 0)
-        return;
-    }
-  while (atomic_compare_and_exchange_bool_acq (&t->cancelhandling,
-                                               ch & ~SETXID_BITMASK, ch));
+  atomic_exchange_release (&t->setxid_flag, 0);
 
   /* Release the futex just in case.  */
   t->setxid_futex = 1;
@@ -154,7 +127,7 @@  setxid_unmark_thread (struct xid_command *cmdp, struct pthread *t)
 static int
 setxid_signal_thread (struct xid_command *cmdp, struct pthread *t)
 {
-  if ((t->cancelhandling & SETXID_BITMASK) == 0)
+  if (atomic_load_relaxed (&t->setxid_flag) == 0)
     return 0;
 
   int val;
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index 763e32bc3e..c00d6ae00c 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -541,7 +541,7 @@  start_thread (void *arg)
     advise_stack_range (pd->stackblock, pd->stackblock_size, (uintptr_t) pd,
 			pd->guardsize);
 
-  if (__glibc_unlikely (pd->cancelhandling & SETXID_BITMASK))
+  if (__glibc_unlikely (atomic_load_relaxed (&pd->setxid_flag) == 1))
     {
       /* Some other thread might call any of the setXid functions and expect
 	 us to reply.  In this case wait until we did that.  */
@@ -551,7 +551,7 @@  start_thread (void *arg)
 	   condition used in the surrounding loop (cancelhandling).  We need
 	   to check and document why this is correct.  */
 	futex_wait_simple (&pd->setxid_futex, 0, FUTEX_PRIVATE);
-      while (pd->cancelhandling & SETXID_BITMASK);
+      while (atomic_load_relaxed (&pd->setxid_flag) == 1);
 
       /* Reset the value so that the stack can be reused.  */
       pd->setxid_futex = 0;