[v2,4/9] nptl: Remove CANCELING_BITMASK

Message ID 20210527172823.3461314-5-adhemerval.zanella@linaro.org
State Committed
Delegated to: Florian Weimer
Headers
Series nptl: pthread cancellation refactor |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent

Commit Message

Adhemerval Zanella Netto May 27, 2021, 5:28 p.m. UTC
  The CANCELING_BITMASK is used as an optimization to avoid sending
the signal when pthread_cancel is called in a concurrent manner.

This requires then to put both the cancellation state and type on
a shared state (cancelhandling), since 'pthread_cancel' checks
whether cancellation is enabled and asynchrnous to either cancel
itself of sending the signal.

It also requires handle the CANCELING_BITMASK on
__pthread_disable_asynccancel, however this is incurs in the same
issues described on BZ#12683: the cancellation is acting even *after*
the syscalls returns with user visible side-effects.

This patch removes this optimization and simplifies the pthread
cancellation implementation: pthread_cancel now first check if
cancellation is already pending and if not always send a signal
if the target is not itself.  The SIGCANCEL handler is also simpified
since there is not need to setup a CAS loop.

It also alows to move both the cancellation state and mode out of
'cancelhadling' (it is done in subsequent patches).

Checked on x86_64-linux-gnu and aarch64-linux-gnu.
---
 nptl/cancellation.c        |  12 ----
 nptl/descr.h               |   3 -
 nptl/pthread_cancel.c      | 124 ++++++++++---------------------------
 nptl/pthread_join_common.c |   2 +-
 4 files changed, 35 insertions(+), 106 deletions(-)
  

Comments

Florian Weimer June 1, 2021, 9:03 a.m. UTC | #1
* Adhemerval Zanella via Libc-alpha:

> The CANCELING_BITMASK is used as an optimization to avoid sending
> the signal when pthread_cancel is called in a concurrent manner.
>
> This requires then to put both the cancellation state and type on
> a shared state (cancelhandling), since 'pthread_cancel' checks
> whether cancellation is enabled and asynchrnous to either cancel
> itself of sending the signal.
>
> It also requires handle the CANCELING_BITMASK on
> __pthread_disable_asynccancel, however this is incurs in the same

typo: “this is incurs”

> issues described on BZ#12683: the cancellation is acting even *after*

“is acted upon even *after*”?

> the syscalls returns with user visible side-effects.
>
> This patch removes this optimization and simplifies the pthread
> cancellation implementation: pthread_cancel now first check if

typo: check → checks

> cancellation is already pending and if not always send a signal

typo: send → sends

Maybe add a comma before “always”.

> if the target is not itself.  The SIGCANCEL handler is also simpified
> since there is not need to setup a CAS loop.
>
> It also alows to move both the cancellation state and mode out of
> 'cancelhadling' (it is done in subsequent patches).

typo: alows

> diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
> index deb404600c..8dfbcff8c3 100644
> --- a/nptl/pthread_cancel.c
> +++ b/nptl/pthread_cancel.c

> @@ -104,72 +87,33 @@ __pthread_cancel (pthread_t th)
>  		    " must be installed for pthread_cancel to work\n");
>    }
> +
> +  int oldch = atomic_fetch_or_acquire (&pd->cancelhandling, CANCELED_BITMASK);
> +  if ((oldch & CANCELED_BITMASK) != 0)
> +    return 0;
> +
> +  if (pd == THREAD_SELF)
>      {
> +      /* A single-threaded process should be able to kill itself, since there
> +	 is nothing in the POSIX specification that says that it cannot.  So
> +	 we set multiple_threads to true so that cancellation points get
> +	 executed.  */
> +      THREAD_SETMEM (THREAD_SELF, header.multiple_threads, 1);
>  #ifndef TLS_MULTIPLE_THREADS_IN_TCB
> -	__libc_multiple_threads = 1;
> +      __libc_multiple_threads = 1;
>  #endif
> +
> +      THREAD_SETMEM (pd, result, PTHREAD_CANCELED);
> +      if ((oldch & CANCELTYPE_BITMASK) != 0)
> +	__do_cancel ();
> +      return 0;
>      }

The last part is for asynchronous self-cancel, right?  Don't we have to
check that cancellation is enabled, too?

Thanks,
Florian
  
Adhemerval Zanella Netto June 1, 2021, 1:50 p.m. UTC | #2
On 01/06/2021 06:03, Florian Weimer wrote:
> * Adhemerval Zanella via Libc-alpha:
> 
>> The CANCELING_BITMASK is used as an optimization to avoid sending
>> the signal when pthread_cancel is called in a concurrent manner.
>>
>> This requires then to put both the cancellation state and type on
>> a shared state (cancelhandling), since 'pthread_cancel' checks
>> whether cancellation is enabled and asynchrnous to either cancel
>> itself of sending the signal.
>>
>> It also requires handle the CANCELING_BITMASK on
>> __pthread_disable_asynccancel, however this is incurs in the same
> 
> typo: “this is incurs”

Ack.

> 
>> issues described on BZ#12683: the cancellation is acting even *after*
> 
> “is acted upon even *after*”?

Ack.

> 
>> the syscalls returns with user visible side-effects.
>>
>> This patch removes this optimization and simplifies the pthread
>> cancellation implementation: pthread_cancel now first check if
> 
> typo: check → checks

Ack.

> 
>> cancellation is already pending and if not always send a signal
> 
> typo: send → sends

Ack.

> 
> Maybe add a comma before “always”.

Ack.

> 
>> if the target is not itself.  The SIGCANCEL handler is also simpified
>> since there is not need to setup a CAS loop.
>>
>> It also alows to move both the cancellation state and mode out of
>> 'cancelhadling' (it is done in subsequent patches).
> 
> typo: alows

Ack.

> 
>> diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
>> index deb404600c..8dfbcff8c3 100644
>> --- a/nptl/pthread_cancel.c
>> +++ b/nptl/pthread_cancel.c
> 
>> @@ -104,72 +87,33 @@ __pthread_cancel (pthread_t th)
>>  		    " must be installed for pthread_cancel to work\n");
>>    }
>> +
>> +  int oldch = atomic_fetch_or_acquire (&pd->cancelhandling, CANCELED_BITMASK);
>> +  if ((oldch & CANCELED_BITMASK) != 0)
>> +    return 0;
>> +
>> +  if (pd == THREAD_SELF)
>>      {
>> +      /* A single-threaded process should be able to kill itself, since there
>> +	 is nothing in the POSIX specification that says that it cannot.  So
>> +	 we set multiple_threads to true so that cancellation points get
>> +	 executed.  */
>> +      THREAD_SETMEM (THREAD_SELF, header.multiple_threads, 1);
>>  #ifndef TLS_MULTIPLE_THREADS_IN_TCB
>> -	__libc_multiple_threads = 1;
>> +      __libc_multiple_threads = 1;
>>  #endif
>> +
>> +      THREAD_SETMEM (pd, result, PTHREAD_CANCELED);
>> +      if ((oldch & CANCELTYPE_BITMASK) != 0)
>> +	__do_cancel ();
>> +      return 0;
>>      }
> 
> The last part is for asynchronous self-cancel, right?  Don't we have to
> check that cancellation is enabled, too?

Indeed it should be:

      if ((oldch & CANCELSTATE_BIT)
          && (oldch & CANCELTYPE_BITMASK) != 0)          

I ended up fixing it later in the serie with the cancel type and state 
refactor.  I have fixed it locally.
  
Adhemerval Zanella Netto June 1, 2021, 4:36 p.m. UTC | #3
On 01/06/2021 10:50, Adhemerval Zanella wrote:
>>> diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
>>> index deb404600c..8dfbcff8c3 100644
>>> --- a/nptl/pthread_cancel.c
>>> +++ b/nptl/pthread_cancel.c
>>
>>> @@ -104,72 +87,33 @@ __pthread_cancel (pthread_t th)
>>>  		    " must be installed for pthread_cancel to work\n");
>>>    }
>>> +
>>> +  int oldch = atomic_fetch_or_acquire (&pd->cancelhandling, CANCELED_BITMASK);
>>> +  if ((oldch & CANCELED_BITMASK) != 0)
>>> +    return 0;
>>> +
>>> +  if (pd == THREAD_SELF)
>>>      {
>>> +      /* A single-threaded process should be able to kill itself, since there
>>> +	 is nothing in the POSIX specification that says that it cannot.  So
>>> +	 we set multiple_threads to true so that cancellation points get
>>> +	 executed.  */
>>> +      THREAD_SETMEM (THREAD_SELF, header.multiple_threads, 1);
>>>  #ifndef TLS_MULTIPLE_THREADS_IN_TCB
>>> -	__libc_multiple_threads = 1;
>>> +      __libc_multiple_threads = 1;
>>>  #endif
>>> +
>>> +      THREAD_SETMEM (pd, result, PTHREAD_CANCELED);
>>> +      if ((oldch & CANCELTYPE_BITMASK) != 0)
>>> +	__do_cancel ();
>>> +      return 0;
>>>      }
>>
>> The last part is for asynchronous self-cancel, right?  Don't we have to
>> check that cancellation is enabled, too?

Ok, so I recall why I did this originally and checking if cancel is enable
is not strictly necessary since SIGCANCEL signal handler will bail early
if cancellation is not enabled.  But it does seems the correct way to do
it.

> 
> Indeed it should be:
> 
>       if ((oldch & CANCELSTATE_BIT)

This should be '(oldch & CANCELSTATE_BIT) == 0'.

>           && (oldch & CANCELTYPE_BITMASK) != 0)          
> 
> I ended up fixing it later in the serie with the cancel type and state 
> refactor.  I have fixed it locally.
>
  

Patch

diff --git a/nptl/cancellation.c b/nptl/cancellation.c
index c20845adc0..b15f25d8f6 100644
--- a/nptl/cancellation.c
+++ b/nptl/cancellation.c
@@ -88,17 +88,5 @@  __pthread_disable_asynccancel (int oldtype)
       /* Prepare the next round.  */
       oldval = curval;
     }
-
-  /* We cannot return when we are being canceled.  Upon return the
-     thread might be things which would have to be undone.  The
-     following loop should loop until the cancellation signal is
-     delivered.  */
-  while (__builtin_expect ((newval & (CANCELING_BITMASK | CANCELED_BITMASK))
-			   == CANCELING_BITMASK, 0))
-    {
-      futex_wait_simple ((unsigned int *) &self->cancelhandling, newval,
-			 FUTEX_PRIVATE);
-      newval = THREAD_GETMEM (self, cancelhandling);
-    }
 }
 libc_hidden_def (__pthread_disable_asynccancel)
diff --git a/nptl/descr.h b/nptl/descr.h
index 9d8297b45f..a120365f88 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -283,9 +283,6 @@  struct pthread
   /* Bit set if asynchronous cancellation mode is selected.  */
 #define CANCELTYPE_BIT		1
 #define CANCELTYPE_BITMASK	(0x01 << CANCELTYPE_BIT)
-  /* Bit set if canceling has been initiated.  */
-#define CANCELING_BIT		2
-#define CANCELING_BITMASK	(0x01 << CANCELING_BIT)
   /* Bit set if canceled.  */
 #define CANCELED_BIT		3
 #define CANCELED_BITMASK	(0x01 << CANCELED_BIT)
diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
index deb404600c..8dfbcff8c3 100644
--- a/nptl/pthread_cancel.c
+++ b/nptl/pthread_cancel.c
@@ -43,35 +43,18 @@  sigcancel_handler (int sig, siginfo_t *si, void *ctx)
 
   struct pthread *self = THREAD_SELF;
 
-  int oldval = THREAD_GETMEM (self, cancelhandling);
-  while (1)
-    {
-      /* We are canceled now.  When canceled by another thread this flag
-	 is already set but if the signal is directly send (internally or
-	 from another process) is has to be done here.  */
-      int newval = oldval | CANCELING_BITMASK | CANCELED_BITMASK;
-
-      if (oldval == newval || (oldval & EXITING_BITMASK) != 0)
-	/* Already canceled or exiting.  */
-	break;
-
-      int curval = THREAD_ATOMIC_CMPXCHG_VAL (self, cancelhandling, newval,
-					      oldval);
-      if (curval == oldval)
-	{
-	  /* Set the return value.  */
-	  THREAD_SETMEM (self, result, PTHREAD_CANCELED);
-
-	  /* Make sure asynchronous cancellation is still enabled.  */
-	  if ((newval & CANCELTYPE_BITMASK) != 0)
-	    /* Run the registered destructors and terminate the thread.  */
-	    __do_cancel ();
-
-	  break;
-	}
-
-      oldval = curval;
-    }
+  int ch = atomic_load_relaxed (&self->cancelhandling);
+  /* Cancelation not enabled, not cancelled, or already exitting.  */
+  if ((ch & CANCELSTATE_BITMASK) != 0
+      || (ch & CANCELED_BITMASK) == 0
+      || (ch & EXITING_BITMASK) != 0)
+    return;
+
+  /* Set the return value.  */
+  THREAD_SETMEM (self, result, PTHREAD_CANCELED);
+  /* Make sure asynchronous cancellation is still enabled.  */
+  if ((ch & CANCELTYPE_BITMASK) != 0)
+    __do_cancel ();
 }
 
 int
@@ -104,72 +87,33 @@  __pthread_cancel (pthread_t th)
 		    " must be installed for pthread_cancel to work\n");
   }
 #endif
-  int result = 0;
-  int oldval;
-  int newval;
-  do
+
+  int oldch = atomic_fetch_or_acquire (&pd->cancelhandling, CANCELED_BITMASK);
+  if ((oldch & CANCELED_BITMASK) != 0)
+    return 0;
+
+  if (pd == THREAD_SELF)
     {
-    again:
-      oldval = pd->cancelhandling;
-      newval = oldval | CANCELING_BITMASK | CANCELED_BITMASK;
-
-      /* Avoid doing unnecessary work.  The atomic operation can
-	 potentially be expensive if the bug has to be locked and
-	 remote cache lines have to be invalidated.  */
-      if (oldval == newval)
-	break;
-
-      /* If the cancellation is handled asynchronously just send a
-	 signal.  We avoid this if possible since it's more
-	 expensive.  */
-      if (CANCEL_ENABLED_AND_CANCELED_AND_ASYNCHRONOUS (newval))
-	{
-	  /* Mark the cancellation as "in progress".  */
-	  if (atomic_compare_and_exchange_bool_acq (&pd->cancelhandling,
-						    oldval | CANCELING_BITMASK,
-						    oldval))
-	    goto again;
-
-	  if (pd == THREAD_SELF)
-	    /* This is not merely an optimization: An application may
-	       call pthread_cancel (pthread_self ()) without calling
-	       pthread_create, so the signal handler may not have been
-	       set up for a self-cancel.  */
-	    {
-	      THREAD_SETMEM (pd, result, PTHREAD_CANCELED);
-	      if ((newval & CANCELTYPE_BITMASK) != 0)
-		__do_cancel ();
-	    }
-	  else
-	    {
-	      /* The cancellation handler will take care of marking the
-		 thread as canceled.  */
-	      pid_t pid = __getpid ();
-
-	      int val = INTERNAL_SYSCALL_CALL (tgkill, pid, pd->tid,
-					       SIGCANCEL);
-	      if (INTERNAL_SYSCALL_ERROR_P (val))
-		result = INTERNAL_SYSCALL_ERRNO (val);
-	    }
-
-	  break;
-	}
-
-	/* A single-threaded process should be able to kill itself, since
-	   there is nothing in the POSIX specification that says that it
-	   cannot.  So we set multiple_threads to true so that cancellation
-	   points get executed.  */
-	THREAD_SETMEM (THREAD_SELF, header.multiple_threads, 1);
+      /* A single-threaded process should be able to kill itself, since there
+	 is nothing in the POSIX specification that says that it cannot.  So
+	 we set multiple_threads to true so that cancellation points get
+	 executed.  */
+      THREAD_SETMEM (THREAD_SELF, header.multiple_threads, 1);
 #ifndef TLS_MULTIPLE_THREADS_IN_TCB
-	__libc_multiple_threads = 1;
+      __libc_multiple_threads = 1;
 #endif
+
+      THREAD_SETMEM (pd, result, PTHREAD_CANCELED);
+      if ((oldch & CANCELTYPE_BITMASK) != 0)
+	__do_cancel ();
+      return 0;
     }
-  /* Mark the thread as canceled.  This has to be done
-     atomically since other bits could be modified as well.  */
-  while (atomic_compare_and_exchange_bool_acq (&pd->cancelhandling, newval,
-					       oldval));
 
-  return result;
+  pid_t pid = __getpid ();
+  int val = INTERNAL_SYSCALL_CALL (tgkill, pid, pd->tid, SIGCANCEL);
+  return INTERNAL_SYSCALL_ERROR_P (val)
+	 ? INTERNAL_SYSCALL_ERRNO (val)
+	 : 0;
 }
 versioned_symbol (libc, __pthread_cancel, pthread_cancel, GLIBC_2_34);
 
diff --git a/nptl/pthread_join_common.c b/nptl/pthread_join_common.c
index e87801b5a3..f842c91a08 100644
--- a/nptl/pthread_join_common.c
+++ b/nptl/pthread_join_common.c
@@ -57,7 +57,7 @@  __pthread_clockjoin_ex (pthread_t threadid, void **thread_return,
   if ((pd == self
        || (self->joinid == pd
 	   && (pd->cancelhandling
-	       & (CANCELING_BITMASK | CANCELED_BITMASK | EXITING_BITMASK
+	       & (CANCELED_BITMASK | EXITING_BITMASK
 		  | TERMINATED_BITMASK)) == 0))
       && !CANCEL_ENABLED_AND_CANCELED (self->cancelhandling))
     /* This is a deadlock situation.  The threads are waiting for each