Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock.

Message ID c60e8de1-704a-9ff1-6ec9-474e47cc4686@linux.ibm.com
State Committed
Headers

Commit Message

Stefan Liebler Feb. 6, 2019, 11:25 a.m. UTC
  Hi Carlos,
I've updated the patch with three additional comments and I've mentioned 
the filed bug.
Please review it once again before I commit it to master and cherry pick 
it to the release branches.

On 02/05/2019 10:00 PM, Carlos O'Donell wrote:
> On 2/5/19 11:21 AM, Stefan Liebler wrote:
>> while debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
>> Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
>> instructions:
>> 140:   a5 1b 00 01             oill    %r1,1
>> 144:   e5 48 a0 f0 00 00       mvghi   240(%r10),0   <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>> 14a:   e3 10 a0 e0 00 24       stg     %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI
>>
>> vs (with compiler barriers):
>> 140:   a5 1b 00 01             oill    %r1,1
>> 144:   e3 10 a0 e0 00 24       stg     %r1,224(%r10)
>> 14a:   e5 48 a0 f0 00 00       mvghi   240(%r10),0
>>
>> Please have a look at the discussion:
>> "Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
>> (https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)
> 
> What a great read! Thank you to everyone you had to spend time tracking
> this down. Thank you for the patch Stefan.
> 
>> This patch is introducing the same compiler barriers and comments
>> for pthread_mutex_trylock as introduced for pthread_mutex_lock and
>> pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
>> "Add compiler barriers around modifications of the robust mutex list."
>>
>> Okay to commit?
> 
> OK for master with 3 additional comments about ordering.
> 
> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
> 
>> The original commit was first available with glibc release 2.25.
>> Once this patch is committed, we should at least backport it to
>> glibc release branches 2.25 - 2.28?
> 
> ... and 2.29.
Yes, of course.

> 
> Yes, absolutely, once you commit to master you are free to use
> git cherry-pick -x to put the fix on all the branches you need.
> Please post your work to libc-stable@sourceware.org.
> 
>> Does anybody know if and where the original commit was backported to?
>> I've found at least "Bug 1401665 - Fix process shared robust mutex defects." (https://bugzilla.redhat.com/show_bug.cgi?id=1401665#c34)
> 
> Yes, I did that backport to RHEL 7.6. These fixes are just "further"
> fixes right? I'll work on getting this fixed in RHEL 7.7, and RHEL 8
> for all arches.
Sounds great.
That's the same fix for pthread_mutex_trylock as previously done for 
pthread_mutex_lock and pthread_mutex_timedlock.

> 
>>      * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
>>      Add compiler barriers and comments.
>>
>> 20190205_pthread_mutex_trylock_barriers.patch
>>
>> commit 9efa39ef04961397e39e7a9d3c11a33937755aec
>> Author: Stefan Liebler <stli@linux.ibm.com>
>> Date:   Tue Feb 5 12:37:42 2019 +0100
>>
>>      Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock.
> 
> OK.
> 
>>      While debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
>>      Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
>>      instructions:
>>      140:   a5 1b 00 01             oill    %r1,1
>>      144:   e5 48 a0 f0 00 00       mvghi   240(%r10),0   <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>      14a:   e3 10 a0 e0 00 24       stg     %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI
>>      
>>      vs (with compiler barriers):
>>      140:   a5 1b 00 01             oill    %r1,1
>>      144:   e3 10 a0 e0 00 24       stg     %r1,224(%r10)
>>      14a:   e5 48 a0 f0 00 00       mvghi   240(%r10),0
>>      
>>      Please have a look at the discussion:
>>      "Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
>>      (https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)
> 
> OK.
> 
>>      This patch is introducing the same compiler barriers and comments
>>      for pthread_mutex_trylock as introduced for pthread_mutex_lock and
>>      pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
>>      "Add compiler barriers around modifications of the robust mutex list."
>>      
>>      ChangeLog:
>>      
>>              * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
>>              Add compiler barriers and comments.
> 
> OK.
> 
>>
> 
> I reviewed:
> 
> nptl/descr.h - OK
> nptl/pthread_mutex_unlock.c - OK
> nptl/pthread_mutex_timedlock.c - OK
> nptl/allocatestack.c - OK
> nptl/pthread_create.c - OK
> nptl/pthread_mutex_lock.c - OK
> nptl/nptl-init.c - OK
> nptl/pthread_mutex_trylock.c <--- I count 10 missing barriers, and 9 missing ordering comments.
> sysdeps/nptl/fork.c - OK
> 
>> diff --git a/nptl/pthread_mutex_trylock.c b/nptl/pthread_mutex_trylock.c
>> index 8fe43b8f0f..ff1d7282ab 100644
>> --- a/nptl/pthread_mutex_trylock.c
>> +++ b/nptl/pthread_mutex_trylock.c
>> @@ -94,6 +94,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>>       case PTHREAD_MUTEX_ROBUST_ADAPTIVE_NP:
>>         THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>>   		     &mutex->__data.__list.__next);
>> +      /* We need to set op_pending before starting the operation.  Also
>> +	 see comments at ENQUEUE_MUTEX.  */
>> +      __asm ("" ::: "memory");
> 
> OK. 1/10 barriers.
> 
>>   
>>         oldval = mutex->__data.__lock;
>>         do
>> @@ -119,7 +122,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>>   	      /* But it is inconsistent unless marked otherwise.  */
>>   	      mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
>>   
>> +	      /* We must not enqueue the mutex before we have acquired it.
>> +		 Also see comments at ENQUEUE_MUTEX.  */
>> +	      __asm ("" ::: "memory");
> 
> OK. 2/10 barriers.
> 
>>   	      ENQUEUE_MUTEX (mutex);
>> +	      /* We need to clear op_pending after we enqueue the mutex.  */
>> +	      __asm ("" ::: "memory");
> 
> OK. 3/10 barriers.
> 
>>   	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>   
>>   	      /* Note that we deliberately exist here.  If we fall
>> @@ -135,6 +143,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>>   	      int kind = PTHREAD_MUTEX_TYPE (mutex);
>>   	      if (kind == PTHREAD_MUTEX_ROBUST_ERRORCHECK_NP)
>>   		{
>> +		  /* We do not need to ensure ordering wrt another memory
>> +		     access.  Also see comments at ENQUEUE_MUTEX. */
> 
> OK. 1/9 comments.
> 
>>   		  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>>   				 NULL);
>>   		  return EDEADLK;
>> @@ -142,6 +152,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>>   
>>   	      if (kind == PTHREAD_MUTEX_ROBUST_RECURSIVE_NP)
>>   		{
>> +		  /* We do not need to ensure ordering wrt another memory
>> +		     access.  */
> 
> OK. 2/9 comments.
> 
>>   		  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>>   				 NULL);
>>   
> 
> Missing comment.
> 
> 159           oldval = atomic_compare_and_exchange_val_acq (&mutex->__data.__lock,
> 
> Did not acquire the lock.
> 
> 160                                                         id, 0);
> 161           if (oldval != 0 && (oldval & FUTEX_OWNER_DIED) == 0)
> 162             {
> 163               THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
> 
> Clearing list_op_pending has no ordering requirement, and so could use a comment?
> 
Added a comment. See new patch.
> 164
> 165               return EBUSY;
> 166             }
> 
> 
>> @@ -173,13 +185,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>>   	      if (oldval == id)
>>   		lll_unlock (mutex->__data.__lock,
>>   			    PTHREAD_ROBUST_MUTEX_PSHARED (mutex));
>> +	      /* FIXME This violates the mutex destruction requirements.  See
>> +		 __pthread_mutex_unlock_full.  */
> 
> OK. 3/9 comments.
> 
>>   	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>   	      return ENOTRECOVERABLE;
>>   	    }
>>   	}
>>         while ((oldval & FUTEX_OWNER_DIED) != 0);
>>   
>> +      /* We must not enqueue the mutex before we have acquired it.
>> +	 Also see comments at ENQUEUE_MUTEX.  */
>> +      __asm ("" ::: "memory");
> 
> OK. 4/10 barriers.
> 
>>         ENQUEUE_MUTEX (mutex);
>> +      /* We need to clear op_pending after we enqueue the mutex.  */
>> +      __asm ("" ::: "memory");
> 
> OK. 5/10 barriers.
> 
>>         THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>   
>>         mutex->__data.__owner = id;
>> @@ -211,10 +230,15 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>>   	}
>>   
>>   	if (robust)
>> -	  /* Note: robust PI futexes are signaled by setting bit 0.  */
>> -	  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>> -			 (void *) (((uintptr_t) &mutex->__data.__list.__next)
>> -				   | 1));
>> +	  {
>> +	    /* Note: robust PI futexes are signaled by setting bit 0.  */
>> +	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>> +			   (void *) (((uintptr_t) &mutex->__data.__list.__next)
>> +				     | 1));
>> +	    /* We need to set op_pending before starting the operation.  Also
>> +	       see comments at ENQUEUE_MUTEX.  */
>> +	    __asm ("" ::: "memory");
> 
> OK. 6/10 barriers.
> 
>> +	  }
>>   
>>   	oldval = mutex->__data.__lock;
>>   
>> @@ -223,12 +247,16 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>>   	  {
>>   	    if (kind == PTHREAD_MUTEX_ERRORCHECK_NP)
>>   	      {
>> +		/* We do not need to ensure ordering wrt another memory
>> +		   access.  */
> 
> OK. 4/9 comments.
> 
>>   		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>   		return EDEADLK;
>>   	      }
>>   
>>   	    if (kind == PTHREAD_MUTEX_RECURSIVE_NP)
>>   	      {
>> +		/* We do not need to ensure ordering wrt another memory
>> +		   access.  */
> 
> OK. 5/9 comments.
> 
>>   		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>   
>>   		/* Just bump the counter.  */
> 
> Missing comment?
> 
> 249         if (oldval != 0)
> 
> Filaed to get the lock.
> 
> 250           {
> 251             if ((oldval & FUTEX_OWNER_DIED) == 0)
> 252               {
> 253                 THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
> 
> Clearing list_op_pending has no ordering requirement, and so could use a comment?
Added a comment. See new patch.
> 
> 254
> 255                 return EBUSY;
> 256               }
> ...
> 270             if (INTERNAL_SYSCALL_ERROR_P (e, __err)
> 271                 && INTERNAL_SYSCALL_ERRNO (e, __err) == EWOULDBLOCK)
> 272               {
> 273                 THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
> 
> Clearing list_op_pending has no ordering requirement, and so could use a comment?
Added a comment. See new patch.
> 
> 274
> 275                 return EBUSY;
> 276               }
> 
> 
>> @@ -287,7 +315,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>>   	    /* But it is inconsistent unless marked otherwise.  */
>>   	    mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
>>   
>> +	    /* We must not enqueue the mutex before we have acquired it.
>> +	       Also see comments at ENQUEUE_MUTEX.  */
>> +	    __asm ("" ::: "memory");
> 
> OK. 7/8 barriers.
> 
>>   	    ENQUEUE_MUTEX (mutex);
>> +	    /* We need to clear op_pending after we enqueue the mutex.  */
>> +	    __asm ("" ::: "memory");
> 
> OK. 8/10 barriers.
> 
>>   	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>   
>>   	    /* Note that we deliberately exit here.  If we fall
>> @@ -310,13 +343,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>>   						  PTHREAD_ROBUST_MUTEX_PSHARED (mutex)),
>>   			      0, 0);
>>   
>> +	    /* To the kernel, this will be visible after the kernel has
>> +	       acquired the mutex in the syscall.  */
> 
> OK. 6/9 comments. 3 missing comments noted.
> 
>>   	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>   	    return ENOTRECOVERABLE;
>>   	  }
>>   
>>   	if (robust)
>>   	  {
>> +	    /* We must not enqueue the mutex before we have acquired it.
>> +	       Also see comments at ENQUEUE_MUTEX.  */
>> +	    __asm ("" ::: "memory");
> 
> OK. 9/10 barriers.
> 
>>   	    ENQUEUE_MUTEX_PI (mutex);
>> +	    /* We need to clear op_pending after we enqueue the mutex.  */
>> +	    __asm ("" ::: "memory");
> 
> OK. 10/10 barriers.
> 
>>   	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>   	  }
>>   
> 
>
  

Comments

Carlos O'Donell Feb. 6, 2019, 3:59 p.m. UTC | #1
On 2/6/19 6:25 AM, Stefan Liebler wrote:
> Hi Carlos,
> I've updated the patch with three additional comments and I've mentioned the filed bug.
> Please review it once again before I commit it to master and cherry pick it to the release branches.

Thank you! Reviewed.

>> Yes, I did that backport to RHEL 7.6. These fixes are just "further"
>> fixes right? I'll work on getting this fixed in RHEL 7.7, and RHEL 8
>> for all arches.
> Sounds great.
> That's the same fix for pthread_mutex_trylock as previously done for pthread_mutex_lock and pthread_mutex_timedlock.

I've filed these:
https://bugzilla.redhat.com/show_bug.cgi?id=1672771
https://bugzilla.redhat.com/show_bug.cgi?id=1672773

Feel free to comment or verify if they are going to be needed in RHEL7.7
or RHEL8. I haven't done the analysis of the disassembly yet. If you
could have a look that would help.

> 20190206_pthread_mutex_trylock_barriers.patch
> 
> commit b4c6ee19e804b0e90c117ec353ce67d321f0319b
> Author: Stefan Liebler <stli@linux.ibm.com>
> Date:   Wed Feb 6 11:27:03 2019 +0100
> 
>     Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock. [BZ #24180]
>     
>     While debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
>     Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
>     instructions:
>     140:   a5 1b 00 01             oill    %r1,1
>     144:   e5 48 a0 f0 00 00       mvghi   240(%r10),0   <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>     14a:   e3 10 a0 e0 00 24       stg     %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI
>     
>     vs (with compiler barriers):
>     140:   a5 1b 00 01             oill    %r1,1
>     144:   e3 10 a0 e0 00 24       stg     %r1,224(%r10)
>     14a:   e5 48 a0 f0 00 00       mvghi   240(%r10),0
>     
>     Please have a look at the discussion:
>     "Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
>     (https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)
>     
>     This patch is introducing the same compiler barriers and comments
>     for pthread_mutex_trylock as introduced for pthread_mutex_lock and
>     pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
>     "Add compiler barriers around modifications of the robust mutex list."
>     
>     ChangeLog:
>     
>             [BZ #24180]
>             * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
>             Add compiler barriers and comments.

OK for master.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

> diff --git a/nptl/pthread_mutex_trylock.c b/nptl/pthread_mutex_trylock.c
> index 8fe43b8f0f..bf2869eca2 100644
> --- a/nptl/pthread_mutex_trylock.c
> +++ b/nptl/pthread_mutex_trylock.c
> @@ -94,6 +94,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>      case PTHREAD_MUTEX_ROBUST_ADAPTIVE_NP:
>        THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>  		     &mutex->__data.__list.__next);
> +      /* We need to set op_pending before starting the operation.  Also
> +	 see comments at ENQUEUE_MUTEX.  */
> +      __asm ("" ::: "memory");

OK. 1/10 barriers.

>  
>        oldval = mutex->__data.__lock;
>        do
> @@ -119,7 +122,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	      /* But it is inconsistent unless marked otherwise.  */
>  	      mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
>  
> +	      /* We must not enqueue the mutex before we have acquired it.
> +		 Also see comments at ENQUEUE_MUTEX.  */
> +	      __asm ("" ::: "memory");

OK. 2/10 barriers.

>  	      ENQUEUE_MUTEX (mutex);
> +	      /* We need to clear op_pending after we enqueue the mutex.  */
> +	      __asm ("" ::: "memory");

OK. 3/10 barriers.

>  	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  
>  	      /* Note that we deliberately exist here.  If we fall
> @@ -135,6 +143,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	      int kind = PTHREAD_MUTEX_TYPE (mutex);
>  	      if (kind == PTHREAD_MUTEX_ROBUST_ERRORCHECK_NP)
>  		{
> +		  /* We do not need to ensure ordering wrt another memory
> +		     access.  Also see comments at ENQUEUE_MUTEX. */

OK. 1/9 comments.

>  		  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>  				 NULL);
>  		  return EDEADLK;
> @@ -142,6 +152,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  
>  	      if (kind == PTHREAD_MUTEX_ROBUST_RECURSIVE_NP)
>  		{
> +		  /* We do not need to ensure ordering wrt another memory
> +		     access.  */

OK. 2/9 comments.

>  		  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>  				 NULL);
>  
> @@ -160,6 +172,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  							id, 0);
>  	  if (oldval != 0 && (oldval & FUTEX_OWNER_DIED) == 0)
>  	    {
> +	      /* We haven't acquired the lock as it is already acquired by
> +		 another owner.  We do not need to ensure ordering wrt another
> +		 memory access.  */

OK. 3/9 comments.

>  	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  
>  	      return EBUSY;
> @@ -173,13 +188,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	      if (oldval == id)
>  		lll_unlock (mutex->__data.__lock,
>  			    PTHREAD_ROBUST_MUTEX_PSHARED (mutex));
> +	      /* FIXME This violates the mutex destruction requirements.  See
> +		 __pthread_mutex_unlock_full.  */

OK. 4/9 comments.

>  	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  	      return ENOTRECOVERABLE;
>  	    }
>  	}
>        while ((oldval & FUTEX_OWNER_DIED) != 0);
>  
> +      /* We must not enqueue the mutex before we have acquired it.
> +	 Also see comments at ENQUEUE_MUTEX.  */
> +      __asm ("" ::: "memory");

OK. 4/10 barriers.

>        ENQUEUE_MUTEX (mutex);
> +      /* We need to clear op_pending after we enqueue the mutex.  */
> +      __asm ("" ::: "memory");

OK. 5/10 barriers.

>        THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  
>        mutex->__data.__owner = id;
> @@ -211,10 +233,15 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	}
>  
>  	if (robust)
> -	  /* Note: robust PI futexes are signaled by setting bit 0.  */
> -	  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
> -			 (void *) (((uintptr_t) &mutex->__data.__list.__next)
> -				   | 1));
> +	  {
> +	    /* Note: robust PI futexes are signaled by setting bit 0.  */
> +	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
> +			   (void *) (((uintptr_t) &mutex->__data.__list.__next)
> +				     | 1));
> +	    /* We need to set op_pending before starting the operation.  Also
> +	       see comments at ENQUEUE_MUTEX.  */
> +	    __asm ("" ::: "memory");

OK. 6/10 barriers.

> +	  }
>  
>  	oldval = mutex->__data.__lock;
>  
> @@ -223,12 +250,16 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	  {
>  	    if (kind == PTHREAD_MUTEX_ERRORCHECK_NP)
>  	      {
> +		/* We do not need to ensure ordering wrt another memory
> +		   access.  */

OK. 5/9 comments.

>  		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  		return EDEADLK;
>  	      }
>  
>  	    if (kind == PTHREAD_MUTEX_RECURSIVE_NP)
>  	      {
> +		/* We do not need to ensure ordering wrt another memory
> +		   access.  */

OK. 6/9 comments.

>  		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  
>  		/* Just bump the counter.  */
> @@ -250,6 +281,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	  {
>  	    if ((oldval & FUTEX_OWNER_DIED) == 0)
>  	      {
> +		/* We haven't acquired the lock as it is already acquired by
> +		   another owner.  We do not need to ensure ordering wrt another
> +		   memory access.  */

OK. 7/9 comments.

>  		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  
>  		return EBUSY;
> @@ -270,6 +304,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	    if (INTERNAL_SYSCALL_ERROR_P (e, __err)
>  		&& INTERNAL_SYSCALL_ERRNO (e, __err) == EWOULDBLOCK)
>  	      {
> +		/* The kernel has not yet finished the mutex owner death.
> +		   We do not need to ensure ordering wrt another memory
> +		   access.  */

OK. 8/9 comments.

>  		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  
>  		return EBUSY;
> @@ -287,7 +324,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  	    /* But it is inconsistent unless marked otherwise.  */
>  	    mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
>  
> +	    /* We must not enqueue the mutex before we have acquired it.
> +	       Also see comments at ENQUEUE_MUTEX.  */
> +	    __asm ("" ::: "memory");

OK. 7/10 barriers.

>  	    ENQUEUE_MUTEX (mutex);
> +	    /* We need to clear op_pending after we enqueue the mutex.  */
> +	    __asm ("" ::: "memory");

OK. 8/10 barriers.

>  	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  
>  	    /* Note that we deliberately exit here.  If we fall
> @@ -310,13 +352,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>  						  PTHREAD_ROBUST_MUTEX_PSHARED (mutex)),
>  			      0, 0);
>  
> +	    /* To the kernel, this will be visible after the kernel has
> +	       acquired the mutex in the syscall.  */

OK. 9/9 comments.

>  	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  	    return ENOTRECOVERABLE;
>  	  }
>  
>  	if (robust)
>  	  {
> +	    /* We must not enqueue the mutex before we have acquired it.
> +	       Also see comments at ENQUEUE_MUTEX.  */
> +	    __asm ("" ::: "memory");

OK. 9/10 barriers.

>  	    ENQUEUE_MUTEX_PI (mutex);
> +	    /* We need to clear op_pending after we enqueue the mutex.  */
> +	    __asm ("" ::: "memory");


OK. 10/10 barriers.

>  	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>  	  }
>
  
Stefan Liebler Feb. 7, 2019, 3:05 p.m. UTC | #2
On 02/06/2019 04:59 PM, Carlos O'Donell wrote:
> On 2/6/19 6:25 AM, Stefan Liebler wrote:
>> Hi Carlos,
>> I've updated the patch with three additional comments and I've mentioned the filed bug.
>> Please review it once again before I commit it to master and cherry pick it to the release branches.
> 
> Thank you! Reviewed.
Committed to master (2.29.9000):
"Add compiler barriers around modifications of the robust mutex list for 
pthread_mutex_trylock. [BZ #24180]"
(https://sourceware.org/git/?p=glibc.git;a=commit;h=823624bdc47f1f80109c9c52dee7939b9386d708)

and backported it to glibc 2.25 ... 2.29 release branches.

Thanks.
Stefan

> 
>>> Yes, I did that backport to RHEL 7.6. These fixes are just "further"
>>> fixes right? I'll work on getting this fixed in RHEL 7.7, and RHEL 8
>>> for all arches.
>> Sounds great.
>> That's the same fix for pthread_mutex_trylock as previously done for pthread_mutex_lock and pthread_mutex_timedlock.
> 
> I've filed these:
> https://bugzilla.redhat.com/show_bug.cgi?id=1672771
> https://bugzilla.redhat.com/show_bug.cgi?id=1672773
> 
> Feel free to comment or verify if they are going to be needed in RHEL7.7
> or RHEL8. I haven't done the analysis of the disassembly yet. If you
> could have a look that would help.
>
  
Carlos O'Donell Feb. 7, 2019, 8:32 p.m. UTC | #3
On 2/7/19 10:05 AM, Stefan Liebler wrote:
> On 02/06/2019 04:59 PM, Carlos O'Donell wrote:
>> On 2/6/19 6:25 AM, Stefan Liebler wrote:
>>> Hi Carlos,
>>> I've updated the patch with three additional comments and I've mentioned the filed bug.
>>> Please review it once again before I commit it to master and cherry pick it to the release branches.
>>
>> Thank you! Reviewed.
> Committed to master (2.29.9000):
> "Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock. [BZ #24180]"
> (https://sourceware.org/git/?p=glibc.git;a=commit;h=823624bdc47f1f80109c9c52dee7939b9386d708)
> 
> and backported it to glibc 2.25 ... 2.29 release branches.

Thank you! Now if only we could fix the final robust mutex
design flaw[1] ;-)
  

Patch

commit b4c6ee19e804b0e90c117ec353ce67d321f0319b
Author: Stefan Liebler <stli@linux.ibm.com>
Date:   Wed Feb 6 11:27:03 2019 +0100

    Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock. [BZ #24180]
    
    While debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
    Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
    instructions:
    140:   a5 1b 00 01             oill    %r1,1
    144:   e5 48 a0 f0 00 00       mvghi   240(%r10),0   <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
    14a:   e3 10 a0 e0 00 24       stg     %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI
    
    vs (with compiler barriers):
    140:   a5 1b 00 01             oill    %r1,1
    144:   e3 10 a0 e0 00 24       stg     %r1,224(%r10)
    14a:   e5 48 a0 f0 00 00       mvghi   240(%r10),0
    
    Please have a look at the discussion:
    "Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
    (https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)
    
    This patch is introducing the same compiler barriers and comments
    for pthread_mutex_trylock as introduced for pthread_mutex_lock and
    pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
    "Add compiler barriers around modifications of the robust mutex list."
    
    ChangeLog:
    
            [BZ #24180]
            * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
            Add compiler barriers and comments.

diff --git a/nptl/pthread_mutex_trylock.c b/nptl/pthread_mutex_trylock.c
index 8fe43b8f0f..bf2869eca2 100644
--- a/nptl/pthread_mutex_trylock.c
+++ b/nptl/pthread_mutex_trylock.c
@@ -94,6 +94,9 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
     case PTHREAD_MUTEX_ROBUST_ADAPTIVE_NP:
       THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
 		     &mutex->__data.__list.__next);
+      /* We need to set op_pending before starting the operation.  Also
+	 see comments at ENQUEUE_MUTEX.  */
+      __asm ("" ::: "memory");
 
       oldval = mutex->__data.__lock;
       do
@@ -119,7 +122,12 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	      /* But it is inconsistent unless marked otherwise.  */
 	      mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
 
+	      /* We must not enqueue the mutex before we have acquired it.
+		 Also see comments at ENQUEUE_MUTEX.  */
+	      __asm ("" ::: "memory");
 	      ENQUEUE_MUTEX (mutex);
+	      /* We need to clear op_pending after we enqueue the mutex.  */
+	      __asm ("" ::: "memory");
 	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 
 	      /* Note that we deliberately exist here.  If we fall
@@ -135,6 +143,8 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	      int kind = PTHREAD_MUTEX_TYPE (mutex);
 	      if (kind == PTHREAD_MUTEX_ROBUST_ERRORCHECK_NP)
 		{
+		  /* We do not need to ensure ordering wrt another memory
+		     access.  Also see comments at ENQUEUE_MUTEX. */
 		  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
 				 NULL);
 		  return EDEADLK;
@@ -142,6 +152,8 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
 
 	      if (kind == PTHREAD_MUTEX_ROBUST_RECURSIVE_NP)
 		{
+		  /* We do not need to ensure ordering wrt another memory
+		     access.  */
 		  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
 				 NULL);
 
@@ -160,6 +172,9 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
 							id, 0);
 	  if (oldval != 0 && (oldval & FUTEX_OWNER_DIED) == 0)
 	    {
+	      /* We haven't acquired the lock as it is already acquired by
+		 another owner.  We do not need to ensure ordering wrt another
+		 memory access.  */
 	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 
 	      return EBUSY;
@@ -173,13 +188,20 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	      if (oldval == id)
 		lll_unlock (mutex->__data.__lock,
 			    PTHREAD_ROBUST_MUTEX_PSHARED (mutex));
+	      /* FIXME This violates the mutex destruction requirements.  See
+		 __pthread_mutex_unlock_full.  */
 	      THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 	      return ENOTRECOVERABLE;
 	    }
 	}
       while ((oldval & FUTEX_OWNER_DIED) != 0);
 
+      /* We must not enqueue the mutex before we have acquired it.
+	 Also see comments at ENQUEUE_MUTEX.  */
+      __asm ("" ::: "memory");
       ENQUEUE_MUTEX (mutex);
+      /* We need to clear op_pending after we enqueue the mutex.  */
+      __asm ("" ::: "memory");
       THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 
       mutex->__data.__owner = id;
@@ -211,10 +233,15 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	}
 
 	if (robust)
-	  /* Note: robust PI futexes are signaled by setting bit 0.  */
-	  THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
-			 (void *) (((uintptr_t) &mutex->__data.__list.__next)
-				   | 1));
+	  {
+	    /* Note: robust PI futexes are signaled by setting bit 0.  */
+	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
+			   (void *) (((uintptr_t) &mutex->__data.__list.__next)
+				     | 1));
+	    /* We need to set op_pending before starting the operation.  Also
+	       see comments at ENQUEUE_MUTEX.  */
+	    __asm ("" ::: "memory");
+	  }
 
 	oldval = mutex->__data.__lock;
 
@@ -223,12 +250,16 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	  {
 	    if (kind == PTHREAD_MUTEX_ERRORCHECK_NP)
 	      {
+		/* We do not need to ensure ordering wrt another memory
+		   access.  */
 		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 		return EDEADLK;
 	      }
 
 	    if (kind == PTHREAD_MUTEX_RECURSIVE_NP)
 	      {
+		/* We do not need to ensure ordering wrt another memory
+		   access.  */
 		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 
 		/* Just bump the counter.  */
@@ -250,6 +281,9 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	  {
 	    if ((oldval & FUTEX_OWNER_DIED) == 0)
 	      {
+		/* We haven't acquired the lock as it is already acquired by
+		   another owner.  We do not need to ensure ordering wrt another
+		   memory access.  */
 		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 
 		return EBUSY;
@@ -270,6 +304,9 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	    if (INTERNAL_SYSCALL_ERROR_P (e, __err)
 		&& INTERNAL_SYSCALL_ERRNO (e, __err) == EWOULDBLOCK)
 	      {
+		/* The kernel has not yet finished the mutex owner death.
+		   We do not need to ensure ordering wrt another memory
+		   access.  */
 		THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 
 		return EBUSY;
@@ -287,7 +324,12 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
 	    /* But it is inconsistent unless marked otherwise.  */
 	    mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
 
+	    /* We must not enqueue the mutex before we have acquired it.
+	       Also see comments at ENQUEUE_MUTEX.  */
+	    __asm ("" ::: "memory");
 	    ENQUEUE_MUTEX (mutex);
+	    /* We need to clear op_pending after we enqueue the mutex.  */
+	    __asm ("" ::: "memory");
 	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 
 	    /* Note that we deliberately exit here.  If we fall
@@ -310,13 +352,20 @@  __pthread_mutex_trylock (pthread_mutex_t *mutex)
 						  PTHREAD_ROBUST_MUTEX_PSHARED (mutex)),
 			      0, 0);
 
+	    /* To the kernel, this will be visible after the kernel has
+	       acquired the mutex in the syscall.  */
 	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 	    return ENOTRECOVERABLE;
 	  }
 
 	if (robust)
 	  {
+	    /* We must not enqueue the mutex before we have acquired it.
+	       Also see comments at ENQUEUE_MUTEX.  */
+	    __asm ("" ::: "memory");
 	    ENQUEUE_MUTEX_PI (mutex);
+	    /* We need to clear op_pending after we enqueue the mutex.  */
+	    __asm ("" ::: "memory");
 	    THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
 	  }