Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock.
Commit Message
Hi Carlos,
I've updated the patch with three additional comments and I've mentioned
the filed bug.
Please review it once again before I commit it to master and cherry pick
it to the release branches.
On 02/05/2019 10:00 PM, Carlos O'Donell wrote:
> On 2/5/19 11:21 AM, Stefan Liebler wrote:
>> while debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
>> Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
>> instructions:
>> 140: a5 1b 00 01 oill %r1,1
>> 144: e5 48 a0 f0 00 00 mvghi 240(%r10),0 <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>> 14a: e3 10 a0 e0 00 24 stg %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI
>>
>> vs (with compiler barriers):
>> 140: a5 1b 00 01 oill %r1,1
>> 144: e3 10 a0 e0 00 24 stg %r1,224(%r10)
>> 14a: e5 48 a0 f0 00 00 mvghi 240(%r10),0
>>
>> Please have a look at the discussion:
>> "Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
>> (https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)
>
> What a great read! Thank you to everyone you had to spend time tracking
> this down. Thank you for the patch Stefan.
>
>> This patch is introducing the same compiler barriers and comments
>> for pthread_mutex_trylock as introduced for pthread_mutex_lock and
>> pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
>> "Add compiler barriers around modifications of the robust mutex list."
>>
>> Okay to commit?
>
> OK for master with 3 additional comments about ordering.
>
> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
>
>> The original commit was first available with glibc release 2.25.
>> Once this patch is committed, we should at least backport it to
>> glibc release branches 2.25 - 2.28?
>
> ... and 2.29.
Yes, of course.
>
> Yes, absolutely, once you commit to master you are free to use
> git cherry-pick -x to put the fix on all the branches you need.
> Please post your work to libc-stable@sourceware.org.
>
>> Does anybody know if and where the original commit was backported to?
>> I've found at least "Bug 1401665 - Fix process shared robust mutex defects." (https://bugzilla.redhat.com/show_bug.cgi?id=1401665#c34)
>
> Yes, I did that backport to RHEL 7.6. These fixes are just "further"
> fixes right? I'll work on getting this fixed in RHEL 7.7, and RHEL 8
> for all arches.
Sounds great.
That's the same fix for pthread_mutex_trylock as previously done for
pthread_mutex_lock and pthread_mutex_timedlock.
>
>> * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
>> Add compiler barriers and comments.
>>
>> 20190205_pthread_mutex_trylock_barriers.patch
>>
>> commit 9efa39ef04961397e39e7a9d3c11a33937755aec
>> Author: Stefan Liebler <stli@linux.ibm.com>
>> Date: Tue Feb 5 12:37:42 2019 +0100
>>
>> Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock.
>
> OK.
>
>> While debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
>> Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
>> instructions:
>> 140: a5 1b 00 01 oill %r1,1
>> 144: e5 48 a0 f0 00 00 mvghi 240(%r10),0 <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>> 14a: e3 10 a0 e0 00 24 stg %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI
>>
>> vs (with compiler barriers):
>> 140: a5 1b 00 01 oill %r1,1
>> 144: e3 10 a0 e0 00 24 stg %r1,224(%r10)
>> 14a: e5 48 a0 f0 00 00 mvghi 240(%r10),0
>>
>> Please have a look at the discussion:
>> "Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
>> (https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)
>
> OK.
>
>> This patch is introducing the same compiler barriers and comments
>> for pthread_mutex_trylock as introduced for pthread_mutex_lock and
>> pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
>> "Add compiler barriers around modifications of the robust mutex list."
>>
>> ChangeLog:
>>
>> * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
>> Add compiler barriers and comments.
>
> OK.
>
>>
>
> I reviewed:
>
> nptl/descr.h - OK
> nptl/pthread_mutex_unlock.c - OK
> nptl/pthread_mutex_timedlock.c - OK
> nptl/allocatestack.c - OK
> nptl/pthread_create.c - OK
> nptl/pthread_mutex_lock.c - OK
> nptl/nptl-init.c - OK
> nptl/pthread_mutex_trylock.c <--- I count 10 missing barriers, and 9 missing ordering comments.
> sysdeps/nptl/fork.c - OK
>
>> diff --git a/nptl/pthread_mutex_trylock.c b/nptl/pthread_mutex_trylock.c
>> index 8fe43b8f0f..ff1d7282ab 100644
>> --- a/nptl/pthread_mutex_trylock.c
>> +++ b/nptl/pthread_mutex_trylock.c
>> @@ -94,6 +94,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>> case PTHREAD_MUTEX_ROBUST_ADAPTIVE_NP:
>> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>> &mutex->__data.__list.__next);
>> + /* We need to set op_pending before starting the operation. Also
>> + see comments at ENQUEUE_MUTEX. */
>> + __asm ("" ::: "memory");
>
> OK. 1/10 barriers.
>
>>
>> oldval = mutex->__data.__lock;
>> do
>> @@ -119,7 +122,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>> /* But it is inconsistent unless marked otherwise. */
>> mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
>>
>> + /* We must not enqueue the mutex before we have acquired it.
>> + Also see comments at ENQUEUE_MUTEX. */
>> + __asm ("" ::: "memory");
>
> OK. 2/10 barriers.
>
>> ENQUEUE_MUTEX (mutex);
>> + /* We need to clear op_pending after we enqueue the mutex. */
>> + __asm ("" ::: "memory");
>
> OK. 3/10 barriers.
>
>> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>
>> /* Note that we deliberately exist here. If we fall
>> @@ -135,6 +143,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>> int kind = PTHREAD_MUTEX_TYPE (mutex);
>> if (kind == PTHREAD_MUTEX_ROBUST_ERRORCHECK_NP)
>> {
>> + /* We do not need to ensure ordering wrt another memory
>> + access. Also see comments at ENQUEUE_MUTEX. */
>
> OK. 1/9 comments.
>
>> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>> NULL);
>> return EDEADLK;
>> @@ -142,6 +152,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>>
>> if (kind == PTHREAD_MUTEX_ROBUST_RECURSIVE_NP)
>> {
>> + /* We do not need to ensure ordering wrt another memory
>> + access. */
>
> OK. 2/9 comments.
>
>> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>> NULL);
>>
>
> Missing comment.
>
> 159 oldval = atomic_compare_and_exchange_val_acq (&mutex->__data.__lock,
>
> Did not acquire the lock.
>
> 160 id, 0);
> 161 if (oldval != 0 && (oldval & FUTEX_OWNER_DIED) == 0)
> 162 {
> 163 THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>
> Clearing list_op_pending has no ordering requirement, and so could use a comment?
>
Added a comment. See new patch.
> 164
> 165 return EBUSY;
> 166 }
>
>
>> @@ -173,13 +185,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>> if (oldval == id)
>> lll_unlock (mutex->__data.__lock,
>> PTHREAD_ROBUST_MUTEX_PSHARED (mutex));
>> + /* FIXME This violates the mutex destruction requirements. See
>> + __pthread_mutex_unlock_full. */
>
> OK. 3/9 comments.
>
>> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>> return ENOTRECOVERABLE;
>> }
>> }
>> while ((oldval & FUTEX_OWNER_DIED) != 0);
>>
>> + /* We must not enqueue the mutex before we have acquired it.
>> + Also see comments at ENQUEUE_MUTEX. */
>> + __asm ("" ::: "memory");
>
> OK. 4/10 barriers.
>
>> ENQUEUE_MUTEX (mutex);
>> + /* We need to clear op_pending after we enqueue the mutex. */
>> + __asm ("" ::: "memory");
>
> OK. 5/10 barriers.
>
>> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>
>> mutex->__data.__owner = id;
>> @@ -211,10 +230,15 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>> }
>>
>> if (robust)
>> - /* Note: robust PI futexes are signaled by setting bit 0. */
>> - THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>> - (void *) (((uintptr_t) &mutex->__data.__list.__next)
>> - | 1));
>> + {
>> + /* Note: robust PI futexes are signaled by setting bit 0. */
>> + THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
>> + (void *) (((uintptr_t) &mutex->__data.__list.__next)
>> + | 1));
>> + /* We need to set op_pending before starting the operation. Also
>> + see comments at ENQUEUE_MUTEX. */
>> + __asm ("" ::: "memory");
>
> OK. 6/10 barriers.
>
>> + }
>>
>> oldval = mutex->__data.__lock;
>>
>> @@ -223,12 +247,16 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>> {
>> if (kind == PTHREAD_MUTEX_ERRORCHECK_NP)
>> {
>> + /* We do not need to ensure ordering wrt another memory
>> + access. */
>
> OK. 4/9 comments.
>
>> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>> return EDEADLK;
>> }
>>
>> if (kind == PTHREAD_MUTEX_RECURSIVE_NP)
>> {
>> + /* We do not need to ensure ordering wrt another memory
>> + access. */
>
> OK. 5/9 comments.
>
>> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>
>> /* Just bump the counter. */
>
> Missing comment?
>
> 249 if (oldval != 0)
>
> Filaed to get the lock.
>
> 250 {
> 251 if ((oldval & FUTEX_OWNER_DIED) == 0)
> 252 {
> 253 THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>
> Clearing list_op_pending has no ordering requirement, and so could use a comment?
Added a comment. See new patch.
>
> 254
> 255 return EBUSY;
> 256 }
> ...
> 270 if (INTERNAL_SYSCALL_ERROR_P (e, __err)
> 271 && INTERNAL_SYSCALL_ERRNO (e, __err) == EWOULDBLOCK)
> 272 {
> 273 THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>
> Clearing list_op_pending has no ordering requirement, and so could use a comment?
Added a comment. See new patch.
>
> 274
> 275 return EBUSY;
> 276 }
>
>
>> @@ -287,7 +315,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>> /* But it is inconsistent unless marked otherwise. */
>> mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
>>
>> + /* We must not enqueue the mutex before we have acquired it.
>> + Also see comments at ENQUEUE_MUTEX. */
>> + __asm ("" ::: "memory");
>
> OK. 7/8 barriers.
>
>> ENQUEUE_MUTEX (mutex);
>> + /* We need to clear op_pending after we enqueue the mutex. */
>> + __asm ("" ::: "memory");
>
> OK. 8/10 barriers.
>
>> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>>
>> /* Note that we deliberately exit here. If we fall
>> @@ -310,13 +343,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>> PTHREAD_ROBUST_MUTEX_PSHARED (mutex)),
>> 0, 0);
>>
>> + /* To the kernel, this will be visible after the kernel has
>> + acquired the mutex in the syscall. */
>
> OK. 6/9 comments. 3 missing comments noted.
>
>> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>> return ENOTRECOVERABLE;
>> }
>>
>> if (robust)
>> {
>> + /* We must not enqueue the mutex before we have acquired it.
>> + Also see comments at ENQUEUE_MUTEX. */
>> + __asm ("" ::: "memory");
>
> OK. 9/10 barriers.
>
>> ENQUEUE_MUTEX_PI (mutex);
>> + /* We need to clear op_pending after we enqueue the mutex. */
>> + __asm ("" ::: "memory");
>
> OK. 10/10 barriers.
>
>> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>> }
>>
>
>
Comments
On 2/6/19 6:25 AM, Stefan Liebler wrote:
> Hi Carlos,
> I've updated the patch with three additional comments and I've mentioned the filed bug.
> Please review it once again before I commit it to master and cherry pick it to the release branches.
Thank you! Reviewed.
>> Yes, I did that backport to RHEL 7.6. These fixes are just "further"
>> fixes right? I'll work on getting this fixed in RHEL 7.7, and RHEL 8
>> for all arches.
> Sounds great.
> That's the same fix for pthread_mutex_trylock as previously done for pthread_mutex_lock and pthread_mutex_timedlock.
I've filed these:
https://bugzilla.redhat.com/show_bug.cgi?id=1672771
https://bugzilla.redhat.com/show_bug.cgi?id=1672773
Feel free to comment or verify if they are going to be needed in RHEL7.7
or RHEL8. I haven't done the analysis of the disassembly yet. If you
could have a look that would help.
> 20190206_pthread_mutex_trylock_barriers.patch
>
> commit b4c6ee19e804b0e90c117ec353ce67d321f0319b
> Author: Stefan Liebler <stli@linux.ibm.com>
> Date: Wed Feb 6 11:27:03 2019 +0100
>
> Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock. [BZ #24180]
>
> While debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
> Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
> instructions:
> 140: a5 1b 00 01 oill %r1,1
> 144: e5 48 a0 f0 00 00 mvghi 240(%r10),0 <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
> 14a: e3 10 a0 e0 00 24 stg %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI
>
> vs (with compiler barriers):
> 140: a5 1b 00 01 oill %r1,1
> 144: e3 10 a0 e0 00 24 stg %r1,224(%r10)
> 14a: e5 48 a0 f0 00 00 mvghi 240(%r10),0
>
> Please have a look at the discussion:
> "Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
> (https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)
>
> This patch is introducing the same compiler barriers and comments
> for pthread_mutex_trylock as introduced for pthread_mutex_lock and
> pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
> "Add compiler barriers around modifications of the robust mutex list."
>
> ChangeLog:
>
> [BZ #24180]
> * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
> Add compiler barriers and comments.
OK for master.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
> diff --git a/nptl/pthread_mutex_trylock.c b/nptl/pthread_mutex_trylock.c
> index 8fe43b8f0f..bf2869eca2 100644
> --- a/nptl/pthread_mutex_trylock.c
> +++ b/nptl/pthread_mutex_trylock.c
> @@ -94,6 +94,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
> case PTHREAD_MUTEX_ROBUST_ADAPTIVE_NP:
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
> &mutex->__data.__list.__next);
> + /* We need to set op_pending before starting the operation. Also
> + see comments at ENQUEUE_MUTEX. */
> + __asm ("" ::: "memory");
OK. 1/10 barriers.
>
> oldval = mutex->__data.__lock;
> do
> @@ -119,7 +122,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
> /* But it is inconsistent unless marked otherwise. */
> mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
>
> + /* We must not enqueue the mutex before we have acquired it.
> + Also see comments at ENQUEUE_MUTEX. */
> + __asm ("" ::: "memory");
OK. 2/10 barriers.
> ENQUEUE_MUTEX (mutex);
> + /* We need to clear op_pending after we enqueue the mutex. */
> + __asm ("" ::: "memory");
OK. 3/10 barriers.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>
> /* Note that we deliberately exist here. If we fall
> @@ -135,6 +143,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
> int kind = PTHREAD_MUTEX_TYPE (mutex);
> if (kind == PTHREAD_MUTEX_ROBUST_ERRORCHECK_NP)
> {
> + /* We do not need to ensure ordering wrt another memory
> + access. Also see comments at ENQUEUE_MUTEX. */
OK. 1/9 comments.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
> NULL);
> return EDEADLK;
> @@ -142,6 +152,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>
> if (kind == PTHREAD_MUTEX_ROBUST_RECURSIVE_NP)
> {
> + /* We do not need to ensure ordering wrt another memory
> + access. */
OK. 2/9 comments.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
> NULL);
>
> @@ -160,6 +172,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
> id, 0);
> if (oldval != 0 && (oldval & FUTEX_OWNER_DIED) == 0)
> {
> + /* We haven't acquired the lock as it is already acquired by
> + another owner. We do not need to ensure ordering wrt another
> + memory access. */
OK. 3/9 comments.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>
> return EBUSY;
> @@ -173,13 +188,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
> if (oldval == id)
> lll_unlock (mutex->__data.__lock,
> PTHREAD_ROBUST_MUTEX_PSHARED (mutex));
> + /* FIXME This violates the mutex destruction requirements. See
> + __pthread_mutex_unlock_full. */
OK. 4/9 comments.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
> return ENOTRECOVERABLE;
> }
> }
> while ((oldval & FUTEX_OWNER_DIED) != 0);
>
> + /* We must not enqueue the mutex before we have acquired it.
> + Also see comments at ENQUEUE_MUTEX. */
> + __asm ("" ::: "memory");
OK. 4/10 barriers.
> ENQUEUE_MUTEX (mutex);
> + /* We need to clear op_pending after we enqueue the mutex. */
> + __asm ("" ::: "memory");
OK. 5/10 barriers.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>
> mutex->__data.__owner = id;
> @@ -211,10 +233,15 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
> }
>
> if (robust)
> - /* Note: robust PI futexes are signaled by setting bit 0. */
> - THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
> - (void *) (((uintptr_t) &mutex->__data.__list.__next)
> - | 1));
> + {
> + /* Note: robust PI futexes are signaled by setting bit 0. */
> + THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
> + (void *) (((uintptr_t) &mutex->__data.__list.__next)
> + | 1));
> + /* We need to set op_pending before starting the operation. Also
> + see comments at ENQUEUE_MUTEX. */
> + __asm ("" ::: "memory");
OK. 6/10 barriers.
> + }
>
> oldval = mutex->__data.__lock;
>
> @@ -223,12 +250,16 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
> {
> if (kind == PTHREAD_MUTEX_ERRORCHECK_NP)
> {
> + /* We do not need to ensure ordering wrt another memory
> + access. */
OK. 5/9 comments.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
> return EDEADLK;
> }
>
> if (kind == PTHREAD_MUTEX_RECURSIVE_NP)
> {
> + /* We do not need to ensure ordering wrt another memory
> + access. */
OK. 6/9 comments.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>
> /* Just bump the counter. */
> @@ -250,6 +281,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
> {
> if ((oldval & FUTEX_OWNER_DIED) == 0)
> {
> + /* We haven't acquired the lock as it is already acquired by
> + another owner. We do not need to ensure ordering wrt another
> + memory access. */
OK. 7/9 comments.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>
> return EBUSY;
> @@ -270,6 +304,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
> if (INTERNAL_SYSCALL_ERROR_P (e, __err)
> && INTERNAL_SYSCALL_ERRNO (e, __err) == EWOULDBLOCK)
> {
> + /* The kernel has not yet finished the mutex owner death.
> + We do not need to ensure ordering wrt another memory
> + access. */
OK. 8/9 comments.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>
> return EBUSY;
> @@ -287,7 +324,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
> /* But it is inconsistent unless marked otherwise. */
> mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
>
> + /* We must not enqueue the mutex before we have acquired it.
> + Also see comments at ENQUEUE_MUTEX. */
> + __asm ("" ::: "memory");
OK. 7/10 barriers.
> ENQUEUE_MUTEX (mutex);
> + /* We need to clear op_pending after we enqueue the mutex. */
> + __asm ("" ::: "memory");
OK. 8/10 barriers.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>
> /* Note that we deliberately exit here. If we fall
> @@ -310,13 +352,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
> PTHREAD_ROBUST_MUTEX_PSHARED (mutex)),
> 0, 0);
>
> + /* To the kernel, this will be visible after the kernel has
> + acquired the mutex in the syscall. */
OK. 9/9 comments.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
> return ENOTRECOVERABLE;
> }
>
> if (robust)
> {
> + /* We must not enqueue the mutex before we have acquired it.
> + Also see comments at ENQUEUE_MUTEX. */
> + __asm ("" ::: "memory");
OK. 9/10 barriers.
> ENQUEUE_MUTEX_PI (mutex);
> + /* We need to clear op_pending after we enqueue the mutex. */
> + __asm ("" ::: "memory");
OK. 10/10 barriers.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
> }
>
On 02/06/2019 04:59 PM, Carlos O'Donell wrote:
> On 2/6/19 6:25 AM, Stefan Liebler wrote:
>> Hi Carlos,
>> I've updated the patch with three additional comments and I've mentioned the filed bug.
>> Please review it once again before I commit it to master and cherry pick it to the release branches.
>
> Thank you! Reviewed.
Committed to master (2.29.9000):
"Add compiler barriers around modifications of the robust mutex list for
pthread_mutex_trylock. [BZ #24180]"
(https://sourceware.org/git/?p=glibc.git;a=commit;h=823624bdc47f1f80109c9c52dee7939b9386d708)
and backported it to glibc 2.25 ... 2.29 release branches.
Thanks.
Stefan
>
>>> Yes, I did that backport to RHEL 7.6. These fixes are just "further"
>>> fixes right? I'll work on getting this fixed in RHEL 7.7, and RHEL 8
>>> for all arches.
>> Sounds great.
>> That's the same fix for pthread_mutex_trylock as previously done for pthread_mutex_lock and pthread_mutex_timedlock.
>
> I've filed these:
> https://bugzilla.redhat.com/show_bug.cgi?id=1672771
> https://bugzilla.redhat.com/show_bug.cgi?id=1672773
>
> Feel free to comment or verify if they are going to be needed in RHEL7.7
> or RHEL8. I haven't done the analysis of the disassembly yet. If you
> could have a look that would help.
>
On 2/7/19 10:05 AM, Stefan Liebler wrote:
> On 02/06/2019 04:59 PM, Carlos O'Donell wrote:
>> On 2/6/19 6:25 AM, Stefan Liebler wrote:
>>> Hi Carlos,
>>> I've updated the patch with three additional comments and I've mentioned the filed bug.
>>> Please review it once again before I commit it to master and cherry pick it to the release branches.
>>
>> Thank you! Reviewed.
> Committed to master (2.29.9000):
> "Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock. [BZ #24180]"
> (https://sourceware.org/git/?p=glibc.git;a=commit;h=823624bdc47f1f80109c9c52dee7939b9386d708)
>
> and backported it to glibc 2.25 ... 2.29 release branches.
Thank you! Now if only we could fix the final robust mutex
design flaw[1] ;-)
commit b4c6ee19e804b0e90c117ec353ce67d321f0319b
Author: Stefan Liebler <stli@linux.ibm.com>
Date: Wed Feb 6 11:27:03 2019 +0100
Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock. [BZ #24180]
While debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
instructions:
140: a5 1b 00 01 oill %r1,1
144: e5 48 a0 f0 00 00 mvghi 240(%r10),0 <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
14a: e3 10 a0 e0 00 24 stg %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI
vs (with compiler barriers):
140: a5 1b 00 01 oill %r1,1
144: e3 10 a0 e0 00 24 stg %r1,224(%r10)
14a: e5 48 a0 f0 00 00 mvghi 240(%r10),0
Please have a look at the discussion:
"Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
(https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)
This patch is introducing the same compiler barriers and comments
for pthread_mutex_trylock as introduced for pthread_mutex_lock and
pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
"Add compiler barriers around modifications of the robust mutex list."
ChangeLog:
[BZ #24180]
* nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
Add compiler barriers and comments.
@@ -94,6 +94,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
case PTHREAD_MUTEX_ROBUST_ADAPTIVE_NP:
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
&mutex->__data.__list.__next);
+ /* We need to set op_pending before starting the operation. Also
+ see comments at ENQUEUE_MUTEX. */
+ __asm ("" ::: "memory");
oldval = mutex->__data.__lock;
do
@@ -119,7 +122,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
/* But it is inconsistent unless marked otherwise. */
mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
+ /* We must not enqueue the mutex before we have acquired it.
+ Also see comments at ENQUEUE_MUTEX. */
+ __asm ("" ::: "memory");
ENQUEUE_MUTEX (mutex);
+ /* We need to clear op_pending after we enqueue the mutex. */
+ __asm ("" ::: "memory");
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
/* Note that we deliberately exist here. If we fall
@@ -135,6 +143,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
int kind = PTHREAD_MUTEX_TYPE (mutex);
if (kind == PTHREAD_MUTEX_ROBUST_ERRORCHECK_NP)
{
+ /* We do not need to ensure ordering wrt another memory
+ access. Also see comments at ENQUEUE_MUTEX. */
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
NULL);
return EDEADLK;
@@ -142,6 +152,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
if (kind == PTHREAD_MUTEX_ROBUST_RECURSIVE_NP)
{
+ /* We do not need to ensure ordering wrt another memory
+ access. */
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
NULL);
@@ -160,6 +172,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
id, 0);
if (oldval != 0 && (oldval & FUTEX_OWNER_DIED) == 0)
{
+ /* We haven't acquired the lock as it is already acquired by
+ another owner. We do not need to ensure ordering wrt another
+ memory access. */
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
return EBUSY;
@@ -173,13 +188,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
if (oldval == id)
lll_unlock (mutex->__data.__lock,
PTHREAD_ROBUST_MUTEX_PSHARED (mutex));
+ /* FIXME This violates the mutex destruction requirements. See
+ __pthread_mutex_unlock_full. */
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
return ENOTRECOVERABLE;
}
}
while ((oldval & FUTEX_OWNER_DIED) != 0);
+ /* We must not enqueue the mutex before we have acquired it.
+ Also see comments at ENQUEUE_MUTEX. */
+ __asm ("" ::: "memory");
ENQUEUE_MUTEX (mutex);
+ /* We need to clear op_pending after we enqueue the mutex. */
+ __asm ("" ::: "memory");
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
mutex->__data.__owner = id;
@@ -211,10 +233,15 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
}
if (robust)
- /* Note: robust PI futexes are signaled by setting bit 0. */
- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
- (void *) (((uintptr_t) &mutex->__data.__list.__next)
- | 1));
+ {
+ /* Note: robust PI futexes are signaled by setting bit 0. */
+ THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
+ (void *) (((uintptr_t) &mutex->__data.__list.__next)
+ | 1));
+ /* We need to set op_pending before starting the operation. Also
+ see comments at ENQUEUE_MUTEX. */
+ __asm ("" ::: "memory");
+ }
oldval = mutex->__data.__lock;
@@ -223,12 +250,16 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
{
if (kind == PTHREAD_MUTEX_ERRORCHECK_NP)
{
+ /* We do not need to ensure ordering wrt another memory
+ access. */
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
return EDEADLK;
}
if (kind == PTHREAD_MUTEX_RECURSIVE_NP)
{
+ /* We do not need to ensure ordering wrt another memory
+ access. */
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
/* Just bump the counter. */
@@ -250,6 +281,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
{
if ((oldval & FUTEX_OWNER_DIED) == 0)
{
+ /* We haven't acquired the lock as it is already acquired by
+ another owner. We do not need to ensure ordering wrt another
+ memory access. */
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
return EBUSY;
@@ -270,6 +304,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
if (INTERNAL_SYSCALL_ERROR_P (e, __err)
&& INTERNAL_SYSCALL_ERRNO (e, __err) == EWOULDBLOCK)
{
+ /* The kernel has not yet finished the mutex owner death.
+ We do not need to ensure ordering wrt another memory
+ access. */
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
return EBUSY;
@@ -287,7 +324,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
/* But it is inconsistent unless marked otherwise. */
mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
+ /* We must not enqueue the mutex before we have acquired it.
+ Also see comments at ENQUEUE_MUTEX. */
+ __asm ("" ::: "memory");
ENQUEUE_MUTEX (mutex);
+ /* We need to clear op_pending after we enqueue the mutex. */
+ __asm ("" ::: "memory");
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
/* Note that we deliberately exit here. If we fall
@@ -310,13 +352,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
PTHREAD_ROBUST_MUTEX_PSHARED (mutex)),
0, 0);
+ /* To the kernel, this will be visible after the kernel has
+ acquired the mutex in the syscall. */
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
return ENOTRECOVERABLE;
}
if (robust)
{
+ /* We must not enqueue the mutex before we have acquired it.
+ Also see comments at ENQUEUE_MUTEX. */
+ __asm ("" ::: "memory");
ENQUEUE_MUTEX_PI (mutex);
+ /* We need to clear op_pending after we enqueue the mutex. */
+ __asm ("" ::: "memory");
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
}