Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock.
Commit Message
Hi,
while debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
instructions:
140: a5 1b 00 01 oill %r1,1
144: e5 48 a0 f0 00 00 mvghi 240(%r10),0 <--- THREAD_SETMEM
(THREAD_SELF, robust_head.list_op_pending, NULL);
14a: e3 10 a0 e0 00 24 stg %r1,224(%r10) <--- last
THREAD_SETMEM of ENQUEUE_MUTEX_PI
vs (with compiler barriers):
140: a5 1b 00 01 oill %r1,1
144: e3 10 a0 e0 00 24 stg %r1,224(%r10)
14a: e5 48 a0 f0 00 00 mvghi 240(%r10),0
Please have a look at the discussion:
"Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
(https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)
This patch is introducing the same compiler barriers and comments
for pthread_mutex_trylock as introduced for pthread_mutex_lock and
pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
"Add compiler barriers around modifications of the robust mutex list."
Okay to commit?
The original commit was first available with glibc release 2.25.
Once this patch is committed, we should at least backport it to
glibc release branches 2.25 - 2.28?
Does anybody know if and where the original commit was backported to?
I've found at least "Bug 1401665 - Fix process shared robust mutex
defects." (https://bugzilla.redhat.com/show_bug.cgi?id=1401665#c34)
Bye
Stefan
ChangeLog:
* nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
Add compiler barriers and comments.
Comments
On 2/5/19 11:21 AM, Stefan Liebler wrote:
> while debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
> Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
> instructions:
> 140: a5 1b 00 01 oill %r1,1
> 144: e5 48 a0 f0 00 00 mvghi 240(%r10),0 <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
> 14a: e3 10 a0 e0 00 24 stg %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI
>
> vs (with compiler barriers):
> 140: a5 1b 00 01 oill %r1,1
> 144: e3 10 a0 e0 00 24 stg %r1,224(%r10)
> 14a: e5 48 a0 f0 00 00 mvghi 240(%r10),0
>
> Please have a look at the discussion:
> "Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
> (https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)
What a great read! Thank you to everyone you had to spend time tracking
this down. Thank you for the patch Stefan.
> This patch is introducing the same compiler barriers and comments
> for pthread_mutex_trylock as introduced for pthread_mutex_lock and
> pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
> "Add compiler barriers around modifications of the robust mutex list."
>
> Okay to commit?
OK for master with 3 additional comments about ordering.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
> The original commit was first available with glibc release 2.25.
> Once this patch is committed, we should at least backport it to
> glibc release branches 2.25 - 2.28?
... and 2.29.
Yes, absolutely, once you commit to master you are free to use
git cherry-pick -x to put the fix on all the branches you need.
Please post your work to libc-stable@sourceware.org.
> Does anybody know if and where the original commit was backported to?
> I've found at least "Bug 1401665 - Fix process shared robust mutex defects." (https://bugzilla.redhat.com/show_bug.cgi?id=1401665#c34)
Yes, I did that backport to RHEL 7.6. These fixes are just "further"
fixes right? I'll work on getting this fixed in RHEL 7.7, and RHEL 8
for all arches.
> * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
> Add compiler barriers and comments.
>
> 20190205_pthread_mutex_trylock_barriers.patch
>
> commit 9efa39ef04961397e39e7a9d3c11a33937755aec
> Author: Stefan Liebler <stli@linux.ibm.com>
> Date: Tue Feb 5 12:37:42 2019 +0100
>
> Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock.
OK.
> While debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
> Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
> instructions:
> 140: a5 1b 00 01 oill %r1,1
> 144: e5 48 a0 f0 00 00 mvghi 240(%r10),0 <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
> 14a: e3 10 a0 e0 00 24 stg %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI
>
> vs (with compiler barriers):
> 140: a5 1b 00 01 oill %r1,1
> 144: e3 10 a0 e0 00 24 stg %r1,224(%r10)
> 14a: e5 48 a0 f0 00 00 mvghi 240(%r10),0
>
> Please have a look at the discussion:
> "Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
> (https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)
OK.
> This patch is introducing the same compiler barriers and comments
> for pthread_mutex_trylock as introduced for pthread_mutex_lock and
> pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
> "Add compiler barriers around modifications of the robust mutex list."
>
> ChangeLog:
>
> * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
> Add compiler barriers and comments.
OK.
>
I reviewed:
nptl/descr.h - OK
nptl/pthread_mutex_unlock.c - OK
nptl/pthread_mutex_timedlock.c - OK
nptl/allocatestack.c - OK
nptl/pthread_create.c - OK
nptl/pthread_mutex_lock.c - OK
nptl/nptl-init.c - OK
nptl/pthread_mutex_trylock.c <--- I count 10 missing barriers, and 9 missing ordering comments.
sysdeps/nptl/fork.c - OK
> diff --git a/nptl/pthread_mutex_trylock.c b/nptl/pthread_mutex_trylock.c
> index 8fe43b8f0f..ff1d7282ab 100644
> --- a/nptl/pthread_mutex_trylock.c
> +++ b/nptl/pthread_mutex_trylock.c
> @@ -94,6 +94,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
> case PTHREAD_MUTEX_ROBUST_ADAPTIVE_NP:
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
> &mutex->__data.__list.__next);
> + /* We need to set op_pending before starting the operation. Also
> + see comments at ENQUEUE_MUTEX. */
> + __asm ("" ::: "memory");
OK. 1/10 barriers.
>
> oldval = mutex->__data.__lock;
> do
> @@ -119,7 +122,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
> /* But it is inconsistent unless marked otherwise. */
> mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
>
> + /* We must not enqueue the mutex before we have acquired it.
> + Also see comments at ENQUEUE_MUTEX. */
> + __asm ("" ::: "memory");
OK. 2/10 barriers.
> ENQUEUE_MUTEX (mutex);
> + /* We need to clear op_pending after we enqueue the mutex. */
> + __asm ("" ::: "memory");
OK. 3/10 barriers.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>
> /* Note that we deliberately exist here. If we fall
> @@ -135,6 +143,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
> int kind = PTHREAD_MUTEX_TYPE (mutex);
> if (kind == PTHREAD_MUTEX_ROBUST_ERRORCHECK_NP)
> {
> + /* We do not need to ensure ordering wrt another memory
> + access. Also see comments at ENQUEUE_MUTEX. */
OK. 1/9 comments.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
> NULL);
> return EDEADLK;
> @@ -142,6 +152,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
>
> if (kind == PTHREAD_MUTEX_ROBUST_RECURSIVE_NP)
> {
> + /* We do not need to ensure ordering wrt another memory
> + access. */
OK. 2/9 comments.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
> NULL);
>
Missing comment.
159 oldval = atomic_compare_and_exchange_val_acq (&mutex->__data.__lock,
Did not acquire the lock.
160 id, 0);
161 if (oldval != 0 && (oldval & FUTEX_OWNER_DIED) == 0)
162 {
163 THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
Clearing list_op_pending has no ordering requirement, and so could use a comment?
164
165 return EBUSY;
166 }
> @@ -173,13 +185,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
> if (oldval == id)
> lll_unlock (mutex->__data.__lock,
> PTHREAD_ROBUST_MUTEX_PSHARED (mutex));
> + /* FIXME This violates the mutex destruction requirements. See
> + __pthread_mutex_unlock_full. */
OK. 3/9 comments.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
> return ENOTRECOVERABLE;
> }
> }
> while ((oldval & FUTEX_OWNER_DIED) != 0);
>
> + /* We must not enqueue the mutex before we have acquired it.
> + Also see comments at ENQUEUE_MUTEX. */
> + __asm ("" ::: "memory");
OK. 4/10 barriers.
> ENQUEUE_MUTEX (mutex);
> + /* We need to clear op_pending after we enqueue the mutex. */
> + __asm ("" ::: "memory");
OK. 5/10 barriers.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>
> mutex->__data.__owner = id;
> @@ -211,10 +230,15 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
> }
>
> if (robust)
> - /* Note: robust PI futexes are signaled by setting bit 0. */
> - THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
> - (void *) (((uintptr_t) &mutex->__data.__list.__next)
> - | 1));
> + {
> + /* Note: robust PI futexes are signaled by setting bit 0. */
> + THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
> + (void *) (((uintptr_t) &mutex->__data.__list.__next)
> + | 1));
> + /* We need to set op_pending before starting the operation. Also
> + see comments at ENQUEUE_MUTEX. */
> + __asm ("" ::: "memory");
OK. 6/10 barriers.
> + }
>
> oldval = mutex->__data.__lock;
>
> @@ -223,12 +247,16 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
> {
> if (kind == PTHREAD_MUTEX_ERRORCHECK_NP)
> {
> + /* We do not need to ensure ordering wrt another memory
> + access. */
OK. 4/9 comments.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
> return EDEADLK;
> }
>
> if (kind == PTHREAD_MUTEX_RECURSIVE_NP)
> {
> + /* We do not need to ensure ordering wrt another memory
> + access. */
OK. 5/9 comments.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>
> /* Just bump the counter. */
Missing comment?
249 if (oldval != 0)
Filaed to get the lock.
250 {
251 if ((oldval & FUTEX_OWNER_DIED) == 0)
252 {
253 THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
Clearing list_op_pending has no ordering requirement, and so could use a comment?
254
255 return EBUSY;
256 }
...
270 if (INTERNAL_SYSCALL_ERROR_P (e, __err)
271 && INTERNAL_SYSCALL_ERRNO (e, __err) == EWOULDBLOCK)
272 {
273 THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
Clearing list_op_pending has no ordering requirement, and so could use a comment?
274
275 return EBUSY;
276 }
> @@ -287,7 +315,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
> /* But it is inconsistent unless marked otherwise. */
> mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
>
> + /* We must not enqueue the mutex before we have acquired it.
> + Also see comments at ENQUEUE_MUTEX. */
> + __asm ("" ::: "memory");
OK. 7/8 barriers.
> ENQUEUE_MUTEX (mutex);
> + /* We need to clear op_pending after we enqueue the mutex. */
> + __asm ("" ::: "memory");
OK. 8/10 barriers.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
>
> /* Note that we deliberately exit here. If we fall
> @@ -310,13 +343,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
> PTHREAD_ROBUST_MUTEX_PSHARED (mutex)),
> 0, 0);
>
> + /* To the kernel, this will be visible after the kernel has
> + acquired the mutex in the syscall. */
OK. 6/9 comments. 3 missing comments noted.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
> return ENOTRECOVERABLE;
> }
>
> if (robust)
> {
> + /* We must not enqueue the mutex before we have acquired it.
> + Also see comments at ENQUEUE_MUTEX. */
> + __asm ("" ::: "memory");
OK. 9/10 barriers.
> ENQUEUE_MUTEX_PI (mutex);
> + /* We need to clear op_pending after we enqueue the mutex. */
> + __asm ("" ::: "memory");
OK. 10/10 barriers.
> THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
> }
>
* Stefan Liebler:
> * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
> Add compiler barriers and comments.
Please file a bug on sourceware.org and reference it in the commit.
Thanks.
Florian
On 02/06/2019 08:02 AM, Florian Weimer wrote:
> * Stefan Liebler:
>
>> * nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
>> Add compiler barriers and comments.
>
> Please file a bug on sourceware.org and reference it in the commit.
> Thanks.
>
> Florian
>
Okay. I've filed the following bug:
"Bug 24180 - pthread_mutex_trylock does not use the correct order of
instructions while maintaining the robust mutex list due to missing
compiler barriers."
https://sourceware.org/bugzilla/show_bug.cgi?id=24180
commit 9efa39ef04961397e39e7a9d3c11a33937755aec
Author: Stefan Liebler <stli@linux.ibm.com>
Date: Tue Feb 5 12:37:42 2019 +0100
Add compiler barriers around modifications of the robust mutex list for pthread_mutex_trylock.
While debugging a kernel warning, Thomas Gleixner, Sebastian Sewior and
Heiko Carstens found a bug in pthread_mutex_trylock due to misordered
instructions:
140: a5 1b 00 01 oill %r1,1
144: e5 48 a0 f0 00 00 mvghi 240(%r10),0 <--- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
14a: e3 10 a0 e0 00 24 stg %r1,224(%r10) <--- last THREAD_SETMEM of ENQUEUE_MUTEX_PI
vs (with compiler barriers):
140: a5 1b 00 01 oill %r1,1
144: e3 10 a0 e0 00 24 stg %r1,224(%r10)
14a: e5 48 a0 f0 00 00 mvghi 240(%r10),0
Please have a look at the discussion:
"Re: WARN_ON_ONCE(!new_owner) within wake_futex_pi() triggerede"
(https://lore.kernel.org/lkml/20190202112006.GB3381@osiris/)
This patch is introducing the same compiler barriers and comments
for pthread_mutex_trylock as introduced for pthread_mutex_lock and
pthread_mutex_timedlock by commit 8f9450a0b7a9e78267e8ae1ab1000ebca08e473e
"Add compiler barriers around modifications of the robust mutex list."
ChangeLog:
* nptl/pthread_mutex_trylock.c (__pthread_mutex_trylock):
Add compiler barriers and comments.
@@ -94,6 +94,9 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
case PTHREAD_MUTEX_ROBUST_ADAPTIVE_NP:
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
&mutex->__data.__list.__next);
+ /* We need to set op_pending before starting the operation. Also
+ see comments at ENQUEUE_MUTEX. */
+ __asm ("" ::: "memory");
oldval = mutex->__data.__lock;
do
@@ -119,7 +122,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
/* But it is inconsistent unless marked otherwise. */
mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
+ /* We must not enqueue the mutex before we have acquired it.
+ Also see comments at ENQUEUE_MUTEX. */
+ __asm ("" ::: "memory");
ENQUEUE_MUTEX (mutex);
+ /* We need to clear op_pending after we enqueue the mutex. */
+ __asm ("" ::: "memory");
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
/* Note that we deliberately exist here. If we fall
@@ -135,6 +143,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
int kind = PTHREAD_MUTEX_TYPE (mutex);
if (kind == PTHREAD_MUTEX_ROBUST_ERRORCHECK_NP)
{
+ /* We do not need to ensure ordering wrt another memory
+ access. Also see comments at ENQUEUE_MUTEX. */
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
NULL);
return EDEADLK;
@@ -142,6 +152,8 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
if (kind == PTHREAD_MUTEX_ROBUST_RECURSIVE_NP)
{
+ /* We do not need to ensure ordering wrt another memory
+ access. */
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
NULL);
@@ -173,13 +185,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
if (oldval == id)
lll_unlock (mutex->__data.__lock,
PTHREAD_ROBUST_MUTEX_PSHARED (mutex));
+ /* FIXME This violates the mutex destruction requirements. See
+ __pthread_mutex_unlock_full. */
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
return ENOTRECOVERABLE;
}
}
while ((oldval & FUTEX_OWNER_DIED) != 0);
+ /* We must not enqueue the mutex before we have acquired it.
+ Also see comments at ENQUEUE_MUTEX. */
+ __asm ("" ::: "memory");
ENQUEUE_MUTEX (mutex);
+ /* We need to clear op_pending after we enqueue the mutex. */
+ __asm ("" ::: "memory");
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
mutex->__data.__owner = id;
@@ -211,10 +230,15 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
}
if (robust)
- /* Note: robust PI futexes are signaled by setting bit 0. */
- THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
- (void *) (((uintptr_t) &mutex->__data.__list.__next)
- | 1));
+ {
+ /* Note: robust PI futexes are signaled by setting bit 0. */
+ THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending,
+ (void *) (((uintptr_t) &mutex->__data.__list.__next)
+ | 1));
+ /* We need to set op_pending before starting the operation. Also
+ see comments at ENQUEUE_MUTEX. */
+ __asm ("" ::: "memory");
+ }
oldval = mutex->__data.__lock;
@@ -223,12 +247,16 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
{
if (kind == PTHREAD_MUTEX_ERRORCHECK_NP)
{
+ /* We do not need to ensure ordering wrt another memory
+ access. */
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
return EDEADLK;
}
if (kind == PTHREAD_MUTEX_RECURSIVE_NP)
{
+ /* We do not need to ensure ordering wrt another memory
+ access. */
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
/* Just bump the counter. */
@@ -287,7 +315,12 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
/* But it is inconsistent unless marked otherwise. */
mutex->__data.__owner = PTHREAD_MUTEX_INCONSISTENT;
+ /* We must not enqueue the mutex before we have acquired it.
+ Also see comments at ENQUEUE_MUTEX. */
+ __asm ("" ::: "memory");
ENQUEUE_MUTEX (mutex);
+ /* We need to clear op_pending after we enqueue the mutex. */
+ __asm ("" ::: "memory");
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
/* Note that we deliberately exit here. If we fall
@@ -310,13 +343,20 @@ __pthread_mutex_trylock (pthread_mutex_t *mutex)
PTHREAD_ROBUST_MUTEX_PSHARED (mutex)),
0, 0);
+ /* To the kernel, this will be visible after the kernel has
+ acquired the mutex in the syscall. */
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
return ENOTRECOVERABLE;
}
if (robust)
{
+ /* We must not enqueue the mutex before we have acquired it.
+ Also see comments at ENQUEUE_MUTEX. */
+ __asm ("" ::: "memory");
ENQUEUE_MUTEX_PI (mutex);
+ /* We need to clear op_pending after we enqueue the mutex. */
+ __asm ("" ::: "memory");
THREAD_SETMEM (THREAD_SELF, robust_head.list_op_pending, NULL);
}