Message ID | 1437396845-19469-1-git-send-email-mkl@pengutronix.de |
---|---|
State | Changes Requested, archived |
Headers |
Received: (qmail 73462 invoked by alias); 20 Jul 2015 12:54:13 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 73448 invoked by uid 89); 20 Jul 2015 12:54:12 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=BAYES_00, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_NONE, RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: metis.ext.pengutronix.de From: Marc Kleine-Budde <mkl@pengutronix.de> To: libc-alpha@sourceware.org Cc: kernel@pengutronix.de, Marc Kleine-Budde <mkl@pengutronix.de> Subject: [PATCH] [BZ 18463] ARM - fix PI mutex breakge Date: Mon, 20 Jul 2015 14:54:05 +0200 Message-Id: <1437396845-19469-1-git-send-email-mkl@pengutronix.de> X-SA-Exim-Connect-IP: 2001:67c:670:100:1d::7 X-SA-Exim-Mail-From: mkl@pengutronix.de X-SA-Exim-Scanned: No (on metis.ext.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: libc-alpha@sourceware.org |
Commit Message
Marc Kleine-Budde
July 20, 2015, 12:54 p.m. UTC
Since commit glibc-2.20~667: 47c5adebd2c8 Correct robust mutex / PI futex kernel assumptions (bug 9894). PI mutexes are broken on ARM systems. In program with more than two threads waiting on a condition variable (which is guarded by a mutex with PTHREAD_PRIO_INHERIT set), the pthread_mutex_unlock() after pthread_cond_broadcast() may fail with EPERM. The above mentioned patch undefines __ASSUME_FUTEX_LOCK_PI, __ASSUME_REQUEUE_PI and __ASSUME_SET_ROBUST_LIST. Which means switching from assuming kernel support for these features to autodetection. For PI futex (__ASSUME_FUTEX_LOCK_PI) and robust mutexes (__ASSUME_SET_ROBUST_LIST) there is runtime detection but no fallback code. As a fallback is not possible without kernel support. Instead a proper error (ENOTSUP) code is returned by the glibc's pthread_mutex_init(). But there is no runtime detection for the REQUEUE_PI feature. The way PI futexes are implemented, REQUEUE_PI must be used, quoting the Kernel's ./Documentation/futex-requeue-pi.txt: > In order to ensure the rt_mutex has an owner if it has waiters, it > is necessary for both the requeue code, as well as the waiting code, > to be able to acquire the rt_mutex before returning to user space. > The requeue code cannot simply wake the waiter and leave it to > acquire the rt_mutex as it would open a race window between the ^^^^^^^^^^^ > requeue call returning to user space and the waiter waking and > starting to run. This is especially true in the uncontended case. > > The solution involves two new rt_mutex helper routines, > rt_mutex_start_proxy_lock() and rt_mutex_finish_proxy_lock(), which > allow the requeue code to acquire an uncontended rt_mutex on behalf > of the waiter and to enqueue the waiter on a contended rt_mutex. > Two new system calls provide the kernel<->user interface to > requeue_pi: FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI. ^^^^^^^^^^^^^^^^^^^^^ > > FUTEX_WAIT_REQUEUE_PI is called by the waiter (pthread_cond_wait() ^^^^^^^^^^^^^^^^^^^^^ > and pthread_cond_timedwait()) to block on the initial futex and wait > to be requeued to a PI-aware futex. The implementation is the > result of a high-speed collision between futex_wait() and > futex_lock_pi(), with some extra logic to check for the additional > wake-up scenarios. Looking at pthread_cond_wait: | int | __pthread_cond_wait (pthread_cond_t *cond, pthread_mutex_t *mutex) | { [...] | #if (defined lll_futex_wait_requeue_pi \ | && defined __ASSUME_REQUEUE_PI) ^^^^^^^^^^^^^^^^^^^ | /* If pi_flag remained 1 then it means that we had the lock and the mutex | but a spurious waker raced ahead of us. Give back the mutex before | going into wait again. */ | if (pi_flag) | { | __pthread_mutex_cond_lock_adjust (mutex); | __pthread_mutex_unlock_usercnt (mutex, 0); | } | pi_flag = USE_REQUEUE_PI (mutex); | | if (pi_flag) | { | err = lll_futex_wait_requeue_pi (&cond->__data.__futex, | futex_val, &mutex->__data.__lock, | pshared); Here's the requeue_pi syscall, but no runtime check for it. See __ASSUME_REQUEUE_PI above. If __ASSUME_REQUEUE_PI is not present this code is not compiled in, resulting in the breakage. | | pi_flag = (err == 0); | } | else | #endif | /* Wait until woken by signal or broadcast. */ | lll_futex_wait (&cond->__data.__futex, futex_val, pshared); Looking a the kernel's kernel/futex.c: | long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, | u32 __user *uaddr2, u32 val2, u32 val3) | { ... | switch (cmd) { | case FUTEX_LOCK_PI: | case FUTEX_UNLOCK_PI: | case FUTEX_TRYLOCK_PI: | case FUTEX_WAIT_REQUEUE_PI: | case FUTEX_CMP_REQUEUE_PI: | if (!futex_cmpxchg_enabled) | return -ENOSYS; | } This is the corresponding code for the runtime checks in the glibc. The runtime check for: | case FUTEX_WAIT_REQUEUE_PI: | case FUTEX_CMP_REQUEUE_PI: has been added in v3.4-rc1~195^2~1. The glibc however only uses lll_futex_wait_requeue_pi on PI mutexes, which are runtime checked during pthread_mutex_init(). This patch removes the "undef __ASSUME_REQUEUE_PI" to fix the issue. Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> --- ChangeLog | 5 +++++ sysdeps/unix/sysv/linux/arm/kernel-features.h | 1 - 2 files changed, 5 insertions(+), 1 deletion(-)
Comments
On Mon, 20 Jul 2015, Marc Kleine-Budde wrote:
> This patch removes the "undef __ASSUME_REQUEUE_PI" to fix the issue.
Nothing in your explanation says why this should be ARM-specific.
You seem to be asserting that the semantics of __ASSUME_REQUEUE_PI are
something other than "PI futexes work with FUTEX_*_REQUEUE_PI support in
the kernel" - or at least, something other than (given the 2.6.32 minimum
kernel version) "futex_atomic_cmpxchg_inatomic is supported".
What are the semantics of __ASSUME_REQUEUE_PI? Your analysis only refers
to one use of that macro; you need to examine all such uses. Then, if the
reference to futex_atomic_cmpxchg_inatomic in
sysdeps/unix/sysv/linux/kernel-features.h is incorrect, you need to update
the comment on the definition there of __ASSUME_REQUEUE_PI so that it
accurately reflects the circumstances under which it is correct, or
incorrect, for that macro to be defined on particular architectures.
(That might mean adding an explanation of the semantics of the macro to
that comment that makes clear why it's desirable to define it even if PI
futexes might not be supported at runtime, for example.) Then, each
architecture that may undefine __ASSUME_REQUEUE_PI needs to be updated to
be consistent with the clarified (and documented) understanding of the
semantics of that macro - rather than just changing one architecture in
isolation without justification for the undefining being correct for the
other architectures.
diff --git a/ChangeLog b/ChangeLog index 126501718892..564155935b82 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,8 @@ +2015-07d20 Marc Kleine-Budde <mkl@pengutronix.de> + + * sysdeps/unix/sysv/linux/arm/kernel-features.h: fix PI futex + breakge: remove __ASSUME_REQUEUE_PI + 2015-07-16 David S. Miller <davem@davemloft.net> * sysdeps/sparc/fpu/libm-test-ulps: Regenerated. diff --git a/sysdeps/unix/sysv/linux/arm/kernel-features.h b/sysdeps/unix/sysv/linux/arm/kernel-features.h index cb407dbc300c..c27bc383a839 100644 --- a/sysdeps/unix/sysv/linux/arm/kernel-features.h +++ b/sysdeps/unix/sysv/linux/arm/kernel-features.h @@ -39,6 +39,5 @@ configuration. */ #if __LINUX_KERNEL_VERSION < 0x030E03 # undef __ASSUME_FUTEX_LOCK_PI -# undef __ASSUME_REQUEUE_PI # undef __ASSUME_SET_ROBUST_LIST #endif