From patchwork Mon Jul 20 12:54:05 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Kleine-Budde X-Patchwork-Id: 7755 Received: (qmail 73462 invoked by alias); 20 Jul 2015 12:54:13 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 73448 invoked by uid 89); 20 Jul 2015 12:54:12 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=BAYES_00, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_NONE, RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: metis.ext.pengutronix.de From: Marc Kleine-Budde To: libc-alpha@sourceware.org Cc: kernel@pengutronix.de, Marc Kleine-Budde Subject: [PATCH] [BZ 18463] ARM - fix PI mutex breakge Date: Mon, 20 Jul 2015 14:54:05 +0200 Message-Id: <1437396845-19469-1-git-send-email-mkl@pengutronix.de> X-SA-Exim-Connect-IP: 2001:67c:670:100:1d::7 X-SA-Exim-Mail-From: mkl@pengutronix.de X-SA-Exim-Scanned: No (on metis.ext.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: libc-alpha@sourceware.org Since commit glibc-2.20~667: 47c5adebd2c8 Correct robust mutex / PI futex kernel assumptions (bug 9894). PI mutexes are broken on ARM systems. In program with more than two threads waiting on a condition variable (which is guarded by a mutex with PTHREAD_PRIO_INHERIT set), the pthread_mutex_unlock() after pthread_cond_broadcast() may fail with EPERM. The above mentioned patch undefines __ASSUME_FUTEX_LOCK_PI, __ASSUME_REQUEUE_PI and __ASSUME_SET_ROBUST_LIST. Which means switching from assuming kernel support for these features to autodetection. For PI futex (__ASSUME_FUTEX_LOCK_PI) and robust mutexes (__ASSUME_SET_ROBUST_LIST) there is runtime detection but no fallback code. As a fallback is not possible without kernel support. Instead a proper error (ENOTSUP) code is returned by the glibc's pthread_mutex_init(). But there is no runtime detection for the REQUEUE_PI feature. The way PI futexes are implemented, REQUEUE_PI must be used, quoting the Kernel's ./Documentation/futex-requeue-pi.txt: > In order to ensure the rt_mutex has an owner if it has waiters, it > is necessary for both the requeue code, as well as the waiting code, > to be able to acquire the rt_mutex before returning to user space. > The requeue code cannot simply wake the waiter and leave it to > acquire the rt_mutex as it would open a race window between the ^^^^^^^^^^^ > requeue call returning to user space and the waiter waking and > starting to run. This is especially true in the uncontended case. > > The solution involves two new rt_mutex helper routines, > rt_mutex_start_proxy_lock() and rt_mutex_finish_proxy_lock(), which > allow the requeue code to acquire an uncontended rt_mutex on behalf > of the waiter and to enqueue the waiter on a contended rt_mutex. > Two new system calls provide the kernel<->user interface to > requeue_pi: FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI. ^^^^^^^^^^^^^^^^^^^^^ > > FUTEX_WAIT_REQUEUE_PI is called by the waiter (pthread_cond_wait() ^^^^^^^^^^^^^^^^^^^^^ > and pthread_cond_timedwait()) to block on the initial futex and wait > to be requeued to a PI-aware futex. The implementation is the > result of a high-speed collision between futex_wait() and > futex_lock_pi(), with some extra logic to check for the additional > wake-up scenarios. Looking at pthread_cond_wait: | int | __pthread_cond_wait (pthread_cond_t *cond, pthread_mutex_t *mutex) | { [...] | #if (defined lll_futex_wait_requeue_pi \ | && defined __ASSUME_REQUEUE_PI) ^^^^^^^^^^^^^^^^^^^ | /* If pi_flag remained 1 then it means that we had the lock and the mutex | but a spurious waker raced ahead of us. Give back the mutex before | going into wait again. */ | if (pi_flag) | { | __pthread_mutex_cond_lock_adjust (mutex); | __pthread_mutex_unlock_usercnt (mutex, 0); | } | pi_flag = USE_REQUEUE_PI (mutex); | | if (pi_flag) | { | err = lll_futex_wait_requeue_pi (&cond->__data.__futex, | futex_val, &mutex->__data.__lock, | pshared); Here's the requeue_pi syscall, but no runtime check for it. See __ASSUME_REQUEUE_PI above. If __ASSUME_REQUEUE_PI is not present this code is not compiled in, resulting in the breakage. | | pi_flag = (err == 0); | } | else | #endif | /* Wait until woken by signal or broadcast. */ | lll_futex_wait (&cond->__data.__futex, futex_val, pshared); Looking a the kernel's kernel/futex.c: | long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, | u32 __user *uaddr2, u32 val2, u32 val3) | { ... | switch (cmd) { | case FUTEX_LOCK_PI: | case FUTEX_UNLOCK_PI: | case FUTEX_TRYLOCK_PI: | case FUTEX_WAIT_REQUEUE_PI: | case FUTEX_CMP_REQUEUE_PI: | if (!futex_cmpxchg_enabled) | return -ENOSYS; | } This is the corresponding code for the runtime checks in the glibc. The runtime check for: | case FUTEX_WAIT_REQUEUE_PI: | case FUTEX_CMP_REQUEUE_PI: has been added in v3.4-rc1~195^2~1. The glibc however only uses lll_futex_wait_requeue_pi on PI mutexes, which are runtime checked during pthread_mutex_init(). This patch removes the "undef __ASSUME_REQUEUE_PI" to fix the issue. Signed-off-by: Marc Kleine-Budde --- ChangeLog | 5 +++++ sysdeps/unix/sysv/linux/arm/kernel-features.h | 1 - 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/ChangeLog b/ChangeLog index 126501718892..564155935b82 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,8 @@ +2015-07d20 Marc Kleine-Budde + + * sysdeps/unix/sysv/linux/arm/kernel-features.h: fix PI futex + breakge: remove __ASSUME_REQUEUE_PI + 2015-07-16 David S. Miller * sysdeps/sparc/fpu/libm-test-ulps: Regenerated. diff --git a/sysdeps/unix/sysv/linux/arm/kernel-features.h b/sysdeps/unix/sysv/linux/arm/kernel-features.h index cb407dbc300c..c27bc383a839 100644 --- a/sysdeps/unix/sysv/linux/arm/kernel-features.h +++ b/sysdeps/unix/sysv/linux/arm/kernel-features.h @@ -39,6 +39,5 @@ configuration. */ #if __LINUX_KERNEL_VERSION < 0x030E03 # undef __ASSUME_FUTEX_LOCK_PI -# undef __ASSUME_REQUEUE_PI # undef __ASSUME_SET_ROBUST_LIST #endif