Message ID | 20210621111650.1164689-1-kurt@linutronix.de |
---|---|
Headers |
Return-Path: <libc-alpha-bounces+patchwork=sourceware.org@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6CFF73896C2D for <patchwork@sourceware.org>; Mon, 21 Jun 2021 11:20:20 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6CFF73896C2D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1624274420; bh=4ADEWPna9MUyn0ubV1mrC4wDQTiV/uKjVr5SpHwlen4=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=bgaVuS96McRUTl1cUhygJ5IYIQKMnvo7/E4TSL4woXAEIQN8RyJM0OouApuI8zIpc AZm6ldoRGT3otcZtjR1WJDIZ6t0WMFfFp2JuStJ2E0yOxYCJZE3ayKW8J/TZyAbMU4 +Ex3fdvMvBkLchw/bxsAwHxSMd8weCkKB19D5Rgc= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by sourceware.org (Postfix) with ESMTPS id 55F09393A425 for <libc-alpha@sourceware.org>; Mon, 21 Jun 2021 11:17:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 55F09393A425 To: libc-alpha@sourceware.org Subject: [PATCH RFC 0/3] nptl: Introduce and use FUTEX_LOCK_PI2 Date: Mon, 21 Jun 2021 13:16:47 +0200 Message-Id: <20210621111650.1164689-1-kurt@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_NUMSUBJECT, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=subscribe> From: Kurt Kanzenbach via Libc-alpha <libc-alpha@sourceware.org> Reply-To: Kurt Kanzenbach <kurt@linutronix.de> Cc: Florian Weimer <fweimer@redhat.com>, Sebastian Andrzej Siewior <bigeasy@linutronix.de>, Kurt Kanzenbach <kurt@linutronix.de>, Thomas Gleixner <tglx@linutronix.de> Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" <libc-alpha-bounces+patchwork=sourceware.org@sourceware.org> |
Series |
nptl: Introduce and use FUTEX_LOCK_PI2
|
|
Message
Kurt Kanzenbach
June 21, 2021, 11:16 a.m. UTC
Hi, Linux real time application often make use of mutexes with priority inheritance enabled due to problems such as unbounded priority inversion. In addition, some applications make use of timeouts e.g., by utilizing pthread_mutex_clocklock(). However, the combination of priority inheritance enabled mutexes with timeouts based on CLOCK_MONOTONIC doesn't work. That is because the underlying Linux kernel futex interface doesn't support it, yet. Using CLOCK_REALTIME is possible, but it is subject to adjustments (NTP, PTP, ...). Therefore, Thomas proposed to add a new futex operation called FUTEX_LOCK_PI2 with support for selectable clocks [1]. That patch series is not merged, yet. Here is a proof of concept patch set adding FUTEX_LOCK_PI2 in order to support pthread_mutex_clocklock(MONOTONIC)/PI. The idea is to use futex_lock_pi2() by default, and fallback to futex_lock_pi() in case the kernel is too old. I think, you get the idea. There's also a bugreport for glibc with a test case: https://sourceware.org/bugzilla/show_bug.cgi?id=26801 Thoughts? Thanks, Kurt [1] - https://lkml.kernel.org/lkml/20210422194417.866740847@linutronix.de/ Kurt Kanzenbach (3): nptl: Introduce futex_lock_pi2() nptl: Use futex_lock_pi2() nptl: Include CLOCK_MONOTONIC in mutex tests nptl/pthread_mutex_timedlock.c | 24 ++++-- nptl/tst-mutexpi10.c | 8 ++ sysdeps/nptl/futex-internal.h | 94 +++++++++++++++++++++++ sysdeps/nptl/lowlevellock-futex.h | 1 + sysdeps/pthread/tst-mutex5.c | 3 +- sysdeps/pthread/tst-mutex9.c | 3 +- sysdeps/unix/sysv/linux/kernel-features.h | 8 ++ 7 files changed, 133 insertions(+), 8 deletions(-)
Comments
On 21/06/2021 08:16, Kurt Kanzenbach wrote: > Hi, > > Linux real time application often make use of mutexes with priority inheritance > enabled due to problems such as unbounded priority inversion. In addition, some > applications make use of timeouts e.g., by utilizing pthread_mutex_clocklock(). > > However, the combination of priority inheritance enabled mutexes with timeouts > based on CLOCK_MONOTONIC doesn't work. That is because the underlying Linux > kernel futex interface doesn't support it, yet. Using CLOCK_REALTIME is > possible, but it is subject to adjustments (NTP, PTP, ...). Therefore, Thomas > proposed to add a new futex operation called FUTEX_LOCK_PI2 with support for > selectable clocks [1]. That patch series is not merged, yet. > > Here is a proof of concept patch set adding FUTEX_LOCK_PI2 in order to support > pthread_mutex_clocklock(MONOTONIC)/PI. The idea is to use futex_lock_pi2() by > default, and fallback to futex_lock_pi() in case the kernel is too old. I think, > you get the idea. > > There's also a bugreport for glibc with a test case: > > https://sourceware.org/bugzilla/show_bug.cgi?id=26801 > > Thoughts? > > Thanks, > Kurt Currently we check for PI mutex support on pthread_mutex_init which basically check for futex_cmpxchg_enabled within kernel (so it fails only on a handful configurations). For FUTEX_LOCK_PI2 I think we will need to rework it, since we are moving the futex PI failure from pthread_mutex_init to pthread_mutex_{timed}lock. It means that we will need to remove the prio_inherit_missing() test on pthread_mutex_init and make the pthread_mutex_{timed}lock fail instead (not sure if we should use ENOTSUP or keep with current EINVAL). The proposed futex_lockpi2_supported() incurs in an extra syscall on every mutex slow path, we should avoid it. I would like also to avoid the CRIU issue as well, so I think it would be better to avoid any caching (as done by prio_inherit_missing()), and optimize the FUTEX_LOCK_PI2 to be used only when the timeout for clock different than CLOCK_REALTIME is required. So instead it would be better to move the logic solely on futex_lock_pi() (I am assuming FUTEX_LOCK_PI2 would be only added for futex_time64): static __always_inline int futex_lock_pi2_64 (int *futex_word, const struct __timespec64 *abtime, int private) { # if __ASSUME_FUTEX_LOCK_PI2 return INTERNAL_SYSCALL_CALL (futex_time64, futex_word, __lll_private_flag (FUTEX_LOCK_PI2, private), 0, abstime); # else if (abstime != NULL && clockid != CLOCK_REALTIME) return INTERNAL_SYSCALL_CALL (futex_time64, futex_word, __lll_private_flag (FUTEX_LOCK_PI2, private), 0, abstime); else { int err = INTERNAL_SYSCALL_CALL (futex_time64, futex_word, __lll_private_flag (FUTEX_LOCK_PI, private), 0, abstime); if (err == -ENOSYS) err = -EOVERFLOW; } # endif /* __ASSUME_FUTEX_LOCK_PI2 */ } static __always_inline int futex_lock_pi64 (int *futex_word, const struct __timespec64 *abstime, int private) { int err; #ifdef __ASSUME_TIME64_SYSCALLS err = futex_lock_pi2_64 (futex_word, abstime, private); #else /* __ASSUME_TIME64_SYSCALLS */ bool need_time64 = abstime != NULL && !in_time_t_range (abstime->tv_sec) if (need_time64) { err = futex_lock_pi2_64 (futex_word, abstime, private); } else { struct timespec ts32; if (abstime != NULL) ts32 = valid_timespec64_to_timespec (*abstime); err = INTERNAL_SYSCALL_CALL (futex, futex_word, __lll_private_flag (FUTEX_LOCK_PI, private), 0, abstime != NULL ? &ts32 : NULL); } #endif /* __ASSUME_TIME64_SYSCALLS */ [...] } It would make the changes on pthread_mutex code minimal, it would be only to remove the extra check for clockid and adjust the comment. Also, as Joseph has noted the __ASSUME_FUTEX_LOCK_PI2 should be the first patch. It also does not make sense to add the __ASSUME_FUTEX_LOCK_PI2 on tests, they need to be kernel agnostic so you will need to handle a possible EINVAL/ENOSUP failure instead. > > [1] - https://lkml.kernel.org/lkml/20210422194417.866740847@linutronix.de/ > > Kurt Kanzenbach (3): > nptl: Introduce futex_lock_pi2() > nptl: Use futex_lock_pi2() > nptl: Include CLOCK_MONOTONIC in mutex tests > > nptl/pthread_mutex_timedlock.c | 24 ++++-- > nptl/tst-mutexpi10.c | 8 ++ > sysdeps/nptl/futex-internal.h | 94 +++++++++++++++++++++++ > sysdeps/nptl/lowlevellock-futex.h | 1 + > sysdeps/pthread/tst-mutex5.c | 3 +- > sysdeps/pthread/tst-mutex9.c | 3 +- > sysdeps/unix/sysv/linux/kernel-features.h | 8 ++ > 7 files changed, 133 insertions(+), 8 deletions(-) >
On Mon Jun 21 2021, Adhemerval Zanella wrote: > Currently we check for PI mutex support on pthread_mutex_init which basically > check for futex_cmpxchg_enabled within kernel (so it fails only on a handful > configurations). > > For FUTEX_LOCK_PI2 I think we will need to rework it, since we are moving > the futex PI failure from pthread_mutex_init to pthread_mutex_{timed}lock. > It means that we will need to remove the prio_inherit_missing() test on > pthread_mutex_init and make the pthread_mutex_{timed}lock fail instead > (not sure if we should use ENOTSUP or keep with current EINVAL). > > The proposed futex_lockpi2_supported() incurs in an extra syscall on every > mutex slow path, we should avoid it. Yes, sure. > I would like also to avoid the CRIU issue as well, so I think it would > be better to avoid any caching (as done by prio_inherit_missing()), > and optimize the FUTEX_LOCK_PI2 to be used only when the timeout for > clock different than CLOCK_REALTIME is required. OK. > > So instead it would be better to move the logic solely on futex_lock_pi() > (I am assuming FUTEX_LOCK_PI2 would be only added for futex_time64): The kernel also adds FUTEX_LOCK_PI2 for the old system call interface. > > static __always_inline int > futex_lock_pi2_64 (int *futex_word, const struct __timespec64 *abtime, > int private) > { > # if __ASSUME_FUTEX_LOCK_PI2 > return INTERNAL_SYSCALL_CALL (futex_time64, futex_word, > __lll_private_flag (FUTEX_LOCK_PI2, private), 0, > abstime); > # else > if (abstime != NULL && clockid != CLOCK_REALTIME) > return INTERNAL_SYSCALL_CALL (futex_time64, futex_word, > __lll_private_flag (FUTEX_LOCK_PI2, private), 0, > abstime); At this point __ASSUME_FUTEX_LOCK_PI2 is false meaning the kernel does not have FUTEX_LOCK_PI2 support. But, it calls FUTEX_LOCK_PI2 for clockid monotonic. This will result in ENOSYS unless it's an old kernel which is patched. Is that intended? > else > { > int err = INTERNAL_SYSCALL_CALL (futex_time64, futex_word, > __lll_private_flag (FUTEX_LOCK_PI, private), 0, > abstime); > if (err == -ENOSYS) > err = -EOVERFLOW; > } > # endif /* __ASSUME_FUTEX_LOCK_PI2 */ > } > > static __always_inline int > futex_lock_pi64 (int *futex_word, const struct __timespec64 *abstime, > int private) > { > int err; > #ifdef __ASSUME_TIME64_SYSCALLS > err = futex_lock_pi2_64 (futex_word, abstime, private); Makes sense. > #else /* __ASSUME_TIME64_SYSCALLS */ > bool need_time64 = abstime != NULL && !in_time_t_range (abstime->tv_sec) > if (need_time64) > { > err = futex_lock_pi2_64 (futex_word, abstime, private); > } > else > { > struct timespec ts32; > if (abstime != NULL) > ts32 = valid_timespec64_to_timespec (*abstime); > > err = INTERNAL_SYSCALL_CALL (futex, futex_word, __lll_private_flag > (FUTEX_LOCK_PI, private), 0, > abstime != NULL ? &ts32 : NULL); > } > #endif /* __ASSUME_TIME64_SYSCALLS */ > [...] > } > > It would make the changes on pthread_mutex code minimal, it would be only to > remove the extra check for clockid and adjust the comment. Well, that's an interesting point. I think the current check has to stay, because there are two locking paths. Only the slow path calls futex_lock_pi_64() which may result in ENOSYS for clock monotonic. But, the fast path which doesn't call futex_lock_pi64() would succeed if the check is removed. Maybe something like this: sysdeps/nptl/futex-internal.h: |static __always_inline bool |futex_lockpi2_supported (void) |{ | return __ASSUME_FUTEX_LOCK_PI2; |} nptl/pthread_mutex_timedlock.c: | if (__glibc_unlikely (! futex_lockpi2_supported () && | clockid != CLOCK_REALTIME)) | return EINVAL; Or did I get something wrong? > > Also, as Joseph has noted the __ASSUME_FUTEX_LOCK_PI2 should be the first > patch. OK. > It also does not make sense to add the __ASSUME_FUTEX_LOCK_PI2 on > tests, they need to be kernel agnostic so you will need to handle a > possible EINVAL/ENOSUP failure instead. Agreed. Thanks for your feedback. I'll rework it accordingly. Thanks, Kurt
* Kurt Kanzenbach: > At this point __ASSUME_FUTEX_LOCK_PI2 is false meaning the kernel does > not have FUTEX_LOCK_PI2 support. But, it calls FUTEX_LOCK_PI2 for > clockid monotonic. This will result in ENOSYS unless it's an old kernel > which is patched. Is that intended? That's not quite how the __ASSUME_* macros work. If not defined, it just means that glibc won't assume that the kernel feature is there. It can still use it (the run-time kernel might have it after all), but glibc has to check for the feature and has to compile in some sort of fallback code. Thanks, Florian
On Tue Jun 22 2021, Florian Weimer wrote: > * Kurt Kanzenbach: > >> At this point __ASSUME_FUTEX_LOCK_PI2 is false meaning the kernel does >> not have FUTEX_LOCK_PI2 support. But, it calls FUTEX_LOCK_PI2 for >> clockid monotonic. This will result in ENOSYS unless it's an old kernel >> which is patched. Is that intended? > > That's not quite how the __ASSUME_* macros work. If not defined, it > just means that glibc won't assume that the kernel feature is there. It > can still use it (the run-time kernel might have it after all), but > glibc has to check for the feature and has to compile in some sort of > fallback code. OK, makes sense. Thanks for the explanation. Thanks, Kurt
On 22/06/2021 04:26, Kurt Kanzenbach wrote: >> #else /* __ASSUME_TIME64_SYSCALLS */ >> bool need_time64 = abstime != NULL && !in_time_t_range (abstime->tv_sec) >> if (need_time64) >> { >> err = futex_lock_pi2_64 (futex_word, abstime, private); >> } >> else >> { >> struct timespec ts32; >> if (abstime != NULL) >> ts32 = valid_timespec64_to_timespec (*abstime); >> >> err = INTERNAL_SYSCALL_CALL (futex, futex_word, __lll_private_flag >> (FUTEX_LOCK_PI, private), 0, >> abstime != NULL ? &ts32 : NULL); >> } >> #endif /* __ASSUME_TIME64_SYSCALLS */ >> [...] >> } >> >> It would make the changes on pthread_mutex code minimal, it would be only to >> remove the extra check for clockid and adjust the comment. > > Well, that's an interesting point. I think the current check has to > stay, because there are two locking paths. Only the slow path calls > futex_lock_pi_64() which may result in ENOSYS for clock monotonic. But, > the fast path which doesn't call futex_lock_pi64() would succeed if the > check is removed. > > Maybe something like this: > > sysdeps/nptl/futex-internal.h: > |static __always_inline bool > |futex_lockpi2_supported (void) > |{ > | return __ASSUME_FUTEX_LOCK_PI2; > |} > > nptl/pthread_mutex_timedlock.c: > | if (__glibc_unlikely (! futex_lockpi2_supported () && > | clockid != CLOCK_REALTIME)) > | return EINVAL; > > Or did I get something wrong? Besides what Florian has explained about __ASSUME_* macros, another issue we have static initialization for pthread_mutex. So the syscall probe only works for dynamic initialization (where caller issue a pthread_mutex_init). I view this as an inconsistent behavior and I don't have a straightforward solution that won't result in a performance penalization for common cases (it would require to probe for FUTEX_LOCK_PI2 *and* FUTEX_LOCK_PI). Instead I think we should move the possible error on the slow path and let the kernel advertise it any missing support.
On Tue Jun 22 2021, Adhemerval Zanella wrote: > Besides what Florian has explained about __ASSUME_* macros, another issue we > have static initialization for pthread_mutex. So the syscall probe only > works for dynamic initialization (where caller issue a pthread_mutex_init). > > I view this as an inconsistent behavior and I don't have a straightforward > solution that won't result in a performance penalization for common cases > (it would require to probe for FUTEX_LOCK_PI2 *and* FUTEX_LOCK_PI). > > Instead I think we should move the possible error on the slow path and let > the kernel advertise it any missing support. OK, good. That makes the timedlock implementation straight forward. I implemented it and started testing now with different Linux kernels. A new version of patch set should be ready next week or so. Thanks, Kurt