Message ID | 20200612185122.327860-1-andrealmeid@collabora.com |
---|---|
Headers |
Return-Path: <libc-alpha-bounces@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1BD59395B435; Fri, 12 Jun 2020 18:51:26 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1BD59395B435 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1591987886; bh=FeeEG+nZmQdyGHkMDJj0v/CLc/IPMaPh/lSElOWzBV0=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=j1ROkhalv4PdDrMEHyJzTboW+gSBOg/uJZPiRRxm8IG++3QSAAOfSP84Uf0B0sTzi ViDyDusCreuV7I3h+/TOTozWGk29Wwd88tyNDPHuqTyv9MyHBiuISOG4vNa67rnz1T RiGMxVXqRXQl8qsqdrqYQh/K6xYXUlFCNM8Qargs= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from bhuna.collabora.co.uk (bhuna.collabora.co.uk [46.235.227.227]) by sourceware.org (Postfix) with ESMTPS id C375E3851C36 for <libc-alpha@sourceware.org>; Fri, 12 Jun 2020 18:51:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org C375E3851C36 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: tonyk) with ESMTPSA id 52F862A5160 To: linux-kernel@vger.kernel.org, tglx@linutronix.de, peterz@infradead.org Subject: [RFC 0/4] futex2: Add new futex interface Date: Fri, 12 Jun 2020 15:51:18 -0300 Message-Id: <20200612185122.327860-1-andrealmeid@collabora.com> X-Mailer: git-send-email 2.27.0 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.3 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org> List-Unsubscribe: <http://sourceware.org/mailman/options/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help> List-Subscribe: <http://sourceware.org/mailman/listinfo/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=subscribe> From: =?utf-8?q?Andr=C3=A9_Almeida_via_Libc-alpha?= <libc-alpha@sourceware.org> Reply-To: =?utf-8?q?Andr=C3=A9_Almeida?= <andrealmeid@collabora.com> Cc: fweimer@redhat.com, malteskarupke@web.de, libc-alpha@sourceware.org, linux-api@vger.kernel.org, mingo@redhat.com, andrealmeid@collabora.com, dvhart@infradead.org, kernel@collabora.com, krisman@collabora.com, pgriffais@valvesoftware.com Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" <libc-alpha-bounces@sourceware.org> |
Series | futex2: Add new futex interface | |
Message
André Almeida
June 12, 2020, 6:51 p.m. UTC
Hello, This RFC is a followup to the previous discussion initiated from my last patch "futex: Implement mechanism to wait on any of several futexes"[1]. As stated in the thread, the correct approach to move forward with the wait multiple operation would be to create a new syscall that would have all new cool features. The first patch adds the new interface and just translate the call for the old interface, without implementing new features. The goal here is to establish the interface and to check if everyone is happy with this API. The rest of patches are selftests to show the interface in action. I have the following questions: - Has anyone stared worked on a implementation of this interface? If yes, it would be nice to share the progress so we don't have duplicated work. - What suggestions do you have to implement this? Start from scratch or reuse the most code possible? - The interface seems correct and implements the requirements asked by you? - The proposed interface uses ktime_t type for absolute timeout, and I assumed that it should use values in a nsec resolution. If this is true, we have some problems with i386 ABI, please check out the COMPAT_32BIT_TIME implementation in patch 1 for more details. I haven't added a time64 implementation yet, until this is clarified. - Is expected to have a x32 ABI implementation as well? In the case of wait and wake, we could use the same as x86_64 ABI. However, for the waitv (aka wait on multiple futexes) we would need a proper x32 entry since we are dealing with 32bit pointers. Those are the cool new features that this syscall should address some day: - Operate with variable bit size futexes, not restricted to 32: 8, 16 and 64 - Wait on multiple futexes, using the following semantics: struct futex_wait { void *uaddr; unsigned long val; unsigned long flags; }; sys_futex_waitv(struct futex_wait *waiters, unsigned int nr_waiters, unsigned long flags, ktime_t *timo); - Have NUMA optimizations: if FUTEX_NUMA_FLAG is present, the `void *uaddr` argument won't be a u{8, 16, 32, 64} value anymore, but a struct containing a NUMA node hint: struct futex32_numa { u32 value __attribute__ ((aligned (8))); u32 hint; }; struct futex64_numa { u64 value __attribute__ ((aligned (16))); u64 hint; }; Thanks, André André Almeida (4): futex2: Add new futex interface selftests: futex: Add futex2 wake/wait test selftests: futex: Add futex2 timeout test selftests: futex: Add futex2 wouldblock test MAINTAINERS | 2 +- arch/x86/entry/syscalls/syscall_32.tbl | 2 + arch/x86/entry/syscalls/syscall_64.tbl | 2 + include/linux/syscalls.h | 9 ++ include/uapi/asm-generic/unistd.h | 7 +- include/uapi/linux/futex.h | 10 ++ init/Kconfig | 7 ++ kernel/Makefile | 1 + kernel/futex2.c | 97 ++++++++++++++++ kernel/sys_ni.c | 5 + tools/include/uapi/asm-generic/unistd.h | 7 +- .../selftests/futex/functional/.gitignore | 1 + .../selftests/futex/functional/Makefile | 4 +- .../selftests/futex/functional/futex2_wait.c | 106 ++++++++++++++++++ .../futex/functional/futex_wait_timeout.c | 27 ++++- .../futex/functional/futex_wait_wouldblock.c | 34 +++++- .../testing/selftests/futex/functional/run.sh | 3 + .../selftests/futex/include/futex2test.h | 50 +++++++++ 18 files changed, 361 insertions(+), 13 deletions(-) create mode 100644 kernel/futex2.c create mode 100644 tools/testing/selftests/futex/functional/futex2_wait.c create mode 100644 tools/testing/selftests/futex/include/futex2test.h
Comments
On Fri, Jun 12, 2020 at 11:53 AM André Almeida via Libc-alpha <libc-alpha@sourceware.org> wrote: > > Hello, > > This RFC is a followup to the previous discussion initiated from my last > patch "futex: Implement mechanism to wait on any of several futexes"[1]. > As stated in the thread, the correct approach to move forward with the > wait multiple operation would be to create a new syscall that would have > all new cool features. > > The first patch adds the new interface and just translate the call for > the old interface, without implementing new features. The goal here is > to establish the interface and to check if everyone is happy with this > API. The rest of patches are selftests to show the interface in action. > I have the following questions: > > - Has anyone stared worked on a implementation of this interface? If > yes, it would be nice to share the progress so we don't have duplicated > work. > > - What suggestions do you have to implement this? Start from scratch or > reuse the most code possible? > > - The interface seems correct and implements the requirements asked by you? > > - The proposed interface uses ktime_t type for absolute timeout, and I > assumed that it should use values in a nsec resolution. If this is true, > we have some problems with i386 ABI, please check out the > COMPAT_32BIT_TIME implementation in patch 1 for more details. I > haven't added a time64 implementation yet, until this is clarified. > > - Is expected to have a x32 ABI implementation as well? In the case of > wait and wake, we could use the same as x86_64 ABI. However, for the > waitv (aka wait on multiple futexes) we would need a proper x32 entry > since we are dealing with 32bit pointers. x32 should be able to use the same i386 compat systcall entry. Will it be problem? > Those are the cool new features that this syscall should address some > day: > > - Operate with variable bit size futexes, not restricted to 32: > 8, 16 and 64 > > - Wait on multiple futexes, using the following semantics: > > struct futex_wait { > void *uaddr; > unsigned long val; > unsigned long flags; > }; > > sys_futex_waitv(struct futex_wait *waiters, unsigned int nr_waiters, > unsigned long flags, ktime_t *timo); > > - Have NUMA optimizations: if FUTEX_NUMA_FLAG is present, the `void *uaddr` > argument won't be a u{8, 16, 32, 64} value anymore, but a struct > containing a NUMA node hint: > > struct futex32_numa { > u32 value __attribute__ ((aligned (8))); > u32 hint; > }; > > struct futex64_numa { > u64 value __attribute__ ((aligned (16))); > u64 hint; > }; > H.J.
Hello H.J., On 6/12/20 4:35 PM, H.J. Lu wrote: > On Fri, Jun 12, 2020 at 11:53 AM André Almeida via Libc-alpha > <libc-alpha@sourceware.org> wrote: >> - Is expected to have a x32 ABI implementation as well? In the case of >> wait and wake, we could use the same as x86_64 ABI. However, for the >> waitv (aka wait on multiple futexes) we would need a proper x32 entry >> since we are dealing with 32bit pointers. > > x32 should be able to use the same i386 compat systcall entry. Will it be > problem? > Indeed, you are right. In the last iteration of this work, I had some problems dealing with x32 ABI, but this new interface doesn't have the same problem anymore. We can use the same sys_waitv_time64 interface for both i368 and x32. >> > > H.J. > Thanks, André
On Fri, Jun 12, 2020 at 8:51 PM André Almeida <andrealmeid@collabora.com> wrote: > - The proposed interface uses ktime_t type for absolute timeout, and I > assumed that it should use values in a nsec resolution. If this is true, > we have some problems with i386 ABI, please check out the > COMPAT_32BIT_TIME implementation in patch 1 for more details. I > haven't added a time64 implementation yet, until this is clarified. ktime_t is not part of the uapi headers, and has always been considered an implementation detail of the kernel so far. I would argue it should stay that way. The most sensible alternatives would be to either use a "__u64 *timeout" argument for a relative timeout, or a "struct __kernel_timespec *timeout" for an absolute timeout. old_time32_t also makes no sense for multiple reasons: - It's another kernel internal type and not part of the uapi headers - your time32 call has different calling conventions from your time64 version, not just a different type. - there should be no need to add syscalls that are known to be buggy when there is a replacement type that does not have that bug. > - Is expected to have a x32 ABI implementation as well? In the case of > wait and wake, we could use the same as x86_64 ABI. However, for the > waitv (aka wait on multiple futexes) we would need a proper x32 entry > since we are dealing with 32bit pointers. For new syscalls, I'd actually recommend not having a separate entry point, but just checking 'if (in_compat_syscall())' inside of the implementation to pick one behavior vs the other when accessing the user pointers. This keeps the implementation simpler and avoids assigning a new x32 syscall number that would be different from all the other architectures. Arnd
Hello Arnd, On 6/25/20 3:48 AM, Arnd Bergmann wrote: > On Fri, Jun 12, 2020 at 8:51 PM André Almeida <andrealmeid@collabora.com> wrote: > >> - The proposed interface uses ktime_t type for absolute timeout, and I >> assumed that it should use values in a nsec resolution. If this is true, >> we have some problems with i386 ABI, please check out the >> COMPAT_32BIT_TIME implementation in patch 1 for more details. I >> haven't added a time64 implementation yet, until this is clarified. > > ktime_t is not part of the uapi headers, and has always been considered > an implementation detail of the kernel so far. I would argue it should > stay that way. The most sensible alternatives would be to either use > a "__u64 *timeout" argument for a relative timeout, or a > "struct __kernel_timespec *timeout" for an absolute timeout. > > old_time32_t also makes no sense for multiple reasons: > > - It's another kernel internal type and not part of the uapi headers > - your time32 call has different calling conventions from your time64 > version, not just a different type. > - there should be no need to add syscalls that are known to be buggy > when there is a replacement type that does not have that bug. > Thanks for the input. As stated by tglx at [1], "supporting relative timeouts is wrong to begin with", my next patch will use "struct __kernel_timespec *timeout" for an absolute timeout. >> - Is expected to have a x32 ABI implementation as well? In the case of >> wait and wake, we could use the same as x86_64 ABI. However, for the >> waitv (aka wait on multiple futexes) we would need a proper x32 entry >> since we are dealing with 32bit pointers. > > For new syscalls, I'd actually recommend not having a separate > entry point, but just checking 'if (in_compat_syscall())' inside of the > implementation to pick one behavior vs the other when accessing > the user pointers. This keeps the implementation simpler and > avoids assigning a new x32 syscall number that would be different > from all the other architectures. > Cool, this will make the code cleaner. > Arnd > Thanks, André [1] https://lkml.org/lkml/2019/7/31/1499