From patchwork Fri Jul 9 00:13:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Andr=C3=A9_Almeida?= X-Patchwork-Id: 44276 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9EACB3982C93 for ; Fri, 9 Jul 2021 00:14:17 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9EACB3982C93 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1625789657; bh=n5sfh39pKxBRjtCDjzjHbjuYVS1ZeR87erfZ4pGfNg4=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=eCCYhphoZFQHRpRHT/a5YAhbQB+kaAmZ8U4rR07GTy5sx0JKD3jrV8qfkoSwc0iZB x/sN79yhuliktxTPX2iXfxM21QPiHRZw0Rgi77riRepexnT43W3PjaQuHFsznHS4r4 nFH5QDEY4l53sZks0fHKKEgAZVe08qT70g8fxaWU= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from bhuna.collabora.co.uk (bhuna.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e3e3]) by sourceware.org (Postfix) with ESMTPS id CD0963857C6B for ; Fri, 9 Jul 2021 00:13:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org CD0963857C6B Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: tonyk) with ESMTPSA id C9C9E1F4212B To: Thomas Gleixner , Ingo Molnar , Peter Zijlstra , Darren Hart , linux-kernel@vger.kernel.org, Steven Rostedt , Sebastian Andrzej Siewior Subject: [PATCH v5 00/11] Add futex2 syscalls Date: Thu, 8 Jul 2021 21:13:17 -0300 Message-Id: <20210709001328.329716-1-andrealmeid@collabora.com> X-Mailer: git-send-email 2.32.0 MIME-Version: 1.0 X-Spam-Status: No, score=-6.0 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_MANYTO, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: =?utf-8?q?Andr=C3=A9_Almeida_via_Libc-alpha?= From: =?utf-8?q?Andr=C3=A9_Almeida?= Reply-To: =?utf-8?q?Andr=C3=A9_Almeida?= Cc: fweimer@redhat.com, shuah@kernel.org, Davidlohr Bueso , libc-alpha@sourceware.org, corbet@lwn.net, linux-api@vger.kernel.org, z.figura12@gmail.com, =?utf-8?q?Andr=C3=A9_Almeida?= , Nicholas Piggin , malteskarupke@fastmail.fm, acme@kernel.org, linux-kselftest@vger.kernel.org, Andrey Semashev , joel@joelfernandes.org, Peter Oskolkov , kernel@collabora.com, krisman@collabora.com, pgriffais@valvesoftware.com Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" This patchset is an implementation of futex2 interface on top of existing futex.c code. * What happened to the current futex()? The futex() is implemented using a multiplexed interface that doesn't scale well and gives headaches to people. We don't want to add more features there. * New features at futex2() ** NUMA-awareness At the current implementation, all futex kernel side infrastructure is stored on a single node. Given that, all futex() calls issued by processors that aren't located on that node will have a memory access penalty when doing it. ** Variable sized futexes Futexes are used to implement atomic operations in userspace. Supporting 8, 16, 32 and 64 bit sized futexes allows user libraries to implement all those sizes in a performant way. Thanks Boost devs for feedback: https://lists.boost.org/Archives/boost/2021/05/251508.php Embedded systems or anything with memory constrains could benefit of using smaller sizes for the futex userspace integer. ** Wait on multiple futexes Proton's (a set of compatibility tools to run Windows games) fork of Wine benefits of this feature to implement WaitForMultipleObjects from Win32 in a performant way. Native game engines will benefit from this as well, given that this is a common wait pattern for games. * The interface The new interface has one syscall per operation as opposed to the current multiplexing one. The details can be found in the following patches, but this is a high level summary of what the interface can do: - Supports wake/wait semantics, as in futex() - Supports requeue operations, similarly as FUTEX_CMP_REQUEUE, but with individual flags for each address - Supports waiting for a vector of futexes, using a new syscall named futex_waitv() - The following features will be implemented in next patchset versions: - Supports variable sized futexes (8bits, 16bits, 32bits and 64bits) - Supports NUMA-awareness operations, where the user can specify on which memory node would like to operate * The patchset Given that futex2 reuses futex code, the patches make futex.c functions public and modify them as needed. This patchset can be also found at my git tree: https://gitlab.collabora.com/tonyk/linux/-/tree/futex2-dev - Patch 1: Implements 32bit wait/wake - Patches 2-3: Implement waitv and requeue. - Patch 4: Add a documentation file which details the interface and the internal implementation. - Patches 5-10: Selftests for all operations along with perf support for futex2. - Patch 11: Proof of concept of waking threads at waitpid(), not to be merged as it is. * Testing ** Stability - glibc[1]: nptl's low level locking was modified to use futex2 API (except for PI). All nptl/ tests passed. - Proton's Wine: Proton/Wine was modified in order to use futex2() for the emulation of Windows NT sync mechanisms based on futex, called "fsync". Triple-A games with huge CPU's loads and tons of parallel jobs worked as expected when compared with the previous FUTEX_WAIT_MULTIPLE implementation at futex(). Some games issue 42k futex2() calls per second. - perf: The perf benchmarks tests can also be used to stress the interface, and they can be found in this patchset. [1] https://gitlab.collabora.com/tonyk/glibc/-/tree/futex2-dev ** Performance - Using perf, no significant difference was measured when comparing futex() and futex2() for the following benchmarks: hash, wake and wake-parallel. - I measured a 15% overhead for the perf's requeue benchmark, comparing futex2() to futex(). Requeue patch provides more details about why this happens and how to overcome this. * Changelog Changes from v4: - Use existing futex.c code when possible - Cleaned up cover letter, check v4 for a more verbose version v4: https://lore.kernel.org/lkml/20210603195924.361327-1-andrealmeid@collabora.com/ André Almeida (11): futex2: Implement wait and wake functions futex2: Implement vectorized wait futex2: Implement requeue operation docs: locking: futex2: Add documentation selftests: futex2: Add wake/wait test selftests: futex2: Add timeout test selftests: futex2: Add wouldblock test selftests: futex2: Add waitv test selftests: futex2: Add requeue test perf bench: Add futex2 benchmark tests kernel: Enable waitpid() for futex2 Documentation/locking/futex2.rst | 185 ++++++ Documentation/locking/index.rst | 1 + arch/x86/entry/syscalls/syscall_32.tbl | 4 + arch/x86/entry/syscalls/syscall_64.tbl | 4 + include/linux/compat.h | 23 + include/linux/futex.h | 103 ++++ include/linux/syscalls.h | 8 + include/uapi/asm-generic/unistd.h | 11 +- include/uapi/linux/futex.h | 27 + init/Kconfig | 7 + kernel/Makefile | 1 + kernel/fork.c | 2 + kernel/futex.c | 111 +--- kernel/futex2.c | 566 ++++++++++++++++++ kernel/sys_ni.c | 9 + tools/arch/x86/include/asm/unistd_64.h | 12 + tools/perf/bench/bench.h | 4 + tools/perf/bench/futex-hash.c | 24 +- tools/perf/bench/futex-requeue.c | 57 +- tools/perf/bench/futex-wake-parallel.c | 41 +- tools/perf/bench/futex-wake.c | 37 +- tools/perf/bench/futex.h | 47 ++ tools/perf/builtin-bench.c | 18 +- .../selftests/futex/functional/.gitignore | 3 + .../selftests/futex/functional/Makefile | 6 +- .../futex/functional/futex2_requeue.c | 164 +++++ .../selftests/futex/functional/futex2_wait.c | 195 ++++++ .../selftests/futex/functional/futex2_waitv.c | 154 +++++ .../futex/functional/futex_wait_timeout.c | 24 +- .../futex/functional/futex_wait_wouldblock.c | 33 +- .../testing/selftests/futex/functional/run.sh | 6 + .../selftests/futex/include/futex2test.h | 112 ++++ 32 files changed, 1865 insertions(+), 134 deletions(-) create mode 100644 Documentation/locking/futex2.rst create mode 100644 kernel/futex2.c create mode 100644 tools/testing/selftests/futex/functional/futex2_requeue.c create mode 100644 tools/testing/selftests/futex/functional/futex2_wait.c create mode 100644 tools/testing/selftests/futex/functional/futex2_waitv.c create mode 100644 tools/testing/selftests/futex/include/futex2test.h --- 2.32.0