mbox series

[v2,00/22] futex: splitup and waitv syscall

Message ID 20210923171111.300673-1-andrealmeid@collabora.com
Headers show
Series futex: splitup and waitv syscall | expand


André Almeida Sept. 23, 2021, 5:10 p.m. UTC
This v2 is a followup of Peter's patchset that addresses feedback from
the Linux Plumbers Conference session about new futex syscalls.

v1: https://lore.kernel.org/lkml/20210915140710.596174479@infradead.org/
- Added a clockid argument in sys_futex_waitv()
	- This required some changes in the timeout init
- Added test for wouldblock
- Added documentation file
- Fixed error path order for futex_wait_multiple()
- Return error if FUTEX_32 is not set for a waiter
	- Extended futex_waitv() selftest to cover error paths like this

Original cover letter for this syscall (extracted from

This patchset introduce the futex_waitv syscall, that enables userspace
to wait in an array of futexes and wake on any.

* Use case

The use case of this syscall is to allow low level locking libraries to
wait for multiple locks at the same time. This is specially useful for
emulating Windows' WaitForMultipleObjects. A futex_waitv()-based solution
has been used for some time at Proton's Wine (a compatibility layer to
run Windows games on Linux). Compared to a solution that uses eventfd(),
futex was able to reduce CPU utilization for games, and even increase
frames per second for some games. This happens because eventfd doesn't
scale very well for a huge number of read, write and poll calls compared
to futex. Native game engines will benefit of this as well, given that
this wait pattern is common for games.

* The interface

This is how the interface looks like:

  futex_waitv(struct futex_waitv *waiters, unsigned int nr_futexes,
              unsigned int flags, struct timespec *timo, clockid_t clockid)

  struct futex_waitv {
          __u64 val;
          __u64 uaddr;
          __u32 flags;
          __u32 __reserved;

struct futex_waitv uses explicit padding, so we can use it in all
architectures. The __reserved is used for the padding and should always
be 0, but it may be repurposed in the future for some extension. If
userspace has 32-bit pointers, it should do a explicit cast to make sure
the upper bits are zeroed. uintptr_t does the tricky and it works for
32/64-bit pointers. The documentation patch provides more detailed
information about the interface.

André Almeida (7):
  futex: Implement sys_futex_waitv()
  futex,x86: Wire up sys_futex_waitv()
  futex,arm: Wire up sys_futex_waitv()
  selftests: futex: Add sys_futex_waitv() test
  selftests: futex: Test sys_futex_waitv() timeout
  selftests: futex: Test sys_futex_waitv() wouldblock
  futex2: Documentation: Document futex_waitv() uAPI

Peter Zijlstra (15):
  futex: Move to kernel/futex/
  futex: Split out syscalls
  futex: Rename {,__}{,un}queue_me()
  futex: Rename futex_wait_queue_me()
  futex: Rename: queue_{,un}lock()
  futex: Rename __unqueue_futex()
  futex: Rename hash_futex()
  futex: Rename: {get,cmpxchg}_futex_value_locked()
  futex: Split out PI futex
  futex: Rename: hb_waiter_{inc,dec,pending}()
  futex: Rename: match_futex()
  futex: Rename mark_wake_futex()
  futex: Split out requeue
  futex: Split out wait/wake
  futex: Simplify double_lock_hb()

 Documentation/userspace-api/futex2.rst        |   81 +
 Documentation/userspace-api/index.rst         |    1 +
 MAINTAINERS                                   |    3 +-
 arch/arm/tools/syscall.tbl                    |    1 +
 arch/arm64/include/asm/unistd.h               |    2 +-
 arch/arm64/include/asm/unistd32.h             |    2 +
 arch/x86/entry/syscalls/syscall_32.tbl        |    1 +
 arch/x86/entry/syscalls/syscall_64.tbl        |    1 +
 include/linux/syscalls.h                      |    6 +
 include/uapi/asm-generic/unistd.h             |    5 +-
 include/uapi/linux/futex.h                    |   25 +
 kernel/Makefile                               |    2 +-
 kernel/futex.c                                | 4272 -----------------
 kernel/futex/Makefile                         |    3 +
 kernel/futex/core.c                           | 1176 +++++
 kernel/futex/futex.h                          |  295 ++
 kernel/futex/pi.c                             | 1233 +++++
 kernel/futex/requeue.c                        |  897 ++++
 kernel/futex/syscalls.c                       |  387 ++
 kernel/futex/waitwake.c                       |  708 +++
 kernel/sys_ni.c                               |    3 +
 .../selftests/futex/functional/.gitignore     |    1 +
 .../selftests/futex/functional/Makefile       |    3 +-
 .../futex/functional/futex_wait_timeout.c     |   21 +-
 .../futex/functional/futex_wait_wouldblock.c  |   41 +-
 .../selftests/futex/functional/futex_waitv.c  |  158 +
 .../testing/selftests/futex/functional/run.sh |    3 +
 .../selftests/futex/include/futex2test.h      |   31 +
 28 files changed, 5080 insertions(+), 4282 deletions(-)
 create mode 100644 Documentation/userspace-api/futex2.rst
 delete mode 100644 kernel/futex.c
 create mode 100644 kernel/futex/Makefile
 create mode 100644 kernel/futex/core.c
 create mode 100644 kernel/futex/futex.h
 create mode 100644 kernel/futex/pi.c
 create mode 100644 kernel/futex/requeue.c
 create mode 100644 kernel/futex/syscalls.c
 create mode 100644 kernel/futex/waitwake.c
 create mode 100644 tools/testing/selftests/futex/functional/futex_waitv.c
 create mode 100644 tools/testing/selftests/futex/include/futex2test.h