mbox

[0/4] Simplify internal single-threaded usage

Message ID 20220608164941.3325089-1-adhemerval.zanella@linaro.org
Headers

Message

Adhemerval Zanella June 8, 2022, 4:49 p.m. UTC
  Glibc currently has three different internal ways to check if a process
is single-threaded: the exported global variable __libc_single_threaded,
the internal-only __libc_multiple_threads, and the variant used by some
architectures and allocated on TCB, multiple_threads.  Also each port
can define SINGLE_THREAD_BY_GLOBAL to either use __libc_multiple_threads
or multiple_threads.

The __libc_single_threaded and __libc_multiple_threads have essentially
the semantic: both are global variables where the value is not reset
if/when the process becomes single-threaded.  The issue of using
__libc_single_threaded internally is since it is accessed through copy
relocation, both values must be updated.  This is fixed in the first
patch.

The second replaces __libc_multiple_threads with __libc_single_threaded,
while also fixing a bug where architecture that defines
SINGLE_THREAD_BY_GLOBAL did not actually enable the optimization.

The third patch replaces multiple_threads with __libc_single_threaded,
to simplify a possible single-thread lock optimization.  On most
architectures, accessing an internal global variable should be as fast
as through the TCB (it seems that only legacy ABIs that require extra
code sequence to materialize global access, such as i686 and sparc,
using the TCB would be faster).

The i686 seems to be the only architecture that optimizes the lock access
directly by reimplementing the atomic operations.  In this case some
as rewritten using compiler builtins along with SINGLE_THREAD_P macro,
while other unused macros are just removed (for instance
atomic_add_zero).  The idea is to just phase out this specific atomic
implementation in favor of compiler builtins and move the single-thread
optimization to be arch-neutral.

The last patch just remove the single-thread.h header and move the
definition to internal sys/single_threaded.h, so now there is only
one place to add such optimization.

Adhemerval Zanella (4):
  misc: Optimize internal usage of __libc_single_threaded
  Replace __libc_multiple_threads with __libc_single_threaded
  Remove usage of TLS_MULTIPLE_THREADS_IN_TCB
  Remove single-thread.h

 dlfcn/dlsym.c                               |   1 +
 elf/libc_early_init.c                       |   9 +
 include/dlfcn.h                             |   4 +
 include/sys/single_threaded.h               |  20 +-
 misc/single_threaded.c                      |   2 +
 misc/tst-atomic.c                           |   1 +
 nptl/Makefile                               |   1 -
 nptl/allocatestack.c                        |  12 -
 nptl/descr.h                                |  17 +-
 nptl/libc_multiple_threads.c                |  28 --
 nptl/pthread_cancel.c                       |   9 +-
 nptl/pthread_create.c                       |  11 +-
 sysdeps/generic/single-thread.h             |  25 --
 sysdeps/i386/htl/tcb-offsets.sym            |   1 -
 sysdeps/i386/nptl/tcb-offsets.sym           |   1 -
 sysdeps/i386/nptl/tls.h                     |   4 +-
 sysdeps/ia64/nptl/tcb-offsets.sym           |   1 -
 sysdeps/ia64/nptl/tls.h                     |   2 -
 sysdeps/mach/hurd/i386/tls.h                |   4 +-
 sysdeps/mach/hurd/sysdep-cancel.h           |   5 -
 sysdeps/nios2/nptl/tcb-offsets.sym          |   1 -
 sysdeps/or1k/nptl/tls.h                     |   2 -
 sysdeps/powerpc/nptl/tcb-offsets.sym        |   3 -
 sysdeps/powerpc/nptl/tls.h                  |   3 -
 sysdeps/s390/nptl/tcb-offsets.sym           |   1 -
 sysdeps/s390/nptl/tls.h                     |   6 +-
 sysdeps/sh/nptl/tcb-offsets.sym             |   1 -
 sysdeps/sh/nptl/tls.h                       |   2 -
 sysdeps/sparc/nptl/tcb-offsets.sym          |   1 -
 sysdeps/sparc/nptl/tls.h                    |   2 +-
 sysdeps/unix/sysdep.h                       |   2 +-
 sysdeps/unix/sysv/linux/aarch64/sysdep.h    |   2 -
 sysdeps/unix/sysv/linux/alpha/sysdep.h      |   2 -
 sysdeps/unix/sysv/linux/arc/sysdep.h        |   2 -
 sysdeps/unix/sysv/linux/arm/sysdep.h        |   2 -
 sysdeps/unix/sysv/linux/hppa/sysdep.h       |   2 -
 sysdeps/unix/sysv/linux/microblaze/sysdep.h |   2 -
 sysdeps/unix/sysv/linux/s390/sysdep.h       |   3 -
 sysdeps/unix/sysv/linux/single-thread.h     |  44 ---
 sysdeps/unix/sysv/linux/x86_64/sysdep.h     |   2 -
 sysdeps/x86/atomic-machine.h                | 327 +++-----------------
 sysdeps/x86_64/nptl/tcb-offsets.sym         |   1 -
 42 files changed, 96 insertions(+), 475 deletions(-)
 delete mode 100644 nptl/libc_multiple_threads.c
 delete mode 100644 sysdeps/generic/single-thread.h
 delete mode 100644 sysdeps/unix/sysv/linux/single-thread.h