From patchwork Thu Feb 1 19:36:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Jeanson X-Patchwork-Id: 56663 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B6DBE385DC1F for ; Thu, 1 Feb 2024 19:38:03 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) by sourceware.org (Postfix) with ESMTPS id DAE3D38582A1 for ; Thu, 1 Feb 2024 19:37:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DAE3D38582A1 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=efficios.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=efficios.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DAE3D38582A1 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=167.114.26.122 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706816243; cv=none; b=LaMlW2B3PPCd3L/fxshPI3iDggWrUhDooKz3Fv7n77zfEuXbJFZVmory9wy4q6l2bN4EUASkGmYZnmvLVEGI7xTqRbtHQgcP4/4S0eCWjQ/2VzFdYqQUff5yauIPjIPBtBGKpqAaZSk85sdDcfoOCEz/Qcz9P9PTSrlKyh02nLM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706816243; c=relaxed/simple; bh=DNA9MZP5zMcUpy2ocLGsj1w1gFShIw0z5J4nR9Jp+ro=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=fs/nE/CI11aq8sLPNXmokAS+tGQAyqqBVZ0RL3zIwfe/NUSfKuoVUskkRzr8A99IX9WEDlOyaIxxQCbHOdNcCdR7Sd1d9SiTbM4emUjiEnY5PMPFlT3/gQI6jG9bKXNkiH5pUzLPUp0Mijv+C9KcBx/uup+/hE0D71b4Zn7OMMc= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1706816238; bh=DNA9MZP5zMcUpy2ocLGsj1w1gFShIw0z5J4nR9Jp+ro=; h=From:To:Cc:Subject:Date:From; b=TzptESy9KL6UHBQZ0NMBwWZSJ9UFU38n5Yy6Q0bwoSax4lzUpUjBjC3XIzwASrAxZ BncJRC2H+IEnPpCREA96QR/+3c8l0HRjIuMpqBwuKOUFEgyvurHVsq/np1UYTLf7V/ pWs6eu6Dh26JxlM1XbWuO2PgUaVGG0KgdrvAeD6w/ZrDQvzkkMJPCE2kvla7TSDlKU oXNdiCdIPJmBw/R+b7A6Mx8CqC+7k+xMu7py5tNtxODrddxKZ20hskzceF6cP3FKqh WKd0Zw5SwKY7RLbE5XtuVYmSRggmbSptB0odmUjBwrvq5c3QpOCD1r94Onl5VirxUc 0zwvNxP2MnZxg== Received: from laptop-mjeanson.internal.efficios.com (192-222-143-198.qc.cable.ebox.net [192.222.143.198]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4TQq1B4hqyzW2G; Thu, 1 Feb 2024 14:37:18 -0500 (EST) From: Michael Jeanson To: libc-alpha@sourceware.org Cc: Michael Jeanson , Florian Weimer , Carlos O'Donell , Mathieu Desnoyers Subject: [PATCH v7 0/8] Extend rseq support Date: Thu, 1 Feb 2024 14:36:40 -0500 Message-Id: <20240201193648.584917-1-mjeanson@efficios.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-5.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org This series rebases the standalone "Add rseq extensible ABI" patch on current master and adds an accelerated getcpu() implementation using the rseq extensible ABI with initial support for aarch64 and x86_64. On an aarch64 system (Snapdragon 8cx Gen 3) which lacks a vDSO for getcpu() we measured an improvement from 130 ns to 1 ns while on x86_64 (i7-8550U) which has a vDSO we measured a more modest improvement from 10 ns to 2 ns. Tested on i386, aarch64 and x86_64. Cc: Florian Weimer Cc: Carlos O'Donell Cc: Mathieu Desnoyers Mathieu Desnoyers (2): x86-64: Add rseq_load32_load32_relaxed aarch64: Add rseq_load32_load32_relaxed Michael Jeanson (6): nptl: fix potential merge of __rseq_* relro symbols Add rseq extensible ABI support nptl: Add public __rseq_feature_size symbol nptl: Add features to internal 'struct rseq_area' nptl: Add rseq internal utils Linux: Use rseq to accelerate getcpu csu/Makefile | 2 +- csu/libc-tls.c | 66 ++++++- csu/rseq-sizes.sym | 11 ++ elf/Makefile | 1 + elf/dl-rseq-symbols.S | 72 ++++++++ elf/dl-tls.c | 62 +++++++ elf/rtld_static_init.c | 12 ++ manual/threads.texi | 8 + nptl/descr.h | 20 +- nptl/pthread_create.c | 2 +- sysdeps/generic/dl-rseq.h | 26 +++ sysdeps/generic/ldsodefs.h | 12 ++ sysdeps/i386/nptl/tcb-access.h | 56 ++++++ sysdeps/nptl/dl-tls_init_tp.c | 16 +- sysdeps/nptl/tcb-access.h | 5 + sysdeps/unix/sysv/linux/Makefile | 10 + sysdeps/unix/sysv/linux/Versions | 3 + sysdeps/unix/sysv/linux/aarch64/ld.abilist | 1 + .../unix/sysv/linux/aarch64/rseq-internal.h | 173 ++++++++++++++++++ sysdeps/unix/sysv/linux/alpha/ld.abilist | 1 + sysdeps/unix/sysv/linux/arc/ld.abilist | 1 + sysdeps/unix/sysv/linux/arm/be/ld.abilist | 1 + sysdeps/unix/sysv/linux/arm/le/ld.abilist | 1 + sysdeps/unix/sysv/linux/csky/ld.abilist | 1 + sysdeps/unix/sysv/linux/dl-parse_auxv.h | 6 + sysdeps/unix/sysv/linux/getcpu.c | 32 +++- sysdeps/unix/sysv/linux/hppa/ld.abilist | 1 + sysdeps/unix/sysv/linux/i386/ld.abilist | 1 + .../unix/sysv/linux/loongarch/lp64/ld.abilist | 1 + .../unix/sysv/linux/m68k/coldfire/ld.abilist | 1 + .../unix/sysv/linux/m68k/m680x0/ld.abilist | 1 + sysdeps/unix/sysv/linux/microblaze/ld.abilist | 1 + .../unix/sysv/linux/mips/mips32/ld.abilist | 1 + .../sysv/linux/mips/mips64/n32/ld.abilist | 1 + .../sysv/linux/mips/mips64/n64/ld.abilist | 1 + sysdeps/unix/sysv/linux/nios2/ld.abilist | 1 + sysdeps/unix/sysv/linux/or1k/ld.abilist | 1 + .../sysv/linux/powerpc/powerpc32/ld.abilist | 1 + .../linux/powerpc/powerpc64/be/ld.abilist | 1 + .../linux/powerpc/powerpc64/le/ld.abilist | 1 + sysdeps/unix/sysv/linux/riscv/rv32/ld.abilist | 1 + sysdeps/unix/sysv/linux/riscv/rv64/ld.abilist | 1 + sysdeps/unix/sysv/linux/rseq-internal.h | 89 ++++++++- .../unix/sysv/linux/s390/s390-32/ld.abilist | 1 + .../unix/sysv/linux/s390/s390-64/ld.abilist | 1 + sysdeps/unix/sysv/linux/sched_getcpu.c | 3 +- sysdeps/unix/sysv/linux/sh/be/ld.abilist | 1 + sysdeps/unix/sysv/linux/sh/le/ld.abilist | 1 + .../unix/sysv/linux/sparc/sparc32/ld.abilist | 1 + .../unix/sysv/linux/sparc/sparc64/ld.abilist | 1 + sysdeps/unix/sysv/linux/sys/rseq.h | 4 + .../unix/sysv/linux/tst-rseq-disable-static.c | 1 + sysdeps/unix/sysv/linux/tst-rseq-disable.c | 20 +- .../unix/sysv/linux/tst-rseq-nptl-static.c | 1 + sysdeps/unix/sysv/linux/tst-rseq-static.c | 1 + sysdeps/unix/sysv/linux/tst-rseq.c | 24 ++- sysdeps/unix/sysv/linux/tst-rseq.h | 9 +- sysdeps/unix/sysv/linux/x86_64/64/ld.abilist | 1 + .../unix/sysv/linux/x86_64/rseq-internal.h | 109 +++++++++++ sysdeps/unix/sysv/linux/x86_64/x32/ld.abilist | 1 + sysdeps/x86_64/nptl/tcb-access.h | 56 ++++++ 61 files changed, 886 insertions(+), 56 deletions(-) create mode 100644 csu/rseq-sizes.sym create mode 100644 elf/dl-rseq-symbols.S create mode 100644 sysdeps/generic/dl-rseq.h create mode 100644 sysdeps/unix/sysv/linux/aarch64/rseq-internal.h create mode 100644 sysdeps/unix/sysv/linux/tst-rseq-disable-static.c create mode 100644 sysdeps/unix/sysv/linux/tst-rseq-nptl-static.c create mode 100644 sysdeps/unix/sysv/linux/tst-rseq-static.c create mode 100644 sysdeps/unix/sysv/linux/x86_64/rseq-internal.h