Message ID | 20240201193648.584917-1-mjeanson@efficios.com |
---|---|
Headers |
Return-Path: <libc-alpha-bounces+patchwork=sourceware.org@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B6DBE385DC1F for <patchwork@sourceware.org>; Thu, 1 Feb 2024 19:38:03 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) by sourceware.org (Postfix) with ESMTPS id DAE3D38582A1 for <libc-alpha@sourceware.org>; Thu, 1 Feb 2024 19:37:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DAE3D38582A1 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=efficios.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=efficios.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DAE3D38582A1 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=167.114.26.122 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706816243; cv=none; b=LaMlW2B3PPCd3L/fxshPI3iDggWrUhDooKz3Fv7n77zfEuXbJFZVmory9wy4q6l2bN4EUASkGmYZnmvLVEGI7xTqRbtHQgcP4/4S0eCWjQ/2VzFdYqQUff5yauIPjIPBtBGKpqAaZSk85sdDcfoOCEz/Qcz9P9PTSrlKyh02nLM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706816243; c=relaxed/simple; bh=DNA9MZP5zMcUpy2ocLGsj1w1gFShIw0z5J4nR9Jp+ro=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=fs/nE/CI11aq8sLPNXmokAS+tGQAyqqBVZ0RL3zIwfe/NUSfKuoVUskkRzr8A99IX9WEDlOyaIxxQCbHOdNcCdR7Sd1d9SiTbM4emUjiEnY5PMPFlT3/gQI6jG9bKXNkiH5pUzLPUp0Mijv+C9KcBx/uup+/hE0D71b4Zn7OMMc= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1706816238; bh=DNA9MZP5zMcUpy2ocLGsj1w1gFShIw0z5J4nR9Jp+ro=; h=From:To:Cc:Subject:Date:From; b=TzptESy9KL6UHBQZ0NMBwWZSJ9UFU38n5Yy6Q0bwoSax4lzUpUjBjC3XIzwASrAxZ BncJRC2H+IEnPpCREA96QR/+3c8l0HRjIuMpqBwuKOUFEgyvurHVsq/np1UYTLf7V/ pWs6eu6Dh26JxlM1XbWuO2PgUaVGG0KgdrvAeD6w/ZrDQvzkkMJPCE2kvla7TSDlKU oXNdiCdIPJmBw/R+b7A6Mx8CqC+7k+xMu7py5tNtxODrddxKZ20hskzceF6cP3FKqh WKd0Zw5SwKY7RLbE5XtuVYmSRggmbSptB0odmUjBwrvq5c3QpOCD1r94Onl5VirxUc 0zwvNxP2MnZxg== Received: from laptop-mjeanson.internal.efficios.com (192-222-143-198.qc.cable.ebox.net [192.222.143.198]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4TQq1B4hqyzW2G; Thu, 1 Feb 2024 14:37:18 -0500 (EST) From: Michael Jeanson <mjeanson@efficios.com> To: libc-alpha@sourceware.org Cc: Michael Jeanson <mjeanson@efficios.com>, Florian Weimer <fweimer@redhat.com>, Carlos O'Donell <carlos@redhat.com>, Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Subject: [PATCH v7 0/8] Extend rseq support Date: Thu, 1 Feb 2024 14:36:40 -0500 Message-Id: <20240201193648.584917-1-mjeanson@efficios.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=subscribe> Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org |
Series |
Extend rseq support
|
|
Message
Michael Jeanson
Feb. 1, 2024, 7:36 p.m. UTC
This series rebases the standalone "Add rseq extensible ABI" patch on current master and adds an accelerated getcpu() implementation using the rseq extensible ABI with initial support for aarch64 and x86_64. On an aarch64 system (Snapdragon 8cx Gen 3) which lacks a vDSO for getcpu() we measured an improvement from 130 ns to 1 ns while on x86_64 (i7-8550U) which has a vDSO we measured a more modest improvement from 10 ns to 2 ns. Tested on i386, aarch64 and x86_64. Cc: Florian Weimer <fweimer@redhat.com> Cc: Carlos O'Donell <carlos@redhat.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Mathieu Desnoyers (2): x86-64: Add rseq_load32_load32_relaxed aarch64: Add rseq_load32_load32_relaxed Michael Jeanson (6): nptl: fix potential merge of __rseq_* relro symbols Add rseq extensible ABI support nptl: Add public __rseq_feature_size symbol nptl: Add features to internal 'struct rseq_area' nptl: Add rseq internal utils Linux: Use rseq to accelerate getcpu csu/Makefile | 2 +- csu/libc-tls.c | 66 ++++++- csu/rseq-sizes.sym | 11 ++ elf/Makefile | 1 + elf/dl-rseq-symbols.S | 72 ++++++++ elf/dl-tls.c | 62 +++++++ elf/rtld_static_init.c | 12 ++ manual/threads.texi | 8 + nptl/descr.h | 20 +- nptl/pthread_create.c | 2 +- sysdeps/generic/dl-rseq.h | 26 +++ sysdeps/generic/ldsodefs.h | 12 ++ sysdeps/i386/nptl/tcb-access.h | 56 ++++++ sysdeps/nptl/dl-tls_init_tp.c | 16 +- sysdeps/nptl/tcb-access.h | 5 + sysdeps/unix/sysv/linux/Makefile | 10 + sysdeps/unix/sysv/linux/Versions | 3 + sysdeps/unix/sysv/linux/aarch64/ld.abilist | 1 + .../unix/sysv/linux/aarch64/rseq-internal.h | 173 ++++++++++++++++++ sysdeps/unix/sysv/linux/alpha/ld.abilist | 1 + sysdeps/unix/sysv/linux/arc/ld.abilist | 1 + sysdeps/unix/sysv/linux/arm/be/ld.abilist | 1 + sysdeps/unix/sysv/linux/arm/le/ld.abilist | 1 + sysdeps/unix/sysv/linux/csky/ld.abilist | 1 + sysdeps/unix/sysv/linux/dl-parse_auxv.h | 6 + sysdeps/unix/sysv/linux/getcpu.c | 32 +++- sysdeps/unix/sysv/linux/hppa/ld.abilist | 1 + sysdeps/unix/sysv/linux/i386/ld.abilist | 1 + .../unix/sysv/linux/loongarch/lp64/ld.abilist | 1 + .../unix/sysv/linux/m68k/coldfire/ld.abilist | 1 + .../unix/sysv/linux/m68k/m680x0/ld.abilist | 1 + sysdeps/unix/sysv/linux/microblaze/ld.abilist | 1 + .../unix/sysv/linux/mips/mips32/ld.abilist | 1 + .../sysv/linux/mips/mips64/n32/ld.abilist | 1 + .../sysv/linux/mips/mips64/n64/ld.abilist | 1 + sysdeps/unix/sysv/linux/nios2/ld.abilist | 1 + sysdeps/unix/sysv/linux/or1k/ld.abilist | 1 + .../sysv/linux/powerpc/powerpc32/ld.abilist | 1 + .../linux/powerpc/powerpc64/be/ld.abilist | 1 + .../linux/powerpc/powerpc64/le/ld.abilist | 1 + sysdeps/unix/sysv/linux/riscv/rv32/ld.abilist | 1 + sysdeps/unix/sysv/linux/riscv/rv64/ld.abilist | 1 + sysdeps/unix/sysv/linux/rseq-internal.h | 89 ++++++++- .../unix/sysv/linux/s390/s390-32/ld.abilist | 1 + .../unix/sysv/linux/s390/s390-64/ld.abilist | 1 + sysdeps/unix/sysv/linux/sched_getcpu.c | 3 +- sysdeps/unix/sysv/linux/sh/be/ld.abilist | 1 + sysdeps/unix/sysv/linux/sh/le/ld.abilist | 1 + .../unix/sysv/linux/sparc/sparc32/ld.abilist | 1 + .../unix/sysv/linux/sparc/sparc64/ld.abilist | 1 + sysdeps/unix/sysv/linux/sys/rseq.h | 4 + .../unix/sysv/linux/tst-rseq-disable-static.c | 1 + sysdeps/unix/sysv/linux/tst-rseq-disable.c | 20 +- .../unix/sysv/linux/tst-rseq-nptl-static.c | 1 + sysdeps/unix/sysv/linux/tst-rseq-static.c | 1 + sysdeps/unix/sysv/linux/tst-rseq.c | 24 ++- sysdeps/unix/sysv/linux/tst-rseq.h | 9 +- sysdeps/unix/sysv/linux/x86_64/64/ld.abilist | 1 + .../unix/sysv/linux/x86_64/rseq-internal.h | 109 +++++++++++ sysdeps/unix/sysv/linux/x86_64/x32/ld.abilist | 1 + sysdeps/x86_64/nptl/tcb-access.h | 56 ++++++ 61 files changed, 886 insertions(+), 56 deletions(-) create mode 100644 csu/rseq-sizes.sym create mode 100644 elf/dl-rseq-symbols.S create mode 100644 sysdeps/generic/dl-rseq.h create mode 100644 sysdeps/unix/sysv/linux/aarch64/rseq-internal.h create mode 100644 sysdeps/unix/sysv/linux/tst-rseq-disable-static.c create mode 100644 sysdeps/unix/sysv/linux/tst-rseq-nptl-static.c create mode 100644 sysdeps/unix/sysv/linux/tst-rseq-static.c create mode 100644 sysdeps/unix/sysv/linux/x86_64/rseq-internal.h
Comments
On 2024-02-01 14:36, Michael Jeanson wrote: > This series rebases the standalone "Add rseq extensible ABI" patch on > current master and adds an accelerated getcpu() implementation using the > rseq extensible ABI with initial support for aarch64 and x86_64. > > On an aarch64 system (Snapdragon 8cx Gen 3) which lacks a vDSO for > getcpu() we measured an improvement from 130 ns to 1 ns while on x86_64 > (i7-8550U) which has a vDSO we measured a more modest improvement from > 10 ns to 2 ns. > > Tested on i386, aarch64 and x86_64. The failures reported by the Linaro-TCWG-CI on arm seem to be only an issue of symbol sorting in the abilist files. Is there tooling to regenerate those for all architectures? Thanks, Michael
On 02/02/24 12:40, Michael Jeanson wrote: > On 2024-02-01 14:36, Michael Jeanson wrote: >> This series rebases the standalone "Add rseq extensible ABI" patch on >> current master and adds an accelerated getcpu() implementation using the >> rseq extensible ABI with initial support for aarch64 and x86_64. >> >> On an aarch64 system (Snapdragon 8cx Gen 3) which lacks a vDSO for >> getcpu() we measured an improvement from 130 ns to 1 ns while on x86_64 >> (i7-8550U) which has a vDSO we measured a more modest improvement from >> 10 ns to 2 ns. >> >> Tested on i386, aarch64 and x86_64. > > The failures reported by the Linaro-TCWG-CI on arm seem to be only an issue of symbol sorting in the abilist files. Is there tooling to regenerate those for all architectures? The 'make update-abi' on the build folder will sort this out.