From patchwork Fri Jun 10 16:35:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 55022 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D9A803852764 for ; Fri, 10 Jun 2022 16:37:01 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D9A803852764 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1654879021; bh=joFIKbjdMEt+5v//5isoyv8l7O6ul5ZwJGByxXPIhGY=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=rlz5cVa9jzHi3Vj/DcoJCPaqLkVztU4qj+aXWurcWE/DhCA/q4b3155njk4K52xcz STz6CYwXaeEZAKtBanP9Z5T1TZFjxrYLR6D6syGCFfHh/ns5nf2k77r/Mq5PFibtJH 8MgyfM5AXpxih8x1BEatoSRWkSD7mbowP1C+tHao= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-ot1-x32a.google.com (mail-ot1-x32a.google.com [IPv6:2607:f8b0:4864:20::32a]) by sourceware.org (Postfix) with ESMTPS id EDFCD38485A5 for ; Fri, 10 Jun 2022 16:35:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org EDFCD38485A5 Received: by mail-ot1-x32a.google.com with SMTP id a21-20020a9d4715000000b0060bfaac6899so11224185otf.12 for ; Fri, 10 Jun 2022 09:35:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=joFIKbjdMEt+5v//5isoyv8l7O6ul5ZwJGByxXPIhGY=; b=1xAoeYlX/yKeQqAjtJBDeaslVJ3X/w6d7tGLoN5Vvs+sYzdE9wd72P7B/2HQ7JByxQ FcJ9vjLFjmn4TQvYQfGx0QfTH9qZtl7XHdF8i5OA2quKmDpHX2M9NnGBvsfz1m7+M3ns fdAONohwCW7UgUf5ZFP32UiKb99Hgw+Gm72OBiQcw4ylAnYqigqTFtEZKnlvROYXKYyT NZiRqTMPUbaLaQqdhAUG5SSj/2oWzvoOA6MhIo8Qab5lF6uvcR4FHx/zrPQPLtxCkhJQ W6NBaMNwRzNBd8FrJotSfx7QspRRUxj6BdG5w91RdgllaQ9svkWceMcchRz1jjEQoexZ To6A== X-Gm-Message-State: AOAM5304ZXYsmXzHHUkg/LPv/etKAiVdZ+j4TS28DPBvFojiuFKEu3cy xxhLQNFVnURtgDO/icYQTCMtHn0j1vOXJQ== X-Google-Smtp-Source: ABdhPJwy4W0z9Esgbxh2G42PH52aDv6Wnenr2+Z5gpdI1/pW62Rh6QnncionxLcQ/nVwi08gmTeLow== X-Received: by 2002:a05:6830:1e79:b0:60c:997:a626 with SMTP id m25-20020a0568301e7900b0060c0997a626mr8527799otr.290.1654878956949; Fri, 10 Jun 2022 09:35:56 -0700 (PDT) Received: from birita.. ([2804:431:c7cb:a613:818b:b86c:a3f8:d455]) by smtp.gmail.com with ESMTPSA id p203-20020aca42d4000000b0032efe5871b0sm1262424oia.45.2022.06.10.09.35.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Jun 2022 09:35:56 -0700 (PDT) To: libc-alpha@sourceware.org, Wilco Dijkstra Subject: [PATCH v2 0/4] Simplify internal single-threaded usage Date: Fri, 10 Jun 2022 13:35:48 -0300 Message-Id: <20220610163552.3587064-1-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-6.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" The glibc currently has three different internal ways to check if a process is single-threaded: the exported global variable __libc_single_threaded, the internal-only __libc_multiple_threads, and the variant used by some architectures and allocated on TCB, multiple_threads. Also each port can define SINGLE_THREAD_BY_GLOBAL to either use __libc_multiple_threads or multiple_threads. The __libc_single_threaded and __libc_multiple_threads have essentially the same semantic: both are global variables where the value is not reset if/when the process becomes multi-threaded. The issue of using __libc_single_threaded internally is since it is accessed through copy relocation, both values must be updated. This is fixed in the first patch. The second patch replaces __libc_multiple_threads with __libc_single_threaded, while also fixing a bug where architectures that define SINGLE_THREAD_BY_GLOBAL does not enable the optimization. The third patch replaces multiple_threads with __libc_single_threaded, to simplify a possible single-thread lock optimization. On most architectures, accessing an internal global variable should be as fast as through the TCB (it seems that only legacy ABIs that require extra code sequence to materialize global access, such as i686 and sparc, using the TCB would be faster, however it is mitigated when the SINGLE_THREAD_P is accessed in large code blocks). The x86 is only architecture that optimizes the lock access directly by reimplementing atomic operations. In this case, the affected implementations are rewritten to use SINGLE_THREAD_P macro, while some unused macros are just removed (for instance atomic_add_zero). The idea is to just phase out this specific atomic implementation in favor of compiler builtins and move the single-thread optimization to be arch-neutral. In the last patch just remove the single-thread.h header and move the definition to internal sys/single_threaded.h, so now there is only one place to add such optimization. v2: * Add RTLD_DEFAULT support for __libc_dlsym and use it instead of ___dlsym. * Simplify the x86 atomic macros. Adhemerval Zanella (4): misc: Optimize internal usage of __libc_single_threaded Replace __libc_multiple_threads with __libc_single_threaded Remove usage of TLS_MULTIPLE_THREADS_IN_TCB Remove single-thread.h elf/dl-libc.c | 20 +- elf/libc_early_init.c | 9 + include/sys/single_threaded.h | 20 +- misc/single_threaded.c | 2 + misc/tst-atomic.c | 1 + nptl/Makefile | 1 - nptl/allocatestack.c | 12 - nptl/descr.h | 17 +- nptl/libc_multiple_threads.c | 28 -- nptl/pthread_cancel.c | 9 +- nptl/pthread_create.c | 11 +- sysdeps/generic/single-thread.h | 25 - sysdeps/i386/htl/tcb-offsets.sym | 1 - sysdeps/i386/nptl/tcb-offsets.sym | 1 - sysdeps/i386/nptl/tls.h | 4 +- sysdeps/ia64/nptl/tcb-offsets.sym | 1 - sysdeps/ia64/nptl/tls.h | 2 - sysdeps/mach/hurd/i386/tls.h | 4 +- sysdeps/mach/hurd/sysdep-cancel.h | 5 - sysdeps/nios2/nptl/tcb-offsets.sym | 1 - sysdeps/or1k/nptl/tls.h | 2 - sysdeps/powerpc/nptl/tcb-offsets.sym | 3 - sysdeps/powerpc/nptl/tls.h | 3 - sysdeps/s390/nptl/tcb-offsets.sym | 1 - sysdeps/s390/nptl/tls.h | 6 +- sysdeps/sh/nptl/tcb-offsets.sym | 1 - sysdeps/sh/nptl/tls.h | 2 - sysdeps/sparc/nptl/tcb-offsets.sym | 1 - sysdeps/sparc/nptl/tls.h | 2 +- sysdeps/unix/sysdep.h | 2 +- sysdeps/unix/sysv/linux/aarch64/sysdep.h | 2 - sysdeps/unix/sysv/linux/alpha/sysdep.h | 2 - sysdeps/unix/sysv/linux/arc/sysdep.h | 2 - sysdeps/unix/sysv/linux/arm/sysdep.h | 2 - sysdeps/unix/sysv/linux/hppa/sysdep.h | 2 - sysdeps/unix/sysv/linux/microblaze/sysdep.h | 2 - sysdeps/unix/sysv/linux/s390/sysdep.h | 3 - sysdeps/unix/sysv/linux/single-thread.h | 44 -- sysdeps/unix/sysv/linux/x86_64/sysdep.h | 2 - sysdeps/x86/atomic-machine.h | 484 ++++++-------------- sysdeps/x86_64/nptl/tcb-offsets.sym | 1 - 41 files changed, 199 insertions(+), 544 deletions(-) delete mode 100644 nptl/libc_multiple_threads.c delete mode 100644 sysdeps/generic/single-thread.h delete mode 100644 sysdeps/unix/sysv/linux/single-thread.h