From patchwork Sun Feb 25 17:25:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 56727 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id ACE4938582AC for ; Sun, 25 Feb 2024 17:26:29 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pf1-x430.google.com (mail-pf1-x430.google.com [IPv6:2607:f8b0:4864:20::430]) by sourceware.org (Postfix) with ESMTPS id CE4D73858D1E for ; Sun, 25 Feb 2024 17:25:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CE4D73858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CE4D73858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::430 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708881961; cv=none; b=IZIh8dMKZu7Vg9btJD1UisPCXOYMeoeZLcpcc/Ib/hkujy74L9vpNgHQOa3/S4A/fr+1wV1XYhcnWHtKcC5/Hd8ew85VJXzHUFzmCKdg6NVPHSrp7+/ygaSm8zxmzIsMfzXjQaYcr7UN1+3JVREeRA/kXh39r11oxxroQ4o7NRQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708881961; c=relaxed/simple; bh=1d80WpzKfb+ltMRqfNBlQpG2gpA5SOk0iLI9CNYv8CE=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=qRvFfSPO+HXuCvksVROxkt5X+QbGgJuZjVe2H/Uxp3Jr0OrRAJ8u257u7Cd+YQACQjr/MG5v2QVDuAps+jQoYodQdpxOzSnaNybo+3EA6ceE/isQ7oLQwusKDxQibcGY5Q4Ue/yC5i+cU+IjmJtJEiEJUpLi9hYu0DF0tEKF03A= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pf1-x430.google.com with SMTP id d2e1a72fcca58-6d9f94b9186so2109008b3a.0 for ; Sun, 25 Feb 2024 09:25:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1708881958; x=1709486758; darn=sourceware.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=51X5ga6CNrbdYev2e1qwlJrZMVArkaXxIDDARvEhanY=; b=cJd65JSfxahH2lB/K87rYVgZrsFdhdQFEtBoYCixVGTEwnXK+h6xmn4Lzb1RwHUDt5 c88xT2BuixrFg04HQU76oWnwdG/PtFNDRUF1bgrpJU8Z0i5uEbd7lejQJmqKMiOj05e0 sq+G/DrhGsxSCftwpPvAXuocIvy3Z00cg2Wdh1yJfainmtq8waxBoaNrUHvK2YA5oH+1 tPOcYfmPLsVll53xR4zPg8Xf6T0t1GE3ITzGz5XDnpTDVNpQu5pFQp7gQUTySPvWKVIj QnpPvuEmAiM6O3qarbNkNKeJ9eVyTIhuMnXMKYH4fBV2QZzMkdp7myc2YQMyqdUr18jz pcoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708881958; x=1709486758; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=51X5ga6CNrbdYev2e1qwlJrZMVArkaXxIDDARvEhanY=; b=AOZhcMGB9FwBEhvv0W89mz+6w/Jvs87khBEZpWQvXJCUOitIBa5Wktzb+/RBDO5Lot bd+a09cgEVQKCmG2jSDngeue333eNPoq4swGS+OQleYPeSFZg8vLYMzky9ovS+i26ta2 R+2POSoIEqCjNDARKKp+Ex+pjueiVTB7G+vwVScKmTBe7P79l3BRX7nbtNzrt2WgpJIH xvZ+2GzDVvOVJuMaydVglC96XXA3U7r8ws5DJYVXA86RyTe/UH9QfBIyHRhjlYNil1ZV 7wIzEsWjJbM06lHEHVCbHK+WxSFnBqWGP236OhHf0oawGUhJgjPUGutnA6dKmls43m2q vdVQ== X-Gm-Message-State: AOJu0YzJ+lxPaqmojJien3kIS27RSAiAWvoW+Y0fPGemEoINnknqGVX/ gomdQQpr7ZxxNBOKIlMCV1zMf7VCe2/Qd4mEYE6IYrMuIcnu3Xu6 X-Google-Smtp-Source: AGHT+IGn0u1HgWnuY2RR90zPtqxMVqO8LvHptsjEoibriI6XSebwU5cP5EqCU78wBtzEz/Jxn+KcmQ== X-Received: by 2002:a05:6a20:9f04:b0:1a0:e17c:d771 with SMTP id mk4-20020a056a209f0400b001a0e17cd771mr8975243pzb.7.1708881957731; Sun, 25 Feb 2024 09:25:57 -0800 (PST) Received: from gnu-cfl-3.localdomain ([172.58.89.72]) by smtp.gmail.com with ESMTPSA id fa21-20020a056a002d1500b006e45daf9e89sm2540157pfb.131.2024.02.25.09.25.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 25 Feb 2024 09:25:57 -0800 (PST) Received: from gnu-cfl-3.. (localhost [IPv6:::1]) by gnu-cfl-3.localdomain (Postfix) with ESMTP id 2ED4F7402BB; Sun, 25 Feb 2024 09:25:56 -0800 (PST) From: "H.J. Lu" To: libc-alpha@sourceware.org Cc: adhemerval.zanella@linaro.org, goldstein.w.n@gmail.com, fweimer@redhat.com Subject: [PATCH v9 0/1] x86: Update _dl_tlsdesc_dynamic to preserve caller-saved registers Date: Sun, 25 Feb 2024 09:25:55 -0800 Message-ID: <20240225172556.3330952-1-hjl.tools@gmail.com> X-Mailer: git-send-email 2.43.2 MIME-Version: 1.0 X-Spam-Status: No, score=-3013.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, RCVD_IN_SBL_CSS, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Changes in v9: 1. Drop the first patch which has been merged into master branch. 2. Correct BEFORE_TLSDESC_CALL placement in test. 3. Define MOD only if undefined in test. Changes in v8: 1. Remove malloc-for-test.c and move malloc to tst-gnu2-tls2.c. 2. Add malloc_counter to verify malloc in tst-gnu2-tls2.c is called for TLSDESC call. 3. Add BEFORE_TLSDESC_CALL and AFTER_TLSDESC_CALL. 4. Use /* ... */ in assembly code comments. Changes in v7: 1. Generate malloc-for-test.map at build time to get the correct version map for malloc. Changes in v6: 1. Drop Tile registers. Changes in v5: 1. Also preserve Tile registers. 2. Add an error check in i386 dl-tlsdesc-dynamic.h. Changes in v4: 1. Add APX registers to STATE_SAVE_MASK so that APX registers are saved in ld.so trampoline. 2. Also save x87 FPU stack registers for TLSDESC_CALL and TLS_DESC_CALL. 3. Change i386 _dl_tlsdesc_dynamic to IFUNC. 4. Rename GLRO(dl_x86_64_tlsdesc_dynamic) to GLRO(dl_x86_tlsdesc_dynamic) for both i386 and x86-64. 5. Update the testcase for i386 with a simple malloc interceptor. Changes in v3: 1. Don't add GLRO(dl_x86_64_tlsdesc_dynamic) to libc.a. Changes in v2: 1. Add GLRO(dl_x86_64_runtime_resolve) to optimize elf_machine_runtime_setup. --- Add APX registers to STATE_SAVE_MASK so that APX registers are saved in ld.so trampoline. This fixes BZ #31371. Compiler generates the following instruction sequence for GNU2 dynamic TLS access: leaq tls_var@TLSDESC(%rip), %rax call *tls_var@TLSCALL(%rax) or leal tls_var@TLSDESC(%ebx), %eax call *tls_var@TLSCALL(%eax) CALL instruction is transparent to compiler which assumes all registers, except for EFLAGS and RAX/EAX, are unchanged after CALL. When _dl_tlsdesc_dynamic is called, it calls __tls_get_addr on the slow path. __tls_get_addr is a normal function which doesn't preserve any caller-saved registers. _dl_tlsdesc_dynamic saved and restored integer caller-saved registers, but didn't preserve any other caller-saved registers. Add _dl_tlsdesc_dynamic IFUNC functions for FNSAVE, FXSAVE, XSAVE and XSAVEC to save and restore all caller-saved registers. This fixes BZ #31372. Add GLRO(dl_x86_64_runtime_resolve) with GLRO(dl_x86_tlsdesc_dynamic) to optimize elf_machine_runtime_setup. H.J. Lu (1): x86: Update _dl_tlsdesc_dynamic to preserve caller-saved registers elf/Makefile | 14 ++ elf/tst-gnu2-tls2.c | 122 ++++++++++++ elf/tst-gnu2-tls2.h | 36 ++++ elf/tst-gnu2-tls2mod0.c | 31 +++ elf/tst-gnu2-tls2mod1.c | 31 +++ elf/tst-gnu2-tls2mod2.c | 31 +++ sysdeps/i386/dl-machine.h | 2 +- sysdeps/i386/dl-tlsdesc-dynamic.h | 190 +++++++++++++++++++ sysdeps/i386/dl-tlsdesc.S | 115 +++++------ sysdeps/x86/Makefile | 7 +- sysdeps/x86/cpu-features.c | 56 +++++- sysdeps/x86/dl-procinfo.c | 16 ++ sysdeps/{x86_64 => x86}/features-offsets.sym | 2 + sysdeps/x86/sysdep.h | 6 + sysdeps/x86/tst-gnu2-tls2.c | 20 ++ sysdeps/x86_64/Makefile | 2 +- sysdeps/x86_64/dl-machine.h | 19 +- sysdeps/x86_64/dl-procinfo.c | 16 ++ sysdeps/x86_64/dl-tlsdesc-dynamic.h | 166 ++++++++++++++++ sysdeps/x86_64/dl-tlsdesc.S | 108 ++++------- sysdeps/x86_64/dl-trampoline-save.h | 34 ++++ sysdeps/x86_64/dl-trampoline-state.h | 51 +++++ sysdeps/x86_64/dl-trampoline.S | 20 +- sysdeps/x86_64/dl-trampoline.h | 34 +--- 24 files changed, 916 insertions(+), 213 deletions(-) create mode 100644 elf/tst-gnu2-tls2.c create mode 100644 elf/tst-gnu2-tls2.h create mode 100644 elf/tst-gnu2-tls2mod0.c create mode 100644 elf/tst-gnu2-tls2mod1.c create mode 100644 elf/tst-gnu2-tls2mod2.c create mode 100644 sysdeps/i386/dl-tlsdesc-dynamic.h rename sysdeps/{x86_64 => x86}/features-offsets.sym (89%) create mode 100644 sysdeps/x86/tst-gnu2-tls2.c create mode 100644 sysdeps/x86_64/dl-tlsdesc-dynamic.h create mode 100644 sysdeps/x86_64/dl-trampoline-save.h create mode 100644 sysdeps/x86_64/dl-trampoline-state.h