From patchwork Mon Feb 26 14:37:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 56729 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BCF51385841D for ; Mon, 26 Feb 2024 14:38:44 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pf1-x434.google.com (mail-pf1-x434.google.com [IPv6:2607:f8b0:4864:20::434]) by sourceware.org (Postfix) with ESMTPS id A574F3858C35 for ; Mon, 26 Feb 2024 14:37:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A574F3858C35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A574F3858C35 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::434 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708958232; cv=none; b=vsRMDpE73iaw0JQME+jfQqeCj/ffKuzw8wNXFFzw/IxjOTSRCv8gFjkiBhwSXHGQ9hzL6OfnNEoIdJeIpjRfaWLL/OjqeNgMPoVTjNeTXypRfIRHbhbWlEqGP8W/dGSHX3SFkqu5FpWS/Anp4+yK9XkhG9koah4dyXOKsvsbokQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708958232; c=relaxed/simple; bh=e4CGAv2CRRxevwJNc9xuYrzNeOYgFKZrJYqchzE/jqk=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=ivDucf2ijnoWn1WiU6gaovXnpOJ6y7/FUu2eFdP4jvpsq3yf88TIrjGhc3qa7CA4IxMGgtR4MOQrv00yVmi6ZKiFmcZkogHAgQNgExnRXE0W1qqvwMSu0O4D8Urwycb25gTlGGItQGnfI/pZjpKRGyBvfSPI4OJDYJ5wahypwno= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pf1-x434.google.com with SMTP id d2e1a72fcca58-6e54451edc6so10421b3a.1 for ; Mon, 26 Feb 2024 06:37:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1708958226; x=1709563026; darn=sourceware.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=XGWIrXRG56GI3aWBq+WJjBrsDGzW8f6RjtqYuFlIcug=; b=d/BibGnVNFj2YghZppsGMGehUhGLnSPO4jwARlIr5ZnAafGR/+T++WWYNJkjBuR0WQ gD9GB5pBV0uxpTCVLmiUJpfjDMIUlC2wrUvIsTRibeL56XT8KSSWAzj7q3ZJ8LXcP166 HNfe3DHkwPdFRtNOpInvEr3WBL19J8WzusK6Apuq2qdxt0pUrfauPA/PktrmXDkquD4f QF95EgOcovbQ2T7aiAK/Tgier6fzQ4oxL12S5mRN6kcTWDz/8mBCOfNlSVJBh08dru1/ J2XZGH6UT5osOpUl+EuRs94ajYJ5cgowMCA4XE+jaakLf147tlnJxZiRdOxbY3ZAm17c 9n2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708958226; x=1709563026; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=XGWIrXRG56GI3aWBq+WJjBrsDGzW8f6RjtqYuFlIcug=; b=p2C4xq9kgn8i7P72QqRd2qcAGB+DgcikYk5ACKgYliSI4s19OMP8FlxucDjeEzdSOA XNPqYY54ChFSo1+tRFQ3JGeYrRt/aao3ELwdAC5OK3n+dT6K6+rm6aU7iJWZfXItizX/ nFEM9nWKOcRaUyyHH5Xaqs5xWHoIGYwLWDPImGgYoQ5nHK5aQSQzVZfewRQU6c4d18+P J7T6LdNYu6m3JckYvSKkK5ES1qehyy30PTpknib4rZq8UT5dPhjDhZxradYkh6N0K1l4 ajsXALJ2qdFTF3eM3VTgjwIZWZYqbAKjOnMC5oGpEoaA8qZSIBxUX7N6pngRJho5xJ5t Odvg== X-Gm-Message-State: AOJu0YwZGpeN/KO7OtEiy6D+ThFbdth2diKTHW33GyJZJAVzCXtwOe4S XGUhLFB5G3AB95gXnPCVa7Ifz/wDOXr3yH0kqM7Sgbxg0w1MWBDs X-Google-Smtp-Source: AGHT+IFos5g/mb8G709kT0zykYfuNT7WgTw3RDcG82hoqY5sl+YzrkqWYDdoFYyOFgS3DC/huQtWMw== X-Received: by 2002:a05:6a20:9f88:b0:19e:ca3a:612b with SMTP id mm8-20020a056a209f8800b0019eca3a612bmr10290962pzb.54.1708958225523; Mon, 26 Feb 2024 06:37:05 -0800 (PST) Received: from gnu-cfl-3.localdomain ([172.58.89.72]) by smtp.gmail.com with ESMTPSA id kh4-20020a170903064400b001dc82ed5519sm4024754plb.245.2024.02.26.06.37.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Feb 2024 06:37:05 -0800 (PST) Received: from gnu-cfl-3.. (localhost [IPv6:::1]) by gnu-cfl-3.localdomain (Postfix) with ESMTP id C0574740257; Mon, 26 Feb 2024 06:37:03 -0800 (PST) From: "H.J. Lu" To: libc-alpha@sourceware.org Cc: goldstein.w.n@gmail.com, adhemerval.zanella@linaro.org Subject: [PATCH v10 0/1] x86: Update _dl_tlsdesc_dynamic to preserve caller-saved registers Date: Mon, 26 Feb 2024 06:37:02 -0800 Message-ID: <20240226143703.4037100-1-hjl.tools@gmail.com> X-Mailer: git-send-email 2.43.2 MIME-Version: 1.0 X-Spam-Status: No, score=-3013.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, RCVD_IN_SBL_CSS, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Changes in v10: 1. Change the test to xfail. Changes in v9: 1. Drop the first patch which has been merged into master branch. 2. Correct BEFORE_TLSDESC_CALL placement in test. 3. Define MOD only if undefined in test. Changes in v8: 1. Remove malloc-for-test.c and move malloc to tst-gnu2-tls2.c. 2. Add malloc_counter to verify malloc in tst-gnu2-tls2.c is called for TLSDESC call. 3. Add BEFORE_TLSDESC_CALL and AFTER_TLSDESC_CALL. 4. Use /* ... */ in assembly code comments. Changes in v7: 1. Generate malloc-for-test.map at build time to get the correct version map for malloc. Changes in v6: 1. Drop Tile registers. Changes in v5: 1. Also preserve Tile registers. 2. Add an error check in i386 dl-tlsdesc-dynamic.h. Changes in v4: 1. Add APX registers to STATE_SAVE_MASK so that APX registers are saved in ld.so trampoline. 2. Also save x87 FPU stack registers for TLSDESC_CALL and TLS_DESC_CALL. 3. Change i386 _dl_tlsdesc_dynamic to IFUNC. 4. Rename GLRO(dl_x86_64_tlsdesc_dynamic) to GLRO(dl_x86_tlsdesc_dynamic) for both i386 and x86-64. 5. Update the testcase for i386 with a simple malloc interceptor. Changes in v3: 1. Don't add GLRO(dl_x86_64_tlsdesc_dynamic) to libc.a. Changes in v2: 1. Add GLRO(dl_x86_64_runtime_resolve) to optimize elf_machine_runtime_setup. --- Add APX registers to STATE_SAVE_MASK so that APX registers are saved in ld.so trampoline. This fixes BZ #31371. Compiler generates the following instruction sequence for GNU2 dynamic TLS access: leaq tls_var@TLSDESC(%rip), %rax call *tls_var@TLSCALL(%rax) or leal tls_var@TLSDESC(%ebx), %eax call *tls_var@TLSCALL(%eax) CALL instruction is transparent to compiler which assumes all registers, except for EFLAGS and RAX/EAX, are unchanged after CALL. When _dl_tlsdesc_dynamic is called, it calls __tls_get_addr on the slow path. __tls_get_addr is a normal function which doesn't preserve any caller-saved registers. _dl_tlsdesc_dynamic saved and restored integer caller-saved registers, but didn't preserve any other caller-saved registers. Add _dl_tlsdesc_dynamic IFUNC functions for FNSAVE, FXSAVE, XSAVE and XSAVEC to save and restore all caller-saved registers. This fixes BZ #31372. Add GLRO(dl_x86_64_runtime_resolve) with GLRO(dl_x86_tlsdesc_dynamic) to optimize elf_machine_runtime_setup. H.J. Lu (1): x86: Update _dl_tlsdesc_dynamic to preserve caller-saved registers elf/Makefile | 18 ++ elf/tst-gnu2-tls2.c | 122 ++++++++++++ elf/tst-gnu2-tls2.h | 36 ++++ elf/tst-gnu2-tls2mod0.c | 31 +++ elf/tst-gnu2-tls2mod1.c | 31 +++ elf/tst-gnu2-tls2mod2.c | 31 +++ sysdeps/i386/dl-machine.h | 2 +- sysdeps/i386/dl-tlsdesc-dynamic.h | 190 +++++++++++++++++++ sysdeps/i386/dl-tlsdesc.S | 115 +++++------ sysdeps/x86/Makefile | 7 +- sysdeps/x86/cpu-features.c | 56 +++++- sysdeps/x86/dl-procinfo.c | 16 ++ sysdeps/{x86_64 => x86}/features-offsets.sym | 2 + sysdeps/x86/sysdep.h | 6 + sysdeps/x86/tst-gnu2-tls2.c | 20 ++ sysdeps/x86_64/Makefile | 2 +- sysdeps/x86_64/dl-machine.h | 19 +- sysdeps/x86_64/dl-procinfo.c | 16 ++ sysdeps/x86_64/dl-tlsdesc-dynamic.h | 166 ++++++++++++++++ sysdeps/x86_64/dl-tlsdesc.S | 108 ++++------- sysdeps/x86_64/dl-trampoline-save.h | 34 ++++ sysdeps/x86_64/dl-trampoline-state.h | 51 +++++ sysdeps/x86_64/dl-trampoline.S | 20 +- sysdeps/x86_64/dl-trampoline.h | 34 +--- 24 files changed, 920 insertions(+), 213 deletions(-) create mode 100644 elf/tst-gnu2-tls2.c create mode 100644 elf/tst-gnu2-tls2.h create mode 100644 elf/tst-gnu2-tls2mod0.c create mode 100644 elf/tst-gnu2-tls2mod1.c create mode 100644 elf/tst-gnu2-tls2mod2.c create mode 100644 sysdeps/i386/dl-tlsdesc-dynamic.h rename sysdeps/{x86_64 => x86}/features-offsets.sym (89%) create mode 100644 sysdeps/x86/tst-gnu2-tls2.c create mode 100644 sysdeps/x86_64/dl-tlsdesc-dynamic.h create mode 100644 sysdeps/x86_64/dl-trampoline-save.h create mode 100644 sysdeps/x86_64/dl-trampoline-state.h