From patchwork Sat Oct 12 02:32:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Buettner X-Patchwork-Id: 98775 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E64E73856DC8 for ; Sat, 12 Oct 2024 02:44:29 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTP id 3EF5C3857B84 for ; Sat, 12 Oct 2024 02:43:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3EF5C3857B84 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3EF5C3857B84 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1728700992; cv=none; b=AWDya6fvvYgZ5E62Kt5re2NDFnTs2PbG60GJCq35rXoejvkSCxSqhyoOs0aqprDki7eASMI9eqAw70vGgOI9ocYucHLvws32tbYyRLQ28k6WMNv0LE8OvczYAhw9fSjkmKu50L/mUyOFORIqTbjBFM7BAB/uuKKTvIht9CKTNq8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1728700992; c=relaxed/simple; bh=9uO0xAwQ5a4r5eKLMJeHFxFKyvVOMQZ6Yn658QIy5GQ=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=EtbsxH4PLAGCVInY0fPceGHJRRF1GgmoZIdPr0yzb5qumOEIPVrEqLTbTdNiYIH7UMRPrZMyDJZwwft8QRbk1cbIU9BtBBmHQypQThXaSkfOfa2OIMXwu4VWlk5Wl8US3aXuORXnk2JX92JSr6fGB0gvmM9ZAriNrK1gy5gCvA0= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1728700988; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=At2m1qq4PNvk5xGxpMVULIQMwdBdK8knw1ihdRzlKgA=; b=hcBe2bTcC/zlTZY8tF/jRHTayRaz7UjKp8OghtsDSFcupoaIUH9RJSPvJ0Ujy48hyhKjoW Gd0bG5Sh8oWEkf/6O7F84bynp9h1NuXgOHemgtsTobhd8QZZ4wfWf5YSUt8bYW/NQtJ7wd 1othldcU88SepFnn0U4YXEhGlD/BrzY= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-617-df4kWOaiN4iWyxBZiHDcJA-1; Fri, 11 Oct 2024 22:43:07 -0400 X-MC-Unique: df4kWOaiN4iWyxBZiHDcJA-1 Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id DDC791956080 for ; Sat, 12 Oct 2024 02:43:06 +0000 (UTC) Received: from f41-1.lan (unknown [10.22.64.14]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 11C5319560AA; Sat, 12 Oct 2024 02:43:05 +0000 (UTC) From: Kevin Buettner To: gdb-patches@sourceware.org Cc: Kevin Buettner Subject: [PATCH v2 04/11] Implement internal TLS address lookup for Linux targets Date: Fri, 11 Oct 2024 19:32:45 -0700 Message-ID: <20241012024220.101084-5-kevinb@redhat.com> In-Reply-To: <20241012024220.101084-1-kevinb@redhat.com> References: <20241012024220.101084-1-kevinb@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gdb-patches-bounces~patchwork=sourceware.org@sourceware.org This commit adds non-architecture-specific support for internal TLS address lookup for Linux targets. By "internal", I mean support which does not rely on libthread_db. Knowledge of how to traverse TLS data structures is contained in this commit along with the next commit containing architecture specific knowledge regarding TLS offsets, registers, and such. The new function 'linux_get_thread_local_address' is a gdbarch method. It should be passed as an argument to set_gdbarch_get_thread_local_address in architecture specific -linux-tdep.c files which wish to offer internal TLS support. The architecture specific tdep files need to define a get_tls_dtv_addr method - as the name suggests, it needs to return the address of the DTV (dynamic thread vector) via architecture specific means. This usually entails fetching the thread pointer via a register or registers assigned to this purpose, and then using that value to locate the address of the DTV from within the TCB (thread control block). Additionally, some architectures also need to provide a DTP offset, which is used by the MUSL C library to adjust the value obtained from a DTV entry to that of the start of the TLS block for a particular thread. This is provided, when necessary, by a get_tls_dtp_offset method. Both methods, get_tls_dtv_addr and get_tls_dtp_offset, are registered with data structures maintained by linux-tdep.c via the new function linux_register_tls_methods(). Thus, for example, on RISC-V, riscv_linux_init_abi() will make the following two calls, the first for registering the internal get_thread_local_address gdbarch method and the second for registering riscv-specific methods for obtaining the DTV address and DTP offset: set_gdbarch_get_thread_local_address (gdbarch, linux_get_thread_local_address); linux_register_tls_methods (info, gdbarch, riscv_linux_get_tls_dtv_addr, riscv_linux_get_tls_dtp_offset); Internal TLS support is provided for two C libraries, GLIBC, and MUSL. Details for accessing the various TLS data structures differ between these libraries. As a consequence, linux-tdep.h defines a new enum, linux_libc, with values linux_libc_unknown, linux_libc_musl, and linux_libc_glibc. A new static function libc_tls_sniffer uses heuristics to (try to) decide whether a program was linked against GLIBC or MUSL. Working out what the heuristics should be, especially for statically linked binaries, turned out to be harder than I thought it would be. A new maintenance setting, force-internal-tls-address-lookup, has been added, which, when set to 'on', will (as the name suggests) force the internal TLS lookup mechanisms to be used. Otherwise, if thread_db support is available (via the thread stratum), that will be preferred since it should be more accurate. I expect that this setting will be mostly used by test cases in the GDB test suite. The new test cases that are part of this series all use it, with iterations using both 'on' and 'off' for all of the tests therein. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=24548 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=31563 --- gdb/linux-tdep.c | 211 +++++++++++++++++++++++++++++++++++++++++++++++ gdb/linux-tdep.h | 36 ++++++++ 2 files changed, 247 insertions(+) diff --git a/gdb/linux-tdep.c b/gdb/linux-tdep.c index 65ec221ef48..42ff1b3f2a1 100644 --- a/gdb/linux-tdep.c +++ b/gdb/linux-tdep.c @@ -48,6 +48,11 @@ #include #include +/* When true, force internal TLS address lookup instead of lookup via + the thread stratum. */ + +static bool force_internal_tls_address_lookup = false; + /* This enum represents the values that the user can choose when informing the Linux kernel about which memory mappings will be dumped in a corefile. They are described in the file @@ -198,6 +203,15 @@ struct linux_gdbarch_data { struct type *siginfo_type = nullptr; int num_disp_step_buffers = 0; + + /* Method for looking up TLS DTV. */ + get_tls_dtv_addr_ftype *get_tls_dtv_addr = nullptr; + + /* Method for looking up the TLS DTP offset. */ + get_tls_dtp_offset_ftype *get_tls_dtp_offset = nullptr; + + /* Cached libc value for TLS lookup purposes. */ + enum linux_libc libc = linux_libc_unknown; }; static const registry::key @@ -2744,6 +2758,172 @@ show_dump_excluded_mappings (struct ui_file *file, int from_tty, " flag is %s.\n"), value); } +/* For TLS lookup purposes, use heuristics to decide whether program + was linked against MUSL or GLIBC. */ + +static enum linux_libc +libc_tls_sniffer (struct gdbarch *gdbarch) +{ + /* Check for cached libc value. */ + linux_gdbarch_data *gdbarch_data = get_linux_gdbarch_data (gdbarch); + if (gdbarch_data->libc != linux_libc_unknown) + return gdbarch_data->libc; + + linux_libc libc = linux_libc_unknown; + + /* Fetch the program interpreter. */ + std::optional interp_name_holder + = svr4_find_program_interpreter (); + if (interp_name_holder) + { + /* A dynamically linked program linked against MUSL will have a + "ld-musl-" in its interpreter name. (Two examples of MUSL + interpreter names are "/lib/ld-musl-x86_64.so.1" and + "lib/ld-musl-aarch64.so.1".) If it's not found, assume GLIBC. */ + const char *interp_name = (const char *) interp_name_holder->data (); + if (strstr (interp_name, "/ld-musl-") != nullptr) + libc = linux_libc_musl; + else + libc = linux_libc_glibc; + gdbarch_data->libc = libc; + return libc; + } + + /* If there is no interpreter name, it's statically linked. For + programs with TLS data, a program statically linked against MUSL + will have the symbols 'main_tls' and 'builtin_tls'. If both of + these are present, assume that it was statically linked against + MUSL, otherwise assume GLIBC. */ + if (lookup_minimal_symbol (current_program_space, "main_tls").minsym + != nullptr + && lookup_minimal_symbol (current_program_space, "builtin_tls").minsym + != nullptr) + libc = linux_libc_musl; + else + libc = linux_libc_glibc; + gdbarch_data->libc = libc; + return libc; +} + +/* Implement gdbarch method, get_thread_local_address, for architectures + which provide a method for determining the DTV and possibly the DTP + offset. */ + +CORE_ADDR +linux_get_thread_local_address (struct gdbarch *gdbarch, ptid_t ptid, + CORE_ADDR lm_addr, CORE_ADDR offset) +{ + linux_gdbarch_data *gdbarch_data = get_linux_gdbarch_data (gdbarch); + + /* Use the target's get_thread_local_address method when: + + - No method has been provided for finding the TLS DTV. + + or + + - The thread stratum has been pushed (at some point) onto the + target stack, except when 'force_internal_tls_address_lookup' + has been set. + + The idea here is to prefer use of of the target's thread_stratum + method since it should be more accurate. */ + if (gdbarch_data->get_tls_dtv_addr == nullptr + || (find_target_at (thread_stratum) != nullptr + && !force_internal_tls_address_lookup)) + { + struct target_ops *target = current_inferior ()->top_target (); + return target->get_thread_local_address (ptid, lm_addr, offset); + } + else + { + /* Details, found below, regarding TLS layout is for the GNU C + library (glibc) and the MUSL C library (musl), circa 2024. + While some of this layout is defined by the TLS ABI, some of + it, such as how/where to find the DTV pointer in the TCB, is + not. A good source of ABI info for some architectures can be + found in "ELF Handling For Thread-Local Storage" by Ulrich + Drepper. That document is worth consulting even for + architectures not described there, since the general approach + and terminology is used regardless. + + Some architectures, such as aarch64, are not described in + that document, so some details had to ferreted out using the + glibc source code. Likewise, the MUSL source code was + consulted for details which differ from GLIBC. */ + enum linux_libc libc = libc_tls_sniffer (gdbarch); + int mod_id; + if (libc == linux_libc_glibc) + mod_id = glibc_link_map_to_tls_module_id (lm_addr); + else /* Assume MUSL. */ + mod_id = musl_link_map_to_tls_module_id (lm_addr); + if (mod_id == 0) + throw_error (TLS_GENERIC_ERROR, _("Unable to determine TLS module id")); + + /* Use the architecture specific DTV fetcher to obtain the DTV. */ + CORE_ADDR dtv_addr = gdbarch_data->get_tls_dtv_addr (gdbarch, ptid, libc); + + /* In GLIBC, The DTV (dynamic thread vector) is an array of + structs consisting of two fields, the first of which is a + pointer to the TLS block of interest. (The second field is a + pointer that assists with memory management, but that's not + of interest here.) Also, the 0th entry is the generation + number, but although it's a single scalar, the 0th entry is + padded to be the same size as all the rest. Thus each + element of the DTV array is two pointers in size. + + In MUSL, the DTV is simply an array of pointers. The 0th + entry is still the generation number, but contains no padding + aside from that which is needed to make it pointer sized. */ + int m; /* Multiplier, for size of DTV entry. */ + switch (libc) + { + case linux_libc_glibc: + m = 2; + break; + default: + m = 1; + break; + } + + /* Obtain TLS block address. Module ids start at 1, so there's + no need to adjust it to skip over the 0th entry of the DTV, + which is the generation number. */ + CORE_ADDR dtv_elem_addr + = dtv_addr + mod_id * m * (gdbarch_ptr_bit (gdbarch) / TARGET_CHAR_BIT); + gdb::byte_vector buf (gdbarch_ptr_bit (gdbarch) / TARGET_CHAR_BIT); + if (target_read_memory (dtv_elem_addr, buf.data (), buf.size ()) != 0) + throw_error (TLS_GENERIC_ERROR, _("Unable to fetch TLS block address")); + const struct builtin_type *builtin = builtin_type (gdbarch); + CORE_ADDR tls_block_addr = gdbarch_pointer_to_address + (gdbarch, builtin->builtin_data_ptr, + buf.data ()); + + /* When the TLS block addr is 0 or -1, this usually indicates that + the TLS storage hasn't been allocated yet. (In GLIBC, some + architectures use 0 while others use -1.) */ + if (tls_block_addr == 0 || tls_block_addr == (CORE_ADDR) -1) + throw_error (TLS_NOT_ALLOCATED_YET_ERROR, _("TLS not allocated yet")); + + /* MUSL (and perhaps other C libraries, though not GLIBC) have + TLS implementations for some architectures which, for some + reason, have DTV entries which must be negatively offset by + DTP_OFFSET in order to obtain the TLS block address. + DTP_OFFSET is a constant in the MUSL sources - these offsets, + when they're non-zero, seem to be either 0x800 or 0x8000, + and are present for riscv[32/64], powerpc[32/64], m68k, and + mips. + + Use the architecture specific get_tls_dtp_offset method, if + present, to obtain this offset. */ + ULONGEST dtp_offset + = gdbarch_data->get_tls_dtp_offset == nullptr + ? 0 + : gdbarch_data->get_tls_dtp_offset (gdbarch, ptid, libc); + + return tls_block_addr - dtp_offset + offset; + } +} + /* To be called from the various GDB_OSABI_LINUX handlers for the various GNU/Linux architectures and machine types. @@ -2787,6 +2967,20 @@ linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch, set_gdbarch_get_siginfo_type (gdbarch, linux_get_siginfo_type); } +/* See linux-tdep.h. */ + +void +linux_register_tls_methods (struct gdbarch_info info, struct gdbarch *gdbarch, + get_tls_dtv_addr_ftype *get_tls_dtv_addr, + get_tls_dtp_offset_ftype *get_tls_dtp_offset) +{ + gdb_assert (get_tls_dtv_addr != nullptr); + + linux_gdbarch_data *gdbarch_data = get_linux_gdbarch_data (gdbarch); + gdbarch_data->get_tls_dtv_addr = get_tls_dtv_addr; + gdbarch_data->get_tls_dtp_offset = get_tls_dtp_offset; +} + void _initialize_linux_tdep (); void _initialize_linux_tdep () @@ -2822,6 +3016,23 @@ VM_DONTDUMP flag (\"dd\" in /proc/PID/smaps) when generating the corefile. For\ more information about this file, refer to the manpage of proc(5) and core(5)."), NULL, show_dump_excluded_mappings, &setlist, &showlist); + + add_setshow_boolean_cmd ("force-internal-tls-address-lookup", class_obscure, + &force_internal_tls_address_lookup, _("\ +Set to force internal TLS address lookup."), _("\ +Show whether GDB is forced to use internal TLS address lookup."), _("\ +When resolving addresses for TLS (Thread Local Storage) variables,\n\ +GDB will attempt to use facilities provided by the thread library (i.e.\n\ +libthread_db). If those facilities aren't available, GDB will fall\n\ +back to using some internal (to GDB), but possibly less accurate\n\ +mechanisms to resolve the addresses for TLS variables. When this flag\n\ +is set, GDB will force use of the fall-back TLS resolution mechanisms.\n\ +This flag is used by some GDB tests to ensure that the internal fallback\n\ +code is exercised and working as expected. The default is to not force\n\ +the internal fall-back mechanisms to be used."), + NULL, NULL, + &maintenance_set_cmdlist, + &maintenance_show_cmdlist); } /* Fetch (and possibly build) an appropriate `link_map_offsets' for diff --git a/gdb/linux-tdep.h b/gdb/linux-tdep.h index 66b03cabaa6..83f1538f033 100644 --- a/gdb/linux-tdep.h +++ b/gdb/linux-tdep.h @@ -88,6 +88,35 @@ extern void linux_displaced_step_restore_all_in_ptid (inferior *parent_inf, extern void linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch, int num_disp_step_buffers); +/* C library variants for TLS lookup. */ + +enum linux_libc +{ + linux_libc_unknown, + linux_libc_musl, + linux_libc_glibc +}; + +/* Function type for "get_tls_dtv_addr" method. */ + +typedef CORE_ADDR (get_tls_dtv_addr_ftype) (struct gdbarch *gdbarch, + ptid_t ptid, + enum linux_libc libc); + +/* Function type for "get_tls_dtp_offset" method. */ + +typedef CORE_ADDR (get_tls_dtp_offset_ftype) (struct gdbarch *gdbarch, + ptid_t ptid, + enum linux_libc libc); + +/* Register architecture specific methods for fetching the TLS DTV + and TLS DTP, used by linux_get_thread_local_address. */ + +extern void linux_register_tls_methods + (struct gdbarch_info info, struct gdbarch *gdbarch, + get_tls_dtv_addr_ftype *get_tls_dtv_addr, + get_tls_dtp_offset_ftype *get_tls_dtp_offset = nullptr); + extern int linux_is_uclinux (void); /* Fetch the AT_HWCAP entry from auxv data AUXV. Use TARGET and GDBARCH to @@ -117,4 +146,11 @@ extern CORE_ADDR linux_get_hwcap2 (); extern struct link_map_offsets *linux_ilp32_fetch_link_map_offsets (); extern struct link_map_offsets *linux_lp64_fetch_link_map_offsets (); +/* Used as a gdbarch method for get_thread_local_address when the tdep + file also defines a suitable method for obtaining the TLS DTV. + See linux_init_abi(), above. */ +CORE_ADDR +linux_get_thread_local_address (struct gdbarch *gdbarch, ptid_t ptid, + CORE_ADDR lm_addr, CORE_ADDR offset); + #endif /* linux-tdep.h */