From patchwork Thu Mar 7 23:50:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Baldwin X-Patchwork-Id: 31782 Received: (qmail 103255 invoked by alias); 7 Mar 2019 23:50:39 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 103247 invoked by uid 89); 7 Mar 2019 23:50:39 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-14.4 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, SPF_PASS autolearn=ham version=3.3.1 spammy=understands, inferiors X-HELO: mx2.freebsd.org Received: from mx2.freebsd.org (HELO mx2.freebsd.org) (8.8.178.116) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 07 Mar 2019 23:50:37 +0000 Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (Client CN "mx1.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx2.freebsd.org (Postfix) with ESMTPS id E44406A868; Thu, 7 Mar 2019 23:50:35 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 88FBC80CBA; Thu, 7 Mar 2019 23:50:34 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from John-Baldwins-MacBook-Pro-3.local (ralph.baldwin.cx [66.234.199.215]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) (Authenticated sender: jhb) by smtp.freebsd.org (Postfix) with ESMTPSA id 1BFA311564; Thu, 7 Mar 2019 23:50:34 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Subject: Re: [PATCH v2 04/11] Add a new gdbarch method to resolve the address of TLS variables. To: Simon Marchi , gdb-patches@sourceware.org References: <4db33aead3f31532b7d4e165d9786df792a4d925.1549672588.git.jhb@FreeBSD.org> <02c8a44b-b1d2-0f0f-9b6f-72a0fb673f83@simark.ca> From: John Baldwin Openpgp: preference=signencrypt Message-ID: <2c282f52-0269-d6a8-8533-4c00b1a4ee8d@FreeBSD.org> Date: Thu, 7 Mar 2019 15:50:12 -0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:60.0) Gecko/20100101 Thunderbird/60.5.3 MIME-Version: 1.0 In-Reply-To: <02c8a44b-b1d2-0f0f-9b6f-72a0fb673f83@simark.ca> X-Rspamd-Queue-Id: 88FBC80CBA X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-2.96 / 15.00]; local_wl_from(0.00)[FreeBSD.org]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; NEURAL_HAM_SHORT(-0.96)[-0.964,0]; ASN(0.00)[asn:11403, ipnet:2610:1c1:1::/48, country:US] X-IsSubscribed: yes On 3/7/19 8:08 AM, Simon Marchi wrote: > On 2019-02-08 7:40 p.m., John Baldwin wrote: >> Permit TLS variable addresses to be resolved purely by an ABI rather >> than requiring a target method. This doesn't try the target method if >> the ABI function is present (even if the ABI function fails) to >> simplify error handling. > > I don't see anything wrong with the patch (and the comment you removed > in target_translate_tls_address hints it is right), but again I am not > very familiar with how TLS works, so I wouldn't spot if anything was > conceptually wrong with this approach. I would appreciate if another > maintainer could take a look and give their opinion. Ok. FWIW, the reason for target vs gdbarch has to do with the different ways one can resolve a TLS variable. Some background: In ELF relocations, a TLS variable is identified by an offset in a special TLS section of an ELF file, similar to global symbols being an offset relative to .data or .bss. However, TLS variables are duplicated for each thread. To support this, the runtime linker allocates an array of pointers for each thread called the DTV array. The runtime linker also assigns an array index to each ELF object, so the executable is assigned array index 1, and other shared libraries that use TLS are assigned indices as they are mapped by the runtime linker. The pointers in each thread's array point to the per-thread blocks of TLS variables for a given ELF object. Thus, if index 1 is for my program and index 2 was assigned to libc, then DTV[1] contains a pointer to all of the TLS variables in my main program and DTV[2] contains a pointer to all of the TLS variables in libc. Thus, if libc has two TLS integers 'foo' and 'bar', they might be assigned offsets of 0 and 4. To read the value of 'foo' one uses the expression '*(int *)(DTV[1])'. To read 'bar' you would use '*(int *)(DTV[1] + 4)'. There are some extra optimizations in the compiler-generated code (there's something called static TLS that can be at a fixed offset from the per-thread TCB pointer IIRC, but there are also valid DTV[] pointers that can get to the same variables just via more indirection. Compiled code is also allowed to fetch the 'base' of a TLS block for a given shared object and then save that 'base' and use offsets from it to access different variables. Put another way, the compiler can assume that &bar - &foo is always '4' and just add the relative offset to '&foo' to compute '&bar' without going through the DTV array every time). In target_translate_tls_address() we are given the 'struct objfile' of the ELF object and the offset of the TLS variable we are trying to find. The gdbarch_fetch_tls_load_module_address function fetches the pointer to the runtime linker's data structure describing that ELF object. The target version (target::get_thread_local_address) expects to use some target specific method to turn a (thread, linker_module_addr, offset) tuple into the address of a TLS variable. On Linux and other systems using libthread_db for this, it calls a libthread_db function. Internally that libthread_db function looks at the runtime linker's structure to extract the TLS index of the ELF object. It then looks in the thread library's per-thread data structure to find a pointer to the DTV array. It then uses 'DTV[index] + offset' to compute the final address. Note that this is all done in libthread_db rather than in gdb itself. The gdbarch method I'm using for FreeBSD doesn't use libthread_db. Instead, it more closely mimics what the compiler-generated code does. Many architectures use some sort of register to point to a per-thread Thread Control Block (TCB), and they store a pointer to the DTV array either in the TCB or at a fixed offset relative to the TCB. For example, 64-bit x86 uses the %fs segment prefix to access the TCB, and the %fs_base register is thus a pointer to the TCB. 32-bit x86 uses %gs and %gs_base instead. RISC-V has a 'tp' register for this purpose, etc. The approach I'm using for FreeBSD is to provide an architecture-specific function that uses the relevant register to locate the pointer to the DTV array. It then calls a shared function (patch 7) that extracts the TLS index from the runtime linker's data structure and computes the final address via 'DTV[index] + offset'. Mostly I did this because I don't like libthread_db, but using a gdbarch method should also be a bit more cross-debugger friendly (you don't have to have a libthread_db on a FreeBSD host that understands the Linux runtime linker or thread library or vice versa, and similar concerns with 32-bit vs 64-bit and x86 vs ARM, etc.). >> diff --git a/gdb/gdbarch.sh b/gdb/gdbarch.sh >> index afc4da7cdd..09097bcbaf 100755 >> --- a/gdb/gdbarch.sh >> +++ b/gdb/gdbarch.sh >> @@ -602,6 +602,7 @@ m;int;remote_register_number;int regno;regno;;default_remote_register_number;;0 >> >> # Fetch the target specific address used to represent a load module. >> F;CORE_ADDR;fetch_tls_load_module_address;struct objfile *objfile;objfile >> +M;CORE_ADDR;get_thread_local_address;ptid_t ptid, CORE_ADDR lm_addr, CORE_ADDR offset;ptid, lm_addr, offset > > Could you document the method, especially the meaning of the parameters? Sure. I used a variant of the comment from the target method: diff --git a/gdb/gdbarch.sh b/gdb/gdbarch.sh index 48fcebd19a..d15b6aa794 100755 --- a/gdb/gdbarch.sh +++ b/gdb/gdbarch.sh @@ -602,6 +602,14 @@ m;int;remote_register_number;int regno;regno;;default_remote_register_number;;0 # Fetch the target specific address used to represent a load module. F;CORE_ADDR;fetch_tls_load_module_address;struct objfile *objfile;objfile + +# Return the thread-local address at OFFSET in the thread-local +# storage for the thread PTID and the shared library or executable +# file given by LM_ADDR. If that block of thread-local storage hasn't +# been allocated yet, this function may return an error. LM_ADDR may +# be zero for statically linked multithreaded inferiors. + +M;CORE_ADDR;get_thread_local_address;ptid_t ptid, CORE_ADDR lm_addr, CORE_ADDR offset;ptid, lm_addr, offset # v;CORE_ADDR;frame_args_skip;;;0;;;0 m;CORE_ADDR;unwind_pc;struct frame_info *next_frame;next_frame;;default_unwind_pc;;0