[v2,04/11] Add a new gdbarch method to resolve the address of TLS variables.

  On 3/7/19 8:08 AM, Simon Marchi wrote:
> On 2019-02-08 7:40 p.m., John Baldwin wrote:
>> Permit TLS variable addresses to be resolved purely by an ABI rather
>> than requiring a target method.  This doesn't try the target method if
>> the ABI function is present (even if the ABI function fails) to
>> simplify error handling.
> 
> I don't see anything wrong with the patch (and the comment you removed 
> in target_translate_tls_address hints it is right), but again I am not 
> very familiar with how TLS works, so I wouldn't spot if anything was 
> conceptually wrong with this approach.  I would appreciate if another 
> maintainer could take a look and give their opinion.

Ok.  FWIW, the reason for target vs gdbarch has to do with the different ways one can
resolve a TLS variable.  Some background:

In ELF relocations, a TLS variable is identified by an offset in a special TLS section
of an ELF file, similar to global symbols being an offset relative to .data or .bss.
However, TLS variables are duplicated for each thread.  To support this, the runtime
linker allocates an array of pointers for each thread called the DTV array.  The runtime
linker also assigns an array index to each ELF object, so the executable is assigned array
index 1, and other shared libraries that use TLS are assigned indices as they are mapped
by the runtime linker.  The pointers in each thread's array point to the per-thread blocks
of TLS variables for a given ELF object.  Thus, if index 1 is for my program and index 2
was assigned to libc, then DTV[1] contains a pointer to all of the TLS variables in my
main program and DTV[2] contains a pointer to all of the TLS variables in libc.

Thus, if libc has two TLS integers 'foo' and 'bar', they might be assigned offsets of
0 and 4.  To read the value of 'foo' one uses the expression '*(int *)(DTV[1])'.  To
read 'bar' you would use '*(int *)(DTV[1] + 4)'.

There are some extra optimizations in the compiler-generated code (there's something
called static TLS that can be at a fixed offset from the per-thread TCB pointer IIRC,
but there are also valid DTV[] pointers that can get to the same variables just via
more indirection.  Compiled code is also allowed to fetch the 'base' of a TLS block
for a given shared object and then save that 'base' and use offsets from it to access
different variables.  Put another way, the compiler can assume that &bar - &foo is
always '4' and just add the relative offset to '&foo' to compute '&bar' without going
through the DTV array every time).

In target_translate_tls_address() we are given the 'struct objfile' of the ELF object
and the offset of the TLS variable we are trying to find.  The
gdbarch_fetch_tls_load_module_address function fetches the pointer to the runtime
linker's data structure describing that ELF object.

The target version (target::get_thread_local_address) expects to use some target
specific method to turn a (thread, linker_module_addr, offset) tuple into the
address of a TLS variable.  On Linux and other systems using libthread_db for this,
it calls a libthread_db function.  Internally that libthread_db function looks at
the runtime linker's structure to extract the TLS index of the ELF object.  It
then looks in the thread library's per-thread data structure to find a pointer
to the DTV array.  It then uses 'DTV[index] + offset' to compute the final address.
Note that this is all done in libthread_db rather than in gdb itself.

The gdbarch method I'm using for FreeBSD doesn't use libthread_db.  Instead, it
more closely mimics what the compiler-generated code does.  Many architectures use
some sort of register to point to a per-thread Thread Control Block (TCB), and they
store a pointer to the DTV array either in the TCB or at a fixed offset relative to
the TCB.  For example, 64-bit x86 uses the %fs segment prefix to access the TCB,
and the %fs_base register is thus a pointer to the TCB.  32-bit x86 uses %gs and
%gs_base instead.  RISC-V has a 'tp' register for this purpose, etc.  The approach
I'm using for FreeBSD is to provide an architecture-specific function that uses the
relevant register to locate the pointer to the DTV array.  It then calls a shared
function (patch 7) that extracts the TLS index from the runtime linker's data
structure and computes the final address via 'DTV[index] + offset'.

Mostly I did this because I don't like libthread_db, but using a gdbarch method
should also be a bit more cross-debugger friendly (you don't have to have a libthread_db
on a FreeBSD host that understands the Linux runtime linker or thread library or
vice versa, and similar concerns with 32-bit vs 64-bit and x86 vs ARM, etc.).

>> diff --git a/gdb/gdbarch.sh b/gdb/gdbarch.sh
>> index afc4da7cdd..09097bcbaf 100755
>> --- a/gdb/gdbarch.sh
>> +++ b/gdb/gdbarch.sh
>> @@ -602,6 +602,7 @@ m;int;remote_register_number;int regno;regno;;default_remote_register_number;;0
>>   
>>   # Fetch the target specific address used to represent a load module.
>>   F;CORE_ADDR;fetch_tls_load_module_address;struct objfile *objfile;objfile
>> +M;CORE_ADDR;get_thread_local_address;ptid_t ptid, CORE_ADDR lm_addr, CORE_ADDR offset;ptid, lm_addr, offset
> 
> Could you document the method, especially the meaning of the parameters?

Sure.  I used a variant of the comment from the target method:

[v2,04/11] Add a new gdbarch method to resolve the address of TLS variables.

Commit Message

Comments

Patch