Patchwork [v2,04/11] Add a new gdbarch method to resolve the address of TLS variables.

login
register
mail settings
Submitter John Baldwin
Date March 7, 2019, 11:50 p.m.
Message ID <2c282f52-0269-d6a8-8533-4c00b1a4ee8d@FreeBSD.org>
Download mbox | patch
Permalink /patch/31782/
State New
Headers show

Comments

John Baldwin - March 7, 2019, 11:50 p.m.
On 3/7/19 8:08 AM, Simon Marchi wrote:
> On 2019-02-08 7:40 p.m., John Baldwin wrote:
>> Permit TLS variable addresses to be resolved purely by an ABI rather
>> than requiring a target method.  This doesn't try the target method if
>> the ABI function is present (even if the ABI function fails) to
>> simplify error handling.
> 
> I don't see anything wrong with the patch (and the comment you removed 
> in target_translate_tls_address hints it is right), but again I am not 
> very familiar with how TLS works, so I wouldn't spot if anything was 
> conceptually wrong with this approach.  I would appreciate if another 
> maintainer could take a look and give their opinion.

Ok.  FWIW, the reason for target vs gdbarch has to do with the different ways one can
resolve a TLS variable.  Some background:

In ELF relocations, a TLS variable is identified by an offset in a special TLS section
of an ELF file, similar to global symbols being an offset relative to .data or .bss.
However, TLS variables are duplicated for each thread.  To support this, the runtime
linker allocates an array of pointers for each thread called the DTV array.  The runtime
linker also assigns an array index to each ELF object, so the executable is assigned array
index 1, and other shared libraries that use TLS are assigned indices as they are mapped
by the runtime linker.  The pointers in each thread's array point to the per-thread blocks
of TLS variables for a given ELF object.  Thus, if index 1 is for my program and index 2
was assigned to libc, then DTV[1] contains a pointer to all of the TLS variables in my
main program and DTV[2] contains a pointer to all of the TLS variables in libc.

Thus, if libc has two TLS integers 'foo' and 'bar', they might be assigned offsets of
0 and 4.  To read the value of 'foo' one uses the expression '*(int *)(DTV[1])'.  To
read 'bar' you would use '*(int *)(DTV[1] + 4)'.

There are some extra optimizations in the compiler-generated code (there's something
called static TLS that can be at a fixed offset from the per-thread TCB pointer IIRC,
but there are also valid DTV[] pointers that can get to the same variables just via
more indirection.  Compiled code is also allowed to fetch the 'base' of a TLS block
for a given shared object and then save that 'base' and use offsets from it to access
different variables.  Put another way, the compiler can assume that &bar - &foo is
always '4' and just add the relative offset to '&foo' to compute '&bar' without going
through the DTV array every time).

In target_translate_tls_address() we are given the 'struct objfile' of the ELF object
and the offset of the TLS variable we are trying to find.  The
gdbarch_fetch_tls_load_module_address function fetches the pointer to the runtime
linker's data structure describing that ELF object.

The target version (target::get_thread_local_address) expects to use some target
specific method to turn a (thread, linker_module_addr, offset) tuple into the
address of a TLS variable.  On Linux and other systems using libthread_db for this,
it calls a libthread_db function.  Internally that libthread_db function looks at
the runtime linker's structure to extract the TLS index of the ELF object.  It
then looks in the thread library's per-thread data structure to find a pointer
to the DTV array.  It then uses 'DTV[index] + offset' to compute the final address.
Note that this is all done in libthread_db rather than in gdb itself.

The gdbarch method I'm using for FreeBSD doesn't use libthread_db.  Instead, it
more closely mimics what the compiler-generated code does.  Many architectures use
some sort of register to point to a per-thread Thread Control Block (TCB), and they
store a pointer to the DTV array either in the TCB or at a fixed offset relative to
the TCB.  For example, 64-bit x86 uses the %fs segment prefix to access the TCB,
and the %fs_base register is thus a pointer to the TCB.  32-bit x86 uses %gs and
%gs_base instead.  RISC-V has a 'tp' register for this purpose, etc.  The approach
I'm using for FreeBSD is to provide an architecture-specific function that uses the
relevant register to locate the pointer to the DTV array.  It then calls a shared
function (patch 7) that extracts the TLS index from the runtime linker's data
structure and computes the final address via 'DTV[index] + offset'.

Mostly I did this because I don't like libthread_db, but using a gdbarch method
should also be a bit more cross-debugger friendly (you don't have to have a libthread_db
on a FreeBSD host that understands the Linux runtime linker or thread library or
vice versa, and similar concerns with 32-bit vs 64-bit and x86 vs ARM, etc.).

>> diff --git a/gdb/gdbarch.sh b/gdb/gdbarch.sh
>> index afc4da7cdd..09097bcbaf 100755
>> --- a/gdb/gdbarch.sh
>> +++ b/gdb/gdbarch.sh
>> @@ -602,6 +602,7 @@ m;int;remote_register_number;int regno;regno;;default_remote_register_number;;0
>>   
>>   # Fetch the target specific address used to represent a load module.
>>   F;CORE_ADDR;fetch_tls_load_module_address;struct objfile *objfile;objfile
>> +M;CORE_ADDR;get_thread_local_address;ptid_t ptid, CORE_ADDR lm_addr, CORE_ADDR offset;ptid, lm_addr, offset
> 
> Could you document the method, especially the meaning of the parameters?

Sure.  I used a variant of the comment from the target method:
Simon Marchi - March 8, 2019, 2:55 a.m.
On 2019-03-07 18:50, John Baldwin wrote:
> On 3/7/19 8:08 AM, Simon Marchi wrote:
>> On 2019-02-08 7:40 p.m., John Baldwin wrote:
>>> Permit TLS variable addresses to be resolved purely by an ABI rather
>>> than requiring a target method.  This doesn't try the target method 
>>> if
>>> the ABI function is present (even if the ABI function fails) to
>>> simplify error handling.
>> 
>> I don't see anything wrong with the patch (and the comment you removed
>> in target_translate_tls_address hints it is right), but again I am not
>> very familiar with how TLS works, so I wouldn't spot if anything was
>> conceptually wrong with this approach.  I would appreciate if another
>> maintainer could take a look and give their opinion.
> 
> Ok.  FWIW, the reason for target vs gdbarch has to do with the
> different ways one can
> resolve a TLS variable.  Some background:
> 
> In ELF relocations, a TLS variable is identified by an offset in a
> special TLS section
> of an ELF file, similar to global symbols being an offset relative to
> .data or .bss.
> However, TLS variables are duplicated for each thread.  To support
> this, the runtime
> linker allocates an array of pointers for each thread called the DTV
> array.  The runtime
> linker also assigns an array index to each ELF object, so the
> executable is assigned array
> index 1, and other shared libraries that use TLS are assigned indices
> as they are mapped
> by the runtime linker.  The pointers in each thread's array point to
> the per-thread blocks
> of TLS variables for a given ELF object.  Thus, if index 1 is for my
> program and index 2
> was assigned to libc, then DTV[1] contains a pointer to all of the TLS
> variables in my
> main program and DTV[2] contains a pointer to all of the TLS variables 
> in libc.
> 
> Thus, if libc has two TLS integers 'foo' and 'bar', they might be
> assigned offsets of
> 0 and 4.  To read the value of 'foo' one uses the expression '*(int
> *)(DTV[1])'.  To
> read 'bar' you would use '*(int *)(DTV[1] + 4)'.
> 
> There are some extra optimizations in the compiler-generated code
> (there's something
> called static TLS that can be at a fixed offset from the per-thread
> TCB pointer IIRC,
> but there are also valid DTV[] pointers that can get to the same
> variables just via
> more indirection.  Compiled code is also allowed to fetch the 'base'
> of a TLS block
> for a given shared object and then save that 'base' and use offsets
> from it to access
> different variables.  Put another way, the compiler can assume that
> &bar - &foo is
> always '4' and just add the relative offset to '&foo' to compute
> '&bar' without going
> through the DTV array every time).
> 
> In target_translate_tls_address() we are given the 'struct objfile' of
> the ELF object
> and the offset of the TLS variable we are trying to find.  The
> gdbarch_fetch_tls_load_module_address function fetches the pointer to
> the runtime
> linker's data structure describing that ELF object.
> 
> The target version (target::get_thread_local_address) expects to use 
> some target
> specific method to turn a (thread, linker_module_addr, offset) tuple 
> into the
> address of a TLS variable.  On Linux and other systems using
> libthread_db for this,
> it calls a libthread_db function.  Internally that libthread_db
> function looks at
> the runtime linker's structure to extract the TLS index of the ELF 
> object.  It
> then looks in the thread library's per-thread data structure to find a 
> pointer
> to the DTV array.  It then uses 'DTV[index] + offset' to compute the
> final address.
> Note that this is all done in libthread_db rather than in gdb itself.
> 
> The gdbarch method I'm using for FreeBSD doesn't use libthread_db.  
> Instead, it
> more closely mimics what the compiler-generated code does.  Many
> architectures use
> some sort of register to point to a per-thread Thread Control Block
> (TCB), and they
> store a pointer to the DTV array either in the TCB or at a fixed
> offset relative to
> the TCB.  For example, 64-bit x86 uses the %fs segment prefix to access 
> the TCB,
> and the %fs_base register is thus a pointer to the TCB.  32-bit x86 
> uses %gs and
> %gs_base instead.  RISC-V has a 'tp' register for this purpose, etc.
> The approach
> I'm using for FreeBSD is to provide an architecture-specific function
> that uses the
> relevant register to locate the pointer to the DTV array.  It then
> calls a shared
> function (patch 7) that extracts the TLS index from the runtime 
> linker's data
> structure and computes the final address via 'DTV[index] + offset'.
> 
> Mostly I did this because I don't like libthread_db, but using a 
> gdbarch method
> should also be a bit more cross-debugger friendly (you don't have to
> have a libthread_db
> on a FreeBSD host that understands the Linux runtime linker or thread 
> library or
> vice versa, and similar concerns with 32-bit vs 64-bit and x86 vs ARM, 
> etc.).

Ok, thanks to your explanation I think I understand better the need to 
have an arch-specific way of doing it.

>>> diff --git a/gdb/gdbarch.sh b/gdb/gdbarch.sh
>>> index afc4da7cdd..09097bcbaf 100755
>>> --- a/gdb/gdbarch.sh
>>> +++ b/gdb/gdbarch.sh
>>> @@ -602,6 +602,7 @@ m;int;remote_register_number;int 
>>> regno;regno;;default_remote_register_number;;0
>>> 
>>>   # Fetch the target specific address used to represent a load 
>>> module.
>>>   F;CORE_ADDR;fetch_tls_load_module_address;struct objfile 
>>> *objfile;objfile
>>> +M;CORE_ADDR;get_thread_local_address;ptid_t ptid, CORE_ADDR lm_addr, 
>>> CORE_ADDR offset;ptid, lm_addr, offset
>> 
>> Could you document the method, especially the meaning of the 
>> parameters?
> 
> Sure.  I used a variant of the comment from the target method:
> 
> diff --git a/gdb/gdbarch.sh b/gdb/gdbarch.sh
> index 48fcebd19a..d15b6aa794 100755
> --- a/gdb/gdbarch.sh
> +++ b/gdb/gdbarch.sh
> @@ -602,6 +602,14 @@ m;int;remote_register_number;int
> regno;regno;;default_remote_register_number;;0
> 
>  # Fetch the target specific address used to represent a load module.
>  F;CORE_ADDR;fetch_tls_load_module_address;struct objfile 
> *objfile;objfile
> +
> +# Return the thread-local address at OFFSET in the thread-local
> +# storage for the thread PTID and the shared library or executable
> +# file given by LM_ADDR.  If that block of thread-local storage hasn't
> +# been allocated yet, this function may return an error.  LM_ADDR may
> +# be zero for statically linked multithreaded inferiors.

What does "may return an error" mean?  A special CORE_ADDR value, or it 
throws an error?

Simon
John Baldwin - March 8, 2019, 6:38 p.m.
On 3/7/19 6:55 PM, Simon Marchi wrote:
> On 2019-03-07 18:50, John Baldwin wrote:
>> Sure.  I used a variant of the comment from the target method:
>>
>> diff --git a/gdb/gdbarch.sh b/gdb/gdbarch.sh
>> index 48fcebd19a..d15b6aa794 100755
>> --- a/gdb/gdbarch.sh
>> +++ b/gdb/gdbarch.sh
>> @@ -602,6 +602,14 @@ m;int;remote_register_number;int
>> regno;regno;;default_remote_register_number;;0
>>
>>  # Fetch the target specific address used to represent a load module.
>>  F;CORE_ADDR;fetch_tls_load_module_address;struct objfile 
>> *objfile;objfile
>> +
>> +# Return the thread-local address at OFFSET in the thread-local
>> +# storage for the thread PTID and the shared library or executable
>> +# file given by LM_ADDR.  If that block of thread-local storage hasn't
>> +# been allocated yet, this function may return an error.  LM_ADDR may
>> +# be zero for statically linked multithreaded inferiors.
> 
> What does "may return an error" mean?  A special CORE_ADDR value, or it 
> throws an error?

Hmm, it throws an error, so maybe "this function may throw an error."?

I'll also add a little patch to update the comment for target::get_thread_local_address
to match.

Patch

diff --git a/gdb/gdbarch.sh b/gdb/gdbarch.sh
index 48fcebd19a..d15b6aa794 100755
--- a/gdb/gdbarch.sh
+++ b/gdb/gdbarch.sh
@@ -602,6 +602,14 @@  m;int;remote_register_number;int regno;regno;;default_remote_register_number;;0
 
 # Fetch the target specific address used to represent a load module.
 F;CORE_ADDR;fetch_tls_load_module_address;struct objfile *objfile;objfile
+
+# Return the thread-local address at OFFSET in the thread-local
+# storage for the thread PTID and the shared library or executable
+# file given by LM_ADDR.  If that block of thread-local storage hasn't
+# been allocated yet, this function may return an error.  LM_ADDR may
+# be zero for statically linked multithreaded inferiors.
+
+M;CORE_ADDR;get_thread_local_address;ptid_t ptid, CORE_ADDR lm_addr, CORE_ADDR offset;ptid, lm_addr, offset
 #
 v;CORE_ADDR;frame_args_skip;;;0;;;0
 m;CORE_ADDR;unwind_pc;struct frame_info *next_frame;next_frame;;default_unwind_pc;;0