Using libthread_db.so with single-threaded programs, for TLS access (was: Re: [RFC PATCH 0/3] Pretty-printing for errno)
Commit Message
On 09/07/2017 12:34 PM, Pedro Alves wrote:
> On 09/06/2017 10:03 PM, Zack Weinberg wrote:
>
>> So, changes to both gdb and libthread_db seem to be required here. I
>> do think that _in principle_ it ought to be possible to use
>> libthread_db to retrieve the address of thread-local data even if the
>> inferior is not linked with libpthread; glibc has quite a few
>> thread-specific variables (errno most prominent, of course, but also
>> h_errno, _res, etc), and so might any library which can be used from
>> both single- and multithreaded programs.
>>
>> This is really not code I feel comfortable hacking up, though, and
>> it's probably more of a project than I have time for, in any case.
>
> Sounds like a promising approach though. I'd like to see this path
> explored a bit more. I'll keep this in my TODO, even though it's
> not likely to bubble up very soon. Thanks for the discussion/ideas!
So I played with this a bit more on the plane back from Cauldron,
to try to see if we'd hit some major roadblock. I also chatted
with Carlos a bit about this back at the Cauldron, and seemingly
there's no major reason this can't be made to work,
TLS-internals-wise.
Seems like that it's mainly a case of moving libthread_db.so-related
symbols from libpthread.so elsewhere. More below.
I hacked libthread_db.so to disable the nptl_version check, so that
it always successfully loads with non-threaded programs. And then
I tweaked GDB enough to make it actually reach libthread_db.so's
td_thr_tls_get_addr in that scenario too. That's when I hit
another snag: the symbols that libthread_db.so needs which describe
the necessary offsets of internal data structures for getting at the
TLS blocks are also in libpthread.so... In particular, the first
we stumble on is "_thread_db_link_map_l_tls_modid". I made GDB
print the symbol lookups to make it easier to debug. Vis:
(gdb) p errno
ps_pglobal_lookup: name="__stack_user" => PS_NOSYM
warning: Cannot find user-level thread for LWP 31772: generic error
ps_pglobal_lookup: name="_thread_db_link_map_l_tls_modid" => PS_NOSYM
Cannot find thread-local storage for process 31772, shared library /lib64/libc.so.6:
operation not applicable to
The lookup is coming from here:
(top-gdb) bt
#0 ps_pglobal_lookup (ph=0x1f65fe0, obj=0x7fffe58f93ae "libpthread.so.0", name=0x7fffe58f9e48 "_thread_db_link_map_l_tls_modid", sym_addr=0x7fffffffc428) at src/gdb/proc-service.c:115
#1 0x00007fffe58f88a8 in td_mod_lookup (ps=<optimized out>, mod=mod@entry=0x7fffe58f93ae "libpthread.so.0", idx=<optimized out>, sym_addr=sym_addr@entry=0x7fffffffc428)
at td_symbol_list.c:48
#2 0x00007fffe58f8f45 in _td_locate_field (ta=ta@entry=0x1f84df0, desc=desc@entry=0x1f84fbc, descriptor_name=descriptor_name@entry=43, idx=idx@entry=0x0,
address=address@entry=0x7fffffffc458) at fetch-value.c:54
#3 0x00007fffe58f8ff0 in _td_fetch_value (ta=0x1f84df0, desc=0x1f84fbc, descriptor_name=descriptor_name@entry=43, idx=idx@entry=0x0, address=0x7ffff7ff7658,
result=result@entry=0x7fffffffc498) at fetch-value.c:94
#4 0x00007fffe58f8ddf in td_thr_tls_get_addr (th=0x7fffffffc4e0, map_address=<optimized out>, offset=16, address=0x7fffffffc4f8) at td_thr_tls_get_addr.c:31
...
So we'd need to move that symbol (and maybe others) to one of ld.so/libc.so
instead. AFAICT, those magic symbols are described in nptl_db/structs.def.
I haven't looked enough to figure out what ends up expanding those macros
in libpthread.so. This is where I stopped.
I'm attaching the gdb and libthread_db.so patches I used, both against current
master in their respective projects. See comments within the patches.
I've also pushed the gdb patch to a "users/palves/tls-nonthreaded" branch.
(I don't think I have write access to glibc's git.)
Thanks,
Pedro Alves
Comments
On Wed, 2017-09-13 at 12:22 +0100, Pedro Alves wrote:
> On 09/07/2017 12:34 PM, Pedro Alves wrote:
> > On 09/06/2017 10:03 PM, Zack Weinberg wrote:
> >
> > > So, changes to both gdb and libthread_db seem to be required
> > > here. I
> > > do think that _in principle_ it ought to be possible to use
> > > libthread_db to retrieve the address of thread-local data even if
> > > the
> > > inferior is not linked with libpthread; glibc has quite a few
> > > thread-specific variables (errno most prominent, of course, but
> > > also
> > > h_errno, _res, etc), and so might any library which can be used
> > > from
> > > both single- and multithreaded programs.
> > >
> > > This is really not code I feel comfortable hacking up, though,
> > > and
> > > it's probably more of a project than I have time for, in any
> > > case.
> >
> > Sounds like a promising approach though. I'd like to see this path
> > explored a bit more. I'll keep this in my TODO, even though it's
> > not likely to bubble up very soon. Thanks for the
> > discussion/ideas!
>
> So I played with this a bit more on the plane back from Cauldron,
> to try to see if we'd hit some major roadblock. I also chatted
> with Carlos a bit about this back at the Cauldron, and seemingly
> there's no major reason this can't be made to work,
> TLS-internals-wise.
>
> Seems like that it's mainly a case of moving libthread_db.so-related
> symbols from libpthread.so elsewhere. More below.
Note that in the valgrind gdbserver, I had to handle the same problem
i.e. find the address of a tls variable without access to any
library (valgrind cannot make use of any library including glibc).
So, I finally end-ed up implementing the minimum logic for that.
It is based on some real ugly hacks, e.g. to get the offset of
lm_modid in struct link_map.
There is also some arch dependent 1 or 2 lines of code to get the dtv.
This is all somewhat fragile, was done in 2014, not broken (yet).
But some more recent changes might have broken the hack,
as I have a test failing after upgrading to Debian 9.
See valgrind coregrind/m_gdbserver/server.c handling of qGetTLSAddr
for
the gory/hacky details.
Better (even partial) support for such things without the need of a
library would significantly improve my life :)
Philippe
From 386b8dc8ef16197b3efa38f4bbbc98833ce7c2c6 Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Mon, 11 Sep 2017 13:48:04 +0100
Subject: [PATCH] remove version checks hack
---
nptl_db/td_ta_map_lwp2thr.c | 10 ++++++++--
nptl_db/td_ta_new.c | 9 ++++++++-
2 files changed, 16 insertions(+), 3 deletions(-)
@@ -185,11 +185,17 @@ td_ta_map_lwp2thr (const td_thragent_t *ta_arg,
sometimes contain garbage that would confuse us, left by the kernel
at exec. So if it looks like initialization is incomplete, we only
fake a special descriptor for the initial thread. */
-
psaddr_t list;
td_err_e err = DB_GET_SYMBOL (list, ta, __stack_user);
if (err != TD_OK)
- return err;
+ {
+ /* '__stack_user' is in pthread.so, so this always fails with
+ non-threaded programs. GDB hardcodes/assumes th_unique==0
+ for the main thread - maybe we should instead return the fake
+ special descriptor for the initial thread here too. See
+ below. */
+ return err;
+ }
err = DB_GET_FIELD (list, ta, list, list_t, next, 0);
if (err != TD_OK)
@@ -33,12 +33,18 @@ LIST_HEAD (__td_agent_list);
td_err_e
td_ta_new (struct ps_prochandle *ps, td_thragent_t **ta)
{
+#if 0
psaddr_t versaddr;
char versbuf[sizeof (VERSION)];
+#endif
LOG ("td_ta_new");
- /* Check whether the versions match. */
+ /* Check whether the versions match.
+
+ XXX: Disabled because "nptl_version" currently lives in
+ libpthread.so. */
+#if 0
if (td_lookup (ps, SYM_nptl_version, &versaddr) != PS_OK)
return TD_NOLIBTHREAD;
if (ps_pdread (ps, versaddr, versbuf, sizeof (versbuf)) != PS_OK)
@@ -47,6 +53,7 @@ td_ta_new (struct ps_prochandle *ps, td_thragent_t **ta)
if (memcmp (versbuf, VERSION, sizeof VERSION) != 0)
/* Not the right version. */
return TD_VERSION;
+#endif
/* Fill in the appropriate information. */
*ta = (td_thragent_t *) calloc (1, sizeof (td_thragent_t));
--
2.5.5