Message ID | 3a7946e9-d178-f878-9774-64ff44bcf5df@redhat.com |
---|---|
State | New, archived |
Headers |
Received: (qmail 68784 invoked by alias); 29 Jun 2017 16:53:31 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 67743 invoked by uid 89); 29 Jun 2017 16:53:30 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=phil, Phil, HTo:U*pmuldoon, pai X-Spam-User: qpsmtpd, 2 recipients X-HELO: mx1.redhat.com DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 51E4850F50 Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=palves@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 51E4850F50 Subject: Re: [RFC PATCH 0/3] Pretty-printing for errno To: Phil Muldoon <pmuldoon@redhat.com>, Zack Weinberg <zackw@panix.com>, libc-alpha@sourceware.org, gdb@sourceware.org References: <20170622224456.1358-1-zackw@panix.com> <b2e7bc3b-d914-37ec-0215-2937949a848c@redhat.com> Cc: joseph@codesourcery.com, fweimer@redhat.com, tom@tromey.com, siddhesh@gotplt.org From: Pedro Alves <palves@redhat.com> Message-ID: <3a7946e9-d178-f878-9774-64ff44bcf5df@redhat.com> Date: Thu, 29 Jun 2017 17:53:11 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <b2e7bc3b-d914-37ec-0215-2937949a848c@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit |
Commit Message
Pedro Alves
June 29, 2017, 4:53 p.m. UTC
On 06/29/2017 04:48 PM, Phil Muldoon wrote: > > (gdb) p (test_i) i > $2 = ('typename: ', 'int') > > Which is most puzzling. Indeed. No need for printers to see this, actually. With this code: typedef int zzz; zzz z; gdb: (gdb) whatis z type = zzz (gdb) whatis (zzz) z type = int Eh. I suspect this is the result of a bogus implementation of only stripping one level of typedef if the argument to whatis is a type name. From the manual: If @var{arg} is a variable or an expression, @code{whatis} prints its literal type as it is used in the source code. If the type was defined using a @code{typedef}, @code{whatis} will @emph{not} print the data type underlying the @code{typedef}. (...) If @var{arg} is a type name that was defined using @code{typedef}, @code{whatis} @dfn{unrolls} only one level of that @code{typedef}. That one-level stripping comes from here, in gdb/eval.c:evaluate_subexp_standard, handling OP_TYPE: ... else if (noside == EVAL_AVOID_SIDE_EFFECTS) { struct type *type = exp->elts[pc + 1].type; /* If this is a typedef, then find its immediate target. We use check_typedef to resolve stubs, but we ignore its result because we do not want to dig past all typedefs. */ check_typedef (type); if (TYPE_CODE (type) == TYPE_CODE_TYPEDEF) type = TYPE_TARGET_TYPE (type); return allocate_value (type); } However, this is reachable in both: #1 - (gdb) whatis (zzz)0 #2 - (gdb) whatis zzz While only case #2 should strip the typedef. Removing that "find immediate target" code above help with fixing #1. We then run into the fact that value_cast also drops typedefs. Fixing that (see patch below) makes whatis (zzz)0 work as expected: (gdb) whatis (zzz)0 type = zzz however, we'd need to still make "whatis" strip one level of typedefs when the argument is a type, somehow, because we now get this: (gdb) whatis zzz type = zzz From 21f3a88611fad22f73aff2217c93d99971dd076f Mon Sep 17 00:00:00 2001 From: Pedro Alves <palves@redhat.com> Date: Thu, 29 Jun 2017 17:18:32 +0100 Subject: [PATCH] whatis --- gdb/eval.c | 2 ++ gdb/valops.c | 28 +++++++++++++++------------- 2 files changed, 17 insertions(+), 13 deletions(-)
Comments
On 06/29/2017 05:53 PM, Pedro Alves wrote: > however, we'd need to still make "whatis" strip one level of > typedefs when the argument is a type, somehow, because we now > get this: > > (gdb) whatis zzz > type = zzz I suspect we can do that by making "whatis" look at the top of the expression tree, see if it's an OP_TYPE: (gdb) set debug expression 1 (gdb) whatis (zzz)0 ... Dump of expression @ 0x1bc6af0, after conversion to prefix form: Expression: `(zzz) 0' Language c, 8 elements, 16 bytes each. 0 UNOP_CAST_TYPE ( 1 OP_TYPE Type @0x18e7920 (zzz)) 4 OP_LONG Type @0x19a6650 (int), value 0 (0x0) type = zzz (gdb) whatis zzz ... Dump of expression @ 0x1bc87b0, after conversion to prefix form: Expression: `zzz' Language c, 3 elements, 16 bytes each. 0 OP_TYPE Type @0x18e7920 (zzz) type = zzz (gdb) whatis z ... Dump of expression @ 0x1bc6af0, after conversion to prefix form: Expression: `z' Language c, 4 elements, 16 bytes each. 0 OP_VAR_VALUE Block @0x19e8ee0, symbol @0x19e8e10 (z) type = zzz (gdb) There may be a helper for this already somewhere. I've never dug that much deeply into this area of the code. Thanks, Pedro Alves
On 06/29/2017 06:02 PM, Pedro Alves wrote: > I suspect we can do that by making "whatis" look at the top of > the expression tree, see if it's an OP_TYPE: That works: With this code: typedef int zzz; zzz z; gdb: (gdb) whatis z type = zzz (gdb) whatis (zzz)0 type = zzz (gdb) whatis zzz type = int and at least make check TESTS="*/whatis*.exp */*ptype*.exp" passes, which is promising. Haven't tried this against a printer, but I think it should start working. We may need OP_SCOPE too. I'll try running this against the full gdb testsuite, and write some test. Thanks, Pedro Alves
On Thu, Jun 29, 2017 at 1:28 PM, Pedro Alves <palves@redhat.com> wrote: > With this code: > > typedef int zzz; > zzz z; > > gdb: > > (gdb) whatis z > type = zzz > (gdb) whatis (zzz)0 > type = zzz > (gdb) whatis zzz > type = int For the glibc patch to go through, this case also needs to work: typedef int error_t; error_t var; extern error_t *errloc(void); int main(void) { return *errloc(); } (compiled to .o file) (gdb) ptype errloc No symbol "errloc" in current context. This might be a problem with inadequate debugging information generated by the compiler -- it does work correctly if a function *definition* is visible. But we need it to work given only the above, possibly with no debug symbols in the shared library defining "errloc". zw
On 06/30/2017 01:28 AM, Zack Weinberg wrote: > On Thu, Jun 29, 2017 at 1:28 PM, Pedro Alves <palves@redhat.com> wrote: >> With this code: >> >> typedef int zzz; >> zzz z; >> >> gdb: >> >> (gdb) whatis z >> type = zzz >> (gdb) whatis (zzz)0 >> type = zzz >> (gdb) whatis zzz >> type = int > I've now sent the patch to gdb-patches: https://sourceware.org/ml/gdb-patches/2017-06/msg00827.html Could one of you check whether the type printer kicks in with that, in the problematic cast case? Phil, might it be a good idea to convert what you were using to a real gdb testsuite test? > For the glibc patch to go through, this case also needs to work: > > typedef int error_t; > error_t var; > extern error_t *errloc(void); > > int main(void) > { > return *errloc(); > } > > (compiled to .o file) > > (gdb) ptype errloc > No symbol "errloc" in current context. > > This might be a problem with inadequate debugging information > generated by the compiler Debug info is generated for function definitions, not declarations. I.e., that declaration for "errloc" alone produces no debug info at all. Try 'readelf -dw' on the .o file. But why is this a valid scenario? Even if you have no debug info, GDB should be able to find the function's address in the elf dynamic symbol table? I'm not sure what you mean by "glibc patch to go through". If you mean acceptance upstream, then I'd consider this a separate issue, because now you're talking about being able to print the "errno" variable in the first place, even as a plain int with no printer? That is related, but orthogonal from the error_t type printer? Otherwise, can you clarify? -- it does work correctly if a function > *definition* is visible. But we need it to work given only the above, > possibly with no debug symbols in the shared library defining > "errloc". Won't __errno_location be always present in the dynamic symbol table? If you strip everything from libc.so.6 / libpthread.so.0 including dynamic symbols, leaving gdb with no clue about "__errno_location", let alone its address, there's no way that gdb can call it. (Unless you call it by address manually.) The next problem is that without debug info for __errno_location, gdb has no clue of its prototype, only that its a function, and so it assumes that it has type "int()", i.e., that it returns int, while in reality it returns int * / __error_t *. (Falling back to assuming "int" is IMO not the best idea, but I don't know the history behind it.) I.e., with your example .o + a separate .o file defining errloc (with no debug info), we now get: (gdb) ptype errloc type = int () and then calling "p *errloc()" (the equivalent of '#define errno (*__errno_location ())' silently subtly does the wrong thing. One dirty way around it would be for the printer to re-define the errno macro (using to cast __errno_location to the correct type before calling it, I guess: (gdb) macro define errno *(*(__error_t *(*) (void)) __errno_location) () That's make "errno" available when you compile with levels lower than -g3, too. Fedora gdb has a morally equivalent hack in the source code directly. Ideally, we'd find a clean way to make this all Just Work. Thanks, Pedro Alves
On 06/30/2017 05:37 PM, Pedro Alves wrote: > I've now sent the patch to gdb-patches: > > https://sourceware.org/ml/gdb-patches/2017-06/msg00827.html > > Could one of you check whether the type printer kicks in with that, > in the problematic cast case? I put it on the users/palves/whatis branch as well, btw. Thanks, Pedro Alves
On Fri, Jun 30, 2017 at 12:37 PM, Pedro Alves <palves@redhat.com> wrote: > On 06/30/2017 01:28 AM, Zack Weinberg wrote: >> For the glibc patch to go through, this case also needs to work: >> >> typedef int error_t; >> error_t var; >> extern error_t *errloc(void); >> >> int main(void) >> { >> return *errloc(); >> } >> >> (compiled to .o file) >> >> (gdb) ptype errloc >> No symbol "errloc" in current context. >> >> This might be a problem with inadequate debugging information >> generated by the compiler > > Debug info is generated for function definitions, not declarations. > I.e., that declaration for "errloc" alone produces no debug info > at all. Try 'readelf -dw' on the .o file. > > But why is this a valid scenario? Even if you have no debug info, > GDB should be able to find the function's address in the elf dynamic > symbol table? > > I'm not sure what you mean by "glibc patch to go through". > If you mean acceptance upstream, then I'd consider this a separate > issue, because now you're talking about being able to print the > "errno" variable in the first place, even as a plain int with no > printer? That is related, but orthogonal from the error_t type printer? > Otherwise, can you clarify? The _primary_ goal of the glibc patches is to trigger a pretty-printer when someone does (gdb) p errno with no further adornment. Since pretty-printers are keyed by type (and since people do store errno codes in other places than errno), this involves a type 'error_t' (and its implementation-namespace alias '__error_t'), but if we can't get 'p errno' to work, we're probably not going to bother. In all currently-supported configurations, this is glibc's definition of errno: extern int *__errno_location (void) __THROW __attribute__ ((__const__)); #define errno (*__errno_location ()) (__THROW is a macro expanding to 'throw ()' when compiling C++.) The patches change that to extern __error_t *__errno_location (void) __THROW __attribute__ ((__const__)); which should be sufficient to get the pretty-printing happening. But obviously that's only going to work if GDB actually knows that the return type of __errno_location is __error_t*, and it doesn't seem to, *even when debug information for the definition is available*: $ gdb /lib/x86_64-linux-gnu/libc.so.6 GNU gdb (Debian 7.12-6) 7.12.0.20161007-git ... Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...Reading symbols from /usr/lib/debug/.build-id/cc/80584889db7a969292959a46c718a2b1500702.debug...done. done. (gdb) ptype __errno_location type = int () (gdb) p __errno_location $1 = {<text variable, no debug info>} 0x20590 <__GI___errno_location> ... a suspicion occurs... (gdb) ptype __GI___errno_location type = int *(void) (gdb) p __GI___errno_location $2 = {int *(void)} 0x20590 <__GI___errno_location> ... so I guess this is a problem with the __GI_ indirection, which *may* be a thing we can resolve on our end. I don't fully understand that stuff. Still, It Would Be Nice™ if you _didn't_ have to have the debug symbols for libc.so installed for this to work. Probably a lot of people think you only need those if you are debugging libc itself. > The next problem is that without debug info for __errno_location, > gdb has no clue of its prototype, only that its a function, and so > it assumes that it has type "int()", i.e., that it returns int, > while in reality it returns int * / __error_t *. (Falling back > to assuming "int" is IMO not the best idea, but I don't know > the history behind it.) Probably because that's the pre-C89 legacy default function signature? > One dirty way around it would be for the printer to > re-define the errno macro (using to cast __errno_location to > the correct type before calling it, I guess: > > (gdb) macro define errno *(*(__error_t *(*) (void)) __errno_location) () > > That's make "errno" available when you compile with levels > lower than -g3, too. Hmm. How would one do that from inside Python? zw
On 06/30/2017 06:27 PM, Zack Weinberg wrote: > Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...Reading symbols > from /usr/lib/debug/.build-id/cc/80584889db7a969292959a46c718a2b1500702.debug...done. > done. > (gdb) ptype __errno_location > type = int () > (gdb) p __errno_location > $1 = {<text variable, no debug info>} 0x20590 <__GI___errno_location> > > ... a suspicion occurs... > > (gdb) ptype __GI___errno_location > type = int *(void) > (gdb) p __GI___errno_location > $2 = {int *(void)} 0x20590 <__GI___errno_location> > > ... so I guess this is a problem with the __GI_ indirection, which > *may* be a thing we can resolve on our end. I don't fully understand > that stuff. OK, thanks, that's clear. I don't know much about __GI_ indirection either. Redefining errno to call the __GI_ version would probably be too easy. :-) > Still, It Would Be Nice™ if you _didn't_ have to have the debug > symbols for libc.so installed for this to work. Probably a lot of > people think you only need those if you are debugging libc itself. I guess that that could also be seen as a packaging issue. May be it'd be possible to (always) have some kind of minimal debug info available for libc, with only the definitions of public API functions, describing their prototypes, but not their internals? See: https://fedoraproject.org/wiki/Features/MiniDebugInfo > >> The next problem is that without debug info for __errno_location, >> gdb has no clue of its prototype, only that its a function, and so >> it assumes that it has type "int()", i.e., that it returns int, >> while in reality it returns int * / __error_t *. (Falling back >> to assuming "int" is IMO not the best idea, but I don't know >> the history behind it.) > > Probably because that's the pre-C89 legacy default function signature? Yes, most likely. >> One dirty way around it would be for the printer to >> re-define the errno macro (using to cast __errno_location to >> the correct type before calling it, I guess: >> >> (gdb) macro define errno *(*(__error_t *(*) (void)) __errno_location) () >> >> That's make "errno" available when you compile with levels >> lower than -g3, too. > > Hmm. How would one do that from inside Python? There's no direct Python API, I believe. You'd just call the CLI command directly, with gdb.execute. xmethods sounds like something that maybe might be useful here: https://sourceware.org/gdb/onlinedocs/gdb/Xmethods-In-Python.html though from the docs it sounds like you can only replace class methods, not free functions, currently. Not sure, have never written any. Thanks, Pedro Alves
On 06/30/2017 07:11 PM, Pedro Alves wrote: > On 06/30/2017 06:27 PM, Zack Weinberg wrote: > >>> One dirty way around it would be for the printer to >>> re-define the errno macro (using to cast __errno_location to >>> the correct type before calling it, I guess: >>> >>> (gdb) macro define errno *(*(__error_t *(*) (void)) __errno_location) () >>> >>> That's make "errno" available when you compile with levels >>> lower than -g3, too. >> >> Hmm. How would one do that from inside Python? > > There's no direct Python API, I believe. You'd just call the CLI > command directly, with gdb.execute. > > xmethods sounds like something that maybe might be useful here: > https://sourceware.org/gdb/onlinedocs/gdb/Xmethods-In-Python.html > though from the docs it sounds like you can only replace class > methods, not free functions, currently. Not sure, have never written any. BTW, it'd be very nice if the printer could replace the "#define errno *__errno_location()" with an alternative implementation that would avoid the function call, so that "print errno": #1 - would also work when debugging core dumps #2 - is just plain safer. Having gdb call functions in the inferior always carries the risk of corrupting an already corrupt inferior even more. Thanks, Pedro Alves
On Friday 30 June 2017 10:57 PM, Zack Weinberg wrote: > The _primary_ goal of the glibc patches is to trigger a pretty-printer > when someone does > > (gdb) p errno > > with no further adornment. Since pretty-printers are keyed by type > (and since people do store errno codes in other places than errno), > this involves a type 'error_t' (and its implementation-namespace alias > '__error_t'), but if we can't get 'p errno' to work, we're probably > not going to bother. > > In all currently-supported configurations, this is glibc's definition of errno: > > extern int *__errno_location (void) __THROW __attribute__ ((__const__)); > #define errno (*__errno_location ()) > > (__THROW is a macro expanding to 'throw ()' when compiling C++.) > > The patches change that to > > extern __error_t *__errno_location (void) __THROW __attribute__ ((__const__)); > > which should be sufficient to get the pretty-printing happening. But > obviously that's only going to work if GDB actually knows that the > return type of __errno_location is __error_t*, and it doesn't seem to, > *even when debug information for the definition is available*: > > $ gdb /lib/x86_64-linux-gnu/libc.so.6 > GNU gdb (Debian 7.12-6) 7.12.0.20161007-git > ... > > Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...Reading symbols > from /usr/lib/debug/.build-id/cc/80584889db7a969292959a46c718a2b1500702.debug...done. > done. > (gdb) ptype __errno_location > type = int () > (gdb) p __errno_location > $1 = {<text variable, no debug info>} 0x20590 <__GI___errno_location> > > ... a suspicion occurs... > > (gdb) ptype __GI___errno_location > type = int *(void) > (gdb) p __GI___errno_location > $2 = {int *(void)} 0x20590 <__GI___errno_location> > > ... so I guess this is a problem with the __GI_ indirection, which > *may* be a thing we can resolve on our end. I don't fully understand > that stuff. > > Still, It Would Be Nice™ if you _didn't_ have to have the debug > symbols for libc.so installed for this to work. Probably a lot of > people think you only need those if you are debugging libc itself. The __GI_* alias is an internal alias of __errno_location and I've seen this before with other symbols, where a function address resolves to the internal alias instead of the public one in gdb as well as other places like objdump. It might make sense to turn this around, but I suspect there may be a reason for it that I am unaware of. Siddhesh
On 07/01/2017 03:35 PM, Siddhesh Poyarekar wrote: > On Friday 30 June 2017 10:57 PM, Zack Weinberg wrote: >> Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...Reading symbols >> from /usr/lib/debug/.build-id/cc/80584889db7a969292959a46c718a2b1500702.debug...done. >> done. >> (gdb) ptype __errno_location >> type = int () >> (gdb) p __errno_location >> $1 = {<text variable, no debug info>} 0x20590 <__GI___errno_location> >> >> ... a suspicion occurs... >> >> (gdb) ptype __GI___errno_location >> type = int *(void) >> (gdb) p __GI___errno_location >> $2 = {int *(void)} 0x20590 <__GI___errno_location> >> >> ... so I guess this is a problem with the __GI_ indirection, which >> *may* be a thing we can resolve on our end. I don't fully understand >> that stuff. > > The __GI_* alias is an internal alias of __errno_location and I've seen > this before with other symbols, where a function address resolves to the > internal alias instead of the public one in gdb as well as other places > like objdump. It might make sense to turn this around, but I suspect > there may be a reason for it that I am unaware of. > I look at this a bit today, and sent a gdb patch to handle this better now: https://sourceware.org/ml/gdb-patches/2017-07/msg00018.html I pushed it to the same branch as the other one: users/palves/whatis. Thanks, Pedro Alves
On 06/30/2017 07:11 PM, Pedro Alves wrote: > On 06/30/2017 06:27 PM, Zack Weinberg wrote: >> Pedro Alves wrote: >>> The next problem is that without debug info for __errno_location, >>> gdb has no clue of its prototype, only that its a function, and so >>> it assumes that it has type "int()", i.e., that it returns int, >>> while in reality it returns int * / __error_t *. (Falling back >>> to assuming "int" is IMO not the best idea, but I don't know >>> the history behind it.) >> >> Probably because that's the pre-C89 legacy default function signature? > > Yes, most likely. FYI, shortly after this discussion, yet another user showed up on #gdb confused by "p getenv("PATH") returning weird negative integer numbers [because he had no debug info for getenv...], so I decided to do something about it. I've now sent a patch series that stops GDB from assuming no-debug-info symbols have type int: [PATCH 00/13] No-debug-info debugging improvements https://sourceware.org/ml/gdb-patches/2017-07/msg00112.html Comments welcome. Thanks, Pedro Alves
On 07/13/2017 03:30 AM, Pedro Alves wrote: > On 06/30/2017 07:11 PM, Pedro Alves wrote: >> On 06/30/2017 06:27 PM, Zack Weinberg wrote: >>> Pedro Alves wrote: >>>> The next problem is that without debug info for __errno_location, >>>> gdb has no clue of its prototype, only that its a function, and so >>>> it assumes that it has type "int()", i.e., that it returns int, >>>> while in reality it returns int * / __error_t *. (Falling back >>>> to assuming "int" is IMO not the best idea, but I don't know >>>> the history behind it.) >>> >>> Probably because that's the pre-C89 legacy default function signature? >> >> Yes, most likely. > > FYI, shortly after this discussion, yet another user showed up > on #gdb confused by "p getenv("PATH") returning weird negative > integer numbers [because he had no debug info for getenv...], so I > decided to do something about it. I've now sent a patch series > that stops GDB from assuming no-debug-info symbols have > type int: > > [PATCH 00/13] No-debug-info debugging improvements > https://sourceware.org/ml/gdb-patches/2017-07/msg00112.html > > Comments welcome. FYI, this is now all in gdb master. I believe all the gdb issues uncovered by the errno printer have been addressed. Let me know if you're aware of something still not working properly. Thanks, Pedro Alves
On Mon, Sep 4, 2017 at 5:25 PM, Pedro Alves <palves@redhat.com> wrote: > > FYI, this is now all in gdb master. I believe all the gdb issues > uncovered by the errno printer have been addressed. Let me know > if you're aware of something still not working properly. I'm sorry I never got around to experimenting with your patches before now. With gdb master as of earlier today, and my patched glibc that tries to pretty-print errno, I can confirm that nearly everything works as desired. `p (error_t) 0` invokes the pretty-printer, and when preprocessor macro bodies are available to the debugger (-ggdb3) so does `p errno`. Unfortunately I am still getting this error message when I try to print the underlying thread-local errno variable (which is what `p errno` does if macro bodies are not available): Cannot find thread-local storage for process 4367, executable file /home/zack/projects/glibc/glibc-build/stdlib/test-errno-printer: Cannot find thread-local variables on this target I don't understand why thread-local variables are inaccessible on my perfectly ordinary x86_64-unknown-linux-gnu workstation (the base OS is Debian 'stretch'). Do you have any idea what might be wrong? zw
On 09/05/2017 10:15 PM, Zack Weinberg wrote: > On Mon, Sep 4, 2017 at 5:25 PM, Pedro Alves <palves@redhat.com> wrote: >> >> FYI, this is now all in gdb master. I believe all the gdb issues >> uncovered by the errno printer have been addressed. Let me know >> if you're aware of something still not working properly. > > I'm sorry I never got around to experimenting with your patches before now. Really no worries. > With gdb master as of earlier today, and my patched glibc that tries > to pretty-print errno, I can confirm that nearly everything works as > desired. `p (error_t) 0` invokes the pretty-printer, and when > preprocessor macro bodies are available to the debugger (-ggdb3) so > does `p errno`. Unfortunately I am still getting this error message > when I try to print the underlying thread-local errno variable (which > is what `p errno` does if macro bodies are not available): > > Cannot find thread-local storage for process 4367, executable file > /home/zack/projects/glibc/glibc-build/stdlib/test-errno-printer: > Cannot find thread-local variables on this target > > I don't understand why thread-local variables are inaccessible on my > perfectly ordinary x86_64-unknown-linux-gnu workstation (the base OS > is Debian 'stretch'). Do you have any idea what might be wrong? I assume your test program isn't actually linked with -pthread? When you do "print errno" in this situation, because there's no "#define errno ..." in sight, gdb ends up finding the real "errno" symbol, which, even though the program isn't threaded, is a TLS symbol, and as such has a DWARF location expression describing its location as an offset into the thread-local block for the current thread. GDB needs to resolve that address, and for threaded programs that is normally done with assistance from libthread_db.so. The problem is then that libthread_db.so only works with programs that link with libpthread.so, and if your test program is actually non-threaded, it doesn't link with libpthread.so, So without libthread_db.so's assistance, gdb cannot "find [the address of] thread-local variables on this target". The error message is coming from generic GDB "get address of tls var" code several layers above GNU/Linux-specific libthread_db.so-interaction code, but still it could probably be made clearer, maybe by adding "the address of". A workaround specifically for errno, and only for live-process debugging [*] is the "macro define" trick I had suggested before: (gdb) macro define errno (*__errno_location ()) After that, "p errno" ends up calling __errno_location just like when you compile the test program with -g3. [*] doesn't work with core file / postmortem debugging. I don't know whether GDB could be able to resolve the location of the errno variable for the main thread (for single-threaded programs): - without libthread_db.so assistance and - without baking in awareness of glibc internal data structures If there is I'd absolutely love to learn about it. Thanks, Pedro Alves
On Tue, Sep 5, 2017 at 6:31 PM, Pedro Alves <palves@redhat.com> wrote: > On 09/05/2017 10:15 PM, Zack Weinberg wrote: >> >> I don't understand why thread-local variables are inaccessible on my >> perfectly ordinary x86_64-unknown-linux-gnu workstation (the base OS >> is Debian 'stretch'). Do you have any idea what might be wrong? > > I assume your test program isn't actually linked with -pthread? That is correct. > When you do "print errno" in this situation, because there's no > "#define errno ..." in sight, gdb ends up finding the real "errno" symbol, > which, even though the program isn't threaded, is a TLS symbol, and as such has > a DWARF location expression describing its location as an offset into the > thread-local block for the current thread. GDB needs to resolve that address, and > for threaded programs that is normally done with assistance from libthread_db.so. > The problem is then that libthread_db.so only works with programs that > link with libpthread.so, and if your test program is actually non-threaded, > it doesn't link with libpthread.so. I am not familiar with the glibc-side TLS implementation, nor with libthread_db.so, nor the code in GDB that uses libthread_db.so. However, reading the implementation of td_thr_tls_get_addr leads me to believe that that function is *supposed* to work even if libpthread.so has not been loaded into the 'inferior'. If it doesn't, perhaps that is a bug on our side. Do you know if GDB even tries? It's not obvious to me looking at linux-thread-db.c. > A workaround specifically for errno, and only for live-process debugging [*] > is the "macro define" trick I had suggested before: > > (gdb) macro define errno (*__errno_location ()) > > After that, "p errno" ends up calling __errno_location just > like when you compile the test program with -g3. Again, is it possible to do (the equivalent of) this from the initialization code of a pretty-printer module? Specifically, there already exists a Python function that's doing this: def register(objfile): """Register pretty printers for the current objfile.""" printer = gdb.printing.RegexpCollectionPrettyPrinter("glibc-errno") printer.add_printer('error_t', r'^(?:__)?error_t', ErrnoPrinter) if objfile == None: objfile = gdb gdb.printing.register_pretty_printer(objfile, printer) called when the module is loaded; what would I need to add to that so that the macro is defined (if it isn't already)? zw
diff --git a/gdb/eval.c b/gdb/eval.c index 2a39774..6aa2499 100644 --- a/gdb/eval.c +++ b/gdb/eval.c @@ -2735,8 +2735,10 @@ evaluate_subexp_standard (struct type *expect_type, result because we do not want to dig past all typedefs. */ check_typedef (type); +#if 0 if (TYPE_CODE (type) == TYPE_CODE_TYPEDEF) type = TYPE_TARGET_TYPE (type); +#endif return allocate_value (type); } else diff --git a/gdb/valops.c b/gdb/valops.c index 8675e6c..058fe1e 100644 --- a/gdb/valops.c +++ b/gdb/valops.c @@ -378,6 +378,8 @@ value_cast (struct type *type, struct value *arg2) /* We deref the value and then do the cast. */ return value_cast (type, coerce_ref (arg2)); + struct type *org_type = type; + type = check_typedef (type); code1 = TYPE_CODE (type); arg2 = coerce_ref (arg2); @@ -452,14 +454,14 @@ value_cast (struct type *type, struct value *arg2) && (code2 == TYPE_CODE_STRUCT || code2 == TYPE_CODE_UNION) && TYPE_NAME (type) != 0) { - struct value *v = value_cast_structs (type, arg2); + struct value *v = value_cast_structs (org_type, arg2); if (v) return v; } if (code1 == TYPE_CODE_FLT && scalar) - return value_from_double (type, value_as_double (arg2)); + return value_from_double (org_type, value_as_double (arg2)); else if (code1 == TYPE_CODE_DECFLOAT && scalar) { enum bfd_endian byte_order = gdbarch_byte_order (get_type_arch (type)); @@ -475,7 +477,7 @@ value_cast (struct type *type, struct value *arg2) /* The only option left is an integral type. */ decimal_from_integral (arg2, dec, dec_len, byte_order); - return value_from_decfloat (type, dec); + return value_from_decfloat (org_type, dec); } else if ((code1 == TYPE_CODE_INT || code1 == TYPE_CODE_ENUM || code1 == TYPE_CODE_RANGE) @@ -496,7 +498,7 @@ value_cast (struct type *type, struct value *arg2) gdbarch_byte_order (get_type_arch (type2))); else longest = value_as_long (arg2); - return value_from_longest (type, convert_to_boolean ? + return value_from_longest (org_type, convert_to_boolean ? (LONGEST) (longest ? 1 : 0) : longest); } else if (code1 == TYPE_CODE_PTR && (code2 == TYPE_CODE_INT @@ -522,14 +524,14 @@ value_cast (struct type *type, struct value *arg2) || longest <= -((LONGEST) 1 << addr_bit)) warning (_("value truncated")); } - return value_from_longest (type, longest); + return value_from_longest (org_type, longest); } else if (code1 == TYPE_CODE_METHODPTR && code2 == TYPE_CODE_INT && value_as_long (arg2) == 0) { - struct value *result = allocate_value (type); + struct value *result = allocate_value (org_type); - cplus_make_method_ptr (type, value_contents_writeable (result), 0, 0); + cplus_make_method_ptr (org_type, value_contents_writeable (result), 0, 0); return result; } else if (code1 == TYPE_CODE_MEMBERPTR && code2 == TYPE_CODE_INT @@ -537,7 +539,7 @@ value_cast (struct type *type, struct value *arg2) { /* The Itanium C++ ABI represents NULL pointers to members as minus one, instead of biasing the normal case. */ - return value_from_longest (type, -1); + return value_from_longest (org_type, -1); } else if (code1 == TYPE_CODE_ARRAY && TYPE_VECTOR (type) && code2 == TYPE_CODE_ARRAY && TYPE_VECTOR (type2) @@ -548,21 +550,21 @@ value_cast (struct type *type, struct value *arg2) error (_("can only cast scalar to vector of same size")); else if (code1 == TYPE_CODE_VOID) { - return value_zero (type, not_lval); + return value_zero (org_type, not_lval); } else if (TYPE_LENGTH (type) == TYPE_LENGTH (type2)) { if (code1 == TYPE_CODE_PTR && code2 == TYPE_CODE_PTR) - return value_cast_pointers (type, arg2, 0); + return value_cast_pointers (org_type, arg2, 0); arg2 = value_copy (arg2); - deprecated_set_value_type (arg2, type); - set_value_enclosing_type (arg2, type); + deprecated_set_value_type (arg2, org_type); + set_value_enclosing_type (arg2, org_type); set_value_pointed_to_offset (arg2, 0); /* pai: chk_val */ return arg2; } else if (VALUE_LVAL (arg2) == lval_memory) - return value_at_lazy (type, value_address (arg2)); + return value_at_lazy (org_type, value_address (arg2)); else { error (_("Invalid cast."));