Message ID | CADPb22Sb3r5RdXc_879iVF8uqMmjhoo9uAiyfOeKKTep6OL0Xw@mail.gmail.com |
---|---|
State | New, archived |
Headers |
Received: (qmail 30302 invoked by alias); 26 Nov 2014 03:22:38 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <gdb-patches.sourceware.org> List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 30280 invoked by uid 89); 26 Nov 2014 03:22:36 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.8 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mail-vc0-f174.google.com Received: from mail-vc0-f174.google.com (HELO mail-vc0-f174.google.com) (209.85.220.174) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Wed, 26 Nov 2014 03:22:34 +0000 Received: by mail-vc0-f174.google.com with SMTP id id10so911210vcb.19 for <gdb-patches@sourceware.org>; Tue, 25 Nov 2014 19:22:32 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=2lIjE++VNArZSklvJMUgYU0Fwj/WzKPQrpeGyaEftG8=; b=UKt8N6FBz4r8mnMX08NrthgM5zOon9EY9wI3qgC0pXxhXS3URx+VC6ZotrW+VshUY7 NJeuGw+MxjPsxusjBI/lb3+gFWulN+unzBT+/5nf7JPvWXkuX/g790+o6pER0Vpyfbuo WFfhA3m/z3+BF1cxs1OXDqmrU7OEaRohs39yJzToFDW5CaAw5/9TVkdrILG/KkowKRWE jt/SrhZn0eDfhsyiQ8ahgm7C/RzxHVtlNJZ8gC9qqeeZ6TgtCA1LCL4Gtn6jF4MsN04E FvoBZnUhYwK+OXdc8sZsynVE5lsIM1NcrE6YmD4wtEESjRNsrHQfHdAvXVWFSpM5vNLC 1QxA== X-Gm-Message-State: ALoCoQkCNjqb0SfAPUQYnZhCzIpO95WHFbmVA+7C6cLnsrz3ZaTdSJZyL7MOqiw3yQE50U7ppAUh MIME-Version: 1.0 X-Received: by 10.52.29.172 with SMTP id l12mr5792445vdh.33.1416972151968; Tue, 25 Nov 2014 19:22:31 -0800 (PST) Received: by 10.52.114.101 with HTTP; Tue, 25 Nov 2014 19:22:31 -0800 (PST) In-Reply-To: <CAGyQ6gzOr86ORPhUjvBZ1YpQdsFDnUqz8gSJsy399+OKVufqzA@mail.gmail.com> References: <CAGyQ6gxO+u9hc_6Qo8Z=-MsNgrBPruDSULYOQY-iQ+xv9d0xfw@mail.gmail.com> <CADPb22T800NSiXg4XjOwTgwQSZfs-uSBoSku0b7tkF=CAGkzwQ@mail.gmail.com> <CAGyQ6gwZhLkVwcay=r=n-V7W+eVn7iWrn_MRzksPegSym7A40g@mail.gmail.com> <CADPb22QmGovY394MbkwqZ34SVGEHkk7Z+6V5OqOGvbsxv_WgVg@mail.gmail.com> <CAGyQ6gzOr86ORPhUjvBZ1YpQdsFDnUqz8gSJsy399+OKVufqzA@mail.gmail.com> Date: Tue, 25 Nov 2014 19:22:31 -0800 Message-ID: <CADPb22Sb3r5RdXc_879iVF8uqMmjhoo9uAiyfOeKKTep6OL0Xw@mail.gmail.com> Subject: Re: [RFC] While processing a struct die, store the method's address in its fn_field From: Doug Evans <dje@google.com> To: Siva Chandra <sivachandra@google.com> Cc: gdb-patches <gdb-patches@sourceware.org> Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes |
Commit Message
Doug Evans
Nov. 26, 2014, 3:22 a.m. UTC
On Tue, Nov 25, 2014 at 3:37 PM, Siva Chandra <sivachandra@google.com> wrote: > On Tue, Nov 25, 2014 at 2:09 PM, Doug Evans <dje@google.com> wrote: >> For functions/methods gdb already uses lowpc as the start address. > > May be I am reading wrong somewhere, but if I am correct, GDB > currently gets to a low pc (the address of the method) of a method via > its symbol. So, I agree that GDB uses low pc anyway. GDB's symbol handling is pretty confusing. A better way to phrase this: "GDB currently gets to a low pc (the address of the method) of a method via its symbol" is to write it as: "One way GDB currently gets to a low pc (the address of the method) of a method via its ELF symbol." [Not only is "symtab" ambiguous in GDB, so is "symbol" ...] GDB tries a myriad of different ways to look up a symbol. If one way doesn't work it tries another. If that doesn't work it tries another. And on and on. Typically the last way GDB tries is to look the symbol up in the minsym symtab (the ELF symbol table). Note that this works for demangled c++ symbols not because we mangle them and then look up the linkage name (note: even "linkage name" is ambiguous in GDB!), but rather because we pre-demangle ELF symbols (*1) and match them that way. So while gdb uses minsyms for lookup, it does that as a last resort (as always there are exceptions, e.g., LOC_UNRESOLVED). (*1): At a cost of 10 of the 13 seconds of GDB startup time and a cost of 480MB of memory, for one of my benchmarks. Besides the performance cost, looking up minsyms this way hides bugs: PRs 17602, 17603. IOW, what if one of the preceding lookups *should* have worked? One useful experiment that I do from time to time is hack out minsyms, and see what happens. E.g., What still works? What broke? How much faster is gdb? later be used by elfstab_offset_sections. */ @@ -1213,6 +1215,8 @@ elf_read_minimal_symbols (struct objfile *objfile, int symfile_flags, synth_symbol_table, 1); } + } + /* Install any minimal symbols that have been collected as the current minimal symbols for this objfile. The debug readers below this point should not generate new minimal symbols; if they do it's their [N.B. That patch may not apply properly due to cut-n-paste errors.] btw, One interesting thing this experiment shows is that (currently) "start" != "tb main; run". IIRC, "start" needs the minsym for main, "tb main; run" does not. >> [IOW it doesn't use the linkage name, though even recently >> I'd forgotten this and thought otherwise. >> Also, GDB doesn't take a demangled name, mangle it, >> and then try to look that up in the ELF symbol table.] > > May be I messed up some terminology somewhere, symbol handling side of > GDB is new to me. But, looking at value_fn_field, it appears to me > that GDB gets a method's address via its linkage name (first looking > up its symbol/minsym). Also, value_fn_field uses lookup_symbol which > uses lookup_symbol_in_language which demangles the linkage name. Based > on this, I concluded that GDB cannot get the address of a method if > the linkage name is missing. This is my understanding now, which could > be wrong ofcourse! fwiw, GDB is bad enough that I often can't trust just reading the code. I often punt on that and actually single step through the relevant code to REALLY see what is happening. While lookup_symbol_in_language will demangle a mangled name for lookup, a good first question to answer is what name is being passed? Typically it is not a mangled name. [In another email I mentioned that the handling of lookup of mangled vs demangled symbols is a cleanup topic in itself. :-)] >>> [Clang >>> actually does not put out the low pc value for the subprogram die >>> under a structure die, but it puts out a separate (not under a >>> structure die) subprogram die for the operator() method with the low >>> pc value.] >> >> DWARF can be hard to read at times. >> Note that GCC can do this too. E.g., >> It will output a declaration in the proper context >> (e.g., inside a class definition for a method) without DW_AT_low_pc >> and then it will output another DIE at the top level with DW_AT_low_pc >> and with a DW_AT_specification referring back to previous DIE in its context. > > Yes, Clang takes this exact route and things work fine with a Clang > generated binary. GCC does not generate the top level DIE at all, but > stuffs in DW_AT_low_pc into the DIE under the class definition and > does not specify the linkage name. So, there is no DIE specifying the > linkage name of operator(). IMO however, what GCC is emitting is > logical sufficient. In which case, GDB needs to store the low pc value > somewhere, and my patch puts it in the type struct. GDB already has a place to store the start address for functions. Why is that not working here? Note that in my previous message, where I dumped the symbols, there is a symbol for operator() and it has the correct start address. IOW, GDB already has the start address. I still don't see the need for the new field. It seems to me that what's wrong is the lookup. Why isn't GDB finding that symbol, and how do we fix that? btw, when I try your patch I get one fail: p lambda(10)^M Invalid data type for function to be called.^M (gdb) FAIL: gdb.dwarf2/dw2-member-function-addr.exp: p lambda() Do you see this? I see this with clang too. I think the problem we need to solve is connecting the object to its operator().
Comments
I will go over your comments in detail and respond. One clarification until then ... On Tue, Nov 25, 2014 at 7:22 PM, Doug Evans <dje@google.com> wrote: > btw, when I try your patch I get one fail: > > p lambda(10)^M > Invalid data type for function to be called.^M > (gdb) FAIL: gdb.dwarf2/dw2-member-function-addr.exp: p lambda() As mentioned in my very first email, the tests depend on https://sourceware.org/ml/gdb-patches/2014-11/msg00479.html. We can remove this dependency by invoking the operator() explicitly "lambda.operator()(10)".
On Tue, Nov 25, 2014 at 7:58 PM, Siva Chandra <sivachandra@google.com> wrote: > I will go over your comments in detail and respond. One clarification > until then ... > > On Tue, Nov 25, 2014 at 7:22 PM, Doug Evans <dje@google.com> wrote: >> btw, when I try your patch I get one fail: >> >> p lambda(10)^M >> Invalid data type for function to be called.^M >> (gdb) FAIL: gdb.dwarf2/dw2-member-function-addr.exp: p lambda() > > As mentioned in my very first email, the tests depend on > https://sourceware.org/ml/gdb-patches/2014-11/msg00479.html. We can > remove this dependency by invoking the operator() explicitly > "lambda.operator()(10)". Ah righto. Digging a bit deeper, I see the lookup_symbol call in value_fn_field gets passed NULL for block. That means that lookup_local_symbol won't do anything, but what we're looking for (as can be seen in the symbol dump) is not in STATIC_BLOCK (block #001) nor GLOBAL_BLOCK (block #000). If I manually hack the value passed to lookup_symbol to be a block with __lambda0::operator()(int) const, and disable your patch, then "p lambda(5)" works. So I think we need to figure out how to pass a usable value for block to lookup_symbol in value_fn_field. Also, I wouldn't preclude a gcc bug here and/or a bug in dwarf2read.c where the lambda should have been in a different block than the one it was put in.
Phew... On Tue, Nov 25, 2014 at 7:22 PM, Doug Evans <dje@google.com> wrote: > fwiw, GDB is bad enough that I often can't trust just reading the code. > I often punt on that and actually single step through > the relevant code to REALLY see what is happening. I single stepped through the code and now see everything that you said here! I believed this in gdbtypes.h: 889 /* * If is_stub is clear, this is the mangled name which 890 we can look up to find the address of the method 891 (FIXME: it would be cleaner to have a pointer to the 892 struct symbol here instead). 893 894 If is_stub is set, this is the portion of the mangled 895 name which specifies the arguments. For example, "ii", 896 if there are two int arguments, or "" if there are no 897 arguments. See gdb_mangle_name for the conversion from 898 this format to the one used if is_stub is clear. */ 899 900 const char *physname; Rest of what I thought I understood fell out from this belief! And the names look so enticing. Resetting my mind ...
On Tue, Nov 25, 2014 at 11:31 PM, Siva Chandra <sivachandra@google.com> wrote: > Phew... > > On Tue, Nov 25, 2014 at 7:22 PM, Doug Evans <dje@google.com> wrote: >> fwiw, GDB is bad enough that I often can't trust just reading the code. >> I often punt on that and actually single step through >> the relevant code to REALLY see what is happening. > > I single stepped through the code and now see everything that you said here! > > I believed this in gdbtypes.h: > > 889 /* * If is_stub is clear, this is the mangled name which > 890 we can look up to find the address of the method > 891 (FIXME: it would be cleaner to have a pointer to the > 892 struct symbol here instead). > 893 > 894 If is_stub is set, this is the portion of the mangled > 895 name which specifies the arguments. For example, "ii", > 896 if there are two int arguments, or "" if there are no > 897 arguments. See gdb_mangle_name for the conversion from > 898 this format to the one used if is_stub is clear. */ > 899 > 900 const char *physname; > > Rest of what I thought I understood fell out from this belief! And the > names look so enticing. > > Resetting my mind ... Zoinks! Yeah, that comment needs some TLC. [btw, the term "physname" is in a transition mode. In some places it means "mangled name". In other places it means "demangled name". It's not clear to me what we want to transition to though.]
diff --git a/gdb/elfread.c b/gdb/elfread.c index 19aaed3..655872f 100644 --- a/gdb/elfread.c +++ b/gdb/elfread.c @@ -1126,6 +1126,8 @@ elf_read_minimal_symbols (struct objfile *objfile, int symfile_flags, set_objfile_data (objfile, dbx_objfile_data_key, dbx); make_cleanup (free_elfinfo, (void *) objfile); + if (0) { + /* Process the normal ELF symbol table first. This may write some chain of info into the dbx_symfile_info of the objfile, which can