[RFC] While processing a struct die, store the method's address in its fn_field

Message ID CADPb22Sb3r5RdXc_879iVF8uqMmjhoo9uAiyfOeKKTep6OL0Xw@mail.gmail.com
State New, archived
Headers

Commit Message

Doug Evans Nov. 26, 2014, 3:22 a.m. UTC
  On Tue, Nov 25, 2014 at 3:37 PM, Siva Chandra <sivachandra@google.com> wrote:
> On Tue, Nov 25, 2014 at 2:09 PM, Doug Evans <dje@google.com> wrote:
>> For functions/methods gdb already uses lowpc as the start address.
>
> May be I am reading wrong somewhere, but if I am correct, GDB
> currently gets to a low pc (the address of the method) of a method via
> its symbol. So, I agree that GDB uses low pc anyway.

GDB's symbol handling is pretty confusing.

A better way to phrase this:

"GDB currently gets to a low pc (the address of the method) of a method via
its symbol"

is to write it as:

"One way GDB currently gets to a low pc (the address of the method)
of a method via its ELF symbol."

[Not only is "symtab" ambiguous in GDB, so is "symbol" ...]

GDB tries a myriad of different ways to look up a symbol.
If one way doesn't work it tries another.
If that doesn't work it tries another.
And on and on.

Typically the last way GDB tries is to look the symbol up
in the minsym symtab (the ELF symbol table).
Note that this works for demangled c++ symbols
not because we mangle them and then look up the
linkage name (note: even "linkage name"
is ambiguous in GDB!), but rather because we pre-demangle
ELF symbols (*1) and match them that way.
So while gdb uses minsyms for lookup, it does that
as a last resort (as always there are exceptions,
e.g., LOC_UNRESOLVED).

(*1): At a cost of 10 of the 13 seconds of GDB startup time and
a cost of 480MB of memory, for one of my benchmarks.
Besides the performance cost, looking up minsyms
this way hides bugs: PRs 17602, 17603.
IOW, what if one of the preceding lookups *should* have worked?

One useful experiment that I do from time to time is
hack out minsyms, and see what happens.
E.g., What still works? What broke? How much faster is gdb?

      later be used by elfstab_offset_sections.  */
@@ -1213,6 +1215,8 @@ elf_read_minimal_symbols (struct objfile
*objfile, int symfile_flags,
                       synth_symbol_table, 1);
     }

+  }
+
   /* Install any minimal symbols that have been collected as the current
      minimal symbols for this objfile.  The debug readers below this point
      should not generate new minimal symbols; if they do it's their

[N.B. That patch may not apply properly due to cut-n-paste errors.]

btw, One interesting thing this experiment shows is that (currently)
"start" != "tb main; run".
IIRC, "start" needs the minsym for main, "tb main; run" does not.


>> [IOW it doesn't use the linkage name, though even recently
>> I'd forgotten this and thought otherwise.
>> Also, GDB doesn't take a demangled name, mangle it,
>> and then try to look that up in the ELF symbol table.]
>
> May be I messed up some terminology somewhere, symbol handling side of
> GDB is new to me. But, looking at value_fn_field, it appears to me
> that GDB gets a method's address via its linkage name (first looking
> up its symbol/minsym). Also, value_fn_field uses lookup_symbol which
> uses lookup_symbol_in_language which demangles the linkage name. Based
> on this, I concluded that GDB cannot get the address of a method if
> the linkage name is missing. This is my understanding now, which could
> be wrong ofcourse!

fwiw, GDB is bad enough that I often can't trust just reading the code.
I often punt on that and actually single step through
the relevant code to REALLY see what is happening.

While lookup_symbol_in_language will demangle a mangled
name for lookup, a good first question to answer is what name
is being passed?  Typically it is not a mangled name.
[In another email I mentioned that the handling of lookup
of mangled vs demangled symbols is a cleanup topic in itself. :-)]

>>>  [Clang
>>> actually does not put out the low pc value for the subprogram die
>>> under a structure die, but it puts out a separate (not under a
>>> structure die) subprogram die for the operator() method with the low
>>> pc value.]
>>
>> DWARF can be hard to read at times.
>> Note that GCC can do this too.  E.g.,
>> It will output a declaration in the proper context
>> (e.g., inside a class definition for a method) without DW_AT_low_pc
>> and then it will output another DIE at the top level with DW_AT_low_pc
>> and with a DW_AT_specification referring back to previous DIE in its context.
>
> Yes, Clang takes this exact route and things work fine with a Clang
> generated binary. GCC does not generate the top level DIE at all, but
> stuffs in DW_AT_low_pc into the DIE under the class definition and
> does not specify the linkage name. So, there is no DIE specifying the
> linkage name of operator(). IMO however, what GCC is emitting is
> logical sufficient. In which case, GDB needs to store the low pc value
> somewhere, and my patch puts it in the type struct.

GDB already has a place to store the start address for functions.
Why is that not working here?

Note that in my previous message, where I dumped the symbols,
there is a symbol for operator() and it has the correct start address.
IOW, GDB already has the start address.
I still don't see the need for the new field.
It seems to me that what's wrong is the lookup.
Why isn't GDB finding that symbol, and how do we fix that?

btw, when I try your patch I get one fail:

p lambda(10)^M
Invalid data type for function to be called.^M
(gdb) FAIL: gdb.dwarf2/dw2-member-function-addr.exp: p lambda()

Do you see this?
I see this with clang too.

I think the problem we need to solve is connecting the object
to its operator().
  

Comments

Siva Chandra Reddy Nov. 26, 2014, 3:58 a.m. UTC | #1
I will go over your comments in detail and respond. One clarification
until then ...

On Tue, Nov 25, 2014 at 7:22 PM, Doug Evans <dje@google.com> wrote:
> btw, when I try your patch I get one fail:
>
> p lambda(10)^M
> Invalid data type for function to be called.^M
> (gdb) FAIL: gdb.dwarf2/dw2-member-function-addr.exp: p lambda()

As mentioned in my very first email, the tests depend on
https://sourceware.org/ml/gdb-patches/2014-11/msg00479.html. We can
remove this dependency by invoking the operator() explicitly
"lambda.operator()(10)".
  
Doug Evans Nov. 26, 2014, 5:19 a.m. UTC | #2
On Tue, Nov 25, 2014 at 7:58 PM, Siva Chandra <sivachandra@google.com> wrote:
> I will go over your comments in detail and respond. One clarification
> until then ...
>
> On Tue, Nov 25, 2014 at 7:22 PM, Doug Evans <dje@google.com> wrote:
>> btw, when I try your patch I get one fail:
>>
>> p lambda(10)^M
>> Invalid data type for function to be called.^M
>> (gdb) FAIL: gdb.dwarf2/dw2-member-function-addr.exp: p lambda()
>
> As mentioned in my very first email, the tests depend on
> https://sourceware.org/ml/gdb-patches/2014-11/msg00479.html. We can
> remove this dependency by invoking the operator() explicitly
> "lambda.operator()(10)".

Ah righto.

Digging a bit deeper, I see the lookup_symbol call in value_fn_field
gets passed NULL for block.
That means that lookup_local_symbol won't do anything, but what we're
looking for (as can be seen in the symbol dump) is not in STATIC_BLOCK
(block #001) nor GLOBAL_BLOCK (block #000).

If I manually hack the value passed to lookup_symbol to be a block with
__lambda0::operator()(int) const,
and disable your patch, then "p lambda(5)" works.

So I think we need to figure out how to pass a usable value for block
to lookup_symbol in value_fn_field.  Also, I wouldn't preclude a gcc
bug here and/or a bug in dwarf2read.c where the lambda should have
been in a different block than the one it was put in.
  
Siva Chandra Reddy Nov. 26, 2014, 7:31 a.m. UTC | #3
Phew...

On Tue, Nov 25, 2014 at 7:22 PM, Doug Evans <dje@google.com> wrote:
> fwiw, GDB is bad enough that I often can't trust just reading the code.
> I often punt on that and actually single step through
> the relevant code to REALLY see what is happening.

I single stepped through the code and now see everything that you said here!

I believed this in gdbtypes.h:

 889             /* * If is_stub is clear, this is the mangled name which
 890                we can look up to find the address of the method
 891                (FIXME: it would be cleaner to have a pointer to the
 892                struct symbol here instead).
 893
 894                If is_stub is set, this is the portion of the mangled
 895                name which specifies the arguments.  For example, "ii",
 896                if there are two int arguments, or "" if there are no
 897                arguments.  See gdb_mangle_name for the conversion from
 898                this format to the one used if is_stub is clear.  */
 899
 900             const char *physname;

Rest of what I thought I understood fell out from this belief! And the
names look so enticing.

Resetting my mind ...
  
Doug Evans Nov. 26, 2014, 9:11 p.m. UTC | #4
On Tue, Nov 25, 2014 at 11:31 PM, Siva Chandra <sivachandra@google.com> wrote:
> Phew...
>
> On Tue, Nov 25, 2014 at 7:22 PM, Doug Evans <dje@google.com> wrote:
>> fwiw, GDB is bad enough that I often can't trust just reading the code.
>> I often punt on that and actually single step through
>> the relevant code to REALLY see what is happening.
>
> I single stepped through the code and now see everything that you said here!
>
> I believed this in gdbtypes.h:
>
>  889             /* * If is_stub is clear, this is the mangled name which
>  890                we can look up to find the address of the method
>  891                (FIXME: it would be cleaner to have a pointer to the
>  892                struct symbol here instead).
>  893
>  894                If is_stub is set, this is the portion of the mangled
>  895                name which specifies the arguments.  For example, "ii",
>  896                if there are two int arguments, or "" if there are no
>  897                arguments.  See gdb_mangle_name for the conversion from
>  898                this format to the one used if is_stub is clear.  */
>  899
>  900             const char *physname;
>
> Rest of what I thought I understood fell out from this belief! And the
> names look so enticing.
>
> Resetting my mind ...

Zoinks!  Yeah, that comment needs some TLC.

[btw, the term "physname" is in a transition mode.
In some places it means "mangled name".
In other places it means "demangled name".
It's not clear to me what we want to transition to though.]
  

Patch

diff --git a/gdb/elfread.c b/gdb/elfread.c
index 19aaed3..655872f 100644
--- a/gdb/elfread.c
+++ b/gdb/elfread.c
@@ -1126,6 +1126,8 @@  elf_read_minimal_symbols (struct objfile
*objfile, int symfile_flags,
   set_objfile_data (objfile, dbx_objfile_data_key, dbx);
   make_cleanup (free_elfinfo, (void *) objfile);

+  if (0) {
+
   /* Process the normal ELF symbol table first.  This may write some
      chain of info into the dbx_symfile_info of the objfile, which can