[2/2] btrace: avoid symbol lookup

Message ID 20140310214252.GA3105@host2.jankratochvil.net
State Not applicable
Headers

Commit Message

Jan Kratochvil March 10, 2014, 9:42 p.m. UTC
  On Fri, 07 Mar 2014 09:57:45 +0100, Markus Metzger wrote:
> --- a/gdb/btrace.c
> +++ b/gdb/btrace.c
> @@ -451,13 +451,17 @@ ftrace_update_function (struct gdbarch *gdbarch,
>    struct symbol *fun;
>    struct btrace_insn *last;
>  
> -  /* Try to determine the function we're in.  We use both types of symbols
> -     to avoid surprises when we sometimes get a full symbol and sometimes
> -     only a minimal symbol.  */
> -  fun = find_pc_function (pc);
> +  /* Try to determine the function we're in.  */
>    bmfun = lookup_minimal_symbol_by_pc (pc);
>    mfun = bmfun.minsym;
>  
> +  /* We only lookup the symbol in the debug information if we have not found
> +     a minimal symbol.  This avoids the expensive lookup for globally visible
> +     symbols.  */
> +  fun = NULL;
> +  if (mfun == NULL)
> +    fun = find_pc_function (pc);
> +
>    if (fun == NULL && mfun == NULL)
>      DEBUG_FTRACE ("no symbol at %s", core_addr_to_string_nz (pc));
>  
[...]

Behavior after the change is not the same.  DWARF functions symbols can span
over discontiguous ranges with DW_AT_ranges but their corresponding ELF
function symbols cannot, therefore those are just some approximation.

Testcase gdb.dwarf2/dw2-objfile-overlap.exp tests such a case, from:
	https://sourceware.org/ml/gdb-patches/2011-11/msg00166.html

For address of symbol "inner":
 * find_pc_function finds DWARF symbol "inner"
 * lookup_minimal_symbol_by_pc finds ELF symbol "outer_inner"

In few Fedora 20 x86_64 -O2 -g built libraries I have not found any
DW_TAG_subprogram using DW_AT_ranges, only DW_TAG_inlined_subroutine use
DW_AT_ranges.  But that is more a limitation of gcc, other compilers may
produce DW_TAG_subprogram using DW_AT_ranges with overlapping function parts.

Inlined functions are found by neither find_pc_function nor
lookup_minimal_symbol_by_pc (block_containing_function (block_for_pc ()) finds
inlined function).  Not sure if it is not a bug of find_pc_function but that
is off-topic here.

I do not think it will be hit in real world cases.  GDB would need some better
abstraction of the symbol kinds to be more systematic in what it outputs.

It would be good to know how much it helps your usecase as it is not a fully
clean/transparent change.


Thanks,
Jan


set confirm no
set trace-commands
file gdb.dwarf2/dw2-objfile-overlap-outer.x
add-symbol-file gdb.dwarf2/dw2-objfile-overlap-inner.x outer_inner
info sym inner
->
find_pc_function "inner".
lookup_minimal_symbol_by_pc "outer_inner".
  

Comments

Metzger, Markus T March 11, 2014, 10:08 a.m. UTC | #1
> -----Original Message-----
> From: gdb-patches-owner@sourceware.org [mailto:gdb-patches-
> owner@sourceware.org] On Behalf Of Jan Kratochvil
> Sent: Monday, March 10, 2014 10:43 PM
> To: Metzger, Markus T
> Cc: gdb-patches@sourceware.org
> Subject: Re: [PATCH 2/2] btrace: avoid symbol lookup
> 
> On Fri, 07 Mar 2014 09:57:45 +0100, Markus Metzger wrote:
> > --- a/gdb/btrace.c
> > +++ b/gdb/btrace.c
> > @@ -451,13 +451,17 @@ ftrace_update_function (struct gdbarch
> *gdbarch,
> >    struct symbol *fun;
> >    struct btrace_insn *last;
> >
> > -  /* Try to determine the function we're in.  We use both types of symbols
> > -     to avoid surprises when we sometimes get a full symbol and sometimes
> > -     only a minimal symbol.  */
> > -  fun = find_pc_function (pc);
> > +  /* Try to determine the function we're in.  */
> >    bmfun = lookup_minimal_symbol_by_pc (pc);
> >    mfun = bmfun.minsym;
> >
> > +  /* We only lookup the symbol in the debug information if we have not
> found
> > +     a minimal symbol.  This avoids the expensive lookup for globally visible
> > +     symbols.  */
> > +  fun = NULL;
> > +  if (mfun == NULL)
> > +    fun = find_pc_function (pc);
> > +
> >    if (fun == NULL && mfun == NULL)
> >      DEBUG_FTRACE ("no symbol at %s", core_addr_to_string_nz (pc));
> >
> [...]
> 
> Behavior after the change is not the same.  DWARF functions symbols can
> span
> over discontiguous ranges with DW_AT_ranges but their corresponding ELF
> function symbols cannot, therefore those are just some approximation.
> 
> Testcase gdb.dwarf2/dw2-objfile-overlap.exp tests such a case, from:
> 	https://sourceware.org/ml/gdb-patches/2011-11/msg00166.html
> 
> For address of symbol "inner":
>  * find_pc_function finds DWARF symbol "inner"
>  * lookup_minimal_symbol_by_pc finds ELF symbol "outer_inner"
> 
> In few Fedora 20 x86_64 -O2 -g built libraries I have not found any
> DW_TAG_subprogram using DW_AT_ranges, only
> DW_TAG_inlined_subroutine use
> DW_AT_ranges.  But that is more a limitation of gcc, other compilers may
> produce DW_TAG_subprogram using DW_AT_ranges with overlapping
> function parts.

I have not seen DW_AT_ranges used for split functions.  OpenMP is a good
candidate for this so I checked what gcc 4.8.2 and a recent icc are doing today.

Both generate separate functions for parallel regions and describe them as
artificial functions in DWARF.  Whereas GCC generates a separate local function
in ELF, ICC puts the parallel region function after the function it was extracted
from and describes both as a single function in ELF.

For GCC we get a function with a funny name in both cases.
For ICC we get a function with a funny name in the DWARF case and the name
of the function from which the parallel region has been extracted in the ELF case.

So the result is different in some cases.  I'm not sure whether it matters, though.

Inline functions are a different topic.  I would like to support them one day.
As you point out below, the existing version does not support inline functions
either, so some work is necessary.
We might need some more work in symbol handling to allow a fast lookup
covering only inline functions.  We could combine this with the minimal symbol
lookup to get both inline functions support and reasonable performance.


> Inlined functions are found by neither find_pc_function nor
> lookup_minimal_symbol_by_pc (block_containing_function (block_for_pc
> ()) finds
> inlined function).  Not sure if it is not a bug of find_pc_function but that
> is off-topic here.
> 
> I do not think it will be hit in real world cases.  GDB would need some better
> abstraction of the symbol kinds to be more systematic in what it outputs.
> 
> It would be good to know how much it helps your usecase as it is not a fully
> clean/transparent change.

As benchmark, I traced "gdb a.out -ex quit" with a breakpoint on quit_force
for debug versions of both gdb and a.out.  The tracing GDB was compiled
without any additional options; the traced GDB was compiled with "-g -O0".
I used a relatively big buffer that contained ~1.5 million instructions.
I used "maintenance time 1" to measure the execution time of "info record".

The first patch improves performance by ~2x.
This patch improves performance by ~3x on top of the first patch.

I first tried to cache the returned symbol and check whether the next PC
belongs to the same function.  My cache hit in ~50% of the cases but showed
no benefit.  For the majority of the other ~50% no symbol was found.  I can't
cache those.

What's missing is a "fast fail", i.e. quickly determine that we won't find any
symbol for this PC.  I won't be able to do this in a reasonable amount of time,
though, so I thought this patch is a compromise between functionality and
performance.


Regards,
Markus.

Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen, Deutschland
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Christian Lamprechter, Hannes Schwaderer, Douglas Lusk
Registergericht: Muenchen HRB 47456
Ust.-IdNr./VAT Registration No.: DE129385895
Citibank Frankfurt a.M. (BLZ 502 109 00) 600119052
  
Jan Kratochvil March 21, 2014, 5:22 p.m. UTC | #2
On Tue, 11 Mar 2014 11:08:43 +0100, Metzger, Markus T wrote:
> What's missing is a "fast fail", i.e. quickly determine that we won't find any
> symbol for this PC.  I won't be able to do this in a reasonable amount of time,
> though, so I thought this patch is a compromise between functionality and
> performance.

I do not think providing incorrect behavior for performance reasons is a valid
tradeoff.  The right way would be to fix the DWARF lookups to be fast enough.

But I no longer approve GDB patches so this is just my personal opinion.


Jan
  
Metzger, Markus T March 24, 2014, 7:55 a.m. UTC | #3
> -----Original Message-----
> From: Jan Kratochvil [mailto:jan.kratochvil@redhat.com]
> Sent: Friday, March 21, 2014 6:22 PM
> To: Metzger, Markus T
> Cc: gdb-patches@sourceware.org; Pedro Alves (palves@redhat.com)
> Subject: Re: [PATCH 2/2] btrace: avoid symbol lookup
> 
> On Tue, 11 Mar 2014 11:08:43 +0100, Metzger, Markus T wrote:
> > What's missing is a "fast fail", i.e. quickly determine that we won't find any
> > symbol for this PC.  I won't be able to do this in a reasonable amount of
> time,
> > though, so I thought this patch is a compromise between functionality and
> > performance.
> 
> I do not think providing incorrect behavior for performance reasons is a valid
> tradeoff.  The right way would be to fix the DWARF lookups to be fast
> enough.

I realized after I sent this that the word 'functionality' was not chosen well.

The only actual change in functionality I was able to observe was missing
parens for the main function, i.e. it had been printed "main()" and is now
printed "main".  That's because its language is 'auto' and not 'c' or 'cpp'.

I believe that this could and should be fixed by fixing up the language of
the main mini-symbol to align with the language of other symbols in the
same object file.

I think the compromise is rather between a nice, general solution that
benefits everybody and a local solution that only benefits btrace and that
might make supporting inline functions more difficult in the future.


Regards,
Markus.

Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen, Deutschland
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Christian Lamprechter, Hannes Schwaderer, Douglas Lusk
Registergericht: Muenchen HRB 47456
Ust.-IdNr./VAT Registration No.: DE129385895
Citibank Frankfurt a.M. (BLZ 502 109 00) 600119052
  
Jan Kratochvil March 24, 2014, 8:37 a.m. UTC | #4
On Mon, 24 Mar 2014 08:55:33 +0100, Metzger, Markus T wrote:
> > I do not think providing incorrect behavior for performance reasons is a valid
> > tradeoff.  The right way would be to fix the DWARF lookups to be fast
> > enough.
[...]
> The only actual change in functionality I was able to observe was missing
> parens for the main function,

I agree in 99.9% of usecases it will work the same.  This still does not prove
it is correct.

I believe I can create a reproducer with two overlapping functions:
  0..1: a()
  2..3: b()
  4..6: a()
  7..8: b()
properly describe by DW_AT_ranges which will work with former GDB but which
will no longer work with patched GDB.

This may definitely happen for some user .S code with hand-coded DWARF.
I do not say it necessarilly happens with any real world compiler output.

This may happen for gdb.dwarf2/dw2-objfile-overlap.exp which comes from a real
world case of Linux kernel modules mapping.

But maybe I miss something and I would fail to create the reproducer, if you
do not agree I can create a .S with hand coded DWARF I can try to create one.

Corner cases are the ones most difficult to debug and it is a pity when
debugger provides incorrect output in such corner cases.


As I said maybe this compromise is acceptable as it may not be hit in real
world usage cases but I do not want to make this decision.


> I think the compromise is rather between a nice, general solution that
> benefits everybody and a local solution that only benefits btrace

This is the second reason why I did not agree with the patch.  GDB needs to be
faster and if this PC->functionname mapping can be accelerated such way then
it should be done globally.


Regards,
Jan
  
Metzger, Markus T March 24, 2014, 9:21 a.m. UTC | #5
> -----Original Message-----
> From: Jan Kratochvil [mailto:jan.kratochvil@redhat.com]
> Sent: Monday, March 24, 2014 9:37 AM


> On Mon, 24 Mar 2014 08:55:33 +0100, Metzger, Markus T wrote:
> > > I do not think providing incorrect behavior for performance reasons is a
> valid
> > > tradeoff.  The right way would be to fix the DWARF lookups to be fast
> > > enough.
> [...]
> > The only actual change in functionality I was able to observe was missing
> > parens for the main function,
> 
> I agree in 99.9% of usecases it will work the same.  This still does not prove
> it is correct.
> 
> I believe I can create a reproducer with two overlapping functions:
>   0..1: a()
>   2..3: b()
>   4..6: a()
>   7..8: b()
> properly describe by DW_AT_ranges which will work with former GDB but
> which
> will no longer work with patched GDB.
> 
> This may definitely happen for some user .S code with hand-coded DWARF.
> I do not say it necessarilly happens with any real world compiler output.
> 
> This may happen for gdb.dwarf2/dw2-objfile-overlap.exp which comes from
> a real
> world case of Linux kernel modules mapping.
> 
> But maybe I miss something and I would fail to create the reproducer, if you
> do not agree I can create a .S with hand coded DWARF I can try to create one.

No need.  I certainly agree that one can write such an assembly file and that
one can describe this in DWARF but not in ELF.

To describe this in ELF one would split a and b and thus end up with 4 functions -
that's what compilers seem to be doing today.  And they also seem to emit
DWARF that describes it that way.


> Corner cases are the ones most difficult to debug and it is a pity when
> debugger provides incorrect output in such corner cases.

I agree.


> As I said maybe this compromise is acceptable as it may not be hit in real
> world usage cases but I do not want to make this decision.

If you just look hard enough, I guess you will find everything somewhere.

Those who use compilers, on the other hand, may not even be able to
tell the difference - except that it's faster which allows them to use
bigger buffers and thus record longer traces.


> > I think the compromise is rather between a nice, general solution that
> > benefits everybody and a local solution that only benefits btrace
> 
> This is the second reason why I did not agree with the patch.  GDB needs to
> be
> faster and if this PC->functionname mapping can be accelerated such way
> then
> it should be done globally.

The specific problem of btrace is that it needs to do a huge number of symbol
lookups within a single GDB command.  I do not know any other GDB command
that gets anywhere near to that.

Regards,
Markus.
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen, Deutschland
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Christian Lamprechter, Hannes Schwaderer, Douglas Lusk
Registergericht: Muenchen HRB 47456
Ust.-IdNr./VAT Registration No.: DE129385895
Citibank Frankfurt a.M. (BLZ 502 109 00) 600119052
  

Patch

diff --git a/gdb/printcmd.c b/gdb/printcmd.c
index 87b1448..f1517d2 100644
--- a/gdb/printcmd.c
+++ b/gdb/printcmd.c
@@ -1107,6 +1107,27 @@  sym_info (char *arg, int from_tty)
     error_no_arg (_("address"));
 
   addr = parse_and_eval_address (arg);
+
+{
+struct symbol *sym=find_pc_function (addr);
+if (sym) {
+  printf_filtered ("find_pc_function \"");
+  fprintf_symbol_filtered (gdb_stdout, SYMBOL_PRINT_NAME (sym),
+			   current_language->la_language, DMGL_ANSI);
+  printf_filtered ("\".\n");
+}
+}
+{
+struct bound_minimal_symbol bmfun=lookup_minimal_symbol_by_pc(addr);
+if (bmfun.minsym) {
+  printf_filtered ("lookup_minimal_symbol_by_pc \"");
+  fprintf_symbol_filtered (gdb_stdout, MSYMBOL_PRINT_NAME (bmfun.minsym),
+			   current_language->la_language, DMGL_ANSI);
+  printf_filtered ("\".\n");
+}
+}
+
+
   ALL_OBJSECTIONS (objfile, osect)
   {
     /* Only process each object file once, even if there's a separate