[patchv2,2/2] Accelerate lookup_symbol_aux_objfile 14.5x [Re: [patch 0/2] Accelerate symbol lookups 15x]

Message ID 20141023182434.GA31412@host2.jankratochvil.net
State New, archived
Headers

Commit Message

Jan Kratochvil Oct. 23, 2014, 6:24 p.m. UTC
  On Wed, 22 Oct 2014 10:55:18 +0200, Doug Evans wrote:
> For example, the count of calls to dict_hash before/after goes from 13.8M to 31.
> I'm guessing one t hing we're doing here is coping with an artifact of
> dwz:

During my simple test on non-DWZ file (./gdb itself) it went 3684->3484.

The problem is that dict_cash(val) is called for the same val for each block
(== symtab).

On DWZ the saving is probably much larger as there are many more symtabs due
to DW_TAG_partial_unit ones.


> what was once one global block to represent the entire objfile is
> now N.

Without DWZ there are X global blocks for X primary symtabs for X CUs of
objfile.  With DWZ there are X+Y global blocks for X+Y primary symtabs for
X+Y CUs where Y are 'DW_TAG_partial_unit's.

For 'DW_TAG_partial_unit's (Ys) their blockvector is usually empty.  But not
always, I have found there typedef symbols, there can IMO be optimized-out
static variables etc.


> [I'm sure the patches help in the non-dwz case, but I suspect it's less.
> Which isn't to say the patches aren't useful.
> I just need play with a few more examples.]

I agree.

[patch 2/2] could needlessly performance-regress non-DWZ cases, therefore
I have put back original ALL_OBJFILE_PRIMARY_SYMTABS (instead of my
ALL_OBJFILE_SYMTABS) as it is perfectly sufficient.  For the performance
testcase of mine:

Benchmark on non-trivial application with    'p <tab><tab>':
Command execution time:   4.215000 (cpu),   4.241466 (wall) --- both fixes with new [patch 2/2]
Command execution time:   7.373000 (cpu),   7.395095 (wall) --- both fixes
Command execution time:  13.572000 (cpu),  13.592689 (wall) --- just lookup_symbol_aux_objfile fix
Command execution time: 113.036000 (cpu), 113.067995 (wall) --- FSF GDB HEAD

That is additional 1.75x improvement, making the total improvement 26.8x.


No regressions on {x86_64,x86_64-m32,i686}-fedora21pre-linux-gnu in standard
and .gdb_index-enabled runs.  Neither of the patches should cause any visible
behavior change.


Thanks,
Jan
gdb/
2014-10-23  Jan Kratochvil  <jan.kratochvil@redhat.com>

	* symtab.c (lookup_symbol_aux_objfile): Use ALL_OBJFILE_SYMTABS, inline
	lookup_block_symbol.
  

Comments

Doug Evans Oct. 24, 2014, 7:16 a.m. UTC | #1
On Thu, Oct 23, 2014 at 11:24 AM, Jan Kratochvil
<jan.kratochvil@redhat.com> wrote:
> On Wed, 22 Oct 2014 10:55:18 +0200, Doug Evans wrote:
>> For example, the count of calls to dict_hash before/after goes from 13.8M to 31.
>> I'm guessing one t hing we're doing here is coping with an artifact of
>> dwz:
>
> During my simple test on non-DWZ file (./gdb itself) it went 3684->3484.
>
> The problem is that dict_cash(val) is called for the same val for each block
> (== symtab).
>
> On DWZ the saving is probably much larger as there are many more symtabs due
> to DW_TAG_partial_unit ones.

Much larger indeed.

One worry I have is that while this helps dwz does it harm something else.
Seems unlikely, but some simple measurements I took make me want to take more.

One thought I have is that significant changes at a higher level will
minimize the impact of this patch.  One change I'm thinking of making
is replacing iterating over every symbol table and then if that fails
going to the index with instead just going straight to the index: the
index knows where the symbols are (you mentioned this as well).
Perhaps not what we want to do for partial syms (though maybe partial
syms could work similarly, haven't gotten that far).  But with an
index it seems clumsy to iterate over all symtabs and then go to the
index.

It should be easy enough to do a quick hack to do an experiment to
collect some data.
I'll try to get to it this weekend.

>> what was once one global block to represent the entire objfile is
>> now N.
>
> Without DWZ there are X global blocks for X primary symtabs for X CUs of
> objfile.  With DWZ there are X+Y global blocks for X+Y primary symtabs for
> X+Y CUs where Y are 'DW_TAG_partial_unit's.

Yep.

> For 'DW_TAG_partial_unit's (Ys) their blockvector is usually empty.  But not
> always, I have found there typedef symbols, there can IMO be optimized-out
> static variables etc.
>
>
>> [I'm sure the patches help in the non-dwz case, but I suspect it's less.
>> Which isn't to say the patches aren't useful.
>> I just need play with a few more examples.]
>
> I agree.
>
> [patch 2/2] could needlessly performance-regress non-DWZ cases, therefore
> I have put back original ALL_OBJFILE_PRIMARY_SYMTABS (instead of my
> ALL_OBJFILE_SYMTABS) as it is perfectly sufficient.  For the performance
> testcase of mine:
>
> Benchmark on non-trivial application with    'p <tab><tab>':
> Command execution time:   4.215000 (cpu),   4.241466 (wall) --- both fixes with new [patch 2/2]
> Command execution time:   7.373000 (cpu),   7.395095 (wall) --- both fixes
> Command execution time:  13.572000 (cpu),  13.592689 (wall) --- just lookup_symbol_aux_objfile fix
> Command execution time: 113.036000 (cpu), 113.067995 (wall) --- FSF GDB HEAD
>
> That is additional 1.75x improvement, making the total improvement 26.8x.
>
>
> No regressions on {x86_64,x86_64-m32,i686}-fedora21pre-linux-gnu in standard
> and .gdb_index-enabled runs.  Neither of the patches should cause any visible
> behavior change.
>
>
> Thanks,
> Jan
>
> gdb/
> 2014-10-23  Jan Kratochvil  <jan.kratochvil@redhat.com>
>
>         * symtab.c (lookup_symbol_aux_objfile): Use ALL_OBJFILE_SYMTABS, inline
>         lookup_block_symbol.
>
> diff --git a/gdb/symtab.c b/gdb/symtab.c
> index c530d50..da13861 100644
> --- a/gdb/symtab.c
> +++ b/gdb/symtab.c
> @@ -1657,15 +1657,25 @@ lookup_symbol_aux_objfile (struct objfile *objfile, int block_index,
>    const struct block *block;
>    struct symtab *s;
>
> +  gdb_assert (block_index == GLOBAL_BLOCK || block_index == STATIC_BLOCK);
> +
>    ALL_OBJFILE_PRIMARY_SYMTABS (objfile, s)
>      {
> +      struct dict_iterator dict_iter;
> +
>        bv = BLOCKVECTOR (s);
>        block = BLOCKVECTOR_BLOCK (bv, block_index);
> -      sym = lookup_block_symbol (block, name, domain);
> -      if (sym)
> +
> +      for (sym = dict_iter_name_first (block->dict, name, &dict_iter);
> +          sym != NULL;
> +          sym = dict_iter_name_next (name, &dict_iter))
>         {
> -         block_found = block;
> -         return fixup_symbol_section (sym, objfile);
> +         if (symbol_matches_domain (SYMBOL_LANGUAGE (sym),
> +                                    SYMBOL_DOMAIN (sym), domain))
> +           {
> +             block_found = block;
> +             return fixup_symbol_section (sym, objfile);
> +           }
>         }
>      }
>
>

This breaks an abstraction boundary, IWBN to preserve it.
[IOW, I look at dict_* as being an implementation detail of blocks.]

If we were to go this route (and apologies for the delay), can you
write a routine like lookup_block_symbol which does the above and call
that here instead?

lookup_block_symbol should live in block.c, not symtab.c.
That's where this new routine should go too.
  
Jan Kratochvil Oct. 24, 2014, 7:33 a.m. UTC | #2
On Fri, 24 Oct 2014 09:16:01 +0200, Doug Evans wrote:
> One thought I have is that significant changes at a higher level will
> minimize the impact of this patch.  One change I'm thinking of making
> is replacing iterating over every symbol table and then if that fails
> going to the index with instead just going straight to the index: the
> index knows where the symbols are (you mentioned this as well).

Yes, I tried that first.

For the performance testcase I provided the issue is in
lookup_symbol_global_iterator_cb():

  data->result = lookup_symbol_aux_objfile (objfile, GLOBAL_BLOCK,
                                            data->name, data->domain);
  if (data->result == NULL)
    data->result = lookup_symbol_aux_quick (objfile, GLOBAL_BLOCK,
                                            data->name, data->domain);

Changing their order does not fix the performance as lookup_symbol_aux_quick()
(that is quick_symbol_functions::lookup_symbol) can return NULL
 * either if the symbol is really not present in the index.
 * or if the symbol's symtab is already expanded.

For the latter case (commonly happening) quick_symbol_functions::lookup_symbol
finds the right symtab but then it drops that information.

Changing the quick_symbol_functions::lookup_symbol semantics may fix it.
But then it will get fixed only for .gdb_index files while my two patches also
improve performance of non-.gdb_index files.

For each objfile .gdb_index presence is orthogonal to DWZ application.

Sure a question remains if we should care about performance of non-.gdb_index
files at all.  Even for continuous edit-build-debug cycles it is worth to run
gdb-add-index during each build.


> If we were to go this route (and apologies for the delay), can you
> write a routine like lookup_block_symbol which does the above and call
> that here instead?

OK.


Jan
  
Doug Evans Oct. 24, 2014, 4:07 p.m. UTC | #3
On Fri, Oct 24, 2014 at 12:33 AM, Jan Kratochvil
<jan.kratochvil@redhat.com> wrote:
> On Fri, 24 Oct 2014 09:16:01 +0200, Doug Evans wrote:
>> One thought I have is that significant changes at a higher level will
>> minimize the impact of this patch.  One change I'm thinking of making
>> is replacing iterating over every symbol table and then if that fails
>> going to the index with instead just going straight to the index: the
>> index knows where the symbols are (you mentioned this as well).
>
> Yes, I tried that first.
>
> For the performance testcase I provided the issue is in
> lookup_symbol_global_iterator_cb():
>
>   data->result = lookup_symbol_aux_objfile (objfile, GLOBAL_BLOCK,
>                                             data->name, data->domain);
>   if (data->result == NULL)
>     data->result = lookup_symbol_aux_quick (objfile, GLOBAL_BLOCK,
>                                             data->name, data->domain);
>
> Changing their order does not fix the performance as lookup_symbol_aux_quick()
> (that is quick_symbol_functions::lookup_symbol) can return NULL
>  * either if the symbol is really not present in the index.
>  * or if the symbol's symtab is already expanded.
>
> For the latter case (commonly happening) quick_symbol_functions::lookup_symbol
> finds the right symtab but then it drops that information.
>
> Changing the quick_symbol_functions::lookup_symbol semantics may fix it.

Yeah, the experiment I want to try is a bit more invasive such that,
for example, we only go to the index (and expand any symtabs found if
necessary and only search those ones). E.g., If the index returns NULL
there's no point in searching the full set of symtabs.

> But then it will get fixed only for .gdb_index files while my two patches also
> improve performance of non-.gdb_index files.

That's still an open issue, sure.

> For each objfile .gdb_index presence is orthogonal to DWZ application.
>
> Sure a question remains if we should care about performance of non-.gdb_index
> files at all.  Even for continuous edit-build-debug cycles it is worth to run
> gdb-add-index during each build.

Or have the compiler/linker build the index.
  

Patch

diff --git a/gdb/symtab.c b/gdb/symtab.c
index c530d50..da13861 100644
--- a/gdb/symtab.c
+++ b/gdb/symtab.c
@@ -1657,15 +1657,25 @@  lookup_symbol_aux_objfile (struct objfile *objfile, int block_index,
   const struct block *block;
   struct symtab *s;
 
+  gdb_assert (block_index == GLOBAL_BLOCK || block_index == STATIC_BLOCK);
+
   ALL_OBJFILE_PRIMARY_SYMTABS (objfile, s)
     {
+      struct dict_iterator dict_iter;
+
       bv = BLOCKVECTOR (s);
       block = BLOCKVECTOR_BLOCK (bv, block_index);
-      sym = lookup_block_symbol (block, name, domain);
-      if (sym)
+
+      for (sym = dict_iter_name_first (block->dict, name, &dict_iter);
+	   sym != NULL;
+	   sym = dict_iter_name_next (name, &dict_iter))
 	{
-	  block_found = block;
-	  return fixup_symbol_section (sym, objfile);
+	  if (symbol_matches_domain (SYMBOL_LANGUAGE (sym),
+				     SYMBOL_DOMAIN (sym), domain))
+	    {
+	      block_found = block;
+	      return fixup_symbol_section (sym, objfile);
+	    }
 	}
     }