[patchv2,2/2] Accelerate lookup_symbol_aux_objfile 14.5x [Re: [patch 0/2] Accelerate symbol lookups 15x]
Message ID | 20141023182434.GA31412@host2.jankratochvil.net |
---|---|
State | New, archived |
Headers |
Received: (qmail 6102 invoked by alias); 23 Oct 2014 18:24:42 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <gdb-patches.sourceware.org> List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 6012 invoked by uid 89); 23 Oct 2014 18:24:41 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.9 required=5.0 tests=AWL, BAYES_00, KAM_STOCKGEN, RP_MATCHES_RCVD, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Thu, 23 Oct 2014 18:24:40 +0000 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s9NIOcsa004828 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Thu, 23 Oct 2014 14:24:38 -0400 Received: from host2.jankratochvil.net (ovpn-116-79.ams2.redhat.com [10.36.116.79]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s9NIOY0U024147 (version=TLSv1/SSLv3 cipher=AES128-GCM-SHA256 bits=128 verify=NO); Thu, 23 Oct 2014 14:24:36 -0400 Date: Thu, 23 Oct 2014 20:24:34 +0200 From: Jan Kratochvil <jan.kratochvil@redhat.com> To: Doug Evans <xdje42@gmail.com> Cc: "gdb-patches@sourceware.org" <gdb-patches@sourceware.org> Subject: [patchv2 2/2] Accelerate lookup_symbol_aux_objfile 14.5x [Re: [patch 0/2] Accelerate symbol lookups 15x] Message-ID: <20141023182434.GA31412@host2.jankratochvil.net> References: <20141020214410.GA22011@host2.jankratochvil.net> <CAP9bCMQ7EXyXJiqK4j2UA9YgxkQiNFFqJPOpbPXH8-YZMRLh2w@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="vtzGhvizbBRQ85DL" Content-Disposition: inline In-Reply-To: <CAP9bCMQ7EXyXJiqK4j2UA9YgxkQiNFFqJPOpbPXH8-YZMRLh2w@mail.gmail.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-IsSubscribed: yes |
Commit Message
Jan Kratochvil
Oct. 23, 2014, 6:24 p.m. UTC
On Wed, 22 Oct 2014 10:55:18 +0200, Doug Evans wrote: > For example, the count of calls to dict_hash before/after goes from 13.8M to 31. > I'm guessing one t hing we're doing here is coping with an artifact of > dwz: During my simple test on non-DWZ file (./gdb itself) it went 3684->3484. The problem is that dict_cash(val) is called for the same val for each block (== symtab). On DWZ the saving is probably much larger as there are many more symtabs due to DW_TAG_partial_unit ones. > what was once one global block to represent the entire objfile is > now N. Without DWZ there are X global blocks for X primary symtabs for X CUs of objfile. With DWZ there are X+Y global blocks for X+Y primary symtabs for X+Y CUs where Y are 'DW_TAG_partial_unit's. For 'DW_TAG_partial_unit's (Ys) their blockvector is usually empty. But not always, I have found there typedef symbols, there can IMO be optimized-out static variables etc. > [I'm sure the patches help in the non-dwz case, but I suspect it's less. > Which isn't to say the patches aren't useful. > I just need play with a few more examples.] I agree. [patch 2/2] could needlessly performance-regress non-DWZ cases, therefore I have put back original ALL_OBJFILE_PRIMARY_SYMTABS (instead of my ALL_OBJFILE_SYMTABS) as it is perfectly sufficient. For the performance testcase of mine: Benchmark on non-trivial application with 'p <tab><tab>': Command execution time: 4.215000 (cpu), 4.241466 (wall) --- both fixes with new [patch 2/2] Command execution time: 7.373000 (cpu), 7.395095 (wall) --- both fixes Command execution time: 13.572000 (cpu), 13.592689 (wall) --- just lookup_symbol_aux_objfile fix Command execution time: 113.036000 (cpu), 113.067995 (wall) --- FSF GDB HEAD That is additional 1.75x improvement, making the total improvement 26.8x. No regressions on {x86_64,x86_64-m32,i686}-fedora21pre-linux-gnu in standard and .gdb_index-enabled runs. Neither of the patches should cause any visible behavior change. Thanks, Jan gdb/ 2014-10-23 Jan Kratochvil <jan.kratochvil@redhat.com> * symtab.c (lookup_symbol_aux_objfile): Use ALL_OBJFILE_SYMTABS, inline lookup_block_symbol.
Comments
On Thu, Oct 23, 2014 at 11:24 AM, Jan Kratochvil <jan.kratochvil@redhat.com> wrote: > On Wed, 22 Oct 2014 10:55:18 +0200, Doug Evans wrote: >> For example, the count of calls to dict_hash before/after goes from 13.8M to 31. >> I'm guessing one t hing we're doing here is coping with an artifact of >> dwz: > > During my simple test on non-DWZ file (./gdb itself) it went 3684->3484. > > The problem is that dict_cash(val) is called for the same val for each block > (== symtab). > > On DWZ the saving is probably much larger as there are many more symtabs due > to DW_TAG_partial_unit ones. Much larger indeed. One worry I have is that while this helps dwz does it harm something else. Seems unlikely, but some simple measurements I took make me want to take more. One thought I have is that significant changes at a higher level will minimize the impact of this patch. One change I'm thinking of making is replacing iterating over every symbol table and then if that fails going to the index with instead just going straight to the index: the index knows where the symbols are (you mentioned this as well). Perhaps not what we want to do for partial syms (though maybe partial syms could work similarly, haven't gotten that far). But with an index it seems clumsy to iterate over all symtabs and then go to the index. It should be easy enough to do a quick hack to do an experiment to collect some data. I'll try to get to it this weekend. >> what was once one global block to represent the entire objfile is >> now N. > > Without DWZ there are X global blocks for X primary symtabs for X CUs of > objfile. With DWZ there are X+Y global blocks for X+Y primary symtabs for > X+Y CUs where Y are 'DW_TAG_partial_unit's. Yep. > For 'DW_TAG_partial_unit's (Ys) their blockvector is usually empty. But not > always, I have found there typedef symbols, there can IMO be optimized-out > static variables etc. > > >> [I'm sure the patches help in the non-dwz case, but I suspect it's less. >> Which isn't to say the patches aren't useful. >> I just need play with a few more examples.] > > I agree. > > [patch 2/2] could needlessly performance-regress non-DWZ cases, therefore > I have put back original ALL_OBJFILE_PRIMARY_SYMTABS (instead of my > ALL_OBJFILE_SYMTABS) as it is perfectly sufficient. For the performance > testcase of mine: > > Benchmark on non-trivial application with 'p <tab><tab>': > Command execution time: 4.215000 (cpu), 4.241466 (wall) --- both fixes with new [patch 2/2] > Command execution time: 7.373000 (cpu), 7.395095 (wall) --- both fixes > Command execution time: 13.572000 (cpu), 13.592689 (wall) --- just lookup_symbol_aux_objfile fix > Command execution time: 113.036000 (cpu), 113.067995 (wall) --- FSF GDB HEAD > > That is additional 1.75x improvement, making the total improvement 26.8x. > > > No regressions on {x86_64,x86_64-m32,i686}-fedora21pre-linux-gnu in standard > and .gdb_index-enabled runs. Neither of the patches should cause any visible > behavior change. > > > Thanks, > Jan > > gdb/ > 2014-10-23 Jan Kratochvil <jan.kratochvil@redhat.com> > > * symtab.c (lookup_symbol_aux_objfile): Use ALL_OBJFILE_SYMTABS, inline > lookup_block_symbol. > > diff --git a/gdb/symtab.c b/gdb/symtab.c > index c530d50..da13861 100644 > --- a/gdb/symtab.c > +++ b/gdb/symtab.c > @@ -1657,15 +1657,25 @@ lookup_symbol_aux_objfile (struct objfile *objfile, int block_index, > const struct block *block; > struct symtab *s; > > + gdb_assert (block_index == GLOBAL_BLOCK || block_index == STATIC_BLOCK); > + > ALL_OBJFILE_PRIMARY_SYMTABS (objfile, s) > { > + struct dict_iterator dict_iter; > + > bv = BLOCKVECTOR (s); > block = BLOCKVECTOR_BLOCK (bv, block_index); > - sym = lookup_block_symbol (block, name, domain); > - if (sym) > + > + for (sym = dict_iter_name_first (block->dict, name, &dict_iter); > + sym != NULL; > + sym = dict_iter_name_next (name, &dict_iter)) > { > - block_found = block; > - return fixup_symbol_section (sym, objfile); > + if (symbol_matches_domain (SYMBOL_LANGUAGE (sym), > + SYMBOL_DOMAIN (sym), domain)) > + { > + block_found = block; > + return fixup_symbol_section (sym, objfile); > + } > } > } > > This breaks an abstraction boundary, IWBN to preserve it. [IOW, I look at dict_* as being an implementation detail of blocks.] If we were to go this route (and apologies for the delay), can you write a routine like lookup_block_symbol which does the above and call that here instead? lookup_block_symbol should live in block.c, not symtab.c. That's where this new routine should go too.
On Fri, 24 Oct 2014 09:16:01 +0200, Doug Evans wrote: > One thought I have is that significant changes at a higher level will > minimize the impact of this patch. One change I'm thinking of making > is replacing iterating over every symbol table and then if that fails > going to the index with instead just going straight to the index: the > index knows where the symbols are (you mentioned this as well). Yes, I tried that first. For the performance testcase I provided the issue is in lookup_symbol_global_iterator_cb(): data->result = lookup_symbol_aux_objfile (objfile, GLOBAL_BLOCK, data->name, data->domain); if (data->result == NULL) data->result = lookup_symbol_aux_quick (objfile, GLOBAL_BLOCK, data->name, data->domain); Changing their order does not fix the performance as lookup_symbol_aux_quick() (that is quick_symbol_functions::lookup_symbol) can return NULL * either if the symbol is really not present in the index. * or if the symbol's symtab is already expanded. For the latter case (commonly happening) quick_symbol_functions::lookup_symbol finds the right symtab but then it drops that information. Changing the quick_symbol_functions::lookup_symbol semantics may fix it. But then it will get fixed only for .gdb_index files while my two patches also improve performance of non-.gdb_index files. For each objfile .gdb_index presence is orthogonal to DWZ application. Sure a question remains if we should care about performance of non-.gdb_index files at all. Even for continuous edit-build-debug cycles it is worth to run gdb-add-index during each build. > If we were to go this route (and apologies for the delay), can you > write a routine like lookup_block_symbol which does the above and call > that here instead? OK. Jan
On Fri, Oct 24, 2014 at 12:33 AM, Jan Kratochvil <jan.kratochvil@redhat.com> wrote: > On Fri, 24 Oct 2014 09:16:01 +0200, Doug Evans wrote: >> One thought I have is that significant changes at a higher level will >> minimize the impact of this patch. One change I'm thinking of making >> is replacing iterating over every symbol table and then if that fails >> going to the index with instead just going straight to the index: the >> index knows where the symbols are (you mentioned this as well). > > Yes, I tried that first. > > For the performance testcase I provided the issue is in > lookup_symbol_global_iterator_cb(): > > data->result = lookup_symbol_aux_objfile (objfile, GLOBAL_BLOCK, > data->name, data->domain); > if (data->result == NULL) > data->result = lookup_symbol_aux_quick (objfile, GLOBAL_BLOCK, > data->name, data->domain); > > Changing their order does not fix the performance as lookup_symbol_aux_quick() > (that is quick_symbol_functions::lookup_symbol) can return NULL > * either if the symbol is really not present in the index. > * or if the symbol's symtab is already expanded. > > For the latter case (commonly happening) quick_symbol_functions::lookup_symbol > finds the right symtab but then it drops that information. > > Changing the quick_symbol_functions::lookup_symbol semantics may fix it. Yeah, the experiment I want to try is a bit more invasive such that, for example, we only go to the index (and expand any symtabs found if necessary and only search those ones). E.g., If the index returns NULL there's no point in searching the full set of symtabs. > But then it will get fixed only for .gdb_index files while my two patches also > improve performance of non-.gdb_index files. That's still an open issue, sure. > For each objfile .gdb_index presence is orthogonal to DWZ application. > > Sure a question remains if we should care about performance of non-.gdb_index > files at all. Even for continuous edit-build-debug cycles it is worth to run > gdb-add-index during each build. Or have the compiler/linker build the index.
diff --git a/gdb/symtab.c b/gdb/symtab.c index c530d50..da13861 100644 --- a/gdb/symtab.c +++ b/gdb/symtab.c @@ -1657,15 +1657,25 @@ lookup_symbol_aux_objfile (struct objfile *objfile, int block_index, const struct block *block; struct symtab *s; + gdb_assert (block_index == GLOBAL_BLOCK || block_index == STATIC_BLOCK); + ALL_OBJFILE_PRIMARY_SYMTABS (objfile, s) { + struct dict_iterator dict_iter; + bv = BLOCKVECTOR (s); block = BLOCKVECTOR_BLOCK (bv, block_index); - sym = lookup_block_symbol (block, name, domain); - if (sym) + + for (sym = dict_iter_name_first (block->dict, name, &dict_iter); + sym != NULL; + sym = dict_iter_name_next (name, &dict_iter)) { - block_found = block; - return fixup_symbol_section (sym, objfile); + if (symbol_matches_domain (SYMBOL_LANGUAGE (sym), + SYMBOL_DOMAIN (sym), domain)) + { + block_found = block; + return fixup_symbol_section (sym, objfile); + } } }