From patchwork Mon Jun 12 16:14:10 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pedro Alves X-Patchwork-Id: 20961 Received: (qmail 122119 invoked by alias); 12 Jun 2017 16:19:53 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 122094 invoked by uid 89); 12 Jun 2017 16:19:51 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, SPF_HELO_PASS, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=Hx-languages-length:3677 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 12 Jun 2017 16:19:50 +0000 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AE9AFC0587CD for ; Mon, 12 Jun 2017 16:14:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com AE9AFC0587CD Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=palves@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com AE9AFC0587CD Received: from cascais.lan (ovpn04.gateway.prod.ext.ams2.redhat.com [10.39.146.4]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3AEEF80F72 for ; Mon, 12 Jun 2017 16:14:15 +0000 (UTC) From: Pedro Alves To: gdb-patches@sourceware.org Subject: [PATCH 5/6] .gdb_index prod perf regression: Estimate size of psyms_seen Date: Mon, 12 Jun 2017 17:14:10 +0100 Message-Id: <1497284051-13795-5-git-send-email-palves@redhat.com> In-Reply-To: <1497284051-13795-1-git-send-email-palves@redhat.com> References: <1497284051-13795-1-git-send-email-palves@redhat.com> In-Reply-To: <8efc0742-1014-4fe0-6948-f40a9c5c4975@redhat.com> References: <8efc0742-1014-4fe0-6948-f40a9c5c4975@redhat.com> Using the same test as the previous patch, perf shows GDB spending over 7% in "free". A substantial number of those calls comes from insertions in the psyms_seen unordered_set causing lots of rehashing and recreating buckets. Fix this by computing an estimate of the size of the set upfront. Using the same test as in the previous patch, against the same gdb inferior, timing improves ~8% further: ~6.5s => ~6.0s (average of 5 runs). gdb/ChangeLog: 2017-06-12 Pedro Alves * dwarf2read.c (recursively_count_psymbols): New function. (write_psymtabs_to_index): Call it to compute number of psyms and pass estimate size of psyms_seen to unordered_set's ctor. --- gdb/ChangeLog | 6 ++++++ gdb/dwarf2read.c | 36 +++++++++++++++++++++++++++++++++++- 2 files changed, 41 insertions(+), 1 deletion(-) diff --git a/gdb/ChangeLog b/gdb/ChangeLog index 01b66a1..9dbc059 100644 --- a/gdb/ChangeLog +++ b/gdb/ChangeLog @@ -1,5 +1,11 @@ 2017-06-12 Pedro Alves + * dwarf2read.c (recursively_count_psymbols): New function. + (write_psymtabs_to_index): Call it to compute number of psyms and + pass estimate size of psyms_seen to unordered_set's ctor. + +2017-06-12 Pedro Alves + * dwarf2read.c (write_hash_table): Check if key already exists before emplacing. diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c index 93fd275..bff2fcb 100644 --- a/gdb/dwarf2read.c +++ b/gdb/dwarf2read.c @@ -23691,6 +23691,22 @@ write_one_signatured_type (void **slot, void *d) return 1; } +/* Recurse into all "included" dependencies and count their symbols as + if they appeared in this psymtab. */ + +static void +recursively_count_psymbols (struct partial_symtab *psymtab, + size_t &psyms_seen) +{ + for (int i = 0; i < psymtab->number_of_dependencies; ++i) + if (psymtab->dependencies[i]->user != NULL) + recursively_count_psymbols (psymtab->dependencies[i], + psyms_seen); + + psyms_seen += psymtab->n_global_syms; + psyms_seen += psymtab->n_static_syms; +} + /* Recurse into all "included" dependencies and write their symbols as if they appeared in this psymtab. */ @@ -23764,7 +23780,6 @@ write_psymtabs_to_index (struct objfile *objfile, const char *dir) mapped_symtab symtab; data_buf cu_list; - std::unordered_set psyms_seen; /* While we're scanning CU's create a table that maps a psymtab pointer (which is what addrmap records) to its index (which is what is recorded @@ -23776,6 +23791,25 @@ write_psymtabs_to_index (struct objfile *objfile, const char *dir) /* The CU list is already sorted, so we don't need to do additional work here. Also, the debug_types entries do not appear in all_comp_units, but only in their own hash table. */ + + /* The psyms_seen set is potentially going to be largish (~40k + elements when indexing a -g3 build of GDB itself). Estimate the + number of elements in order to avoid too many rehashes, which + require rebuilding buckets and thus many trips to + malloc/free. */ + size_t psyms_count = 0; + for (int i = 0; i < dwarf2_per_objfile->n_comp_units; ++i) + { + struct dwarf2_per_cu_data *per_cu + = dwarf2_per_objfile->all_comp_units[i]; + struct partial_symtab *psymtab = per_cu->v.psymtab; + + if (psymtab != NULL && psymtab->user == NULL) + recursively_count_psymbols (psymtab, psyms_count); + } + /* Generating an index for gdb itself shows a ratio of + TOTAL_SEEN_SYMS/UNIQUE_SYMS or ~5. 4 seems like a good bet. */ + std::unordered_set psyms_seen (psyms_count / 4); for (int i = 0; i < dwarf2_per_objfile->n_comp_units; ++i) { struct dwarf2_per_cu_data *per_cu