Message ID | 20231212173239.16793-1-tdevries@suse.de |
---|---|
Headers |
Return-Path: <gdb-patches-bounces+patchwork=sourceware.org@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2D0C53857734 for <patchwork@sourceware.org>; Tue, 12 Dec 2023 17:33:20 +0000 (GMT) X-Original-To: gdb-patches@sourceware.org Delivered-To: gdb-patches@sourceware.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by sourceware.org (Postfix) with ESMTPS id 6F7FC38582A2 for <gdb-patches@sourceware.org>; Tue, 12 Dec 2023 17:32:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6F7FC38582A2 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6F7FC38582A2 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702402347; cv=none; b=Ssz8ELQ62SdIcA02AqkQ+p9rmeXY7ZdDU7elRsaYPZJy9d9aOsHcVhptUwdsyDRq9B1sjGuALwnqeX3fV3a38T43BMa4Cy/cwDCVk+EXCkJM8vMVjux7WO9VwQtgPScMYbPs8TtmtwFdDx+wouL3ebtnxigONHa5JdGxDZi9+V4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702402347; c=relaxed/simple; bh=J5KO8R01eLpux+2Wet7ggwXcnGT3JGqsvdbDUdrlJ90=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:From: To:Subject:Date:Message-Id:MIME-Version; b=raJqYoDY9+juhNul6gBABQqN2t4Nja078lp++0rByhMtSlpXrFCBLuk7YAlxSbeYn4CK1mAmNL0h0A8Fv9Gq5reKbmGiskjM6trOSZFX5y7xjM0wApN4KFqcIR0REhWAg+qfDu9YZo/l+T0h0lJ0FT/PtpRVn+TVsx4IJakq44A= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id D19DA21D98 for <gdb-patches@sourceware.org>; Tue, 12 Dec 2023 17:32:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1702402343; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=KrqGu5oSpkoG6C1Tt8I5QedRblEM2qjcpBZfTAApCMA=; b=wcqfhepsKJ2WLdpSYD5kKWCFE7AyeX5HeKNVF4QV8ZyoHmM/FgrE0EIfbTfqr/8/ruSgFq e21cWwj8KcZcmQclggWVDcyx+6cYiddeW8DMZ9+RaEGpicMbvzV81CPuv68WhrYN2tmpnI b28dvGGC+Xxp7A+2qs97gh6JfKbVzWc= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1702402343; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=KrqGu5oSpkoG6C1Tt8I5QedRblEM2qjcpBZfTAApCMA=; b=03hBuhyYTGiq9j6W3+Ua5I1A0/EYtay8Ga/kkOc8lRQtawxPzNIsOrL7zj9Zr5SV8Pg1Va tfKCa3ZbUtl1RoAA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1702402341; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=KrqGu5oSpkoG6C1Tt8I5QedRblEM2qjcpBZfTAApCMA=; b=WNsSKu6FT3POoeDVaYafqiQ7P1bg7wLVktZYKeFdMqMwEONm41979gDMGjBeb/k9Fj3CLV cf3jbWJkkDFK2A62cH9YtyIOFpqVibMqYlqZjJ8IRq6LyzY7tS8uqA79HtjtLQhUKX75fG aeBdxDWYJEYk4o56as+L7U+gw8M7ZhM= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1702402341; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=KrqGu5oSpkoG6C1Tt8I5QedRblEM2qjcpBZfTAApCMA=; b=C+vJ/k53MMvgijmGU0ZXwEvLnxXue5kMOg+EHF4ICEsWoC5UJS16uRLgcL6pckIJVYMBc3 SmHQKrroqCx4wbDg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id B829A13725 for <gdb-patches@sourceware.org>; Tue, 12 Dec 2023 17:32:21 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 6EZFKyWZeGVaPQAAD6G6ig (envelope-from <tdevries@suse.de>) for <gdb-patches@sourceware.org>; Tue, 12 Dec 2023 17:32:21 +0000 From: Tom de Vries <tdevries@suse.de> To: gdb-patches@sourceware.org Subject: [PATCH v2 00/13] [gdb/symtab, cc-with-dwz] Fix gdb.cp/breakpoint-locs.exp Date: Tue, 12 Dec 2023 18:32:26 +0100 Message-Id: <20231212173239.16793-1-tdevries@suse.de> X-Mailer: git-send-email 2.35.3 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Level: ************ X-Spam-Score: 12.01 X-Spam-Level: X-Rspamd-Server: rspamd1 X-Rspamd-Queue-Id: D19DA21D98 X-Spam-Flag: NO Authentication-Results: smtp-out1.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=WNsSKu6F; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b="C+vJ/k53"; dmarc=pass (policy=none) header.from=suse.de; spf=softfail (smtp-out1.suse.de: 2a07:de40:b281:104:10:150:64:97 is neither permitted nor denied by domain of tdevries@suse.de) smtp.mailfrom=tdevries@suse.de X-Spamd-Result: default: False [-10.31 / 50.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; R_MISSING_CHARSET(2.50)[]; TO_DN_NONE(0.00)[]; BROKEN_CONTENT_TYPE(1.50)[]; R_SPF_SOFTFAIL(0.00)[~all:c]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[suse.de:+]; DMARC_POLICY_ALLOW(0.00)[suse.de,none]; MX_GOOD(-0.01)[]; DMARC_POLICY_ALLOW_WITH_FAILURES(-0.50)[]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; BAYES_HAM(-3.00)[100.00%]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[gdb-patches@sourceware.org]; DWL_DNSWL_HI(-3.50)[suse.de:dkim]; RCPT_COUNT_ONE(0.00)[1]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; WHITELIST_DMARC(-7.00)[suse.de:D:+]; MID_CONTAINS_FROM(1.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[breakpoint-locs-2.cc:url,suse.de:dkim,breakpoint-locs.cc:url,sourceware.org:url]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_TLS_ALL(0.00)[] X-Spam-Score: -10.31 X-Spam-Status: No, score=-6.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FILL_THIS_FORM, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gdb-patches mailing list <gdb-patches.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/gdb-patches>, <mailto:gdb-patches-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/gdb-patches>, <mailto:gdb-patches-request@sourceware.org?subject=subscribe> Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org |
Series |
Fix gdb.cp/breakpoint-locs.exp
|
|
Message
Tom de Vries
Dec. 12, 2023, 5:32 p.m. UTC
I. INTRODUCTION With target board unix, test-case gdb.cp/breakpoint-locs.exp passes reliably. But when running with target board cc-with-dwz, we sometimes have 2 breakpoint locations: ... (gdb) break N1::C1::baz^M Breakpoint 1 at 0x4004db: N1::C1::baz. (2 locations)^M (gdb) PASS: gdb.cp/breakpoint-locs.exp: break N1::C1::baz ... and sometimes only 1: ... (gdb) break N1::C1::baz^M Breakpoint 1 at 0x4004db: file breakpoint-locs.h, line 23.^M (gdb) FAIL: gdb.cp/breakpoint-locs.exp: break N1::C1::baz ... By using "maint set worker-threads 0" we can make the test-case fail reliably: ... $ gdb -q -batch \ -iex "maint set worker-threads 0" \ outputs/gdb.cp/breakpoint-locs/breakpoint-locs \ -ex "break N1::C1::baz" ... And by adding -readnow, we can make it pass reliably. This is PR symtab/30728, a gdb 12 -> 13 regression. This series fixes it. II. ANALYSIS In order for the breakpoint to have two breakpoint locations, the symtabs for both CUs breakpoint-locs.cc and breakpoint-locs-2.cc need to be expanded (which explains why adding -readnow fixes it). Let's call these CU1 and CU2. CU1 is always expanded during the call: ... cp_canonicalize_string_no_typedefs (string=string@entry=0x2fbbf90 "N1::C1::baz") ... due to being the first CU to contain N1. Then CU2 is expanded or not, depending on which per_cu (CU1 or CU2) this cooked_index entry refers to (not shown in the dump atm): ... [12] ((cooked_index_entry *) 0x7f1200002f00) name: baz canonical: baz qualified: N1::C1::baz DWARF tag: DW_TAG_subprogram flags: 0x0 [] DIE offset: 0x2e parent: ((cooked_index_entry *) 0x7f1200002ed0) [C1] ... The DIE offset 0x2e corresponds to this DIE in a PU, imported both from CU1 and CU2: ... <0><b>: Abbrev Number: 34 (DW_TAG_partial_unit) <1><14>: Abbrev Number: 37 (DW_TAG_namespace) <15> DW_AT_name : N1 <2><19>: Abbrev Number: 36 (DW_TAG_class_type) <1a> DW_AT_name : C1 <3><20>: Abbrev Number: 32 (DW_TAG_subprogram) <21> DW_AT_external : 1 <21> DW_AT_name : baz <27> DW_AT_linkage_name: _ZN2N12C13bazEv <2b> DW_AT_accessibility: 1 (public) <2c> DW_AT_declaration : 1 <1><2e>: Abbrev Number: 35 (DW_TAG_subprogram) <2f> DW_AT_specification: <0x20> <30> DW_AT_inline : 3 (declared as inline and inlined) ... [ In the target board unix case, there's no PU and both CU1 and CU2 contain a copy of 0x2e, and consequently both CUs are expanded. ] Which per_cu the cooked_index entry refers to is decided by this import race in cooked_indexer::ensure_cu_exists: ... /* When scanning, we only want to visit a given CU a single time. Doing this check here avoids self-imports as well. */ if (for_scanning) { bool nope = false; if (!per_cu->scanned.compare_exchange_strong (nope, true)) return nullptr; } ... [ Note that in the "maint set worker-threads 0" case, CU1 is guaranteed to win the import race. ] With gdb 12 (the partial symtab case) there was a similar problem, and we relied on the DW_TAG_inlined_subroutine DIEs for the necessary expansion, see commit f9b5d5ea18a ("[gdb/symtab] Fix missing breakpoint location for inlined function"). The CU1 DW_TAG_inlined_subroutine DIE is at 0x148: ... <0><ee>: Abbrev Number: 16 (DW_TAG_compile_unit) <f3> DW_AT_language : 4 (C++) <f4> DW_AT_name : gdb.cp/breakpoint-locs.cc <1><13b>: Abbrev Number: 25 (DW_TAG_subprogram) <13c> DW_AT_specification: <0x115> <13d> DW_AT_low_pc : 0x4004d7 <145> DW_AT_high_pc : 16 <146> DW_AT_frame_base : 1 byte block: 9c (DW_OP_call_frame_cfa) <148> DW_AT_GNU_all_call_sites: 1 <2><148>: Abbrev Number: 24 (DW_TAG_inlined_subroutine) <149> DW_AT_abstract_origin: <0x2e> <14d> DW_AT_low_pc : 0x4004db <155> DW_AT_high_pc : 9 <156> DW_AT_call_file : 1 <157> DW_AT_call_line : 22 ... and for CU2 there's a similar one at 0x1cf. There are no corresponding DW_TAG_inlined_subroutine cooked_index entries, because the DIEs are skipped during indexing. So, we can fix this by adding DW_TAG_inlined_subroutine cooked_index entries. III. FIX It's easy to stop skipping the DW_TAG_inlined_subroutine cooked_index entries during indexing. However, that gives us cooked_index entries with incorrect parents: ... [17] ((cooked_index_entry *) 0x2e68e40) name: baz canonical: baz qualified: N1::foo::baz DWARF tag: DW_TAG_inlined_subroutine flags: 0x0 [] DIE offset: 0x148 parent: ((cooked_index_entry *) 0x2e68de0) [foo] [19] ((cooked_index_entry *) 0x2e68f90) name: baz canonical: baz qualified: N1::bar::baz DWARF tag: DW_TAG_inlined_subroutine flags: 0x0 [] DIE offset: 0x1cf parent: ((cooked_index_entry *) 0x2e68f30) [bar] ... That is, the qualified name should be N1::C1::baz in both cases. The C1 can be found via the DW_AT_abstract_origin, but instead the DIE parents N1::foo and N1::bar are used. Again, we can fix this easily, by forcing to use DW_AT_abstract_origin, but then we have a missing parent for the DIE at 0x1cf: ... [17] ((cooked_index_entry *) 0x4854140) name: baz canonical: baz qualified: N1::C1::baz DWARF tag: DW_TAG_inlined_subroutine flags: 0x0 [] DIE offset: 0x148 parent: ((cooked_index_entry *) 0x4853f60) [C1] [19] ((cooked_index_entry *) 0x4854290) name: baz canonical: baz qualified: baz DWARF tag: DW_TAG_inlined_subroutine flags: 0x0 [] DIE offset: 0x1cf parent: ((cooked_index_entry *) 0) ... The data structures that matter for tracking parent relations in cooked_index entries are: - m_die_range_map (currently known parent relations), and - m_deferred_entries (deferred cooked_index entries due to unknown parent). These only have the scope of a parsing a single CU (including PUs for which the CU won the import race). Since CU1 wins the import race, it has the information available to get the right parent. But that's not the case for CU2. So in order to get the corrent parent for DIE 0x1cf, tracking parent relations in the cooked index need to done at the inter-CU scope, implying also inter-shard scope. I've filed PR symtab/30846 "[gdb/symtab] incorrect parent for forward spec (inter-cu case)" for part of the problem, and the series includes two test-cases that specifically check for this type of problem: - gdb.dwarf2/backward-spec-inter-cu.exp - gdb.dwarf2/forward-spec-inter-cu.exp IV. ALTERNATIVE APPROACH An alternative way of approaching the problem is to observe that for the target board unix case we have two cooked_index entries with per_cu CU1 and CU2: ... [18] ((cooked_index_entry *) 0x3631250) name: baz canonical: baz qualified: N1::C1::baz DWARF tag: DW_TAG_subprogram flags: 0x0 [] DIE offset: 0x173 parent: ((cooked_index_entry *) 0x3631130) [C1] [20] ((cooked_index_entry *) 0x3631490) name: baz canonical: baz qualified: N1::C1::baz DWARF tag: DW_TAG_subprogram flags: 0x0 [] DIE offset: 0x25a parent: ((cooked_index_entry *) 0x3631370) [C1] ... but for the dwz case we have a only a single entry: ... [12] ((cooked_index_entry *) 0x7f1200002f00) name: baz canonical: baz qualified: N1::C1::baz DWARF tag: DW_TAG_subprogram flags: 0x0 [] DIE offset: 0x2e parent: ((cooked_index_entry *) 0x7f1200002ed0) [C1] ... with per_cu either CU1 or CU2. If in the dwz case we'd have two of those, one with per_cu CU1 and one with per_cu CU2, then the problem would also be solved. Put differently, in the target board unix case we have two cooked index entries, one with per_cu CU1, and one with per_cu CU2 because that's a one-on-one reflection of what the dwarf looks like. But in the dwz case, factoring out the DIE into a PU and importing it from from both CUs does not alter the logical contents of the dwarf, so it would make complete sense to have two entries in this case as well. Of course, we'd want to exploit the dwz compression also inside gdb, and consequently some scheme of a single cooked_index entry with the per_cu being the actual PU and then mapping it to the complete set of CU importers would be more optimal. [ Note that in such a scheme, we'd have to be able to differentiate between expand-one-CU and expand-all-CUs to prevent expanding all importing CUs in the expand-one-CU case. ] But the current internal representation is in fact corresponding to dwz dropping the import for all but one CU (well, for the cooked index, not for the expanded symtabs). I think that the applied rationale is that if you're looking for a type in the symtabs, the first one will do, so it doesn't make sense to keep track of more than one. This however does not work if you're looking for all the occurrences of a type in the symtabs, as is the case here. I suppose it could be a matter of taste whether the current internal representation is: - a smart solution, or - incorrect representation of the dwarf. The fact is that: - if we choose the former, then correct dwz handling requires extra care (as set out in FIX), and - if we choose the latter and fix it, then the dwz and non-dwz case are by design handled in the same way. I'm not pursuing this alternative approach in this patch series for now, for the reasons that: - this solution was not chosen for gdb 12 either (though the patch went in without comments), - because I'm hoping that it's a bit easier and safer to backport to gdb 13, and - I'm not sure if other maintainers are supportive of this approach. V. PATCH SERIES STRUCTURE The series consists of several parts. - patches that do some refactoring: - [gdb/symtab] Refactor condition in scan_attributes - [gdb/symtab] Factor out m_die_range_map usage - [gdb/symtab] Handle nullptr parent in parent_map::set_parent - [gdb/symtab] Factor out m_deferred_entries usage - patches that fix resolving deferred entries and add corresponding test-cases: - [gdb/symtab] Resolve deferred entries, inter-shard case - [gdb/testsuite] Add gdb.dwarf2/forward-spec-inter-cu.exp - [gdb/testsuite] Add gdb.dwarf2/backward-spec-inter-cu.exp - patches that add two optimizations to the parent calculation (one preparation patch, and two optimization patches): - [gdb/symtab] Keep track of processed DIEs in shard - [gdb/symtab] Resolve deferred entries, intra-shard case - [gdb/symtab] Don't defer backward refs, inter-cu intra-shard case - patches that fix PR symtab/30728 (logically one patch, but split up to be able to assess performance impact more precisely): - [gdb/symtab] Recurse into c++ DW_TAG_subprogram DIEs for cooked index - [gdb/symtab] Keep track of all parents for cooked index - [gdb/symtab] Fix DW_TAG_inlined_subroutine entries in the cooked index The fix for PR symtab/30728 consists of adding entries to the cooked index for DW_TAG_inlined_subroutine DIEs, with correct parent entries. Tested on x86_64-linux, with target boards unix, cc-with-dwz and cc-with-dwz-m. I've split out the two added test-cases into separate patches, for better patch readability. Unfortunately the patch "[gdb/symtab] Resolve deferred entries, inter-shard case" is still somewhat large due to moving code about. VI. PERFORMANCE I've also done a performance experiment. The command being run is: ... $ gdb -q -batch /usr/lib/debug/usr/bin/gdb-12.1-150400.15.9.1.x86_64.debug ... The .debug file doesn't contain any partial units, so this excercises the non-dwz scenario. The command is run 10 times, and the mean value is used. I also added a 5 second sleep before each run to prevent thermal throttling from messing up the measurements. Furthermore, the processor on which I run the experiment is an i7-1250U, with 4 performance cores and 8 efficiency cores, so I run the experiment using "taskset -c 0-3" to make sure only performance cores are used. The script I use compares one gdb version to another, so the first column is 100% by definition, using the same base version, though a different run each time. This is comparing the base version with itself, which gives an idea about accuracy: ... real: mean: 687.00 (100%) mean: 691.00 (100.58%) user: mean: 2196.00 (100%) mean: 2196.00 (100.00%) sys : mean: 108.00 (100%) mean: 106.00 (98.15%) mem : mean: 345220.40 (100%) mean: 346112.40 (100.26%) ... The mem measure is the "Maximum resident set size of the process during its lifetime, in Kbytes". The real/user/sys times are in miliseconds. And this is comparing the patch series with the base version: ... real: mean: 687.00 (100%) mean: 931.00 (135.52%) user: mean: 2204.00 (100%) mean: 2938.00 (133.30%) sys : mean: 102.00 (100%) mean: 137.00 (134.31%) mem : mean: 345479.60 (100%) mean: 418396.00 (121.11%) ... In summary, the overall result is ~36% more real time and ~21% more memory. The cumulative results of individual patches (leaving out the test-case patches) are: ... [gdb/symtab] Refactor condition in scan_attributes real: mean: 690.00 (100%) mean: 691.00 (100.14%) user: mean: 2201.00 (100%) mean: 2218.00 (100.77%) sys : mean: 102.00 (100%) mean: 93.00 (91.18%) mem : mean: 344846.80 (100%) mean: 344196.40 (99.81%) [gdb/symtab] Factor out m_die_range_map usage real: mean: 689.00 (100%) mean: 696.00 (101.02%) user: mean: 2214.00 (100%) mean: 2212.00 (99.91%) sys : mean: 98.00 (100%) mean: 91.00 (92.86%) mem : mean: 345784.00 (100%) mean: 346644.00 (100.25%) [gdb/symtab] Handle nullptr parent in parent_map::set_parent real: mean: 688.00 (100%) mean: 690.00 (100.29%) user: mean: 2215.00 (100%) mean: 2223.00 (100.36%) sys : mean: 99.00 (100%) mean: 94.00 (94.95%) mem : mean: 344220.40 (100%) mean: 345640.80 (100.41%) [gdb/symtab] Factor out m_deferred_entries usage real: mean: 693.00 (100%) mean: 689.00 (99.42%) user: mean: 2191.00 (100%) mean: 2201.00 (100.46%) sys : mean: 103.00 (100%) mean: 101.00 (98.06%) mem : mean: 346022.00 (100%) mean: 345167.60 (99.75%) [gdb/symtab] Resolve deferred entries, inter-shard case real: mean: 688.00 (100%) mean: 711.00 (103.34%) user: mean: 2204.00 (100%) mean: 2265.00 (102.77%) sys : mean: 103.00 (100%) mean: 119.00 (115.53%) mem : mean: 344649.60 (100%) mean: 375080.00 (108.83%) [gdb/symtab] Keep track of processed DIEs in shard real: mean: 684.00 (100%) mean: 728.00 (106.43%) user: mean: 2208.00 (100%) mean: 2301.00 (104.21%) sys : mean: 96.00 (100%) mean: 124.00 (129.17%) mem : mean: 344005.60 (100%) mean: 374647.20 (108.91%) [gdb/symtab] Resolve deferred entries, intra-shard case real: mean: 693.00 (100%) mean: 724.00 (104.47%) user: mean: 2213.00 (100%) mean: 2301.00 (103.98%) sys : mean: 103.00 (100%) mean: 114.00 (110.68%) mem : mean: 344722.80 (100%) mean: 375021.20 (108.79%) [gdb/symtab] Don't defer backward refs, inter-cu intra-shard case real: mean: 686.00 (100%) mean: 725.00 (105.69%) user: mean: 2199.00 (100%) mean: 2299.00 (104.55%) sys : mean: 106.00 (100%) mean: 119.00 (112.26%) mem : mean: 344597.20 (100%) mean: 374937.20 (108.80%) [gdb/symtab] Recurse into c++ DW_TAG_subprogram DIEs for cooked index real: mean: 690.00 (100%) mean: 835.00 (121.01%) user: mean: 2186.00 (100%) mean: 2693.00 (123.19%) sys : mean: 106.00 (100%) mean: 130.00 (122.64%) mem : mean: 345059.60 (100%) mean: 409780.00 (118.76%) [gdb/symtab] Keep track of all parents for cooked index real: mean: 684.00 (100%) mean: 903.00 (132.02%) user: mean: 2240.00 (100%) mean: 2921.00 (130.40%) sys : mean: 93.00 (100%) mean: 141.00 (151.61%) mem : mean: 346353.20 (100%) mean: 411096.00 (118.69%) [gdb/symtab] Fix DW_TAG_inlined_subroutine entries in the cooked index real: mean: 690.00 (100%) mean: 930.00 (134.78%) user: mean: 2227.00 (100%) mean: 2960.00 (132.91%) sys : mean: 96.00 (100%) mean: 138.00 (143.75%) mem : mean: 343766.00 (100%) mean: 417973.60 (121.59%) ... Ideally, the impact of a patch series implementing dwz support on a non-dwz test-case is none. But fixing dwz support requires tracking more data, and there's no way of knowing in advance whether the additional data will be used or not. Of course this can be accommodated by optimistically assuming that the data is unnecessary, and when it turns out it was actually needed, partially or completely restart indexing. My suspicion is that this approach itself is going to be complex, so I think it's best avoided. An optimization that can be added on top of the current approach is to opportunistically use data from other shards if already available. An extension of this would be to wait until data from another shard is available, which would void the need to handle inter-shard dependencies after the parallel_for_each, but I think there's a risk for deadlock there. The imports as generated by dwz are a fairly predictable pattern, but there are other producers of imports, for instance GCC LTO. VII. TAGS Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30728 Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30846 VIII. SUBMISSION HISTORY v2: - added refactoring patch at the start - dropped: - [gdb/symtab] Check effect in parent_map::set_parent (review comments, concern about speed) - [gdb/symtab] Add debug_handle_deferred_entries (doesn't produce readable output for worker-threads > 0) - [gdb/symtab] Add parent_map::dump (only used by dropped patch) - fixed performance regression in "[gdb/symtab] Keep track of processed DIEs in shard" - split up last patch of v1 into 3 patches to be able to assess performance impact more precisely - use std::unique_ptr for m_die_range_map, m_deferred_entries and m_die_range_map_valid - rewrote the last patch to work for any DIE, not just DW_TAG_inlined_subroutine DIEs. - dropped the why narrative from the individual log messages, which proved difficult to maintain, focusing on the what instead, leaving the why for the cover letter. v1: - https://sourceware.org/pipermail/gdb-patches/2023-October/202882.html rfc: - https://sourceware.org/pipermail/gdb-patches/2023-September/202443.html. Tom de Vries (13): [gdb/symtab] Refactor condition in scan_attributes [gdb/symtab] Factor out m_die_range_map usage [gdb/symtab] Handle nullptr parent in parent_map::set_parent [gdb/symtab] Factor out m_deferred_entries usage [gdb/symtab] Resolve deferred entries, inter-shard case [gdb/testsuite] Add gdb.dwarf2/forward-spec-inter-cu.exp [gdb/testsuite] Add gdb.dwarf2/backward-spec-inter-cu.exp [gdb/symtab] Keep track of processed DIEs in shard [gdb/symtab] Resolve deferred entries, intra-shard case [gdb/symtab] Don't defer backward refs, inter-cu intra-shard case [gdb/symtab] Recurse into c++ DW_TAG_subprogram DIEs for cooked index [gdb/symtab] Keep track of all parents for cooked index [gdb/symtab] Fix DW_TAG_inlined_subroutine entries in the cooked index gdb/dwarf2/cooked-index.c | 113 +++++++++ gdb/dwarf2/cooked-index.h | 157 ++++++++++++- gdb/dwarf2/read.c | 222 +++++++++++++----- .../gdb.dwarf2/backward-spec-inter-cu.exp | 103 ++++++++ .../gdb.dwarf2/forward-spec-inter-cu.exp | 103 ++++++++ 5 files changed, 642 insertions(+), 56 deletions(-) create mode 100644 gdb/testsuite/gdb.dwarf2/backward-spec-inter-cu.exp create mode 100644 gdb/testsuite/gdb.dwarf2/forward-spec-inter-cu.exp base-commit: 14f2724f80b156928b1a0b0f9733350558e35e63
Comments
>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes:
[...]
Thank you for this series and the write-up.
You've done a great job on this.
Tom> IV. ALTERNATIVE APPROACH
Tom> I suppose it could be a matter of taste whether the current internal
Tom> representation is:
Tom> - a smart solution, or
Tom> - incorrect representation of the dwarf.
I'm not sure I totally understood this section, but the basic idea
behind this code -- and I think these decisions predate the cooked-index
rewrite -- was to try to exploit dwz compression by also using less
memory inside gdb.
Technically, I think, in DWARF an import can appear at any level and gdb
should probably just be re-reading a bunch of DIEs over and over.
However, this seemed super expensive back when I wrote the dwz support,
and since dwz is the only known compressor and it doesn't do some of the
weird stuff, it looked better to try to implement this optimization.
Shrug, maybe that was a mistake. If every import caused a re-scan of
the PU, it would mean that only dwz users would pay for this feature.
Just maybe the price would be very high.
Tom> - I'm not sure if other maintainers are supportive of this approach.
I probably just don't understand.
Tom> And this is comparing the patch series with the base version:
Tom> ...
Tom> real: mean: 687.00 (100%) mean: 931.00 (135.52%)
Tom> user: mean: 2204.00 (100%) mean: 2938.00 (133.30%)
Tom> sys : mean: 102.00 (100%) mean: 137.00 (134.31%)
Tom> mem : mean: 345479.60 (100%) mean: 418396.00 (121.11%)
Tom> ...
Tom> In summary, the overall result is ~36% more real time and ~21% more memory.
Pretty heavy. However I was curious about something:
Tom> The cumulative results of individual patches (leaving out the test-case
Tom> patches) are:
Tom> [gdb/symtab] Recurse into c++ DW_TAG_subprogram DIEs for cooked index
Tom> real: mean: 690.00 (100%) mean: 835.00 (121.01%)
Tom> user: mean: 2186.00 (100%) mean: 2693.00 (123.19%)
Tom> sys : mean: 106.00 (100%) mean: 130.00 (122.64%)
Tom> mem : mean: 345059.60 (100%) mean: 409780.00 (118.76%)
... the big performance jump appears here -- but is it really needed?
IIUC this is done because we want to detect inlined functions in C++.
However, in theory those should also be emitted at the top level with
DW_AT_inline, with a value of either DW_INL_inlined or
DW_INL_declared_inlined.
Would having entries for these be enough to provoke the desired CU
expansion? If so it may be just a matter of marking some more abbrevs
as "interesting".
Actually, for the simple test case I tried, I already see it in the
index. So I wonder what's going on there.
Anyway if we could drop this patch then it seems like the overall cost
would be alright.
Tom> Ideally, the impact of a patch series implementing dwz support on a non-dwz
Tom> test-case is none.
Tom> But fixing dwz support requires tracking more data, and there's no way of
Tom> knowing in advance whether the additional data will be used or not.
Yeah. I loathe this part of DWARF but I have come around to accept that
we just have to handle it.
Tom> Of course this can be accommodated by optimistically assuming that the data is
Tom> unnecessary, and when it turns out it was actually needed, partially or
Tom> completely restart indexing. My suspicion is that this approach itself is
Tom> going to be complex, so I think it's best avoided.
I agree.
Tom
On 12/12/23 20:05, Tom Tromey wrote: >>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes: > > [...] > > Thank you for this series and the write-up. > You've done a great job on this. > > Tom> IV. ALTERNATIVE APPROACH > > Tom> I suppose it could be a matter of taste whether the current internal > Tom> representation is: > Tom> - a smart solution, or > Tom> - incorrect representation of the dwarf. > > I'm not sure I totally understood this section, but the basic idea > behind this code -- and I think these decisions predate the cooked-index > rewrite -- was to try to exploit dwz compression by also using less > memory inside gdb. > Makes sense to me. > Technically, I think, in DWARF an import can appear at any level That's also my understanding. > and gdb > should probably just be re-reading a bunch of DIEs over and over. Yes, though we could make a distinction between top-level imports (as produced by dwz and gcc lto) and otherwise, and handle the top-level imports efficiently. > However, this seemed super expensive back when I wrote the dwz support, > and since dwz is the only known compressor and it doesn't do some of the > weird stuff, it looked better to try to implement this optimization. > > Shrug, maybe that was a mistake. If every import caused a re-scan of > the PU, it would mean that only dwz users would pay for this feature. > Just maybe the price would be very high. > The solution I tried to propose here is a middle ground: - it fixes the PR we're trying to fix in this series, and - it exploits the dwz compression inside gdb, to avoid the high price you mention. The basic idea is: - let the first import trigger a read of a PU, - in the cooked_index entries generated for the PU, set the per_cu to the PU (instead of to the importing CU as we do now), - keep track of importers, allowing to list all importers CUs of a given PU, - when doing expansion for a cooked_index entry, if the per_cu is a PU, expand all importer CUs. There are two notes here: - there is the risk that this gets us too many expanded symtabs, so when using the cooked index we should know whether we're interested in expanding all matching symtabs or just one. - imports can happen from CUs with different languages, so we'll have to handle that somehow. I'm not sure btw whether we're getting that correct either atm. > Tom> - I'm not sure if other maintainers are supportive of this approach. > > I probably just don't understand. > Ok, just let me know if there's anything I can do to clarify things. > Tom> And this is comparing the patch series with the base version: > Tom> ... > Tom> real: mean: 687.00 (100%) mean: 931.00 (135.52%) > Tom> user: mean: 2204.00 (100%) mean: 2938.00 (133.30%) > Tom> sys : mean: 102.00 (100%) mean: 137.00 (134.31%) > Tom> mem : mean: 345479.60 (100%) mean: 418396.00 (121.11%) > Tom> ... > > Tom> In summary, the overall result is ~36% more real time and ~21% more memory. > > Pretty heavy. However I was curious about something: > > Tom> The cumulative results of individual patches (leaving out the test-case > Tom> patches) are: > > Tom> [gdb/symtab] Recurse into c++ DW_TAG_subprogram DIEs for cooked index > > Tom> real: mean: 690.00 (100%) mean: 835.00 (121.01%) > Tom> user: mean: 2186.00 (100%) mean: 2693.00 (123.19%) > Tom> sys : mean: 106.00 (100%) mean: 130.00 (122.64%) > Tom> mem : mean: 345059.60 (100%) mean: 409780.00 (118.76%) > > ... the big performance jump appears here -- but is it really needed? > > IIUC this is done because we want to detect inlined functions in C++. > However, in theory those should also be emitted at the top level with > DW_AT_inline, with a value of either DW_INL_inlined or > DW_INL_declared_inlined. > In the gdb.cp/breakpoint-locs.exp cc-with-dwz case we have: ... <1><14>: Abbrev Number: 34 (DW_TAG_namespace) <15> DW_AT_name : N1 <18> DW_AT_sibling : <0x2e> <2><19>: Abbrev Number: 32 (DW_TAG_class_type) <1a> DW_AT_name : C1 <1d> DW_AT_byte_size : 1 <1e> DW_AT_decl_file : 2 <1f> DW_AT_decl_line : 20 <3><20>: Abbrev Number: 35 (DW_TAG_subprogram) <21> DW_AT_external : 1 <21> DW_AT_name : baz <25> DW_AT_decl_file : 2 <26> DW_AT_decl_line : 23 <27> DW_AT_linkage_name: (indirect string, offset: 0x1d3): _ZN2N12C13bazEv <2b> DW_AT_accessibility: 1 (public) <2c> DW_AT_declaration : 1 <3><2c>: Abbrev Number: 0 <2><2d>: Abbrev Number: 0 <1><2e>: Abbrev Number: 37 (DW_TAG_subprogram) <2f> DW_AT_specification: <0x20> <30> DW_AT_inline : 3 (declared as inline and inlined) <31> DW_AT_sibling : <0x39> ... and I think what you're describing matches most closely DIE 0x2e (though I haven't found DW_INL_inlined or DW_INL_declared_inlined in the dwarf produced for the test-case, also not with gcc 12 and -gdwarf-5). > Would having entries for these be enough to provoke the desired CU > expansion? If so it may be just a matter of marking some more abbrevs > as "interesting". > The problem is that the DIE 0x2e has been moved to a PU, and as consequence in the current implementation will cause expansion of only a single CU. > Actually, for the simple test case I tried, I already see it in the > index. So I wonder what's going on there. > I wonder if you can reproduce the FAIL I'm seeing? [ Btw, it's not 100% reproducible, so I usually run it in a $(seq 1 100) loop. ] > Anyway if we could drop this patch then it seems like the overall cost > would be alright. > > Tom> Ideally, the impact of a patch series implementing dwz support on a non-dwz > Tom> test-case is none. > > Tom> But fixing dwz support requires tracking more data, and there's no way of > Tom> knowing in advance whether the additional data will be used or not. > > Yeah. I loathe this part of DWARF but I have come around to accept that > we just have to handle it. > > Tom> Of course this can be accommodated by optimistically assuming that the data is > Tom> unnecessary, and when it turns out it was actually needed, partially or > Tom> completely restart indexing. My suspicion is that this approach itself is > Tom> going to be complex, so I think it's best avoided. > > I agree. Thanks for the review. - Tom
>>>>> "Tom" == Tom de Vries <tdevries@suse.de> writes: >> However, in theory those should also be emitted at the top level with >> DW_AT_inline, with a value of either DW_INL_inlined or >> DW_INL_declared_inlined. Tom> In the gdb.cp/breakpoint-locs.exp cc-with-dwz case we have: Tom> <30> DW_AT_inline : 3 (declared as inline and inlined) Tom> and I think what you're describing matches most closely DIE 0x2e Tom> (though I haven't found DW_INL_inlined or DW_INL_declared_inlined in Tom> the dwarf produced for the test-case, also not with gcc 12 and Tom> -gdwarf-5). readelf / objdump don't print the symbolic name here, but that's actually what you're seeing. From dwarf2.h: /* Inline attribute. */ enum dwarf_inline_attribute { DW_INL_not_inlined = 0, DW_INL_inlined = 1, DW_INL_declared_not_inlined = 2, DW_INL_declared_inlined = 3 }; >> Would having entries for these be enough to provoke the desired CU >> expansion? If so it may be just a matter of marking some more abbrevs >> as "interesting". Tom> The problem is that the DIE 0x2e has been moved to a PU, and as Tom> consequence in the current implementation will cause expansion of only Tom> a single CU. Ok, I see. Won't this fail with .gdb_index as well then? Because IIRC that attributes symbols from a PU to the canonical includer. Anyway it seems better to me to record inclusions more precisely and then expand more often (depending on the lookup) than to spend a lot of time recursing into C++ subroutines. Tom