Message ID | 20181130172148.18830-1-sergiodj@redhat.com |
---|---|
State | New, archived |
Headers |
Received: (qmail 88179 invoked by alias); 30 Nov 2018 17:22:05 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <gdb-patches.sourceware.org> List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 88162 invoked by uid 89); 30 Nov 2018 17:22:04 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=mid, helped, sake X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 30 Nov 2018 17:22:01 +0000 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 76EFC3082B07; Fri, 30 Nov 2018 17:22:00 +0000 (UTC) Received: from psique.yyz.redhat.com (unused-10-15-17-196.yyz.redhat.com [10.15.17.196]) by smtp.corp.redhat.com (Postfix) with ESMTP id 08329194A9; Fri, 30 Nov 2018 17:21:54 +0000 (UTC) From: Sergio Durigan Junior <sergiodj@redhat.com> To: GDB Patches <gdb-patches@sourceware.org> Cc: Keith Seitz <keiths@redhat.com>, Tom Tromey <tom@tromey.com>, Jan Kratochvil <jan.kratochvil@redhat.com>, Sergio Durigan Junior <sergiodj@redhat.com> Subject: [RFC/PATCH] Fix dwarf2read.c:dwarf2_find_containing_comp_unit's binary search Date: Fri, 30 Nov 2018 12:21:48 -0500 Message-Id: <20181130172148.18830-1-sergiodj@redhat.com> X-IsSubscribed: yes |
Commit Message
Sergio Durigan Junior
Nov. 30, 2018, 5:21 p.m. UTC
First of all, I would like to express my gratitude to Keith Seitz, Jan Kratochvil and Tom Tromey, who were really kind and helped a lot with this bug. The patch itself was authored by Jan. This all began with: https://bugzilla.redhat.com/show_bug.cgi?id=1639242 py-bt is broken, results in exception In summary, the error reported by the bug above is: $ gdb -args python3 GNU gdb (GDB) Fedora 8.1.1-3.fc28 (...) Reading symbols from python3...Reading symbols from /usr/lib/debug/usr/bin/python3.6-3.6.6-1.fc28.x86_64.debug...done. done. Dwarf Error: could not find partial DIE containing offset 0x316 [in module /usr/lib/debug/usr/bin/python3.6-3.6.6-1.fc28.x86_64.debug] After a long investigation, and after thinking that the problem might actually be on DWZ's side, we were able to determine that there's something wrong going on when dwarf2read.c:dwarf2_find_containing_comp_unit performs a binary search over all of the CUs belonging to an objfile in order to find the CU which contains a DIE at an specific offset. The current algorithm is: static struct dwarf2_per_cu_data * dwarf2_find_containing_comp_unit (sect_offset sect_off, unsigned int offset_in_dwz, struct dwarf2_per_objfile *dwarf2_per_objfile) { struct dwarf2_per_cu_data *this_cu; int low, high; const sect_offset *cu_off; low = 0; high = dwarf2_per_objfile->all_comp_units.size () - 1; while (high > low) { struct dwarf2_per_cu_data *mid_cu; int mid = low + (high - low) / 2; mid_cu = dwarf2_per_objfile->all_comp_units[mid]; cu_off = &mid_cu->sect_off; if (mid_cu->is_dwz > offset_in_dwz || (mid_cu->is_dwz == offset_in_dwz && *cu_off >= sect_off)) high = mid; else low = mid + 1; } For the sake of this example, let's consider that "sect_off = 0x7d". There are a few important things going on here. First, "dwarf2_per_objfile->all_comp_units ()" will be sorted first by whether the CU is a DWZ CU, and then by cu->sect_off. In this specific bug, "offset_in_dwz" is false, which means that, for the most part of the loop, we're going to do "high = mid" (i.e, we'll work with the lower part of the vector). In our particular case, when we reach the part where "mid_cu->is_dwz == offset_in_dwz" (i.e, both are false), we end up with "high = 2" and "mid = 1". I.e., there are only 2 elements in the vector who are not DWZ. The vector looks like this: #0: cu->sect_off = 0; length = 114; is_dwz = false <-- low #1: cu->sect_off = 114; length = 7796; is_dwz = false <-- mid #2: cu->sect_off = 0; length = 28; is_dwz = true <-- high ... The CU we want is #1, which is exactly where "mid" is. Also, #1 is not DWZ, which is also exactly what we want. So we perform the second comparison: (mid_cu->is_dwz == offset_in_dwz && *cu_off >= sect_off) ^^^^^^^^^^^^^^^^^^^ Because "*cu_off = 114" and "sect_off = 0x7d", this evaluates to false, so we end up with "low = mid + 1 = 2", which actually gives us the wrong CU (i.e., a CU that is DWZ). Next in the code, GDB does: gdb_assert (low == high); this_cu = dwarf2_per_objfile->all_comp_units[low]; cu_off = &this_cu->sect_off; if (this_cu->is_dwz != offset_in_dwz || *cu_off > sect_off) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ { if (low == 0 || this_cu->is_dwz != offset_in_dwz) error (_("Dwarf Error: could not find partial DIE containing " "offset %s [in module %s]"), sect_offset_str (sect_off), bfd_get_filename (dwarf2_per_objfile->objfile->obfd)); ... Triggering the error we saw in the original bug report. It's important to notice that we see the error message because the selected CU is a DWZ one, but we're looking for a non-DWZ CU here. However, even when the selected CU is *not* a DWZ (and we don't see any error message), we still end up with the wrong CU. For example, suppose that the vector had: #0: cu->sect_off = 0; length = 114; is_dwz = false #1: cu->sect_off = 114; length = 7796; is_dwz = false #2: cu->sect_off = 7910; length = 28; is_dwz = false ... I.e., #2's "is_dwz" is false instead of true. In this case, we still want #1, because that's where the DIE is located. After the loop ends up in #2, we have "is_dwz" as false, which is what we wanted, so we compare offsets. In this case, "7910 >= 0x7d", so we set "mid = high = 2". Next iteration, we have "mid = 0 + (2 - 0) / 2 = 1", and thus we examining #1. "is_dwz" is still false, but "114 >= 0x7d" also evaluates to false, so "low = mid + 1 = 2", which makes the loop stop. Therefore, we end up choosing #2 as our CU, even though #1 is the right one. The problem here is happening because we're comparing "sect_off" directly against "*cu_off", while we should actually be comparing against "*cu_off + mid_cu->length" (i.e., the end offset): ... || (mid_cu->is_dwz == offset_in_dwz && *cu_off + mid_cu->length >= sect_off)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ... And this is what the patch does. The idea is that if GDB is searching for an offset that falls above the *end* of the CU being analyzed (i.e., "mid"), then the next iteration should try a higher-offset CU next. The previous algorithm was using the *beginning* of the CU. Unfortunately, I could not devise a testcase for this problem, so I am proposing a fix with this huge explanation attached to it in the hope that it is sufficient. After talking a bit to Keith (our testcase guru), it seems that one would have to create an objfile with both DWZ and non-DWZ sections, which may prove very hard to do, I think. I ran this patch on our BuildBot, and no regressions were detected. gdb/ChangeLog: 2018-11-30 Jan Kratochvil <jan.kratochvil@redhat.com> Keith Seitz <keiths@redhat.com> Tom Tromey <tom@tromey.com> Sergio Durigan Junior <sergiodj@redhat.com> https://bugzilla.redhat.com/show_bug.cgi?id=1613614 * dwarf2read.c (dwarf2_find_containing_comp_unit): Add 'mid_cu->length' to '*cu_off' when checking if 'sect_off' is inside the CU. --- gdb/dwarf2read.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
Comments
>>>>> "Sergio" == Sergio Durigan Junior <sergiodj@redhat.com> writes:
Sergio> https://bugzilla.redhat.com/show_bug.cgi?id=1613614
Sergio> * dwarf2read.c (dwarf2_find_containing_comp_unit): Add
Sergio> 'mid_cu->length' to '*cu_off' when checking if 'sect_off' is
Sergio> inside the CU.
Thanks for the extensive analysis.
I agree this is correct; please check this in.
It's weird that cu_off is a pointer in this code.
It seems like that local could be removed entirely.
But, no need to do that in this patch.
Tom
On Friday, November 30 2018, Tom Tromey wrote: >>>>>> "Sergio" == Sergio Durigan Junior <sergiodj@redhat.com> writes: > > Sergio> https://bugzilla.redhat.com/show_bug.cgi?id=1613614 > Sergio> * dwarf2read.c (dwarf2_find_containing_comp_unit): Add > Sergio> 'mid_cu->length' to '*cu_off' when checking if 'sect_off' is > Sergio> inside the CU. > > Thanks for the extensive analysis. > I agree this is correct; please check this in. Thanks for the quick review. This has been pushed: 81fbbaf96216ed88973a069e4ed25379d7421ec8 > It's weird that cu_off is a pointer in this code. > It seems like that local could be removed entirely. > But, no need to do that in this patch. Yeah, as I told you on IRC, I also noticed this. I'll see about sending a subsequent patch later. Thanks,
On 11/30/18 10:05 AM, Sergio Durigan Junior wrote: > On Friday, November 30 2018, Tom Tromey wrote: > >>>>>>> "Sergio" == Sergio Durigan Junior <sergiodj@redhat.com> writes: >> >> Sergio> https://bugzilla.redhat.com/show_bug.cgi?id=1613614 >> Sergio> * dwarf2read.c (dwarf2_find_containing_comp_unit): Add >> Sergio> 'mid_cu->length' to '*cu_off' when checking if 'sect_off' is >> Sergio> inside the CU. >> >> Thanks for the extensive analysis. >> I agree this is correct; please check this in. > > Thanks for the quick review. This has been pushed: > > 81fbbaf96216ed88973a069e4ed25379d7421ec8 > Is this needed for 8.2 branch? Keith
On Tuesday, December 04 2018, Keith Seitz wrote: > On 11/30/18 10:05 AM, Sergio Durigan Junior wrote: >> On Friday, November 30 2018, Tom Tromey wrote: >> >>>>>>>> "Sergio" == Sergio Durigan Junior <sergiodj@redhat.com> writes: >>> >>> Sergio> https://bugzilla.redhat.com/show_bug.cgi?id=1613614 >>> Sergio> * dwarf2read.c (dwarf2_find_containing_comp_unit): Add >>> Sergio> 'mid_cu->length' to '*cu_off' when checking if 'sect_off' is >>> Sergio> inside the CU. >>> >>> Thanks for the extensive analysis. >>> I agree this is correct; please check this in. >> >> Thanks for the quick review. This has been pushed: >> >> 81fbbaf96216ed88973a069e4ed25379d7421ec8 >> > > Is this needed for 8.2 branch? Yep, I'd say so. This seems like a simple patch to be backported to 8.2, but I'll wait for a global maintainer to give the green light so that I can create a proper bug. Thanks,
>>>>> "Sergio" == Sergio Durigan Junior <sergiodj@redhat.com> writes: >> Is this needed for 8.2 branch? Sergio> Yep, I'd say so. This seems like a simple patch to be backported to Sergio> 8.2, but I'll wait for a global maintainer to give the green light so Sergio> that I can create a proper bug. Did anyone ever reply to this? It seems like a good fix to backport to me. Tom
On Sunday, December 16 2018, Tom Tromey wrote: >>>>>> "Sergio" == Sergio Durigan Junior <sergiodj@redhat.com> writes: > >>> Is this needed for 8.2 branch? > > Sergio> Yep, I'd say so. This seems like a simple patch to be backported to > Sergio> 8.2, but I'll wait for a global maintainer to give the green light so > Sergio> that I can create a proper bug. > > Did anyone ever reply to this? Nope :-). > It seems like a good fix to backport to me. Cool. I opened <https://sourceware.org/bugzilla/show_bug.cgi?id=24003>, and I'll backport the patch to 8.2.1 now. Thanks,
diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c index d2a8cd44f9..8bfbd69394 100644 --- a/gdb/dwarf2read.c +++ b/gdb/dwarf2read.c @@ -25126,7 +25126,8 @@ dwarf2_find_containing_comp_unit (sect_offset sect_off, mid_cu = dwarf2_per_objfile->all_comp_units[mid]; cu_off = &mid_cu->sect_off; if (mid_cu->is_dwz > offset_in_dwz - || (mid_cu->is_dwz == offset_in_dwz && *cu_off >= sect_off)) + || (mid_cu->is_dwz == offset_in_dwz + && *cu_off + mid_cu->length >= sect_off)) high = mid; else low = mid + 1;