gdb/dwarf: fix order of operations when reading .debug_names
Checks
| Context |
Check |
Description |
| linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 |
success
|
Build passed
|
| linaro-tcwg-bot/tcwg_gdb_build--master-arm |
success
|
Build passed
|
| linaro-tcwg-bot/tcwg_gdb_check--master-arm |
success
|
Test passed
|
| linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 |
success
|
Test passed
|
Commit Message
From: Simon Marchi <simon.marchi@polymtl.ca>
PR 34149 shows a hard to reproduce bug. The problem sometimes shows up
when reading in `.debug_names` indexes, manifesting like this:
warning: Section .debug_names has incorrect entry in CU table, ignoring .debug_names.
This is shown even if the CU table in the `.debug_names` index
accurately describes the CUs in `.debug_info`.
The bug was introduced by commit b301725d35c1 ("gdb/dwarf: read foreign
type units").
Before b301725d35c1, the steps when reading a `.debug_names` index were:
- call to create_all_units(), which filled the all_units vector by
walking `.debug_info` and `.debug_types` and then sorted it.
- calls to the build_and_check_* functions, which do some lookups in
the all_units vector, requiring it to be sorted.
All good. Post-b301725d35c1, we have:
- call to create_all_units(), which still fills the all_units vector by
walking `.debug_info` and `.debug_types`, but does not sort it
anymore.
- calls to the build_and_check_* functions, which do some lookups in
the all_units vector, requiring it to be sorted (oops)
- call to create_foreign_type_units_from_debug_names(), which may add
more items to the all_units vector.
- call to finalize_all_units(), to sort the all_units vector.
The sorting of all_units was taken out of create_all_units() on purpose.
Because create_foreign_type_units_from_debug_names() may add more units,
we don't want to sort the vector in create_all_units() and then sort it
again after create_foreign_type_units_from_debug_names().
The build_and_check_* functions do some lookups in the all_units vector
with dwarf2_find_unit(), which uses std::lower_bound() internally. This
happens while the all_units vector is not sorted according to the
sorting key, all_units_less_than(), which makes the search is bogus.
dwarf2_find_unit() might return "no such unit", when in fact, the given
unit exists. This leads build_and_check_cu_list_from_debug_names() to
believe that the list of CUs in the `.debug_names` index does not match
what was built from reading `.debug_info`, we get the warning shown
above, and the index gets wrongfully rejected.
For this bug to show up, it is necessary to have type units in a
`.debug_types`. The sorting key for units in the all_units vector is:
section pointer (`dwarf2_section_info *`), then the unit offset into the
section. If all units come from the same section (`.debug_info`), then
this bug does not show up because the all_units vector happens to be
sorted after create_all_units() runs. This happens for example in test
gdb.dwarf2/debug-names-bad-cu-index.exp, where we create some DWARF 4
units and then a `.debug_names` index out of them.
It must also happen for the `dwarf2_section_info *` for `.debug_types`
to be "less than" the `dwarf2_section_info *` for `.debug_info`,
otherwise the all_units also happens to be sorted after
create_all_units() runs. This entirely depends on the whims of the
memory allocator.
I was not able to reproduce it in my normal GDB dev build, but I was
able in a "thread sanitizer" build, for some reason. Not because there
is an actual threading issue, but I think because it shuffled the memory
allocations in just the right way to make the bug happen. And then, it
only happened when doing "make check", I could not get it to reproduce
when running interactively. Using -D_GLIBCXX_DEBUG was immensely
useful, because it pointed out that the "range must be sorted"
precondition for std::lower_bound() was not met, which made the problem
immediately obvious:
(gdb) file /home/simark/build/binutils-gdb-tsan/gdb/testsuite/outputs/gdb.dwarf2/debug-names-bad-cu-index/debug-names-bad-cu-index
Reading symbols from /home/simark/build/binutils-gdb-tsan/gdb/testsuite/outputs/gdb.dwarf2/debug-names-bad-cu-index/debug-names-bad-cu-index...
/usr/include/c++/16.1.1/bits/stl_algo.h:1981:
In function:
constexpr _FIter std::lower_bound(_FIter, _FIter, const _Tp&, _Compare)
[with _FIter = gnu_debug::_Safe_iterator<gnu_cxx::
normal_iterator<unique_ptr<dwarf2_per_cu, dwarf2_per_cu_deleter>*,
vector<unique_ptr<dwarf2_per_cu, dwarf2_per_cu_deleter>,
allocator<unique_ptr<dwarf2_per_cu, dwarf2_per_cu_deleter> > > >,
debug::vector<unique_ptr<dwarf2_per_cu, dwarf2_per_cu_deleter> >,
random_access_iterator_tag>; _Tp = section_and_offset; _Compare =
dwarf2_find_unit(const section_and_offset&,
dwarf2_per_bfd*)::<lambda(const dwarf2_per_cu_up&, const
section_and_offset&)>]
Error: elements in iterator range [first, last) are not partitioned by the
predicate __comp and value __val.
Knowing all this, is it easy to artificially reproduce the problem by
making create_all_units() reverse the all_units vector, like so:
// Reverse the all_units vector
std::reverse (per_objfile->per_bfd->all_units.begin (),
per_objfile->per_bfd->all_units.end ());
The fix is to reorder the operations in this way
- call create_all_units(), to fill the all_units vector by walking
`.debug_info` and `.debug_types`
- call create_foreign_type_units_from_debug_names(), which may add more
items to the all_units vector.
- call finalize_all_units(), to sort the all_units vector.
- call the build_and_check_* functions, which is fine now that
all_units is sorted.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=34149
Change-Id: I315f391605af549b55341a167683d2dc6203bcea
---
gdb/dwarf2/read-debug-names.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
base-commit: 5100d2b65d6fb6d571893be40b2849b3cb2abe59
Comments
>>>>> "Simon" == Simon Marchi <simon.marchi@efficios.com> writes:
Simon> - calls to the build_and_check_* functions, which do some lookups in
Simon> the all_units vector, requiring it to be sorted (oops)
I wonder if either there's some way to assert that the vector must be
sorted on entry, or if there's a way to have a self-sorting data type
here, rendering this kind of bug impossible.
Anyway I think this is ok.
Approved-By: Tom Tromey <tom@tromey.com>
Tom
On 2026-05-20 11:15, Tom Tromey wrote:
>>>>>> "Simon" == Simon Marchi <simon.marchi@efficios.com> writes:
>
> Simon> - calls to the build_and_check_* functions, which do some lookups in
> Simon> the all_units vector, requiring it to be sorted (oops)
>
> I wonder if either there's some way to assert that the vector must be
> sorted on entry, or if there's a way to have a self-sorting data type
> here, rendering this kind of bug impossible.
assert: in dwarf2_find_unit, we could check that the vector is sorted,
but that would replicate the check done by -D_GLIBCXX_DEBUG. I'm not
sure we want to pay this price for every dwarf2_find_unit call in
production binaries.
Alternatively, we could maintain a flag that says "is all_units
currently sorted", clear that flag when doing any modification to the
vector, and set it in the sort_all_units method. All modifications to
the vector should be done through methods to ensure the flag gets
cleared. I think that would be cheap enough to implement, I'll give it
a try.
self-sorting data type: we could insert the new units at the right place
from the start, but that would be less efficient, because that would be
akin to an insertion sort. It's more efficient to do one sort at the
end.
We could perhaps use std::set, but there is also the case in
fill_in_sig_entry_from_dwo_file, when the section of a unit becomes
known at a later time. At this point, the key for the unit would
change, so we would still need to remove it and reinsert it in the set.
I don't see a way to enforce that automatically.
> Anyway I think this is ok.
> Approved-By: Tom Tromey <tom@tromey.com>
Thanks, I'll push it and work on a patch to add the assert.
Simon
@@ -1075,7 +1075,15 @@ dwarf2_read_debug_names (dwarf2_per_objfile *per_objfile)
}
}
+ /* Populate the `all_units` vector. */
create_all_units (per_objfile);
+ create_foreign_type_units_from_debug_names (per_objfile->per_bfd, map);
+
+ /* Sort the `all_units` vector. This must be done before the
+ `build_and_check_*` calls below, since they do binary search lookups in
+ that vector. */
+ finalize_all_units (per_objfile->per_bfd);
+
if (!build_and_check_cu_lists_from_debug_names (per_bfd, map, dwz_map))
return false;
@@ -1097,12 +1105,6 @@ dwarf2_read_debug_names (dwarf2_per_objfile *per_objfile)
return false;
}
- create_foreign_type_units_from_debug_names (per_objfile->per_bfd, map);
-
- /* create_foreign_type_units_from_debug_names may add more entries to the
- ALL_UNITS vector, so it must be called before finalize_all_units. */
- finalize_all_units (per_objfile->per_bfd);
-
per_bfd->debug_aranges.read (per_objfile->objfile);
/* There is a single address map for the whole index (coming from