[2/2] gdb/dwarf: track whether the all_units vector is sorted

Message ID 20260520182101.4074675-2-simon.marchi@polymtl.ca
State New
Headers
Series [1/2] gdb/dwarf: change signatured_type_up to include deleter type |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm success Test passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Test passed

Commit Message

Simon Marchi May 20, 2026, 6:19 p.m. UTC
  From: Simon Marchi <simon.marchi@polymtl.ca>

dwarf2_find_unit performs a binary search over dwarf2_per_bfd::all_units
using std::lower_bound.  Before calling std::lower_bound, we want to
make sure that all_units is properly sorted.

Track the state of whether all_units is considered sorted with a new
dwarf2_per_bfd::all_units_sorted flag, and assert it in
dwarf2_find_unit.  This will help catch bugs where we call
dwarf2_find_unit on a non sorted vector, for instance what was fixed by
commit fbaef7de7701 ("gdb/dwarf: fix order of operations when reading
.debug_names").

The flag is set:

 - At initialization time, because an empty vector is sorted.

 - Whenever the vector is cleared, for the same reason (added a helper
   method for that).

 - After sorting in dwarf2_per_bfd::sort_all_unit.

It is cleared:

 - Whenever a unit is appended to all_units.  To keep this centralized,
   add a new dwarf2_per_bfd::add_unit helper that appends the unit and
   clears the flag.

 - Whenever a unit's sort key (section / sect_off) is modified.  That is
   done in dwarf2_per_cu::set_section and dwarf2_per_cu::set_sect_off.

Note that the flag is cleared when appending a unit even if the vector
would by chance already be sorted.  This is good, because it will catch
mistakes even on machines where a problem with std::lower_bound would
not manifest.

I checked that this patch would have caught the problem before commit
fbaef7de7701:

    (gdb) file /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.dwarf2/debug-names-bad-cu-index/debug-names-bad-cu-index
    Reading symbols from /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.dwarf2/debug-names-bad-cu-index/debug-names-bad-cu-index...
    /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:18092: internal-error: dwarf2_find_unit: Assertion `per_bfd->all_units_sorted' failed.

Change-Id: I9b3e52bb33b02763594efa381761f383ee344317
---
 gdb/dwarf2/read-debug-names.c |  2 +-
 gdb/dwarf2/read-gdb-index.c   |  4 ++--
 gdb/dwarf2/read.c             | 28 ++++++++++++++++++++++++++--
 gdb/dwarf2/read.h             | 34 ++++++++++++++++++++++------------
 4 files changed, 51 insertions(+), 17 deletions(-)
  

Comments

Tom Tromey May 21, 2026, 2:25 p.m. UTC | #1
>>>>> "Simon" == simon marchi <simon.marchi@polymtl.ca> writes:

Simon> Note that the flag is cleared when appending a unit even if the vector
Simon> would by chance already be sorted.  This is good, because it will catch
Simon> mistakes even on machines where a problem with std::lower_bound would
Simon> not manifest.

Simon> I checked that this patch would have caught the problem before commit
Simon> fbaef7de7701:

Simon>     (gdb) file /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.dwarf2/debug-names-bad-cu-index/debug-names-bad-cu-index
Simon>     Reading symbols from /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.dwarf2/debug-names-bad-cu-index/debug-names-bad-cu-index...
Simon>     /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:18092: internal-error: dwarf2_find_unit: Assertion `per_bfd->all_units_sorted' failed.

Thanks, and especially for checking this.
Approved-By: Tom Tromey <tom@tromey.com>

Tom
  
Simon Marchi May 21, 2026, 5:45 p.m. UTC | #2
On 5/21/26 10:25 AM, Tom Tromey wrote:
>>>>>> "Simon" == simon marchi <simon.marchi@polymtl.ca> writes:
> 
> Simon> Note that the flag is cleared when appending a unit even if the vector
> Simon> would by chance already be sorted.  This is good, because it will catch
> Simon> mistakes even on machines where a problem with std::lower_bound would
> Simon> not manifest.
> 
> Simon> I checked that this patch would have caught the problem before commit
> Simon> fbaef7de7701:
> 
> Simon>     (gdb) file /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.dwarf2/debug-names-bad-cu-index/debug-names-bad-cu-index
> Simon>     Reading symbols from /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.dwarf2/debug-names-bad-cu-index/debug-names-bad-cu-index...
> Simon>     /home/simark/src/binutils-gdb/gdb/dwarf2/read.c:18092: internal-error: dwarf2_find_unit: Assertion `per_bfd->all_units_sorted' failed.
> 
> Thanks, and especially for checking this.
> Approved-By: Tom Tromey <tom@tromey.com>
> 
> Tom

Thanks, pushed both patches.

Simon
  

Patch

diff --git a/gdb/dwarf2/read-debug-names.c b/gdb/dwarf2/read-debug-names.c
index dfa54f7914a2..12ecac74ffdc 100644
--- a/gdb/dwarf2/read-debug-names.c
+++ b/gdb/dwarf2/read-debug-names.c
@@ -1036,7 +1036,7 @@  create_foreign_type_units_from_debug_names (dwarf2_per_bfd *per_bfd,
 
       map.foreign_type_units.emplace_back (sig_type.get ());
       per_bfd->signatured_types.emplace (sig_type.get ());
-      per_bfd->all_units.emplace_back (std::move (sig_type));
+      per_bfd->add_unit (std::move (sig_type));
     }
 }
 
diff --git a/gdb/dwarf2/read-gdb-index.c b/gdb/dwarf2/read-gdb-index.c
index e0f527b7503e..324984c0de2d 100644
--- a/gdb/dwarf2/read-gdb-index.c
+++ b/gdb/dwarf2/read-gdb-index.c
@@ -526,7 +526,7 @@  create_cus_from_gdb_index_list (dwarf2_per_bfd *per_bfd,
       dwarf2_per_cu_up per_cu = per_bfd->allocate_per_cu (section, sect_off,
 							  length, is_dwz);
       units.emplace_back (per_cu.get ());
-      per_bfd->all_units.emplace_back (std::move (per_cu));
+      per_bfd->add_unit (std::move (per_cu));
     }
 }
 
@@ -584,7 +584,7 @@  create_signatured_type_table_from_gdb_index
 
       sig_types_hash.emplace (sig_type.get ());
       units.emplace_back (sig_type.get ());
-      per_bfd->all_units.emplace_back (std::move (sig_type));
+      per_bfd->add_unit (std::move (sig_type));
     }
 
   per_bfd->signatured_types = std::move (sig_types_hash);
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 883b068ad8ae..c0d46683fa79 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -2247,7 +2247,7 @@  add_type_unit (dwarf2_per_bfd *per_bfd, dwarf2_section_info *section,
 					 false /* is_dwz */, sig);
 
   auto emplace_ret = per_bfd->signatured_types.emplace (sig_type.get ());
-  per_bfd->all_units.emplace_back (std::move (sig_type));
+  per_bfd->add_unit (std::move (sig_type));
 
   /* Assert that an insertion took place - that there wasn't a type unit with
      that signature already.  */
@@ -3390,7 +3390,7 @@  read_comp_units_from_section (dwarf2_per_objfile *per_objfile,
 	}
 
       info_ptr = info_ptr + this_cu->length ();
-      per_bfd->all_units.push_back (std::move (this_cu));
+      per_bfd->add_unit (std::move (this_cu));
     }
 }
 
@@ -3405,6 +3405,7 @@  dwarf2_per_bfd::sort_all_units ()
 		 return all_units_less_than (*a, { b->section (),
 						   b->sect_off () });
 	       });
+  this->all_units_sorted = true;
 }
 
 /* See read.h.  */
@@ -17908,6 +17909,27 @@  dwarf2_symbol_mark_computed (const struct attribute *attr, struct symbol *sym,
 
 /* See read.h.  */
 
+void
+dwarf2_per_cu::set_section (dwarf2_section_info *section)
+{
+  gdb_assert (section != nullptr);
+  gdb_assert (m_section == nullptr);
+  m_section = section;
+  m_per_bfd->all_units_sorted = false;
+}
+
+/* See read.h.  */
+
+void
+dwarf2_per_cu::set_sect_off (sect_offset sect_off)
+{
+  gdb_assert (m_sect_off == invalid_sect_offset);
+  m_sect_off = sect_off;
+  m_per_bfd->all_units_sorted = false;
+}
+
+/* See read.h.  */
+
 void
 dwarf2_per_cu::set_lang (enum language lang, dwarf_source_language dw_lang)
 {
@@ -18067,6 +18089,8 @@  dwarf2_find_containing_unit (const section_and_offset &target,
 dwarf2_per_cu *
 dwarf2_find_unit (const section_and_offset &start, dwarf2_per_bfd *per_bfd)
 {
+  gdb_assert (per_bfd->all_units_sorted);
+
   auto it = std::lower_bound (per_bfd->all_units.begin (),
 			      per_bfd->all_units.end (), start,
 			      [] (const dwarf2_per_cu_up &per_cu,
diff --git a/gdb/dwarf2/read.h b/gdb/dwarf2/read.h
index fd6ee1b8b800..3fbcc22b973a 100644
--- a/gdb/dwarf2/read.h
+++ b/gdb/dwarf2/read.h
@@ -285,22 +285,13 @@  struct dwarf2_per_cu
   { return m_section; }
 
   /* Set the section of this unit.  */
-  void set_section (dwarf2_section_info *section)
-  {
-    gdb_assert (section != nullptr);
-    gdb_assert (m_section == nullptr);
-    m_section = section;
-  }
+  void set_section (dwarf2_section_info *section);
 
   sect_offset sect_off () const
   { return m_sect_off; }
 
   /* Set the section offset of this unit.  */
-  void set_sect_off (sect_offset sect_off)
-  {
-    gdb_assert (m_sect_off == invalid_sect_offset);
-    m_sect_off = sect_off;
-  }
+  void set_sect_off (sect_offset sect_off);
 
   bool is_dwz () const
   { return m_is_dwz; }
@@ -601,6 +592,13 @@  struct dwarf2_per_bfd
      dwarf2_find_containing_unit to be able to perform a binary search.  */
   void sort_all_units ();
 
+  /* Clear the all_units vector.  */
+  void clear_all_units ()
+  {
+    this->all_units.clear ();
+    this->all_units_sorted = true;
+  }
+
   /* Return the separate '.dwz' debug file.  If there is no
      .gnu_debugaltlink or .debug_sup section in the file, then the
      result depends on REQUIRE: if REQUIRE is true, error out; if
@@ -641,6 +639,13 @@  struct dwarf2_per_bfd
      This one is used when only the signature is known at creation time.  */
   signatured_type_up allocate_signatured_type (ULONGEST signature);
 
+  /* Append UNIT to ALL_UNITS.  */
+  void add_unit (dwarf2_per_cu_up unit)
+  {
+    this->all_units_sorted = false;
+    this->all_units.emplace_back (std::move (unit));
+  }
+
   /* Map all the DWARF section data needed when scanning
      .debug_info.  */
   void map_info_sections (struct objfile *objfile);
@@ -691,6 +696,11 @@  struct dwarf2_per_bfd
      DW_FORM_ref_addr attributes (reference by section offset).  */
   std::vector<dwarf2_per_cu_up> all_units;
 
+  /* True if ALL_UNITS is currently sorted according to all_units_less_than.
+     Set by sort_all_units, cleared whenever a unit is added to ALL_UNITS or
+     a unit's sort key (section / sect_off) is modified.  */
+  bool all_units_sorted = false;
+
   /* Number of compilation and type units in the ALL_UNITS vector.  */
   unsigned int num_comp_units = 0;
   unsigned int num_type_units = 0;
@@ -769,7 +779,7 @@  struct scoped_remove_all_units
     if (m_per_bfd == nullptr)
       return;
 
-    m_per_bfd->all_units.clear ();
+    m_per_bfd->clear_all_units ();
     m_per_bfd->num_comp_units = 0;
     m_per_bfd->num_type_units = 0;
   }