[PATCHv5,3/4] gdb: improve shared library build-id check for core-files

Message ID dae8d6d63a35d4a7f86c81cd6f035e1f6f9680de.1724263147.git.aburgess@redhat.com
State New
Headers
Series More build-id checking when opening core files |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Test passed

Commit Message

Andrew Burgess Aug. 21, 2024, 6:01 p.m. UTC
  When GDB opens a core file, in 'core_target::build_file_mappings ()',
we collection information about the files that are mapped into the
core file, specifically, the build-id and the DT_SONAME attribute for
the file, which will be set for some shared libraries.

We then cache the DT_SONAME to build-id information on the core file
bfd object in the function set_cbfd_soname_build_id.

Later, when we are loading the shared libraries for the core file, we
can use the library's file name to look in the DT_SONAME to build-id
map, and, if we find a matching entry, we can use the build-id to
validate that we are loading the correct shared library.

This works OK, but has some limitations: not every shared library will
have a DT_SONAME attribute.  Though it is good practice to add such an
attribute, it's not required.  A library without this attribute will
not have its build-id checked, which can lead to GDB loading the wrong
shared library.

What I want to do in this commit is to improve GDB's ability to use
the build-ids extracted in core_target::build_file_mappings to both
validate the shared libraries being loaded, and then to use these
build-ids to potentially find (via debuginfod) the shared library.

To do this I propose making the following changes to GDB:

(1) Rather than just recording the DT_SONAME to build-id mapping in
set_cbfd_soname_build_id, we should also record, the full filename to
build-id mapping, and also the memory ranges to build-id mapping for
every memory range covered by every mapped file.

(2) Add a new callback solib_ops::find_solib_addr.  This callback
takes a solib object and returns an (optional) address within the
inferior that is part of this library.  We can use this address to
find a mapped file using the stored memory ranges which will increase
the cases in which a match can be found.

(3) Move the mapped file record keeping out of solib.c and into
corelow.c.  Future commits will make use of this information from
other parts of GDB.  This information was never solib specific, it
lived in the solib.c file because that was the only user of the data,
but really, the data is all about the core file, and should be stored
in core_target, other parts of GDB can then query this data as needed.

Now, when we load a shared library for a core file, we do the
following lookups:

  1. Is the exact filename of the shared library found in the filename
  to build-id map?  If so then use this build-id for validation.

  2. Find an address within the shared library using ::find_solib_addr
  and then look for an entry in the mapped address to build-id map.
  If an entry is found then use this build-id.

  3. Finally, look in the soname to build-id map.  If an entry is
  found then use this build-id.

The addition of step #2 here means that GDB is now far more likely to
find a suitable build-id for a shared library.  Having acquired a
build-id the existing code for using debuginfod to lookup a shared
library object can trigger more often.

On top of this, we also create a build-id to filename map.  This is
useful as often a shared library is implemented as a symbolic link to
the actual shared library file.  The mapped file information is stored
based on the actual, real file name, while the shared library
information holds the original symbolic link file name.

If when loading the shared library, we find the symbolic link has
disappeared, we can use the build-id to file name map to check if the
actual file is still around, if it is (and if the build-id matches)
then we can fall back to use that file.  This is another way in which
we can slightly increase the chances that GDB will find the required
files when loading a core file.

Adding all of the above required pretty much a full rewrite of the
existing set_cbfd_soname_build_id function and the corresponding
get_cbfd_soname_build_id function, so I have taken the opportunity to
move the information caching out of solib.c and into corelow.c where
it is now accessed through the function core_target_find_mapped_file.

At this point the benefit of this move is not entirely obvious, though
I don't think the new location is significantly worse than where it
was originally.  The benefit though is that the cached information is
no longer tied to the shared library loading code.

I already have a second set of patches (not in this series) that make
use of this caching from elsewhere in GDB.  I've not included those
patches in this series as this series is already pretty big, but even
if those follow up patches don't arrive, I think the new location is
just as good as the original location.

Rather that caching the information within the core file BFD via the
registry mechanism, the information used for the mapped file lookup is
now stored within the core_file target directly.
---
 gdb/corelow.c                                 | 334 +++++++++++++++++-
 gdb/gdbcore.h                                 |  66 ++++
 gdb/solib-aix.c                               |   5 +
 gdb/solib-darwin.c                            |   5 +
 gdb/solib-dsbt.c                              |   5 +
 gdb/solib-frv.c                               |   5 +
 gdb/solib-svr4.c                              |  10 +
 gdb/solib-target.c                            |   5 +
 gdb/solib.c                                   | 128 +++----
 gdb/solib.h                                   |   7 -
 gdb/solist.h                                  |  22 ++
 gdb/testsuite/gdb.base/solib-search.exp       |   6 +-
 .../gdb.debuginfod/solib-with-soname-1.c      |  39 ++
 .../gdb.debuginfod/solib-with-soname-2.c      |  41 +++
 .../gdb.debuginfod/solib-with-soname.exp      | 260 ++++++++++++++
 15 files changed, 847 insertions(+), 91 deletions(-)
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
  

Patch

diff --git a/gdb/corelow.c b/gdb/corelow.c
index 27984952ead..e23d7d78882 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -60,6 +60,119 @@ 
 #define O_LARGEFILE 0
 #endif
 
+/* A mem_range and the build-id associated with the file mapped into the
+   given range.  */
+
+struct mem_range_and_build_id
+{
+  mem_range_and_build_id (mem_range &&r, const bfd_build_id *id)
+    : range (r),
+      build_id (id)
+  { /* Nothing.  */ }
+
+  /* A range of memory addresses.  */
+  mem_range range;
+
+  /* The build-id of the file mapped into RANGE.  */
+  const bfd_build_id *build_id;
+};
+
+/* An instance of this class is created within the core_target and is used
+   to hold all the information that relating to mapped files, their address
+   ranges, and their corresponding build-ids.  */
+
+struct mapped_file_info
+{
+  /* See comment on function definition.  */
+
+  void add (const char *soname, const char *expected_filename,
+	    const char *actual_filename, std::vector<mem_range> &&ranges,
+	    const bfd_build_id *build_id);
+
+  /* See comment on function definition.  */
+
+  std::optional <core_target_mapped_file_info>
+  lookup (const char *filename, const std::optional<CORE_ADDR> &addr);
+
+private:
+
+  /* Helper for ::lookup.  BUILD_ID is a build-id that was found in
+     one of the data structures within this class.  Lookup the
+     corresponding filename in m_build_id_to_filename_map and return a pair
+     containing the build-id and filename.
+
+     If no corresponding filename is found in m_build_id_to_filename_map
+     then the returned pair contains BUILD_ID and an empty string.
+
+     If BUILD_ID is nullptr then the returned pair contains nullptr and an
+     empty string.  */
+
+  struct core_target_mapped_file_info
+  make_result (const bfd_build_id *build_id)
+  {
+    if (build_id != nullptr)
+      {
+      auto it = m_build_id_to_filename_map.find (build_id);
+      if (it != m_build_id_to_filename_map.end ())
+	return { build_id, it->second };
+    }
+
+    return { build_id, {} };
+  }
+
+  /* A type that maps a string to a build-id.  */
+  using string_to_build_id_map
+    = std::unordered_map<std::string, const bfd_build_id *>;
+
+  /* A type that maps a build-id to a string.  */
+  using build_id_to_string_map
+    = std::unordered_map<const bfd_build_id *, std::string>;
+
+  /* When loading a core file, the build-ids are extracted based on the
+     file backed mappings.  This map associates the name of a file that was
+     mapped into the core file with the corresponding build-id.  The
+     build-id pointers in this map will never be nullptr as we only record
+     files if they have a build-id.  */
+
+  string_to_build_id_map m_filename_to_build_id_map;
+
+  /* Map a build-id pointer back to the name of the file that was mapped
+     into the inferior's address space.  If we lookup a matching build-id
+     using either a soname or an address then this map allows us to also
+     provide a full path to a file with a matching build-id.  */
+
+  build_id_to_string_map m_build_id_to_filename_map;
+
+  /* If the file that was mapped into the core file was a shared library
+     then it might have a DT_SONAME tag in its .dynamic section, this tag
+     contains the name of a shared object.  When opening a shared library,
+     if it's basename appears in this map then we can use the corresponding
+     build-id.
+
+     In the rare case that two different files have the same DT_SONAME
+     value then the build-id pointer in this map will be nullptr, this
+     indicates that it's not possible to find a build-id based on the given
+     DT_SONAME value.  */
+
+  string_to_build_id_map m_soname_to_build_id_map;
+
+  /* This vector maps memory ranges onto an associated build-id.  The
+     ranges are those of the files mapped into the core file.
+
+     Entries in this vector must not overlap, and are sorted be increasing
+     memory address.  Within each entry the build-id pointer will not be
+     nullptr.
+
+     While building this vector the entries are not sorted, they are
+     sorted once after the table has finished being built.  */
+
+  std::vector<mem_range_and_build_id> m_address_to_build_id_list;
+
+  /* False if address_to_build_id_list is unsorted, otherwise true.  */
+
+  bool m_address_to_build_id_list_sorted = false;
+};
+
 /* The core file target.  */
 
 static const target_info core_target_info = {
@@ -136,6 +249,13 @@  class core_target final : public process_stratum_target
   /* See definition.  */
   void info_proc_mappings (struct gdbarch *gdbarch);
 
+  std::optional <core_target_mapped_file_info>
+  lookup_mapped_file_info (const char *filename,
+			   const std::optional<CORE_ADDR> &addr)
+  {
+    return m_mapped_file_info.lookup (filename, addr);
+  }
+
 private: /* per-core data */
 
   /* Get rid of the core inferior.  */
@@ -158,7 +278,13 @@  class core_target final : public process_stratum_target
      still be useful.  */
   std::vector<mem_range> m_core_unavailable_mappings;
 
-  /* Build m_core_file_mappings.  Called from the constructor.  */
+  /* Data structure that holds information mapping filenames and address
+     ranges to the corresponding build-ids as well as the reverse build-id
+     to filename mapping.  */
+  mapped_file_info m_mapped_file_info;
+
+  /* Build m_core_file_mappings and m_mapped_file_info.  Called from the
+     constructor.  */
   void build_file_mappings ();
 
   /* FIXME: kettenis/20031023: Eventually this field should
@@ -354,6 +480,10 @@  core_target::build_file_mappings ()
 	    }
 	}
 
+      std::vector<mem_range> ranges;
+      for (const mapped_file::region &region : file_data.regions)
+	ranges.emplace_back (region.start, region.end - region.start);
+
       if (expanded_fname == nullptr
 	  || abfd == nullptr
 	  || !bfd_check_format (abfd.get (), bfd_object))
@@ -430,16 +560,26 @@  core_target::build_file_mappings ()
 	    }
 	}
 
-      /* If this is a bfd of a shared library, record its soname and
-	 build-id.  */
-      if (file_data.build_id != nullptr && abfd != nullptr)
+      /* If this is a bfd with a build-id then record the filename,
+	 optional soname (DT_SONAME .dynamic attribute), and the range of
+	 addresses at which this bfd is mapped.  This information can be
+	 used to perform build-id checking when loading the shared
+	 libraries.  */
+      if (file_data.build_id != nullptr)
 	{
-	  gdb::unique_xmalloc_ptr<char> soname
-	    = gdb_bfd_read_elf_soname (bfd_get_filename (abfd.get ()));
+	  normalize_mem_ranges (&ranges);
+
+	  const char *actual_filename = nullptr;
+	  gdb::unique_xmalloc_ptr<char> soname;
+	  if (abfd != nullptr)
+	    {
+	      actual_filename = bfd_get_filename (abfd.get ());
+	      soname = gdb_bfd_read_elf_soname (actual_filename);
+	    }
 
-	  if (soname != nullptr)
-	    set_cbfd_soname_build_id (current_program_space->cbfd,
-				      soname.get (), file_data.build_id);
+	  m_mapped_file_info.add (soname.get (), filename.c_str (),
+				  actual_filename, std::move (ranges),
+				  file_data.build_id);
 	}
     }
 
@@ -1634,6 +1774,182 @@  maintenance_print_core_file_backed_mappings (const char *args, int from_tty)
     targ->info_proc_mappings (targ->core_gdbarch ());
 }
 
+/* Add more details discovered while processing the core-file's mapped file
+   information, we're building maps between filenames and the corresponding
+   build-ids, between address ranges and the corresponding build-ids, and
+   also a reverse map between build-id and the corresponding filename.
+
+   SONAME is the DT_SONAME attribute extracted from the .dynamic section of
+   a shared library that was mapped into the core file.  This can be
+   nullptr if the mapped files was not a shared library, or didn't have a
+   DT_SONAME attribute.
+
+   EXPECTED_FILENAME is the name of the file that was mapped into the
+   inferior as extracted from the core file, this should never be nullptr.
+
+   ACTUAL_FILENAME is the name of the actual file GDB found to provide the
+   mapped file information, this can be nullptr if GDB failed to find a
+   suitable file.  This might be different to EXPECTED_FILENAME, e.g. GDB
+   might have downloaded the file from debuginfod and so ACTUAL_FILENAME
+   will be a file in the debuginfod client cache.
+
+   RANGES is the list of memory ranges at which this file was mapped into
+   the inferior.
+
+   BUILD_ID is the build-id for this mapped file, this will never be
+   nullptr.  Not every mapped file will have a build-id, but there's no
+   point calling this function if we failed to find a build-id; this
+   structure only exists so we can lookup files based on their build-id.  */
+
+void
+mapped_file_info::add (const char *soname,
+		       const char *expected_filename,
+		       const char *actual_filename,
+		       std::vector<mem_range> &&ranges,
+		       const bfd_build_id *build_id)
+{
+  gdb_assert (build_id != nullptr);
+  gdb_assert (expected_filename != nullptr);
+
+  if (soname != nullptr)
+    {
+      /* If we already have an entry with this SONAME then this indicates
+	 that the inferior has two files mapped into memory with different
+	 file names (and most likely different build-ids), but with the
+	 same DT_SONAME attribute.  In this case we can't use the
+	 DT_SONAME to figure out the expected build-id of a shared
+	 library, so poison the entry for this SONAME by setting the entry
+	 to nullptr.  */
+      auto it = m_soname_to_build_id_map.find (soname);
+      if (it != m_soname_to_build_id_map.end ()
+	  && it->second != nullptr
+	  && !build_id_equal (it->second, build_id))
+	m_soname_to_build_id_map[soname] = nullptr;
+      else
+	m_soname_to_build_id_map[soname] = build_id;
+    }
+
+  /* When the core file is initially opened and the mapped files are
+     parsed, we group the build-id information based on the file name.  As
+     a consequence, we should see each EXPECTED_FILENAME value exactly
+     once.  This means that each insertion should always succeed.  */
+  const auto [it, inserted]
+    = m_filename_to_build_id_map.emplace (expected_filename, build_id);
+  gdb_assert (inserted);
+
+  /* Setup the reverse build-id to file name map.  */
+  if (actual_filename != nullptr)
+    m_build_id_to_filename_map.emplace (build_id, actual_filename);
+
+  /* Setup the list of memory range to build-id objects.  */
+  for (mem_range &r : ranges)
+    m_address_to_build_id_list.emplace_back (std::move (r), build_id);
+
+  /* At this point the m_address_to_build_id_list is unsorted (we just
+     added some entries to the end of the list).  All entries should be
+     added before any look-ups are performed, and the list is only sorted
+     when the first look-up is performed.  */
+  gdb_assert (!m_address_to_build_id_list_sorted);
+}
+
+/* FILENAME is the name of a file GDB is trying to load, and ADDR is
+   (optionally) an address within the file in the inferior's address space.
+
+   Search through the information gathered from the core-file's mapped file
+   information looking for a file named FILENAME, or for a file that covers
+   ADDR.  If a match is found then return the build-id for the file along
+   with the location where GDB found the mapped file.
+
+   The location of the mapped file might be the empty string if GDB was
+   unable to find the mapped file.
+
+   If no build-id can be found for FILENAME then GDB will return a pair
+   containing nullptr (for the build-id) and an empty string for the file
+   name.  */
+
+std::optional <core_target_mapped_file_info>
+mapped_file_info::lookup (const char *filename,
+			  const std::optional<CORE_ADDR> &addr)
+{
+  if (filename != nullptr)
+    {
+      /* If there's a matching entry in m_filename_to_build_id_map then the
+	 associated build-id will not be nullptr, and can be used to
+	 validate that FILENAME is correct.  */
+      auto it = m_filename_to_build_id_map.find (filename);
+      if (it != m_filename_to_build_id_map.end ())
+	return make_result (it->second);
+    }
+
+  if (addr.has_value ())
+    {
+      /* On the first lookup, sort the address_to_build_id_list.  */
+      if (!m_address_to_build_id_list_sorted)
+	{
+	  std::sort (m_address_to_build_id_list.begin (),
+		     m_address_to_build_id_list.end (),
+		     [] (const mem_range_and_build_id &a,
+			 const mem_range_and_build_id &b) {
+		       return a.range < b.range;
+		     });
+	  m_address_to_build_id_list_sorted = true;
+	}
+
+      /* Look for the first entry whose range's start address is not less
+	 than, or equal too, the address ADDR.  If we find such an entry,
+	 then the previous entry's range might contain ADDR.  If it does
+	 then that previous entry's build-id can be used.  */
+      auto it = std::lower_bound
+	(m_address_to_build_id_list.begin (),
+	 m_address_to_build_id_list.end (),
+	 *addr,
+	 [] (const mem_range_and_build_id &a,
+	     const CORE_ADDR &b) {
+	  return a.range.start <= b;
+	});
+
+      if (it != m_address_to_build_id_list.begin ())
+	{
+	  --it;
+
+	  if (it->range.contains (*addr))
+	    return make_result (it->build_id);
+	}
+    }
+
+  if (filename != nullptr)
+    {
+      /* If the basename of FILENAME appears in m_soname_to_build_id_map
+	 then when the mapped files were processed, we saw a file with a
+	 DT_SONAME attribute corresponding to FILENAME, use that build-id
+	 to validate FILENAME.
+
+	 However, the build-id in this map might be nullptr if we saw
+	 multiple mapped files with the same DT_SONAME attribute (though
+	 this should be pretty rare).  */
+      auto it
+	= m_soname_to_build_id_map.find (lbasename (filename));
+      if (it != m_soname_to_build_id_map.end ()
+	  && it->second != nullptr)
+	return make_result (it->second);
+    }
+
+  return {};
+}
+
+/* See gdbcore.h.  */
+
+std::optional <core_target_mapped_file_info>
+core_target_find_mapped_file (const char *filename,
+			      std::optional<CORE_ADDR> addr)
+{
+  core_target *targ = get_current_core_target ();
+  if (targ == nullptr || current_program_space->cbfd.get () == nullptr)
+    return {};
+
+  return targ->lookup_mapped_file_info (filename, addr);
+}
+
 void _initialize_corelow ();
 void
 _initialize_corelow ()
diff --git a/gdb/gdbcore.h b/gdb/gdbcore.h
index d6aeb356e37..782643acf56 100644
--- a/gdb/gdbcore.h
+++ b/gdb/gdbcore.h
@@ -196,4 +196,70 @@  class thread_section_name
   std::string m_storage;
 };
 
+/* Type returned from core_target_find_mapped_file.  Holds information
+   about a mapped file that was processed when a core file was initially
+   loaded.  */
+struct core_target_mapped_file_info
+{
+  /* Constructor.  BUILD_ID is not nullptr, and is the build-id for the
+     mapped file.  FILENAME is the location of the file that GDB loaded to
+     provide the mapped file.  This might be different from the name of the
+     mapped file mentioned in the core file, e.g. if GDB downloads a file
+     from debuginfod then FILENAME would point into the debuginfod client
+     cache.  The FILENAME can be the empty string if GDB was unable to find
+     a file to provide the mapped file.  */
+
+  core_target_mapped_file_info (const bfd_build_id *build_id,
+				const std::string filename)
+    : m_build_id (build_id),
+      m_filename (filename)
+  {
+    gdb_assert (m_build_id != nullptr);
+  }
+
+  /* The build-id for this mapped file.  */
+
+  const bfd_build_id *
+  build_id () const
+  {
+    return m_build_id;
+  }
+
+  /* The file GDB used to provide this mapped file.  */
+
+  const std::string &
+  filename () const
+  {
+    return m_filename;
+  }
+
+private:
+  const bfd_build_id *m_build_id = nullptr;
+  const std::string m_filename;
+};
+
+/* If the current inferior has a core_target for its process target, then
+   lookup information about a mapped file that was discovered when the
+   core file was loaded.
+
+   The FILENAME is the file we're looking for.  The ADDR, if provided, is a
+   mapped address within the inferior which is known to be part of the file
+   we are looking for.
+
+   As an example, when loading shared libraries this function can be
+   called, in that case FILENAME will be the name of the shared library
+   that GDB is trying to load and ADDR will be an inferior address which is
+   part of the shared library we are looking for.
+
+   This function looks for a mapped file which matches FILENAME and/or
+   which covers ADDR and returns information about that file.
+
+   The returned information includes the name of the mapped file if known
+   and the build-id for the mapped file if known.
+
+   */
+std::optional<core_target_mapped_file_info>
+core_target_find_mapped_file (const char *filename,
+			      std::optional<CORE_ADDR> addr);
+
 #endif /* !defined (GDBCORE_H) */
diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
index a50bb165c19..1c319c22223 100644
--- a/gdb/solib-aix.c
+++ b/gdb/solib-aix.c
@@ -689,6 +689,11 @@  const solib_ops solib_aix_so_ops =
   solib_aix_open_symbol_file_object,
   solib_aix_in_dynsym_resolve_code,
   solib_aix_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
 
 void _initialize_solib_aix ();
diff --git a/gdb/solib-darwin.c b/gdb/solib-darwin.c
index dd4da93a4c8..c7a7f624102 100644
--- a/gdb/solib-darwin.c
+++ b/gdb/solib-darwin.c
@@ -665,4 +665,9 @@  const solib_ops darwin_so_ops =
   open_symbol_file_object,
   darwin_in_dynsym_resolve_code,
   darwin_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
diff --git a/gdb/solib-dsbt.c b/gdb/solib-dsbt.c
index d61100463cc..2751ded40c5 100644
--- a/gdb/solib-dsbt.c
+++ b/gdb/solib-dsbt.c
@@ -914,6 +914,11 @@  const solib_ops dsbt_so_ops =
   open_symbol_file_object,
   dsbt_in_dynsym_resolve_code,
   solib_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
 
 void _initialize_dsbt_solib ();
diff --git a/gdb/solib-frv.c b/gdb/solib-frv.c
index c1a57683866..2ce35c4bc28 100644
--- a/gdb/solib-frv.c
+++ b/gdb/solib-frv.c
@@ -1086,4 +1086,9 @@  const solib_ops frv_so_ops =
   open_symbol_file_object,
   frv_in_dynsym_resolve_code,
   solib_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
index 1a0e5424806..4e8a85d2cef 100644
--- a/gdb/solib-svr4.c
+++ b/gdb/solib-svr4.c
@@ -3355,6 +3355,15 @@  svr4_iterate_over_objfiles_in_search_order
     }
 }
 
+/* See solib_ops::find_solib_addr in solist.h.  */
+
+static std::optional<CORE_ADDR>
+svr4_find_solib_addr (solib &so)
+{
+  auto *li = gdb::checked_static_cast<lm_info_svr4 *> (so.lm_info.get ());
+  return li->l_addr_inferior;
+}
+
 const struct solib_ops svr4_so_ops =
 {
   svr4_relocate_section_addresses,
@@ -3369,6 +3378,7 @@  const struct solib_ops svr4_so_ops =
   svr4_keep_data_in_core,
   svr4_update_solib_event_breakpoints,
   svr4_handle_solib_event,
+  svr4_find_solib_addr,
 };
 
 void _initialize_svr4_solib ();
diff --git a/gdb/solib-target.c b/gdb/solib-target.c
index 6563da05a47..3fffa7c2595 100644
--- a/gdb/solib-target.c
+++ b/gdb/solib-target.c
@@ -412,4 +412,9 @@  const solib_ops solib_target_so_ops =
   solib_target_open_symbol_file_object,
   solib_target_in_dynsym_resolve_code,
   solib_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
diff --git a/gdb/solib.c b/gdb/solib.c
index 49f607514dc..184e1901acc 100644
--- a/gdb/solib.c
+++ b/gdb/solib.c
@@ -469,58 +469,6 @@  solib_bfd_open (const char *pathname)
   return abfd;
 }
 
-/* Mapping of a core file's shared library sonames to their respective
-   build-ids.  Added to the registries of core file bfds.  */
-
-typedef std::unordered_map<std::string, std::string> soname_build_id_map;
-
-/* Key used to associate a soname_build_id_map to a core file bfd.  */
-
-static const struct registry<bfd>::key<soname_build_id_map>
-  cbfd_soname_build_id_data_key;
-
-/* See solib.h.  */
-
-void
-set_cbfd_soname_build_id (gdb_bfd_ref_ptr abfd, const char *soname,
-			  const bfd_build_id *build_id)
-{
-  gdb_assert (abfd.get () != nullptr);
-  gdb_assert (soname != nullptr);
-  gdb_assert (build_id != nullptr);
-
-  soname_build_id_map *mapptr
-    = cbfd_soname_build_id_data_key.get (abfd.get ());
-
-  if (mapptr == nullptr)
-    mapptr = cbfd_soname_build_id_data_key.emplace (abfd.get ());
-
-  (*mapptr)[soname] = build_id_to_string (build_id);
-}
-
-/* If SONAME had a build-id associated with it in ABFD's registry by a
-   previous call to set_cbfd_soname_build_id then return the build-id
-   as a NULL-terminated hex string.  */
-
-static gdb::unique_xmalloc_ptr<char>
-get_cbfd_soname_build_id (gdb_bfd_ref_ptr abfd, const char *soname)
-{
-  if (abfd.get () == nullptr || soname == nullptr)
-    return {};
-
-  soname_build_id_map *mapptr
-    = cbfd_soname_build_id_data_key.get (abfd.get ());
-
-  if (mapptr == nullptr)
-    return {};
-
-  auto it = mapptr->find (lbasename (soname));
-  if (it == mapptr->end ())
-    return {};
-
-  return make_unique_xstrdup (it->second.c_str ());
-}
-
 /* Given a pointer to one of the shared objects in our list of mapped
    objects, use the recorded name to open a bfd descriptor for the
    object, build a section table, relocate all the section addresses
@@ -540,36 +488,60 @@  solib_map_sections (solib &so)
 
   gdb::unique_xmalloc_ptr<char> filename (tilde_expand (so.so_name.c_str ()));
   gdb_bfd_ref_ptr abfd (ops->bfd_open (filename.get ()));
-  gdb::unique_xmalloc_ptr<char> build_id_hexstr
-    = get_cbfd_soname_build_id (current_program_space->cbfd,
-				so.so_name.c_str ());
+
+  /* If we have a core target then the core target might have some helpful
+     information (i.e. build-ids) about the shared libraries we are trying
+     to load.  Grab those hints now and use the below to validate or find
+     the shared libraries.
+
+     If we don't have a core target then this will return an empty struct
+     with no hint information, we then lookup the shared library based on
+     its filename.  */
+  std::optional<CORE_ADDR> solib_addr = ops->find_solib_addr (so);
+  std::optional <const core_target_mapped_file_info> mapped_file_info
+    = core_target_find_mapped_file (so.so_name.c_str (), solib_addr);
 
   /* If we already know the build-id of this solib from a core file, verify
      it matches ABFD's build-id.  If there is a mismatch or the solib wasn't
      found, attempt to query debuginfod for the correct solib.  */
-  if (build_id_hexstr.get () != nullptr)
+  if (mapped_file_info.has_value ())
     {
-      bool mismatch = false;
-
-      if (abfd != nullptr && abfd->build_id != nullptr)
-	{
-	  std::string build_id = build_id_to_string (abfd->build_id);
+      bool mismatch = (abfd != nullptr
+		       && build_id_bfd_get (abfd.get ()) != nullptr
+		       && !build_id_equal (mapped_file_info->build_id (),
+					   build_id_bfd_get (abfd.get ())));
 
-	  if (build_id != build_id_hexstr.get ())
-	    mismatch = true;
-	}
       if (abfd == nullptr || mismatch)
 	{
-	  scoped_fd fd = debuginfod_exec_query (
-	    (const unsigned char *) build_id_hexstr.get (), 0,
-	    so.so_name.c_str (), &filename);
-
-	  if (fd.get () >= 0)
-	    abfd = ops->bfd_open (filename.get ());
-	  else if (mismatch)
-	    warning (_ ("Build-id of %ps does not match core file."),
-		     styled_string (file_name_style.style (),
-				    filename.get ()));
+	  /* If GDB found a suitable file during the file mapping
+	     processing stage then lets use that.  We don't check the
+	     build-id after opening this file, either this file was found
+	     by build-id, in which case it's going to match, or this file
+	     doesn't have a build-id, so checking tells us nothing.
+	     However, if it was good enough during the mapped file
+	     processing, we assume it's good enough now.  */
+	  if (!mapped_file_info->filename ().empty ())
+	    abfd = ops->bfd_open (mapped_file_info->filename ().c_str ());
+	  else
+	    abfd = nullptr;
+
+	  if (abfd == nullptr)
+	    {
+	      scoped_fd fd = debuginfod_exec_query
+		(mapped_file_info->build_id ()->data,
+		 mapped_file_info->build_id ()->size,
+		 so.so_name.c_str (), &filename);
+
+	      if (fd.get () >= 0)
+		abfd = ops->bfd_open (filename.get ());
+	      else if (mismatch)
+		{
+		  warning (_ ("Build-id of %ps does not match core file."),
+			   styled_string (file_name_style.style (),
+					  filename.get ()));
+		  abfd = nullptr;
+		}
+	    }
 	}
     }
 
@@ -1708,6 +1680,14 @@  remove_user_added_objfile (struct objfile *objfile)
     }
 }
 
+/* See solist.h.  */
+
+std::optional<CORE_ADDR>
+default_find_solib_addr (solib &so)
+{
+  return {};
+}
+
 void _initialize_solib ();
 
 void
diff --git a/gdb/solib.h b/gdb/solib.h
index 25ed77c1050..eacff65f9a6 100644
--- a/gdb/solib.h
+++ b/gdb/solib.h
@@ -136,11 +136,4 @@  extern void update_solib_breakpoints (void);
 
 extern void handle_solib_event (void);
 
-/* Associate SONAME with BUILD_ID in ABFD's registry so that it can be
-   retrieved with get_cbfd_soname_build_id.  */
-
-extern void set_cbfd_soname_build_id (gdb_bfd_ref_ptr abfd,
-				      const char *soname,
-				      const bfd_build_id *build_id);
-
 #endif /* SOLIB_H */
diff --git a/gdb/solist.h b/gdb/solist.h
index 6b2a97adf7a..b898ae31899 100644
--- a/gdb/solist.h
+++ b/gdb/solist.h
@@ -159,6 +159,23 @@  struct solib_ops
      NULL, in which case no specific preprocessing is necessary
      for this target.  */
   void (*handle_event) (void);
+
+  /* Return an address within the inferior's address space which is known
+     to be part of SO.  If there is no such address, or GDB doesn't know
+     how to figure out such an address then an empty optional is
+     returned.
+
+     The returned address can be used when loading the shared libraries
+     for a core file.  GDB knows the build-ids for (some) files mapped
+     into the inferior's address space, and knows the address ranges which
+     those mapped files cover.  If GDB can figure out a representative
+     address for the library then this can be used to match a library to a
+     mapped file, and thus to a build-id.  GDB can then use this
+     information to help locate the shared library objfile, if the objfile
+     is not in the expected place (as defined by the shared libraries file
+     name).  */
+
+  std::optional<CORE_ADDR> (*find_solib_addr) (solib &so);
 };
 
 /* A unique pointer to a so_list.  */
@@ -178,4 +195,9 @@  extern gdb_bfd_ref_ptr solib_bfd_fopen (const char *pathname, int fd);
 /* Find solib binary file and open it.  */
 extern gdb_bfd_ref_ptr solib_bfd_open (const char *in_pathname);
 
+/* A default implementation of the solib_ops::find_solib_addr callback.
+   This just returns an empty std::optional<CORE_ADDR> indicating GDB is
+   unable to find an address within the library SO.  */
+extern std::optional<CORE_ADDR> default_find_solib_addr (solib &so);
+
 #endif
diff --git a/gdb/testsuite/gdb.base/solib-search.exp b/gdb/testsuite/gdb.base/solib-search.exp
index f038a0425c4..bf021c8a8d2 100644
--- a/gdb/testsuite/gdb.base/solib-search.exp
+++ b/gdb/testsuite/gdb.base/solib-search.exp
@@ -43,7 +43,11 @@  set right_binfile2_lib \
 set binfile1_lib [standard_output_file ${libname1}.so]
 set binfile2_lib [standard_output_file ${libname2}.so]
 
-set lib_flags [list debug ldflags=-Wl,-Bsymbolic]
+# When this test was written, GDB's ability to track down shared
+# libraries for a core file based on the build-id much poorer.  As GDB
+# has improved we now need to disable build-ids in order for this test
+# to function as expected.
+set lib_flags [list debug no-build-id ldflags=-Wl,-Bsymbolic]
 set wrong_lib_flags "$lib_flags additional_flags=-DARRAY_SIZE=1"
 set right_lib_flags "$lib_flags additional_flags=-DARRAY_SIZE=8192 additional_flags=-DRIGHT"
 
diff --git a/gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c b/gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c
new file mode 100644
index 00000000000..30e9267dc01
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c
@@ -0,0 +1,39 @@ 
+/* Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdlib.h>
+
+/* It is important that these two variables have names of the same length
+   so that the debug information in the two library versions is laid out
+   the same.  If they differ then the .dynamic section might move, which
+   will trigger a different check within GDB than the one we actually want
+   to check.  */
+
+#if LIB_VERSION == 1
+volatile int *library_1_var = (volatile int *) 0x12345678;
+#elif LIB_VERSION == 2
+volatile int *library_2_var = (volatile int *) 0x11223344;
+#else
+# error Unknown library version
+#endif
+
+int
+foo (void)
+{
+  /* This should trigger a core dump.  */
+  abort ();
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c b/gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c
new file mode 100644
index 00000000000..a51d48e2f2e
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c
@@ -0,0 +1,41 @@ 
+/* Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <assert.h>
+#include <dlfcn.h>
+#include <stddef.h>
+
+/* This is in the shared library.  */
+extern int foo (void);
+
+/* This is updated by the .exp file.  */
+char *libname = "libfoo_2.so";
+
+int
+main (void)
+{
+  void *handle;
+  int res, tmp;
+
+  handle = dlopen (libname, RTLD_LAZY);
+  assert (handle != NULL);
+
+  res = foo ();
+
+  tmp = dlclose (handle);
+  assert (tmp == 0);
+
+  return res;
+}
diff --git a/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp b/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
new file mode 100644
index 00000000000..5bf6817daf8
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
@@ -0,0 +1,260 @@ 
+# Copyright 2024 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+# This test exercises GDB's ability to validate build-ids when loading
+# shared libraries for a core file.
+#
+# The test creates two "versions" of a shared library, sets up a
+# symlink to point to one version of the library, and creates a core file.
+#
+# We then try re-loading the core file and executable and check that
+# GDB is able to correctly load the shared library.  To confuse things
+# we retarget the library symlink at the other version of the library.
+#
+# After that we repeat the test, but this time deleting the symlink
+# completely.
+#
+# Then we remove the version of the library completely, at this point
+# we do expect GDB to give a warning about being unable to load the library.
+#
+# And finally, we setup debuginfod and have it serve the missing
+# library file, GDB should correctly download the library file.
+#
+# Despite this test living in the gdb.debuginfod/ directory, only the last
+# part of this test actually uses debuginfod, everything up to that point is
+# pretty generic.
+
+load_lib debuginfod-support.exp
+
+require allow_shlib_tests
+require {istarget "*-linux*"}
+require {!is_remote host}
+require {!using_fission}
+
+standard_testfile -1.c -2.c
+
+# Build two similar, but slightly different versions of the shared
+# library.  Both libraries have DT_SONAME set to the generic
+# libfoo.so, we'll create a symlink with that name later.
+set library_1_filename [standard_output_file "libfoo_1.so"]
+set library_2_filename [standard_output_file "libfoo_2.so"]
+
+# The generic name for the library.
+set library_filename [standard_output_file "libfoo.so"]
+
+# When compiling a shared library the -Wl,-soname,NAME option is
+# automatically added based on the final name of the library.  We want
+# to compile libfoo_1.so, but set the soname to libfoo.so.  To achieve
+# this we first compile into libfoo.so, and then rename the library to
+# libfoo_1.so.
+if {[build_executable "build libfoo_1.so" $library_filename \
+	 $srcfile \
+	      { debug shlib build-id \
+		    additional_flags=-DLIB_VERSION=1 }] == -1} {
+    return
+}
+remote_exec build "mv ${library_filename} ${library_1_filename}"
+
+# See the comment above, but this time we rename to libfoo_2.so.
+if {[build_executable "build libfoo_2.so" $library_filename \
+	 $srcfile \
+	      { debug shlib build-id \
+		    additional_flags=-DLIB_VERSION=2 }] == -1} {
+    return
+}
+remote_exec build "mv ${library_filename} ${library_2_filename}"
+
+# Create libfoo.so symlink to the libfoo_1.so library.  If this
+# symlink creation fails then we assume we can't create symlinks on
+# this host.  If this succeeds then later symlink creation is required
+# to succeed, and will trigger an FAIL if it doesn't.
+set status \
+    [remote_exec build \
+	 "ln -sf ${library_1_filename} ${library_filename}"]
+if {[lindex $status 0] != 0} {
+    unsupported "host does not support symbolic links"
+    return
+}
+
+# Build the executable.  This links against libfoo.so, which is
+# poining at libfoo_1.so.  Just to confuse things even more, this
+# executable uses dlopen to load libfoo_2.so.  Weird!
+if { [build_executable "build executable" ${binfile} ${srcfile2} \
+	  [list debug shlib=${library_filename} shlib_load]] == -1 } {
+    return
+}
+
+# If the board file is automatically splitting the debug information
+# into a separate file (e.g. the cc-with-gnu-debuglink.exp board) then
+# this test isn't going to work.
+clean_restart
+gdb_file_cmd $binfile
+if {$gdb_file_cmd_debug_info ne "debug"} {
+    unsupported "failed to find debug information"
+    return
+}
+if {[regexp "${testfile}.debug" $gdb_file_cmd_msg]} {
+    unsupported "debug information has been split to a separate file"
+    return
+}
+
+# Run BINFILE which will generate a corefile.
+set corefile [core_find $binfile]
+if {$corefile eq ""} {
+    untested "could not generate core file"
+    return
+}
+
+# Helper proc to load global BINFILE and then load global COREFILE.
+#
+# If EXPECT_WARNING is true then we require a warning about being
+# unable to load the shared library symbols, otherwise, EXPECT_WARNING
+# is false and we require no warning.
+#
+# If EXPECT_DOWNLOAD is true then we require a line indicating that
+# the shared library is being downloaded from debuginfod, otherwise
+# the shared library should not be downloaded.
+proc load_exec_and_core_file { expect_warning expect_download testname } {
+    with_test_prefix $testname {
+	clean_restart $::binfile
+
+	set saw_warning false
+	set saw_download false
+	set saw_generated false
+	set saw_terminated false
+
+	gdb_test_multiple "core-file $::corefile" "load core file" {
+	    -re "^Core was generated by \[^\r\n\]+\r\n" {
+		set saw_generated true
+		exp_continue
+	    }
+	    -re "^Program terminated with signal \[^\r\n\]+\r\n" {
+		set saw_terminated true
+		exp_continue
+	    }
+	    -re "^warning: Can't open file \[^\r\n\]+ during file-backed mapping note processing\r\n" {
+		# Ignore warnings from the file backed mapping phase.
+		exp_continue
+	    }
+	    -re "^warning: Could not load shared library symbols for \[^\r\n\]+/libfoo\\.so\\.\r\n" {
+		set saw_warning true
+		exp_continue
+	    }
+	    -re "^Downloading executable for \[^\r\n\]+/libfoo_1\\.so\\.\\.\\.\r\n" {
+		set saw_download true
+		exp_continue
+	    }
+	    -re "^$::gdb_prompt $" {
+		gdb_assert { $saw_generated && $saw_terminated \
+				 && $saw_warning == $expect_warning \
+				 && $saw_download == $expect_download } \
+		    $gdb_test_name
+	    }
+	    -re "^\[^\r\n\]*\r\n" {
+		exp_continue
+	    }
+	}
+
+	# If we don't expect a warning then debug symbols from the
+	# shared library should be available.  Confirm we can read a
+	# variable from the shared library.  If we do expect a warning
+	# then the shared library debug symbols have not loaded, and
+	# the library variable should not be available.
+	if { !$expect_warning } {
+	    gdb_test "print/x library_1_var" " = 0x12345678" \
+		"check library_1_var can be read"
+	} else {
+	    gdb_test "print/x library_1_var" \
+		"^No symbol \"library_1_var\" in current context\\." \
+		"check library_1_var cannot be read"
+	}
+    }
+}
+
+# Initial test, just load the executable and core file.  At this point
+# everything should load fine as everything is where we expect to find
+# it.
+load_exec_and_core_file false false \
+    "load core file, all libraries as expected"
+
+# Update libfoo.so symlink to point at the second library then reload
+# the core file.  GDB should spot that the symlink points to the wrong
+# file, but should be able to figure out the correct file to load as
+# the right file will be in the mapped file list.
+set status [remote_exec build \
+		"ln -sf ${library_2_filename} ${library_filename}"]
+gdb_assert { [lindex $status 0] == 0 } \
+    "update library symlink to point to the wrong file"
+
+load_exec_and_core_file false false \
+    "load core file, symlink points to wrong file"
+
+# Remove libfoo.so symlink and reload the core file.  As in the
+# previous test GDB should be able to figure out the correct file to
+# load as the correct file will still appear in the mapped file list.
+set status [remote_exec build "rm -f ${library_filename}"]
+gdb_assert { [lindex $status 0] == 0 } "remove library symlink"
+
+load_exec_and_core_file false false \
+    "load core file, symlink removed"
+
+# Remove LIBRARY_1_FILENAME.  We'll now see a warning that the mapped
+# file can't be loaded (we ignore that warning), and we'll see a
+# warning that the shared library can't be loaded.
+set library_1_backup_filename ${library_1_filename}.backup
+set status \
+    [remote_exec build \
+	 "mv ${library_1_filename} ${library_1_backup_filename}"]
+gdb_assert { [lindex $status 0] == 0 } \
+    "remove libfoo_1.so"
+
+load_exec_and_core_file true false \
+    "load core file, libfoo_1.so removed"
+
+# Setup a debuginfod server which can serve the original shared
+# library file.
+if {![allow_debuginfod_tests]} {
+    untested "skippig debuginfod parts of this test"
+    return
+}
+
+set server_dir [standard_output_file "debuginfod.server"]
+file mkdir $server_dir
+file rename -force $library_1_backup_filename $server_dir
+
+prepare_for_debuginfod cache db
+
+set url [start_debuginfod $db $server_dir]
+if { $url eq "" } {
+    unresolved "failed to start debuginfod server"
+    return
+}
+
+with_debuginfod_env $cache {
+    setenv DEBUGINFOD_URLS $url
+
+    save_vars { GDBFLAGS } {
+	append GDBFLAGS " -ex \"set debuginfod enabled on\""
+
+	# Reload the executable and core file.  GDB should download
+	# the file libfoo_1.so using debuginfod during the mapped file
+	# phase, but should then reuse that download during the shared
+	# library phase.
+	load_exec_and_core_file false true \
+	    "load core file, use debuginfod"
+    }
+}
+
+stop_debuginfod