[3/4] gdb: improve shared library build-id check for core-files

Message ID d429564b8ceec92624b14d5f8f367206d9e96a4b.1715955328.git.aburgess@redhat.com
State New
Headers
Series More build-id checking when opening core files |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 fail Testing failed
linaro-tcwg-bot/tcwg_gdb_check--master-arm fail Testing failed

Commit Message

Andrew Burgess May 17, 2024, 2:18 p.m. UTC
  When GDB opens a core file, in 'core_target::build_file_mappings ()',
we collection information about the files that are mapped into the
core file, specifically, the build-id and the DT_SONAME attribute for
the file, which will be set for some shared libraries.

We then cache the DT_SONAME to build-id information on the core file
bfd object in the function set_cbfd_soname_build_id.

Later, when we are loading the shared libraries for the core file, we
can use the libraries file name to look in the DT_SONAME to build-id
map, and, if we find a matching entry, we can use the build-id to
validate that we are loading the correct shared library.

This works OK, but has some limitations: not every shared library will
have a DT_SONAME attribute.  Though it is good practice to add such an
attribute, it's not required.  A library without this attribute will
not have its build-id checked, which can lead to GDB loading the wrong
shared library.

What I want to do in this commit is to improve GDB's ability to use
the build-id extracted during the ::build_file_mappings phase to both
validate the shared libraries being loaded, and then to use these
build-ids to potentially find (via debuginfod) the shared library.

To do this two changes need to be made to GDB.

First, rather than just recording the soname to build-id mapping in
set_cbfd_soname_build_id, I now also record, the full mapped filename
to build-id mapping, and also the memory ranges to build-id mapping
for every memory range covered by a mapped file.

Second, I've added a new callback solib_ops::find_solib_addr.  This
callback takes a solib object and returns an (optional) address within
the inferior that is part of this library.

Now, when we load a shared library for a core file, we do the
following lookups:

  1. Is the exact filename of the shared library found in the filename
  to build-id map?  If so then use this build-id for validation.

  2. Find an address within the shared library using ::find_solib_addr
  and then look for an entry in the mapped address to build-id map.
  If an entry is found then use this build-id.

  3. Finally, look in the soname to build-id map.  If an entry is
  found then use this build-id.

The addition of step #2 here means that GDB is now far more likely to
find a suitable build-id for a shared library.  Having acquired a
build-id the existing code for using debuginfod to lookup a shared
library object can trigger more often.

On top of this, I also build a reverse build-id to filename map in
set_cbfd_soname_build_id.  This is useful as often a shared library is
implemented as a symbolic link to the actual shared library file.  The
mapped file information is stored based on the actual final, real file
name, while the shared library information holds the original symbolic
link file name.

If when loading the shared library, we find the symbolic link has
disappeared, then we can use the build-id to file name map to check if
the actual file is still around, if it is (and if the build-id
matches) then we can fall back to use that file.

I've also renamed set_cbfd_soname_build_id to set_cbfd_solib_build_id
and get_cbfd_soname_build_id to get_cbfd_solib_build_id as the
'soname' in the function names was now only part of the larger job
these functions do.
---
 gdb/corelow.c                                 |  33 +-
 gdb/solib-aix.c                               |   6 +
 gdb/solib-darwin.c                            |   6 +
 gdb/solib-dsbt.c                              |   6 +
 gdb/solib-frv.c                               |   6 +
 gdb/solib-svr4.c                              |  10 +
 gdb/solib-target.c                            |   6 +
 gdb/solib.c                                   | 373 +++++++++++++++---
 gdb/solib.h                                   |  39 +-
 gdb/solist.h                                  |  22 ++
 gdb/testsuite/gdb.base/solib-search.exp       |   6 +-
 .../gdb.debuginfod/solib-with-soname-1.c      |  39 ++
 .../gdb.debuginfod/solib-with-soname-2.c      |  41 ++
 .../gdb.debuginfod/solib-with-soname.exp      | 242 ++++++++++++
 gdb/testsuite/lib/gdb.exp                     |  13 +
 15 files changed, 784 insertions(+), 64 deletions(-)
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c
 create mode 100644 gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
  

Patch

diff --git a/gdb/corelow.c b/gdb/corelow.c
index 28d619594d8..85bc3c26bea 100644
--- a/gdb/corelow.c
+++ b/gdb/corelow.c
@@ -418,6 +418,10 @@  core_target::build_file_mappings ()
 	    }
 	}
 
+      std::vector<mem_range> ranges;
+      for (const mapped_file::region &region : file_data.regions)
+	ranges.emplace_back (region.start, region.end - region.start);
+
       if (expanded_fname == nullptr
 	  || abfd == nullptr
 	  || !bfd_check_format (abfd.get (), bfd_object))
@@ -494,16 +498,29 @@  core_target::build_file_mappings ()
 	    }
 	}
 
-      /* If this is a bfd of a shared library, record its soname and
-	 build-id.  */
-      if (file_data.build_id != nullptr && abfd != nullptr)
+      /* If this is a bfd with a build-id then record the filename,
+	 optional soname (DT_SONAME .dynamic attribute), and the range of
+	 addresses at which this bfd is mapped.  This information can be
+	 used to perform build-id checking when loading the shared
+	 libraries.  */
+      if (file_data.build_id != nullptr)
 	{
-	  gdb::unique_xmalloc_ptr<char> soname
-	    = gdb_bfd_read_elf_soname (bfd_get_filename (abfd.get ()));
+	  normalize_mem_ranges (&ranges);
+
+	  const char *actual_filename = nullptr;
+	  gdb::unique_xmalloc_ptr<char> soname;
+	  if (abfd != nullptr)
+	    {
+	      actual_filename = bfd_get_filename (abfd.get ());
+	      soname = gdb_bfd_read_elf_soname (actual_filename);
+	    }
 
-	  if (soname != nullptr)
-	    set_cbfd_soname_build_id (current_program_space->cbfd,
-				      soname.get (), file_data.build_id);
+	  set_cbfd_solib_build_id (current_program_space->cbfd,
+				   soname.get (),
+				   filename.c_str (),
+				   actual_filename,
+				   std::move (ranges),
+				   file_data.build_id);
 	}
     }
 
diff --git a/gdb/solib-aix.c b/gdb/solib-aix.c
index a50bb165c19..a9242408137 100644
--- a/gdb/solib-aix.c
+++ b/gdb/solib-aix.c
@@ -689,6 +689,12 @@  const solib_ops solib_aix_so_ops =
   solib_aix_open_symbol_file_object,
   solib_aix_in_dynsym_resolve_code,
   solib_aix_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
 
 void _initialize_solib_aix ();
diff --git a/gdb/solib-darwin.c b/gdb/solib-darwin.c
index f0828fdf102..e9a85db1e5e 100644
--- a/gdb/solib-darwin.c
+++ b/gdb/solib-darwin.c
@@ -665,4 +665,10 @@  const solib_ops darwin_so_ops =
   open_symbol_file_object,
   darwin_in_dynsym_resolve_code,
   darwin_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
diff --git a/gdb/solib-dsbt.c b/gdb/solib-dsbt.c
index 11225f72ed0..479f32182d2 100644
--- a/gdb/solib-dsbt.c
+++ b/gdb/solib-dsbt.c
@@ -913,6 +913,12 @@  const solib_ops dsbt_so_ops =
   open_symbol_file_object,
   dsbt_in_dynsym_resolve_code,
   solib_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
 
 void _initialize_dsbt_solib ();
diff --git a/gdb/solib-frv.c b/gdb/solib-frv.c
index 39508fab4c8..7502fe1f560 100644
--- a/gdb/solib-frv.c
+++ b/gdb/solib-frv.c
@@ -1084,4 +1084,10 @@  const solib_ops frv_so_ops =
   open_symbol_file_object,
   frv_in_dynsym_resolve_code,
   solib_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
index 1d4a50568d7..91359293e75 100644
--- a/gdb/solib-svr4.c
+++ b/gdb/solib-svr4.c
@@ -3353,6 +3353,15 @@  svr4_iterate_over_objfiles_in_search_order
     }
 }
 
+/* See solib_ops::find_solib_addr in solist.h.  */
+
+static std::optional<CORE_ADDR>
+svr4_find_solib_addr (solib &so)
+{
+  auto *li = gdb::checked_static_cast<lm_info_svr4 *> (so.lm_info.get ());
+  return li->l_addr_inferior;
+}
+
 const struct solib_ops svr4_so_ops =
 {
   svr4_relocate_section_addresses,
@@ -3368,6 +3377,7 @@  const struct solib_ops svr4_so_ops =
   svr4_keep_data_in_core,
   svr4_update_solib_event_breakpoints,
   svr4_handle_solib_event,
+  svr4_find_solib_addr,
 };
 
 void _initialize_svr4_solib ();
diff --git a/gdb/solib-target.c b/gdb/solib-target.c
index 6563da05a47..f89761f109f 100644
--- a/gdb/solib-target.c
+++ b/gdb/solib-target.c
@@ -412,4 +412,10 @@  const solib_ops solib_target_so_ops =
   solib_target_open_symbol_file_object,
   solib_target_in_dynsym_resolve_code,
   solib_bfd_open,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  nullptr,
+  default_find_solib_addr,
 };
diff --git a/gdb/solib.c b/gdb/solib.c
index db2ff5ce9a3..c39dfbcc78e 100644
--- a/gdb/solib.c
+++ b/gdb/solib.c
@@ -475,56 +475,304 @@  solib_bfd_open (const char *pathname)
   return abfd;
 }
 
-/* Mapping of a core file's shared library sonames to their respective
-   build-ids.  Added to the registries of core file bfds.  */
+/* A mem_range and the build-id associated with the file mapped into the
+   given range.  */
 
-typedef std::unordered_map<std::string, std::string> soname_build_id_map;
+struct mem_range_and_build_id
+{
+  mem_range_and_build_id (mem_range &&r, const bfd_build_id *id)
+    : range (r),
+      build_id (id)
+  { /* Nothing.  */ }
+
+  /* A range of memory addresses.  */
+  mem_range range;
+
+  /* The build-id of the file mapped into RANGE.  */
+  const bfd_build_id *build_id;
+};
+
+/* When a core file is opened set_cbfd_solib_build_id is called and the
+   data it is passed is used to create an instance of this class.
+
+   This class maps various information extracted from parsing the core
+   file onto the corresponding build-id.
+
+   When we latter load the shared libraries as part of the core file
+   opening process we can use this data structure to find the build-id
+   that corresponds to a shared library.
+
+   Knowing the build-id allows for two things, first, we can refuse to
+   load an incorrect shared library (a file is found, but the build-id
+   doesn't match), and second, we can use various strategies to try and
+   find the correct shared library based on the build-id (e.g. we might be
+   able to download the shared library from debuginfod).  */
+
+struct solib_core_file_build_id_data
+{
+  /* A type that maps a string to a build-id.  */
+  using string_to_build_id_map
+    = std::unordered_map<std::string, const bfd_build_id *>;
+
+  /* A type for a reverse build-id to string map.  We only store a sting
+     pointer in this map as we point back to the strings held in a
+     string_to_build_id_map, this avoids duplicating some strings.  */
+  using build_id_to_string_map
+    = std::unordered_map<const bfd_build_id *, std::string>;
+
+  /* When loading a core file, the build-ids are extracted based on the
+     file backed mappings.  This map associates the name of a file that was
+     mapped into the core file with the corresponding build-id.  When
+     opening a shared library, if the shared library filename appears in
+     this map, then we can use the corresponding build-id.  The build-id
+     pointers in this map will never be nullptr.  */
+
+  string_to_build_id_map filename_to_build_id_map;
+
+  /* Map a build-id pointer back to a filename.  The std::string pointers
+     in this map point back to the strings in FILENAME_TO_BUILD_ID_MAP.
+     If we end up figuring out the build-id from SONAME_TO_BUILD_ID_MAP or
+     from ADDRESS_TO_BUILD_ID_LIST then it might be that we can't find an
+     on-disk file.  In this case we can use this reverse map to find the
+     name of the mapped file, which might be usable.  */
+
+  build_id_to_string_map build_id_to_filename_map;
+
+  /* If the file that was mapped into the core file was a shared library
+     then it might have a DT_SONAME tag in its .dynamic section, this tag
+     contains the name of a shared object.  When opening a shared library,
+     if it's basename appears in this map then we can use the corresponding
+     build-id.
+
+     In the rare case that two different files have the same DT_SONAME
+     value then the build-id pointer in this map will be nullptr, this
+     indicates that it's not possible to find a build-id based on the
+     DT_SONAME value.  */
+
+  string_to_build_id_map soname_to_build_id_map;
+
+  /* This vector maps memory ranges onto an associated build-id.  The
+     ranges are those of the files mapped into the core file.  When opening
+     a shared library, if non of the string to build-id maps provide a
+     build-id then we can find a representative address for the library by
+     calling solib_ops::find_solib_addr() then then look in this vector for
+     a memory range that covers this address.  If a suitable match is found
+     then we use the corresponding build-id.
+
+     Entries in this vector should not overlap, and are sorted be
+     increasing memory address.  Within each entry the build-id pointer
+     will not be nullptr.  */
+
+  std::vector<mem_range_and_build_id> address_to_build_id_list;
+
+  /* The address_to_build_id_list is built over multiple calls to
+     set_cbfd_solib_build_id, each call corresponds to a single mapped
+     file and might add multiple entries to address_to_build_id_list.  It
+     would be wasteful to sort address_to_build_id_list each time some new
+     entries are added, better to record that the address_to_build_id_list
+     is unsorted, and then sort once the first time we are going to perform
+     a lookup in address_to_build_id_list, that is what this flag is for.
+     When this flag is false address_to_build_id_list is not sorted, and
+     needs sorting before we search the list.  When this flag is true the
+     address_to_build_id_list is sorted.  */
+
+  bool address_to_build_id_list_sorted = false;
+};
 
 /* Key used to associate a soname_build_id_map to a core file bfd.  */
 
-static const struct registry<bfd>::key<soname_build_id_map>
+static const struct registry<bfd>::key<solib_core_file_build_id_data>
   cbfd_soname_build_id_data_key;
 
 /* See solib.h.  */
 
 void
-set_cbfd_soname_build_id (gdb_bfd_ref_ptr abfd, const char *soname,
-			  const bfd_build_id *build_id)
+set_cbfd_solib_build_id (gdb_bfd_ref_ptr cbfd, const char *soname,
+			 const char *expected_filename,
+			 const char *actual_filename,
+			 std::vector<mem_range> &&ranges,
+			 const bfd_build_id *build_id)
 {
-  gdb_assert (abfd.get () != nullptr);
-  gdb_assert (soname != nullptr);
+  gdb_assert (cbfd.get () != nullptr);
   gdb_assert (build_id != nullptr);
+  gdb_assert (expected_filename != nullptr);
+
+  solib_core_file_build_id_data *lookup_data
+    = cbfd_soname_build_id_data_key.get (cbfd.get ());
+
+  if (lookup_data == nullptr)
+    {
+      lookup_data = cbfd_soname_build_id_data_key.emplace (cbfd.get ());
+      gdb_assert (lookup_data != nullptr);
+    }
+
+  if (soname != nullptr)
+    {
+      /* If we already have an entry with this SONAME then this indicates
+	 that the inferior has two files mapped into memory with different
+	 file names (and most likely different build-ids), but with the
+	 same DT_SONAME attribute.  In this case we can't use the
+	 DT_SONAME to figure out the expected build-id of a shared
+	 library, so poison the entry for this SONAME by setting the entry
+	 to nullptr.  */
+      auto it
+	= lookup_data->soname_to_build_id_map.find (soname);
+      if (it != lookup_data->soname_to_build_id_map.end ()
+	  && it->second != nullptr
+	  && !build_id_equal (it->second, build_id))
+	lookup_data->soname_to_build_id_map[soname] = nullptr;
+      else
+	lookup_data->soname_to_build_id_map[soname] = build_id;
+    }
+
+  /* When the core file is initially opened and the mapped files are
+     parsed, we group the build-id information based on the file name.  As
+     a consequence, we should see each EXPECTED_FILENAME value exactly
+     once.  This means that each insertion should always succeed.  */
+  const auto [it, inserted]
+    = lookup_data->filename_to_build_id_map.emplace (expected_filename,
+						     build_id);
+  gdb_assert (inserted);
+
+  /* Setup the reverse build-id to file name map.  We store a pointer to
+     the string in the filename_to_build_id_map, this saves creating
+     another copy of this string.  The two maps have the same lifetime, so
+     this should be fine.  */
+  if (actual_filename != nullptr)
+    lookup_data->build_id_to_filename_map.emplace (build_id,
+						   actual_filename);
+
+  /* Setup the list of memory range to build-id objects.  */
+  for (mem_range &r : ranges)
+    lookup_data->address_to_build_id_list.emplace_back (std::move (r),
+							build_id);
+
+  /* At this point the address_to_build_id_list is unsorted (we just added
+     some entries to the end of the list).  All entries should be added
+     before any look-ups are performed, and the list is only sorted when
+     the first look-up is performed.  */
+  gdb_assert (!lookup_data->address_to_build_id_list_sorted);
+}
 
-  soname_build_id_map *mapptr
-    = cbfd_soname_build_id_data_key.get (abfd.get ());
+/* Helper for get_cbfd_solib_build_id.  BUILD_ID is a build-id that was
+   found in LOOKUP_DATA.  Look-up the corresponding file name in
+   LOOKUP_DATA.build_id_to_filename_map and return a pair containing
+   BUILD_ID and the file name pointer that we find.  If no corresponding
+   filename is found in the build_id_to_filename_map then the returned
+   pair contains BUILD_ID and an empty string.  If BUILD_ID is nullptr
+   then the returned pair contains nullptr and an empty string.  */
+
+static std::pair<const bfd_build_id *, const std::string>
+get_cbfd_solib_make_result (solib_core_file_build_id_data *lookup_data,
+			    const bfd_build_id *build_id)
+{
+  if (build_id != nullptr)
+    {
+      gdb_assert (lookup_data != nullptr);
 
-  if (mapptr == nullptr)
-    mapptr = cbfd_soname_build_id_data_key.emplace (abfd.get ());
+      auto it = lookup_data->build_id_to_filename_map.find (build_id);
+      if (it != lookup_data->build_id_to_filename_map.end ())
+	return { build_id, it->second };
+    }
 
-  (*mapptr)[soname] = build_id_to_string (build_id);
+  return { build_id, {} };
 }
 
-/* If SONAME had a build-id associated with it in ABFD's registry by a
-   previous call to set_cbfd_soname_build_id then return the build-id
-   as a NULL-terminated hex string.  */
+/* CBFD is the current core file bfd object, which might be nullptr,
+   FILENAME is the file name of the shared library GDB is trying to load,
+   and SOLIB_ADDR is (optionally) an address within the shared library.
 
-static gdb::unique_xmalloc_ptr<char>
-get_cbfd_soname_build_id (gdb_bfd_ref_ptr abfd, const char *soname)
+   Look through the data previously registered via calls to
+   set_cbfd_solib_build_id and try to find a build-id corresponding to
+   FILENAME.
+
+   If a build-id is found then return a pair, the first item is the
+   build-id and the second item is a string which contains the file name of
+   the file GDB opened to provide the file backed mapping.  This file name
+   string might be empty if GDB didn't find a file to provide the file
+   backed mapping.
+
+   If no build-id can be found for FILENAME then GDB will return a pair
+   containing nullptr (for the build-id) and an empty string for the file
+   name.  */
+
+static std::pair<const bfd_build_id *, const std::string>
+get_cbfd_solib_build_id (gdb_bfd_ref_ptr cbfd, const char *filename,
+			 const std::optional<CORE_ADDR> &solib_addr)
 {
-  if (abfd.get () == nullptr || soname == nullptr)
-    return {};
+  if (cbfd.get () == nullptr)
+    return { nullptr, {} };
 
-  soname_build_id_map *mapptr
-    = cbfd_soname_build_id_data_key.get (abfd.get ());
+  solib_core_file_build_id_data *lookup_data
+    = cbfd_soname_build_id_data_key.get (cbfd.get ());
 
-  if (mapptr == nullptr)
-    return {};
+  if (lookup_data == nullptr)
+    return { nullptr, {} };
 
-  auto it = mapptr->find (lbasename (soname));
-  if (it == mapptr->end ())
-    return {};
+  if (filename != nullptr)
+    {
+      /* If there's a matching entry in filename_to_build_id_map then the
+	 associated build-id will not be nullptr, and can be used to
+	 validate that FILENAME is correct.  */
+      auto it = lookup_data->filename_to_build_id_map.find (filename);
+      if (it != lookup_data->filename_to_build_id_map.end ())
+	return get_cbfd_solib_make_result (lookup_data, it->second);
+    }
+
+  if (solib_addr.has_value ())
+    {
+      /* On the first lookup, sort the address_to_build_id_list.  */
+      if (!lookup_data->address_to_build_id_list_sorted)
+	{
+	  std::sort (lookup_data->address_to_build_id_list.begin (),
+		     lookup_data->address_to_build_id_list.end (),
+		     [] (const mem_range_and_build_id &a,
+			 const mem_range_and_build_id &b) {
+		       return a.range < b.range;
+		     });
+	  lookup_data->address_to_build_id_list_sorted = true;
+	}
+
+      /* Look for the first entry whose range's start address is not less
+	 than, or equal too, the address SOLIB_ADDR.  If we find such an
+	 entry, then the previous entry's range might contain SOLIB_ADDR.
+	 If it does then that previous entry's build-id can be used.  */
+      auto it = std::lower_bound
+	(lookup_data->address_to_build_id_list.begin (),
+	 lookup_data->address_to_build_id_list.end (),
+	 *solib_addr,
+	 [] (const mem_range_and_build_id &a,
+	     const CORE_ADDR &b) {
+	  return a.range.start <= b;
+	});
+
+      if (it != lookup_data->address_to_build_id_list.begin ())
+	{
+	  --it;
 
-  return make_unique_xstrdup (it->second.c_str ());
+	  if (it->range.contains (*solib_addr))
+	    return get_cbfd_solib_make_result (lookup_data, it->build_id);
+	}
+    }
+
+  if (filename != nullptr)
+    {
+      /* If the basename of FILENAME appears in the soname_to_build_id_map
+	 then when the mapped files were processed, we saw a file with a
+	 DT_SONAME attribute corresponding to FILENAME, use that build-id
+	 to validate FILENAME.
+
+	 However, the build-id in this map might be nullptr if we saw
+	 multiple mapped files with the same DT_SONAME attribute (though
+	 this should be pretty rare).  */
+      auto it
+	= lookup_data->soname_to_build_id_map.find (lbasename (filename));
+      if (it != lookup_data->soname_to_build_id_map.end ()
+	  && it->second != nullptr)
+	return get_cbfd_solib_make_result (lookup_data, it->second);
+    }
+
+  return { nullptr, {} };
 }
 
 /* Given a pointer to one of the shared objects in our list of mapped
@@ -546,36 +794,53 @@  solib_map_sections (solib &so)
 
   gdb::unique_xmalloc_ptr<char> filename (tilde_expand (so.so_name.c_str ()));
   gdb_bfd_ref_ptr abfd (ops->bfd_open (filename.get ()));
-  gdb::unique_xmalloc_ptr<char> build_id_hexstr
-    = get_cbfd_soname_build_id (current_program_space->cbfd,
-				so.so_name.c_str ());
+  std::optional<CORE_ADDR> solib_addr = ops->find_solib_addr (so);
+
+  const auto [expected_build_id, mapped_filename]
+    = get_cbfd_solib_build_id (current_program_space->cbfd,
+				so.so_name.c_str (),
+				solib_addr);
 
   /* If we already know the build-id of this solib from a core file, verify
      it matches ABFD's build-id.  If there is a mismatch or the solib wasn't
      found, attempt to query debuginfod for the correct solib.  */
-  if (build_id_hexstr.get () != nullptr)
+  if (expected_build_id != nullptr)
     {
-      bool mismatch = false;
+      bool mismatch = (abfd != nullptr
+		       && build_id_bfd_get (abfd.get ()) != nullptr
+		       && !build_id_equal (expected_build_id,
+					   build_id_bfd_get (abfd.get ())));
 
-      if (abfd != nullptr && abfd->build_id != nullptr)
-	{
-	  std::string build_id = build_id_to_string (abfd->build_id);
-
-	  if (build_id != build_id_hexstr.get ())
-	    mismatch = true;
-	}
       if (abfd == nullptr || mismatch)
 	{
-	  scoped_fd fd = debuginfod_exec_query (
-	    (const unsigned char *) build_id_hexstr.get (), 0,
-	    so.so_name.c_str (), &filename);
-
-	  if (fd.get () >= 0)
-	    abfd = ops->bfd_open (filename.get ());
-	  else if (mismatch)
-	    warning (_ ("Build-id of %ps does not match core file."),
-		     styled_string (file_name_style.style (),
-				    filename.get ()));
+	  /* If GDB found a suitable file during the file mapping
+	     processing stage then lets use that.  We don't check the
+	     build-id after opening this file, either this file was found
+	     by build-id, in which case it's going to match, or this file
+	     doesn't have a build-id, so checking tells us nothing.
+	     However, if it was good enough during the mapped file
+	     processing, we assume it's good enough now.  */
+	  if (!mapped_filename.empty ())
+	    abfd = ops->bfd_open (mapped_filename.c_str ());
+	  else
+	    abfd = nullptr;
+
+	  if (abfd == nullptr)
+	    {
+	      scoped_fd fd = debuginfod_exec_query
+		(expected_build_id->data, expected_build_id->size,
+		 so.so_name.c_str (), &filename);
+
+	      if (fd.get () >= 0)
+		abfd = ops->bfd_open (filename.get ());
+	      else if (mismatch)
+		{
+		  warning (_ ("Build-id of %ps does not match core file."),
+			   styled_string (file_name_style.style (),
+					  filename.get ()));
+		  abfd = nullptr;
+		}
+	    }
 	}
     }
 
@@ -1705,6 +1970,14 @@  remove_user_added_objfile (struct objfile *objfile)
     }
 }
 
+/* See solist.h.  */
+
+std::optional<CORE_ADDR>
+default_find_solib_addr (solib &so)
+{
+  return {};
+}
+
 void _initialize_solib ();
 
 void
diff --git a/gdb/solib.h b/gdb/solib.h
index f7a93c0718f..9cd9f111cde 100644
--- a/gdb/solib.h
+++ b/gdb/solib.h
@@ -29,6 +29,7 @@  struct program_space;
 #include "gdb_bfd.h"
 #include "symfile-add-flags.h"
 #include "gdbsupport/function-view.h"
+#include "memrange.h"
 
 /* Value of the 'set debug solib' configuration variable.  */
 
@@ -136,11 +137,39 @@  extern void update_solib_breakpoints (void);
 
 extern void handle_solib_event (void);
 
-/* Associate SONAME with BUILD_ID in ABFD's registry so that it can be
-   retrieved with get_cbfd_soname_build_id.  */
+/* This function is used to pass information extracted when processing the
+   core file's mapped file section through to the shared library processing
+   parts of GDB.  When the mapped files are processed GDB is (sometimes)
+   able to find the build-ids for the mapped files.  These build-ids are
+   recorded by calling this function, and can be used during shared library
+   loading to ensure the correct shared libraries are loaded.
 
-extern void set_cbfd_soname_build_id (gdb_bfd_ref_ptr abfd,
-				      const char *soname,
-				      const bfd_build_id *build_id);
+   CBFD is the core file bfd object to record this information against.
+
+   SONAME is the DT_SONAME attribute extracted from the .dynamic section of
+   the shared library that was mapped into the core file.  This can be
+   nullptr.
+
+   EXPECTED_FILENAME is the name of the file that was mapped into the
+   inferior as extracted from the core file, this should never be nullptr.
+
+   ACTUAL_FILENAME is the name of the actual file GDB found to provide the
+   mapped file information, this can be nullptr if GDB failed to find a
+   suitable file.
+
+   RANGES is the list of memory ranges at which this file was mapped into
+   the inferior.
+
+   BUILD_ID this is the build-id for this mapped file, this will never be
+   nullptr.  Not every mapped file will have a build-id, but there's no
+   point calling this function if we failed to find a build-id; this is all
+   about allowing the shared library code to find the build-id.  */
+
+extern void set_cbfd_solib_build_id (gdb_bfd_ref_ptr cbfd,
+				     const char *soname,
+				     const char *expected_filename,
+				     const char *actual_filename,
+				     std::vector<mem_range> &&ranges,
+				     const bfd_build_id *build_id);
 
 #endif /* SOLIB_H */
diff --git a/gdb/solist.h b/gdb/solist.h
index f0d22080de1..bd60987f49c 100644
--- a/gdb/solist.h
+++ b/gdb/solist.h
@@ -167,6 +167,23 @@  struct solib_ops
      NULL, in which case no specific preprocessing is necessary
      for this target.  */
   void (*handle_event) (void);
+
+  /* Return an address within the inferior's address space which is known
+     to be part of SO.  If there is no such address, or GDB doesn't know
+     how to figure out such an address then an empty optional is
+     returned.
+
+     The returned address can be used when loading the shared libraries
+     for a core file.  GDB knows the build-ids for (some) files mapped
+     into the inferior's address space, and knows the address ranges which
+     those mapped files cover.  If GDB can figure out a representative
+     address for the library then this can be used to match a library to a
+     mapped file, and thus to a build-id.  GDB can then use this
+     information to help locate the shared library objfile, if the objfile
+     is not in the expected place (as defined by the shared libraries file
+     name).  */
+
+  std::optional<CORE_ADDR> (*find_solib_addr) (solib &so);
 };
 
 /* A unique pointer to a so_list.  */
@@ -186,4 +203,9 @@  extern gdb_bfd_ref_ptr solib_bfd_fopen (const char *pathname, int fd);
 /* Find solib binary file and open it.  */
 extern gdb_bfd_ref_ptr solib_bfd_open (const char *in_pathname);
 
+/* A default implementation of the solib_ops::find_solib_addr callback.
+   This just returns an empty std::optional<CORE_ADDR> indicating GDB is
+   unable to find an address within the library SO.  */
+extern std::optional<CORE_ADDR> default_find_solib_addr (solib &so);
+
 #endif
diff --git a/gdb/testsuite/gdb.base/solib-search.exp b/gdb/testsuite/gdb.base/solib-search.exp
index 581046a1ed8..bbfe3ae719a 100644
--- a/gdb/testsuite/gdb.base/solib-search.exp
+++ b/gdb/testsuite/gdb.base/solib-search.exp
@@ -43,7 +43,11 @@  set right_binfile2_lib \
 set binfile1_lib [standard_output_file ${libname1}.so]
 set binfile2_lib [standard_output_file ${libname2}.so]
 
-set lib_flags [list debug ldflags=-Wl,-Bsymbolic]
+# When this test was written, GDB's ability to track down shared
+# libraries for a core file based on the build-id much poorer.  As GDB
+# has improved we now need to disable build-ids in order for this test
+# to function as expected.
+set lib_flags [list debug no-build-id ldflags=-Wl,-Bsymbolic]
 set wrong_lib_flags "$lib_flags additional_flags=-DARRAY_SIZE=1"
 set right_lib_flags "$lib_flags additional_flags=-DARRAY_SIZE=8192 additional_flags=-DRIGHT"
 
diff --git a/gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c b/gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c
new file mode 100644
index 00000000000..30e9267dc01
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/solib-with-soname-1.c
@@ -0,0 +1,39 @@ 
+/* Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <stdlib.h>
+
+/* It is important that these two variables have names of the same length
+   so that the debug information in the two library versions is laid out
+   the same.  If they differ then the .dynamic section might move, which
+   will trigger a different check within GDB than the one we actually want
+   to check.  */
+
+#if LIB_VERSION == 1
+volatile int *library_1_var = (volatile int *) 0x12345678;
+#elif LIB_VERSION == 2
+volatile int *library_2_var = (volatile int *) 0x11223344;
+#else
+# error Unknown library version
+#endif
+
+int
+foo (void)
+{
+  /* This should trigger a core dump.  */
+  abort ();
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c b/gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c
new file mode 100644
index 00000000000..a51d48e2f2e
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/solib-with-soname-2.c
@@ -0,0 +1,41 @@ 
+/* Copyright 2024 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <assert.h>
+#include <dlfcn.h>
+#include <stddef.h>
+
+/* This is in the shared library.  */
+extern int foo (void);
+
+/* This is updated by the .exp file.  */
+char *libname = "libfoo_2.so";
+
+int
+main (void)
+{
+  void *handle;
+  int res, tmp;
+
+  handle = dlopen (libname, RTLD_LAZY);
+  assert (handle != NULL);
+
+  res = foo ();
+
+  tmp = dlclose (handle);
+  assert (tmp == 0);
+
+  return res;
+}
diff --git a/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp b/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
new file mode 100644
index 00000000000..dfc78923436
--- /dev/null
+++ b/gdb/testsuite/gdb.debuginfod/solib-with-soname.exp
@@ -0,0 +1,242 @@ 
+# Copyright 2024 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+# This test exercises GDB's ability to validate build-ids when loading
+# shared libraries for a core file.
+#
+# The test creates two "versions" of a shared library, sets up a
+# symlink to point to one version of the library, and creates a core file.
+#
+# We then try re-loading the core file and executable and check that
+# GDB is able to correctly load the shared library.  To confuse things
+# we retarget the library symlink at the other version of the library.
+#
+# After that we repeat the test, but this time deleting the symlink
+# completely.
+#
+# Then we remove the version of the library completely, at this point
+# we do expect GDB to give a warning about being unable to load the library.
+#
+# And finally, we setup debuginfod and have it serve the missing
+# library file, GDB should correctly download the library file.
+#
+# Despite this test living in the gdb.debuginfod/ directory, only the last
+# part of this test actually uses debuginfod, everything up to that point is
+# pretty generic.
+
+load_lib debuginfod-support.exp
+
+require allow_shlib_tests
+require {istarget "*-linux*"}
+
+standard_testfile -1.c -2.c
+
+# Build two similar, but slightly different versions of the shared
+# library.  Both libraries have DT_SONAME set to the generic
+# libfoo.so, we'll create a symlink with that name later.
+set library_1_filename [standard_output_file "libfoo_1.so"]
+set library_2_filename [standard_output_file "libfoo_2.so"]
+
+# The generic name for the library.
+set library_filename [standard_output_file "libfoo.so"]
+
+# When compiling a shared library the -Wl,-soname,NAME option is
+# automatically added based on the final name of the library.  We want
+# to compile libfoo_1.so, but set the soname to libfoo.so.  To achieve
+# this we first compile into libfoo.so, and then rename the library to
+# libfoo_1.so.
+if {[build_executable "build libfoo_1.so" $library_filename \
+	 $srcfile \
+	      { debug shlib build-id \
+		    additional_flags=-DLIB_VERSION=1 }] == -1} {
+    return
+}
+remote_exec host "mv ${library_filename} ${library_1_filename}"
+
+# See the comment above, but this time we rename to libfoo_2.so.
+if {[build_executable "build libfoo_2.so" $library_filename \
+	 $srcfile \
+	      { debug shlib build-id \
+		    additional_flags=-DLIB_VERSION=2 }] == -1} {
+    return
+}
+remote_exec host "mv ${library_filename} ${library_2_filename}"
+
+# Create libfoo.so symlink to the libfoo_1.so library.  If this
+# symlink creation fails then we assume we can't create symlinks on
+# this host.  If this succeeds then later symlink creation is required
+# to succeed, and will trigger an FAIL if it doesn't.
+set status \
+    [remote_exec host \
+	 "ln -sf ${library_1_filename} ${library_filename}"]
+if {[lindex $status 0] != 0} {
+    unsupported "host does not support symbolic links"
+    return
+}
+
+# Build the executable.  This links against libfoo.so, which is
+# poining at libfoo_1.so.  Just to confuse things even more, this
+# executable uses dlopen to load libfoo_2.so.  Weird!
+if { [build_executable "build executable" ${binfile} ${srcfile2} \
+	  [list debug shlib=${library_filename} shlib_load]] == -1 } {
+    return
+}
+
+# Run BINFILE which will generate a corefile.
+set corefile [core_find $binfile]
+if {$corefile eq ""} {
+    untested "could not generate core file"
+    return
+}
+
+# Helper proc to load global BINFILE and then load global COREFILE.
+# If EXPECT_WARNING is true then we expect a warning about being
+# unable to load the shared library symbols, otherwise, EXPECT_WARNING
+# is false and we expect no warnings about loading shared library
+# symbols.
+proc load_exec_and_core_file { expect_warning testname } {
+    with_test_prefix $testname {
+	clean_restart $::binfile
+
+	set saw_warning false
+	gdb_test_multiple "core-file $::corefile" "load core file" {
+	    -re "^core-file \[^\r\n\]+\r\n" {
+		exp_continue
+	    }
+	    -re "^\\\[New \[^\r\n\]+\\\]\r\n" {
+		exp_continue
+	    }
+	    -re "^Core was generated by \[^\r\n\]+\r\n" {
+		exp_continue
+	    }
+	    -re "^Program terminated with signal \[^\r\n\]+\r\n" {
+		exp_continue
+	    }
+	    -re "^#0\\s+\[^\r\n\]+\r\n" {
+		exp_continue
+	    }
+	    -re "^\\s*\r\n" {
+		exp_continue
+	    }
+	    -re "^warning: Can't open file \[^\r\n\]+ during file-backed mapping note processing" {
+		# Ignore warnings from the file backed mapping phase.
+		exp_continue
+	    }
+	    -re "^warning: Could not load shared library symbols for \[^\r\n\]+/libfoo\\.so\\.\r\nDo you need \"set solib-search-path\" or \"set sysroot\".\r\n" {
+		set saw_warning true
+		exp_continue
+	    }
+	    -re "^Downloading separate debug info for \[^\r\n\]+\r\n" {
+		exp_continue
+	    }
+	    -re "^Downloading executable for \[^\r\n\]+/libfoo_1\\.so\\.\\.\\." {
+		exp_continue
+	    }
+	    -re "^$::gdb_prompt $" {
+		gdb_assert { $saw_warning == $expect_warning } $gdb_test_name
+	    }
+	}
+
+	# If we don't expect a warning then debug symbols from the
+	# shared library should be available.  Confirm we can read a
+	# variable from the shared library.  If we do expect a warning
+	# then the shared library debug symbols have not loaded, and
+	# the library variable should not be available.
+	if { !$expect_warning } {
+	    gdb_test "print/x library_1_var" " = 0x12345678" \
+		"check library_1_var can be read"
+	} else {
+	    gdb_test "print/x library_1_var" \
+		"^No symbol \"library_1_var\" in current context\\." \
+		"check library_1_var cannot be read"
+	}
+    }
+}
+
+# Initial test, just load the executable and core file.  At this point
+# everything should load fine as everything is where we expect to find
+# it.
+load_exec_and_core_file false \
+    "load core file, all libraries as expected"
+
+# Update libfoo.so symlink to point at the second library then reload
+# the core file.  GDB should spot that the symlink points to the wrong
+# file, but should be able to figure out the correct file to load as
+# the right file will be in the mapped file list.
+set status [remote_exec host \
+		"ln -sf ${library_2_filename} ${library_filename}"]
+gdb_assert { [lindex $status 0] == 0 } \
+    "update library symlink to point to the wrong file"
+
+load_exec_and_core_file false \
+    "load core file, symlink points to wrong file"
+
+# Remove libfoo.so symlink and reload the core file.  As in the
+# previous test GDB should be able to figure out the correct file to
+# load as the correct file will still appear in the mapped file list.
+set status [remote_exec host "rm -f ${library_filename}"]
+gdb_assert { [lindex $status 0] == 0 } "remove library symlink"
+
+load_exec_and_core_file false \
+    "load core file, symlink removed"
+
+# Remove LIBRARY_1_FILENAME.  We'll now see a warning that the mapped
+# file can't be loaded (we ignore that warning), and we'll see a
+# warning that the shared library can't be loaded.
+set library_1_backup_filename ${library_1_filename}.backup
+set status \
+    [remote_exec host \
+	 "mv ${library_1_filename} ${library_1_backup_filename}"]
+gdb_assert { [lindex $status 0] == 0 } \
+    "remove libfoo_1.so"
+
+load_exec_and_core_file true \
+    "load core file, libfoo_1.so removed"
+
+# Setup a debuginfod server which can serve the original shared
+# library file.
+if {![allow_debuginfod_tests]} {
+    untested "skippig debuginfod parts of this test"
+    return
+}
+
+set server_dir [standard_output_file "debuginfod.server"]
+file mkdir $server_dir
+file rename -force $library_1_backup_filename $server_dir
+
+prepare_for_debuginfod cache db
+
+set url [start_debuginfod $db $server_dir]
+if { $url eq "" } {
+    unresolved "failed to start debuginfod server"
+    return
+}
+
+with_debuginfod_env $cache {
+    setenv DEBUGINFOD_URLS $url
+
+    save_vars { GDBFLAGS } {
+	append GDBFLAGS " -ex \"set debuginfod enabled on\""
+
+	# Reload the executable and core file.  GDB should download
+	# the file libfoo_1.so using debuginfod during the mapped file
+	# phase, but should then reuse that download during the shared
+	# library phase.
+	load_exec_and_core_file false \
+	    "load core file, use debuginfod"
+    }
+}
+
+stop_debuginfod
diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
index 0d78691c381..c958ff18d2a 100644
--- a/gdb/testsuite/lib/gdb.exp
+++ b/gdb/testsuite/lib/gdb.exp
@@ -5211,6 +5211,7 @@  proc quote_for_host { args } {
 #     debug information
 #   - text_segment=addr: Tell the linker to place the text segment at ADDR.
 #   - build-id: Ensure the final binary includes a build-id.
+#   - no-build-id: Ensure the final binary does not include a build-id.
 #   - column-info/no-column-info: Enable/Disable generation of column table
 #     information.
 #
@@ -5316,6 +5317,18 @@  proc gdb_compile {source dest type options} {
 	lappend new_options "additional_flags=-Wl,--build-id"
     }
 
+    # If the 'no-build-id' option is used then disable the build-id.
+    if {[lsearch -exact $options no-build-id] > 0} {
+	lappend new_options "additional_flags=-Wl,--build-id=none"
+    }
+
+    # Sanity check.  If both 'build-id' and 'no-build-id' are used
+    # then what is expected from us!
+    if {[lsearch -exact $options build-id] > 0
+	&& [lsearch -exact $options no-build-id] > 0} {
+	error "cannot use build-id and no-build-id options"
+    }
+
     # Treating .c input files as C++ is deprecated in Clang, so
     # explicitly force C++ language.
     if { !$getting_compiler_info