ctf-reader: add support to looking debug information in external path

Message ID 20220507015434.1417130-1-guillermo.e.martinez@oracle.com
State New
Headers
Series ctf-reader: add support to looking debug information in external path |

Commit Message

Guillermo E. Martinez May 7, 2022, 1:54 a.m. UTC
  Hello libabigail team,

This patch add for looking debug information in external path(s) in 
CTF reader.

   * https://sourceware.org/pipermail/libabigail/2022q2/004340.html
   * https://sourceware.org/pipermail/libabigail/2022q2/004343.html
   * https://sourceware.org/pipermail/libabigail/2022q2/004344.html

Please let me know your comments, I'll really appreciate them.

Kind regards,
Guillermo

When an ELF `stripped' file is used to get CTF debug information the
ELF symbols used by ctf reader (`symtab_reader::symtab') is split in
a separate file and even though CTF was designed to be in ELF file
after be `stripped' this .ctf section can be 'loaded'  from and
external .debug file, for instance the script `find-debuginfo' used to
generate RPM debug packages split debug information in .debug files. The
location of such files is pass as a standard argument fromlibabigail
tools and the name of the file is gathering from the `.gnu_debuglink'
section.

	* include/abg-ctf-reader.h (ctf_reader::create_read_context):
	Add `debug_info_root_paths' argument.
	(ctf_reader::reset_read_context): Likewise.
	* src/abg-ctf-reader.cc: Add `read_context::elf_{handler,fd}_dbg',
	data members.
	(read_context::read_context): Add new `debug_info_root_paths'
	(read_context::initialize): Likewise.
	(ctf_reader::create_read_context): Likewise.
	(ctf_reader::close_elf_handler): Release
	`read_context::elf_{handler,fd}_dbg' members.
	(ctf_reader::find_alt_debuginfo): Add new function.
	(ctf_reader::slurp_elf_info): Add new argument `status'. Use
	`find_alt_debuginfo' and `elf_helpers::find_section_by_name'
	to read the symtab and ctf information from an external .debug
	file.
	(ctf_reader::read_corpus): Verify `status' after `slurp_elf_info'.
	(ctf_reader::reset_read_context): Add new `debug_info_root_path'
	argument.
	* src/abg-elf-helpers.cc (elf_helpers::find_section_by_name): Add new
	function.
	* src/abg-elf-helpers.h: Likewise.
	* src/abg-tools-utils.cc (maybe_load_vmlinux_ctf_corpus):
	Adjust `ctf_reader::{create,reset}_read_context'.
	* tests/test-read-ctf.cc: Likewise.
	* tools/abidiff.cc (display_usage): Add `--ctf' command line
	option.
	(main): Adjust `ctf_reader::create_read_context'.
	Likewise.
	* tools/abidw.cc (load_corpus_and_write_abixml): Adjust
	`ctf_reader::create_read_context'.
	* tools/abilint.cc: Likewise.

Signed-off-by: Guillermo E. Martinez <guillermo.e.martinez@oracle.com>
---
 include/abg-ctf-reader.h |   2 +
 src/abg-ctf-reader.cc    | 176 ++++++++++++++++++++++++++++++---------
 src/abg-elf-helpers.cc   |   2 +
 src/abg-tools-utils.cc   |  11 ++-
 tests/test-read-ctf.cc   |   3 +
 tools/abidiff.cc         |   6 +-
 tools/abidw.cc           |   6 +-
 tools/abilint.cc         |   7 +-
 8 files changed, 165 insertions(+), 48 deletions(-)


base-commit: c96463e1ad974b7c4561886d7a3aa8a3c9a35607
prerequisite-patch-id: 781b026536589341e1e4378d9529fe258633bb53
prerequisite-patch-id: e6191c510bc90e225bd0858d333e2e01b6e52a62
prerequisite-patch-id: b87e0b761bf3f909eb84147b3b13e2a338de9509
  

Comments

Dodji Seketeli May 13, 2022, 12:39 p.m. UTC | #1
"Guillermo E. Martinez via Libabigail" <libabigail@sourceware.org> a
écrit:

> Hello libabigail team,
>
> This patch add for looking debug information in external path(s) in 
> CTF reader.
>
>    * https://sourceware.org/pipermail/libabigail/2022q2/004340.html
>    * https://sourceware.org/pipermail/libabigail/2022q2/004343.html
>    * https://sourceware.org/pipermail/libabigail/2022q2/004344.html
>
> Please let me know your comments, I'll really appreciate them.

Sorry, the patch doesn't seem to apply for me, even though all the
patches above seem to have been applied to master.

Am I doing something wrong?

Cheers,
  

Patch

diff --git a/include/abg-ctf-reader.h b/include/abg-ctf-reader.h
index ba7289aa..60f3623d 100644
--- a/include/abg-ctf-reader.h
+++ b/include/abg-ctf-reader.h
@@ -31,6 +31,7 @@  typedef shared_ptr<read_context> read_context_sptr;
 
 read_context_sptr
 create_read_context(const std::string& elf_path,
+                    const vector<char**>& debug_info_root_paths,
                     ir::environment *env);
 corpus_sptr
 read_corpus(read_context *ctxt, elf_reader::status& status);
@@ -47,6 +48,7 @@  set_read_context_corpus_group(read_context& ctxt, corpus_group_sptr& group);
 void
 reset_read_context(read_context_sptr &ctxt,
                    const std::string&	elf_path,
+                   const vector<char**>& debug_info_root_path,
                    ir::environment*	environment);
 std::string
 dic_type_key(ctf_dict_t *dic, ctf_id_t ctf_type);
diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc
index 3f8c3d03..9512d47d 100644
--- a/src/abg-ctf-reader.cc
+++ b/src/abg-ctf-reader.cc
@@ -70,6 +70,10 @@  public:
   Elf *elf_handler;
   int elf_fd;
 
+  /// libelf handler for the ELF file from which we read the CTF data,
+  /// and the corresponding file descriptor found in external .debug file
+  Elf *elf_handler_dbg;
+  int elf_fd_dbg;
 
   /// The symtab read from the ELF file.
   symtab_reader::symtab_sptr symtab;
@@ -83,6 +87,8 @@  public:
   corpus_sptr			cur_corpus_;
   corpus_group_sptr		cur_corpus_group_;
   corpus::exported_decls_builder* exported_decls_builder_;
+  // The set of directories under which to look for debug info.
+  vector<char**>		debug_info_root_paths_;
 
   /// Setter of the exported decls builder object.
   ///
@@ -240,15 +246,20 @@  public:
   ///
   /// @param elf_path the path to the ELF file.
   ///
-  /// @param environment the environment used by the current context.
+  /// @param debug_info_root_paths vector with the paths
+  /// to directories where .debug file is located.
+  ///
+  /// @param env the environment used by the current context.
   /// This environment contains resources needed by the reader and by
   /// the types and declarations that are to be created later.  Note
   /// that ABI artifacts that are to be compared all need to be
   /// created within the same environment.
-  read_context(const string& elf_path, ir::environment *env) :
-    ctfa(NULL)
+  read_context(const string& elf_path,
+               const vector<char**>& debug_info_root_paths,
+               ir::environment *env) :
+   ctfa(NULL)
   {
-    initialize(elf_path, env);
+    initialize(elf_path, debug_info_root_paths, env);
   }
 
   /// Initializer of read_context.
@@ -256,6 +267,9 @@  public:
   /// @param elf_path the path to the elf file the context is to be
   /// used for.
   ///
+  /// @param debug_info_root_paths vector with the paths
+  /// to directories where .debug file is located.
+  ///
   /// @param environment the environment used by the current context.
   /// This environment contains resources needed by the reader and by
   /// the types and declarations that are to be created later.  Note
@@ -266,16 +280,21 @@  public:
   /// must be greater than the life time of the resulting @ref
   /// read_context the context uses resources that are allocated in
   /// the environment.
-  void initialize(const string& elf_path, ir::environment *env)
+  void initialize(const string& elf_path,
+                  const vector<char**>& debug_info_root_paths,
+                  ir::environment *env)
   {
     types_map.clear();
     filename = elf_path;
     ir_env = env;
     elf_handler = NULL;
+    elf_handler_dbg = NULL;
     elf_fd = -1;
+    elf_fd_dbg = -1;
     symtab.reset();
     cur_corpus_group_.reset();
     exported_decls_builder_ = 0;
+    debug_info_root_paths_ = debug_info_root_paths;
   }
 
   ~read_context()
@@ -1319,6 +1338,10 @@  close_elf_handler (read_context *ctxt)
   /* Finish the ELF handler and close the associated file.  */
   elf_end(ctxt->elf_handler);
   close(ctxt->elf_fd);
+
+  /* Finish the ELF handler and close the associated debug file.  */
+  elf_end(ctxt->elf_handler_dbg);
+  close(ctxt->elf_fd_dbg);
 }
 
 /// Fill a CTF section description with the information in a given ELF
@@ -1345,38 +1368,114 @@  fill_ctf_section(Elf_Scn *elf_section, ctf_sect_t *ctf_section)
   ctf_section->cts_entsize = section_header->sh_entsize;
 }
 
+/// Find a CTF section and debug symbols in a given ELF using
+/// .gnu_debuglink section.
+
+/// @param ctxt the read context.
+/// @param ctf_dbg_section the CTF section to fill with the raw data.
+static void
+find_alt_debuginfo(read_context *ctxt, Elf_Scn **ctf_dbg_scn)
+{
+  std::string name;
+  Elf_Data *data;
+
+  Elf_Scn *section = elf_helpers::find_section
+    (ctxt->elf_handler, ".gnu_debuglink", SHT_PROGBITS);
+
+  if (section
+      && (data = elf_getdata(section, NULL))
+      && data->d_size != 0)
+    name = (char *) data->d_buf;
+
+  int fd = -1;
+  Elf *hdlr = NULL;
+  *ctf_dbg_scn = NULL;
+
+  if (!name.empty())
+    for (vector<char**>::const_iterator i = ctxt->debug_info_root_paths_.begin();
+         i != ctxt->debug_info_root_paths_.end();
+         ++i)
+      {
+        std::string file_path;
+        if (!tools_utils::find_file_under_dir(**i, name, file_path))
+          continue;
+
+        if ((fd = open(file_path.c_str(), O_RDONLY)) == -1)
+          continue;
+
+        if ((hdlr = elf_begin(fd, ELF_C_READ, NULL)) == NULL)
+          {
+            close(fd);
+            continue;
+          }
+
+        ctxt->symtab =
+          symtab_reader::symtab::load(hdlr, ctxt->ir_env, nullptr);
+
+        // unlikely .ctf was designed to be present in stripped file
+        *ctf_dbg_scn =
+          elf_helpers::find_section(hdlr, ".ctf", SHT_PROGBITS);
+          break;
+
+        elf_end(hdlr);
+        close(fd);
+      }
+
+  // If we don't have a symbol table, use current one in ELF file
+  if (!ctxt->symtab)
+    ctxt->symtab =
+     symtab_reader::symtab::load(ctxt->elf_handler, ctxt->ir_env, nullptr);
+
+  ctxt->elf_handler_dbg = hdlr;
+  ctxt->elf_fd_dbg = fd;
+}
+
 /// Slurp certain information from the ELF file described by a given
 /// read context and install it in a libabigail corpus.
 ///
 /// @param ctxt the read context
 /// @param corp the libabigail corpus in which to install the info.
-///
-/// @return 0 if there is an error.
-/// @return 1 otherwise.
+/// @param status the resulting status flags.
 
-static int
-slurp_elf_info(read_context *ctxt, corpus_sptr corp)
+static void
+slurp_elf_info(read_context *ctxt,
+               corpus_sptr corp,
+               elf_reader::status& status)
 {
+  /* Set the ELF architecture.  */
+  GElf_Ehdr *ehdr, eh_mem;
   Elf_Scn *symtab_scn;
-  Elf_Scn *ctf_scn;
+  Elf_Scn *ctf_scn, *ctf_dbg_scn;
   Elf_Scn *strtab_scn;
-  GElf_Ehdr eh_mem;
-  GElf_Ehdr *ehdr = gelf_getehdr(ctxt->elf_handler, &eh_mem);
 
-  /* Set the ELF architecture.  */
+  if (!(ehdr = gelf_getehdr(ctxt->elf_handler, &eh_mem)))
+      return;
+
   corp->set_architecture_name(elf_helpers::e_machine_to_string(ehdr->e_machine));
 
-  /* Read the symtab from the ELF file and set it in the corpus.  */
-  ctxt->symtab =
-    symtab_reader::symtab::load(ctxt->elf_handler, ctxt->ir_env,
-                                0 /* No suppressions.  */);
+  find_alt_debuginfo(ctxt, &ctf_dbg_scn);
+  ABG_ASSERT(ctxt->symtab);
   corp->set_symtab(ctxt->symtab);
 
   if (corp->get_origin() & corpus::LINUX_KERNEL_BINARY_ORIGIN)
-    return 1;
+    {
+      status |= elf_reader::STATUS_OK;
+      return;
+    }
 
   /* Get the raw ELF section contents for libctf.  */
-  ctf_scn = elf_helpers::find_section(ctxt->elf_handler, ".ctf", SHT_PROGBITS);
+  const char *ctf_name = ".ctf";
+  ctf_scn = elf_helpers::find_section_by_name(ctxt->elf_handler, ctf_name);
+  if (ctf_scn == NULL)
+    {
+      if (ctf_dbg_scn)
+        ctf_scn = ctf_dbg_scn;
+      else
+        {
+          status |= elf_reader::STATUS_DEBUG_INFO_NOT_FOUND;
+          return;
+        }
+    }
 
   // ET_{EXEC,DYN} needs .dyn{sym,str} in ctf_arc_bufopen
   const char *symtab_name = ".dynsym";
@@ -1390,15 +1489,17 @@  slurp_elf_info(read_context *ctxt, corpus_sptr corp)
 
   symtab_scn = elf_helpers::find_section_by_name(ctxt->elf_handler, symtab_name);
   strtab_scn = elf_helpers::find_section_by_name(ctxt->elf_handler, strtab_name);
-
-  if (ctf_scn == NULL || symtab_scn == NULL || strtab_scn == NULL)
-    return 0;
+  if (symtab_scn == NULL || strtab_scn == NULL)
+    {
+      status |= elf_reader::STATUS_NO_SYMBOLS_FOUND;
+      return;
+    }
 
   fill_ctf_section(ctf_scn, &ctxt->ctf_sect);
   fill_ctf_section(symtab_scn, &ctxt->symtab_sect);
   fill_ctf_section(strtab_scn, &ctxt->strtab_sect);
 
-  return 1;
+  status |= elf_reader::STATUS_OK;
 }
 
 /// Create and return a new read context to process CTF information
@@ -1409,9 +1510,12 @@  slurp_elf_info(read_context *ctxt, corpus_sptr corp)
 
 read_context_sptr
 create_read_context(const std::string& elf_path,
+                    const vector<char**>& debug_info_root_paths,
                     ir::environment *env)
 {
-  read_context_sptr result(new read_context(elf_path, env));
+  read_context_sptr result(new read_context(elf_path,
+                                            debug_info_root_paths,
+                                            env));
   return result;
 }
 
@@ -1430,17 +1534,12 @@  read_corpus(read_context *ctxt, elf_reader::status &status)
 {
   corpus_sptr corp
     = std::make_shared<corpus>(ctxt->ir_env, ctxt->filename);
-
   ctxt->cur_corpus_ = corp;
-  /* Be optimist.  */
-  status = elf_reader::STATUS_OK;
+  status = elf_reader::STATUS_UNKNOWN;
 
   /* Open the ELF file.  */
   if (!open_elf_handler(ctxt))
-    {
-      status = elf_reader::STATUS_DEBUG_INFO_NOT_FOUND;
       return corp;
-    }
 
   bool is_linux_kernel = elf_helpers::is_linux_kernel(ctxt->elf_handler);
   uint32_t origin = corpus::CTF_ORIGIN;
@@ -1452,15 +1551,15 @@  read_corpus(read_context *ctxt, elf_reader::status &status)
   if (ctxt->cur_corpus_group_)
     ctxt->cur_corpus_group_->add_corpus(ctxt->cur_corpus_);
 
-  if (!slurp_elf_info(ctxt, corp) && !is_linux_kernel)
-    {
-      status = elf_reader::STATUS_NO_SYMBOLS_FOUND;
+  slurp_elf_info(ctxt, corp, status);
+  if (!is_linux_kernel
+      && ((status & elf_reader::STATUS_DEBUG_INFO_NOT_FOUND) |
+          (status & elf_reader::STATUS_NO_SYMBOLS_FOUND)))
       return corp;
-    }
 
   // Set the set of exported declaration that are defined.
   ctxt->exported_decls_builder
-   (ctxt->cur_corpus_->get_exported_decls_builder().get());
+    (ctxt->cur_corpus_->get_exported_decls_builder().get());
 
   int errp;
   if (corp->get_origin() & corpus::LINUX_KERNEL_BINARY_ORIGIN)
@@ -1483,7 +1582,7 @@  read_corpus(read_context *ctxt, elf_reader::status &status)
 
   ctxt->ir_env->canonicalization_is_done(false);
   if (ctxt->ctfa == NULL)
-    status = elf_reader::STATUS_DEBUG_INFO_NOT_FOUND;
+    status |= elf_reader::STATUS_DEBUG_INFO_NOT_FOUND;
   else
     {
       process_ctf_archive(ctxt, corp);
@@ -1570,10 +1669,11 @@  read_and_add_corpus_to_group_from_elf(read_context* ctxt,
 void
 reset_read_context(read_context_sptr	&ctxt,
                    const std::string&	 elf_path,
+                   const vector<char**>& debug_info_root_path,
                    ir::environment*	 environment)
 {
   if (ctxt)
-    ctxt->initialize(elf_path, environment);
+    ctxt->initialize(elf_path, debug_info_root_path, environment);
 }
 
 /// Returns a key to be use in types_map dict conformed by
diff --git a/src/abg-elf-helpers.cc b/src/abg-elf-helpers.cc
index a1fd4e6c..c41e339e 100644
--- a/src/abg-elf-helpers.cc
+++ b/src/abg-elf-helpers.cc
@@ -300,6 +300,8 @@  e_machine_to_string(GElf_Half e_machine)
 ///
 /// @param elf_handle the elf handle to use.
 ///
+/// @param name the section name.
+///
 /// @return the section found, nor nil if none was found.
 Elf_Scn*
 find_section_by_name(Elf* elf_handle, const std::string& name)
diff --git a/src/abg-tools-utils.cc b/src/abg-tools-utils.cc
index f30c3f1d..7ea7ecb7 100644
--- a/src/abg-tools-utils.cc
+++ b/src/abg-tools-utils.cc
@@ -2633,6 +2633,9 @@  maybe_load_vmlinux_dwarf_corpus(corpus::origin      origin,
 /// @param root the path of the directory under which the kernel
 /// kernel modules were found.
 ///
+/// @param di_root the directory in aboslute path which debug
+/// info is to be found for binaries under director @p root
+///
 /// @param verbose true if the function has to emit some verbose
 /// messages.
 ///
@@ -2646,6 +2649,7 @@  maybe_load_vmlinux_ctf_corpus(corpus::origin	  origin,
                               const string&       vmlinux,
                               vector<string>&     modules,
                               const string&       root,
+                              vector<char**>&     di_roots,
                               bool                verbose,
                               timer&              t,
                               environment_sptr&   env)
@@ -2655,7 +2659,7 @@  maybe_load_vmlinux_ctf_corpus(corpus::origin	  origin,
 
   abigail::elf_reader::status status = abigail::elf_reader::STATUS_OK;
   ctf_reader::read_context_sptr ctxt;
-  ctxt = ctf_reader::create_read_context(vmlinux, env.get());
+  ctxt = ctf_reader::create_read_context(vmlinux, di_roots, env.get());
   group.reset(new corpus_group(env.get(), root));
   set_read_context_corpus_group(*ctxt, group);
 
@@ -2690,7 +2694,7 @@  maybe_load_vmlinux_ctf_corpus(corpus::origin	  origin,
          << "/" << total_nb_modules
          << ") ... " << std::flush;
 
-      reset_read_context(ctxt, *m, env.get());
+      reset_read_context(ctxt, *m, di_roots, env.get());
       set_read_context_corpus_group(*ctxt, group);
 
       t.start();
@@ -2788,7 +2792,8 @@  build_corpus_group_from_kernel_dist_under(const string&	root,
                                       supprs, verbose, t, env);
 #ifdef WITH_CTF
       maybe_load_vmlinux_ctf_corpus(origin, group, vmlinux,
-                                    modules, root, verbose, t, env);
+                                    modules, root, di_roots,
+                                    verbose, t, env);
 #endif
     }
 
diff --git a/tests/test-read-ctf.cc b/tests/test-read-ctf.cc
index 215ed8d6..1f31e90c 100644
--- a/tests/test-read-ctf.cc
+++ b/tests/test-read-ctf.cc
@@ -22,6 +22,7 @@ 
 
 using std::string;
 using std::cerr;
+using std::vector;
 
 using abigail::tests::read_common::InOutSpec;
 using abigail::tests::read_common::test_task;
@@ -346,9 +347,11 @@  test_task_ctf::perform()
   env.reset(new abigail::ir::environment);
   abigail::elf_reader::status status =
     abigail::elf_reader::STATUS_UNKNOWN;
+  vector<char**> di_roots;
   ABG_ASSERT(abigail::tools_utils::file_exists(in_elf_path));
 
   read_context_sptr ctxt = create_read_context(in_elf_path,
+                                               di_roots,
                                                env.get());
   ABG_ASSERT(ctxt);
 
diff --git a/tools/abidiff.cc b/tools/abidiff.cc
index adec742a..6e9b9437 100644
--- a/tools/abidiff.cc
+++ b/tools/abidiff.cc
@@ -244,7 +244,9 @@  display_usage(const string& prog_name, ostream& out)
     << " --dump-diff-tree  emit a debug dump of the internal diff tree to "
     "the error output stream\n"
     <<  " --stats  show statistics about various internal stuff\n"
-    << "  --ctf use CTF instead of DWARF in ELF files\n"
+#ifdef WITH_CTF
+    << " --ctf use CTF instead of DWARF in ELF files\n"
+#endif
 #ifdef WITH_DEBUG_SELF_COMPARISON
     << " --debug debug the process of comparing an ABI corpus against itself"
 #endif
@@ -1177,6 +1179,7 @@  main(int argc, char* argv[])
               {
                 abigail::ctf_reader::read_context_sptr ctxt
                   = abigail::ctf_reader::create_read_context(opts.file1,
+                                                             opts.prepared_di_root_paths1,
                                                              env.get());
                 ABG_ASSERT(ctxt);
                 c1 = abigail::ctf_reader::read_corpus(ctxt.get(),
@@ -1260,6 +1263,7 @@  main(int argc, char* argv[])
               {
                 abigail::ctf_reader::read_context_sptr ctxt
                   = abigail::ctf_reader::create_read_context(opts.file2,
+                                                             opts.prepared_di_root_paths2,
                                                              env.get());
                 ABG_ASSERT(ctxt);
                 c2 = abigail::ctf_reader::read_corpus(ctxt.get(),
diff --git a/tools/abidw.cc b/tools/abidw.cc
index 32d055f5..1ca642bb 100644
--- a/tools/abidw.cc
+++ b/tools/abidw.cc
@@ -540,9 +540,9 @@  load_corpus_and_write_abixml(char* argv[],
   if (opts.use_ctf)
     {
       abigail::ctf_reader::read_context_sptr ctxt
-        = abigail::ctf_reader::create_read_context (opts.in_file_path,
-                                                    env.get());
-
+        = abigail::ctf_reader::create_read_context(opts.in_file_path,
+                                                   opts.prepared_di_root_paths,
+                                                   env.get());
       assert (ctxt);
       t.start();
       corp = abigail::ctf_reader::read_corpus (ctxt, s);
diff --git a/tools/abilint.cc b/tools/abilint.cc
index 4883b557..b45cad19 100644
--- a/tools/abilint.cc
+++ b/tools/abilint.cc
@@ -782,9 +782,10 @@  main(int argc, char* argv[])
 #ifdef WITH_CTF
             if (opts.use_ctf)
               {
-                abigail::ctf_reader::read_context_sptr ctxt
-                  = abigail::ctf_reader::create_read_context(opts.file_path,
-                                                             env.get());
+                abigail::ctf_reader::read_context_sptr ctxt =
+                  abigail::ctf_reader::create_read_context(opts.file_path,
+                                                           di_roots,
+                                                           env.get());
                 ABG_ASSERT(ctxt);
                 corp = abigail::ctf_reader::read_corpus(ctxt.get(), s);
               }