[v2] ctf-reader: Improves performance to read CTF information from Linux kernel

Message ID 20220427221656.2216932-1-guillermo.e.martinez@oracle.com
State New
Headers
Series [v2] ctf-reader: Improves performance to read CTF information from Linux kernel |

Commit Message

Guillermo E. Martinez April 27, 2022, 10:16 p.m. UTC
  Hello libabigail team,

This is the v2 patch to improves the performance to extract  CTF
information from Linux kernel, major changes respect to v1 are:

   * Initialize `read_context::ctf_ctfa' per group, since it contains
   the debug information for vmlinux and kernel modules.
   * Get rid of utilities functions to locate, set and get the vmlinux
   archive pointer while corpus group is processed.

Feedback will be welcomed and appreciated as ever!,

Kind Regards,
Guillermo

The _same_ ctf archive is being open/close several times by vmlinux corpus
and for each kernel module belongs to a group, it's hight time consuming
during ctf extraction info from Linux kernel.  So, this patch improves the
performance up to 300% (from ~2m:50s to ~50s), keep open per _group_
`ctf_archive_t' pointer from vmlinux (the main_corpus_from_current_group),
since this ctf archive file contains the information for kernel modules
as well, then it's reused by each module to be processed in the group,
invoking to `reset_read_context' ctf reader function.  Doing so,
`read_context::ctfa` member should be updated iff corpus origin is
`CTF_ORIGIN', otherwise it must contain valid information from the main
element in the corpus group, `ctf_close' must be called after processing
all group's members, that is in `reader_context' destructor.

	* include/abg-ctf-reader.h (ctf_reader::set_vmlinux_ctfa_path):
	Remove declaration.
	* src/abg-ctf-reader.cc (read_context::read_context): Use member
	initialization for `ctfa' member. Remove `vmlinux_ctfa_path_'.
	(read_context::~read_context): Add destructor and delete `ctfa'
	member.
	(ctf_reader::set_vmlinux_ctfa_path): Remove definition.
	(ctf_reader::read_corpus): Adjust `read_context::ctfa' if corpus
	origin is `LINUX_KERNEL_BINARY_ORIGIN', looking for
	'vmlinux.ctfa'. Invoke `process_ctf_archive', `sort_{functions,
	variables}' if ctf debug info was found.
	* src/abg-tools-utils.cc: (get_vmlinux_ctfa_path_from_kernel_dist):
	Remove definition.

Signed-off-by: Guillermo E. Martinez <guillermo.e.martinez@oracle.com>
---
 include/abg-ctf-reader.h |  3 ---
 src/abg-ctf-reader.cc    | 48 +++++++++++++++++++---------------------
 src/abg-tools-utils.cc   | 40 ++-------------------------------
 3 files changed, 25 insertions(+), 66 deletions(-)
  

Patch

diff --git a/include/abg-ctf-reader.h b/include/abg-ctf-reader.h
index 827d1bc2..ba7289aa 100644
--- a/include/abg-ctf-reader.h
+++ b/include/abg-ctf-reader.h
@@ -48,9 +48,6 @@  void
 reset_read_context(read_context_sptr &ctxt,
                    const std::string&	elf_path,
                    ir::environment*	environment);
-void
-set_vmlinux_ctfa_path(read_context& ctxt,
-                      const string& filename);
 std::string
 dic_type_key(ctf_dict_t *dic, ctf_id_t ctf_type);
 } // end namespace ctf_reader
diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc
index 2944be07..cc79520d 100644
--- a/src/abg-ctf-reader.cc
+++ b/src/abg-ctf-reader.cc
@@ -58,10 +58,6 @@  public:
   /// be read from the file then this is NULL.
   ctf_archive_t *ctfa;
 
-  /// The name of the vmlinux file from which the CTF archive got
-  /// extracted.
-  string vmlinux_ctfa_path_;
-
   /// A map associating CTF type ids with libabigail IR types.  This
   /// is used to reuse already generated types.
   std::map<std::string,type_base_sptr> types_map;
@@ -199,7 +195,9 @@  public:
   /// Constructor.
   ///
   /// @param elf_path the path to the ELF file.
-  read_context(const string& elf_path, ir::environment *env)
+  read_context(const string& elf_path, ir::environment *env) :
+    // ctfa data member can be used per courpus group
+    ctfa(NULL)
   {
     initialize(elf_path, env);
   }
@@ -212,11 +210,14 @@  public:
     elf_handler = NULL;
     elf_fd = -1;
     is_elf_exec = false;
-    ctfa = NULL;
-    vmlinux_ctfa_path_ = "";
     symtab.reset();
     cur_corpus_group_.reset();
   }
+
+  ~read_context()
+  {
+    ctf_close(ctfa);
+  }
 }; // end class read_context.
 
 /// Forward reference, needed because several of the process_ctf_*
@@ -1333,7 +1334,15 @@  read_corpus(read_context *ctxt, elf_reader::status &status)
 
   int errp;
   if (corp->get_origin() == corpus::LINUX_KERNEL_BINARY_ORIGIN)
-     ctxt->ctfa = ctf_arc_open(ctxt->vmlinux_ctfa_path_.c_str(), &errp);
+    {
+      std::string filename;
+      if (tools_utils::base_name(ctxt->filename, filename)
+          && filename == "vmlinux")
+        {
+          std::string vmlinux_ctfa_path = ctxt->filename + ".ctfa";
+          ctxt->ctfa = ctf_arc_open(vmlinux_ctfa_path.c_str(), &errp);
+        }
+    }
   else
     /* Build the ctfa from the contents of the relevant ELF sections,
        and process the CTF archive in the read context, if any.
@@ -1346,14 +1355,15 @@  read_corpus(read_context *ctxt, elf_reader::status &status)
   if (ctxt->ctfa == NULL)
     status = elf_reader::STATUS_DEBUG_INFO_NOT_FOUND;
   else
-    process_ctf_archive(ctxt, corp);
+    {
+      process_ctf_archive(ctxt, corp);
+      ctxt->cur_corpus_->sort_functions();
+      ctxt->cur_corpus_->sort_variables();
+    }
 
   ctxt->ir_env->canonicalization_is_done(true);
-  ctxt->cur_corpus_->sort_functions();
-  ctxt->cur_corpus_->sort_variables();
 
   /* Cleanup and return.  */
-  ctf_close(ctxt->ctfa);
   close_elf_handler(ctxt);
   return corp;
 }
@@ -1383,7 +1393,7 @@  set_read_context_corpus_group(read_context& ctxt,
 {
   ctxt.cur_corpus_group_ = group;
 }
-//
+
 /// Read a corpus and add it to a given @ref corpus_group.
 ///
 /// @param ctxt the reading context to consider.
@@ -1436,18 +1446,6 @@  reset_read_context(read_context_sptr	&ctxt,
     ctxt->initialize(elf_path, environment);
 }
 
-/// Set the @ref filename being assigned to the current read context.
-///
-/// @param ctxt the read_context to consider.
-///
-/// @param filename the @ref vmlinux CTFA filename to set.
-void
-set_vmlinux_ctfa_path(read_context& ctxt,
-                      const string& filename)
-{
-  ctxt.vmlinux_ctfa_path_ = filename;
-}
-
 std::string
 dic_type_key(ctf_dict_t *dic, ctf_id_t ctf_type)
 {
diff --git a/src/abg-tools-utils.cc b/src/abg-tools-utils.cc
index b73786a8..47a33022 100644
--- a/src/abg-tools-utils.cc
+++ b/src/abg-tools-utils.cc
@@ -2470,31 +2470,6 @@  get_vmlinux_path_from_kernel_dist(const string&	from,
   return found;
 }
 
-/// Get the paths of the CTF vmlinux archive under given directory.
-///
-/// @param from the directory under which to look for.
-///
-/// @param vmlinux_path output parameter.  The path of the CTF vmlinux
-/// binary that was found.
-///
-/// @return true if at least the path to the vmlinux.ctfa binary was found.
-#ifdef WITH_CTF
-bool
-get_vmlinux_ctfa_path_from_kernel_dist(const string& from,
-                                       string& vmlinux_ctfa_path)
-{
-  if (!dir_exists(from))
-    return false;
-
-  string dist_root = from;
-  bool found = false;
-  if (find_vmlinux_path(dist_root, vmlinux_ctfa_path, "vmlinux.ctfa"))
-    found = true;
-
-  return found;
-}
-#endif
-
 /// Get the paths of the vmlinux and kernel module binaries under
 /// given directory.
 ///
@@ -2576,15 +2551,6 @@  build_corpus_group_from_kernel_dist_under(const string&	root,
   t.start();
   bool got_binary_paths =
     get_binary_paths_from_kernel_dist(root, debug_info_root, vmlinux, modules);
-#ifdef WITH_CTF
-  string vmlinux_ctfa;
-  if (got_binary_paths &&
-      env->get_debug_format_type() == environment::CTF_FORMAT_TYPE)
-    {
-      got_binary_paths = get_vmlinux_ctfa_path_from_kernel_dist(root, vmlinux_ctfa);
-      ABG_ASSERT(!vmlinux_ctfa.empty());
-    }
-#endif
 
   t.stop();
 
@@ -2684,12 +2650,11 @@  build_corpus_group_from_kernel_dist_under(const string&	root,
       else if (env->get_debug_format_type() == environment::CTF_FORMAT_TYPE)
         {
           ctf_reader::read_context_sptr ctxt;
+
           if (!vmlinux.empty())
             {
               ctxt =
-               ctf_reader::create_read_context(vmlinux, env.get());
-              set_vmlinux_ctfa_path(*ctxt, vmlinux_ctfa);
-
+                ctf_reader::create_read_context(vmlinux, env.get());
               group.reset(new corpus_group(env.get(), root));
               set_read_context_corpus_group(*ctxt, group);
 
@@ -2725,7 +2690,6 @@  build_corpus_group_from_kernel_dist_under(const string&	root,
                       << ") ... " << std::flush;
 
                    reset_read_context(ctxt, *m, env.get());
-                   set_vmlinux_ctfa_path(*ctxt, vmlinux_ctfa);
                    set_read_context_corpus_group(*ctxt, group);
 
                    t.start();