ctf-reader: Improves performance to read CTF information from Linux kernel

Message ID 20220405160833.1179590-1-guillermo.e.martinez@oracle.com
State New
Headers
Series ctf-reader: Improves performance to read CTF information from Linux kernel |

Commit Message

Guillermo E. Martinez April 5, 2022, 4:08 p.m. UTC
  Hello libabigail team,

This patch is meant to improves the performance to extract  CTF
information from Linux kernel, it depends of:

https://sourceware.org/pipermail/libabigail/2022q1/004262.html

Feedback will be welcomed and appreciated as ever!,

Kind regards,
Guillermo

The _same_ ctf archive is being open/close several times by vmlinux corpus
and for each kernel module belongs to a group, it's hight time consuming
during ctf extraction info from Linux kernel.  So, this patch improves the
performance up to 300% (from ~2m:50s to ~50s), storing a cached
`ctf_archive_t' pointer from vmlinux (the main_corpus_from_current_group)
using the new `open_vmlinux_ctf_archive' function, and since this ctf
archive file contains the information for kernel modules as well, then
it's reused by each module to be processed in the group by ctf reader
using `set_vmlinux_ctf_archive' function.  Doing so,
`read_context::ctfa` member should be updated if corpus origin is
`CTF_ORIGIN', otherwise it must contain a valid address, and `ctf_close'
must be called after processing all group's members, that is in
`reader_context' destructor.

	* include/abg-ctf-reader.h (ctf_reader::{open_vmlinux_ctf_archive,
	vmlinux_ctf_archive,set_vmlinux_ctf_archive}): Add new functions.
	* src/abg-ctf-reader.cc (read_context::~read_context): Add
	destructor.
	(ctf_reader::read_corpus): Adjust `ctf archive' if corpus
	origin is `CTF_ORIGIN'.
	* src/abg-tools-utils.cc (get_binary_paths_from_kernel_dist):
	Use new `open_vmlinux_ctf_archive' and `set_vmlinux_ctf_archive'
	to store a cached `ctf_archive_t' type gather from vmlinux
	and be reused by the rest of kernel modules.

Signed-off-by: Guillermo E. Martinez <guillermo.e.martinez@oracle.com>
---
 include/abg-ctf-reader.h | 11 ++++++--
 src/abg-ctf-reader.cc    | 56 +++++++++++++++++++++++++++++-----------
 src/abg-tools-utils.cc   | 14 ++++++----
 3 files changed, 59 insertions(+), 22 deletions(-)
  

Comments

Guillermo E. Martinez April 25, 2022, 6:33 p.m. UTC | #1
On Tuesday, April 5, 2022 11:08:33 AM CDT Guillermo E. Martinez wrote:
Hello libabigail team,

Any comment about this patch?
 
> This patch is meant to improves the performance to extract  CTF
> information from Linux kernel, it depends of:
> 
> https://sourceware.org/pipermail/libabigail/2022q1/004262.html
> 
> Feedback will be welcomed and appreciated as ever!,
> [...]
Kind regards,
Guillermo
  
Dodji Seketeli April 29, 2022, 11:11 a.m. UTC | #2
Hello,

"Guillermo E. Martinez via Libabigail" <libabigail@sourceware.org> a
écrit:

> Hello libabigail team,
>
> This patch is meant to improves the performance to extract  CTF
> information from Linux kernel, it depends of:
>
> https://sourceware.org/pipermail/libabigail/2022q1/004262.html
>
> Feedback will be welcomed and appreciated as ever!,

I am sorry, but the patch doesn't apply to the current master branch
anymore :-(

Would you please refresh it on your tree and re-send it?

Thanks.
  
Guillermo E. Martinez April 29, 2022, 1:58 p.m. UTC | #3
On Friday, April 29, 2022 6:11:29 AM CDT Dodji Seketeli wrote:
> Hello,
Hello Dodji,

> "Guillermo E. Martinez via Libabigail" <libabigail@sourceware.org> a
> écrit:
> 
> > Hello libabigail team,
> >
> > This patch is meant to improves the performance to extract  CTF
> > information from Linux kernel, it depends of:
> >
> > https://sourceware.org/pipermail/libabigail/2022q1/004262.html
> >
> > Feedback will be welcomed and appreciated as ever!,
> 
> I am sorry, but the patch doesn't apply to the current master branch
> anymore :-(
Sorry, hmmm my fault, I forget mention that patch:
    https://sourceware.org/pipermail/libabigail/2022q1/004262.html
depends on this:
   https://sourceware.org/pipermail/libabigail/2022q2/004279.html 
> Would you ple|ase refresh it on your tree and re-send it?
Sure, i will refresh this and its dependencies from master :-)

> Thanks.
Thanks!, 
guillermo
  

Patch

diff --git a/include/abg-ctf-reader.h b/include/abg-ctf-reader.h
index 827d1bc2..09bf4020 100644
--- a/include/abg-ctf-reader.h
+++ b/include/abg-ctf-reader.h
@@ -49,8 +49,15 @@  reset_read_context(read_context_sptr &ctxt,
                    const std::string&	elf_path,
                    ir::environment*	environment);
 void
-set_vmlinux_ctfa_path(read_context& ctxt,
-                      const string& filename);
+open_vmlinux_ctf_archive(read_context& ctxt,
+                         const string& vmlinux_ctfa_path);
+ctf_archive_t *
+vmlinux_ctf_archive(read_context& ctxt);
+
+void
+set_vmlinux_ctf_archive(read_context& ctxt,
+                        void *ctfa);
+
 std::string
 dic_type_key(ctf_dict_t *dic, ctf_id_t ctf_type);
 } // end namespace ctf_reader
diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc
index b9b3d939..5efa6fc7 100644
--- a/src/abg-ctf-reader.cc
+++ b/src/abg-ctf-reader.cc
@@ -58,10 +58,6 @@  public:
   /// be read from the file then this is NULL.
   ctf_archive_t *ctfa;
 
-  /// The name of the vmlinux file from which the CTF archive got
-  /// extracted.
-  string vmlinux_ctfa_path_;
-
   /// A map associating CTF type ids with libabigail IR types.  This
   /// is used to reuse already generated types.
   std::map<std::string,type_base_sptr> types_map;
@@ -213,10 +209,14 @@  public:
     elf_fd = -1;
     is_elf_exec = false;
     ctfa = NULL;
-    vmlinux_ctfa_path_ = "";
     symtab.reset();
     cur_corpus_group_.reset();
   }
+
+  ~read_context()
+  {
+    ctf_close(ctfa);
+  }
 }; // end class read_context.
 
 /// Forward reference, needed because several of the process_ctf_*
@@ -1332,9 +1332,7 @@  read_corpus(read_context *ctxt, elf_reader::status &status)
     }
 
   int errp;
-  if (corp->get_origin() == corpus::LINUX_KERNEL_BINARY_ORIGIN)
-     ctxt->ctfa = ctf_arc_open(ctxt->vmlinux_ctfa_path_.c_str(), &errp);
-  else
+  if (corp->get_origin() == corpus::CTF_ORIGIN)
     /* Build the ctfa from the contents of the relevant ELF sections,
        and process the CTF archive in the read context, if any.
        Information about the types, variables, functions, etc contained
@@ -1353,7 +1351,6 @@  read_corpus(read_context *ctxt, elf_reader::status &status)
   ctxt->cur_corpus_->sort_variables();
 
   /* Cleanup and return.  */
-  ctf_close(ctxt->ctfa);
   close_elf_handler(ctxt);
   return corp;
 }
@@ -1383,7 +1380,7 @@  set_read_context_corpus_group(read_context& ctxt,
 {
   ctxt.cur_corpus_group_ = group;
 }
-//
+
 /// Read a corpus and add it to a given @ref corpus_group.
 ///
 /// @param ctxt the reading context to consider.
@@ -1436,18 +1433,47 @@  reset_read_context(read_context_sptr	&ctxt,
     ctxt->initialize(elf_path, environment);
 }
 
-/// Set the @ref filename being assigned to the current read context.
+/// Open CTF archive in a read context for a given path.
 ///
 /// @param ctxt the read_context to consider.
 ///
-/// @param filename the @ref vmlinux CTFA filename to set.
+/// @param vmlinux_ctfa_path the vmlinux CTFA filename
+/// to be opened.
 void
-set_vmlinux_ctfa_path(read_context& ctxt,
-                      const string& filename)
+open_vmlinux_ctf_archive(read_context& ctxt,
+                         const string& vmlinux_ctfa_path)
 {
-  ctxt.vmlinux_ctfa_path_ = filename;
+  int errp;
+  if ((ctxt.ctfa = ctf_arc_open(vmlinux_ctfa_path.c_str(),
+                                &errp)) == NULL )
+    {
+      std::cerr << "cannot open: "
+       << ctf_errmsg (errp)
+       << "\n";
+      std::abort();
+    }
 }
 
+
+/// Set CTF archive member for a given read context.
+///
+/// @param ctxt the read_context to consider.
+///
+/// @param ctfa reference to be set.
+void
+set_vmlinux_ctf_archive(read_context& ctxt,
+                        void *ctfa)
+{
+  ctxt.ctfa = static_cast<ctf_archive_t *>(ctfa);
+}
+
+/// Return CTF archive member for a given read context.
+///
+/// @param ctxt the read_context to consider.
+ctf_archive_t *
+vmlinux_ctf_archive(read_context& ctxt)
+{ return ctxt.ctfa; }
+
 std::string
 dic_type_key(ctf_dict_t *dic, ctf_id_t ctf_type)
 {
diff --git a/src/abg-tools-utils.cc b/src/abg-tools-utils.cc
index b73786a8..5995a4ac 100644
--- a/src/abg-tools-utils.cc
+++ b/src/abg-tools-utils.cc
@@ -2577,12 +2577,13 @@  build_corpus_group_from_kernel_dist_under(const string&	root,
   bool got_binary_paths =
     get_binary_paths_from_kernel_dist(root, debug_info_root, vmlinux, modules);
 #ifdef WITH_CTF
-  string vmlinux_ctfa;
+  string vmlinux_ctfa_path;
   if (got_binary_paths &&
       env->get_debug_format_type() == environment::CTF_FORMAT_TYPE)
     {
-      got_binary_paths = get_vmlinux_ctfa_path_from_kernel_dist(root, vmlinux_ctfa);
-      ABG_ASSERT(!vmlinux_ctfa.empty());
+      got_binary_paths =
+       get_vmlinux_ctfa_path_from_kernel_dist(root, vmlinux_ctfa_path);
+      ABG_ASSERT(!vmlinux_ctfa_path.empty());
     }
 #endif
 
@@ -2684,11 +2685,14 @@  build_corpus_group_from_kernel_dist_under(const string&	root,
       else if (env->get_debug_format_type() == environment::CTF_FORMAT_TYPE)
         {
           ctf_reader::read_context_sptr ctxt;
+          void *vmlinux_ctfa = nullptr;
+
           if (!vmlinux.empty())
             {
               ctxt =
                ctf_reader::create_read_context(vmlinux, env.get());
-              set_vmlinux_ctfa_path(*ctxt, vmlinux_ctfa);
+              open_vmlinux_ctf_archive(*ctxt, vmlinux_ctfa_path);
+              vmlinux_ctfa = vmlinux_ctf_archive(*ctxt);
 
               group.reset(new corpus_group(env.get(), root));
               set_read_context_corpus_group(*ctxt, group);
@@ -2725,7 +2729,7 @@  build_corpus_group_from_kernel_dist_under(const string&	root,
                       << ") ... " << std::flush;
 
                    reset_read_context(ctxt, *m, env.get());
-                   set_vmlinux_ctfa_path(*ctxt, vmlinux_ctfa);
+                   set_vmlinux_ctf_archive(*ctxt, vmlinux_ctfa);
                    set_read_context_corpus_group(*ctxt, group);
 
                    t.start();