[for,review] ctf-reader: Enumerate dicts in the archive rather than using their name

Message ID 87h6bvs5wl.fsf@redhat.com
State New
Headers
Series [for,review] ctf-reader: Enumerate dicts in the archive rather than using their name |

Commit Message

Dodji Seketeli Aug. 8, 2024, 9:47 a.m. UTC
  Hello,

Following up my message
https://inbox.sourceware.org/libabigail/87le17s6b9.fsf@seketeli.org/,
Here is a patch I am proposing to avoid having to pass a new
--ctf-parent <ctf-parent-dict-name> option to the libabigail tools.  If
you agree with the spirit of this patch and if it works for you, I'd
prefer this one (or some other patch that avoids having to pass a
--ctf-parent option to the tools) to be applied instead.  So I am
looking forward to hearing your comments.

ctf::reader::process_ctf_archive calls ctf_dict_open with the name of
the (parent) dictionary to open.  If that parent dictionary is set to
an unexpected name, then the call to ctf_dict_open fails.  This can
happen for instance when at link time, the name of the parent
dictionary is set to an arbitrary name using the
"ctf_link_set_memb_name_changer" function.

This patch enumerates the dictionaries of the archive to avoid having
to know the name of the parent dictionary.  The enumeration is done
using the ctf_archive_next function of libctf.

There is currently no binary with an unexpected dictionary name in the
test suite so this patch cannot be tested for that particular case.
I'd be glad to have such binaries added to the test suite.

In the mean time this patch has been tested successfully using "make
fullcheck" on the existing test suite.

	* src/abg-ctf-reader.cc (reader::process_ctf_archive):  Do not use
	ctf_dict_open to open the dictionary by name.  Rather, enumerate
	the dictionaries of the current archive by using ctf_archive_next
	just like what lookup_symbol_in_ctf_archive does.  Set the
	argument of the skip_parent parameter to "false" to ensure we get
	the parent dictionary.
	(lookup_symbol_in_ctf_archive): Clean this up to properly
	initialize the parameters and to document the arguments to the
	parameters of ctf_archive_next.  Use nullptr instead of NULL.
	* tests/data/test-diff-pkg-ctf/test-rpm-report-1.txt: Adjust.
	* tests/data/test-diff-pkg-ctf/test-rpm-report-2.txt: Adjust.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
---
 src/abg-ctf-reader.cc                         | 52 +++++++++----------
 .../test-diff-pkg-ctf/test-rpm-report-1.txt   |  2 +-
 .../test-diff-pkg-ctf/test-rpm-report-2.txt   |  2 +-
 3 files changed, 26 insertions(+), 30 deletions(-)
  

Comments

Nick Alcock Aug. 8, 2024, 2:58 p.m. UTC | #1
On 8 Aug 2024, Dodji Seketeli spake thusly:

> Hello,
>
> Following up my message
> https://inbox.sourceware.org/libabigail/87le17s6b9.fsf@seketeli.org/,
> Here is a patch I am proposing to avoid having to pass a new
> --ctf-parent <ctf-parent-dict-name> option to the libabigail tools.  If
> you agree with the spirit of this patch and if it works for you, I'd
> prefer this one (or some other patch that avoids having to pass a
> --ctf-parent option to the tools) to be applied instead.  So I am
> looking forward to hearing your comments.

Seems good to me, though I am appalled on your behalf that you have to
do this. I'll fix ctf_dict_open so that passing in a NULL member always
works in future :)

> ctf::reader::process_ctf_archive calls ctf_dict_open with the name of
> the (parent) dictionary to open.  If that parent dictionary is set to
> an unexpected name, then the call to ctf_dict_open fails.  This can

Sorry! Mea culpa...

> +    ctf_dict = initial_ctf_dict;

Just a note that if you do want to extract the parent, you can call
ctf_parent_dict on this dict and get back the parent (or NULL, if this
already *is* the parent). 

> +    // Iterate through the dictionnaries of the archive and get the
> +    // first one, which should be the parent dictionnary.

This is not actually *guaranteed*, but this does happen to be the way
ctf_link_write works right now, and always has, and I have no plans to
change it. There'll be a better way to do this (i.e.
ctf_dict_open(arc, NULL, &err) will work for all archives, even those
whose parent dict has been renamed) long before this behaviour changes
(if it ever does).

>      for (const auto& symbol : symtab_reader::filtered_symtab(*symt, filter))
>        {
>  	std::string sym_name = symbol->get_name();
>@@ -524,11 +515,14 @@ public:
>  	    func_declaration->set_is_in_public_symbol_table(true);
>  	    add_fn_to_exported_or_undefined_decls(func_declaration.get());
>  	  }
> -
> -	ctf_dict = dict_tmp;
> +	if (ctf_dict != initial_ctf_dict)
> +	  {
> +	    ctf_dict_close(initial_ctf_dict);
> +	    initial_ctf_dict = ctf_dict;
> +	  }
>        }
> -
>      ctf_dict_close(ctf_dict);
> +    ctf_next_destroy(it);

That's right :) again, you don't need to do this if you're actually
*iterating* using an iterator, since the end-of-iteration error case
will free the iterator for you. But even then it's harmless:
ctf_next_destroy(NULL) does nothing (as you'd expect).

> -      while ((fp = ctf_archive_next(ctfa, &i, &arcname, 1, &ctf_err)) != NULL)
> +      while ((fp = ctf_archive_next(ctfa, &i, &arcname,
> +				    /*skip_parent=*/true,
> +				    &ctf_err)) != nullptr)

Just a note that skip_parent has the same affliction and assumes the
parent dict is called .ctf :( I'll fix that at the same time.

(The easiest fix for this is to double-check if the dict you get back is
a parent via ctf_parent_dict(), as above, and do a ctf_dict_close /
continue pair if it is. Once I fix ctf_archive_next this code will be a
NOP but will be harmless.)
  
Claudiu Zissulescu-Ianculescu Aug. 19, 2024, 10:19 a.m. UTC | #2
Hello Dodji,

The patch works/looks fine for me.

Thank you,
Claudiu

P.S.
I have tested using the next command line:
abidw --annotate --ctf vmlinux
Without your patch abidw fails.



On 8/8/24 12:47 PM, Dodji Seketeli wrote:
> Hello,
> 
> Following up my message
> https://inbox.sourceware.org/libabigail/87le17s6b9.fsf@seketeli.org/,
> Here is a patch I am proposing to avoid having to pass a new
> --ctf-parent <ctf-parent-dict-name> option to the libabigail tools.  If
> you agree with the spirit of this patch and if it works for you, I'd
> prefer this one (or some other patch that avoids having to pass a
> --ctf-parent option to the tools) to be applied instead.  So I am
> looking forward to hearing your comments.
> 
> ctf::reader::process_ctf_archive calls ctf_dict_open with the name of
> the (parent) dictionary to open.  If that parent dictionary is set to
> an unexpected name, then the call to ctf_dict_open fails.  This can
> happen for instance when at link time, the name of the parent
> dictionary is set to an arbitrary name using the
> "ctf_link_set_memb_name_changer" function.
> 
> This patch enumerates the dictionaries of the archive to avoid having
> to know the name of the parent dictionary.  The enumeration is done
> using the ctf_archive_next function of libctf.
> 
> There is currently no binary with an unexpected dictionary name in the
> test suite so this patch cannot be tested for that particular case.
> I'd be glad to have such binaries added to the test suite.
> 
> In the mean time this patch has been tested successfully using "make
> fullcheck" on the existing test suite.
> 
> 	* src/abg-ctf-reader.cc (reader::process_ctf_archive):  Do not use
> 	ctf_dict_open to open the dictionary by name.  Rather, enumerate
> 	the dictionaries of the current archive by using ctf_archive_next
> 	just like what lookup_symbol_in_ctf_archive does.  Set the
> 	argument of the skip_parent parameter to "false" to ensure we get
> 	the parent dictionary.
> 	(lookup_symbol_in_ctf_archive): Clean this up to properly
> 	initialize the parameters and to document the arguments to the
> 	parameters of ctf_archive_next.  Use nullptr instead of NULL.
> 	* tests/data/test-diff-pkg-ctf/test-rpm-report-1.txt: Adjust.
> 	* tests/data/test-diff-pkg-ctf/test-rpm-report-2.txt: Adjust.
> 
> Signed-off-by: Dodji Seketeli <dodji@redhat.com>
> ---
>  src/abg-ctf-reader.cc                         | 52 +++++++++----------
>  .../test-diff-pkg-ctf/test-rpm-report-1.txt   |  2 +-
>  .../test-diff-pkg-ctf/test-rpm-report-2.txt   |  2 +-
>  3 files changed, 26 insertions(+), 30 deletions(-)
> 
> diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc
> index 6879c6e0..40b89af9 100644
> --- a/src/abg-ctf-reader.cc
> +++ b/src/abg-ctf-reader.cc
> @@ -445,35 +445,26 @@ public:
>      corp->add(ir_translation_unit);
>      cur_transl_unit(ir_translation_unit);
>  
> -    int ctf_err;
> -    ctf_dict_t *ctf_dict, *dict_tmp;
> +    ctf_dict_t *ctf_dict = nullptr, *initial_ctf_dict = nullptr;
>      const auto symt = symtab();
>      symtab_reader::symtab_filter filter = symt->make_filter();
>      filter.set_public_symbols();
> -    std::string dict_name;
>  
> -    if ((corp->get_origin() & corpus::LINUX_KERNEL_BINARY_ORIGIN)
> -	&& corpus_group())
> +    ctf_next_t *it = nullptr;
> +    // Iterate through the dictionnaries of the archive and get the
> +    // first one, which should be the parent dictionnary.
> +    initial_ctf_dict = ctf_archive_next(ctfa, /*iterator=*/&it,
> +					/*dict_name=*/nullptr,
> +					/*skip_parent=*/false,
> +					/*ctf_error=*/nullptr);
> +    if (!initial_ctf_dict)
>        {
> -	tools_utils::base_name(corpus_path(), dict_name);
> -	// remove .* suffix
> -	std::size_t pos = dict_name.find(".");
> -	if (pos != string::npos)
> -	  dict_name.erase(pos);
> -
> -	std::replace(dict_name.begin(), dict_name.end(), '-', '_');
> -      }
> -
> -    if ((ctf_dict = ctf_dict_open(ctfa,
> -				  dict_name.empty() ? NULL : dict_name.c_str(),
> -				  &ctf_err)) == NULL)
> -      {
> -	fprintf(stderr, "ERROR dictionary not found\n");
> -	abort();
> +	std::cerr << "Could not find any dictionnary in the CTF archive\n";
> +	ctf_next_destroy(it);
> +	return;
>        }
>  
> -    dict_tmp = ctf_dict;
> -
> +    ctf_dict = initial_ctf_dict;
>      for (const auto& symbol : symtab_reader::filtered_symtab(*symt, filter))
>        {
>  	std::string sym_name = symbol->get_name();
> @@ -524,11 +515,14 @@ public:
>  	    func_declaration->set_is_in_public_symbol_table(true);
>  	    add_fn_to_exported_or_undefined_decls(func_declaration.get());
>  	  }
> -
> -	ctf_dict = dict_tmp;
> +	if (ctf_dict != initial_ctf_dict)
> +	  {
> +	    ctf_dict_close(initial_ctf_dict);
> +	    initial_ctf_dict = ctf_dict;
> +	  }
>        }
> -
>      ctf_dict_close(ctf_dict);
> +    ctf_next_destroy(it);
>    }
>  
>    /// Add a new type declaration to the given libabigail IR corpus CORP.
> @@ -1667,10 +1661,12 @@ lookup_symbol_in_ctf_archive(ctf_archive_t *ctfa, ctf_dict_t **ctf_dict,
>    if (ctf_type == CTF_ERR)
>      {
>        ctf_dict_t *fp;
> -      ctf_next_t *i = NULL;
> -      const char *arcname;
> +      ctf_next_t *i = nullptr;
> +      const char *arcname = nullptr;
>  
> -      while ((fp = ctf_archive_next(ctfa, &i, &arcname, 1, &ctf_err)) != NULL)
> +      while ((fp = ctf_archive_next(ctfa, &i, &arcname,
> +				    /*skip_parent=*/true,
> +				    &ctf_err)) != nullptr)
>          {
>            if ((ctf_type = ctf_lookup_by_symbol_name (fp, sym_name)) == CTF_ERR)
>              ctf_type = ctf_lookup_variable(fp, sym_name);
> diff --git a/tests/data/test-diff-pkg-ctf/test-rpm-report-1.txt b/tests/data/test-diff-pkg-ctf/test-rpm-report-1.txt
> index 84d22968..af0afa73 100644
> --- a/tests/data/test-diff-pkg-ctf/test-rpm-report-1.txt
> +++ b/tests/data/test-diff-pkg-ctf/test-rpm-report-1.txt
> @@ -1,5 +1,5 @@
>  ================ changes of 'libdwarf.so.1.20180129.0'===============
> -  Functions changes summary: 0 Removed, 0 Changed, 0 Added function
> +  Functions changes summary: 0 Removed, 0 Changed (1 filtered out), 0 Added function
>    Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>    Function symbols changes summary: 0 Removed, 1 Added function symbol not referenced by debug info
>    Variable symbols changes summary: 0 Removed, 0 Added variable symbol not referenced by debug info
> diff --git a/tests/data/test-diff-pkg-ctf/test-rpm-report-2.txt b/tests/data/test-diff-pkg-ctf/test-rpm-report-2.txt
> index e22399de..1347b012 100644
> --- a/tests/data/test-diff-pkg-ctf/test-rpm-report-2.txt
> +++ b/tests/data/test-diff-pkg-ctf/test-rpm-report-2.txt
> @@ -1,5 +1,5 @@
>  ================ changes of 'libdwarf.so.1.20180129.0'===============
> -  Functions changes summary: 0 Removed, 0 Changed, 0 Added function
> +  Functions changes summary: 0 Removed, 0 Changed (1 filtered out), 0 Added function
>    Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>    Function symbols changes summary: 1 Removed, 0 Added function symbol not referenced by debug info
>    Variable symbols changes summary: 0 Removed, 0 Added variable symbol not referenced by debug info
  
Dodji Seketeli Aug. 19, 2024, 11:25 a.m. UTC | #3
Claudiu Zissulescu-Ianculescu <claudiu.zissulescu-ianculescu@oracle.com>
a écrit:

[...]

> The patch works/looks fine for me.

Thanks!

[...]

> P.S.
> I have tested using the next command line:
> abidw --annotate --ctf vmlinux
> Without your patch abidw fails.

Ah, cool! Thanks for that.

Is it possible that we get that vmlinux binary into
https://sourceware.org/git/gitweb.cgi?p=libabigail-tests.git somehow?

Cheers,

> Thank you,

You are welcome.

[...]

Cheers,
  
Dodji Seketeli Aug. 20, 2024, 5:39 p.m. UTC | #4
Claudiu Zissulescu-Ianculescu <claudiu.zissulescu-ianculescu@oracle.com>
a écrit:

> Hello Dodji,
>
> The patch works/looks fine for me.

Many thanks, again.

[...]


>> 
>> 	* src/abg-ctf-reader.cc (reader::process_ctf_archive):  Do not use
>> 	ctf_dict_open to open the dictionary by name.  Rather, enumerate
>> 	the dictionaries of the current archive by using ctf_archive_next
>> 	just like what lookup_symbol_in_ctf_archive does.  Set the
>> 	argument of the skip_parent parameter to "false" to ensure we get
>> 	the parent dictionary.
>> 	(lookup_symbol_in_ctf_archive): Clean this up to properly
>> 	initialize the parameters and to document the arguments to the
>> 	parameters of ctf_archive_next.  Use nullptr instead of NULL.
>> 	* tests/data/test-diff-pkg-ctf/test-rpm-report-1.txt: Adjust.
>> 	* tests/data/test-diff-pkg-ctf/test-rpm-report-2.txt: Adjust.
>> 
>> Signed-off-by: Dodji Seketeli <dodji@redhat.com>

Applied to the mainline.

[...]

Cheers,
  

Patch

diff --git a/src/abg-ctf-reader.cc b/src/abg-ctf-reader.cc
index 6879c6e0..40b89af9 100644
--- a/src/abg-ctf-reader.cc
+++ b/src/abg-ctf-reader.cc
@@ -445,35 +445,26 @@  public:
     corp->add(ir_translation_unit);
     cur_transl_unit(ir_translation_unit);
 
-    int ctf_err;
-    ctf_dict_t *ctf_dict, *dict_tmp;
+    ctf_dict_t *ctf_dict = nullptr, *initial_ctf_dict = nullptr;
     const auto symt = symtab();
     symtab_reader::symtab_filter filter = symt->make_filter();
     filter.set_public_symbols();
-    std::string dict_name;
 
-    if ((corp->get_origin() & corpus::LINUX_KERNEL_BINARY_ORIGIN)
-	&& corpus_group())
+    ctf_next_t *it = nullptr;
+    // Iterate through the dictionnaries of the archive and get the
+    // first one, which should be the parent dictionnary.
+    initial_ctf_dict = ctf_archive_next(ctfa, /*iterator=*/&it,
+					/*dict_name=*/nullptr,
+					/*skip_parent=*/false,
+					/*ctf_error=*/nullptr);
+    if (!initial_ctf_dict)
       {
-	tools_utils::base_name(corpus_path(), dict_name);
-	// remove .* suffix
-	std::size_t pos = dict_name.find(".");
-	if (pos != string::npos)
-	  dict_name.erase(pos);
-
-	std::replace(dict_name.begin(), dict_name.end(), '-', '_');
-      }
-
-    if ((ctf_dict = ctf_dict_open(ctfa,
-				  dict_name.empty() ? NULL : dict_name.c_str(),
-				  &ctf_err)) == NULL)
-      {
-	fprintf(stderr, "ERROR dictionary not found\n");
-	abort();
+	std::cerr << "Could not find any dictionnary in the CTF archive\n";
+	ctf_next_destroy(it);
+	return;
       }
 
-    dict_tmp = ctf_dict;
-
+    ctf_dict = initial_ctf_dict;
     for (const auto& symbol : symtab_reader::filtered_symtab(*symt, filter))
       {
 	std::string sym_name = symbol->get_name();
@@ -524,11 +515,14 @@  public:
 	    func_declaration->set_is_in_public_symbol_table(true);
 	    add_fn_to_exported_or_undefined_decls(func_declaration.get());
 	  }
-
-	ctf_dict = dict_tmp;
+	if (ctf_dict != initial_ctf_dict)
+	  {
+	    ctf_dict_close(initial_ctf_dict);
+	    initial_ctf_dict = ctf_dict;
+	  }
       }
-
     ctf_dict_close(ctf_dict);
+    ctf_next_destroy(it);
   }
 
   /// Add a new type declaration to the given libabigail IR corpus CORP.
@@ -1667,10 +1661,12 @@  lookup_symbol_in_ctf_archive(ctf_archive_t *ctfa, ctf_dict_t **ctf_dict,
   if (ctf_type == CTF_ERR)
     {
       ctf_dict_t *fp;
-      ctf_next_t *i = NULL;
-      const char *arcname;
+      ctf_next_t *i = nullptr;
+      const char *arcname = nullptr;
 
-      while ((fp = ctf_archive_next(ctfa, &i, &arcname, 1, &ctf_err)) != NULL)
+      while ((fp = ctf_archive_next(ctfa, &i, &arcname,
+				    /*skip_parent=*/true,
+				    &ctf_err)) != nullptr)
         {
           if ((ctf_type = ctf_lookup_by_symbol_name (fp, sym_name)) == CTF_ERR)
             ctf_type = ctf_lookup_variable(fp, sym_name);
diff --git a/tests/data/test-diff-pkg-ctf/test-rpm-report-1.txt b/tests/data/test-diff-pkg-ctf/test-rpm-report-1.txt
index 84d22968..af0afa73 100644
--- a/tests/data/test-diff-pkg-ctf/test-rpm-report-1.txt
+++ b/tests/data/test-diff-pkg-ctf/test-rpm-report-1.txt
@@ -1,5 +1,5 @@ 
 ================ changes of 'libdwarf.so.1.20180129.0'===============
-  Functions changes summary: 0 Removed, 0 Changed, 0 Added function
+  Functions changes summary: 0 Removed, 0 Changed (1 filtered out), 0 Added function
   Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
   Function symbols changes summary: 0 Removed, 1 Added function symbol not referenced by debug info
   Variable symbols changes summary: 0 Removed, 0 Added variable symbol not referenced by debug info
diff --git a/tests/data/test-diff-pkg-ctf/test-rpm-report-2.txt b/tests/data/test-diff-pkg-ctf/test-rpm-report-2.txt
index e22399de..1347b012 100644
--- a/tests/data/test-diff-pkg-ctf/test-rpm-report-2.txt
+++ b/tests/data/test-diff-pkg-ctf/test-rpm-report-2.txt
@@ -1,5 +1,5 @@ 
 ================ changes of 'libdwarf.so.1.20180129.0'===============
-  Functions changes summary: 0 Removed, 0 Changed, 0 Added function
+  Functions changes summary: 0 Removed, 0 Changed (1 filtered out), 0 Added function
   Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
   Function symbols changes summary: 1 Removed, 0 Added function symbol not referenced by debug info
   Variable symbols changes summary: 0 Removed, 0 Added variable symbol not referenced by debug info