abg-writer: faster referenced type emission tests

Message ID 20210827152439.918695-1-gprocida@google.com
State New
Headers
Series abg-writer: faster referenced type emission tests |

Commit Message

Giuliano Procida Aug. 27, 2021, 3:24 p.m. UTC
  When determining whether a referenced type should be emitted, various
tests are done:

- has the type been emitted already? hash table lookup
- does the translation unit match? string comparison
- is this the last translation unit? read bool variable

The translation unit tests were added in recent commits and followed
the hash table lookups. This resulted in a performance regression
affecting Android continuous integration tests.

The lookups require a hash calculation and an equality check if the
hash is present. The equality checks are expensive deep equalities
rather than pointer comparisons.

This change reorders the tests so that the lookups happen last. This
speeds up abidw by more than a factor of 10 for one Android library.

	* src/abg-writer.cc (write_translation_unit): Reorder
	referenced type emission tests for efficiency. Consolidate
	related comments.

Signed-off-by: Giuliano Procida <gprocida@google.com>
Reviewed-by: Matthias Maennich <maennich@google.com>
---
 src/abg-writer.cc | 74 ++++++++++++++++++-----------------------------
 1 file changed, 28 insertions(+), 46 deletions(-)
  

Comments

Dodji Seketeli Sept. 1, 2021, 10:11 a.m. UTC | #1
Hello Giuliano,

Giuliano Procida <gprocida@google.com> a écrit:

> When determining whether a referenced type should be emitted, various
> tests are done:
>
> - has the type been emitted already? hash table lookup
> - does the translation unit match? string comparison
> - is this the last translation unit? read bool variable
>
> The translation unit tests were added in recent commits and followed
> the hash table lookups. This resulted in a performance regression
> affecting Android continuous integration tests.
>
> The lookups require a hash calculation and an equality check if the
> hash is present. The equality checks are expensive deep equalities
> rather than pointer comparisons.
>
> This change reorders the tests so that the lookups happen last. This
> speeds up abidw by more than a factor of 10 for one Android library.

Whoah!  Thank you for this.

[...]

>
> 	* src/abg-writer.cc (write_translation_unit): Reorder
> 	referenced type emission tests for efficiency. Consolidate
> 	related comments.
>
> Signed-off-by: Giuliano Procida <gprocida@google.com>
> Reviewed-by: Matthias Maennich <maennich@google.com>

Applied to master.

> ---
>  src/abg-writer.cc | 74 ++++++++++++++++++-----------------------------
>  1 file changed, 28 insertions(+), 46 deletions(-)
>
> diff --git a/src/abg-writer.cc b/src/abg-writer.cc
> index bf460eed..bf0ba9e4 100644
> --- a/src/abg-writer.cc
> +++ b/src/abg-writer.cc
> @@ -2347,49 +2347,38 @@ write_translation_unit(write_context&		ctxt,
>    // So this map of type -> string is to contain the referenced types
>    // we need to emit.
>    type_ptr_set_type referenced_types_to_emit;
> +
> +  // For each referenced type, ensure that it is either emitted in the
> +  // translation unit to which it belongs or in the last translation
> +  // unit as a last resort.
>    for (type_ptr_set_type::const_iterator i =
>  	 ctxt.get_referenced_types().begin();
>         i != ctxt.get_referenced_types().end();
>         ++i)
> -    {
> -      if (!ctxt.type_is_emitted(*i)
> -	  && !ctxt.decl_only_type_is_emitted(*i)
> -	  // Ensure that the referenced type is emitted in the
> -	  // translation that it belongs to.
> -	  && (is_last
> -	      || ((*i)->get_translation_unit()->get_absolute_path()
> -		  == tu.get_absolute_path())))
> -	referenced_types_to_emit.insert(*i);
> -    }
> +    if ((is_last || (*i)->get_translation_unit()->get_absolute_path()
> +	  == tu.get_absolute_path())
> +	&& !ctxt.type_is_emitted(*i)
> +	&& !ctxt.decl_only_type_is_emitted(*i))
> +      referenced_types_to_emit.insert(*i);

I have factorized the test into a new function that is called here,
instead.

Below is the patch applied.

Thank you !

Cheers,

From 0907d84aeff64744e6ecc5161668477da71ca97e Mon Sep 17 00:00:00 2001
From: Giuliano Procida <gprocida@google.com>
Date: Fri, 27 Aug 2021 16:24:39 +0100
Subject: [PATCH] abg-writer: faster referenced type emission tests

When determining whether a referenced type should be emitted, various
tests are done:

- has the type been emitted already? hash table lookup
- does the translation unit match? string comparison
- is this the last translation unit? read bool variable

The translation unit tests were added in recent commits and followed
the hash table lookups. This resulted in a performance regression
affecting Android continuous integration tests.

The lookups require a hash calculation and an equality check if the
hash is present. The equality checks are expensive deep equalities
rather than pointer comparisons.

This change reorders the tests so that the lookups happen last. This
speeds up abidw by more than a factor of 10 for one Android library.

	* src/abg-writer.cc (write_translation_unit): Reorder
	referenced type emission tests for efficiency. Consolidate
	related comments.

Signed-off-by: Giuliano Procida <gprocida@google.com>
Reviewed-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
---
 src/abg-writer.cc | 88 ++++++++++++++++++++++-------------------------
 1 file changed, 42 insertions(+), 46 deletions(-)

diff --git a/src/abg-writer.cc b/src/abg-writer.cc
index bf460eed..9f48dc92 100644
--- a/src/abg-writer.cc
+++ b/src/abg-writer.cc
@@ -2242,6 +2242,35 @@ write_canonical_types_of_scope(const scope_decl	&scope,
   return true;
 }
 
+/// Test if a type referenced in a given translation unit should be
+/// emitted or not.
+///
+/// This is a subroutine of @ref write_translation_unit.
+///
+/// @param t the type to consider.
+///
+/// @param ctxt the write context to consider.
+///
+/// @param tu the translation unit to consider.
+///
+/// @param tu_is_last true if @p tu is the last translation unit being
+/// emitted.
+///
+/// @return true iff @p t is to be emitted.
+static bool
+referenced_type_should_be_emitted(const type_base *t,
+				  const write_context& ctxt,
+				  const translation_unit& tu,
+				  bool tu_is_last)
+{
+  if ((tu_is_last || t->get_translation_unit()->get_absolute_path()
+       == tu.get_absolute_path())
+      && !ctxt.type_is_emitted(t)
+      && !ctxt.decl_only_type_is_emitted(t))
+    return true;
+  return false;
+}
+
 /// Serialize a translation unit to an output stream.
 ///
 /// @param ctxt the context of the serialization.  It contains e.g,
@@ -2347,49 +2376,29 @@ write_translation_unit(write_context&		ctxt,
   // So this map of type -> string is to contain the referenced types
   // we need to emit.
   type_ptr_set_type referenced_types_to_emit;
+
+  // For each referenced type, ensure that it is either emitted in the
+  // translation unit to which it belongs or in the last translation
+  // unit as a last resort.
   for (type_ptr_set_type::const_iterator i =
 	 ctxt.get_referenced_types().begin();
        i != ctxt.get_referenced_types().end();
        ++i)
-    {
-      if (!ctxt.type_is_emitted(*i)
-	  && !ctxt.decl_only_type_is_emitted(*i)
-	  // Ensure that the referenced type is emitted in the
-	  // translation that it belongs to.
-	  && (is_last
-	      || ((*i)->get_translation_unit()->get_absolute_path()
-		  == tu.get_absolute_path())))
-	referenced_types_to_emit.insert(*i);
-    }
+    if (referenced_type_should_be_emitted(*i, ctxt, tu, is_last))
+      referenced_types_to_emit.insert(*i);
 
   for (fn_type_ptr_set_type::const_iterator i =
 	 ctxt.get_referenced_function_types().begin();
        i != ctxt.get_referenced_function_types().end();
        ++i)
-    if (!ctxt.type_is_emitted(*i)
-	&& !ctxt.decl_only_type_is_emitted(*i)
-	// Ensure that the referenced type is emitted in the
-	// translation that it belongs to.
-	&& (is_last
-	    || ((*i)->get_translation_unit()->get_absolute_path()
-		== tu.get_absolute_path())))
-      // A referenced type that was not emitted at all must be
-      // emitted now.
+    if (referenced_type_should_be_emitted(*i, ctxt, tu, is_last))
       referenced_types_to_emit.insert(*i);
 
   for (type_ptr_set_type::const_iterator i =
 	 ctxt.get_referenced_non_canonical_types().begin();
        i != ctxt.get_referenced_non_canonical_types().end();
        ++i)
-    if (!ctxt.type_is_emitted(*i)
-	&& !ctxt.decl_only_type_is_emitted(*i)
-	// Ensure that the referenced type is emitted in the
-	// translation that it belongs to.
-	&& (is_last
-	    || ((*i)->get_translation_unit()->get_absolute_path()
-		== tu.get_absolute_path())))
-      // A referenced type that was not emitted at all must be
-      // emitted now.
+    if (referenced_type_should_be_emitted(*i, ctxt, tu, is_last))
       referenced_types_to_emit.insert(*i);
 
   // Ok, now let's emit the referenced type for good.
@@ -2438,34 +2447,21 @@ write_translation_unit(write_context&		ctxt,
       // there are still some referenced types in there that are not
       // emitted yet.  If yes, then we'll emit those again.
 
+      // For each referenced type, ensure that it is either emitted in
+      // the translation unit to which it belongs or in the last
+      // translation unit as a last resort.
       for (type_ptr_set_type::const_iterator i =
 	     ctxt.get_referenced_types().begin();
 	   i != ctxt.get_referenced_types().end();
 	   ++i)
-	if (!ctxt.type_is_emitted(*i)
-	    && !ctxt.decl_only_type_is_emitted(*i)
-	    // Ensure that the referenced type is emitted in the
-	    // translation that it belongs to.
-	    && (is_last
-		|| ((*i)->get_translation_unit()->get_absolute_path()
-		    == tu.get_absolute_path())))
-	  // A referenced type that was not emitted at all must be
-	  // emitted now.
+	if (referenced_type_should_be_emitted(*i, ctxt, tu, is_last))
 	  referenced_types_to_emit.insert(*i);
 
       for (type_ptr_set_type::const_iterator i =
 	     ctxt.get_referenced_non_canonical_types().begin();
 	   i != ctxt.get_referenced_non_canonical_types().end();
 	   ++i)
-	if (!ctxt.type_is_emitted(*i)
-	    && !ctxt.decl_only_type_is_emitted(*i)
-	    // Ensure that the referenced type is emitted in the
-	    // translation that it belongs to.
-	    && (is_last
-		|| ((*i)->get_translation_unit()->get_absolute_path()
-		    == tu.get_absolute_path())))
-	  // A referenced type that was not emitted at all must be
-	  // emitted now.
+	if (referenced_type_should_be_emitted(*i, ctxt, tu, is_last))
 	  referenced_types_to_emit.insert(*i);
     }
  

Patch

diff --git a/src/abg-writer.cc b/src/abg-writer.cc
index bf460eed..bf0ba9e4 100644
--- a/src/abg-writer.cc
+++ b/src/abg-writer.cc
@@ -2347,49 +2347,38 @@  write_translation_unit(write_context&		ctxt,
   // So this map of type -> string is to contain the referenced types
   // we need to emit.
   type_ptr_set_type referenced_types_to_emit;
+
+  // For each referenced type, ensure that it is either emitted in the
+  // translation unit to which it belongs or in the last translation
+  // unit as a last resort.
   for (type_ptr_set_type::const_iterator i =
 	 ctxt.get_referenced_types().begin();
        i != ctxt.get_referenced_types().end();
        ++i)
-    {
-      if (!ctxt.type_is_emitted(*i)
-	  && !ctxt.decl_only_type_is_emitted(*i)
-	  // Ensure that the referenced type is emitted in the
-	  // translation that it belongs to.
-	  && (is_last
-	      || ((*i)->get_translation_unit()->get_absolute_path()
-		  == tu.get_absolute_path())))
-	referenced_types_to_emit.insert(*i);
-    }
+    if ((is_last || (*i)->get_translation_unit()->get_absolute_path()
+	  == tu.get_absolute_path())
+	&& !ctxt.type_is_emitted(*i)
+	&& !ctxt.decl_only_type_is_emitted(*i))
+      referenced_types_to_emit.insert(*i);
 
   for (fn_type_ptr_set_type::const_iterator i =
 	 ctxt.get_referenced_function_types().begin();
        i != ctxt.get_referenced_function_types().end();
        ++i)
-    if (!ctxt.type_is_emitted(*i)
-	&& !ctxt.decl_only_type_is_emitted(*i)
-	// Ensure that the referenced type is emitted in the
-	// translation that it belongs to.
-	&& (is_last
-	    || ((*i)->get_translation_unit()->get_absolute_path()
-		== tu.get_absolute_path())))
-      // A referenced type that was not emitted at all must be
-      // emitted now.
+    if ((is_last || (*i)->get_translation_unit()->get_absolute_path()
+	  == tu.get_absolute_path())
+	&& !ctxt.type_is_emitted(*i)
+	&& !ctxt.decl_only_type_is_emitted(*i))
       referenced_types_to_emit.insert(*i);
 
   for (type_ptr_set_type::const_iterator i =
 	 ctxt.get_referenced_non_canonical_types().begin();
        i != ctxt.get_referenced_non_canonical_types().end();
        ++i)
-    if (!ctxt.type_is_emitted(*i)
-	&& !ctxt.decl_only_type_is_emitted(*i)
-	// Ensure that the referenced type is emitted in the
-	// translation that it belongs to.
-	&& (is_last
-	    || ((*i)->get_translation_unit()->get_absolute_path()
-		== tu.get_absolute_path())))
-      // A referenced type that was not emitted at all must be
-      // emitted now.
+    if ((is_last || (*i)->get_translation_unit()->get_absolute_path()
+	  == tu.get_absolute_path())
+	&& !ctxt.type_is_emitted(*i)
+	&& !ctxt.decl_only_type_is_emitted(*i))
       referenced_types_to_emit.insert(*i);
 
   // Ok, now let's emit the referenced type for good.
@@ -2438,34 +2427,27 @@  write_translation_unit(write_context&		ctxt,
       // there are still some referenced types in there that are not
       // emitted yet.  If yes, then we'll emit those again.
 
+      // For each referenced type, ensure that it is either emitted in
+      // the translation unit to which it belongs or in the last
+      // translation unit as a last resort.
       for (type_ptr_set_type::const_iterator i =
 	     ctxt.get_referenced_types().begin();
 	   i != ctxt.get_referenced_types().end();
 	   ++i)
-	if (!ctxt.type_is_emitted(*i)
-	    && !ctxt.decl_only_type_is_emitted(*i)
-	    // Ensure that the referenced type is emitted in the
-	    // translation that it belongs to.
-	    && (is_last
-		|| ((*i)->get_translation_unit()->get_absolute_path()
-		    == tu.get_absolute_path())))
-	  // A referenced type that was not emitted at all must be
-	  // emitted now.
+	if ((is_last || (*i)->get_translation_unit()->get_absolute_path()
+	      == tu.get_absolute_path())
+	    && !ctxt.type_is_emitted(*i)
+	    && !ctxt.decl_only_type_is_emitted(*i))
 	  referenced_types_to_emit.insert(*i);
 
       for (type_ptr_set_type::const_iterator i =
 	     ctxt.get_referenced_non_canonical_types().begin();
 	   i != ctxt.get_referenced_non_canonical_types().end();
 	   ++i)
-	if (!ctxt.type_is_emitted(*i)
-	    && !ctxt.decl_only_type_is_emitted(*i)
-	    // Ensure that the referenced type is emitted in the
-	    // translation that it belongs to.
-	    && (is_last
-		|| ((*i)->get_translation_unit()->get_absolute_path()
-		    == tu.get_absolute_path())))
-	  // A referenced type that was not emitted at all must be
-	  // emitted now.
+	if ((is_last || (*i)->get_translation_unit()->get_absolute_path()
+	      == tu.get_absolute_path())
+	    && !ctxt.type_is_emitted(*i)
+	    && !ctxt.decl_only_type_is_emitted(*i))
 	  referenced_types_to_emit.insert(*i);
     }