[v2] abg-reader.cc: track WIP types by pointer not name

Message ID 20200622172504.27660-1-gprocida@google.com
State Dropped, archived
Headers
Series [v2] abg-reader.cc: track WIP types by pointer not name |

Commit Message

Giuliano Procida June 22, 2020, 5:25 p.m. UTC
  When reading ABI XML files, the reader needs to construct types
progressively as any type may depend on other types and even on
itself. Such work-in-progress types are tracked explicitly.

The storage used for this is a map from (external, qualified) type
name to a count of how many times the type (name) has been seen.

However, function type names are invariably stored as "void ()" as
they are incomplete at the point they are added to the map. When the
reader later attempts to remove the marking they have their proper,
different names. In short, the code doesn't do what it's supposed to.

This commit changes the stored value from string to const type_base*.
Equality on type_base_sptr has been defined to be a deep comparison so
storing those wouldn't be quite right.

It also replaces the unordered_map with a simple vector, implementing
a searchable stack. This is simpler and fast, for the expected number
of types, and allows a stronger invariant to be asserted due to the
preservation of insertion order.

Note that this commit removes some of the call paths that result in
incorrect (external) type names being cached, regardless of whether
they are actually used.

	* src/abg-reader.cc (xml_reader::m_wip_types_map): Replace
	with m_wip_types_stack of type vector<const type_base*>.
	(xml_reader::clear_wip_classes_map): Remove.
	(xml_reader::clear_wip_types_stack): New function, clears
	m_wip_types_stack.
	(xml_reader::mark_type_as_wip): Push type pointer onto
	m_wip_types_stack.
	(xml_reader::unmark_type_as_wip): Add assertion that type
	pointer is at top of stack. Pop pointer off m_wip_types_stack.
	(xml_reader::is_wip_type): Test if type pointer in stack.

Signed-off-by: Giuliano Procida <gprocida@google.com>
---
 src/abg-reader.cc | 33 ++++++++++-----------------------
 1 file changed, 10 insertions(+), 23 deletions(-)
  

Comments

Matthias Männich June 22, 2020, 8:06 p.m. UTC | #1
On Mon, Jun 22, 2020 at 06:25:04PM +0100, Giuliano Procida wrote:
>When reading ABI XML files, the reader needs to construct types
>progressively as any type may depend on other types and even on
>itself. Such work-in-progress types are tracked explicitly.
>
>The storage used for this is a map from (external, qualified) type
>name to a count of how many times the type (name) has been seen.
>
>However, function type names are invariably stored as "void ()" as
>they are incomplete at the point they are added to the map. When the
>reader later attempts to remove the marking they have their proper,
>different names. In short, the code doesn't do what it's supposed to.
>
>This commit changes the stored value from string to const type_base*.
>Equality on type_base_sptr has been defined to be a deep comparison so
>storing those wouldn't be quite right.
>
>It also replaces the unordered_map with a simple vector, implementing
>a searchable stack. This is simpler and fast, for the expected number
>of types, and allows a stronger invariant to be asserted due to the
>preservation of insertion order.
>
>Note that this commit removes some of the call paths that result in
>incorrect (external) type names being cached, regardless of whether
>they are actually used.
>
>	* src/abg-reader.cc (xml_reader::m_wip_types_map): Replace
>	with m_wip_types_stack of type vector<const type_base*>.
>	(xml_reader::clear_wip_classes_map): Remove.
>	(xml_reader::clear_wip_types_stack): New function, clears
>	m_wip_types_stack.
>	(xml_reader::mark_type_as_wip): Push type pointer onto
>	m_wip_types_stack.
>	(xml_reader::unmark_type_as_wip): Add assertion that type
>	pointer is at top of stack. Pop pointer off m_wip_types_stack.
>	(xml_reader::is_wip_type): Test if type pointer in stack.
>

The surroundings of this code look like a bit of RAII could not harm.
Especially given the invariant that unmarking pops the stack, marking
could probably also define the unmarking by using something like a scope
guard.

Reviewed-by: Matthias Maennich <maennich@google.com>

Cheers,
Matthias

>Signed-off-by: Giuliano Procida <gprocida@google.com>
>---
> src/abg-reader.cc | 33 ++++++++++-----------------------
> 1 file changed, 10 insertions(+), 23 deletions(-)
>
>diff --git a/src/abg-reader.cc b/src/abg-reader.cc
>index eb74659f..480d1402 100644
>--- a/src/abg-reader.cc
>+++ b/src/abg-reader.cc
>@@ -26,6 +26,7 @@
> /// native XML format is named "abixml".
>
> #include "config.h"
>+#include <algorithm>
> #include <cstring>
> #include <cstdlib>
> #include <cerrno>
>@@ -119,7 +120,7 @@ private:
>   unordered_map<string, vector<type_base_sptr> >	m_types_map;
>   unordered_map<string, shared_ptr<function_tdecl> >	m_fn_tmpl_map;
>   unordered_map<string, shared_ptr<class_tdecl> >	m_class_tmpl_map;
>-  unordered_map<string, size_t>			m_wip_types_map;
>+  vector<const type_base*>				m_wip_types_stack;
>   vector<type_base_sptr>				m_types_to_canonicalize;
>   string_xml_node_map					m_id_xml_node_map;
>   xml_node_decl_base_sptr_map				m_xml_node_decl_map;
>@@ -522,8 +523,8 @@ public:
>   /// the map of the class that are currently being built, but at not
>   /// yet fully built.
>   void
>-  clear_wip_classes_map()
>-  {m_wip_types_map.clear();}
>+  clear_wip_types_stack()
>+  {m_wip_types_stack.clear();}
>
>   /// Mark a given type as being "Work In Progress"; that is, mark it
>   /// as being currently built.
>@@ -534,12 +535,7 @@ public:
>   {
>     if (!t)
>       return;
>-    string qname = get_type_name(t, /*qualified=*/true);
>-    unordered_map<string, size_t>::iterator it = m_wip_types_map.find(qname);
>-    if (it == m_wip_types_map.end())
>-      m_wip_types_map[qname] = 1;
>-    else
>-      ++it->second;
>+    m_wip_types_stack.push_back(t.get());
>   }
>
>   /// Mark a given class as being *NOT* "Work In Progress" anymore;
>@@ -551,15 +547,9 @@ public:
>   {
>     if (!t)
>       return;
>-
>-    string qname = get_type_name(t, /*qualified=*/true);
>-    unordered_map<string, size_t>::iterator it = m_wip_types_map.find(qname);
>-    if (it == m_wip_types_map.end())
>-      return;
>-    if (it->second)
>-      --it->second;
>-    if (it->second == 0)
>-      m_wip_types_map.erase(it);
>+    ABG_ASSERT(!m_wip_types_stack.empty());
>+    ABG_ASSERT(m_wip_types_stack.back() == t.get());
>+    m_wip_types_stack.pop_back();
>   }
>
>   /// Test if a type is being currently built; that is, if it's "Work
>@@ -571,11 +561,8 @@ public:
>   {
>     if (!t)
>       return false;
>-
>-    string qname = get_type_name(t, /*qualified=*/true);
>-    unordered_map<string, size_t>::const_iterator i =
>-      m_wip_types_map.find(qname);
>-    return i != m_wip_types_map.end();
>+    return std::find(m_wip_types_stack.begin(), m_wip_types_stack.end(), t.get())
>+           != m_wip_types_stack.end();
>   }
>
>   /// Test if two types are equal, without comparing them structurally.
>-- 
>2.27.0.111.gc72c7da667-goog
>
  
Dodji Seketeli July 9, 2020, 12:58 p.m. UTC | #2
Hello,

Giuliano Procida <gprocida@google.com> a écrit:

> When reading ABI XML files, the reader needs to construct types
> progressively as any type may depend on other types and even on
> itself. Such work-in-progress types are tracked explicitly.

Right.

Though I looked at this in details and I think this whole WIP tracking
is useless now, in the abixml reader so we might as well get rid of it
altogether.

It was used back in the old days because there was no type
de-duplication in abixml so the volume of types seen by the abixml
reader could be really huge some times.  So delaying type
canonicalization had a big impact, potentially.  So canonicalizing for a
WIP struct or function type was delayed until all the abixml file is
read.  Now in the code, all structs and function types are delayed
anyway, irrespective of their WIP-ness.  And we received no complaints
of slowness from the abixml reader.

So really, I am going to just dump the support for tracking WIP types in
the abixml reader altogether.  I am nervous about doing the same from
the DWARF reader, though, as we can see *many* more types from DWARF
because of all the duplication inherent to how the DWARF is generated at
the moment.

So I think this patch, altough valid, won't be necessary as the WIP
types tracking is going away.

>
> The storage used for this is a map from (external, qualified) type
> name to a count of how many times the type (name) has been seen.
>
> However, function type names are invariably stored as "void ()" as
> they are incomplete at the point they are added to the map. When the
> reader later attempts to remove the marking they have their proper,
> different names. In short, the code doesn't do what it's supposed to.
>

Right, I agree.  This WIP thing wasdesigned before the support for
function types was added.  And yes, the string representation of a
function types changes during its WIP life-cycle so clearly using that
representation to designate it is problematic.

So really, let's just dump the support for WIP type tracking from the
abixml reader.

I'll send a patch for that right away.

Thanks!

[...]

Cheers,
  

Patch

diff --git a/src/abg-reader.cc b/src/abg-reader.cc
index eb74659f..480d1402 100644
--- a/src/abg-reader.cc
+++ b/src/abg-reader.cc
@@ -26,6 +26,7 @@ 
 /// native XML format is named "abixml".
 
 #include "config.h"
+#include <algorithm>
 #include <cstring>
 #include <cstdlib>
 #include <cerrno>
@@ -119,7 +120,7 @@  private:
   unordered_map<string, vector<type_base_sptr> >	m_types_map;
   unordered_map<string, shared_ptr<function_tdecl> >	m_fn_tmpl_map;
   unordered_map<string, shared_ptr<class_tdecl> >	m_class_tmpl_map;
-  unordered_map<string, size_t>			m_wip_types_map;
+  vector<const type_base*>				m_wip_types_stack;
   vector<type_base_sptr>				m_types_to_canonicalize;
   string_xml_node_map					m_id_xml_node_map;
   xml_node_decl_base_sptr_map				m_xml_node_decl_map;
@@ -522,8 +523,8 @@  public:
   /// the map of the class that are currently being built, but at not
   /// yet fully built.
   void
-  clear_wip_classes_map()
-  {m_wip_types_map.clear();}
+  clear_wip_types_stack()
+  {m_wip_types_stack.clear();}
 
   /// Mark a given type as being "Work In Progress"; that is, mark it
   /// as being currently built.
@@ -534,12 +535,7 @@  public:
   {
     if (!t)
       return;
-    string qname = get_type_name(t, /*qualified=*/true);
-    unordered_map<string, size_t>::iterator it = m_wip_types_map.find(qname);
-    if (it == m_wip_types_map.end())
-      m_wip_types_map[qname] = 1;
-    else
-      ++it->second;
+    m_wip_types_stack.push_back(t.get());
   }
 
   /// Mark a given class as being *NOT* "Work In Progress" anymore;
@@ -551,15 +547,9 @@  public:
   {
     if (!t)
       return;
-
-    string qname = get_type_name(t, /*qualified=*/true);
-    unordered_map<string, size_t>::iterator it = m_wip_types_map.find(qname);
-    if (it == m_wip_types_map.end())
-      return;
-    if (it->second)
-      --it->second;
-    if (it->second == 0)
-      m_wip_types_map.erase(it);
+    ABG_ASSERT(!m_wip_types_stack.empty());
+    ABG_ASSERT(m_wip_types_stack.back() == t.get());
+    m_wip_types_stack.pop_back();
   }
 
   /// Test if a type is being currently built; that is, if it's "Work
@@ -571,11 +561,8 @@  public:
   {
     if (!t)
       return false;
-
-    string qname = get_type_name(t, /*qualified=*/true);
-    unordered_map<string, size_t>::const_iterator i =
-      m_wip_types_map.find(qname);
-    return i != m_wip_types_map.end();
+    return std::find(m_wip_types_stack.begin(), m_wip_types_stack.end(), t.get())
+           != m_wip_types_stack.end();
   }
 
   /// Test if two types are equal, without comparing them structurally.