[applied] Improve type (de)serialization instability debugging

Message ID 87ilxud8z9.fsf@redhat.com
State New
Headers
Series [applied] Improve type (de)serialization instability debugging |

Commit Message

Dodji Seketeli Oct. 18, 2021, 8:37 a.m. UTC
  Hello,

When debugging an issue uncovered by performing self comparison (abidw
--abidiff <binary>) I realized that I needed a stronger verification
of canonical types changing between type serialization and type
de-serialization.  Namely, when a type T with canonical type C is
serialized, its de-serialized type should still have the same
canonical type C.  Otherwise, it means some "type instability" took
place during serialization and de-serialization.

This patch implements that verification and also cleans up things
that came across while working on adding this debugging check.

	* include/abg-fwd.h (is_non_canonicalized_type): Declare new
	function.
	* src/abg-ir-priv.h: Include abg-corpus.h
	(environment::priv::pointer_type_id_map_): Fix comment.
	(environment::priv::check_canonical_type_from_abixml_during_self_comp):
	Define new member function.
	* src/abg-ir.cc (unmark_types_as_being_compared): Factorize this
	from ...
	(return_comparison_result): ... here.  Also, add a parameter to
	control whether this function should perform the "canonical type
	propagation optimization" or not.  By default the optimization is
	performed.  This can be changed for debugging purposes later.
	(type_base::get_canonical_type_for): Re-organise the self
	comparison debugging process to invoke the new function
	environment::priv::check_canonical_type_from_abixml_during_self_comp
	each time a canonical type is computed, in addition to doing the
	previous verification that was done when no canonical type was
	found.  Emit better error messages.
	(is_non_canonicalized_type): Rename the static function
	is_allowed_non_canonicalized_type into this and make it
	non-static.
	(hash_as_canonical_type_or_constant): Adjust.
	* src/abg-reader.cc (maybe_map_type_with_type_id): Define new
	static function.
	(read_context::maybe_check_abixml_canonical_type_stability):
	Ignore types that were not canonicalized.
	(read_corpus_from_input): Set the origin of the corpus early
	enough so that it's available to the canonicalizer even for types
	being canonicalized early.
	(MAYBE_MAP_TYPE_WITH_TYPE_ID): Factorize this macro out of ...
	(build_type): ... this.  That macro is defined only when debugging
	self comparison.
	(build_array_type_def): Map the read subrange type with its
	type-id.
	(handle_{type_decl, qualified_type, pointer_type_def,
	reference_type_def, function_type, array_type_def,enum_type_decl,
	typedef_decl, class_decl, union_decl}): Map the read type with its
	type-id.
	(load_canonical_type_ids): Ignore non-canonicalized types that
	which ids were saved in the type-id file.
	* src/abg-writer.cc (write_type_record): Factorize from ...
	(write_canonical_type_ids): ... here.  Don't forget to write the
	type-ids of decl-only types.  This can be useful for eye
	inspection.
	* tools/abidw.cc (load_corpus_and_write_abixml): Wait until the
	end of the function before removing the type-id file.  This can be
	useful for eye inspection.

Signed-off-by: Dodji Seketeli <dodji@redhat.com>
Applied to master.

---
 include/abg-fwd.h |   6 ++
 src/abg-ir-priv.h |  79 ++++++++++++++++++++++++-
 src/abg-ir.cc     | 143 +++++++++++++++++++++++++++++++++-------------
 src/abg-reader.cc | 114 ++++++++++++++++++++++++++++++------
 src/abg-writer.cc |  73 +++++++++++++++++++----
 tools/abidw.cc    |  11 ++--
 6 files changed, 354 insertions(+), 72 deletions(-)
  

Patch

diff --git a/include/abg-fwd.h b/include/abg-fwd.h
index 03171940..9e353506 100644
--- a/include/abg-fwd.h
+++ b/include/abg-fwd.h
@@ -1379,6 +1379,12 @@  hash_type_or_decl(const type_or_decl_base *);
 size_t
 hash_type_or_decl(const type_or_decl_base_sptr &);
 
+bool
+is_non_canonicalized_type(const type_base *);
+
+bool
+is_non_canonicalized_type(const type_base_sptr&);
+
 bool
 function_decl_is_less_than(const function_decl&f, const function_decl &s);
 
diff --git a/src/abg-ir-priv.h b/src/abg-ir-priv.h
index f09162ab..18623ceb 100644
--- a/src/abg-ir-priv.h
+++ b/src/abg-ir-priv.h
@@ -16,6 +16,7 @@ 
 #include <string>
 
 #include "abg-ir.h"
+#include "abg-corpus.h"
 
 namespace abigail
 {
@@ -377,8 +378,8 @@  struct environment::priv
   //   'abidw --debug-abidiff <binary>'.  It holds the set of mapping of
   // an abixml (canonical) type and its type-id.
   unordered_map<string, uintptr_t>	type_id_canonical_type_map_;
-  // Likewise.  It holds a map that associates the pointer to a type read from
-  // abixml and the type-id string it corresponds to.
+  // Likewise.  It holds a map that associates the pointer to a type
+  // read from abixml and the type-id string it corresponds to.
   unordered_map<uintptr_t, string>	pointer_type_id_map_;
 #endif
   bool					canonicalization_is_done_;
@@ -674,6 +675,80 @@  struct environment::priv
     types_with_non_confirmed_propagated_ct_.erase(i);
   }
 
+#ifdef WITH_DEBUG_SELF_COMPARISON
+  /// When debugging self comparison, verify that a type T
+  /// de-serialized from abixml has the same canonical type as the
+  /// initial type built from DWARF that was serialized into T in the
+  /// first place.
+  ///
+  /// @param t deserialized type (from abixml) to consider.
+  ///
+  /// @param c the canonical type @p t should have.
+  ///
+  /// @return true iff @p c is the canonical type that @p t should
+  /// have.
+  bool
+  check_canonical_type_from_abixml_during_self_comp(const type_base* t,
+						    const type_base* c)
+  {
+    if (!t || !t->get_corpus() || !c)
+      return false;
+
+    if (!(t->get_corpus()->get_origin() == ir::corpus::NATIVE_XML_ORIGIN))
+      return false;
+
+    // Get the abixml type-id that this type was constructed from.
+    string type_id;
+    {
+      unordered_map<uintptr_t, string>::const_iterator it =
+	pointer_type_id_map_.find(reinterpret_cast<uintptr_t>(t));
+      if (it == pointer_type_id_map_.end())
+	return false;
+      type_id = it->second;
+    }
+
+    // Get the canonical type the original in-memory type (constructed
+    // from DWARF) had when it was serialized into abixml in the first place.
+    type_base *original_canonical_type = nullptr;
+    if (!type_id.empty())
+      {
+	unordered_map<string, uintptr_t>::const_iterator it =
+	  type_id_canonical_type_map_.find(type_id);
+	if (it == type_id_canonical_type_map_.end())
+	  return false;
+	original_canonical_type = reinterpret_cast<type_base*>(it->second);
+      }
+
+    // Now perform the real check.
+    //
+    // We want to ensure that the canonical type 'c' of 't' is the
+    // same as the canonical type of initial in-memory type (built
+    // from DWARF) that was serialized into 't' (in abixml) in the
+    // first place.
+    if (original_canonical_type == c)
+      return true;
+
+    return false;
+  }
+
+  /// When debugging self comparison, verify that a type T
+  /// de-serialized from abixml has the same canonical type as the
+  /// initial type built from DWARF that was serialized into T in the
+  /// first place.
+  ///
+  /// @param t deserialized type (from abixml) to consider.
+  ///
+  /// @param c the canonical type @p t should have.
+  ///
+  /// @return true iff @p c is the canonical type that @p t should
+  /// have.
+  bool
+  check_canonical_type_from_abixml_during_self_comp(const type_base_sptr& t,
+						    const type_base_sptr& c)
+  {
+    return check_canonical_type_from_abixml_during_self_comp(t.get(), c.get());
+  }
+#endif
 };// end struct environment::priv
 
 // <class_or_union::priv definitions>
diff --git a/src/abg-ir.cc b/src/abg-ir.cc
index 4f19e07a..3a1fbdd1 100644
--- a/src/abg-ir.cc
+++ b/src/abg-ir.cc
@@ -936,6 +936,26 @@  mark_types_as_being_compared(T& l, T&r)
   push_composite_type_comparison_operands(l, r);
 }
 
+/// Mark a pair of types as being not compared anymore.
+///
+/// This is helpful to later detect recursive cycles in the comparison
+/// stack.
+///
+/// Note that the types must have been passed to
+/// mark_types_as_being_compared prior to calling this function.
+///
+/// @param l the left-hand-side operand of the comparison.
+///
+/// @parm r the right-hand-side operand of the comparison.
+template<typename T>
+void
+unmark_types_as_being_compared(T& l, T&r)
+{
+  l.priv_->unmark_as_being_compared(l);
+  l.priv_->unmark_as_being_compared(r);
+  pop_composite_type_comparison_operands(l, r);
+}
+
 /// Return the result of the comparison of two (sub) types.
 ///
 /// The function does the necessary book keeping before returning the
@@ -951,21 +971,25 @@  mark_types_as_being_compared(T& l, T&r)
 ///
 ///   @param r the right-hand-side operand of the type comparison
 ///
+///   @param propagate_canonical_type if true, it means the function
+///   performs the @ref OnTheFlyCanonicalization, aka, "canonical type
+///   propagation optimization".
+///
 ///   @param value the result of the comparison of @p l and @p r.
 ///
 ///   @return the value @p value.
 template<typename T>
 bool
-return_comparison_result(T& l, T& r, bool value)
+return_comparison_result(T& l, T& r, bool value,
+			 bool propagate_canonical_type = true)
 {
-  if (value == true)
+  if (propagate_canonical_type && (value == true))
     maybe_propagate_canonical_type(l, r);
-  l.priv_->unmark_as_being_compared(l);
-  l.priv_->unmark_as_being_compared(r);
 
-  pop_composite_type_comparison_operands(l, r);
+  unmark_types_as_being_compared(l, r);
+
   const environment* env = l.get_environment();
-  if (env->do_on_the_fly_canonicalization())
+  if (propagate_canonical_type && env->do_on_the_fly_canonicalization())
     // We are instructed to perform the "canonical type propagation"
     // optimization, making 'r' to possibly get the canonical type of
     // 'l' if it has one.  This mostly means that we are currently
@@ -3634,7 +3658,7 @@  void
 environment::self_comparison_debug_is_on(bool f)
 {priv_->self_comparison_debug_on_ = f;}
 
-/// Test if the we are in the process of the 'self-comparison
+/// Test if we are in the process of the 'self-comparison
 /// debugging' as triggered by 'abidw --debug-abidiff' command.
 ///
 /// @return true if self comparison debug is on.
@@ -13655,45 +13679,73 @@  type_base::get_canonical_type_for(type_base_sptr t)
 	      break;
 	    }
 	}
-      if (!result)
-	{
 #ifdef WITH_DEBUG_SELF_COMPARISON
-	  if (env->self_comparison_debug_is_on())
+      if (env->self_comparison_debug_is_on())
+	{
+	  // So we are debugging the canonicalization process,
+	  // possibly via the use of 'abidw --debug-abidiff <binary>'.
+	  corpus_sptr corp1, corp2;
+	  env->get_self_comparison_debug_inputs(corp1, corp2);
+	  if (corp1 && corp2 && t->get_corpus() == corp2.get())
 	    {
-	      // So we are debugging the canonicalization process,
-	      // possibly via the use of 'abidw --debug-abidiff <binary>'.
-	      //
 	      // If 't' comes from the second corpus, then it *must*
 	      // be equal to its matching canonical type coming from
 	      // the first corpus because the second corpus is the
 	      // abixml representation of the first corpus.  In other
 	      // words, all types coming from the second corpus must
 	      // have canonical types coming from the first corpus.
-	      //
-	      // We are in the case where 't' is different from all
-	      // the canonical types of the same name that come from
-	      // the first corpus.
-	      //
-	      // If 't' indeed comes from the second corpus then this
-	      // clearly is a canonicalization failure.
-	      //
-	      // There was a problem either during the serialization
-	      // of 't' into abixml, or during the de-serialization
-	      // from abixml into abigail::ir.  Further debugging is
-	      // needed to determine what that root cause problem is.
-	      //
-	      // Note that the first canonicalization problem of this
-	      // kind must be fixed before looking at the subsequent
-	      // ones, because the later might well just be
-	      // consequences of the former.
-	      corpus_sptr corp1, corp2;
-	      env->get_self_comparison_debug_inputs(corp1, corp2);
-	      if (corp1 && corp2 && (t->get_corpus() == corp2.get()))
-		std::cerr << "error: problem detected with type '"
-			  << repr
-			  << "' from second corpus\n" << std::flush;
+	      if (result)
+		{
+		  if (!env->priv_->
+		      check_canonical_type_from_abixml_during_self_comp(t,
+									result))
+		    // The canonical type of the type re-read from abixml
+		    // type doesn't match the canonical type that was
+		    // initially serialized down.
+		    std::cerr << "error: wrong canonical type for '"
+			      << repr
+			      << "' / type: @"
+			      << std::hex
+			      << t.get()
+			      << "/ canon: @"
+			      << result.get()
+			      << std::endl;
+		}
+	      else //!result
+		{
+		  uintptr_t ptr_val = reinterpret_cast<uintptr_t>(t.get());
+		  string type_id = env->get_type_id_from_pointer(ptr_val);
+		  if (type_id.empty())
+		    type_id = "type-id-<not-found>";
+		  // We are in the case where 't' is different from all
+		  // the canonical types of the same name that come from
+		  // the first corpus.
+		  //
+		  // If 't' indeed comes from the second corpus then this
+		  // clearly is a canonicalization failure.
+		  //
+		  // There was a problem either during the serialization
+		  // of 't' into abixml, or during the de-serialization
+		  // from abixml into abigail::ir.  Further debugging is
+		  // needed to determine what that root cause problem is.
+		  //
+		  // Note that the first canonicalization problem of this
+		  // kind must be fixed before looking at the subsequent
+		  // ones, because the later might well just be
+		  // consequences of the former.
+		  std::cerr << "error: wrong induced canonical type for '"
+			    << repr
+			    << "' from second corpus"
+			    << ", ptr: " << std::hex << t.get()
+			    << "type-id: " << type_id
+			    << std::endl;
+		}
 	    }
+	}
 #endif
+
+      if (!result)
+	{
 	  v.push_back(t);
 	  result = t;
 	}
@@ -25010,8 +25062,8 @@  hash_type_or_decl(const type_or_decl_base_sptr& tod)
 ///
 /// @return true iff @p t is a one of the only types allowed to be
 /// non-canonicalized in the system.
-static bool
-is_allowed_non_canonicalized_type(const type_base *t)
+bool
+is_non_canonicalized_type(const type_base *t)
 {
   if (!t)
     return true;
@@ -25020,6 +25072,19 @@  is_allowed_non_canonicalized_type(const type_base *t)
   return is_declaration_only_class_or_union_type(t) || env->is_void_type(t);
 }
 
+/// Test if a given type is allowed to be non canonicalized
+///
+/// This is a subroutine of hash_as_canonical_type_or_constant.
+///
+/// For now, the only types allowed to be non canonicalized in the
+/// system are decl-only class/union and the void type.
+///
+/// @return true iff @p t is a one of the only types allowed to be
+/// non-canonicalized in the system.
+bool
+is_non_canonicalized_type(const type_base_sptr& t)
+{return is_non_canonicalized_type(t.get());}
+
 /// Hash a type by either returning the pointer value of its canonical
 /// type or by returning a constant if the type doesn't have a
 /// canonical type.
@@ -25062,7 +25127,7 @@  hash_as_canonical_type_or_constant(const type_base *t)
   // non-canonicalized type.  It must be a decl-only class or a void
   // type, otherwise it means that for some weird reason, the type
   // hasn't been canonicalized.  It should be!
-  ABG_ASSERT(is_allowed_non_canonicalized_type(t));
+  ABG_ASSERT(is_non_canonicalized_type(t));
 
   return 0xDEADBABE;
 }
diff --git a/src/abg-reader.cc b/src/abg-reader.cc
index 1ca36b7c..d2d423ed 100644
--- a/src/abg-reader.cc
+++ b/src/abg-reader.cc
@@ -61,6 +61,14 @@  static bool	read_is_non_reachable_type(xmlNodePtr, bool&);
 static bool	read_naming_typedef_id_string(xmlNodePtr, string&);
 #ifdef WITH_DEBUG_SELF_COMPARISON
 static bool	read_type_id_string(xmlNodePtr, string&);
+static bool	maybe_map_type_with_type_id(const type_base_sptr&,
+					    xmlNodePtr);
+static bool	maybe_map_type_with_type_id(const type_base_sptr&,
+					    const string&);
+#define MAYBE_MAP_TYPE_WITH_TYPE_ID(type, xml_node) \
+  maybe_map_type_with_type_id(type, xml_node)
+#else
+#define MAYBE_MAP_TYPE_WITH_TYPE_ID(type, xml_node)
 #endif
 static void	maybe_set_naming_typedef(read_context&	ctxt, xmlNodePtr, const decl_base_sptr &);
 class read_context;
@@ -839,9 +847,12 @@  public:
 	// was being serialized.
 	auto j = m_env->get_type_id_canonical_type_map().find(type_id);
 	if (j == m_env->get_type_id_canonical_type_map().end())
-	  std::cerr << "error: no type with type-id: '"
-		    << type_id
-		    << "' could be read back from the typeid file\n";
+	  {
+	    if (t->get_naked_canonical_type())
+	      std::cerr << "error: no type with type-id: '"
+			<< type_id
+			<< "' could be read back from the typeid file\n";
+	  }
 	else if (j->second
 		 != reinterpret_cast<uintptr_t>(t->get_canonical_type().get()))
 	  // So thecanonical type of 't' (at abixml de-serialization
@@ -1911,6 +1922,7 @@  read_corpus_from_input(read_context& ctxt)
 	ctxt.clear_per_corpus_data();
 
       corpus& corp = *ctxt.get_corpus();
+      corp.set_origin(corpus::NATIVE_XML_ORIGIN);
       ctxt.set_exported_decls_builder(corp.get_exported_decls_builder().get());
 
       handle_version_attribute(reader, corp);
@@ -1970,6 +1982,8 @@  read_corpus_from_input(read_context& ctxt)
 	ctxt.clear_per_corpus_data();
 
       corpus& corp = *ctxt.get_corpus();
+      corp.set_origin(corpus::NATIVE_XML_ORIGIN);
+
       ctxt.set_exported_decls_builder(corp.get_exported_decls_builder().get());
 
       xml::xml_char_sptr path_str = XML_NODE_GET_ATTRIBUTE(node, "path");
@@ -2036,8 +2050,6 @@  read_corpus_from_input(read_context& ctxt)
 
   ctxt.get_environment()->canonicalization_is_done(true);
 
-  corp.set_origin(corpus::NATIVE_XML_ORIGIN);
-
   if (call_reader_next)
     {
       // This is the necessary counter-part of the xmlTextReaderExpand()
@@ -2863,6 +2875,70 @@  read_type_id_string(xmlNodePtr node, string& type_id)
     }
   return false;
 }
+
+/// Associate a type-id string with the type that was constructed from
+/// it.
+///
+/// Note that if we are not in "self comparison debugging" mode or if
+/// the type we are looking at is not canonicalized, then this
+/// function does nothing.
+///
+/// @param t the type built from the a type XML node that has a
+/// particular type-id.
+///
+/// @param type_id the type-id of type @p t.
+///
+/// @return true if the association was performed.
+static bool
+maybe_map_type_with_type_id(const type_base_sptr& t,
+			    const string& type_id)
+{
+  if (!t)
+    return false;
+
+  environment *env = t->get_environment();
+  if (!env->self_comparison_debug_is_on()
+      || is_non_canonicalized_type(t.get()))
+    return false;
+
+  env->get_pointer_type_id_map()[reinterpret_cast<uintptr_t>(t.get())] =
+    type_id;
+
+  return true;
+}
+
+/// Associate a type-id string with the type that was constructed from
+/// it.
+///
+/// Note that if we are not in "self comparison debugging" mode or if
+/// the type we are looking at is not canonicalized, then this
+/// function does nothing.
+///
+/// @param t the type built from the a type XML node that has a
+/// particular type-id.
+///
+/// @param type_id the type-id of type @p t.
+///
+/// @return true if the association was performed.
+static bool
+maybe_map_type_with_type_id(const type_base_sptr& t,
+			    xmlNodePtr node)
+{
+  if (!t)
+    return false;
+
+  environment *env = t->get_environment();
+  if (!env->self_comparison_debug_is_on()
+      || is_non_canonicalized_type(t.get()))
+    return false;
+
+  string type_id;
+  if (!read_type_id_string(node, type_id) || type_id.empty())
+    return false;
+
+  return maybe_map_type_with_type_id(t, type_id);
+}
+
 #endif
 
 /// Set the naming typedef to a given decl depending on the content of
@@ -4196,6 +4272,7 @@  build_array_type_def(read_context&	ctxt,
 	if (array_type_def::subrange_sptr s =
 	    build_subrange_type(ctxt, n))
 	  {
+	    MAYBE_MAP_TYPE_WITH_TYPE_ID(s, n);
 	    if (add_to_current_scope)
 	      {
 		add_decl_to_scope(s, ctxt.get_cur_scope());
@@ -5705,16 +5782,7 @@  build_type(read_context&	ctxt,
 	abi->record_type_as_reachable_from_public_interfaces(*t);
     }
 
-#ifdef WITH_DEBUG_SELF_COMPARISON
-  environment *env = ctxt.get_environment();
-  if (t && env->self_comparison_debug_is_on())
-    {
-      string type_id;
-      if (read_type_id_string(node, type_id))
-	// Let's store the type-id of this type pointer.
-	env->get_pointer_type_id_map()[reinterpret_cast<uintptr_t>(t.get())] = type_id;
-    }
-#endif
+  MAYBE_MAP_TYPE_WITH_TYPE_ID(t, node);
 
   if (t)
     ctxt.maybe_canonicalize_type(t,/*force_delay=*/false );
@@ -5732,6 +5800,7 @@  handle_type_decl(read_context&	ctxt,
 		 bool		add_to_current_scope)
 {
   type_decl_sptr decl = build_type_decl(ctxt, node, add_to_current_scope);
+  MAYBE_MAP_TYPE_WITH_TYPE_ID(decl, node);
   if (decl && decl->get_scope())
     ctxt.maybe_canonicalize_type(decl, /*force_delay=*/false);
   return decl;
@@ -5765,6 +5834,7 @@  handle_qualified_type_decl(read_context&	ctxt,
   qualified_type_def_sptr decl =
     build_qualified_type_decl(ctxt, node,
 			      add_to_current_scope);
+  MAYBE_MAP_TYPE_WITH_TYPE_ID(decl, node);
   if (decl && decl->get_scope())
     ctxt.maybe_canonicalize_type(decl, /*force_delay=*/false);
   return decl;
@@ -5782,6 +5852,7 @@  handle_pointer_type_def(read_context&	ctxt,
 {
   pointer_type_def_sptr decl = build_pointer_type_def(ctxt, node,
 						      add_to_current_scope);
+  MAYBE_MAP_TYPE_WITH_TYPE_ID(decl, node);
   if (decl && decl->get_scope())
     ctxt.maybe_canonicalize_type(decl, /*force_delay=*/false);
   return decl;
@@ -5799,6 +5870,7 @@  handle_reference_type_def(read_context& ctxt,
 {
   reference_type_def_sptr decl = build_reference_type_def(ctxt, node,
 							  add_to_current_scope);
+  MAYBE_MAP_TYPE_WITH_TYPE_ID(decl, node);
   if (decl && decl->get_scope())
     ctxt.maybe_canonicalize_type(decl, /*force_delay=*/false);
   return decl;
@@ -5816,6 +5888,7 @@  handle_function_type(read_context&	ctxt,
 {
   function_type_sptr type = build_function_type(ctxt, node,
 						  add_to_current_scope);
+  MAYBE_MAP_TYPE_WITH_TYPE_ID(type, node);
   ctxt.maybe_canonicalize_type(type, /*force_delay=*/true);
   return type;
 }
@@ -5832,6 +5905,7 @@  handle_array_type_def(read_context&	ctxt,
 {
   array_type_def_sptr decl = build_array_type_def(ctxt, node,
 						  add_to_current_scope);
+  MAYBE_MAP_TYPE_WITH_TYPE_ID(decl, node);
   ctxt.maybe_canonicalize_type(decl, /*force_delay=*/false);
   return decl;
 }
@@ -5847,6 +5921,7 @@  handle_enum_type_decl(read_context&	ctxt,
   enum_type_decl_sptr decl =
     build_enum_type_decl_if_not_suppressed(ctxt, node,
 					   add_to_current_scope);
+  MAYBE_MAP_TYPE_WITH_TYPE_ID(decl, node);
   if (decl && decl->get_scope())
     ctxt.maybe_canonicalize_type(decl, /*force_delay=*/false);
   return decl;
@@ -5862,6 +5937,7 @@  handle_typedef_decl(read_context&	ctxt,
 {
   typedef_decl_sptr decl = build_typedef_decl(ctxt, node,
 					      add_to_current_scope);
+  MAYBE_MAP_TYPE_WITH_TYPE_ID(decl, node);
   if (decl && decl->get_scope())
     ctxt.maybe_canonicalize_type(decl, /*force_delay=*/false);
   return decl;
@@ -5914,6 +5990,7 @@  handle_class_decl(read_context& ctxt,
 {
   class_decl_sptr decl =
     build_class_decl_if_not_suppressed(ctxt, node, add_to_current_scope);
+  MAYBE_MAP_TYPE_WITH_TYPE_ID(is_type(decl), node);
   if (decl && decl->get_scope())
     ctxt.maybe_canonicalize_type(decl, /*force_delay=*/false);
   return decl;
@@ -5932,6 +6009,7 @@  handle_union_decl(read_context& ctxt,
 {
   union_decl_sptr decl =
     build_union_decl_if_not_suppressed(ctxt, node, add_to_current_scope);
+  MAYBE_MAP_TYPE_WITH_TYPE_ID(is_type(decl), node);
   if (decl && decl->get_scope())
     ctxt.maybe_canonicalize_type(decl, /*force_delay=*/false);
   return decl;
@@ -6162,7 +6240,11 @@  load_canonical_type_ids(xml_reader::read_context& ctxt, const string &file_path)
 	  s << canonical_address;
 	  uintptr_t v = 0;
 	  s >>  std::hex >> v;
-	  if (!id.empty())
+	  if (!id.empty()
+	      // 0xdeadbabe is the special value the hash of types
+	      // that are not canonicalized.  Look into function
+	      // hash_as_canonical_type_or_constant for the details.
+	      && v != 0xdeadbabe)
 	    ctxt.get_environment()->get_type_id_canonical_type_map()[id] = v;
 	}
     }
diff --git a/src/abg-writer.cc b/src/abg-writer.cc
index c7e07b70..efc01fa7 100644
--- a/src/abg-writer.cc
+++ b/src/abg-writer.cc
@@ -836,6 +836,20 @@  public:
     m_emitted_decls_map[irepr] = true;
   }
 
+  /// Get the set of types that have been emitted.
+  ///
+  /// @return the set of types that have been emitted.
+  const type_ptr_set_type&
+  get_emitted_types_set() const
+  {return m_emitted_type_set;}
+
+  /// Get the set of types that have been emitted.
+  ///
+  /// @return the set of types that have been emitted.
+  const type_ptr_set_type&
+  get_emitted_decl_only_types_set() const
+  {return m_emitted_decl_only_set;}
+
   /// Clear the map that contains the IDs of the types that has been
   /// recorded as having been written out to the XML output.
   void
@@ -4764,6 +4778,47 @@  dump_decl_location(const decl_base_sptr d)
 {dump_decl_location(d.get());}
 
 #ifdef WITH_DEBUG_SELF_COMPARISON
+/// Write one of the records of the "type-ids" debugging file.
+///
+/// This is a sub-routine of write_canonical_type_ids.
+///
+/// @param ctxt the context to use.
+///
+/// @param type the type which canonical type pointer value to emit.
+///
+/// @param o the output stream to write to.
+static void
+write_type_record(xml_writer::write_context&	ctxt,
+		  const type_base*		type,
+		  ostream&			o)
+{
+  // We want to serialize a type record which content looks like:
+  //
+  //     <type>
+  //       <id>type-id-573</id>
+  //       <c>0x262ee28</c>
+  //     </type>
+  //     <type>
+  //       <id>type-id-569</id>
+  //       <c>0x2628298</c>
+  //     </type>
+  //     <type>
+  //       <id>type-id-575</id>
+  //       <c>0x25f9ba8</c>
+  //     </type>
+
+  string id = ctxt.get_id_for_type (type);
+  o << "  <type>\n"
+    << "    <id>" << id << "</id>\n"
+    << "    <c>"
+    << std::hex
+    << (type->get_canonical_type()
+	? reinterpret_cast<uintptr_t>(type->get_canonical_type().get())
+	: 0xdeadbabe)
+    << "</c>\n"
+    << "  </type>\n";
+}
+
 /// Serialize the map that is stored at
 /// environment::get_type_id_canonical_type_map() to an output stream.
 ///
@@ -4794,17 +4849,13 @@  write_canonical_type_ids(xml_writer::write_context& ctxt, ostream& o)
   // <abixml-types-check>
 
   o << "<abixml-types-check>\n";
-  for (const auto &p : ctxt.get_environment()->get_canonical_types_map())
-    for (const auto& type_sptr : p.second)
-      {
-	string id = ctxt.get_id_for_type (type_sptr);
-	o << "  <type>\n"
-	  << "    <id>" << id << "</id>\n"
-	  << "    <c>"
-	  << std::hex << type_sptr->get_canonical_type().get()
-	  << "</c>\n"
-	  << "  </type>\n";
-      }
+
+  for (const auto &type : ctxt.get_emitted_types_set())
+    write_type_record(ctxt, type, o);
+
+  for (const auto &type : ctxt.get_emitted_decl_only_types_set())
+    write_type_record(ctxt, type, o);
+
   o << "</abixml-types-check>\n";
 }
 
diff --git a/tools/abidw.cc b/tools/abidw.cc
index 3f38c695..54bf3aa4 100644
--- a/tools/abidw.cc
+++ b/tools/abidw.cc
@@ -591,10 +591,7 @@  load_corpus_and_write_abixml(char* argv[],
 #ifdef WITH_DEBUG_SELF_COMPARISON
 	  if (opts.debug_abidiff
 	      && !opts.type_id_file_path.empty())
-	    {
-	      load_canonical_type_ids(*read_ctxt, opts.type_id_file_path);
-	      remove(opts.type_id_file_path.c_str());
-	    }
+	    load_canonical_type_ids(*read_ctxt, opts.type_id_file_path);
 #endif
 	  t.start();
 	  corpus_sptr corp2 =
@@ -636,6 +633,12 @@  load_corpus_and_write_abixml(char* argv[],
 	  return 0;
 	}
 
+#ifdef WITH_DEBUG_SELF_COMPARISON
+	  if (opts.debug_abidiff
+	      && !opts.type_id_file_path.empty())
+	    remove(opts.type_id_file_path.c_str());
+#endif
+
       if (opts.noout)
 	return 0;