Hello,
In the debugging mode of self comparison induced by the invocation of
"abidw --debug-abidiff <binary>", it's useful to be able to ensure the
following invariant:
The pointer value of the canonical type of a type T that is
serialized into abixml with the id string "type-id-12" (for
instance) must keep the same canonical type pointer value when
that abixml file is de-serialized back into memory. This is
possible mainly because libabigail stays loaded in memory all the
time during both serialization and de-serialization.
This patch adds support for detecting when that invariant is not
respected.
In other words it detects when the type build from de-serializing the
type which id is "type-id-12" (for instance) has a canonical type
which pointer value is different from the pointer value of the
canonical type (of the type) that was serialized as having the type id
"type-id-12".
This is done in three phases.
The first phase happens in the code of abidw itself; after the abixml
is written on disk, another file called the "typeid file" is written
on disk as well. That later file contains a set of records; each
record associates a "type id string" (like the type IDs that appear in
the abixml file) to the pointer value of the canonical type that
matches that type id string. That file is thus now available for
manual inspection during a later debugger session. This is done by
invoking the new function write_canonical_type_ids.
The second phase appears right before abixml loading time. The typeid
file is read back and the association "type-id string" <-> is stored
in a hash map that is returned by
environment::get_type_id_canonical_type_map(). This is done by
invoking the new function load_canonical_type_ids.
The third phase happens right after the canonicalization (triggered in
the abixml reader) of a type coming from abixml, corresponding to a
given type id. It checks if the pointer value of the canonicalization
type just computed is the same as the one associated to that type id
in the map returned by environment::get_type_id_canonical_type_map.
This is a way of verifying the "stability" of a canonical type during
its serialization and de-serialization to and from abixml and it's
done as part of "abidw --debug-abidiff <binary>".
Just as an example, here is the kind of error output that I am getting
on a real life debugging session on a binary that exhibits self
comparison error:
$ abidw --debug-abidiff -d <some-binary>
error: problem detected with type 'typedef Vmalloc_t' from second corpus
error: canonical type for type 'typedef Vmalloc_t' of type-id 'type-id-179' changed from '1a083e8' to '21369b8'
[...]
$
From this output, I see that the first type for which libabigail
exhibits an instability on the pointer value of the canonical type is
the type 'typedef Vmalloc_t'. In other words, when that type is saved
to abixml, the type we read back is different. This needs further
debugging but at least it pinpoints exactly what type we are seeing
the core issue on first. This is of a tremendous help in the root
cause analysis needed to understand why the self comparison is
failing.
* include/abg-ir.h (environment::get_type_id_canonical_type_map):
Declare new data member.
* src/abg-ir.cc (environment::priv::type_id_canonical_type_map_):
Define new data member.
(environment::get_type_id_canonical_type_map): Define new method.
* include/abg-reader.h (load_canonical_type_ids): Declare new
function.
* src/abg-reader.cc (read_context::m_pointer_type_id_map):
Define new data member.
(read_context::{get_pointer_type_id_map,
maybe_check_abixml_canonical_type_stability}): Define new methods.
(read_context::{maybe_canonicalize_type,
perform_late_type_canonicalizing}): Invoke
maybe_perform_self_comparison_canonical_type_check after
canonicalization to perform canonicalization type stability
checking.
(build_type): Associate the pointer value for the newly built type
with the type id string identifying it in the abixml. Once the
abixml representation is dropped from memory and we are about to
perform type canonicalization, we can still know what the type id
of a given type coming from abixml was; it's thus possible to
verify that the canonical type associated to that type id is the
same as the one stored in the typeid file.
(read_type_id_string): Define new static function.
(load_canonical_type_ids): Define new function.
* include/abg-writer.h (write_canonical_type_ids): Likewise.
* src/abg-writer.cc (write_canonical_type_ids): Define new
function overloads.
* tools/abidw.cc (options::type_id_file_path): New data member.
(load_corpus_and_write_abixml): Write and read back the typeid
file.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
Applied to master.
---
include/abg-ir.h | 5 +
include/abg-reader.h | 6 ++
include/abg-writer.h | 10 ++
src/abg-ir.cc | 19 ++++
src/abg-reader.cc | 211 ++++++++++++++++++++++++++++++++++++++++++-
src/abg-writer.cc | 67 ++++++++++++++
tools/abidw.cc | 15 +++
7 files changed, 331 insertions(+), 2 deletions(-)
@@ -219,6 +219,11 @@ public:
type_base* get_canonical_type(const char* name, unsigned index);
+#ifdef WITH_DEBUG_SELF_COMPARISON
+ unordered_map<string, uintptr_t>&
+ get_type_id_canonical_type_map() const;
+#endif
+
friend class class_or_union;
friend class class_decl;
friend class function_type;
@@ -82,6 +82,12 @@ void
consider_types_not_reachable_from_public_interfaces(read_context& ctxt,
bool flag);
}//end xml_reader
+
+#ifdef WITH_DEBUG_SELF_COMPARISON
+bool
+load_canonical_type_ids(xml_reader::read_context& ctxt,
+ const string& file_path);
+#endif
}//end namespace abigail
#endif // __ABG_READER_H__
@@ -127,6 +127,16 @@ write_corpus_group(write_context& ctx,
unsigned indent);
}// end namespace xml_writer
+
+#ifdef WITH_DEBUG_SELF_COMPARISON
+void
+write_canonical_type_ids(xml_writer::write_context&, ostream&);
+
+bool
+write_canonical_type_ids(xml_writer::write_context&,
+ const string &);
+#endif
+
}// end namespace abigail
#endif // __ABG_WRITER_H__
@@ -2797,6 +2797,10 @@ struct environment::priv
// This one is the corpus for the ABIXML file representing the
// serialization of the input binary.
corpus_wptr second_self_comparison_corpus_;
+ // This is also used for debugging purposes, when using
+ // 'abidw --debug-abidiff <binary>'. It holds the set of mapping of
+ // an abixml (canonical) type and its type-id.
+ unordered_map<string, uintptr_t> type_id_canonical_type_map_;
#endif
bool canonicalization_is_done_;
bool do_on_the_fly_canonicalization_;
@@ -3307,6 +3311,21 @@ environment::get_canonical_type(const char* name, unsigned index)
return (*types)[index].get();
}
+#ifdef WITH_DEBUG_SELF_COMPARISON
+/// Get the set of abixml type-id and the pointer value of the
+/// (canonical) type it's associated to.
+///
+/// This is useful for debugging purposes, especially in the context
+/// of the use of the command:
+/// 'abidw --debug-abidiff <binary>'.
+///
+/// @return the set of abixml type-id and the pointer value of the
+/// (canonical) type it's associated to.
+unordered_map<string, uintptr_t>&
+environment::get_type_id_canonical_type_map() const
+{return priv_->type_id_canonical_type_map_;}
+#endif
+
// </environment stuff>
// <type_or_decl_base stuff>
@@ -58,6 +58,9 @@ static bool read_is_declaration_only(xmlNodePtr, bool&);
static bool read_is_artificial(xmlNodePtr, bool&);
static bool read_tracking_non_reachable_types(xmlNodePtr, bool&);
static bool read_is_non_reachable_type(xmlNodePtr, bool&);
+#ifdef WITH_DEBUG_SELF_COMPARISON
+static bool read_type_id_string(xmlNodePtr, string&);
+#endif
class read_context;
@@ -96,6 +99,9 @@ private:
unordered_map<string, vector<type_base_sptr> > m_types_map;
unordered_map<string, shared_ptr<function_tdecl> > m_fn_tmpl_map;
unordered_map<string, shared_ptr<class_tdecl> > m_class_tmpl_map;
+#ifdef WITH_DEBUG_SELF_COMPARISON
+ unordered_map<uintptr_t, string> m_pointer_type_id_map;
+#endif
vector<type_base_sptr> m_types_to_canonicalize;
string_xml_node_map m_id_xml_node_map;
xml_node_decl_base_sptr_map m_xml_node_decl_map;
@@ -378,6 +384,25 @@ public:
return i->second;
}
+#ifdef WITH_DEBUG_SELF_COMPARISON
+ /// Getter of the map that associates the values of type pointers to
+ /// their type-id strings.
+ ///
+ /// Note that this map is populated at abixml reading time, (by
+ /// build_type()) when a given XML element representing a type is
+ /// read into a corresponding abigail::ir::type_base.
+ ///
+ /// This is used only for the purpose of debugging the
+ /// self-comparison process. That is, when invoking "abidw
+ /// --debug-abidiff".
+ ///
+ /// @return the map that associates the values of type pointers to
+ /// their type-id strings.
+ unordered_map<uintptr_t, string>&
+ get_pointer_type_id_map()
+ {return m_pointer_type_id_map;}
+#endif
+
/// Return the current lexical scope.
scope_decl*
get_cur_scope() const
@@ -791,6 +816,67 @@ public:
clear_decls_stack();
}
+#ifdef WITH_DEBUG_SELF_COMPARISON
+ /// Perform a debugging routine for the "self-comparison" mode.
+ ///
+ /// This is done when this command is on:
+ ///
+ /// "abidw --debug-abidiff".
+ ///
+ /// Consider a type 't' built from an XML element from the abixml
+ /// reader and that has just been canonicalized.
+ ///
+ /// This function checks if the canonical type of 't' is the same as
+ /// the canonical type of the type which was saved into the abixml
+ /// with the same "type-id" as the one of 't'.
+ ///
+ /// Note that at abixml saving time, a debugging file was saved on
+ /// disk to record the mapping of canonical type pointers and their
+ /// type-ids. Right before reading the abixml again, that file was
+ /// read again and the mapping was loaded in the map returned by
+ /// environment::get_type_id_canonical_type_map().
+ void
+ maybe_check_abixml_canonical_type_stability(type_base_sptr& t)
+ {
+ if (!m_env->self_comparison_debug_is_on()
+ || m_env->get_type_id_canonical_type_map().empty())
+ return ;
+
+ // Let's get the type-id of this type as recorded in the
+ // originating abixml file.
+ string type_id;
+ const auto i =
+ get_pointer_type_id_map().find(reinterpret_cast<uintptr_t>(t.get()));
+ if (i != get_pointer_type_id_map().end())
+ {
+ type_id = i->second;
+ // Now let's get the canonical type that initially led to the
+ // serialization of a type with this type-id, when the abixml
+ // was being serialized.
+ auto j = m_env->get_type_id_canonical_type_map().find(type_id);
+ if (j == m_env->get_type_id_canonical_type_map().end())
+ std::cerr << "error: no type with type-id: '"
+ << type_id
+ << "' could be read back from the abixml file\n";
+ else if (j->second
+ != reinterpret_cast<uintptr_t>(t->get_canonical_type().get()))
+ // So thecanonical type of 't' (at abixml de-serialization
+ // time) is different from the canonical type that led to
+ // the serialization of 't' at abixml serialization time.
+ // Report this because it needs further debugging.
+ std::cerr << "error: canonical type for type '"
+ << t->get_pretty_representation(/*internal=*/false,
+ /*qualified=*/false)
+ << "' of type-id '" << type_id
+ << "' changed from '" << std::hex
+ << j->second << "' to '" << std::hex
+ << reinterpret_cast<uintptr_t>(t->get_canonical_type().get())
+ << std::dec
+ << "'\n";
+ }
+ }
+#endif
+
/// Test if a type should be canonicalized early. If so,
/// canonicalize it right away. Otherwise, schedule it for late
/// canonicalizing; that is, schedule it so that it's going to be
@@ -836,7 +922,12 @@ public:
&& !is_typedef(t)
&& !is_enum_type(t)
&& !is_function_type(t))
- canonicalize(t);
+ {
+ canonicalize(t);
+#ifdef WITH_DEBUG_SELF_COMPARISON
+ maybe_check_abixml_canonical_type_stability(t);
+#endif
+ }
else
{
// We do not want to try to canonicalize a class type that
@@ -865,7 +956,12 @@ public:
for (vector<type_base_sptr>::iterator i = m_types_to_canonicalize.begin();
i != m_types_to_canonicalize.end();
++i)
- canonicalize(*i);
+ {
+ canonicalize(*i);
+#ifdef WITH_DEBUG_SELF_COMPARISON
+ maybe_check_abixml_canonical_type_stability(*i);
+#endif
+ }
}
/// Test whether if a given function suppression matches a function
@@ -2704,6 +2800,26 @@ read_elf_symbol_visibility(xmlNodePtr node, elf_symbol::visibility& v)
return false;
}
+#ifdef WITH_DEBUG_SELF_COMPARISON
+/// Read the value of the 'id' attribute from a given XML node.
+///
+/// @param node the XML node to consider.
+///
+/// @param type_id the string to set the 'id' to.
+///
+/// @return true iff @p type_id was successfully set.
+static bool
+read_type_id_string(xmlNodePtr node, string& type_id)
+{
+ if (xml_char_sptr s = XML_NODE_GET_ATTRIBUTE(node, "id"))
+ {
+ type_id = CHAR_STR(s);
+ return true;
+ }
+ return false;
+}
+#endif
+
/// Build a @ref namespace_decl from an XML element node which name is
/// "namespace-decl". Note that this function recursively reads the
/// content of the namespace and builds the proper IR nodes
@@ -5541,6 +5657,16 @@ build_type(read_context& ctxt,
abi->record_type_as_reachable_from_public_interfaces(*t);
}
+#ifdef WITH_DEBUG_SELF_COMPARISON
+ if (t && ctxt.get_environment()->self_comparison_debug_is_on())
+ {
+ string type_id;
+ if (read_type_id_string(node, type_id))
+ // Let's store the type-id of this type pointer.
+ ctxt.get_pointer_type_id_map()[reinterpret_cast<uintptr_t>(t.get())] = type_id;
+ }
+#endif
+
ctxt.maybe_canonicalize_type(t);
return t;
}
@@ -5913,4 +6039,85 @@ read_corpus_from_native_xml_file(const string& path,
}//end namespace xml_reader
+#ifdef WITH_DEBUG_SELF_COMPARISON
+/// Load the map that is stored at
+/// environment::get_type_id_canonical_type_map().
+///
+/// That map associates type-ids to the pointer value of the canonical
+/// types they correspond to. The map is loaded from a file that was
+/// stored on disk by some debugging primitive that is activated when
+/// the command "abidw --debug-abidiff <binary>' is used."
+///
+/// The function that stored the map in that file is
+/// write_canonical_type_ids.
+///
+/// @param ctxt the read context to use.
+///
+/// @param file_path the path to the file containing the type-ids <->
+/// canonical type mapping.
+///
+/// @return true iff the loading was successful.
+bool
+load_canonical_type_ids(xml_reader::read_context& ctxt, const string &file_path)
+{
+ xmlDocPtr doc = xmlReadFile(file_path.c_str(), NULL, XML_PARSE_NOERROR);
+ if (!doc)
+ return false;
+
+ xmlNodePtr node = xmlDocGetRootElement(doc);
+ if (!node)
+ return false;
+
+ // We expect a file which content looks like:
+ //
+ // <abixml-types-check>
+ // <type>
+ // <id>type-id-573</id>
+ // <c>0x262ee28</c>
+ // </type>
+ // <type>
+ // <id>type-id-569</id>
+ // <c>0x2628298</c>
+ // </type>
+ // <type>
+ // <id>type-id-575</id>
+ // <c>0x25f9ba8</c>
+ // </type>
+ // <abixml-types-check>
+ //
+ // So let's parse it!
+
+ if (xmlStrcmp(node->name, (xmlChar*) "abixml-types-check"))
+ return false;
+
+ for (node = xmlFirstElementChild(node);
+ node;
+ node = xmlNextElementSibling(node))
+ {
+ if (xmlStrcmp(node->name, (xmlChar*) "type"))
+ continue;
+
+ string id, canonical_address;
+ xmlNodePtr data = xmlFirstElementChild(node);
+ if (data && !xmlStrcmp(data->name, (xmlChar*) "id")
+ && data->children && xmlNodeIsText(data->children))
+ id = (char*) XML_GET_CONTENT(data->children);
+
+ data = xmlNextElementSibling(data);
+ if (data && !xmlStrcmp(data->name, (xmlChar*) "c")
+ && data->children && xmlNodeIsText(data->children))
+ {
+ canonical_address = (char*) XML_GET_CONTENT(data->children);
+ std::stringstream s;
+ s << canonical_address;
+ uintptr_t v = 0;
+ s >> std::hex >> v;
+ if (!id.empty())
+ ctxt.get_environment()->get_type_id_canonical_type_map()[id] = v;
+ }
+ }
+ return true;
+}
+#endif
+
}//end namespace abigail
@@ -4689,5 +4689,72 @@ void
dump_decl_location(const decl_base_sptr d)
{dump_decl_location(d.get());}
+#ifdef WITH_DEBUG_SELF_COMPARISON
+/// Serialize the map that is stored at
+/// environment::get_type_id_canonical_type_map() to an output stream.
+///
+/// This is for debugging purposes and is triggered ultimately by
+/// invoking the command 'abidw --debug-abidiff <binary>'.
+///
+/// @param ctxt the write context.
+///
+/// @param o the output stream to serialize the map to.
+void
+write_canonical_type_ids(xml_writer::write_context& ctxt, ostream& o)
+{
+ // We want to serialize a file which content looks like:
+ //
+ // <abixml-types-check>
+ // <type>
+ // <id>type-id-573</id>
+ // <c>0x262ee28</c>
+ // </type>
+ // <type>
+ // <id>type-id-569</id>
+ // <c>0x2628298</c>
+ // </type>
+ // <type>
+ // <id>type-id-575</id>
+ // <c>0x25f9ba8</c>
+ // </type>
+ // <abixml-types-check>
+
+ o << "<abixml-types-check>\n";
+ for (const auto &p : ctxt.get_environment()->get_canonical_types_map())
+ for (const auto& type_sptr : p.second)
+ {
+ string id = ctxt.get_id_for_type (type_sptr);
+ o << " <type>\n"
+ << " <id>" << id << "</id>\n"
+ << " <c>"
+ << std::hex << type_sptr->get_canonical_type().get()
+ << "</c>\n"
+ << " </type>\n";
+ }
+ o << "</abixml-types-check>\n";
+}
+
+/// Serialize the map that is stored at
+/// environment::get_type_id_canonical_type_map() to a file.
+///
+/// This is for debugging purposes and is triggered ultimately by
+/// invoking the command 'abidw --debug-abidiff <binary>'.
+///
+/// @param ctxt the write context.
+///
+/// @param file_path the file to serialize the map to.
+bool
+write_canonical_type_ids(xml_writer::write_context& ctxt,
+ const string &file_path)
+{
+ std:: ofstream o (file_path);
+
+ if (!o.is_open())
+ return true;
+ write_canonical_type_ids(ctxt, o);
+ o.close();
+ return true;
+}
+#endif
// </Debugging routines>
} //end namespace abigail
@@ -108,6 +108,9 @@ struct options
bool drop_private_types;
bool drop_undefined_syms;
type_id_style_kind type_id_style;
+#ifdef WITH_DEBUG_SELF_COMPARISON
+ string type_id_file_path;
+#endif
options()
: display_version(),
@@ -575,9 +578,21 @@ load_corpus_and_write_abixml(char* argv[],
write_corpus(*write_ctxt, corp, 0);
tmp_file->get_stream().flush();
+#ifdef WITH_DEBUG_SELF_COMPARISON
+ if (opts.debug_abidiff)
+ {
+ opts.type_id_file_path = tmp_file->get_path() + string(".typeid");
+ write_canonical_type_ids(*write_ctxt, opts.type_id_file_path);
+ }
+#endif
xml_reader::read_context_sptr read_ctxt =
create_native_xml_read_context(tmp_file->get_path(), env.get());
+#ifdef WITH_DEBUG_SELF_COMPARISON
+ if (opts.debug_abidiff
+ && !opts.type_id_file_path.empty())
+ load_canonical_type_ids(*read_ctxt, opts.type_id_file_path);
+#endif
t.start();
corpus_sptr corp2 =
read_corpus_from_input(*read_ctxt);