Subject: [PATCH 1/3] reader: Handle 'abi-corpus' element being possibly empty
Commit Message
Hello,
The abixml reader wrongly assumes that the 'abi-corpus' element is
always non-empty. Note that until now, the only emitter of abixml
consumed in practice was abg-writer.cc and it only emits non-empty
'abi-corpus' elements. So the issue wasn't exposed.
So, the reader assumes that an 'abi-corpus' element has at least a
text node.
For instance, consider this minimal input file named test-v0.abi:
$cat test-v0.abi
<abi-corpus-group architecture='elf-arm-aarch64'>
<abi-corpus path='vmlinux' architecture='elf-arm-aarch64'>
</abi-corpus>
</abi-corpus-group>
$
Now, compare it to this file where the abi-corpus element is an empty
element (doesn't even contain any text):
$cat test-v0.abi
<abi-corpus-group architecture='elf-arm-aarch64'>
<abi-corpus path='vmlinux'/>
</abi-corpus-group>
$
comparing the two files with abidiff (wrongly) reports:
$ abidiff test-v0.abi test-v1.abi
ELF architecture changed
Functions changes summary: 0 Removed, 0 Changed, 0 Added function
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
architecture changed from 'elf-arm-aarch64' to ''
$
What's happening is that read_corpus_from_input is getting out early
when it sees that the node is empty. This is at:
xmlNodePtr node = ctxt.get_corpus_node();
@@ -1907,10 +1925,14 @@ read_corpus_from_input(read_context& ctxt)
corp.set_soname(reinterpret_cast<char*>(soname_str.get()));
}
if (!node->children) // <---- we get out early here and we
return nil; // forget about the properties of
// the current empty corpus element node
So, at its core, fixing the issue at hand involves avoiding the early
return there.
But then, it turns out that's not enough.
In the current setting, the different abixml processing entry points
are designed to be used in a semi "streaming" mode.
So for instance, read_translation_unit_from_input can be invoked
repeatedly to "stream in" the next translation unit at each
invocation.
Alternatively, the lower level xmlTextReaderNext can be used to
iterate over XML node until we reach the translation unit XML element
we are interested in. At that point xmlTextReaderExpand can be used
to expand the XML node, then we let the context know that this is
the current node of the corpus that needs to be processed, using
read_context::get_corpus_node. Once we've done that,
read_translation_unit_from_input can be called to process that
particular corpus node. Note that the corpus node at hand, that needs
to be processed will be retrieved by read_context::get_corpus_node.
These two modes of operation are also available for
read_corpus_from_input, read_symbol_db_from_input,
read_elf_needed_from_input etc.
Today, these functions all assume that the current node returned by
read_context::get_corpus_node is the node /before/ the node of the
corpus to be processed. So they all start looking at the /next sibling/
of the node returned by read_context::get_corpus_node. So the code
was implicitly assuming that read_context::get_corpus_node was
pointing to a text node that was before the node of the corpus that we
want to process.
This is wrong. read_context::get_corpus_node should just return the
current node of the corpus that needs to be processed and voila.
And so read_context::set_corpus_node should be used to set the current
node of the corpus to the current element node that needs to be processed.
That's the spirit of the change done by this patch.
As its name suggests, the existing
xml::advance_to_next_sibling_element is used to skip non element xml
nodes (including text nodes) and move to the next element node to
process, which is set to the context using
read_context::set_corpus_node.
Then the actual processing functions like read_corpus_from_input get
the node to process, using read_context::get_corpus_node and process
it rather than processing the sibling node that comes after it.
The other changes are either to prevent related crashes that I noticed
while doing various tests, update the abilint tool used to read and
debug abixml input files and add better documentation.
* src/abg-reader.cc (read_context::get_corpus_node): Add comment
to this member function.
(read_translation_unit_from_input, read_symbol_db_from_input)
(read_elf_needed_from_input): Start processing the current node of
the corpus that needs to be processed rather than its next
sibling. Once the processing is done, set the new "current node
of the corpus to be processed" properly by skipping to the next
element node to be processed.
(read_corpus_from_input): Don't get out early when the
'abi-corpus' element is empty. If, however, it has children node,
skip to the first child element and flag it -- using
read_context::set_corpus_node -- as being the element node to be
processed by the processing facilities of the reader. If we are
in a mode where we called xmlTextReaderExpand ourselves to get the
node to process, then it means we need to free that node
indirectly by calling xmlTextReaderNext. In that case, that node
should not be flagged by read_context::set_corpus_node. Add more
comments.
* src/abg-corpus.cc (corpus::is_empty): Do not crash when no
symtab is around.
* src/abg-libxml-utils.cc (go_to_next_sibling_element_or_stay):
Fix typo in comment.
(advance_to_next_sibling_element): Don't crash when given a nil
node.
* tests/data/test-abidiff/test-PR27616-squished-v0.abi: Add new
test input.
* tests/data/test-abidiff/test-PR27616-squished-v1.abi: Likewise.
* tests/data/test-abidiff/test-PR27616-v0.xml: Likewise.
* tests/data/test-abidiff/test-PR27616-v1.xml: Likewise.
* tests/data/Makefile.am: Add the new test inputs above to source
distribution.
* tests/test-abidiff.cc (specs): Add the new tests inputs above to
this harness.
* tools/abilint.cc (main): Support writing corpus groups.
Signed-off-by: Dodji Seketeli <dodji@redhat.com>
---
src/abg-corpus.cc | 2 +-
src/abg-libxml-utils.cc | 6 +-
src/abg-reader.cc | 65 +++++++++++++------
tests/data/Makefile.am | 4 ++
.../test-abidiff/test-PR27616-squished-v0.abi | 43 ++++++++++++
.../test-abidiff/test-PR27616-squished-v1.abi | 1 +
tests/data/test-abidiff/test-PR27616-v0.xml | 4 ++
tests/data/test-abidiff/test-PR27616-v1.xml | 3 +
tests/test-abidiff.cc | 12 ++++
tools/abilint.cc | 9 ++-
10 files changed, 126 insertions(+), 23 deletions(-)
create mode 100644 tests/data/test-abidiff/test-PR27616-squished-v0.abi
create mode 100644 tests/data/test-abidiff/test-PR27616-squished-v1.abi
create mode 100644 tests/data/test-abidiff/test-PR27616-v0.xml
create mode 100644 tests/data/test-abidiff/test-PR27616-v1.xml
@@ -1009,7 +1009,7 @@ corpus::is_empty() const
}
}
return (members_empty
- && !get_symtab()->has_symbols()
+ && (!get_symtab() || !get_symtab()->has_symbols())
&& priv_->soname.empty()
&& priv_->needed.empty());
}
@@ -413,7 +413,8 @@ unescape_xml_comment(const std::string& str)
return result;
}
-/// Maybe get the next sibling element node of an XML node, or stay to the sam
+/// Maybe get the next sibling element node of an XML node, or stay to
+/// the same.
///
/// If there is no next sibling xml element node, the function returns
/// the initial node.
@@ -443,6 +444,9 @@ go_to_next_sibling_element_or_stay(xmlNodePtr node)
xmlNodePtr
advance_to_next_sibling_element(xmlNodePtr node)
{
+ if (!node)
+ return 0;
+
xmlNodePtr n = go_to_next_sibling_element_or_stay(node->next);
if (n == 0 || n->type != XML_ELEMENT_NODE)
return 0;
@@ -196,10 +196,20 @@ public:
get_reader() const
{return m_reader;}
+ /// Getter of the current XML node in the corpus element sub-tree
+ /// that needs to be processed.
+ ///
+ /// @return the current XML node in the corpus element sub-tree that
+ /// needs to be processed.
xmlNodePtr
get_corpus_node() const
{return m_corp_node;}
+ /// Setter of the current XML node in the corpus element sub-tree
+ /// that needs to be processed.
+ ///
+ /// @param node set the current XML node in the corpus element
+ /// sub-tree that needs to be processed.
void
set_corpus_node(xmlNodePtr node)
{m_corp_node = node;}
@@ -1485,7 +1495,7 @@ read_translation_unit_from_input(read_context& ctxt)
else
{
node = 0;
- for (xmlNodePtr n = ctxt.get_corpus_node()->next; n; n = n->next)
+ for (xmlNodePtr n = ctxt.get_corpus_node(); n; n = n->next)
{
if (!n
|| n->type != XML_ELEMENT_NODE)
@@ -1501,15 +1511,17 @@ read_translation_unit_from_input(read_context& ctxt)
return nil;
tu = get_or_read_and_add_translation_unit(ctxt, node);
- // So read_translation_unit() can trigger (under the hood) reading
- // from several translation units just because
- // read_context::get_scope_for_node() has been called. In that
- // case, after that unexpected call to read_translation_unit(), the
- // current corpus node of the context is going to point to that
- // translation unit that has been read under the hood. Let's set
- // the corpus node to the one we initially called
- // read_translation_unit() on here.
- ctxt.set_corpus_node(node);
+
+ if (ctxt.get_corpus_node())
+ {
+ // We are not in the mode where the current corpus node came
+ // from a local invocation of xmlTextReaderExpand. So let's set
+ // ctxt.get_corpus_node to the next child element node of the
+ // corpus that needs to be processed.
+ node = xml::advance_to_next_sibling_element(node);
+ ctxt.set_corpus_node(node);
+ }
+
return tu;
}
@@ -1583,7 +1595,7 @@ read_symbol_db_from_input(read_context& ctxt,
xmlTextReaderNext(reader.get());
}
else
- for (xmlNodePtr n = ctxt.get_corpus_node()->next; n; n = n->next)
+ for (xmlNodePtr n = ctxt.get_corpus_node(); n; n = n->next)
{
if (!n || n->type != XML_ELEMENT_NODE)
continue;
@@ -1594,8 +1606,11 @@ read_symbol_db_from_input(read_context& ctxt,
else if (xmlStrEqual(n->name, BAD_CAST("elf-variable-symbols")))
has_var_syms = true;
else
- break;
- ctxt.set_corpus_node(n);
+ {
+ ctxt.set_corpus_node(n);
+ break;
+ }
+
if (has_fn_syms)
{
fn_symdb = build_elf_symbol_db(ctxt, n, true);
@@ -1688,7 +1703,7 @@ read_elf_needed_from_input(read_context& ctxt,
}
else
{
- for (xmlNodePtr n = ctxt.get_corpus_node()->next; n; n = n->next)
+ for (xmlNodePtr n = ctxt.get_corpus_node(); n; n = n->next)
{
if (!n || n->type != XML_ELEMENT_NODE)
continue;
@@ -1703,6 +1718,7 @@ read_elf_needed_from_input(read_context& ctxt,
if (node)
{
result = build_needed(node, needed);
+ node = xml::advance_to_next_sibling_element(node);
ctxt.set_corpus_node(node);
}
@@ -1806,6 +1822,8 @@ read_corpus_from_input(read_context& ctxt)
if (!reader)
return nil;
+ // This is to remember to call xmlTextReaderNext if we ever call
+ // xmlTextReaderExpand.
bool call_reader_next = false;
xmlNodePtr node = ctxt.get_corpus_node();
@@ -1907,10 +1925,14 @@ read_corpus_from_input(read_context& ctxt)
corp.set_soname(reinterpret_cast<char*>(soname_str.get()));
}
- if (!node->children)
- return nil;
-
- ctxt.set_corpus_node(node->children);
+ // If the corpus element node has children nodes, make
+ // ctxt.get_corpus_node() returns the first child element node of
+ // the corpus element that *needs* to be processed.
+ if (node->children)
+ {
+ xmlNodePtr n = xml::advance_to_next_sibling_element(node->children);
+ ctxt.set_corpus_node(n);
+ }
corpus& corp = *ctxt.get_corpus();
@@ -1966,6 +1988,10 @@ read_corpus_from_input(read_context& ctxt)
// This is the necessary counter-part of the xmlTextReaderExpand()
// call at the beginning of the function.
xmlTextReaderNext(reader.get());
+ // The call above invalidates the xml node returned by
+ // xmlTextReaderExpand, which is can still be accessed via
+ // ctxt.set_corpus_node.
+ ctxt.set_corpus_node(0);
}
else
{
@@ -1974,7 +2000,8 @@ read_corpus_from_input(read_context& ctxt)
if (!node)
{
node = ctxt.get_corpus_node();
- node = xml::advance_to_next_sibling_element(node->parent);
+ if (node)
+ node = xml::advance_to_next_sibling_element(node->parent);
}
ctxt.set_corpus_node(node);
}
@@ -206,6 +206,10 @@ test-abidiff-exit/test-crc-v1.abi \
test-abidiff-exit/test-missing-alias-report.txt \
test-abidiff-exit/test-missing-alias.abi \
test-abidiff-exit/test-missing-alias.suppr \
+test-abidiff/test-PR27616-squished-v0.abi \
+test-abidiff/test-PR27616-squished-v1.abi \
+test-abidiff/test-PR27616-v0.xml \
+test-abidiff/test-PR27616-v1.xml \
\
test-diff-dwarf/test0-v0.cc \
test-diff-dwarf/test0-v0.o \
new file mode 100644
@@ -0,0 +1,43 @@
+<abi-corpus version='2.0' path='data/test-read-dwarf/test6.so'>
+ <elf-needed>
+ <dependency name='libstdc++.so.6'/>
+ <dependency name='libm.so.6'/>
+ <dependency name='libgcc_s.so.1'/>
+ <dependency name='libc.so.6'/>
+ </elf-needed>
+ <elf-function-symbols>
+ <elf-symbol name='_Z3barv' type='func-type' binding='global-binding' visibility='default-visibility' is-defined='yes'/>
+ <elf-symbol name='_Z4blehv' type='func-type' binding='global-binding' visibility='default-visibility' is-defined='yes'/>
+ <elf-symbol name='_ZN1B3fooEv' type='func-type' binding='weak-binding' visibility='default-visibility' is-defined='yes'/>
+ <elf-symbol name='_fini' type='func-type' binding='global-binding' visibility='default-visibility' is-defined='yes'/>
+ <elf-symbol name='_init' type='func-type' binding='global-binding' visibility='default-visibility' is-defined='yes'/>
+ </elf-function-symbols>
+ <elf-variable-symbols>
+ <elf-symbol name='_ZN1CIiE3barE' size='4' type='object-type' binding='gnu-unique-binding' visibility='default-visibility' is-defined='yes'/>
+ <elf-symbol name='_ZZN1B3fooEvE1a' size='4' type='object-type' binding='gnu-unique-binding' visibility='default-visibility' is-defined='yes'/>
+ </elf-variable-symbols>
+ <abi-instr address-size='64' path='test6.cc' comp-dir-path='/home/skumari/Tasks/source_repo/dodji/libabigail/tests/data/test-read-dwarf' language='LANG_C_plus_plus'>
+ <type-decl name='int' size-in-bits='32' id='type-id-1'/>
+ <class-decl name='B' size-in-bits='8' is-struct='yes' visibility='default' filepath='/home/skumari/Tasks/source_repo/dodji/libabigail/tests/data/test-read-dwarf/test6.cc' line='9' column='1' id='type-id-2'>
+ <member-function access='public'>
+ <function-decl name='foo' mangled-name='_ZN1B3fooEv' filepath='/home/skumari/Tasks/source_repo/dodji/libabigail/tests/data/test-read-dwarf/test6.cc' line='11' column='1' visibility='default' binding='global' size-in-bits='64' elf-symbol-id='_ZN1B3fooEv'>
+ <parameter type-id='type-id-3' name='this' is-artificial='yes'/>
+ <return type-id='type-id-1'/>
+ </function-decl>
+ </member-function>
+ </class-decl>
+ <class-decl name='C<int>' size-in-bits='8' is-struct='yes' visibility='default' filepath='/home/skumari/Tasks/source_repo/dodji/libabigail/tests/data/test-read-dwarf/test6.cc' line='26' column='1' id='type-id-4'>
+ <data-member access='public' static='yes'>
+ <var-decl name='bar' type-id='type-id-1' mangled-name='_ZN1CIiE3barE' visibility='default' filepath='/home/skumari/Tasks/source_repo/dodji/libabigail/tests/data/test-read-dwarf/test6.cc' line='31' column='1' elf-symbol-id='_ZN1CIiE3barE'/>
+ </data-member>
+ </class-decl>
+ <pointer-type-def type-id='type-id-2' size-in-bits='64' id='type-id-5'/>
+ <qualified-type-def type-id='type-id-5' const='yes' id='type-id-3'/>
+ <function-decl name='bar' mangled-name='_Z3barv' filepath='/home/skumari/Tasks/source_repo/dodji/libabigail/tests/data/test-read-dwarf/test6.cc' line='19' column='1' visibility='default' binding='global' size-in-bits='64' elf-symbol-id='_Z3barv'>
+ <return type-id='type-id-1'/>
+ </function-decl>
+ <function-decl name='bleh' mangled-name='_Z4blehv' filepath='/home/skumari/Tasks/source_repo/dodji/libabigail/tests/data/test-read-dwarf/test6.cc' line='34' column='1' visibility='default' binding='global' size-in-bits='64' elf-symbol-id='_Z4blehv'>
+ <return type-id='type-id-1'/>
+ </function-decl>
+ </abi-instr>
+</abi-corpus>
new file mode 100644
@@ -0,0 +1 @@
+<abi-corpus version='2.0' path='data/test-read-dwarf/test6.so'><elf-needed><dependency name='libstdc++.so.6'/><dependency name='libm.so.6'/><dependency name='libgcc_s.so.1'/><dependency name='libc.so.6'/></elf-needed><elf-function-symbols><elf-symbol name='_Z3barv' type='func-type' binding='global-binding' visibility='default-visibility' is-defined='yes'/><elf-symbol name='_Z4blehv' type='func-type' binding='global-binding' visibility='default-visibility' is-defined='yes'/><elf-symbol name='_ZN1B3fooEv' type='func-type' binding='weak-binding' visibility='default-visibility' is-defined='yes'/><elf-symbol name='_fini' type='func-type' binding='global-binding' visibility='default-visibility' is-defined='yes'/><elf-symbol name='_init' type='func-type' binding='global-binding' visibility='default-visibility' is-defined='yes'/></elf-function-symbols><elf-variable-symbols><elf-symbol name='_ZN1CIiE3barE' size='4' type='object-type' binding='gnu-unique-binding' visibility='default-visibility' is-defined='yes'/><elf-symbol name='_ZZN1B3fooEvE1a' size='4' type='object-type' binding='gnu-unique-binding' visibility='default-visibility' is-defined='yes'/></elf-variable-symbols><abi-instr address-size='64' path='test6.cc' comp-dir-path='/home/skumari/Tasks/source_repo/dodji/libabigail/tests/data/test-read-dwarf' language='LANG_C_plus_plus'><type-decl name='int' size-in-bits='32' id='type-id-1'/><class-decl name='B' size-in-bits='8' is-struct='yes' visibility='default' filepath='/home/skumari/Tasks/source_repo/dodji/libabigail/tests/data/test-read-dwarf/test6.cc' line='9' column='1' id='type-id-2'><member-function access='public'><function-decl name='foo' mangled-name='_ZN1B3fooEv' filepath='/home/skumari/Tasks/source_repo/dodji/libabigail/tests/data/test-read-dwarf/test6.cc' line='11' column='1' visibility='default' binding='global' size-in-bits='64' elf-symbol-id='_ZN1B3fooEv'><parameter type-id='type-id-3' name='this' is-artificial='yes'/><return type-id='type-id-1'/></function-decl></member-function></class-decl><class-decl name='C<int>' size-in-bits='8' is-struct='yes' visibility='default' filepath='/home/skumari/Tasks/source_repo/dodji/libabigail/tests/data/test-read-dwarf/test6.cc' line='26' column='1' id='type-id-4'><data-member access='public' static='yes'><var-decl name='bar' type-id='type-id-1' mangled-name='_ZN1CIiE3barE' visibility='default' filepath='/home/skumari/Tasks/source_repo/dodji/libabigail/tests/data/test-read-dwarf/test6.cc' line='31' column='1' elf-symbol-id='_ZN1CIiE3barE'/></data-member></class-decl><pointer-type-def type-id='type-id-2' size-in-bits='64' id='type-id-5'/><qualified-type-def type-id='type-id-5' const='yes' id='type-id-3'/><function-decl name='bar' mangled-name='_Z3barv' filepath='/home/skumari/Tasks/source_repo/dodji/libabigail/tests/data/test-read-dwarf/test6.cc' line='19' column='1' visibility='default' binding='global' size-in-bits='64' elf-symbol-id='_Z3barv'><return type-id='type-id-1'/></function-decl><function-decl name='bleh' mangled-name='_Z4blehv' filepath='/home/skumari/Tasks/source_repo/dodji/libabigail/tests/data/test-read-dwarf/test6.cc' line='34' column='1' visibility='default' binding='global' size-in-bits='64' elf-symbol-id='_Z4blehv'><return type-id='type-id-1'/></function-decl></abi-instr></abi-corpus>
new file mode 100644
@@ -0,0 +1,4 @@
+<abi-corpus-group architecture='elf-arm-aarch64'>
+ <abi-corpus path='vmlinux' architecture='elf-arm-aarch64'>
+ </abi-corpus>
+</abi-corpus-group>
new file mode 100644
@@ -0,0 +1,3 @@
+<abi-corpus-group architecture='elf-arm-aarch64'>
+ <abi-corpus path='vmlinux' architecture='elf-arm-aarch64'/>
+</abi-corpus-group>
@@ -128,6 +128,18 @@ static InOutSpec specs[] =
"data/test-abidiff/test-crc-report.txt",
"output/test-abidiff/test-crc-report-1-2.txt"
},
+ {
+ "data/test-abidiff/test-PR27616-v0.xml",
+ "data/test-abidiff/test-PR27616-v1.xml",
+ "data/test-abidiff/empty-report.txt",
+ "output/test-abidiff/empty-report.txt"
+ },
+ {
+ "data/test-abidiff/test-PR27616-squished-v0.abi",
+ "data/test-abidiff/test-PR27616-squished-v1.abi",
+ "data/test-abidiff/empty-report.txt",
+ "output/test-abidiff/empty-report.txt"
+ },
// This should be the last entry.
{0, 0, 0, 0}
};
@@ -440,11 +440,16 @@ main(int argc, char* argv[])
else
{
if (type == abigail::tools_utils::FILE_TYPE_XML_CORPUS
- ||type == abigail::tools_utils::FILE_TYPE_XML_CORPUS_GROUP
+ || type == abigail::tools_utils::FILE_TYPE_XML_CORPUS_GROUP
|| type == abigail::tools_utils::FILE_TYPE_ELF)
{
if (!opts.noout)
- is_ok = write_corpus(*ctxt, corp, 0);
+ {
+ if (corp)
+ is_ok = write_corpus(*ctxt, corp, 0);
+ else if (group)
+ is_ok = write_corpus_group(*ctxt, group, 0);
+ }
}
}