Message ID | 20201021110815.4025428-1-maennich@google.com |
---|---|
State | New |
Headers | show |
Series | [1/3] Stabilise sort of canonical types | expand |
Hi. On Wed, 21 Oct 2020 at 12:09, Matthias Maennich <maennich@google.com> wrote: > Location expressions might occasionally be unset or invalid. E.g. a > reason for that are thread local variables that do not exactly have a > location to refer to. Compilers/Linkers may choose an odd > representation. E.g. see the dwarfdump output for the added testcase > based on libandroid.so (from AOSP). > > $ dwarfdump libandroid.so|egrep -B9 "gChoreographerE$" > > LOCAL_SYMBOLS: > < 1><0x00000022> DW_TAG_namespace > DW_AT_name android > < 2><0x00000027> DW_TAG_variable > DW_AT_name gChoreographer > DW_AT_type <0x00000b65> > DW_AT_decl_file 0x00000003 .../choreographer.cpp > DW_AT_decl_line 0x00000059 > DW_AT_location len 0x0000: : > DW_AT_linkage_name _ZN7androidL14gChoreographerE > > The DW_AT_location is properly read by elfutils' dwarf_location(), but > is not useful for us to proceed with. Hence early exit on this. > > * src/abg-dwarf-reader.cc (die_location_address): Ignore invalid > location expressions. > * tests/data/Makefile.am: Add new test files. > * tests/data/test-read-dwarf/test-libandroid.so: New test file. > * tests/data/test-read-dwarf/test-libandroid.so.abi: Likewise. > * tests/test-read-dwarf.cc: Add new test case. > > Reported-by: Dan Albert <danalbert@google.com> > Signed-off-by: Matthias Maennich <maennich@google.com> > --- > src/abg-dwarf-reader.cc | 5 + > tests/data/Makefile.am | 2 + > tests/data/test-read-dwarf/test-libandroid.so | Bin 0 -> 2579424 bytes > .../test-read-dwarf/test-libandroid.so.abi | 86949 ++++++++++++++++ > tests/test-read-dwarf.cc | 7 + > 5 files changed, 86963 insertions(+) > create mode 100644 tests/data/test-read-dwarf/test-libandroid.so > create mode 100644 tests/data/test-read-dwarf/test-libandroid.so.abi > > diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc > index 7257052e6e45..c65e1b921945 100644 > --- a/src/abg-dwarf-reader.cc > +++ b/src/abg-dwarf-reader.cc > @@ -10358,6 +10358,11 @@ die_location_address(Dwarf_Die* die, > if (!die_location_expr(die, DW_AT_location, &expr, &expr_len)) > return false; > > + // Ignore invalid location expressions where reading them succeeded, > but they > + // are set to 0x0 in DWARF. > + if (!expr) > + return false; > + > int64_t addr = 0; > if (!eval_last_constant_dwarf_sub_expr(expr, expr_len, addr, > is_tls_address)) > return false; > diff --git a/tests/data/Makefile.am b/tests/data/Makefile.am > index 40a575faa60b..34e7a0f4212f 100644 > --- a/tests/data/Makefile.am > +++ b/tests/data/Makefile.am > @@ -520,6 +520,8 @@ test-read-dwarf/test-PR26568-2.o \ > test-read-dwarf/test-PR26568-1.o \ > test-read-dwarf/test-PR26568-1.o.abi \ > test-read-dwarf/test-PR26568-2.o.abi \ > +test-read-dwarf/test-libandroid.so \ > +test-read-dwarf/test-libandroid.so.abi \ > \ > test-annotate/test0.abi \ > test-annotate/test1.abi \ > [snip due to size] My only concern is that this is a large test case. I hope we get more use out of it than just this one issue. Reviewed-by: Giuliano Procida <gprocida@google.com> Regards, Giuliano.
Hi, On Wed, 2020-10-21 at 16:49 +0100, Giuliano Procida via Libabigail wrote: > On Wed, 21 Oct 2020 at 12:09, Matthias Maennich <maennich@google.com> > wrote: > > > Location expressions might occasionally be unset or invalid. E.g. a > > reason for that are thread local variables that do not exactly have a > > location to refer to. Compilers/Linkers may choose an odd > > representation. E.g. see the dwarfdump output for the added testcase > > based on libandroid.so (from AOSP). > > > > $ dwarfdump libandroid.so|egrep -B9 "gChoreographerE$" > > > > LOCAL_SYMBOLS: > > < 1><0x00000022> DW_TAG_namespace > > DW_AT_name android > > < 2><0x00000027> DW_TAG_variable > > DW_AT_name gChoreographer > > DW_AT_type <0x00000b65> > > DW_AT_decl_file 0x00000003 .../choreographer.cpp > > DW_AT_decl_line 0x00000059 > > DW_AT_location len 0x0000: : > > DW_AT_linkage_name _ZN7androidL14gChoreographerE > > > > The DW_AT_location is properly read by elfutils' dwarf_location(), but > > is not useful for us to proceed with. Hence early exit on this. For some reason I missed the original post, so I don't fully understand what is going on here. Would it be possible to make this libandroid.so available somewhere? Thanks, Mark
Hi! On Wed, Oct 21, 2020 at 08:39:16PM +0200, Mark Wielaard wrote: >Hi, > >On Wed, 2020-10-21 at 16:49 +0100, Giuliano Procida via Libabigail wrote: >> On Wed, 21 Oct 2020 at 12:09, Matthias Maennich <maennich@google.com> >> wrote: >> >> > Location expressions might occasionally be unset or invalid. E.g. a >> > reason for that are thread local variables that do not exactly have a >> > location to refer to. Compilers/Linkers may choose an odd >> > representation. E.g. see the dwarfdump output for the added testcase >> > based on libandroid.so (from AOSP). >> > >> > $ dwarfdump libandroid.so|egrep -B9 "gChoreographerE$" >> > >> > LOCAL_SYMBOLS: >> > < 1><0x00000022> DW_TAG_namespace >> > DW_AT_name android >> > < 2><0x00000027> DW_TAG_variable >> > DW_AT_name gChoreographer >> > DW_AT_type <0x00000b65> >> > DW_AT_decl_file 0x00000003 .../choreographer.cpp >> > DW_AT_decl_line 0x00000059 >> > DW_AT_location len 0x0000: : >> > DW_AT_linkage_name _ZN7androidL14gChoreographerE >> > >> > The DW_AT_location is properly read by elfutils' dwarf_location(), but >> > is not useful for us to proceed with. Hence early exit on this. > >For some reason I missed the original post, so I don't fully understand >what is going on here. Would it be possible to make this libandroid.so >available somewhere? Patch 2/3 got stuck in the moderation queue as it exceeded the message size limit. I staged all patches at https://github.com/metti/libabigail/commits/libandroid 1/3 https://github.com/metti/libabigail/commit/b8a067e1d2c90bd4677cab53a4e01208f3a78f37 2/3 https://github.com/metti/libabigail/commit/e001e06a37036429450b3802b8c26b820b2e464f 3/3 https://github.com/metti/libabigail/commit/4af2c640c9ce76027d1617d5ed61a8df05ee8aee Cheers, Matthias > >Thanks, > >Mark
Hi Matthias, On Thu, 2020-10-22 at 09:50 +0100, Matthias Maennich wrote: > On Wed, Oct 21, 2020 at 08:39:16PM +0200, Mark Wielaard wrote: > > On Wed, 2020-10-21 at 16:49 +0100, Giuliano Procida via Libabigail wrote: > > > On Wed, 21 Oct 2020 at 12:09, Matthias Maennich <maennich@google.com> > > > wrote: > > > > > > > Location expressions might occasionally be unset or invalid. E.g. a > > > > reason for that are thread local variables that do not exactly have a > > > > location to refer to. Compilers/Linkers may choose an odd > > > > representation. E.g. see the dwarfdump output for the added testcase > > > > based on libandroid.so (from AOSP). > > > > > > > > $ dwarfdump libandroid.so|egrep -B9 "gChoreographerE$" > > > > > > > > LOCAL_SYMBOLS: > > > > < 1><0x00000022> DW_TAG_namespace > > > > DW_AT_name android > > > > < 2><0x00000027> DW_TAG_variable > > > > DW_AT_name gChoreographer > > > > DW_AT_type <0x00000b65> > > > > DW_AT_decl_file 0x00000003 .../choreographer.cpp > > > > DW_AT_decl_line 0x00000059 > > > > DW_AT_location len 0x0000: : > > > > DW_AT_linkage_name _ZN7androidL14gChoreographerE > > > > > > > > The DW_AT_location is properly read by elfutils' dwarf_location(), but > > > > is not useful for us to proceed with. Hence early exit on this. > > > > For some reason I missed the original post, so I don't fully understand > > what is going on here. Would it be possible to make this libandroid.so > > available somewhere? > > Patch 2/3 got stuck in the moderation queue as it exceeded the message > size limit. I staged all patches at > > https://github.com/metti/libabigail/commits/libandroid > > 1/3 https://github.com/metti/libabigail/commit/b8a067e1d2c90bd4677cab53a4e01208f3a78f37 > 2/3 https://github.com/metti/libabigail/commit/e001e06a37036429450b3802b8c26b820b2e464f > 3/3 https://github.com/metti/libabigail/commit/4af2c640c9ce76027d1617d5ed61a8df05ee8aee Thanks, I believe I understand now what is going on. gChoreographer has a empty location expression (zero length block). It is correct to interpret these as not having a location expression at all. The spec says "An empty location description consists of a DWARF expression containing no operations. It represents a piece or all of an object that is present in the source but not in the object code" But I think it would be better to make die_location_expr () return false in case expr_len == 0, instead of your check. The commit message and comment make it sound like the expression isn't invalid, but it is actually well defined. BTW. I think that if this variable is actually a thread local variable then it would be better for the DWARF producer to produce an actual (non-empty) location expression using DW_OP_GNU_push_tls_address or DW_OP_form_tls_address, which abg-dwarf-reader should handle correctly. Cheers, Mark
Hi Mark, On Thu, Oct 22, 2020 at 02:25:20PM +0200, Mark Wielaard wrote: >Hi Matthias, > >On Thu, 2020-10-22 at 09:50 +0100, Matthias Maennich wrote: >> On Wed, Oct 21, 2020 at 08:39:16PM +0200, Mark Wielaard wrote: >> > On Wed, 2020-10-21 at 16:49 +0100, Giuliano Procida via Libabigail wrote: >> > > On Wed, 21 Oct 2020 at 12:09, Matthias Maennich <maennich@google.com> >> > > wrote: >> > > >> > > > Location expressions might occasionally be unset or invalid. E.g. a >> > > > reason for that are thread local variables that do not exactly have a >> > > > location to refer to. Compilers/Linkers may choose an odd >> > > > representation. E.g. see the dwarfdump output for the added testcase >> > > > based on libandroid.so (from AOSP). >> > > > >> > > > $ dwarfdump libandroid.so|egrep -B9 "gChoreographerE$" >> > > > >> > > > LOCAL_SYMBOLS: >> > > > < 1><0x00000022> DW_TAG_namespace >> > > > DW_AT_name android >> > > > < 2><0x00000027> DW_TAG_variable >> > > > DW_AT_name gChoreographer >> > > > DW_AT_type <0x00000b65> >> > > > DW_AT_decl_file 0x00000003 .../choreographer.cpp >> > > > DW_AT_decl_line 0x00000059 >> > > > DW_AT_location len 0x0000: : >> > > > DW_AT_linkage_name _ZN7androidL14gChoreographerE >> > > > >> > > > The DW_AT_location is properly read by elfutils' dwarf_location(), but >> > > > is not useful for us to proceed with. Hence early exit on this. >> > >> > For some reason I missed the original post, so I don't fully understand >> > what is going on here. Would it be possible to make this libandroid.so >> > available somewhere? >> >> Patch 2/3 got stuck in the moderation queue as it exceeded the message >> size limit. I staged all patches at >> >> https://github.com/metti/libabigail/commits/libandroid >> >> 1/3 https://github.com/metti/libabigail/commit/b8a067e1d2c90bd4677cab53a4e01208f3a78f37 >> 2/3 https://github.com/metti/libabigail/commit/e001e06a37036429450b3802b8c26b820b2e464f >> 3/3 https://github.com/metti/libabigail/commit/4af2c640c9ce76027d1617d5ed61a8df05ee8aee > >Thanks, I believe I understand now what is going on. gChoreographer has >a empty location expression (zero length block). > >It is correct to interpret these as not having a location expression at >all. The spec says "An empty location description consists of a DWARF >expression containing no operations. It represents a piece or all of an >object that is present in the source but not in the object code" > >But I think it would be better to make die_location_expr () return >false in case expr_len == 0, instead of your check. The commit message >and comment make it sound like the expression isn't invalid, but it is >actually well defined. I will attempt that. Thanks for the hint. > >BTW. I think that if this variable is actually a thread local variable >then it would be better for the DWARF producer to produce an actual >(non-empty) location expression using DW_OP_GNU_push_tls_address or >DW_OP_form_tls_address, which abg-dwarf-reader should handle correctly. I never got to reproduce this with any other binary other than this particular version of the library. A simple C++ example: | static thread_local int i = 1; | void test() { i++; } produces DW_AT_location len 0x000a: 0e0000000000000000e0: DW_OP_const8u 0 DW_OP_GNU_push_tls_address for clang/gcc 10 as you expected and causes no issue. Cheers, Matthias > >Cheers, > >Mark
Matthias Maennich <maennich@google.com> a écrit: > From: Giuliano Procida <gprocida@google.com> > > Some types like unnamed-enum-underlying-type are not distinguished by > type_topo_comp. This can result in nondeterministic output and flakey > tests. > > While a more complete ordering from type_topo_comp would be nice, the > nondeterminism can reduced by preserving the relative order of > identically-named types. > > * src/abg-ir.cc (scope_decl::get_sorted_canonical_types): Sort > canonical types with std::stable_sort(..., type_topo_comp()). Interesting. Did this reduce the non-deterministic result you were seeing? I think a follow-up patch to this one could be to change enum_type_decl::get_pretty_representation to make it use a "flat representation" of enums when the enums is anonymous, like what we do in union_decl::get_pretty_representation. That should reduce the variance we see in the result of sorting. In the mean time, I guess this change shouldn't hurt. Thanks for looking into it! > Signed-off-by: Giuliano Procida <gprocida@google.com> > Reviewed-by: Matthias Maennich <maennich@google.com> > Signed-off-by: Matthias Maennich <maennich@google.com> Applied to master. Cheers, [...]
Hi Dodji. On Wed, 28 Oct 2020 at 15:28, Dodji Seketeli <dodji@seketeli.org> wrote: > Matthias Maennich <maennich@google.com> a écrit: > > > From: Giuliano Procida <gprocida@google.com> > > > > Some types like unnamed-enum-underlying-type are not distinguished by > > type_topo_comp. This can result in nondeterministic output and flakey > > tests. > > > > While a more complete ordering from type_topo_comp would be nice, the > > nondeterminism can reduced by preserving the relative order of > > identically-named types. > > > > * src/abg-ir.cc (scope_decl::get_sorted_canonical_types): Sort > > canonical types with std::stable_sort(..., type_topo_comp()). > > Interesting. Did this reduce the non-deterministic result you were > seeing? > > It reduced, but did not eliminate, the non-determinism. > I think a follow-up patch to this one could be to change > enum_type_decl::get_pretty_representation to make it use a "flat > representation" of enums when the enums is anonymous, like what we do in > union_decl::get_pretty_representation. That should reduce the variance > we see in the result of sorting. > > Interesting. Currently all *underlying* types of enums are considered anonymous. It looks like this was done to address a canonicalisation issue with such types all having the same name. However, there's also dormant code to add the enum name (if non anonymous) to the underlying type name. I was looking at a patch to add the bit size to the end of the name of enum-underlying types and in fact it's something we're likely to use locally as it does eliminate the test non-determinism. This is not ideal as it causes one-off churn to XML and the name switch is picked up as a (normally filtered) harmless change. Finally, I did a brief experiment which failed to cause any test failures. --- a/src/abg-comparison.cc +++ b/src/abg-comparison.cc @@ -4199,7 +4199,7 @@ compute_diff(const enum_type_decl_sptr first, ABG_ASSERT(first->get_environment() == second->get_environment()); diff_sptr ud = compute_diff_for_types(first->get_underlying_type(), - second->get_underlying_type(), + first->get_underlying_type(), ctxt); enum_diff_sptr d(new enum_diff(first, second, ud, ctxt)); I did this due to dubious diff output with my own testing. I'll try to contribute some enum test cases and perhaps even a diff presentation fix. In the mean time, I guess this change shouldn't hurt. Thanks for > looking into it! > You're welcome. Giuliano. > > > Signed-off-by: Giuliano Procida <gprocida@google.com> > > Reviewed-by: Matthias Maennich <maennich@google.com> > > Signed-off-by: Matthias Maennich <maennich@google.com> > > Applied to master. > > Cheers, > > [...] > > -- > Dodji >
diff --git a/src/abg-ir.cc b/src/abg-ir.cc index 8c04797f574c..e491528922d3 100644 --- a/src/abg-ir.cc +++ b/src/abg-ir.cc @@ -6008,9 +6008,9 @@ scope_decl::get_sorted_canonical_types() const priv_->sorted_canonical_types_.push_back(*e); type_topo_comp comp; - std::sort(priv_->sorted_canonical_types_.begin(), - priv_->sorted_canonical_types_.end(), - comp); + std::stable_sort(priv_->sorted_canonical_types_.begin(), + priv_->sorted_canonical_types_.end(), + comp); } return priv_->sorted_canonical_types_; }