abidiff: improve treatment of array types in leaf changes mode

Message ID 20201203150916.3540551-1-gprocida@google.com
State New
Headers
Series abidiff: improve treatment of array types in leaf changes mode |

Commit Message

Giuliano Procida Dec. 3, 2020, 3:09 p.m. UTC
  If an array's element type doesn't change name but has some other
(local) change, then the change should not also be considered local to
the array type.

	* src/abg-ir.cc (types_have_similar_structure): When
	examining array types, always treat element type changes as
	indirect changes.
	* tests/data/Makefile.am: Add new test case files.
	* tests/data/test-abidiff-exit/test-non-leaf-array-report.txt:
	New test case showing correct --leaf-changes-only reporting.
	* tests/data/test-abidiff-exit/test-non-leaf-array-v0.c:
	Likewise.
	* tests/data/test-abidiff-exit/test-non-leaf-array-v0.o:
	Likewise.
	* tests/data/test-abidiff-exit/test-non-leaf-array-v1.c:
	Likewise.
	* tests/data/test-abidiff-exit/test-non-leaf-array-v1.o:
	Likewise.
	* tests/test-abidiff-exit.cc: Run new test case.

Signed-off-by: Giuliano Procida <gprocida@google.com>
---
 src/abg-ir.cc                                    |   2 +-
 tests/data/Makefile.am                           |   5 +++++
 .../test-non-leaf-array-report.txt               |  11 +++++++++++
 .../test-abidiff-exit/test-non-leaf-array-v0.c   |  12 ++++++++++++
 .../test-abidiff-exit/test-non-leaf-array-v0.o   | Bin 0 -> 3096 bytes
 .../test-abidiff-exit/test-non-leaf-array-v1.c   |  12 ++++++++++++
 .../test-abidiff-exit/test-non-leaf-array-v1.o   | Bin 0 -> 3072 bytes
 tests/test-abidiff-exit.cc                       |  11 +++++++++++
 8 files changed, 52 insertions(+), 1 deletion(-)
 create mode 100644 tests/data/test-abidiff-exit/test-non-leaf-array-report.txt
 create mode 100644 tests/data/test-abidiff-exit/test-non-leaf-array-v0.c
 create mode 100644 tests/data/test-abidiff-exit/test-non-leaf-array-v0.o
 create mode 100644 tests/data/test-abidiff-exit/test-non-leaf-array-v1.c
 create mode 100644 tests/data/test-abidiff-exit/test-non-leaf-array-v1.o
  

Comments

Dodji Seketeli Dec. 4, 2020, 11:02 a.m. UTC | #1
Hello Giuliano,

Giuliano Procida <gprocida@google.com> a écrit:

[...]

> diff --git a/src/abg-ir.cc b/src/abg-ir.cc
> index c6f7c13e..b0db9c39 100644
> --- a/src/abg-ir.cc
> +++ b/src/abg-ir.cc
> @@ -23606,7 +23606,7 @@ types_have_similar_structure(const type_base* first,
>  	  || ty1->get_dimension_count() != ty2->get_dimension_count()
>  	  || !types_have_similar_structure(ty1->get_element_type(),
>  					   ty2->get_element_type(),
> -					   indirect_type))
> +					   true))
>  	return false;
>  
>        return true;

I think this change is correct, thanks for spotting that.

However ...

[...]

> If an array's element type doesn't change name but has some other
> (local) change, then the change should not also be considered local to
> the array type.

I find this comment confusing, even if I think I see what you mean.

A change being local or not, is a concept that is not at the same
logical level as the concept of "type similarity" defined in the comment
of the function types_have_similar_structure.  I would say that the type
similarity concept is at a lower logical level.  So seeing this comment
that applies to a higher level concept for a change made to something

But I agree that the comments of types_have_similar_structure are
confusing as well.

So I am proposing two patches following this message, for this issue.
One patch amends the the comments for types_have_similar_structure, and
the second one is your patch, with amended comments.

And we can discuss from there as you like.

[...]

Cheers,
  
Giuliano Procida Dec. 4, 2020, 2:41 p.m. UTC | #2
Hi.

On Fri, 4 Dec 2020 at 11:02, Dodji Seketeli <dodji@seketeli.org> wrote:

> Hello Giuliano,
>
> Giuliano Procida <gprocida@google.com> a écrit:
>
> [...]
>
> > diff --git a/src/abg-ir.cc b/src/abg-ir.cc
> > index c6f7c13e..b0db9c39 100644
> > --- a/src/abg-ir.cc
> > +++ b/src/abg-ir.cc
> > @@ -23606,7 +23606,7 @@ types_have_similar_structure(const type_base*
> first,
> >         || ty1->get_dimension_count() != ty2->get_dimension_count()
> >         || !types_have_similar_structure(ty1->get_element_type(),
> >                                          ty2->get_element_type(),
> > -                                        indirect_type))
> > +                                        true))
> >       return false;
> >
> >        return true;
>
> I think this change is correct, thanks for spotting that.
>
> However ...
>
> [...]
>
> > If an array's element type doesn't change name but has some other
> > (local) change, then the change should not also be considered local to
> > the array type.
>
> I find this comment confusing, even if I think I see what you mean.
>
> A change being local or not, is a concept that is not at the same
> logical level as the concept of "type similarity" defined in the comment
> of the function types_have_similar_structure.  I would say that the type
> similarity concept is at a lower logical level.  So seeing this comment
> that applies to a higher level concept for a change made to something
>
> But I agree that the comments of types_have_similar_structure are
> confusing as well.
>
> So I am proposing two patches following this message, for this issue.
> One patch amends the the comments for types_have_similar_structure, and
> the second one is your patch, with amended comments.
>
> And we can discuss from there as you like.
>
>
I'm happy with your patches and have replied to them. However, I'll be
honest, I've found this aspect (local changes and types with
similar structure) of abidiff really hard to reason about. I'm likely
looking at things the wrong way, but here's roughly how I see things.

We want to identify "leaf" changes. I think of them more as "root" (cause)
changes, but that's just inverting the edges in some graph. These should be
(user-defined) types that have not changed name but have changed in some
way "directly" (indirectly would be if there were a more remote cause of
the same sort). There are also direct changes to symbols' types.
User-defined types with local changes have paths to the symbols whose types
they affect - they are the impacted interfaces.

Now with libabigail, there is the notion of local vs non-local change which
might not correspond exactly with the notion sketched above, but it's
certainly related. And then there are at least two more
pieces: types_have_similar_structure which is used in decisions about
locality and lastly the logic used to filter changes for consideration in
the leaf report - they have to be local plus a lot of other conditions.

The decision-making that goes into leaf reporting seems to be spread across
several bits of code and I speculate that it was an inconsistency between
two of these that resulted in diff nodes being selected for inclusion but
actually have no local diff to report (by the represent helper function).
If I knew what types_have_similar_structure had to be consistent with I
could check all the cases or, more interestingly, see if the decisions
being made in two different places could be made in just one place.

Regards,
Giuliano.


> [...]
>
> Cheers,
>
> --
>                 Dodji
>
  
Dodji Seketeli Dec. 7, 2020, 1:24 p.m. UTC | #3
Hello Giuliano,

Giuliano Procida <gprocida@google.com> a écrit:

[...]

> I've found this aspect (local changes and types with similar
> structure) of abidiff really hard to reason about. I'm likely looking
> at things the wrong way, but here's roughly how I see things.

No, I don't think you are seeing things the "wrong way".  I think there
are several ways of seeing things.

There are several goals libabigail had to fulfill.  To accommodate them
all we adopted some point of views that might make things look less
obvious that they could have been should we have fewer goals to fulfill.

So, here are some of the concepts we use in Libabigail to analyze
changes.  I'll only talk about those related to your discussion here.

An ABI artifact is either a type or a declaration.  An ABI artifact is a
composite entity, meaning that it has sub-parts that are ABI artifacts
as well.  So, a declaration has a type.  A type might have have
sub-types.

In that context, we define the concept of a local change.

A local change to an ABI artifact is a change that does not involve any
of the sub-parts of the artifact.  For instance, a local change for a
typedef would be a change to its name.  But a change to the underlying
type of the typedef is not considered local, as the underlying type is
the sub-type of the typedef.

Said otherwise, a local change to an ABI artifact is a change to the
artifact itself, not a change to any of its sub-parts.

A leaf change is then a local change "that makes sense" from the
end-user point of view, in the context of the leaf reporter.

More precisely, a leaf change:

   * must impact an exported interface of the binary we are
     analyzing.  In other words, it must be a change to a sub-part of
     an ABI artifact (a declaration) that is exported by the binary we
     are looking at.

   * doesn't contain a name change

   * is about a function, variable, class, union or enum type.

There are other restrictions, but there are technical details that are
not useful to mention at this point, think.

The reason why we introduced the concept of the "local change" is that
it's useful to fulfill goals in libabigail, independently from the leaf
reporter requirements.  It's useful to be able to perform things like
redundancy detection and proper categorization of diff nodes.  But I
think this is a topic for a separate discussion.

> We want to identify "leaf" changes. I think of them more as "root" (cause)
> changes, but that's just inverting the edges in some graph. These should be
> (user-defined) types that have not changed name but have changed in some
> way "directly" (indirectly would be if there were a more remote cause of
> the same sort).

I find your way of seeing things fairly accurate.  It's doesn't seem to
go against the view that I exposed above.


> There are also direct changes to symbols' types.

I tend to think more in terms of 'exported interfaces'.  I.e, a binary
exports functions or global variables.  Those constitute the interfaces
of the application binary.  Those interfaces have several properties.
Their type is one of their properties.  In the particular case of ELF
binaries, those exported interfaces must have ELF symbols, which another
property of those exported interfaces.

So, yes, there are also "local changes to the binary interfaces".  This
is a particular case of the definition I laid down earlier.

So I agree with you.

> User-defined types with local changes have paths to the symbols whose types
> they affect - they are the impacted interfaces.

Yes.

> Now with libabigail, there is the notion of local vs non-local change which
> might not correspond exactly with the notion sketched above, but it's
> certainly related.

I hope it's clearer now.

> And then there are at least two more
> pieces: types_have_similar_structure which is used in decisions about
> locality.

Right.  The concept of "type structure similarity" is a detail of the
concept of "local change".

> and lastly the logic used to filter changes for consideration in
> the leaf report - they have to be local plus a lot of other conditions.

Right.

> The decision-making that goes into leaf reporting seems to be spread across
> several bits of code

The pass that walks through diff nodes to detect which ones are leaf
nodes is the 'struct leaf_diff_node_marker_visitor' defined in
abg-comparison.cc.

Now, there are a *lot* of concepts involved here, I agree.  But
ultimately, I think it's related to the complexity of what we are trying
to model here.

> and I speculate that it was an inconsistency between two of these that
> resulted in diff nodes being selected for inclusion but actually have
> no local diff to report (by the represent helper function).

Stricto sensu, I think the problem was us not respecting the definition
of the "type structure similarity" relation.  As that relation is at the
*bottom* of a stack of concepts, it's not surprising that a bug in there
has impacts possibly further down the road.

At the end of the day, this is a complex network of concepts we are
dealing with.

> If I knew what types_have_similar_structure had to be consistent with I
> could check all the cases or, more interestingly, see if the decisions
> being made in two different places could be made in just one place.

Well.  In this particular case, I'd argue that
types_have_similar_structure had to be consistent with itself to begin
with :-)

So I wouldn't beat myself over this too much, seriously.  To me, what is
paramount is that we keep the whole thing as "debuggable" as possible.
Because *that* is how we progressively get better.  At least in my
experience.

Thank you for initiating this discussion.  I hope I haven't made things
even more confusing than they are at the moment.

Cheers,
  

Patch

diff --git a/src/abg-ir.cc b/src/abg-ir.cc
index c6f7c13e..b0db9c39 100644
--- a/src/abg-ir.cc
+++ b/src/abg-ir.cc
@@ -23606,7 +23606,7 @@  types_have_similar_structure(const type_base* first,
 	  || ty1->get_dimension_count() != ty2->get_dimension_count()
 	  || !types_have_similar_structure(ty1->get_element_type(),
 					   ty2->get_element_type(),
-					   indirect_type))
+					   true))
 	return false;
 
       return true;
diff --git a/tests/data/Makefile.am b/tests/data/Makefile.am
index 84447185..f1d7ee45 100644
--- a/tests/data/Makefile.am
+++ b/tests/data/Makefile.am
@@ -184,6 +184,11 @@  test-abidiff-exit/test-headers-dirs/test-headers-dir-v0.c \
 test-abidiff-exit/test-headers-dirs/test-headers-dir-v0.o \
 test-abidiff-exit/test-headers-dirs/test-headers-dir-v1.c \
 test-abidiff-exit/test-headers-dirs/test-headers-dir-v1.o \
+test-abidiff-exit/test-non-leaf-array-v0.c \
+test-abidiff-exit/test-non-leaf-array-v0.o \
+test-abidiff-exit/test-non-leaf-array-v1.c \
+test-abidiff-exit/test-non-leaf-array-v1.o \
+test-abidiff-exit/test-non-leaf-array-report.txt \
 \
 test-diff-dwarf/test0-v0.cc		\
 test-diff-dwarf/test0-v0.o			\
diff --git a/tests/data/test-abidiff-exit/test-non-leaf-array-report.txt b/tests/data/test-abidiff-exit/test-non-leaf-array-report.txt
new file mode 100644
index 00000000..f13cd0d4
--- /dev/null
+++ b/tests/data/test-abidiff-exit/test-non-leaf-array-report.txt
@@ -0,0 +1,11 @@ 
+Leaf changes summary: 1 artifact changed
+Changed leaf types summary: 1 leaf type changed
+Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 0 Added function
+Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable
+
+'struct buzz at test-non-leaf-array-v0.c:1:1' changed:
+  type size changed from 32 to 64 (in bits)
+  there are data member changes:
+    type 'int' of 'buzz::a' changed:
+      type name changed from 'int' to 'long int'
+      type size changed from 32 to 64 (in bits)
diff --git a/tests/data/test-abidiff-exit/test-non-leaf-array-v0.c b/tests/data/test-abidiff-exit/test-non-leaf-array-v0.c
new file mode 100644
index 00000000..9594c396
--- /dev/null
+++ b/tests/data/test-abidiff-exit/test-non-leaf-array-v0.c
@@ -0,0 +1,12 @@ 
+struct buzz {
+  int a;
+};
+
+struct flexible {
+  long count;
+  struct buzz lightyear[0];
+};
+
+struct flexible var;
+void fun(struct flexible flex) { (void) flex; }
+
diff --git a/tests/data/test-abidiff-exit/test-non-leaf-array-v0.o b/tests/data/test-abidiff-exit/test-non-leaf-array-v0.o
new file mode 100644
index 0000000000000000000000000000000000000000..f45eecb207f8ed7a78ae672d348ec9995b25b378
GIT binary patch
literal 3096
zcmbtW-D@0G6hC)oKV~z{n$0RUiO5tcwctz=scozzU5%y<)lv(EqDYwCo&CVw*)lsD
zH&Llb5ekA81Qql_eep%G|AC-?fP()*!B^izAM~6%=WcFpC-~rjz4x5o`Ml?T%-(wW
zm93l*ph&<597qZUxKTRb>yoWQ4d&tcr91a8-FffNyKnv^(O+=POU{;Ij!9k{9F5Nr
zpBKX3!r2Ny<VwULmM`?81f<>;1$&HGffS-Bc@|0!y^EL)MO!LM9!eh(LIWTyi!Sd5
zF^HM!s|b}Xslj2UJ&r&)MJ21c#Fvs2&LY|8KIvl8s(7+ebF9izr7Gr2bB`ik6IR7>
z&O6UL&pD!o)z5-ur{Z>P4MXHlg5zvBmW`F8uA=LTtg8T&gxIc80Q&@N*Uq71`HU9n
z!b0vi*6NE33o87n1uCL`{IeGq0GxsrnoQwos#-1zGi#*Qr$LgXRa&r&qY!^rW^t6K
z0XI)T5{?ot8b)3}4BB20$HAVryW%&cW5I=TnusFIh{Ey*ISbv}{czH2^g|#9E?&9r
zZmynlS6BRJ{1w+5B;zPt?}Skp_nIgMakINVSv%{UJ>z;E*J}r(y{OrZhtY66^2X8I
zy{P3S0md6ae;9S#aWv|6qOj%C0g;GgdYx{v7ve@Y#{2t67rvYfZbEZ7MqjwNxq04w
z>ZP#J3nH%1TYdI4vvQ395A!VcuYs7%iKSyD`z|);F~sRdBwPDJW$we==4^2b4N^FB
zq)U;9KmN&5V)wFO+@5Kl59uw`WRb-;(9~HlJ?)277HF4S(^yc}qyyGUS9}04XTj_P
z7y%aGLC+uy9(WyK5$&qlx>PS8M|@uE&-6GSc}D5h(;ZJqt$YSW#gj%=Ksc>dRazw_
zlEQDNB0R#g!oOk}@1i8yL4CJQGfwtuEd==*PI+&!e8s?7gf|TQ8<x8UPA?_VB%P-9
z_gVhfz+Y$iYo4ioeHZU(xW2BR8CQK!Cq?y^aQP^+-P{pzHjcnu#wn+s7yQxQ0N(?g
zlQ^Bbic7*t0)8t<0`ME75%_V~5Byfx7<YDhQG19w$p?)_9PS=c2XPQ}@QKk~`n@Pr
zE|>!R=5R0w@#XNNVG{cIvU#H<XuicxVRxq;2ZN9c|DV$m`qN&heyVVFW~S$#WwO=j
z1PX+iFT`ctNaIcaQ>df!YWiPgAN?#7WXh|EnK8*u`Ts_OMy4%&#^`@UU9%zfcLI^R
z>_F#=AeEm!fkf;0Gh-5dlELpu(8#o<I!%nse`Rn*^Ci}_r9NlXf0Z{>`BVM$`kM8(
zGGZcxf01YecrBf#qnUpPHCg%J=lrTql%MXdng1qY6jRlStfXt4|2qZntn#b*9mKQz
z|6+glKm2dAzhMT8=7D4bl~>t+!Ty|1!Qit0+7w9{wSEPEicD7jKF{bMeKVB5>MPY_
v=BC&;IKMi-^qryi*7T=6$y(}<R20$^=`<b9^;6s|{~x&iZ>OFqWBUIKkucXI

literal 0
HcmV?d00001

diff --git a/tests/data/test-abidiff-exit/test-non-leaf-array-v1.c b/tests/data/test-abidiff-exit/test-non-leaf-array-v1.c
new file mode 100644
index 00000000..516c8b1f
--- /dev/null
+++ b/tests/data/test-abidiff-exit/test-non-leaf-array-v1.c
@@ -0,0 +1,12 @@ 
+struct buzz {
+  long a;
+};
+
+struct flexible {
+  long count;
+  struct buzz lightyear[0];
+};
+
+struct flexible var;
+void fun(struct flexible flex) { (void) flex; }
+
diff --git a/tests/data/test-abidiff-exit/test-non-leaf-array-v1.o b/tests/data/test-abidiff-exit/test-non-leaf-array-v1.o
new file mode 100644
index 0000000000000000000000000000000000000000..298dda06b0fbf53e001eede352f049a0196e5264
GIT binary patch
literal 3072
zcmbtW-D@0G6hC(-yF1A=+iX^`X;h}7)Pgg~N^N5$>1s4>uoe~iqDYwCz59W?vt?#B
zCJ|9Y#lA=d!7BPvUyA>O_~4Tdg8zizoA@U9py$k-JGr@?;DZPD-gAEE^Pc-L`~FMU
zwn{>PA^{t4C@B=6Rn_>EZ9xMT;l|Zlzg@lc-e0%h`YA<!#&IU)>@>`Yl-CAF<4eTb
zLfBh4TLFkt%3%b%Oe)rw?I{W=A}X0hCR9(7!iI{6)5rL$(m?KMwf1fhftahmg3zp$
zYKZGiLl90y$*Lgn%^X3(IZgJZ&j-lY#S^uLW7U>xb+K4ocnI-^uxgHT$$8Fs))5UX
zdmb!16So^{7;NToaGVWv!je%}({&Xgwj1PCJp<c~i)d}1*W52Jl}=%SzPP-kVxC>1
z%&k*jzOV!^SJvW@DLh4`;*YwZlUkIH5l>6AU<F4Z{;tjAnC?Apm4HOXi5E^nZzO}B
z7erC8@9j4I4*5aZh1o0-3f7z`uY6dt(7iX3d&Blf0x@vq`VDupdC_gI`cM0-t~XAm
zp<M6FP)5TJib2#FtnaN|@GhKpy}s-9f_Ojd45CRmna18Ud}kPTy(GYRF^nc*-<^i>
zupi2<I}DQyA?DahN5lRg*_XKO?dib*tR&-`(3wop1+HvvUUHv&QMQLc$n|;6XU;JT
z52UNaHXOVTVsAk#pQzfmu`LfHPCufwsn6FIJ}GU^SGLd~g9}GG{4?<9KSfIHTM>*K
zlXv)t-a<_gS$xw>odx;fKBBTfyTqEtf>}*EWUXw)hX8XH%-@F*U;&=_0<z$~R{&0<
zU0qw3>gA(|FKYeC5Av~lPM4gnbVh3Bbe9!R8dU+|v|3eZos>ulrw4`T!EDO$Z&}7$
zm=f)vzFX%Qr#NaY1o;|HdGD~iZQvu8uNydv@TP&^WqDxWpRxR@fxp4>*F01G`YwL2
z;rhCMU|jWw)<C3sOL+PirzpSz&t`u%j=^2VDW{$n{CIziF9FU;l+6RhC2}tTzZ)b0
z`0Y3bKa!)s@5=VHzcUPb6V#=A&~8U^_lP=(g0PQ|i|#TShElm;3h+CV@mS)U;fIq%
z`uL`KaT0XiW+yq==|#a<a^e4TT1J2BmFlMo*J36=|16WOPG?Xc%zPnU){QjY^goL_
zI<KbxtL&qnWr9q(iI^Fa?3Dj6Bxq#X(g%$GFSIlpVt>v4M8JR@=v)z`^3%tUXdQoM
zOu`>w@Gc1&nYL7?iIMq_46bUv#G1C$=c4-8cte#x)laXlS%0@6CPMHBCwh$pjm-Qz
zs4L1p;QXpjl%MXdng1qY6jRlSY)UP@jQ144v&ygLcM&i0|DFBa|M35a{S7m)%>&5>
zDzCEtiv2mAf^+}19LX59eg%JlOi}+nDCi%3GnBvTE7fD>rr0g6U!7n2&d_^n`qQ2i
dE%hE1h4ff9<)gWNid*D=hwJ|?_skg6|6ia;)N%j-

literal 0
HcmV?d00001

diff --git a/tests/test-abidiff-exit.cc b/tests/test-abidiff-exit.cc
index bb0262bb..bf1facc4 100644
--- a/tests/test-abidiff-exit.cc
+++ b/tests/test-abidiff-exit.cc
@@ -337,6 +337,17 @@  InOutSpec in_out_specs[] =
     "data/test-abidiff-exit/test-headers-dirs/test-headers-dir-report-2.txt",
     "output/test-abidiff-exit/test-headers-dirs/test-headers-dir-report-2.txt"
   },
+  {
+    "data/test-abidiff-exit/test-non-leaf-array-v0.o",
+    "data/test-abidiff-exit/test-non-leaf-array-v1.o",
+    "",
+    "",
+    "",
+    "--leaf-changes-only",
+    abigail::tools_utils::ABIDIFF_ABI_CHANGE,
+    "data/test-abidiff-exit/test-non-leaf-array-report.txt",
+    "output/test-abidiff-exit/test-non-leaf-array-report.txt"
+  },
   {0, 0, 0 ,0, 0, 0, abigail::tools_utils::ABIDIFF_OK, 0, 0}
 };