[0/3] Type equality refactor and instrumentation

Message ID 20200708095315.948634-1-gprocida@google.com
Headers
Series Type equality refactor and instrumentation |

Message

Giuliano Procida July 8, 2020, 9:53 a.m. UTC
  Hi Dodji.

This series refactors various operator== methods making their common
functionality evident.

The first patch is just a prelude to make the second smaller.

The second patch does the refactoring. I'm not attached to the name
'equality_helper'.

The third patch is not intended for direct inclusion in libabigail but
builds on the refactoring to investigate how equality and canonical
types work in practice. It identifies some potential discrepancies,
but they may be entirely expected.

In general, it can be risky to define operator== in a way where
reflexivity, symmetry and transitivity do not obviously hold or can be
sensitive to small code changes or in way where equality, say for
class_decl_sptr, can be affected by something like canonicalisation.
More instrumentation could be added to check behaviour.

Giuliano Procida (3):
  abg-ir.cc: Tidy some operator== definitions
  abg-ir.cc: Refactor operator== methods with helper
  Add some type equality paranoia.

 src/abg-ir.cc | 185 +++++++++++++++++++++++++++++---------------------
 1 file changed, 106 insertions(+), 79 deletions(-)
  

Comments

Dodji Seketeli July 27, 2020, 7:32 a.m. UTC | #1
Hello Giuliano,

Giuliano Procida <gprocida@google.com> a écrit:

> Hi Dodji.
>
> This series refactors various operator== methods making their common
> functionality evident.
>
> The first patch is just a prelude to make the second smaller.
>
> The second patch does the refactoring. I'm not attached to the name
> 'equality_helper'.

Thank you for working on this.

I have looked into these first two patches and they look good to me in general.
I'll post their individual review as usual, shortly.

> The third patch is not intended for direct inclusion in libabigail but
> builds on the refactoring to investigate how equality and canonical
> types work in practice. It identifies some potential discrepancies,
> but they may be entirely expected.

In it's current form, I'd rather hold the inclusion of this one for now.
It's super intrusive, clutters the code quite a bit and I am not really
sure about its practical use at the moment.

> In general, it can be risky to define operator== in a way where
> reflexivity, symmetry and transitivity do not obviously hold or can be
> sensitive to small code changes or in way where equality, say for
> class_decl_sptr, can be affected by something like canonicalisation.
> More instrumentation could be added to check behaviour.

The comparison code is tricky.  The number one reason for this is that
it has to be fast.  So yes, it's risky.

One category of tests that are "simple" to perform and in practise are
quite powerful to detect and debug potential comparison issues are
"identity tests".  That is, comparing a binary against itself and
requiring that the result be the empty set.  This is why the option
"--abidiff" was added to abidw.

So I tend to favor schemes that keep the code the less cluttered and the
most "debuggable" as possible, as opposed to adding a lot of
instrumentation.  I guess some balance has to found there.

Cheers,
  
Giuliano Procida July 27, 2020, 10:55 a.m. UTC | #2
Hi Dodji.

On Mon, 27 Jul 2020 at 08:33, Dodji Seketeli <dodji@seketeli.org> wrote:
>
> Hello Giuliano,
>
> Giuliano Procida <gprocida@google.com> a écrit:
>
> > Hi Dodji.
> >
> > This series refactors various operator== methods making their common
> > functionality evident.
> >
> > The first patch is just a prelude to make the second smaller.
> >
> > The second patch does the refactoring. I'm not attached to the name
> > 'equality_helper'.
>
> Thank you for working on this.
>
> I have looked into these first two patches and they look good to me in general.
> I'll post their individual review as usual, shortly.
>
> > The third patch is not intended for direct inclusion in libabigail but
> > builds on the refactoring to investigate how equality and canonical
> > types work in practice. It identifies some potential discrepancies,
> > but they may be entirely expected.
>
> In it's current form, I'd rather hold the inclusion of this one for now.
> It's super intrusive, clutters the code quite a bit and I am not really
> sure about its practical use at the moment.

The code is absolutely not intended to be committed to master.
On top of the obvious impact on the code, tests were still running
after 60h with this turned on!

> > In general, it can be risky to define operator== in a way where
> > reflexivity, symmetry and transitivity do not obviously hold or can be
> > sensitive to small code changes or in way where equality, say for
> > class_decl_sptr, can be affected by something like canonicalisation.
> > More instrumentation could be added to check behaviour.
>
> The comparison code is tricky.  The number one reason for this is that
> it has to be fast.  So yes, it's risky.
>
> One category of tests that are "simple" to perform and in practise are
> quite powerful to detect and debug potential comparison issues are
> "identity tests".  That is, comparing a binary against itself and
> requiring that the result be the empty set.  This is why the option
> "--abidiff" was added to abidw.
>
> So I tend to favor schemes that keep the code the less cluttered and the
> most "debuggable" as possible, as opposed to adding a lot of
> instrumentation.  I guess some balance has to found there.

This was intended for one-off (or occasional) instrumentation. Running with
the instrumentation, I noticed various places where canonical and structural
comparison differ. If the comparisons are happening during canonicalisation
then things may be temporarily off. Or I may have a bug in the instrumentation.

At any rate, the commit adds code comments showing where certain discrepancies
were noted during "make check". For example, function_type nodes sometimes
compare as canonically equal but not "structurally" and vice versa.
"Structurally"
is not really fully structurally as recursive calls attempt canonical comparison
first.

If there are any you think are of concern, let me know and I can investigate
further and open a bug with findings.

I've reworked my series to account for your version of the second commit.
The instrumentation commit can be found amongst the changes in
https://github.com/myxoid/libabigail/commits/type-equality-paranoia

Regards,
Giuliano.

> Cheers,
>
> --
>                 Dodji