Message ID | 20211203114622.2944173-4-maennich@google.com |
---|---|
State | New |
Headers |
Return-Path: <libabigail-bounces+patchwork=sourceware.org@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B331C3858001 for <patchwork@sourceware.org>; Fri, 3 Dec 2021 12:09:16 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B331C3858001 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1638533356; bh=MQgzZiea8hoG3Vxy854j46b/3JOrXpF0ygXNMV3TFTU=; h=Date:In-Reply-To:References:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=UTgux0ocUscqYs3LhoKai7B230kK8h3tvTf+QAMYv/I4fVatThR7lrb/qmyzBkVMn j2pw4YVHKH0wYB/eU2CLloghsWp7uz9WSU7TsaTxyZDPua/Fk4xJK3gFo9Res3l6iN 8+74sxx8e8q1N5jKXiexvLV4X6oZsFIWrLCZQK58= X-Original-To: libabigail@sourceware.org Delivered-To: libabigail@sourceware.org Received: from mail-wm1-x34a.google.com (mail-wm1-x34a.google.com [IPv6:2a00:1450:4864:20::34a]) by sourceware.org (Postfix) with ESMTPS id 4C0003858022 for <libabigail@sourceware.org>; Fri, 3 Dec 2021 12:09:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4C0003858022 Received: by mail-wm1-x34a.google.com with SMTP id z138-20020a1c7e90000000b003319c5f9164so3348760wmc.7 for <libabigail@sourceware.org>; Fri, 03 Dec 2021 04:09:12 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=MQgzZiea8hoG3Vxy854j46b/3JOrXpF0ygXNMV3TFTU=; b=sE+OeDt+P33qcQMyv2qTxEejN6sd7AmzZbGoMsUOJT2CswplYgBtJ87IYCFz1SP3UE OeD/CbIiEjdvG6dcHRxAY9kUCRjVx1N8NkLH8iKPuFyb0HtUpX83tMx+CP9MEjxDl8SR RLTtp37t3HxYbhtq5kClnwoxVIxrdvpWRmh2LlVnAQGIHLaJo4Ow7hYyb6h/Jnh4+V18 gBdZk5fNCQqXnUGQ5+ARhrCFEXlBj1fKxvIa3+QukraI724kwXfXr2hQAVcqy15OU7sw XEh8POXNQ75JtysX7BznX6gRsM9hNA9jjGjI77c5PE99E/RuJrBC2mIYYO2U5SHwnxVb RHVQ== X-Gm-Message-State: AOAM532oMm6sYzcPQhOQKMKOQup8O04fUBOFCQLzj+V9UAHFn+flVgSL PbB7baRxPYKXtgdjVHnjxEpKOeB5EH9iWsLdSSQQccRra4eUEp+sd4Chid2Opdb261upnunnp7E 6e7IyOU5N1vbEXrdJoudLJP5VRL7bSPlNhu5qar5oqSZHfY5ATAvT+1RMEHRSVyAM5QCm6k4= X-Google-Smtp-Source: ABdhPJzR3TsrtUPGvjvt9lzVnRm69S/0p7NZGWIWDniyaSjBZYT35qggk3iEE0mCb+HVL2ew6YkTGHTSievRKA== X-Received: from lux.lon.corp.google.com ([2a00:79e0:d:210:47e9:78df:690b:201]) (user=maennich job=sendgmr) by 2002:a7b:ce83:: with SMTP id q3mr14473491wmj.37.1638533351314; Fri, 03 Dec 2021 04:09:11 -0800 (PST) Date: Fri, 3 Dec 2021 11:46:21 +0000 In-Reply-To: <20211203114622.2944173-1-maennich@google.com> Message-Id: <20211203114622.2944173-4-maennich@google.com> Mime-Version: 1.0 References: <20211203114622.2944173-1-maennich@google.com> X-Mailer: git-send-email 2.34.1.400.ga245620fadb-goog Subject: [PATCH 3/5] XML writer: track emitted types by bare pointer To: libabigail@sourceware.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-22.7 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project <libabigail.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/libabigail>, <mailto:libabigail-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/libabigail/> List-Help: <mailto:libabigail-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/libabigail>, <mailto:libabigail-request@sourceware.org?subject=subscribe> From: Matthias Maennich via Libabigail <libabigail@sourceware.org> Reply-To: Matthias Maennich <maennich@google.com> Cc: maennich@google.com, kernel-team@android.com Errors-To: libabigail-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libabigail" <libabigail-bounces+patchwork=sourceware.org@sourceware.org> |
Series |
Improvements for the XML Writer
|
|
Commit Message
Matthias Männich
Dec. 3, 2021, 11:46 a.m. UTC
This is a performance and safety improvement made possible by the previous changes which ensure that the same pointers are inserted and looked up. This essentially removes the now unnecessary deep comparison. * src/abg-writer.cc (type_ptr_set_type): Change typedef container type to use default equality and hashing for pointer keys. (fn_type_ptr_set_type): Likewise. Reviewed-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Matthias Maennich <maennich@google.com> --- src/abg-writer.cc | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-)
Comments
Hello, Matthias Maennich <maennich@google.com> a écrit: [...] > This is a performance and safety improvement made possible by the > previous changes which ensure that the same pointers are inserted and > looked up. > > This essentially removes the now unnecessary deep comparison. [...] > +++ b/src/abg-writer.cc > @@ -123,14 +123,10 @@ typedef unordered_map<type_base*, > abigail::diff_utils::deep_ptr_eq_functor> type_ptr_map; > > // A convenience typedef for a set of type_base*. > -typedef unordered_set<const type_base*, type_hasher, > - abigail::diff_utils::deep_ptr_eq_functor> > -type_ptr_set_type; > +typedef std::unordered_set<const type_base*> type_ptr_set_type; > > /// A convenience typedef for a set of function type*. > -typedef unordered_set<function_type*, type_hasher, > - abigail::diff_utils::deep_ptr_eq_functor> > -fn_type_ptr_set_type; > +typedef std::unordered_set<function_type*> fn_type_ptr_set_type; The problem I see with doing this is that it's possible that two declaration-only classes, that are equivalent but that have different pointer values get into these sets. In that case, they would be considered different even though they are not. So maybe it would be better have an equality operator that uses is_non_canonicalized_type() to detect those rare cases and use structural comparison in those cases? What do you think?
Hi Dodji! Thanks for the review and thanks for your thoughts! On Fri, Dec 10, 2021 at 11:50:49AM +0100, Dodji Seketeli wrote: >Hello, > >Matthias Maennich <maennich@google.com> a écrit: > >[...] > >> This is a performance and safety improvement made possible by the >> previous changes which ensure that the same pointers are inserted and >> looked up. >> >> This essentially removes the now unnecessary deep comparison. > >[...] > >> +++ b/src/abg-writer.cc >> @@ -123,14 +123,10 @@ typedef unordered_map<type_base*, >> abigail::diff_utils::deep_ptr_eq_functor> type_ptr_map; >> >> // A convenience typedef for a set of type_base*. >> -typedef unordered_set<const type_base*, type_hasher, >> - abigail::diff_utils::deep_ptr_eq_functor> >> -type_ptr_set_type; >> +typedef std::unordered_set<const type_base*> type_ptr_set_type; >> >> /// A convenience typedef for a set of function type*. >> -typedef unordered_set<function_type*, type_hasher, >> - abigail::diff_utils::deep_ptr_eq_functor> >> -fn_type_ptr_set_type; >> +typedef std::unordered_set<function_type*> fn_type_ptr_set_type; > >The problem I see with doing this is that it's possible that two >declaration-only classes, that are equivalent but that have different >pointer values get into these sets. > >In that case, they would be considered different even though they are >not. If the XML writer considers two equivalent declaration-only types to be different, one question to ask is: what is the real difference, that is, how will this affect the outcome of abidiff? If the types never change (kind, name or declaration/definition status), nothing should ever be reported. If a type does change... there are two possibilities: either the types were really one type and now perhaps abidiff reports diffs for the same name in two different ways; or the types were really two different ones and abidiff has a simpler job. In my experience, abidiff doesn't always report declaration-only/defined transitions. It doesn't sound like there will be any really bad impact on diffs from having this kind of duplication. However, if someone can come up with a test case of the kind you mention, that would give some extra reassurance. > >So maybe it would be better have an equality operator that uses >is_non_canonicalized_type() to detect those rare cases and use >structural comparison in those cases? That might come at higher cost than it is beneficial. > >What do you think? For us specifically - building with clang and for our use cases - if we keep structural equality of any kind then we need a hash function to go along with this and, as we've sadly found out, this isn't working well at the moment. We are currently on a bit dated version of libabigail for our production use, but would like to close that gap again to come closer to master. The risk of infinite loops and the reality of 30x slowdowns for certain workloads mean we would need to apply these changes to remove structural equality testing from the XML writer and then maintain an Android version of libabigail as a more heavily-patched fork, to whatever extent is feasible. I would rather we find a good solution that works for all to get again close to upstream and not having to maintain such a fork. Yet, as an additional piece of assurance: the testing we have done does not only include kernels, but of course we heavily examined the libabigail test suite. Additionally, we maintain a large set of small test cases specifically created for ABI stability testing and to cover corner cases of all sorts. We are in the process of publishing those as well. So far, this has served as great input for this patch series as well. Does this make sense? What do you think? Cheers, Matthias > >-- > Dodji
Matthias Maennich <maennich@google.com> a écrit: [...] > If the XML writer considers two equivalent declaration-only types to be > different, one question to ask is: what is the real difference, that is, > how will this affect the outcome of abidiff? The problem is not necessarily at the abidiff level per say. The problem would be duplication of decl-only types in the abixml output, I think, and maybe infinite loops in those cases. The infinite loops are easy to debug, though. So I am not concerned about them. > If the types never change > (kind, name or declaration/definition status), nothing should ever be > reported. If a type does change... there are two possibilities: either > the types were really one type and now perhaps abidiff reports diffs for > the same name in two different ways; or the types were really two > different ones and abidiff has a simpler job. In my experience, abidiff > doesn't always report declaration-only/defined transitions. It doesn't > sound like there will be any really bad impact on diffs from having this > kind of duplication. However, if someone can come up with a test case of > the kind you mention, that would give some extra reassurance. The reason why I was pointing to this "general" issue is to make sure you are aware of this. As type duplications in abixml was something you guys were tracking (and rightly so) I thought I'd point out that we still have the risk here. But because the type id map (writer_context::m_type_id_map) is not affected, the duplicated types will correctly be identified as such by the reader; thus I don't think abidiff is going to be affected. >>So maybe it would be better have an equality operator that uses >>is_non_canonicalized_type() to detect those rare cases and use >>structural comparison in those cases? > > That might come at higher cost than it is beneficial. I could not tell, as I don't necessarily have the right binaries at hand. I trust you. > >> >>What do you think? > > For us specifically - building with clang and for our use cases - if we > keep structural equality of any kind then we need a hash function to go > along with this and, as we've sadly found out, this isn't working well > at the moment. We are currently on a bit dated version of libabigail for > our production use, but would like to close that gap again to come > closer to master. > > The risk of infinite loops and the reality of 30x slowdowns for certain > workloads mean we would need to apply these changes to remove structural > equality testing from the XML writer and then maintain an Android > version of libabigail as a more heavily-patched fork, to whatever extent > is feasible. I would rather we find a good solution that works for all > to get again close to upstream and not having to maintain such a fork. > > Yet, as an additional piece of assurance: the testing we have done does > not only include kernels, but of course we heavily examined the > libabigail test suite. Additionally, we maintain a large set of small > test cases specifically created for ABI stability testing and to cover > corner cases of all sorts. We are in the process of publishing those as > well. So far, this has served as great input for this patch series as > well. > > Does this make sense? What do you think? If you don't really care about the potential type duplication in the abixml as stated above, frankly, let's just get this patch in. Are you okay with that? Cheers,
Thanks Dodji for having a look and for sharing your thoughts! That is - as always - very helpful to get a good full picture! On Mon, Jan 10, 2022 at 06:00:04PM +0100, Dodji Seketeli wrote: >Matthias Maennich <maennich@google.com> a écrit: > >[...] > >> If the XML writer considers two equivalent declaration-only types to be >> different, one question to ask is: what is the real difference, that is, >> how will this affect the outcome of abidiff? > >The problem is not necessarily at the abidiff level per say. > >The problem would be duplication of decl-only types in the abixml >output, I think, and maybe infinite loops in those cases. The infinite >loops are easy to debug, though. So I am not concerned about them. > Agreed there might be some duplication. See for example the commentary about tests/data/test-read-dwarf/PR22122-libftdc.so.abi in PATCH 4/5. This series is specifically to eliminate the risk of infinite loops in the libabigail version we have downstream; and also to improve performance. After these fixes there are some more changes that should make infinite loops even less likely. In our case the infinite loops only happened when using Clang's library (hash tables) and were not so easy to debug! >> If the types never change >> (kind, name or declaration/definition status), nothing should ever be >> reported. If a type does change... there are two possibilities: either >> the types were really one type and now perhaps abidiff reports diffs for >> the same name in two different ways; or the types were really two >> different ones and abidiff has a simpler job. In my experience, abidiff >> doesn't always report declaration-only/defined transitions. It doesn't >> sound like there will be any really bad impact on diffs from having this >> kind of duplication. However, if someone can come up with a test case of >> the kind you mention, that would give some extra reassurance. > >The reason why I was pointing to this "general" issue is to make sure >you are aware of this. As type duplications in abixml was something you >guys were tracking (and rightly so) I thought I'd point out that we >still have the risk here. > >But because the type id map (writer_context::m_type_id_map) is not >affected, the duplicated types will correctly be identified as such by >the reader; thus I don't think abidiff is going to be affected. > Duplicates with different type ids could still appear after these changes. But they should not hurt abidiff and may point to problems earlier in the pipeline (even the compiler - we found a Clang bug during the investigation). Duplicates with the same type id can be conflicting or not conflicting. Not conflicting is not ideal, but abidiff can handle this. Conflicting means we have some problem interpreting the XML - which definition is the right one? PATCH 4/5 does indeed affect the type id map specifically so that we avoid the risk of conflicting definitions. >>>So maybe it would be better have an equality operator that uses >>>is_non_canonicalized_type() to detect those rare cases and use >>>structural comparison in those cases? >> >> That might come at higher cost than it is beneficial. > >I could not tell, as I don't necessarily have the right binaries at >hand. I trust you. It is having the binaries but also the tool chain (the prebuilt clang version that we use to build Android is very close to upstream releases, but differences can be subtle - as always). Clang produces different DWARF and has different bugs from GCC but the standard library is also more sensitive to how unordered_map and unordered_set are used. > >> >>> >>>What do you think? >> >> For us specifically - building with clang and for our use cases - if we >> keep structural equality of any kind then we need a hash function to go >> along with this and, as we've sadly found out, this isn't working well >> at the moment. We are currently on a bit dated version of libabigail for >> our production use, but would like to close that gap again to come >> closer to master. >> >> The risk of infinite loops and the reality of 30x slowdowns for certain >> workloads mean we would need to apply these changes to remove structural >> equality testing from the XML writer and then maintain an Android >> version of libabigail as a more heavily-patched fork, to whatever extent >> is feasible. I would rather we find a good solution that works for all >> to get again close to upstream and not having to maintain such a fork. >> >> Yet, as an additional piece of assurance: the testing we have done does >> not only include kernels, but of course we heavily examined the >> libabigail test suite. Additionally, we maintain a large set of small >> test cases specifically created for ABI stability testing and to cover >> corner cases of all sorts. We are in the process of publishing those as >> well. So far, this has served as great input for this patch series as >> well. >> >> Does this make sense? What do you think? > >If you don't really care about the potential type duplication in the >abixml as stated above, frankly, let's just get this patch in. > >Are you okay with that? Yes. Though I think it's important you are somewhat happy with PATCH 4/5 as well as they go together. Cheers, Matthias > >Cheers, > >-- > Dodji
Matthias Maennich <maennich@google.com> a écrit: > This is a performance and safety improvement made possible by the > previous changes which ensure that the same pointers are inserted and > looked up. > > This essentially removes the now unnecessary deep comparison. > > * src/abg-writer.cc (type_ptr_set_type): Change typedef > container type to use default equality and hashing for pointer > keys. > (fn_type_ptr_set_type): Likewise. > > Reviewed-by: Giuliano Procida <gprocida@google.com> > Signed-off-by: Matthias Maennich <maennich@google.com> Applied to master, thanks! [...] Cheers,
diff --git a/src/abg-writer.cc b/src/abg-writer.cc index 4cf2a7fb16af..d2ef19eb7095 100644 --- a/src/abg-writer.cc +++ b/src/abg-writer.cc @@ -123,14 +123,10 @@ typedef unordered_map<type_base*, abigail::diff_utils::deep_ptr_eq_functor> type_ptr_map; // A convenience typedef for a set of type_base*. -typedef unordered_set<const type_base*, type_hasher, - abigail::diff_utils::deep_ptr_eq_functor> -type_ptr_set_type; +typedef std::unordered_set<const type_base*> type_ptr_set_type; /// A convenience typedef for a set of function type*. -typedef unordered_set<function_type*, type_hasher, - abigail::diff_utils::deep_ptr_eq_functor> -fn_type_ptr_set_type; +typedef std::unordered_set<function_type*> fn_type_ptr_set_type; typedef unordered_map<shared_ptr<function_tdecl>, string,