From patchwork Fri Jun 12 09:59:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Giuliano Procida X-Patchwork-Id: 39569 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id ADC5D39551D7; Fri, 12 Jun 2020 09:59:32 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org ADC5D39551D7 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1591955972; bh=liUItKTZw7cczRyGU0WDYJc1h/yLpRtDQI8qP2oOknM=; h=Date:In-Reply-To:References:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=UhgoUGv8cjDQ++yLu8Vum8E3uqT5fyXLSfNirxL9QOmWuIPHII+BtYzIeFRhB2PRP Uk+++bLrM+J+Q8E1xdNX7SmnqqwCS5k8uYRD3HF6LIWwjAzP3KsKzjXNiDtJCY7V8D uO/tzkplg7l7c4OPLBcbE52aW9Y4TuFllkjeqVDw= X-Original-To: libabigail@sourceware.org Delivered-To: libabigail@sourceware.org Received: from mail-qk1-x74a.google.com (mail-qk1-x74a.google.com [IPv6:2607:f8b0:4864:20::74a]) by sourceware.org (Postfix) with ESMTPS id 1E8A43954C7B for ; Fri, 12 Jun 2020 09:59:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1E8A43954C7B Received: by mail-qk1-x74a.google.com with SMTP id m29so7560730qkm.17 for ; Fri, 12 Jun 2020 02:59:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=liUItKTZw7cczRyGU0WDYJc1h/yLpRtDQI8qP2oOknM=; b=V+lxETOvUjioDfAvldKn4ebMLCkIQlAH8CT6JuaEP/cY8pGv6vXNqlpeWBt+/yE7TQ xP2iMjhzrMQHEYNuFJ72mz9pgDKWPM/dy7aaA8I7V5ZMQYBaHKJqAoc8KB/vMa1ZJNWX 5upWRPUN1bE3mLxYv1G+ypH4ubpuye3aFneaTG2m8yTJym7ipMR61xvv5yL07N/xmqir YoWM3QQEtZFXkHdq6veqPX+AIYBPVtx2rJJ79nNU7opqGtwvqXlRR2rFw2l/wxzCnuEA L7+6fKWmo4Kpq4O0MRb7TdlrKMOdzSveZ931aFvEr4mR5l9oUgveOoqXCKZ264oSKMJQ mHNw== X-Gm-Message-State: AOAM531emwuDnOrZaVFzY4E3tkXO+TqmcJvq22TPn0KC8DCD5+1a1gDY Hs9bxtYqnwIJhquKKaDMYxXw2wQYw+eU1bocvHMaLsVAfaAEGeQkd//4GpVnoqpF5Q+lIYeNAoQ /d/9f3ICOccJlqJoA6HIhPL8zxr+b+thQ0Fcq1aU8//Ym1rKK09x83mdKyesJ/6/hU+GVm14= X-Google-Smtp-Source: ABdhPJwi7OSOtp0CGnf12yXBVAprFV6xV4xnSzHQNq0Cv7qDiDFcLtiUlDPH6cdeZxJNqRy5kOEyYtSH7fvpUg== X-Received: by 2002:ad4:4309:: with SMTP id c9mr11830295qvs.50.1591955968481; Fri, 12 Jun 2020 02:59:28 -0700 (PDT) Date: Fri, 12 Jun 2020 10:59:16 +0100 In-Reply-To: <20200612095917.2146-1-gprocida@google.com> Message-Id: <20200612095917.2146-2-gprocida@google.com> Mime-Version: 1.0 References: <20200612095917.2146-1-gprocida@google.com> X-Mailer: git-send-email 2.27.0.290.gba653c62da-goog Subject: [PATCH 1/2] abg-writer: Add support for stable hash type ids. To: libabigail@sourceware.org X-Spam-Status: No, score=-24.6 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-Patchwork-Original-From: Giuliano Procida via Libabigail From: Giuliano Procida Reply-To: Giuliano Procida Cc: kernel-team@android.com, gprocida@google.com Errors-To: libabigail-bounces@sourceware.org Sender: "Libabigail" The type ids currently emitted by the XML writer are simply type-id-1, type-id-2 etc. Additions or removals of types early in this sequence result in cascading changes to many other XML elements. This commit adds support for stable type ids in the form of hashes of libabigail's internal type names. On fairly rare occasions (typically involving unnamed types), the names of two distinct types can be the same. In any case, if there is a hash collision the XML writer will find the next unused id and so preserve uniqueness. Diffs between XML files produced using --type-id-style hash will be much smaller and easier to review. * doc/manuals/abidw.rst: Replace stray documentation of --named-type-ids with documention of new --type-id-style option. * include/abg-writer.h (type_id_style_kind): Add new enum. (set_type_id_style): Add new write_context setter. (set_common_options): Set type id style in write context. * src/abg-writer.cc (stable_hash): Add new 32-bit FNV-1a hash function. (write_context): Add m_type_id_style member to record type style to use, defaulting to COUNTER. Add m_used_type_id_hashes to record already used hashes. (write_context::get_type_id_style): Add new getter. (write_context::set_type_id_style): Add new setter. (get_id_for_type): Add support for HASH style. (set_type_id_style): Add new helper function. * tools/abidw.cc (options): Add type_id_style member. (display_usage): Add description of --type-id-style option. (parse_command_line): Parse --type-id-style option. Signed-off-by: Giuliano Procida --- doc/manuals/abidw.rst | 15 +++---- include/abg-writer.h | 10 +++++ src/abg-writer.cc | 95 +++++++++++++++++++++++++++++++++++++++---- tools/abidw.cc | 20 ++++++++- 4 files changed, 125 insertions(+), 15 deletions(-) diff --git a/doc/manuals/abidw.rst b/doc/manuals/abidw.rst index e1fce997..9472cfe8 100644 --- a/doc/manuals/abidw.rst +++ b/doc/manuals/abidw.rst @@ -197,13 +197,14 @@ Options 1.8 will not set the default size and will interpret types without a size-in-bits attribute as zero sized. - * ``--named-type-ids`` - - Without this option ids used to reference types in the XML file - use simple numbers. With this option the ids used are derived - from the type name to make it easier to see which type is - referenced and make the XML file more stable in case new types are - added (without this option that might mean all id numbers change). + * ``--type-id-style`` ( ``counter`` | ``hash`` ) + + This option controls how types are idenfied in the generated XML + files. The default ``counter`` style just numbers (with + ``type-id-`` as prefix) the types in the order they are + encountered. The ``hash`` style uses a (stable, portable) hash of + the type names that libabigail uses internally and is intended + to make the XML files easier to diff. * ``--check-alternate-debug-info-base-name`` <*elf-path*> diff --git a/include/abg-writer.h b/include/abg-writer.h index 8086d828..021a95b7 100644 --- a/include/abg-writer.h +++ b/include/abg-writer.h @@ -38,6 +38,12 @@ namespace xml_writer using namespace abigail::ir; +enum type_id_style_kind +{ + COUNTER, + HASH +}; + class write_context; /// A convenience typedef for a shared pointer to write_context. @@ -74,6 +80,9 @@ set_short_locs(write_context& ctxt, bool flag); void set_write_parameter_names(write_context& ctxt, bool flag); +void +set_type_id_style(write_context& ctxt, type_id_style_kind style); + /// A convenience generic function to set common options (usually used /// by Libabigail tools) from a generic options carrying-object, into /// a given @ref write_context. @@ -96,6 +105,7 @@ set_common_options(write_context& ctxt, const OPTS& opts) set_write_parameter_names(ctxt, opts.write_parameter_names); set_short_locs(ctxt, opts.short_locs); set_write_default_sizes(ctxt, opts.default_sizes); + set_type_id_style(ctxt, opts.type_id_style); } void diff --git a/src/abg-writer.cc b/src/abg-writer.cc index bafa3024..f0dbd096 100644 --- a/src/abg-writer.cc +++ b/src/abg-writer.cc @@ -27,6 +27,7 @@ #include "config.h" #include +#include #include #include #include @@ -56,6 +57,39 @@ ABG_BEGIN_EXPORT_DECLARATIONS ABG_END_EXPORT_DECLARATIONS // +namespace +{ +/// Compute a stable string hash. +/// +/// std::hash has no portability or stability guarantees so is +/// unsuitable where reproducibility is a requirement. +/// +/// This is the 32-bit FNV-1a algorithm. The algorithm, reference code +/// and constants are all unencumbered. It is fast and has reasonable +/// distribution properties. +/// +/// https://en.wikipedia.org/wiki/Fowler-Noll-Vo_hash_function +/// +/// @param str the string to hash. +/// +/// @return an unsigned 32 bit hash value. +uint32_t +stable_hash(const std::string& str) +{ + const uint32_t prime = 0x01000193; + const uint32_t offset_basis = 0x811c9dc5; + uint32_t hash = offset_basis; + for (std::string::const_iterator i = str.begin(); i != str.end(); ++i) + { + uint8_t byte = *i; + hash = hash ^ byte; + hash = hash * prime; + } + return hash; +} + +} // namespace + namespace abigail { using std::cerr; @@ -177,7 +211,9 @@ class write_context bool m_write_parameter_names; bool m_short_locs; bool m_write_default_sizes; + type_id_style_kind m_type_id_style; mutable type_ptr_map m_type_id_map; + mutable std::unordered_set m_used_type_id_hashes; mutable type_ptr_set_type m_emitted_type_set; type_ptr_set_type m_emitted_decl_only_set; // A map of types that are referenced by emitted pointers, @@ -214,7 +250,8 @@ public: m_write_elf_needed(true), m_write_parameter_names(true), m_short_locs(false), - m_write_default_sizes(true) + m_write_default_sizes(true), + m_type_id_style(COUNTER) {} /// Getter of the environment we are operating from. @@ -374,6 +411,24 @@ public: set_show_locs(bool f) {m_show_locs = f;} + /// Getter of the "type-id-style" option. + /// + /// This option controls the kind of type ids used in XML output. + /// + /// @return the value of the "type-id-style" option. + type_id_style_kind + get_type_id_style() const + {return m_type_id_style;} + + /// Setter of the "type-id-style" option. + /// + /// This option controls the kind of type ids used in XML output. + /// + /// @param style the new value of the "type-id-style" option. + void + set_type_id_style(type_id_style_kind style) + {m_type_id_style = style;} + /// Getter of the @ref id_manager. /// /// @return the @ref id_manager used by the current instance of @ref @@ -421,14 +476,29 @@ public: c = const_cast(t); type_ptr_map::const_iterator it = m_type_id_map.find(c); - if (it == m_type_id_map.end()) + if (it != m_type_id_map.end()) + return it->second; + + switch (m_type_id_style) { - interned_string id = - get_id_manager().get_id_with_prefix("type-id-"); - m_type_id_map[c] = id; - return id; + case COUNTER: + { + interned_string id = get_id_manager().get_id_with_prefix("type-id-"); + return m_type_id_map[c] = id; + } + case HASH: + { + interned_string pretty = c->get_cached_pretty_representation(true); + size_t hash = stable_hash(*pretty.raw()); + while (!m_used_type_id_hashes.insert(hash).second) + ++hash; + std::ostringstream os; + os << std::hex << std::setfill('0') << std::setw(8) << hash; + return m_type_id_map[c] = c->get_environment()->intern(os.str()); + } } - return it->second; + ABG_ASSERT_NOT_REACHED; + return interned_string(); } string @@ -2131,6 +2201,17 @@ void set_write_default_sizes(write_context& ctxt, bool flag) {ctxt.set_write_default_sizes(flag);} +/// Set the 'type-id-style' property. +/// +/// This property controls the kind of type ids used in XML output. +/// +/// @param ctxt the context to set this property on. +/// +/// @param style the new value of the 'type-id-style' property. +void +set_type_id_style(write_context& ctxt, type_id_style_kind style) +{ctxt.set_type_id_style(style);} + /// Serialize the canonical types of a given scope. /// /// @param scope the scope to consider. diff --git a/tools/abidw.cc b/tools/abidw.cc index 9aec3112..7de31737 100644 --- a/tools/abidw.cc +++ b/tools/abidw.cc @@ -70,7 +70,10 @@ using abigail::comparison::corpus_diff_sptr; using abigail::comparison::compute_diff; using abigail::comparison::diff_context_sptr; using abigail::comparison::diff_context; +using abigail::xml_writer::COUNTER; +using abigail::xml_writer::HASH; using abigail::xml_writer::create_write_context; +using abigail::xml_writer::type_id_style_kind; using abigail::xml_writer::write_context_sptr; using abigail::xml_writer::write_corpus; using abigail::xml_reader::read_corpus_from_native_xml_file; @@ -114,6 +117,7 @@ struct options bool do_log; bool drop_private_types; bool drop_undefined_syms; + type_id_style_kind type_id_style; options() : display_version(), @@ -136,7 +140,8 @@ struct options annotate(), do_log(), drop_private_types(false), - drop_undefined_syms(false) + drop_undefined_syms(false), + type_id_style(COUNTER) {} ~options() @@ -175,6 +180,7 @@ display_usage(const string& prog_name, ostream& out) << " --no-write-default-sizes do not emit pointer size when it equals" " the default address size of the translation unit\n" << " --no-parameter-names do not show names of function parameters\n" + << " --type-id-style (counter|hash) type-id style (counter(default): type-id-number; hash: hexdigits\n" << " --check-alternate-debug-info check alternate debug info " "of \n" << " --check-alternate-debug-info-base-name check alternate " @@ -301,6 +307,18 @@ parse_command_line(int argc, char* argv[], options& opts) opts.default_sizes = false; else if (!strcmp(argv[i], "--no-parameter-names")) opts.write_parameter_names = false; + else if (!strcmp(argv[i], "--type-id-style")) + { + ++i; + if (i >= argc) + return false; + if (!strcmp(argv[i], "counter")) + opts.type_id_style = COUNTER; + else if (!strcmp(argv[i], "hash")) + opts.type_id_style = HASH; + else + return false; + } else if (!strcmp(argv[i], "--check-alternate-debug-info") || !strcmp(argv[i], "--check-alternate-debug-info-base-name")) {