From patchwork Fri Jun 11 15:33:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Giuliano Procida X-Patchwork-Id: 43836 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 999803988408 for ; Fri, 11 Jun 2021 15:34:33 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 999803988408 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1623425673; bh=p8mcokSbbopDlaxzeaTeN1RQdO41PUBIGX855WPf4RQ=; h=Date:In-Reply-To:References:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=gBF2L517U4EPLO7VBRgjHc4Zph5x1HBrMAWR3qo9RBuvYstuPgTmx4XowO01oicSE 9kyDOVAX232D4rlCQVioJTltmj+p1zjqfZtmHmB+BYzbuBbwldmsYzyaybNoem05aC 8ZnrW5+GkmWgo9lXDxWTujUl5mTVrK4YPbBIWrE0= X-Original-To: libabigail@sourceware.org Delivered-To: libabigail@sourceware.org Received: from mail-qt1-x849.google.com (mail-qt1-x849.google.com [IPv6:2607:f8b0:4864:20::849]) by sourceware.org (Postfix) with ESMTPS id D45293985477 for ; Fri, 11 Jun 2021 15:34:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D45293985477 Received: by mail-qt1-x849.google.com with SMTP id l6-20020ac84cc60000b0290243ab0e481cso2153850qtv.2 for ; Fri, 11 Jun 2021 08:34:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=p8mcokSbbopDlaxzeaTeN1RQdO41PUBIGX855WPf4RQ=; b=CLMNrTk4WOjxa/Bzd/kjYeeyb1FHCScfvgxpVlOBbhv2FDzJkNPp+Y8+aGnm84yU89 OEgAhgmd8h8K6rjDNt31noCltTEzS/GbZ0Rtthp090ktg6/2AEzMAfXIxkLEcVLVMS5u bmo2CC4fZOcLemgk6fxORxWlvVH+FtJ4DHwmh+jW4E8/2DealuOIrefHmS+IqaT1ucKO jmYJ4+3BA/MBHiWpP4U+F6lRVbvQnM58edip3qb814VpgWYQVOmpbvC7esXCrSot9KTU J0N9j0FsWaFelc396qiMMRMJrp6XFCJn2sa080LNLH9uz5bCGK1rfHF/plCikuSP3T17 4szQ== X-Gm-Message-State: AOAM533zlmlock/NekmI2khOQm6QNlwPQ1Le0U2i65lMogknhJ9ghZi1 9wXPrKEMWCr4UGc5EfXmhA1kLdJjd4kALXVPHkQ5uoBNJQgpmbmGC6O76hx1Bl9uh4RVp4R8lcf Yla+o72ssWE10gFKuGQsibFC3yyuU1YTBHtE1+qOrVFTi8zCV9HiBcZfljOSCe5vx6nx31IU= X-Google-Smtp-Source: ABdhPJzR7jpw6zMC3Kr29dw/93m9pS8AyGCqyjQ4L8aSwnuDjr30cQ6golzQ/uo8dAZn8Rz67K2a65FzylX0mw== X-Received: from tef.lon.corp.google.com ([2a00:79e0:d:210:c264:c1f9:cf3c:2c0b]) (user=gprocida job=sendgmr) by 2002:a05:6214:1514:: with SMTP id e20mr5293251qvy.55.1623425661387; Fri, 11 Jun 2021 08:34:21 -0700 (PDT) Date: Fri, 11 Jun 2021 16:33:17 +0100 In-Reply-To: <20210611153319.778996-1-gprocida@google.com> Message-Id: <20210611153319.778996-5-gprocida@google.com> Mime-Version: 1.0 References: <20210611153319.778996-1-gprocida@google.com> X-Mailer: git-send-email 2.32.0.272.g935e593368-goog Subject: [PATCH 4/6] BTF: add core functionality To: libabigail@sourceware.org X-Spam-Status: No, score=-22.0 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-Patchwork-Original-From: Giuliano Procida via Libabigail From: Giuliano Procida Reply-To: Giuliano Procida Cc: teguiani@android.com, maennich@google.com, Maria Teguiani , kernel-team@android.com Errors-To: libabigail-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libabigail" This commit adds core BTF functionality. The code can: - read ELF symbols using the symtab_reader - read BTF into a graph of type and symbol information - optionally perform a text dump of BTF as it's read - compute diffs between two BTF graphs as a diff graph - done using DFS and with the help of the SCC finder - create and cache full names for types - serialise a diff graph as a textual diff report - done using DFS The files are almost a straight copy of the originals. Changes: - Files renamed and #includes updated. - Header guards and include order are now in libabigail style. * COMPILING: Add note about BTF dependency. * configure.ac: Add check for . * src/Makefile.am: Add abg-btf.cc, conditional on C++17 and presence of . * include/abg-btf.h: New file. * src/abg-btf.cc: New file. Co-authored-by: Maria Teguiani Signed-off-by: Giuliano Procida --- COMPILING | 5 + configure.ac | 7 + include/abg-btf.h | 551 ++++++++++++++++++ src/Makefile.am | 9 +- src/abg-btf.cc | 1364 +++++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 1935 insertions(+), 1 deletion(-) create mode 100644 include/abg-btf.h create mode 100644 src/abg-btf.cc diff --git a/COMPILING b/COMPILING index 33613a37..75a6588a 100644 --- a/COMPILING +++ b/COMPILING @@ -5,6 +5,11 @@ the moment the dependencies are the following Free Software packages: libtool libxml2 +To build the BTF tools, you will need which can be found +in the development package of: + + linux-libc + If you are building from the Git repository, then you'd also need the following packages: diff --git a/configure.ac b/configure.ac index edd03cf8..d9125d99 100644 --- a/configure.ac +++ b/configure.ac @@ -603,6 +603,13 @@ if test x$ENABLE_CXX17 = xyes; then fi AM_CONDITIONAL(ENABLE_CXX17, test x$ENABLE_CXX17 = xyes) +dnl Check for the presence of the BTF header. +HAVE_BTF=no +AC_CHECK_HEADER([linux/btf.h], + [HAVE_BTF=yes], + [AC_MSG_NOTICE([No linux/btf.h found, omitted BTF support.])]) +AM_CONDITIONAL(HAVE_BTF, test x$HAVE_BTF = xyes) + dnl Set the list of libraries libabigail depends on DEPS_LIBS="$XML_LIBS $ELF_LIBS $DW_LIBS" diff --git a/include/abg-btf.h b/include/abg-btf.h new file mode 100644 index 00000000..d47b4ac9 --- /dev/null +++ b/include/abg-btf.h @@ -0,0 +1,551 @@ +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// -*- mode: C++ -*- +// +// Copyright (C) 2020-2021 Google, Inc. +// +// Author: Maria Teguiani +// Author: Giuliano Procida + +#ifndef __ABG_BTF_H__ +#define __ABG_BTF_H__ + +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "abg-ir.h" +#include "abg-scc.h" + +namespace abigail { +namespace btf { + +#define m_assert(expr, msg) assert(((void)(msg), (expr))) + +using Kind = uint32_t; + +// A Member refers to a data element of a struct or union, used in the context +// of StructUnion +struct Member { + uint32_t typeId_; + uint32_t offset_; + uint32_t bitfieldSize_; +}; + +// A Parameter refers to a variable declared in the function declaration, used +// in the context of Function +struct Parameter { + std::string name_; + uint32_t typeId_; +}; + +struct Secinfo { + uint32_t typeId_; + uint32_t offset_; + uint32_t bytesize_; +}; + +enum class ForwardDeclarationKind { STRUCT, UNION }; + +std::ostream &operator<<(std::ostream &os, ForwardDeclarationKind kind); + +class Type; + +using Comparison = std::pair; + +enum class Precedence { NIL, POINTER, ARRAY_FUNCTION, ATOMIC }; +enum class Side { LEFT, RIGHT }; + +class Name { + public: + explicit Name(std::string_view name) + : left_(name), precedence_(Precedence::NIL), right_() {} + Name(std::string_view left, Precedence precedence, std::string_view right) + : left_(left), precedence_(precedence), right_(right) {} + Name Add(Side side, Precedence precedence, std::string_view text) const; + Name Qualify(const std::set &qualifiers) const; + std::ostream &Print(std::ostream &os) const; + + private: + std::string left_; + Precedence precedence_; + std::string right_; +}; + +std::ostream &operator<<(std::ostream &os, const Name &name); + +using NameCache = std::unordered_map; + +struct DiffDetail { + DiffDetail(const std::string &text, const std::optional &edge) + : text_(text), edge_(edge) {} + std::string text_; + std::optional edge_; +}; + +using Diff = std::vector; + +struct Result { + void AddDiff(const std::string &text) { + equals_ = false; + details_.emplace_back(text, std::optional()); + } + + void AddDiff(const std::string &text, Comparison comparison) { + equals_ = false; + details_.emplace_back(text, comparison); + } + + void MaybeAddDiff(const std::string &text, + const std::pair> &p) { + equals_ &= p.first; + const auto &diff = p.second; + if (diff) + details_.emplace_back(text, diff); + } + + // Maximally powerful lazy version, takes a function that outputs an "edge" + // diff description, to be used only when a diff is present. + void MaybeAddDiff(std::function text, + const std::pair> &p) { + equals_ &= p.first; + const auto &diff = p.second; + if (diff) { + std::ostringstream os; + text(os); + details_.emplace_back(os.str(), diff); + } + } + + template void MaybeAddDiff( + const std::string &text, const T &before, const T &after) { + if (before != after) { + equals_ = false; + std::ostringstream os; + os << text << " changed from " << before << " to " << after; + AddDiff(os.str()); + } + } + + // Lazy version. + template void MaybeAddDiff( + std::function text, + const T &before, const T &after) { + if (before != after) { + equals_ = false; + std::ostringstream os; + text(os); + os << " changed from " << before << " to " << after; + AddDiff(os.str()); + } + } + + bool equals_ = true; + Diff details_; +}; + +using Outcomes = std::map; +// unvisited (absent) -> started (false) -> finished (true) +using Seen = std::map; + +void Print(const Comparison &comparison, const Outcomes &outcomes, Seen &seen, + NameCache &names, std::ostream &os, size_t indent = 0); + +void Print(const Diff &details, const Outcomes &outcomes, Seen &seen, + NameCache &names, std::ostream &os, size_t indent = 0); + +class Type { + public: + Type(const std::vector> &types, uint32_t index, + std::string_view name, Kind kind) + : types_(types), index_(index), name_(name), kind_(kind) {} + virtual ~Type() = default; + uint32_t GetIndex() const { return index_; } + std::string_view GetName() const { return name_; } + Kind GetKind() const { return kind_; } + const std::vector> &GetTypes() const { + return types_; + } + + // as() provides a method to defer downcasting to the base class, + // instead of needing to use dynamic_cast in a local context. If the type is + // not correct, the assert will trigger in debug mode. In release mode, this + // will crash dereferencing the nullptr. + template + const Target &as() const { + static_assert(std::is_convertible::value, + "Target must publically inherit Type"); + const Target *t = dynamic_cast(this); + m_assert(t, "Invalid downcast"); + return *t; + } + // Separate qualifiers from underlying type. + // + // The caller must always be prepared to receive a different type as + // qualifiers are sometimes discarded. + virtual const Type &ResolveQualifiers(std::set &qualifiers) const; + virtual const Type &ResolveTypedef( + std::vector &typedefs) const; + + const Name &GetDescription(NameCache &names) const; + static Result CompareSymbols( + const std::map &lhs, + const std::map &rhs, + Outcomes &outcomes); + + protected: + struct State { + explicit State(Outcomes &o) : outcomes(o), scc(o.value_comp()) { } + Outcomes &outcomes; + std::set known_equal; + SCC scc; + }; + const Type &GetTypeAtIndex(size_t index) const; + + virtual Name MakeDescription(NameCache &names) const = 0; + virtual Result Equals(const Type &other, State &state) const = 0; + static Comparison Removed(const Type &lhs, State &state); + static Comparison Added(const Type &rhs, State &state); + static std::pair> Compare( + const Type &lhs, const Type &rhs, State &state); + + private: + static std::string GetDiffMessage(NameCache &names, + const Type &lhs, const Type &rhs, + const std::string &message = std::string()); + const std::vector> &types_; + const uint32_t index_; + const std::string name_; + const Kind kind_; +}; + +class Void : public Type { + public: + Void(const std::vector> &types, uint32_t index, + std::string_view name, Kind kind) + : Type(types, index, name, kind) {} + Name MakeDescription(NameCache &names) const final; + Result Equals(const Type &other, State &state) const final; +}; + +class Ptr : public Type { + public: + Ptr(const std::vector> &types, uint32_t index, + std::string_view name, Kind kind, uint32_t pointeeTypeId) + : Type(types, index, name, kind), + pointeeTypeId_(pointeeTypeId) {} + uint32_t GetPointeeTypeId() const { return pointeeTypeId_; } + Name MakeDescription(NameCache &names) const final; + Result Equals(const Type &other, State &state) const final; + + private: + const uint32_t pointeeTypeId_; +}; + +class Typedef : public Type { + public: + Typedef(const std::vector> &types, uint32_t index, + std::string_view name, Kind kind, uint32_t referredTypeId) + : Type(types, index, name, kind), + referredTypeId_(referredTypeId) {} + uint32_t GetReferredTypeId() const { return referredTypeId_; } + Name MakeDescription(NameCache &names) const final; + Result Equals(const Type &other, State &state) const final; + const Type &ResolveTypedef( + std::vector &typedefs) const final; + + private: + const uint32_t referredTypeId_; +}; + +class Qualifier : public Type { + public: + Qualifier(const std::vector> &types, + uint32_t index, std::string_view name, Kind kind, + uint32_t qualifiedTypeId) + : Type(types, index, name, kind), + qualifiedTypeId_(qualifiedTypeId) {} + uint32_t GetQualifiedTypeId() const { return qualifiedTypeId_; } + Name MakeDescription(NameCache &names) const final; + Result Equals(const Type &other, State &state) const final; + const Type &ResolveQualifiers(std::set &qualifiers) const final; + + private: + const uint32_t qualifiedTypeId_; +}; + +class Integer : public Type { + public: + Integer(const std::vector> &types, uint32_t index, + std::string_view name, Kind kind, uint32_t encoding, + uint32_t offset, uint32_t bitsize, uint32_t bytesize) + : Type(types, index, name, kind), + offset_(offset), + bitsize_(bitsize), + bytesize_(bytesize), + isBool_(encoding & BTF_INT_BOOL), + isSigned_(encoding & BTF_INT_SIGNED), + isChar_(encoding & BTF_INT_CHAR) {} + bool isBool() const { return isBool_; } + bool isSigned() const { return isSigned_; } + bool isChar() const { return isChar_; } + uint32_t GetOffset() const { return offset_; } + + // GetBitSize() gives the semantics of the field. GetByteSize() gives the + // storage size, and is equal or greater than GetBitSize()*8 + uint32_t GetBitSize() const { return bitsize_; } + uint32_t GetByteSize() const { return bytesize_; } + Name MakeDescription(NameCache &names) const final; + Result Equals(const Type &other, State &state) const final; + + private: + const uint32_t offset_; + const uint32_t bitsize_; + const uint32_t bytesize_; + const bool isBool_; + const bool isSigned_; + const bool isChar_; +}; + +class Array : public Type { + public: + Array(const std::vector> &types, uint32_t index, + std::string_view name, Kind kind, uint32_t elementTypeId, + uint32_t indexTypeId, uint32_t numOfElements) + : Type(types, index, name, kind), + elementTypeId_(elementTypeId), + indexTypeId_(indexTypeId), + numOfElements_(numOfElements) {} + uint32_t GetElementTypeId() const { return elementTypeId_; } + uint32_t GetIndexTypeId() const { return indexTypeId_; } + uint32_t GetNumberOfElements() const { return numOfElements_; } + Name MakeDescription(NameCache &names) const final; + Result Equals(const Type &other, State &state) const final; + + private: + const uint32_t elementTypeId_; + const uint32_t indexTypeId_; + const uint32_t numOfElements_; +}; + +class StructUnion : public Type { + public: + StructUnion(const std::vector> &types, + uint32_t index, std::string_view name, Kind kind, + uint32_t bytesize, std::map members) + : Type(types, index, name, kind), + bytesize_(bytesize), + members_(members) {} + uint32_t GetByteSize() const { return bytesize_; } + const std::map &GetMembers() const { return members_; } + Name MakeDescription(NameCache &names) const final; + Result Equals(const Type &other, State &state) const final; + + private: + const uint32_t bytesize_; + const std::map members_; +}; + +class Enumeration : public Type { + public: + Enumeration(const std::vector> &types, + uint32_t index, std::string_view name, Kind kind, + uint32_t bytesize, std::map enumerators) + : Type(types, index, name, kind), + bytesize_(bytesize), + enumerators_(enumerators) {} + uint32_t GetByteSize() const { return bytesize_; } + const std::map &GetEnums() const { return enumerators_; } + Name MakeDescription(NameCache &names) const final; + Result Equals(const Type &other, State &state) const final; + + private: + const uint32_t bytesize_; + const std::map enumerators_; +}; + +// BTF only considers structs and unions as forward-declared types, and does not +// include forward-declared enums. They are treated as BTF_KIND_ENUMs with vlen +// set to zero +class ForwardDeclaration : public Type { + public: + ForwardDeclaration(const std::vector> &types, + uint32_t index, std::string_view name, Kind kind, + bool isUnion) + : Type(types, index, name, kind), + fwdKind_(isUnion ? ForwardDeclarationKind::UNION + : ForwardDeclarationKind::STRUCT) {} + ForwardDeclarationKind GetFwdKind() const { return fwdKind_; } + Name MakeDescription(NameCache &names) const final; + Result Equals(const Type &other, State &state) const final; + + private: + const ForwardDeclarationKind fwdKind_; +}; + +class Function : public Type { + public: + enum class Linkage : uint16_t { STATIC, GLOBAL, EXTERN }; + Function(const std::vector> &types, + uint32_t index, std::string_view name, Kind kind, + uint32_t referredTypeId, Linkage linkage) + : Type(types, index, name, kind), + referredTypeId_(referredTypeId), + linkage_(linkage) {} + uint32_t GetReferredTypeId() const { return referredTypeId_; } + Linkage GetLinkage() const { return linkage_; } + Name MakeDescription(NameCache &names) const final; + Result Equals(const Type &other, State &state) const final; + + private: + const uint32_t referredTypeId_; + const Linkage linkage_; +}; + +class FunctionPrototype : public Type { + public: + FunctionPrototype(const std::vector> &types, + uint32_t index, std::string_view name, Kind kind, + uint32_t returnTypeId, std::vector parameters) + : Type(types, index, name, kind), + returnTypeId_(returnTypeId), + parameters_(parameters) {} + uint32_t GetReturnTypeId() const { return returnTypeId_; } + const std::vector &GetParameters() const { return parameters_; } + Name MakeDescription(NameCache &names) const final; + Result Equals(const Type &other, State &state) const final; + + private: + const uint32_t returnTypeId_; + const std::vector parameters_; +}; + +class Variable : public Type { + public: + enum class Linkage : uint32_t { STATIC, GLOBAL_ALLOC, GLOBAL_EXTERN }; + Variable(const std::vector> &types, + uint32_t index, std::string_view name, Kind kind, + unsigned varTypeId, Linkage linkage) + : Type(types, index, name, kind), + varTypeId_(varTypeId), + linkage_(linkage) {} + uint32_t GetVarTypeId() const { return varTypeId_; } + Linkage GetLinkage() const { return linkage_; } + Name MakeDescription(NameCache &names) const final; + Result Equals(const Type &other, State &state) const final; + + private: + const uint32_t varTypeId_; + const Linkage linkage_; +}; + +class DataSection : public Type { + public: + DataSection(const std::vector> &types, + uint32_t index, std::string_view name, Kind kind, + uint32_t bytesize, std::vector secinfos) + : Type(types, index, name, kind), + bytesize_(bytesize), + secinfos_(secinfos) {} + uint32_t GetByteSize() const { return bytesize_; } + const std::vector &GetSecinfos() const { return secinfos_; } + Name MakeDescription(NameCache &names) const final; + Result Equals(const Type &other, State &state) const final; + + private: + const uint32_t bytesize_; + const std::vector secinfos_; +}; + +// Not actually a BTF type but needed for uniformity of representation. +class ElfSymbol : public Type { + public: + ElfSymbol(const std::vector> &types, + abigail::elf_symbol_sptr symbol, uint32_t type_id) + : Type(types, -1, {}, -1), symbol_(symbol), type_id_(type_id) {} + abigail::elf_symbol_sptr GetElfSymbol() const { return symbol_; } + uint32_t GetTypeId() const { return type_id_; } + Name MakeDescription(NameCache &names) const final; + Result Equals(const Type &other, State &state) const final; + + private: + abigail::elf_symbol_sptr symbol_; + uint32_t type_id_; +}; + +std::ostream &operator<<(std::ostream &os, const Type &bt); +std::ostream &operator<<(std::ostream &os, const Ptr &bp); +std::ostream &operator<<(std::ostream &os, const Typedef &bp); +std::ostream &operator<<(std::ostream &os, const Qualifier &bp); +std::ostream &operator<<(std::ostream &os, const Integer &bi); +std::ostream &operator<<(std::ostream &os, const Array &ba); +std::ostream &operator<<(std::ostream &os, const StructUnion &bsu); +std::ostream &operator<<(std::ostream &os, const Enumeration &be); +std::ostream &operator<<(std::ostream &os, const ForwardDeclaration &bfd); +std::ostream &operator<<(std::ostream &os, const Function &bf); +std::ostream &operator<<(std::ostream &os, const FunctionPrototype &bfp); +std::ostream &operator<<(std::ostream &os, const Variable &bv); +std::ostream &operator<<(std::ostream &os, const DataSection &bds); +std::ostream &operator<<(std::ostream &os, const ElfSymbol &bes); +std::ostream &operator<<(std::ostream &os, Variable::Linkage linkage); +std::ostream &operator<<(std::ostream &os, Function::Linkage linkage); + +// BTF Specification: https://www.kernel.org/doc/html/latest/bpf/btf.html +class Structs { + public: + Structs(const char *start, std::unique_ptr env, + const abigail::symtab_reader::symtab_sptr tab, + const bool verbose = false); + ~Structs() = default; + void PrintHeader(); + void BuildTypes(); + void PrintStringSection(); + const std::map &GetSymbols( + bool use_elf_symbols) const { + return use_elf_symbols ? elf_symbols_ : btf_symbols_; + } + + private: + const btf_header *header_; + const btf_type *type_section_; + const char *str_section_; + const std::unique_ptr env_; + const abigail::symtab_reader::symtab_sptr tab_; + const bool verbose_; + + std::vector> types_; + std::unordered_map btf_symbol_types_; + std::map btf_symbols_; + std::map elf_symbols_; + + std::vector bad_names_; + std::string bad_prefix_1_; + std::string bad_prefix_2_; + + int BuildOneType(const btf_type *t, uint32_t index); + void BuildElfSymbols(); + std::map BuildMembers( + bool kflag, const btf_member *members, size_t vlen); + std::map BuildEnums(const struct btf_enum *enums, + size_t vlen); + std::vector BuildParams(const struct btf_param *params, + size_t vlen); + std::vector BuildDatasec(const btf_type *type, size_t vlen); + std::string_view GetName(uint32_t name_off); +}; + +Structs ReadFile(const std::string &path, bool verbose = false); + +} // end namespace btf +} // end namespace abigail + +#endif // __ABG_BTF_H__ diff --git a/src/Makefile.am b/src/Makefile.am index 430ce98d..d67c604c 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -8,6 +8,12 @@ VIZ_SOURCES = abg-viz-common.cc \ abg-viz-dot.cc \ abg-viz-svg.cc +if ENABLE_CXX17 +if HAVE_BTF + BTF_SOURCES = abg-btf.cc +endif +endif + libabigail_la_SOURCES = \ abg-internal.h \ abg-traverse.cc \ @@ -39,7 +45,8 @@ abg-elf-helpers.cc \ abg-regex.cc \ abg-symtab-reader.h \ abg-symtab-reader.cc \ -$(VIZ_SOURCES) +$(VIZ_SOURCES) \ +$(BTF_SOURCES) libabigail_la_LIBADD = $(DEPS_LIBS) libabigail_la_LDFLAGS = -lpthread -Wl,--as-needed -no-undefined diff --git a/src/abg-btf.cc b/src/abg-btf.cc new file mode 100644 index 00000000..8e4f9ea6 --- /dev/null +++ b/src/abg-btf.cc @@ -0,0 +1,1364 @@ +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// -*- mode: C++ -*- +// +// Copyright (C) 2020-2021 Google, Inc. +// +// Author: Maria Teguiani +// Author: Giuliano Procida + +#include + +#include +#include +#include +#include +#include + +#include "abg-elf-helpers.h" +#include "abg-symtab-reader.h" +#include "abg-tools-utils.h" + +#include "abg-btf.h" + +namespace abigail { +namespace btf { + +// This is a helper map that yields the name of a Type in the preferred +// print style by indexing into the vector with a Kind +static constexpr std::array kKindNames = { + "void type", + "integer type", + "pointer type", + "array type", + "struct type", + "union type", + "enum type", + "forward declaration", + "typedef", + "volatile", + "const", + "restrict", + "function", + "function type", + "variable", + "data section" +}; + +// This matches bpftool dump format raw. +static constexpr std::array kRawKindNames = { + "VOID", + "INT", + "PTR", + "ARRAY", + "STRUCT", + "UNION", + "ENUM", + "FWD", + "TYPEDEF", + "VOLATILE", + "CONST", + "RESTRICT", + "FUNC", + "FUNC_PROTO", + "VAR", + "DATASEC" +}; + +static constexpr std::array kVarLinkage = { + "static", + "global-alloc", + "global-extern" // NOTE: bpftool currently says "(unknown)" +}; + +static constexpr std::array kFunLinkage = { + "static", + "global", + "extern" +}; + +Name Name::Add(Side side, Precedence precedence, std::string_view text) const { + bool bracket = precedence < precedence_; + std::ostringstream left; + std::ostringstream right; + + // Bits on the left need (sometimes) to be separated by whitespace. + left << left_; + if (bracket) + left << '('; + else if (side == Side::LEFT) + left << ' '; + + (side == Side::LEFT ? left : right) << text; + + // Bits on the right are arrays [] and functions () and need no whitespace. + if (bracket) + right << ')'; + right << right_; + + return Name{left.str(), precedence, right.str()}; +} + +Name Name::Qualify(const std::set &qualifiers) const { + // this covers the case when bad qualifiers have been dropped + if (qualifiers.empty()) + return *this; + // add qualifiers to the left or right of the type stem + std::ostringstream os; + if (precedence_ == Precedence::NIL) { + for (const auto &qualifier : qualifiers) + os << kKindNames[qualifier] << ' '; + os << left_; + } else if (precedence_ == Precedence::POINTER) { + os << left_; + for (const auto &qualifier : qualifiers) + os << ' ' << kKindNames[qualifier]; + } else { + m_assert(false, "unqualifiable element"); + } + // qualifiers attach without affecting precedence + return Name{os.str(), precedence_, right_}; +} + +std::ostream &Name::Print(std::ostream &os) const { + return os << left_ << right_; +} + +std::ostream &operator<<(std::ostream &os, const Name &name) { + return name.Print(os); +} + +// There are several reasons for treating CV-qualifiers specially. +// 1. They upset the precedence scheme we've got going here. +// 2. Qualifiers need to be placed according to what they qualify. +// 3. The BTF model doesn't preclude ordering and duplication issues. +// 4. A better model would have qualifiers as part of the types. +const Name &Type::GetDescription(NameCache &names) const { + // infinite recursion prevention - insert at most once + static const Name black_hole{"#"}; + + auto insertion = names.insert({this, black_hole}); + Name &cached = insertion.first->second; + + if (insertion.second) { + // newly inserted, need to determine name of type + std::set qualifiers; + const Type &under = ResolveQualifiers(qualifiers); + if (this == &under) { + // unqualified, simple case + cached = MakeDescription(names); + } else { + // qualified, but we may end up adding no qualifiers + auto insertion_under = names.insert({&under, black_hole}); + Name &cached_under = insertion_under.first->second; + + // newly inserted underlying type name + if (insertion_under.second) + cached_under = under.MakeDescription(names); + + // add the qualifiers (to the appropriate side) + cached = cached_under.Qualify(qualifiers); + } + } + + return cached; +} + +std::string GetPlainDescription(NameCache &names, const Type &type) { + std::ostringstream os; + os << '"' << type.GetDescription(names) << '"'; + return os.str(); +} + +std::string GetTypedefDescription(NameCache &names, const Type &given) { + std::ostringstream os; + std::vector typedefs; + const Type &type = given.ResolveTypedef(typedefs); + for (auto td : typedefs) + os << std::quoted(td) << " = "; + os << GetPlainDescription(names, type); + return os.str(); +} + +constexpr size_t INDENT_INCREMENT = 2; + +void Print(const Comparison &comparison, const Outcomes &outcomes, Seen &seen, + NameCache &names, std::ostream &os, size_t indent) { + const auto *lhs = comparison.first; + const auto *rhs = comparison.second; + if (!rhs) { + os << GetPlainDescription(names, *lhs) << " was removed\n"; + return; + } + if (!lhs) { + os << GetPlainDescription(names, *rhs) << " was added\n"; + return; + } + bool td = lhs->GetKind() == BTF_KIND_TYPEDEF + || rhs->GetKind() == BTF_KIND_TYPEDEF; + const std::string lhs_descr = td ? GetTypedefDescription(names, *lhs) + : GetPlainDescription(names, *lhs); + const std::string rhs_descr = td ? GetTypedefDescription(names, *rhs) + : GetPlainDescription(names, *rhs); + if (lhs_descr == rhs_descr) + os << lhs_descr << " changed"; + else + os << "changed from " << lhs_descr << " to " << rhs_descr; + const auto &details = outcomes.find(comparison)->second; + auto insertion = seen.insert({comparison, false}); + if (!insertion.second) { + if (!insertion.first->second) + os << " (being reported)"; + else if (!details.empty()) + os << " (already reported)"; + } + os << '\n'; + if (insertion.second) { + Print(details, outcomes, seen, names, os, indent + INDENT_INCREMENT); + insertion.first->second = true; + } + // paragraph spacing + if (!indent) + os << '\n'; +} + +void Print(const Diff &details, const Outcomes &outcomes, Seen &seen, + NameCache &names, std::ostream &os, size_t indent) { + for (const auto &detail : details) { + os << std::string(indent, ' ') << detail.text_; + if (!detail.edge_) { + os << '\n'; + } else { + os << ' '; + Print(detail.edge_.value(), outcomes, seen, names, os, indent); + } + } +} + +const Type &Type::GetTypeAtIndex(size_t index) const { + m_assert(index < types_.size(), "Index out of bounds."); + return *(types_[index].get()); +} + +std::string QualifiersMessage(Kind qualifier, const std::string &action) { + std::ostringstream os; + os << "qualifier " << kKindNames[qualifier] << ' ' << action; + return os.str(); +} + +Result Type::CompareSymbols( + const std::map &lhs, + const std::map &rhs, + Outcomes &outcomes) { + Result result; + State state(outcomes); + auto lit = lhs.begin(); + auto rit = rhs.begin(); + // Each branch contains a NULL pointer check in case BTF information is + // missing for a symbol. We conservatively have to assume that any symbol + // without BTF information may have changed type. + // + // NOTE: this currently happens for all global variables + while (lit != lhs.end() || rit != rhs.end()) { + if (rit == rhs.end() || (lit != lhs.end() && lit->first < rit->first)) { + // removed + if (lit->second) { + auto diff = Removed(*lit->second, state); + result.AddDiff("symbol", diff); + } else { + std::ostringstream os; + os << "symbol " << std::quoted(lit->first) + << " (of unknown type) was removed"; + result.AddDiff(os.str()); + } + ++lit; + } else if (lit == lhs.end() || (rit != rhs.end() && lit->first > rit->first)) { + // added + if (rit->second) { + auto diff = Added(*rit->second, state); + result.AddDiff("symbol", diff); + } else { + std::ostringstream os; + os << "symbol " << std::quoted(rit->first) + << " (if unknown type) was added"; + result.AddDiff(os.str()); + } + ++rit; + } else { + // in both + if (lit->second && rit->second) { + auto diff = Compare(*lit->second, *rit->second, state); + result.MaybeAddDiff("symbol", diff); + } else { + std::ostringstream os; + os << "symbol " << std::quoted(lit->first) + << " (of unknown type) may have changed"; + result.AddDiff(os.str()); + } + ++lit; + ++rit; + } + } + return result; +} + +Comparison Type::Removed(const Type &lhs, State &state) { + Comparison comparison{&lhs, nullptr}; + state.outcomes.insert({comparison, {}}); + return comparison; +} + +Comparison Type::Added(const Type &rhs, State &state) { + Comparison comparison{nullptr, &rhs}; + state.outcomes.insert({comparison, {}}); + return comparison; +} + +/* + * We compute a diff for every visited node. + * + * Each node has one of: + * 1. equals = true; perhaps only tentative edge differences + * 2. equals = false; at least one definitive node or edge difference + * + * On the first visit to a node we can put a placeholder in, the equals value is + * irrelevant, the diff may contain local and edge differences. If an SCC + * contains only internal edge differences (and equivalently equals is true) + * then the differences can all (eventually) be discarded. + * + * On exit from the first visit to a node, equals reflects the tree of + * comparisons below that node in the DFS and similarly, the diff graph starting + * from the node contains a subtree of this tree plus potentially edges to + * existing nodes to the side or below (already visited SCCs, sharing), or above + * (back links forming cycles). + * + * When an SCC is closed, all equals implies deleting all diffs, any false + * implies updating all to false. + * + * On subsequent visits to a node, there are 2 cases. The node is still open: + * return true and an edge diff. The node is closed, return the stored value and + * an edge diff. + */ +std::pair> Type::Compare( + const Type &lhs, const Type &rhs, Type::State &state) { + Comparison comparison{&lhs, &rhs}; + + // 1. Check if the comparison has an already known result. + if (state.known_equal.count(comparison)) { + // Already visited and closed. Equal. + return {true, {}}; + } + if (state.outcomes.count(comparison)) { + // Already visited and closed. Different. + return {false, {comparison}}; + } + // Either open or not visited at all + + // 2. Record node with Strongly-Connected Component finder. + auto handle = state.scc.Open({comparison, {}}); + if (!handle) { + // Already open. + // + // Return a dummy true outcome and some tentative diffs. The diffs may end + // up not being used and, while it would be nice to be lazier, they encode + // all the cycling-breaking edges needed to recreate a full diff structure. + return {true, {comparison}}; + } + // Comparison opened, need to close it before returning. + + Result result; + + std::set lhs_quals; + std::set rhs_quals; + const Type &l = lhs.ResolveQualifiers(lhs_quals); + const Type &r = rhs.ResolveQualifiers(rhs_quals); + if (!lhs_quals.empty() || !rhs_quals.empty()) { + // 3.1 Qualified type difference. + auto lit = lhs_quals.begin(); + auto rit = rhs_quals.begin(); + auto lend = lhs_quals.end(); + auto rend = rhs_quals.end(); + while (lit != lend || rit != rend) { + if (rit == rend || (lit != lend && *lit < *rit)) { + result.AddDiff(QualifiersMessage(*lit, "removed")); + ++lit; + } else if (lit == lend || (rit != rend && *lit > *rit)) { + result.AddDiff(QualifiersMessage(*rit, "added")); + ++rit; + } else { + ++lit; + ++rit; + } + } + const auto comp = Compare(l, r, state); + result.MaybeAddDiff("underlying type", comp); + } else if (l.GetKind() == BTF_KIND_TYPEDEF + || r.GetKind() == BTF_KIND_TYPEDEF) { + // 3.2 Typedef difference. + std::vector l_typedefs; + std::vector r_typedefs; + const Type &l_ref = l.ResolveTypedef(l_typedefs); + const Type &r_ref = r.ResolveTypedef(r_typedefs); + const auto comp = Compare(l_ref, r_ref, state); + result.MaybeAddDiff("via typedefs", comp); + } else if (typeid(l) != typeid(r)) { + // 4. Incomparable. + result.equals_ = false; + } else { + // 5. Actually compare with dynamic type dispatch. + result = l.Equals(r, state); + } + + // 6. Update result and check for a complete Strongly-Connected Component. + auto comparisons = state.scc.Close(handle.value(), + [&result](Outcomes::value_type &p) { + p.second = result.details_; + }); + if (!comparisons.empty()) { + // Closed SCC. + // + // Note that result now incorporates every inequality and difference in the + // SCC via the DFS spanning tree. + if (result.equals_) { + // Same. Record equalities. + for (auto &c : comparisons) + state.known_equal.insert(c.first); + return {true, {}}; + } else { + // Different. Record diffs. + state.outcomes.insert(std::make_move_iterator(comparisons.begin()), + std::make_move_iterator(comparisons.end())); + } + } + + // Note that both equals and diff are tentative iff comparison is open. + return {result.equals_, {comparison}}; +} + +Name Void::MakeDescription(NameCache &names) const { + return Name{"void"}; +} + +Name Ptr::MakeDescription(NameCache &names) const { + return GetTypeAtIndex(GetPointeeTypeId()).GetDescription(names).Add( + Side::LEFT, Precedence::POINTER, "*"); +} + +Name Typedef::MakeDescription(NameCache &names) const { + return Name{GetName()}; +} + +Name Qualifier::MakeDescription(NameCache &names) const { + m_assert(false, "should not be called"); // NOLINT + return Name{GetName()}; +} + +Name Integer::MakeDescription(NameCache &names) const { + return Name{GetName()}; +} + +Name Array::MakeDescription(NameCache &names) const { + std::ostringstream os; + os << '[' << GetNumberOfElements() << ']'; + return GetTypeAtIndex(GetElementTypeId()).GetDescription(names).Add( + Side::RIGHT, Precedence::ARRAY_FUNCTION, os.str()); +} + +Name StructUnion::MakeDescription(NameCache &names) const { + std::ostringstream os; + os << (GetKind() == BTF_KIND_STRUCT ? "struct" : "union") + << ' ' << (GetName().empty() ? "" : GetName()); + return Name{os.str()}; +} + +Name Enumeration::MakeDescription(NameCache &names) const { + std::ostringstream os; + os << "enum " << (GetName().empty() ? "" : GetName()); + if (GetEnums().empty()) + os << ""; + return Name{os.str()}; +} + +Name ForwardDeclaration::MakeDescription(NameCache &names) const { + std::ostringstream os; + os << GetFwdKind() << ' ' << GetName() << ""; + return Name{os.str()}; +} + +Name FunctionPrototype::MakeDescription(NameCache &names) const { + std::ostringstream os; + os << '('; + bool sep = false; + for (const auto &p : GetParameters()) { + if (sep) + os << ", "; + else + sep = true; + const auto &arg_descr = GetTypeAtIndex(p.typeId_).GetDescription(names); + if (p.name_.empty()) + os << arg_descr; + else + os << arg_descr.Add(Side::LEFT, Precedence::ATOMIC, p.name_); + } + os << ')'; + return GetTypeAtIndex(GetReturnTypeId()).GetDescription(names).Add( + Side::RIGHT, Precedence::ARRAY_FUNCTION, os.str()); +} + +Name Variable::MakeDescription(NameCache &names) const { + return GetTypeAtIndex(GetVarTypeId()).GetDescription(names).Add( + Side::LEFT, Precedence::ATOMIC, GetName()); +} + +Name Function::MakeDescription(NameCache &names) const { + return GetTypeAtIndex(GetReferredTypeId()).GetDescription(names).Add( + Side::LEFT, Precedence::ATOMIC, GetName()); +} + +Name DataSection::MakeDescription(NameCache &names) const { + // NOTE: not yet encountered in the wild + return Name{"Unimplemented"}; +} + +Name ElfSymbol::MakeDescription(NameCache &names) const { + return Name{symbol_->get_name()}; +} + +Result Void::Equals(const Type &other, State &state) const { + return {}; +} + +Result Ptr::Equals(const Type &other, State &state) const { + Result result; + const auto &o = other.as(); + + const auto ref_diff = Compare(GetTypeAtIndex(GetPointeeTypeId()), + o.GetTypeAtIndex(o.GetPointeeTypeId()), state); + result.MaybeAddDiff("pointed-to type", ref_diff); + return result; +} + +Result Typedef::Equals(const Type &other, State &state) const { + m_assert(false, "should not be called"); // NOLINT + return {}; +} + +Result Qualifier::Equals(const Type &other, State &state) const { + m_assert(false, "should not be called"); // NOLINT + return {}; +} + +Result Integer::Equals(const Type &other, State &state) const { + Result result; + const auto &o = other.as(); + + if (isBool() != o.isBool()) { + result.AddDiff(isBool() ? "the first one is a boolean" + : "the second one is a boolean"); + } + if (isSigned() != o.isSigned()) { + result.AddDiff(isSigned() ? "the first one is signed" + : "the second one is signed"); + } + if (isChar() != o.isChar()) { + result.AddDiff(isChar() ? "the first one is a char" + : "the second one is a char"); + } + result.MaybeAddDiff("offset", GetOffset(), o.GetOffset()); + result.MaybeAddDiff("bit size", GetBitSize(), o.GetBitSize()); + if (GetBitSize() != GetByteSize() * 8 && + o.GetBitSize() != o.GetByteSize() * 8) + result.MaybeAddDiff("byte size", GetByteSize(), o.GetByteSize()); + return result; +} + +Result Array::Equals(const Type &other, State &state) const { + Result result; + const auto &o = other.as(); + + result.MaybeAddDiff("number of elements", + GetNumberOfElements(), o.GetNumberOfElements()); + const auto index_type_diff = + Compare(GetTypeAtIndex(GetIndexTypeId()), + o.GetTypeAtIndex(o.GetIndexTypeId()), state); + result.MaybeAddDiff("index type", index_type_diff); + const auto element_type_diff = + Compare(GetTypeAtIndex(GetElementTypeId()), + o.GetTypeAtIndex(o.GetElementTypeId()), state); + result.MaybeAddDiff("element type", element_type_diff); + + return result; +} + +Result StructUnion::Equals(const Type &other, State &state) const { + Result result; + const auto &o = other.as(); + + if (GetKind() != o.GetKind()) { + result.AddDiff(GetKind() == BTF_KIND_STRUCT + ? "changed from struct to union" + : "changed from union to struct"); + } + + result.MaybeAddDiff("byte size", GetByteSize(), o.GetByteSize()); + + const auto &members1 = GetMembers(); + const auto &members2 = o.GetMembers(); + std::set visited_members; + for (const auto &m1 : members1) { + visited_members.insert(m1.first); + auto iter = members2.find(m1.first); + if (iter == members2.end()) { + std::ostringstream os; + os << "member " << std::quoted(m1.first) << " of type"; + auto diff = Removed(GetTypeAtIndex(m1.second.typeId_), state); + result.AddDiff(os.str(), diff); + } else { + result.MaybeAddDiff([&](std::ostream &os) { + os << "member " << std::quoted(m1.first) << " offset"; + }, m1.second.offset_, iter->second.offset_); + result.MaybeAddDiff([&](std::ostream &os) { + os << "member " << std::quoted(m1.first) << " bitfield size"; + }, m1.second.bitfieldSize_, iter->second.bitfieldSize_); + const auto sub_diff = + Compare(GetTypeAtIndex(m1.second.typeId_), + o.GetTypeAtIndex(iter->second.typeId_), state); + result.MaybeAddDiff([&](std::ostream &os) { + os << "member " << std::quoted(m1.first) << " type"; + }, sub_diff); + } + } + for (const auto &m2 : members2) { + if (visited_members.find(m2.first) == visited_members.end()) { + std::ostringstream os; + os << "member " << std::quoted(m2.first) << " of type"; + auto diff = Added(o.GetTypeAtIndex(m2.second.typeId_), state); + result.AddDiff(os.str(), diff); + } + } + + return result; +} + +Result Enumeration::Equals(const Type &other, State &state) const { + Result result; + const auto &o = other.as(); + + result.MaybeAddDiff("byte size", GetByteSize(), o.GetByteSize()); + + const auto &enums1 = GetEnums(); + const auto &enums2 = o.GetEnums(); + std::set visited_enums; + for (const auto &e1 : enums1) { + visited_enums.insert(e1.first); + auto iter = enums2.find(e1.first); + if (iter == enums2.end()) { + std::ostringstream os; + os << "enumerator " << std::quoted(e1.first) + << " (" << e1.second << ") was removed"; + result.AddDiff(os.str()); + } else { + result.MaybeAddDiff([&](std::ostream &os) { + os << "enumerator " << std::quoted(e1.first) << " value"; + }, e1.second, iter->second); + } + } + for (const auto &e2 : enums2) { + if (visited_enums.find(e2.first) == visited_enums.end()) { + std::ostringstream os; + os << "enumerator " << std::quoted(e2.first) + << " (" << e2.second << ") was added"; + result.AddDiff(os.str()); + } + } + + return result; +} + +Result ForwardDeclaration::Equals(const Type &other, State &state) const { + Result result; + const auto &o = other.as(); + + result.MaybeAddDiff("kind", GetFwdKind(), o.GetFwdKind()); + + return result; +} + +Result Function::Equals(const Type &other, State &state) const { + Result result; + const auto &o = other.as(); + + result.MaybeAddDiff("linkage", GetLinkage(), o.GetLinkage()); + const auto func_proto_diff = + Compare(GetTypeAtIndex(GetReferredTypeId()), + o.GetTypeAtIndex(o.GetReferredTypeId()), state); + result.MaybeAddDiff("type", func_proto_diff); + + return result; +} + +Result FunctionPrototype::Equals(const Type &other, State &state) const { + Result result; + const auto &o = other.as(); + + const auto return_type_diff = + Compare(GetTypeAtIndex(GetReturnTypeId()), + o.GetTypeAtIndex(o.GetReturnTypeId()), state); + result.MaybeAddDiff("return type", return_type_diff); + + const auto ¶meters1 = GetParameters(); + const auto ¶meters2 = o.GetParameters(); + size_t min = std::min(parameters1.size(), parameters2.size()); + for (size_t i = 0; i < min; ++i) { + const auto &p1 = parameters1.at(i); + const auto &p2 = parameters2.at(i); + const auto sub_diff = Compare(GetTypeAtIndex(p1.typeId_), + o.GetTypeAtIndex(p2.typeId_), state); + result.MaybeAddDiff([&](std::ostream &os) { + os << "parameter " << i + 1; + const auto &n1 = p1.name_; + const auto &n2 = p2.name_; + if (n1 == n2 && !n1.empty()) { + os << " (" << std::quoted(n1) << ")"; + } else if (n1 != n2) { + os << " ("; + if (!n1.empty()) os << "was " << std::quoted(n1); + if (!n1.empty() && !n2.empty()) os << ", "; + if (!n2.empty()) os << "now " << std::quoted(n2); + os << ")"; + } + os << " type"; + }, sub_diff); + } + + bool added = parameters1.size() < parameters2.size(); + const auto ¶meters = added ? parameters2 : parameters1; + for (size_t i = min; i < parameters.size(); ++i) { + const auto ¶meter = parameters.at(i); + std::ostringstream os; + os << "parameter " << i + 1; + if (!parameter.name_.empty()) + os << " (" << std::quoted(parameter.name_) << ")"; + os << " of type"; + const auto ¶meter_type = GetTypeAtIndex(parameter.typeId_); + auto diff = added ? Added(parameter_type, state) + : Removed(parameter_type, state); + result.AddDiff(os.str(), diff); + } + + return result; +} + +// NOTE: not yet encountered in the wild +Result Variable::Equals(const Type &other, State &state) const { + Result result; + const auto &o = other.as(); + + result.MaybeAddDiff("linkage", GetLinkage(), o.GetLinkage()); + const auto var_diff = Compare(GetTypeAtIndex(GetVarTypeId()), + o.GetTypeAtIndex(o.GetVarTypeId()), state); + result.MaybeAddDiff("type", var_diff); + + return result; +} + +Result DataSection::Equals(const Type &other, State &state) const { + Result result; + result.AddDiff("Unimplemented"); + + // NOTE: not yet encountered in the wild + m_assert(false, "Unimplemented\n"); // NOLINT + + return result; +} + +Result ElfSymbol::Equals(const Type &other, State &state) const { + Result result; + const auto &o = other.as(); + // TODO: compare ELF symbol attributes + const auto type_diff = Compare(GetTypeAtIndex(type_id_), + o.GetTypeAtIndex(o.type_id_), state); + result.MaybeAddDiff("type", type_diff); + return result; +} + +const Type &Type::ResolveQualifiers(std::set &qualifiers) const { + if (kind_ == BTF_KIND_ARRAY || kind_ == BTF_KIND_FUNC_PROTO) { + // There should be no qualifiers here. + qualifiers.clear(); + } + return *this; +} + +const Type &Qualifier::ResolveQualifiers( + std::set &qualifiers) const { + qualifiers.insert(GetKind()); + return GetTypeAtIndex(GetQualifiedTypeId()).ResolveQualifiers(qualifiers); +} + +const Type &Type::ResolveTypedef( + std::vector &typedefs) const { + return *this; +} + +const Type &Typedef::ResolveTypedef( + std::vector &typedefs) const { + typedefs.push_back(GetName()); + return GetTypeAtIndex(GetReferredTypeId()).ResolveTypedef(typedefs); +} + +std::ostream &operator<<(std::ostream &os, const Type &bt) { + auto name = bt.GetName(); + os << kRawKindNames[bt.GetKind()] + << " '" << (name.empty() ? "(anon)" : name) << '\''; + return os; +} + +std::ostream &operator<<(std::ostream &os, const Ptr &bp) { + os << static_cast(bp) + << " type_id=" << bp.GetPointeeTypeId() + << '\n'; + return os; +} + +std::ostream &operator<<(std::ostream &os, const Typedef &bp) { + os << static_cast(bp) + << " type_id=" << bp.GetReferredTypeId() + << '\n'; + return os; +} + +std::ostream &operator<<(std::ostream &os, const Qualifier &bp) { + os << static_cast(bp) + << " type_id=" << bp.GetQualifiedTypeId() + << '\n'; + return os; +} + +std::ostream &operator<<(std::ostream &os, const Integer &bi) { + os << static_cast(bi) + << " size=" << bi.GetByteSize() + << " bits_offset=" << bi.GetOffset() + << " nr_bits=" << bi.GetBitSize() + << " bool=" << bi.isBool() + << " char=" << bi.isChar() + << " signed=" << bi.isSigned() + << '\n'; + return os; +} + +std::ostream &operator<<(std::ostream &os, const Array &ba) { + os << static_cast(ba) + << " type_id=" << ba.GetElementTypeId() + << " index_type_id=" << ba.GetIndexTypeId() + << " nr_elems=" << ba.GetNumberOfElements() + << '\n'; + return os; + } + +std::ostream &operator<<(std::ostream &os, const StructUnion &bsu) { + os << static_cast(bsu) + << " size=" << bsu.GetByteSize() + << " vlen=" << bsu.GetMembers().size() << '\n'; + for (const auto &member : bsu.GetMembers()) { + os << "\t'" << member.first << '\'' + << " type_id=" << member.second.typeId_ + << " bits_offset=" << member.second.offset_; + if (member.second.bitfieldSize_) + os << " bitfield_size=" << member.second.bitfieldSize_; + os << '\n'; + } + return os; +} + +std::ostream &operator<<(std::ostream &os, const Enumeration &be) { + os << static_cast(be) + << " size=" << be.GetByteSize() + << " vlen=" << be.GetEnums().size() + << '\n'; + for (const auto &e : be.GetEnums()) { + os << "\t'" << e.first << "' val=" << e.second << '\n'; + } + return os; +} + +std::ostream &operator<<(std::ostream &os, const ForwardDeclaration &bfd) { + os << static_cast(bfd) + << " fwd_kind=" << bfd.GetFwdKind() + << '\n'; + return os; +} + +std::ostream &operator<<(std::ostream &os, const Function &bf) { + os << static_cast(bf) + << " type_id=" << bf.GetReferredTypeId() + << " linkage=" << bf.GetLinkage() + << '\n'; + return os; +} + +std::ostream &operator<<(std::ostream &os, const FunctionPrototype &bfp) { + os << static_cast(bfp) + << " ret_type_id=" << bfp.GetReturnTypeId() + << " vlen=" << bfp.GetParameters().size() + << '\n'; + for (const auto ¶m : bfp.GetParameters()) { + os << "\t'" << (param.name_.empty() ? "(anon)" : param.name_) + << "' type_id=" << param.typeId_ << '\n'; + } + return os; +} + +std::ostream &operator<<(std::ostream &os, const Variable &bv) { + // NOTE: The odd comma is to match bpftool dump. + os << static_cast(bv) + << " type_id=" << bv.GetVarTypeId() + << ", linkage=" << bv.GetLinkage() + << '\n'; + return os; +} + +std::ostream &operator<<(std::ostream &os, const DataSection &bds) { + os << static_cast(bds) + << " size=" << bds.GetByteSize() + << '\n'; + for (const auto &secinfo : bds.GetSecinfos()) { + os << "\ttype_id=" << secinfo.typeId_ + << " offset=" << secinfo.offset_ + << " size=" << secinfo.bytesize_ << '\n'; + } + return os; +} + +std::ostream &operator<<(std::ostream &os, const ElfSymbol &bes) { + os << static_cast(bes); + // TODO: report ELF symbol attributes + os << " type=" << bes.GetTypeId() + << '\n'; + return os; +} + +std::ostream &operator<<(std::ostream &os, ForwardDeclarationKind kind) { + switch (kind) { + case ForwardDeclarationKind::STRUCT: + os << "struct"; + break; + case ForwardDeclarationKind::UNION: + os << "union"; + break; + } + return os; +} + +std::ostream &operator<<(std::ostream &os, Variable::Linkage linkage) { + auto ix = static_cast(linkage); + return os << (ix < kVarLinkage.size() ? kVarLinkage[ix] : "(unknown)"); +} + +std::ostream &operator<<(std::ostream &os, Function::Linkage linkage) { + auto ix = static_cast(linkage); + return os << (ix < kFunLinkage.size() ? kFunLinkage[ix] : "(unknown)"); +} + +Structs::Structs(const char *start, + std::unique_ptr env, + const abigail::symtab_reader::symtab_sptr tab, + const bool verbose) + : env_(std::move(env)), + tab_(tab), + verbose_(verbose) { + header_ = reinterpret_cast(start); + m_assert(header_->magic == 0xEB9F, "Magic field must be 0xEB9F for BTF"); + + type_section_ = reinterpret_cast(start + header_->hdr_len + + header_->type_off); + str_section_ = start + header_->hdr_len + header_->str_off; + + // every btf_type struct in the type section has an implicit type id. + // type id 0 is reserved for void type, so it is set to a Void here. + // The type section is parsed sequentially and type id is assigned to each + // recognized type starting from id 1. + types_.push_back(std::make_unique(types_, 0, "void", 0)); + + if (verbose_) { + PrintHeader(); + } + BuildTypes(); + if (verbose_) { + PrintStringSection(); + } + + // NOTE: a whole bunch of Linux kernel symbols appear with duplicate (but not + // necessarily identical) BTF type information + // + // TODO: find a better way of resolving this + bad_prefix_1_ = "__arm64_sys_"; + bad_prefix_2_ = "__arm64_compat_sys_"; + bad_names_ = { + "arch_prctl_spec_ctrl_get", + "arch_prctl_spec_ctrl_set", + "ioremap_cache", + "kvm_arch_set_irq_inatomic", + "module_frob_arch_sections", + "vsnprintf" + }; +} + +void Structs::PrintHeader() { + std::cout << "BTF header:\n" + << "\tmagic " << header_->magic + << ", version " << static_cast(header_->version) + << ", flags " << static_cast(header_->flags) + << ", hdr_len " << header_->hdr_len << "\n" + << "\ttype_off " << header_->type_off + << ", type_len " << header_->type_len << "\n" + << "\tstr_off " << header_->str_off + << ", str_len " << header_->str_len << "\n"; +} + +// vlen: vector length, the number of struct/union members +std::map Structs::BuildMembers( + bool kflag, const btf_member *members, size_t vlen) { + std::map result; + int anonymous = 0; + for (size_t i = 0; i < vlen; ++i) { + const auto raw_offset = members[i].offset; + Member member{ + .typeId_ = members[i].type, + .offset_ = kflag ? BTF_MEMBER_BIT_OFFSET(raw_offset) : raw_offset, + .bitfieldSize_ = kflag ? BTF_MEMBER_BITFIELD_SIZE(raw_offset) : 0}; + std::string name = std::string(GetName(members[i].name_off)); + if (name.empty()) { + name = "anon member #" + std::to_string(anonymous); + ++anonymous; + } + result.emplace(name, member); + } + return result; +} + +// vlen: vector length, the number of enum values +std::map Structs::BuildEnums(const struct btf_enum *enums, + size_t vlen) { + std::map result; + for (size_t i = 0; i < vlen; ++i) { + result.emplace(std::string(GetName(enums[i].name_off)), enums[i].val); + } + return result; +} + +// vlen: vector length, the number of parameters +std::vector Structs::BuildParams(const struct btf_param *params, + size_t vlen) { + std::vector result; + result.reserve(vlen); + for (size_t i = 0; i < vlen; ++i) { + Parameter parameter{ + .name_ = std::string(GetName(params[i].name_off)), + .typeId_ = params[i].type}; + result.push_back(parameter); + } + return result; +} + +// vlen: vector length, the number of variables +std::vector Structs::BuildDatasec(const btf_type *type, + size_t vlen) { + std::vector result; + result.reserve(vlen); + const auto *secinfos = reinterpret_cast(type + 1); + for (size_t i = 0; i < vlen; ++i) { + Secinfo secinfo{.typeId_ = secinfos[i].type, + .offset_ = secinfos[i].offset, + .bytesize_ = secinfos[i].size}; + result.push_back(secinfo); + } + return result; +} + +void Structs::BuildTypes() { + m_assert(!(header_->type_off & (sizeof(uint32_t) - 1)), "Unaligned type_off"); + if (header_->type_len == 0) { + std::cerr << "No types found"; + return; + } + if (verbose_) { + std::cout << "Type section:\n"; + } + + const char *curr = reinterpret_cast(type_section_); + const char *end = curr + header_->type_len; + uint32_t index = 1; + while (curr < end) { + const btf_type *t = reinterpret_cast(curr); + int type_size = BuildOneType(t, index); + m_assert(type_size > 0, "Could not identify BTF type"); + curr += type_size; + ++index; + } + + BuildElfSymbols(); +} + +int Structs::BuildOneType(const btf_type *t, uint32_t index) { + const auto kind = BTF_INFO_KIND(t->info); + const auto vlen = BTF_INFO_VLEN(t->info); + // Data following the btf_type struct. + const void *data = reinterpret_cast(t + 1); + m_assert(kind >= 0 && kind < NR_BTF_KINDS, "Unknown BTF kind"); + + if (verbose_) std::cout << '[' << index << "] "; + int type_size = sizeof(struct btf_type); + switch (kind) { + case BTF_KIND_INT: { + const auto bits = *reinterpret_cast(data); + types_.push_back(std::make_unique( + types_, index, GetName(t->name_off), kind, + BTF_INT_ENCODING(bits), BTF_INT_OFFSET(bits), BTF_INT_BITS(bits), + t->size)); + if (verbose_) { + std::cout << types_.back()->as(); + } + type_size += sizeof(uint32_t); + break; + } + case BTF_KIND_PTR: { + types_.push_back(std::make_unique( + types_, index, GetName(t->name_off), kind, t->type)); + if (verbose_) { + std::cout << types_.back()->as(); + } + break; + } + case BTF_KIND_TYPEDEF: { + types_.push_back(std::make_unique( + types_, index, GetName(t->name_off), kind, t->type)); + if (verbose_) { + std::cout << types_.back()->as(); + } + break; + } + case BTF_KIND_VOLATILE: + case BTF_KIND_CONST: + case BTF_KIND_RESTRICT: { + types_.push_back(std::make_unique( + types_, index, GetName(t->name_off), kind, t->type)); + if (verbose_) { + std::cout << types_.back()->as(); + } + break; + } + case BTF_KIND_ARRAY: { + const auto *array = reinterpret_cast(data); + types_.push_back(std::make_unique( + types_, index, GetName(t->name_off), kind, array->type, + array->index_type, array->nelems)); + if (verbose_) { + std::cout << types_.back()->as(); + } + type_size += sizeof(struct btf_array); + break; + } + case BTF_KIND_STRUCT: + case BTF_KIND_UNION: { + const bool kflag = BTF_INFO_KFLAG(t->info); + const auto *btf_members = reinterpret_cast(data); + const auto members = BuildMembers(kflag, btf_members, vlen); + types_.push_back(std::make_unique( + types_, index, GetName(t->name_off), kind, t->size, members)); + if (verbose_) { + std::cout << types_.back()->as(); + } + type_size += vlen * sizeof(struct btf_member); + break; + } + case BTF_KIND_ENUM: { + const auto *enums = reinterpret_cast(data); + std::map enumerators = BuildEnums(enums, vlen); + types_.push_back(std::make_unique( + types_, index, GetName(t->name_off), kind, + t->size, enumerators)); + if (verbose_) { + std::cout << types_.back()->as(); + } + type_size += vlen * sizeof(struct btf_enum); + break; + } + case BTF_KIND_FWD: { + const bool kflag = BTF_INFO_KFLAG(t->info); + types_.push_back(std::make_unique( + types_, index, GetName(t->name_off), kind, kflag)); + if (verbose_) { + std::cout << types_.back()->as(); + } + break; + } + case BTF_KIND_FUNC: { + const auto name = GetName(t->name_off); + types_.push_back(std::make_unique( + types_, index, name, kind, t->type, + Function::Linkage(vlen))); + if (verbose_) { + std::cout << types_.back()->as(); + } + bool inserted = btf_symbol_types_.insert({name, t->type}).second; + if (!inserted) { + // NOTE: duplicate BTF symbols could be considered a bug in pahole + // + // TODO: remove these checks once resolved + bool known = !name.compare(0, bad_prefix_1_.size(), bad_prefix_1_) || + !name.compare(0, bad_prefix_2_.size(), bad_prefix_2_) || + std::find(bad_names_.begin(), bad_names_.end(), name) != + bad_names_.end(); + m_assert(known, "Insertion failed, duplicate found in symbol map"); + (void)known; + } + btf_symbols_.insert({name, types_.back().get()}); + break; + } + case BTF_KIND_FUNC_PROTO: { + const auto *params = reinterpret_cast(data); + std::vector parameters = BuildParams(params, vlen); + types_.push_back(std::make_unique( + types_, index, GetName(t->name_off), kind, t->type, + parameters)); + if (verbose_) { + std::cout << types_.back()->as(); + } + type_size += vlen * sizeof(struct btf_param); + break; + } + case BTF_KIND_VAR: { + // NOTE: not yet encountered in the wild + const auto *variable = reinterpret_cast(data); + const auto name = GetName(t->name_off); + types_.push_back(std::make_unique( + types_, index, name, kind, t->type, + Variable::Linkage(variable->linkage))); + if (verbose_) { + std::cout << types_.back()->as(); + } + + bool inserted = btf_symbol_types_.insert({name, t->type}).second; + m_assert(inserted, "Insertion failed, duplicate found in symbol map"); + (void)inserted; + btf_symbols_.insert({name, types_.back().get()}); + + type_size += sizeof(struct btf_var); + break; + } + case BTF_KIND_DATASEC: { + std::vector secinfos = BuildDatasec(t, vlen); + types_.push_back(std::make_unique( + types_, index, GetName(t->name_off), kind, t->size, + secinfos)); + if (verbose_) { + std::cout << types_.back()->as(); + } + type_size += vlen * sizeof(struct btf_var_secinfo); + } + } + return type_size; +} + +std::string_view Structs::GetName(uint32_t name_off) { + m_assert(name_off < header_->str_len, + "The name offset exceeds the section length"); + const char *section_end = str_section_ + header_->str_len; + const char *name_end = std::find(&str_section_[name_off], section_end, '\0'); + m_assert(name_end < section_end, + "The name continues past the string section limit"); + (void)name_end; + + std::string_view name{&str_section_[name_off]}; + return name; +} + +void Structs::PrintStringSection() { + std::cout << "String section:\n"; + const char *curr = str_section_; + const char *limit = str_section_ + header_->str_len; + while (curr < limit) { + const char *pos = std::find(curr, limit, '\0'); + m_assert(pos < limit, "Error reading the string section"); + std::cout << ' ' << curr; + curr = pos + 1; + } + std::cout << '\n'; +} + +void Structs::BuildElfSymbols() { + const auto filter = [&]() { + auto filter = tab_->make_filter(); + filter.set_public_symbols(); + return filter; + }(); + for (const auto &symbol : + abigail::symtab_reader::filtered_symtab(*tab_, filter)) { + const auto &symbol_name = symbol->get_name(); + const auto &main_symbol_name = symbol->get_main_symbol()->get_name(); + auto it = btf_symbol_types_.find(main_symbol_name); + if (it == btf_symbol_types_.end()) { + // missing BTF information is tracked explicitly + std::cerr << "ELF symbol " << std::quoted(symbol_name); + if (symbol_name != main_symbol_name) + std::cerr << " (aliased to " << std::quoted(main_symbol_name) << ')'; + std::cerr << " BTF info missing\n"; + types_.push_back(nullptr); + } else { + uint32_t type_id = it->second; + types_.push_back(std::make_unique(types_, symbol, type_id)); + } + elf_symbols_.emplace(symbol_name, types_.back().get()); + } +} + +class ElfHandle { + public: + ElfHandle(const std::string &path) : dwfl_(nullptr, dwfl_end) { + string name; + tools_utils::base_name(path, name); + elf_version(EV_CURRENT); + + dwfl_ = std::unique_ptr( + dwfl_begin(&offline_callbacks_), dwfl_end); + auto dwfl_module = dwfl_report_offline(dwfl_.get(), name.c_str(), + path.c_str(), -1); + GElf_Addr bias; + elf_handle_ = dwfl_module_getelf(dwfl_module, &bias); + } + + // Conversion operator to act as a drop-in replacement for Elf* + operator Elf *() const { return elf_handle_; } + + Elf *get() const { return elf_handle_; } + + private: + // Dwfl owns all our data, hence only keep track of this + std::unique_ptr dwfl_; + Elf *elf_handle_; + + Dwfl_Callbacks offline_callbacks_; +}; + +Structs ReadFile(const std::string &path, bool verbose) { + using abigail::symtab_reader::symtab; + + ElfHandle elf(path); + m_assert(elf.get() != nullptr, "Could not get elf handle from file."); + + Elf_Scn *btf_section = + abigail::elf_helpers::find_section(elf, ".BTF", SHT_PROGBITS); + m_assert(btf_section != nullptr, + "The given file does not have a BTF section"); + Elf_Data *elf_data = elf_rawdata(btf_section, 0); + m_assert(elf_data != nullptr, "The BTF section is invalid"); + const char *btf_start = static_cast(elf_data->d_buf); + + auto env = std::make_unique(); + auto tab = symtab::load(elf, env.get()); + + return Structs(btf_start, std::move(env), std::move(tab), verbose); +} + +} // end namespace btf +} // end namespace abigail