From patchwork Thu Mar 25 21:51:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Giuliano Procida X-Patchwork-Id: 42774 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E7D663858001; Thu, 25 Mar 2021 21:52:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E7D663858001 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1616709129; bh=rtNyey0W4rdVw22uiMlDxOS7/ozvLNb9B9VCPHJNb7Y=; h=Date:In-Reply-To:References:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=iU/OTs1B5tMKsdEHdjrAKNlVlF65os2QZS1nea7A/T9GkiNGL7WHQk+MS+A1dAidp +X6tjYAqE4Q67AsZ6ieSbji0pdYCIHwktfICZ6dvbMdQibbsxkxMP2hQDv++l233Y+ WWYujGbjuQxWidpnu+0N0+yNZq3RYah61U3aM9Vs= X-Original-To: libabigail@sourceware.org Delivered-To: libabigail@sourceware.org Received: from mail-qv1-xf49.google.com (mail-qv1-xf49.google.com [IPv6:2607:f8b0:4864:20::f49]) by sourceware.org (Postfix) with ESMTPS id 680CF3858001 for ; Thu, 25 Mar 2021 21:52:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 680CF3858001 Received: by mail-qv1-xf49.google.com with SMTP id a7so2196286qvx.10 for ; Thu, 25 Mar 2021 14:52:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=rtNyey0W4rdVw22uiMlDxOS7/ozvLNb9B9VCPHJNb7Y=; b=A5TnY52dSSfDv5NrrbOluc7elJcgrJ9YlDyLaNqQYWydyD3PQMMrJoL4GELg8HjkdZ Pu++deHI8SAFG6hr5jpn9hc0oMHYc0c/6DTiOYVFAW2KHl7W7qiZAgM5rs5OB0uBZ+yk cgRL0V5H9RmyvO1Xx7QAGSzp0iwPapHBMO2ZEt7AuhT9p4PTMDZN4S0CAZmpznfSlBOs U5S6BVcobxlE2AUmmv43OnmrFxEaYGcHjCUrtCygLQQoyS6LVcjOYKYizcR4g29T+uy3 BPMRyRCqrwh6urX4Hif9cyNvDU0UwaoIp1txVbpd/yYYUc1Kat23acNg9ET7mxmuYHhD C09A== X-Gm-Message-State: AOAM531/aw0mJmUZNEiK4TBsb9J7VfZWAslK1l7HZZ4kv6J9HuiGYfuO Em8g0sATYYch/KVovAki+p5h65zO2I8nyUlMNgHZY7OgsrJWHsvisOm09PVPr1FI1E5lOkqPjmk j0Jl/tV/kEhOisNeKs8a+fZqDGjHTGN5afmFG5obhEYj8w9vkVIrEvzGHCKXIQDqQoMLZpa8= X-Google-Smtp-Source: ABdhPJxjkw/yqeda6xUJYsWGYY/wLKkmdJTozWW64CDQxevjqQcBEVcxPsMKP+vD7y++0G8regty9tbB/mrqyg== X-Received: from tef.lon.corp.google.com ([2a00:79e0:d:110:2df6:f24a:7f54:86a8]) (user=gprocida job=sendgmr) by 2002:a05:6214:10c7:: with SMTP id r7mr10469828qvs.3.1616709124904; Thu, 25 Mar 2021 14:52:04 -0700 (PDT) Date: Thu, 25 Mar 2021 21:51:38 +0000 In-Reply-To: <20210325215146.3597963-1-gprocida@google.com> Message-Id: <20210325215146.3597963-2-gprocida@google.com> Mime-Version: 1.0 References: <20210316165509.2658452-1-gprocida@google.com> <20210325215146.3597963-1-gprocida@google.com> X-Mailer: git-send-email 2.31.0.291.g576ba9dcdaf-goog Subject: [RFC PATCH 1/9] Add ABI tidying utility To: libabigail@sourceware.org X-Spam-Status: No, score=-22.5 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-Patchwork-Original-From: Giuliano Procida via Libabigail From: Giuliano Procida Reply-To: Giuliano Procida Cc: maennich@google.com, kernel-team@android.com Errors-To: libabigail-bounces@sourceware.org Sender: "Libabigail" This initial version: - reads XML into a DOM - strips all text (whitespace) from elements - reindents, assuming empty elements are emitted as a single tag - writes out XML, excluding the XML declaration Removing text elements makes other manipulation of the XML DON easier. This should be a semantics-preserving transformation, but is not. See https://sourceware.org/bugzilla/show_bug.cgi?id=27616. * scripts/abitidy.pl (stript_text): New function to remove text nodes from a DOM. (indent): New function to add whitespace nodes to reindent a DOM. (...): The rest of script consists of top-level comments, option handling and DOM read / write. Signed-off-by: Giuliano Procida --- scripts/abitidy.pl | 146 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 146 insertions(+) create mode 100755 scripts/abitidy.pl diff --git a/scripts/abitidy.pl b/scripts/abitidy.pl new file mode 100755 index 00000000..66d636d7 --- /dev/null +++ b/scripts/abitidy.pl @@ -0,0 +1,146 @@ +#!/usr/bin/perl + +# This script is intended to consume libabigail ABI XML as generated +# by abidw and produce a possibly smaller representation that captures +# the same ABI. In particular, the output should be such that abidiff +# --harmless reports no differences (or is empty). + +use v5.32.0; +use strict; +use warnings; +use experimental 'signatures'; + +use autodie; + +use Data::Dumper; +use Getopt::Long; +use IO::File; +use XML::LibXML; + +# Overview of ABI XML elements and their roles +# +# ELF +# +# elf-needed - container +# dependency - names a library +# elf-variable-symbols - contains a list of symbols +# elf-function-symbols - contains a list of symbols +# elf-symbol - describes an ELF variable or function +# +# Grouping and scoping +# +# abi-corpus-group +# abi-corpus +# abi-instr - compilation unit containers +# namespace-decl - pure container, possibly named +# +# Types (some introduce scopes, only in C++) +# +# type-decl - defines a primitive type +# typedef-decl - defines a type, links to a type +# qualified-type-def - defines a type, links to a type +# pointer-type-def - defines a type, links to a type +# reference-type-def - defines a type, links to a type +# array-type-def - defines a (multidimensional array) type, refers to element type, contains subranges +# subrange - contains array length, refers to element type; defines types (never referred to; duplicated) +# function-type - defines a type +# parameter - belongs to function-type and -decl, links to a type +# return - belongs to function-type and -decl, links to a type +# enum-decl - defines a type, names it, contains a list of enumerators and an underlying-type +# underlying-type - belongs to enum-decl +# enumerator - belongs to enum-decl +# union-decl - defines and names a type, contains member elements linked to other things +# class-decl - defines and names a type, contains base type, member elements linking to other things +# base-class - belongs to class-decl +# data-member - container for a member; holds access level +# member-function - container for a member; holds access level +# member-type - container for a type declaration; holds access level +# member-template - container for a (function?) template declaration; holds access level +# +# Higher order Things +# +# class-template-decl - defines a type (function), but without instantiation this isn't usable +# function-template-decl - defines a type (function), but without instantiation this isn't usable +# template-type-parameter - defines a type (variable), perhaps one which should be excluded from the real type graph +# template-non-type-parameter - names a template parameter, links to a type +# template-parameter-type-composition - container? +# +# Values +# +# var-decl - names a variable, can link to a symbol, links to a type +# function-decl - names a function, can link to a symbol +# has same children as function-type, rather than linking to a type + +# Remove all text nodes. +sub strip_text($dom) { + for my $node ($dom->findnodes('//text()')) { + $node->unbindNode(); + } +} + +# Make XML nicely indented. We could make the code a bit less inside +# out by passing the parent node as an extra argument. Efforts in this +# direction ran into trouble. +sub indent; +sub indent($indent, $node) { + if ($node->nodeType == XML_ELEMENT_NODE) { + my @children = $node->childNodes(); + return unless @children; + my $more_indent = $indent + 2; + # The ordering of operations here is incidental. The outcomes we + # want are 1. an extra newline after the opening tag and + # reindenting the closing tag to match, and 2. indentation for the + # children. + $node->insertBefore(new XML::LibXML::Text("\n"), $children[0]); + for my $child (@children) { + $node->insertBefore(new XML::LibXML::Text(' ' x $more_indent), $child); + indent($more_indent, $child); + $node->insertAfter(new XML::LibXML::Text("\n"), $child); + } + $node->appendText(' ' x $indent); + } else { + for my $child ($node->childNodes()) { + indent($indent, $child); + } + } +} + +# Parse arguments. +my $input_opt; +my $output_opt; +my $all_opt; +GetOptions('i|input=s' => \$input_opt, + 'o|output=s' => \$output_opt, + 'a|all' => sub { + 1 + }, + ) and !@ARGV or die("usage: $0", + map { (' ', $_) } ( + '[-i|--input file]', + '[-o|--output file]', + '[-a|--all]', + ), "\n"); + +exit 0 unless defined $input_opt; + +# Load the XML. +my $input = $input_opt eq '-' ? \*STDIN : new IO::File $input_opt, '<'; +my $dom = XML::LibXML->load_xml(IO => $input); +close $input; + +# This simplifies DOM analysis and manipulation. +strip_text($dom); + +exit 0 unless defined $output_opt; + +# Reformat for human consumption. +indent(0, $dom); + +# Emit the XML, removing the XML declaration. +my $output = $output_opt eq '-' ? \*STDOUT : new IO::File $output_opt, '>'; +my $out = $dom->toString(); +$out =~ s;^<\?xml .*\?>\n;;m; +print $output $out; +close $output; + +exit 0; From patchwork Thu Mar 25 21:51:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Giuliano Procida X-Patchwork-Id: 42775 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 375283857C6B; Thu, 25 Mar 2021 21:52:10 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 375283857C6B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1616709130; bh=qwIx8x66TcLFTiJLDjfQgXN7KsJo/u7dvgGZOh5NFQc=; h=Date:In-Reply-To:References:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=rAsmFB9pyaol3JF7hPqvDJFq2rMhodIy8lV/JBzc3BFheoXUIvO4U3uAVbII40DmE hXBJesGOsKQZoXP29XJb+2/v2nflX6aPWs0u5KT+owM9iy2lwtzYJrKNefnV3wHRdq ox8B9jCKHN27pEIJp2tzFzZRubGXbUj6UXSQdR7U= X-Original-To: libabigail@sourceware.org Delivered-To: libabigail@sourceware.org Received: from mail-qt1-x849.google.com (mail-qt1-x849.google.com [IPv6:2607:f8b0:4864:20::849]) by sourceware.org (Postfix) with ESMTPS id B94033857C62 for ; Thu, 25 Mar 2021 21:52:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org B94033857C62 Received: by mail-qt1-x849.google.com with SMTP id m8so4100470qtp.14 for ; Thu, 25 Mar 2021 14:52:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=qwIx8x66TcLFTiJLDjfQgXN7KsJo/u7dvgGZOh5NFQc=; b=L2mI7PW4dhLcE85pi9NdEWjoaQ3Pbmrkg9AUrNouqM2H5gjeZ/ujolyBeALNv1i12e GmXmpvtzN+FRFpyI+8XIJhz37s51zW0c1Vya36yl38SMCoKRk1rlbcNEmP4D8vmnOoK6 mkhVH1XGPsdp8wAjlLK0Tj7xCAjwJvM/nFRQXIKCCyILhOyBbllSRF7X3AH8SCP+xuez kqYu9cSRp+7fYhrfN9lj5Mo0DWfYVCzB+M/zr0NhpF+6OQXG8e8nMroDf6TaE57VR/9a WuaP0FftwbIuCIESxFykLkbK6nKzcGnu3CuAO2pwhnRgf536XoqQefJnWGhIS4DGyHk7 bnrg== X-Gm-Message-State: AOAM5323eljSL+X51RtHP9qe3B/sViDk5KcozklG5W6CimA6PKh0SjlH 5iock3gxZXa2AWagBqGb1JC85c6XEVp9X+DIWK5Ecw81TPyvtrZtEgIy2k9S8GqaEtv6Rvto84r Nz9YuVZ8wInI0IszkcjjFfbsaX01/E1g960n5Nd0lL2EMmfoPgdHP+Tnt1j8odPPf4ZhtOfw= X-Google-Smtp-Source: ABdhPJzkAdOcjQ4RYRNsb9475ZCrl47g/EfR78dxKrR6sT0sFuPx+uQpES7VNek5sfWUc5oi/LE3OxXC+ZKBwg== X-Received: from tef.lon.corp.google.com ([2a00:79e0:d:110:2df6:f24a:7f54:86a8]) (user=gprocida job=sendgmr) by 2002:a05:6214:18e5:: with SMTP id ep5mr10799403qvb.32.1616709127287; Thu, 25 Mar 2021 14:52:07 -0700 (PDT) Date: Thu, 25 Mar 2021 21:51:39 +0000 In-Reply-To: <20210325215146.3597963-1-gprocida@google.com> Message-Id: <20210325215146.3597963-3-gprocida@google.com> Mime-Version: 1.0 References: <20210316165509.2658452-1-gprocida@google.com> <20210325215146.3597963-1-gprocida@google.com> X-Mailer: git-send-email 2.31.0.291.g576ba9dcdaf-goog Subject: [RFC PATCH 2/9] Add pass to drop empty XML elements To: libabigail@sourceware.org X-Spam-Status: No, score=-22.3 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-Patchwork-Original-From: Giuliano Procida via Libabigail From: Giuliano Procida Reply-To: Giuliano Procida Cc: maennich@google.com, kernel-team@android.com Errors-To: libabigail-bounces@sourceware.org Sender: "Libabigail" Certain elements in ABI XML are effectvely containers and can be dropped if empty and their attributes don't carry ABI information. - elf-variable-symbols: pure container - elf-function-symbols: pure container - namespace-decl: has a name - abi-instr: compilation unit (path etc.) - abi-corpus: binary object (architecture) - abi-corpus-group: binary objects (architecture) It could be argued that abi-corpus (or abi-corpus-group) should be kept around to hold the architecture of an object or set of objects. However, if a binary object has no symbols (say, if it is empty), it hardly matters what the architecture is. Note that: - abidiff rejects XML files with an XML declaration at the top - abidiff rejects completely empty files Resolving the first would make the second moot. In the meantime, we avoid dropping top-level elements. * scripts/abitidy.pl (drop_if_empty): New variable containing the tags of elements that can be dropped if empty. (drop_empty): New Function that removes empty elements, except top-level ones. Signed-off-by: Giuliano Procida --- scripts/abitidy.pl | 43 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 42 insertions(+), 1 deletion(-) diff --git a/scripts/abitidy.pl b/scripts/abitidy.pl index 66d636d7..1f74e267 100755 --- a/scripts/abitidy.pl +++ b/scripts/abitidy.pl @@ -105,20 +105,58 @@ sub indent($indent, $node) { } } +# Remove an XML element and any preceeding comment. +sub remove_node($node) { + my $prev = $node->previousSibling(); + if ($prev && $prev->nodeType == XML_COMMENT_NODE) { + $prev->unbindNode(); + } + $node->unbindNode(); +} + +# These container elements can be dropped if empty. +my %drop_if_empty = map { $_ => undef } qw( + elf-variable-symbols + elf-function-symbols + namespace-decl + abi-instr + abi-corpus + abi-corpus-group +); + +# This is a XML DOM traversal as we want post-order traversal so we +# delete nodes that become empty during the process. +sub drop_empty; +sub drop_empty($node) { + my $node_name = $node->getName(); + for my $child ($node->childNodes()) { + drop_empty($child); + } + if (!$node->hasChildNodes() && $node->nodeType == XML_ELEMENT_NODE && exists $drop_if_empty{$node->getName()}) { + # Until abidiff accepts empty ABIs, avoid dropping top-level elements. + if ($node->parentNode->nodeType == XML_ELEMENT_NODE) { + remove_node($node); + } + } +} + # Parse arguments. my $input_opt; my $output_opt; my $all_opt; +my $drop_opt; GetOptions('i|input=s' => \$input_opt, 'o|output=s' => \$output_opt, 'a|all' => sub { - 1 + $drop_opt = 1 }, + 'd|drop-empty!' => \$drop_opt, ) and !@ARGV or die("usage: $0", map { (' ', $_) } ( '[-i|--input file]', '[-o|--output file]', '[-a|--all]', + '[-d|--[no-]drop-empty]', ), "\n"); exit 0 unless defined $input_opt; @@ -131,6 +169,9 @@ close $input; # This simplifies DOM analysis and manipulation. strip_text($dom); +# Drop empty elements. +drop_empty($dom) if $drop_opt; + exit 0 unless defined $output_opt; # Reformat for human consumption. From patchwork Thu Mar 25 21:51:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Giuliano Procida X-Patchwork-Id: 42776 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A39A53857C62; Thu, 25 Mar 2021 21:52:13 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A39A53857C62 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1616709133; bh=5G0wBMS0urIiBLECL+XLWnN6kMBqHJC0RDMurAxp21M=; h=Date:In-Reply-To:References:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=Z6Z9O8+X60DMH8uQBiQBRA9U2bPlor+hBGR0ySNQEPek1WC0aM5+2Kz3Up4povJxP uFsxVFx3SPoxQcFRuK8zTA64BZ1CAa+26cGKZbJFlLCItPH9G0+CMGITIUAfY+IkWM R6vaIOvfFzSUbS7uWnq6fkxZ4ehcrcHktnPOtcps= X-Original-To: libabigail@sourceware.org Delivered-To: libabigail@sourceware.org Received: from mail-wr1-x449.google.com (mail-wr1-x449.google.com [IPv6:2a00:1450:4864:20::449]) by sourceware.org (Postfix) with ESMTPS id A57EB385783A for ; Thu, 25 Mar 2021 21:52:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org A57EB385783A Received: by mail-wr1-x449.google.com with SMTP id n16so3265050wro.1 for ; Thu, 25 Mar 2021 14:52:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=5G0wBMS0urIiBLECL+XLWnN6kMBqHJC0RDMurAxp21M=; b=N4QOc1fvc1HnOu+dza0RgRn8HSG/qQSSogJKB1w2dcVc5iAFOU/OCARb6AqrmbgxIi h4XAJYCK1mgmJMiYQIu7SUdyMtvKbfHgMW/l/4QtuY3Y4+yltdJTbMrSjwYLJGpjDOZj 7H03L6A5y5DGxjRnueZloAra18gcecs58YgwsAp7TvgY04Z52rbzQnj0j6ZqLlW6ddbw pL4gcOUmDJ4W9IQDTnRD99LPk9VKgFjY9Z99cPRbw7HFNHiwz7ndgOf09TrcKa32OB72 gl6uwJP3+loTlvQ6JW1NAL7hWuaf8zCB0NvEfHCmIvouriOqyquCJcuOG+Fn4NepkVO0 qoPQ== X-Gm-Message-State: AOAM530ICWwfSJKvHzrSLDpWXIei2B43KZCsxSDofdIQqABuYI9XjSd0 HqMc26U70RbzJB2YJ06WK33D7YYCBwVtKdas8OqJnj5ak9jYRo60wMRTMghKUIUe8X5VPjN8joV vdSeiQCHMiJaPMzp9NewSviglkYIbSvpB3qJ9uoLvpy2HYy2C510Lob0ENvGEZ6D87LA/5q0= X-Google-Smtp-Source: ABdhPJzR69cRHWlBxnLHRpyYwUy5lYupWInDJqWgaTL1J7K+YeVnCihHfVPMv3urLC99Cox8W0L1y7GKRbVbmw== X-Received: from tef.lon.corp.google.com ([2a00:79e0:d:110:2df6:f24a:7f54:86a8]) (user=gprocida job=sendgmr) by 2002:a7b:c209:: with SMTP id x9mr9916602wmi.92.1616709129551; Thu, 25 Mar 2021 14:52:09 -0700 (PDT) Date: Thu, 25 Mar 2021 21:51:40 +0000 In-Reply-To: <20210325215146.3597963-1-gprocida@google.com> Message-Id: <20210325215146.3597963-4-gprocida@google.com> Mime-Version: 1.0 References: <20210316165509.2658452-1-gprocida@google.com> <20210325215146.3597963-1-gprocida@google.com> X-Mailer: git-send-email 2.31.0.291.g576ba9dcdaf-goog Subject: [RFC PATCH 3/9] Add pass to prune unreachable parts of the ABI To: libabigail@sourceware.org X-Spam-Status: No, score=-23.3 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-Patchwork-Original-From: Giuliano Procida via Libabigail From: Giuliano Procida Reply-To: Giuliano Procida Cc: maennich@google.com, kernel-team@android.com Errors-To: libabigail-bounces@sourceware.org Sender: "Libabigail" In response to internal reports about very large ABIs for libraires implemented in C++ but exposing only a C API, I wrote a script as a means of investigating the issue. This has been adapted here. The aim is to remove parts of the XML that are irrelevant to abidiff. ELF symbols (and shared object dependencies) are taken to be the graph roots for the purpose of reachability analysis. On most XML files, this results in a modest saving (5-10%). However, for the library that sparked the invesigaton, the resulting XML was more than 2300 times smaller. If this functionality is implemented in libagail tself, this pass will become a no-op. * scripts/abitidy.pl (prune_unreachable): New function to determine the reachable declaration and type nodes in an XNL ABI and remove all unreachable ones. Signed-off-by: Giuliano Procida --- scripts/abitidy.pl | 215 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 214 insertions(+), 1 deletion(-) diff --git a/scripts/abitidy.pl b/scripts/abitidy.pl index 1f74e267..c9f93ed8 100755 --- a/scripts/abitidy.pl +++ b/scripts/abitidy.pl @@ -140,23 +140,233 @@ sub drop_empty($node) { } } +# Remove unreachable declarations and types. +# +# When making a graph from ABI XML, the following are the types of +# "node" we care about. The "edges" are obtained from a few XML +# attributes as well as via XML element containment. +# +# ELF (exported) symbols +# +# elf-symbol (has a name; the code here currently knows nothing +# about aliases) +# +# Declarations (that mention a symbol) +# +# These live in a scope. In C++ scopes can be nested and include +# namespaces and class types. +# +# var-decl (also used for member variables) +# elf-symbol linked to symbol via mangled-name +# type-id links to a type +# function-decl (also used for member functions) +# elf-symbol linked to symbol via mangled-name +# parameter and return type-ids link to types +# (alas, not just a link to a function type) +# +# Types +# +# These occupy pretty much all the other elements, besides those +# that act as simple containers. +sub prune_unreachable($dom) { + my %elf_symbols; + # Graph vertices (only needed for statistics). + my %vertices; + # Graph edges. + my %edges; + + # Keep track of type / symbol nesting. + my @stack; + + # Traverse the whole XML DOM. + my sub make_graph($node) { + # The XML attributes we care about. + my $name; + my $id; + my $type_id; + my $symbol; + my $naming_typedef_id; + + # Not every node we encounter is an XML element. + if ($node->nodeType == XML_ELEMENT_NODE) { + $name = $node->getAttribute('name'); + $id = $node->getAttribute('id'); + $type_id = $node->getAttribute('type-id'); + $symbol = $node->getAttribute('mangled-name'); + $naming_typedef_id = $node->getAttribute('naming-typedef-id'); + die if defined $id && defined $symbol; + } + + if (defined $name && $node->getName() eq 'elf-symbol') { + $elf_symbols{$name} = undef; + # Early return is safe, but not necessary. + return; + } + + if (defined $id) { + my $vertex = "type:$id"; + # This element defines a type (but there may be more than one + # defining the same type - we cannot rely on uniqueness). + $vertices{$vertex} = undef; + if (defined $naming_typedef_id) { + # This is an odd one, there can be a backwards link from an + # anonymous type to the typedef that refers to it, so we need to + # pull in the typedef, even if nothing else refers to it. + $edges{$vertex}{"type:$naming_typedef_id"} = undef; + } + if (@stack) { + # Parent<->child dependencies; record dependencies both + # ways to avoid holes in XML types and declarations. + $edges{$stack[-1]}{$vertex} = undef; + $edges{$vertex}{$stack[-1]} = undef; + } + push @stack, $vertex; + } + + if (defined $symbol) { + my $vertex = "symbol:$symbol"; + # This element is a declaration linked to a symbol (whether or not + # exported). + $vertices{$vertex} = undef; + if (@stack) { + # Parent<->child dependencies; record dependencies both ways + # to avoid holes in XML types and declarations. + # + # Symbols exist outside of the type hierarchy, so choosing to + # make them depend on a containing type scope and vice versa + # is conservative and probably not necessary. + $edges{$stack[-1]}{$vertex} = undef; + $edges{$vertex}{$stack[-1]} = undef; + } + # The symbol depends on the types mentioned in this element, so + # record it. + push @stack, $vertex; + # In practice there will be at most one symbol on the stack; we + # could verify this here, but it wouldn't achieve anything. + } + + if (defined $type_id) { + if (@stack) { + # The enclosing type or symbol refers to another type. + $edges{$stack[-1]}{"type:$type_id"} = undef; + } + } + + for my $child ($node->childNodes()) { + __SUB__->($child); + } + + if (defined $symbol) { + pop @stack; + } + if (defined $id) { + pop @stack; + } + } + + # Build a graph. + make_graph($dom); + die if @stack; + #warn Dumper(\%elf_symbols, \%vertices, \%edges); + + # DFS visited state. Would be nicer with a flat namespace of nodes. + my %seen; + my sub dfs($vertex) { + no warnings 'recursion'; + return if exists $seen{$vertex}; + $seen{$vertex} = undef; + + my $tos = $edges{$vertex}; + if (defined $tos) { + for my $to (keys %$tos) { + __SUB__->($to); + } + } + } + + # Traverse the graph, starting from the exported symbols. + for my $symbol (keys %elf_symbols) { + my $vertex = "symbol:$symbol"; + if (exists $vertices{$vertex}) { + dfs($vertex); + } else { + warn "no declaration found for ELF symbol $symbol\n"; + } + } + + #warn Dumper(\%seen); + + # Useful counts. + my sub print_report() { + my $count_elf_symbols = scalar keys %elf_symbols; + my $count_vertices = scalar keys %vertices; + my $count_seen = scalar keys %seen; + + warn qq{ELF = $count_elf_symbols +vertices = $count_vertices +seen = $count_seen +}; + } + + #print_report(); + + # XPath selection is too slow as we end up enumerating lots of + # nested items whose preservation is entirely determined by their + # containing items. DFS with early stopping for the win. + my sub remove_unwanted($node) { + my $node_name = $node->getName(); + my $name; + my $id; + my $symbol; + + if ($node->nodeType == XML_ELEMENT_NODE) { + $name = $node->getAttribute('name'); + $id = $node->getAttribute('id'); + $symbol = $node->getAttribute('mangled-name'); + die if defined $id && defined $symbol; + } + + # Return if we know that this is a type or declaration to keep or + # drop in its entirety. + if (defined $id) { + remove_node($node) unless exists $seen{"type:$id"}; + return; + } + if ($node_name eq 'var-decl' || $node_name eq 'function-decl') { + remove_node($node) unless defined $symbol && exists $seen{"symbol:$symbol"}; + return; + } + + # Otherwise, this is not a type, declaration or part thereof, so + # process child elements. + for my $child ($node->childNodes()) { + __SUB__->($child); + } + } + + remove_unwanted($dom); +} + # Parse arguments. my $input_opt; my $output_opt; my $all_opt; my $drop_opt; +my $prune_opt; GetOptions('i|input=s' => \$input_opt, 'o|output=s' => \$output_opt, 'a|all' => sub { - $drop_opt = 1 + $drop_opt = $prune_opt = 1 }, 'd|drop-empty!' => \$drop_opt, + 'p|prune-unreachable!' => \$prune_opt, ) and !@ARGV or die("usage: $0", map { (' ', $_) } ( '[-i|--input file]', '[-o|--output file]', '[-a|--all]', '[-d|--[no-]drop-empty]', + '[-p|--[no-]prune-unreachable]', ), "\n"); exit 0 unless defined $input_opt; @@ -169,6 +379,9 @@ close $input; # This simplifies DOM analysis and manipulation. strip_text($dom); +# Prune unreachable elements. +prune_unreachable($dom) if $prune_opt; + # Drop empty elements. drop_empty($dom) if $drop_opt; From patchwork Thu Mar 25 21:51:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Giuliano Procida X-Patchwork-Id: 42777 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E98B5385700B; Thu, 25 Mar 2021 21:52:13 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E98B5385700B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1616709134; bh=HcaYo5hYaKg7TpHYpU+q7TVR30n/F6ug5ZYX4oIPErU=; h=Date:In-Reply-To:References:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=VKGYB+XIoKow4lnv0lpVyncqkJzTWxJRM21DX9TXY8JRPRwKfBjREbbRoVyBLYe13 s6i0cW9K6owM4nQ+95WalZTIPr4LkaJLEaPyf38DtJCNb6iimD/CYMmI/0E/gP7aP0 aw1faTejBks9BLc5QXuWIeUYfUtTlpvRnT+gJi4k= X-Original-To: libabigail@sourceware.org Delivered-To: libabigail@sourceware.org Received: from mail-qv1-xf4a.google.com (mail-qv1-xf4a.google.com [IPv6:2607:f8b0:4864:20::f4a]) by sourceware.org (Postfix) with ESMTPS id 727FA3857C62 for ; Thu, 25 Mar 2021 21:52:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 727FA3857C62 Received: by mail-qv1-xf4a.google.com with SMTP id b15so4488833qvz.15 for ; Thu, 25 Mar 2021 14:52:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=HcaYo5hYaKg7TpHYpU+q7TVR30n/F6ug5ZYX4oIPErU=; b=EnsdNz19d325gRnlYzpLgpQjC4Fzs5MYGnuTUKaZhWYxoXvzq+eiOi0bqUAQVQ6KLc UEiELCiAcgtmQnNs6aNHxqmSX2MXFHQwwlSiqbvPE5nj1qZIW1N4jc6CRnoitGczvR9F b0yV2zCMfg/GTy383Bz9koR+5qdoLU9jO0TypANb0clUZ44JoApSz1uiO7nY7ORMFqmP b/iSH4nKJuJ+1qE5GZl1qkOonL8BbkuFRG4DAxSkX1ZdfldmowBmLmRnP5W/bY5cnnKA BUxqLTte+fHlXa10gSUmTuA9NpNfuIZc76lCsZJK9tfy/TMZr1lL35ukO8SratO1Q3vN LhOg== X-Gm-Message-State: AOAM530Umo8eA4l+I+Z/LKz+y3R33/HltcuNkuFQmNatUGy6OQZQ2VSa 1QGbCCLFYYK6G+78oJGIZ6cNbvFAKrE5Zb63wfs95PrLxK3Of2TC5xXCo50hhmwzIfWAIeKdVqd 3vQR+bghhBydUTCwKhL8J3PrRCCHrHCheV572hNrIbyMV43xp3KqgOvfLCIaUyiZkicHTJoM= X-Google-Smtp-Source: ABdhPJxgReNnce90qT9+AOuDrD/LnqDuOydufncWU6QMfgctSLTAIQE2ro0lWJMeVJLeGusI2C3kOyKNnf2QbQ== X-Received: from tef.lon.corp.google.com ([2a00:79e0:d:110:2df6:f24a:7f54:86a8]) (user=gprocida job=sendgmr) by 2002:a05:6214:2ea:: with SMTP id h10mr1376656qvu.55.1616709132010; Thu, 25 Mar 2021 14:52:12 -0700 (PDT) Date: Thu, 25 Mar 2021 21:51:41 +0000 In-Reply-To: <20210325215146.3597963-1-gprocida@google.com> Message-Id: <20210325215146.3597963-5-gprocida@google.com> Mime-Version: 1.0 References: <20210316165509.2658452-1-gprocida@google.com> <20210325215146.3597963-1-gprocida@google.com> X-Mailer: git-send-email 2.31.0.291.g576ba9dcdaf-goog Subject: [RFC PATCH 4/9] Add pass to filter symbols To: libabigail@sourceware.org X-Spam-Status: No, score=-22.6 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-Patchwork-Original-From: Giuliano Procida via Libabigail From: Giuliano Procida Reply-To: Giuliano Procida Cc: maennich@google.com, kernel-team@android.com Errors-To: libabigail-bounces@sourceware.org Sender: "Libabigail" Currently symbol lists can be applied at the point of extraction (abidw) or comparison (abidiff). This commmit enables symbol lists to be applied directly to ABI XML files which will simplify workflows that use multiple symbol lists. * scripts/abitidy.pl (symbols): New variable to hold optional symbols to filter by. (filter_symbols): New function to filter out unlisted symbols. Signed-off-by: Giuliano Procida --- scripts/abitidy.pl | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/scripts/abitidy.pl b/scripts/abitidy.pl index c9f93ed8..468eeac4 100755 --- a/scripts/abitidy.pl +++ b/scripts/abitidy.pl @@ -347,14 +347,35 @@ seen = $count_seen remove_unwanted($dom); } +# Read symbols from a file. +sub read_symbols($file) { + my %symbols; + my $fh = new IO::File $file, '<'; + while (<$fh>) { + chomp; + $symbols{$_} = undef; + } + close $fh; + return \%symbols; +} + +# Remove unlisted ELF symbols, +sub filter_symbols($symbols, $dom) { + for my $node ($dom->findnodes('elf-symbol')) { + remove_node($node) unless exists $symbols->{$node->getAttribute('name')}; + } +} + # Parse arguments. my $input_opt; my $output_opt; +my $symbols_opt; my $all_opt; my $drop_opt; my $prune_opt; GetOptions('i|input=s' => \$input_opt, 'o|output=s' => \$output_opt, + 's|symbols=s' => \$symbols_opt, 'a|all' => sub { $drop_opt = $prune_opt = 1 }, @@ -364,6 +385,7 @@ GetOptions('i|input=s' => \$input_opt, map { (' ', $_) } ( '[-i|--input file]', '[-o|--output file]', + '[-s|--symbols file]', '[-a|--all]', '[-d|--[no-]drop-empty]', '[-p|--[no-]prune-unreachable]', @@ -379,6 +401,9 @@ close $input; # This simplifies DOM analysis and manipulation. strip_text($dom); +# Remove unlisted symbols. +filter_symbols(read_symbols($symbols_opt), $dom) if defined $symbols_opt; + # Prune unreachable elements. prune_unreachable($dom) if $prune_opt; From patchwork Thu Mar 25 21:51:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Giuliano Procida X-Patchwork-Id: 42778 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3C5FE3858001; Thu, 25 Mar 2021 21:52:20 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3C5FE3858001 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1616709140; bh=YJPFkz1kZ4kivaqeQ046NiUbp1bHwylZ5GhGXY5qpJI=; h=Date:In-Reply-To:References:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=QzW1LWoVPpOePHcL213nnEGFkorBSuVJDm/KlaV5L4cJHwcLGQ5vZWj53veH7Cgjp jXRb8YmiEQ9e6MPq9ZfT0uyRCfDQQjFnrAspZls7M0Z8OdLhbPrOSvKeUGOzQsyxXG IkX/a73y284KHK5sxKl7sd2VNy6e1ZWWVr7uhxic= X-Original-To: libabigail@sourceware.org Delivered-To: libabigail@sourceware.org Received: from mail-wr1-x44a.google.com (mail-wr1-x44a.google.com [IPv6:2a00:1450:4864:20::44a]) by sourceware.org (Postfix) with ESMTPS id 7D56B3854801 for ; Thu, 25 Mar 2021 21:52:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 7D56B3854801 Received: by mail-wr1-x44a.google.com with SMTP id z6so3237948wrh.11 for ; Thu, 25 Mar 2021 14:52:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=YJPFkz1kZ4kivaqeQ046NiUbp1bHwylZ5GhGXY5qpJI=; b=i5o28iGUXdSYSxtwOcasemkrhkSv3/xMakNoowXv0DvFuRJfjwjDyn7hRaSwuPGJ7w QHUDBRqvYi3PRuOoLHur9+LxEE+7HwH7htXEB1+7+krYgQvbTwrhIGQd8ORWYrG6/WqR 6PpjtxWgTslTDLV4XLZRwVavqqUEVAnixRzf7yCxrug4+4bVEo0VnerVnBQigIy9fGMI 3EU/XTK19YmjD7h+ZbOl0QQ5UYey6jlvIy0EH0oEWfHX2eVbbi+60DL3AbGagzmmccGp qFBb4de3VLOirSAN56a2CYTKyG6+axGpg1D80vwmWYFr7Gbu4/o3QUYrYMHnxf9KTvdh swXw== X-Gm-Message-State: AOAM532fUF37GKYEXX/x4lVHADbmjIjR/Rnkcvd66uFKupfs0SOZgvsv ixxSzUXvE1g+H65SG28cejTK4RXmcmX4yo2FYKyIF2caIEVT7XZVSL6rmZI/LGZtrGVaPU0IieZ PXIGE1yZGsE9Tygjm7xUkfe8YbcKtqMLSej0EoJ674SO4Gqy/fZ1yXDQuFtSTz4WaDiCKdwA= X-Google-Smtp-Source: ABdhPJzilbAjWRb3zx66VcFyTqi8Ow/JqXPcOicRbGP42jN+VtaVPQJR8NBvjyVUPyrmikTi7EsawE3OXyiQ/Q== X-Received: from tef.lon.corp.google.com ([2a00:79e0:d:110:2df6:f24a:7f54:86a8]) (user=gprocida job=sendgmr) by 2002:a05:600c:2ca:: with SMTP id 10mr10057748wmn.40.1616709134533; Thu, 25 Mar 2021 14:52:14 -0700 (PDT) Date: Thu, 25 Mar 2021 21:51:42 +0000 In-Reply-To: <20210325215146.3597963-1-gprocida@google.com> Message-Id: <20210325215146.3597963-6-gprocida@google.com> Mime-Version: 1.0 References: <20210316165509.2658452-1-gprocida@google.com> <20210325215146.3597963-1-gprocida@google.com> X-Mailer: git-send-email 2.31.0.291.g576ba9dcdaf-goog Subject: [RFC PATCH 5/9] Add pass to normalise anonymous type names To: libabigail@sourceware.org X-Spam-Status: No, score=-22.7 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-Patchwork-Original-From: Giuliano Procida via Libabigail From: Giuliano Procida Reply-To: Giuliano Procida Cc: maennich@google.com, kernel-team@android.com Errors-To: libabigail-bounces@sourceware.org Sender: "Libabigail" Currently libabigail exposes the internal names used for anonymous types in ABI XML. The names are not stable - they are subject to renumbering - and cause "harmless" diffs which in turn can contribute to hard-to-read and verbose abidiff --harmless output. * scripts/abitidy.pl (normalise_anonymous_type_names): New function to normalise anonymous type names by stripping off numerical suffices. Signed-off-by: Giuliano Procida --- scripts/abitidy.pl | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/scripts/abitidy.pl b/scripts/abitidy.pl index 468eeac4..321363d7 100755 --- a/scripts/abitidy.pl +++ b/scripts/abitidy.pl @@ -366,6 +366,16 @@ sub filter_symbols($symbols, $dom) { } } +sub normalise_anonymous_type_names($) { + my ($doc) = @_; + my $name_path = new XML::LibXML::XPathExpression('//abi-instr//*[@name]'); + for my $node ($doc->findnodes($name_path)) { + my $value = $node->getAttribute('name'); + $value =~ s;^(__anonymous_[a-z]+__)\d*$;$1;; + $node->setAttribute('name', $value); + } +} + # Parse arguments. my $input_opt; my $output_opt; @@ -373,14 +383,16 @@ my $symbols_opt; my $all_opt; my $drop_opt; my $prune_opt; +my $normalise_opt; GetOptions('i|input=s' => \$input_opt, 'o|output=s' => \$output_opt, 's|symbols=s' => \$symbols_opt, 'a|all' => sub { - $drop_opt = $prune_opt = 1 + $drop_opt = $prune_opt = $normalise_opt = 1 }, 'd|drop-empty!' => \$drop_opt, 'p|prune-unreachable!' => \$prune_opt, + 'n|normalise-anonymous!' => \$normalise_opt, ) and !@ARGV or die("usage: $0", map { (' ', $_) } ( '[-i|--input file]', @@ -389,6 +401,7 @@ GetOptions('i|input=s' => \$input_opt, '[-a|--all]', '[-d|--[no-]drop-empty]', '[-p|--[no-]prune-unreachable]', + '[-n|--[no-]normalise-anonymous]', ), "\n"); exit 0 unless defined $input_opt; @@ -404,6 +417,9 @@ strip_text($dom); # Remove unlisted symbols. filter_symbols(read_symbols($symbols_opt), $dom) if defined $symbols_opt; +# Normalise anonymous type names. +normalise_anonymous_type_names($dom) if $normalise_opt; + # Prune unreachable elements. prune_unreachable($dom) if $prune_opt; From patchwork Thu Mar 25 21:51:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Giuliano Procida X-Patchwork-Id: 42780 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id ECA133854801; Thu, 25 Mar 2021 21:52:24 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org ECA133854801 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1616709145; bh=HciouxOvNM2thMTy8tqVKblL7VMaq9wCpqY1gPsm3ts=; h=Date:In-Reply-To:References:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=s+pevrzJSMY+Mc5XJD/5E2EfVIk9Yq/UjCMVVY+H4scCmn5kEMMkEP1nII/OnKzs7 mE3rFQX+OMVuvFvjqRVweNbpuxUxLctn/L8tdwRm1rnNTcVUhXrrquQbgUDTU/EuUz EaXUjheih3vGP2LHQ1XUARJkGXR//qgN6HrOInnk= X-Original-To: libabigail@sourceware.org Delivered-To: libabigail@sourceware.org Received: from mail-qk1-x74a.google.com (mail-qk1-x74a.google.com [IPv6:2607:f8b0:4864:20::74a]) by sourceware.org (Postfix) with ESMTPS id 0E919385800D for ; Thu, 25 Mar 2021 21:52:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 0E919385800D Received: by mail-qk1-x74a.google.com with SMTP id b136so4867489qkc.20 for ; Thu, 25 Mar 2021 14:52:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=HciouxOvNM2thMTy8tqVKblL7VMaq9wCpqY1gPsm3ts=; b=gdTkDpwzasLwdRAoyXt8y8Qcyl9+txpCWOuuwUUpHndXrBYYtloaI3H1IpivzaPUo3 PeZZMm4D8c1qPQq7C+T9L9sY/BPZwRXx/QfZp7dyPTGcK/moFsuId5ag6UMpIMxO0Wjt Gmqia3UkO1Yaiv0IPHWI9+COxdRVunW+BpQ+odE8rVCybslRjbEG/t+NxfZ3XaYD3W1Q B/18lhp+O9hTudP6jOZuM6IwxfYOC7sGQaFL8Uz05kAfAij1oKl2mYI7duftD4QGpB/6 MeiteXFZWIlX9g4Erga/VuKp8IjexKYu8KF2x2PodHTUs1ksR4SUQ25VXi7iCUJx2EKv 1ImQ== X-Gm-Message-State: AOAM531j/EmmIIOGywgCvkvJuFZdyZha0Y3llo81s5senAQY3Pkvsn5R yktuYHD8omHLCpZ59IBS/DJuSCdf++nbeCtRHtkXSiHSRZrRtcgD19y4wDFAqJ/FxZaBz8FKV51 UevUzPasVVGopJArl/p3F+cRflsjdjebaEObfEm6IvHH5Udajog6nLFtqE8O06DFBVesXLUc= X-Google-Smtp-Source: ABdhPJyhRh4nvgyqQswl8jr0xVZlO4QLdKLgTY7m6H6tMClyeCDV98i3vAtfNFF0lojP3R16xfmNMFpcNj8UUg== X-Received: from tef.lon.corp.google.com ([2a00:79e0:d:110:2df6:f24a:7f54:86a8]) (user=gprocida job=sendgmr) by 2002:a0c:e38f:: with SMTP id a15mr10677802qvl.18.1616709136554; Thu, 25 Mar 2021 14:52:16 -0700 (PDT) Date: Thu, 25 Mar 2021 21:51:43 +0000 In-Reply-To: <20210325215146.3597963-1-gprocida@google.com> Message-Id: <20210325215146.3597963-7-gprocida@google.com> Mime-Version: 1.0 References: <20210316165509.2658452-1-gprocida@google.com> <20210325215146.3597963-1-gprocida@google.com> X-Mailer: git-send-email 2.31.0.291.g576ba9dcdaf-goog Subject: [RFC PATCH 6/9] Add pass to report duplicate type ids To: libabigail@sourceware.org X-Spam-Status: No, score=-22.8 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-Patchwork-Original-From: Giuliano Procida via Libabigail From: Giuliano Procida Reply-To: Giuliano Procida Cc: maennich@google.com, kernel-team@android.com Errors-To: libabigail-bounces@sourceware.org Sender: "Libabigail" Duplicate type ids sometimes appear ABI XML files. If these relate to subrange elements, they are innocuous, otherwise they represent some duplication or even inconsistency in libabigail output. See https://sourceware.org/bugzilla/show_bug.cgi?id=26591. * scripts/abitidy.pl (report_duplicate_types): New function to report duplicate types. Signed-off-by: Giuliano Procida --- scripts/abitidy.pl | 24 +++++++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-) diff --git a/scripts/abitidy.pl b/scripts/abitidy.pl index 321363d7..d5ddd7ea 100755 --- a/scripts/abitidy.pl +++ b/scripts/abitidy.pl @@ -376,6 +376,22 @@ sub normalise_anonymous_type_names($) { } } +sub report_duplicate_types($dom) { + my %hash; + for my $type ($dom->findnodes('*[@id]')) { + # subranges are not really types and shouldn't be considered + next if $type->getName() eq 'subrange'; + my $id = $type->getAttribute('id'); + for my $ids ($hash{$id}) { + $ids //= []; + push @$ids, $type; + } + } + for my $id (keys %hash) { + warn "residual duplicated types with id $id\n"; + } +} + # Parse arguments. my $input_opt; my $output_opt; @@ -384,15 +400,17 @@ my $all_opt; my $drop_opt; my $prune_opt; my $normalise_opt; +my $report_opt; GetOptions('i|input=s' => \$input_opt, 'o|output=s' => \$output_opt, 's|symbols=s' => \$symbols_opt, 'a|all' => sub { - $drop_opt = $prune_opt = $normalise_opt = 1 + $drop_opt = $prune_opt = $normalise_opt = $report_opt = 1 }, 'd|drop-empty!' => \$drop_opt, 'p|prune-unreachable!' => \$prune_opt, 'n|normalise-anonymous!' => \$normalise_opt, + 'r|report-duplicates!' => \$report_opt, ) and !@ARGV or die("usage: $0", map { (' ', $_) } ( '[-i|--input file]', @@ -402,6 +420,7 @@ GetOptions('i|input=s' => \$input_opt, '[-d|--[no-]drop-empty]', '[-p|--[no-]prune-unreachable]', '[-n|--[no-]normalise-anonymous]', + '[-r|--[no-]report-duplicates]', ), "\n"); exit 0 unless defined $input_opt; @@ -420,6 +439,9 @@ filter_symbols(read_symbols($symbols_opt), $dom) if defined $symbols_opt; # Normalise anonymous type names. normalise_anonymous_type_names($dom) if $normalise_opt; +# Check for duplicate types. +report_duplicate_types($dom) if $report_opt; + # Prune unreachable elements. prune_unreachable($dom) if $prune_opt; From patchwork Thu Mar 25 21:51:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Giuliano Procida X-Patchwork-Id: 42782 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8422C3857829; Thu, 25 Mar 2021 21:52:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8422C3857829 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1616709150; bh=1baxkKnI70zEG8dGg3y8vRQg/wk9dbqBpZXn4ILXfxw=; h=Date:In-Reply-To:References:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=lROdbvgZR7e5g9yXOZud1vKwBvWyGN3rYG8A+iDrQulMu/vNpk/npKC6YXeU+pVXb WuVEmpavUgbiDv8KQ4H/Ek3RWz0E8K2U/bEXNag7EYRq4gK/ObOQSF+eWlVJR7SKK1 7fM6wRP9k1zr6sBosg/quv//gaob0N9RLhmHjajw= X-Original-To: libabigail@sourceware.org Delivered-To: libabigail@sourceware.org Received: from mail-qt1-x84a.google.com (mail-qt1-x84a.google.com [IPv6:2607:f8b0:4864:20::84a]) by sourceware.org (Postfix) with ESMTPS id 0FDB8385802E for ; Thu, 25 Mar 2021 21:52:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 0FDB8385802E Received: by mail-qt1-x84a.google.com with SMTP id t5so4121699qti.5 for ; Thu, 25 Mar 2021 14:52:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=1baxkKnI70zEG8dGg3y8vRQg/wk9dbqBpZXn4ILXfxw=; b=lWntKckOHiFvkILmo8AtsXNrWABx+jwI8JTqZnRTn0h59JsJ9exd1MzCe4U7cWIeu5 PSlfgDWSRIHaNfJKJoxP0oqdGOoy35krUijhA8dBU91UkIaUTvRi2vST3Cb8r1EkBjZh 9aoNKTG8YF5KT1qhPRLkaS+KbSabHIgvDHbJMQpMsM18uZI7tt+KSP1/zHO758RPV8fS oCLblRnhbr+/cutvzfh+TQ7cFUwuXbgtPF9FTtq0bucwIqzTvi3PmP+KH/Gz3zUIPKXl OmIVezJlVqkhlmk3V4+rlt4nZC7wc+5uTC01ZoOza+wb7XLYy5CBjaFlwLNdwd0zRMv9 /QoQ== X-Gm-Message-State: AOAM533lCzTHy6SZBpqW79AOYOFS+rasegFRlaWfKnCXrXtTEGlDUhtQ F357hC2zo9zPk9I/xWv71AkNzbglQ+hLsoN8SfT3yA3gjFTnEaExRlWnukVcRVyKZyZz0v5A6DL MjcVayW8I2d/wcFeZHBU91wsA1px2A8NQfOEYY/HTYBPv12yhufbI6KlAyjefd5Fd1Qgwr0A= X-Google-Smtp-Source: ABdhPJx7I6PZrVcwW/H01n4J/4nyTQV00sFDStXfHsud9nc4m8h6UpkSluRYNlzt+Ovzy5+eBsBXea3ud4lT+g== X-Received: from tef.lon.corp.google.com ([2a00:79e0:d:110:2df6:f24a:7f54:86a8]) (user=gprocida job=sendgmr) by 2002:ad4:52c2:: with SMTP id p2mr10399327qvs.45.1616709138556; Thu, 25 Mar 2021 14:52:18 -0700 (PDT) Date: Thu, 25 Mar 2021 21:51:44 +0000 In-Reply-To: <20210325215146.3597963-1-gprocida@google.com> Message-Id: <20210325215146.3597963-8-gprocida@google.com> Mime-Version: 1.0 References: <20210316165509.2658452-1-gprocida@google.com> <20210325215146.3597963-1-gprocida@google.com> X-Mailer: git-send-email 2.31.0.291.g576ba9dcdaf-goog Subject: [RFC PATCH 7/9] Add pass to eliminate duplicate member-type fragments To: libabigail@sourceware.org X-Spam-Status: No, score=-22.5 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-Patchwork-Original-From: Giuliano Procida via Libabigail From: Giuliano Procida Reply-To: Giuliano Procida Cc: maennich@google.com, kernel-team@android.com Errors-To: libabigail-bounces@sourceware.org Sender: "Libabigail" One common kind of duplicate type in ABI XML occurs when the XNL writer emits a type declaration with its surrounding scope but that scope actually includes one or more user-defined types. An additional complication is that a nested member type's access can be misapplied to its containing type. The types themselves are later emitted in full and with correct member access specifiers. * scripts/abitidy.pl (sub_tree): New function to determine if one XML node is a sub-tree of another (in the sense that the former can be overlaid without changing the latter), ignoring possibly-incorrect access attributes. (eliminate_duplicate_types): New function that checks nodes with the same type id, determines if there is a maximal node and, if so, remmoves the other nodes. Signed-off-by: Giuliano Procida --- scripts/abitidy.pl | 112 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 111 insertions(+), 1 deletion(-) diff --git a/scripts/abitidy.pl b/scripts/abitidy.pl index d5ddd7ea..35b3c054 100755 --- a/scripts/abitidy.pl +++ b/scripts/abitidy.pl @@ -12,9 +12,11 @@ use experimental 'signatures'; use autodie; +use Algorithm::Diff qw(diff); use Data::Dumper; use Getopt::Long; use IO::File; +use List::Util qw(any all uniq); use XML::LibXML; # Overview of ABI XML elements and their roles @@ -376,6 +378,108 @@ sub normalise_anonymous_type_names($) { } } +sub sub_tree; +sub sub_tree($l, $r) { + # Node name must be an exact match. + return 0 if $l->getName() ne $r->getName(); + # Left attributes may be missing on the left, but otherwise must match. + # With one exception: libabigail emits the access specifier for the + # type it's trying to "emit in scope" rather than what may be a + # containing type; so allow access to differ on member-type nodes. + for my $k (keys %$l) { + my $lv = $l->{$k}; + my $rv = $r->{$k}; + return 0 unless defined $rv; + next if $k eq 'access' && $l->getName() eq 'member-type'; + return 0 unless $lv eq $rv; + } + # Left elements must be a subsequence of right elements. + my @lc = grep { $_->nodeType != XML_COMMENT_NODE } $l->childNodes(); + my @rc = grep { $_->nodeType != XML_COMMENT_NODE } $r->childNodes(); + unless (scalar @lc <= scalar @rc) { + return 0; + } + # Only handle three forms of subsequence. Otherwise need some kind + # of backtracking check. + unless (@lc) { + # empty + return 1; + } + if (scalar @lc == 1) { + # singleton + return any { $_ } map { sub_tree($lc[0], $_) } @rc; + } + if (scalar @lc == scalar @rc) { + # same length + return all { $_ } map { sub_tree($lc[$_], $rc[$_]) } (0..$#lc); + } + warn "XML elements have interestingly different subelements\n", + $l->toString(), "\n", $r->toString(), "\n"; + return 0; +} + +sub get_scope($node) { + my @names; + while (1) { + $node = $node->parentNode; + last unless defined $node && $node->nodeType == XML_ELEMENT_NODE; + my $kind = $node->getName(); + last if $kind eq 'abi-instr'; + # Some C++ enums introduce scope but we don't need to worry. + next unless $kind eq 'class-decl' || $kind eq 'union-decl' || $kind eq 'namespace-decl'; + # Anonymous type are not permitted to contain type members (but + # this still works with g++ -fpermissive). + my $name = $node->getAttribute('name'); + unshift @names, $name; + } + return @names; +} + +sub eliminate_duplicate_types($dom) { + my %hash; + # Collect all (unnested) types and their namespace scopes. + for my $type ($dom->findnodes('//abi-corpus/abi-instr/*[@id] | //abi-corpus/abi-instr//namespace-decl/*[@id]')) { + my $id = $type->getAttribute('id'); + my $scope = join(':', get_scope($type)); + for my $list ($hash{$id}{$scope}) { + $list //= []; + push @$list, $type; + } + } + for my $id (keys %hash) { + my $scopes = $hash{$id}; + if (scalar keys %$scopes > 1) { + warn "inconsistent scopes found for duplicate types with id $id\n"; + next; + } + my ($ns) = keys %$scopes; + my $types = $scopes->{$ns}; + next if scalar @$types == 1; + # Find a potential maximal candidate. + my $candidate = 0; + for my $ix (1..$#$types) { + $candidate = $ix if sub_tree($types->[$candidate], $types->[$ix]); + } + # Verify it is indeed maximal. + my @losers = grep { $_ != $candidate } (0..$#$types); + for my $ix (@losers) { + unless (sub_tree($types->[$ix], $types->[$candidate])) { + warn "conflicting duplicate types with id $id\n"; + my @strs = map { $types->[$_]->toString() } ($ix, $candidate); + map { $_ =~ s;><;>\n<;g } @strs; + my @lines = map { [split("\n", $_)] } @strs; + warn Dumper(diff(@lines)); + $candidate = undef; + last; + } + } + if (defined $candidate) { + map { remove_node($types->[$_]) } @losers; + warn "successfully eliminated duplicate types with id $id\n"; + } + } +} + sub report_duplicate_types($dom) { my %hash; for my $type ($dom->findnodes('*[@id]')) { @@ -400,16 +504,18 @@ my $all_opt; my $drop_opt; my $prune_opt; my $normalise_opt; +my $eliminate_opt; my $report_opt; GetOptions('i|input=s' => \$input_opt, 'o|output=s' => \$output_opt, 's|symbols=s' => \$symbols_opt, 'a|all' => sub { - $drop_opt = $prune_opt = $normalise_opt = $report_opt = 1 + $drop_opt = $prune_opt = $normalise_opt = $eliminate_opt = $report_opt = 1 }, 'd|drop-empty!' => \$drop_opt, 'p|prune-unreachable!' => \$prune_opt, 'n|normalise-anonymous!' => \$normalise_opt, + 'e|eliminate-duplicates!' => \$eliminate_opt, 'r|report-duplicates!' => \$report_opt, ) and !@ARGV or die("usage: $0", map { (' ', $_) } ( @@ -420,6 +526,7 @@ GetOptions('i|input=s' => \$input_opt, '[-d|--[no-]drop-empty]', '[-p|--[no-]prune-unreachable]', '[-n|--[no-]normalise-anonymous]', + '[-e|--[no-]eliminate-duplicates]', '[-r|--[no-]report-duplicates]', ), "\n"); @@ -439,6 +546,9 @@ filter_symbols(read_symbols($symbols_opt), $dom) if defined $symbols_opt; # Normalise anonymous type names. normalise_anonymous_type_names($dom) if $normalise_opt; +# Eliminate complete duplicates and extra fragments of types. +eliminate_duplicate_types($dom) if $eliminate_opt; + # Check for duplicate types. report_duplicate_types($dom) if $report_opt; From patchwork Thu Mar 25 21:51:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Giuliano Procida X-Patchwork-Id: 42779 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 88167385800D; Thu, 25 Mar 2021 21:52:24 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 88167385800D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1616709144; bh=/S8nG4KAD608CGa20OEvBU0+Hj8j1yBvsrwlLsVF62A=; h=Date:In-Reply-To:References:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=ojSOvZyVKZUCSSJJu/8gdwc2MZI1+QROAvBbC/lhrRujsem9CdzaRoUhr9HI8Z7KM tncCvGPyxrTZxDYNeR1RZVmQYTslcwveuiv8OcXVyvBGIPw7GTzKEieM3NXM093Afm zBtxpylULd1cb5mIkHxFEIR0Z9o5/YRGJ0PwxgS8= X-Original-To: libabigail@sourceware.org Delivered-To: libabigail@sourceware.org Received: from mail-qk1-x749.google.com (mail-qk1-x749.google.com [IPv6:2607:f8b0:4864:20::749]) by sourceware.org (Postfix) with ESMTPS id 1F667385BF9E for ; Thu, 25 Mar 2021 21:52:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1F667385BF9E Received: by mail-qk1-x749.google.com with SMTP id g18so4874069qki.15 for ; Thu, 25 Mar 2021 14:52:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=/S8nG4KAD608CGa20OEvBU0+Hj8j1yBvsrwlLsVF62A=; b=QG1TKpYjCVh7s4m6sFk+Zhubwo0//oxZj6+toWR7aUFbRDL64dLBRb23usirZpYDHc auevtcV5aOq04PbO28HwV+zegQYemKhTIXK6nA31h0hcIIS1m0XR37ma6XzXlPL6N40Z AdNg+x8AP4G/o/VViQwOgfi1b6gkzLYPEohz0jZD0PndLQRi8HozrYgr7j9imroK82mg A/KmDKRZereE1MZ3TYcBHE9qC1T9mWIcbnH1VOAOYij78xN2NTI/eQgjMYSH2H0FNU1K X1VypwgzM/Q3ZFKtRLofiniRq/wBpcSU0uXkmoIUYOGGljAAgUflCCEnGDS0WGipAike ZK4w== X-Gm-Message-State: AOAM5329txmOhgsJLW8aNPZvBJXgWSbyn7XNmM9jCsmH9LmY7KoNRmoS DC7JO4pH7QeMkPdHx8ypxa1ndTuf+9AyXnNwZ+kJWx2X0gSmrnhLj7rTSPXfV+h1PVYlnJ4RFKv Qcz2MFUnT9ymfjewWNjaMvRyxmmx6h1jjxp8EZjTNMxmwexSHy6lZZlYNKmGoUvaX840ay/A= X-Google-Smtp-Source: ABdhPJyqNcRzILN3b3vP9vdeZIM2LgS8as3G0FPfK167DiCnYkxsYEarDqdsHTIk4FE81x3zmKZ5BlEbhskM2w== X-Received: from tef.lon.corp.google.com ([2a00:79e0:d:110:2df6:f24a:7f54:86a8]) (user=gprocida job=sendgmr) by 2002:a0c:c248:: with SMTP id w8mr10366073qvh.58.1616709140702; Thu, 25 Mar 2021 14:52:20 -0700 (PDT) Date: Thu, 25 Mar 2021 21:51:45 +0000 In-Reply-To: <20210325215146.3597963-1-gprocida@google.com> Message-Id: <20210325215146.3597963-9-gprocida@google.com> Mime-Version: 1.0 References: <20210316165509.2658452-1-gprocida@google.com> <20210325215146.3597963-1-gprocida@google.com> X-Mailer: git-send-email 2.31.0.291.g576ba9dcdaf-goog Subject: [RFC PATCH 8/9] Add pass to stabilise types and declarations To: libabigail@sourceware.org X-Spam-Status: No, score=-22.7 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-Patchwork-Original-From: Giuliano Procida via Libabigail From: Giuliano Procida Reply-To: Giuliano Procida Cc: maennich@google.com, kernel-team@android.com Errors-To: libabigail-bounces@sourceware.org Sender: "Libabigail" This commit adds a pass which consolidates all types and declarations within an abi-corpus into a replacement abi-instr. These elements are then ordered by id and name respectively, with types before declarations. * scripts/abitidy.pl (stabilise_types_and_declarations): New function to order types and declarations as deterministically as possible within an abi-corpus. Signed-off-by: Giuliano Procida --- scripts/abitidy.pl | 84 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 83 insertions(+), 1 deletion(-) diff --git a/scripts/abitidy.pl b/scripts/abitidy.pl index 35b3c054..67fe3a69 100755 --- a/scripts/abitidy.pl +++ b/scripts/abitidy.pl @@ -496,6 +496,82 @@ sub report_duplicate_types($dom) { } } +# Stabilise types and declarations. +sub stabilise_types_and_declarations($dom) { + my $corpus_path = new XML::LibXML::XPathExpression('//abi-corpus'); + my $instr_path = new XML::LibXML::XPathExpression('abi-instr'); + my $type_or_decl_path = new XML::LibXML::XPathExpression('*[@id]|*[@name]'); + + my @corpora = $dom->findnodes($corpus_path); + for my $coprus (@corpora) { + # Let's squish it. We expect its abi-instr elements to have + # consistent attributes, ignoring path. + my %attrs; + my @children = $coprus->findnodes($instr_path); + for my $child (@children) { + for my $attr (keys %$child) { + my $value = $child->{$attr}; + $attrs{$attr}{$value} = undef; + } + } + next unless scalar keys %attrs; + + # Create a replacement abi-instr node. + my $replacement = new XML::LibXML::Element('abi-instr'); + # Original attribute ordering is lost, may as well be deterministic here. + for my $attr (sort keys %attrs) { + # Check attribute consistency. + for my $values ($attrs{$attr}) { + if (scalar keys %{$values} > 1) { + die "unexpected non-constant abi-instr attribute $attr\n" + unless $attr eq 'path' || $attr eq 'comp-dir-path' || $attr eq 'language'; + $values = { 'various' => undef }; + } + for my $value (keys %$values) { + $replacement->setAttribute($attr, $value); + } + } + } + + # Gather sorted types and decls. + my @types_and_decls = sort { + my $a_id = $a->{id}; + my $a_name = $a->{name}; + die unless defined $a_id || defined $a_name; + my $b_id = $b->{id}; + my $b_name = $b->{name}; + die unless defined $b_id || defined $b_name; + # types before declarations + # order types by id + # order declarations by name + defined $a_id != defined $b_id ? !defined $a_id <=> !defined $b_id + : defined $a_id ? $a_id cmp $b_id + : $a_name cmp $b_name + } map { $_->findnodes($type_or_decl_path) } @children; + + # Add them to replacement abi-instr + map { + my $prev = $_->previousSibling(); + if ($prev && $prev->nodeType == XML_COMMENT_NODE) { + $prev->unbindNode(); + $replacement->appendChild($prev); + } + $_->unbindNode(); + $replacement->appendChild($_) + } @types_and_decls; + # Remove the old abi-instr nodes. + for my $child (@children) { + if ($child->hasChildNodes()) { + warn "failed to evacuate abi-instr: ", $child->toString(), "\n"; + next; + } + remove_node($child); + } + # Add the replacement abi-instr node to the abi-corpus. + $coprus->appendChild($replacement); + } +} + # Parse arguments. my $input_opt; my $output_opt; @@ -505,17 +581,19 @@ my $drop_opt; my $prune_opt; my $normalise_opt; my $eliminate_opt; +my $stabilise_opt; my $report_opt; GetOptions('i|input=s' => \$input_opt, 'o|output=s' => \$output_opt, 's|symbols=s' => \$symbols_opt, 'a|all' => sub { - $drop_opt = $prune_opt = $normalise_opt = $eliminate_opt = $report_opt = 1 + $drop_opt = $prune_opt = $normalise_opt = $eliminate_opt = $stabilise_opt = $report_opt = 1 }, 'd|drop-empty!' => \$drop_opt, 'p|prune-unreachable!' => \$prune_opt, 'n|normalise-anonymous!' => \$normalise_opt, 'e|eliminate-duplicates!' => \$eliminate_opt, + 't|stabilise-order!' => \$stabilise_opt, 'r|report-duplicates!' => \$report_opt, ) and !@ARGV or die("usage: $0", map { (' ', $_) } ( @@ -527,6 +605,7 @@ GetOptions('i|input=s' => \$input_opt, '[-p|--[no-]prune-unreachable]', '[-n|--[no-]normalise-anonymous]', '[-e|--[no-]eliminate-duplicates]', + '[-t|--[no-]stabilise-order]', '[-r|--[no-]report-duplicates]', ), "\n"); @@ -555,6 +634,9 @@ report_duplicate_types($dom) if $report_opt; # Prune unreachable elements. prune_unreachable($dom) if $prune_opt; +# Stabilise types and declarations. +stabilise_types_and_declarations($dom) if $stabilise_opt; + # Drop empty elements. drop_empty($dom) if $drop_opt; From patchwork Thu Mar 25 21:51:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Giuliano Procida X-Patchwork-Id: 42781 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3EFDE385802E; Thu, 25 Mar 2021 21:52:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3EFDE385802E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1616709150; bh=wM1L44w1X6HU2wCWTA+LIWhl361jehAOeFiIsQdyteA=; h=Date:In-Reply-To:References:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=wSHjxYQHnsCogvEJJMfQxRKiOjJjRsoPkk0YDhvrZSITVPRGDjO6ZHKN34l0FgIwu h5YVSXSZP+YNxa6TnUczE5k3AOPtkR0N/YivpL6Zm9Jv/CexJFUWhvVI7fbmrNpYiB WNyzBzvZvcgDzT4/KEJkZPFqHvFSoHRvnr5y/Dx8= X-Original-To: libabigail@sourceware.org Delivered-To: libabigail@sourceware.org Received: from mail-qt1-x84a.google.com (mail-qt1-x84a.google.com [IPv6:2607:f8b0:4864:20::84a]) by sourceware.org (Postfix) with ESMTPS id 31C0A3857C73 for ; Thu, 25 Mar 2021 21:52:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 31C0A3857C73 Received: by mail-qt1-x84a.google.com with SMTP id f26so4092323qtq.17 for ; Thu, 25 Mar 2021 14:52:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=wM1L44w1X6HU2wCWTA+LIWhl361jehAOeFiIsQdyteA=; b=qyJB+UwQ6yQTfNSBOJus/zJHFFL+RuBdOSBFFG8HK8qjd8ABwweN2HZEdPiUTcmhE1 stFtn/T6/fCwEFa+XO11nZOFbngGLuiiNJFHvCsNPMlAn2huWD3cRsqQGtflayKHRBHf 23kvUJ3k7nXjZnwIh0h2qvgJ3m5DAHU/4ZIz1vktPFnrp4W9fV5WeqwGe/RqONY2OSIw NRJKcnyIuSs/q3B7aBr85beIT/NQIDXeJdJCRm1LimLOyw5un6lV8kn6FUNJbExadlwu WEGVB8h4rg0mmpUOTYtW12Ji9Qw2cilNpaF2Zch1B0GDx43R/BtgvBSm2hoyO49OdUKm qMPw== X-Gm-Message-State: AOAM532mpOeT21BYPwGMxpBhkhvBc5Vh28Tb+jUsEpdUYWF3/7jWYUeX W/iQtodhh87AX3LwlEnQnVZqCpNcGcqTmWrDWeuKb6m1LCjTZy/cgt4zNNefNRIycje7YEp7J7m U9lIMSk7VD7rn3QcpktkarZBNm1a7W/kirts5qYgUZp+1Q4WsIaxIe2I2xsN9SVu52e4sfWg= X-Google-Smtp-Source: ABdhPJwvWkYJWAVgZKpRnYldSZ7kwavW+1yqfa3zq49L7mYS/mWHN48FvjjvdTkS4hi6XIHuNP0eNXW20WSWiQ== X-Received: from tef.lon.corp.google.com ([2a00:79e0:d:110:2df6:f24a:7f54:86a8]) (user=gprocida job=sendgmr) by 2002:ad4:4c4c:: with SMTP id cs12mr10333922qvb.35.1616709142805; Thu, 25 Mar 2021 14:52:22 -0700 (PDT) Date: Thu, 25 Mar 2021 21:51:46 +0000 In-Reply-To: <20210325215146.3597963-1-gprocida@google.com> Message-Id: <20210325215146.3597963-10-gprocida@google.com> Mime-Version: 1.0 References: <20210316165509.2658452-1-gprocida@google.com> <20210325215146.3597963-1-gprocida@google.com> X-Mailer: git-send-email 2.31.0.291.g576ba9dcdaf-goog Subject: [RFC PATCH 9/9] Add pass to resolve stray forward type declarations To: libabigail@sourceware.org X-Spam-Status: No, score=-22.5 required=5.0 tests=BAYES_00, DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libabigail@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list of the Libabigail project List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-Patchwork-Original-From: Giuliano Procida via Libabigail From: Giuliano Procida Reply-To: Giuliano Procida Cc: maennich@google.com, kernel-team@android.com Errors-To: libabigail-bounces@sourceware.org Sender: "Libabigail" This can be used to improve the precision of ABI reporting for C++ and Linux kernel which both have some kind of One Definition Rule when it comes to types. TODO: handle naming-typedef-id references as well * scripts/abitidy.pl (substitute_type_ids): New function to perform renaming of type-id attributes within XML elements. (resolve_forward_declarations): New function that resolves forward declarations of types to their definitions, assuming a consistent universe of type names. Signed-off-by: Giuliano Procida --- scripts/abitidy.pl | 62 ++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 57 insertions(+), 5 deletions(-) diff --git a/scripts/abitidy.pl b/scripts/abitidy.pl index 67fe3a69..d45a82bb 100755 --- a/scripts/abitidy.pl +++ b/scripts/abitidy.pl @@ -464,11 +464,8 @@ sub eliminate_duplicate_types($dom) { my @losers = grep { $_ != $candidate } (0..$#$types); for my $ix (@losers) { unless (sub_tree($types->[$ix], $types->[$candidate])) { - warn "conflicting duplicate types with id $id\n"; my @strs = map { $types->[$_]->toString() } ($ix, $candidate); - map { $_ =~ s;><;>\n<;g } @strs; - my @lines = map { [split("\n", $_)] } @strs; - warn Dumper(diff(@lines)); + warn "conflicting duplicate types with id $id:\n", map { " $_\n" } @strs, "\n"; $candidate = undef; last; } @@ -572,6 +569,55 @@ sub stabilise_types_and_declarations($dom) { } } +# Substitute a set of type ids with another. +sub substitute_type_ids($winner, $losers, $dom) { + for my $ref ($dom->findnodes('//*[@type-id]')) { + my $type_id = $ref->getAttribute('type-id'); + $ref->setAttribute('type-id', $winner) if exists $losers->{$type_id}; + } +} + +# Find definitions and declarations for the same thing and replace +# references to the latter with the former. naming-typedef-id may be +# an added complication. +sub resolve_forward_declarations($dom) { + for my $corpus ($dom->findnodes('//abi-corpus')) { + my %synonyms; + # Safe to extend to deeper-nested types? Need get_scopes. + for my $type ($corpus->findnodes('abi-instr/*[@id]')) { + my $kind = $type->getName(); + my $name = $type->getAttribute('name'); + next unless defined $name; + next if $name =~ m;^__anonymous_;; + my $key = "$kind:$name"; + $synonyms{$key} //= []; + push @{$synonyms{$key}}, $type; + } + + for my $key (keys %synonyms) { + my $types = $synonyms{$key}; + next if scalar(@$types) == 1; + my @decls = grep { $_->hasAttribute('is-declaration-only') } @$types; + my @defns = grep { !$_->hasAttribute('is-declaration-only') } @$types; + next unless @decls and @defns; + # Have declarations and definitions, check that top-level ids + # are the only differences. + my ($kind, $name) = split(':', $key); + my @decl_strs = uniq map { my $str = $_->toString(); my $id = $_->getAttribute('id'); $str =~ s; id='$id';;g; $str } @decls; + my @defn_strs = uniq map { my $str = $_->toString(); my $id = $_->getAttribute('id'); $str =~ s; id='$id';;g; $str } @defns; + unless (scalar @decl_strs == 1 && scalar @defn_strs == 1) { + warn "cannot resolve duplicate $kind types with name $name\n"; + next; + } + my $winner = $defns[0]->getAttribute('id'); + my @losers = grep { $_ ne $winner } map { $_->getAttribute('id') } @$types; + warn "resolved $kind $name: substituting @losers with $winner\n"; + substitute_type_ids($winner, {map { $_ => undef } @losers}, $dom); + map { remove_node($_) } (@defns[1..$#defns], @decls); + } + } +} + # Parse arguments. my $input_opt; my $output_opt; @@ -582,18 +628,20 @@ my $prune_opt; my $normalise_opt; my $eliminate_opt; my $stabilise_opt; +my $forward_opt; my $report_opt; GetOptions('i|input=s' => \$input_opt, 'o|output=s' => \$output_opt, 's|symbols=s' => \$symbols_opt, 'a|all' => sub { - $drop_opt = $prune_opt = $normalise_opt = $eliminate_opt = $stabilise_opt = $report_opt = 1 + $drop_opt = $prune_opt = $normalise_opt = $eliminate_opt = $stabilise_opt = $forward_opt = $report_opt = 1 }, 'd|drop-empty!' => \$drop_opt, 'p|prune-unreachable!' => \$prune_opt, 'n|normalise-anonymous!' => \$normalise_opt, 'e|eliminate-duplicates!' => \$eliminate_opt, 't|stabilise-order!' => \$stabilise_opt, + 'f|resolve-forward!' => \$forward_opt, 'r|report-duplicates!' => \$report_opt, ) and !@ARGV or die("usage: $0", map { (' ', $_) } ( @@ -606,6 +654,7 @@ GetOptions('i|input=s' => \$input_opt, '[-n|--[no-]normalise-anonymous]', '[-e|--[no-]eliminate-duplicates]', '[-t|--[no-]stabilise-order]', + '[-f|--[no-]resolve-forward]', '[-r|--[no-]report-duplicates]', ), "\n"); @@ -631,6 +680,9 @@ eliminate_duplicate_types($dom) if $eliminate_opt; # Check for duplicate types. report_duplicate_types($dom) if $report_opt; +# Check for types which are both declared and defined. +resolve_forward_declarations($dom) if $forward_opt; + # Prune unreachable elements. prune_unreachable($dom) if $prune_opt;