From patchwork Fri Oct 18 14:47:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Krzysztof Koch X-Patchwork-Id: 35144 Received: (qmail 86919 invoked by alias); 18 Oct 2019 14:48:27 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 86063 invoked by uid 89); 18 Oct 2019 14:48:14 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-18.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: EUR03-AM5-obe.outbound.protection.outlook.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=xBoTi0oieZ9T3hFzvYhmHHTmtvVBDDkv6wYcFWwq9dM=; b=Ym5EyTa2QZuBJMeNpSPqctuHixqBGoqQRYiJYLCIm98RhZTtM8xGBeHEAzPQs0OAWmhjfCEqBFvKA89/2iqXzXs3l5U+zR4iROlo6ibIIclFgXWeYscSeqiII6iyNdxPInHRMyGcr9HAVHiUs8of3QmkevT2gxUm1426DrvIbQQ= Authentication-Results: spf=temperror (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; sourceware.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; sourceware.org; dmarc=none action=none header.from=arm.com; Received-SPF: TempError (protection.outlook.com: error in processing during lookup of arm.com: DNS Timeout) X-CR-MTA-TID: 64aa7808 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Q/Fagy0DNoaeAYS8Li3tBSX6lIECm3b1+VbQNGcDaHpVA4NeXn0aa1Dqf0KR4UE37gZ6uygBIL+y+xpqap+6QIjkQkWD5b1XUF4wVgRwtlVunAqRL4wTkwn09k4lFRNbn003nVwCXdmB+/e6ltDsJw+4PvD1go88Da4QwV0AHKRBji32e3Ug4736Vl3l4tDQQyDG8wKHZpjLlnHTq6sj7WyuZpUPBS+lvkF8EdlTGk0X7bSAaJygW3cV/61nPioOvS1qtaFLQfkxi+SZs2L8uoUSQUxLWal3kxjhXJjAM5O+CgBGOXKePSPCO69wZYsrS1PvcCDoDdLdtGgtch6lAw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=xBoTi0oieZ9T3hFzvYhmHHTmtvVBDDkv6wYcFWwq9dM=; b=AnLnqmrGNWO5Xve1PqP4L2ZomXT4cI4rRS2UMNf2unzoAJv/sbARP4h+MYLnnJtD1j/rMjxgSHIRo2oUnSjH8bK4D6b/qkRxehqiZk13auOVi5gWYG3gWHU43nAWYPbfJLuOcZVroRaE/MwLuIDx57mX6Ax+ApYRrdDNtKhKZMTBqshwV1v22gkf/Fo0YjgG4HSgaLXmz5V2XPBFtH109AkaDakfneatF7/oO30dcvKpdxscDxjNXDJwcAFmGonPOMRJuoxsQIiHN5ny8Y+X3nn+mMTZso/t/lLgtAobZKW7X+g/VgbNvrtgclyr+R2WbjZ4Tn8tG0f3TgfvNEwXRg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=xBoTi0oieZ9T3hFzvYhmHHTmtvVBDDkv6wYcFWwq9dM=; b=Ym5EyTa2QZuBJMeNpSPqctuHixqBGoqQRYiJYLCIm98RhZTtM8xGBeHEAzPQs0OAWmhjfCEqBFvKA89/2iqXzXs3l5U+zR4iROlo6ibIIclFgXWeYscSeqiII6iyNdxPInHRMyGcr9HAVHiUs8of3QmkevT2gxUm1426DrvIbQQ= From: Krzysztof Koch To: "libc-alpha@sourceware.org" CC: nd Subject: [RFC] Add script for plotting string benchmark results Date: Fri, 18 Oct 2019 14:47:45 +0000 Message-ID: x-ts-tracking-id: e056baba-2f03-42f5-89be-0d7e7f8b4948.0 x-checkrecipientchecked: true Authentication-Results-Original: spf=none (sender IP is ) smtp.mailfrom=Krzysztof.Koch@arm.com; X-MS-Exchange-PUrlCount: 1 x-checkrecipientrouted: true x-ms-oob-tlc-oobclassifiers: OLM:6430;OLM:6430; X-Forefront-Antispam-Report-Untrusted: SFV:NSPM; SFS:(10009020)(4636009)(136003)(346002)(376002)(366004)(39860400002)(396003)(199004)(189003)(74316002)(33656002)(64756008)(8936002)(66946007)(66446008)(6306002)(76116006)(186003)(9686003)(7736002)(81156014)(81166006)(305945005)(66476007)(66616009)(66556008)(30864003)(486006)(8676002)(476003)(6436002)(2351001)(2501003)(7696005)(316002)(478600001)(14454004)(6506007)(6116002)(3846002)(86362001)(256004)(52536014)(2906002)(5640700003)(6916009)(14444005)(55016002)(4326008)(26005)(66066001)(25786009)(5660300002)(102836004)(99286004)(71190400001)(71200400001)(99936001)(2004002); DIR:OUT; SFP:1101; SCL:1; SRVR:VE1PR08MB5263; H:VE1PR08MB4783.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: khlY2bx2LutQwDeRWyQOEe/lRqghKJSQKpZ7WP40hhwnN8QcSzigHZfisgN6uZlzIWJB4fQirclNH/XEm6QeyXhFMwmCO1PcvNZNYjXAMrqC62Ko1SRmT0SlZ5BF8fJNc6bGajAVYmP8RzIIUdYbk0VrxWlQUDPcGm61gKJayEBIEUybTfmXUIxsg+HwqoQSixAVvpqy/8uDDAk1VMN0fkuv5bVK84Gcre1aY/BOOCSMrGVS+JZoE9eDYN4qQ5dWY9qsS4oZFcphhCFgFyG4q6XCqooLSmENnifZ+5AtHL1qKo3Wgii17AdDfuPopZqzwtAKuP1iiL+F2Jli8vOQPuJpVg/9cttAs5KbvU/VzjhpWRrTwfdp7j9oacYdPJaSYcePeDMEIWQXYHLYN9I1fbOhPeSaMh7ihxwd0u5whFhTHIOyTP2wyhPxvqFlSfDB73MDWW4pZfkP/GVWsgYjdQ== x-ms-exchange-transport-forked: True MIME-Version: 1.0 Original-Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Krzysztof.Koch@arm.com; Return-Path: Krzysztof.Koch@arm.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT024.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: d1a6dfa2-43c6-4fdf-311b-08d753da2349 Add a script for visualizing the JSON output generated by existing glibc string microbenchmarks. Overview: plot_strings.py is capable of plotting both absolute and relative (--relative flag) timings. For both types of graphs, there is an option to explicitly specify the ifuncs to plot, from the ones listed in the benchmark results file. This is can be achieved using the --ifuncs parameter. For absolute plots, it is possible to show the exact timing values (--values flag), but the default behavior is to hide the figures. When plotting relative timing differences between ifuncs, the first ifunc listed in the input JSON file is the baseline, unless the baseline implementation is explicitly chosen with the --baseline parameter. For ease of reading, the script marks the statistically insignificant differences on the graphs. The default is +-5% but this value can be controlled with the --threshold parameter. Finally, if a window manager is available, one can enable interactive figure display which allows navigation through the graph. To enable this feature use the --display flag and make sure you are using a GUI-based matplotlib backend. To accommodate for the heterogeneity in string benchmark results files, one can control i.e x-axis scale, or the key to access the varied parameter value in the JSON tree, with options like --logarithmic or --param. This ensures that the script works with all files which pass JSON schema validation. The schema to validate against can be chosen with the --schema parameter. Implementation: plot_strings.py traverses the JSON tree until a 'results' array is found and generates a separate figure for each such array. The figure is then saved to a file in one of the available formats (controlled with the --extension parameter). As the tree is traversed, the recursive function tracks the metadata about the test being run, so that each figure has a unique and meaningful title and filename. Given the current state of the benchmarks, this data is limited to the top-level 'bench-variant'. While plot_strings.py works with existing benchmarks, provisions have been made to allow adding more structure and metadata to these benchmarks. Currently, many benchmarks produce multiple timing values for the same value of the varied parameter (typically 'length'). Mutiple data points for the same parameter usually mean that some other parameter was varied as well, for example, if memmove's src and dst buffers overlap or not (see bench-memmove-walk.c and bench-memmove-walk.out). Unfortunately, this information is not exposed by the benchmark, so plot_strings.py has to resort to computing the geometric mean of these multiple values. In the process, useful information about the benchmark configuration is lost. Also, averaging the timings for different alignments can hide useful characterstics of the benchmarked ifuncs. Testing: The script has been tested on all existing string microbenchmarks which produce results in JSON format. plot_string.py was tested on both Windows and Ubuntu 16.04.2 LTS. It runs on both python 2 and 3 (2.7.12 and 3.5.12 tested). Useful commands: 1. Plot absolute timings for all ifuncs in bench-strlen.out: $ ./plot_strings.py bench-strlen.out 2. Display help: $ ./plot_strings.py -h 3. Plot absolute timings for selected ifuncs. Save the generated figures in pdf format to the 'results' directory. Use logarithmic x-axis scale and show grid lines as well as exact timing numbers: $ ./plot_strings.py bench.out -o results/ -g -v -l -e pdf \ -i __memset_avx512_unaligned_erms __memset_avx512_unaligned 4. Plot relative timings for all ifuncs in bench.out with __generic_memset as baseline. Display a percentage difference threshold of +-10%: $ ./plot_strings.py bench.out -r -b __generic_memset -t 10 Discussion: 1. I would like to propose relaxing the benchout_strings.schema.json to allow specifying either a 'results' array with 'timings' (as before) or a 'variants' array right after the array of 'ifuncs'. See below example: { "timing_type": "hp_timing", "functions": { "memcpy": { "bench-variant": "default", "ifuncs": ["generic_memcpy", "__memcpy_thunderx"], "variants": [ { "name": "powers of 2", "variants": [ { "name": "both aligned", "results": [ { "length": 1, "align1": 0, "align2": 0, "timings": [x, y] }, { "length": 2, "align1": 0, "align2": 0, "timings": [x, y] }, ... { "length": 65536, "align1": 0, "align2": 0, "timings": [x, y] }] }, { "name": "dst misaligned", "results": [ { "length": 1, "align1": 0, "align2": 0, "timings": [x, y] }, { "length": 2, "align1": 0, "align2": 1, "timings": [x, y] }, ... 'variants' array consists of objects such that each object has a 'name' attribute to describe the configuration of a particular test in the benchmark. This can be a description, for example, of how the parameter was varied or what was the buffer alignment tested. The 'name' attribute is then followed by another 'variants' array or a 'results' array. The nesting of variants allows arbitrary grouping of benchmark timings, while allowing giving description of these groups. Using recusion, one can proceduraly create titles and filenames for the figures being generated. RFC: 1. Is the idea of adding variants array to benchout_strings.schema.json a good one? The goal is not to break existing scripts but to start the process of tidying up the string benchmarks and adding metadata to them. 2. Would people find this script useful? Are there any features missing? Have I made some wrong assumptions? 3. I am aware that compare_strings.py is already capable of plotting string benchmark results, however, the readability of the genereated graphs is questionable. One problem with the script is that it labels each individual data point with the specific parameter values because multiple parameters are varied simultaneously. This approach only works for few data points. If we add more structure to the string benchmark output, then it will become possible to produce much more timings without overwhelming the reader. For example, bench-memcpy.c increments the 'length' by 1 to generate some of the 'timings', but it only goes from 0 to 31, which I believe is mostly motivated by the fear of generating too much data. --- benchtests/scripts/plot_strings.py | 263 +++++++++++++++++++++++++++++++++++++ 1 file changed, 263 insertions(+) create mode 100755 benchtests/scripts/plot_strings.py \ No newline at end of file diff --git a/benchtests/scripts/plot_strings.py b/benchtests/scripts/plot_strings.py new file mode 100755 index 0000000..34d39b9 --- /dev/null +++ b/benchtests/scripts/plot_strings.py @@ -0,0 +1,263 @@ +#!/usr/bin/python3 +# Plot GNU C Library string microbenchmark output. +# Copyright (C) 2019 Free Software Foundation, Inc. +# This file is part of the GNU C Library. +# +# The GNU C Library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2.1 of the License, or (at your option) any later version. +# +# The GNU C Library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with the GNU C Library; if not, see +# . +"""Plot string microbenchmark results. + +Given a benchmark results file in json format and a benchmark schema file, +plot either absolute or relative benchmark timings. + +Separate figure is generated and saved to a file for each 'results' array +found in the input file. +""" +import matplotlib as mpl +mpl.use("Agg") # Remove this line to use (default) GUI backend + +import argparse +from collections import defaultdict +import json +import matplotlib.pyplot as plt +import numpy as np +import os + +try: + import jsonschema as validator +except ImportError: + print("Could not find jsonschema module.") + raise + +# Use pre-selected markers for plotting lines to improve readability +markers = [".", "s", "x", "o", "^", "|", "*", "1", "+", "D"] + +# Benchmark variants for which the x-axis scale should be logarithmic +log_variants = {"powers of 2"} + + +def gmean(numbers, axis=None): + """Compute geometric mean. + + Args: + numbers: numbers to compute geometric mean of + Return: + geometric mean of numbers + """ + a = np.array(numbers, dtype=np.complex) + means = a.prod(axis) ** (1.0 / len(a)) + return np.real(means) + + +def relativeDifference(x, x_reference): + """Compute relative difference. + + Args: + x: values + x_reference: reference (baseline) values + Return: + relative difference between x and x_reference (in %) + """ + abs_diff = np.subtract(x, x_reference) + return np.divide(np.multiply(abs_diff, 100.0), x_reference) + + +def plotRecursive(json_iter, routine, ifuncs, bench_variant, title, outpath, + x_scale): + """Plot benchmark timings. + + Args: + json_iter: reference to json object + routine: benchmarked string routine + ifuncs: list of ifuncs tested + bench_variant: top-level benchmark variant name + title: generated figure's title + outpath: output file path (without extension) + x_scale: x-axis scale + """ + + # RECURSIVE CASE: + # 'variants' array found, continue recursive search for 'results' array + if "variants" in json_iter: + for variant in json_iter["variants"]: + new_title = "%s%s, " % (title, variant["name"]) + new_outpath = "%s_%s" % (outpath, variant["name"].replace(" ", "_")) + new_x_scale = "log" if variant["name"] in log_variants else x_scale + + plotRecursive(variant, routine, ifuncs, bench_variant, new_title, + new_outpath, new_x_scale) + return + + # BASE CASE: + # 'results' array found, collect and plot timings + domain = [] + timings = defaultdict(list) + + for result in json_iter["results"]: + domain.append(result[args.param]) + timings[result[args.param]].append(result["timings"]) + + domain = np.unique(np.array(domain)) + averages = [] + + # Use geometric mean if there are multple timings for each parameter + # value. + for parameter in domain: + averages.append(gmean(timings[parameter], axis=0)) + + averages = np.array(averages).transpose() + + # Choose ifuncs to plot + if args.ifuncs: + plotted_ifuncs = [x.replace("__", "") for x in args.ifuncs] + else: + plotted_ifuncs = ifuncs + + plotted_indices = [ifuncs.index(x) for x in plotted_ifuncs] + + # Plot relative timings + if args.relative: + + # Choose the baseline ifunc + if args.baseline: + baseline = args.baseline.replace("__", "") + else: + baseline = ifuncs[0] + + baseline_index = ifuncs.index(baseline) + + # Compare the timings against baseline + codomain = relativeDifference(averages[plotted_indices,:], + averages[baseline_index]) + + # Start plotting relative timings + plt.figure() + plt.axhspan(-args.threshold, args.threshold, color="lightgray", + alpha=0.3) + plt.axhline(0, color="k", linestyle="--", linewidth=0.4) + plt.ylabel("Relative timing (in %)") + title = "Relative timing comparison against %s\nfor %s benchmark, %s" \ + % (baseline, bench_variant, title) + outpath = os.path.join(args.outdir, "%s_%s%s" % \ + (baseline, bench_variant, outpath)) + + # Plot absolute timings + else: + codomain = averages[plotted_indices,:] + plt.figure() + + if not args.values: + plt.axes().yaxis.set_major_formatter(plt.NullFormatter()) + + plt.ylabel("Timing") + title = "%s %s benchmark results\n%s" % (routine, bench_variant, title) + outpath = os.path.join(args.outdir, "%s_%s%s" \ + % (routine, bench_variant, outpath)) + + + # Finish generating Figure + plt.xlabel(args.param) + plt.xscale(x_scale) + plt.title(title) + + if args.grid: + plt.grid(color="k", linestyle=":", linewidth=0.5, alpha=0.5) + + for i in range(len(plotted_ifuncs)): + plt.plot(domain, codomain[i], marker=markers[i % len(markers)], label=plotted_ifuncs[i]) + + plt.legend(loc="best") + plt.savefig("%s_%s.%s" % (outpath, x_scale, args.extension), + format=args.extension) + + if args.display: + plt.show() + + plt.close() + + +def main(args): + """Program Entry Point. + + Args: + args: command line arguments (excluding program name) + """ + + schema = None + + with open(args.schema, "r") as f: + schema = json.load(f) + + for filename in args.bench: + bench = None + + with open(filename, "r") as f: + bench = json.load(f) + + validator.validate(bench, schema) + + for function in bench["functions"]: + bench_variant = bench["functions"][function]["bench-variant"] + x_scale = "log" if args.logarithmic else "linear" + ifuncs = bench["functions"][function]["ifuncs"] + ifuncs_trimmed = [x.replace("__", "") for x in ifuncs] + + plotRecursive(bench["functions"][function], function, + ifuncs_trimmed, bench_variant, "", "", x_scale) + + +""" main() """ +if __name__ == "__main__": + + parser = argparse.ArgumentParser(description= + "Plot string microbenchmark timings", + formatter_class=argparse.ArgumentDefaultsHelpFormatter) + + # Required parameter + parser.add_argument("bench", nargs="+", + help="benchmark results file(s) in json format") + + # Optional parameters + parser.add_argument("-b", "--baseline", type=str, + help="baseline ifunc for relative plot") + parser.add_argument("-d", "--display", action="store_true", + help="display figures (requires window manager)") + parser.add_argument("-e", "--extension", type=str, default="png", + choices=["png", "pdf", "svg"], + help="output file(s) extension") + parser.add_argument("-g", "--grid", action="store_true", help="show grid") + parser.add_argument("-i", "--ifuncs", nargs="+", help="ifuncs to plot") + parser.add_argument("-l", "--logarithmic", action="store_true", + help="use logarithmic x-axis scale") + parser.add_argument("-o", "--outdir", type=str, default=os.getcwd(), + help="output directory") + parser.add_argument("-p", "--param", type=str, default="length", + help="varied parameter name") + parser.add_argument("-r", "--relative", action="store_true", + help="plot relative timing differences") + parser.add_argument("-s", "--schema", type=str, + default=os.path.join(os.path.dirname( + os.path.realpath(__file__)), + "benchout_strings.schema.json"), + help="schema file to validate the result file.") + parser.add_argument("-t", "--threshold", type=int, default=5, + help="diference threshold to mark in graph (in %%)") + parser.add_argument("-v", "--values", action="store_true", + help="show actual timing values") + + args = parser.parse_args() + if (not args.relative and args.baseline) or (args.relative and args.values): + raise ValueError("Conflicting command line arguments") + + main(args)