From patchwork Thu Jun 18 16:34:08 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Ondrej Bilka X-Patchwork-Id: 7241 Received: (qmail 33176 invoked by alias); 18 Jun 2015 16:34:34 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 33164 invoked by uid 89); 18 Jun 2015 16:34:33 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.5 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, KAM_LOTSOFHASH, SPF_NEUTRAL autolearn=no version=3.3.2 X-HELO: popelka.ms.mff.cuni.cz Date: Thu, 18 Jun 2015 18:34:08 +0200 From: =?utf-8?B?T25kxZllaiBCw61sa2E=?= To: Carlos O'Donell Cc: libc-alpha@sourceware.org Subject: Re: [PING] Wordexp benchtest: good, bad and unreliable. Message-ID: <20150618163408.GA16797@domone> References: <20150513124945.GA4067@domone> <55538A90.9000908@redhat.com> <20150513183554.GA31157@domone> <55539E20.4010506@redhat.com> <20150606105711.GA7215@domone> <20150616173255.GC9387@domone> <558077D3.1080903@redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <558077D3.1080903@redhat.com> User-Agent: Mutt/1.5.20 (2009-06-14) On Tue, Jun 16, 2015 at 03:24:03PM -0400, Carlos O'Donell wrote: > On 06/16/2015 01:32 PM, Ondřej Bílka wrote: > >> As I mentioned before for wordexp performance depends on too many > >> parameters. So here is microbenchmark that shows some improvements and > >> some regressions. There is lot of noise. I ran these three times and on > >> one * and ffffffffoo pattern become twice slower for no reason. > > There is a reason, but we just don't know it yet. > My guess that difference was caused by stat variance but I couldn't verify that. > >> So how do you decide with conflicted data like this? > > You can't make an "automatic" or "automated" decision, but as a baseline > the statistics do help. With noisy benchmarks you can still look at > potential trends in the noise. We don't want the bad patterns to get 5x > slower. > Thats for most of functions simply unavoidable, what I submitted is classical example that you cannot avoid some paterns being 5 times slower as you are shifting worst case. Here with my wordexp change patterns that didn't used strchr(ifs,x) would get five times slower when user uses big IFS. On other hand patterns that checked strchr(ifs,x) will become five times faster with this change. These tradeoffs are actually quite routine, for example most of string routines tend to be slow when inputs are always at page boundary. You cannot avoid that as alternatives move worst case to much more probable scenarios like empirical probability ~1/8 that 16 bytes cross 64 byte boundary. I could improve performance by optimizing that case aggresively. However it would likely be net loss as cross-page is most of time cold path and I try to optimize it to size. > Similarly we will have conflicting workloads modeled as benchmarks, and > the hard part is that as an expert we have to make a difficult choice. > We have to understand the workload and decide "Yes, we ignore this performance > decrease because we expect fewer people will suffer." However, in order > to do that we need comments about exactly what workload we're trying to model. > Thats problem that you need to collect lot of domain knowledge to be usable. Problem is that I didn't find yet application that uses it. > For most of our microbenchmarks the workloads are "raw performance of function > call" and that's very simplistic, it's also very generic. > Thats quite problem when there are many variants that need to be tried what should you priorize, as this migth be improved or not depending on average ifs size and average size of pattern. > This group of tests should have a comment about what's it's testing. > > What workload are we modeling? > None in particular, just trying to measure performance change. > >> + > >> + > >> + do_test ("*", 0); > >> + setenv("IFS"," \ > >> + ", 1); > >> + > >> + do_test ("foo", 0); > >> + do_test ("ffoo", 0); > >> + do_test ("ffffoo", 0); > >> + do_test ("ffffffffoo", 0); > >> + do_test ("ffffffffffffffffoo", 0); > >> + do_test ("ffffffffffffffffffffffffffffffffoo", 0); > >> +do_test ("ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffoo", 0); > >> + do_test ("*", 0); > >> + > > Similarly with the IFS change. This is not a normal IFS change for an average workload. > > Usually one changes IFS to be a series of characters, or just tab, or just newline etc. > These are worst case for my change. There is constant overhead per call to parse IFS, then my patch saves some cycles per character depending on which path we take. On paths of f...ffoo form you save cost of conversion strchr ("\n|&;<>(){}", ch) per character. I read code more and I could also measure saving of strchr(ifs, ch) per character for patterns ?fff...fffo. However measuring exact performance is bit difficult again due to noise as wordexp calls glob which takes ~70000 cycles. Thats lot compared with savings in wordexp itself, difference between longest pattern was 4500 which this change reduced to 500 cycles. I am not completely sure about that numbers as I got that estimate by subtracting performance of glob without ifs. old one: wordexp Pattern foo flags 0: 1131.86 Pattern ffoo flags 0: 629.5 Pattern ffffoo flags 0: 709.562 Pattern ffffffffoo flags 0: 866.359 Pattern ffffffffffffffffoo flags 0: 1115.53 Pattern ffffffffffffffffffffffffffffffffoo flags 0: 1591.81 Pattern ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffoo flags 0: 2608.45 Pattern ?foo flags 0: 74733.9 Pattern ?ffoo flags 0: 71044.9 Pattern ?ffffoo flags 0: 71165.1 Pattern ?ffffffffoo flags 0: 71009 Pattern ?ffffffffffffffffoo flags 0: 71283.4 Pattern ?ffffffffffffffffffffffffffffffffoo flags 0: 71748.2 Pattern ?ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffoo flags 0: 73446.5 Pattern * flags 0: 135284 Hundred byte IFS Pattern foo flags 0: 1067.41 Pattern ffoo flags 0: 1103.17 Pattern ffffoo flags 0: 1164.02 Pattern ffffffffoo flags 0: 1449.36 Pattern ffffffffffffffffoo flags 0: 2375.23 Pattern ffffffffffffffffffffffffffffffffoo flags 0: 2170.95 Pattern ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffoo flags 0: 3120.31 Pattern ?foo flags 0: 72397 Pattern ?ffoo flags 0: 71573.4 Pattern ?ffffoo flags 0: 71768.1 Pattern ?ffffffffoo flags 0: 71925.9 Pattern ?ffffffffffffffffoo flags 0: 72591.7 Pattern ?ffffffffffffffffffffffffffffffffoo flags 0: 73172.4 Pattern ?ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffoo flags 0: 78188.5 Pattern * flags 0: 134651 new one: wordexp Pattern foo flags 0: 1056.98 Pattern ffoo flags 0: 680.562 Pattern ffffoo flags 0: 690.219 Pattern ffffffffoo flags 0: 842.516 Pattern ffffffffffffffffoo flags 0: 1016.67 Pattern ffffffffffffffffffffffffffffffffoo flags 0: 1411.91 Pattern ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffoo flags 0: 2174.77 Pattern ?foo flags 0: 81948.1 Pattern ?ffoo flags 0: 71048.2 Pattern ?ffffoo flags 0: 71345 Pattern ?ffffffffoo flags 0: 71098.2 Pattern ?ffffffffffffffffoo flags 0: 71314.9 Pattern ?ffffffffffffffffffffffffffffffffoo flags 0: 71669 Pattern ?ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffoo flags 0: 72204.1 Pattern * flags 0: 137048 Hundred byte IFS Pattern foo flags 0: 1568.56 Pattern ffoo flags 0: 1581.34 Pattern ffffoo flags 0: 1627.89 Pattern ffffffffoo flags 0: 1924.27 Pattern ffffffffffffffffoo flags 0: 2109.28 Pattern ffffffffffffffffffffffffffffffffoo flags 0: 2456.67 Pattern ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffoo flags 0: 3204.42 Pattern ?foo flags 0: 74253.8 Pattern ?ffoo flags 0: 72561.5 Pattern ?ffffoo flags 0: 72630.3 Pattern ?ffffffffoo flags 0: 72794.3 Pattern ?ffffffffffffffffoo flags 0: 72951 Pattern ?ffffffffffffffffffffffffffffffffoo flags 0: 73194.2 Pattern ?ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffoo flags 0: 74068.2 Pattern * flags 0: 137750 diff --git a/benchtests/Makefile b/benchtests/Makefile index 8e615e5..fb737da 100644 --- a/benchtests/Makefile +++ b/benchtests/Makefile @@ -35,7 +35,7 @@ string-bench := bcopy bzero memccpy memchr memcmp memcpy memmem memmove \ strcat strchr strchrnul strcmp strcpy strcspn strlen \ strncasecmp strncat strncmp strncpy strnlen strpbrk strrchr \ strspn strstr strcpy_chk stpcpy_chk memrchr strsep strtok \ - strcoll + strcoll wordexp string-bench-all := $(string-bench) # We have to generate locales diff --git a/benchtests/bench-wordexp.c b/benchtests/bench-wordexp.c new file mode 100644 index 0000000..7696f73 --- /dev/null +++ b/benchtests/bench-wordexp.c @@ -0,0 +1,118 @@ +/* Measure wordexp functions. + Copyright (C) 2015 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define TEST_MAIN +#define TEST_NAME "wordexp" +#include "bench-string.h" + +#include + +typedef int (*proto_t) (const char *, wordexp_t *, int); + +IMPL (wordexp, 1) + + +static void +do_one_test (impl_t *impl, const char *s1, int flag) +{ + size_t i, iters = INNER_LOOP_ITERS; + timing_t start, stop, cur; + + TIMING_NOW (start); + for (i = 0; i < iters; ++i) + { + wordexp_t out; + CALL (impl, s1, &out, flag); + } + TIMING_NOW (stop); + + TIMING_DIFF (cur, start, stop); + + TIMING_PRINT_MEAN ((double) cur, (double) iters); +} + + +static void +do_test (const char *s1, int flag) +{ + + printf ("Pattern %s flags %i:", s1, flag); + + FOR_EACH_IMPL (impl, 0) + do_one_test (impl, s1, flag); + + putchar ('\n'); +} + +static int +test_main (void) +{ + test_init (); + + printf ("%23s", ""); + FOR_EACH_IMPL (impl, 0) + printf ("\t%s", impl->name); + putchar ('\n'); + + do_test ("foo", 0); + do_test ("ffoo", 0); + do_test ("ffffoo", 0); + do_test ("ffffffffoo", 0); + do_test ("ffffffffffffffffoo", 0); + do_test ("ffffffffffffffffffffffffffffffffoo", 0); +do_test ("ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffoo", 0); + + do_test ("?foo", 0); + do_test ("?ffoo", 0); + do_test ("?ffffoo", 0); + do_test ("?ffffffffoo", 0); + do_test ("?ffffffffffffffffoo", 0); + do_test ("?ffffffffffffffffffffffffffffffffoo", 0); +do_test ("?ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffoo", 0); + + + do_test ("*", 0); + + printf ("Hundred byte IFS\n"); + + + setenv("IFS"," \ + ", 1); + + do_test ("foo", 0); + do_test ("ffoo", 0); + do_test ("ffffoo", 0); + do_test ("ffffffffoo", 0); + do_test ("ffffffffffffffffoo", 0); + do_test ("ffffffffffffffffffffffffffffffffoo", 0); +do_test ("ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffoo", 0); + + do_test ("?foo", 0); + do_test ("?ffoo", 0); + do_test ("?ffffoo", 0); + do_test ("?ffffffffoo", 0); + do_test ("?ffffffffffffffffoo", 0); + do_test ("?ffffffffffffffffffffffffffffffffoo", 0); +do_test ("?ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffoo", 0); + + do_test ("*", 0); + + return ret; +} + +#include "../test-skeleton.c"