From patchwork Mon Oct 22 22:37:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 29841 Received: (qmail 75024 invoked by alias); 22 Oct 2018 22:37:43 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 74878 invoked by uid 89); 22 Oct 2018 22:37:42 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, FREEMAIL_FROM, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=2010, Major, globally, os X-HELO: mail-pf1-f195.google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:subject:date:message-id; bh=jXygVdyXuwrSaExKSyHtelYQNv2bswlVXLoaOI+NYJI=; b=rzzCRP9MBj/D26hDTKKHbuhiXA1hBId8m4JNFxCLBiWTTcxuYLBC+NrjIhRGtGyNdG LQqpP3eEX7y9myK7l8U30oj6X/uEUoxrZ5vzJS+kfX0gX/ZtSgbg0pdmToqXgOa5pDZR AkgutfzMFBbhVAfUh8x/6NY95nrEQOUn2AKwd818mfq2X1mQk5nU05XyF5fvuFPrCZNu SJJPop3yNo7XXtbZclCzuBJc47gDXhVK5Jy84O2fiv8O6M1ERBiQE620S1Ti7K8mX7mO UHvIYNfoZ5Umoml8OTEQcCGRO0kKwWmLDDS1q5S4tdKd3n6URMkZK293muqwPgT81vli AMVg== Return-Path: From: "H.J. Lu" To: libc-alpha@sourceware.org Subject: [PATCH] x86: Add --enable-rdtscp-in-benchtests Date: Mon, 22 Oct 2018 15:37:11 -0700 Message-Id: <20181022223711.26910-1-hjl.tools@gmail.com> RDTSCP waits until all previous instructions have executed and all previous loads are globally visible before reading the counter. RDTSC doesn't wait until all previous instructions have been executed before reading the counter. This patch adds --enable-rdtscp-in-benchtests to use RDTSCP in benchtests. NOTE: Benchtests in RDTSCP-enabled glibc require CPUs capable of RDTSCP instruction. All x86 processors since 2010 support RDTSCP instruction. * INSTALL: Regenerated. * configure: Likewise. * NEWS: Mention --enable-rdtscp-in-benchtests. * configure.ac: Add --enable-rdtscp-in-benchtests. * benchtests/Makefile (CPPFLAGS-nonlib): Add -DENABLE_RDTSCP for --enable-rdtscp-in-benchtests. * manual/install.texi: Document --enable-rdtscp-in-benchtests. * sysdeps/x86/hp-timing.h (HP_TIMING_NOW): Use RDTSCP if ENABLE_RDTSCP is defined. --- INSTALL | 10 ++++++++++ NEWS | 7 +++++++ benchtests/Makefile | 4 ++++ configure | 13 +++++++++++++ configure.ac | 7 +++++++ manual/install.texi | 9 +++++++++ sysdeps/x86/hp-timing.h | 15 ++++++++++++++- 7 files changed, 64 insertions(+), 1 deletion(-) diff --git a/INSTALL b/INSTALL index f9c5cbb9a3..01e4f7af0c 100644 --- a/INSTALL +++ b/INSTALL @@ -153,6 +153,16 @@ if 'CFLAGS' is specified it must enable optimization. For example: library. This option hardcodes the newly built C library path in dynamic tests so that they can be invoked directly. +'--enable-rdtscp-in-benchtests' + This x86 only option enables RDTSCP instruction in benchtests. + When the GNU C Library is built with '--enable-rdtscp-in-tests', + benchtests will use RDTSCP instruction, instead of RDTSC + instruction, for high precision, low overhead timing, which + provides more precise timing data. The resulting library is + unchanged by this option. Note that benchtests in RDTSCP-enabled + the GNU C Library require CPUs capable of RDTSCP instruction. All + x86 processors since 2010 support RDTSCP instruction. + '--disable-timezone-tools' By default, timezone related utilities ('zic', 'zdump', and 'tzselect') are installed with the GNU C Library. If you are diff --git a/NEWS b/NEWS index f054dc0433..d6007ec4f5 100644 --- a/NEWS +++ b/NEWS @@ -9,6 +9,13 @@ Version 2.29 Major new features: +* The GNU C Library can now be built with --enable-rdtscp-in-benchtests + to use RDTSCP instruction in benchtests for high precision, low + overhead timing on x86 CPUs, which provides more precise timing data. + The resulting library is unchanged. Benchtests in RDTSCP-enabled glibc + require CPUs capable of RDTSCP instruction. All x86 processors since + 2010 support RDTSCP instruction. + * A new convenience target has been added for distribution maintainers to build and install all locales as directories with files. The new target is run by issuing the following command in your build tree: diff --git a/benchtests/Makefile b/benchtests/Makefile index bcd6a9c26d..16326c9e41 100644 --- a/benchtests/Makefile +++ b/benchtests/Makefile @@ -131,6 +131,10 @@ CPPFLAGS-nonlib += -DDURATION=$(BENCH_DURATION) -D_ISOMAC # HP_TIMING if it is available. ifdef USE_CLOCK_GETTIME CPPFLAGS-nonlib += -DUSE_CLOCK_GETTIME +else +ifeq (${enable-rdtscp-in-benchtests},yes) +CPPFLAGS-nonlib += -DENABLE_RDTSCP +endif endif DETAILED_OPT := diff --git a/configure b/configure index f30c31afdc..a562849490 100755 --- a/configure +++ b/configure @@ -794,6 +794,7 @@ enable_pt_chown enable_tunables enable_mathvec enable_cet +enable_rdtscp_in_benchtests with_cpu ' ac_precious_vars='build_alias @@ -1471,6 +1472,8 @@ Optional Features: depends on architecture] --enable-cet enable Intel Control-flow Enforcement Technology (CET), x86 only + --enable-rdtscp-in-benchtests + enable RDTSCP in benchtests, x86 only [default=no] Optional Packages: --with-PACKAGE[=ARG] use PACKAGE [ARG=yes] @@ -3785,6 +3788,16 @@ else fi +# Check whether --enable-rdtscp-in-benchtests was given. +if test "${enable_rdtscp_in_benchtests+set}" = set; then : + enableval=$enable_rdtscp_in_benchtests; rdtscp_in_benchtests=$enableval +else + rdtscp_in_benchtests=no +fi + +config_vars="$config_vars +enable-rdtscp-in-benchtests = $rdtscp_in_benchtests" + # We keep the original values in `$config_*' and never modify them, so we # can write them unchanged into config.make. Everything else uses # $machine, $vendor, and $os, and changes them whenever convenient. diff --git a/configure.ac b/configure.ac index e983fd8faa..d3cfb2c728 100644 --- a/configure.ac +++ b/configure.ac @@ -478,6 +478,13 @@ AC_ARG_ENABLE([cet], [enable_cet=$enableval], [enable_cet=no]) +AC_ARG_ENABLE([rdtscp-in-benchtests], + AC_HELP_STRING([--enable-rdtscp-in-benchtests], + [enable RDTSCP in benchtests, x86 only @<:@default=no@:>@]), + [rdtscp_in_benchtests=$enableval], + [rdtscp_in_benchtests=no]) +LIBC_CONFIG_VAR([enable-rdtscp-in-benchtests], [$rdtscp_in_benchtests]) + # We keep the original values in `$config_*' and never modify them, so we # can write them unchanged into config.make. Everything else uses # $machine, $vendor, and $os, and changes them whenever convenient. diff --git a/manual/install.texi b/manual/install.texi index 61178dadcd..fa3fb912f6 100644 --- a/manual/install.texi +++ b/manual/install.texi @@ -182,6 +182,15 @@ By default, dynamic tests are linked to run with the installed C library. This option hardcodes the newly built C library path in dynamic tests so that they can be invoked directly. +@item --enable-rdtscp-in-benchtests +This x86 only option enables RDTSCP instruction in benchtests. When +@theglibc{} is built with @option{--enable-rdtscp-in-tests}, benchtests +will use RDTSCP instruction, instead of RDTSC instruction, for high +precision, low overhead timing, which provides more precise timing data. +The resulting library is unchanged by this option. Note that benchtests +in RDTSCP-enabled @theglibc{} require CPUs capable of RDTSCP instruction. +All x86 processors since 2010 support RDTSCP instruction. + @item --disable-timezone-tools By default, timezone related utilities (@command{zic}, @command{zdump}, and @command{tzselect}) are installed with @theglibc{}. If you are building diff --git a/sysdeps/x86/hp-timing.h b/sysdeps/x86/hp-timing.h index 77a1360748..fd840314fa 100644 --- a/sysdeps/x86/hp-timing.h +++ b/sysdeps/x86/hp-timing.h @@ -40,7 +40,20 @@ typedef unsigned long long int hp_timing_t; NB: Use __builtin_ia32_rdtsc directly since including makes building glibc very slow. */ -# define HP_TIMING_NOW(Var) ((Var) = __builtin_ia32_rdtsc ()) +# ifdef ENABLE_RDTSCP +/* RDTSCP waits until all previous instructions have executed and all + previous loads are globally visible before reading the counter. + RDTSC doesn't wait until all previous instructions have been executed + before reading the counter. When --enable-rdtscp-in-tests is used, + use RDTSCP in benchtests. */ +# define HP_TIMING_NOW(Var) \ + (__extension__ ({ \ + unsigned int __aux; \ + (Var) = __builtin_ia32_rdtscp (&__aux); \ + })) +# else +# define HP_TIMING_NOW(Var) ((Var) = __builtin_ia32_rdtsc ()) +# endif # include #else