V2 [PATCH] x86: Use _rdtsc intrinsic for HP_TIMING_NOW
Commit Message
On 10/20/18, Florian Weimer <fw@deneb.enyo.de> wrote:
> * Florian Weimer:
>
>
>> I'm trying to remove the #include <hp-timing.h> from
>> <libc-internal.h>, but I don't know how hard this will be.
>
> That doesn't help. It's pulled in via <tls.h> and <descr.h> as well.
> We would need a separate <hp_timing_t.h> header file to avoid that
> because hp_timing_t is required in a few central places (thread
> descriptor and link map).
>
Please try this.
Comments
* H. J. Lu:
> + NB: Use __builtin_ia32_rdtsc directly since including <x86intrin.h>
> + is very slow with GCC 6. */
Thanks for the patch.
It fixes the issue, but the comment is wrong: I see the same slowdown
with GCC 8.
From a4fac17d8bc95fc405faa1fac0a383719b72bccf Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Sat, 20 Oct 2018 15:22:01 -0700
Subject: [PATCH] x86: Don't include <x86intrin.h>
Use __builtin_ia32_rdtsc directly since including <x86intrin.h> is very
slow with GCC 6.
* sysdeps/x86/hp-timing.h: Don't include <x86intrin.h>.
(HP_TIMING_NOW): Replace _rdtsc with __builtin_ia32_rdtsc.
---
sysdeps/x86/hp-timing.h | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
@@ -22,8 +22,6 @@
#include <isa.h>
#if MINIMUM_ISA == 686 || MINIMUM_ISA == 8664
-# include <x86intrin.h>
-
/* We always assume having the timestamp register. */
# define HP_TIMING_AVAIL (1)
# define HP_SMALL_TIMING_AVAIL (1)
@@ -38,8 +36,11 @@ typedef unsigned long long int hp_timing_t;
might not be 100% accurate since there might be some more instructions
running in this moment. This could be changed by using a barrier like
'cpuid' right before the `rdtsc' instruciton. But we are not interested
- in accurate clock cycles here so we don't do this. */
-# define HP_TIMING_NOW(Var) ((Var) = _rdtsc ())
+ in accurate clock cycles here so we don't do this.
+
+ NB: Use __builtin_ia32_rdtsc directly since including <x86intrin.h>
+ is very slow with GCC 6. */
+# define HP_TIMING_NOW(Var) ((Var) = __builtin_ia32_rdtsc ())
# include <hp-timing-common.h>
#else
--
2.17.2