x86: Don't include <x86intrin.h>

Message ID CAMe9rOpkqy+pr7mzkLr5Q4opkeubmXeQrn5=R1zjyW5tUQos5w@mail.gmail.com
State New, archived
Headers

Commit Message

H.J. Lu Oct. 21, 2018, 1:11 a.m. UTC
  On 10/20/18, H.J. Lu <hjl.tools@gmail.com> wrote:
> On 10/20/18, Florian Weimer <fw@deneb.enyo.de> wrote:
>> * Florian Weimer:
>>
>>
>>> I'm trying to remove the #include <hp-timing.h> from
>>> <libc-internal.h>, but I don't know how hard this will be.
>>
>> That doesn't help.  It's pulled in via <tls.h> and <descr.h> as well.
>> We would need a separate <hp_timing_t.h> header file to avoid that
>> because hp_timing_t is required in a few central places (thread
>> descriptor and link map).
>>
>
> Please try this.
>

It is not about GCC 6.  GCC 8 has the same issue.  Here is the updated
patch.
  

Comments

Florian Weimer Oct. 21, 2018, 7:04 a.m. UTC | #1
* H. J. Lu:

> It is not about GCC 6.  GCC 8 has the same issue.  Here is the updated
> patch.

Ah, sorry, I didn't see your second message.  Yes, patch looks good.
  

Patch

From 157f63aef12d8f6abe62996638903c483306fc6a Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Sat, 20 Oct 2018 15:22:01 -0700
Subject: [PATCH] x86: Don't include <x86intrin.h>

Use __builtin_ia32_rdtsc directly since including <x86intrin.h> makes
building glibc very slow.  On Intel Core i5-6260U, this patch reduces
x86-64 build time from 8 minutes 33 seconds to 3 minutes 48 seconds
with "make -j4" and GCC 8.2.1.

	* sysdeps/x86/hp-timing.h: Don't include <x86intrin.h>.
	(HP_TIMING_NOW): Replace _rdtsc with __builtin_ia32_rdtsc.
---
 sysdeps/x86/hp-timing.h | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/sysdeps/x86/hp-timing.h b/sysdeps/x86/hp-timing.h
index 1c20e9d828..77a1360748 100644
--- a/sysdeps/x86/hp-timing.h
+++ b/sysdeps/x86/hp-timing.h
@@ -22,8 +22,6 @@ 
 #include <isa.h>
 
 #if MINIMUM_ISA == 686 || MINIMUM_ISA == 8664
-# include <x86intrin.h>
-
 /* We always assume having the timestamp register.  */
 # define HP_TIMING_AVAIL	(1)
 # define HP_SMALL_TIMING_AVAIL	(1)
@@ -38,8 +36,11 @@  typedef unsigned long long int hp_timing_t;
    might not be 100% accurate since there might be some more instructions
    running in this moment.  This could be changed by using a barrier like
    'cpuid' right before the `rdtsc' instruciton.  But we are not interested
-   in accurate clock cycles here so we don't do this.  */
-# define HP_TIMING_NOW(Var)	((Var) = _rdtsc ())
+   in accurate clock cycles here so we don't do this.
+
+   NB: Use __builtin_ia32_rdtsc directly since including <x86intrin.h>
+   makes building glibc very slow.  */
+# define HP_TIMING_NOW(Var)	((Var) = __builtin_ia32_rdtsc ())
 
 # include <hp-timing-common.h>
 #else
-- 
2.17.2