V2 [PATCH] x86: Use _rdtsc intrinsic for HP_TIMING_NOW
Commit Message
On Tue, Oct 2, 2018 at 11:06 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Tue, Oct 2, 2018 at 10:23 AM Adhemerval Zanella
> <adhemerval.zanella@linaro.org> wrote:
> >
> >
> >
> > On 02/10/2018 13:56, H.J. Lu wrote:
> > > On Tue, Oct 2, 2018 at 6:45 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >>
> > >> On Tue, Oct 2, 2018 at 6:15 AM Adhemerval Zanella
> > >> <adhemerval.zanella@linaro.org> wrote:
> > >
> > >>> I am not very found of cross abi includes like this one, wouldn't be better
> > >>> to create a 'sysdeps/x86/hp-timing.h' and for 32-bit use _rdtsc iff
> > >>> __i686__ is defined otherwise include generic header?
> > >>
> > >> I can do that.
> > >
> > > Here is the V2 patch. OK for master?
> > >
> >
> > > NB: Checking if __i686__ isn't sufficient since __i686__ may not be
> > > defined when building for i686 class processors.
> >
> > Right, it seems gcc does not define it for -march newer than pentium4.
> >
> > > diff --git a/sysdeps/i386/i586/init-arch.h b/sysdeps/i386/i586/isa.h
> > > similarity index 89%
> > > rename from sysdeps/i386/i586/init-arch.h
> > > rename to sysdeps/i386/i586/isa.h
> > > index 72fb46c61e..a2b5d585d4 100644
> > > --- a/sysdeps/i386/i586/init-arch.h
> > > +++ b/sysdeps/i386/i586/isa.h
> > > @@ -1,4 +1,4 @@
> > > -/* Copyright (C) 2015-2018 Free Software Foundation, Inc.
> > > +/* Copyright (C) 2018 Free Software Foundation, Inc.
> > > This file is part of the GNU C Library.
> > >
> > > The GNU C Library is free software; you can redistribute it and/or
> > > @@ -15,5 +15,9 @@
> > > License along with the GNU C Library; if not, see
> > > <http://www.gnu.org/licenses/>. */
> > >
> > > +#ifndef _isa_h
> > > +#define _isa_h
> > > +
> >
> > Not sure if this is explicit on our coding rules, but usually we
> > use uppercase caps for preprocessor guards. I think it requires
> > a one-line description as well (same for other isa.h files).
>
> I will change that.
>
> > > -#ifndef _HP_TIMING_H
> > > -#define _HP_TIMING_H 1
> > > +#include <isa.h>
> > > +
> > > +#if MINIMUM_ISA == 686 || MINIMUM_ISA == 8664
> > > +# ifndef _HP_TIMING_H
> > > +# define _HP_TIMING_H 1
> > > +
> >
> > The include guard inside the MINIMUM_ISA check seems off, why do
> > you need it?
>
> <sysdeps/generic/hp-timing.h> will be skipped if _HP_TIMING_H is
> defined. That is the draw back for the single x86 hp-timing.h.
Here is the updated patch. OK for master?
Comments
On 03/10/2018 20:08, H.J. Lu wrote:
> On Tue, Oct 2, 2018 at 11:06 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Tue, Oct 2, 2018 at 10:23 AM Adhemerval Zanella
>> <adhemerval.zanella@linaro.org> wrote:
>>>
>>>
>>> On 02/10/2018 13:56, H.J. Lu wrote:
>>>> On Tue, Oct 2, 2018 at 6:45 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>> On Tue, Oct 2, 2018 at 6:15 AM Adhemerval Zanella
>>>>> <adhemerval.zanella@linaro.org> wrote:
>>>>>> I am not very found of cross abi includes like this one, wouldn't be better
>>>>>> to create a 'sysdeps/x86/hp-timing.h' and for 32-bit use _rdtsc iff
>>>>>> __i686__ is defined otherwise include generic header?
>>>>> I can do that.
>>>> Here is the V2 patch. OK for master?
>>>>
>>>> NB: Checking if __i686__ isn't sufficient since __i686__ may not be
>>>> defined when building for i686 class processors.
>>> Right, it seems gcc does not define it for -march newer than pentium4.
>>>
>>>> diff --git a/sysdeps/i386/i586/init-arch.h b/sysdeps/i386/i586/isa.h
>>>> similarity index 89%
>>>> rename from sysdeps/i386/i586/init-arch.h
>>>> rename to sysdeps/i386/i586/isa.h
>>>> index 72fb46c61e..a2b5d585d4 100644
>>>> --- a/sysdeps/i386/i586/init-arch.h
>>>> +++ b/sysdeps/i386/i586/isa.h
>>>> @@ -1,4 +1,4 @@
>>>> -/* Copyright (C) 2015-2018 Free Software Foundation, Inc.
>>>> +/* Copyright (C) 2018 Free Software Foundation, Inc.
>>>> This file is part of the GNU C Library.
>>>>
>>>> The GNU C Library is free software; you can redistribute it and/or
>>>> @@ -15,5 +15,9 @@
>>>> License along with the GNU C Library; if not, see
>>>> <http://www.gnu.org/licenses/>. */
>>>>
>>>> +#ifndef _isa_h
>>>> +#define _isa_h
>>>> +
>>> Not sure if this is explicit on our coding rules, but usually we
>>> use uppercase caps for preprocessor guards. I think it requires
>>> a one-line description as well (same for other isa.h files).
>> I will change that.
>>
>>>> -#ifndef _HP_TIMING_H
>>>> -#define _HP_TIMING_H 1
>>>> +#include <isa.h>
>>>> +
>>>> +#if MINIMUM_ISA == 686 || MINIMUM_ISA == 8664
>>>> +# ifndef _HP_TIMING_H
>>>> +# define _HP_TIMING_H 1
>>>> +
>>> The include guard inside the MINIMUM_ISA check seems off, why do
>>> you need it?
>> <sysdeps/generic/hp-timing.h> will be skipped if _HP_TIMING_H is
>> defined. That is the draw back for the single x86 hp-timing.h.
> Here is the updated patch. OK for master?
LGTM.
> index 59af526fdb..1c20e9d828 100644
> --- a/sysdeps/i386/i686/hp-timing.h
> +++ b/sysdeps/x86/hp-timing.h
> @@ -1,7 +1,6 @@
> -/* High precision, low overhead timing functions. i686 version.
> - Copyright (C) 1998-2018 Free Software Foundation, Inc.
> +/* High precision, low overhead timing functions. x86 version.
> + Copyright (C) 2018 Free Software Foundation, Inc.
> This file is part of the GNU C Library.
> - Contributed by Ulrich Drepper <drepper@cygnus.com>, 1998.
>
> The GNU C Library is free software; you can redistribute it and/or
> modify it under the terms of the GNU Lesser General Public
> @@ -20,12 +19,17 @@
> #ifndef _HP_TIMING_H
> #define _HP_TIMING_H 1
Another possibility is define _HP_TIMING_X86_1 and not undefine it below,
I don't have a strong opinion, so either way look good to me.
* H. J. Lu:
> Since _rdtsc intrinsic is supported in GCC 4.9, we can use it for
> HP_TIMING_NOW. This patch
>
> 1. Create x86 hp-timing.h to replace i686 and x86_64 hp-timing.h.
> 2. Move MINIMUM_ISA from init-arch.h to isa.h so that x86 hp-timing.h
> can check minimum x86 ISA to decide if _rdtsc can be used.
>
> NB: Checking if __i686__ isn't sufficient since __i686__ may not be
> defined when building for i686 class processors.
>
> * sysdeps/i386/init-arch.h: Removed.
> * sysdeps/i386/i586/init-arch.h: Likewise.
> * sysdeps/i386/i686/init-arch.h: Likewise.
> * sysdeps/i386/i686/hp-timing.h: Likewise.
> * sysdeps/x86_64/hp-timing.h: Likewise.
> * sysdeps/i386/isa.h: New file.
> * sysdeps/i386/i586/isa.h: Likewise.
> * sysdeps/i386/i686/isa.h: Likewise.
> * sysdeps/x86_64/isa.h: Likewise.
> * sysdeps/x86/hp-timing.h: New file.
> * sysdeps/x86/init-arch.h: Include <isa.h>.
This patch results in a substantial build time increase on Debian with
GCC 6: instead of less than two minutes I now see more than four
minutes for a full build (excluding testing).
I tried to fix this by avoiding the #undef in sysdeps/x86/hp-timing.h,
as Adhemerval suggested, but it did not help. It turns out that
including <x86intrin.h> in the old header (sysdeps/x86_64/hp-timing.h)
is sufficient to trigger the regression in build performance.
I think the regression substantial enough to revert the patch.
Comments?
I'm trying to remove the #include <hp-timing.h> from
<libc-internal.h>, but I don't know how hard this will be.
* Florian Weimer:
> I'm trying to remove the #include <hp-timing.h> from
> <libc-internal.h>, but I don't know how hard this will be.
That doesn't help. It's pulled in via <tls.h> and <descr.h> as well.
We would need a separate <hp_timing_t.h> header file to avoid that
because hp_timing_t is required in a few central places (thread
descriptor and link map).
From 01974a82401e0facbefe46f5d5a1423314083d7d Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Mon, 1 Oct 2018 15:05:34 -0700
Subject: [PATCH] x86: Use _rdtsc intrinsic for HP_TIMING_NOW
Since _rdtsc intrinsic is supported in GCC 4.9, we can use it for
HP_TIMING_NOW. This patch
1. Create x86 hp-timing.h to replace i686 and x86_64 hp-timing.h.
2. Move MINIMUM_ISA from init-arch.h to isa.h so that x86 hp-timing.h
can check minimum x86 ISA to decide if _rdtsc can be used.
NB: Checking if __i686__ isn't sufficient since __i686__ may not be
defined when building for i686 class processors.
* sysdeps/i386/init-arch.h: Removed.
* sysdeps/i386/i586/init-arch.h: Likewise.
* sysdeps/i386/i686/init-arch.h: Likewise.
* sysdeps/i386/i686/hp-timing.h: Likewise.
* sysdeps/x86_64/hp-timing.h: Likewise.
* sysdeps/i386/isa.h: New file.
* sysdeps/i386/i586/isa.h: Likewise.
* sysdeps/i386/i686/isa.h: Likewise.
* sysdeps/x86_64/isa.h: Likewise.
* sysdeps/x86/hp-timing.h: New file.
* sysdeps/x86/init-arch.h: Include <isa.h>.
---
sysdeps/i386/i586/{init-arch.h => isa.h} | 9 ++++--
sysdeps/i386/i686/{init-arch.h => isa.h} | 9 ++++--
sysdeps/i386/{init-arch.h => isa.h} | 9 ++++--
sysdeps/{i386/i686 => x86}/hp-timing.h | 28 +++++++++++------
sysdeps/x86/init-arch.h | 1 +
sysdeps/x86_64/hp-timing.h | 40 ------------------------
sysdeps/x86_64/isa.h | 24 ++++++++++++++
7 files changed, 65 insertions(+), 55 deletions(-)
rename sysdeps/i386/i586/{init-arch.h => isa.h} (85%)
rename sysdeps/i386/i686/{init-arch.h => isa.h} (85%)
rename sysdeps/i386/{init-arch.h => isa.h} (85%)
rename sysdeps/{i386/i686 => x86}/hp-timing.h (69%)
delete mode 100644 sysdeps/x86_64/hp-timing.h
create mode 100644 sysdeps/x86_64/isa.h
similarity index 85%
rename from sysdeps/i386/i586/init-arch.h
rename to sysdeps/i386/i586/isa.h
@@ -1,4 +1,5 @@
-/* Copyright (C) 2015-2018 Free Software Foundation, Inc.
+/* x86 ISA info. i586 version.
+ Copyright (C) 2018 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
@@ -15,5 +16,9 @@
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
+#ifndef _ISA_H
+#define _ISA_H
+
#define MINIMUM_ISA 586
-#include <sysdeps/x86/init-arch.h>
+
+#endif
similarity index 85%
rename from sysdeps/i386/i686/init-arch.h
rename to sysdeps/i386/i686/isa.h
@@ -1,4 +1,5 @@
-/* Copyright (C) 2015-2018 Free Software Foundation, Inc.
+/* x86 ISA info. i686 version.
+ Copyright (C) 2018 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
@@ -15,5 +16,9 @@
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
+#ifndef _ISA_H
+#define _ISA_H
+
#define MINIMUM_ISA 686
-#include <sysdeps/x86/init-arch.h>
+
+#endif
similarity index 85%
rename from sysdeps/i386/init-arch.h
rename to sysdeps/i386/isa.h
@@ -1,4 +1,5 @@
-/* Copyright (C) 2015-2018 Free Software Foundation, Inc.
+/* x86 ISA info. i486 version.
+ Copyright (C) 2018 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
@@ -15,5 +16,9 @@
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
+#ifndef _ISA_H
+#define _ISA_H
+
#define MINIMUM_ISA 486
-#include <sysdeps/x86/init-arch.h>
+
+#endif
similarity index 69%
rename from sysdeps/i386/i686/hp-timing.h
rename to sysdeps/x86/hp-timing.h
@@ -1,7 +1,6 @@
-/* High precision, low overhead timing functions. i686 version.
- Copyright (C) 1998-2018 Free Software Foundation, Inc.
+/* High precision, low overhead timing functions. x86 version.
+ Copyright (C) 2018 Free Software Foundation, Inc.
This file is part of the GNU C Library.
- Contributed by Ulrich Drepper <drepper@cygnus.com>, 1998.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
@@ -20,12 +19,17 @@
#ifndef _HP_TIMING_H
#define _HP_TIMING_H 1
+#include <isa.h>
+
+#if MINIMUM_ISA == 686 || MINIMUM_ISA == 8664
+# include <x86intrin.h>
+
/* We always assume having the timestamp register. */
-#define HP_TIMING_AVAIL (1)
-#define HP_SMALL_TIMING_AVAIL (1)
+# define HP_TIMING_AVAIL (1)
+# define HP_SMALL_TIMING_AVAIL (1)
/* We indeed have inlined functions. */
-#define HP_TIMING_INLINE (1)
+# define HP_TIMING_INLINE (1)
/* We use 64bit values for the times. */
typedef unsigned long long int hp_timing_t;
@@ -35,8 +39,14 @@ typedef unsigned long long int hp_timing_t;
running in this moment. This could be changed by using a barrier like
'cpuid' right before the `rdtsc' instruciton. But we are not interested
in accurate clock cycles here so we don't do this. */
-#define HP_TIMING_NOW(Var) __asm__ __volatile__ ("rdtsc" : "=A" (Var))
+# define HP_TIMING_NOW(Var) ((Var) = _rdtsc ())
-#include <hp-timing-common.h>
+# include <hp-timing-common.h>
+#else
+/* NB: Undefine _HP_TIMING_H so that <sysdeps/generic/hp-timing.h> will
+ be included. */
+# undef _HP_TIMING_H
+# include <sysdeps/generic/hp-timing.h>
+#endif
-#endif /* hp-timing.h */
+#endif /* hp-timing.h */
@@ -21,6 +21,7 @@
# include <ldsodefs.h>
#endif
#include <ifunc-init.h>
+#include <isa.h>
#ifndef __x86_64__
/* Due to the reordering and the other nifty extensions in i686, it is
deleted file mode 100644
@@ -1,40 +0,0 @@
-/* High precision, low overhead timing functions. x86-64 version.
- Copyright (C) 2002-2018 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <http://www.gnu.org/licenses/>. */
-
-#ifndef _HP_TIMING_H
-#define _HP_TIMING_H 1
-
-/* We always assume having the timestamp register. */
-#define HP_TIMING_AVAIL (1)
-#define HP_SMALL_TIMING_AVAIL (1)
-
-/* We indeed have inlined functions. */
-#define HP_TIMING_INLINE (1)
-
-/* We use 64bit values for the times. */
-typedef unsigned long long int hp_timing_t;
-
-/* The "=A" constraint used in 32-bit mode does not work in 64-bit mode. */
-#define HP_TIMING_NOW(Var) \
- ({ unsigned int _hi, _lo; \
- asm volatile ("rdtsc" : "=a" (_lo), "=d" (_hi)); \
- (Var) = ((unsigned long long int) _hi << 32) | _lo; })
-
-#include <hp-timing-common.h>
-
-#endif /* hp-timing.h */
new file mode 100644
@@ -0,0 +1,24 @@
+/* x86 ISA info. x86-64 version.
+ Copyright (C) 2018 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+#ifndef _ISA_H
+#define _ISA_H
+
+#define MINIMUM_ISA 8664
+
+#endif
--
2.17.1