Fix HP_SMALL_TIMING_AVAIL undef warnings

Message ID 1398897187.5201.23.camel@ubuntu-sellcey
State Superseded
Headers

Commit Message

Steve Ellcey April 30, 2014, 10:33 p.m. UTC
  Here is a new *untested* patch for the HP_SMALL_TIMING_AVAIL warnings but
the patch actually does much more then get rid of these warnings.  There
was a lot of duplication in the various platform hp-timing.h header files,
but these were all different then the generic hp-timing.h header file which
assumes you are not implementing any timing functionality.  So I created 
a generic hp-timing-enabled.h header file, put the macros in there and
had the various platform files include that.

The two macros I did not define in hp-timing-enabled.h are HP_TIMING_NOW
and HP_TIMING_ACCUM.  These are pretty different across all the targets
and so I left each target to define its own.  I also couldn't put the
typedef of hp_timing_t in the shared file because it is different for
one target (alpha), where it is 32 bits instead of 64 like it is on the
other targets, but I did create a default HP_TIMING_TYPE that could
be used in the hp_timing_t typedef declaration for each target and only
alpha had to redefine it.

This isn't ready for checkin since it isn't tested (I did do a build on
x86) but I was hoping someone could review it for design issues and
that the various platform maintainers for the platforms affected by the
change (x86, alpha, sparc, ia64, powerpc) could try it out.

Steve Ellcey
sellcey@mips.com


2014-04-30  Steve Ellcey  <sellcey@mips.com>

	* sysdeps/generic/hp-timing-enabled.h: New file.
	* sysdeps/generic/hp-timing.h (HP_SMALL_TIMING_AVAIL):
	Set default value.
	* sysdeps/powerpc/powerpc32/hp-timing.h (HP_SMALL_TIMING_AVAIL): Ditto.
	* sysdeps/alpha/hp-timing.h: Include hp-timing-enabled.h to get
	default defintions of HP_TIMING_* macros.
	* sysdeps/i386/i686/hp-timing.h: Ditto.
	* sysdeps/ia64/hp-timing.h: Ditto.
	* sysdeps/powerpc/powerpc32/power4/hp-timing.h: Ditto.
	* sysdeps/powerpc/powerpc64/hp-timing.h: Ditto.
	* sysdeps/sparc/sparc32/sparcv9/hp-timing.h: Ditto.
	* sysdeps/sparc/sparc64/hp-timing.h: Ditto.
	* elf/dl-support.c: Use #if to check HP_SMALL_TIMING_AVAIL.
	* elf/rtld.c: Ditto.
  

Comments

Will Newton June 25, 2014, 4 p.m. UTC | #1
On 30 April 2014 23:33, Steve Ellcey <sellcey@mips.com> wrote:

> Here is a new *untested* patch for the HP_SMALL_TIMING_AVAIL warnings but
> the patch actually does much more then get rid of these warnings.  There
> was a lot of duplication in the various platform hp-timing.h header files,
> but these were all different then the generic hp-timing.h header file which
> assumes you are not implementing any timing functionality.  So I created
> a generic hp-timing-enabled.h header file, put the macros in there and
> had the various platform files include that.
>
> The two macros I did not define in hp-timing-enabled.h are HP_TIMING_NOW
> and HP_TIMING_ACCUM.  These are pretty different across all the targets
> and so I left each target to define its own.  I also couldn't put the
> typedef of hp_timing_t in the shared file because it is different for
> one target (alpha), where it is 32 bits instead of 64 like it is on the
> other targets, but I did create a default HP_TIMING_TYPE that could
> be used in the hp_timing_t typedef declaration for each target and only
> alpha had to redefine it.
>
> This isn't ready for checkin since it isn't tested (I did do a build on
> x86) but I was hoping someone could review it for design issues and
> that the various platform maintainers for the platforms affected by the
> change (x86, alpha, sparc, ia64, powerpc) could try it out.
>
> Steve Ellcey
> sellcey@mips.com
>
>
> 2014-04-30  Steve Ellcey  <sellcey@mips.com>
>
>         * sysdeps/generic/hp-timing-enabled.h: New file.
>         * sysdeps/generic/hp-timing.h (HP_SMALL_TIMING_AVAIL):
>         Set default value.
>         * sysdeps/powerpc/powerpc32/hp-timing.h (HP_SMALL_TIMING_AVAIL): Ditto.
>         * sysdeps/alpha/hp-timing.h: Include hp-timing-enabled.h to get
>         default defintions of HP_TIMING_* macros.

Typo definitions.

>         * sysdeps/i386/i686/hp-timing.h: Ditto.
>         * sysdeps/ia64/hp-timing.h: Ditto.
>         * sysdeps/powerpc/powerpc32/power4/hp-timing.h: Ditto.
>         * sysdeps/powerpc/powerpc64/hp-timing.h: Ditto.
>         * sysdeps/sparc/sparc32/sparcv9/hp-timing.h: Ditto.
>         * sysdeps/sparc/sparc64/hp-timing.h: Ditto.
>         * elf/dl-support.c: Use #if to check HP_SMALL_TIMING_AVAIL.
>         * elf/rtld.c: Ditto.

Sorry for taking so long to look at this. A couple of things:

HP_TIMING_MAX is documented but it's actually HP_TIMING_TYPE_MAX. I
wonder whether it might be as well to always make every architecture
define hp_timing_t and HP_TIMING_TYPE_MAX at the cost of a little
duplication (or we could make it ~(hp_timing_t)0?).

The original warnings are fixed but new ones are added of the form:

 In file included from ../sysdeps/x86_64/hp-timing.h:22:0,
                 from ../include/libc-internal.h:7,
                 from ../sysdeps/x86_64/nptl/tls.h:29,
                 from ../sysdeps/unix/sysv/linux/x86_64/sysdep.h:23,
                 from <stdin>:1:
../sysdeps/i386/i686/hp-timing.h:38:0: warning: "HP_TIMING_ACCUM"
redefined [enabled by default]
 #define HP_TIMING_ACCUM(Sum, Diff) \
 ^
In file included from ../sysdeps/i386/i686/hp-timing.h:26:0,
                 from ../sysdeps/x86_64/hp-timing.h:22,
                 from ../include/libc-internal.h:7,
                 from ../sysdeps/x86_64/nptl/tls.h:29,
                 from ../sysdeps/unix/sysv/linux/x86_64/sysdep.h:23,
                 from <stdin>:1:
  

Patch

diff --git a/elf/dl-support.c b/elf/dl-support.c
index e435436..e9cfa64 100644
--- a/elf/dl-support.c
+++ b/elf/dl-support.c
@@ -130,7 +130,7 @@  void *_dl_random;
 #include <dl-procinfo.c>
 
 /* We expect less than a second for relocation.  */
-#ifdef HP_SMALL_TIMING_AVAIL
+#if HP_SMALL_TIMING_AVAIL
 # undef HP_TIMING_AVAIL
 # define HP_TIMING_AVAIL HP_SMALL_TIMING_AVAIL
 #endif
diff --git a/elf/rtld.c b/elf/rtld.c
index 9d121dc..5dcb61a 100644
--- a/elf/rtld.c
+++ b/elf/rtld.c
@@ -196,7 +196,7 @@  static struct libname_list _dl_rtld_libname;
 static struct libname_list _dl_rtld_libname2;
 
 /* We expect less than a second for relocation.  */
-#ifdef HP_SMALL_TIMING_AVAIL
+#if HP_SMALL_TIMING_AVAIL
 # undef HP_TIMING_AVAIL
 # define HP_TIMING_AVAIL HP_SMALL_TIMING_AVAIL
 #endif
diff --git a/sysdeps/alpha/hp-timing.h b/sysdeps/alpha/hp-timing.h
index 90f9b9d..0e2dfc5 100644
--- a/sysdeps/alpha/hp-timing.h
+++ b/sysdeps/alpha/hp-timing.h
@@ -23,63 +23,21 @@ 
 #include <string.h>
 #include <sys/param.h>
 #include <_itoa.h>
+#include <hp-timing-enabled.h>
 
-/* The macros defined here use the timestamp counter in IA-64.  They
-   provide a very accurate way to measure the time with very little
-   overhead.  The time values themself have no real meaning, only
-   differences are interesting.
-
-   The list of macros we need includes the following:
-
-   - HP_TIMING_AVAIL: test for availability.
-
-   - HP_TIMING_INLINE: this macro is non-zero if the functionality is not
-     implemented using function calls but instead uses some inlined code
-     which might simply consist of a few assembler instructions.  We have to
-     know this since we might want to use the macros here in places where we
-     cannot make function calls.
-
-   - hp_timing_t: This is the type for variables used to store the time
-     values.
-
-   - HP_TIMING_ZERO: clear `hp_timing_t' object.
-
-   - HP_TIMING_NOW: place timestamp for current time in variable given as
-     parameter.
-
-   - HP_TIMING_DIFF_INIT: do whatever is necessary to be able to use the
-     HP_TIMING_DIFF macro.
-
-   - HP_TIMING_DIFF: compute difference between two times and store it
-     in a third.  Source and destination might overlap.
-
-   - HP_TIMING_ACCUM: add time difference to another variable.  This might
-     be a bit more complicated to implement for some platforms as the
-     operation should be thread-safe and 64bit arithmetic on 32bit platforms
-     is not.
-
-   - HP_TIMING_ACCUM_NT: this is the variant for situations where we know
-     there are no threads involved.
+/* We use 32 bit values for the times.  */
+#undef HP_TIMING_TYPE
+#define HP_TIMING_TYPE unsigned int
 
-   - HP_TIMING_PRINT: write decimal representation of the timing value into
-     the given string.  This operation need not be inline even though
-     HP_TIMING_INLINE is specified.
-*/
+typedef HP_TIMING_TYPE hp_timing_t;
 
 /* We always have the timestamp register, but it's got only a 4 second
    range.  Use it for ld.so profiling only.  */
+#undef HP_TIMING_AVAIL
 #define HP_TIMING_AVAIL		(0)
+#undef HP_SMALL_TIMING_AVAIL
 #define HP_SMALL_TIMING_AVAIL	(1)
 
-/* We indeed have inlined functions.  */
-#define HP_TIMING_INLINE	(1)
-
-/* We use 32 bit values for the times.  */
-typedef unsigned int hp_timing_t;
-
-/* Set timestamp value to zero.  */
-#define HP_TIMING_ZERO(VAR)	(VAR) = (0)
-
 /* The "rpcc" instruction returns a 32-bit counting half and a 32-bit
    "virtual cycle counter displacement".  Subtracting the two gives us
    a virtual cycle count.  */
@@ -91,27 +49,10 @@  typedef unsigned int hp_timing_t;
   } while (0)
 
 /* ??? Two rpcc instructions can be scheduled simultaneously.  */
+#undef HP_TIMING_DIFF_INIT
 #define HP_TIMING_DIFF_INIT() do { } while (0)
 
-/* It's simple arithmetic for us.  */
-#define HP_TIMING_DIFF(Diff, Start, End)	(Diff) = ((End) - (Start))
-
 /* ??? Don't bother, since we're only used for ld.so.  */
 #define HP_TIMING_ACCUM(Sum, Diff)  not implemented
 
-/* No threads, no extra work.  */
-#define HP_TIMING_ACCUM_NT(Sum, Diff)	(Sum) += (Diff)
-
-/* Print the time value.  */
-#define HP_TIMING_PRINT(Buf, Len, Val) \
-  do {									      \
-    char __buf[20];							      \
-    char *__cp = _itoa_word (Val, __buf + sizeof (__buf), 10, 0);	      \
-    int __len = (Len);							      \
-    char *__dest = (Buf);						      \
-    while (__len-- > 0 && __cp < __buf + sizeof (__buf))		      \
-      *__dest++ = *__cp++;						      \
-    memcpy (__dest, " clock cycles", MIN (__len, sizeof (" clock cycles")));  \
-  } while (0)
-
 #endif	/* hp-timing.h */
diff --git a/sysdeps/generic/hp-timing-enabled.h b/sysdeps/generic/hp-timing-enabled.h
new file mode 100644
index 0000000..1ee137a
--- /dev/null
+++ b/sysdeps/generic/hp-timing-enabled.h
@@ -0,0 +1,142 @@ 
+/* High precision, low overhead timing functions.  Generic version.
+   Copyright (C) 1998-2014 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Ulrich Drepper <drepper@cygnus.com>, 1998.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _HP_TIMING_ENABLED_H
+#define _HP_TIMING_ENABLED_H	1
+
+/* This file defines *most* of the macros that platforms defining a timer
+   would need.  The HP_TIMING_NOW and HP_TIMING_ACCUM macros are not defined
+   here and must be defined in a platform specific header because they are
+   almost certainly going to contain inline assembly code.   The other macros
+   here may use HP_TIMING_NOW but are otherwise generic enough to be shared.
+   Some platforms may still need or want to undefine one or more of these and
+   redefine specific ones though.
+
+   Here is the list of macros a platform needs to support timers
+
+   - HP_TIMING_AVAIL: test for availability.
+
+   - HP_SMALL_TIMING_AVAIL: test for availability.
+
+   - HP_TIMING_INLINE: this macro is non-zero if the functionality is not
+     implemented using function calls but instead uses some inlined code
+     which might simply consist of a few assembler instructions.  We have to
+     know this since we might want to use the macros here in places where we
+     cannot make function calls.
+
+   - HP_TIMING_TYPE: This is the type for variables used to store the time
+     values.
+
+   - HP_TIMING_MAX:  The MAX value for HP_TIMING_TYPE, used in intializing
+     a HP_TIMING_TYPE variable in HP_TIMING_DIFF_INIT.
+
+   - HP_TIMING_ZERO: clear `hp_timing_t' object.
+
+   - HP_TIMING_NOW: place timestamp for current time in variable given as
+     parameter.
+
+   - HP_TIMING_DIFF_INIT: do whatever is necessary to be able to use the
+     HP_TIMING_DIFF macro.
+
+   - HP_TIMING_DIFF: compute difference between two times and store it
+     in a third.  Source and destination might overlap.
+
+   - HP_TIMING_ACCUM: add time difference to another variable.  This might
+     be a bit more complicated to implement for some platforms as the
+     operation should be thread-safe and 64bit arithmetic on 32bit platforms
+     is not.
+
+   - HP_TIMING_ACCUM_NT: this is the variant for situations where we know
+     there are no threads involved.
+
+   - HP_TIMING_PRINT: write decimal representation of the timing value into
+     the given string.  This operation need not be inline even though
+     HP_TIMING_INLINE is specified.
+
+   - hp_timing_t: This is not a macro but a typedef that should be declared
+     as type HP_TIMING_TYPE.
+
+*/
+
+/*
+   Provide definitions for a platform that has a timestamp register.
+   Some may be overridden with platform specific versions and HP_TIMING_NOW
+   must always be specified by platforms because there is no generic version
+   possible.
+*/
+
+#define HP_TIMING_AVAIL		(1)
+#define HP_SMALL_TIMIING_AVAIL	(0)
+#define HP_TIMING_INLINE	(1)
+
+#define HP_TIMING_TYPE		int64_t
+#define HP_TIMING_TYPE_MAX	INT64_MAX
+
+#define HP_TIMING_ZERO(Var)	(Var) = (0)
+
+/* Use two HP_TIMING_NOW sequences in a row to find out how long it takes.  */
+#define HP_TIMING_DIFF_INIT() \
+  do {									      \
+    int __cnt = 5;							      \
+    GLRO(dl_hp_timing_overhead) = HP_TIMING_TYPE_MAX;			      \
+    do									      \
+      {									      \
+	hp_timing_t __t1, __t2;						      \
+	HP_TIMING_NOW (__t1);						      \
+	HP_TIMING_NOW (__t2);						      \
+	if (__t2 - __t1 < GLRO(dl_hp_timing_overhead))			      \
+	  GLRO(dl_hp_timing_overhead) = __t2 - __t1;			      \
+      }									      \
+    while (--__cnt > 0);						      \
+  } while (0)
+
+/* It's simple arithmetic for us.  */
+#define HP_TIMING_DIFF(Diff, Start, End)	(Diff) = ((End) - (Start))
+
+/* We have to jump through hoops to get this correctly implemented.  */
+#define HP_TIMING_ACCUM(Sum, Diff) \
+  do {									      \
+    hp_timing_t __oldval;						      \
+    hp_timing_t __diff = (Diff) - GLRO(dl_hp_timing_overhead);		      \
+    hp_timing_t __newval;						      \
+    do									      \
+      {									      \
+	__oldval = (Sum);						      \
+	__newval = __oldval + __diff;					      \
+      }									      \
+    while (! __sync_bool_compare_and_swap (&Sum, __oldvar, __newval));	      \
+  } while (0)
+
+/* No threads, no extra work.  */
+#define HP_TIMING_ACCUM_NT(Sum, Diff)		(Sum) += (Diff)
+
+/* Print the time value.  */
+#define HP_TIMING_PRINT(Buf, Len, Val) \
+  do {									      \
+    char __buf[20];							      \
+    char *__cp = _itoa_word (Val, __buf + sizeof (__buf), 10, 0);	      \
+    int __len = (Len);							      \
+    char *__dest = (Buf);						      \
+    while (__len-- > 0 && __cp < __buf + sizeof (__buf))		      \
+      *__dest++ = *__cp++;						      \
+    memcpy (__dest, " clock cycles", MIN (__len,			      \
+					  (int) sizeof (" clock cycles")));   \
+  } while (0)
+
+#endif	/* hp-timing-enabled.h */
diff --git a/sysdeps/generic/hp-timing.h b/sysdeps/generic/hp-timing.h
index eddc971..11559ca 100644
--- a/sysdeps/generic/hp-timing.h
+++ b/sysdeps/generic/hp-timing.h
@@ -30,6 +30,8 @@ 
 
    - HP_TIMING_AVAIL: test for availability.
 
+   - HP_SMALL_TIMING_AVAIL: test for availability.
+
    - HP_TIMING_INLINE: this macro is non-zero if the functionality is not
      implemented using function calls but instead uses some inlined code
      which might simply consist of a few assembler instructions.  We have to
@@ -66,6 +68,7 @@ 
 
 /* Provide dummy definitions.  */
 #define HP_TIMING_AVAIL		(0)
+#define HP_SMALL_TIMING_AVAIL	(0)
 #define HP_TIMING_INLINE	(0)
 typedef int hp_timing_t;
 #define HP_TIMING_ZERO(Var)
diff --git a/sysdeps/i386/i686/hp-timing.h b/sysdeps/i386/i686/hp-timing.h
index 4a2006e..9a3334f 100644
--- a/sysdeps/i386/i686/hp-timing.h
+++ b/sysdeps/i386/i686/hp-timing.h
@@ -23,68 +23,9 @@ 
 #include <string.h>
 #include <sys/param.h>
 #include <_itoa.h>
+#include <hp-timing-enabled.h>
 
-/* The macros defined here use the timestamp counter in i586 and up versions
-   of the x86 processors.  They provide a very accurate way to measure the
-   time with very little overhead.  The time values themself have no real
-   meaning, only differences are interesting.
-
-   This version is for the i686 processors.  The difference to the i586
-   version is that the timerstamp register is unconditionally used.  This is
-   not the case for the i586 version where we have to perform runtime test
-   whether the processor really has this capability.  We have to make this
-   distinction since the sysdeps/i386/i586 code is supposed to work on all
-   platforms while the i686 already contains i686-specific code.
-
-   The list of macros we need includes the following:
-
-   - HP_TIMING_AVAIL: test for availability.
-
-   - HP_TIMING_INLINE: this macro is non-zero if the functionality is not
-     implemented using function calls but instead uses some inlined code
-     which might simply consist of a few assembler instructions.  We have to
-     know this since we might want to use the macros here in places where we
-     cannot make function calls.
-
-   - hp_timing_t: This is the type for variables used to store the time
-     values.
-
-   - HP_TIMING_ZERO: clear `hp_timing_t' object.
-
-   - HP_TIMING_NOW: place timestamp for current time in variable given as
-     parameter.
-
-   - HP_TIMING_DIFF_INIT: do whatever is necessary to be able to use the
-     HP_TIMING_DIFF macro.
-
-   - HP_TIMING_DIFF: compute difference between two times and store it
-     in a third.  Source and destination might overlap.
-
-   - HP_TIMING_ACCUM: add time difference to another variable.  This might
-     be a bit more complicated to implement for some platforms as the
-     operation should be thread-safe and 64bit arithmetic on 32bit platforms
-     is not.
-
-   - HP_TIMING_ACCUM_NT: this is the variant for situations where we know
-     there are no threads involved.
-
-   - HP_TIMING_PRINT: write decimal representation of the timing value into
-     the given string.  This operation need not be inline even though
-     HP_TIMING_INLINE is specified.
-
-*/
-
-/* We always assume having the timestamp register.  */
-#define HP_TIMING_AVAIL		(1)
-
-/* We indeed have inlined functions.  */
-#define HP_TIMING_INLINE	(1)
-
-/* We use 64bit values for the times.  */
-typedef unsigned long long int hp_timing_t;
-
-/* Set timestamp value to zero.  */
-#define HP_TIMING_ZERO(Var)	(Var) = (0)
+typedef HP_TIMING_TYPE hp_timing_t;
 
 /* That's quite simple.  Use the `rdtsc' instruction.  Note that the value
    might not be 100% accurate since there might be some more instructions
@@ -93,28 +34,6 @@  typedef unsigned long long int hp_timing_t;
    in accurate clock cycles here so we don't do this.  */
 #define HP_TIMING_NOW(Var)	__asm__ __volatile__ ("rdtsc" : "=A" (Var))
 
-/* Use two 'rdtsc' instructions in a row to find out how long it takes.  */
-#define HP_TIMING_DIFF_INIT() \
-  do {									      \
-    if (GLRO(dl_hp_timing_overhead) == 0)				      \
-      {									      \
-	int __cnt = 5;							      \
-	GLRO(dl_hp_timing_overhead) = ~0ull;				      \
-	do								      \
-	  {								      \
-	    hp_timing_t __t1, __t2;					      \
-	    HP_TIMING_NOW (__t1);					      \
-	    HP_TIMING_NOW (__t2);					      \
-	    if (__t2 - __t1 < GLRO(dl_hp_timing_overhead))		      \
-	      GLRO(dl_hp_timing_overhead) = __t2 - __t1;		      \
-	  }								      \
-	while (--__cnt > 0);						      \
-      }									      \
-  } while (0)
-
-/* It's simple arithmetic for us.  */
-#define HP_TIMING_DIFF(Diff, Start, End)	(Diff) = ((End) - (Start))
-
 /* We have to jump through hoops to get this correctly implemented.  */
 #define HP_TIMING_ACCUM(Sum, Diff) \
   do {									      \
@@ -138,19 +57,4 @@  typedef unsigned long long int hp_timing_t;
     while ((unsigned char) __not_done);					      \
   } while (0)
 
-/* No threads, no extra work.  */
-#define HP_TIMING_ACCUM_NT(Sum, Diff)	(Sum) += (Diff)
-
-/* Print the time value.  */
-#define HP_TIMING_PRINT(Buf, Len, Val) \
-  do {									      \
-    char __buf[20];							      \
-    char *__cp = _itoa (Val, __buf + sizeof (__buf), 10, 0);		      \
-    size_t __len = (Len);						      \
-    char *__dest = (Buf);						      \
-    while (__len-- > 0 && __cp < __buf + sizeof (__buf))		      \
-      *__dest++ = *__cp++;						      \
-    memcpy (__dest, " clock cycles", MIN (__len, sizeof (" clock cycles")));  \
-  } while (0)
-
 #endif	/* hp-timing.h */
diff --git a/sysdeps/ia64/hp-timing.h b/sysdeps/ia64/hp-timing.h
index bf97b47..be7eb6e 100644
--- a/sysdeps/ia64/hp-timing.h
+++ b/sysdeps/ia64/hp-timing.h
@@ -24,62 +24,9 @@ 
 #include <sys/param.h>
 #include <_itoa.h>
 #include <ia64intrin.h>
+#include <hp-timing-enabled.h>
 
-/* The macros defined here use the timestamp counter in IA-64.  They
-   provide a very accurate way to measure the time with very little
-   overhead.  The time values themself have no real meaning, only
-   differences are interesting.
-
-   The list of macros we need includes the following:
-
-   - HP_TIMING_AVAIL: test for availability.
-
-   - HP_TIMING_INLINE: this macro is non-zero if the functionality is not
-     implemented using function calls but instead uses some inlined code
-     which might simply consist of a few assembler instructions.  We have to
-     know this since we might want to use the macros here in places where we
-     cannot make function calls.
-
-   - hp_timing_t: This is the type for variables used to store the time
-     values.
-
-   - HP_TIMING_ZERO: clear `hp_timing_t' object.
-
-   - HP_TIMING_NOW: place timestamp for current time in variable given as
-     parameter.
-
-   - HP_TIMING_DIFF_INIT: do whatever is necessary to be able to use the
-     HP_TIMING_DIFF macro.
-
-   - HP_TIMING_DIFF: compute difference between two times and store it
-     in a third.  Source and destination might overlap.
-
-   - HP_TIMING_ACCUM: add time difference to another variable.  This might
-     be a bit more complicated to implement for some platforms as the
-     operation should be thread-safe and 64bit arithmetic on 32bit platforms
-     is not.
-
-   - HP_TIMING_ACCUM_NT: this is the variant for situations where we know
-     there are no threads involved.
-
-   - HP_TIMING_PRINT: write decimal representation of the timing value into
-     the given string.  This operation need not be inline even though
-     HP_TIMING_INLINE is specified.
-
-*/
-
-/* We always assume having the timestamp register.  */
-#define HP_TIMING_AVAIL		(1)
-
-/* We indeed have inlined functions.  */
-#define HP_TIMING_INLINE	(1)
-
-/* We use 64bit values for the times.  */
-typedef unsigned long int hp_timing_t;
-
-/* Set timestamp value to zero.  */
-#define HP_TIMING_ZERO(Var)	(Var) = (0)
-
+typedef HP_TIMING_TYPE hp_timing_t;
 
 /* The Itanium/Merced has a bug where the ar.itc register value read
    is not correct in some situations.  The solution is to read again.
@@ -95,25 +42,6 @@  typedef unsigned long int hp_timing_t;
      while (REPEAT_READ (__itc));					      \
      Var = __itc; })
 
-/* Use two 'ar.itc' instructions in a row to find out how long it takes.  */
-#define HP_TIMING_DIFF_INIT() \
-  do {									      \
-    int __cnt = 5;							      \
-    GLRO(dl_hp_timing_overhead) = ~0ul;					      \
-    do									      \
-      {									      \
-	hp_timing_t __t1, __t2;						      \
-	HP_TIMING_NOW (__t1);						      \
-	HP_TIMING_NOW (__t2);						      \
-	if (__t2 - __t1 < GLRO(dl_hp_timing_overhead))			      \
-	  GLRO(dl_hp_timing_overhead) = __t2 - __t1;			      \
-      }									      \
-    while (--__cnt > 0);						      \
-  } while (0)
-
-/* It's simple arithmetic for us.  */
-#define HP_TIMING_DIFF(Diff, Start, End)	(Diff) = ((End) - (Start))
-
 /* We have to jump through hoops to get this correctly implemented.  */
 #define HP_TIMING_ACCUM(Sum, Diff) \
   do {									      \
@@ -128,20 +56,4 @@  typedef unsigned long int hp_timing_t;
     while (! __sync_bool_compare_and_swap (&Sum, __oldvar, __newval));	      \
   } while (0)
 
-/* No threads, no extra work.  */
-#define HP_TIMING_ACCUM_NT(Sum, Diff)	(Sum) += (Diff)
-
-/* Print the time value.  */
-#define HP_TIMING_PRINT(Buf, Len, Val) \
-  do {									      \
-    char __buf[20];							      \
-    char *__cp = _itoa_word (Val, __buf + sizeof (__buf), 10, 0);	      \
-    int __len = (Len);							      \
-    char *__dest = (Buf);						      \
-    while (__len-- > 0 && __cp < __buf + sizeof (__buf))		      \
-      *__dest++ = *__cp++;						      \
-    memcpy (__dest, " clock cycles", MIN (__len,			      \
-					  (int) sizeof (" clock cycles")));   \
-  } while (0)
-
 #endif	/* hp-timing.h */
diff --git a/sysdeps/powerpc/powerpc32/hp-timing.h b/sysdeps/powerpc/powerpc32/hp-timing.h
index 9ff52eb..2931063 100644
--- a/sysdeps/powerpc/powerpc32/hp-timing.h
+++ b/sysdeps/powerpc/powerpc32/hp-timing.h
@@ -65,6 +65,7 @@ 
 
 /* Provide dummy definitions.  */
 #define HP_TIMING_AVAIL		(0)
+#define HP_SMALL_TIMING_AVAIL	(0)
 #define HP_TIMING_INLINE	(0)
 typedef unsigned long long int hp_timing_t;
 #define HP_TIMING_ZERO(Var)
@@ -78,4 +79,7 @@  typedef unsigned long long int hp_timing_t;
 /* Since this implementation is not available we tell the user about it.  */
 #define HP_TIMING_NONAVAIL	1
 
-#endif /* hp-timing.h */
+/* Provide a dummy hp_timing_t type.  */
+typedef int hp_timing_t;
+
+#ndif /* hp-timing.h */
diff --git a/sysdeps/powerpc/powerpc32/power4/hp-timing.h b/sysdeps/powerpc/powerpc32/power4/hp-timing.h
index f1e3beb..47dabcd 100644
--- a/sysdeps/powerpc/powerpc32/power4/hp-timing.h
+++ b/sysdeps/powerpc/powerpc32/power4/hp-timing.h
@@ -24,60 +24,9 @@ 
 #include <sys/param.h>
 #include <_itoa.h>
 #include <atomic.h>
+#include <hp-timing-enabled.h>
 
-/* The macros defined here use the powerpc 64-bit time base register.
-   The time base is nominally clocked at 1/8th the CPU clock, but this
-   can vary.
-
-   The list of macros we need includes the following:
-
-   - HP_TIMING_AVAIL: test for availability.
-
-   - HP_TIMING_INLINE: this macro is non-zero if the functionality is not
-     implemented using function calls but instead uses some inlined code
-     which might simply consist of a few assembler instructions.  We have to
-     know this since we might want to use the macros here in places where we
-     cannot make function calls.
-
-   - hp_timing_t: This is the type for variables used to store the time
-     values.
-
-   - HP_TIMING_ZERO: clear `hp_timing_t' object.
-
-   - HP_TIMING_NOW: place timestamp for current time in variable given as
-     parameter.
-
-   - HP_TIMING_DIFF_INIT: do whatever is necessary to be able to use the
-     HP_TIMING_DIFF macro.
-
-   - HP_TIMING_DIFF: compute difference between two times and store it
-     in a third.  Source and destination might overlap.
-
-   - HP_TIMING_ACCUM: add time difference to another variable.  This might
-     be a bit more complicated to implement for some platforms as the
-     operation should be thread-safe and 64bit arithmetic on 32bit platforms
-     is not.
-
-   - HP_TIMING_ACCUM_NT: this is the variant for situations where we know
-     there are no threads involved.
-
-   - HP_TIMING_PRINT: write decimal representation of the timing value into
-     the given string.  This operation need not be inline even though
-     HP_TIMING_INLINE is specified.
-
-*/
-
-/* We always assume having the timestamp register.  */
-#define HP_TIMING_AVAIL		(1)
-
-/* We indeed have inlined functions.  */
-#define HP_TIMING_INLINE	(1)
-
-/* We use 64bit values for the times.  */
-typedef unsigned long long int hp_timing_t;
-
-/* Set timestamp value to zero.  */
-#define HP_TIMING_ZERO(Var)	(Var) = (0)
+typedef HP_TIMING_TYPE hp_timing_t;
 
 /* That's quite simple.  Use the `mftb' instruction.  Note that the value
    might not be 100% accurate since there might be some more instructions
@@ -98,30 +47,6 @@  typedef unsigned long long int hp_timing_t;
     Var = ((hp_timing_t) hi << 32) | lo;				\
   } while (0)
 
-
-/* Use two 'mftb' instructions in a row to find out how long it takes.
-   On current POWER4, POWER5, and 970 processors mftb take ~10 cycles.  */
-#define HP_TIMING_DIFF_INIT() \
-  do {									      \
-    if (GLRO(dl_hp_timing_overhead) == 0)				      \
-      {									      \
-	int __cnt = 5;							      \
-	GLRO(dl_hp_timing_overhead) = ~0ull;				      \
-	do								      \
-	  {								      \
-	    hp_timing_t __t1, __t2;					      \
-	    HP_TIMING_NOW (__t1);					      \
-	    HP_TIMING_NOW (__t2);					      \
-	    if (__t2 - __t1 < GLRO(dl_hp_timing_overhead))		      \
-	      GLRO(dl_hp_timing_overhead) = __t2 - __t1;		      \
-	  }								      \
-	while (--__cnt > 0);						      \
-      }									      \
-  } while (0)
-
-/* It's simple arithmetic in 64-bit.  */
-#define HP_TIMING_DIFF(Diff, Start, End)	(Diff) = ((End) - (Start))
-
 /* We need to insure that this add is atomic in threaded environments.  We use
    __arch_atomic_exchange_and_add_64 from atomic.h to get thread safety.  */
 #define HP_TIMING_ACCUM(Sum, Diff) \
@@ -130,19 +55,4 @@  typedef unsigned long long int hp_timing_t;
     __arch_atomic_exchange_and_add_64 (&(Sum), __diff);	                      \
   } while (0)
 
-/* No threads, no extra work.  */
-#define HP_TIMING_ACCUM_NT(Sum, Diff)	(Sum) += (Diff)
-
-/* Print the time value.  */
-#define HP_TIMING_PRINT(Buf, Len, Val) \
-  do {									      \
-    char __buf[20];							      \
-    char *__cp = _itoa (Val, __buf + sizeof (__buf), 10, 0);		      \
-    size_t __len = (Len);						      \
-    char *__dest = (Buf);						      \
-    while (__len-- > 0 && __cp < __buf + sizeof (__buf))		      \
-      *__dest++ = *__cp++;						      \
-    memcpy (__dest, " ticks", MIN (__len, sizeof (" ticks")));  \
-  } while (0)
-
 #endif	/* hp-timing.h */
diff --git a/sysdeps/powerpc/powerpc64/hp-timing.h b/sysdeps/powerpc/powerpc64/hp-timing.h
index f1efa12..38ab607 100644
--- a/sysdeps/powerpc/powerpc64/hp-timing.h
+++ b/sysdeps/powerpc/powerpc64/hp-timing.h
@@ -24,60 +24,9 @@ 
 #include <sys/param.h>
 #include <_itoa.h>
 #include <atomic.h>
+#include <hp-timing-enabled.h>
 
-/* The macros defined here use the powerpc 64-bit time base register.
-   The time base is nominally clocked at 1/8th the CPU clock, but this
-   can vary.
-
-   The list of macros we need includes the following:
-
-   - HP_TIMING_AVAIL: test for availability.
-
-   - HP_TIMING_INLINE: this macro is non-zero if the functionality is not
-     implemented using function calls but instead uses some inlined code
-     which might simply consist of a few assembler instructions.  We have to
-     know this since we might want to use the macros here in places where we
-     cannot make function calls.
-
-   - hp_timing_t: This is the type for variables used to store the time
-     values.
-
-   - HP_TIMING_ZERO: clear `hp_timing_t' object.
-
-   - HP_TIMING_NOW: place timestamp for current time in variable given as
-     parameter.
-
-   - HP_TIMING_DIFF_INIT: do whatever is necessary to be able to use the
-     HP_TIMING_DIFF macro.
-
-   - HP_TIMING_DIFF: compute difference between two times and store it
-     in a third.  Source and destination might overlap.
-
-   - HP_TIMING_ACCUM: add time difference to another variable.  This might
-     be a bit more complicated to implement for some platforms as the
-     operation should be thread-safe and 64bit arithmetic on 32bit platforms
-     is not.
-
-   - HP_TIMING_ACCUM_NT: this is the variant for situations where we know
-     there are no threads involved.
-
-   - HP_TIMING_PRINT: write decimal representation of the timing value into
-     the given string.  This operation need not be inline even though
-     HP_TIMING_INLINE is specified.
-
-*/
-
-/* We always assume having the timestamp register.  */
-#define HP_TIMING_AVAIL		(1)
-
-/* We indeed have inlined functions.  */
-#define HP_TIMING_INLINE	(1)
-
-/* We use 64bit values for the times.  */
-typedef unsigned long long int hp_timing_t;
-
-/* Set timestamp value to zero.  */
-#define HP_TIMING_ZERO(Var)	(Var) = (0)
+typedef HP_TIMING_TYPE hp_timing_t;
 
 /* That's quite simple.  Use the `mftb' instruction.  Note that the value
    might not be 100% accurate since there might be some more instructions
@@ -90,29 +39,6 @@  typedef unsigned long long int hp_timing_t;
 #define HP_TIMING_NOW(Var)	__asm__ __volatile__ ("mftb %0" : "=r" (Var))
 #endif
 
-/* Use two 'mftb' instructions in a row to find out how long it takes.
-   On current POWER4, POWER5, and 970 processors mftb take ~10 cycles.  */
-#define HP_TIMING_DIFF_INIT() \
-  do {									      \
-    if (GLRO(dl_hp_timing_overhead) == 0)				      \
-      {									      \
-	int __cnt = 5;							      \
-	GLRO(dl_hp_timing_overhead) = ~0ull;				      \
-	do								      \
-	  {								      \
-	    hp_timing_t __t1, __t2;					      \
-	    HP_TIMING_NOW (__t1);					      \
-	    HP_TIMING_NOW (__t2);					      \
-	    if (__t2 - __t1 < GLRO(dl_hp_timing_overhead))		      \
-	      GLRO(dl_hp_timing_overhead) = __t2 - __t1;		      \
-	  }								      \
-	while (--__cnt > 0);						      \
-      }									      \
-  } while (0)
-
-/* It's simple arithmetic in 64-bit.  */
-#define HP_TIMING_DIFF(Diff, Start, End)	(Diff) = ((End) - (Start))
-
 /* We need to insure that this add is atomic in threaded environments.  We use
    __arch_atomic_exchange_and_add_64 from atomic.h to get thread safety.  */
 #define HP_TIMING_ACCUM(Sum, Diff) \
@@ -121,19 +47,4 @@  typedef unsigned long long int hp_timing_t;
     __arch_atomic_exchange_and_add_64 (&(Sum), __diff);	                      \
   } while (0)
 
-/* No threads, no extra work.  */
-#define HP_TIMING_ACCUM_NT(Sum, Diff)	(Sum) += (Diff)
-
-/* Print the time value.  */
-#define HP_TIMING_PRINT(Buf, Len, Val) \
-  do {									      \
-    char __buf[20];							      \
-    char *__cp = _itoa (Val, __buf + sizeof (__buf), 10, 0);		      \
-    size_t __len = (Len);						      \
-    char *__dest = (Buf);						      \
-    while (__len-- > 0 && __cp < __buf + sizeof (__buf))		      \
-      *__dest++ = *__cp++;						      \
-    memcpy (__dest, " ticks", MIN (__len, sizeof (" ticks")));  \
-  } while (0)
-
 #endif	/* hp-timing.h */
diff --git a/sysdeps/sparc/sparc32/sparcv9/hp-timing.h b/sysdeps/sparc/sparc32/sparcv9/hp-timing.h
index fd7e76e..3d46091 100644
--- a/sysdeps/sparc/sparc32/sparcv9/hp-timing.h
+++ b/sysdeps/sparc/sparc32/sparcv9/hp-timing.h
@@ -23,36 +23,15 @@ 
 #include <string.h>
 #include <sys/param.h>
 #include <_itoa.h>
+#include <hp-timing-enabled.h>
 
-#define HP_TIMING_AVAIL		(1)
-#define HP_TIMING_INLINE	(1)
-
-typedef unsigned long long int hp_timing_t;
-
-#define HP_TIMING_ZERO(Var)	(Var) = (0)
+typedef HP_TIMING_TYPE hp_timing_t;
 
 #define HP_TIMING_NOW(Var) \
       __asm__ __volatile__ ("rd %%tick, %L0\n\t" \
 			    "srlx %L0, 32, %H0" \
 			    : "=r" (Var))
 
-#define HP_TIMING_DIFF_INIT() \
-  do {									      \
-    int __cnt = 5;							      \
-    GLRO(dl_hp_timing_overhead) = ~0ull;				      \
-    do									      \
-      {									      \
-	hp_timing_t __t1, __t2;						      \
-	HP_TIMING_NOW (__t1);						      \
-	HP_TIMING_NOW (__t2);						      \
-	if (__t2 - __t1 < GLRO(dl_hp_timing_overhead))			      \
-	  GLRO(dl_hp_timing_overhead) = __t2 - __t1;			      \
-      }									      \
-    while (--__cnt > 0);						      \
-  } while (0)
-
-#define HP_TIMING_DIFF(Diff, Start, End)	(Diff) = ((End) - (Start))
-
 #define HP_TIMING_ACCUM(Sum, Diff)				\
 do {								\
   hp_timing_t __diff = (Diff) - GLRO(dl_hp_timing_overhead);	\
@@ -70,17 +49,4 @@  do {								\
 		       : "memory", "g1", "g5", "g6");		\
 } while(0)
 
-#define HP_TIMING_ACCUM_NT(Sum, Diff)	(Sum) += (Diff)
-
-#define HP_TIMING_PRINT(Buf, Len, Val) \
-  do {									      \
-    char __buf[20];							      \
-    char *__cp = _itoa (Val, __buf + sizeof (__buf), 10, 0);		      \
-    int __len = (Len);							      \
-    char *__dest = (Buf);						      \
-    while (__len-- > 0 && __cp < __buf + sizeof (__buf))		      \
-      *__dest++ = *__cp++;						      \
-    memcpy (__dest, " clock cycles", MIN (__len, sizeof (" clock cycles")));  \
-  } while (0)
-
 #endif	/* hp-timing.h */
diff --git a/sysdeps/sparc/sparc64/hp-timing.h b/sysdeps/sparc/sparc64/hp-timing.h
index fa08cc8..a5d5cde 100644
--- a/sysdeps/sparc/sparc64/hp-timing.h
+++ b/sysdeps/sparc/sparc64/hp-timing.h
@@ -23,33 +23,12 @@ 
 #include <string.h>
 #include <sys/param.h>
 #include <_itoa.h>
+#include <hp-timing-enabled.h>
 
-#define HP_TIMING_AVAIL		(1)
-#define HP_TIMING_INLINE	(1)
-
-typedef unsigned long int hp_timing_t;
-
-#define HP_TIMING_ZERO(Var)	(Var) = (0)
+typedef HP_TIMING_TYPE hp_timing_t;
 
 #define HP_TIMING_NOW(Var) __asm__ __volatile__ ("rd %%tick, %0" : "=r" (Var))
 
-#define HP_TIMING_DIFF_INIT() \
-  do {									      \
-    int __cnt = 5;							      \
-    GLRO(dl_hp_timing_overhead) = ~0ull;				      \
-    do									      \
-      {									      \
-	hp_timing_t __t1, __t2;						      \
-	HP_TIMING_NOW (__t1);						      \
-	HP_TIMING_NOW (__t2);						      \
-	if (__t2 - __t1 < GLRO(dl_hp_timing_overhead))			      \
-	  GLRO(dl_hp_timing_overhead) = __t2 - __t1;			      \
-      }									      \
-    while (--__cnt > 0);						      \
-  } while (0)
-
-#define HP_TIMING_DIFF(Diff, Start, End)	(Diff) = ((End) - (Start))
-
 #define HP_TIMING_ACCUM(Sum, Diff)				\
 do {								\
   hp_timing_t __diff = (Diff) - GLRO(dl_hp_timing_overhead);	\
@@ -65,17 +44,4 @@  do {								\
 		       : "memory", "g1", "g5", "g6");		\
 } while(0)
 
-#define HP_TIMING_ACCUM_NT(Sum, Diff)	(Sum) += (Diff)
-
-#define HP_TIMING_PRINT(Buf, Len, Val) \
-  do {									      \
-    char __buf[20];							      \
-    char *__cp = _itoa (Val, __buf + sizeof (__buf), 10, 0);		      \
-    int __len = (Len);							      \
-    char *__dest = (Buf);						      \
-    while (__len-- > 0 && __cp < __buf + sizeof (__buf))		      \
-      *__dest++ = *__cp++;						      \
-    memcpy (__dest, " clock cycles", MIN (__len, sizeof (" clock cycles")));  \
-  } while (0)
-
 #endif	/* hp-timing.h */