Patchwork [v2] Common cpuid wrappers, use SYS_cpuid when available

login
register
mail settings
Submitter Piotr Henryk Dabrowski
Date March 10, 2016, 8:12 p.m.
Message ID <20160310211238.590c6afa@ultra.tux-net>
Download mbox | patch
Permalink /patch/11300/
State New
Headers show

Comments

Piotr Henryk Dabrowski - March 10, 2016, 8:12 p.m.
Thank you for your comments. I have modified the patch to include most of them.
Please let me know if the v2 [1] is closer to something that could pass a
review.


Florian Weimer <fw@deneb.enyo.de> writes:

> <cpuid.h> is provided by GCC, it would have to change as well.

I don't think <cpuid.h> would *have to* change.
Especially that all the original __cpuid* macros need to stay as they are, as
they are used even in the Linux kernel itself and would cause a circular
recursion here.
Of course providing kernel-adjusted cpuid* functions besides the standard asm
__cpuid* macros would solve a lot of problems when modifying programs to use
this feature.
Including the glibc, but here that would require bumping the minimum required
GCC version (to one with updated cpuid.h), so I guess this is not a possible
solution (yet).

> The real challenge is the pervasive use of inline assembly, though.
>
> Currently, you cannot invoke system calls from IFUNC selectors.  This
> means that one major application for IFUNC selectors cannot use the
> system call.

Is this really a big drawback? This is a feature that may have its own
limitations. You don't call to cpuid op that much after all. Obviously this
will never be as portable as the simple __cpuid* asm macros from <cpuid.h>.

> I also wonder if the relevant CPUID flags should rather be part of auxv

Thank you for these suggestions.

But wouldn't providing auxv entries make this feature ELF-only?

And how could we decide which flags are relevant? There are at least 11
different relevant op/count cpuid calls, returning 4 u32 registers each [2].

Plus it takes us far away from the original idea of simply replacing cpuid calls
within the application [3]. This would require switching the detection of CPU
features to a completely new model, which programmers might find hard to adapt
to. Which, in turn, would ruin the whole plan of making kernel-adjusted cpuid
widely adopted.
Just my opinion though :-)

> or if this functionality should be a vsyscall instead.

vsyscall or vdso? I will look into this idea.

> (you may need to do a CPUID before doing a system call, and vice versa).

Why? You mean checking the cpuid bits if we have syscall/sysret features?
Still you can easily call the original asm __cpuid macro in such case.


Joseph Myers <joseph@codesourcery.com> writes:

> Non-sysdeps files should not have anything architecture-specific.
>
> Now, *if* you need a configure test, you can't avoid changing the 
> architecture-independent config.h.in (we don't yet have a way to split 
> that by architecture).  But the rest can be avoided by using sysdeps 
> configure fragments and headers.  And you don't need a configure test 
> anyway - the code can use #ifdef __NR_cpuid to test if the syscall is 
> available at compile time, and __ASSUME_CPUID to test if the syscall can 
> be presumed to work (if the runtime kernel is known to be recent enough, 
> since the kernel headers used to build glibc may be more recent than the 
> kernel used by glibc at runtime).  Finally, misc/cpuid.h should go under a 
> sysdeps directory, suitably named not to conflict with the compiler's 
> <cpuid.h>.
>
> There's a complication: sysdeps/x86 shouldn't contain anything 
> Linux-specific either.  So what that suggests is that you have e.g. 
> sysdeps/x86/x86-cpuid.h that uses just the cpuid instruction but that can 
> be overridden by an OS-specific header that supports the syscall - and 
> then have such an OS-specific version in sysdeps/unix/sysv/linux/x86 (you 
> can also do more complicated schemes to avoid duplicating the code to use 
> the instruction).

The configure test was removed. It wasn't really necessary, although maybe it
might be a good idea to somehow log which version of cpuid is being used
during the compile time.

And with the current code the __ASSUME_CPUID should not be necessary either.


Mike Frysinger <vapier@gentoo.org> writes:

> don't think you need this here.  you can define __ASSUME_CPUID in
> kernel-features.h and use that everywhere.  look at that file and
> symbols it defins as an example.

Removed with the configure checks.

> nope -- you'll need to sign copyright papers w/the FSF

So the copyright attribution line must display the FSF only, plus I need to sign
the legal papers, right?
How can I obtain a copy for signing? Just in case.


Andreas Schwab <schwab@suse.de> writes:

> You cannot use the host libc to check for features.

Removed with the configure checks.


[1] https://gitlab.com/ultr/glibc/tags/ultr-sys_cpuid-v2
[2]
    0x00000001, 0
    0x00000006, 0
    0x00000007, 0
    0x0000000D, 1
    0x0000000F, 0
    0x0000000F, 1
    0x80000001, 0
    0x80000008, 0
    0x8000000A, 0
    0x80860001, 0
    0xC0000001, 0
[3]
    __cpuid_count(level, count, eax, ebx, ecx, edx);
    vs
    int ret = syscall(SYS_cpuid, level, count, &eax, &ebx, &ecx, &edx);


Regards,
Piotr Henryk Dabrowski


--

Common cpuid wrappers, use SYS_cpuid when available

	* misc/common_cpuid.h: Common cpuid wrappers
	* sysdeps/generic/local_cpuid.h: Common cpuid wrappers
	* sysdeps/unix/sysv/linux/x86/local_cpuid.h: use SYS_cpuid if available
	* sysdeps/x86/cpu-features.c: Use local_cpuid.h
	* sysdeps/x86/fpu/test-fenv-clear-sse.c: Use local_cpuid.h
	* sysdeps/x86/fpu/test-fenv-sse-2.c: Use local_cpuid.h
	* sysdeps/x86/fpu/test-fenv-sse.c: Use local_cpuid.h
	* sysdeps/x86_64/cacheinfo.c: Use local_cpuid.h
	* sysdeps/x86_64/tst-audit10.c: Use local_cpuid.h
	* sysdeps/x86_64/tst-audit4.c: Use local_cpuid.h
	* sysdeps/x86_64/tst-audit6.c: Use local_cpuid.h
	* sysdeps/x86_64/tst-auditmod10b.c: Use local_cpuid.h
	* sysdeps/x86_64/tst-auditmod4b.c: Use local_cpuid.h
	* sysdeps/x86_64/tst-auditmod6b.c: Use local_cpuid.h
	* sysdeps/x86_64/tst-auditmod6c.c: Use local_cpuid.h
	* sysdeps/x86_64/tst-auditmod7b.c: Use local_cpuid.h
---
 ChangeLog                                 | 19 ++++++++
 misc/common_cpuid.h                       | 73 +++++++++++++++++++++++++++++++
 sysdeps/generic/local_cpuid.h             | 33 ++++++++++++++
 sysdeps/unix/sysv/linux/x86/local_cpuid.h | 40 +++++++++++++++++
 sysdeps/x86/cpu-features.c                | 37 ++++++++--------
 sysdeps/x86/fpu/test-fenv-clear-sse.c     |  4 +-
 sysdeps/x86/fpu/test-fenv-sse-2.c         |  4 +-
 sysdeps/x86/fpu/test-fenv-sse.c           |  4 +-
 sysdeps/x86_64/cacheinfo.c                | 24 +++++-----
 sysdeps/x86_64/tst-audit10.c              |  6 +--
 sysdeps/x86_64/tst-audit4.c               |  4 +-
 sysdeps/x86_64/tst-audit6.c               |  4 +-
 sysdeps/x86_64/tst-auditmod10b.c          |  6 +--
 sysdeps/x86_64/tst-auditmod4b.c           |  4 +-
 sysdeps/x86_64/tst-auditmod6b.c           |  4 +-
 sysdeps/x86_64/tst-auditmod6c.c           |  4 +-
 sysdeps/x86_64/tst-auditmod7b.c           |  4 +-
 17 files changed, 220 insertions(+), 54 deletions(-)
 create mode 100644 misc/common_cpuid.h
 create mode 100644 sysdeps/generic/local_cpuid.h
 create mode 100644 sysdeps/unix/sysv/linux/x86/local_cpuid.h
Joseph Myers - March 10, 2016, 9:37 p.m.
On Thu, 10 Mar 2016, Piotr Henryk Dabrowski wrote:

> 	* misc/common_cpuid.h: Common cpuid wrappers
> 	* sysdeps/generic/local_cpuid.h: Common cpuid wrappers

The cpuid concept is x86-specific.  Thus, nothing should go in misc/ or 
sysdeps/generic/.  Use sysdeps/x86/.

> +{
> +	cpuid_count (level, 0, eax, ebx, ecx, edx);

Formatting of course needs to be in GNU style (so two-column indents).

> +#ifdef __NR_cpuid
> +	if (INLINE_SYSCALL (cpuid, 6, level, count, eax, ebx, ecx, edx) == 0)
> +		return;
> +#endif
> +	__cpuid_count (level, count, *eax, *ebx, *ecx, *edx);

If the kernel used at runtime supports the syscall, is it ever possible 
for it to fail?  If not, you should have __ASSUME_CPUID to disable the 
fallback.
Mike Frysinger - March 10, 2016, 10:45 p.m.
On 10 Mar 2016 21:12, Piotr Henryk Dabrowski wrote:
> > or if this functionality should be a vsyscall instead.
> 
> vsyscall or vdso? I will look into this idea.

vdso.  vsyscall is an old synonym now for it.  see `man 7 vdso`.

> > nope -- you'll need to sign copyright papers w/the FSF
> 
> So the copyright attribution line must display the FSF only, plus I need to sign
> the legal papers, right?

yes, you must do both

> How can I obtain a copy for signing? Just in case.

see this page:
https://sourceware.org/glibc/wiki/Contribution%20checklist#FSF_copyright_Assignment
-mike

Patch

diff --git a/ChangeLog b/ChangeLog
index 727516e..d17e167 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,22 @@ 
+2016-03-10  Piotr Henryk Dabrowski  <ultr@ultr.pl>
+
+	* misc/common_cpuid.h: Common cpuid wrappers
+	* sysdeps/generic/local_cpuid.h: Common cpuid wrappers
+	* sysdeps/unix/sysv/linux/x86/local_cpuid.h: use SYS_cpuid if available
+	* sysdeps/x86/cpu-features.c: Use local_cpuid.h
+	* sysdeps/x86/fpu/test-fenv-clear-sse.c: Use local_cpuid.h
+	* sysdeps/x86/fpu/test-fenv-sse-2.c: Use local_cpuid.h
+	* sysdeps/x86/fpu/test-fenv-sse.c: Use local_cpuid.h
+	* sysdeps/x86_64/cacheinfo.c: Use local_cpuid.h
+	* sysdeps/x86_64/tst-audit10.c: Use local_cpuid.h
+	* sysdeps/x86_64/tst-audit4.c: Use local_cpuid.h
+	* sysdeps/x86_64/tst-audit6.c: Use local_cpuid.h
+	* sysdeps/x86_64/tst-auditmod10b.c: Use local_cpuid.h
+	* sysdeps/x86_64/tst-auditmod4b.c: Use local_cpuid.h
+	* sysdeps/x86_64/tst-auditmod6b.c: Use local_cpuid.h
+	* sysdeps/x86_64/tst-auditmod6c.c: Use local_cpuid.h
+	* sysdeps/x86_64/tst-auditmod7b.c: Use local_cpuid.h
+
 2016-03-09  Aurelien Jarno  <aurelien@aurel32.net>
 
 	[BZ #19792]
diff --git a/misc/common_cpuid.h b/misc/common_cpuid.h
new file mode 100644
index 0000000..89b7e93
--- /dev/null
+++ b/misc/common_cpuid.h
@@ -0,0 +1,73 @@ 
+/* CPUID wrapper functions.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Piotr Henryk Dabrowski (ultr@ultr.pl), 2016.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _LOCAL_CPUID_H
+# error "Do not use <misc/common_cpuid.h> directly; include <local_cpuid.h> instead."
+#endif
+
+#ifndef _COMMON_CPUID_H
+#define _COMMON_CPUID_H 1
+
+#include <cpuid.h>
+#include <errno.h>
+
+#define get_cpuid_max __get_cpuid_max
+
+/* NOTE: for new Linux kernels these functions try to use kernel-adjusted
+   values for cpuid returned by the SYS_cpuid sys call.
+   Otherwise they fallback to native cpuid implementation.  */
+
+/* Return cpuid data for requested cpuid level (eax) and count register (ecx),
+   as found in returned eax, ebx, ecx and edx registers.
+   All pointers are required to be non-null.
+   Implementation is system dependant.  */
+static inline void
+cpuid_count (unsigned int level, unsigned int count,
+	     unsigned int *eax, unsigned int *ebx,
+	     unsigned int *ecx, unsigned int *edx);
+
+/* Return cpuid data for requested cpuid level (eax),
+   as found in returned eax, ebx, ecx and edx registers.
+   All pointers are required to be non-null.  */
+static inline void
+cpuid (unsigned int level,
+       unsigned int *eax, unsigned int *ebx,
+       unsigned int *ecx, unsigned int *edx)
+{
+	cpuid_count (level, 0, eax, ebx, ecx, edx);
+}
+
+/* Return cpuid data for requested cpuid level (eax),
+   as found in returned eax, ebx, ecx and edx registers.
+   The function checks if cpuid is supported and returns 1 for valid cpuid
+   information or 0 for unsupported cpuid level.
+   All pointers are required to be non-null.  */
+static inline int
+get_cpuid (unsigned int level,
+	   unsigned int *eax, unsigned int *ebx,
+	   unsigned int *ecx, unsigned int *edx)
+{
+	unsigned int ext = level & 0x80000000;
+	if (get_cpuid_max (ext, 0) < level)
+		return 0;
+	cpuid (level, eax, ebx, ecx, edx);
+	return 1;
+}
+
+#endif /* common_cpuid.h */
diff --git a/sysdeps/generic/local_cpuid.h b/sysdeps/generic/local_cpuid.h
new file mode 100644
index 0000000..7350293
--- /dev/null
+++ b/sysdeps/generic/local_cpuid.h
@@ -0,0 +1,33 @@ 
+/* CPUID wrapper functions.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Piotr Henryk Dabrowski (ultr@ultr.pl), 2016.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _LOCAL_CPUID_H
+#define _LOCAL_CPUID_H 1
+
+#include <misc/common_cpuid.h>
+
+static inline void
+cpuid_count (unsigned int level, unsigned int count,
+	     unsigned int *eax, unsigned int *ebx,
+	     unsigned int *ecx, unsigned int *edx)
+{
+	__cpuid_count (level, count, *eax, *ebx, *ecx, *edx);
+}
+
+#endif /* local_cpuid.h */
diff --git a/sysdeps/unix/sysv/linux/x86/local_cpuid.h b/sysdeps/unix/sysv/linux/x86/local_cpuid.h
new file mode 100644
index 0000000..8a459d4
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/x86/local_cpuid.h
@@ -0,0 +1,40 @@ 
+/* CPUID wrapper functions.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Piotr Henryk Dabrowski (ultr@ultr.pl), 2016.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _LOCAL_CPUID_H
+#define _LOCAL_CPUID_H 1
+
+#include <misc/common_cpuid.h>
+
+#include <sysdep.h>
+#include <sys/syscall.h>
+
+static inline void
+cpuid_count (unsigned int level, unsigned int count,
+	     unsigned int *eax, unsigned int *ebx,
+	     unsigned int *ecx, unsigned int *edx)
+{
+#ifdef __NR_cpuid
+	if (INLINE_SYSCALL (cpuid, 6, level, count, eax, ebx, ecx, edx) == 0)
+		return;
+#endif
+	__cpuid_count (level, count, *eax, *ebx, *ecx, *edx);
+}
+
+#endif /* local_cpuid.h */
diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
index 218ff2b..dc5be3f 100644
--- a/sysdeps/x86/cpu-features.c
+++ b/sysdeps/x86/cpu-features.c
@@ -16,7 +16,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <cpuid.h>
+#include <local_cpuid.h>
 #include <cpu-features.h>
 
 static inline void
@@ -25,9 +25,9 @@  get_common_indeces (struct cpu_features *cpu_features,
 		    unsigned int *extended_model)
 {
   unsigned int eax;
-  __cpuid (1, eax, cpu_features->cpuid[COMMON_CPUID_INDEX_1].ebx,
-	   cpu_features->cpuid[COMMON_CPUID_INDEX_1].ecx,
-	   cpu_features->cpuid[COMMON_CPUID_INDEX_1].edx);
+  cpuid (1, &eax, &(cpu_features->cpuid[COMMON_CPUID_INDEX_1].ebx),
+	 &(cpu_features->cpuid[COMMON_CPUID_INDEX_1].ecx),
+	 &(cpu_features->cpuid[COMMON_CPUID_INDEX_1].edx));
   GLRO(dl_x86_cpu_features).cpuid[COMMON_CPUID_INDEX_1].eax = eax;
   *family = (eax >> 8) & 0x0f;
   *model = (eax >> 4) & 0x0f;
@@ -42,20 +42,21 @@  get_common_indeces (struct cpu_features *cpu_features,
 static inline void
 init_cpu_features (struct cpu_features *cpu_features)
 {
-  unsigned int ebx, ecx, edx;
+  unsigned int eax, ebx, ecx, edx;
   unsigned int family = 0;
   unsigned int model = 0;
   enum cpu_features_kind kind;
 
 #if !HAS_CPUID
-  if (__get_cpuid_max (0, 0) == 0)
+  if (get_cpuid_max (0, 0) == 0)
     {
       kind = arch_kind_other;
       goto no_cpuid;
     }
 #endif
 
-  __cpuid (0, cpu_features->max_cpuid, ebx, ecx, edx);
+  cpuid (0, &eax, &ebx, &ecx, &edx);
+  cpu_features->max_cpuid = eax;
 
   /* This spells out "GenuineIntel".  */
   if (ebx == 0x756e6547 && ecx == 0x6c65746e && edx == 0x49656e69)
@@ -147,13 +148,13 @@  init_cpu_features (struct cpu_features *cpu_features)
       ecx = cpu_features->cpuid[COMMON_CPUID_INDEX_1].ecx;
 
       unsigned int eax;
-      __cpuid (0x80000000, eax, ebx, ecx, edx);
+      cpuid (0x80000000, &eax, &ebx, &ecx, &edx);
       if (eax >= 0x80000001)
-	__cpuid (0x80000001,
-		 cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].eax,
-		 cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].ebx,
-		 cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].ecx,
-		 cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].edx);
+	cpuid (0x80000001,
+	       &(cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].eax),
+	       &(cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].ebx),
+	       &(cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].ecx),
+	       &(cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].edx));
 
       if (family == 0x15)
 	{
@@ -175,11 +176,11 @@  init_cpu_features (struct cpu_features *cpu_features)
     cpu_features->feature[index_I686] |= bit_I686;
 
   if (cpu_features->max_cpuid >= 7)
-    __cpuid_count (7, 0,
-		   cpu_features->cpuid[COMMON_CPUID_INDEX_7].eax,
-		   cpu_features->cpuid[COMMON_CPUID_INDEX_7].ebx,
-		   cpu_features->cpuid[COMMON_CPUID_INDEX_7].ecx,
-		   cpu_features->cpuid[COMMON_CPUID_INDEX_7].edx);
+    cpuid_count (7, 0,
+		 &(cpu_features->cpuid[COMMON_CPUID_INDEX_7].eax),
+		 &(cpu_features->cpuid[COMMON_CPUID_INDEX_7].ebx),
+		 &(cpu_features->cpuid[COMMON_CPUID_INDEX_7].ecx),
+		 &(cpu_features->cpuid[COMMON_CPUID_INDEX_7].edx));
 
   /* Can we call xgetbv?  */
   if (HAS_CPU_FEATURE (OSXSAVE))
diff --git a/sysdeps/x86/fpu/test-fenv-clear-sse.c b/sysdeps/x86/fpu/test-fenv-clear-sse.c
index cc4b3f0..45de9eb 100644
--- a/sysdeps/x86/fpu/test-fenv-clear-sse.c
+++ b/sysdeps/x86/fpu/test-fenv-clear-sse.c
@@ -17,7 +17,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <cpuid.h>
+#include <local_cpuid.h>
 #include <stdbool.h>
 
 static bool
@@ -25,7 +25,7 @@  have_sse2 (void)
 {
   unsigned int eax, ebx, ecx, edx;
 
-  if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx))
+  if (!get_cpuid (1, &eax, &ebx, &ecx, &edx))
     return false;
 
   return (edx & bit_SSE2) != 0;
diff --git a/sysdeps/x86/fpu/test-fenv-sse-2.c b/sysdeps/x86/fpu/test-fenv-sse-2.c
index d3197c3..92ee3f2 100644
--- a/sysdeps/x86/fpu/test-fenv-sse-2.c
+++ b/sysdeps/x86/fpu/test-fenv-sse-2.c
@@ -16,7 +16,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <cpuid.h>
+#include <local_cpuid.h>
 #include <fenv.h>
 #include <float.h>
 #include <stdbool.h>
@@ -28,7 +28,7 @@  have_sse2 (void)
 {
   unsigned int eax, ebx, ecx, edx;
 
-  if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx))
+  if (!get_cpuid (1, &eax, &ebx, &ecx, &edx))
     return false;
 
   return (edx & bit_SSE2) != 0;
diff --git a/sysdeps/x86/fpu/test-fenv-sse.c b/sysdeps/x86/fpu/test-fenv-sse.c
index 4f4ff6a..5836b95 100644
--- a/sysdeps/x86/fpu/test-fenv-sse.c
+++ b/sysdeps/x86/fpu/test-fenv-sse.c
@@ -16,7 +16,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <cpuid.h>
+#include <local_cpuid.h>
 #include <fenv.h>
 #include <float.h>
 #include <stdbool.h>
@@ -27,7 +27,7 @@  have_sse2 (void)
 {
   unsigned int eax, ebx, ecx, edx;
 
-  if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx))
+  if (!get_cpuid (1, &eax, &ebx, &ecx, &edx))
     return false;
 
   return (edx & bit_SSE2) != 0;
diff --git a/sysdeps/x86_64/cacheinfo.c b/sysdeps/x86_64/cacheinfo.c
index 96463df..18da742 100644
--- a/sysdeps/x86_64/cacheinfo.c
+++ b/sysdeps/x86_64/cacheinfo.c
@@ -20,7 +20,7 @@ 
 #include <stdbool.h>
 #include <stdlib.h>
 #include <unistd.h>
-#include <cpuid.h>
+#include <local_cpuid.h>
 #include <init-arch.h>
 
 #define is_intel GLRO(dl_x86_cpu_features).kind == arch_kind_intel
@@ -162,7 +162,7 @@  intel_check_word (int name, unsigned int value, bool *has_level_2,
 	  unsigned int round = 0;
 	  while (1)
 	    {
-	      __cpuid_count (4, round, eax, ebx, ecx, edx);
+	      cpuid_count (4, round, &eax, &ebx, &ecx, &edx);
 
 	      enum { null = 0, data = 1, inst = 2, uni = 3 } type = eax & 0x1f;
 	      if (type == null)
@@ -275,7 +275,7 @@  handle_intel (int name, unsigned int maxidx)
       unsigned int ebx;
       unsigned int ecx;
       unsigned int edx;
-      __cpuid (2, eax, ebx, ecx, edx);
+      cpuid (2, &eax, &ebx, &ecx, &edx);
 
       /* The low byte of EAX in the first round contain the number of
 	 rounds we have to make.  At least one, the one we are already
@@ -319,7 +319,7 @@  handle_amd (int name)
   unsigned int ebx;
   unsigned int ecx;
   unsigned int edx;
-  __cpuid (0x80000000, eax, ebx, ecx, edx);
+  cpuid (0x80000000, &eax, &ebx, &ecx, &edx);
 
   /* No level 4 cache (yet).  */
   if (name > _SC_LEVEL3_CACHE_LINESIZE)
@@ -329,7 +329,7 @@  handle_amd (int name)
   if (eax < fn)
     return 0;
 
-  __cpuid (fn, eax, ebx, ecx, edx);
+  cpuid (fn, &eax, &ebx, &ecx, &edx);
 
   if (name < _SC_LEVEL1_DCACHE_SIZE)
     {
@@ -479,7 +479,7 @@  init_cacheinfo (void)
   unsigned int ebx;
   unsigned int ecx;
   unsigned int edx;
-  int max_cpuid_ex;
+  unsigned int max_cpuid_ex;
   long int data = -1;
   long int shared = -1;
   unsigned int level;
@@ -512,7 +512,7 @@  init_cacheinfo (void)
 	  /* Query until desired cache level is enumerated.  */
 	  do
 	    {
-	      __cpuid_count (4, i++, eax, ebx, ecx, edx);
+	      cpuid_count (4, i++, &eax, &ebx, &ecx, &edx);
 
 	      /* There seems to be a bug in at least some Pentium Ds
 		 which sometimes fail to iterate all cache parameters.
@@ -536,7 +536,7 @@  init_cacheinfo (void)
 	      i = 0;
 	      while (1)
 		{
-		  __cpuid_count (11, i++, eax, ebx, ecx, edx);
+		  cpuid_count (11, i++, &eax, &ebx, &ecx, &edx);
 
 		  int shipped = ebx & 0xff;
 		  int type = ecx & 0xff0;
@@ -598,7 +598,7 @@  init_cacheinfo (void)
       shared = handle_amd (_SC_LEVEL3_CACHE_SIZE);
 
       /* Get maximum extended function. */
-      __cpuid (0x80000000, max_cpuid_ex, ebx, ecx, edx);
+      cpuid (0x80000000, &max_cpuid_ex, &ebx, &ecx, &edx);
 
       if (shared <= 0)
 	/* No shared L3 cache.  All we have is the L2 cache.  */
@@ -609,7 +609,7 @@  init_cacheinfo (void)
 	  if (max_cpuid_ex >= 0x80000008)
 	    {
 	      /* Get width of APIC ID.  */
-	      __cpuid (0x80000008, max_cpuid_ex, ebx, ecx, edx);
+	      cpuid (0x80000008, &max_cpuid_ex, &ebx, &ecx, &edx);
 	      threads = 1 << ((ecx >> 12) & 0x0f);
 	    }
 
@@ -617,7 +617,7 @@  init_cacheinfo (void)
 	    {
 	      /* If APIC ID width is not available, use logical
 		 processor count.  */
-	      __cpuid (0x00000001, max_cpuid_ex, ebx, ecx, edx);
+	      cpuid (0x00000001, &max_cpuid_ex, &ebx, &ecx, &edx);
 
 	      if ((edx & (1 << 28)) != 0)
 		threads = (ebx >> 16) & 0xff;
@@ -635,7 +635,7 @@  init_cacheinfo (void)
 #ifndef DISABLE_PREFETCHW
       if (max_cpuid_ex >= 0x80000001)
 	{
-	  __cpuid (0x80000001, eax, ebx, ecx, edx);
+	  cpuid (0x80000001, &eax, &ebx, &ecx, &edx);
 	  /*  PREFETCHW     || 3DNow!  */
 	  if ((ecx & 0x100) || (edx & 0x80000000))
 	    __x86_prefetchw = -1;
diff --git a/sysdeps/x86_64/tst-audit10.c b/sysdeps/x86_64/tst-audit10.c
index a487b40..5a23774 100644
--- a/sysdeps/x86_64/tst-audit10.c
+++ b/sysdeps/x86_64/tst-audit10.c
@@ -16,7 +16,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <cpuid.h>
+#include <local_cpuid.h>
 #include <cpu-features.h>
 
 int tst_audit10_aux (void);
@@ -26,11 +26,11 @@  avx512_enabled (void)
 {
   unsigned int eax, ebx, ecx, edx;
 
-  if (__get_cpuid (1, &eax, &ebx, &ecx, &edx) == 0
+  if (get_cpuid (1, &eax, &ebx, &ecx, &edx) == 0
       || (ecx & (bit_AVX | bit_OSXSAVE)) != (bit_AVX | bit_OSXSAVE))
     return 0;
 
-  __cpuid_count (7, 0, eax, ebx, ecx, edx);
+  cpuid_count (7, 0, &eax, &ebx, &ecx, &edx);
   if (!(ebx & bit_AVX512F))
     return 0;
 
diff --git a/sysdeps/x86_64/tst-audit4.c b/sysdeps/x86_64/tst-audit4.c
index d8e2ab1..0ab51ed 100644
--- a/sysdeps/x86_64/tst-audit4.c
+++ b/sysdeps/x86_64/tst-audit4.c
@@ -16,7 +16,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <cpuid.h>
+#include <local_cpuid.h>
 
 int tst_audit4_aux (void);
 
@@ -25,7 +25,7 @@  avx_enabled (void)
 {
   unsigned int eax, ebx, ecx, edx;
 
-  if (__get_cpuid (1, &eax, &ebx, &ecx, &edx) == 0
+  if (get_cpuid (1, &eax, &ebx, &ecx, &edx) == 0
       || (ecx & (bit_AVX | bit_OSXSAVE)) != (bit_AVX | bit_OSXSAVE))
     return 0;
 
diff --git a/sysdeps/x86_64/tst-audit6.c b/sysdeps/x86_64/tst-audit6.c
index f2f6a48..85689a0 100644
--- a/sysdeps/x86_64/tst-audit6.c
+++ b/sysdeps/x86_64/tst-audit6.c
@@ -2,7 +2,7 @@ 
 
 #include <stdlib.h>
 #include <string.h>
-#include <cpuid.h>
+#include <local_cpuid.h>
 #include <emmintrin.h>
 
 extern __m128i audit_test (__m128i, __m128i, __m128i, __m128i,
@@ -14,7 +14,7 @@  avx_enabled (void)
 {
   unsigned int eax, ebx, ecx, edx;
 
-  if (__get_cpuid (1, &eax, &ebx, &ecx, &edx) == 0
+  if (get_cpuid (1, &eax, &ebx, &ecx, &edx) == 0
       || (ecx & (bit_AVX | bit_OSXSAVE)) != (bit_AVX | bit_OSXSAVE))
     return 0;
 
diff --git a/sysdeps/x86_64/tst-auditmod10b.c b/sysdeps/x86_64/tst-auditmod10b.c
index ad6fcaf..9c3093f 100644
--- a/sysdeps/x86_64/tst-auditmod10b.c
+++ b/sysdeps/x86_64/tst-auditmod10b.c
@@ -125,18 +125,18 @@  la_symbind64 (Elf64_Sym *sym, unsigned int ndx, uintptr_t *refcook,
 
 #ifdef __AVX512F__
 #include <immintrin.h>
-#include <cpuid.h>
+#include <local_cpuid.h>
 
 static int
 check_avx512 (void)
 {
   unsigned int eax, ebx, ecx, edx;
 
-  if (__get_cpuid (1, &eax, &ebx, &ecx, &edx) == 0
+  if (get_cpuid (1, &eax, &ebx, &ecx, &edx) == 0
       || (ecx & (bit_AVX | bit_OSXSAVE)) != (bit_AVX | bit_OSXSAVE))
     return 0;
 
-  __cpuid_count (7, 0, eax, ebx, ecx, edx);
+  cpuid_count (7, 0, &eax, &ebx, &ecx, &edx);
   if (!(ebx & bit_AVX512F))
     return 0;
 
diff --git a/sysdeps/x86_64/tst-auditmod4b.c b/sysdeps/x86_64/tst-auditmod4b.c
index 2b0d827..df74ea3 100644
--- a/sysdeps/x86_64/tst-auditmod4b.c
+++ b/sysdeps/x86_64/tst-auditmod4b.c
@@ -108,7 +108,7 @@  la_symbind64 (Elf64_Sym *sym, unsigned int ndx, uintptr_t *refcook,
 
 #ifdef __AVX__
 #include <immintrin.h>
-#include <cpuid.h>
+#include <local_cpuid.h>
 
 static int avx = -1;
 
@@ -120,7 +120,7 @@  check_avx (void)
     {
       unsigned int eax, ebx, ecx, edx;
 
-      if (__get_cpuid (1, &eax, &ebx, &ecx, &edx)
+      if (get_cpuid (1, &eax, &ebx, &ecx, &edx)
 	  && (ecx & bit_AVX))
 	avx = 1;
       else
diff --git a/sysdeps/x86_64/tst-auditmod6b.c b/sysdeps/x86_64/tst-auditmod6b.c
index 886fc33..521441c 100644
--- a/sysdeps/x86_64/tst-auditmod6b.c
+++ b/sysdeps/x86_64/tst-auditmod6b.c
@@ -108,7 +108,7 @@  la_symbind64 (Elf64_Sym *sym, unsigned int ndx, uintptr_t *refcook,
 
 #ifdef __AVX__
 #include <immintrin.h>
-#include <cpuid.h>
+#include <local_cpuid.h>
 
 static int avx = -1;
 
@@ -120,7 +120,7 @@  check_avx (void)
     {
       unsigned int eax, ebx, ecx, edx;
 
-      if (__get_cpuid (1, &eax, &ebx, &ecx, &edx)
+      if (get_cpuid (1, &eax, &ebx, &ecx, &edx)
 	  && (ecx & bit_AVX))
 	avx = 1;
       else
diff --git a/sysdeps/x86_64/tst-auditmod6c.c b/sysdeps/x86_64/tst-auditmod6c.c
index b2ee24d..d4ca5c8 100644
--- a/sysdeps/x86_64/tst-auditmod6c.c
+++ b/sysdeps/x86_64/tst-auditmod6c.c
@@ -108,7 +108,7 @@  la_symbind64 (Elf64_Sym *sym, unsigned int ndx, uintptr_t *refcook,
 
 #ifdef __AVX__
 #include <immintrin.h>
-#include <cpuid.h>
+#include <local_cpuid.h>
 
 static int avx = -1;
 
@@ -120,7 +120,7 @@  check_avx (void)
     {
       unsigned int eax, ebx, ecx, edx;
 
-      if (__get_cpuid (1, &eax, &ebx, &ecx, &edx)
+      if (get_cpuid (1, &eax, &ebx, &ecx, &edx)
 	  && (ecx & bit_AVX))
 	avx = 1;
       else
diff --git a/sysdeps/x86_64/tst-auditmod7b.c b/sysdeps/x86_64/tst-auditmod7b.c
index f27076d..343a27e 100644
--- a/sysdeps/x86_64/tst-auditmod7b.c
+++ b/sysdeps/x86_64/tst-auditmod7b.c
@@ -108,7 +108,7 @@  la_symbind64 (Elf64_Sym *sym, unsigned int ndx, uintptr_t *refcook,
 
 #ifdef __AVX__
 #include <immintrin.h>
-#include <cpuid.h>
+#include <local_cpuid.h>
 
 static int avx = -1;
 
@@ -120,7 +120,7 @@  check_avx (void)
     {
       unsigned int eax, ebx, ecx, edx;
 
-      if (__get_cpuid (1, &eax, &ebx, &ecx, &edx)
+      if (get_cpuid (1, &eax, &ebx, &ecx, &edx)
 	  && (ecx & bit_AVX))
 	avx = 1;
       else