Patchwork Common cpuid wrappers, use SYS_cpuid when available

login
register
mail settings
Submitter Piotr Henryk Dabrowski
Date March 9, 2016, 1:24 a.m.
Message ID <20160309022443.3e3530f6@ultra.tux-net>
Download mbox | patch
Permalink /patch/11278/
State New
Headers show

Comments

Piotr Henryk Dabrowski - March 9, 2016, 1:24 a.m.
* config.h.in: Check for SYS_cpuid and define HAVE_SYS_CPUID
	* configure: Check for SYS_cpuid and define HAVE_SYS_CPUID
	* configure.ac: Check for SYS_cpuid and define HAVE_SYS_CPUID
	* misc/cpuid.h: Common cpuid wrappers, use SYS_cpuid when available
	* sysdeps/x86/cpu-features.c: Use misc/cpuid.h wrappers
	* sysdeps/x86/fpu/test-fenv-clear-sse.c: Use misc/cpuid.h wrappers
	* sysdeps/x86/fpu/test-fenv-sse-2.c: Use misc/cpuid.h wrappers
	* sysdeps/x86/fpu/test-fenv-sse.c: Use misc/cpuid.h wrappers
	* sysdeps/x86_64/cacheinfo.c: Use misc/cpuid.h wrappers
	* sysdeps/x86_64/tst-audit10.c: Use misc/cpuid.h wrappers
	* sysdeps/x86_64/tst-audit4.c: Use misc/cpuid.h wrappers
	* sysdeps/x86_64/tst-audit6.c: Use misc/cpuid.h wrappers
	* sysdeps/x86_64/tst-auditmod10b.c: Use misc/cpuid.h wrappers
	* sysdeps/x86_64/tst-auditmod4b.c: Use misc/cpuid.h wrappers
	* sysdeps/x86_64/tst-auditmod6b.c: Use misc/cpuid.h wrappers
	* sysdeps/x86_64/tst-auditmod6c.c: Use misc/cpuid.h wrappers
	* sysdeps/x86_64/tst-auditmod7b.c: Use misc/cpuid.h wrappers
---
 ChangeLog                             | 20 +++++++++
 config.h.in                           |  3 ++
 configure                             | 37 ++++++++++++++++
 configure.ac                          | 18 ++++++++
 misc/cpuid.h                          | 82 +++++++++++++++++++++++++++++++++++
 sysdeps/x86/cpu-features.c            | 37 ++++++++--------
 sysdeps/x86/fpu/test-fenv-clear-sse.c |  4 +-
 sysdeps/x86/fpu/test-fenv-sse-2.c     |  4 +-
 sysdeps/x86/fpu/test-fenv-sse.c       |  4 +-
 sysdeps/x86_64/cacheinfo.c            | 24 +++++-----
 sysdeps/x86_64/tst-audit10.c          |  6 +--
 sysdeps/x86_64/tst-audit4.c           |  4 +-
 sysdeps/x86_64/tst-audit6.c           |  4 +-
 sysdeps/x86_64/tst-auditmod10b.c      |  6 +--
 sysdeps/x86_64/tst-auditmod4b.c       |  4 +-
 sysdeps/x86_64/tst-auditmod6b.c       |  4 +-
 sysdeps/x86_64/tst-auditmod6c.c       |  4 +-
 sysdeps/x86_64/tst-auditmod7b.c       |  4 +-
 18 files changed, 215 insertions(+), 54 deletions(-)
 create mode 100644 misc/cpuid.h
Piotr Henryk Dabrowski - March 9, 2016, 1:27 a.m.
Currently there is no way of disabling CPU features reported by the CPUID
instruction. Which sometimes turn out to be broken [1] or undesired [2].
We can assume we will run into similar situations again sooner or later.
The only way to fix this is to do a microcode update (if it is available),
as the BIOS does not provide a way to disable CPUID bits either. When there is
no new microcode, then there is no way to tell your system not to use certain
CPU features. This sometimes leads to an unbootable and/or unusable system.
Plus the ability to quickly disable certain CPU extensions would be handy for
debugging.

This patch aims at providing system-wide support for the kernel-adjusted CPUID:
* The kernel takes a command line parameter (cpu-=...) allowing for an easy way
  to disable any of the known CPUID capability bits [3]. Plus the kernel may
  disable certain features by itself as well.
* Then the kernel provides a system call for obtaining the adjusted data [4]
  (SYS_cpuid, to be used instead of the __cpuid* macros from GCC's cpuid.h).

Since the cpuid instruction is available from the user-space, use of SYS_cpuid
cannot be enforced on programmers. But it can be encouraged, and making glibc
respect it is a good start (and a requirement for this purpose).
The expected impact is, after the new versions of kernel and glibc are widely
adopted, to discourage use of low-level __cpuid* macros for checking supported
CPU features on Linux as a coding issue that workarounds and breaks system
features.
And we may expect users to report bugs for programs that do not respect features
being disabled. Especially that they will be trivial to fix.
It will take time, but if this is introduced now, it may become a widely used
solution in a few years that will finally allow us to easily disable unwanted
CPU features on demand.

This feature is NOT implemented in the Linux kernel yet.
However I would like to ask you to say if you *would* ACK this *if* the
SYS_cpuid system call were to be adopted into the kernel.
Obviously shipping either without the other does not make any sense.

This is also my very first patch for glibc, so please let me know of any code
quality issues or improvement suggestions.

On GitLab you can find trees with both this patch [5] and the latest Linux
kernel patched [6]. And I attach a test program for the SYS_cpuid below [7].

[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=800574
[2] https://devtalk.nvidia.com/default/topic/893325/newest-and-beta-linux-driver-causing-segmentation-fault-core-dumped-on-all-skylake-platforms/
[3] e.g. 'linux ... nosplash quiet cpu-=mmx,sse,sse2'
[4] long sys_cpuid(const u32 level, const u32 count,
                   u32 __user *eax, u32 __user *ebx,
                   u32 __user *ecx, u32 __user *edx);
[5] https://gitlab.com/ultr/glibc/tags/ultr-sys_cpuid
[6] https://gitlab.com/ultr/linux/tags/ultr-sys_cpuid-master
[7] SYS_cpuid test program:
- - - - cut here - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
#include <stdio.h>
#include <stdint.h>

#include <unistd.h>
#include <sys/syscall.h>

#include <cpuid.h>

#ifndef __linux__
    #warning Not a Linux!
#endif

#ifndef SYS_cpuid
    #warning Defining undefined SYS_cpuid!
    #ifdef __x86_64__
        #define SYS_cpuid 327
    #else
        #define SYS_cpuid 378
    #endif
#endif

void get_kernel(const uint32_t level, const uint32_t count) {
    uint32_t eax = 0, ebx = 0, ecx = 0, edx = 0;
    int ret = syscall(SYS_cpuid, level, count, &eax, &ebx, &ecx, &edx);
    printf("sys_cpuid==%d:\t[0x%08lX,%lu] => [0x%08lX,0x%08lX,0x%08lX,0x%08lX]\n", ret, level, count, eax, ebx, ecx, edx);
}

void get_native(const uint32_t level, const uint32_t count) {
    register uint32_t eax = 0, ebx = 0, ecx = 0, edx = 0;
    __cpuid_count(level, count, eax, ebx, ecx, edx);
    printf("native cpuid:\t[0x%08lX,%lu] => [0x%08lX,0x%08lX,0x%08lX,0x%08lX]\n", level, count, eax, ebx, ecx, edx);
}

void get(const uint32_t level, const uint32_t count) {
    get_native(level, count);
    get_kernel(level, count);
}

int main(int argc, char **argv) {
    printf("SYS_cpuid = %d\n", SYS_cpuid);
    get(0x00000001, 0);
    get(0x00000006, 0);
    get(0x00000007, 0);
    get(0x0000000D, 1);
    get(0x0000000F, 0);
    get(0x0000000F, 1);
    get(0x80000001, 0);
    get(0x80000008, 0);
    get(0x8000000A, 0);
    get(0x80860001, 0);
    get(0xC0000001, 0);

    get(0x00000002, 0);
    get(0x00000004, 0);
    get(0x00000004, 1);
    get(0x00000004, 2);
    get(0x00000004, 3);
    return 0;
}
- - - - cut here - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Regards,
Piotr Henryk Dabrowski
Mike Frysinger - March 9, 2016, 3:52 a.m.
On 09 Mar 2016 02:24, Piotr Henryk Dabrowski wrote:
> --- a/configure.ac
> +++ b/configure.ac
> @@ -1704,6 +1704,24 @@ AC_SUBST(libc_cv_cxx_thread_local)
>  AC_LANG_POP([C++])
>  dnl End of C++ feature tests.
>  
> +# SYS_cpuid syscall
> +libc_cv_sys_cpuid=no
> +AC_MSG_CHECKING(for x86 kernel with SYS_cpuid support)
> +AC_TRY_COMPILE([
> +  #if (defined(__i386__) || defined(__x86_64__)) && defined(__linux__)
> +  #include <sys/syscall.h>
> +  #if !defined(SYS_cpuid) || !defined(__NR_cpuid)
> +  #error SYS_cpuid not defined
> +  #endif
> +  #else
> +  #error Not a x86 Linux
> +  #endif
> +], [], [libc_cv_sys_cpuid=yes], [libc_cv_sys_cpuid=no])
> +if test "$libc_cv_sys_cpuid" = yes; then
> +  AC_DEFINE(HAVE_SYS_CPUID)
> +fi
> +AC_MSG_RESULT($libc_cv_sys_cpuid)

don't think you need this here.  you can define __ASSUME_CPUID in
kernel-features.h and use that everywhere.  look at that file and
symbols it defins as an example.

> +   Copyright (C) 2016 Piotr Henryk Dabrowski <ultr@ultr.pl>

nope -- you'll need to sign copyright papers w/the FSF

> +/* NOTE: for new Linux kernels these functions try to use kernel-adjusted
> +   values for cpuid returned by the SYS_cpuid sys call.
> +   Otherwise they fallback to native cpuid implementation. */

GNU style: two spaces after periods

> +	if (INLINE_SYSCALL(cpuid, 6, level, count, eax, ebx, ecx, edx) == 0)

GNU style: put spaces before the ( w/func calls
-mike
Adhemerval Zanella Netto - March 9, 2016, 4:49 a.m.
On 09-03-2016 08:27, Piotr Henryk Dabrowski wrote:
> 
> This feature is NOT implemented in the Linux kernel yet.
> However I would like to ask you to say if you *would* ACK this *if* the
> SYS_cpuid system call were to be adopted into the kernel.
> Obviously shipping either without the other does not make any sense.

If such functionality is indeed accepted upstream in Linux I see no reason
on not possible ack this patch (the ack it self will depend on patch
quality and architecture maintainer feedback). I see the idea is reasonable,
the only drawback is a slight large latency on program startup (due syscall
issuing in cacheinfo.c).

Before start to ask for patch revision, you need to sort out the copyright
by signing the papers with the FSF as pointed out by Mike Frysinger.
Andreas Schwab - March 9, 2016, 8:16 a.m.
Piotr Henryk Dabrowski <ultr@ultr.pl> writes:

> +# SYS_cpuid syscall
> +libc_cv_sys_cpuid=no
> +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for x86 kernel with
> SYS_cpuid support" >&5 +$as_echo_n "checking for x86 kernel with SYS_cpuid
> support... " >&6; } +cat confdefs.h - <<_ACEOF >conftest.$ac_ext
> +/* end confdefs.h.  */
> +
> +  #if (defined(__i386__) || defined(__x86_64__)) && defined(__linux__)
> +  #include <sys/syscall.h>

You cannot use the host libc to check for features.

Andreas.
Florian Weimer - March 9, 2016, 8:22 p.m.
* Piotr Henryk Dabrowski:

> However I would like to ask you to say if you *would* ACK this *if* the
> SYS_cpuid system call were to be adopted into the kernel.

<cpuid.h> is provided by GCC, it would have to change as well.  The
real challenge is the pervasive use of inline assembly, though.

Currently, you cannot invoke system calls from IFUNC selectors.  This
means that one major application for IFUNC selectors cannot use the
system call.

I also wonder if the relevant CPUID flags should rather be part of
auxv, or if this functionality should be a vsyscall instead.  Both
would avoid circularity issues (you may need to do a CPUID before
doing a system call, and vice versa).

Patch

diff --git a/ChangeLog b/ChangeLog
index a6be762..3c67632 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,23 @@ 
+2016-03-07  Piotr Henryk Dabrowski  <ultr@ultr.pl>
+
+	* config.h.in: Check for SYS_cpuid and define HAVE_SYS_CPUID
+	* configure: Check for SYS_cpuid and define HAVE_SYS_CPUID
+	* configure.ac: Check for SYS_cpuid and define HAVE_SYS_CPUID
+	* misc/cpuid.h: Common cpuid wrappers, use SYS_cpuid when available
+	* sysdeps/x86/cpu-features.c: Use misc/cpuid.h wrappers
+	* sysdeps/x86/fpu/test-fenv-clear-sse.c: Use misc/cpuid.h wrappers
+	* sysdeps/x86/fpu/test-fenv-sse-2.c: Use misc/cpuid.h wrappers
+	* sysdeps/x86/fpu/test-fenv-sse.c: Use misc/cpuid.h wrappers
+	* sysdeps/x86_64/cacheinfo.c: Use misc/cpuid.h wrappers
+	* sysdeps/x86_64/tst-audit10.c: Use misc/cpuid.h wrappers
+	* sysdeps/x86_64/tst-audit4.c: Use misc/cpuid.h wrappers
+	* sysdeps/x86_64/tst-audit6.c: Use misc/cpuid.h wrappers
+	* sysdeps/x86_64/tst-auditmod10b.c: Use misc/cpuid.h wrappers
+	* sysdeps/x86_64/tst-auditmod4b.c: Use misc/cpuid.h wrappers
+	* sysdeps/x86_64/tst-auditmod6b.c: Use misc/cpuid.h wrappers
+	* sysdeps/x86_64/tst-auditmod6c.c: Use misc/cpuid.h wrappers
+	* sysdeps/x86_64/tst-auditmod7b.c: Use misc/cpuid.h wrappers
+
 2016-03-09  Joseph Myers  <joseph@codesourcery.com>
 
 	[BZ #19790]
diff --git a/config.h.in b/config.h.in
index 0147ba3..0f2231a 100644
--- a/config.h.in
+++ b/config.h.in
@@ -121,6 +121,9 @@ 
 /* Mach/i386 specific: define if the `i386_set_gdt' RPC is available.  */
 #undef	HAVE_I386_SET_GDT
 
+/* Define if the x86 kernel supports SYS_cpuid syscall.  */
+#undef HAVE_SYS_CPUID
+
 /* Defined of libidn is available.  */
 #undef HAVE_LIBIDN
 
diff --git a/configure b/configure
index 8fe5937..1138642 100755
--- a/configure
+++ b/configure
@@ -6420,6 +6420,43 @@  ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS
$LDFLAGS conftest.$ac_ext $ ac_compiler_gnu=$ac_cv_c_compiler_gnu
 
 
+# SYS_cpuid syscall
+libc_cv_sys_cpuid=no
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for x86 kernel with
SYS_cpuid support" >&5 +$as_echo_n "checking for x86 kernel with SYS_cpuid
support... " >&6; } +cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+  #if (defined(__i386__) || defined(__x86_64__)) && defined(__linux__)
+  #include <sys/syscall.h>
+  #if !defined(SYS_cpuid) || !defined(__NR_cpuid)
+  #error SYS_cpuid not defined
+  #endif
+  #else
+  #error Not a x86 Linux
+  #endif
+
+int
+main ()
+{
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+  libc_cv_sys_cpuid=yes
+else
+  libc_cv_sys_cpuid=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+if test "$libc_cv_sys_cpuid" = yes; then
+  $as_echo "#define HAVE_SYS_CPUID 1" >>confdefs.h
+
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libc_cv_sys_cpuid" >&5
+$as_echo "$libc_cv_sys_cpuid" >&6; }
+
 ### End of automated tests.
 ### Now run sysdeps configure fragments.
 
diff --git a/configure.ac b/configure.ac
index 3c766b7..89a1779 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1704,6 +1704,24 @@  AC_SUBST(libc_cv_cxx_thread_local)
 AC_LANG_POP([C++])
 dnl End of C++ feature tests.
 
+# SYS_cpuid syscall
+libc_cv_sys_cpuid=no
+AC_MSG_CHECKING(for x86 kernel with SYS_cpuid support)
+AC_TRY_COMPILE([
+  #if (defined(__i386__) || defined(__x86_64__)) && defined(__linux__)
+  #include <sys/syscall.h>
+  #if !defined(SYS_cpuid) || !defined(__NR_cpuid)
+  #error SYS_cpuid not defined
+  #endif
+  #else
+  #error Not a x86 Linux
+  #endif
+], [], [libc_cv_sys_cpuid=yes], [libc_cv_sys_cpuid=no])
+if test "$libc_cv_sys_cpuid" = yes; then
+  AC_DEFINE(HAVE_SYS_CPUID)
+fi
+AC_MSG_RESULT($libc_cv_sys_cpuid)
+
 ### End of automated tests.
 ### Now run sysdeps configure fragments.
 
diff --git a/misc/cpuid.h b/misc/cpuid.h
new file mode 100644
index 0000000..6caea13
--- /dev/null
+++ b/misc/cpuid.h
@@ -0,0 +1,82 @@ 
+/* CPUID wrapper functions.
+   This file is part of the GNU C Library.
+   Copyright (C) 2016 Piotr Henryk Dabrowski <ultr@ultr.pl>
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _CPUID_H
+#define _CPUID_H 1
+
+/* NOTE: for new Linux kernels these functions try to use kernel-adjusted
+   values for cpuid returned by the SYS_cpuid sys call.
+   Otherwise they fallback to native cpuid implementation. */
+
+#include <config.h>
+
+#include <cpuid.h>
+#include <errno.h>
+#include <stddef.h>
+
+#ifdef HAVE_SYS_CPUID
+#include <sysdep.h>
+#include <sys/syscall.h>
+#endif
+
+#define get_cpuid_max __get_cpuid_max
+
+/* Return cpuid data for requested cpuid level (eax) and count register (ecx),
+   as found in returned eax, ebx, ecx and edx registers.
+   All pointers are required to be non-null. */
+static inline void
+cpuid_count (unsigned int level, unsigned int count,
+	     unsigned int *eax, unsigned int *ebx,
+	     unsigned int *ecx, unsigned int *edx)
+{
+#ifdef HAVE_SYS_CPUID
+	if (INLINE_SYSCALL(cpuid, 6, level, count, eax, ebx, ecx, edx) == 0)
+		return;
+#endif
+	__cpuid_count(level, count, *eax, *ebx, *ecx, *edx);
+}
+
+/* Return cpuid data for requested cpuid level (eax),
+   as found in returned eax, ebx, ecx and edx registers.
+   All pointers are required to be non-null. */
+static inline void
+cpuid (unsigned int level,
+       unsigned int *eax, unsigned int *ebx,
+       unsigned int *ecx, unsigned int *edx)
+{
+	cpuid_count(level, 0, eax, ebx, ecx, edx);
+}
+
+/* Return cpuid data for requested cpuid level (eax),
+   as found in returned eax, ebx, ecx and edx registers.
+   The function checks if cpuid is supported and returns 1 for valid cpuid
+   information or 0 for unsupported cpuid level.
+   All pointers are required to be non-null. */
+static inline int
+get_cpuid (unsigned int level,
+	   unsigned int *eax, unsigned int *ebx,
+	   unsigned int *ecx, unsigned int *edx)
+{
+	unsigned int ext = level & 0x80000000;
+	if (get_cpuid_max (ext, 0) < level)
+		return 0;
+	cpuid (level, eax, ebx, ecx, edx);
+	return 1;
+}
+
+#endif /* cpuid.h */
diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
index 218ff2b..a71c701 100644
--- a/sysdeps/x86/cpu-features.c
+++ b/sysdeps/x86/cpu-features.c
@@ -16,7 +16,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <cpuid.h>
+#include <misc/cpuid.h>
 #include <cpu-features.h>
 
 static inline void
@@ -25,9 +25,9 @@  get_common_indeces (struct cpu_features *cpu_features,
 		    unsigned int *extended_model)
 {
   unsigned int eax;
-  __cpuid (1, eax, cpu_features->cpuid[COMMON_CPUID_INDEX_1].ebx,
-	   cpu_features->cpuid[COMMON_CPUID_INDEX_1].ecx,
-	   cpu_features->cpuid[COMMON_CPUID_INDEX_1].edx);
+  cpuid (1, &eax, &(cpu_features->cpuid[COMMON_CPUID_INDEX_1].ebx),
+	 &(cpu_features->cpuid[COMMON_CPUID_INDEX_1].ecx),
+	 &(cpu_features->cpuid[COMMON_CPUID_INDEX_1].edx));
   GLRO(dl_x86_cpu_features).cpuid[COMMON_CPUID_INDEX_1].eax = eax;
   *family = (eax >> 8) & 0x0f;
   *model = (eax >> 4) & 0x0f;
@@ -42,20 +42,21 @@  get_common_indeces (struct cpu_features *cpu_features,
 static inline void
 init_cpu_features (struct cpu_features *cpu_features)
 {
-  unsigned int ebx, ecx, edx;
+  unsigned int eax, ebx, ecx, edx;
   unsigned int family = 0;
   unsigned int model = 0;
   enum cpu_features_kind kind;
 
 #if !HAS_CPUID
-  if (__get_cpuid_max (0, 0) == 0)
+  if (get_cpuid_max (0, 0) == 0)
     {
       kind = arch_kind_other;
       goto no_cpuid;
     }
 #endif
 
-  __cpuid (0, cpu_features->max_cpuid, ebx, ecx, edx);
+  cpuid (0, &eax, &ebx, &ecx, &edx);
+  cpu_features->max_cpuid = eax;
 
   /* This spells out "GenuineIntel".  */
   if (ebx == 0x756e6547 && ecx == 0x6c65746e && edx == 0x49656e69)
@@ -147,13 +148,13 @@  init_cpu_features (struct cpu_features *cpu_features)
       ecx = cpu_features->cpuid[COMMON_CPUID_INDEX_1].ecx;
 
       unsigned int eax;
-      __cpuid (0x80000000, eax, ebx, ecx, edx);
+      cpuid (0x80000000, &eax, &ebx, &ecx, &edx);
       if (eax >= 0x80000001)
-	__cpuid (0x80000001,
-		 cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].eax,
-		 cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].ebx,
-		 cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].ecx,
-		 cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].edx);
+	cpuid (0x80000001,
+	       &(cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].eax),
+	       &(cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].ebx),
+	       &(cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].ecx),
+	       &(cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].edx));
 
       if (family == 0x15)
 	{
@@ -175,11 +176,11 @@  init_cpu_features (struct cpu_features *cpu_features)
     cpu_features->feature[index_I686] |= bit_I686;
 
   if (cpu_features->max_cpuid >= 7)
-    __cpuid_count (7, 0,
-		   cpu_features->cpuid[COMMON_CPUID_INDEX_7].eax,
-		   cpu_features->cpuid[COMMON_CPUID_INDEX_7].ebx,
-		   cpu_features->cpuid[COMMON_CPUID_INDEX_7].ecx,
-		   cpu_features->cpuid[COMMON_CPUID_INDEX_7].edx);
+    cpuid_count (7, 0,
+		 &(cpu_features->cpuid[COMMON_CPUID_INDEX_7].eax),
+		 &(cpu_features->cpuid[COMMON_CPUID_INDEX_7].ebx),
+		 &(cpu_features->cpuid[COMMON_CPUID_INDEX_7].ecx),
+		 &(cpu_features->cpuid[COMMON_CPUID_INDEX_7].edx));
 
   /* Can we call xgetbv?  */
   if (HAS_CPU_FEATURE (OSXSAVE))
diff --git a/sysdeps/x86/fpu/test-fenv-clear-sse.c
b/sysdeps/x86/fpu/test-fenv-clear-sse.c index cc4b3f0..816470b 100644
--- a/sysdeps/x86/fpu/test-fenv-clear-sse.c
+++ b/sysdeps/x86/fpu/test-fenv-clear-sse.c
@@ -17,7 +17,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <cpuid.h>
+#include <misc/cpuid.h>
 #include <stdbool.h>
 
 static bool
@@ -25,7 +25,7 @@  have_sse2 (void)
 {
   unsigned int eax, ebx, ecx, edx;
 
-  if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx))
+  if (!get_cpuid (1, &eax, &ebx, &ecx, &edx))
     return false;
 
   return (edx & bit_SSE2) != 0;
diff --git a/sysdeps/x86/fpu/test-fenv-sse-2.c
b/sysdeps/x86/fpu/test-fenv-sse-2.c index d3197c3..c72d5ad 100644
--- a/sysdeps/x86/fpu/test-fenv-sse-2.c
+++ b/sysdeps/x86/fpu/test-fenv-sse-2.c
@@ -16,7 +16,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <cpuid.h>
+#include <misc/cpuid.h>
 #include <fenv.h>
 #include <float.h>
 #include <stdbool.h>
@@ -28,7 +28,7 @@  have_sse2 (void)
 {
   unsigned int eax, ebx, ecx, edx;
 
-  if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx))
+  if (!get_cpuid (1, &eax, &ebx, &ecx, &edx))
     return false;
 
   return (edx & bit_SSE2) != 0;
diff --git a/sysdeps/x86/fpu/test-fenv-sse.c b/sysdeps/x86/fpu/test-fenv-sse.c
index 4f4ff6a..c8f1497 100644
--- a/sysdeps/x86/fpu/test-fenv-sse.c
+++ b/sysdeps/x86/fpu/test-fenv-sse.c
@@ -16,7 +16,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <cpuid.h>
+#include <misc/cpuid.h>
 #include <fenv.h>
 #include <float.h>
 #include <stdbool.h>
@@ -27,7 +27,7 @@  have_sse2 (void)
 {
   unsigned int eax, ebx, ecx, edx;
 
-  if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx))
+  if (!get_cpuid (1, &eax, &ebx, &ecx, &edx))
     return false;
 
   return (edx & bit_SSE2) != 0;
diff --git a/sysdeps/x86_64/cacheinfo.c b/sysdeps/x86_64/cacheinfo.c
index 96463df..76cd03d 100644
--- a/sysdeps/x86_64/cacheinfo.c
+++ b/sysdeps/x86_64/cacheinfo.c
@@ -20,7 +20,7 @@ 
 #include <stdbool.h>
 #include <stdlib.h>
 #include <unistd.h>
-#include <cpuid.h>
+#include <misc/cpuid.h>
 #include <init-arch.h>
 
 #define is_intel GLRO(dl_x86_cpu_features).kind == arch_kind_intel
@@ -162,7 +162,7 @@  intel_check_word (int name, unsigned int value, bool
*has_level_2, unsigned int round = 0;
 	  while (1)
 	    {
-	      __cpuid_count (4, round, eax, ebx, ecx, edx);
+	      cpuid_count (4, round, &eax, &ebx, &ecx, &edx);
 
 	      enum { null = 0, data = 1, inst = 2, uni = 3 } type = eax & 0x1f;
 	      if (type == null)
@@ -275,7 +275,7 @@  handle_intel (int name, unsigned int maxidx)
       unsigned int ebx;
       unsigned int ecx;
       unsigned int edx;
-      __cpuid (2, eax, ebx, ecx, edx);
+      cpuid (2, &eax, &ebx, &ecx, &edx);
 
       /* The low byte of EAX in the first round contain the number of
 	 rounds we have to make.  At least one, the one we are already
@@ -319,7 +319,7 @@  handle_amd (int name)
   unsigned int ebx;
   unsigned int ecx;
   unsigned int edx;
-  __cpuid (0x80000000, eax, ebx, ecx, edx);
+  cpuid (0x80000000, &eax, &ebx, &ecx, &edx);
 
   /* No level 4 cache (yet).  */
   if (name > _SC_LEVEL3_CACHE_LINESIZE)
@@ -329,7 +329,7 @@  handle_amd (int name)
   if (eax < fn)
     return 0;
 
-  __cpuid (fn, eax, ebx, ecx, edx);
+  cpuid (fn, &eax, &ebx, &ecx, &edx);
 
   if (name < _SC_LEVEL1_DCACHE_SIZE)
     {
@@ -479,7 +479,7 @@  init_cacheinfo (void)
   unsigned int ebx;
   unsigned int ecx;
   unsigned int edx;
-  int max_cpuid_ex;
+  unsigned int max_cpuid_ex;
   long int data = -1;
   long int shared = -1;
   unsigned int level;
@@ -512,7 +512,7 @@  init_cacheinfo (void)
 	  /* Query until desired cache level is enumerated.  */
 	  do
 	    {
-	      __cpuid_count (4, i++, eax, ebx, ecx, edx);
+	      cpuid_count (4, i++, &eax, &ebx, &ecx, &edx);
 
 	      /* There seems to be a bug in at least some Pentium Ds
 		 which sometimes fail to iterate all cache parameters.
@@ -536,7 +536,7 @@  init_cacheinfo (void)
 	      i = 0;
 	      while (1)
 		{
-		  __cpuid_count (11, i++, eax, ebx, ecx, edx);
+		  cpuid_count (11, i++, &eax, &ebx, &ecx, &edx);
 
 		  int shipped = ebx & 0xff;
 		  int type = ecx & 0xff0;
@@ -598,7 +598,7 @@  init_cacheinfo (void)
       shared = handle_amd (_SC_LEVEL3_CACHE_SIZE);
 
       /* Get maximum extended function. */
-      __cpuid (0x80000000, max_cpuid_ex, ebx, ecx, edx);
+      cpuid (0x80000000, &max_cpuid_ex, &ebx, &ecx, &edx);
 
       if (shared <= 0)
 	/* No shared L3 cache.  All we have is the L2 cache.  */
@@ -609,7 +609,7 @@  init_cacheinfo (void)
 	  if (max_cpuid_ex >= 0x80000008)
 	    {
 	      /* Get width of APIC ID.  */
-	      __cpuid (0x80000008, max_cpuid_ex, ebx, ecx, edx);
+	      cpuid (0x80000008, &max_cpuid_ex, &ebx, &ecx, &edx);
 	      threads = 1 << ((ecx >> 12) & 0x0f);
 	    }
 
@@ -617,7 +617,7 @@  init_cacheinfo (void)
 	    {
 	      /* If APIC ID width is not available, use logical
 		 processor count.  */
-	      __cpuid (0x00000001, max_cpuid_ex, ebx, ecx, edx);
+	      cpuid (0x00000001, &max_cpuid_ex, &ebx, &ecx, &edx);
 
 	      if ((edx & (1 << 28)) != 0)
 		threads = (ebx >> 16) & 0xff;
@@ -635,7 +635,7 @@  init_cacheinfo (void)
 #ifndef DISABLE_PREFETCHW
       if (max_cpuid_ex >= 0x80000001)
 	{
-	  __cpuid (0x80000001, eax, ebx, ecx, edx);
+	  cpuid (0x80000001, &eax, &ebx, &ecx, &edx);
 	  /*  PREFETCHW     || 3DNow!  */
 	  if ((ecx & 0x100) || (edx & 0x80000000))
 	    __x86_prefetchw = -1;
diff --git a/sysdeps/x86_64/tst-audit10.c b/sysdeps/x86_64/tst-audit10.c
index a487b40..7442f46 100644
--- a/sysdeps/x86_64/tst-audit10.c
+++ b/sysdeps/x86_64/tst-audit10.c
@@ -16,7 +16,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <cpuid.h>
+#include <misc/cpuid.h>
 #include <cpu-features.h>
 
 int tst_audit10_aux (void);
@@ -26,11 +26,11 @@  avx512_enabled (void)
 {
   unsigned int eax, ebx, ecx, edx;
 
-  if (__get_cpuid (1, &eax, &ebx, &ecx, &edx) == 0
+  if (get_cpuid (1, &eax, &ebx, &ecx, &edx) == 0
       || (ecx & (bit_AVX | bit_OSXSAVE)) != (bit_AVX | bit_OSXSAVE))
     return 0;
 
-  __cpuid_count (7, 0, eax, ebx, ecx, edx);
+  cpuid_count (7, 0, &eax, &ebx, &ecx, &edx);
   if (!(ebx & bit_AVX512F))
     return 0;
 
diff --git a/sysdeps/x86_64/tst-audit4.c b/sysdeps/x86_64/tst-audit4.c
index d8e2ab1..5aeaebf 100644
--- a/sysdeps/x86_64/tst-audit4.c
+++ b/sysdeps/x86_64/tst-audit4.c
@@ -16,7 +16,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <cpuid.h>
+#include <misc/cpuid.h>
 
 int tst_audit4_aux (void);
 
@@ -25,7 +25,7 @@  avx_enabled (void)
 {
   unsigned int eax, ebx, ecx, edx;
 
-  if (__get_cpuid (1, &eax, &ebx, &ecx, &edx) == 0
+  if (get_cpuid (1, &eax, &ebx, &ecx, &edx) == 0
       || (ecx & (bit_AVX | bit_OSXSAVE)) != (bit_AVX | bit_OSXSAVE))
     return 0;
 
diff --git a/sysdeps/x86_64/tst-audit6.c b/sysdeps/x86_64/tst-audit6.c
index f2f6a48..e7cc9d6 100644
--- a/sysdeps/x86_64/tst-audit6.c
+++ b/sysdeps/x86_64/tst-audit6.c
@@ -2,7 +2,7 @@ 
 
 #include <stdlib.h>
 #include <string.h>
-#include <cpuid.h>
+#include <misc/cpuid.h>
 #include <emmintrin.h>
 
 extern __m128i audit_test (__m128i, __m128i, __m128i, __m128i,
@@ -14,7 +14,7 @@  avx_enabled (void)
 {
   unsigned int eax, ebx, ecx, edx;
 
-  if (__get_cpuid (1, &eax, &ebx, &ecx, &edx) == 0
+  if (get_cpuid (1, &eax, &ebx, &ecx, &edx) == 0
       || (ecx & (bit_AVX | bit_OSXSAVE)) != (bit_AVX | bit_OSXSAVE))
     return 0;
 
diff --git a/sysdeps/x86_64/tst-auditmod10b.c b/sysdeps/x86_64/tst-auditmod10b.c
index ad6fcaf..3b050d0 100644
--- a/sysdeps/x86_64/tst-auditmod10b.c
+++ b/sysdeps/x86_64/tst-auditmod10b.c
@@ -125,18 +125,18 @@  la_symbind64 (Elf64_Sym *sym, unsigned int ndx, uintptr_t
*refcook, 
 #ifdef __AVX512F__
 #include <immintrin.h>
-#include <cpuid.h>
+#include <misc/cpuid.h>
 
 static int
 check_avx512 (void)
 {
   unsigned int eax, ebx, ecx, edx;
 
-  if (__get_cpuid (1, &eax, &ebx, &ecx, &edx) == 0
+  if (get_cpuid (1, &eax, &ebx, &ecx, &edx) == 0
       || (ecx & (bit_AVX | bit_OSXSAVE)) != (bit_AVX | bit_OSXSAVE))
     return 0;
 
-  __cpuid_count (7, 0, eax, ebx, ecx, edx);
+  cpuid_count (7, 0, &eax, &ebx, &ecx, &edx);
   if (!(ebx & bit_AVX512F))
     return 0;
 
diff --git a/sysdeps/x86_64/tst-auditmod4b.c b/sysdeps/x86_64/tst-auditmod4b.c
index 2b0d827..c980887 100644
--- a/sysdeps/x86_64/tst-auditmod4b.c
+++ b/sysdeps/x86_64/tst-auditmod4b.c
@@ -108,7 +108,7 @@  la_symbind64 (Elf64_Sym *sym, unsigned int ndx, uintptr_t
*refcook, 
 #ifdef __AVX__
 #include <immintrin.h>
-#include <cpuid.h>
+#include <misc/cpuid.h>
 
 static int avx = -1;
 
@@ -120,7 +120,7 @@  check_avx (void)
     {
       unsigned int eax, ebx, ecx, edx;
 
-      if (__get_cpuid (1, &eax, &ebx, &ecx, &edx)
+      if (get_cpuid (1, &eax, &ebx, &ecx, &edx)
 	  && (ecx & bit_AVX))
 	avx = 1;
       else
diff --git a/sysdeps/x86_64/tst-auditmod6b.c b/sysdeps/x86_64/tst-auditmod6b.c
index 886fc33..d77d949 100644
--- a/sysdeps/x86_64/tst-auditmod6b.c
+++ b/sysdeps/x86_64/tst-auditmod6b.c
@@ -108,7 +108,7 @@  la_symbind64 (Elf64_Sym *sym, unsigned int ndx, uintptr_t
*refcook, 
 #ifdef __AVX__
 #include <immintrin.h>
-#include <cpuid.h>
+#include <misc/cpuid.h>
 
 static int avx = -1;
 
@@ -120,7 +120,7 @@  check_avx (void)
     {
       unsigned int eax, ebx, ecx, edx;
 
-      if (__get_cpuid (1, &eax, &ebx, &ecx, &edx)
+      if (get_cpuid (1, &eax, &ebx, &ecx, &edx)
 	  && (ecx & bit_AVX))
 	avx = 1;
       else
diff --git a/sysdeps/x86_64/tst-auditmod6c.c b/sysdeps/x86_64/tst-auditmod6c.c
index b2ee24d..0a5f143 100644
--- a/sysdeps/x86_64/tst-auditmod6c.c
+++ b/sysdeps/x86_64/tst-auditmod6c.c
@@ -108,7 +108,7 @@  la_symbind64 (Elf64_Sym *sym, unsigned int ndx, uintptr_t
*refcook, 
 #ifdef __AVX__
 #include <immintrin.h>
-#include <cpuid.h>
+#include <misc/cpuid.h>
 
 static int avx = -1;
 
@@ -120,7 +120,7 @@  check_avx (void)
     {
       unsigned int eax, ebx, ecx, edx;
 
-      if (__get_cpuid (1, &eax, &ebx, &ecx, &edx)
+      if (get_cpuid (1, &eax, &ebx, &ecx, &edx)
 	  && (ecx & bit_AVX))
 	avx = 1;
       else
diff --git a/sysdeps/x86_64/tst-auditmod7b.c b/sysdeps/x86_64/tst-auditmod7b.c
index f27076d..72257cd 100644
--- a/sysdeps/x86_64/tst-auditmod7b.c
+++ b/sysdeps/x86_64/tst-auditmod7b.c
@@ -108,7 +108,7 @@  la_symbind64 (Elf64_Sym *sym, unsigned int ndx, uintptr_t
*refcook, 
 #ifdef __AVX__
 #include <immintrin.h>
-#include <cpuid.h>
+#include <misc/cpuid.h>
 
 static int avx = -1;
 
@@ -120,7 +120,7 @@  check_avx (void)
     {
       unsigned int eax, ebx, ecx, edx;
 
-      if (__get_cpuid (1, &eax, &ebx, &ecx, &edx)
+      if (get_cpuid (1, &eax, &ebx, &ecx, &edx)
 	  && (ecx & bit_AVX))
 	avx = 1;
       else