From patchwork Mon Jun 22 20:25:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 39771 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7E357388B022; Mon, 22 Jun 2020 20:26:38 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7E357388B022 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1592857598; bh=Cy0Z8J8OwIeZr+KkYQWgJFsI0A7x4JJVIdv8a6geI4Q=; h=References:In-Reply-To:Date:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=unzALub7hhxXirlvxRjyRhD4jbWbx9GgXma0ox0cIrIyKQFOJmbZ5kR6eL5guJHu1 zPJoOVtdr67EkYWmJmhfJiyFOqbXq8FCJyMQ/Gxz9bq23PcKST0TWHdtS5PZ+/UBFg POG0Q8HD6BdCanP1Xa3CuJTwAS7WF7fS/4sPHmV8= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-io1-xd43.google.com (mail-io1-xd43.google.com [IPv6:2607:f8b0:4864:20::d43]) by sourceware.org (Postfix) with ESMTPS id DE199388A80E for ; Mon, 22 Jun 2020 20:26:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org DE199388A80E Received: by mail-io1-xd43.google.com with SMTP id h4so13403562ior.5 for ; Mon, 22 Jun 2020 13:26:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Cy0Z8J8OwIeZr+KkYQWgJFsI0A7x4JJVIdv8a6geI4Q=; b=bqSCJkyaizLZlQFrBxNXHm23eaAmbkWu247j9wQqHUoEPy+R0cbymcM/PL48/GHrs8 Zt3VHcc1n6O6fOp6xLQWqsywsGvPZBMFMTbDepj+TKL951N3pJ0ZZgU7kSzkwwLHFDb6 8mAfu+FRYtx9GGXlPKCzBP+xCOxwfa64K9dW0i9SfrbaaZQv5eitSWQ00OlG1t3bmSZ+ H2R9NMmevvoP0T5PAb39lSyOTdtm/tmJZlY4/66x7ObN31P9uNDeO5QoQSCKNDxwwz92 jokHFrTZrLF9FV8dbHS36no2qxNpatOKziuMNX2fLhACgsTF6JMfRcv0c9n32j2M6ZJZ hMNw== X-Gm-Message-State: AOAM5333iXNtBEMHR7ZHjCca1J51dXAg7RnoWRWzCr3wGoZGetIg3or6 EoKzD6s6vXrudKeLxG6oeFP7lR+KtYbxGNxyStw= X-Google-Smtp-Source: ABdhPJx9p2nMElBjdVc1oS+GdasVImyTTBlJuEUnPmx69s8JPvWe01jVfSokngerCY/pbth/wyF2/rYvLLNuRLp9bwg= X-Received: by 2002:a05:6602:2e05:: with SMTP id o5mr21472492iow.28.1592857593105; Mon, 22 Jun 2020 13:26:33 -0700 (PDT) MIME-Version: 1.0 References: <20200617193100.1115529-1-hjl.tools@gmail.com> <87mu50rccl.fsf@oldenburg2.str.redhat.com> <87o8pbpiuy.fsf@oldenburg2.str.redhat.com> In-Reply-To: <87o8pbpiuy.fsf@oldenburg2.str.redhat.com> Date: Mon, 22 Jun 2020 13:25:57 -0700 Message-ID: Subject: V3: [PATCH] x86: Install [BZ #26124] To: Florian Weimer X-Spam-Status: No, score=-9.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "H.J. Lu via Libc-alpha" From: "H.J. Lu" Reply-To: "H.J. Lu" Cc: "H.J. Lu via Libc-alpha" , Joseph Myers Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" On Mon, Jun 22, 2020 at 2:09 AM Florian Weimer wrote: > > * H. J. Lu via Libc-alpha: > > > On Thu, Jun 18, 2020 at 1:45 AM Florian Weimer wrote: > >> > >> * H. J. Lu via Libc-alpha: > >> > >> > exports only: > >> > > >> > struct cpu_features > >> > { > >> > struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX]; > >> > unsigned int feature[FEATURE_INDEX_MAX]; > >> > struct cpu_features_basic basic; > >> > }; > >> > >> The struct cpu_features ABI does not appear to be stable, as I wrote on > >> the other thread: > >> > >> > >> > > > > Here is the updated patch to use > > > > struct cpu_features > > { > > struct cpu_features_basic basic; > > unsigned int usable[USABLE_FEATURE_INDEX_MAX]; > > struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX]; > > }; > > > > This should be backward compatible for both .o and .so files. > > This is an improvement. I still see issues with this interface, though. > > Programmers will need something like this for IFUNC resolvers, but this > is not usable there if the relocation order does not match the explicit > ELF dependency order (e.g., due to symbol interposition). This is a > generic problem with IFUNC resolvers. Other architectures solve this by > adding arguments to IFUNC resolvers, so that the required can be This issue is orthogonal to . > accessed without relocations. __builtin_cpu_supports does not have this > issue. > > The manual does not explain the difference between “available” and > “usable”. I do not think we should expose both. I don't see any > circumstances under which “available” would provide useful information. Fixed. "available" means supported by the operating system, which is very useful information. > struct cpu_features (even in its reduced form) is fairly large. We will > never be able to reduce its size again if it becomes public ABI. Fixed by struct cpu_features { struct cpu_features_basic basic; unsigned int *usable_p; struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX]; }; > I'm not convinced that this interface conforms to the API contract for > the CPUID instruction. Aren't the raw bits in the cpuid field subject > to interpretation based on family and model? HAS_CPU_FEATURE does not > reflect this at all. Yes, it does. The cpuid array is populated based on family and model. It has accurate information about processor features. > My preferred interface would look like this: > > * The low-level function call ABI is: > > unsigned long long long int __x86_get_cpu_features (unsigned int word) > __attribute__ ((const)); > > * IFUNC resolvers receive two arguments, > > unsigned long long long int *feature_words, size_t feature_words_length > > * CPU_FEATURE_USABLE (NAME) would expand to something like > > ((__x86_get_cpu_features (NAME_word) & NAME_bitmask) != 0) > > The implementation of __x86_get_cpu_features simply returns 0 if the > index is too large, which solves the extension problem. > > * CPU_FEATURE_USABLE_ARRAY (WORDS, WORDS_LENGTH, NAME) would expand to this: > > (((NAME_word < WORDS_LENGTH ? WORDS[NAME_word] ? : 0) & NAME_bitmask) != 0) > > This version is expected to be used in IFUNC resolvers. The macro > name is just a strawman. > > Only the “usable” bits are exposed, so the interface makes it clear that > the implementation performs some normalization. My new interface is expandable and backward compatible. It is also forward compatible since the older libc.so works with binaries compiled against the newer as long as the cpuid and usable array sizes are the same. > Alternatively, the __builtin_cpu_supports interface could be enhanced to > cover all the features you need. __builtin_cpu_supports is equivalent to CPU_FEATURE_USABLE and it doesn't support HAS_CPU_FEATURE which does provide useful information. From 4ee18b75b4f187487842b58062408d704c0af777 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Wed, 17 Jun 2020 09:12:18 -0700 Subject: [PATCH] x86: Install [BZ #26124] Install so that programmers can do #if __has_include() #include #endif ... if (HAS_CPU_FEATURE (SSE2)) ... if (CPU_FEATURE_USABLE (AVX2)) ... exports only: struct cpu_features { struct cpu_features_basic basic; unsigned int *usable_p; struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX]; }; /* Get a pointer to the CPU features structure. */ extern const struct cpu_features *__x86_get_cpu_features (void) __attribute__ ((const)); Since all feature checks are done through macros, programs compiled with a newer header are compatible with the older ones as long as the layout of struct cpu_features is identical. The cpuid array can be expanded with backward binary compatibility for both .o and .so files. Note: Although GCC has __builtin_cpu_supports, it only supports a subset of and it is equivalent to CPU_FEATURE_USABLE. It doesn't support HAS_CPU_FEATURE. --- NEWS | 2 + manual/platform.texi | 23 ++ sysdeps/unix/sysv/linux/i386/ld.abilist | 1 + sysdeps/unix/sysv/linux/x86_64/64/ld.abilist | 1 + sysdeps/unix/sysv/linux/x86_64/x32/ld.abilist | 1 + sysdeps/x86/Makefile | 1 + sysdeps/x86/Versions | 4 +- sysdeps/x86/dl-get-cpu-features.c | 4 +- sysdeps/x86/include/cpu-features.h | 207 ++++++++++++++++++ .../{cpu-features.h => sys/platform/x86.h} | 172 +-------------- sysdeps/x86/tst-get-cpu-features.c | 6 +- sysdeps/x86_64/fpu/math-tests-arch.h | 8 +- sysdeps/x86_64/multiarch/test-multiarch.c | 10 +- 13 files changed, 264 insertions(+), 176 deletions(-) create mode 100644 sysdeps/x86/include/cpu-features.h rename sysdeps/x86/{cpu-features.h => sys/platform/x86.h} (77%) diff --git a/NEWS b/NEWS index a660fc59a8..ae7d1ece35 100644 --- a/NEWS +++ b/NEWS @@ -9,6 +9,8 @@ Version 2.32 Major new features: +* Add to provide query macros for x86 CPU features. + * Unicode 12.1.0 Support: Character encoding, character type info, and transliteration tables are all updated to Unicode 12.1.0, using generator scripts contributed by Mike FABIAN (Red Hat). diff --git a/manual/platform.texi b/manual/platform.texi index 504addc956..39cb68bd78 100644 --- a/manual/platform.texi +++ b/manual/platform.texi @@ -7,6 +7,7 @@ @menu * PowerPC:: Facilities Specific to the PowerPC Architecture * RISC-V:: Facilities Specific to the RISC-V Architecture +* X86:: Facilities Specific to the X86 Architecture @end menu @node PowerPC @@ -134,3 +135,25 @@ all threads in the current process. Setting the ordering on only the current thread is necessary. All other flag bits are reserved. @end deftypefun + +@node X86 +@appendixsec X86-specific Facilities + +Facilities specific to X86 that are not specific to a particular +operating system are declared in @file{sys/platform/x86.h}. + +@deftypefun {const struct cpu_features *} __x86_get_cpu_features (void) +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} +Return a pointer to x86 CPU feature structure used by query macros for x86 +CPU features. +@end deftypefun + +@deftypefn Macro int HAS_CPU_FEATURE (@var{name}) +This macro returns a nonzero value (true) if the processor feature +@var{name} is available. +@end deftypefn + +@deftypefn Macro int CPU_FEATURE_USABLE (@var{name}) +This macro returns a nonzero value (true) if the processor feature +@var{name} is supported by the operating system. +@end deftypefn diff --git a/sysdeps/unix/sysv/linux/i386/ld.abilist b/sysdeps/unix/sysv/linux/i386/ld.abilist index 0478e22071..1226876689 100644 --- a/sysdeps/unix/sysv/linux/i386/ld.abilist +++ b/sysdeps/unix/sysv/linux/i386/ld.abilist @@ -3,3 +3,4 @@ GLIBC_2.1 __libc_stack_end D 0x4 GLIBC_2.1 _dl_mcount F GLIBC_2.3 ___tls_get_addr F GLIBC_2.3 __tls_get_addr F +GLIBC_2.32 __x86_get_cpu_features F diff --git a/sysdeps/unix/sysv/linux/x86_64/64/ld.abilist b/sysdeps/unix/sysv/linux/x86_64/64/ld.abilist index d3cdf7611e..886e57abd5 100644 --- a/sysdeps/unix/sysv/linux/x86_64/64/ld.abilist +++ b/sysdeps/unix/sysv/linux/x86_64/64/ld.abilist @@ -2,3 +2,4 @@ GLIBC_2.2.5 __libc_stack_end D 0x8 GLIBC_2.2.5 _dl_mcount F GLIBC_2.2.5 _r_debug D 0x28 GLIBC_2.3 __tls_get_addr F +GLIBC_2.32 __x86_get_cpu_features F diff --git a/sysdeps/unix/sysv/linux/x86_64/x32/ld.abilist b/sysdeps/unix/sysv/linux/x86_64/x32/ld.abilist index c70bccf782..0d2f8a2cc5 100644 --- a/sysdeps/unix/sysv/linux/x86_64/x32/ld.abilist +++ b/sysdeps/unix/sysv/linux/x86_64/x32/ld.abilist @@ -2,3 +2,4 @@ GLIBC_2.16 __libc_stack_end D 0x4 GLIBC_2.16 __tls_get_addr F GLIBC_2.16 _dl_mcount F GLIBC_2.16 _r_debug D 0x14 +GLIBC_2.32 __x86_get_cpu_features F diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile index beab426f67..0e4d132803 100644 --- a/sysdeps/x86/Makefile +++ b/sysdeps/x86/Makefile @@ -4,6 +4,7 @@ endif ifeq ($(subdir),elf) sysdep-dl-routines += dl-get-cpu-features +sysdep_headers += sys/platform/x86.h tests += tst-get-cpu-features tst-get-cpu-features-static tests-static += tst-get-cpu-features-static diff --git a/sysdeps/x86/Versions b/sysdeps/x86/Versions index e02923708e..7e3139dbb1 100644 --- a/sysdeps/x86/Versions +++ b/sysdeps/x86/Versions @@ -1,5 +1,5 @@ ld { - GLIBC_PRIVATE { - __get_cpu_features; + GLIBC_2.32 { + __x86_get_cpu_features; } } diff --git a/sysdeps/x86/dl-get-cpu-features.c b/sysdeps/x86/dl-get-cpu-features.c index 9d61cd56be..fa1a1caa87 100644 --- a/sysdeps/x86/dl-get-cpu-features.c +++ b/sysdeps/x86/dl-get-cpu-features.c @@ -18,10 +18,10 @@ #include -#undef __get_cpu_features +#undef __x86_get_cpu_features const struct cpu_features * -__get_cpu_features (void) +__x86_get_cpu_features (void) { return &GLRO(dl_x86_cpu_features); } diff --git a/sysdeps/x86/include/cpu-features.h b/sysdeps/x86/include/cpu-features.h new file mode 100644 index 0000000000..c6efa3f066 --- /dev/null +++ b/sysdeps/x86/include/cpu-features.h @@ -0,0 +1,207 @@ +/* Data structure for x86 CPU features. + Copyright (C) 2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _PRIVATE_CPU_FEATURES_H +#define _PRIVATE_CPU_FEATURES_H 1 + +#ifdef _CPU_FEATURES_H +# error this should be impossible +#endif + +#ifndef _ISOMAC +/* Get most of the contents from the public header, but we define a + different `struct cpu_features' type for private use. */ +# define cpu_features cpu_features_public +# define __x86_get_cpu_features __x86_get_cpu_features_public +#endif + +#include + +#ifndef _ISOMAC + +# undef cpu_features +# undef __x86_get_cpu_features +# define __get_cpu_features __x86_get_cpu_features + +enum +{ + /* The integer bit array index for the first set of preferred feature + bits. */ + PREFERRED_FEATURE_INDEX_1 = 0, + /* The current maximum size of the feature integer bit array. */ + PREFERRED_FEATURE_INDEX_MAX +}; + +# undef CPU_FEATURES_ARCH_P +# define CPU_FEATURES_ARCH_P(ptr, name) \ + ((ptr->feature_##name[index_arch_##name] & (bit_arch_##name)) != 0) + +/* HAS_ARCH_FEATURE evaluates to true if we may use the feature at + runtime. */ +# define HAS_ARCH_FEATURE(name) \ + CPU_FEATURES_ARCH_P (__x86_get_cpu_features (), name) +/* CPU_FEATURE_USABLE evaluates to true if the feature is usable. */ +# undef CPU_FEATURE_USABLE +# define CPU_FEATURE_USABLE(name) \ + CPU_FEATURES_ARCH_P (__x86_get_cpu_features (), name##_Usable) + +/* USABLE_FEATURE_INDEX_1. */ +# define feature_AVX_Usable usable +# define feature_AVX2_Usable usable +# define feature_AVX512F_Usable usable +# define feature_AVX512CD_Usable usable +# define feature_AVX512ER_Usable usable +# define feature_AVX512PF_Usable usable +# define feature_AVX512VL_Usable usable +# define feature_AVX512BW_Usable usable +# define feature_AVX512DQ_Usable usable +# define feature_AVX512_4FMAPS_Usable usable +# define feature_AVX512_4VNNIW_Usable usable +# define feature_AVX512_BITALG_Usable usable +# define feature_AVX512_IFMA_Usable usable +# define feature_AVX512_VBMI_Usable usable +# define feature_AVX512_VBMI2_Usable usable +# define feature_AVX512_VNNI_Usable usable +# define feature_AVX512_VPOPCNTDQ_Usable usable +# define feature_FMA_Usable usable +# define feature_FMA4_Usable usable +# define feature_VAES_Usable usable +# define feature_VPCLMULQDQ_Usable usable +# define feature_XOP_Usable usable +# define feature_XSAVEC_Usable usable +# define feature_F16C_Usable usable +# define feature_AVX512_VP2INTERSECT_Usable usable +# define feature_AVX512_BF16_Usable usable +# define feature_PKU_Usable usable + +/* PREFERRED_FEATURE_INDEX_1. */ +# define bit_arch_I586 (1u << 0) +# define bit_arch_I686 (1u << 1) +# define bit_arch_Fast_Rep_String (1u << 2) +# define bit_arch_Fast_Copy_Backward (1u << 3) +# define bit_arch_Fast_Unaligned_Load (1u << 4) +# define bit_arch_Fast_Unaligned_Copy (1u << 5) +# define bit_arch_Slow_BSF (1u << 6) +# define bit_arch_Slow_SSE4_2 (1u << 7) +# define bit_arch_AVX_Fast_Unaligned_Load (1u << 8) +# define bit_arch_Prefer_MAP_32BIT_EXEC (1u << 9) +# define bit_arch_Prefer_PMINUB_for_stringop (1u << 10) +# define bit_arch_Prefer_No_VZEROUPPER (1u << 11) +# define bit_arch_Prefer_ERMS (1u << 12) +# define bit_arch_Prefer_FSRM (1u << 13) +# define bit_arch_Prefer_No_AVX512 (1u << 14) +# define bit_arch_MathVec_Prefer_No_AVX512 (1u << 15) + +# define index_arch_Fast_Rep_String PREFERRED_FEATURE_INDEX_1 +# define index_arch_Fast_Copy_Backward PREFERRED_FEATURE_INDEX_1 +# define index_arch_Slow_BSF PREFERRED_FEATURE_INDEX_1 +# define index_arch_Fast_Unaligned_Load PREFERRED_FEATURE_INDEX_1 +# define index_arch_Prefer_PMINUB_for_stringop PREFERRED_FEATURE_INDEX_1 +# define index_arch_Fast_Unaligned_Copy PREFERRED_FEATURE_INDEX_1 +# define index_arch_I586 PREFERRED_FEATURE_INDEX_1 +# define index_arch_I686 PREFERRED_FEATURE_INDEX_1 +# define index_arch_Slow_SSE4_2 PREFERRED_FEATURE_INDEX_1 +# define index_arch_AVX_Fast_Unaligned_Load PREFERRED_FEATURE_INDEX_1 +# define index_arch_Prefer_MAP_32BIT_EXEC PREFERRED_FEATURE_INDEX_1 +# define index_arch_Prefer_No_VZEROUPPER PREFERRED_FEATURE_INDEX_1 +# define index_arch_Prefer_ERMS PREFERRED_FEATURE_INDEX_1 +# define index_arch_Prefer_No_AVX512 PREFERRED_FEATURE_INDEX_1 +# define index_arch_MathVec_Prefer_No_AVX512 PREFERRED_FEATURE_INDEX_1 +# define index_arch_Prefer_FSRM PREFERRED_FEATURE_INDEX_1 + +# define feature_Fast_Rep_String preferred +# define feature_Fast_Copy_Backward preferred +# define feature_Slow_BSF preferred +# define feature_Fast_Unaligned_Load preferred +# define feature_Prefer_PMINUB_for_stringop preferred +# define feature_Fast_Unaligned_Copy preferred +# define feature_I586 preferred +# define feature_I686 preferred +# define feature_Slow_SSE4_2 preferred +# define feature_AVX_Fast_Unaligned_Load preferred +# define feature_Prefer_MAP_32BIT_EXEC preferred +# define feature_Prefer_No_VZEROUPPER preferred +# define feature_Prefer_ERMS preferred +# define feature_Prefer_No_AVX512 preferred +# define feature_MathVec_Prefer_No_AVX512 preferred +# define feature_Prefer_FSRM preferred + +/* XCR0 Feature flags. */ +# define bit_XMM_state (1u << 1) +# define bit_YMM_state (1u << 2) +# define bit_Opmask_state (1u << 5) +# define bit_ZMM0_15_state (1u << 6) +# define bit_ZMM16_31_state (1u << 7) + +struct cpu_features +{ + struct cpu_features_basic basic; + unsigned int *usable_p; + struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX]; + unsigned int usable[USABLE_FEATURE_INDEX_MAX]; + unsigned int preferred[PREFERRED_FEATURE_INDEX_MAX]; + /* The state size for XSAVEC or XSAVE. The type must be unsigned long + int so that we use + + sub xsave_state_size_offset(%rip) %RSP_LP + + in _dl_runtime_resolve. */ + unsigned long int xsave_state_size; + /* The full state size for XSAVE when XSAVEC is disabled by + + GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC_Usable + */ + unsigned int xsave_state_full_size; + /* Data cache size for use in memory and string routines, typically + L1 size. */ + unsigned long int data_cache_size; + /* Shared cache size for use in memory and string routines, typically + L2 or L3 size. */ + unsigned long int shared_cache_size; + /* Threshold to use non temporal store. */ + unsigned long int non_temporal_threshold; +}; + +# if defined (_LIBC) && !IS_IN (nonlib) +/* Unused for x86. */ +# define INIT_ARCH() +# define __x86_get_cpu_features() (&GLRO(dl_x86_cpu_features)) +# define x86_get_cpuid_registers(i) \ + (&(GLRO(dl_x86_cpu_features).cpuid[i])) +# endif + +# ifdef __x86_64__ +# define HAS_CPUID 1 +# elif (defined __i586__ || defined __pentium__ \ + || defined __geode__ || defined __k6__) +# define HAS_CPUID 1 +# define HAS_I586 1 +# define HAS_I686 HAS_ARCH_FEATURE (I686) +# elif defined __i486__ +# define HAS_CPUID 0 +# define HAS_I586 HAS_ARCH_FEATURE (I586) +# define HAS_I686 HAS_ARCH_FEATURE (I686) +# else +# define HAS_CPUID 1 +# define HAS_I586 1 +# define HAS_I686 1 +# endif + +#endif /* !_ISOMAC */ + +#endif /* include/cpu-features.h */ diff --git a/sysdeps/x86/cpu-features.h b/sysdeps/x86/sys/platform/x86.h similarity index 77% rename from sysdeps/x86/cpu-features.h rename to sysdeps/x86/sys/platform/x86.h index 574f055e0c..43e286cde3 100644 --- a/sysdeps/x86/cpu-features.h +++ b/sysdeps/x86/sys/platform/x86.h @@ -1,4 +1,5 @@ -/* This file is part of the GNU C Library. +/* Data structure for x86 CPU features. + This file is part of the GNU C Library. Copyright (C) 2008-2020 Free Software Foundation, Inc. The GNU C Library is free software; you can redistribute it and/or @@ -15,8 +16,8 @@ License along with the GNU C Library; if not, see . */ -#ifndef cpu_features_h -#define cpu_features_h +#ifndef _SYS_PLATFORM_X86_H +#define _SYS_PLATFORM_X86_H enum { @@ -27,15 +28,6 @@ enum USABLE_FEATURE_INDEX_MAX }; -enum -{ - /* The integer bit array index for the first set of preferred feature - bits. */ - PREFERRED_FEATURE_INDEX_1 = 0, - /* The current maximum size of the feature integer bit array. */ - PREFERRED_FEATURE_INDEX_MAX -}; - enum { COMMON_CPUID_INDEX_1 = 0, @@ -80,51 +72,23 @@ struct cpu_features struct cpu_features_basic basic; unsigned int *usable_p; struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX]; - unsigned int usable[USABLE_FEATURE_INDEX_MAX]; - unsigned int preferred[PREFERRED_FEATURE_INDEX_MAX]; - /* The state size for XSAVEC or XSAVE. The type must be unsigned long - int so that we use - - sub xsave_state_size_offset(%rip) %RSP_LP - - in _dl_runtime_resolve. */ - unsigned long int xsave_state_size; - /* The full state size for XSAVE when XSAVEC is disabled by - - GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC_Usable - */ - unsigned int xsave_state_full_size; - /* Data cache size for use in memory and string routines, typically - L1 size. */ - unsigned long int data_cache_size; - /* Shared cache size for use in memory and string routines, typically - L2 or L3 size. */ - unsigned long int shared_cache_size; - /* Threshold to use non temporal store. */ - unsigned long int non_temporal_threshold; }; -/* Used from outside of glibc to get access to the CPU features - structure. */ -extern const struct cpu_features *__get_cpu_features (void) +/* Get a pointer to the CPU features structure. */ +extern const struct cpu_features *__x86_get_cpu_features (void) __attribute__ ((const)); -/* Only used directly in cpu-features.c. */ -# define CPU_FEATURES_CPU_P(ptr, name) \ +#define CPU_FEATURES_CPU_P(ptr, name) \ ((ptr->cpuid[index_cpu_##name].reg_##name & (bit_cpu_##name)) != 0) -# define CPU_FEATURES_ARCH_P(ptr, name) \ - ((ptr->feature_##name[index_arch_##name] & (bit_arch_##name)) != 0) +#define CPU_FEATURES_ARCH_P(ptr, name) \ + ((ptr->usable_p[index_arch_##name] & (bit_arch_##name)) != 0) /* HAS_CPU_FEATURE evaluates to true if CPU supports the feature. */ #define HAS_CPU_FEATURE(name) \ - CPU_FEATURES_CPU_P (__get_cpu_features (), name) -/* HAS_ARCH_FEATURE evaluates to true if we may use the feature at - runtime. */ -# define HAS_ARCH_FEATURE(name) \ - CPU_FEATURES_ARCH_P (__get_cpu_features (), name) + CPU_FEATURES_CPU_P (__x86_get_cpu_features (), name) /* CPU_FEATURE_USABLE evaluates to true if the feature is usable. */ #define CPU_FEATURE_USABLE(name) \ - HAS_ARCH_FEATURE (name##_Usable) + CPU_FEATURES_ARCH_P (__x86_get_cpu_features (), name##_Usable) /* Architecture features. */ @@ -185,34 +149,6 @@ extern const struct cpu_features *__get_cpu_features (void) #define index_arch_AVX512_BF16_Usable USABLE_FEATURE_INDEX_1 #define index_arch_PKU_Usable USABLE_FEATURE_INDEX_1 -#define feature_AVX_Usable usable -#define feature_AVX2_Usable usable -#define feature_AVX512F_Usable usable -#define feature_AVX512CD_Usable usable -#define feature_AVX512ER_Usable usable -#define feature_AVX512PF_Usable usable -#define feature_AVX512VL_Usable usable -#define feature_AVX512BW_Usable usable -#define feature_AVX512DQ_Usable usable -#define feature_AVX512_4FMAPS_Usable usable -#define feature_AVX512_4VNNIW_Usable usable -#define feature_AVX512_BITALG_Usable usable -#define feature_AVX512_IFMA_Usable usable -#define feature_AVX512_VBMI_Usable usable -#define feature_AVX512_VBMI2_Usable usable -#define feature_AVX512_VNNI_Usable usable -#define feature_AVX512_VPOPCNTDQ_Usable usable -#define feature_FMA_Usable usable -#define feature_FMA4_Usable usable -#define feature_VAES_Usable usable -#define feature_VPCLMULQDQ_Usable usable -#define feature_XOP_Usable usable -#define feature_XSAVEC_Usable usable -#define feature_F16C_Usable usable -#define feature_AVX512_VP2INTERSECT_Usable usable -#define feature_AVX512_BF16_Usable usable -#define feature_PKU_Usable usable - /* CPU features. */ /* COMMON_CPUID_INDEX_1. */ @@ -761,88 +697,4 @@ extern const struct cpu_features *__get_cpu_features (void) /* EAX. */ #define reg_AVX512_BF16 eax -/* FEATURE_INDEX_2. */ -#define bit_arch_I586 (1u << 0) -#define bit_arch_I686 (1u << 1) -#define bit_arch_Fast_Rep_String (1u << 2) -#define bit_arch_Fast_Copy_Backward (1u << 3) -#define bit_arch_Fast_Unaligned_Load (1u << 4) -#define bit_arch_Fast_Unaligned_Copy (1u << 5) -#define bit_arch_Slow_BSF (1u << 6) -#define bit_arch_Slow_SSE4_2 (1u << 7) -#define bit_arch_AVX_Fast_Unaligned_Load (1u << 8) -#define bit_arch_Prefer_MAP_32BIT_EXEC (1u << 9) -#define bit_arch_Prefer_PMINUB_for_stringop (1u << 10) -#define bit_arch_Prefer_No_VZEROUPPER (1u << 11) -#define bit_arch_Prefer_ERMS (1u << 12) -#define bit_arch_Prefer_FSRM (1u << 13) -#define bit_arch_Prefer_No_AVX512 (1u << 14) -#define bit_arch_MathVec_Prefer_No_AVX512 (1u << 15) - -#define index_arch_Fast_Rep_String PREFERRED_FEATURE_INDEX_1 -#define index_arch_Fast_Copy_Backward PREFERRED_FEATURE_INDEX_1 -#define index_arch_Slow_BSF PREFERRED_FEATURE_INDEX_1 -#define index_arch_Fast_Unaligned_Load PREFERRED_FEATURE_INDEX_1 -#define index_arch_Prefer_PMINUB_for_stringop PREFERRED_FEATURE_INDEX_1 -#define index_arch_Fast_Unaligned_Copy PREFERRED_FEATURE_INDEX_1 -#define index_arch_I586 PREFERRED_FEATURE_INDEX_1 -#define index_arch_I686 PREFERRED_FEATURE_INDEX_1 -#define index_arch_Slow_SSE4_2 PREFERRED_FEATURE_INDEX_1 -#define index_arch_AVX_Fast_Unaligned_Load PREFERRED_FEATURE_INDEX_1 -#define index_arch_Prefer_MAP_32BIT_EXEC PREFERRED_FEATURE_INDEX_1 -#define index_arch_Prefer_No_VZEROUPPER PREFERRED_FEATURE_INDEX_1 -#define index_arch_Prefer_ERMS PREFERRED_FEATURE_INDEX_1 -#define index_arch_Prefer_No_AVX512 PREFERRED_FEATURE_INDEX_1 -#define index_arch_MathVec_Prefer_No_AVX512 PREFERRED_FEATURE_INDEX_1 -#define index_arch_Prefer_FSRM PREFERRED_FEATURE_INDEX_1 - -#define feature_Fast_Rep_String preferred -#define feature_Fast_Copy_Backward preferred -#define feature_Slow_BSF preferred -#define feature_Fast_Unaligned_Load preferred -#define feature_Prefer_PMINUB_for_stringop preferred -#define feature_Fast_Unaligned_Copy preferred -#define feature_I586 preferred -#define feature_I686 preferred -#define feature_Slow_SSE4_2 preferred -#define feature_AVX_Fast_Unaligned_Load preferred -#define feature_Prefer_MAP_32BIT_EXEC preferred -#define feature_Prefer_No_VZEROUPPER preferred -#define feature_Prefer_ERMS preferred -#define feature_Prefer_No_AVX512 preferred -#define feature_MathVec_Prefer_No_AVX512 preferred -#define feature_Prefer_FSRM preferred - -/* XCR0 Feature flags. */ -#define bit_XMM_state (1u << 1) -#define bit_YMM_state (1u << 2) -#define bit_Opmask_state (1u << 5) -#define bit_ZMM0_15_state (1u << 6) -#define bit_ZMM16_31_state (1u << 7) - -# if defined (_LIBC) && !IS_IN (nonlib) -/* Unused for x86. */ -# define INIT_ARCH() -# define __get_cpu_features() (&GLRO(dl_x86_cpu_features)) -# define x86_get_cpuid_registers(i) \ - (&(GLRO(dl_x86_cpu_features).cpuid[i])) -# endif - -#ifdef __x86_64__ -# define HAS_CPUID 1 -#elif (defined __i586__ || defined __pentium__ \ - || defined __geode__ || defined __k6__) -# define HAS_CPUID 1 -# define HAS_I586 1 -# define HAS_I686 HAS_ARCH_FEATURE (I686) -#elif defined __i486__ -# define HAS_CPUID 0 -# define HAS_I586 HAS_ARCH_FEATURE (I586) -# define HAS_I686 HAS_ARCH_FEATURE (I686) -#else -# define HAS_CPUID 1 -# define HAS_I586 1 -# define HAS_I686 1 -#endif - -#endif /* cpu_features_h */ +#endif /* _SYS_PLATFORM_X86_H */ diff --git a/sysdeps/x86/tst-get-cpu-features.c b/sysdeps/x86/tst-get-cpu-features.c index c60918cf00..c470bfb345 100644 --- a/sysdeps/x86/tst-get-cpu-features.c +++ b/sysdeps/x86/tst-get-cpu-features.c @@ -1,4 +1,4 @@ -/* Test case for x86 __get_cpu_features interface +/* Test case for __x86_get_cpu_features interface Copyright (C) 2015-2020 Free Software Foundation, Inc. This file is part of the GNU C Library. @@ -18,7 +18,7 @@ #include #include -#include +#include #include #define CHECK_CPU_FEATURE(name) \ @@ -45,7 +45,7 @@ static const char * const cpu_kinds[] = static int do_test (void) { - const struct cpu_features *cpu_features = __get_cpu_features (); + const struct cpu_features *cpu_features = __x86_get_cpu_features (); switch (cpu_features->basic.kind) { diff --git a/sysdeps/x86_64/fpu/math-tests-arch.h b/sysdeps/x86_64/fpu/math-tests-arch.h index 435ddad991..cc3c2b0c11 100644 --- a/sysdeps/x86_64/fpu/math-tests-arch.h +++ b/sysdeps/x86_64/fpu/math-tests-arch.h @@ -16,7 +16,7 @@ License along with the GNU C Library; if not, see . */ -#include +#include #if defined REQUIRE_AVX @@ -24,7 +24,7 @@ # define CHECK_ARCH_EXT \ do \ { \ - if (!HAS_ARCH_FEATURE (AVX_Usable)) return; \ + if (!CPU_FEATURE_USABLE (AVX)) return; \ } \ while (0) @@ -34,7 +34,7 @@ # define CHECK_ARCH_EXT \ do \ { \ - if (!HAS_ARCH_FEATURE (AVX2_Usable)) return; \ + if (!CPU_FEATURE_USABLE (AVX2)) return; \ } \ while (0) @@ -44,7 +44,7 @@ # define CHECK_ARCH_EXT \ do \ { \ - if (!HAS_ARCH_FEATURE (AVX512F_Usable)) return; \ + if (!CPU_FEATURE_USABLE (AVX512F)) return; \ } \ while (0) diff --git a/sysdeps/x86_64/multiarch/test-multiarch.c b/sysdeps/x86_64/multiarch/test-multiarch.c index 317373ceda..9feaf057e5 100644 --- a/sysdeps/x86_64/multiarch/test-multiarch.c +++ b/sysdeps/x86_64/multiarch/test-multiarch.c @@ -16,7 +16,7 @@ License along with the GNU C Library; if not, see . */ -#include +#include #include #include #include @@ -75,10 +75,10 @@ do_test (int argc, char **argv) int fails; get_cpuinfo (); - fails = check_proc ("avx", HAS_ARCH_FEATURE (AVX_Usable), - "HAS_ARCH_FEATURE (AVX_Usable)"); - fails += check_proc ("fma4", HAS_ARCH_FEATURE (FMA4_Usable), - "HAS_ARCH_FEATURE (FMA4_Usable)"); + fails = check_proc ("avx", CPU_FEATURE_USABLE (AVX), + "CPU_FEATURE_USABLE (AVX)"); + fails += check_proc ("fma4", CPU_FEATURE_USABLE (FMA4), + "CPU_FEATURE_USABLE (FMA4)"); fails += check_proc ("sse4_2", HAS_CPU_FEATURE (SSE4_2), "HAS_CPU_FEATURE (SSE4_2)"); fails += check_proc ("sse4_1", HAS_CPU_FEATURE (SSE4_1) -- 2.26.2