From patchwork Wed Apr 19 18:35:32 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Lu, Hongjiu" X-Patchwork-Id: 20098 Received: (qmail 106415 invoked by alias); 19 Apr 2017 18:35:36 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 106388 invoked by uid 89); 19 Apr 2017 18:35:36 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, NO_DNS_FOR_FROM, RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=disturb, 1102 X-HELO: mga06.intel.com X-ExtLoop1: 1 Date: Wed, 19 Apr 2017 11:35:32 -0700 From: "H.J. Lu" To: GNU C Library Subject: [PATCH] [BZ #21391] x86: Set dl_platform and dl_hwcap from CPU features Message-ID: <20170419183532.GA18407@intel.com> Reply-To: "H.J. Lu" MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.8.0 (2017-02-23) dl_platform and dl_hwcap are set from AT_PLATFORM and AT_HWCAP very early during startup. They are used by dynamic linker to determine platform and build an array of hardware capability names, which are added to search path when loading shared object. dl_platform and dl_hwcap are unused on x86-64. On i386, i386, i486, i586 and i686 platforms were supported and only SSE2 capability was used. On x86, usage of AT_PLATFORM and AT_HWCAP to determine platform and processor capabilities is obsolete since all information is available in dl_x86_cpu_features. This patch sets dl_platform and dl_hwcap from dl_x86_cpu_features in dynamic linker. On i386, the available plaforms are changed to i586 and i686 since i386 has been deprecated. On x86-64, the available plaforms are haswell, which is for Haswell class processors with BMI1, BMI2, LZCNT, MOVBE, POPCNT, AVX2 and FMA, and xeon_phi, which is for Xeon Phi class processors with AVX512F, AVX512CD, AVX512ER and AVX512PF. A capability, avx512_1, is also added to x86-64 for AVX512 ISAs: AVX512F, AVX512CD, AVX512BW, AVX512DQ and AVX512VL. Any comments? H.J. --- [BZ #21391] * sysdeps/i386/dl-machine.h (dl_platform_init) [IS_IN (rtld)]: Only call init_cpu_features. [!IS_IN (rtld)]: Only set GLRO(dl_platform) to NULL if needed. * sysdeps/x86_64/dl-machine.h (dl_platform_init): Likewise. * sysdeps/i386/dl-procinfo.h: Removed. * sysdeps/unix/sysv/linux/i386/dl-procinfo.h: Don't include nor . Include . (_dl_procinfo): Replace _DL_HWCAP_COUNT with 32. * sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h [!IS_IN (ldconfig)]: Include instead of . * sysdeps/x86/cpu-features.c: Include . (init_cpu_features): Set dl_platform, dl_hwcap and dl_hwcap_mask. * sysdeps/x86/cpu-features.h (bit_cpu_LZCNT): New. (bit_cpu_MOVBE): Likewise. (bit_cpu_BMI1): Likewise. (bit_cpu_BMI2): Likewise. (index_cpu_BMI1): Likewise. (index_cpu_BMI2): Likewise. (index_cpu_LZCNT): Likewise. (index_cpu_MOVBE): Likewise. (index_cpu_POPCNT): Likewise. (reg_BMI1): Likewise. (reg_BMI2): Likewise. (reg_LZCNT): Likewise. (reg_MOVBE): Likewise. (reg_POPCNT): Likewise. * sysdeps/x86/dl-hwcap.h: New file. * sysdeps/x86/dl-procinfo.h: Likewise. * sysdeps/x86/dl-procinfo.c (_dl_x86_hwcap_flags): New. (_dl_x86_platforms): Likewise. --- sysdeps/i386/dl-machine.h | 10 +-- sysdeps/i386/dl-procinfo.c | 21 +----- sysdeps/i386/dl-procinfo.h | 102 --------------------------- sysdeps/unix/sysv/linux/i386/dl-procinfo.h | 6 +- sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h | 2 +- sysdeps/x86/cpu-features.c | 48 +++++++++++++ sysdeps/x86/cpu-features.h | 15 ++++ sysdeps/x86/dl-hwcap.h | 75 ++++++++++++++++++++ sysdeps/x86/dl-procinfo.c | 38 +++++++++- sysdeps/x86/dl-procinfo.h | 48 +++++++++++++ sysdeps/x86_64/dl-machine.h | 10 +-- 11 files changed, 237 insertions(+), 138 deletions(-) delete mode 100644 sysdeps/i386/dl-procinfo.h create mode 100644 sysdeps/x86/dl-hwcap.h create mode 100644 sysdeps/x86/dl-procinfo.h diff --git a/sysdeps/i386/dl-machine.h b/sysdeps/i386/dl-machine.h index 99a72f6..57d4a0b 100644 --- a/sysdeps/i386/dl-machine.h +++ b/sysdeps/i386/dl-machine.h @@ -233,14 +233,14 @@ _dl_start_user:\n\ static inline void __attribute__ ((unused)) dl_platform_init (void) { - if (GLRO(dl_platform) != NULL && *GLRO(dl_platform) == '\0') - /* Avoid an empty string which would disturb us. */ - GLRO(dl_platform) = NULL; - -#ifdef SHARED +#if IS_IN (rtld) /* init_cpu_features has been called early from __libc_start_main in static executable. */ init_cpu_features (&GLRO(dl_x86_cpu_features)); +#else + if (GLRO(dl_platform) != NULL && *GLRO(dl_platform) == '\0') + /* Avoid an empty string which would disturb us. */ + GLRO(dl_platform) = NULL; #endif } diff --git a/sysdeps/i386/dl-procinfo.c b/sysdeps/i386/dl-procinfo.c index b832830..7237f77 100644 --- a/sysdeps/i386/dl-procinfo.c +++ b/sysdeps/i386/dl-procinfo.c @@ -17,10 +17,7 @@ License along with the GNU C Library; if not, see . */ -/* This information must be kept in sync with the _DL_HWCAP_COUNT and - _DL_PLATFORM_COUNT definitions in procinfo.h. - - If anything should be added here check whether the size of each string +/* If anything should be added here check whether the size of each string is still ok with the given array size. All the #ifdefs in the definitions are quite irritating but @@ -64,21 +61,5 @@ PROCINFO_CLASS const char _dl_x86_cap_flags[32][8] , #endif -#if !defined PROCINFO_DECL && defined SHARED - ._dl_x86_platforms -#else -PROCINFO_CLASS const char _dl_x86_platforms[4][5] -#endif -#ifndef PROCINFO_DECL -= { - "i386", "i486", "i586", "i686" - } -#endif -#if !defined SHARED || defined PROCINFO_DECL -; -#else -, -#endif - #undef PROCINFO_DECL #undef PROCINFO_CLASS diff --git a/sysdeps/i386/dl-procinfo.h b/sysdeps/i386/dl-procinfo.h deleted file mode 100644 index 9c38846..0000000 --- a/sysdeps/i386/dl-procinfo.h +++ /dev/null @@ -1,102 +0,0 @@ -/* i386 version of processor capability information handling macros. - Copyright (C) 1998-2017 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Ulrich Drepper , 1998. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#ifndef _DL_PROCINFO_H -#define _DL_PROCINFO_H 1 -#include - -#define _DL_HWCAP_COUNT 32 - -#define _DL_PLATFORMS_COUNT 4 - -/* Start at 48 to reserve some space. */ -#define _DL_FIRST_PLATFORM 48 -/* Mask to filter out platforms. */ -#define _DL_HWCAP_PLATFORM (((1ULL << _DL_PLATFORMS_COUNT) - 1) \ - << _DL_FIRST_PLATFORM) - -enum -{ - HWCAP_I386_FPU = 1 << 0, - HWCAP_I386_VME = 1 << 1, - HWCAP_I386_DE = 1 << 2, - HWCAP_I386_PSE = 1 << 3, - HWCAP_I386_TSC = 1 << 4, - HWCAP_I386_MSR = 1 << 5, - HWCAP_I386_PAE = 1 << 6, - HWCAP_I386_MCE = 1 << 7, - HWCAP_I386_CX8 = 1 << 8, - HWCAP_I386_APIC = 1 << 9, - HWCAP_I386_SEP = 1 << 11, - HWCAP_I386_MTRR = 1 << 12, - HWCAP_I386_PGE = 1 << 13, - HWCAP_I386_MCA = 1 << 14, - HWCAP_I386_CMOV = 1 << 15, - HWCAP_I386_FCMOV = 1 << 16, - HWCAP_I386_MMX = 1 << 23, - HWCAP_I386_OSFXSR = 1 << 24, - HWCAP_I386_XMM = 1 << 25, - HWCAP_I386_XMM2 = 1 << 26, - HWCAP_I386_AMD3D = 1 << 31, - - /* XXX Which others to add here? */ - HWCAP_IMPORTANT = (HWCAP_I386_XMM2) - -}; - -/* We cannot provide a general printing function. */ -#define _dl_procinfo(type, word) -1 - -static inline const char * -__attribute__ ((unused)) -_dl_hwcap_string (int idx) -{ - return GLRO(dl_x86_cap_flags)[idx]; -}; - -static inline int -__attribute__ ((unused, always_inline)) -_dl_string_hwcap (const char *str) -{ - int i; - - for (i = 0; i < _DL_HWCAP_COUNT; i++) - { - if (strcmp (str, GLRO(dl_x86_cap_flags)[i]) == 0) - return i; - } - return -1; -}; - -static inline int -__attribute__ ((unused, always_inline)) -_dl_string_platform (const char *str) -{ - int i; - - if (str != NULL) - for (i = 0; i < _DL_PLATFORMS_COUNT; ++i) - { - if (strcmp (str, GLRO(dl_x86_platforms)[i]) == 0) - return _DL_FIRST_PLATFORM + i; - } - return -1; -}; - -#endif /* dl-procinfo.h */ diff --git a/sysdeps/unix/sysv/linux/i386/dl-procinfo.h b/sysdeps/unix/sysv/linux/i386/dl-procinfo.h index d49638c..a3a5f9d 100644 --- a/sysdeps/unix/sysv/linux/i386/dl-procinfo.h +++ b/sysdeps/unix/sysv/linux/i386/dl-procinfo.h @@ -17,9 +17,7 @@ License along with the GNU C Library; if not, see . */ -#include -#include - +#include #undef _dl_procinfo static inline int @@ -36,7 +34,7 @@ _dl_procinfo (unsigned int type, unsigned long int word) _dl_printf ("AT_HWCAP: "); - for (i = 0; i < _DL_HWCAP_COUNT; ++i) + for (i = 0; i < 32; ++i) if (word & (1 << i)) _dl_printf (" %s", GLRO(dl_x86_cap_flags)[i]); diff --git a/sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h b/sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h index 7829e1c..7b45fe4 100644 --- a/sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h +++ b/sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h @@ -1,5 +1,5 @@ #if IS_IN (ldconfig) # include #else -# include +# include #endif diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c index f30918d..b481f50 100644 --- a/sysdeps/x86/cpu-features.c +++ b/sysdeps/x86/cpu-features.c @@ -18,6 +18,7 @@ #include #include +#include static void get_common_indeces (struct cpu_features *cpu_features, @@ -310,4 +311,51 @@ no_cpuid: cpu_features->family = family; cpu_features->model = model; cpu_features->kind = kind; + +#if IS_IN (rtld) + /* Reuse dl_platform, dl_hwcap and dl_hwcap_mask for x86. */ + GLRO(dl_platform) = NULL; + GLRO(dl_hwcap) = 0; + GLRO(dl_hwcap_mask) = HWCAP_IMPORTANT; + +# ifdef __x86_64__ + if (cpu_features->kind == arch_kind_intel) + { + if (CPU_FEATURES_ARCH_P (cpu_features, AVX512F_Usable) + && CPU_FEATURES_CPU_P (cpu_features, AVX512CD)) + { + if (CPU_FEATURES_CPU_P (cpu_features, AVX512ER)) + { + if (CPU_FEATURES_CPU_P (cpu_features, AVX512PF)) + GLRO(dl_platform) = "xeon_phi"; + } + else + { + if (CPU_FEATURES_CPU_P (cpu_features, AVX512BW) + && CPU_FEATURES_CPU_P (cpu_features, AVX512DQ) + && CPU_FEATURES_CPU_P (cpu_features, AVX512VL)) + GLRO(dl_hwcap) |= HWCAP_X86_AVX512_1; + } + } + + if (GLRO(dl_platform) == NULL + && CPU_FEATURES_ARCH_P (cpu_features, AVX2_Usable) + && CPU_FEATURES_ARCH_P (cpu_features, FMA_Usable) + && CPU_FEATURES_CPU_P (cpu_features, BMI1) + && CPU_FEATURES_CPU_P (cpu_features, BMI2) + && CPU_FEATURES_CPU_P (cpu_features, LZCNT) + && CPU_FEATURES_CPU_P (cpu_features, MOVBE) + && CPU_FEATURES_CPU_P (cpu_features, POPCNT)) + GLRO(dl_platform) = "haswell"; + } +# else + if (CPU_FEATURES_CPU_P (cpu_features, SSE2)) + GLRO(dl_hwcap) |= HWCAP_X86_SSE2; + + if (CPU_FEATURES_ARCH_P (cpu_features, I686)) + GLRO(dl_platform) = "i686"; + else if (CPU_FEATURES_ARCH_P (cpu_features, I586)) + GLRO(dl_platform) = "i586"; +# endif +#endif } diff --git a/sysdeps/x86/cpu-features.h b/sysdeps/x86/cpu-features.h index 85a39e7..31c7c80 100644 --- a/sysdeps/x86/cpu-features.h +++ b/sysdeps/x86/cpu-features.h @@ -57,8 +57,13 @@ #define bit_cpu_FMA (1 << 12) #define bit_cpu_FMA4 (1 << 16) #define bit_cpu_HTT (1 << 28) +#define bit_cpu_LZCNT (1 << 5) +#define bit_cpu_MOVBE (1 << 22) +#define bit_cpu_POPCNT (1 << 23) /* COMMON_CPUID_INDEX_7. */ +#define bit_cpu_BMI1 (1 << 3) +#define bit_cpu_BMI2 (1 << 8) #define bit_cpu_ERMS (1 << 9) #define bit_cpu_RTM (1 << 11) #define bit_cpu_AVX2 (1 << 5) @@ -258,6 +263,11 @@ extern const struct cpu_features *__get_cpu_features (void) # define index_cpu_POPCOUNT COMMON_CPUID_INDEX_1 # define index_cpu_OSXSAVE COMMON_CPUID_INDEX_1 # define index_cpu_HTT COMMON_CPUID_INDEX_1 +# define index_cpu_BMI1 COMMON_CPUID_INDEX_7 +# define index_cpu_BMI2 COMMON_CPUID_INDEX_7 +# define index_cpu_LZCNT COMMON_CPUID_INDEX_1 +# define index_cpu_MOVBE COMMON_CPUID_INDEX_1 +# define index_cpu_POPCNT COMMON_CPUID_INDEX_1 # define reg_CX8 edx # define reg_CMOV edx @@ -282,6 +292,11 @@ extern const struct cpu_features *__get_cpu_features (void) # define reg_POPCOUNT ecx # define reg_OSXSAVE ecx # define reg_HTT edx +# define reg_BMI1 ebx +# define reg_BMI2 ebx +# define reg_LZCNT ecx +# define reg_MOVBE ecx +# define reg_POPCNT ecx # define index_arch_Fast_Rep_String FEATURE_INDEX_1 # define index_arch_Fast_Copy_Backward FEATURE_INDEX_1 diff --git a/sysdeps/x86/dl-hwcap.h b/sysdeps/x86/dl-hwcap.h new file mode 100644 index 0000000..c956684 --- /dev/null +++ b/sysdeps/x86/dl-hwcap.h @@ -0,0 +1,75 @@ +/* x86 version of hardware capability information handling macros. + Copyright (C) 2017 Free Software Foundation, Inc. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _DL_HWCAP_H +#define _DL_HWCAP_H + +#if IS_IN (ldconfig) +/* Since ldconfig processes both i386 and x86-64 libraries, it needs + to cover all platforms and hardware capabilities. */ +# define HWCAP_PLATFORMS_START 0 +# define HWCAP_PLATFORMS_COUNT 4 +# define HWCAP_START 0 +# define HWCAP_COUNT 2 +# define HWCAP_IMPORTANT (HWCAP_X86_SSE2 | HWCAP_X86_AVX512_1) +#elif defined __x86_64__ +/* For 64 bit, only cover x86-64 platforms and capabilities. */ +# define HWCAP_PLATFORMS_START 2 +# define HWCAP_PLATFORMS_COUNT 4 +# define HWCAP_START 1 +# define HWCAP_COUNT 2 +# define HWCAP_IMPORTANT (HWCAP_X86_AVX512_1) +#else +/* For 32 bit, only cover i586, i686 and SSE2. */ +# define HWCAP_PLATFORMS_START 0 +# define HWCAP_PLATFORMS_COUNT 2 +# define HWCAP_START 0 +# define HWCAP_COUNT 1 +# define HWCAP_IMPORTANT (HWCAP_X86_SSE2) +#endif + +enum +{ + HWCAP_X86_SSE2 = 1 << 0, + HWCAP_X86_AVX512_1 = 1 << 1 +}; + +static inline const char * +__attribute__ ((unused)) +_dl_hwcap_string (int idx) +{ + return GLRO(dl_x86_hwcap_flags)[idx]; +}; + +static inline int +__attribute__ ((unused, always_inline)) +_dl_string_hwcap (const char *str) +{ + int i; + + for (i = HWCAP_START; i < HWCAP_COUNT; i++) + { + if (strcmp (str, GLRO(dl_x86_hwcap_flags)[i]) == 0) + return i; + } + return -1; +}; + +/* We cannot provide a general printing function. */ +#define _dl_procinfo(type, word) -1 + +#endif /* dl-hwcap.h */ diff --git a/sysdeps/x86/dl-procinfo.c b/sysdeps/x86/dl-procinfo.c index 9d154bf..43ab8fe 100644 --- a/sysdeps/x86/dl-procinfo.c +++ b/sysdeps/x86/dl-procinfo.c @@ -16,7 +16,11 @@ License along with the GNU C Library; if not, see . */ -/* If anything should be added here check whether the size of each string +/* This information must be kept in sync with the _DL_HWCAP_COUNT, + HWCAP_PLATFORMS_START and HWCAP_PLATFORMS_COUNT definitions in + dl-hwcap.h. + + If anything should be added here check whether the size of each string is still ok with the given array size. All the #ifdefs in the definitions are quite irritating but @@ -50,3 +54,35 @@ PROCINFO_CLASS struct cpu_features _dl_x86_cpu_features , # endif #endif + +#if !defined PROCINFO_DECL && defined SHARED + ._dl_x86_hwcap_flags +#else +PROCINFO_CLASS const char _dl_x86_hwcap_flags[2][9] +#endif +#ifndef PROCINFO_DECL += { + "sse2", "avx512_1" + } +#endif +#if !defined SHARED || defined PROCINFO_DECL +; +#else +, +#endif + +#if !defined PROCINFO_DECL && defined SHARED + ._dl_x86_platforms +#else +PROCINFO_CLASS const char _dl_x86_platforms[4][9] +#endif +#ifndef PROCINFO_DECL += { + "i586", "i686", "haswell", "xeon_phi" + } +#endif +#if !defined SHARED || defined PROCINFO_DECL +; +#else +, +#endif diff --git a/sysdeps/x86/dl-procinfo.h b/sysdeps/x86/dl-procinfo.h new file mode 100644 index 0000000..5feb146 --- /dev/null +++ b/sysdeps/x86/dl-procinfo.h @@ -0,0 +1,48 @@ +/* x86 version of processor capability information handling macros. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _DL_PROCINFO_H +#define _DL_PROCINFO_H 1 +#include +#include + +#define _DL_HWCAP_COUNT HWCAP_COUNT +#define _DL_PLATFORMS_COUNT HWCAP_PLATFORMS_COUNT + +/* Start at 48 to reserve spaces for hardware capabilities. */ +#define _DL_FIRST_PLATFORM 48 +/* Mask to filter out platforms. */ +#define _DL_HWCAP_PLATFORM (((1ULL << _DL_PLATFORMS_COUNT) - 1) \ + << _DL_FIRST_PLATFORM) + +static inline int +__attribute__ ((unused, always_inline)) +_dl_string_platform (const char *str) +{ + int i; + + if (str != NULL) + for (i = HWCAP_PLATFORMS_START; i < HWCAP_PLATFORMS_COUNT; ++i) + { + if (strcmp (str, GLRO(dl_x86_platforms)[i]) == 0) + return _DL_FIRST_PLATFORM + i; + } + return -1; +}; + +#endif /* dl-procinfo.h */ diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h index daf4d8c..0015db4 100644 --- a/sysdeps/x86_64/dl-machine.h +++ b/sysdeps/x86_64/dl-machine.h @@ -240,14 +240,14 @@ _dl_start_user:\n\ static inline void __attribute__ ((unused)) dl_platform_init (void) { - if (GLRO(dl_platform) != NULL && *GLRO(dl_platform) == '\0') - /* Avoid an empty string which would disturb us. */ - GLRO(dl_platform) = NULL; - -#ifdef SHARED +#if IS_IN (rtld) /* init_cpu_features has been called early from __libc_start_main in static executable. */ init_cpu_features (&GLRO(dl_x86_cpu_features)); +#else + if (GLRO(dl_platform) != NULL && *GLRO(dl_platform) == '\0') + /* Avoid an empty string which would disturb us. */ + GLRO(dl_platform) = NULL; #endif }