From patchwork Fri Sep 1 17:40:34 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Lu, Hongjiu" X-Patchwork-Id: 22504 Received: (qmail 58415 invoked by alias); 1 Sep 2017 17:40:51 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 58402 invoked by uid 89); 1 Sep 2017 17:40:51 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, NO_DNS_FOR_FROM, RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=knl X-HELO: mga09.intel.com X-ExtLoop1: 1 Date: Fri, 1 Sep 2017 10:40:34 -0700 From: "H.J. Lu" To: GNU C Library Cc: Andrew Senkevich Subject: [PATCH] x86: Add MathVec_Prefer_No_AVX512 to cpu-features [BZ #21967] Message-ID: <20170901174034.GA17543@gmail.com> Reply-To: "H.J. Lu" MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.8.3 (2017-05-23) AVX512 functions in mathvec are used on machines with AVX512. An AVX2 wrapper is also provided and it can be used when the AVX512 version isn't profitable. MathVec_Prefer_No_AVX512 is addded to cpu-features. If glibc.tune.hwcaps=MathVec_Prefer_No_AVX512 is set in GLIBC_TUNABLES environment variable, the AVX2 wrapper will be used. Tested on x86-64 machines with and without AVX512. Also verified glibc.tune.hwcaps=MathVec_Prefer_No_AVX512 on AVX512 machine. Any comments? H.J. --- [BZ #21967] * sysdeps/x86/cpu-features.h (bit_arch_MathVec_Prefer_No_AVX512): New. (index_arch_MathVec_Prefer_No_AVX512): Likewise. * sysdeps/x86/cpu-tunables.c (TUNABLE_CALLBACK (set_hwcaps)): Handle MathVec_Prefer_No_AVX512. --- sysdeps/x86/cpu-features.h | 2 ++ sysdeps/x86/cpu-tunables.c | 7 +++++++ sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx512.h | 13 ++++++++----- 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/sysdeps/x86/cpu-features.h b/sysdeps/x86/cpu-features.h index 9e01781424..a032a2e168 100644 --- a/sysdeps/x86/cpu-features.h +++ b/sysdeps/x86/cpu-features.h @@ -40,6 +40,7 @@ #define bit_arch_Use_dl_runtime_resolve_opt (1 << 20) #define bit_arch_Use_dl_runtime_resolve_slow (1 << 21) #define bit_arch_Prefer_No_AVX512 (1 << 22) +#define bit_arch_MathVec_Prefer_No_AVX512 (1 << 23) /* CPUID Feature flags. */ @@ -239,6 +240,7 @@ extern const struct cpu_features *__get_cpu_features (void) # define index_arch_Use_dl_runtime_resolve_opt FEATURE_INDEX_1 # define index_arch_Use_dl_runtime_resolve_slow FEATURE_INDEX_1 # define index_arch_Prefer_No_AVX512 FEATURE_INDEX_1 +# define index_arch_MathVec_Prefer_No_AVX512 FEATURE_INDEX_1 #endif /* !__ASSEMBLER__ */ diff --git a/sysdeps/x86/cpu-tunables.c b/sysdeps/x86/cpu-tunables.c index 0ab708cca8..ec72d86f08 100644 --- a/sysdeps/x86/cpu-tunables.c +++ b/sysdeps/x86/cpu-tunables.c @@ -303,6 +303,13 @@ TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *valp) disable, 23); } break; + case 24: + { + CHECK_GLIBC_IFUNC_ARCH_NEED_ARCH_BOTH + (n, cpu_features, MathVec_Prefer_No_AVX512, + AVX512F_Usable, disable, 24); + } + break; case 26: { CHECK_GLIBC_IFUNC_ARCH_NEED_CPU_BOTH diff --git a/sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx512.h b/sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx512.h index 1857e1f760..fffc9da114 100644 --- a/sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx512.h +++ b/sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx512.h @@ -32,11 +32,14 @@ IFUNC_SELECTOR (void) { const struct cpu_features* cpu_features = __get_cpu_features (); - if (CPU_FEATURES_ARCH_P (cpu_features, AVX512DQ_Usable)) - return OPTIMIZE (skx); - - if (CPU_FEATURES_ARCH_P (cpu_features, AVX512F_Usable)) - return OPTIMIZE (knl); + if (!CPU_FEATURES_ARCH_P (cpu_features, MathVec_Prefer_No_AVX512)) + { + if (CPU_FEATURES_ARCH_P (cpu_features, AVX512DQ_Usable)) + return OPTIMIZE (skx); + + if (CPU_FEATURES_ARCH_P (cpu_features, AVX512F_Usable)) + return OPTIMIZE (knl); + } return OPTIMIZE (avx2_wrapper); }