Message ID | 20181011182352.25394-1-adhemerval.zanella@linaro.org |
---|---|
State | Dropped |
Headers |
Received: (qmail 26407 invoked by alias); 11 Oct 2018 18:24:11 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 26274 invoked by uid 89); 11 Oct 2018 18:24:01 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-25.8 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=H*p:D*org X-HELO: mail-qt1-f195.google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id; bh=FQ1Cz+MjHrscDEkApq4UUnSvjIqC1bp2aXJDgJilGkI=; b=IpsVjX07yOkvg6aQEViDEPAelY2AZWeXobzfAtCTUK8QYxi5x0vuFKaP//H22z/mev JNpBfFcX9qgHjTFR/2RxAg4c3YjaAOypiuZox/yRfFJeetkwhFyvJBnwB9FyNwWzL1AO CUBKGzIkRJWllF4AvK/BLwXLRO/Yib3UX3yKw= Return-Path: <adhemerval.zanella@linaro.org> From: Adhemerval Zanella <adhemerval.zanella@linaro.org> To: libc-alpha@sourceware.org Subject: [PATCH] x86: Fix Haswell strong flags (BZ#23709) Date: Thu, 11 Oct 2018 15:23:52 -0300 Message-Id: <20181011182352.25394-1-adhemerval.zanella@linaro.org> |
Commit Message
Adhemerval Zanella Netto
Oct. 11, 2018, 6:23 p.m. UTC
Th commit 'Disable TSX on some Haswell processors.' (2702856bf4) changed the default flags for Haswell models. Previously, new models were handled by the default switch path, which assumed a Core i3/i5/i7 if AVX is available. After the patch, Haswell models (0x3f, 0x3c, 0x45, 0x46) do not set the flags Fast_Rep_String, Fast_Unaligned_Load, Fast_Unaligned_Copy, and Prefer_PMINUB_for_stringop (only the TSX one). This patch fixes it by disentangle the TSX flag handling from the memory optimization ones. The strstr case cited on patch now selects the __strstr_sse2_unaligned as expected for the Haswell cpu. Checked on x86_64-linux-gnu. [BZ #23709] * sysdeps/x86/cpu-features.c (init_cpu_features): Set TSX bits independently of other flags. --- ChangeLog | 6 ++++++ sysdeps/x86/cpu-features.c | 6 ++++++ 2 files changed, 12 insertions(+)
Comments
* Adhemerval Zanella: > diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c > index f4e0f5a2ed..80b3054cf8 100644 > --- a/sysdeps/x86/cpu-features.c > +++ b/sysdeps/x86/cpu-features.c > @@ -316,7 +316,13 @@ init_cpu_features (struct cpu_features *cpu_features) > | bit_arch_Fast_Unaligned_Copy > | bit_arch_Prefer_PMINUB_for_stringop); > break; > + } > > + /* Disable TSX on some Haswell processors to avoid TSX on kernels that > + weren't updated with the latest microcode package (which disables > + broken feature by default). */ > + switch (model) > + { > case 0x3f: > /* Xeon E7 v3 with stepping >= 4 has working TSX. */ > if (stepping >= 4) I think the change is okay as posted. It will need some testing in the field because the newly selected implementations could have unexpected performance drawbacks. Thanks, Florian
On 23/10/2018 06:55, Florian Weimer wrote: > * Adhemerval Zanella: > >> diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c >> index f4e0f5a2ed..80b3054cf8 100644 >> --- a/sysdeps/x86/cpu-features.c >> +++ b/sysdeps/x86/cpu-features.c >> @@ -316,7 +316,13 @@ init_cpu_features (struct cpu_features *cpu_features) >> | bit_arch_Fast_Unaligned_Copy >> | bit_arch_Prefer_PMINUB_for_stringop); >> break; >> + } >> >> + /* Disable TSX on some Haswell processors to avoid TSX on kernels that >> + weren't updated with the latest microcode package (which disables >> + broken feature by default). */ >> + switch (model) >> + { >> case 0x3f: >> /* Xeon E7 v3 with stepping >= 4 has working TSX. */ >> if (stepping >= 4) > > I think the change is okay as posted. It will need some testing in the > field because the newly selected implementations could have unexpected > performance drawbacks. This fix only change ifunc selection for Haswell chips [1]: Haswell (Client) GT3E 0 0x6 0x4 0x6 Family 6 Model 70 ULT 0 0x6 0x4 0x5 Family 6 Model 69 S 0 0x6 0x3 0xC Family 6 Model 60 Haswell (Server) E, EP, EX 0 0x6 0x3 0xF Family 6 Model 63 On these chips the internal glibc flags bit_arch_Fast_Rep_String, bit_arch_Fast_Unaligned_Load, bit_arch_Fast_Unaligned_Copy, and bit_arch_Prefer_PMINUB_for_stringop won't be set: * The bit_arch_Fast_Rep_String is only used on ifunc selection on i686 (32-bits) and it selects the *ssse3_rep* memcpy, memmove, bcopy, and mempcpy. It it is not set the *ssse3* variant is used instead. I am not sure which is the performance difference between but my expectation is ssse3_rep should be faster than ssse3. * The bit_arch_Fast_Unaligned_Load influences both i686 and x86_64. For x86_64 it influences the selections of the SSE2 unaligned optimization variants for stpncpy, strcpy, strncpy, stpcpy, strncat, strcat, and strstr. For all but strstr an ssse3 or sse2 variant is used instead I am not sure which is the performance difference between but my expectation the unaligned version should be faster. * The bit_arch_Fast_Unaligned_Copy influences mempcpy, memmove, and memcpy. If chip has not SSE3 the bit will select either a RMS or a unaligned variant. For Haswell the *_avx_unaligned_erms variants will be selected, so this bits won't interfere with best selections. The bit_arch_PMINUB_for_stringop is not used on ifunc selection. [1] https://en.wikichip.org/wiki/intel/cpuid
diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c index f4e0f5a2ed..80b3054cf8 100644 --- a/sysdeps/x86/cpu-features.c +++ b/sysdeps/x86/cpu-features.c @@ -316,7 +316,13 @@ init_cpu_features (struct cpu_features *cpu_features) | bit_arch_Fast_Unaligned_Copy | bit_arch_Prefer_PMINUB_for_stringop); break; + } + /* Disable TSX on some Haswell processors to avoid TSX on kernels that + weren't updated with the latest microcode package (which disables + broken feature by default). */ + switch (model) + { case 0x3f: /* Xeon E7 v3 with stepping >= 4 has working TSX. */ if (stepping >= 4)