Message ID | alpine.DEB.2.21.1809172305170.13111@digraph.polyomino.org.uk |
---|---|
State | New, archived |
Headers |
Received: (qmail 36888 invoked by alias); 17 Sep 2018 23:07:33 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 36522 invoked by uid 89); 17 Sep 2018 23:07:33 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-24.5 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=former, availability X-HELO: relay1.mentorg.com Date: Mon, 17 Sep 2018 23:07:26 +0000 From: Joseph Myers <joseph@codesourcery.com> To: <libc-alpha@sourceware.org> Subject: Update list of i686-class processors in sysdeps/x86/cpu-features.h Message-ID: <alpine.DEB.2.21.1809172305170.13111@digraph.polyomino.org.uk> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" |
Commit Message
Joseph Myers
Sept. 17, 2018, 11:07 p.m. UTC
I noticed that sysdeps/x86/cpu-features.h had conditionals on whether to define HAS_CPUID, HAS_I586 and HAS_I686 with a long list of preprocessor macros for i686-and-later processors which however was out of date. This patch adds more such macros based on the list in current GCC. It seems HAS_I586 and HAS_I686 are unused so the only effect of these macros being missing is that 32-bit glibc built for one of these processors would end up doing runtime detection of CPUID availability. Tested for x86. Question: would it be better to implement this conditional in a negative sense (!defined __i486__ && !defined __i586__ && !defined __geode__, based on what macros are in the fixed conditional, but maybe using __k6__ instead of __geode__ based on GCC's understanding of the options in question) to reduce the chances of it needing updating in future? (__i586__ is actually handled above.) Question: is the inclusion of __k6__ in the present conditional logically incorrect, and the omission of __geode__ logically incorrect, as regards whether to define HAVE_I686? The way GCC defines -march=k6 and -march=geode, it seems to think that the former excludes CMOV and the latter includes it. (-march=geode is specifically "AMD Geode embedded processor with MMX and 3DNow!@: instruction set support.".) 2018-09-17 Joseph Myers <joseph@codesourcery.com> * sysdeps/x86/cpu-features.h [__goldmont__ || __goldmont_plus__ || __tremont__ || __knm__ || __skylake__ || __skylake_avx512__ || __cannonlake__ || __icelake_client__ || __icelake_server__ || __znver1__]: Also count as i686 processors.
Comments
* Joseph Myers: > Question: would it be better to implement this conditional in a > negative sense (!defined __i486__ && !defined __i586__ && !defined > __geode__, based on what macros are in the fixed conditional, but > maybe using __k6__ instead of __geode__ based on GCC's understanding > of the options in question) to reduce the chances of it needing > updating in future? (__i586__ is actually handled above.) I think the GCC developers expect that we use the individual feature test macros (__SSE2__ etc.). The model-based macros are useless because CPU features are increasingly non-monotonic, and it does not make sense to replicate GCC's view of the Intel and AMD roadmaps in glibc. > Question: is the inclusion of __k6__ in the present conditional > logically incorrect, and the omission of __geode__ logically > incorrect, as regards whether to define HAVE_I686? The way GCC > defines -march=k6 and -march=geode, it seems to think that the former > excludes CMOV and the latter includes it. (-march=geode is > specifically "AMD Geode embedded processor with MMX and 3DNow!@: > instruction set support.".) Geode lacks support for long NOPs, but our i686 port requires them.
On Wed, 19 Sep 2018, Florian Weimer wrote: > * Joseph Myers: > > > Question: would it be better to implement this conditional in a > > negative sense (!defined __i486__ && !defined __i586__ && !defined > > __geode__, based on what macros are in the fixed conditional, but > > maybe using __k6__ instead of __geode__ based on GCC's understanding > > of the options in question) to reduce the chances of it needing > > updating in future? (__i586__ is actually handled above.) > > I think the GCC developers expect that we use the individual feature > test macros (__SSE2__ etc.). The model-based macros are useless Which don't exist for CMOV, for example. We have a general principle that it's best for compile-time selection of subarchitecture variants of functions to be based on how the compiler behaves rather than configuring for a variant host triplet or using --with-cpu. We could e.g. select SSE functions by default based on __SSE2__ (even if we don't do so at present). But testing whether the compiler produces a CMOV instruction for particular code seems more fragile than testing #if conditions for processors without CMOV. As I suggested in the discussion in bug 23630, I think it would make sense to eliminate, or at least minimize, the i586 / i686 / i786 sysdeps directories, to reduce the number of function variants to be maintained; maybe there are a few places where having conditionals in files on whether CMOV is supported would be awkward, but in many places either an i686-tuned version can be used as generic, or only small local conditionals for CMOV support are needed. E.g. sysdeps/i386/i686/pthread_spin_trylock.S defines HAVE_CMOV; that ought to be something available in a generic header, based on some way on what the compiler predefines. > > Question: is the inclusion of __k6__ in the present conditional > > logically incorrect, and the omission of __geode__ logically > > incorrect, as regards whether to define HAVE_I686? The way GCC > > defines -march=k6 and -march=geode, it seems to think that the former > > excludes CMOV and the latter includes it. (-march=geode is > > specifically "AMD Geode embedded processor with MMX and 3DNow!@: > > instruction set support.".) > > Geode lacks support for long NOPs, but our i686 port requires them. Requires them where? Through the use of -Wa,-mtune=i686 in sysdeps/i386/i686/Makefile, or also through explicit uses?
On Wed, Sep 19, 2018 at 5:39 AM, Joseph Myers <joseph@codesourcery.com> wrote: >> >> Geode lacks support for long NOPs, but our i686 port requires them. > > Requires them where? Through the use of -Wa,-mtune=i686 in > sysdeps/i386/i686/Makefile, or also through explicit uses? CET is allowed for i686, which uses long NOPs.
On Wed, 19 Sep 2018, H.J. Lu wrote: > On Wed, Sep 19, 2018 at 5:39 AM, Joseph Myers <joseph@codesourcery.com> wrote: > > >> > >> Geode lacks support for long NOPs, but our i686 port requires them. > > > > Requires them where? Through the use of -Wa,-mtune=i686 in > > sysdeps/i386/i686/Makefile, or also through explicit uses? > > CET is allowed for i686, which uses long NOPs. Isn't that just a matter of what's enabled by an explicit configure option, and so not relevant to conditionals that might be needed when merging / moving some code from i686 sysdeps directories to the generic i386 versions?
Joseph Myers <joseph@codesourcery.com> writes: > On Wed, 19 Sep 2018, Florian Weimer wrote: > >> * Joseph Myers: >> >> > Question: would it be better to implement this conditional in a >> > negative sense (!defined __i486__ && !defined __i586__ && !defined >> > __geode__, based on what macros are in the fixed conditional, but >> > maybe using __k6__ instead of __geode__ based on GCC's understanding >> > of the options in question) to reduce the chances of it needing >> > updating in future? (__i586__ is actually handled above.) >> >> I think the GCC developers expect that we use the individual feature >> test macros (__SSE2__ etc.). The model-based macros are useless > > Which don't exist for CMOV, for example. That's really unfortunate. > We have a general principle that it's best for compile-time selection of > subarchitecture variants of functions to be based on how the compiler > behaves rather than configuring for a variant host triplet or using > --with-cpu. We could e.g. select SSE functions by default based on > __SSE2__ (even if we don't do so at present). But testing whether the > compiler produces a CMOV instruction for particular code seems more > fragile than testing #if conditions for processors without CMOV. I agree about CMOV because it is extremely unlikely that we will see another useful x86 CPU that doesn't have CMOV. > As I suggested in the discussion in bug 23630, I think it would make sense > to eliminate, or at least minimize, the i586 / i686 / i786 sysdeps > directories, to reduce the number of function variants to be maintained; > maybe there are a few places where having conditionals in files on whether > CMOV is supported would be awkward, but in many places either an > i686-tuned version can be used as generic, or only small local > conditionals for CMOV support are needed. E.g. > sysdeps/i386/i686/pthread_spin_trylock.S defines HAVE_CMOV; that ought to > be something available in a generic header, based on some way on what the > compiler predefines. That particular function should probably be written in C, so that the compiler can produce a better branch-free sequence, without using CMOV. (I would be slightly surprised if the availability of CMOV would matter to assembler code tuned for generic x86.) >> > Question: is the inclusion of __k6__ in the present conditional >> > logically incorrect, and the omission of __geode__ logically >> > incorrect, as regards whether to define HAVE_I686? The way GCC >> > defines -march=k6 and -march=geode, it seems to think that the former >> > excludes CMOV and the latter includes it. (-march=geode is >> > specifically "AMD Geode embedded processor with MMX and 3DNow!@: >> > instruction set support.".) >> >> Geode lacks support for long NOPs, but our i686 port requires them. > > Requires them where? Through the use of -Wa,-mtune=i686 in > sysdeps/i386/i686/Makefile, or also through explicit uses? Hmm. I may have misremembered. The current port should be okay as long as CET isn't enabled. For a while, GAS generated long NOPs with -Wa,-mtune=i686, but I found this commit in Fedora: commit 55a30785d463ecd9cc1fa0e4b384b9f8b93a370a Author: Jeff Law <law@redhat.com> Date: Mon Sep 24 09:25:31 2012 -0600 - Remove most of fedora-nscd patch as we no longer use the old init files, but systemd instead. - Remove path-to-vi patch. With the usr-move changes that patch is totally unnecessary. - Remove i686-nopl patch. Gas was changed back in 2011 to avoid nopl. - Move gai-rfc1918 patch to submitted upstream status And that re-instantiated the -Wa,-mtune=i686 options which were patched out before. We also didn't receive any complaints that Fedora 27 did not run on AMD Geode, so glibc 2.26 at least was still okay. Thanks, Florian
diff --git a/sysdeps/x86/cpu-features.h b/sysdeps/x86/cpu-features.h index d342664c64..e086b5f09e 100644 --- a/sysdeps/x86/cpu-features.h +++ b/sysdeps/x86/cpu-features.h @@ -268,12 +268,18 @@ extern const struct cpu_features *__get_cpu_features (void) || defined __core_avx2__ || defined __nehalem__ \ || defined __sandybridge__ || defined __haswell__ \ || defined __knl__ || defined __bonnell__ \ - || defined __silvermont__ \ + || defined __silvermont__ || defined __goldmont__ \ + || defined __goldmont_plus__ || defined __tremont__ \ + || defined __knm__ || defined __skylake__ \ + || defined __skylake_avx512__ || defined __cannonlake__ \ + || defined __icelake_client__ \ + || defined __icelake_server__ \ || defined __k6__ || defined __k8__ \ || defined __athlon__ || defined __amdfam10__ \ || defined __bdver1__ || defined __bdver2__ \ || defined __bdver3__ || defined __bdver4__ \ - || defined __btver1__ || defined __btver2__) + || defined __btver1__ || defined __btver2__ \ + || defined __znver1__) # define HAS_CPUID 1 # define HAS_I586 1 # define HAS_I686 1