Message ID | 20200317044646.29707-1-PMallappa@amd.com |
---|---|
Headers |
Return-Path: <prem.mallappa@gmail.com> X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pl1-x643.google.com (mail-pl1-x643.google.com [IPv6:2607:f8b0:4864:20::643]) by sourceware.org (Postfix) with ESMTPS id D67C9381DCE1 for <libc-alpha@sourceware.org>; Tue, 17 Mar 2020 04:47:08 +0000 (GMT) Received: by mail-pl1-x643.google.com with SMTP id b22so9039063pls.12 for <libc-alpha@sourceware.org>; Mon, 16 Mar 2020 21:47:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=4ZFpOoNLv3ZPA6btWcEKgVjXfwZQ99kphnc9LGKdF34=; b=V+hN7oJCSPzgM6gH9qoCMei+7KLvUYEBiBuQDW1ujC/WM/NU1aX+DFs7lYmOBnJ+mQ bdMxKQhe4i62kS8H0xG3DsFn0ybFln068UmUpcVbJitsEKaDS/guzHgzp68SriMEK0YA dZkoOzqJZXd2HZoihSbzdl91nPZ90GsAfTDx48vIPkXVWpChZ6eSaY8Uvl2uprpq4ilv 4g2GvRLOYLn2XO8/KWhskzPh02f7S2gELz175aFMRmapFI9tPPbDKYngZKg6AR+JGFGR tZHhiRGENxsuZXwd6EwvA7ELArs078dUef7OUgjrokPqEctMYwD+c9bLXbf3KPKyN7dG B9wA== X-Gm-Message-State: ANhLgQ26IauW2tplyjv0xdglEZetiy+BIEqFWoRkdQl0OnjsDSoPD3wn laHo7pmIJirsNSwW23Vauntwewii X-Google-Smtp-Source: ADFU+vulXGnaYUvEqHvkIkhsyLX3bEvfjlJGgZp9cWnelbuj5cV+KoqjijPB3GLNLpyO0/6d9n7boQ== X-Received: by 2002:a17:90b:94a:: with SMTP id dw10mr3226253pjb.105.1584420427506; Mon, 16 Mar 2020 21:47:07 -0700 (PDT) Received: from Myrtle-ZP-Model0.amd.com.amd.com ([165.204.156.251]) by smtp.gmail.com with ESMTPSA id md20sm1218661pjb.15.2020.03.16.21.47.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Mar 2020 21:47:06 -0700 (PDT) From: Prem Mallappa <prem.mallappa@gmail.com> X-Google-Original-From: Prem Mallappa <PMallappa@amd.com> To: libc-alpha@sourceware.org, codonell@redhat.com, schwab@suse.com, FWeimer@redhat.com Cc: Prem Mallappa <Premachandra.Mallappa@amd.com> Subject: [PATCH 0/3] RFC: Platform Support for AMD Zen and AVX2/AVX Date: Tue, 17 Mar 2020 10:16:43 +0530 Message-Id: <20200317044646.29707-1-PMallappa@amd.com> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.2 required=5.0 tests=DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_2, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org> List-Unsubscribe: <http://sourceware.org/mailman/options/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe> List-Archive: <http://sourceware.org/pipermail/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help> List-Subscribe: <http://sourceware.org/mailman/listinfo/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=subscribe> X-List-Received-Date: Tue, 17 Mar 2020 04:47:09 -0000 |
Series |
RFC: Platform Support for AMD Zen and AVX2/AVX
|
|
Message
Prem Mallappa
March 17, 2020, 4:46 a.m. UTC
From: Prem Mallappa <Premachandra.Mallappa@amd.com> Hello Glibc Community, == (cross posting to libc-alpha, apologies for the spam) == This is in response to [1] https://sourceware.org/bugzilla/show_bug.cgi?id=24979 [2] https://sourceware.org/bugzilla/show_bug.cgi?id=24080 [3] https://sourceware.org/bugzilla/show_bug.cgi?id=23249 It is clear that there is no panacea here. However, here is an attempt to address them in parts. >From [1], enable customers who already have "haswell" libs and has seen perf benifits by loading them on AMD Zen. (Load libraries by placing them in LD_LIBRARY_PATH/zen or by a symbolic link zen->haswell) >From [2] and [3] And, A futuristic generic-avx2/generic-avx libs, enables OS vendors to supply an optimized set. And haswell/zen are really a superset, hence keeping it made sense. By this we would like to open it up for discussion The haswell/zen can be intel/amd (or any other name, and supply ifunc based loading internally) Prem Mallappa (3): x86: Refactor platform support in cpu_features x86: Add AMD Zen and AVX2/AVX platform support x86: test to load from PLATFORM path sysdeps/x86/cpu-features.c | 113 ++++++++++++++++++++++--------------- sysdeps/x86_64/Makefile | 3 +- 2 files changed, 69 insertions(+), 47 deletions(-)
Comments
* Prem Mallappa via Libc-alpha: > From: Prem Mallappa <Premachandra.Mallappa@amd.com> > > Hello Glibc Community, > > == (cross posting to libc-alpha, apologies for the spam) == > > This is in response to > > [1] https://sourceware.org/bugzilla/show_bug.cgi?id=24979 > [2] https://sourceware.org/bugzilla/show_bug.cgi?id=24080 > [3] https://sourceware.org/bugzilla/show_bug.cgi?id=23249 > > It is clear that there is no panacea here. However, > here is an attempt to address them in parts. > > From [1], enable customers who already have > "haswell" libs and has seen perf benifits by loading > them on AMD Zen. > (Load libraries by placing them in LD_LIBRARY_PATH/zen > or by a symbolic link zen->haswell) > > From [2] and [3] > And, A futuristic generic-avx2/generic-avx libs, > enables OS vendors to supply an optimized set. > And haswell/zen are really a superset, hence > keeping it made sense. > > By this we would like to open it up for discussion > The haswell/zen can be intel/amd > (or any other name, and supply ifunc based loading > internally) I think we cannot use the platform subdirectory for that because there is just a single one. If we want a Intel/AMD split, we need to enhance the dynamic loader to try the CPU vendor directory first, and then fallback to a shared subdirectory. Most distributions do not want to test and ship binaries specific to Intel or AMD CPUs. That's a generic loader change which will need some time to implement, but we can work on something else in the meantime: We need to check for *all* relevant CPU flags such code can use and, and only enable a subdirectory if they are present. This is necessary because virtualization and microcode updates can disable individual CPU features. For the new shared subdirectory, I think we should not restrict ourselves just to AVX2, but we should also include useful extensions that are in practice always implemented in silicon along with AVX2, but can be separately tweaked. This seems to be a reasonable list of CPU feature flags to start with: 3DNOW 3DNOWEXT 3DNOWPREFETCH ABM ADX AES AVX AVX2 BMI BMI2 CET CLFLUSH CLFLUSHOPT CLWB CLZERO CMPXCHG16B ERMS F16C FMA FMA4 FSGSBASE FSRM FXSR HLE LAHF LZCNT MOVBE MWAITX PCLMUL PCOMMIT PKU POPCNT PREFETCHW RDPID RDRAND RDSEED RDTSCP RTM SHA SSE3 SSE4.1 SSE4.2 SSE4A SSSE3 TSC XGETBV XSAVE XSAVEC XSAVEOPT XSAVES You (as in AMD) need to go through this list and come back with the subset that you think should be enabled for current and future CPUs, based on your internal roadmap and known errata for existing CPUs. We do not need a rationale for how you filter down the list, merely the outcome. (I already have the trimmed-down list from Intel.)
On 3/17/20 5:02 AM, Florian Weimer wrote: > * Prem Mallappa via Libc-alpha: > >> From: Prem Mallappa <Premachandra.Mallappa@amd.com> >> >> Hello Glibc Community, >> >> == (cross posting to libc-alpha, apologies for the spam) == >> >> This is in response to >> >> [1] https://sourceware.org/bugzilla/show_bug.cgi?id=24979 >> [2] https://sourceware.org/bugzilla/show_bug.cgi?id=24080 >> [3] https://sourceware.org/bugzilla/show_bug.cgi?id=23249 >> >> It is clear that there is no panacea here. However, >> here is an attempt to address them in parts. >> >> From [1], enable customers who already have >> "haswell" libs and has seen perf benifits by loading >> them on AMD Zen. >> (Load libraries by placing them in LD_LIBRARY_PATH/zen >> or by a symbolic link zen->haswell) >> >> From [2] and [3] >> And, A futuristic generic-avx2/generic-avx libs, >> enables OS vendors to supply an optimized set. >> And haswell/zen are really a superset, hence >> keeping it made sense. >> >> By this we would like to open it up for discussion >> The haswell/zen can be intel/amd >> (or any other name, and supply ifunc based loading >> internally) > > I think we cannot use the platform subdirectory for that because there > is just a single one. If we want a Intel/AMD split, we need to > enhance the dynamic loader to try the CPU vendor directory first, and > then fallback to a shared subdirectory. Most distributions do not > want to test and ship binaries specific to Intel or AMD CPUs. I agree. The additional burden on testing, maintaining, and supporting distinct libraries is not feasible. > That's a generic loader change which will need some time to implement, > but we can work on something else in the meantime: > > We need to check for *all* relevant CPU flags such code can use and, > and only enable a subdirectory if they are present. This is necessary > because virtualization and microcode updates can disable individual > CPU features. Agreed. This is the only sensible plan. The platform directories already imply some of this, but it's not well structured. > For the new shared subdirectory, I think we should not restrict > ourselves just to AVX2, but we should also include useful extensions > that are in practice always implemented in silicon along with AVX2, > but can be separately tweaked. Agreed. > This seems to be a reasonable list of CPU feature flags to start with: > > 3DNOW > 3DNOWEXT > 3DNOWPREFETCH > ABM > ADX > AES > AVX > AVX2 > BMI > BMI2 > CET > CLFLUSH > CLFLUSHOPT > CLWB > CLZERO > CMPXCHG16B > ERMS > F16C > FMA > FMA4 > FSGSBASE > FSRM > FXSR > HLE > LAHF > LZCNT > MOVBE > MWAITX > PCLMUL > PCOMMIT > PKU > POPCNT > PREFETCHW > RDPID > RDRAND > RDSEED > RDTSCP > RTM > SHA > SSE3 > SSE4.1 > SSE4.2 > SSE4A > SSSE3 > TSC > XGETBV > XSAVE > XSAVEC > XSAVEOPT > XSAVES > > You (as in AMD) need to go through this list and come back with the > subset that you think should be enabled for current and future CPUs, > based on your internal roadmap and known errata for existing CPUs. We > do not need a rationale for how you filter down the list, merely the > outcome. And this is the hard part that we can't solve without AMD's help. Even if you ignore "future CPUs" it would be useful to get this list for all current CPUs given your architectural knowledge, errata, and other factors like microcode, that covers the currently released CPUs. > (I already have the trimmed-down list from Intel.) >
On 17/03/2020 10:17, Carlos O'Donell via Libc-alpha wrote: > On 3/17/20 5:02 AM, Florian Weimer wrote: >> * Prem Mallappa via Libc-alpha: >> >>> From: Prem Mallappa <Premachandra.Mallappa@amd.com> >>> >>> Hello Glibc Community, >>> >>> == (cross posting to libc-alpha, apologies for the spam) == >>> >>> This is in response to >>> >>> [1] https://sourceware.org/bugzilla/show_bug.cgi?id=24979 >>> [2] https://sourceware.org/bugzilla/show_bug.cgi?id=24080 >>> [3] https://sourceware.org/bugzilla/show_bug.cgi?id=23249 >>> >>> It is clear that there is no panacea here. However, >>> here is an attempt to address them in parts. >>> >>> From [1], enable customers who already have >>> "haswell" libs and has seen perf benifits by loading >>> them on AMD Zen. >>> (Load libraries by placing them in LD_LIBRARY_PATH/zen >>> or by a symbolic link zen->haswell) >>> >>> From [2] and [3] >>> And, A futuristic generic-avx2/generic-avx libs, >>> enables OS vendors to supply an optimized set. >>> And haswell/zen are really a superset, hence >>> keeping it made sense. >>> >>> By this we would like to open it up for discussion >>> The haswell/zen can be intel/amd >>> (or any other name, and supply ifunc based loading >>> internally) >> >> I think we cannot use the platform subdirectory for that because there >> is just a single one. If we want a Intel/AMD split, we need to >> enhance the dynamic loader to try the CPU vendor directory first, and >> then fallback to a shared subdirectory. Most distributions do not >> want to test and ship binaries specific to Intel or AMD CPUs. > > I agree. The additional burden on testing, maintaining, and supporting > distinct libraries is not feasible. > >> That's a generic loader change which will need some time to implement, >> but we can work on something else in the meantime: >> >> We need to check for *all* relevant CPU flags such code can use and, >> and only enable a subdirectory if they are present. This is necessary >> because virtualization and microcode updates can disable individual >> CPU features. > > Agreed. This is the only sensible plan. The platform directories already > imply some of this, but it's not well structured. Which should be our policy regarding the platform name over releases? Should the names set in previous release being supported in a compatibility manner or should it not be constraint (as for tunables) and subject of change? If the former, with a defined subset of the CPU features flags it might be possible to share folder over x86 chips without using chips release names. > >> For the new shared subdirectory, I think we should not restrict >> ourselves just to AVX2, but we should also include useful extensions >> that are in practice always implemented in silicon along with AVX2, >> but can be separately tweaked. > > Agreed. > >> This seems to be a reasonable list of CPU feature flags to start with: >> >> 3DNOW >> 3DNOWEXT >> 3DNOWPREFETCH >> ABM >> ADX >> AES >> AVX >> AVX2 >> BMI >> BMI2 >> CET >> CLFLUSH >> CLFLUSHOPT >> CLWB >> CLZERO >> CMPXCHG16B >> ERMS >> F16C >> FMA >> FMA4 >> FSGSBASE >> FSRM >> FXSR >> HLE >> LAHF >> LZCNT >> MOVBE >> MWAITX >> PCLMUL >> PCOMMIT >> PKU >> POPCNT >> PREFETCHW >> RDPID >> RDRAND >> RDSEED >> RDTSCP >> RTM >> SHA >> SSE3 >> SSE4.1 >> SSE4.2 >> SSE4A >> SSSE3 >> TSC >> XGETBV >> XSAVE >> XSAVEC >> XSAVEOPT >> XSAVES >> >> You (as in AMD) need to go through this list and come back with the >> subset that you think should be enabled for current and future CPUs, >> based on your internal roadmap and known errata for existing CPUs. We >> do not need a rationale for how you filter down the list, merely the >> outcome. > > And this is the hard part that we can't solve without AMD's help. > > Even if you ignore "future CPUs" it would be useful to get this list > for all current CPUs given your architectural knowledge, errata, and > other factors like microcode, that covers the currently released CPUs. So the question is how should be move forward: let the chip vendor (Intel, AMD, etc.) define its own naming scheme based on its own chip roadmap, or create subsets of features to set a common naming folder (as suggested by Richard Biener on BZ#24080 [1])? [1] https://sourceware.org/bugzilla/show_bug.cgi?id=24080
On 3/17/20 3:27 PM, Adhemerval Zanella via Libc-alpha wrote: > On 17/03/2020 10:17, Carlos O'Donell via Libc-alpha wrote: >> Agreed. This is the only sensible plan. The platform directories already >> imply some of this, but it's not well structured. > > Which should be our policy regarding the platform name over releases? > Should the names set in previous release being supported in a > compatibility manner or should it not be constraint (as for tunables) > and subject of change? It should be subject to change just like tunables. It should be an optimization, and not a requirement, and applications should always provide a fallback implementaiton to allow the application to load. Failure to load the libraries *may* result in a failure to start the application and we need to be OK with that. That is to say that a particularly package may only ship an optimized library (going against the recommendation). We should verify that downstream distributions can use /etc/ld.so.conf as a way to add back directories into the search of the existing additional multilib search directories e.g. Add back /lib64/haswell for a few years. Does that answer your question? > If the former, with a defined subset of the CPU features flags it might > be possible to share folder over x86 chips without using chips release > names. Yes. >> And this is the hard part that we can't solve without AMD's help. >> >> Even if you ignore "future CPUs" it would be useful to get this list >> for all current CPUs given your architectural knowledge, errata, and >> other factors like microcode, that covers the currently released CPUs. > > So the question is how should be move forward: let the chip vendor > (Intel, AMD, etc.) define its own naming scheme based on its own > chip roadmap, or create subsets of features to set a common naming > folder (as suggested by Richard Biener on BZ#24080 [1])? In the end I think we'll want: (a) Try CPU vendor directories first. - Each vendor should name their directories and the explicit compiler options to target them (printed by LD_DEBUG). (b) Try shared directories second. - Based on a common set of identified features. - Compiler options to target the shared set should be explicitly stated (printed by LD_DEBUG). My understanding is that Florian is asking for help with (b) to identify what things should be enabled for current CPUs, and that we'll compare that list to the Intel list and make a common shared directory that the downstream distributions can used for the most optimized library we can have in common. What we can do: - Cleanup the generic code to allow a CPU vendor split? - We have a mix of hwcap bit handling and list handling for platforms. This is also a bit messy. - Allow vendors to drop in their CPU vendor search list ahead of the shared list, again based on feature presence. - Prepare the code for a shared common directory based on some shared subset of features, and enable it only if those features are present. Today doing this is a bit messy in cpu-features.c. > [1] https://sourceware.org/bugzilla/show_bug.cgi?id=24080
* Carlos O'Donell via Libc-alpha: > On 3/17/20 3:27 PM, Adhemerval Zanella via Libc-alpha wrote: >> On 17/03/2020 10:17, Carlos O'Donell via Libc-alpha wrote: >>> Agreed. This is the only sensible plan. The platform directories already >>> imply some of this, but it's not well structured. >> >> Which should be our policy regarding the platform name over releases? >> Should the names set in previous release being supported in a >> compatibility manner or should it not be constraint (as for tunables) >> and subject of change? > > It should be subject to change just like tunables. I disagree; for a subset of the directories, we should guarantee stability. > It should be an optimization, and not a requirement, and applications > should always provide a fallback implementaiton to allow the application > to load. Agreed. Programmers need to assume that future glibc versions may stop selecting certain subdirectories. However, I'm not sure if we can suddenly start selecting directories on systems where we did not do so before. > We should verify that downstream distributions can use /etc/ld.so.conf > as a way to add back directories into the search of the existing > additional multilib search directories e.g. Add back /lib64/haswell > for a few years. I don't think that works. > In the end I think we'll want: > > (a) Try CPU vendor directories first. > - Each vendor should name their directories and the explicit > compiler options to target them (printed by LD_DEBUG). > > (b) Try shared directories second. > - Based on a common set of identified features. > - Compiler options to target the shared set should be explicitly > stated (printed by LD_DEBUG). > > My understanding is that Florian is asking for help with (b) > to identify what things should be enabled for current CPUs, and > that we'll compare that list to the Intel list and make a common > shared directory that the downstream distributions can used > for the most optimized library we can have in common. The results for (b) also feed into (a) to some extent because if research for (b) reveals that certain CPU features have been disabled by microupdate updates, we probably do not want them for (a), either.