Message ID | 20190302220618.706-1-gabriel@inconstante.eti.br |
---|---|
State | Committed |
Headers |
Received: (qmail 110837 invoked by alias); 2 Mar 2019 22:06:29 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 110825 invoked by uid 89); 2 Mar 2019 22:06:29 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY autolearn=ham version=3.3.2 spammy=HX-Envelope-From:sk:gabriel, 2145, 2.14.5, bind X-HELO: smtpout1.mo528.mail-out.ovh.net From: "Gabriel F. T. Gomes" <gabriel@inconstante.eti.br> To: <libc-alpha@sourceware.org>, <adhemerval.zanella@linaro.org>, <tuliom@ascii.art.br> Subject: [PATCH] powerpc: Fix build of wcscpy with --disable-multi-arch Date: Sat, 2 Mar 2019 19:06:18 -0300 Message-ID: <20190302220618.706-1-gabriel@inconstante.eti.br> MIME-Version: 1.0 Content-Type: text/plain X-Ovh-Tracer-Id: 18329650486068825795 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: 0 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedutddrvdejgdduieduucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuqfggjfdpvefjgfevmfevgfenuceurghilhhouhhtmecuhedttdenuc |
Commit Message
Gabriel F. T. Gomes
March 2, 2019, 10:06 p.m. UTC
Since the commit commit 81a14439417552324ec6ca71f65ddf8e7cdd51c7 Author: Adhemerval Zanella <adhemerval.zanella@linaro.org> Date: Tue Feb 5 17:35:12 2019 -0200 wcsmbs: optimize wcscat powerpc64 and powerpc64le builds fail when configured with --disable-multi-arch, due to an undefined reference to __GI___wcscpy. This patch fixes this on sysdeps/powerpc/powerpc64/power6/wcscpy.c, which is only used when multi-arch is disabled. This patch does nothing for the failures on 32-bits powerpc builds, because the file is under the powerpc64 subdirectory, however, powerpc builds were already failing with --disable-multi-arch, with multiple error messages, even before the aforementioned commit. Tested for powerpc, powerpc64, and powerpc64le with multi-arch enabled (all pass) and disabled (powerpc still fails as explained above). * sysdeps/powerpc/powerpc64/power6/wcscpy.c (WCSCPY): Define to __wcscpy, then use libc_hidden_def and weak_alias to bind it to __GI___wcscpy and wcscpy. --- sysdeps/powerpc/powerpc64/power6/wcscpy.c | 3 +++ 1 file changed, 3 insertions(+)
Comments
On 02/03/2019 19:06, Gabriel F. T. Gomes wrote: > Since the commit > > commit 81a14439417552324ec6ca71f65ddf8e7cdd51c7 > Author: Adhemerval Zanella <adhemerval.zanella@linaro.org> > Date: Tue Feb 5 17:35:12 2019 -0200 > > wcsmbs: optimize wcscat > > powerpc64 and powerpc64le builds fail when configured with > --disable-multi-arch, due to an undefined reference to __GI___wcscpy. > This patch fixes this on sysdeps/powerpc/powerpc64/power6/wcscpy.c, > which is only used when multi-arch is disabled. > > This patch does nothing for the failures on 32-bits powerpc builds, > because the file is under the powerpc64 subdirectory, however, powerpc > builds were already failing with --disable-multi-arch, with multiple > error messages, even before the aforementioned commit. > > Tested for powerpc, powerpc64, and powerpc64le with multi-arch enabled > (all pass) and disabled (powerpc still fails as explained above). > > * sysdeps/powerpc/powerpc64/power6/wcscpy.c (WCSCPY): Define to > __wcscpy, then use libc_hidden_def and weak_alias to bind it to > __GI___wcscpy and wcscpy. LGTM. In fact you need either your compiler to setup to to use power6+ as default or configure with --with-cpu (the powerpc64le gcc I used to check in fact do not have neither set). As a side note, this is exactly the kind of annoyance I tried to avoid when I advocated to reduce the build permutations on glibc for the recent s390 ifunc refactor (and it bit me with an issue recently [1]). So I would ask again if these micro-optimization indeed pays for the over-complicated build options we need to actually test to make sure nothing broken glibc build. For powerpc64 there are two issue: 1. can't we just infer the minimum cpu support from compiler or CC and get rid of --with-cpu? I would like to avoid to add the possible permutation of --with-cpu=powerX and --disable-multi-arch of the required build to check. It won't help much now since powerpc still organizes itself with sysdeps subfolder, but it can be a first step. Second, and this is minor one, I will send a patch to just remove the power6 wcs* optimized routines. They are rarely used in the wild, the optimization are just loop unrolling that we can either rely on compiler (sadly gcc does not unroll this kind of loop either with aggressive optimization flags) or add on generic implementation, and power6 and power7 are essentially the same binary for powerpc64. [1] https://sourceware.org/ml/libc-alpha/2019-02/msg00140.html > --- > sysdeps/powerpc/powerpc64/power6/wcscpy.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/sysdeps/powerpc/powerpc64/power6/wcscpy.c b/sysdeps/powerpc/powerpc64/power6/wcscpy.c > index 722c8f995b..59cfb28832 100644 > --- a/sysdeps/powerpc/powerpc64/power6/wcscpy.c > +++ b/sysdeps/powerpc/powerpc64/power6/wcscpy.c > @@ -1 +1,4 @@ > +#define WCSCPY __wcscpy > #include <sysdeps/powerpc/power6/wcscpy.c> > +libc_hidden_def (__wcscpy) > +weak_alias (__wcscpy, wcscpy) >
On Sun, Mar 03 2019, Adhemerval Zanella wrote: > > On 02/03/2019 19:06, Gabriel F. T. Gomes wrote: > > > > powerpc64 and powerpc64le builds fail when configured with > > --disable-multi-arch, due to an undefined reference to __GI___wcscpy. > > This patch fixes this on sysdeps/powerpc/powerpc64/power6/wcscpy.c, > > which is only used when multi-arch is disabled. > > > > This patch does nothing for the failures on 32-bits powerpc builds, > > because the file is under the powerpc64 subdirectory, however, powerpc > > builds were already failing with --disable-multi-arch, with multiple > > error messages, even before the aforementioned commit. > > > > Tested for powerpc, powerpc64, and powerpc64le with multi-arch enabled > > (all pass) and disabled (powerpc still fails as explained above). > > > > LGTM. Thanks. > In fact you need either your compiler to setup to to use power6+ > as default or configure with --with-cpu (the powerpc64le gcc I used > to check in fact do not have neither set). I usually build powerpc64le with --with-cpu=power8, because that's the machine where little-endian is supposed to work, and since that's what distros typically use, and that's why I saw the failure. I don't do the same for powerpce64... I only wrote that such build is failing, because I ran an extra test, just to check what was going on (not because I know of any distros that use --with-cpu=power6 (or newer)), I'm sorry if I gave the impression that this was the case. Anyhow, yes, one would need to use --with-cpu to get the failure, thanks for making that clear. I will commit this patch with that information. > As a side note, this is exactly the kind of annoyance I tried to avoid > when I advocated to reduce the build permutations on glibc for the > recent s390 ifunc refactor (and it bit me with an issue recently [1]). > So I would ask again if these micro-optimization indeed pays for the > over-complicated build options we need to actually test to make sure > nothing broken glibc build. We will have to check if they actually do, and I'll review your new patch set. I'm not so sure about the representativeness (I hope that this is an actual word in English) of the benchmarks, though. This has always been an open-question for me. > For powerpc64 there are two issue: 1. can't we just infer the minimum > cpu support from compiler or CC and get rid of --with-cpu? I thought that --with-cpu was useful for distros, so that they could decide what minimum cpu they want to build support for. Are you implying that they could set this in the compiler? > I would like > to avoid to add the possible permutation of --with-cpu=powerX and > --disable-multi-arch of the required build to check. It won't help much > now since powerpc still organizes itself with sysdeps subfolder, but > it can be a first step. This is new for me. I thought that the options were good, in the sense that distros can use whatever combination they deem appropriate for their users. I don't even know the reasons why one would disable multi-arch builds, I only assumed that distros and users would know better. > Second, and this is minor one, I will send a patch to just remove the > power6 wcs* optimized routines. They are rarely used in the wild, the > optimization are just loop unrolling that we can either rely on > compiler (sadly gcc does not unroll this kind of loop either with > aggressive optimization flags) or add on generic implementation, > and power6 and power7 are essentially the same binary for > powerpc64. Thanks for doing this.
On 05/03/2019 10:35, Gabriel F. T. Gomes wrote: > On Sun, Mar 03 2019, Adhemerval Zanella wrote: >> >> On 02/03/2019 19:06, Gabriel F. T. Gomes wrote: >>> >>> powerpc64 and powerpc64le builds fail when configured with >>> --disable-multi-arch, due to an undefined reference to __GI___wcscpy. >>> This patch fixes this on sysdeps/powerpc/powerpc64/power6/wcscpy.c, >>> which is only used when multi-arch is disabled. >>> >>> This patch does nothing for the failures on 32-bits powerpc builds, >>> because the file is under the powerpc64 subdirectory, however, powerpc >>> builds were already failing with --disable-multi-arch, with multiple >>> error messages, even before the aforementioned commit. >>> >>> Tested for powerpc, powerpc64, and powerpc64le with multi-arch enabled >>> (all pass) and disabled (powerpc still fails as explained above). >>> >> >> LGTM. > > Thanks. > >> In fact you need either your compiler to setup to to use power6+ >> as default or configure with --with-cpu (the powerpc64le gcc I used >> to check in fact do not have neither set). > > I usually build powerpc64le with --with-cpu=power8, because that's the > machine where little-endian is supposed to work, and since that's what > distros typically use, and that's why I saw the failure. > > I don't do the same for powerpce64... I only wrote that such build is > failing, because I ran an extra test, just to check what was going on > (not because I know of any distros that use --with-cpu=power6 (or > newer)), I'm sorry if I gave the impression that this was the case. > > Anyhow, yes, one would need to use --with-cpu to get the failure, thanks > for making that clear. > > I will commit this patch with that information. > >> As a side note, this is exactly the kind of annoyance I tried to avoid >> when I advocated to reduce the build permutations on glibc for the >> recent s390 ifunc refactor (and it bit me with an issue recently [1]). >> So I would ask again if these micro-optimization indeed pays for the >> over-complicated build options we need to actually test to make sure >> nothing broken glibc build. > > We will have to check if they actually do, and I'll review your new > patch set. I'm not so sure about the representativeness (I hope that > this is an actual word in English) of the benchmarks, though. This has > always been an open-question for me. As Wilco has pointed out, we can just the similar optimization on generic implementation instead and avoid a lot of powerpc specific code. > >> For powerpc64 there are two issue: 1. can't we just infer the minimum >> cpu support from compiler or CC and get rid of --with-cpu? > > I thought that --with-cpu was useful for distros, so that they could > decide what minimum cpu they want to build support for. Are you > implying that they could set this in the compiler? Currently --with-cpu is still required to correctly infer the internal sysdeps folders if you aim to build glibc to run on a specific chip as the minimum base (power8 for instance). However this same information can be inferred from -mcpu flag from CC, so I see this extra configure option as just another build permutation that incur in more extra testing. My idea is to make powerpc behave as other architectures and avoid the multiple possible configuration options provide by each chip/ISA and instead have: 1. powerpc-linux-gnu: make ifunc default for all build instead of just power4. If I recall correctly the only impeding reason was the hp-timing support that was added only on ISA 2.02. However with current hp-timing refactor, we can conditionally build its support iff compiler builds for power4. 2. powerpc64-linux-gnu: remove all the powerX variations, add some missing ifunc to remove chip-specific implementation that are only active on some chips (add_n.S and sub_n.S for power7 for instance). A possible optimization to enable optimized base symbol to be used as default (memcpy for instance) it to implement something similar s390 did and make some ifunc build conditional depending of the compiler option. I am not a huge fan of this approach, since it just moves the build permutations from configure option to compiler flags. 3. powerpc64le-linux-gnu: as forr powerpc64-linux-gnu, remove all powerX variations and make power8 the true default by avoid build and export the various pre-power8 variations. The powerpc64{le} has an additional issue which seems that only power8 routines are shared between LE and BE abi (it seems power9 is not support for BE). So one idea might be to move power8 arch-specific optimization to a common folder user by LE and BE. Another issue powerpc64le might also use different some specific implementation from older chips. I don't have a best approach on how to select it, but one thing that came to my mind might be reimplement some routines as: --- sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c #ifdef _ARCH_PWR6 double __libm_copysign (double x, double y) { __builtin_copysign (x, y); } #else #define NO_MATH_REDIRECT /* Redefine copysign so that the compiler won't complain about the type mismatch with the IFUNC selector in strong_alias below. */ #undef __copysign #define __copysign __redirect_copysign #include <math.h> #include <math_ldbl_opt.h> #undef __copysign #include <shlib-compat.h> #include "init-arch.h" #include <libm-alias-double.h> extern __typeof (__redirect_copysign) __copysign_ppc64 attribute_hidden; extern __typeof (__redirect_copysign) __copysign_power6 attribute_hidden; extern __typeof (__redirect_copysign) __libm_copysign; libc_ifunc (__libm_copysign, (hwcap & PPC_FEATURE_ARCH_2_05) ? __copysign_power6 : __copysign_ppc64); #endif strong_alias (__libm_copysign, __copysign) libm_alias_double (__copysign, copysign) #if LONG_DOUBLE_COMPAT (libc, GLIBC_2_0) compat_symbol (libc, __copysign, copysignl, GLIBC_2_0); #endif --- But I am not sure it is really pays off in term of simplify glibc code and build. Maybe an option is just to use __builtin_* when we know compiler generates an efficient instruction and just use the generic implementation instead. > >> I would like >> to avoid to add the possible permutation of --with-cpu=powerX and >> --disable-multi-arch of the required build to check. It won't help much >> now since powerpc still organizes itself with sysdeps subfolder, but >> it can be a first step. > > This is new for me. I thought that the options were good, in the sense > that distros can use whatever combination they deem appropriate for > their users. I don't even know the reasons why one would disable > multi-arch builds, I only assumed that distros and users would know > better. I see the --disable-multi-arch useful just when used along with --with-cpu, because since the idea is just to provide a chip specific build there is no sense of adding all possible ifunc variants. But afaik distros usually do not use it. > >> Second, and this is minor one, I will send a patch to just remove the >> power6 wcs* optimized routines. They are rarely used in the wild, the >> optimization are just loop unrolling that we can either rely on >> compiler (sadly gcc does not unroll this kind of loop either with >> aggressive optimization flags) or add on generic implementation, >> and power6 and power7 are essentially the same binary for >> powerpc64. > > Thanks for doing this. >
On Sun, 3 Mar 2019, Adhemerval Zanella wrote: > For powerpc64 there are two issue: 1. can't we just infer the minimum > cpu support from compiler or CC and get rid of --with-cpu? I would like We should do that in general to get rid of --with-cpu (regarding powerpc, see my review comments <https://sourceware.org/ml/libc-alpha/2017-02/msg00164.html>).
diff --git a/sysdeps/powerpc/powerpc64/power6/wcscpy.c b/sysdeps/powerpc/powerpc64/power6/wcscpy.c index 722c8f995b..59cfb28832 100644 --- a/sysdeps/powerpc/powerpc64/power6/wcscpy.c +++ b/sysdeps/powerpc/powerpc64/power6/wcscpy.c @@ -1 +1,4 @@ +#define WCSCPY __wcscpy #include <sysdeps/powerpc/power6/wcscpy.c> +libc_hidden_def (__wcscpy) +weak_alias (__wcscpy, wcscpy)