[POWERPC] Don't override gcc -mcpu with wrong asm .machine
Commit Message
If powerpc gcc is modified to not pass -many to the assembler, glibc
builds break. -many is a sticky option, which means .machine in the
source isn't really effective in selecting an instruction set. You do
get the insns you want, in particular with the operand variations for
the given cpu, but then all other cpu instructions are made available
too. So, ".machine power4" in power4/memcmp.S doesn't stop you using
power7 insns as used by the little-endian code in that file. However
if -many is not in force, power4/memcmp.S won't compile for
powerpc64le.
It's rather strange that these files use .machine at all. Perhaps
that was a workaround for versions of gcc that didn't pass -m<cpu> (or
passed a wrong -m<cpu>) to the assembler unless -mcpu was given? Even
with such a gcc, I believe you'll get the correct -m<cpu> if the glibc
makefiles always pass -mcpu=<cpu> to gcc. Since I haven't verified
that all possible powerpc build variants do that, I haven't completely
removed .machine, but instead put the .machine directive inside an
ifdef test of a macro defined by gcc. Now if the Makefile passes
-mcpu=power8 the assembly won't disagree by using ".machine power7".
* sysdeps/powerpc/powerpc64/power4/memcmp.S: Don't emit .machine
when __LITTLE_ENDIAN__.
* sysdeps/powerpc/powerpc64/power8/strcasecmp.S: Don't emit .machine
when _ARCH_PWR8.
* sysdeps/powerpc/powerpc64/power8/strcasestr.S: Likewise.
Comments
Alan Modra <amodra@gmail.com> writes:
> If powerpc gcc is modified to not pass -many to the assembler, glibc
> builds break. -many is a sticky option, which means .machine in the
> source isn't really effective in selecting an instruction set. You do
> get the insns you want, in particular with the operand variations for
> the given cpu, but then all other cpu instructions are made available
> too.
> So, ".machine power4" in power4/memcmp.S doesn't stop you using
> power7 insns as used by the little-endian code in that file. However
> if -many is not in force, power4/memcmp.S won't compile for
> powerpc64le.
Isn't this the expected behavior?
After all, this file is in the power4 directory and, AFAIU, shouldn't be using
instructions incompatible with POWER4.
I understand that ppc64le was never tested on POWER4, but I'm afraid this
failure is actually showing there is another problem.
> It's rather strange that these files use .machine at all. Perhaps
> that was a workaround for versions of gcc that didn't pass -m<cpu> (or
> passed a wrong -m<cpu>) to the assembler unless -mcpu was given?
I think that's a (tentative?) way to guarantee the file would never have
unsupported instructions for the given cpu.
> Even with such a gcc, I believe you'll get the correct -m<cpu> if the glibc
> makefiles always pass -mcpu=<cpu> to gcc.
I don't think that's guaranteed to happen, e.g. if glibc is not configured
with --with-cpu on big endian, it may not pass -mcpu to GCC.
> Since I haven't verified that all possible powerpc build variants do that
Ack. I'm running some tests and will get back to you.
On Fri, 20 Oct 2017, Tulio Magno Quites Machado Filho wrote:
> I understand that ppc64le was never tested on POWER4, but I'm afraid this
> failure is actually showing there is another problem.
There is a configure test that for powerpc64le the (compiler + options
used) is building for POWER8 or later. *But* this does not ensure that
the power8 sysdeps directories are used.
The preferred approach for deciding which sub-architecture sysdeps
directories to use is that used in sysdeps/arm/preconfigure.ac - examine
what machine the compiler builds for. However, powerpc is still using the
older approach of requiring --with-cpu= options to select such sysdeps
directories, and there is no requirement to use --with-cpu=power8 for
powerpc64le. (Some architectures use a third approach, of basing things
on the host triplet.)
It would be a good idea to move powerpc to testing what the compiler does,
and generally to obsolete --with-cpu= for all architectures.
Joseph Myers <joseph@codesourcery.com> writes:
> On Fri, 20 Oct 2017, Tulio Magno Quites Machado Filho wrote:
>
>> I understand that ppc64le was never tested on POWER4, but I'm afraid this
>> failure is actually showing there is another problem.
>
> There is a configure test that for powerpc64le the (compiler + options
> used) is building for POWER8 or later. *But* this does not ensure that
> the power8 sysdeps directories are used.
>
> The preferred approach for deciding which sub-architecture sysdeps
> directories to use is that used in sysdeps/arm/preconfigure.ac - examine
> what machine the compiler builds for. However, powerpc is still using the
> older approach of requiring --with-cpu= options to select such sysdeps
> directories, and there is no requirement to use --with-cpu=power8 for
> powerpc64le. (Some architectures use a third approach, of basing things
> on the host triplet.)
>
> It would be a good idea to move powerpc to testing what the compiler does,
> and generally to obsolete --with-cpu= for all architectures.
Yep. I still have to make the changes to my last try on this [1].
[1] https://patchwork.sourceware.org/patch/19173/
On Fri, Oct 20, 2017 at 03:51:54PM -0200, Tulio Magno Quites Machado Filho wrote:
> I understand that ppc64le was never tested on POWER4, but I'm afraid this
> failure is actually showing there is another problem.
Yes, why is powerpc64/multiarch/memcmp-power4.S being compiled for
powerpc64le? And even with --disable-multiarch!
Alan Modra <amodra@gmail.com> writes:
> On Fri, Oct 20, 2017 at 03:51:54PM -0200, Tulio Magno Quites Machado Filho wrote:
>> I understand that ppc64le was never tested on POWER4, but I'm afraid this
>> failure is actually showing there is another problem.
>
> Yes, why is powerpc64/multiarch/memcmp-power4.S being compiled for
> powerpc64le?
powerpc64 code is reused in powerpc64le. It has been working well, but I
believe we can improve how the Implies mechanism is being used.
> And even with --disable-multiarch!
Do you mean --disable-multi-arch?
In my tests, powerpc64/power8/memcmp.S is being used instead.
On Fri, Oct 20, 2017 at 10:02:12PM -0200, Tulio Magno Quites Machado Filho wrote:
> Alan Modra <amodra@gmail.com> writes:
>
> > On Fri, Oct 20, 2017 at 03:51:54PM -0200, Tulio Magno Quites Machado Filho wrote:
> >> I understand that ppc64le was never tested on POWER4, but I'm afraid this
> >> failure is actually showing there is another problem.
> >
> > Yes, why is powerpc64/multiarch/memcmp-power4.S being compiled for
> > powerpc64le?
>
> powerpc64 code is reused in powerpc64le. It has been working well, but I
> believe we can improve how the Implies mechanism is being used.
Yes, powerpc64le is only supported on power8 and later machines, so
there is no point in having LE code in any of the power4 through
power7 directories, or compiling it. Adding multi-arch code into the
shared lib that will never be used just bloats it.
I have a fairly large patch in progress to fix this.
Also, if power8 or later includes code from earlier directories then
it would be cleaner to not do so if to support LE we need to put
power7 instructions in power4 directories, as is done in
power4/memcmp.S.
> > And even with --disable-multiarch!
>
> Do you mean --disable-multi-arch?
> In my tests, powerpc64/power8/memcmp.S is being used instead.
Ugh, yes, I typoed the option in my configure script. With
--disable-mult-arch I don't get any power4 code.
BTW, do you know why strcmp and strncmp are ifunc only in libc.so and
not in libc.a?
On Sat, 21 Oct 2017, Alan Modra wrote:
> On Fri, Oct 20, 2017 at 03:51:54PM -0200, Tulio Magno Quites Machado Filho wrote:
> > I understand that ppc64le was never tested on POWER4, but I'm afraid this
> > failure is actually showing there is another problem.
>
> Yes, why is powerpc64/multiarch/memcmp-power4.S being compiled for
> powerpc64le? And even with --disable-multiarch!
Because unlike e.g. ARM, the powerpc port does not deduce which sysdeps
directories to use based on which subarchitecture the compiler is
generating code for. (ARM also has optimizations to avoid IFUNC variants
that are unnecessary based on what the compiler is generating code for.
Such optimizations could get complicated in general; ideally you'd both
not build any variants that will not get used on any processor supporting
the compiler-generated code, and eliminate the IFUNCs altogether if that
means there is only one variant of a function left.)
@@ -26,7 +26,9 @@
# define MEMCMP memcmp
#endif
+#ifndef __LITTLE_ENDIAN__
.machine power4
+#endif
ENTRY_TOCLESS (MEMCMP, 4)
CALL_MCOUNT 3
@@ -98,6 +98,7 @@
# define VPOPCNTD_V8_V8 vpopcntd v8, v8;
# define VADDUQM_V7_V8 vadduqm v9, v7, v8;
#else
+ .machine power7
# define VCLZD_V8_v7 .long 0x11003fc2
# define MFVRD_R3_V1 .long 0x7c230067
# define VSUBUDM_V9_V8 .long 0x112944c0
@@ -105,8 +106,6 @@
# define VADDUQM_V7_V8 .long 0x11274100
#endif
- .machine power7
-
ENTRY (__STRCASECMP)
#ifdef USE_AS_STRNCASECMP
CALL_MCOUNT 3
@@ -78,13 +78,11 @@
#ifdef _ARCH_PWR8
#define VCLZD_V8_v7 vclzd v8, v7;
#else
+ .machine power7
#define VCLZD_V8_v7 .long 0x11003fc2
#endif
#define FRAMESIZE (FRAME_MIN_SIZE+48)
-/* TODO: change this to .machine power8 when the minimum required binutils
- allows it. */
- .machine power7
ENTRY (STRCASESTR, 4)
CALL_MCOUNT 2
mflr r0 /* Load link register LR to r0. */