[POWERPC] Don't override gcc -mcpu with wrong asm .machine

Message ID 20171020130241.GA27132@bubble.grove.modra.org
State Superseded
Delegated to: Tulio Magno Quites Machado Filho
Headers

Commit Message

Alan Modra Oct. 20, 2017, 1:02 p.m. UTC
  If powerpc gcc is modified to not pass -many to the assembler, glibc
builds break.  -many is a sticky option, which means .machine in the
source isn't really effective in selecting an instruction set.  You do
get the insns you want, in particular with the operand variations for
the given cpu, but then all other cpu instructions are made available
too.  So, ".machine power4" in power4/memcmp.S doesn't stop you using
power7 insns as used by the little-endian code in that file.  However
if -many is not in force, power4/memcmp.S won't compile for
powerpc64le.

It's rather strange that these files use .machine at all.  Perhaps
that was a workaround for versions of gcc that didn't pass -m<cpu> (or
passed a wrong -m<cpu>) to the assembler unless -mcpu was given?  Even
with such a gcc, I believe you'll get the correct -m<cpu> if the glibc
makefiles always pass -mcpu=<cpu> to gcc.  Since I haven't verified
that all possible powerpc build variants do that, I haven't completely
removed .machine, but instead put the .machine directive inside an
ifdef test of a macro defined by gcc.  Now if the Makefile passes
-mcpu=power8 the assembly won't disagree by using ".machine power7".

	* sysdeps/powerpc/powerpc64/power4/memcmp.S: Don't emit .machine
	when __LITTLE_ENDIAN__.
	* sysdeps/powerpc/powerpc64/power8/strcasecmp.S: Don't emit .machine
	when _ARCH_PWR8.
	* sysdeps/powerpc/powerpc64/power8/strcasestr.S: Likewise.
  

Comments

Tulio Magno Quites Machado Filho Oct. 20, 2017, 5:51 p.m. UTC | #1
Alan Modra <amodra@gmail.com> writes:

> If powerpc gcc is modified to not pass -many to the assembler, glibc
> builds break.  -many is a sticky option, which means .machine in the
> source isn't really effective in selecting an instruction set.  You do
> get the insns you want, in particular with the operand variations for
> the given cpu, but then all other cpu instructions are made available
> too.
> So, ".machine power4" in power4/memcmp.S doesn't stop you using
> power7 insns as used by the little-endian code in that file.  However
> if -many is not in force, power4/memcmp.S won't compile for
> powerpc64le.

Isn't this the expected behavior?
After all, this file is in the power4 directory and, AFAIU, shouldn't be using
instructions incompatible with POWER4.

I understand that ppc64le was never tested on POWER4, but I'm afraid this
failure is actually showing there is another problem.

> It's rather strange that these files use .machine at all. Perhaps
> that was a workaround for versions of gcc that didn't pass -m<cpu> (or
> passed a wrong -m<cpu>) to the assembler unless -mcpu was given?

I think that's a (tentative?) way to guarantee the file would never have
unsupported instructions for the given cpu.

> Even with such a gcc, I believe you'll get the correct -m<cpu> if the glibc
> makefiles always pass -mcpu=<cpu> to gcc.

I don't think that's guaranteed to happen, e.g. if glibc is not configured
with --with-cpu on big endian, it may not pass -mcpu to GCC.

> Since I haven't verified that all possible powerpc build variants do that

Ack.  I'm running some tests and will get back to you.
  
Joseph Myers Oct. 20, 2017, 8:02 p.m. UTC | #2
On Fri, 20 Oct 2017, Tulio Magno Quites Machado Filho wrote:

> I understand that ppc64le was never tested on POWER4, but I'm afraid this
> failure is actually showing there is another problem.

There is a configure test that for powerpc64le the (compiler + options 
used) is building for POWER8 or later.  *But* this does not ensure that 
the power8 sysdeps directories are used.

The preferred approach for deciding which sub-architecture sysdeps 
directories to use is that used in sysdeps/arm/preconfigure.ac - examine 
what machine the compiler builds for.  However, powerpc is still using the 
older approach of requiring --with-cpu= options to select such sysdeps 
directories, and there is no requirement to use --with-cpu=power8 for 
powerpc64le.  (Some architectures use a third approach, of basing things 
on the host triplet.)

It would be a good idea to move powerpc to testing what the compiler does, 
and generally to obsolete --with-cpu= for all architectures.
  
Tulio Magno Quites Machado Filho Oct. 20, 2017, 8:35 p.m. UTC | #3
Joseph Myers <joseph@codesourcery.com> writes:

> On Fri, 20 Oct 2017, Tulio Magno Quites Machado Filho wrote:
>
>> I understand that ppc64le was never tested on POWER4, but I'm afraid this
>> failure is actually showing there is another problem.
>
> There is a configure test that for powerpc64le the (compiler + options 
> used) is building for POWER8 or later.  *But* this does not ensure that 
> the power8 sysdeps directories are used.
>
> The preferred approach for deciding which sub-architecture sysdeps 
> directories to use is that used in sysdeps/arm/preconfigure.ac - examine 
> what machine the compiler builds for.  However, powerpc is still using the 
> older approach of requiring --with-cpu= options to select such sysdeps 
> directories, and there is no requirement to use --with-cpu=power8 for 
> powerpc64le.  (Some architectures use a third approach, of basing things 
> on the host triplet.)
>
> It would be a good idea to move powerpc to testing what the compiler does, 
> and generally to obsolete --with-cpu= for all architectures.

Yep.  I still have to make the changes to my last try on this [1].

[1] https://patchwork.sourceware.org/patch/19173/
  
Alan Modra Oct. 20, 2017, 11:51 p.m. UTC | #4
On Fri, Oct 20, 2017 at 03:51:54PM -0200, Tulio Magno Quites Machado Filho wrote:
> I understand that ppc64le was never tested on POWER4, but I'm afraid this
> failure is actually showing there is another problem.

Yes, why is powerpc64/multiarch/memcmp-power4.S being compiled for
powerpc64le?  And even with --disable-multiarch!
  
Tulio Magno Quites Machado Filho Oct. 21, 2017, 12:02 a.m. UTC | #5
Alan Modra <amodra@gmail.com> writes:

> On Fri, Oct 20, 2017 at 03:51:54PM -0200, Tulio Magno Quites Machado Filho wrote:
>> I understand that ppc64le was never tested on POWER4, but I'm afraid this
>> failure is actually showing there is another problem.
>
> Yes, why is powerpc64/multiarch/memcmp-power4.S being compiled for
> powerpc64le?

powerpc64 code is reused in powerpc64le.  It has been working well, but I
believe we can improve how the Implies mechanism is being used.

>   And even with --disable-multiarch!

Do you mean --disable-multi-arch?
In my tests, powerpc64/power8/memcmp.S is being used instead.
  
Alan Modra Oct. 22, 2017, 11:52 a.m. UTC | #6
On Fri, Oct 20, 2017 at 10:02:12PM -0200, Tulio Magno Quites Machado Filho wrote:
> Alan Modra <amodra@gmail.com> writes:
> 
> > On Fri, Oct 20, 2017 at 03:51:54PM -0200, Tulio Magno Quites Machado Filho wrote:
> >> I understand that ppc64le was never tested on POWER4, but I'm afraid this
> >> failure is actually showing there is another problem.
> >
> > Yes, why is powerpc64/multiarch/memcmp-power4.S being compiled for
> > powerpc64le?
> 
> powerpc64 code is reused in powerpc64le.  It has been working well, but I
> believe we can improve how the Implies mechanism is being used.

Yes, powerpc64le is only supported on power8 and later machines, so
there is no point in having LE code in any of the power4 through
power7 directories, or compiling it.  Adding multi-arch code into the
shared lib that will never be used just bloats it.

I have a fairly large patch in progress to fix this.

Also, if power8 or later includes code from earlier directories then
it would be cleaner to not do so if to support LE we need to put
power7 instructions in power4 directories, as is done in
power4/memcmp.S.

> >   And even with --disable-multiarch!
> 
> Do you mean --disable-multi-arch?
> In my tests, powerpc64/power8/memcmp.S is being used instead.

Ugh, yes, I typoed the option in my configure script.  With
--disable-mult-arch I don't get any power4 code.

BTW, do you know why strcmp and strncmp are ifunc only in libc.so and
not in libc.a?
  
Joseph Myers Oct. 23, 2017, 3:42 p.m. UTC | #7
On Sat, 21 Oct 2017, Alan Modra wrote:

> On Fri, Oct 20, 2017 at 03:51:54PM -0200, Tulio Magno Quites Machado Filho wrote:
> > I understand that ppc64le was never tested on POWER4, but I'm afraid this
> > failure is actually showing there is another problem.
> 
> Yes, why is powerpc64/multiarch/memcmp-power4.S being compiled for
> powerpc64le?  And even with --disable-multiarch!

Because unlike e.g. ARM, the powerpc port does not deduce which sysdeps 
directories to use based on which subarchitecture the compiler is 
generating code for.  (ARM also has optimizations to avoid IFUNC variants 
that are unnecessary based on what the compiler is generating code for.  
Such optimizations could get complicated in general; ideally you'd both 
not build any variants that will not get used on any processor supporting 
the compiler-generated code, and eliminate the IFUNCs altogether if that 
means there is only one variant of a function left.)
  

Patch

diff --git a/sysdeps/powerpc/powerpc64/power4/memcmp.S b/sysdeps/powerpc/powerpc64/power4/memcmp.S
index c366801..6cb4a16 100644
--- a/sysdeps/powerpc/powerpc64/power4/memcmp.S
+++ b/sysdeps/powerpc/powerpc64/power4/memcmp.S
@@ -26,7 +26,9 @@ 
 # define MEMCMP memcmp
 #endif
 
+#ifndef __LITTLE_ENDIAN__
 	.machine power4
+#endif
 ENTRY_TOCLESS (MEMCMP, 4)
 	CALL_MCOUNT 3
 
diff --git a/sysdeps/powerpc/powerpc64/power8/strcasecmp.S b/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
index 88b17a6..0655bd7 100644
--- a/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
+++ b/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
@@ -98,6 +98,7 @@ 
 #  define VPOPCNTD_V8_V8	vpopcntd v8, v8;
 #  define VADDUQM_V7_V8	vadduqm	v9, v7, v8;
 #else
+	.machine  power7
 #  define VCLZD_V8_v7	.long	0x11003fc2
 #  define MFVRD_R3_V1	.long	0x7c230067
 #  define VSUBUDM_V9_V8	.long	0x112944c0
@@ -105,8 +106,6 @@ 
 #  define VADDUQM_V7_V8	.long	0x11274100
 #endif
 
-	.machine  power7
-
 ENTRY (__STRCASECMP)
 #ifdef USE_AS_STRNCASECMP
 	CALL_MCOUNT 3
diff --git a/sysdeps/powerpc/powerpc64/power8/strcasestr.S b/sysdeps/powerpc/powerpc64/power8/strcasestr.S
index 3f59cb0..c301387 100644
--- a/sysdeps/powerpc/powerpc64/power8/strcasestr.S
+++ b/sysdeps/powerpc/powerpc64/power8/strcasestr.S
@@ -78,13 +78,11 @@ 
 #ifdef _ARCH_PWR8
 #define VCLZD_V8_v7	vclzd	v8, v7;
 #else
+	.machine  power7
 #define VCLZD_V8_v7	.long	0x11003fc2
 #endif
 
 #define	FRAMESIZE	(FRAME_MIN_SIZE+48)
-/* TODO: change this to .machine power8 when the minimum required binutils
-   allows it.  */
-	.machine  power7
 ENTRY (STRCASESTR, 4)
 	CALL_MCOUNT 2
 	mflr	r0			/* Load link register LR to r0.  */