[11,backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

Message ID afcb3f33-131a-6073-c1b5-0c338becc180@linux.ibm.com
State New
Headers
Series [11,backport] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082] |

Commit Message

Li, Pan2 via Gcc-patches Feb. 10, 2022, 6:22 p.m. UTC
  Hi!

This is a backport from mainline 3f30f2d1dbb3228b8468b26239fe60c2974ce2ac.
These built-ins were misimplemented as always having big-endian semantics.

Because the built-in infrastructure has changed, the modifications to the
source are different but achieve the same purpose.  The modifications to
the test suite are identical (after fixing the issue with -mbig that David
pointed out with the original patch).

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for releases/gcc-11?

Thanks!
Bill


2022-02-10  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	PR target/95082
	* config/rs6000/rs6000-builtin.def (VCLZLSBB_V16QI): Change default
	pattern.
	(VCLZLSBB_V8HI): Likewise.
	(VCLZLSBB_V4SI): Likewise.
	(VCTZLSBB_V16QI): Likewise.
	(VCTZLSBB_V8HI): Likewise.
	(VCTZLSBB_V4SI): Likewise.
	* config/rs6000/rs6000-call.c (rs6000_expand_builtin): Make big-endian
	adjustments to P9V_BUILTIN_VC[LT]ZLSBB_* built-in expansions.

gcc/testsuite/
	PR target/95082
	* gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c: Restrict to big-endian.
	* gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c: Likewise.
	* gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c: New.
	* gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c: New.
	* gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c: Restrict to big-endian.
	* gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c: Likewise.
	* gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c: New.
	* gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c: New.
---
 gcc/config/rs6000/rs6000-builtin.def          | 12 ++++----
 gcc/config/rs6000/rs6000-call.c               | 30 +++++++++++++++++++
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c |  1 +
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c |  1 +
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c | 15 ++++++++++
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c | 15 ++++++++++
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c |  1 +
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c |  1 +
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c | 15 ++++++++++
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c | 15 ++++++++++
 10 files changed, 100 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c
  

Comments

Segher Boessenkool Feb. 10, 2022, 8:06 p.m. UTC | #1
Hi!

On Thu, Feb 10, 2022 at 12:22:28PM -0600, Bill Schmidt wrote:
> This is a backport from mainline 3f30f2d1dbb3228b8468b26239fe60c2974ce2ac.
> These built-ins were misimplemented as always having big-endian semantics.

What is different compared to the trunk version?


Segher
  
Li, Pan2 via Gcc-patches Feb. 10, 2022, 8:15 p.m. UTC | #2
Hi!

On 2/10/22 2:06 PM, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Feb 10, 2022 at 12:22:28PM -0600, Bill Schmidt wrote:
>> This is a backport from mainline 3f30f2d1dbb3228b8468b26239fe60c2974ce2ac.
>> These built-ins were misimplemented as always having big-endian semantics.
> What is different compared to the trunk version?

The infrastructure changed, so:

(1) Instead of changing the default pattern in rs6000-builtins.def, I have
to change it in rs6000-builtin.def.  (Note the missing "s".)

(2) Instead of having the endian change driven by an "endian" flag in the
built-in description in rs6000-builtins.def, I have to add some more ad-hoc
code in rs6000_expand_builtin to handle the change to the big-endian
pattern.

That's all.

Thanks!
Bill

>
>
> Segher
  
Segher Boessenkool Feb. 10, 2022, 8:50 p.m. UTC | #3
On Thu, Feb 10, 2022 at 12:22:28PM -0600, Bill Schmidt wrote:
> This is a backport from mainline 3f30f2d1dbb3228b8468b26239fe60c2974ce2ac.
> These built-ins were misimplemented as always having big-endian semantics.
> 
> Because the built-in infrastructure has changed, the modifications to the
> source are different but achieve the same purpose.  The modifications to
> the test suite are identical (after fixing the issue with -mbig that David
> pointed out with the original patch).

>  /* 1 argument vector functions added in ISA 3.0 (power9). */
> -BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",	CONST,	vclzlsbb_v16qi)
> -BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",	CONST,	vclzlsbb_v8hi)
> -BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",	CONST,	vclzlsbb_v4si)
> -BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",	CONST,	vctzlsbb_v16qi)
> -BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",	CONST,	vctzlsbb_v8hi)
> -BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",	CONST,	vctzlsbb_v4si)
> +BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",	CONST,	vctzlsbb_v16qi)
> +BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",	CONST,	vctzlsbb_v8hi)
> +BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",	CONST,	vctzlsbb_v4si)
> +BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",	CONST,	vclzlsbb_v16qi)
> +BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",	CONST,	vclzlsbb_v8hi)
> +BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",	CONST,	vclzlsbb_v4si)

Please change the default to be equal to the builtin name, so, the BE
version.  We do that everywhere else as well, and it makes a lot more
sense (since everything in Power has BE numbering).

The trunk version has this correct afaics?

> --- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
> +++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
> @@ -1,6 +1,7 @@
>  /* { dg-do compile { target { powerpc*-*-* } } } */

(Delete the redundant target clause when modifying any testcase, please).

>  /* { dg-require-effective-target powerpc_p9vector_ok } */
>  /* { dg-options "-mdejagnu-cpu=power9" } */
> +/* { dg-additional-options "-mbig" { target powerpc64le-*-* } } */

You don't need the target clause, if it already is BE by default it does
not do anything to add it redundantly.

But this is wrong anyway: the name of the target triple does not say
whether we are BE or LE.  Instead you should use the be or le selectors.
But again, just add -mbig always.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile { target { powerpc*-*-* } } } */
> +/* { dg-require-effective-target powerpc_p9vector_ok } */
> +/* { dg-options "-mdejagnu-cpu=power9 -mlittle" } */

And here you do it correctly :-)

Okay with those fixes (all happen a few times).  Thanks!


Segher
  
Li, Pan2 via Gcc-patches Feb. 10, 2022, 9:17 p.m. UTC | #4
Hi!

On 2/10/22 2:50 PM, Segher Boessenkool wrote:
> On Thu, Feb 10, 2022 at 12:22:28PM -0600, Bill Schmidt wrote:
>> This is a backport from mainline 3f30f2d1dbb3228b8468b26239fe60c2974ce2ac.
>> These built-ins were misimplemented as always having big-endian semantics.
>>
>> Because the built-in infrastructure has changed, the modifications to the
>> source are different but achieve the same purpose.  The modifications to
>> the test suite are identical (after fixing the issue with -mbig that David
>> pointed out with the original patch).
>>  /* 1 argument vector functions added in ISA 3.0 (power9). */
>> -BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",	CONST,	vclzlsbb_v16qi)
>> -BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",	CONST,	vclzlsbb_v8hi)
>> -BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",	CONST,	vclzlsbb_v4si)
>> -BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",	CONST,	vctzlsbb_v16qi)
>> -BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",	CONST,	vctzlsbb_v8hi)
>> -BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",	CONST,	vctzlsbb_v4si)
>> +BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",	CONST,	vctzlsbb_v16qi)
>> +BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",	CONST,	vctzlsbb_v8hi)
>> +BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",	CONST,	vctzlsbb_v4si)
>> +BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",	CONST,	vclzlsbb_v16qi)
>> +BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",	CONST,	vclzlsbb_v8hi)
>> +BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",	CONST,	vclzlsbb_v4si)
> Please change the default to be equal to the builtin name, so, the BE
> version.  We do that everywhere else as well, and it makes a lot more
> sense (since everything in Power has BE numbering).
>
> The trunk version has this correct afaics?

No, trunk has this, for example:

  const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
    VCLZLSBB_V16QI vctzlsbb_v16qi {endian}

So the backport matches what is on trunk.  

Throughout the new builtin infrastructure, the defaults are set for
little-endian, and the "endian" flag changes behavior for big-endian.

>
>> --- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
>> @@ -1,6 +1,7 @@
>>  /* { dg-do compile { target { powerpc*-*-* } } } */
> (Delete the redundant target clause when modifying any testcase, please).

Okay.
>
>>  /* { dg-require-effective-target powerpc_p9vector_ok } */
>>  /* { dg-options "-mdejagnu-cpu=power9" } */
>> +/* { dg-additional-options "-mbig" { target powerpc64le-*-* } } */
> You don't need the target clause, if it already is BE by default it does
> not do anything to add it redundantly.
>
> But this is wrong anyway: the name of the target triple does not say
> whether we are BE or LE.  Instead you should use the be or le selectors.
> But again, just add -mbig always.

This was added by David Edelsohn to the trunk version of the patch, because
-mbig actually is not supported on all subtargets.  (I found that quite
surprising also.)  Apparently this doesn't work on AIX, for example.  But 
-mlittle works everywhere.  Go figure.

That's something that should be fixed, I guess, but it's orthogonal
to this patch.

Thanks!
Bill

>
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c
>> @@ -0,0 +1,15 @@
>> +/* { dg-do compile { target { powerpc*-*-* } } } */
>> +/* { dg-require-effective-target powerpc_p9vector_ok } */
>> +/* { dg-options "-mdejagnu-cpu=power9 -mlittle" } */
> And here you do it correctly :-)
>
> Okay with those fixes (all happen a few times).  Thanks!
>
>
> Segher
  
Segher Boessenkool Feb. 10, 2022, 10:11 p.m. UTC | #5
On Thu, Feb 10, 2022 at 03:17:05PM -0600, Bill Schmidt wrote:
> >>  /* 1 argument vector functions added in ISA 3.0 (power9). */
> >> -BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",	CONST,	vclzlsbb_v16qi)
> >> -BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",	CONST,	vclzlsbb_v8hi)
> >> -BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",	CONST,	vclzlsbb_v4si)
> >> -BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",	CONST,	vctzlsbb_v16qi)
> >> -BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",	CONST,	vctzlsbb_v8hi)
> >> -BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",	CONST,	vctzlsbb_v4si)
> >> +BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",	CONST,	vctzlsbb_v16qi)
> >> +BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",	CONST,	vctzlsbb_v8hi)
> >> +BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",	CONST,	vctzlsbb_v4si)
> >> +BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",	CONST,	vclzlsbb_v16qi)
> >> +BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",	CONST,	vclzlsbb_v8hi)
> >> +BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",	CONST,	vclzlsbb_v4si)
> > Please change the default to be equal to the builtin name, so, the BE
> > version.  We do that everywhere else as well, and it makes a lot more
> > sense (since everything in Power has BE numbering).
> >
> > The trunk version has this correct afaics?
> 
> No, trunk has this, for example:
> 
>   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
>     VCLZLSBB_V16QI vctzlsbb_v16qi {endian}

I see this on trunk:

  const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
    VCLZLSBB_V16QI vclzlsbb_v16qi {}

Oh, you changed it?  Please fix it, then.

> Throughout the new builtin infrastructure, the defaults are set for
> little-endian, and the "endian" flag changes behavior for big-endian.

That is a big mistake.  There are many machine instructions  that are
*always* big-endian (most even!), and none that are always
little-endian.  So this should be fixed, sooner rather than later :-(

> >>  /* { dg-require-effective-target powerpc_p9vector_ok } */
> >>  /* { dg-options "-mdejagnu-cpu=power9" } */
> >> +/* { dg-additional-options "-mbig" { target powerpc64le-*-* } } */
> > You don't need the target clause, if it already is BE by default it does
> > not do anything to add it redundantly.
> >
> > But this is wrong anyway: the name of the target triple does not say
> > whether we are BE or LE.  Instead you should use the be or le selectors.
> > But again, just add -mbig always.
> 
> This was added by David Edelsohn to the trunk version of the patch, because
> -mbig actually is not supported on all subtargets.  (I found that quite
> surprising also.)

Huh.  Yeah I think I encountered that before.

So this is because these options are in sysv4.opt .

> Apparently this doesn't work on AIX, for example.  But 
> -mlittle works everywhere.  Go figure.

... and -mlittle is exactly the same?  Wtw.

I only looked at the .opt files, maybe one of them is handled directly,
or more likely in specs?  And not symmetrically?

> That's something that should be fixed, I guess, but it's orthogonal
> to this patch.

Fixing it later is more work :-(

Please at least open a bug report for it.


The other things need fixing before the patch is okay.


Segher
  
Li, Pan2 via Gcc-patches Feb. 10, 2022, 10:28 p.m. UTC | #6
Hi!

On 2/10/22 4:11 PM, Segher Boessenkool wrote:
> On Thu, Feb 10, 2022 at 03:17:05PM -0600, Bill Schmidt wrote:
>>>>  /* 1 argument vector functions added in ISA 3.0 (power9). */
>>>> -BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",	CONST,	vclzlsbb_v16qi)
>>>> -BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",	CONST,	vclzlsbb_v8hi)
>>>> -BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",	CONST,	vclzlsbb_v4si)
>>>> -BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",	CONST,	vctzlsbb_v16qi)
>>>> -BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",	CONST,	vctzlsbb_v8hi)
>>>> -BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",	CONST,	vctzlsbb_v4si)
>>>> +BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",	CONST,	vctzlsbb_v16qi)
>>>> +BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",	CONST,	vctzlsbb_v8hi)
>>>> +BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",	CONST,	vctzlsbb_v4si)
>>>> +BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",	CONST,	vclzlsbb_v16qi)
>>>> +BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",	CONST,	vclzlsbb_v8hi)
>>>> +BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",	CONST,	vclzlsbb_v4si)
>>> Please change the default to be equal to the builtin name, so, the BE
>>> version.  We do that everywhere else as well, and it makes a lot more
>>> sense (since everything in Power has BE numbering).
>>>
>>> The trunk version has this correct afaics?
>> No, trunk has this, for example:
>>
>>   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
>>     VCLZLSBB_V16QI vctzlsbb_v16qi {endian}
> I see this on trunk:
>
>   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
>     VCLZLSBB_V16QI vclzlsbb_v16qi {}
>
> Oh, you changed it?  Please fix it, then.

In a patch you approved, yes.  I don't really understand why you want
it changed now.  You must not be looking at the most recent trunk
revision.

>
>> Throughout the new builtin infrastructure, the defaults are set for
>> little-endian, and the "endian" flag changes behavior for big-endian.
> That is a big mistake.  There are many machine instructions  that are
> *always* big-endian (most even!), and none that are always
> little-endian.  So this should be fixed, sooner rather than later :-(

That does not seem like a good idea in stage 4 to me.  That requires
yet another patch to reverse a bunch of other things unnecessarily.

This is a purely arbitrary choice.  The endian flag is only used when
a built-in function must have one behavior for big-endian, and another
behavior for little-endian.  Which one is chosen as the default is
absolutely arbitrary.  When we expand the built-in we will either
accept the default or change to the other.  The existence of machine
instructions that are only big-endian has nothing to do with the case;
what matters is the existence of built-in functions that have two
behaviors.

>>>>  /* { dg-require-effective-target powerpc_p9vector_ok } */
>>>>  /* { dg-options "-mdejagnu-cpu=power9" } */
>>>> +/* { dg-additional-options "-mbig" { target powerpc64le-*-* } } */
>>> You don't need the target clause, if it already is BE by default it does
>>> not do anything to add it redundantly.
>>>
>>> But this is wrong anyway: the name of the target triple does not say
>>> whether we are BE or LE.  Instead you should use the be or le selectors.
>>> But again, just add -mbig always.
>> This was added by David Edelsohn to the trunk version of the patch, because
>> -mbig actually is not supported on all subtargets.  (I found that quite
>> surprising also.)
> Huh.  Yeah I think I encountered that before.
>
> So this is because these options are in sysv4.opt .
>
>> Apparently this doesn't work on AIX, for example.  But 
>> -mlittle works everywhere.  Go figure.
> ... and -mlittle is exactly the same?  Wtw.
>
> I only looked at the .opt files, maybe one of them is handled directly,
> or more likely in specs?  And not symmetrically?
>
>> That's something that should be fixed, I guess, but it's orthogonal
>> to this patch.
> Fixing it later is more work :-(
>
> Please at least open a bug report for it.

I can do that.

>
>
> The other things need fixing before the patch is okay.

I'd ask you to reconsider, as explained above.

Thanks,
Bill

>
>
> Segher
  
David Edelsohn Feb. 10, 2022, 10:43 p.m. UTC | #7
On Thu, Feb 10, 2022 at 4:17 PM Bill Schmidt <wschmidt@linux.ibm.com> wrote:
>
> Hi!
>
> On 2/10/22 2:50 PM, Segher Boessenkool wrote:
> > On Thu, Feb 10, 2022 at 12:22:28PM -0600, Bill Schmidt wrote:
> >> This is a backport from mainline 3f30f2d1dbb3228b8468b26239fe60c2974ce2ac.
> >> These built-ins were misimplemented as always having big-endian semantics.
> >>
> >> Because the built-in infrastructure has changed, the modifications to the
> >> source are different but achieve the same purpose.  The modifications to
> >> the test suite are identical (after fixing the issue with -mbig that David
> >> pointed out with the original patch).
> >>  /* 1 argument vector functions added in ISA 3.0 (power9). */
> >> -BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",      CONST,  vclzlsbb_v16qi)
> >> -BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",        CONST,  vclzlsbb_v8hi)
> >> -BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",        CONST,  vclzlsbb_v4si)
> >> -BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",      CONST,  vctzlsbb_v16qi)
> >> -BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",        CONST,  vctzlsbb_v8hi)
> >> -BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",        CONST,  vctzlsbb_v4si)
> >> +BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",      CONST,  vctzlsbb_v16qi)
> >> +BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",        CONST,  vctzlsbb_v8hi)
> >> +BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",        CONST,  vctzlsbb_v4si)
> >> +BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",      CONST,  vclzlsbb_v16qi)
> >> +BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",        CONST,  vclzlsbb_v8hi)
> >> +BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",        CONST,  vclzlsbb_v4si)
> > Please change the default to be equal to the builtin name, so, the BE
> > version.  We do that everywhere else as well, and it makes a lot more
> > sense (since everything in Power has BE numbering).
> >
> > The trunk version has this correct afaics?
>
> No, trunk has this, for example:
>
>   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
>     VCLZLSBB_V16QI vctzlsbb_v16qi {endian}
>
> So the backport matches what is on trunk.
>
> Throughout the new builtin infrastructure, the defaults are set for
> little-endian, and the "endian" flag changes behavior for big-endian.
>
> >
> >> --- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
> >> +++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
> >> @@ -1,6 +1,7 @@
> >>  /* { dg-do compile { target { powerpc*-*-* } } } */
> > (Delete the redundant target clause when modifying any testcase, please).
>
> Okay.
> >
> >>  /* { dg-require-effective-target powerpc_p9vector_ok } */
> >>  /* { dg-options "-mdejagnu-cpu=power9" } */
> >> +/* { dg-additional-options "-mbig" { target powerpc64le-*-* } } */
> > You don't need the target clause, if it already is BE by default it does
> > not do anything to add it redundantly.
> >
> > But this is wrong anyway: the name of the target triple does not say
> > whether we are BE or LE.  Instead you should use the be or le selectors.
> > But again, just add -mbig always.
>
> This was added by David Edelsohn to the trunk version of the patch, because
> -mbig actually is not supported on all subtargets.  (I found that quite
> surprising also.)  Apparently this doesn't work on AIX, for example.  But
> -mlittle works everywhere.  Go figure.

-mbig/-mlittle only applies to Linux, not AIX and not Darwin.

I changed the BE testcases to add "-mbig" for little endian default
targets because the compiler implicitly should be operating in big
endian mode for other targets and the testcases should succeed.

For the LE testcases, I changed the target selector to
"powrpc*-*-linux*" because that is the only PowerPC target that can
operate as little endian.  I could not find a generic "le" target
selector.  powerpc*-*-linux* understands "-mlittle", so I left the
dg-options clause because there is no need to separate out "-mlittle"
for that subset of PowerPC targets.

Thanks, David

>
> That's something that should be fixed, I guess, but it's orthogonal
> to this patch.
>
> Thanks!
> Bill
>
> >
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c
> >> @@ -0,0 +1,15 @@
> >> +/* { dg-do compile { target { powerpc*-*-* } } } */
> >> +/* { dg-require-effective-target powerpc_p9vector_ok } */
> >> +/* { dg-options "-mdejagnu-cpu=power9 -mlittle" } */
> > And here you do it correctly :-)
> >
> > Okay with those fixes (all happen a few times).  Thanks!
> >
> >
> > Segher
  
Jakub Jelinek Feb. 10, 2022, 11:33 p.m. UTC | #8
On Thu, Feb 10, 2022 at 05:43:26PM -0500, David Edelsohn via Gcc-patches wrote:
> For the LE testcases, I changed the target selector to
> "powrpc*-*-linux*" because that is the only PowerPC target that can
> operate as little endian.  I could not find a generic "le" target
> selector.  powerpc*-*-linux* understands "-mlittle", so I left the

There is
proc check_effective_target_be { } {
    return [check_no_compiler_messages be object {
        int dummy[__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ ? 1 : -1];
    }]
}
and
proc check_effective_target_le { } {
    return [check_no_compiler_messages le object {
        int dummy[__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ ? 1 : -1];
    }]
}
so you can just use { dg-do compile { target { le } } } etc.

	Jakub
  
Segher Boessenkool Feb. 11, 2022, 7:17 a.m. UTC | #9
Hi!

On Thu, Feb 10, 2022 at 05:43:26PM -0500, David Edelsohn wrote:
> -mbig/-mlittle only applies to Linux, not AIX and not Darwin.
> 
> I changed the BE testcases to add "-mbig" for little endian default
> targets because the compiler implicitly should be operating in big
> endian mode for other targets and the testcases should succeed.
> 
> For the LE testcases, I changed the target selector to
> "powrpc*-*-linux*" because that is the only PowerPC target that can
> operate as little endian.  I could not find a generic "le" target
> selector.

It is fully generic.  I added it in
  commit 89453706e0032f9a9c2107631873d9dad38dc14c
  Author: Segher Boessenkool <segher@kernel.crashing.org>
  Date:   Wed May 23 19:31:05 2018 +0200

      testsuite: Introduce be/le selectors

It is very useful, just like ilp32 / lp64 :-)

> powerpc*-*-linux* understands "-mlittle", so I left the
> dg-options clause because there is no need to separate out "-mlittle"
> for that subset of PowerPC targets.

Yes.  powerpc64le-linux is the only problematic thing (you can make
powerpc64-linux be LE as well, and in principle you can make
powerpc64le-linux be BE, that just doesn't work right now -- this all
*does* work for powerpc-linux and powerpcle-linux).

Implicit assumptions make everything harder.


Segher
  
Segher Boessenkool Feb. 11, 2022, 7:31 a.m. UTC | #10
Hi!

On Thu, Feb 10, 2022 at 04:28:02PM -0600, Bill Schmidt wrote:
> On 2/10/22 4:11 PM, Segher Boessenkool wrote:
> >> No, trunk has this, for example:
> >>
> >>   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
> >>     VCLZLSBB_V16QI vctzlsbb_v16qi {endian}
> > I see this on trunk:
> >
> >   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
> >     VCLZLSBB_V16QI vclzlsbb_v16qi {}
> >
> > Oh, you changed it?  Please fix it, then.
> 
> In a patch you approved, yes.

Yes, I missed it.  That is not an argument that it would be good or
should not be change.

> I don't really understand why you want
> it changed now.

Because it is wrong.

> You must not be looking at the most recent trunk revision.

Indeed I haven't been able to update master for a week or so, it does
not bootstrap, as we have talked about.

> >> Throughout the new builtin infrastructure, the defaults are set for
> >> little-endian, and the "endian" flag changes behavior for big-endian.
> > That is a big mistake.  There are many machine instructions  that are
> > *always* big-endian (most even!), and none that are always
> > little-endian.  So this should be fixed, sooner rather than later :-(
> 
> That does not seem like a good idea in stage 4 to me.  That requires
> yet another patch to reverse a bunch of other things unnecessarily.

Things that were added in stage 4, a few days ago even.  Things that are
broken and wrong.  Things I do not want to have to release with and deal
with all the pain of having broken released versions.

> This is a purely arbitrary choice.

No, it is not.  It flies in the face of consistency.

> The endian flag is only used when
> a built-in function must have one behavior for big-endian, and another
> behavior for little-endian.  Which one is chosen as the default is
> absolutely arbitrary.

The one that corresponds to the name should be the default.  I don't see
how you can argue otherwise.

> When we expand the built-in we will either
> accept the default or change to the other.  The existence of machine
> instructions that are only big-endian has nothing to do with the case;
> what matters is the existence of built-in functions that have two
> behaviors.

Everything in our backend is BE by default, just like everything in the
architecture is.  Yes, LE works almost as well (or just as well) in most
places, but everything is named assuming BE.  This consistency is hugely
important, without it the reader will not understand things as well and
as easily.

> >> That's something that should be fixed, I guess, but it's orthogonal
> >> to this patch.
> > Fixing it later is more work :-(
> >
> > Please at least open a bug report for it.
> 
> I can do that.

Thanks!

> > The other things need fixing before the patch is okay.
> 
> I'd ask you to reconsider, as explained above.

It is purely an implementation thing, and it is completely trivial to
do.  If you truly are afraid of breaking things (you should not be), it
is marginally acceptable to do this as the very first thing in stage 1.

Consistency matters.  Naming matters.  These shape how we think about
things.


Segher
  
Li, Pan2 via Gcc-patches Feb. 11, 2022, 1:55 p.m. UTC | #11
Fine.  I withdraw the patch request, and will remove my name from
the bugzilla.  Somebody else can deal with it.  I have more important
things to worry about.

Bill

On 2/11/22 1:31 AM, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Feb 10, 2022 at 04:28:02PM -0600, Bill Schmidt wrote:
>> On 2/10/22 4:11 PM, Segher Boessenkool wrote:
>>>> No, trunk has this, for example:
>>>>
>>>>   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
>>>>     VCLZLSBB_V16QI vctzlsbb_v16qi {endian}
>>> I see this on trunk:
>>>
>>>   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
>>>     VCLZLSBB_V16QI vclzlsbb_v16qi {}
>>>
>>> Oh, you changed it?  Please fix it, then.
>> In a patch you approved, yes.
> Yes, I missed it.  That is not an argument that it would be good or
> should not be change.
>
>> I don't really understand why you want
>> it changed now.
> Because it is wrong.
>
>> You must not be looking at the most recent trunk revision.
> Indeed I haven't been able to update master for a week or so, it does
> not bootstrap, as we have talked about.
>
>>>> Throughout the new builtin infrastructure, the defaults are set for
>>>> little-endian, and the "endian" flag changes behavior for big-endian.
>>> That is a big mistake.  There are many machine instructions  that are
>>> *always* big-endian (most even!), and none that are always
>>> little-endian.  So this should be fixed, sooner rather than later :-(
>> That does not seem like a good idea in stage 4 to me.  That requires
>> yet another patch to reverse a bunch of other things unnecessarily.
> Things that were added in stage 4, a few days ago even.  Things that are
> broken and wrong.  Things I do not want to have to release with and deal
> with all the pain of having broken released versions.
>
>> This is a purely arbitrary choice.
> No, it is not.  It flies in the face of consistency.
>
>> The endian flag is only used when
>> a built-in function must have one behavior for big-endian, and another
>> behavior for little-endian.  Which one is chosen as the default is
>> absolutely arbitrary.
> The one that corresponds to the name should be the default.  I don't see
> how you can argue otherwise.
>
>> When we expand the built-in we will either
>> accept the default or change to the other.  The existence of machine
>> instructions that are only big-endian has nothing to do with the case;
>> what matters is the existence of built-in functions that have two
>> behaviors.
> Everything in our backend is BE by default, just like everything in the
> architecture is.  Yes, LE works almost as well (or just as well) in most
> places, but everything is named assuming BE.  This consistency is hugely
> important, without it the reader will not understand things as well and
> as easily.
>
>>>> That's something that should be fixed, I guess, but it's orthogonal
>>>> to this patch.
>>> Fixing it later is more work :-(
>>>
>>> Please at least open a bug report for it.
>> I can do that.
> Thanks!
>
>>> The other things need fixing before the patch is okay.
>> I'd ask you to reconsider, as explained above.
> It is purely an implementation thing, and it is completely trivial to
> do.  If you truly are afraid of breaking things (you should not be), it
> is marginally acceptable to do this as the very first thing in stage 1.
>
> Consistency matters.  Naming matters.  These shape how we think about
> things.
>
>
> Segher
  

Patch

diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
index 6270444ef70..b28ee02070a 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2678,12 +2678,12 @@  BU_P9V_64BIT_AV_X (STXVL,	"stxvl",	MISC)
 BU_P9V_64BIT_AV_X (XST_LEN_R,	"xst_len_r",	MISC)
 
 /* 1 argument vector functions added in ISA 3.0 (power9). */
-BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",	CONST,	vclzlsbb_v16qi)
-BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",	CONST,	vclzlsbb_v8hi)
-BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",	CONST,	vclzlsbb_v4si)
-BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",	CONST,	vctzlsbb_v16qi)
-BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",	CONST,	vctzlsbb_v8hi)
-BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",	CONST,	vctzlsbb_v4si)
+BU_P9V_AV_1 (VCLZLSBB_V16QI, "vclzlsbb_v16qi",	CONST,	vctzlsbb_v16qi)
+BU_P9V_AV_1 (VCLZLSBB_V8HI, "vclzlsbb_v8hi",	CONST,	vctzlsbb_v8hi)
+BU_P9V_AV_1 (VCLZLSBB_V4SI, "vclzlsbb_v4si",	CONST,	vctzlsbb_v4si)
+BU_P9V_AV_1 (VCTZLSBB_V16QI, "vctzlsbb_v16qi",	CONST,	vclzlsbb_v16qi)
+BU_P9V_AV_1 (VCTZLSBB_V8HI, "vctzlsbb_v8hi",	CONST,	vclzlsbb_v8hi)
+BU_P9V_AV_1 (VCTZLSBB_V4SI, "vctzlsbb_v4si",	CONST,	vclzlsbb_v4si)
 
 /* Built-in support for Power9 "VSU option" string operations includes
    new awareness of the "vector compare not equal" (vcmpneb, vcmpneb.,
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index ef20cb30388..27bb25fa4d8 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -13221,6 +13221,36 @@  rs6000_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED,
 	}
       break;
 
+    case P9V_BUILTIN_VCLZLSBB_V16QI:
+      if (BYTES_BIG_ENDIAN)
+	icode = CODE_FOR_vclzlsbb_v16qi;
+      break;
+
+    case P9V_BUILTIN_VCLZLSBB_V8HI:
+      if (BYTES_BIG_ENDIAN)
+	icode = CODE_FOR_vclzlsbb_v8hi;
+      break;
+
+    case P9V_BUILTIN_VCLZLSBB_V4SI:
+      if (BYTES_BIG_ENDIAN)
+	icode = CODE_FOR_vclzlsbb_v4si;
+      break;
+
+    case P9V_BUILTIN_VCTZLSBB_V16QI:
+      if (BYTES_BIG_ENDIAN)
+	icode = CODE_FOR_vctzlsbb_v16qi;
+      break;
+
+    case P9V_BUILTIN_VCTZLSBB_V8HI:
+      if (BYTES_BIG_ENDIAN)
+	icode = CODE_FOR_vctzlsbb_v8hi;
+      break;
+
+    case P9V_BUILTIN_VCTZLSBB_V4SI:
+      if (BYTES_BIG_ENDIAN)
+	icode = CODE_FOR_vctzlsbb_v4si;
+      break;
+
     default:
       break;
     }
diff --git a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
index 0faf233425e..e2f6b08bc0f 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
@@ -1,6 +1,7 @@ 
 /* { dg-do compile { target { powerpc*-*-* } } } */
 /* { dg-require-effective-target powerpc_p9vector_ok } */
 /* { dg-options "-mdejagnu-cpu=power9" } */
+/* { dg-additional-options "-mbig" { target powerpc64le-*-* } } */
 
 #include <altivec.h>
 
diff --git a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c
index 201ed17e2fd..a9647e749ae 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c
@@ -1,6 +1,7 @@ 
 /* { dg-do compile { target { powerpc*-*-* } } } */
 /* { dg-require-effective-target powerpc_p9vector_ok } */
 /* { dg-options "-mdejagnu-cpu=power9" } */
+/* { dg-additional-options "-mbig" { target powerpc64le-*-* } } */
 
 #include <altivec.h>
 
diff --git a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c
new file mode 100644
index 00000000000..6ee31a11aee
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c
@@ -0,0 +1,15 @@ 
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mdejagnu-cpu=power9 -mlittle" } */
+
+#include <altivec.h>
+
+int
+count_leading_zero_byte_bits (vector signed char *arg1_p)
+{
+  vector signed char arg_1 = *arg1_p;
+
+  return vec_cntlz_lsbb (arg_1);
+}
+
+/* { dg-final { scan-assembler "vctzlsbb" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c
new file mode 100644
index 00000000000..6105091b016
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c
@@ -0,0 +1,15 @@ 
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mdejagnu-cpu=power9 -mlittle" } */
+
+#include <altivec.h>
+
+int
+count_leading_zero_byte_bits (vector unsigned char *arg1_p)
+{
+  vector unsigned char arg_1 = *arg1_p;
+
+  return vec_cntlz_lsbb (arg_1);
+}
+
+/* { dg-final { scan-assembler "vctzlsbb" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c
index 70a398ac401..7ee2b6e87db 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c
@@ -1,6 +1,7 @@ 
 /* { dg-do compile { target { powerpc*-*-* } } } */
 /* { dg-require-effective-target powerpc_p9vector_ok } */
 /* { dg-options "-mdejagnu-cpu=power9" } */
+/* { dg-additional-options "-mbig" { target powerpc64le-*-* } } */
 
 #include <altivec.h>
 
diff --git a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c
index f6d41e3e728..d87a7903fce 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c
@@ -1,6 +1,7 @@ 
 /* { dg-do compile { target { powerpc*-*-* } } } */
 /* { dg-require-effective-target powerpc_p9vector_ok } */
 /* { dg-options "-mdejagnu-cpu=power9" } */
+/* { dg-additional-options "-mbig" { target powerpc64le-*-* } } */
 
 #include <altivec.h>
 
diff --git a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c
new file mode 100644
index 00000000000..a9245d8200c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c
@@ -0,0 +1,15 @@ 
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mdejagnu-cpu=power9 -mlittle" } */
+
+#include <altivec.h>
+
+int
+count_trailing_zero_byte_bits (vector signed char *arg1_p)
+{
+  vector signed char arg_1 = *arg1_p;
+
+  return vec_cnttz_lsbb (arg_1);
+}
+
+/* { dg-final { scan-assembler "vclzlsbb" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c
new file mode 100644
index 00000000000..71fea5306c1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c
@@ -0,0 +1,15 @@ 
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mdejagnu-cpu=power9 -mlittle" } */
+
+#include <altivec.h>
+
+int
+count_trailing_zero_byte_bits (vector unsigned char *arg1_p)
+{
+  vector unsigned char arg_1 = *arg1_p;
+
+  return vec_cnttz_lsbb (arg_1);
+}
+
+/* { dg-final { scan-assembler "vclzlsbb" } } */