rs6000: Fix a handful of 32-bit built-in function problems in the new support

Message ID 4792cfc9-59bb-15b9-4f45-67f1e2f9c037@linux.ibm.com
State New
Headers
Series rs6000: Fix a handful of 32-bit built-in function problems in the new support |

Commit Message

Li, Pan2 via Gcc-patches Nov. 9, 2021, 9:46 p.m. UTC
  Hi!  Some time ago I realized I hadn't tested the new builtin support against 32-bit
big-endian in quite a while.  When I did, I found a handful of errors that needed
correcting.
 - One builtin needs to be disabled for 32-bit.
 - One builtin needs to be restricted to 32-bit only.
 - One builtin used unsigned long when it needed unsigned long long.
 - Six builtins used unsigned long long when they needed unsigned long.
 - One test case needed its expected error message adjusted.
Otherwise things were fine.

Bootstrapped and tested on powerpc64le-linux-gnu and powerpc64-linux-gnu with no
regressions.  Is this okay for trunk?

Thanks!
Bill


2021-09-02  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-builtin-new.def (CMPB): Flag as no32bit.
	(BPERMD): Flag as 32bit.
	(UNPACK_TD): Return unsigned long long instead of unsigned long.
	(SET_TEXASR): Pass unsigned long instead of unsigned long long.
	(SET_TEXASRU): Likewise.
	(SET_TFHAR): Likewise.
	(SET_TFIAR): Likewise.
	(TABORTDC): Likewise.
	(TABORTDCI): Likewise.
	* config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Fix error
	handling for no32bit.  Add 32bit handling for RS6000_BIF_BPERMD.

gcc/testsuite/
	* gcc.target/powerpc/cmpb-3.c: Adjust error message.
---
 gcc/config/rs6000/rs6000-builtin-new.def  | 22 +++++++++++-----------
 gcc/config/rs6000/rs6000-call.c           |  9 ++++++---
 gcc/testsuite/gcc.target/powerpc/cmpb-3.c |  2 +-
 3 files changed, 18 insertions(+), 15 deletions(-)
  

Comments

Segher Boessenkool Nov. 10, 2021, 8:33 a.m. UTC | #1
On Tue, Nov 09, 2021 at 03:46:54PM -0600, Bill Schmidt wrote:
> Hi!  Some time ago I realized I hadn't tested the new builtin support against 32-bit
> big-endian in quite a while.  When I did, I found a handful of errors that needed
> correcting.
>  - One builtin needs to be disabled for 32-bit.
>  - One builtin needs to be restricted to 32-bit only.
>  - One builtin used unsigned long when it needed unsigned long long.
>  - Six builtins used unsigned long long when they needed unsigned long.
>  - One test case needed its expected error message adjusted.
> Otherwise things were fine.

> Bootstrapped and tested on powerpc64le-linux-gnu and powerpc64-linux-gnu with no
> regressions.

{-m32,-m64} for the latter, right?

> 	* config/rs6000/rs6000-builtin-new.def (CMPB): Flag as no32bit.
> 	(BPERMD): Flag as 32bit.
> 	(UNPACK_TD): Return unsigned long long instead of unsigned long.
> 	(SET_TEXASR): Pass unsigned long instead of unsigned long long.
> 	(SET_TEXASRU): Likewise.
> 	(SET_TFHAR): Likewise.
> 	(SET_TFIAR): Likewise.
> 	(TABORTDC): Likewise.
> 	(TABORTDCI): Likewise.
> 	* config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Fix error
> 	handling for no32bit.  Add 32bit handling for RS6000_BIF_BPERMD.
> 
> gcc/testsuite/
> 	* gcc.target/powerpc/cmpb-3.c: Adjust error message.


>    const signed long __builtin_bpermd (signed long, signed long);
> -    BPERMD bpermd_di {}
> +    BPERMD bpermd_di {32bit}

That is not what the old code does?

    case POWER7_BUILTIN_BPERMD:
      return rs6000_expand_binop_builtin (((TARGET_64BIT)
                                           ? CODE_FOR_bpermd_di
                                           : CODE_FOR_bpermd_si), exp, target);

> -  void __builtin_set_texasr (unsigned long long);
> +  void __builtin_set_texasr (unsigned long);
>      SET_TEXASR nothing {htm,htmspr}
>  
> -  void __builtin_set_texasru (unsigned long long);
> +  void __builtin_set_texasru (unsigned long);
>      SET_TEXASRU nothing {htm,htmspr}
>  
> -  void __builtin_set_tfhar (unsigned long long);
> +  void __builtin_set_tfhar (unsigned long);
>      SET_TFHAR nothing {htm,htmspr}
>  
> -  void __builtin_set_tfiar (unsigned long long);
> +  void __builtin_set_tfiar (unsigned long);
>      SET_TFIAR nothing {htm,htmspr}

This does not seem to be what the exiting code does, either?  Try with
-m32 -mpowerpc64 (it extends to 64 bit there, so the builtin does not
have long int as parameter, it has long long int).

> @@ -15758,6 +15759,8 @@ rs6000_expand_new_builtin (tree exp, rtx target,
>      {
>        if (fcode == RS6000_BIF_MFTB)
>  	icode = CODE_FOR_rs6000_mftb_si;
> +      else if (fcode == RS6000_BIF_BPERMD)
> +	icode = CODE_FOR_bpermd_si;
>        else
>  	gcc_unreachable ();
>      }

But you disabled it for 32 bit now, so huh.

> --- a/gcc/testsuite/gcc.target/powerpc/cmpb-3.c
> +++ b/gcc/testsuite/gcc.target/powerpc/cmpb-3.c
> @@ -8,7 +8,7 @@ void abort ();
>  long long int
>  do_compare (long long int a, long long int b)
>  {
> -  return __builtin_cmpb (a, b);	/* { dg-error "'__builtin_cmpb' is not supported in this compiler configuration" } */
> +  return __builtin_cmpb (a, b);	/* { dg-error "'__builtin_p6_cmpb' is not supported in 32-bit mode" } */
>  }

The original spelling is the correct one?


Segher
  
Li, Pan2 via Gcc-patches Nov. 10, 2021, 9:28 p.m. UTC | #2
On 11/10/21 2:33 AM, Segher Boessenkool wrote:
> On Tue, Nov 09, 2021 at 03:46:54PM -0600, Bill Schmidt wrote:
>> Hi!  Some time ago I realized I hadn't tested the new builtin support against 32-bit
>> big-endian in quite a while.  When I did, I found a handful of errors that needed
>> correcting.
>>  - One builtin needs to be disabled for 32-bit.
>>  - One builtin needs to be restricted to 32-bit only.
>>  - One builtin used unsigned long when it needed unsigned long long.
>>  - Six builtins used unsigned long long when they needed unsigned long.
>>  - One test case needed its expected error message adjusted.
>> Otherwise things were fine.
>> Bootstrapped and tested on powerpc64le-linux-gnu and powerpc64-linux-gnu with no
>> regressions.
> {-m32,-m64} for the latter, right?

Yes, indeed.
>
>> 	* config/rs6000/rs6000-builtin-new.def (CMPB): Flag as no32bit.
>> 	(BPERMD): Flag as 32bit.
>> 	(UNPACK_TD): Return unsigned long long instead of unsigned long.
>> 	(SET_TEXASR): Pass unsigned long instead of unsigned long long.
>> 	(SET_TEXASRU): Likewise.
>> 	(SET_TFHAR): Likewise.
>> 	(SET_TFIAR): Likewise.
>> 	(TABORTDC): Likewise.
>> 	(TABORTDCI): Likewise.
>> 	* config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Fix error
>> 	handling for no32bit.  Add 32bit handling for RS6000_BIF_BPERMD.
>>
>> gcc/testsuite/
>> 	* gcc.target/powerpc/cmpb-3.c: Adjust error message.
>
>>    const signed long __builtin_bpermd (signed long, signed long);
>> -    BPERMD bpermd_di {}
>> +    BPERMD bpermd_di {32bit}
> That is not what the old code does?
>
>     case POWER7_BUILTIN_BPERMD:
>       return rs6000_expand_binop_builtin (((TARGET_64BIT)
>                                            ? CODE_FOR_bpermd_di
>                                            : CODE_FOR_bpermd_si), exp, target);

Sorry!  I mischaracterized what "32bit" implies.  It means we have to perform special
processing for 32-bit, not that it is restricted to 32-bit only.  So below, you see
the code added to handle it and subsitute the icode as required.

>> -  void __builtin_set_texasr (unsigned long long);
>> +  void __builtin_set_texasr (unsigned long);
>>      SET_TEXASR nothing {htm,htmspr}
>>  
>> -  void __builtin_set_texasru (unsigned long long);
>> +  void __builtin_set_texasru (unsigned long);
>>      SET_TEXASRU nothing {htm,htmspr}
>>  
>> -  void __builtin_set_tfhar (unsigned long long);
>> +  void __builtin_set_tfhar (unsigned long);
>>      SET_TFHAR nothing {htm,htmspr}
>>  
>> -  void __builtin_set_tfiar (unsigned long long);
>> +  void __builtin_set_tfiar (unsigned long);
>>      SET_TFIAR nothing {htm,htmspr}
> This does not seem to be what the exiting code does, either?  Try with
> -m32 -mpowerpc64 (it extends to 64 bit there, so the builtin does not
> have long int as parameter, it has long long int).

This uses a tfiar_t, which is a typedef for uintptr_t, so long int is appropriate.
This is necessary to make the HTM tests pass on 32-bit powerpc64.

>
>> @@ -15758,6 +15759,8 @@ rs6000_expand_new_builtin (tree exp, rtx target,
>>      {
>>        if (fcode == RS6000_BIF_MFTB)
>>  	icode = CODE_FOR_rs6000_mftb_si;
>> +      else if (fcode == RS6000_BIF_BPERMD)
>> +	icode = CODE_FOR_bpermd_si;
>>        else
>>  	gcc_unreachable ();
>>      }
> But you disabled it for 32 bit now, so huh.

See above for my former mis-explanation.  Oops...

>
>> --- a/gcc/testsuite/gcc.target/powerpc/cmpb-3.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/cmpb-3.c
>> @@ -8,7 +8,7 @@ void abort ();
>>  long long int
>>  do_compare (long long int a, long long int b)
>>  {
>> -  return __builtin_cmpb (a, b);	/* { dg-error "'__builtin_cmpb' is not supported in this compiler configuration" } */
>> +  return __builtin_cmpb (a, b);	/* { dg-error "'__builtin_p6_cmpb' is not supported in 32-bit mode" } */
>>  }
> The original spelling is the correct one?

This is something I have on my to-do list for the future, to see whether I
can improve it.  The overloaded function __builtin_cmpb gets translated to
the underlying non-overloaded builtin __builtin_p6_cmpb, and that's the only
name that's still around by the time we get to the error processing.  I want
to see whether I can add some infrastructure to recover the overloaded
function name in such cases.  Is it okay to defer this for now?

Thanks for the review!
Bill

>
>
> Segher
  
Segher Boessenkool Nov. 11, 2021, 1:11 p.m. UTC | #3
On Wed, Nov 10, 2021 at 03:28:18PM -0600, Bill Schmidt wrote:
> On 11/10/21 2:33 AM, Segher Boessenkool wrote:
> > On Tue, Nov 09, 2021 at 03:46:54PM -0600, Bill Schmidt wrote:
> >> 	* config/rs6000/rs6000-builtin-new.def (CMPB): Flag as no32bit.
> >> 	(BPERMD): Flag as 32bit.

So, change this to something like "flag this as needing special handling
on 32 bit" or something?

> >> -  void __builtin_set_texasr (unsigned long long);
> >> +  void __builtin_set_texasr (unsigned long);
> >>      SET_TEXASR nothing {htm,htmspr}
> >>  
> >> -  void __builtin_set_texasru (unsigned long long);
> >> +  void __builtin_set_texasru (unsigned long);
> >>      SET_TEXASRU nothing {htm,htmspr}
> >>  
> >> -  void __builtin_set_tfhar (unsigned long long);
> >> +  void __builtin_set_tfhar (unsigned long);
> >>      SET_TFHAR nothing {htm,htmspr}
> >>  
> >> -  void __builtin_set_tfiar (unsigned long long);
> >> +  void __builtin_set_tfiar (unsigned long);
> >>      SET_TFIAR nothing {htm,htmspr}
> > This does not seem to be what the exiting code does, either?  Try with
> > -m32 -mpowerpc64 (it extends to 64 bit there, so the builtin does not
> > have long int as parameter, it has long long int).
> 
> This uses a tfiar_t, which is a typedef for uintptr_t, so long int is appropriate.
> This is necessary to make the HTM tests pass on 32-bit powerpc64.

void f(long x) { __builtin_set_texasr(x); }

built with -m32 -mpowerpc64 gives (in the expand dump):

void f (long int x)
{
  long long unsigned int _1;

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  _1 = (long long unsigned int) x_2(D);
  __builtin_set_texasr (_1); [tail call]
  return;
;;    succ:       EXIT

}

The builtins have a "long long" argument in the existing code, in this
configuration.  And this is not the same as "long" here.

> >> --- a/gcc/testsuite/gcc.target/powerpc/cmpb-3.c
> >> +++ b/gcc/testsuite/gcc.target/powerpc/cmpb-3.c
> >> @@ -8,7 +8,7 @@ void abort ();
> >>  long long int
> >>  do_compare (long long int a, long long int b)
> >>  {
> >> -  return __builtin_cmpb (a, b);	/* { dg-error "'__builtin_cmpb' is not supported in this compiler configuration" } */
> >> +  return __builtin_cmpb (a, b);	/* { dg-error "'__builtin_p6_cmpb' is not supported in 32-bit mode" } */
> >>  }
> > The original spelling is the correct one?
> 
> This is something I have on my to-do list for the future, to see whether I
> can improve it.  The overloaded function __builtin_cmpb gets translated to
> the underlying non-overloaded builtin __builtin_p6_cmpb, and that's the only
> name that's still around by the time we get to the error processing.  I want
> to see whether I can add some infrastructure to recover the overloaded
> function name in such cases.  Is it okay to defer this for now?

It is fine to defer it.  It is not fine to change the testcase like
this.  The user did not write __builtin_p6_cmpb (which is not even
documented btw), so the compiler should not talk about that.  It is
fine to leave the test failing for now.


Segher
  
Li, Pan2 via Gcc-patches Nov. 11, 2021, 4:50 p.m. UTC | #4
Hi!

On 11/11/21 7:11 AM, Segher Boessenkool wrote:
> On Wed, Nov 10, 2021 at 03:28:18PM -0600, Bill Schmidt wrote:
>> On 11/10/21 2:33 AM, Segher Boessenkool wrote:
>>> On Tue, Nov 09, 2021 at 03:46:54PM -0600, Bill Schmidt wrote:
>>>> 	* config/rs6000/rs6000-builtin-new.def (CMPB): Flag as no32bit.
>>>> 	(BPERMD): Flag as 32bit.
> So, change this to something like "flag this as needing special handling
> on 32 bit" or something?

Sure.
>
>>>> -  void __builtin_set_texasr (unsigned long long);
>>>> +  void __builtin_set_texasr (unsigned long);
>>>>      SET_TEXASR nothing {htm,htmspr}
>>>>  
>>>> -  void __builtin_set_texasru (unsigned long long);
>>>> +  void __builtin_set_texasru (unsigned long);
>>>>      SET_TEXASRU nothing {htm,htmspr}
>>>>  
>>>> -  void __builtin_set_tfhar (unsigned long long);
>>>> +  void __builtin_set_tfhar (unsigned long);
>>>>      SET_TFHAR nothing {htm,htmspr}
>>>>  
>>>> -  void __builtin_set_tfiar (unsigned long long);
>>>> +  void __builtin_set_tfiar (unsigned long);
>>>>      SET_TFIAR nothing {htm,htmspr}
>>> This does not seem to be what the exiting code does, either?  Try with
>>> -m32 -mpowerpc64 (it extends to 64 bit there, so the builtin does not
>>> have long int as parameter, it has long long int).
>> This uses a tfiar_t, which is a typedef for uintptr_t, so long int is appropriate.
>> This is necessary to make the HTM tests pass on 32-bit powerpc64.
> void f(long x) { __builtin_set_texasr(x); }
>
> built with -m32 -mpowerpc64 gives (in the expand dump):
>
> void f (long int x)
> {
>   long long unsigned int _1;
>
> ;;   basic block 2, loop depth 0
> ;;    pred:       ENTRY
>   _1 = (long long unsigned int) x_2(D);
>   __builtin_set_texasr (_1); [tail call]
>   return;
> ;;    succ:       EXIT
>
> }
>
> The builtins have a "long long" argument in the existing code, in this
> configuration.  And this is not the same as "long" here.

Hm, strange.  I'll have to go back and revisit this.  Something subtle going on.

Thanks,
Bill

>
>>>> --- a/gcc/testsuite/gcc.target/powerpc/cmpb-3.c
>>>> +++ b/gcc/testsuite/gcc.target/powerpc/cmpb-3.c
>>>> @@ -8,7 +8,7 @@ void abort ();
>>>>  long long int
>>>>  do_compare (long long int a, long long int b)
>>>>  {
>>>> -  return __builtin_cmpb (a, b);	/* { dg-error "'__builtin_cmpb' is not supported in this compiler configuration" } */
>>>> +  return __builtin_cmpb (a, b);	/* { dg-error "'__builtin_p6_cmpb' is not supported in 32-bit mode" } */
>>>>  }
>>> The original spelling is the correct one?
>> This is something I have on my to-do list for the future, to see whether I
>> can improve it.  The overloaded function __builtin_cmpb gets translated to
>> the underlying non-overloaded builtin __builtin_p6_cmpb, and that's the only
>> name that's still around by the time we get to the error processing.  I want
>> to see whether I can add some infrastructure to recover the overloaded
>> function name in such cases.  Is it okay to defer this for now?
> It is fine to defer it.  It is not fine to change the testcase like
> this.  The user did not write __builtin_p6_cmpb (which is not even
> documented btw), so the compiler should not talk about that.  It is
> fine to leave the test failing for now.
>
>
> Segher
  
Li, Pan2 via Gcc-patches Nov. 14, 2021, 2:17 p.m. UTC | #5
On 11/11/21 10:50 AM, Bill Schmidt wrote:
> On 11/11/21 7:11 AM, Segher Boessenkool wrote:
>> void f(long x) { __builtin_set_texasr(x); }
>>
>> built with -m32 -mpowerpc64 gives (in the expand dump):
>>
>> void f (long int x)
>> {
>>   long long unsigned int _1;
>>
>> ;;   basic block 2, loop depth 0
>> ;;    pred:       ENTRY
>>   _1 = (long long unsigned int) x_2(D);
>>   __builtin_set_texasr (_1); [tail call]
>>   return;
>> ;;    succ:       EXIT
>>
>> }
>>
>> The builtins have a "long long" argument in the existing code, in this
>> configuration.  And this is not the same as "long" here.
> Hm, strange.  I'll have to go back and revisit this.  Something subtle going on.
>
So, we have one of the more bizarre API situations here that I've ever seen.

We have three 64-bit HTM registers:  TEXASR, TFHAR, and TFIAR.  We also have the
32-bit TEXASRU, which is the upper half of TEXASR.  The documnted interfaces for
reading and modifying these registers are:

  unsigned long __builtin_get_texasr (void);
  unsigned long __builtin_get_texasru (void);
  unsigned long __builtin_get_tfhar (void);
  unsigned long __builtin_get_tfiar (void);

  void __builtin_set_texasr (unsigned long);
  void __builtin_set_texasru (unsigned long);
  void __builtin_set_tfhar (unsigned long);
  void __builtin_set_tfiar (unsigned long);

In reality, these interfaces are defined this way for pure 32-bit and pure 64-bit,
but for -m32 -mpowerpc64 we have some grotesque hackery that overrides the
expected interfaces to be:

  unsigned long long __builtin_get_texasr (void);
  unsigned long long __builtin_get_texasru (void);
  unsigned long long __builtin_get_tfhar (void);
  unsigned long long __builtin_get_tfiar (void);

  void __builtin_set_texasr (unsigned long long);
  void __builtin_set_texasru (unsigned long long);
  void __builtin_set_tfhar (unsigned long long);
  void __builtin_set_tfiar (unsigned long long);

An undocumented conditional API is a really, really bad idea, given that it
forces users of this interface for general code to #ifdef on the -m32
-mpowerpc64 condition.  Not to mention treating 32-bit registers the same as
64-bit ones, and only modifying half the register on a 32-bit system.  (Is HTM
even supported on a 32-bit system?)

It would have likely been better to have one consistent interface, using
int for TEXASRU and long long for the others, even though that requires
dealing with two registers for the 32-bit case; but that's all water under
the bridge.  We have what we have.

If I sound irritated, it's because, just for this case, I'll have to add a
bunch of extra machinery to track up to two prototypes for each builtin
function, and perform conditional initialization when it applies.  The one
good thing is that these already have a builtin attribute "htmspr" that I
can key off of to do the extra processing.

And somebody ought to fix the misleading documentation...

Thanks,
Bill
  
Segher Boessenkool Nov. 14, 2021, 3:29 p.m. UTC | #6
Hi!

On Sun, Nov 14, 2021 at 08:17:41AM -0600, Bill Schmidt wrote:
> On 11/11/21 10:50 AM, Bill Schmidt wrote:
> > On 11/11/21 7:11 AM, Segher Boessenkool wrote:
> >> void f(long x) { __builtin_set_texasr(x); }
> >>
> >> built with -m32 -mpowerpc64 gives (in the expand dump):
> >>
> >> void f (long int x)
> >> {
> >>   long long unsigned int _1;
> >>
> >> ;;   basic block 2, loop depth 0
> >> ;;    pred:       ENTRY
> >>   _1 = (long long unsigned int) x_2(D);
> >>   __builtin_set_texasr (_1); [tail call]
> >>   return;
> >> ;;    succ:       EXIT
> >>
> >> }
> >>
> >> The builtins have a "long long" argument in the existing code, in this
> >> configuration.  And this is not the same as "long" here.
> > Hm, strange.  I'll have to go back and revisit this.  Something subtle going on.
> >
> So, we have one of the more bizarre API situations here that I've ever seen.
> 
> We have three 64-bit HTM registers:  TEXASR, TFHAR, and TFIAR.  We also have the
> 32-bit TEXASRU, which is the upper half of TEXASR.  The documnted interfaces for
> reading and modifying these registers are:
> 
>   unsigned long __builtin_get_texasr (void);
>   unsigned long __builtin_get_texasru (void);
>   unsigned long __builtin_get_tfhar (void);
>   unsigned long __builtin_get_tfiar (void);
> 
>   void __builtin_set_texasr (unsigned long);
>   void __builtin_set_texasru (unsigned long);
>   void __builtin_set_tfhar (unsigned long);
>   void __builtin_set_tfiar (unsigned long);
> 
> In reality, these interfaces are defined this way for pure 32-bit and pure 64-bit,
> but for -m32 -mpowerpc64 we have some grotesque hackery that overrides the
> expected interfaces to be:
> 
>   unsigned long long __builtin_get_texasr (void);
>   unsigned long long __builtin_get_texasru (void);
>   unsigned long long __builtin_get_tfhar (void);
>   unsigned long long __builtin_get_tfiar (void);
> 
>   void __builtin_set_texasr (unsigned long long);
>   void __builtin_set_texasru (unsigned long long);
>   void __builtin_set_tfhar (unsigned long long);
>   void __builtin_set_tfiar (unsigned long long);

Yes.  Everything in -m32 -mpowerpc64 should follow the 32-bit ABI.  If
you consider these builtins part of the ABI (are they documented there?)
then this is simply a bug.

> An undocumented conditional API is a really, really bad idea, given that it
> forces users of this interface for general code to #ifdef on the -m32
> -mpowerpc64 condition.  Not to mention treating 32-bit registers the same as
> 64-bit ones, and only modifying half the register on a 32-bit system.  (Is HTM
> even supported on a 32-bit system?)

There are no pure 32 bit CPUs that implement HTM, to my knowledge.  But
of course HTM works fine with SF=0 (that is the reason TEXASRU exists!
Compare to TB and TBU).

> It would have likely been better to have one consistent interface, using
> int for TEXASRU and long long for the others, even though that requires
> dealing with two registers for the 32-bit case; but that's all water under
> the bridge.  We have what we have.

"long" for the others, actually.

TFHAR and TFIAR hold code addresses.  TEXASR gets only the low 32 bits
of the register read, that is why TEXASRU exists :-)

> If I sound irritated, it's because, just for this case, I'll have to add a
> bunch of extra machinery to track up to two prototypes for each builtin
> function, and perform conditional initialization when it applies.  The one
> good thing is that these already have a builtin attribute "htmspr" that I
> can key off of to do the extra processing.

Another option might be to finally fix this.  There still are shipping
CPUs that support HTM ;-)

And essentially no one uses -m32 -mpowerpc64 on Linux or AIX.  On Linux
because ucontext_t and jmp_buf do not deal with the high half of the
registers, and iiuc on AIX the kernel doesn't deal with it in context
switches even.  Darwin does use it, but afaik no one runs Darwin on a
CPU with HTM.

> And somebody ought to fix the misleading documentation...

Yes.

Do you want to fix this mess?  I will take a patch using "long" for
all these registers and builtins (just like we have for essentially all
other SPRs!)


Segher
  
Li, Pan2 via Gcc-patches Nov. 15, 2021, 2:32 p.m. UTC | #7
Hi Segher,

On 11/14/21 9:29 AM, Segher Boessenkool wrote:
> Hi!
>
> On Sun, Nov 14, 2021 at 08:17:41AM -0600, Bill Schmidt wrote:
>> On 11/11/21 10:50 AM, Bill Schmidt wrote:
>>> On 11/11/21 7:11 AM, Segher Boessenkool wrote:
>>>> void f(long x) { __builtin_set_texasr(x); }
>>>>
>>>> built with -m32 -mpowerpc64 gives (in the expand dump):
>>>>
>>>> void f (long int x)
>>>> {
>>>>   long long unsigned int _1;
>>>>
>>>> ;;   basic block 2, loop depth 0
>>>> ;;    pred:       ENTRY
>>>>   _1 = (long long unsigned int) x_2(D);
>>>>   __builtin_set_texasr (_1); [tail call]
>>>>   return;
>>>> ;;    succ:       EXIT
>>>>
>>>> }
>>>>
>>>> The builtins have a "long long" argument in the existing code, in this
>>>> configuration.  And this is not the same as "long" here.
>>> Hm, strange.  I'll have to go back and revisit this.  Something subtle going on.
>>>
>> So, we have one of the more bizarre API situations here that I've ever seen.
>>
>> We have three 64-bit HTM registers:  TEXASR, TFHAR, and TFIAR.  We also have the
>> 32-bit TEXASRU, which is the upper half of TEXASR.  The documnted interfaces for
>> reading and modifying these registers are:
>>
>>   unsigned long __builtin_get_texasr (void);
>>   unsigned long __builtin_get_texasru (void);
>>   unsigned long __builtin_get_tfhar (void);
>>   unsigned long __builtin_get_tfiar (void);
>>
>>   void __builtin_set_texasr (unsigned long);
>>   void __builtin_set_texasru (unsigned long);
>>   void __builtin_set_tfhar (unsigned long);
>>   void __builtin_set_tfiar (unsigned long);
>>
>> In reality, these interfaces are defined this way for pure 32-bit and pure 64-bit,
>> but for -m32 -mpowerpc64 we have some grotesque hackery that overrides the
>> expected interfaces to be:
>>
>>   unsigned long long __builtin_get_texasr (void);
>>   unsigned long long __builtin_get_texasru (void);
>>   unsigned long long __builtin_get_tfhar (void);
>>   unsigned long long __builtin_get_tfiar (void);
>>
>>   void __builtin_set_texasr (unsigned long long);
>>   void __builtin_set_texasru (unsigned long long);
>>   void __builtin_set_tfhar (unsigned long long);
>>   void __builtin_set_tfiar (unsigned long long);
> Yes.  Everything in -m32 -mpowerpc64 should follow the 32-bit ABI.  If
> you consider these builtins part of the ABI (are they documented there?)
> then this is simply a bug.
>
>> An undocumented conditional API is a really, really bad idea, given that it
>> forces users of this interface for general code to #ifdef on the -m32
>> -mpowerpc64 condition.  Not to mention treating 32-bit registers the same as
>> 64-bit ones, and only modifying half the register on a 32-bit system.  (Is HTM
>> even supported on a 32-bit system?)
> There are no pure 32 bit CPUs that implement HTM, to my knowledge.  But
> of course HTM works fine with SF=0 (that is the reason TEXASRU exists!
> Compare to TB and TBU).
>
>> It would have likely been better to have one consistent interface, using
>> int for TEXASRU and long long for the others, even though that requires
>> dealing with two registers for the 32-bit case; but that's all water under
>> the bridge.  We have what we have.
> "long" for the others, actually.
>
> TFHAR and TFIAR hold code addresses.  TEXASR gets only the low 32 bits
> of the register read, that is why TEXASRU exists :-)

Yes, right - That makes sense, and is what is currently documented.

>
>> If I sound irritated, it's because, just for this case, I'll have to add a
>> bunch of extra machinery to track up to two prototypes for each builtin
>> function, and perform conditional initialization when it applies.  The one
>> good thing is that these already have a builtin attribute "htmspr" that I
>> can key off of to do the extra processing.
> Another option might be to finally fix this.  There still are shipping
> CPUs that support HTM ;-)
>
> And essentially no one uses -m32 -mpowerpc64 on Linux or AIX.  On Linux
> because ucontext_t and jmp_buf do not deal with the high half of the
> registers, and iiuc on AIX the kernel doesn't deal with it in context
> switches even.  Darwin does use it, but afaik no one runs Darwin on a
> CPU with HTM.

Agreed, this seems like an odd use case from the beginning.

>
>> And somebody ought to fix the misleading documentation...
> Yes.
>
> Do you want to fix this mess?  I will take a patch using "long" for
> all these registers and builtins (just like we have for essentially all
> other SPRs!)

Sure!  In fact, that's what my existing patch does.

Thanks -- I will use the time that frees up to take a look at how to get
the overloaded builtin name the user wrote to show up on error messages,
instead of translating it to the specific builtin name.  Then I'll
repost the remaining pieces of the testsuite patch and the 32-bit patch
with all outstanding issues resolved.

Thanks again for all the help.

Bill

>
>
> Segher
  

Patch

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def b/gcc/config/rs6000/rs6000-builtin-new.def
index 1dd8f6b40b2..b9f7768c2f4 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -267,7 +267,7 @@ 
 ; Power6 builtins (ISA 2.05).
 [power6]
   const signed long __builtin_p6_cmpb (signed long, signed long);
-    CMPB cmpbdi3 {}
+    CMPB cmpbdi3 {no32bit}
 
   const signed int __builtin_p6_cmpb_32 (signed int, signed int);
     CMPB_32 cmpbsi3 {}
@@ -2015,7 +2015,7 @@ 
     ADDG6S addg6s {}
 
   const signed long __builtin_bpermd (signed long, signed long);
-    BPERMD bpermd_di {}
+    BPERMD bpermd_di {32bit}
 
   const unsigned int __builtin_cbcdtd (unsigned int);
     CBCDTD cbcdtd {}
@@ -2968,7 +2968,7 @@ 
   void __builtin_set_fpscr_drn (const int[0,7]);
     SET_FPSCR_DRN rs6000_set_fpscr_drn {}
 
-  const unsigned long __builtin_unpack_dec128 (_Decimal128, const int<1>);
+  const unsigned long long __builtin_unpack_dec128 (_Decimal128, const int<1>);
     UNPACK_TD unpacktd {}
 
 
@@ -3023,27 +3023,27 @@ 
   unsigned long long __builtin_get_tfiar ();
     GET_TFIAR nothing {htm,htmspr}
 
-  void __builtin_set_texasr (unsigned long long);
+  void __builtin_set_texasr (unsigned long);
     SET_TEXASR nothing {htm,htmspr}
 
-  void __builtin_set_texasru (unsigned long long);
+  void __builtin_set_texasru (unsigned long);
     SET_TEXASRU nothing {htm,htmspr}
 
-  void __builtin_set_tfhar (unsigned long long);
+  void __builtin_set_tfhar (unsigned long);
     SET_TFHAR nothing {htm,htmspr}
 
-  void __builtin_set_tfiar (unsigned long long);
+  void __builtin_set_tfiar (unsigned long);
     SET_TFIAR nothing {htm,htmspr}
 
   unsigned int __builtin_tabort (unsigned int);
     TABORT tabort {htm,htmcr}
 
-  unsigned int __builtin_tabortdc (unsigned long long, unsigned long long, \
-                                   unsigned long long);
+  unsigned int __builtin_tabortdc (unsigned long, unsigned long, \
+                                   unsigned long);
     TABORTDC tabortdc {htm,htmcr}
 
-  unsigned int __builtin_tabortdci (unsigned long long, unsigned long long, \
-                                    unsigned long long);
+  unsigned int __builtin_tabortdci (unsigned long, unsigned long, \
+                                    unsigned long);
     TABORTDCI tabortdci {htm,htmcr}
 
   unsigned int __builtin_tabortwc (unsigned int, unsigned int, unsigned int);
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 83e1abb6118..800a3bb8a33 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -15732,9 +15732,10 @@  rs6000_expand_new_builtin (tree exp, rtx target,
     }
 
   if (bif_is_no32bit (*bifaddr) && TARGET_32BIT)
-    fatal_error (input_location,
-		 "%<%s%> is not supported in 32-bit mode",
-		 bifaddr->bifname);
+    {
+      error ("%<%s%> is not supported in 32-bit mode", bifaddr->bifname);
+      return const0_rtx;
+    }
 
   if (bif_is_cpu (*bifaddr))
     return new_cpu_expand_builtin (fcode, exp, target);
@@ -15758,6 +15759,8 @@  rs6000_expand_new_builtin (tree exp, rtx target,
     {
       if (fcode == RS6000_BIF_MFTB)
 	icode = CODE_FOR_rs6000_mftb_si;
+      else if (fcode == RS6000_BIF_BPERMD)
+	icode = CODE_FOR_bpermd_si;
       else
 	gcc_unreachable ();
     }
diff --git a/gcc/testsuite/gcc.target/powerpc/cmpb-3.c b/gcc/testsuite/gcc.target/powerpc/cmpb-3.c
index de111a80144..9c465de383f 100644
--- a/gcc/testsuite/gcc.target/powerpc/cmpb-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/cmpb-3.c
@@ -8,7 +8,7 @@  void abort ();
 long long int
 do_compare (long long int a, long long int b)
 {
-  return __builtin_cmpb (a, b);	/* { dg-error "'__builtin_cmpb' is not supported in this compiler configuration" } */
+  return __builtin_cmpb (a, b);	/* { dg-error "'__builtin_p6_cmpb' is not supported in 32-bit mode" } */
 }
 
 void expect (long long int pattern, long long int value)