rs6000: Builtin test changes for int_128bit-runnable.c

Message ID 1f9aab5c-ee1b-a676-551c-3f7e4671c26b@linux.ibm.com
State New
Headers
Series rs6000: Builtin test changes for int_128bit-runnable.c |

Commit Message

Li, Pan2 via Gcc-patches Nov. 18, 2021, 4:09 p.m. UTC
  Hi!  This patch is broken out from the test case patch for the new builtins 
support.

The old builtins code performs gimple folding on 128-bit compares.  This
results in correct but very inefficient code.  (I suspect we may be
missing some optab entries, misleading gimple into 64-bit emulation.)  In
the new support, I disabled this gimple folding, which results in us
directly generating the 128-bit comparison instructions.  This patch
adjusts the scan-assembler-times counts for those instructions.

I've opened PR103316 to track this.

Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64) with
no regressions.  Is this okay for trunk?

Thanks!
Bill


2021-11-17  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/testsuite/
	* gcc.target/powerpc/int_128bit-runnable.c: Adjust instruction
	counts since we do better by not gimple-folding some builtins.
---
 gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
  

Comments

Segher Boessenkool Nov. 18, 2021, 4:22 p.m. UTC | #1
On Thu, Nov 18, 2021 at 10:09:53AM -0600, Bill Schmidt wrote:
> Hi!  This patch is broken out from the test case patch for the new builtins 
> support.
> 
> The old builtins code performs gimple folding on 128-bit compares.  This
> results in correct but very inefficient code.  (I suspect we may be
> missing some optab entries, misleading gimple into 64-bit emulation.)

It is sub-optimal if Gimple ever does this: it is better done later, at
expand time.

> In
> the new support, I disabled this gimple folding, which results in us
> directly generating the 128-bit comparison instructions.  This patch
> adjusts the scan-assembler-times counts for those instructions.
> 
> I've opened PR103316 to track this.

Thanks.

So when the generic code wisens up this testcase will still pass?  But
you do then need to re-introduce the folding here?

> gcc/testsuite/
> 	* gcc.target/powerpc/int_128bit-runnable.c: Adjust instruction
> 	counts since we do better by not gimple-folding some builtins.

Wrap later please?  80 chars is fine, 79 chars is fine, 10 chars or 70
chars is not :-(

(Not that it matters much *here* of course; it just annoys me

Also, s/ since.*/./ please.  Changelogs say what changed, not why, and
the "why" you say here is only half of the story, pretty misleading for
future archaeologists.

> --- a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
> +++ b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
> @@ -11,9 +11,9 @@
>  /* { dg-final { scan-assembler-times {\mvrlq\M} 2 } } */
>  /* { dg-final { scan-assembler-times {\mvrlqnm\M} 2 } } */
>  /* { dg-final { scan-assembler-times {\mvrlqmi\M} 2 } } */
> -/* { dg-final { scan-assembler-times {\mvcmpequq\M} 16 } } */
> -/* { dg-final { scan-assembler-times {\mvcmpgtsq\M} 16 } } */
> -/* { dg-final { scan-assembler-times {\mvcmpgtuq\M} 16 } } */
> +/* { dg-final { scan-assembler-times {\mvcmpequq\M} 24 } } */
> +/* { dg-final { scan-assembler-times {\mvcmpgtsq\M} 26 } } */
> +/* { dg-final { scan-assembler-times {\mvcmpgtuq\M} 26 } } */
>  /* { dg-final { scan-assembler-times {\mvmuloud\M} 1 } } */
>  /* { dg-final { scan-assembler-times {\mvmulesd\M} 1 } } */
>  /* { dg-final { scan-assembler-times {\mvmulosd\M} 1 } } */

If you think it actually generates better code now, and this is expected
code, then okay for trunk.  Thanks!


Segher
  
Li, Pan2 via Gcc-patches Nov. 18, 2021, 4:31 p.m. UTC | #2
Hi!

On 11/18/21 10:22 AM, Segher Boessenkool wrote:
> On Thu, Nov 18, 2021 at 10:09:53AM -0600, Bill Schmidt wrote:
>> Hi!  This patch is broken out from the test case patch for the new builtins 
>> support.
>>
>> The old builtins code performs gimple folding on 128-bit compares.  This
>> results in correct but very inefficient code.  (I suspect we may be
>> missing some optab entries, misleading gimple into 64-bit emulation.)
> It is sub-optimal if Gimple ever does this: it is better done later, at
> expand time.
>
>> In
>> the new support, I disabled this gimple folding, which results in us
>> directly generating the 128-bit comparison instructions.  This patch
>> adjusts the scan-assembler-times counts for those instructions.
>>
>> I've opened PR103316 to track this.
> Thanks.
>
> So when the generic code wisens up this testcase will still pass?  But
> you do then need to re-introduce the folding here?

Right.  If we fix the generic code, the test will still pass.  We need
to re-introduce the folding to leverage it, and will only do that if
the test still passes.  We always want these single-instruction
comparisons to fall out for simple tests like these.

>
>> gcc/testsuite/
>> 	* gcc.target/powerpc/int_128bit-runnable.c: Adjust instruction
>> 	counts since we do better by not gimple-folding some builtins.
> Wrap later please?  80 chars is fine, 79 chars is fine, 10 chars or 70
> chars is not :-(
>
> (Not that it matters much *here* of course; it just annoys me

A slight argument in favor of earlier wrapping:  With Git, the ChangeLog
entries in the commit messages get indented, so wrapping a little
earlier makes those much easier to read.  That's why I started reducing
the length of my entries a little.  Not a big deal either way, but it's
really noticeable in git log output.

>
> Also, s/ since.*/./ please.  Changelogs say what changed, not why, and
> the "why" you say here is only half of the story, pretty misleading for
> future archaeologists.

Good call.
>
>> --- a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
>> @@ -11,9 +11,9 @@
>>  /* { dg-final { scan-assembler-times {\mvrlq\M} 2 } } */
>>  /* { dg-final { scan-assembler-times {\mvrlqnm\M} 2 } } */
>>  /* { dg-final { scan-assembler-times {\mvrlqmi\M} 2 } } */
>> -/* { dg-final { scan-assembler-times {\mvcmpequq\M} 16 } } */
>> -/* { dg-final { scan-assembler-times {\mvcmpgtsq\M} 16 } } */
>> -/* { dg-final { scan-assembler-times {\mvcmpgtuq\M} 16 } } */
>> +/* { dg-final { scan-assembler-times {\mvcmpequq\M} 24 } } */
>> +/* { dg-final { scan-assembler-times {\mvcmpgtsq\M} 26 } } */
>> +/* { dg-final { scan-assembler-times {\mvcmpgtuq\M} 26 } } */
>>  /* { dg-final { scan-assembler-times {\mvmuloud\M} 1 } } */
>>  /* { dg-final { scan-assembler-times {\mvmulesd\M} 1 } } */
>>  /* { dg-final { scan-assembler-times {\mvmulosd\M} 1 } } */
> If you think it actually generates better code now, and this is expected
> code, then okay for trunk.  Thanks!

Thanks very much for the review!
Bill
>
>
> Segher
  

Patch

diff --git a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
index 1255ee9f0ab..1356793635a 100644
--- a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
@@ -11,9 +11,9 @@ 
 /* { dg-final { scan-assembler-times {\mvrlq\M} 2 } } */
 /* { dg-final { scan-assembler-times {\mvrlqnm\M} 2 } } */
 /* { dg-final { scan-assembler-times {\mvrlqmi\M} 2 } } */
-/* { dg-final { scan-assembler-times {\mvcmpequq\M} 16 } } */
-/* { dg-final { scan-assembler-times {\mvcmpgtsq\M} 16 } } */
-/* { dg-final { scan-assembler-times {\mvcmpgtuq\M} 16 } } */
+/* { dg-final { scan-assembler-times {\mvcmpequq\M} 24 } } */
+/* { dg-final { scan-assembler-times {\mvcmpgtsq\M} 26 } } */
+/* { dg-final { scan-assembler-times {\mvcmpgtuq\M} 26 } } */
 /* { dg-final { scan-assembler-times {\mvmuloud\M} 1 } } */
 /* { dg-final { scan-assembler-times {\mvmulesd\M} 1 } } */
 /* { dg-final { scan-assembler-times {\mvmulosd\M} 1 } } */