rs6000: Support vector float/double for vec_sldw

Message ID db2df66f-01b3-bf94-38b5-86eef28bc265@linux.ibm.com
State New
Headers
Series rs6000: Support vector float/double for vec_sldw |

Commit Message

Li, Pan2 via Gcc-patches Jan. 21, 2022, 5:31 p.m. UTC
  Hi,

It was recently discovered that Clang supports a couple of variants of vec_sldw that
GCC does not.  After some discussion, we decided that these variants are reasonable,
and GCC will also support them.  This patch adds that support.

I updated an existing test and discovered it wasn't actually checking for generation
of the xxsldwi instruction, so I added that check as well.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this okay
for trunk?

Thanks!
Bill


2022-01-21  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-overload.def (VEC_SLDW): Add instances for
	vector float and vector double.

gcc/testsuite/
	* gcc.target/powerpc/builtins-4.c: Add two test variants.  Adjust
	assembler counts.
---
 gcc/config/rs6000/rs6000-overload.def         |  4 +++
 gcc/testsuite/gcc.target/powerpc/builtins-4.c | 34 +++++++++++++------
 2 files changed, 28 insertions(+), 10 deletions(-)
  

Comments

Segher Boessenkool Jan. 21, 2022, 5:47 p.m. UTC | #1
Hi!

On Fri, Jan 21, 2022 at 11:31:34AM -0600, Bill Schmidt wrote:
> It was recently discovered that Clang supports a couple of variants of vec_sldw that
> GCC does not.  After some discussion, we decided that these variants are reasonable,
> and GCC will also support them.  This patch adds that support.

As we discussed, this is reasonable only because we already allow
non-integer inputs (and outputs) for all(?) other permute class
instructions.

> I updated an existing test and discovered it wasn't actually checking for generation
> of the xxsldwi instruction, so I added that check as well.

It can always generate vsldoi instead, which is a strict superset (if
all registers used are VRs).  They will not likely be here, because
these are such simple functions, but that is a bit fragile.

> 	* gcc.target/powerpc/builtins-4.c: Add two test variants.  Adjust
> 	assembler counts.

Is there any justification for the new counts?

... Ah, it didn't count the sld's at all before.  Okay.

> @@ -161,6 +175,6 @@ test_sll_vuill_vuill_vuc (vector unsigned long long int x,
>  /* { dg-final { scan-assembler-times "xvnabssp"  1 } } */
>  /* { dg-final { scan-assembler-times "xvnabsdp"  1 } } */
>  /* { dg-final { scan-assembler-times "vslo"      4 } } */
> -/* { dg-final { scan-assembler-times "xxlor"     30 } } */
> +/* { dg-final { scan-assembler-times "xxlor"     32 } } */

This will need modification for the phase of the moon.  It also does not
even test only xxlor insn (also xxlorc insns, for example).

> +/* { dg-final { scan-assembler-times "xxsldwi"   10 } } */

Okay if you make this
  \mxxsldwi\M
or even
  \m(?:xxsldwi|vsldoi)\M

Thanks!


Segher
  
Li, Pan2 via Gcc-patches Jan. 21, 2022, 7:15 p.m. UTC | #2
Thanks!  Pushed as r12-6806 with the testcase adjusted.

Bill

On 1/21/22 11:47 AM, Segher Boessenkool wrote:
> Hi!
>
> On Fri, Jan 21, 2022 at 11:31:34AM -0600, Bill Schmidt wrote:
>> It was recently discovered that Clang supports a couple of variants of vec_sldw that
>> GCC does not.  After some discussion, we decided that these variants are reasonable,
>> and GCC will also support them.  This patch adds that support.
> As we discussed, this is reasonable only because we already allow
> non-integer inputs (and outputs) for all(?) other permute class
> instructions.
>
>> I updated an existing test and discovered it wasn't actually checking for generation
>> of the xxsldwi instruction, so I added that check as well.
> It can always generate vsldoi instead, which is a strict superset (if
> all registers used are VRs).  They will not likely be here, because
> these are such simple functions, but that is a bit fragile.
>
>> 	* gcc.target/powerpc/builtins-4.c: Add two test variants.  Adjust
>> 	assembler counts.
> Is there any justification for the new counts?
>
> ... Ah, it didn't count the sld's at all before.  Okay.
>
>> @@ -161,6 +175,6 @@ test_sll_vuill_vuill_vuc (vector unsigned long long int x,
>>  /* { dg-final { scan-assembler-times "xvnabssp"  1 } } */
>>  /* { dg-final { scan-assembler-times "xvnabsdp"  1 } } */
>>  /* { dg-final { scan-assembler-times "vslo"      4 } } */
>> -/* { dg-final { scan-assembler-times "xxlor"     30 } } */
>> +/* { dg-final { scan-assembler-times "xxlor"     32 } } */
> This will need modification for the phase of the moon.  It also does not
> even test only xxlor insn (also xxlorc insns, for example).
>
>> +/* { dg-final { scan-assembler-times "xxsldwi"   10 } } */
> Okay if you make this
>   \mxxsldwi\M
> or even
>   \m(?:xxsldwi|vsldoi)\M
>
> Thanks!
>
>
> Segher
  

Patch

diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index dea6f5d4258..cdc703e9764 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -3405,6 +3405,10 @@ 
     XXSLDWI_2DI  XXSLDWI_VSLL
   vull __builtin_vec_sldw (vull, vull, const int);
     XXSLDWI_2DI  XXSLDWI_VULL
+  vf __builtin_vec_sldw (vf, vf, const int);
+    XXSLDWI_4SF  XXSLDWI_VF
+  vd __builtin_vec_sldw (vd, vd, const int);
+    XXSLDWI_2DF  XXSLDWI_VD
 
 [VEC_SLL, vec_sll, __builtin_vec_sll]
   vsc __builtin_vec_sll (vsc, vuc);
diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-4.c b/gcc/testsuite/gcc.target/powerpc/builtins-4.c
index 4e3b543f242..df012e9b7d6 100644
--- a/gcc/testsuite/gcc.target/powerpc/builtins-4.c
+++ b/gcc/testsuite/gcc.target/powerpc/builtins-4.c
@@ -119,6 +119,18 @@  test_vul_sldw_vul_vul (vector unsigned long long x,
 	return vec_sldw (x, y, 3);
 }
 
+vector float
+test_vf_sldw_vf_vf (vector float x, vector float y)
+{
+  return vec_sldw (x, y, 3);
+}
+
+vector double
+test_vd_sldw_vd_vd (vector double x, vector double y)
+{
+  return vec_sldw (x, y, 1);
+}
+
 vector signed int long long
 test_sll_vsill_vsill_vuc (vector signed long long int x,
 			  vector unsigned char y)
@@ -146,14 +158,16 @@  test_sll_vuill_vuill_vuc (vector unsigned long long int x,
      test_slo_vsll_slo_vsll_vuc    1 vslo
      test_slo_vull_slo_vull_vsc    1 vslo
      test_slo_vull_slo_vull_vuc    1 vslo
-     test_vsc_sldw_vsc_vsc         1 xxlor
-     test_vuc_sldw_vuc_vuc         1 xxlor
-     test_vssi_sldw_vssi_vssi      1 xxlor
-     test_vusi_sldw_vusi_vusi      1 xxlor
-     test_vsi_sldw_vsi_vsi         1 xxlor
-     test_vui_sldw_vui_vui         1 xxlor
-     test_vsl_sldw_vsl_vsl         1 xxlor
-     test_vul_sldw_vul_vul         1 xxlor
+     test_vsc_sldw_vsc_vsc         1 xxlor, 1 xxsldwi
+     test_vuc_sldw_vuc_vuc         1 xxlor, 1 xxsldwi
+     test_vssi_sldw_vssi_vssi      1 xxlor, 1 xxsldwi
+     test_vusi_sldw_vusi_vusi      1 xxlor, 1 xxsldwi
+     test_vsi_sldw_vsi_vsi         1 xxlor, 1 xxsldwi
+     test_vui_sldw_vui_vui         1 xxlor, 1 xxsldwi
+     test_vsl_sldw_vsl_vsl         1 xxlor, 1 xxsldwi
+     test_vul_sldw_vul_vul         1 xxlor, 1 xxsldwi
+     test_vf_sldw_vf_vf            1 xxlor, 1 xxsldwi
+     test_vd_sldw_vd_vd            1 xxlor, 1 xxsldwi
      test_sll_vsill_vsill_vuc      1 vsl
      test_sll_vuill_vuill_vuc      1 vsl  */
 
@@ -161,6 +175,6 @@  test_sll_vuill_vuill_vuc (vector unsigned long long int x,
 /* { dg-final { scan-assembler-times "xvnabssp"  1 } } */
 /* { dg-final { scan-assembler-times "xvnabsdp"  1 } } */
 /* { dg-final { scan-assembler-times "vslo"      4 } } */
-/* { dg-final { scan-assembler-times "xxlor"     30 } } */
+/* { dg-final { scan-assembler-times "xxlor"     32 } } */
 /* { dg-final { scan-assembler-times {\mvsl\M}   5 } } */
-
+/* { dg-final { scan-assembler-times "xxsldwi"   10 } } */