[2/2] rs6000: Refine integer comparison handlings in rs6000_emit_vector_compare

Message ID 247bf71b-e0ab-7cf7-098b-a106a0764301@linux.ibm.com
State New
Headers
Series [1/2] rs6000: Emit vector fp comparison directly in rs6000_emit_vector_compare |

Commit Message

Kewen.Lin Nov. 16, 2022, 6:51 a.m. UTC
  Hi,

The current handlings in rs6000_emit_vector_compare is a bit
complicated to me, especially after we emit vector float
comparison insn with the given code directly.  This patch is
to refine the handlings for vector integer comparison operators,
it becomes not recursive, and we don't need the helper function
rs6000_emit_vector_compare_inner any more.

Bootstrapped and regtested on powerpc64-linux-gnu P7 and P8,
and powerpc64le-linux-gnu P9 and P10.

I'm going to push this later this week if no objections.

BR,
Kewen
-----
gcc/ChangeLog:

	* config/rs6000/rs6000.cc (rs6000_emit_vector_compare_inner): Remove.
	(rs6000_emit_vector_compare): Refine it by directly using the reversed
	or swapped code, to avoid the recursion.
---
 gcc/config/rs6000/rs6000.cc | 159 ++++++++----------------------------
 1 file changed, 34 insertions(+), 125 deletions(-)

--
2.27.0
  

Comments

Segher Boessenkool Nov. 16, 2022, 6:58 p.m. UTC | #1
Hi!

On Wed, Nov 16, 2022 at 02:51:04PM +0800, Kewen.Lin wrote:
> The current handlings in rs6000_emit_vector_compare is a bit
> complicated to me, especially after we emit vector float
> comparison insn with the given code directly.  This patch is
> to refine the handlings for vector integer comparison operators,
> it becomes not recursive, and we don't need the helper function
> rs6000_emit_vector_compare_inner any more.

That sounds nice.

>    /* In vector.md, we support all kinds of vector float point
>       comparison operators in a comparison rtl pattern, we can
>       just emit the comparison rtx insn directly here.  Besides,
>       we should have a centralized place to handle the possibility
> -     of raising invalid exception.  */
> -  if (GET_MODE_CLASS (dmode) == MODE_VECTOR_FLOAT)
> +     of raising invalid exception.  Also emit directly for vector
> +     integer comparison operators EQ/GT/GTU.  */
> +  if (GET_MODE_CLASS (dmode) == MODE_VECTOR_FLOAT
> +      || rcode == EQ
> +      || rcode == GT
> +      || rcode == GTU)

The comment still says it handles FP only.  That would be best to keep
imo: add a separate block of code to handle the integer stuff you want
to add.  You get the same or better generated code, the compiler is
smart enough.  Code is for the user to read, and C is not a portable
assembler language.

This whole series needs to be factored better, it does way too many
things, and only marginally related things, at every step.  Or I don't
see it anyway :-)


Segher
  
Kewen.Lin Nov. 17, 2022, 7:52 a.m. UTC | #2
Hi Segher,

Thanks for the comments!

on 2022/11/17 02:58, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Nov 16, 2022 at 02:51:04PM +0800, Kewen.Lin wrote:
>> The current handlings in rs6000_emit_vector_compare is a bit
>> complicated to me, especially after we emit vector float
>> comparison insn with the given code directly.  This patch is
>> to refine the handlings for vector integer comparison operators,
>> it becomes not recursive, and we don't need the helper function
>> rs6000_emit_vector_compare_inner any more.
> 
> That sounds nice.
> 
>>    /* In vector.md, we support all kinds of vector float point
>>       comparison operators in a comparison rtl pattern, we can
>>       just emit the comparison rtx insn directly here.  Besides,
>>       we should have a centralized place to handle the possibility
>> -     of raising invalid exception.  */
>> -  if (GET_MODE_CLASS (dmode) == MODE_VECTOR_FLOAT)
>> +     of raising invalid exception.  Also emit directly for vector
>> +     integer comparison operators EQ/GT/GTU.  */
>> +  if (GET_MODE_CLASS (dmode) == MODE_VECTOR_FLOAT
>> +      || rcode == EQ
>> +      || rcode == GT
>> +      || rcode == GTU)
> 
> The comment still says it handles FP only.  That would be best to keep
> imo: add a separate block of code to handle the integer stuff you want
> to add.  You get the same or better generated code, the compiler is
> smart enough.  Code is for the user to read, and C is not a portable
> assembler language.

OK, I'll make two blocks for FP and integer respectively.  I struggled
a bit updating this hunk with comments for integer comparison
consideration, someone could argue that both can share the same handling
if updating the condition.

> 
> This whole series needs to be factored better, it does way too many
> things, and only marginally related things, at every step.  Or I don't
> see it anyway :-)

OK, I was thinking patch 1/2 is to unify the current vector float
comparison handlings, patch 2/2 is to refine the remaining handlings
for vector integer comparison.  I'm pleased to factor it better, any
suggestions on concrete code is highly appreciated.  :)

btw, I constructed one test case as below, there is no assembly change
before and after this patch.

#define GT(a, b) (((a) > (b)))
#define GE(a, b) (((a) >= (b)))
#define LT(a, b) (((a) < (b)))
#define LE(a, b) (((a) <= (b)))
#define EQ(a, b) (((a) == (b)))
#define NE(a, b) (((a) != (b)))

#define TEST_VECT(NAME, TYPE)                                                  \
  __attribute__ ((noipa)) void test_##NAME##_##TYPE (TYPE *x, TYPE *y,         \
                                                     int *res, int n)          \
  {                                                                            \
    for (int i = 0; i < n; i++)                                                \
      res[i] = NAME (x[i], y[i]);                                              \
  }

#include "stdint.h"

#define TEST(TYPE)                                                             \
  TEST_VECT (GT, TYPE)                                                         \
  TEST_VECT (GE, TYPE)                                                         \
  TEST_VECT (LT, TYPE)                                                         \
  TEST_VECT (LE, TYPE)                                                         \
  TEST_VECT (EQ, TYPE)                                                         \
  TEST_VECT (NE, TYPE)

TEST (int64_t)
TEST (uint64_t)
TEST (int32_t)
TEST (uint32_t)
TEST (int16_t)
TEST (uint16_t)
TEST (int8_t)
TEST (uint8_t)



BR,
Kewen
  
Segher Boessenkool Nov. 18, 2022, 3:18 p.m. UTC | #3
Hi!

On Thu, Nov 17, 2022 at 03:52:26PM +0800, Kewen.Lin wrote:
> on 2022/11/17 02:58, Segher Boessenkool wrote:
> > On Wed, Nov 16, 2022 at 02:51:04PM +0800, Kewen.Lin wrote:
> >>    /* In vector.md, we support all kinds of vector float point
> >>       comparison operators in a comparison rtl pattern, we can
> >>       just emit the comparison rtx insn directly here.  Besides,
> >>       we should have a centralized place to handle the possibility
> >> -     of raising invalid exception.  */
> >> -  if (GET_MODE_CLASS (dmode) == MODE_VECTOR_FLOAT)
> >> +     of raising invalid exception.  Also emit directly for vector
> >> +     integer comparison operators EQ/GT/GTU.  */
> >> +  if (GET_MODE_CLASS (dmode) == MODE_VECTOR_FLOAT
> >> +      || rcode == EQ
> >> +      || rcode == GT
> >> +      || rcode == GTU)
> > 
> > The comment still says it handles FP only.  That would be best to keep
> > imo: add a separate block of code to handle the integer stuff you want
> > to add.  You get the same or better generated code, the compiler is
> > smart enough.  Code is for the user to read, and C is not a portable
> > assembler language.
> 
> OK, I'll make two blocks for FP and integer respectively.  I struggled
> a bit updating this hunk with comments for integer comparison
> consideration, someone could argue that both can share the same handling
> if updating the condition.

The golden rule of programming: if something is hard to explain, you
probably overcomplicated it.  Sometimes more code is much easier to
read, too.

> > This whole series needs to be factored better, it does way too many
> > things, and only marginally related things, at every step.  Or I don't
> > see it anyway :-)
> 
> OK, I was thinking patch 1/2 is to unify the current vector float
> comparison handlings, patch 2/2 is to refine the remaining handlings
> for vector integer comparison.  I'm pleased to factor it better, any
> suggestions on concrete code is highly appreciated.  :)

Often it helps to start with a patch (or patches) that only restructures
existing code, without changing what it does, so that the patch that
does change anything material is smaller and easier to read and review.
The (perhaps big) patch that doesn't change anything is easy to review
as well then, simple because it *says* it does not change anything, and
reviewing it boils down to verifying that is true.


Segher
  

Patch

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 56db12f08a0..21f4cda7b80 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -15639,169 +15639,78 @@  output_cbranch (rtx op, const char *label, int reversed, rtx_insn *insn)
   return string;
 }

-/* Return insn for VSX or Altivec comparisons.  */
-
-static rtx
-rs6000_emit_vector_compare_inner (enum rtx_code code, rtx op0, rtx op1)
-{
-  rtx mask;
-  machine_mode mode = GET_MODE (op0);
-
-  switch (code)
-    {
-    default:
-      break;
-
-    case GE:
-      if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
-	return NULL_RTX;
-      /* FALLTHRU */
-
-    case EQ:
-    case GT:
-    case GTU:
-      mask = gen_reg_rtx (mode);
-      emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (code, mode, op0, op1)));
-      return mask;
-    }
-
-  return NULL_RTX;
-}
-
 /* Emit vector compare for operands OP0 and OP1 using code RCODE.
-   DMODE is expected destination mode. This is a recursive function.  */
+   DMODE is expected destination mode.  */

 static rtx
 rs6000_emit_vector_compare (enum rtx_code rcode,
 			    rtx op0, rtx op1,
 			    machine_mode dmode)
 {
-  rtx mask;
   gcc_assert (VECTOR_UNIT_ALTIVEC_OR_VSX_P (dmode));
   gcc_assert (GET_MODE (op0) == GET_MODE (op1));
+  rtx mask = gen_reg_rtx (dmode);

   /* In vector.md, we support all kinds of vector float point
      comparison operators in a comparison rtl pattern, we can
      just emit the comparison rtx insn directly here.  Besides,
      we should have a centralized place to handle the possibility
-     of raising invalid exception.  */
-  if (GET_MODE_CLASS (dmode) == MODE_VECTOR_FLOAT)
+     of raising invalid exception.  Also emit directly for vector
+     integer comparison operators EQ/GT/GTU.  */
+  if (GET_MODE_CLASS (dmode) == MODE_VECTOR_FLOAT
+      || rcode == EQ
+      || rcode == GT
+      || rcode == GTU)
     {
-      mask = gen_reg_rtx (dmode);
       emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (rcode, dmode, op0, op1)));
       return mask;
     }

   bool swap_operands = false;
-  bool try_again = false;
-
-  /* See if the comparison works as is.  */
-  mask = rs6000_emit_vector_compare_inner (rcode, op0, op1);
-  if (mask)
-    return mask;
+  bool need_invert = false;
+  enum rtx_code code = UNKNOWN;

   switch (rcode)
     {
     case LT:
-      rcode = GT;
-      swap_operands = true;
-      try_again = true;
-      break;
     case LTU:
-      rcode = GTU;
+      code = swap_condition (rcode);
       swap_operands = true;
-      try_again = true;
       break;
     case NE:
-      /* Invert condition and try again.
-	 e.g., A != B becomes ~(A==B).  */
-      {
-	enum rtx_code rev_code;
-	enum insn_code nor_code;
-	rtx mask2;
-
-	rev_code = reverse_condition_maybe_unordered (rcode);
-	if (rev_code == UNKNOWN)
-	  return NULL_RTX;
-
-	nor_code = optab_handler (one_cmpl_optab, dmode);
-	if (nor_code == CODE_FOR_nothing)
-	  return NULL_RTX;
-
-	mask2 = rs6000_emit_vector_compare (rev_code, op0, op1, dmode);
-	if (!mask2)
-	  return NULL_RTX;
-
-	mask = gen_reg_rtx (dmode);
-	emit_insn (GEN_FCN (nor_code) (mask, mask2));
-	return mask;
-      }
+    case LE:
+    case LEU:
+      code = reverse_condition (rcode);
+      need_invert = true;
       break;
     case GE:
+      code = GT;
+      swap_operands = true;
+      need_invert = true;
+      break;
     case GEU:
-    case LE:
-    case LEU:
-      /* Try GT/GTU/LT/LTU OR EQ */
-      {
-	rtx c_rtx, eq_rtx;
-	enum insn_code ior_code;
-	enum rtx_code new_code;
-
-	switch (rcode)
-	  {
-	  case  GE:
-	    new_code = GT;
-	    break;
-
-	  case GEU:
-	    new_code = GTU;
-	    break;
-
-	  case LE:
-	    new_code = LT;
-	    break;
-
-	  case LEU:
-	    new_code = LTU;
-	    break;
-
-	  default:
-	    gcc_unreachable ();
-	  }
-
-	ior_code = optab_handler (ior_optab, dmode);
-	if (ior_code == CODE_FOR_nothing)
-	  return NULL_RTX;
-
-	c_rtx = rs6000_emit_vector_compare (new_code, op0, op1, dmode);
-	if (!c_rtx)
-	  return NULL_RTX;
-
-	eq_rtx = rs6000_emit_vector_compare (EQ, op0, op1, dmode);
-	if (!eq_rtx)
-	  return NULL_RTX;
-
-	mask = gen_reg_rtx (dmode);
-	emit_insn (GEN_FCN (ior_code) (mask, c_rtx, eq_rtx));
-	return mask;
-      }
+      code = GTU;
+      swap_operands = true;
+      need_invert = true;
       break;
     default:
-      return NULL_RTX;
+      gcc_unreachable ();
+      break;
     }

-  if (try_again)
-    {
-      if (swap_operands)
-	std::swap (op0, op1);
+  if (swap_operands)
+    std::swap (op0, op1);
+
+  emit_insn (gen_rtx_SET (mask, gen_rtx_fmt_ee (code, dmode, op0, op1)));

-      mask = rs6000_emit_vector_compare_inner (rcode, op0, op1);
-      if (mask)
-	return mask;
+  if (need_invert)
+    {
+      enum insn_code nor_code = optab_handler (one_cmpl_optab, dmode);
+      gcc_assert (nor_code != CODE_FOR_nothing);
+      emit_insn (GEN_FCN (nor_code) (mask, mask));
     }

-  /* You only get two chances.  */
-  return NULL_RTX;
+  return mask;
 }

 /* Emit vector conditional expression.  DEST is destination. OP_TRUE and