[x86] Support *testdi_not_doubleword during STV pass.

Message ID 043a01d89220$6bb82a70$43287f50$@nextmovesoftware.com
State New
Headers
Series [x86] Support *testdi_not_doubleword during STV pass. |

Commit Message

Roger Sayle July 7, 2022, 4:41 p.m. UTC
  This patch fixes the current two FAILs of pr65105-5.c on x86 when
compiled with -m32.  These (temporary) breakages were fallout from my
patches to improve/upgrade (scalar) double word comparisons.
On mainline, the i386 backend currently represents a critical comparison
using (compare (and (not reg1) reg2) (const_int 0)) which isn't/wasn't
recognized by the STV pass' convertible_comparison_p.  This simple STV
patch adds support for this pattern (*testdi_not_doubleword) and
generates the vector pandn and ptest instructions expected in the
existing (failing) test case.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, where with --target_board=unix{-m32} there are two
fewer failures, and without, there are no new failures.
Ok for mainline?


2022-07-07  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
        * config/i386/i386-features.cc (convert_compare): Add support
        for *testdi_not_doubleword pattern (i.e. "(compare (and (not ...")
        by generating a pandn followed by ptest.
        (convertible_comparison_p): Recognize both *cmpdi_doubleword and
        recent *testdi_not_doubleword comparison patterns.


Thanks in advance,
Roger
--
  

Comments

Uros Bizjak July 8, 2022, 6:48 a.m. UTC | #1
On Thu, Jul 7, 2022 at 6:41 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> This patch fixes the current two FAILs of pr65105-5.c on x86 when
> compiled with -m32.  These (temporary) breakages were fallout from my
> patches to improve/upgrade (scalar) double word comparisons.
> On mainline, the i386 backend currently represents a critical comparison
> using (compare (and (not reg1) reg2) (const_int 0)) which isn't/wasn't
> recognized by the STV pass' convertible_comparison_p.  This simple STV
> patch adds support for this pattern (*testdi_not_doubleword) and
> generates the vector pandn and ptest instructions expected in the
> existing (failing) test case.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, where with --target_board=unix{-m32} there are two
> fewer failures, and without, there are no new failures.
> Ok for mainline?
>
>
> 2022-07-07  Roger Sayle  <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
>         * config/i386/i386-features.cc (convert_compare): Add support
>         for *testdi_not_doubleword pattern (i.e. "(compare (and (not ...")
>         by generating a pandn followed by ptest.
>         (convertible_comparison_p): Recognize both *cmpdi_doubleword and
>         recent *testdi_not_doubleword comparison patterns.

OK, I think this is the correct approach to ANDN handling.

Thanks,
Uros.

>
> Thanks in advance,
> Roger
> --
>
  

Patch

diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc
index be38586..a7bd172 100644
--- a/gcc/config/i386/i386-features.cc
+++ b/gcc/config/i386/i386-features.cc
@@ -938,10 +938,10 @@  general_scalar_chain::convert_compare (rtx op1, rtx op2, rtx_insn *insn)
 {
   rtx tmp = gen_reg_rtx (vmode);
   rtx src;
-  convert_op (&op1, insn);
   /* Comparison against anything other than zero, requires an XOR.  */
   if (op2 != const0_rtx)
     {
+      convert_op (&op1, insn);
       convert_op (&op2, insn);
       /* If both operands are MEMs, explicitly load the OP1 into TMP.  */
       if (MEM_P (op1) && MEM_P (op2))
@@ -953,8 +953,25 @@  general_scalar_chain::convert_compare (rtx op1, rtx op2, rtx_insn *insn)
 	src = op1;
       src = gen_rtx_XOR (vmode, src, op2);
     }
+  else if (GET_CODE (op1) == AND
+	   && GET_CODE (XEXP (op1, 0)) == NOT)
+    {
+      rtx op11 = XEXP (XEXP (op1, 0), 0);
+      rtx op12 = XEXP (op1, 1);
+      convert_op (&op11, insn);
+      convert_op (&op12, insn);
+      if (MEM_P (op11))
+	{
+	  emit_insn_before (gen_rtx_SET (tmp, op11), insn);
+	  op11 = tmp;
+	}
+      src = gen_rtx_AND (vmode, gen_rtx_NOT (vmode, op11), op12);
+    }
   else
-    src = op1;
+    {
+      convert_op (&op1, insn);
+      src = op1;
+    }
   emit_insn_before (gen_rtx_SET (tmp, src), insn);
 
   if (vmode == V2DImode)
@@ -1399,17 +1416,29 @@  convertible_comparison_p (rtx_insn *insn, enum machine_mode mode)
   rtx op1 = XEXP (src, 0);
   rtx op2 = XEXP (src, 1);
 
-  if (!CONST_INT_P (op1)
-      && ((!REG_P (op1) && !MEM_P (op1))
-	  || GET_MODE (op1) != mode))
-    return false;
-
-  if (!CONST_INT_P (op2)
-      && ((!REG_P (op2) && !MEM_P (op2))
-	  || GET_MODE (op2) != mode))
-    return false;
+  /* *cmp<dwi>_doubleword.  */
+  if ((CONST_INT_P (op1)
+       || ((REG_P (op1) || MEM_P (op1))
+           && GET_MODE (op1) == mode))
+      && (CONST_INT_P (op2)
+	  || ((REG_P (op2) || MEM_P (op2))
+	      && GET_MODE (op2) == mode)))
+    return true;
+
+  /* *test<dwi>_not_doubleword.  */
+  if (op2 == const0_rtx
+      && GET_CODE (op1) == AND
+      && GET_CODE (XEXP (op1, 0)) == NOT)
+    {
+      rtx op11 = XEXP (XEXP (op1, 0), 0);
+      rtx op12 = XEXP (op1, 1);
+      return (REG_P (op11) || MEM_P (op11))
+	     && (REG_P (op12) || MEM_P (op12))
+	     && GET_MODE (op11) == mode
+	     && GET_MODE (op12) == mode;
+    }
 
-  return true;
+  return false;
 }
 
 /* The general version of scalar_to_vector_candidate_p.  */