[rs6000] Use CC for BCD operations [PR100736]

Message ID 887ce9bb-3a34-b880-41b1-4ac2cfb743eb@linux.ibm.com
State New
Headers
Series [rs6000] Use CC for BCD operations [PR100736] |

Commit Message

HAO CHEN GUI June 16, 2022, 6:36 a.m. UTC
  Hi,
  This patch uses CC instead of CCFP for all BCD operations. Thus, infinite
math flag has no impact on BCD operations. To support BCD overflow and
invalid coding, ordered and unordered are added into CC mode. With CC,
"ge" and "le" are converted to reverse comparison. So the invalid coding
needs to be tested separately.

  This patch also replaces bcdadd with bcdsub for BCD invaliding coding
expand. The bcdsub with two identical numbers doesn't cause overflow while
bcdadd could.

  Another patch which creates a dedicated CC mode for BCD operations is
ready. With this patch, we don't need ordered and unordered in CC. Please
advice if I can submit it.

  Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
Is this okay for trunk? Any recommendations? Thanks a lot.

2022-06-16 Haochen Gui <guihaoc@linux.ibm.com>

gcc/
	PR target/100736
	* config/rs6000/altivec.md (bcd<bcd_add_sub>_<mode>): Replace CCFP with
	CC
	(*bcd<bcd_add_sub>_test_<mode>): Replace CCFP with CC.  Generate
	condition insn with CC mode.
	(*bcdinvalid_<mode>): Replace CCFP with CC.  Replace bcdadd. with
	bcdsub.
	(bcdinvalid_<mode>): Replace CCFP with CC.
	(bcdshift_v16qi): Likewise.
	(bcdmul10_v16qi): Likewise.
	(bcddiv10_v16qi): Likewise.
	(peephole for bcd_add/sub): Likewise.
	* config/rs6000/predicates.md (branch_comparison_operator): Add
	ordered and unordered in CC mode.
	* config/rs6000/rs6000.cc (validate_condition_mode): Likewise.

gcc/testsuite/
	PR target/100736
	* gcc.target/powerpc/bcd-4.c: Adjust number of bcdadd and bcdsub.
	Scan no cror insns.

patch.diff
  

Comments

Segher Boessenkool June 16, 2022, 11:24 a.m. UTC | #1
Hi!

On Thu, Jun 16, 2022 at 02:36:53PM +0800, HAO CHEN GUI wrote:
>   This patch uses CC instead of CCFP for all BCD operations. Thus, infinite
> math flag has no impact on BCD operations. To support BCD overflow and
> invalid coding, ordered and unordered are added into CC mode.

This is wrong.  Unordered is an FP concept.  What the BCD insns do is
write to bit 3 in a CR field, which for FP comparisons is the
"unordered" comparison result (FP comparisons result in one of four
options: lt, gt, eq, or un).  For integer comparisons there are only
three possible comparison results (lt, gt, and eq), and bit 3 is set to
a copy of XER[SO], the "summary overflow" bit.

The BCD insns have the one-out-of-three comparison result as well, and
they set bit 3 if overflow happened.  There is the extra gotcha that it
sets the four condition field bits to 0001 if there is an invalidly
coded input, but we can ignore that: it is not supposed to happen.

There is no normal way to get at bit 3 of a CR field.  We can use some
unspec though, which is total cheating but it does work, and it is
safe, albeit sometimes suboptimal.

> With CC,
> "ge" and "le" are converted to reverse comparison. So the invalid coding
> needs to be tested separately.

Bit 3 has no meaning in integer comparisons, in GCC, it is not modeled.
We handle this with some unspec usually.  We need to for vector insns
that set a CR field for example: they can set more than one bit to 1, or
all bits to 0, neither of which is valid in any MODE_CC we have.

We should add proper CC modes for all the ways that instructions can set
the CR field bits.  But this is much less trivial than it may seem, so
I'd like the PR to be fixed in a simple way first (a way that can be
backported, too!)

> 	* config/rs6000/altivec.md (bcd<bcd_add_sub>_<mode>): Replace CCFP with
> 	CC

Sentences end in a full stop.  I'd say
	....................................................: Replace CCFP
	with CC.
so that it doesn't look as silly / strange :-)

> +  rtx cr6 = gen_rtx_REG (CCmode, CR6_REGNO);
> +  rtx condition_rtx = gen_rtx_<CODE> (SImode, cr6, const0_rtx);
> +  rtx_code cond_code = GET_CODE (condition_rtx);

That GET_CODE always would return <CODE>, fwiw.

You shouldn't need anything like this, bcdinvalid will work just fine if
written as bcdadd_ov (with vector of 0 as the second op)?


Segher
  
HAO CHEN GUI June 17, 2022, 8:19 a.m. UTC | #2
Hi,

On 16/6/2022 下午 7:24, Segher Boessenkool wrote:
> There is no normal way to get at bit 3 of a CR field.  We can use some
> unspec though, which is total cheating but it does work, and it is
> safe, albeit sometimes suboptimal.

Thanks so much for your advice. I will use an unspec for setting reg from
the BCD overflow bit.
> 
> You shouldn't need anything like this, bcdinvalid will work just fine if
> written as bcdadd_ov (with vector of 0 as the second op)?

The vector of 0 is not equal to BCD 0, I think. The BCD number contains
preferred sign (PS) bit. So all zeros itself is an invalid encoding. We may
use bcdsub_ov with duplicated op to implement bcdinvalid.
  
Segher Boessenkool June 17, 2022, 6:08 p.m. UTC | #3
Hi!

On Fri, Jun 17, 2022 at 04:19:37PM +0800, HAO CHEN GUI wrote:
> On 16/6/2022 下午 7:24, Segher Boessenkool wrote:
> > You shouldn't need anything like this, bcdinvalid will work just fine if
> > written as bcdadd_ov (with vector of 0 as the second op)?
> 
> The vector of 0 is not equal to BCD 0, I think. The BCD number contains
> preferred sign (PS) bit. So all zeros itself is an invalid encoding. We may
> use bcdsub_ov with duplicated op to implement bcdinvalid.

For the machine, you should use 0x0c or 0x0f, sure.  But in RTL we can
do whatever we want.

But bcdsub is easier indeed, and we don't need to construct a 0 first
then.


Segher
  

Patch

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index efc8ae35c2e..5ffbab17a9e 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -4379,7 +4379,7 @@  (define_insn "bcd<bcd_add_sub>_<mode>"
 		      (match_operand:VBCD 2 "register_operand" "v")
 		      (match_operand:QI 3 "const_0_to_1_operand" "n")]
 		     UNSPEC_BCD_ADD_SUB))
-   (clobber (reg:CCFP CR6_REGNO))]
+   (clobber (reg:CC CR6_REGNO))]
   "TARGET_P8_VECTOR"
   "bcd<bcd_add_sub>. %0,%1,%2,%3"
   [(set_attr "type" "vecsimple")])
@@ -4389,9 +4389,9 @@  (define_insn "bcd<bcd_add_sub>_<mode>"
 ;; UNORDERED test on an integer type (like V1TImode) is not defined.  The type
 ;; probably should be one that can go in the VMX (Altivec) registers, so we
 ;; can't use DDmode or DFmode.
-(define_insn "*bcd<bcd_add_sub>_test_<mode>"
-  [(set (reg:CCFP CR6_REGNO)
-	(compare:CCFP
+(define_insn "bcd<bcd_add_sub>_test_<mode>"
+  [(set (reg:CC CR6_REGNO)
+	(compare:CC
 	 (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")
 		       (match_operand:VBCD 2 "register_operand" "v")
 		       (match_operand:QI 3 "const_0_to_1_operand" "i")]
@@ -4408,8 +4408,8 @@  (define_insn "*bcd<bcd_add_sub>_test2_<mode>"
 		      (match_operand:VBCD 2 "register_operand" "v")
 		      (match_operand:QI 3 "const_0_to_1_operand" "i")]
 		     UNSPEC_BCD_ADD_SUB))
-   (set (reg:CCFP CR6_REGNO)
-	(compare:CCFP
+   (set (reg:CC CR6_REGNO)
+	(compare:CC
 	 (unspec:V2DF [(match_dup 1)
 		       (match_dup 2)
 		       (match_dup 3)]
@@ -4502,8 +4502,8 @@  (define_insn "vclrrb"
    [(set_attr "type" "vecsimple")])

 (define_expand "bcd<bcd_add_sub>_<code>_<mode>"
-  [(parallel [(set (reg:CCFP CR6_REGNO)
-		   (compare:CCFP
+  [(parallel [(set (reg:CC CR6_REGNO)
+		   (compare:CC
 		    (unspec:V2DF [(match_operand:VBCD 1 "register_operand")
 				  (match_operand:VBCD 2 "register_operand")
 				  (match_operand:QI 3 "const_0_to_1_operand")]
@@ -4511,33 +4511,56 @@  (define_expand "bcd<bcd_add_sub>_<code>_<mode>"
 		    (match_dup 4)))
 	      (clobber (match_scratch:VBCD 5))])
    (set (match_operand:SI 0 "register_operand")
-	(BCD_TEST:SI (reg:CCFP CR6_REGNO)
+	(BCD_TEST:SI (reg:CC CR6_REGNO)
 		     (const_int 0)))]
   "TARGET_P8_VECTOR"
 {
   operands[4] = CONST0_RTX (V2DFmode);
+  emit_insn (gen_bcd<bcd_add_sub>_test_<mode> (operands[0], operands[1],
+					       operands[2], operands[3],
+					       operands[4]));
+
+  rtx cr6 = gen_rtx_REG (CCmode, CR6_REGNO);
+  rtx condition_rtx = gen_rtx_<CODE> (SImode, cr6, const0_rtx);
+  rtx_code cond_code = GET_CODE (condition_rtx);
+
+  if (cond_code == GE || cond_code == LE)
+    {
+      rtx not_result = gen_reg_rtx (CCEQmode);
+      rtx not_op, rev_cond_rtx;
+      rev_cond_rtx = gen_rtx_fmt_ee (rs6000_reverse_condition (SImode,
+							       cond_code),
+				     SImode, XEXP (condition_rtx, 0),
+				     const0_rtx);
+      not_op = gen_rtx_COMPARE (CCEQmode, rev_cond_rtx, const0_rtx);
+      emit_insn (gen_rtx_SET (not_result, not_op));
+      condition_rtx = gen_rtx_EQ (SImode, not_result, const0_rtx);
+    }
+
+  emit_insn (gen_rtx_SET (operands[0], condition_rtx));
+  DONE;
 })

 (define_insn "*bcdinvalid_<mode>"
-  [(set (reg:CCFP CR6_REGNO)
-	(compare:CCFP
+  [(set (reg:CC CR6_REGNO)
+	(compare:CC
 	 (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")]
 		      UNSPEC_BCDADD)
 	 (match_operand:V2DF 2 "zero_constant" "j")))
    (clobber (match_scratch:VBCD 0 "=v"))]
   "TARGET_P8_VECTOR"
-  "bcdadd. %0,%1,%1,0"
+  "bcdsub. %0,%1,%1,0"
   [(set_attr "type" "vecsimple")])

 (define_expand "bcdinvalid_<mode>"
-  [(parallel [(set (reg:CCFP CR6_REGNO)
-		   (compare:CCFP
+  [(parallel [(set (reg:CC CR6_REGNO)
+		   (compare:CC
 		    (unspec:V2DF [(match_operand:VBCD 1 "register_operand")]
 				 UNSPEC_BCDADD)
 		    (match_dup 2)))
 	      (clobber (match_scratch:VBCD 3))])
    (set (match_operand:SI 0 "register_operand")
-	(unordered:SI (reg:CCFP CR6_REGNO)
+	(unordered:SI (reg:CC CR6_REGNO)
 		      (const_int 0)))]
   "TARGET_P8_VECTOR"
 {
@@ -4550,7 +4573,7 @@  (define_insn "bcdshift_v16qi"
 		       (match_operand:V16QI 2 "register_operand" "v")
 		       (match_operand:QI 3 "const_0_to_1_operand" "n")]
 		     UNSPEC_BCDSHIFT))
-   (clobber (reg:CCFP CR6_REGNO))]
+   (clobber (reg:CC CR6_REGNO))]
   "TARGET_P8_VECTOR"
   "bcds. %0,%1,%2,%3"
   [(set_attr "type" "vecsimple")])
@@ -4559,7 +4582,7 @@  (define_expand "bcdmul10_v16qi"
   [(set (match_operand:V16QI 0 "register_operand")
 	(unspec:V16QI [(match_operand:V16QI 1 "register_operand")]
 		      UNSPEC_BCDSHIFT))
-   (clobber (reg:CCFP CR6_REGNO))]
+   (clobber (reg:CC CR6_REGNO))]
   "TARGET_P9_VECTOR"
 {
   rtx one = gen_reg_rtx (V16QImode);
@@ -4574,7 +4597,7 @@  (define_expand "bcddiv10_v16qi"
   [(set (match_operand:V16QI 0 "register_operand")
 	(unspec:V16QI [(match_operand:V16QI 1 "register_operand")]
 		      UNSPEC_BCDSHIFT))
-   (clobber (reg:CCFP CR6_REGNO))]
+   (clobber (reg:CC CR6_REGNO))]
   "TARGET_P9_VECTOR"
 {
   rtx one = gen_reg_rtx (V16QImode);
@@ -4598,9 +4621,9 @@  (define_peephole2
 				 (match_operand:V1TI 2 "register_operand")
 				 (match_operand:QI 3 "const_0_to_1_operand")]
 				UNSPEC_BCD_ADD_SUB))
-	      (clobber (reg:CCFP CR6_REGNO))])
-   (parallel [(set (reg:CCFP CR6_REGNO)
-		   (compare:CCFP
+	      (clobber (reg:CC CR6_REGNO))])
+   (parallel [(set (reg:CC CR6_REGNO)
+		   (compare:CC
 		    (unspec:V2DF [(match_dup 1)
 				  (match_dup 2)
 				  (match_dup 3)]
@@ -4613,8 +4636,8 @@  (define_peephole2
 				 (match_dup 2)
 				 (match_dup 3)]
 				UNSPEC_BCD_ADD_SUB))
-	      (set (reg:CCFP CR6_REGNO)
-		   (compare:CCFP
+	      (set (reg:CC CR6_REGNO)
+		   (compare:CC
 		    (unspec:V2DF [(match_dup 1)
 				  (match_dup 2)
 				  (match_dup 3)]
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index b1fcc69bb60..28567bcc64c 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -1316,7 +1316,7 @@  (define_predicate "branch_comparison_operator"
 	  (if_then_else (match_test "flag_finite_math_only")
 	    (match_code "lt,le,gt,ge,eq,ne,unordered,ordered")
 	    (match_code "lt,gt,eq,unordered,unge,unle,ne,ordered"))
-	  (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne"))
+	  (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne,unordered,ordered"))
 	(match_test "validate_condition_mode (GET_CODE (op),
 					      GET_MODE (XEXP (op, 0))),
 		     1")))
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index d4defc855d0..15f813d9cf5 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -11249,11 +11249,13 @@  validate_condition_mode (enum rtx_code code, machine_mode mode)
 	      || mode == CCUNSmode);

   gcc_assert (mode == CCFPmode
-	      || (code != ORDERED && code != UNORDERED
-		  && code != UNEQ && code != LTGT
+	      || (code != UNEQ && code != LTGT
 		  && code != UNGT && code != UNLT
 		  && code != UNGE && code != UNLE));

+  gcc_assert (mode == CCFPmode || mode == CCmode
+	      || (code != ORDERED && code != UNORDERED));
+
   /* These are invalid; the information is not there.  */
   gcc_assert (mode != CCEQmode || code == EQ || code == NE);
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/bcd-4.c b/gcc/testsuite/gcc.target/powerpc/bcd-4.c
index 2c8554dfe82..3c25ed60e17 100644
--- a/gcc/testsuite/gcc.target/powerpc/bcd-4.c
+++ b/gcc/testsuite/gcc.target/powerpc/bcd-4.c
@@ -2,10 +2,11 @@ 
 /* { dg-require-effective-target int128 } */
 /* { dg-require-effective-target power10_hw } */
 /* { dg-options "-mdejagnu-cpu=power10 -O2 -save-temps" } */
-/* { dg-final { scan-assembler-times {\mbcdadd\M} 7 } } */
-/* { dg-final { scan-assembler-times {\mbcdsub\M} 18 } } */
+/* { dg-final { scan-assembler-times {\mbcdadd\M} 5 } } */
+/* { dg-final { scan-assembler-times {\mbcdsub\M} 20 } } */
 /* { dg-final { scan-assembler-times {\mbcds\M} 2 } } */
 /* { dg-final { scan-assembler-times {\mdenbcdq\M} 1 } } */
+/* { dg-final { scan-assembler-not {\mcror\M} 1 } } */

 #include <altivec.h>