[rs6000] Use CC for BCD operations [PR100736]
Commit Message
Hi,
This patch uses CC instead of CCFP for all BCD operations. Thus, infinite
math flag has no impact on BCD operations. To support BCD overflow and
invalid coding, ordered and unordered are added into CC mode. With CC,
"ge" and "le" are converted to reverse comparison. So the invalid coding
needs to be tested separately.
This patch also replaces bcdadd with bcdsub for BCD invaliding coding
expand. The bcdsub with two identical numbers doesn't cause overflow while
bcdadd could.
Another patch which creates a dedicated CC mode for BCD operations is
ready. With this patch, we don't need ordered and unordered in CC. Please
advice if I can submit it.
Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
Is this okay for trunk? Any recommendations? Thanks a lot.
2022-06-16 Haochen Gui <guihaoc@linux.ibm.com>
gcc/
PR target/100736
* config/rs6000/altivec.md (bcd<bcd_add_sub>_<mode>): Replace CCFP with
CC
(*bcd<bcd_add_sub>_test_<mode>): Replace CCFP with CC. Generate
condition insn with CC mode.
(*bcdinvalid_<mode>): Replace CCFP with CC. Replace bcdadd. with
bcdsub.
(bcdinvalid_<mode>): Replace CCFP with CC.
(bcdshift_v16qi): Likewise.
(bcdmul10_v16qi): Likewise.
(bcddiv10_v16qi): Likewise.
(peephole for bcd_add/sub): Likewise.
* config/rs6000/predicates.md (branch_comparison_operator): Add
ordered and unordered in CC mode.
* config/rs6000/rs6000.cc (validate_condition_mode): Likewise.
gcc/testsuite/
PR target/100736
* gcc.target/powerpc/bcd-4.c: Adjust number of bcdadd and bcdsub.
Scan no cror insns.
patch.diff
Comments
Hi!
On Thu, Jun 16, 2022 at 02:36:53PM +0800, HAO CHEN GUI wrote:
> This patch uses CC instead of CCFP for all BCD operations. Thus, infinite
> math flag has no impact on BCD operations. To support BCD overflow and
> invalid coding, ordered and unordered are added into CC mode.
This is wrong. Unordered is an FP concept. What the BCD insns do is
write to bit 3 in a CR field, which for FP comparisons is the
"unordered" comparison result (FP comparisons result in one of four
options: lt, gt, eq, or un). For integer comparisons there are only
three possible comparison results (lt, gt, and eq), and bit 3 is set to
a copy of XER[SO], the "summary overflow" bit.
The BCD insns have the one-out-of-three comparison result as well, and
they set bit 3 if overflow happened. There is the extra gotcha that it
sets the four condition field bits to 0001 if there is an invalidly
coded input, but we can ignore that: it is not supposed to happen.
There is no normal way to get at bit 3 of a CR field. We can use some
unspec though, which is total cheating but it does work, and it is
safe, albeit sometimes suboptimal.
> With CC,
> "ge" and "le" are converted to reverse comparison. So the invalid coding
> needs to be tested separately.
Bit 3 has no meaning in integer comparisons, in GCC, it is not modeled.
We handle this with some unspec usually. We need to for vector insns
that set a CR field for example: they can set more than one bit to 1, or
all bits to 0, neither of which is valid in any MODE_CC we have.
We should add proper CC modes for all the ways that instructions can set
the CR field bits. But this is much less trivial than it may seem, so
I'd like the PR to be fixed in a simple way first (a way that can be
backported, too!)
> * config/rs6000/altivec.md (bcd<bcd_add_sub>_<mode>): Replace CCFP with
> CC
Sentences end in a full stop. I'd say
....................................................: Replace CCFP
with CC.
so that it doesn't look as silly / strange :-)
> + rtx cr6 = gen_rtx_REG (CCmode, CR6_REGNO);
> + rtx condition_rtx = gen_rtx_<CODE> (SImode, cr6, const0_rtx);
> + rtx_code cond_code = GET_CODE (condition_rtx);
That GET_CODE always would return <CODE>, fwiw.
You shouldn't need anything like this, bcdinvalid will work just fine if
written as bcdadd_ov (with vector of 0 as the second op)?
Segher
Hi,
On 16/6/2022 下午 7:24, Segher Boessenkool wrote:
> There is no normal way to get at bit 3 of a CR field. We can use some
> unspec though, which is total cheating but it does work, and it is
> safe, albeit sometimes suboptimal.
Thanks so much for your advice. I will use an unspec for setting reg from
the BCD overflow bit.
>
> You shouldn't need anything like this, bcdinvalid will work just fine if
> written as bcdadd_ov (with vector of 0 as the second op)?
The vector of 0 is not equal to BCD 0, I think. The BCD number contains
preferred sign (PS) bit. So all zeros itself is an invalid encoding. We may
use bcdsub_ov with duplicated op to implement bcdinvalid.
Hi!
On Fri, Jun 17, 2022 at 04:19:37PM +0800, HAO CHEN GUI wrote:
> On 16/6/2022 下午 7:24, Segher Boessenkool wrote:
> > You shouldn't need anything like this, bcdinvalid will work just fine if
> > written as bcdadd_ov (with vector of 0 as the second op)?
>
> The vector of 0 is not equal to BCD 0, I think. The BCD number contains
> preferred sign (PS) bit. So all zeros itself is an invalid encoding. We may
> use bcdsub_ov with duplicated op to implement bcdinvalid.
For the machine, you should use 0x0c or 0x0f, sure. But in RTL we can
do whatever we want.
But bcdsub is easier indeed, and we don't need to construct a 0 first
then.
Segher
@@ -4379,7 +4379,7 @@ (define_insn "bcd<bcd_add_sub>_<mode>"
(match_operand:VBCD 2 "register_operand" "v")
(match_operand:QI 3 "const_0_to_1_operand" "n")]
UNSPEC_BCD_ADD_SUB))
- (clobber (reg:CCFP CR6_REGNO))]
+ (clobber (reg:CC CR6_REGNO))]
"TARGET_P8_VECTOR"
"bcd<bcd_add_sub>. %0,%1,%2,%3"
[(set_attr "type" "vecsimple")])
@@ -4389,9 +4389,9 @@ (define_insn "bcd<bcd_add_sub>_<mode>"
;; UNORDERED test on an integer type (like V1TImode) is not defined. The type
;; probably should be one that can go in the VMX (Altivec) registers, so we
;; can't use DDmode or DFmode.
-(define_insn "*bcd<bcd_add_sub>_test_<mode>"
- [(set (reg:CCFP CR6_REGNO)
- (compare:CCFP
+(define_insn "bcd<bcd_add_sub>_test_<mode>"
+ [(set (reg:CC CR6_REGNO)
+ (compare:CC
(unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")
(match_operand:VBCD 2 "register_operand" "v")
(match_operand:QI 3 "const_0_to_1_operand" "i")]
@@ -4408,8 +4408,8 @@ (define_insn "*bcd<bcd_add_sub>_test2_<mode>"
(match_operand:VBCD 2 "register_operand" "v")
(match_operand:QI 3 "const_0_to_1_operand" "i")]
UNSPEC_BCD_ADD_SUB))
- (set (reg:CCFP CR6_REGNO)
- (compare:CCFP
+ (set (reg:CC CR6_REGNO)
+ (compare:CC
(unspec:V2DF [(match_dup 1)
(match_dup 2)
(match_dup 3)]
@@ -4502,8 +4502,8 @@ (define_insn "vclrrb"
[(set_attr "type" "vecsimple")])
(define_expand "bcd<bcd_add_sub>_<code>_<mode>"
- [(parallel [(set (reg:CCFP CR6_REGNO)
- (compare:CCFP
+ [(parallel [(set (reg:CC CR6_REGNO)
+ (compare:CC
(unspec:V2DF [(match_operand:VBCD 1 "register_operand")
(match_operand:VBCD 2 "register_operand")
(match_operand:QI 3 "const_0_to_1_operand")]
@@ -4511,33 +4511,56 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>"
(match_dup 4)))
(clobber (match_scratch:VBCD 5))])
(set (match_operand:SI 0 "register_operand")
- (BCD_TEST:SI (reg:CCFP CR6_REGNO)
+ (BCD_TEST:SI (reg:CC CR6_REGNO)
(const_int 0)))]
"TARGET_P8_VECTOR"
{
operands[4] = CONST0_RTX (V2DFmode);
+ emit_insn (gen_bcd<bcd_add_sub>_test_<mode> (operands[0], operands[1],
+ operands[2], operands[3],
+ operands[4]));
+
+ rtx cr6 = gen_rtx_REG (CCmode, CR6_REGNO);
+ rtx condition_rtx = gen_rtx_<CODE> (SImode, cr6, const0_rtx);
+ rtx_code cond_code = GET_CODE (condition_rtx);
+
+ if (cond_code == GE || cond_code == LE)
+ {
+ rtx not_result = gen_reg_rtx (CCEQmode);
+ rtx not_op, rev_cond_rtx;
+ rev_cond_rtx = gen_rtx_fmt_ee (rs6000_reverse_condition (SImode,
+ cond_code),
+ SImode, XEXP (condition_rtx, 0),
+ const0_rtx);
+ not_op = gen_rtx_COMPARE (CCEQmode, rev_cond_rtx, const0_rtx);
+ emit_insn (gen_rtx_SET (not_result, not_op));
+ condition_rtx = gen_rtx_EQ (SImode, not_result, const0_rtx);
+ }
+
+ emit_insn (gen_rtx_SET (operands[0], condition_rtx));
+ DONE;
})
(define_insn "*bcdinvalid_<mode>"
- [(set (reg:CCFP CR6_REGNO)
- (compare:CCFP
+ [(set (reg:CC CR6_REGNO)
+ (compare:CC
(unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")]
UNSPEC_BCDADD)
(match_operand:V2DF 2 "zero_constant" "j")))
(clobber (match_scratch:VBCD 0 "=v"))]
"TARGET_P8_VECTOR"
- "bcdadd. %0,%1,%1,0"
+ "bcdsub. %0,%1,%1,0"
[(set_attr "type" "vecsimple")])
(define_expand "bcdinvalid_<mode>"
- [(parallel [(set (reg:CCFP CR6_REGNO)
- (compare:CCFP
+ [(parallel [(set (reg:CC CR6_REGNO)
+ (compare:CC
(unspec:V2DF [(match_operand:VBCD 1 "register_operand")]
UNSPEC_BCDADD)
(match_dup 2)))
(clobber (match_scratch:VBCD 3))])
(set (match_operand:SI 0 "register_operand")
- (unordered:SI (reg:CCFP CR6_REGNO)
+ (unordered:SI (reg:CC CR6_REGNO)
(const_int 0)))]
"TARGET_P8_VECTOR"
{
@@ -4550,7 +4573,7 @@ (define_insn "bcdshift_v16qi"
(match_operand:V16QI 2 "register_operand" "v")
(match_operand:QI 3 "const_0_to_1_operand" "n")]
UNSPEC_BCDSHIFT))
- (clobber (reg:CCFP CR6_REGNO))]
+ (clobber (reg:CC CR6_REGNO))]
"TARGET_P8_VECTOR"
"bcds. %0,%1,%2,%3"
[(set_attr "type" "vecsimple")])
@@ -4559,7 +4582,7 @@ (define_expand "bcdmul10_v16qi"
[(set (match_operand:V16QI 0 "register_operand")
(unspec:V16QI [(match_operand:V16QI 1 "register_operand")]
UNSPEC_BCDSHIFT))
- (clobber (reg:CCFP CR6_REGNO))]
+ (clobber (reg:CC CR6_REGNO))]
"TARGET_P9_VECTOR"
{
rtx one = gen_reg_rtx (V16QImode);
@@ -4574,7 +4597,7 @@ (define_expand "bcddiv10_v16qi"
[(set (match_operand:V16QI 0 "register_operand")
(unspec:V16QI [(match_operand:V16QI 1 "register_operand")]
UNSPEC_BCDSHIFT))
- (clobber (reg:CCFP CR6_REGNO))]
+ (clobber (reg:CC CR6_REGNO))]
"TARGET_P9_VECTOR"
{
rtx one = gen_reg_rtx (V16QImode);
@@ -4598,9 +4621,9 @@ (define_peephole2
(match_operand:V1TI 2 "register_operand")
(match_operand:QI 3 "const_0_to_1_operand")]
UNSPEC_BCD_ADD_SUB))
- (clobber (reg:CCFP CR6_REGNO))])
- (parallel [(set (reg:CCFP CR6_REGNO)
- (compare:CCFP
+ (clobber (reg:CC CR6_REGNO))])
+ (parallel [(set (reg:CC CR6_REGNO)
+ (compare:CC
(unspec:V2DF [(match_dup 1)
(match_dup 2)
(match_dup 3)]
@@ -4613,8 +4636,8 @@ (define_peephole2
(match_dup 2)
(match_dup 3)]
UNSPEC_BCD_ADD_SUB))
- (set (reg:CCFP CR6_REGNO)
- (compare:CCFP
+ (set (reg:CC CR6_REGNO)
+ (compare:CC
(unspec:V2DF [(match_dup 1)
(match_dup 2)
(match_dup 3)]
@@ -1316,7 +1316,7 @@ (define_predicate "branch_comparison_operator"
(if_then_else (match_test "flag_finite_math_only")
(match_code "lt,le,gt,ge,eq,ne,unordered,ordered")
(match_code "lt,gt,eq,unordered,unge,unle,ne,ordered"))
- (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne"))
+ (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne,unordered,ordered"))
(match_test "validate_condition_mode (GET_CODE (op),
GET_MODE (XEXP (op, 0))),
1")))
@@ -11249,11 +11249,13 @@ validate_condition_mode (enum rtx_code code, machine_mode mode)
|| mode == CCUNSmode);
gcc_assert (mode == CCFPmode
- || (code != ORDERED && code != UNORDERED
- && code != UNEQ && code != LTGT
+ || (code != UNEQ && code != LTGT
&& code != UNGT && code != UNLT
&& code != UNGE && code != UNLE));
+ gcc_assert (mode == CCFPmode || mode == CCmode
+ || (code != ORDERED && code != UNORDERED));
+
/* These are invalid; the information is not there. */
gcc_assert (mode != CCEQmode || code == EQ || code == NE);
}
@@ -2,10 +2,11 @@
/* { dg-require-effective-target int128 } */
/* { dg-require-effective-target power10_hw } */
/* { dg-options "-mdejagnu-cpu=power10 -O2 -save-temps" } */
-/* { dg-final { scan-assembler-times {\mbcdadd\M} 7 } } */
-/* { dg-final { scan-assembler-times {\mbcdsub\M} 18 } } */
+/* { dg-final { scan-assembler-times {\mbcdadd\M} 5 } } */
+/* { dg-final { scan-assembler-times {\mbcdsub\M} 20 } } */
/* { dg-final { scan-assembler-times {\mbcds\M} 2 } } */
/* { dg-final { scan-assembler-times {\mdenbcdq\M} 1 } } */
+/* { dg-final { scan-assembler-not {\mcror\M} 1 } } */
#include <altivec.h>