[v4,1/2] Match: Simplify (x != 0 ? x + ~0 : 0) to (x - x != 0).
Checks
Context |
Check |
Description |
linaro-tcwg-bot/tcwg_gcc_build--master-arm |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 |
success
|
Test passed
|
linaro-tcwg-bot/tcwg_gcc_check--master-arm |
success
|
Test passed
|
Commit Message
From: xuli <xuli1@eswincomputing.com>
When the imm operand op1=1 in the unsigned scalar sat_sub form2 below,
we can simplify (x != 0 ? x + ~0 : 0) to (x - x != 0), thereby eliminating
a branch instruction.This simplification also applies to signed integer.
Form2:
T __attribute__((noinline)) \
sat_u_sub_imm##IMM##_##T##_fmt_2 (T x) \
{ \
return x >= (T)IMM ? x - (T)IMM : 0; \
}
Take below form 2 as example:
DEF_SAT_U_SUB_IMM_FMT_2(uint8_t, 1)
Before this patch:
__attribute__((noinline))
uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
{
uint8_t _1;
uint8_t _3;
<bb 2> [local count: 1073741824]:
if (x_2(D) != 0)
goto <bb 3>; [50.00%]
else
goto <bb 4>; [50.00%]
<bb 3> [local count: 536870912]:
_3 = x_2(D) + 255;
<bb 4> [local count: 1073741824]:
# _1 = PHI <x_2(D)(2), _3(3)>
return _1;
}
Assembly code:
sat_u_sub_imm1_uint8_t_fmt_2:
beq a0,zero,.L2
addiw a0,a0,-1
andi a0,a0,0xff
.L2:
ret
After this patch:
__attribute__((noinline))
uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
{
_Bool _1;
unsigned char _2;
uint8_t _4;
<bb 2> [local count: 1073741824]:
_1 = x_3(D) != 0;
_2 = (unsigned char) _1;
_4 = x_3(D) - _2;
return _4;
}
Assembly code:
sat_u_sub_imm1_uint8_t_fmt_2:
snez a5,a0
subw a0,a0,a5
andi a0,a0,0xff
ret
The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.
Signed-off-by: Li Xu <xuli1@eswincomputing.com>
gcc/ChangeLog:
* match.pd: Simplify (x != 0 ? x + ~0 : 0) to (x - x != 0).
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/phi-opt-44.c: New test.
---
gcc/match.pd | 10 +++++++++
gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c | 26 ++++++++++++++++++++++
gcc/testsuite/gcc.dg/tree-ssa/phi-opt-45.c | 26 ++++++++++++++++++++++
3 files changed, 62 insertions(+)
create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-45.c
Comments
On Thu, Oct 24, 2024 at 6:22 PM Li Xu <xuli1@eswincomputing.com> wrote:
>
> From: xuli <xuli1@eswincomputing.com>
>
> When the imm operand op1=1 in the unsigned scalar sat_sub form2 below,
> we can simplify (x != 0 ? x + ~0 : 0) to (x - x != 0), thereby eliminating
> a branch instruction.This simplification also applies to signed integer.
>
> Form2:
> T __attribute__((noinline)) \
> sat_u_sub_imm##IMM##_##T##_fmt_2 (T x) \
> { \
> return x >= (T)IMM ? x - (T)IMM : 0; \
> }
>
> Take below form 2 as example:
> DEF_SAT_U_SUB_IMM_FMT_2(uint8_t, 1)
>
> Before this patch:
> __attribute__((noinline))
> uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
> {
> uint8_t _1;
> uint8_t _3;
>
> <bb 2> [local count: 1073741824]:
> if (x_2(D) != 0)
> goto <bb 3>; [50.00%]
> else
> goto <bb 4>; [50.00%]
>
> <bb 3> [local count: 536870912]:
> _3 = x_2(D) + 255;
>
> <bb 4> [local count: 1073741824]:
> # _1 = PHI <x_2(D)(2), _3(3)>
> return _1;
>
> }
>
> Assembly code:
> sat_u_sub_imm1_uint8_t_fmt_2:
> beq a0,zero,.L2
> addiw a0,a0,-1
> andi a0,a0,0xff
> .L2:
> ret
>
> After this patch:
> __attribute__((noinline))
> uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
> {
> _Bool _1;
> unsigned char _2;
> uint8_t _4;
>
> <bb 2> [local count: 1073741824]:
> _1 = x_3(D) != 0;
> _2 = (unsigned char) _1;
> _4 = x_3(D) - _2;
> return _4;
>
> }
>
> Assembly code:
> sat_u_sub_imm1_uint8_t_fmt_2:
> snez a5,a0
> subw a0,a0,a5
> andi a0,a0,0xff
> ret
>
> The below test suites are passed for this patch:
> 1. The rv64gcv fully regression tests.
> 2. The x86 bootstrap tests.
> 3. The x86 fully regression tests.
>
> Signed-off-by: Li Xu <xuli1@eswincomputing.com>
> gcc/ChangeLog:
>
> * match.pd: Simplify (x != 0 ? x + ~0 : 0) to (x - x != 0).
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/phi-opt-44.c: New test.
> ---
> gcc/match.pd | 10 +++++++++
> gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c | 26 ++++++++++++++++++++++
> gcc/testsuite/gcc.dg/tree-ssa/phi-opt-45.c | 26 ++++++++++++++++++++++
> 3 files changed, 62 insertions(+)
> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-45.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 0455dfa6993..f48fd7d52ba 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3383,6 +3383,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> }
> (if (wi::eq_p (sum, wi::uhwi (0, precision)))))))
>
> +/* The boundary condition for case 10: IMM = 1:
> + SAT_U_SUB = X >= IMM ? (X - IMM) : 0.
> + simplify (X != 0 ? X + ~0 : 0) to (X - X != 0). */
> +(simplify
> + (cond (ne@1 @0 integer_zerop)
> + (nop_convert? (plus (nop_convert? @0) integer_all_onesp))
> + integer_zerop)
> + (if (INTEGRAL_TYPE_P (type))
> + (minus @0 (convert @1))))
This looks good to me, though I can't approve it.
Thanks,
Andrew
> +
> /* Signed saturation sub, case 1:
> T minus = (T)((UT)X - (UT)Y);
> SAT_S_SUB = (X ^ Y) & (X ^ minus) < 0 ? (-(T)(X < 0) ^ MAX) : minus;
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
> new file mode 100644
> index 00000000000..962bf0954f6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-phiopt1" } */
> +
> +#include <stdint.h>
> +
> +uint8_t f1 (uint8_t x)
> +{
> + return x >= (uint8_t)1 ? x - (uint8_t)1 : 0;
> +}
> +
> +uint16_t f2 (uint16_t x)
> +{
> + return x >= (uint16_t)1 ? x - (uint16_t)1 : 0;
> +}
> +
> +uint32_t f3 (uint32_t x)
> +{
> + return x >= (uint32_t)1 ? x - (uint32_t)1 : 0;
> +}
> +
> +uint64_t f4 (uint64_t x)
> +{
> + return x >= (uint64_t)1 ? x - (uint64_t)1 : 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-not "goto" "phiopt1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-45.c b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-45.c
> new file mode 100644
> index 00000000000..62a2ab63184
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-45.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-phiopt1" } */
> +
> +#include <stdint.h>
> +
> +int8_t f1 (int8_t x)
> +{
> + return x != 0 ? x - (int8_t)1 : 0;
> +}
> +
> +int16_t f2 (int16_t x)
> +{
> + return x != 0 ? x - (int16_t)1 : 0;
> +}
> +
> +int32_t f3 (int32_t x)
> +{
> + return x != 0 ? x - (int32_t)1 : 0;
> +}
> +
> +int64_t f4 (int64_t x)
> +{
> + return x != 0 ? x - (int64_t)1 : 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-not "goto" "phiopt1" } } */
> --
> 2.17.1
>
On Sat, Oct 26, 2024 at 12:20 AM Andrew Pinski <pinskia@gmail.com> wrote:
>
> On Thu, Oct 24, 2024 at 6:22 PM Li Xu <xuli1@eswincomputing.com> wrote:
> >
> > From: xuli <xuli1@eswincomputing.com>
> >
> > When the imm operand op1=1 in the unsigned scalar sat_sub form2 below,
> > we can simplify (x != 0 ? x + ~0 : 0) to (x - x != 0), thereby eliminating
> > a branch instruction.This simplification also applies to signed integer.
> >
> > Form2:
> > T __attribute__((noinline)) \
> > sat_u_sub_imm##IMM##_##T##_fmt_2 (T x) \
> > { \
> > return x >= (T)IMM ? x - (T)IMM : 0; \
> > }
> >
> > Take below form 2 as example:
> > DEF_SAT_U_SUB_IMM_FMT_2(uint8_t, 1)
> >
> > Before this patch:
> > __attribute__((noinline))
> > uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
> > {
> > uint8_t _1;
> > uint8_t _3;
> >
> > <bb 2> [local count: 1073741824]:
> > if (x_2(D) != 0)
> > goto <bb 3>; [50.00%]
> > else
> > goto <bb 4>; [50.00%]
> >
> > <bb 3> [local count: 536870912]:
> > _3 = x_2(D) + 255;
> >
> > <bb 4> [local count: 1073741824]:
> > # _1 = PHI <x_2(D)(2), _3(3)>
> > return _1;
> >
> > }
> >
> > Assembly code:
> > sat_u_sub_imm1_uint8_t_fmt_2:
> > beq a0,zero,.L2
> > addiw a0,a0,-1
> > andi a0,a0,0xff
> > .L2:
> > ret
> >
> > After this patch:
> > __attribute__((noinline))
> > uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
> > {
> > _Bool _1;
> > unsigned char _2;
> > uint8_t _4;
> >
> > <bb 2> [local count: 1073741824]:
> > _1 = x_3(D) != 0;
> > _2 = (unsigned char) _1;
> > _4 = x_3(D) - _2;
> > return _4;
> >
> > }
> >
> > Assembly code:
> > sat_u_sub_imm1_uint8_t_fmt_2:
> > snez a5,a0
> > subw a0,a0,a5
> > andi a0,a0,0xff
> > ret
> >
> > The below test suites are passed for this patch:
> > 1. The rv64gcv fully regression tests.
> > 2. The x86 bootstrap tests.
> > 3. The x86 fully regression tests.
> >
> > Signed-off-by: Li Xu <xuli1@eswincomputing.com>
> > gcc/ChangeLog:
> >
> > * match.pd: Simplify (x != 0 ? x + ~0 : 0) to (x - x != 0).
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/tree-ssa/phi-opt-44.c: New test.
> > ---
> > gcc/match.pd | 10 +++++++++
> > gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c | 26 ++++++++++++++++++++++
> > gcc/testsuite/gcc.dg/tree-ssa/phi-opt-45.c | 26 ++++++++++++++++++++++
> > 3 files changed, 62 insertions(+)
> > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
> > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-45.c
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 0455dfa6993..f48fd7d52ba 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -3383,6 +3383,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > }
> > (if (wi::eq_p (sum, wi::uhwi (0, precision)))))))
> >
> > +/* The boundary condition for case 10: IMM = 1:
> > + SAT_U_SUB = X >= IMM ? (X - IMM) : 0.
> > + simplify (X != 0 ? X + ~0 : 0) to (X - X != 0). */
> > +(simplify
> > + (cond (ne@1 @0 integer_zerop)
> > + (nop_convert? (plus (nop_convert? @0) integer_all_onesp))
> > + integer_zerop)
> > + (if (INTEGRAL_TYPE_P (type))
> > + (minus @0 (convert @1))))
>
> This looks good to me, though I can't approve it.
OK.
Thanks,
Richard.
> Thanks,
> Andrew
>
> > +
> > /* Signed saturation sub, case 1:
> > T minus = (T)((UT)X - (UT)Y);
> > SAT_S_SUB = (X ^ Y) & (X ^ minus) < 0 ? (-(T)(X < 0) ^ MAX) : minus;
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
> > new file mode 100644
> > index 00000000000..962bf0954f6
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
> > @@ -0,0 +1,26 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -fdump-tree-phiopt1" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t f1 (uint8_t x)
> > +{
> > + return x >= (uint8_t)1 ? x - (uint8_t)1 : 0;
> > +}
> > +
> > +uint16_t f2 (uint16_t x)
> > +{
> > + return x >= (uint16_t)1 ? x - (uint16_t)1 : 0;
> > +}
> > +
> > +uint32_t f3 (uint32_t x)
> > +{
> > + return x >= (uint32_t)1 ? x - (uint32_t)1 : 0;
> > +}
> > +
> > +uint64_t f4 (uint64_t x)
> > +{
> > + return x >= (uint64_t)1 ? x - (uint64_t)1 : 0;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-not "goto" "phiopt1" } } */
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-45.c b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-45.c
> > new file mode 100644
> > index 00000000000..62a2ab63184
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-45.c
> > @@ -0,0 +1,26 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -fdump-tree-phiopt1" } */
> > +
> > +#include <stdint.h>
> > +
> > +int8_t f1 (int8_t x)
> > +{
> > + return x != 0 ? x - (int8_t)1 : 0;
> > +}
> > +
> > +int16_t f2 (int16_t x)
> > +{
> > + return x != 0 ? x - (int16_t)1 : 0;
> > +}
> > +
> > +int32_t f3 (int32_t x)
> > +{
> > + return x != 0 ? x - (int32_t)1 : 0;
> > +}
> > +
> > +int64_t f4 (int64_t x)
> > +{
> > + return x != 0 ? x - (int64_t)1 : 0;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-not "goto" "phiopt1" } } */
> > --
> > 2.17.1
> >
@@ -3383,6 +3383,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
}
(if (wi::eq_p (sum, wi::uhwi (0, precision)))))))
+/* The boundary condition for case 10: IMM = 1:
+ SAT_U_SUB = X >= IMM ? (X - IMM) : 0.
+ simplify (X != 0 ? X + ~0 : 0) to (X - X != 0). */
+(simplify
+ (cond (ne@1 @0 integer_zerop)
+ (nop_convert? (plus (nop_convert? @0) integer_all_onesp))
+ integer_zerop)
+ (if (INTEGRAL_TYPE_P (type))
+ (minus @0 (convert @1))))
+
/* Signed saturation sub, case 1:
T minus = (T)((UT)X - (UT)Y);
SAT_S_SUB = (X ^ Y) & (X ^ minus) < 0 ? (-(T)(X < 0) ^ MAX) : minus;
new file mode 100644
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-phiopt1" } */
+
+#include <stdint.h>
+
+uint8_t f1 (uint8_t x)
+{
+ return x >= (uint8_t)1 ? x - (uint8_t)1 : 0;
+}
+
+uint16_t f2 (uint16_t x)
+{
+ return x >= (uint16_t)1 ? x - (uint16_t)1 : 0;
+}
+
+uint32_t f3 (uint32_t x)
+{
+ return x >= (uint32_t)1 ? x - (uint32_t)1 : 0;
+}
+
+uint64_t f4 (uint64_t x)
+{
+ return x >= (uint64_t)1 ? x - (uint64_t)1 : 0;
+}
+
+/* { dg-final { scan-tree-dump-not "goto" "phiopt1" } } */
new file mode 100644
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-phiopt1" } */
+
+#include <stdint.h>
+
+int8_t f1 (int8_t x)
+{
+ return x != 0 ? x - (int8_t)1 : 0;
+}
+
+int16_t f2 (int16_t x)
+{
+ return x != 0 ? x - (int16_t)1 : 0;
+}
+
+int32_t f3 (int32_t x)
+{
+ return x != 0 ? x - (int32_t)1 : 0;
+}
+
+int64_t f4 (int64_t x)
+{
+ return x != 0 ? x - (int64_t)1 : 0;
+}
+
+/* { dg-final { scan-tree-dump-not "goto" "phiopt1" } } */