[1/2,v3] Match: Simplify unsigned scalar sat_sub(x, 1) to (x - x != 0)

Message ID 20241023090847.17589-1-xuli1@eswincomputing.com
State New
Headers
Series [1/2,v3] Match: Simplify unsigned scalar sat_sub(x, 1) to (x - x != 0) |

Commit Message

Li Xu Oct. 23, 2024, 9:08 a.m. UTC
  From: xuli <xuli1@eswincomputing.com>

When the imm operand op1=1 in the unsigned scalar sat_sub form2 below,
we can simplify (x != 0 ? x + max : 0) to (x - x != 0), thereby eliminating
a branch instruction.

Form2:
T __attribute__((noinline))             \
sat_u_sub_imm##IMM##_##T##_fmt_2 (T x)  \
{                                       \
  return x >= (T)IMM ? x - (T)IMM : 0;  \
}

Take below form 2 as example:
DEF_SAT_U_SUB_IMM_FMT_2(uint8_t, 1)

Before this patch:
__attribute__((noinline))
uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
{
  uint8_t _1;
  uint8_t _3;

  <bb 2> [local count: 1073741824]:
  if (x_2(D) != 0)
    goto <bb 3>; [50.00%]
  else
    goto <bb 4>; [50.00%]

  <bb 3> [local count: 536870912]:
  _3 = x_2(D) + 255;

  <bb 4> [local count: 1073741824]:
  # _1 = PHI <x_2(D)(2), _3(3)>
  return _1;

}

Assembly code:
sat_u_sub_imm1_uint8_t_fmt_2:
	beq	a0,zero,.L2
	addiw	a0,a0,-1
	andi	a0,a0,0xff
.L2:
	ret

After this patch:
__attribute__((noinline))
uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
{
  _Bool _1;
  unsigned char _2;
  uint8_t _4;

  <bb 2> [local count: 1073741824]:
  _1 = x_3(D) != 0;
  _2 = (unsigned char) _1;
  _4 = x_3(D) - _2;
  return _4;

}

Assembly code:
sat_u_sub_imm1_uint8_t_fmt_2:
	snez	a5,a0
	subw	a0,a0,a5
	andi	a0,a0,0xff
	ret

The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.

Signed-off-by: Li Xu <xuli1@eswincomputing.com>
gcc/ChangeLog:

	* match.pd: Simplify (x != 0 ? x + max : 0) to (x - x != 0).

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/phi-opt-44.c: New test.

---
 gcc/match.pd                               |  9 ++++++++
 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c | 26 ++++++++++++++++++++++
 2 files changed, 35 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
  

Comments

Andrew Pinski Oct. 24, 2024, 2:23 a.m. UTC | #1
On Wed, Oct 23, 2024 at 2:08 AM Li Xu <xuli1@eswincomputing.com> wrote:
>
> From: xuli <xuli1@eswincomputing.com>
>
> When the imm operand op1=1 in the unsigned scalar sat_sub form2 below,
> we can simplify (x != 0 ? x + max : 0) to (x - x != 0), thereby eliminating
> a branch instruction.
>
> Form2:
> T __attribute__((noinline))             \
> sat_u_sub_imm##IMM##_##T##_fmt_2 (T x)  \
> {                                       \
>   return x >= (T)IMM ? x - (T)IMM : 0;  \
> }
>
> Take below form 2 as example:
> DEF_SAT_U_SUB_IMM_FMT_2(uint8_t, 1)
>
> Before this patch:
> __attribute__((noinline))
> uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
> {
>   uint8_t _1;
>   uint8_t _3;
>
>   <bb 2> [local count: 1073741824]:
>   if (x_2(D) != 0)
>     goto <bb 3>; [50.00%]
>   else
>     goto <bb 4>; [50.00%]
>
>   <bb 3> [local count: 536870912]:
>   _3 = x_2(D) + 255;
>
>   <bb 4> [local count: 1073741824]:
>   # _1 = PHI <x_2(D)(2), _3(3)>
>   return _1;
>
> }
>
> Assembly code:
> sat_u_sub_imm1_uint8_t_fmt_2:
>         beq     a0,zero,.L2
>         addiw   a0,a0,-1
>         andi    a0,a0,0xff
> .L2:
>         ret
>
> After this patch:
> __attribute__((noinline))
> uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
> {
>   _Bool _1;
>   unsigned char _2;
>   uint8_t _4;
>
>   <bb 2> [local count: 1073741824]:
>   _1 = x_3(D) != 0;
>   _2 = (unsigned char) _1;
>   _4 = x_3(D) - _2;
>   return _4;
>
> }
>
> Assembly code:
> sat_u_sub_imm1_uint8_t_fmt_2:
>         snez    a5,a0
>         subw    a0,a0,a5
>         andi    a0,a0,0xff
>         ret
>
> The below test suites are passed for this patch:
> 1. The rv64gcv fully regression tests.
> 2. The x86 bootstrap tests.
> 3. The x86 fully regression tests.
>
> Signed-off-by: Li Xu <xuli1@eswincomputing.com>
> gcc/ChangeLog:
>
>         * match.pd: Simplify (x != 0 ? x + max : 0) to (x - x != 0).
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/tree-ssa/phi-opt-44.c: New test.
>
> ---
>  gcc/match.pd                               |  9 ++++++++
>  gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c | 26 ++++++++++++++++++++++
>  2 files changed, 35 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 0455dfa6993..6a245f8e0d3 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3383,6 +3383,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>    }
>    (if (wi::eq_p (sum, wi::uhwi (0, precision)))))))
>
> +/* The boundary condition for case 10: IMM = 1:
> +   SAT_U_SUB = X >= IMM ? (X - IMM) : 0.
> +   simplify (X != 0 ? X + max : 0) to (X - X != 0).  */
> +(simplify
> + (cond (ne @0 integer_zerop) (plus @0 integer_all_onesp) integer_zerop)
> + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> +     && types_match (type, @0))

You don't need either TYPE_UNSIGNED any more because integer_all_onesp
handles both sign and unsigned "all ones".
You also don't need types_match since they will match due to the rules
of PLUS_EXPR.

Once you remove the UNSIGNED check, please add testcases for the
signed case too.

> +   (minus @0 (convert (ne @0 { build_zero_cst (type); })))))

Note if you capture the ne expression originally you can just reuse
that instead of recreating it.

Sorry I missed that in the original review.

Other than that this looks good to me (but I can't approve it).

Thanks,
Andrew

> +
>  /* Signed saturation sub, case 1:
>     T minus = (T)((UT)X - (UT)Y);
>     SAT_S_SUB = (X ^ Y) & (X ^ minus) < 0 ? (-(T)(X < 0) ^ MAX) : minus;
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
> new file mode 100644
> index 00000000000..756ba065d84
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-phiopt1" } */
> +
> +#include <stdint.h>
> +
> +uint8_t sat_u_imm1_uint8_t (uint8_t x)
> +{
> +  return x >= (uint8_t)1 ? x - (uint8_t)1 : 0;
> +}
> +
> +uint16_t sat_u_imm1_uint16_t (uint16_t x)
> +{
> +  return x >= (uint16_t)1 ? x - (uint16_t)1 : 0;
> +}
> +
> +uint32_t sat_u_imm1_uint32_t (uint32_t x)
> +{
> +  return x >= (uint32_t)1 ? x - (uint32_t)1 : 0;
> +}
> +
> +uint64_t sat_u_imm1_uint64_t (uint64_t x)
> +{
> +  return x >= (uint64_t)1 ? x - (uint64_t)1 : 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-not "goto" "phiopt1" } } */
> --
> 2.17.1
>
  
Li Xu Oct. 24, 2024, 6:42 a.m. UTC | #2
> -----原始邮件-----发件人:"Andrew Pinski" <pinskia@gmail.com>发送时间:2024-10-24 10:23:01 (星期四)收件人:"Li Xu" <xuli1@eswincomputing.com>抄送:gcc-patches@gcc.gnu.org, kito.cheng@gmail.com, richard.guenther@gmail.com, Tamar.Christina@arm.com, juzhe.zhong@rivai.ai, pan2.li@intel.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com主题:Re: [PATCH 1/2 v3] Match: Simplify unsigned scalar sat_sub(x, 1) to (x - x != 0)
> 
> On Wed, Oct 23, 2024 at 2:08 AM Li Xu <xuli1@eswincomputing.com> wrote:
> >
> > From: xuli <xuli1@eswincomputing.com>
> >
> > When the imm operand op1=1 in the unsigned scalar sat_sub form2 below,
> > we can simplify (x != 0 ? x + max : 0) to (x - x != 0), thereby eliminating
> > a branch instruction.
> >
> > Form2:
> > T __attribute__((noinline))             \
> > sat_u_sub_imm##IMM##_##T##_fmt_2 (T x)  \
> > {                                       \
> >   return x >= (T)IMM ? x - (T)IMM : 0;  \
> > }
> >
> > Take below form 2 as example:
> > DEF_SAT_U_SUB_IMM_FMT_2(uint8_t, 1)
> >
> > Before this patch:
> > __attribute__((noinline))
> > uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
> > {
> >   uint8_t _1;
> >   uint8_t _3;
> >
> >   <bb 2> [local count: 1073741824]:
> >   if (x_2(D) != 0)
> >     goto <bb 3>; [50.00%]
> >   else
> >     goto <bb 4>; [50.00%]
> >
> >   <bb 3> [local count: 536870912]:
> >   _3 = x_2(D) + 255;
> >
> >   <bb 4> [local count: 1073741824]:
> >   # _1 = PHI <x_2(D)(2), _3(3)>
> >   return _1;
> >
> > }
> >
> > Assembly code:
> > sat_u_sub_imm1_uint8_t_fmt_2:
> >         beq     a0,zero,.L2
> >         addiw   a0,a0,-1
> >         andi    a0,a0,0xff
> > .L2:
> >         ret
> >
> > After this patch:
> > __attribute__((noinline))
> > uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
> > {
> >   _Bool _1;
> >   unsigned char _2;
> >   uint8_t _4;
> >
> >   <bb 2> [local count: 1073741824]:
> >   _1 = x_3(D) != 0;
> >   _2 = (unsigned char) _1;
> >   _4 = x_3(D) - _2;
> >   return _4;
> >
> > }
> >
> > Assembly code:
> > sat_u_sub_imm1_uint8_t_fmt_2:
> >         snez    a5,a0
> >         subw    a0,a0,a5
> >         andi    a0,a0,0xff
> >         ret
> >
> > The below test suites are passed for this patch:
> > 1. The rv64gcv fully regression tests.
> > 2. The x86 bootstrap tests.
> > 3. The x86 fully regression tests.
> >
> > Signed-off-by: Li Xu <xuli1@eswincomputing.com>
> > gcc/ChangeLog:
> >
> >         * match.pd: Simplify (x != 0 ? x + max : 0) to (x - x != 0).
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.dg/tree-ssa/phi-opt-44.c: New test.
> >
> > ---
> >  gcc/match.pd                               |  9 ++++++++
> >  gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c | 26 ++++++++++++++++++++++
> >  2 files changed, 35 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 0455dfa6993..6a245f8e0d3 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -3383,6 +3383,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >    }
> >    (if (wi::eq_p (sum, wi::uhwi (0, precision)))))))
> >
> > +/* The boundary condition for case 10: IMM = 1:
> > +   SAT_U_SUB = X >= IMM ? (X - IMM) : 0.
> > +   simplify (X != 0 ? X + max : 0) to (X - X != 0).  */
> > +(simplify
> > + (cond (ne @0 integer_zerop) (plus @0 integer_all_onesp) integer_zerop)
> > + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> > +     && types_match (type, @0))
> 
> You don't need either TYPE_UNSIGNED any more because integer_all_onesp
> handles both sign and unsigned "all ones".
> You also don't need types_match since they will match due to the rules
> of PLUS_EXPR.

Thanks for reviewing.

This simplification only applies to unsigned saturated subtraction, for example, if x=3, max=127 or 255:
             (X != 0 ? X + max : 0)         (X - X != 0)
uint8_t      (3!=0 ? 3 + 255 : 0 ) =3       (3-3!=0) =3   correct
int8_t       (3!=0 ? 3 + 127 : 0 ) =-126    (3-3!=0) =3   not correct

I think it should be changed to the following form:
(if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type))

> 
> Once you remove the UNSIGNED check, please add testcases for the
> signed case too.
> 
> > +   (minus @0 (convert (ne @0 { build_zero_cst (type); })))))
> 
> Note if you capture the ne expression originally you can just reuse
> that instead of recreating it.

I am not familiar with the simplify syntax. Do you mean like this?
(minus @0 (convert (ne @0 integer_zerop))))
integer_zerop is a predicate function. there is a problem with writing it this way.
How can i reuse it ?

> 
> Sorry I missed that in the original review.
> 
> Other than that this looks good to me (but I can't approve it).
> 
> Thanks,
> Andrew
> 
> > +
> >  /* Signed saturation sub, case 1:
> >     T minus = (T)((UT)X - (UT)Y);
> >     SAT_S_SUB = (X ^ Y) & (X ^ minus) < 0 ? (-(T)(X < 0) ^ MAX) : minus;
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
> > new file mode 100644
> > index 00000000000..756ba065d84
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
> > @@ -0,0 +1,26 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -fdump-tree-phiopt1" } */
> > +
> > +#include <stdint.h>
> > +
> > +uint8_t sat_u_imm1_uint8_t (uint8_t x)
> > +{
> > +  return x >= (uint8_t)1 ? x - (uint8_t)1 : 0;
> > +}
> > +
> > +uint16_t sat_u_imm1_uint16_t (uint16_t x)
> > +{
> > +  return x >= (uint16_t)1 ? x - (uint16_t)1 : 0;
> > +}
> > +
> > +uint32_t sat_u_imm1_uint32_t (uint32_t x)
> > +{
> > +  return x >= (uint32_t)1 ? x - (uint32_t)1 : 0;
> > +}
> > +
> > +uint64_t sat_u_imm1_uint64_t (uint64_t x)
> > +{
> > +  return x >= (uint64_t)1 ? x - (uint64_t)1 : 0;
> > +}
> > +
> > +/* { dg-final { scan-tree-dump-not "goto" "phiopt1" } } */
> > --
> > 2.17.1
> >
  
Andrew Pinski Oct. 24, 2024, 6:55 a.m. UTC | #3
On Wed, Oct 23, 2024 at 11:43 PM Li Xu <xuli1@eswincomputing.com> wrote:
>
>
>
>
> > -----原始邮件-----发件人:"Andrew Pinski" <pinskia@gmail.com>发送时间:2024-10-24 10:23:01 (星期四)收件人:"Li Xu" <xuli1@eswincomputing.com>抄送:gcc-patches@gcc.gnu.org, kito.cheng@gmail.com, richard.guenther@gmail.com, Tamar.Christina@arm.com, juzhe.zhong@rivai.ai, pan2.li@intel.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com主题:Re: [PATCH 1/2 v3] Match: Simplify unsigned scalar sat_sub(x, 1) to (x - x != 0)
> >
> > On Wed, Oct 23, 2024 at 2:08 AM Li Xu <xuli1@eswincomputing.com> wrote:
> > >
> > > From: xuli <xuli1@eswincomputing.com>
> > >
> > > When the imm operand op1=1 in the unsigned scalar sat_sub form2 below,
> > > we can simplify (x != 0 ? x + max : 0) to (x - x != 0), thereby eliminating
> > > a branch instruction.
> > >
> > > Form2:
> > > T __attribute__((noinline))             \
> > > sat_u_sub_imm##IMM##_##T##_fmt_2 (T x)  \
> > > {                                       \
> > >   return x >= (T)IMM ? x - (T)IMM : 0;  \
> > > }
> > >
> > > Take below form 2 as example:
> > > DEF_SAT_U_SUB_IMM_FMT_2(uint8_t, 1)
> > >
> > > Before this patch:
> > > __attribute__((noinline))
> > > uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
> > > {
> > >   uint8_t _1;
> > >   uint8_t _3;
> > >
> > >   <bb 2> [local count: 1073741824]:
> > >   if (x_2(D) != 0)
> > >     goto <bb 3>; [50.00%]
> > >   else
> > >     goto <bb 4>; [50.00%]
> > >
> > >   <bb 3> [local count: 536870912]:
> > >   _3 = x_2(D) + 255;
> > >
> > >   <bb 4> [local count: 1073741824]:
> > >   # _1 = PHI <x_2(D)(2), _3(3)>
> > >   return _1;
> > >
> > > }
> > >
> > > Assembly code:
> > > sat_u_sub_imm1_uint8_t_fmt_2:
> > >         beq     a0,zero,.L2
> > >         addiw   a0,a0,-1
> > >         andi    a0,a0,0xff
> > > .L2:
> > >         ret
> > >
> > > After this patch:
> > > __attribute__((noinline))
> > > uint8_t sat_u_sub_imm1_uint8_t_fmt_2 (uint8_t x)
> > > {
> > >   _Bool _1;
> > >   unsigned char _2;
> > >   uint8_t _4;
> > >
> > >   <bb 2> [local count: 1073741824]:
> > >   _1 = x_3(D) != 0;
> > >   _2 = (unsigned char) _1;
> > >   _4 = x_3(D) - _2;
> > >   return _4;
> > >
> > > }
> > >
> > > Assembly code:
> > > sat_u_sub_imm1_uint8_t_fmt_2:
> > >         snez    a5,a0
> > >         subw    a0,a0,a5
> > >         andi    a0,a0,0xff
> > >         ret
> > >
> > > The below test suites are passed for this patch:
> > > 1. The rv64gcv fully regression tests.
> > > 2. The x86 bootstrap tests.
> > > 3. The x86 fully regression tests.
> > >
> > > Signed-off-by: Li Xu <xuli1@eswincomputing.com>
> > > gcc/ChangeLog:
> > >
> > >         * match.pd: Simplify (x != 0 ? x + max : 0) to (x - x != 0).
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >         * gcc.dg/tree-ssa/phi-opt-44.c: New test.
> > >
> > > ---
> > >  gcc/match.pd                               |  9 ++++++++
> > >  gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c | 26 ++++++++++++++++++++++
> > >  2 files changed, 35 insertions(+)
> > >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
> > >
> > > diff --git a/gcc/match.pd b/gcc/match.pd
> > > index 0455dfa6993..6a245f8e0d3 100644
> > > --- a/gcc/match.pd
> > > +++ b/gcc/match.pd
> > > @@ -3383,6 +3383,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >    }
> > >    (if (wi::eq_p (sum, wi::uhwi (0, precision)))))))
> > >
> > > +/* The boundary condition for case 10: IMM = 1:
> > > +   SAT_U_SUB = X >= IMM ? (X - IMM) : 0.
> > > +   simplify (X != 0 ? X + max : 0) to (X - X != 0).  */
> > > +(simplify
> > > + (cond (ne @0 integer_zerop) (plus @0 integer_all_onesp) integer_zerop)
> > > + (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
> > > +     && types_match (type, @0))
> >
> > You don't need either TYPE_UNSIGNED any more because integer_all_onesp
> > handles both sign and unsigned "all ones".
> > You also don't need types_match since they will match due to the rules
> > of PLUS_EXPR.
>
> Thanks for reviewing.
>
> This simplification only applies to unsigned saturated subtraction, for example, if x=3, max=127 or 255:
>              (X != 0 ? X + max : 0)         (X - X != 0)
> uint8_t      (3!=0 ? 3 + 255 : 0 ) =3       (3-3!=0) =3   correct
> int8_t       (3!=0 ? 3 + 127 : 0 ) =-126    (3-3!=0) =3   not correct
>
> I think it should be changed to the following form:
> (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type))

Oh you misunderstood what integer_all_onesp means here. Basically it
means ~0 which is either -1 (for signed types) or the max value (for
unsigned types).
So in the case of uint8_t, it will match 255 while for int8_t, it will match -1.
That is why I requested you to use integer_all_onesp in the first place.

It does not match the max value, there is the predicate max_value for
that already.

>
> >
> > Once you remove the UNSIGNED check, please add testcases for the
> > signed case too.
> >
> > > +   (minus @0 (convert (ne @0 { build_zero_cst (type); })))))
> >
> > Note if you capture the ne expression originally you can just reuse
> > that instead of recreating it.
>
> I am not familiar with the simplify syntax. Do you mean like this?
> (minus @0 (convert (ne @0 integer_zerop))))
> integer_zerop is a predicate function. there is a problem with writing it this way.
> How can i reuse it ?

Something like:
```
/* (x != 0) ? (x + ~0) : x -> x - (x != 0) . */
(simplify
 (cond (ne@1 @0 integer_zerop) (plus @0 integer_all_onesp) integer_zerop)
 (if (INTEGRAL_TYPE_P (type))
   (minus @0 (convert @1))))
```

Notice the @1 on the `ne` there. You are capturing the whole `ne`
expression (or the ssa name for gimple that is set to the ne
expression).
I hope this helps you.

Thanks,
Andrew Pinski

>
> >
> > Sorry I missed that in the original review.
> >
> > Other than that this looks good to me (but I can't approve it).
> >
> > Thanks,
> > Andrew
> >
> > > +
> > >  /* Signed saturation sub, case 1:
> > >     T minus = (T)((UT)X - (UT)Y);
> > >     SAT_S_SUB = (X ^ Y) & (X ^ minus) < 0 ? (-(T)(X < 0) ^ MAX) : minus;
> > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
> > > new file mode 100644
> > > index 00000000000..756ba065d84
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
> > > @@ -0,0 +1,26 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O2 -fdump-tree-phiopt1" } */
> > > +
> > > +#include <stdint.h>
> > > +
> > > +uint8_t sat_u_imm1_uint8_t (uint8_t x)
> > > +{
> > > +  return x >= (uint8_t)1 ? x - (uint8_t)1 : 0;
> > > +}
> > > +
> > > +uint16_t sat_u_imm1_uint16_t (uint16_t x)
> > > +{
> > > +  return x >= (uint16_t)1 ? x - (uint16_t)1 : 0;
> > > +}
> > > +
> > > +uint32_t sat_u_imm1_uint32_t (uint32_t x)
> > > +{
> > > +  return x >= (uint32_t)1 ? x - (uint32_t)1 : 0;
> > > +}
> > > +
> > > +uint64_t sat_u_imm1_uint64_t (uint64_t x)
> > > +{
> > > +  return x >= (uint64_t)1 ? x - (uint64_t)1 : 0;
> > > +}
> > > +
> > > +/* { dg-final { scan-tree-dump-not "goto" "phiopt1" } } */
> > > --
> > > 2.17.1
> > >
  

Patch

diff --git a/gcc/match.pd b/gcc/match.pd
index 0455dfa6993..6a245f8e0d3 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3383,6 +3383,15 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   }
   (if (wi::eq_p (sum, wi::uhwi (0, precision)))))))
 
+/* The boundary condition for case 10: IMM = 1:
+   SAT_U_SUB = X >= IMM ? (X - IMM) : 0.
+   simplify (X != 0 ? X + max : 0) to (X - X != 0).  */
+(simplify
+ (cond (ne @0 integer_zerop) (plus @0 integer_all_onesp) integer_zerop)
+ (if (INTEGRAL_TYPE_P (type) && TYPE_UNSIGNED (type)
+     && types_match (type, @0))
+   (minus @0 (convert (ne @0 { build_zero_cst (type); })))))
+
 /* Signed saturation sub, case 1:
    T minus = (T)((UT)X - (UT)Y);
    SAT_S_SUB = (X ^ Y) & (X ^ minus) < 0 ? (-(T)(X < 0) ^ MAX) : minus;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
new file mode 100644
index 00000000000..756ba065d84
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-44.c
@@ -0,0 +1,26 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-phiopt1" } */
+
+#include <stdint.h>
+
+uint8_t sat_u_imm1_uint8_t (uint8_t x)
+{
+  return x >= (uint8_t)1 ? x - (uint8_t)1 : 0;
+}
+
+uint16_t sat_u_imm1_uint16_t (uint16_t x)
+{
+  return x >= (uint16_t)1 ? x - (uint16_t)1 : 0;
+}
+
+uint32_t sat_u_imm1_uint32_t (uint32_t x)
+{
+  return x >= (uint32_t)1 ? x - (uint32_t)1 : 0;
+}
+
+uint64_t sat_u_imm1_uint64_t (uint64_t x)
+{
+  return x >= (uint64_t)1 ? x - (uint64_t)1 : 0;
+}
+
+/* { dg-final { scan-tree-dump-not "goto" "phiopt1" } } */