rtl: Remove invalid compare simplification [PR117186]

Message ID mptzfk45brm.fsf@arm.com
State New
Headers
Series rtl: Remove invalid compare simplification [PR117186] |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_gcc_check--master-arm success Test passed
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 success Test passed

Commit Message

Richard Sandiford Jan. 6, 2025, 1:10 p.m. UTC
  g:d882fe5150fbbeb4e44d007bb4964e5b22373021, posted at
https://gcc.gnu.org/pipermail/gcc-patches/2000-July/033786.html ,
added code to treat:

  (set (reg:CC cc) (compare:CC (gt:M (reg:CC cc) 0) (lt:M (reg:CC cc) 0)))

as a nop.  This PR shows that that isn't always correct.
The compare in the set above is between two 0/1 booleans (at least
on STORE_FLAG_VALUE==1 targets), whereas the unknown comparison that
produced the incoming (reg:CC cc) is unconstrained; it could be between
arbitrary integers, or even floats.  The fold is therefore replacing a
cc that is valid for both signed and unsigned comparisons with one that
is only known to be valid for signed comparisons.

  (gt (compare (gt cc 0) (lt cc 0) 0)

does simplify to:

  (gt cc 0)

but:

  (gtu (compare (gt cc 0) (lt cc 0) 0)

does not simplify to:

  (gtu cc 0)

The optimisation didn't come with a testcase, but it was added for
i386's cmpstrsi, now cmpstrnsi.  That probably doesn't matter as much
as it once did, since it's now conditional on -minline-all-stringops.
But the patch is almost 25 years old, so whatever the original
motivation was, it seems likely that other things now rely on it.

It therefore seems better to try to preserve the optimisation on rtl
rather than get rid of it.  To do that, we need to look at how the
result of the outer compare is used.  We'd therefore be looking at four
instructions (the gt, the lt, the compare, and the use of the compare),
but combine already allows that for 3-instruction combinations thanks
to:

  /* If the source is a COMPARE, look for the use of the comparison result
     and try to simplify it unless we already have used undobuf.other_insn.  */

When applied to boolean inputs, a comparison operator is
effectively a boolean logical operator (AND, ANDNOT, XOR, etc.).
simplify_logical_relational_operation already had code to simplify
logical operators between two comparison results, but:

* It only handled IOR, which doesn't cover all the cases needed here.
  The others are easily added.

* It treated comparisons of integers as having an ORDERED/UNORDERED result.
  Therefore:

  * it would not treat "true for LT + EQ + GT" as "always true" for
    comparisons between integers, because the mask excluded the UNORDERED
    condition.

  * it would try to convert "true for LT + GT" into LTGT even for comparisons
    between integers.  To prevent an ICE later, the code used:

       /* Many comparison codes are only valid for certain mode classes.  */
       if (!comparison_code_valid_for_mode (code, mode))
         return 0;

    However, this used the wrong mode, since "mode" is here the integer
    result of the comparisons (and the mode of the IOR), not the mode of
    the things being compared.  Thus the effect was to reject all
    floating-point-only codes, even when comparing floats.

  I think instead the code should detect whether the comparison is between
  integer values and remove UNORDERED from consideration if so.  It then
  always produces a valid comparison (or an always true/false result),
  and so comparison_code_valid_for_mode is not needed.  In particular,
  "true for LT + GT" becomes NE for comparisons between integers but
  remains LTGT for comparisons between floats.

* There was a missing check for whether the comparison inputs had
  side effects.

While there, it also seemed worth extending
simplify_logical_relational_operation to unsigned comparisons, since
that makes the testing easier.

As far as that testing goes: the patch exhaustively tests all
combinations of integer comparisons in:

  (cmp1 (cmp2 X Y) (cmp3 X Y))

for the 10 integer comparisons, giving 1000 fold attempts in total.
It then tries all combinations of (X in {-1,0,1} x Y in {-1,0,1})
on the result of the fold, giving 9 checks per fold, or 9000 in total.
That's probably more than is typical for self-tests, but it seems to
complete in neglible time, even for -O0 builds.

Tested on aarch64-linux-gnu & x86_64-linux-gnu.  OK to install?

The patch isn't exactly a spot fix, and the bug is ancient, so I suppose
the patch probably isn't suitable for backports.

Richard


gcc/
	PR rtl-optimization/117186
	* rtl.h (simplify_context::simplify_logical_relational_operation): Add
	an invert0_p parameter.
	* simplify-rtx.cc (unsigned_comparison_to_mask): New function.
	(mask_to_unsigned_comparison): Likewise.
	(comparison_code_valid_for_mode): Delete.
	(simplify_context::simplify_logical_relational_operation): Add
	an invert0_p parameter.  Handle AND and XOR.  Handle unsigned
	comparisons.  Handle always-false results.  Ignore the low bit
	of the mask if the operands are always ordered and remove the
	then-redundant check of comparison_code_valid_for_mode.  Check
	for side-effects in the operands before simplifying them away.
	(simplify_context::simplify_binary_operation_1): Remove
	simplification of (compare (gt ...) (lt ...)) and instead...
	(simplify_context::simplify_relational_operation_1): ...handle
	comparisons of comparisons here.
	(test_comparisons): New function.
	(test_scalar_ops): Call it.

gcc/testsuite/
	PR rtl-optimization/117186
	* gcc.dg/torture/pr117186.c: New test.
	* gcc.target/aarch64/pr117186.c: Likewise.
---
 gcc/rtl.h                                   |   3 +-
 gcc/simplify-rtx.cc                         | 276 ++++++++++++++------
 gcc/testsuite/gcc.dg/torture/pr117186.c     |  15 ++
 gcc/testsuite/gcc.target/aarch64/pr117186.c | 128 +++++++++
 4 files changed, 339 insertions(+), 83 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr117186.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr117186.c
  

Comments

Richard Biener Jan. 9, 2025, 1:02 p.m. UTC | #1
On Mon, Jan 6, 2025 at 2:12 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> g:d882fe5150fbbeb4e44d007bb4964e5b22373021, posted at
> https://gcc.gnu.org/pipermail/gcc-patches/2000-July/033786.html ,
> added code to treat:
>
>   (set (reg:CC cc) (compare:CC (gt:M (reg:CC cc) 0) (lt:M (reg:CC cc) 0)))
>
> as a nop.  This PR shows that that isn't always correct.
> The compare in the set above is between two 0/1 booleans (at least
> on STORE_FLAG_VALUE==1 targets), whereas the unknown comparison that
> produced the incoming (reg:CC cc) is unconstrained; it could be between
> arbitrary integers, or even floats.  The fold is therefore replacing a
> cc that is valid for both signed and unsigned comparisons with one that
> is only known to be valid for signed comparisons.
>
>   (gt (compare (gt cc 0) (lt cc 0) 0)
>
> does simplify to:
>
>   (gt cc 0)
>
> but:
>
>   (gtu (compare (gt cc 0) (lt cc 0) 0)
>
> does not simplify to:
>
>   (gtu cc 0)
>
> The optimisation didn't come with a testcase, but it was added for
> i386's cmpstrsi, now cmpstrnsi.  That probably doesn't matter as much
> as it once did, since it's now conditional on -minline-all-stringops.
> But the patch is almost 25 years old, so whatever the original
> motivation was, it seems likely that other things now rely on it.
>
> It therefore seems better to try to preserve the optimisation on rtl
> rather than get rid of it.  To do that, we need to look at how the
> result of the outer compare is used.  We'd therefore be looking at four
> instructions (the gt, the lt, the compare, and the use of the compare),
> but combine already allows that for 3-instruction combinations thanks
> to:
>
>   /* If the source is a COMPARE, look for the use of the comparison result
>      and try to simplify it unless we already have used undobuf.other_insn.  */
>
> When applied to boolean inputs, a comparison operator is
> effectively a boolean logical operator (AND, ANDNOT, XOR, etc.).
> simplify_logical_relational_operation already had code to simplify
> logical operators between two comparison results, but:
>
> * It only handled IOR, which doesn't cover all the cases needed here.
>   The others are easily added.
>
> * It treated comparisons of integers as having an ORDERED/UNORDERED result.
>   Therefore:
>
>   * it would not treat "true for LT + EQ + GT" as "always true" for
>     comparisons between integers, because the mask excluded the UNORDERED
>     condition.
>
>   * it would try to convert "true for LT + GT" into LTGT even for comparisons
>     between integers.  To prevent an ICE later, the code used:
>
>        /* Many comparison codes are only valid for certain mode classes.  */
>        if (!comparison_code_valid_for_mode (code, mode))
>          return 0;
>
>     However, this used the wrong mode, since "mode" is here the integer
>     result of the comparisons (and the mode of the IOR), not the mode of
>     the things being compared.  Thus the effect was to reject all
>     floating-point-only codes, even when comparing floats.
>
>   I think instead the code should detect whether the comparison is between
>   integer values and remove UNORDERED from consideration if so.  It then
>   always produces a valid comparison (or an always true/false result),
>   and so comparison_code_valid_for_mode is not needed.  In particular,
>   "true for LT + GT" becomes NE for comparisons between integers but
>   remains LTGT for comparisons between floats.
>
> * There was a missing check for whether the comparison inputs had
>   side effects.
>
> While there, it also seemed worth extending
> simplify_logical_relational_operation to unsigned comparisons, since
> that makes the testing easier.
>
> As far as that testing goes: the patch exhaustively tests all
> combinations of integer comparisons in:
>
>   (cmp1 (cmp2 X Y) (cmp3 X Y))
>
> for the 10 integer comparisons, giving 1000 fold attempts in total.
> It then tries all combinations of (X in {-1,0,1} x Y in {-1,0,1})
> on the result of the fold, giving 9 checks per fold, or 9000 in total.
> That's probably more than is typical for self-tests, but it seems to
> complete in neglible time, even for -O0 builds.
>
> Tested on aarch64-linux-gnu & x86_64-linux-gnu.  OK to install?

OK.

> The patch isn't exactly a spot fix, and the bug is ancient, so I suppose
> the patch probably isn't suitable for backports.

Maybe for GCC 14, but not without some soaking time of course.

Thanks,
Richard.

> Richard
>
>
> gcc/
>         PR rtl-optimization/117186
>         * rtl.h (simplify_context::simplify_logical_relational_operation): Add
>         an invert0_p parameter.
>         * simplify-rtx.cc (unsigned_comparison_to_mask): New function.
>         (mask_to_unsigned_comparison): Likewise.
>         (comparison_code_valid_for_mode): Delete.
>         (simplify_context::simplify_logical_relational_operation): Add
>         an invert0_p parameter.  Handle AND and XOR.  Handle unsigned
>         comparisons.  Handle always-false results.  Ignore the low bit
>         of the mask if the operands are always ordered and remove the
>         then-redundant check of comparison_code_valid_for_mode.  Check
>         for side-effects in the operands before simplifying them away.
>         (simplify_context::simplify_binary_operation_1): Remove
>         simplification of (compare (gt ...) (lt ...)) and instead...
>         (simplify_context::simplify_relational_operation_1): ...handle
>         comparisons of comparisons here.
>         (test_comparisons): New function.
>         (test_scalar_ops): Call it.
>
> gcc/testsuite/
>         PR rtl-optimization/117186
>         * gcc.dg/torture/pr117186.c: New test.
>         * gcc.target/aarch64/pr117186.c: Likewise.
> ---
>  gcc/rtl.h                                   |   3 +-
>  gcc/simplify-rtx.cc                         | 276 ++++++++++++++------
>  gcc/testsuite/gcc.dg/torture/pr117186.c     |  15 ++
>  gcc/testsuite/gcc.target/aarch64/pr117186.c | 128 +++++++++
>  4 files changed, 339 insertions(+), 83 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr117186.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/pr117186.c
>
> diff --git a/gcc/rtl.h b/gcc/rtl.h
> index 3d211953b7d..3b676c46880 100644
> --- a/gcc/rtl.h
> +++ b/gcc/rtl.h
> @@ -3478,7 +3478,8 @@ private:
>    rtx simplify_byte_swapping_operation (rtx_code, machine_mode, rtx, rtx);
>    rtx simplify_associative_operation (rtx_code, machine_mode, rtx, rtx);
>    rtx simplify_distributive_operation (rtx_code, machine_mode, rtx, rtx);
> -  rtx simplify_logical_relational_operation (rtx_code, machine_mode, rtx, rtx);
> +  rtx simplify_logical_relational_operation (rtx_code, machine_mode, rtx, rtx,
> +                                            bool = false);
>    rtx simplify_binary_operation_series (rtx_code, machine_mode, rtx, rtx);
>    rtx simplify_distribute_over_subregs (rtx_code, machine_mode, rtx, rtx);
>    rtx simplify_shift_const_int (rtx_code, machine_mode, rtx, unsigned int);
> diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
> index c77ce956332..71c5d3c1b1b 100644
> --- a/gcc/simplify-rtx.cc
> +++ b/gcc/simplify-rtx.cc
> @@ -2446,6 +2446,61 @@ simplify_context::simplify_associative_operation (rtx_code code,
>    return 0;
>  }
>
> +/* If COMPARISON can be treated as an unsigned comparison, return a mask
> +   that represents it (8 if it includes <, 4 if it includes > and 2
> +   if it includes ==).  Return 0 otherwise.  */
> +static int
> +unsigned_comparison_to_mask (rtx_code comparison)
> +{
> +  switch (comparison)
> +    {
> +    case LTU:
> +      return 8;
> +    case GTU:
> +      return 4;
> +    case EQ:
> +      return 2;
> +
> +    case LEU:
> +      return 10;
> +    case GEU:
> +      return 6;
> +
> +    case NE:
> +      return 12;
> +
> +    default:
> +      return 0;
> +    }
> +}
> +
> +/* Reverse the mapping in unsigned_comparison_to_mask, going from masks
> +   to comparisons.  */
> +static rtx_code
> +mask_to_unsigned_comparison (int mask)
> +{
> +  switch (mask)
> +    {
> +    case 8:
> +      return LTU;
> +    case 4:
> +      return GTU;
> +    case 2:
> +      return EQ;
> +
> +    case 10:
> +      return LEU;
> +    case 6:
> +      return GEU;
> +
> +    case 12:
> +      return NE;
> +
> +    default:
> +      gcc_unreachable ();
> +    }
> +}
> +
>  /* Return a mask describing the COMPARISON.  */
>  static int
>  comparison_to_mask (enum rtx_code comparison)
> @@ -2530,53 +2585,6 @@ mask_to_comparison (int mask)
>      }
>  }
>
> -/* Return true if CODE is valid for comparisons of mode MODE, false
> -   otherwise.
> -
> -   It is always safe to return false, even if the code was valid for the
> -   given mode as that will merely suppress optimizations.  */
> -
> -static bool
> -comparison_code_valid_for_mode (enum rtx_code code, enum machine_mode mode)
> -{
> -  switch (code)
> -    {
> -      /* These are valid for integral, floating and vector modes.  */
> -      case NE:
> -      case EQ:
> -      case GE:
> -      case GT:
> -      case LE:
> -      case LT:
> -       return (INTEGRAL_MODE_P (mode)
> -               || FLOAT_MODE_P (mode)
> -               || VECTOR_MODE_P (mode));
> -
> -      /* These are valid for floating point modes.  */
> -      case LTGT:
> -      case UNORDERED:
> -      case ORDERED:
> -      case UNEQ:
> -      case UNGE:
> -      case UNGT:
> -      case UNLE:
> -      case UNLT:
> -       return FLOAT_MODE_P (mode);
> -
> -      /* These are filtered out in simplify_logical_operation, but
> -        we check for them too as a matter of safety.   They are valid
> -        for integral and vector modes.  */
> -      case GEU:
> -      case GTU:
> -      case LEU:
> -      case LTU:
> -       return INTEGRAL_MODE_P (mode) || VECTOR_MODE_P (mode);
> -
> -      default:
> -       gcc_unreachable ();
> -    }
> -}
> -
>  /* Canonicalize RES, a scalar const0_rtx/const_true_rtx to the right
>     false/true value of comparison with MODE where comparison operands
>     have CMP_MODE.  */
> @@ -2625,17 +2633,16 @@ relational_result (machine_mode mode, machine_mode cmp_mode, rtx res)
>  }
>
>  /* Simplify a logical operation CODE with result mode MODE, operating on OP0
> -   and OP1, which should be both relational operations.  Return 0 if no such
> -   simplification is possible.  */
> +   and OP1, in the case where both are relational operations.  Assume that
> +   OP0 is inverted if INVERT0_P is true.
> +
> +   Return 0 if no such simplification is possible.  */
>  rtx
>  simplify_context::simplify_logical_relational_operation (rtx_code code,
>                                                          machine_mode mode,
> -                                                        rtx op0, rtx op1)
> +                                                        rtx op0, rtx op1,
> +                                                        bool invert0_p)
>  {
> -  /* We only handle IOR of two relational operations.  */
> -  if (code != IOR)
> -    return 0;
> -
>    if (!(COMPARISON_P (op0) && COMPARISON_P (op1)))
>      return 0;
>
> @@ -2643,28 +2650,64 @@ simplify_context::simplify_logical_relational_operation (rtx_code code,
>         && rtx_equal_p (XEXP (op0, 1), XEXP (op1, 1))))
>      return 0;
>
> +  if (side_effects_p (op0))
> +    return 0;
> +
>    enum rtx_code code0 = GET_CODE (op0);
>    enum rtx_code code1 = GET_CODE (op1);
>
> -  /* We don't handle unsigned comparisons currently.  */
> -  if (code0 == LTU || code0 == GTU || code0 == LEU || code0 == GEU)
> -    return 0;
> -  if (code1 == LTU || code1 == GTU || code1 == LEU || code1 == GEU)
> -    return 0;
> +  /* Assume at first that the comparisons are on integers, and that the
> +     operands are therefore ordered.  */
> +  int all = 14;
> +  int mask0 = unsigned_comparison_to_mask (code0);
> +  int mask1 = unsigned_comparison_to_mask (code1);
> +  bool unsigned_p = (IN_RANGE (mask0 & 12, 4, 8)
> +                    || IN_RANGE (mask1 & 12, 4, 8));
> +  if (unsigned_p)
> +    {
> +      /* We only reach here when comparing integers.  Reject mixtures of signed
> +        and unsigned comparisons.  */
> +      if (mask0 == 0 || mask1 == 0)
> +       return 0;
> +    }
> +  else
> +    {
> +      /* See whether the operands might be unordered.  */
> +      if (HONOR_NANS (GET_MODE (XEXP (op0, 0))))
> +       all = 15;
> +      mask0 = comparison_to_mask (code0) & all;
> +      mask1 = comparison_to_mask (code1) & all;
> +    }
>
> -  int mask0 = comparison_to_mask (code0);
> -  int mask1 = comparison_to_mask (code1);
> +  if (invert0_p)
> +    mask0 = mask0 ^ all;
>
> -  int mask = mask0 | mask1;
> +  int mask;
> +  if (code == AND)
> +    mask = mask0 & mask1;
> +  else if (code == IOR)
> +    mask = mask0 | mask1;
> +  else if (code == XOR)
> +    mask = mask0 ^ mask1;
> +  else
> +    return 0;
>
> -  if (mask == 15)
> +  if (mask == all)
>      return relational_result (mode, GET_MODE (op0), const_true_rtx);
>
> -  code = mask_to_comparison (mask);
> +  if (mask == 0)
> +    return relational_result (mode, GET_MODE (op0), const0_rtx);
>
> -  /* Many comparison codes are only valid for certain mode classes.  */
> -  if (!comparison_code_valid_for_mode (code, mode))
> -    return 0;
> +  if (unsigned_p)
> +    code = mask_to_unsigned_comparison (mask);
> +  else
> +    {
> +      code = mask_to_comparison (mask);
> +      /* LTGT and NE are arithmetically equivalent for ordered operands,
> +        with NE being the canonical choice.  */
> +      if (code == LTGT && all == 14)
> +       code = NE;
> +    }
>
>    op0 = XEXP (op1, 0);
>    op1 = XEXP (op1, 1);
> @@ -3217,21 +3260,6 @@ simplify_context::simplify_binary_operation_1 (rtx_code code,
>        break;
>
>      case COMPARE:
> -      /* Convert (compare (gt (flags) 0) (lt (flags) 0)) to (flags).  */
> -      if (((GET_CODE (op0) == GT && GET_CODE (op1) == LT)
> -          || (GET_CODE (op0) == GTU && GET_CODE (op1) == LTU))
> -         && XEXP (op0, 1) == const0_rtx && XEXP (op1, 1) == const0_rtx)
> -       {
> -         rtx xop00 = XEXP (op0, 0);
> -         rtx xop10 = XEXP (op1, 0);
> -
> -         if (REG_P (xop00) && REG_P (xop10)
> -             && REGNO (xop00) == REGNO (xop10)
> -             && GET_MODE (xop00) == mode
> -             && GET_MODE (xop10) == mode
> -             && GET_MODE_CLASS (mode) == MODE_CC)
> -           return xop00;
> -       }
>        break;
>
>      case MINUS:
> @@ -6405,6 +6433,31 @@ simplify_context::simplify_relational_operation_1 (rtx_code code,
>           && INTVAL (XEXP (tmp, 1)) == GET_MODE_PRECISION (int_mode) - 1)
>         return simplify_gen_binary (AND, mode, XEXP (tmp, 0), const1_rtx);
>      }
> +
> +  /* For two booleans A and B:
> +
> +     A >  B == ~B & A
> +     A >= B == ~B | A
> +     A <  B == ~A & B
> +     A <= B == ~A | B
> +     A == B == ~A ^ B (== ~B ^ A)
> +     A != B ==  A ^ B
> +
> +     simplify_logical_relational_operation checks whether A and B
> +     are booleans.  */
> +  if (code == GTU || code == GT)
> +    return simplify_logical_relational_operation (AND, mode, op1, op0, true);
> +  if (code == GEU || code == GE)
> +    return simplify_logical_relational_operation (IOR, mode, op1, op0, true);
> +  if (code == LTU || code == LT)
> +    return simplify_logical_relational_operation (AND, mode, op0, op1, true);
> +  if (code == LEU || code == LE)
> +    return simplify_logical_relational_operation (IOR, mode, op0, op1, true);
> +  if (code == EQ)
> +    return simplify_logical_relational_operation (XOR, mode, op0, op1, true);
> +  if (code == NE)
> +    return simplify_logical_relational_operation (XOR, mode, op0, op1);
> +
>    return NULL_RTX;
>  }
>
> @@ -8493,6 +8546,63 @@ test_scalar_int_ext_ops2 (machine_mode bmode, machine_mode mmode,
>                  simplify_gen_unary (TRUNCATE, smode, mreg, mmode));
>  }
>
> +/* Test comparisons of comparisons, with the inner comparisons being
> +   between values of mode MODE2 and producing results of mode MODE1,
> +   and with the outer comparisons producing results of mode MODE0.  */
> +
> +static void
> +test_comparisons (machine_mode mode0, machine_mode mode1, machine_mode mode2)
> +{
> +  rtx reg0 = make_test_reg (mode2);
> +  rtx reg1 = make_test_reg (mode2);
> +
> +  static const rtx_code codes[] = {
> +    EQ, NE, LT, LTU, LE, LEU, GE, GEU, GT, GTU
> +  };
> +  constexpr auto num_codes = ARRAY_SIZE (codes);
> +  rtx cmps[num_codes];
> +  rtx vals[] = { constm1_rtx, const0_rtx, const1_rtx };
> +
> +  for (unsigned int i = 0; i < num_codes; ++i)
> +    cmps[i] = gen_rtx_fmt_ee (codes[i], mode1, reg0, reg1);
> +
> +  for (auto code : codes)
> +    for (unsigned int i0 = 0; i0 < num_codes; ++i0)
> +      for (unsigned int i1 = 0; i1 < num_codes; ++i1)
> +       {
> +         rtx cmp_res = simplify_relational_operation (code, mode0, mode1,
> +                                                      cmps[i0], cmps[i1]);
> +         if (i0 >= 2 && i1 >= 2 && (i0 ^ i1) & 1)
> +           ASSERT_TRUE (cmp_res == NULL_RTX);
> +         else
> +           {
> +             ASSERT_TRUE (cmp_res != NULL_RTX
> +                          && (CONSTANT_P (cmp_res)
> +                              || (COMPARISON_P (cmp_res)
> +                                  && GET_MODE (cmp_res) == mode0
> +                                  && REG_P (XEXP (cmp_res, 0))
> +                                  && REG_P (XEXP (cmp_res, 1)))));
> +             for (rtx reg0_val : vals)
> +               for (rtx reg1_val : vals)
> +                 {
> +                   rtx val0 = simplify_const_relational_operation
> +                     (codes[i0], mode1, reg0_val, reg1_val);
> +                   rtx val1 = simplify_const_relational_operation
> +                     (codes[i1], mode1, reg0_val, reg1_val);
> +                   rtx val = simplify_const_relational_operation
> +                     (code, mode0, val0, val1);
> +                   rtx folded = cmp_res;
> +                   if (COMPARISON_P (cmp_res))
> +                     folded = simplify_const_relational_operation
> +                       (GET_CODE (cmp_res), mode0,
> +                        XEXP (cmp_res, 0) == reg0 ? reg0_val : reg1_val,
> +                        XEXP (cmp_res, 1) == reg0 ? reg0_val : reg1_val);
> +                   ASSERT_RTX_EQ (val, folded);
> +                 }
> +           }
> +       }
> +}
> +
>
>  /* Verify some simplifications involving scalar expressions.  */
>
> @@ -8517,6 +8627,8 @@ test_scalar_ops ()
>    test_scalar_int_ext_ops2 (DImode, HImode, QImode);
>    test_scalar_int_ext_ops2 (DImode, SImode, QImode);
>    test_scalar_int_ext_ops2 (DImode, SImode, HImode);
> +
> +  test_comparisons (QImode, HImode, SImode);
>  }
>
>  /* Test vector simplifications involving VEC_DUPLICATE in which the
> diff --git a/gcc/testsuite/gcc.dg/torture/pr117186.c b/gcc/testsuite/gcc.dg/torture/pr117186.c
> new file mode 100644
> index 00000000000..83bcb650102
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr117186.c
> @@ -0,0 +1,15 @@
> +/* { dg-do run } */
> +
> +[[gnu::noipa]] int
> +f1 (int x, int y)
> +{
> +  return (x < y) < (y < x);
> +}
> +
> +int
> +main (void)
> +{
> +  if (f1 (-1, 0))
> +    __builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr117186.c b/gcc/testsuite/gcc.target/aarch64/pr117186.c
> new file mode 100644
> index 00000000000..d6e2a4af58c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/pr117186.c
> @@ -0,0 +1,128 @@
> +/* { dg-options "-O" } */
> +/* { dg-final { check-function-bodies "**" "" "" } } */
> +
> +/*
> +** f1:
> +** (
> +**     cmp     w0, w1
> +**     cset    w0, gt
> +** |
> +**     cmp     w1, w0
> +**     cset    w0, lt
> +** )
> +**     ret
> +*/
> +int
> +f1 (int x, int y)
> +{
> +  return (x < y) < (y < x);
> +}
> +
> +/*
> +** f2:
> +**     mov     w0, 0
> +**     ret
> +*/
> +int
> +f2 (int x, int y)
> +{
> +  return (x <= y) < (x < y);
> +}
> +
> +/*
> +** f3:
> +**     cmp     (w0, w1|w1, w0)
> +**     cset    w0, ne
> +**     ret
> +*/
> +int
> +f3 (int x, int y)
> +{
> +  return (x <= y) == (x < y);
> +}
> +
> +/*
> +** f4:
> +**     cmp     (w0, w1|w1, w0)
> +**     cset    w0, ne
> +**     ret
> +*/
> +int
> +f4 (int x, int y)
> +{
> +  return (y < x) != (x < y);
> +}
> +
> +/*
> +** f5:
> +**     cmp     (w0, w1|w1, w0)
> +**     cset    w0, eq
> +**     ret
> +*/
> +int
> +f5 (int x, int y)
> +{
> +  return (y >= x) > (y > x);
> +}
> +
> +/*
> +** f6:
> +** (
> +**     cmp     w0, w1
> +**     cset    w0, ge
> +** |
> +**     cmp     w1, w0
> +**     cset    w0, le
> +** )
> +**     ret
> +*/
> +int
> +f6 (int x, int y)
> +{
> +  return (x < y) < (y <= x);
> +}
> +
> +/*
> +** f7:
> +**     mov     w0, 1
> +**     ret
> +*/
> +int
> +f7 (int x, int y)
> +{
> +  return (x < y) <= (x <= y);
> +}
> +
> +/*
> +** f8:
> +** (
> +**     cmp     w0, w1
> +**     cset    w0, hi
> +** |
> +**     cmp     w1, w0
> +**     cset    w0, ls
> +** )
> +**     ret
> +*/
> +int
> +f8 (unsigned int x, unsigned int y)
> +{
> +  return (x < y) < (y < x);
> +}
> +
> +/*
> +** f9:
> +** (
> +**     cmp     w0, w1
> +**     cset    w0, (hs|cs)
> +** |
> +**     cmp     w1, w0
> +**     cset    w0, (lo|cc)
> +** )
> +**     ret
> +*/
> +int
> +f9 (unsigned int x, unsigned int y)
> +{
> +  return (x < y) < (y <= x);
> +}
> --
> 2.25.1
>
  
Richard Sandiford Jan. 10, 2025, 12:56 p.m. UTC | #2
Richard Biener <richard.guenther@gmail.com> writes:
> On Mon, Jan 6, 2025 at 2:12 PM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> g:d882fe5150fbbeb4e44d007bb4964e5b22373021, posted at
>> https://gcc.gnu.org/pipermail/gcc-patches/2000-July/033786.html ,
>> added code to treat:
>>
>>   (set (reg:CC cc) (compare:CC (gt:M (reg:CC cc) 0) (lt:M (reg:CC cc) 0)))
>>
>> as a nop.  This PR shows that that isn't always correct.
>> The compare in the set above is between two 0/1 booleans (at least
>> on STORE_FLAG_VALUE==1 targets), whereas the unknown comparison that
>> produced the incoming (reg:CC cc) is unconstrained; it could be between
>> arbitrary integers, or even floats.  The fold is therefore replacing a
>> cc that is valid for both signed and unsigned comparisons with one that
>> is only known to be valid for signed comparisons.
>>
>>   (gt (compare (gt cc 0) (lt cc 0) 0)
>>
>> does simplify to:
>>
>>   (gt cc 0)
>>
>> but:
>>
>>   (gtu (compare (gt cc 0) (lt cc 0) 0)
>>
>> does not simplify to:
>>
>>   (gtu cc 0)
>>
>> The optimisation didn't come with a testcase, but it was added for
>> i386's cmpstrsi, now cmpstrnsi.  That probably doesn't matter as much
>> as it once did, since it's now conditional on -minline-all-stringops.
>> But the patch is almost 25 years old, so whatever the original
>> motivation was, it seems likely that other things now rely on it.
>>
>> It therefore seems better to try to preserve the optimisation on rtl
>> rather than get rid of it.  To do that, we need to look at how the
>> result of the outer compare is used.  We'd therefore be looking at four
>> instructions (the gt, the lt, the compare, and the use of the compare),
>> but combine already allows that for 3-instruction combinations thanks
>> to:
>>
>>   /* If the source is a COMPARE, look for the use of the comparison result
>>      and try to simplify it unless we already have used undobuf.other_insn.  */
>>
>> When applied to boolean inputs, a comparison operator is
>> effectively a boolean logical operator (AND, ANDNOT, XOR, etc.).
>> simplify_logical_relational_operation already had code to simplify
>> logical operators between two comparison results, but:
>>
>> * It only handled IOR, which doesn't cover all the cases needed here.
>>   The others are easily added.
>>
>> * It treated comparisons of integers as having an ORDERED/UNORDERED result.
>>   Therefore:
>>
>>   * it would not treat "true for LT + EQ + GT" as "always true" for
>>     comparisons between integers, because the mask excluded the UNORDERED
>>     condition.
>>
>>   * it would try to convert "true for LT + GT" into LTGT even for comparisons
>>     between integers.  To prevent an ICE later, the code used:
>>
>>        /* Many comparison codes are only valid for certain mode classes.  */
>>        if (!comparison_code_valid_for_mode (code, mode))
>>          return 0;
>>
>>     However, this used the wrong mode, since "mode" is here the integer
>>     result of the comparisons (and the mode of the IOR), not the mode of
>>     the things being compared.  Thus the effect was to reject all
>>     floating-point-only codes, even when comparing floats.
>>
>>   I think instead the code should detect whether the comparison is between
>>   integer values and remove UNORDERED from consideration if so.  It then
>>   always produces a valid comparison (or an always true/false result),
>>   and so comparison_code_valid_for_mode is not needed.  In particular,
>>   "true for LT + GT" becomes NE for comparisons between integers but
>>   remains LTGT for comparisons between floats.
>>
>> * There was a missing check for whether the comparison inputs had
>>   side effects.
>>
>> While there, it also seemed worth extending
>> simplify_logical_relational_operation to unsigned comparisons, since
>> that makes the testing easier.
>>
>> As far as that testing goes: the patch exhaustively tests all
>> combinations of integer comparisons in:
>>
>>   (cmp1 (cmp2 X Y) (cmp3 X Y))
>>
>> for the 10 integer comparisons, giving 1000 fold attempts in total.
>> It then tries all combinations of (X in {-1,0,1} x Y in {-1,0,1})
>> on the result of the fold, giving 9 checks per fold, or 9000 in total.
>> That's probably more than is typical for self-tests, but it seems to
>> complete in neglible time, even for -O0 builds.
>>
>> Tested on aarch64-linux-gnu & x86_64-linux-gnu.  OK to install?
>
> OK.

Thanks!

>> The patch isn't exactly a spot fix, and the bug is ancient, so I suppose
>> the patch probably isn't suitable for backports.
>
> Maybe for GCC 14, but not without some soaking time of course.

OK, I'll leave the PR open for that.

JTFR, I noticed while reading the patch back that I'd fluffed the
attempt to future-proof the aarch64 test.  The ls and lo conditions in:

>> +/*
>> +** f8:
>> +** (
>> +**     cmp     w0, w1
>> +**     cset    w0, hi
>> +** |
>> +**     cmp     w1, w0
>> +**     cset    w0, ls
>> +** )
>> +**     ret
>> +*/
>> +int
>> +f8 (unsigned int x, unsigned int y)
>> +{
>> +  return (x < y) < (y < x);
>> +}
>> +
>> +/*
>> +** f9:
>> +** (
>> +**     cmp     w0, w1
>> +**     cset    w0, (hs|cs)
>> +** |
>> +**     cmp     w1, w0
>> +**     cset    w0, (lo|cc)
>> +** )
>> +**     ret
>> +*/
>> +int
>> +f9 (unsigned int x, unsigned int y)
>> +{
>> +  return (x < y) < (y <= x);
>> +}
>> --
>> 2.25.1
>>

were the wrong way around.  I've pushed the patch with that change.

Richard
  
Andreas Schwab Jan. 11, 2025, 12:42 p.m. UTC | #3
This breaks m68k:

$ ./xgcc -B./  -xc -nostdinc /dev/null -S -o /dev/null -fself-test=../../gcc/testsuite/selftests
../../gcc/simplify-rtx.cc:8600: test_comparisons: FAIL: ASSERT_RTX_EQ (val, folded)
  expected: (const_int -1 [0xffffffffffffffff])
  actual: (const_int 0 [0])
cc1: internal compiler error: in assert_rtx_eq_at, at selftest-rtl.cc:57
0x1d1d7bf internal_error(char const*, ...)
        ../../gcc/diagnostic-global-context.cc:517
0x685a13 fancy_abort(char const*, int, char const*)
        ../../gcc/diagnostic.cc:1722
0xe1d284 selftest::assert_rtx_eq_at(selftest::location const&, char const*, rtx_def*, rtx_def*)
        ../../gcc/selftest-rtl.cc:57
0xe3834c test_comparisons
        ../../gcc/simplify-rtx.cc:8600
0xe3834c test_scalar_ops
        ../../gcc/simplify-rtx.cc:8631
0xe3834c selftest::simplify_rtx_cc_tests()
        ../../gcc/simplify-rtx.cc:9206
0x1c19486 selftest::run_tests()
        ../../gcc/selftest-run-tests.cc:122
0xe66e2a toplev::run_self_tests()
        ../../gcc/toplev.cc:2267
  
Tobias Burnus Jan. 13, 2025, 9:48 a.m. UTC | #4
Andreas Schwab wrote:
> This breaks m68k:

Same issue on GCN, hence I filed https://gcc.gnu.org/PR118418

If I look at the debugging output, see  PR, it seems as if
the self-test function test_comparisons contains the assumption:
    FALSE < TRUE
but if TRUE is -1, that assumption does not hold
(for signed variables).

And both GCN and m68k '#define STORE_FLAG_VALUE -1',
as Andreas noted.

Tobias
  

Patch

diff --git a/gcc/rtl.h b/gcc/rtl.h
index 3d211953b7d..3b676c46880 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -3478,7 +3478,8 @@  private:
   rtx simplify_byte_swapping_operation (rtx_code, machine_mode, rtx, rtx);
   rtx simplify_associative_operation (rtx_code, machine_mode, rtx, rtx);
   rtx simplify_distributive_operation (rtx_code, machine_mode, rtx, rtx);
-  rtx simplify_logical_relational_operation (rtx_code, machine_mode, rtx, rtx);
+  rtx simplify_logical_relational_operation (rtx_code, machine_mode, rtx, rtx,
+					     bool = false);
   rtx simplify_binary_operation_series (rtx_code, machine_mode, rtx, rtx);
   rtx simplify_distribute_over_subregs (rtx_code, machine_mode, rtx, rtx);
   rtx simplify_shift_const_int (rtx_code, machine_mode, rtx, unsigned int);
diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index c77ce956332..71c5d3c1b1b 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -2446,6 +2446,61 @@  simplify_context::simplify_associative_operation (rtx_code code,
   return 0;
 }
 
+/* If COMPARISON can be treated as an unsigned comparison, return a mask
+   that represents it (8 if it includes <, 4 if it includes > and 2
+   if it includes ==).  Return 0 otherwise.  */
+static int
+unsigned_comparison_to_mask (rtx_code comparison)
+{
+  switch (comparison)
+    {
+    case LTU:
+      return 8;
+    case GTU:
+      return 4;
+    case EQ:
+      return 2;
+
+    case LEU:
+      return 10;
+    case GEU:
+      return 6;
+
+    case NE:
+      return 12;
+
+    default:
+      return 0;
+    }
+}
+
+/* Reverse the mapping in unsigned_comparison_to_mask, going from masks
+   to comparisons.  */
+static rtx_code
+mask_to_unsigned_comparison (int mask)
+{
+  switch (mask)
+    {
+    case 8:
+      return LTU;
+    case 4:
+      return GTU;
+    case 2:
+      return EQ;
+
+    case 10:
+      return LEU;
+    case 6:
+      return GEU;
+
+    case 12:
+      return NE;
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
 /* Return a mask describing the COMPARISON.  */
 static int
 comparison_to_mask (enum rtx_code comparison)
@@ -2530,53 +2585,6 @@  mask_to_comparison (int mask)
     }
 }
 
-/* Return true if CODE is valid for comparisons of mode MODE, false
-   otherwise.
-
-   It is always safe to return false, even if the code was valid for the
-   given mode as that will merely suppress optimizations.  */
-
-static bool
-comparison_code_valid_for_mode (enum rtx_code code, enum machine_mode mode)
-{
-  switch (code)
-    {
-      /* These are valid for integral, floating and vector modes.  */
-      case NE:
-      case EQ:
-      case GE:
-      case GT:
-      case LE:
-      case LT:
-	return (INTEGRAL_MODE_P (mode)
-		|| FLOAT_MODE_P (mode)
-		|| VECTOR_MODE_P (mode));
-
-      /* These are valid for floating point modes.  */
-      case LTGT:
-      case UNORDERED:
-      case ORDERED:
-      case UNEQ:
-      case UNGE:
-      case UNGT:
-      case UNLE:
-      case UNLT:
-	return FLOAT_MODE_P (mode);
-
-      /* These are filtered out in simplify_logical_operation, but
-	 we check for them too as a matter of safety.   They are valid
-	 for integral and vector modes.  */
-      case GEU:
-      case GTU:
-      case LEU:
-      case LTU:
-	return INTEGRAL_MODE_P (mode) || VECTOR_MODE_P (mode);
-
-      default:
-	gcc_unreachable ();
-    }
-}
-
 /* Canonicalize RES, a scalar const0_rtx/const_true_rtx to the right
    false/true value of comparison with MODE where comparison operands
    have CMP_MODE.  */
@@ -2625,17 +2633,16 @@  relational_result (machine_mode mode, machine_mode cmp_mode, rtx res)
 }
 
 /* Simplify a logical operation CODE with result mode MODE, operating on OP0
-   and OP1, which should be both relational operations.  Return 0 if no such
-   simplification is possible.  */
+   and OP1, in the case where both are relational operations.  Assume that
+   OP0 is inverted if INVERT0_P is true.
+
+   Return 0 if no such simplification is possible.  */
 rtx
 simplify_context::simplify_logical_relational_operation (rtx_code code,
 							 machine_mode mode,
-							 rtx op0, rtx op1)
+							 rtx op0, rtx op1,
+							 bool invert0_p)
 {
-  /* We only handle IOR of two relational operations.  */
-  if (code != IOR)
-    return 0;
-
   if (!(COMPARISON_P (op0) && COMPARISON_P (op1)))
     return 0;
 
@@ -2643,28 +2650,64 @@  simplify_context::simplify_logical_relational_operation (rtx_code code,
 	&& rtx_equal_p (XEXP (op0, 1), XEXP (op1, 1))))
     return 0;
 
+  if (side_effects_p (op0))
+    return 0;
+
   enum rtx_code code0 = GET_CODE (op0);
   enum rtx_code code1 = GET_CODE (op1);
 
-  /* We don't handle unsigned comparisons currently.  */
-  if (code0 == LTU || code0 == GTU || code0 == LEU || code0 == GEU)
-    return 0;
-  if (code1 == LTU || code1 == GTU || code1 == LEU || code1 == GEU)
-    return 0;
+  /* Assume at first that the comparisons are on integers, and that the
+     operands are therefore ordered.  */
+  int all = 14;
+  int mask0 = unsigned_comparison_to_mask (code0);
+  int mask1 = unsigned_comparison_to_mask (code1);
+  bool unsigned_p = (IN_RANGE (mask0 & 12, 4, 8)
+		     || IN_RANGE (mask1 & 12, 4, 8));
+  if (unsigned_p)
+    {
+      /* We only reach here when comparing integers.  Reject mixtures of signed
+	 and unsigned comparisons.  */
+      if (mask0 == 0 || mask1 == 0)
+	return 0;
+    }
+  else
+    {
+      /* See whether the operands might be unordered.  */
+      if (HONOR_NANS (GET_MODE (XEXP (op0, 0))))
+	all = 15;
+      mask0 = comparison_to_mask (code0) & all;
+      mask1 = comparison_to_mask (code1) & all;
+    }
 
-  int mask0 = comparison_to_mask (code0);
-  int mask1 = comparison_to_mask (code1);
+  if (invert0_p)
+    mask0 = mask0 ^ all;
 
-  int mask = mask0 | mask1;
+  int mask;
+  if (code == AND)
+    mask = mask0 & mask1;
+  else if (code == IOR)
+    mask = mask0 | mask1;
+  else if (code == XOR)
+    mask = mask0 ^ mask1;
+  else
+    return 0;
 
-  if (mask == 15)
+  if (mask == all)
     return relational_result (mode, GET_MODE (op0), const_true_rtx);
 
-  code = mask_to_comparison (mask);
+  if (mask == 0)
+    return relational_result (mode, GET_MODE (op0), const0_rtx);
 
-  /* Many comparison codes are only valid for certain mode classes.  */
-  if (!comparison_code_valid_for_mode (code, mode))
-    return 0;
+  if (unsigned_p)
+    code = mask_to_unsigned_comparison (mask);
+  else
+    {
+      code = mask_to_comparison (mask);
+      /* LTGT and NE are arithmetically equivalent for ordered operands,
+	 with NE being the canonical choice.  */
+      if (code == LTGT && all == 14)
+	code = NE;
+    }
 
   op0 = XEXP (op1, 0);
   op1 = XEXP (op1, 1);
@@ -3217,21 +3260,6 @@  simplify_context::simplify_binary_operation_1 (rtx_code code,
       break;
 
     case COMPARE:
-      /* Convert (compare (gt (flags) 0) (lt (flags) 0)) to (flags).  */
-      if (((GET_CODE (op0) == GT && GET_CODE (op1) == LT)
-	   || (GET_CODE (op0) == GTU && GET_CODE (op1) == LTU))
-	  && XEXP (op0, 1) == const0_rtx && XEXP (op1, 1) == const0_rtx)
-	{
-	  rtx xop00 = XEXP (op0, 0);
-	  rtx xop10 = XEXP (op1, 0);
-
-	  if (REG_P (xop00) && REG_P (xop10)
-	      && REGNO (xop00) == REGNO (xop10)
-	      && GET_MODE (xop00) == mode
-	      && GET_MODE (xop10) == mode
-	      && GET_MODE_CLASS (mode) == MODE_CC)
-	    return xop00;
-	}
       break;
 
     case MINUS:
@@ -6405,6 +6433,31 @@  simplify_context::simplify_relational_operation_1 (rtx_code code,
 	  && INTVAL (XEXP (tmp, 1)) == GET_MODE_PRECISION (int_mode) - 1)
 	return simplify_gen_binary (AND, mode, XEXP (tmp, 0), const1_rtx);
     }
+
+  /* For two booleans A and B:
+
+     A >  B == ~B & A
+     A >= B == ~B | A
+     A <  B == ~A & B
+     A <= B == ~A | B
+     A == B == ~A ^ B (== ~B ^ A)
+     A != B ==  A ^ B
+
+     simplify_logical_relational_operation checks whether A and B
+     are booleans.  */
+  if (code == GTU || code == GT)
+    return simplify_logical_relational_operation (AND, mode, op1, op0, true);
+  if (code == GEU || code == GE)
+    return simplify_logical_relational_operation (IOR, mode, op1, op0, true);
+  if (code == LTU || code == LT)
+    return simplify_logical_relational_operation (AND, mode, op0, op1, true);
+  if (code == LEU || code == LE)
+    return simplify_logical_relational_operation (IOR, mode, op0, op1, true);
+  if (code == EQ)
+    return simplify_logical_relational_operation (XOR, mode, op0, op1, true);
+  if (code == NE)
+    return simplify_logical_relational_operation (XOR, mode, op0, op1);
+
   return NULL_RTX;
 }
 
@@ -8493,6 +8546,63 @@  test_scalar_int_ext_ops2 (machine_mode bmode, machine_mode mmode,
 		 simplify_gen_unary (TRUNCATE, smode, mreg, mmode));
 }
 
+/* Test comparisons of comparisons, with the inner comparisons being
+   between values of mode MODE2 and producing results of mode MODE1,
+   and with the outer comparisons producing results of mode MODE0.  */
+
+static void
+test_comparisons (machine_mode mode0, machine_mode mode1, machine_mode mode2)
+{
+  rtx reg0 = make_test_reg (mode2);
+  rtx reg1 = make_test_reg (mode2);
+
+  static const rtx_code codes[] = {
+    EQ, NE, LT, LTU, LE, LEU, GE, GEU, GT, GTU
+  };
+  constexpr auto num_codes = ARRAY_SIZE (codes);
+  rtx cmps[num_codes];
+  rtx vals[] = { constm1_rtx, const0_rtx, const1_rtx };
+
+  for (unsigned int i = 0; i < num_codes; ++i)
+    cmps[i] = gen_rtx_fmt_ee (codes[i], mode1, reg0, reg1);
+
+  for (auto code : codes)
+    for (unsigned int i0 = 0; i0 < num_codes; ++i0)
+      for (unsigned int i1 = 0; i1 < num_codes; ++i1)
+	{
+	  rtx cmp_res = simplify_relational_operation (code, mode0, mode1,
+						       cmps[i0], cmps[i1]);
+	  if (i0 >= 2 && i1 >= 2 && (i0 ^ i1) & 1)
+	    ASSERT_TRUE (cmp_res == NULL_RTX);
+	  else
+	    {
+	      ASSERT_TRUE (cmp_res != NULL_RTX
+			   && (CONSTANT_P (cmp_res)
+			       || (COMPARISON_P (cmp_res)
+				   && GET_MODE (cmp_res) == mode0
+				   && REG_P (XEXP (cmp_res, 0))
+				   && REG_P (XEXP (cmp_res, 1)))));
+	      for (rtx reg0_val : vals)
+		for (rtx reg1_val : vals)
+		  {
+		    rtx val0 = simplify_const_relational_operation
+		      (codes[i0], mode1, reg0_val, reg1_val);
+		    rtx val1 = simplify_const_relational_operation
+		      (codes[i1], mode1, reg0_val, reg1_val);
+		    rtx val = simplify_const_relational_operation
+		      (code, mode0, val0, val1);
+		    rtx folded = cmp_res;
+		    if (COMPARISON_P (cmp_res))
+		      folded = simplify_const_relational_operation
+			(GET_CODE (cmp_res), mode0,
+			 XEXP (cmp_res, 0) == reg0 ? reg0_val : reg1_val,
+			 XEXP (cmp_res, 1) == reg0 ? reg0_val : reg1_val);
+		    ASSERT_RTX_EQ (val, folded);
+		  }
+	    }
+	}
+}
+
 
 /* Verify some simplifications involving scalar expressions.  */
 
@@ -8517,6 +8627,8 @@  test_scalar_ops ()
   test_scalar_int_ext_ops2 (DImode, HImode, QImode);
   test_scalar_int_ext_ops2 (DImode, SImode, QImode);
   test_scalar_int_ext_ops2 (DImode, SImode, HImode);
+
+  test_comparisons (QImode, HImode, SImode);
 }
 
 /* Test vector simplifications involving VEC_DUPLICATE in which the
diff --git a/gcc/testsuite/gcc.dg/torture/pr117186.c b/gcc/testsuite/gcc.dg/torture/pr117186.c
new file mode 100644
index 00000000000..83bcb650102
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr117186.c
@@ -0,0 +1,15 @@ 
+/* { dg-do run } */
+
+[[gnu::noipa]] int
+f1 (int x, int y)
+{
+  return (x < y) < (y < x);
+}
+
+int
+main (void)
+{
+  if (f1 (-1, 0))
+    __builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/pr117186.c b/gcc/testsuite/gcc.target/aarch64/pr117186.c
new file mode 100644
index 00000000000..d6e2a4af58c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr117186.c
@@ -0,0 +1,128 @@ 
+/* { dg-options "-O" } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+/*
+** f1:
+** (
+**	cmp	w0, w1
+**	cset	w0, gt
+** |
+**	cmp	w1, w0
+**	cset	w0, lt
+** )
+**	ret
+*/
+int
+f1 (int x, int y)
+{
+  return (x < y) < (y < x);
+}
+
+/*
+** f2:
+**	mov	w0, 0
+**	ret
+*/
+int
+f2 (int x, int y)
+{
+  return (x <= y) < (x < y);
+}
+
+/*
+** f3:
+**	cmp	(w0, w1|w1, w0)
+**	cset	w0, ne
+**	ret
+*/
+int
+f3 (int x, int y)
+{
+  return (x <= y) == (x < y);
+}
+
+/*
+** f4:
+**	cmp	(w0, w1|w1, w0)
+**	cset	w0, ne
+**	ret
+*/
+int
+f4 (int x, int y)
+{
+  return (y < x) != (x < y);
+}
+
+/*
+** f5:
+**	cmp	(w0, w1|w1, w0)
+**	cset	w0, eq
+**	ret
+*/
+int
+f5 (int x, int y)
+{
+  return (y >= x) > (y > x);
+}
+
+/*
+** f6:
+** (
+**	cmp	w0, w1
+**	cset	w0, ge
+** |
+**	cmp	w1, w0
+**	cset	w0, le
+** )
+**	ret
+*/
+int
+f6 (int x, int y)
+{
+  return (x < y) < (y <= x);
+}
+
+/*
+** f7:
+**	mov	w0, 1
+**	ret
+*/
+int
+f7 (int x, int y)
+{
+  return (x < y) <= (x <= y);
+}
+
+/*
+** f8:
+** (
+**	cmp	w0, w1
+**	cset	w0, hi
+** |
+**	cmp	w1, w0
+**	cset	w0, ls
+** )
+**	ret
+*/
+int
+f8 (unsigned int x, unsigned int y)
+{
+  return (x < y) < (y < x);
+}
+
+/*
+** f9:
+** (
+**	cmp	w0, w1
+**	cset	w0, (hs|cs)
+** |
+**	cmp	w1, w0
+**	cset	w0, (lo|cc)
+** )
+**	ret
+*/
+int
+f9 (unsigned int x, unsigned int y)
+{
+  return (x < y) < (y <= x);
+}