[v4,2/4] tree-optimization/116024 - simplify C1-X cmp C2 for unsigned types

Message ID 20240930080408.2501963-3-artemiy@synopsys.com
State Committed
Commit 65b33d43d29b148e127b1ba997f1bbc2c7028b94
Headers
Series tree-optimization/116024 - match.pd: add 4 int-compare simplifications |

Checks

Context Check Description
rivoscibot/toolchain-ci-rivos-apply-patch success Patch applied
rivoscibot/toolchain-ci-rivos-lint success Lint passed
rivoscibot/toolchain-ci-rivos-build--newlib-rv64gcv-lp64d-multilib success Build passed
rivoscibot/toolchain-ci-rivos-build--linux-rv64gcv-lp64d-multilib success Build passed
rivoscibot/toolchain-ci-rivos-build--linux-rv64gc_zba_zbb_zbc_zbs-lp64d-multilib success Build passed
rivoscibot/toolchain-ci-rivos-build--newlib-rv64gc-lp64d-non-multilib success Build passed
rivoscibot/toolchain-ci-rivos-build--linux-rv64gc-lp64d-non-multilib success Build passed
rivoscibot/toolchain-ci-rivos-test success Testing passed
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 success Test passed
linaro-tcwg-bot/tcwg_gcc_check--master-arm success Test passed

Commit Message

Artemiy Volkov Sept. 30, 2024, 8:04 a.m. UTC
  Implement a match.pd transformation inverting the sign of X in
C1 - X cmp C2, where C1 and C2 are integer constants and X is
of an unsigned type, by observing that:

(a) If cmp is == or !=, simply move X and C2 to opposite sides of the
comparison to arrive at X cmp C1 - C2.

(b) If cmp is <:
	- C1 - X < C2 means that C1 - X spans the range of 0, 1, ..., C2 - 1;
        - This means that X spans the range of C1 - (C2 - 1),
	  C1 - (C2 - 2), ..., C1;
	- Subtracting C1 - (C2 - 1), X - (C1 - (C2 - 1)) is one of 0, 1,
	  ..., C1 - (C1 - (C2 - 1));
        - Simplifying the above, X - (C1 - C2 + 1) is one of 0, 1, ...,
         C2 - 1;
        - Summarizing, the expression C1 - X < C2 can be transformed
	  into X - (C1 - C2 + 1) < C2.

(c) Similarly, if cmp is <=:
	- C1 - X <= C2 means that C1 - X is one of 0, 1, ..., C2;
	- It follows that X is one of C1 - C2, C1 - (C2 - 1), ..., C1;
        - Subtracting C1 - C2, X - (C1 - C2) has range 0, 1, ..., C2;
        - Thus, the expression C1 - X <= C2 can be transformed into
	  X - (C1 - C2) <= C2.

(d) The >= and > cases are negations of (b) and (c), respectively.

This transformation allows to occasionally save load-immediate /
subtraction instructions, e.g. the following statement:

300 - (unsigned int)f() < 100;

now compiles to

addi    a0,a0,-201
sltiu   a0,a0,100

instead of

li      a5,300
sub     a0,a5,a0
sltiu   a0,a0,100

on 32-bit RISC-V.

Additional examples can be found in the newly added test file.  This
patch has been bootstrapped and regtested on aarch64, x86_64, and i386,
and additionally regtested on riscv32.

gcc/ChangeLog:

        PR tree-optimization/116024
        * match.pd: New transformation around integer comparison.

gcc/testsuite/ChangeLog:

        * gcc.dg/tree-ssa/pr116024-1.c: New test.

Signed-off-by: Artemiy Volkov <artemiy@synopsys.com>
---
 gcc/match.pd                               | 23 +++++++-
 gcc/testsuite/gcc.dg/tree-ssa/pr116024-1.c | 65 ++++++++++++++++++++++
 2 files changed, 87 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr116024-1.c
  

Comments

Jeff Law Oct. 8, 2024, 11:56 p.m. UTC | #1
On 9/30/24 2:04 AM, Artemiy Volkov wrote:
> Implement a match.pd transformation inverting the sign of X in
> C1 - X cmp C2, where C1 and C2 are integer constants and X is
> of an unsigned type, by observing that:
> 
> (a) If cmp is == or !=, simply move X and C2 to opposite sides of the
> comparison to arrive at X cmp C1 - C2.
> 
> (b) If cmp is <:
> 	- C1 - X < C2 means that C1 - X spans the range of 0, 1, ..., C2 - 1;
>          - This means that X spans the range of C1 - (C2 - 1),
> 	  C1 - (C2 - 2), ..., C1;
> 	- Subtracting C1 - (C2 - 1), X - (C1 - (C2 - 1)) is one of 0, 1,
> 	  ..., C1 - (C1 - (C2 - 1));
>          - Simplifying the above, X - (C1 - C2 + 1) is one of 0, 1, ...,
>           C2 - 1;
>          - Summarizing, the expression C1 - X < C2 can be transformed
> 	  into X - (C1 - C2 + 1) < C2.
> 
> (c) Similarly, if cmp is <=:
> 	- C1 - X <= C2 means that C1 - X is one of 0, 1, ..., C2;
> 	- It follows that X is one of C1 - C2, C1 - (C2 - 1), ..., C1;
>          - Subtracting C1 - C2, X - (C1 - C2) has range 0, 1, ..., C2;
>          - Thus, the expression C1 - X <= C2 can be transformed into
> 	  X - (C1 - C2) <= C2.
> 
> (d) The >= and > cases are negations of (b) and (c), respectively.
> 
> This transformation allows to occasionally save load-immediate /
> subtraction instructions, e.g. the following statement:
> 
> 300 - (unsigned int)f() < 100;
> 
> now compiles to
> 
> addi    a0,a0,-201
> sltiu   a0,a0,100
> 
> instead of
> 
> li      a5,300
> sub     a0,a5,a0
> sltiu   a0,a0,100
> 
> on 32-bit RISC-V.
> 
> Additional examples can be found in the newly added test file.  This
> patch has been bootstrapped and regtested on aarch64, x86_64, and i386,
> and additionally regtested on riscv32.
> 
> gcc/ChangeLog:
> 
>          PR tree-optimization/116024
>          * match.pd: New transformation around integer comparison.
> 
> gcc/testsuite/ChangeLog:
> 
>          * gcc.dg/tree-ssa/pr116024-1.c: New test.
> 
> Signed-off-by: Artemiy Volkov <artemiy@synopsys.com>
Thanks.  I've pushed this to the trunk.
jeff
  

Patch

diff --git a/gcc/match.pd b/gcc/match.pd
index b074f49eebd..46195a603d0 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -9020,7 +9020,28 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 				     TYPE_SIGN (TREE_TYPE (@0)));
 	   constant_boolean_node (less == ovf_high, type);
 	 })
-      (rcmp @1 { res; }))))))
+      (rcmp @1 { res; })))
+/* For unsigned types, transform like so (using < as example):
+	 C1 - X < C2
+      ==>  C1 - X = { 0, 1, ..., C2 - 1 }
+      ==>  X = { C1 - (C2 - 1), ..., C1 + 1, C1 }
+      ==>  X - (C1 - (C2 - 1)) = { 0, 1, ..., C1 - (C1 - (C2 - 1)) }
+      ==>  X - (C1 - C2 + 1) = { 0, 1, ..., C2 - 1 }
+      ==>  X - (C1 - C2 + 1) < C2.
+
+      Similarly,
+	 C1 - X <= C2 ==> X - (C1 - C2) <= C2;
+	 C1 - X >= C2 ==> X - (C1 - C2 + 1) >= C2;
+	 C1 - X > C2 ==> X - (C1 - C2) > C2.  */
+   (if (TYPE_UNSIGNED (TREE_TYPE (@1)))
+     (switch
+       (if (cmp == EQ_EXPR || cmp == NE_EXPR)
+	 (cmp @1 (minus @0 @2)))
+       (if (cmp == LE_EXPR || cmp == GT_EXPR)
+	 (cmp (plus @1 (minus @2 @0)) @2))
+       (if (cmp == LT_EXPR || cmp == GE_EXPR)
+	 (cmp (plus @1 (minus @2
+		   (plus @0 { build_one_cst (TREE_TYPE (@1)); }))) @2)))))))
 
 /* Canonicalizations of BIT_FIELD_REFs.  */
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr116024-1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr116024-1.c
new file mode 100644
index 00000000000..91cb6a7c4f1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr116024-1.c
@@ -0,0 +1,65 @@ 
+/* PR tree-optimization/116024 */
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-forwprop1-details" } */
+
+#include <stdint.h>
+
+uint32_t f(void);
+
+int32_t i2(void)
+{
+  uint32_t l = 10 - (uint32_t)f();
+  return l <= 20; // f() + 10 <= 20 
+}
+
+int32_t i2a(void)
+{
+  uint32_t l = 10 - (uint32_t)f();
+  return l < 30; // f() + 19 < 30 
+}
+
+int32_t i2b(void)
+{
+  uint32_t l = 200 - (uint32_t)f();
+  return l <= 100; // f() - 100 <= 100 
+}
+
+int32_t i2c(void)
+{
+  uint32_t l = 300 - (uint32_t)f();
+  return l < 100; // f() - 201 < 100
+}
+
+int32_t i2d(void)
+{
+  uint32_t l = 1000 - (uint32_t)f();
+  return l >= 2000; // f() + 999 >= 2000
+}
+
+int32_t i2e(void)
+{
+  uint32_t l = 1000 - (uint32_t)f();
+  return l > 3000; // f() + 2000 > 3000
+}
+
+int32_t i2f(void)
+{
+  uint32_t l = 20000 - (uint32_t)f();
+  return l >= 10000; // f() - 10001 >= 10000
+}
+
+int32_t i2g(void)
+{
+  uint32_t l = 30000 - (uint32_t)f();
+  return l > 10000; // f() - 20000 > 10000
+}
+
+/* { dg-final { scan-tree-dump-times "Removing dead stmt:.*?- _" 8 "forwprop1" } } */
+/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 10.*\n.*<= 20" 1 "forwprop1" } } */
+/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 19.*\n.*<= 29" 1 "forwprop1" } } */
+/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 4294967196.*\n.*<= 100" 1 "forwprop1" } } */
+/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 4294967095.*\n.*<= 99" 1 "forwprop1" } } */
+/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 999.*\n.*> 1999" 1 "forwprop1" } } */
+/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 2000.*\n.*> 3000" 1 "forwprop1" } } */
+/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 4294957295.*\n.*> 9999" 1 "forwprop1" } } */
+/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 4294947296.*\n.*> 10000" 1 "forwprop1" } } */