[v2] MATCH: Simplify `a rrotate (32-b) -> a lrotate b` [PR109906]
Checks
Context |
Check |
Description |
linaro-tcwg-bot/tcwg_gcc_build--master-arm |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 |
success
|
Test passed
|
linaro-tcwg-bot/tcwg_gcc_check--master-arm |
success
|
Test passed
|
Commit Message
The pattern `a rrotate (32-b)` should be optimized to `a lrotate b`.
The same is also true for `a lrotate (32-b)`. It can be optimized to
`a rrotate b`.
This patch adds following patterns:
a rrotate (32-b) -> a lrotate b
a lrotate (32-b) -> a rrotate b
PR tree-optimization/109906
gcc/ChangeLog:
* match.pd (a rrotate (32-b) -> a lrotate b): New pattern
(a lrotate (32-b) -> a rrotate b): New pattern
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr109906.c: New test.
---
gcc/match.pd | 9 ++++++
gcc/testsuite/gcc.dg/tree-ssa/pr109906.c | 40 ++++++++++++++++++++++++
2 files changed, 49 insertions(+)
create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr109906.c
Comments
On 9/18/24 12:45 AM, Eikansh Gupta wrote:
> The pattern `a rrotate (32-b)` should be optimized to `a lrotate b`.
> The same is also true for `a lrotate (32-b)`. It can be optimized to
> `a rrotate b`.
>
> This patch adds following patterns:
> a rrotate (32-b) -> a lrotate b
> a lrotate (32-b) -> a rrotate b
>
> PR tree-optimization/109906
>
> gcc/ChangeLog:
>
> * match.pd (a rrotate (32-b) -> a lrotate b): New pattern
> (a lrotate (32-b) -> a rrotate b): New pattern
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/pr109906.c: New test.
Note I think the testcase is specific to 32bit int targets. You
probably need to use a target selector to limit it appropriately.
> /* { dg-require-effective-target int32 } */
Is probably the magic selector you want...
And you need to indicate where it was bootstrapped and regression tested.
I think the match.pd bits are fine. So it's really just a matter of
dotting i's and crossing t's on process and testsuite adjustment.
Jeff
@@ -4759,6 +4759,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
build_int_cst (TREE_TYPE (@1),
element_precision (type)), @1); }))
+/* a rrotate (32-b) -> a lrotate b */
+/* a lrotate (32-b) -> a rrotate b */
+(for rotate (lrotate rrotate)
+ orotate (rrotate lrotate)
+ (simplify
+ (rotate @0 (minus INTEGER_CST@1 @2))
+ (if (TYPE_PRECISION (TREE_TYPE (@0)) == wi::to_wide (@1))
+ (orotate @0 @2))))
+
/* Turn (a OP c1) OP c2 into a OP (c1+c2). */
(for op (lrotate rrotate rshift lshift)
(simplify
new file mode 100644
@@ -0,0 +1,40 @@
+/* PR tree-optimization/109906 */
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized-raw" } */
+
+/* Implementation of rotate right operation */
+static inline
+unsigned rrotate(unsigned x, int t)
+{
+ if (t >= 32) __builtin_unreachable();
+ unsigned tl = x >> (t);
+ unsigned th = x << (32-t);
+ return tl | th;
+}
+
+/* Here rotate left is achieved by doing rotate right by (32 - x) */
+unsigned rotateleft(unsigned t, int x)
+{
+ return rrotate (t, 32-x);
+}
+
+/* Implementation of rotate left operation */
+static inline
+unsigned lrotate(unsigned x, int t)
+{
+ if (t >= 32) __builtin_unreachable();
+ unsigned tl = x << (t);
+ unsigned th = x >> (32-t);
+ return tl | th;
+}
+
+/* Here rotate right is achieved by doing rotate left by (32 - x) */
+unsigned rotateright(unsigned t, int x)
+{
+ return lrotate (t, 32-x);
+}
+
+/* Shouldn't have instruction for (32 - x). */
+/* { dg-final { scan-tree-dump-not "minus_expr" "optimized" } } */
+/* { dg-final { scan-tree-dump "rrotate_expr" "optimized" } } */
+/* { dg-final { scan-tree-dump "lrotate_expr" "optimized" } } */