[V2,rs6000] Tweak modulo define_insns to eliminate register copy
Commit Message
Updated patch with review comments addressed: fixed up testcase and added
another testcase to verify peephole is functional.
Don't force target of modulo into a distinct register.
The define_insns for the modulo operation currently force the target
register
to a distinct reg in preparation for a possible future peephole combining
div/mod. But this can lead to cases of a needless copy being inserted. Fixed
with the following patch.
Bootstrapped and regression tested on powerpc64le.
Ok for master?
-Pat
2023-03-21 Pat Haugen <pthaugen@linux.ibm.com>
gcc/
* config/rs6000/rs6000.md (*mod<mode>3, umod<mode>3): Add
non-earlyclobber alternative.
gcc/testsuite/
* gcc.target/powerpc/mod-no_copy.c: New.
* gcc.target/powerpc/mod-peephole.c: New.
Comments
Hi!
On Tue, Mar 21, 2023 at 07:10:04AM -0500, Pat Haugen wrote:
> Updated patch with review comments addressed: fixed up testcase and added
> another testcase to verify peephole is functional.
>
> Don't force target of modulo into a distinct register.
>
> The define_insns for the modulo operation currently force the target
> register
> to a distinct reg in preparation for a possible future peephole combining
> div/mod. But this can lead to cases of a needless copy being inserted. Fixed
> with the following patch.
> +/* { dg-final { scan-assembler-not {\mmodsd\M} } } */
> +/* { dg-final { scan-assembler-not {\mmodud\M} } } */
You can do
/* { dg-final { scan-assembler-not {\mmod[su]d\M} } } */
if you want?
With or without that, okay for trunk. Thanks!
Segher
@@ -3437,9 +3437,9 @@ (define_expand "mod<mode>3"
;; In order to enable using a peephole2 for combining div/mod to
eliminate the
;; mod, prefer putting the result of mod into a different register
(define_insn "*mod<mode>3"
- [(set (match_operand:GPR 0 "gpc_reg_operand" "=&r")
- (mod:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")
- (match_operand:GPR 2 "gpc_reg_operand" "r")))]
+ [(set (match_operand:GPR 0 "gpc_reg_operand" "=&r,r")
+ (mod:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r")
+ (match_operand:GPR 2 "gpc_reg_operand" "r,r")))]
"TARGET_MODULO"
"mods<wd> %0,%1,%2"
[(set_attr "type" "div")
@@ -3447,9 +3447,9 @@ (define_insn "*mod<mode>3"
(define_insn "umod<mode>3"
- [(set (match_operand:GPR 0 "gpc_reg_operand" "=&r")
- (umod:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")
- (match_operand:GPR 2 "gpc_reg_operand" "r")))]
+ [(set (match_operand:GPR 0 "gpc_reg_operand" "=&r,r")
+ (umod:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r")
+ (match_operand:GPR 2 "gpc_reg_operand" "r,r")))]
"TARGET_MODULO"
"modu<wd> %0,%1,%2"
[(set_attr "type" "div")
b/gcc/testsuite/gcc.target/powerpc/mod-no_copy.c
new file mode 100644
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-mdejagnu-cpu=power9 -O2" } */
+
+/* Verify r3 is used as source and target, no copy inserted. */
+
+long foo (long a, long b)
+{
+ return (a % b);
+}
+
+unsigned long foo2 (unsigned long a, unsigned long b)
+{
+ return (a % b);
+}
+
+/* { dg-final { scan-assembler-not {\mmr\M} } } */
b/gcc/testsuite/gcc.target/powerpc/mod-peephole.c
new file mode 100644
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-mdejagnu-cpu=power9 -O2" } */
+
+/* Verify peephole fires to combine div/mod using same opnds. */
+
+long foo (long a, long b)
+{
+ long x, y;
+
+ x = a / b;
+ y = a % b;
+ return (x + y);
+}
+
+unsigned long foo2 (unsigned long a, unsigned long b)
+{
+ unsigned long x, y;
+
+ x = a / b;
+ y = a % b;
+ return (x + y);
+}
+
+/* { dg-final { scan-assembler-not {\mmodsd\M} } } */
+/* { dg-final { scan-assembler-not {\mmodud\M} } } */