[v1,2/3] RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w
Commit Message
When encountering a prescaled (biased) value as a candidate for
sh[123]add.uw, the combine pass will present this as shifted by the
aggregate amount (prescale + shift-amount) with an appropriately
adjusted mask constant that has fewer than 32 bits set.
E.g., here's the failing expression seen in combine for a prescale of
1 and a shift of 2 (note how 0x3fffffff8 >> 3 is 0x7fffffff).
Trying 7, 8 -> 10:
7: r78:SI=r81:DI#0<<0x1
REG_DEAD r81:DI
8: r79:DI=zero_extend(r78:SI)
REG_DEAD r78:SI
10: r80:DI=r79:DI<<0x2+r82:DI
REG_DEAD r79:DI
REG_DEAD r82:DI
Failed to match this instruction:
(set (reg:DI 80 [ cD.1491 ])
(plus:DI (and:DI (ashift:DI (reg:DI 81)
(const_int 3 [0x3]))
(const_int 17179869176 [0x3fffffff8]))
(reg:DI 82)))
To address this, we introduce a splitter handling these cases.
gcc/ChangeLog:
* config/riscv/bitmanip.md: Add split to handle opportunities
for slli + sh[123]add.uw
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zba-shadd.c: New test.
Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Co-developed-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
---
gcc/config/riscv/bitmanip.md | 44 ++++++++++++++++++++++
gcc/testsuite/gcc.target/riscv/zba-shadd.c | 13 +++++++
2 files changed, 57 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/riscv/zba-shadd.c
Comments
LGTM, you can commit that without [3/3] if you like :)
On Wed, May 25, 2022 at 5:47 AM Philipp Tomsich
<philipp.tomsich@vrull.eu> wrote:
>
> When encountering a prescaled (biased) value as a candidate for
> sh[123]add.uw, the combine pass will present this as shifted by the
> aggregate amount (prescale + shift-amount) with an appropriately
> adjusted mask constant that has fewer than 32 bits set.
>
> E.g., here's the failing expression seen in combine for a prescale of
> 1 and a shift of 2 (note how 0x3fffffff8 >> 3 is 0x7fffffff).
> Trying 7, 8 -> 10:
> 7: r78:SI=r81:DI#0<<0x1
> REG_DEAD r81:DI
> 8: r79:DI=zero_extend(r78:SI)
> REG_DEAD r78:SI
> 10: r80:DI=r79:DI<<0x2+r82:DI
> REG_DEAD r79:DI
> REG_DEAD r82:DI
> Failed to match this instruction:
> (set (reg:DI 80 [ cD.1491 ])
> (plus:DI (and:DI (ashift:DI (reg:DI 81)
> (const_int 3 [0x3]))
> (const_int 17179869176 [0x3fffffff8]))
> (reg:DI 82)))
>
> To address this, we introduce a splitter handling these cases.
>
> gcc/ChangeLog:
>
> * config/riscv/bitmanip.md: Add split to handle opportunities
> for slli + sh[123]add.uw
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/zba-shadd.c: New test.
>
> Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> Co-developed-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
>
> ---
>
> gcc/config/riscv/bitmanip.md | 44 ++++++++++++++++++++++
> gcc/testsuite/gcc.target/riscv/zba-shadd.c | 13 +++++++
> 2 files changed, 57 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/riscv/zba-shadd.c
>
> diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
> index 0ab9ffe3c0b..6c1ccc6f8c5 100644
> --- a/gcc/config/riscv/bitmanip.md
> +++ b/gcc/config/riscv/bitmanip.md
> @@ -79,6 +79,50 @@ (define_insn "*shNadduw"
> [(set_attr "type" "bitmanip")
> (set_attr "mode" "DI")])
>
> +;; During combine, we may encounter an attempt to combine
> +;; slli rtmp, rs, #imm
> +;; zext.w rtmp, rtmp
> +;; sh[123]add rd, rtmp, rs2
> +;; which will lead to the immediate not satisfying the above constraints.
> +;; By splitting the compound expression, we can simplify to a slli and a
> +;; sh[123]add.uw.
> +(define_split
> + [(set (match_operand:DI 0 "register_operand")
> + (plus:DI (and:DI (ashift:DI (match_operand:DI 1 "register_operand")
> + (match_operand:QI 2 "immediate_operand"))
> + (match_operand:DI 3 "consecutive_bits_operand"))
> + (match_operand:DI 4 "register_operand")))
> + (clobber (match_operand:DI 5 "register_operand"))]
> + "TARGET_64BIT && TARGET_ZBA"
> + [(set (match_dup 5) (ashift:DI (match_dup 1) (match_dup 6)))
> + (set (match_dup 0) (plus:DI (and:DI (ashift:DI (match_dup 5)
> + (match_dup 7))
> + (match_dup 8))
> + (match_dup 4)))]
> +{
> + unsigned HOST_WIDE_INT mask = UINTVAL (operands[3]);
> + /* scale: shift within the sh[123]add.uw */
> + int scale = 32 - clz_hwi (mask);
> + /* bias: pre-scale amount (i.e. the prior shift amount) */
> + int bias = ctz_hwi (mask) - scale;
> +
> + /* If the bias + scale don't add up to operand[2], reject. */
> + if ((scale + bias) != UINTVAL (operands[2]))
> + FAIL;
> +
> + /* If the shift-amount is out-of-range for sh[123]add.uw, reject. */
> + if ((scale < 1) || (scale > 3))
> + FAIL;
> +
> + /* If there's no bias, the '*shNadduw' pattern should have matched. */
> + if (bias == 0)
> + FAIL;
> +
> + operands[6] = GEN_INT (bias);
> + operands[7] = GEN_INT (scale);
> + operands[8] = GEN_INT (0xffffffffULL << scale);
> +})
> +
> (define_insn "*add.uw"
> [(set (match_operand:DI 0 "register_operand" "=r")
> (plus:DI (zero_extend:DI
> diff --git a/gcc/testsuite/gcc.target/riscv/zba-shadd.c b/gcc/testsuite/gcc.target/riscv/zba-shadd.c
> new file mode 100644
> index 00000000000..33da2530f3f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zba-shadd.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -march=rv64gc_zba -mabi=lp64" } */
> +
> +unsigned long foo(unsigned int a, unsigned long b)
> +{
> + a = a << 1;
> + unsigned long c = (unsigned long) a;
> + unsigned long d = b + (c<<2);
> + return d;
> +}
> +
> +/* { dg-final { scan-assembler "sh2add.uw" } } */
> +/* { dg-final { scan-assembler-not "zext" } } */
> \ No newline at end of file
> --
> 2.34.1
>
Thanks, applied to master!
For [3/3], I'll submit a new standalone patch with the requested changes.
On Tue, 7 Jun 2022 at 12:25, Kito Cheng <kito.cheng@gmail.com> wrote:
>
> LGTM, you can commit that without [3/3] if you like :)
>
> On Wed, May 25, 2022 at 5:47 AM Philipp Tomsich
> <philipp.tomsich@vrull.eu> wrote:
> >
> > When encountering a prescaled (biased) value as a candidate for
> > sh[123]add.uw, the combine pass will present this as shifted by the
> > aggregate amount (prescale + shift-amount) with an appropriately
> > adjusted mask constant that has fewer than 32 bits set.
> >
> > E.g., here's the failing expression seen in combine for a prescale of
> > 1 and a shift of 2 (note how 0x3fffffff8 >> 3 is 0x7fffffff).
> > Trying 7, 8 -> 10:
> > 7: r78:SI=r81:DI#0<<0x1
> > REG_DEAD r81:DI
> > 8: r79:DI=zero_extend(r78:SI)
> > REG_DEAD r78:SI
> > 10: r80:DI=r79:DI<<0x2+r82:DI
> > REG_DEAD r79:DI
> > REG_DEAD r82:DI
> > Failed to match this instruction:
> > (set (reg:DI 80 [ cD.1491 ])
> > (plus:DI (and:DI (ashift:DI (reg:DI 81)
> > (const_int 3 [0x3]))
> > (const_int 17179869176 [0x3fffffff8]))
> > (reg:DI 82)))
> >
> > To address this, we introduce a splitter handling these cases.
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/bitmanip.md: Add split to handle opportunities
> > for slli + sh[123]add.uw
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/zba-shadd.c: New test.
> >
> > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > Co-developed-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
> >
> > ---
> >
> > gcc/config/riscv/bitmanip.md | 44 ++++++++++++++++++++++
> > gcc/testsuite/gcc.target/riscv/zba-shadd.c | 13 +++++++
> > 2 files changed, 57 insertions(+)
> > create mode 100644 gcc/testsuite/gcc.target/riscv/zba-shadd.c
> >
> > diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
> > index 0ab9ffe3c0b..6c1ccc6f8c5 100644
> > --- a/gcc/config/riscv/bitmanip.md
> > +++ b/gcc/config/riscv/bitmanip.md
> > @@ -79,6 +79,50 @@ (define_insn "*shNadduw"
> > [(set_attr "type" "bitmanip")
> > (set_attr "mode" "DI")])
> >
> > +;; During combine, we may encounter an attempt to combine
> > +;; slli rtmp, rs, #imm
> > +;; zext.w rtmp, rtmp
> > +;; sh[123]add rd, rtmp, rs2
> > +;; which will lead to the immediate not satisfying the above constraints.
> > +;; By splitting the compound expression, we can simplify to a slli and a
> > +;; sh[123]add.uw.
> > +(define_split
> > + [(set (match_operand:DI 0 "register_operand")
> > + (plus:DI (and:DI (ashift:DI (match_operand:DI 1 "register_operand")
> > + (match_operand:QI 2 "immediate_operand"))
> > + (match_operand:DI 3 "consecutive_bits_operand"))
> > + (match_operand:DI 4 "register_operand")))
> > + (clobber (match_operand:DI 5 "register_operand"))]
> > + "TARGET_64BIT && TARGET_ZBA"
> > + [(set (match_dup 5) (ashift:DI (match_dup 1) (match_dup 6)))
> > + (set (match_dup 0) (plus:DI (and:DI (ashift:DI (match_dup 5)
> > + (match_dup 7))
> > + (match_dup 8))
> > + (match_dup 4)))]
> > +{
> > + unsigned HOST_WIDE_INT mask = UINTVAL (operands[3]);
> > + /* scale: shift within the sh[123]add.uw */
> > + int scale = 32 - clz_hwi (mask);
> > + /* bias: pre-scale amount (i.e. the prior shift amount) */
> > + int bias = ctz_hwi (mask) - scale;
> > +
> > + /* If the bias + scale don't add up to operand[2], reject. */
> > + if ((scale + bias) != UINTVAL (operands[2]))
> > + FAIL;
> > +
> > + /* If the shift-amount is out-of-range for sh[123]add.uw, reject. */
> > + if ((scale < 1) || (scale > 3))
> > + FAIL;
> > +
> > + /* If there's no bias, the '*shNadduw' pattern should have matched. */
> > + if (bias == 0)
> > + FAIL;
> > +
> > + operands[6] = GEN_INT (bias);
> > + operands[7] = GEN_INT (scale);
> > + operands[8] = GEN_INT (0xffffffffULL << scale);
> > +})
> > +
> > (define_insn "*add.uw"
> > [(set (match_operand:DI 0 "register_operand" "=r")
> > (plus:DI (zero_extend:DI
> > diff --git a/gcc/testsuite/gcc.target/riscv/zba-shadd.c b/gcc/testsuite/gcc.target/riscv/zba-shadd.c
> > new file mode 100644
> > index 00000000000..33da2530f3f
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/zba-shadd.c
> > @@ -0,0 +1,13 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -march=rv64gc_zba -mabi=lp64" } */
> > +
> > +unsigned long foo(unsigned int a, unsigned long b)
> > +{
> > + a = a << 1;
> > + unsigned long c = (unsigned long) a;
> > + unsigned long d = b + (c<<2);
> > + return d;
> > +}
> > +
> > +/* { dg-final { scan-assembler "sh2add.uw" } } */
> > +/* { dg-final { scan-assembler-not "zext" } } */
> > \ No newline at end of file
> > --
> > 2.34.1
> >
../../gcc/config/riscv/bitmanip.md: In function 'rtx_insn* gen_split_44(rtx_ins\
n*, rtx_def**)':
../../gcc/config/riscv/bitmanip.md:110:28: error: comparison of integer express\
ions of different signedness: 'int' and 'long unsigned int' [-Werror=sign-compa\
re]
110 | if ((scale + bias) != UINTVAL (operands[2]))
Hi Andreas:
Fixed via https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d6b423882a05d7b4f40ae1e9d942c9c4c13761b7,
thanks!
On Fri, Jun 17, 2022 at 4:34 PM Andreas Schwab <schwab@linux-m68k.org> wrote:
>
> ../../gcc/config/riscv/bitmanip.md: In function 'rtx_insn* gen_split_44(rtx_ins\
> n*, rtx_def**)':
> ../../gcc/config/riscv/bitmanip.md:110:28: error: comparison of integer express\
> ions of different signedness: 'int' and 'long unsigned int' [-Werror=sign-compa\
> re]
> 110 | if ((scale + bias) != UINTVAL (operands[2]))
>
> --
> Andreas Schwab, schwab@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
> "And now for something completely different."
Kito, thanks: you were a few minutes ahead of my fix there.
On Fri, 17 Jun 2022 at 16:00, Kito Cheng <kito.cheng@gmail.com> wrote:
> Hi Andreas:
>
> Fixed via
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d6b423882a05d7b4f40ae1e9d942c9c4c13761b7
> ,
> thanks!
>
> On Fri, Jun 17, 2022 at 4:34 PM Andreas Schwab <schwab@linux-m68k.org>
> wrote:
> >
> > ../../gcc/config/riscv/bitmanip.md: In function 'rtx_insn*
> gen_split_44(rtx_ins\
> > n*, rtx_def**)':
> > ../../gcc/config/riscv/bitmanip.md:110:28: error: comparison of integer
> express\
> > ions of different signedness: 'int' and 'long unsigned int'
> [-Werror=sign-compa\
> > re]
> > 110 | if ((scale + bias) != UINTVAL (operands[2]))
> >
> > --
> > Andreas Schwab, schwab@linux-m68k.org
> > GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
> > "And now for something completely different."
>
@@ -79,6 +79,50 @@ (define_insn "*shNadduw"
[(set_attr "type" "bitmanip")
(set_attr "mode" "DI")])
+;; During combine, we may encounter an attempt to combine
+;; slli rtmp, rs, #imm
+;; zext.w rtmp, rtmp
+;; sh[123]add rd, rtmp, rs2
+;; which will lead to the immediate not satisfying the above constraints.
+;; By splitting the compound expression, we can simplify to a slli and a
+;; sh[123]add.uw.
+(define_split
+ [(set (match_operand:DI 0 "register_operand")
+ (plus:DI (and:DI (ashift:DI (match_operand:DI 1 "register_operand")
+ (match_operand:QI 2 "immediate_operand"))
+ (match_operand:DI 3 "consecutive_bits_operand"))
+ (match_operand:DI 4 "register_operand")))
+ (clobber (match_operand:DI 5 "register_operand"))]
+ "TARGET_64BIT && TARGET_ZBA"
+ [(set (match_dup 5) (ashift:DI (match_dup 1) (match_dup 6)))
+ (set (match_dup 0) (plus:DI (and:DI (ashift:DI (match_dup 5)
+ (match_dup 7))
+ (match_dup 8))
+ (match_dup 4)))]
+{
+ unsigned HOST_WIDE_INT mask = UINTVAL (operands[3]);
+ /* scale: shift within the sh[123]add.uw */
+ int scale = 32 - clz_hwi (mask);
+ /* bias: pre-scale amount (i.e. the prior shift amount) */
+ int bias = ctz_hwi (mask) - scale;
+
+ /* If the bias + scale don't add up to operand[2], reject. */
+ if ((scale + bias) != UINTVAL (operands[2]))
+ FAIL;
+
+ /* If the shift-amount is out-of-range for sh[123]add.uw, reject. */
+ if ((scale < 1) || (scale > 3))
+ FAIL;
+
+ /* If there's no bias, the '*shNadduw' pattern should have matched. */
+ if (bias == 0)
+ FAIL;
+
+ operands[6] = GEN_INT (bias);
+ operands[7] = GEN_INT (scale);
+ operands[8] = GEN_INT (0xffffffffULL << scale);
+})
+
(define_insn "*add.uw"
[(set (match_operand:DI 0 "register_operand" "=r")
(plus:DI (zero_extend:DI
new file mode 100644
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=rv64gc_zba -mabi=lp64" } */
+
+unsigned long foo(unsigned int a, unsigned long b)
+{
+ a = a << 1;
+ unsigned long c = (unsigned long) a;
+ unsigned long d = b + (c<<2);
+ return d;
+}
+
+/* { dg-final { scan-assembler "sh2add.uw" } } */
+/* { dg-final { scan-assembler-not "zext" } } */
\ No newline at end of file