[v1,8/8] RISC-V: bitmanip: relax minmax to operate on GPR

Message ID 20211111141020.2738001-9-philipp.tomsich@vrull.eu
State Deferred, archived
Headers
Series Improvements to bitmanip-1.0 (Zb[abcs]) support |

Commit Message

Philipp Tomsich Nov. 11, 2021, 2:10 p.m. UTC
  While min/minu/max/maxu instructions are provided for XLEN only, these
can safely operate on GPRs (i.e. SImode or DImode for RV64): SImode is
always sign-extended, which ensures that the XLEN-wide instructions
can be used for signed and unsigned comparisons on SImode yielding a
correct ordering of value.

This commit
 - relaxes the minmax pattern to express for GPR (instead of X only),
   providing both a si3 and di3 expansion on RV64
 - adds a sign-extending form for thee si3 pattern for RV64 to all REE
   to eliminate redundant extensions
 - adds test-cases for both

gcc/ChangeLog:

	* config/riscv/bitmanip.md: Relax minmax to GPR (i.e SImode or
	  DImode) on RV64.
	* config/riscv/bitmanip.md (<bitmanip_optab>si3_sext): Add
          pattern for REE.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/zbb-min-max.c: Add testcases for SImode
          operands checking that no redundant sign- or zero-extensions
          are emitted.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
---

 gcc/config/riscv/bitmanip.md                 | 14 +++++++++++---
 gcc/testsuite/gcc.target/riscv/zbb-min-max.c | 20 +++++++++++++++++---
 2 files changed, 28 insertions(+), 6 deletions(-)
  

Comments

Kito Cheng Nov. 11, 2021, 4 p.m. UTC | #1
Hi Philipp:

We can't pretend we have SImode min/max instruction without that semantic.
Give this testcase, x86 and rv64gc print out 8589934592 8589934591 = 0,
but with this patch and compile with rv64gc_zba_zbb -O3, the output
become 8589934592 8589934591 = 8589934592

-------------Testcase---------------
#include <stdio.h>
long long __attribute__((noinline, noipa))
foo6(long long a, long long b)
{
  int xa = a;
  int xb = b;
  return (xa > xb ? xa : xb);
}
int main() {
  long long a = 0x200000000ll;
  long long b = 0x1ffffffffl;
  long long c = foo6(a, b);
  printf ("%lld %lld = %lld\n", a, b, c);
  return 0;
}
--------------------------------------
v64gc_zba_zbb -O3 w/o this patch:
foo6:
        sext.w  a1,a1
        sext.w  a0,a0
        max     a0,a0,a1
        ret

--------------------------------------
v64gc_zba_zbb -O3 w/ this patch:
foo6:
        max     a0,a0,a1
        ret

On Thu, Nov 11, 2021 at 10:10 PM Philipp Tomsich
<philipp.tomsich@vrull.eu> wrote:
>
> While min/minu/max/maxu instructions are provided for XLEN only, these
> can safely operate on GPRs (i.e. SImode or DImode for RV64): SImode is
> always sign-extended, which ensures that the XLEN-wide instructions
> can be used for signed and unsigned comparisons on SImode yielding a
> correct ordering of value.
>
> This commit
>  - relaxes the minmax pattern to express for GPR (instead of X only),
>    providing both a si3 and di3 expansion on RV64
>  - adds a sign-extending form for thee si3 pattern for RV64 to all REE
>    to eliminate redundant extensions
>  - adds test-cases for both
>
> gcc/ChangeLog:
>
>         * config/riscv/bitmanip.md: Relax minmax to GPR (i.e SImode or
>           DImode) on RV64.
>         * config/riscv/bitmanip.md (<bitmanip_optab>si3_sext): Add
>           pattern for REE.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/riscv/zbb-min-max.c: Add testcases for SImode
>           operands checking that no redundant sign- or zero-extensions
>           are emitted.
>
> Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> ---
>
>  gcc/config/riscv/bitmanip.md                 | 14 +++++++++++---
>  gcc/testsuite/gcc.target/riscv/zbb-min-max.c | 20 +++++++++++++++++---
>  2 files changed, 28 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
> index 000deb48b16..2a28f78f5f6 100644
> --- a/gcc/config/riscv/bitmanip.md
> +++ b/gcc/config/riscv/bitmanip.md
> @@ -260,13 +260,21 @@ (define_insn "bswap<mode>2"
>    [(set_attr "type" "bitmanip")])
>
>  (define_insn "<bitmanip_optab><mode>3"
> -  [(set (match_operand:X 0 "register_operand" "=r")
> -        (bitmanip_minmax:X (match_operand:X 1 "register_operand" "r")
> -                          (match_operand:X 2 "register_operand" "r")))]
> +  [(set (match_operand:GPR 0 "register_operand" "=r")
> +        (bitmanip_minmax:GPR (match_operand:GPR 1 "register_operand" "r")
> +                            (match_operand:GPR 2 "register_operand" "r")))]
>    "TARGET_ZBB"
>    "<bitmanip_insn>\t%0,%1,%2"
>    [(set_attr "type" "bitmanip")])
>
> +(define_insn "<bitmanip_optab>si3_sext"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +        (sign_extend:DI (bitmanip_minmax:SI (match_operand:SI 1 "register_operand" "r")
> +                            (match_operand:SI 2 "register_operand" "r"))))]
> +  "TARGET_64BIT && TARGET_ZBB"
> +  "<bitmanip_insn>\t%0,%1,%2"
> +  [(set_attr "type" "bitmanip")])
> +
>  ;; orc.b (or-combine) is added as an unspec for the benefit of the support
>  ;; for optimized string functions (such as strcmp).
>  (define_insn "orcb<mode>2"
> diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c
> index f44c398ea08..7169e873551 100644
> --- a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c
> +++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-march=rv64gc_zbb -mabi=lp64 -O2" } */
> +/* { dg-options "-march=rv64gc_zba_zbb -mabi=lp64 -O2" } */
>
>  long
>  foo1 (long i, long j)
> @@ -25,7 +25,21 @@ foo4 (unsigned long i, unsigned long j)
>    return i > j ? i : j;
>  }
>
> +unsigned int
> +foo5(unsigned int a, unsigned int b)
> +{
> +  return a > b ? a : b;
> +}
> +
> +int
> +foo6(int a, int b)
> +{
> +  return a > b ? a : b;
> +}
> +
>  /* { dg-final { scan-assembler-times "min" 3 } } */
> -/* { dg-final { scan-assembler-times "max" 3 } } */
> +/* { dg-final { scan-assembler-times "max" 4 } } */
>  /* { dg-final { scan-assembler-times "minu" 1 } } */
> -/* { dg-final { scan-assembler-times "maxu" 1 } } */
> +/* { dg-final { scan-assembler-times "maxu" 3 } } */
> +/* { dg-final { scan-assembler-not "zext.w" } } */
> +/* { dg-final { scan-assembler-not "sext.w" } } */
> --
> 2.32.0
>
  
Philipp Tomsich Nov. 11, 2021, 4:18 p.m. UTC | #2
Kito,

Unless I am missing something, the problem is not the relaxation to
GPR, but rather the sign-extending pattern I had squashed into the
same patch.
If you disable "<bitmanip_optab>si3_sext", a sext.w will be have to be
emitted after the 'max' and before the return (or before the SImode
output is consumed as a DImode), pushing the REE opportunity to a
subsequent consumer (e.g. an addw).

This will generate
   foo6:
      max a0,a0,a1
      sext.w a0,a0
      ret
which (assuming that the inputs to max are properly sign-extended
SImode values living in DImode registers) will be the same as
performing the two sext.w before the max.

Having a second set of eyes on this is appreciated — let me know if
you agree and I'll revise, once I have collected feedback on the
remaining patches of the series.

Philipp.


On Thu, 11 Nov 2021 at 17:00, Kito Cheng <kito.cheng@gmail.com> wrote:
>
> Hi Philipp:
>
> We can't pretend we have SImode min/max instruction without that semantic.
> Give this testcase, x86 and rv64gc print out 8589934592 8589934591 = 0,
> but with this patch and compile with rv64gc_zba_zbb -O3, the output
> become 8589934592 8589934591 = 8589934592
>
> -------------Testcase---------------
> #include <stdio.h>
> long long __attribute__((noinline, noipa))
> foo6(long long a, long long b)
> {
>   int xa = a;
>   int xb = b;
>   return (xa > xb ? xa : xb);
> }
> int main() {
>   long long a = 0x200000000ll;
>   long long b = 0x1ffffffffl;
>   long long c = foo6(a, b);
>   printf ("%lld %lld = %lld\n", a, b, c);
>   return 0;
> }
> --------------------------------------
> v64gc_zba_zbb -O3 w/o this patch:
> foo6:
>         sext.w  a1,a1
>         sext.w  a0,a0
>         max     a0,a0,a1
>         ret
>
> --------------------------------------
> v64gc_zba_zbb -O3 w/ this patch:
> foo6:
>         max     a0,a0,a1
>         ret
>
> On Thu, Nov 11, 2021 at 10:10 PM Philipp Tomsich
> <philipp.tomsich@vrull.eu> wrote:
> >
> > While min/minu/max/maxu instructions are provided for XLEN only, these
> > can safely operate on GPRs (i.e. SImode or DImode for RV64): SImode is
> > always sign-extended, which ensures that the XLEN-wide instructions
> > can be used for signed and unsigned comparisons on SImode yielding a
> > correct ordering of value.
> >
> > This commit
> >  - relaxes the minmax pattern to express for GPR (instead of X only),
> >    providing both a si3 and di3 expansion on RV64
> >  - adds a sign-extending form for thee si3 pattern for RV64 to all REE
> >    to eliminate redundant extensions
> >  - adds test-cases for both
> >
> > gcc/ChangeLog:
> >
> >         * config/riscv/bitmanip.md: Relax minmax to GPR (i.e SImode or
> >           DImode) on RV64.
> >         * config/riscv/bitmanip.md (<bitmanip_optab>si3_sext): Add
> >           pattern for REE.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.target/riscv/zbb-min-max.c: Add testcases for SImode
> >           operands checking that no redundant sign- or zero-extensions
> >           are emitted.
> >
> > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > ---
> >
> >  gcc/config/riscv/bitmanip.md                 | 14 +++++++++++---
> >  gcc/testsuite/gcc.target/riscv/zbb-min-max.c | 20 +++++++++++++++++---
> >  2 files changed, 28 insertions(+), 6 deletions(-)
> >
> > diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
> > index 000deb48b16..2a28f78f5f6 100644
> > --- a/gcc/config/riscv/bitmanip.md
> > +++ b/gcc/config/riscv/bitmanip.md
> > @@ -260,13 +260,21 @@ (define_insn "bswap<mode>2"
> >    [(set_attr "type" "bitmanip")])
> >
> >  (define_insn "<bitmanip_optab><mode>3"
> > -  [(set (match_operand:X 0 "register_operand" "=r")
> > -        (bitmanip_minmax:X (match_operand:X 1 "register_operand" "r")
> > -                          (match_operand:X 2 "register_operand" "r")))]
> > +  [(set (match_operand:GPR 0 "register_operand" "=r")
> > +        (bitmanip_minmax:GPR (match_operand:GPR 1 "register_operand" "r")
> > +                            (match_operand:GPR 2 "register_operand" "r")))]
> >    "TARGET_ZBB"
> >    "<bitmanip_insn>\t%0,%1,%2"
> >    [(set_attr "type" "bitmanip")])
> >
> > +(define_insn "<bitmanip_optab>si3_sext"
> > +  [(set (match_operand:DI 0 "register_operand" "=r")
> > +        (sign_extend:DI (bitmanip_minmax:SI (match_operand:SI 1 "register_operand" "r")
> > +                            (match_operand:SI 2 "register_operand" "r"))))]
> > +  "TARGET_64BIT && TARGET_ZBB"
> > +  "<bitmanip_insn>\t%0,%1,%2"
> > +  [(set_attr "type" "bitmanip")])
> > +
> >  ;; orc.b (or-combine) is added as an unspec for the benefit of the support
> >  ;; for optimized string functions (such as strcmp).
> >  (define_insn "orcb<mode>2"
> > diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c
> > index f44c398ea08..7169e873551 100644
> > --- a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c
> > +++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c
> > @@ -1,5 +1,5 @@
> >  /* { dg-do compile } */
> > -/* { dg-options "-march=rv64gc_zbb -mabi=lp64 -O2" } */
> > +/* { dg-options "-march=rv64gc_zba_zbb -mabi=lp64 -O2" } */
> >
> >  long
> >  foo1 (long i, long j)
> > @@ -25,7 +25,21 @@ foo4 (unsigned long i, unsigned long j)
> >    return i > j ? i : j;
> >  }
> >
> > +unsigned int
> > +foo5(unsigned int a, unsigned int b)
> > +{
> > +  return a > b ? a : b;
> > +}
> > +
> > +int
> > +foo6(int a, int b)
> > +{
> > +  return a > b ? a : b;
> > +}
> > +
> >  /* { dg-final { scan-assembler-times "min" 3 } } */
> > -/* { dg-final { scan-assembler-times "max" 3 } } */
> > +/* { dg-final { scan-assembler-times "max" 4 } } */
> >  /* { dg-final { scan-assembler-times "minu" 1 } } */
> > -/* { dg-final { scan-assembler-times "maxu" 1 } } */
> > +/* { dg-final { scan-assembler-times "maxu" 3 } } */
> > +/* { dg-final { scan-assembler-not "zext.w" } } */
> > +/* { dg-final { scan-assembler-not "sext.w" } } */
> > --
> > 2.32.0
> >
  
Kito Cheng Nov. 11, 2021, 4:27 p.m. UTC | #3
IIRC it's not work even without sign extend pattern since I did similar
experimental before (not for RISC-V, but same concept), I guess I need more
time to test that.

Philipp Tomsich <philipp.tomsich@vrull.eu> 於 2021年11月12日 週五 00:18 寫道:

> Kito,
>
> Unless I am missing something, the problem is not the relaxation to
> GPR, but rather the sign-extending pattern I had squashed into the
> same patch.
> If you disable "<bitmanip_optab>si3_sext", a sext.w will be have to be
> emitted after the 'max' and before the return (or before the SImode
> output is consumed as a DImode), pushing the REE opportunity to a
> subsequent consumer (e.g. an addw).
>
> This will generate
>    foo6:
>       max a0,a0,a1
>       sext.w a0,a0
>       ret
> which (assuming that the inputs to max are properly sign-extended
> SImode values living in DImode registers) will be the same as
> performing the two sext.w before the max.
>
> Having a second set of eyes on this is appreciated — let me know if
> you agree and I'll revise, once I have collected feedback on the
> remaining patches of the series.
>
> Philipp.
>
>
> On Thu, 11 Nov 2021 at 17:00, Kito Cheng <kito.cheng@gmail.com> wrote:
> >
> > Hi Philipp:
> >
> > We can't pretend we have SImode min/max instruction without that
> semantic.
> > Give this testcase, x86 and rv64gc print out 8589934592 8589934591 = 0,
> > but with this patch and compile with rv64gc_zba_zbb -O3, the output
> > become 8589934592 8589934591 = 8589934592
> >
> > -------------Testcase---------------
> > #include <stdio.h>
> > long long __attribute__((noinline, noipa))
> > foo6(long long a, long long b)
> > {
> >   int xa = a;
> >   int xb = b;
> >   return (xa > xb ? xa : xb);
> > }
> > int main() {
> >   long long a = 0x200000000ll;
> >   long long b = 0x1ffffffffl;
> >   long long c = foo6(a, b);
> >   printf ("%lld %lld = %lld\n", a, b, c);
> >   return 0;
> > }
> > --------------------------------------
> > v64gc_zba_zbb -O3 w/o this patch:
> > foo6:
> >         sext.w  a1,a1
> >         sext.w  a0,a0
> >         max     a0,a0,a1
> >         ret
> >
> > --------------------------------------
> > v64gc_zba_zbb -O3 w/ this patch:
> > foo6:
> >         max     a0,a0,a1
> >         ret
> >
> > On Thu, Nov 11, 2021 at 10:10 PM Philipp Tomsich
> > <philipp.tomsich@vrull.eu> wrote:
> > >
> > > While min/minu/max/maxu instructions are provided for XLEN only, these
> > > can safely operate on GPRs (i.e. SImode or DImode for RV64): SImode is
> > > always sign-extended, which ensures that the XLEN-wide instructions
> > > can be used for signed and unsigned comparisons on SImode yielding a
> > > correct ordering of value.
> > >
> > > This commit
> > >  - relaxes the minmax pattern to express for GPR (instead of X only),
> > >    providing both a si3 and di3 expansion on RV64
> > >  - adds a sign-extending form for thee si3 pattern for RV64 to all REE
> > >    to eliminate redundant extensions
> > >  - adds test-cases for both
> > >
> > > gcc/ChangeLog:
> > >
> > >         * config/riscv/bitmanip.md: Relax minmax to GPR (i.e SImode or
> > >           DImode) on RV64.
> > >         * config/riscv/bitmanip.md (<bitmanip_optab>si3_sext): Add
> > >           pattern for REE.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >         * gcc.target/riscv/zbb-min-max.c: Add testcases for SImode
> > >           operands checking that no redundant sign- or zero-extensions
> > >           are emitted.
> > >
> > > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > > ---
> > >
> > >  gcc/config/riscv/bitmanip.md                 | 14 +++++++++++---
> > >  gcc/testsuite/gcc.target/riscv/zbb-min-max.c | 20 +++++++++++++++++---
> > >  2 files changed, 28 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/gcc/config/riscv/bitmanip.md
> b/gcc/config/riscv/bitmanip.md
> > > index 000deb48b16..2a28f78f5f6 100644
> > > --- a/gcc/config/riscv/bitmanip.md
> > > +++ b/gcc/config/riscv/bitmanip.md
> > > @@ -260,13 +260,21 @@ (define_insn "bswap<mode>2"
> > >    [(set_attr "type" "bitmanip")])
> > >
> > >  (define_insn "<bitmanip_optab><mode>3"
> > > -  [(set (match_operand:X 0 "register_operand" "=r")
> > > -        (bitmanip_minmax:X (match_operand:X 1 "register_operand" "r")
> > > -                          (match_operand:X 2 "register_operand"
> "r")))]
> > > +  [(set (match_operand:GPR 0 "register_operand" "=r")
> > > +        (bitmanip_minmax:GPR (match_operand:GPR 1 "register_operand"
> "r")
> > > +                            (match_operand:GPR 2 "register_operand"
> "r")))]
> > >    "TARGET_ZBB"
> > >    "<bitmanip_insn>\t%0,%1,%2"
> > >    [(set_attr "type" "bitmanip")])
> > >
> > > +(define_insn "<bitmanip_optab>si3_sext"
> > > +  [(set (match_operand:DI 0 "register_operand" "=r")
> > > +        (sign_extend:DI (bitmanip_minmax:SI (match_operand:SI 1
> "register_operand" "r")
> > > +                            (match_operand:SI 2 "register_operand"
> "r"))))]
> > > +  "TARGET_64BIT && TARGET_ZBB"
> > > +  "<bitmanip_insn>\t%0,%1,%2"
> > > +  [(set_attr "type" "bitmanip")])
> > > +
> > >  ;; orc.b (or-combine) is added as an unspec for the benefit of the
> support
> > >  ;; for optimized string functions (such as strcmp).
> > >  (define_insn "orcb<mode>2"
> > > diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c
> b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c
> > > index f44c398ea08..7169e873551 100644
> > > --- a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c
> > > +++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c
> > > @@ -1,5 +1,5 @@
> > >  /* { dg-do compile } */
> > > -/* { dg-options "-march=rv64gc_zbb -mabi=lp64 -O2" } */
> > > +/* { dg-options "-march=rv64gc_zba_zbb -mabi=lp64 -O2" } */
> > >
> > >  long
> > >  foo1 (long i, long j)
> > > @@ -25,7 +25,21 @@ foo4 (unsigned long i, unsigned long j)
> > >    return i > j ? i : j;
> > >  }
> > >
> > > +unsigned int
> > > +foo5(unsigned int a, unsigned int b)
> > > +{
> > > +  return a > b ? a : b;
> > > +}
> > > +
> > > +int
> > > +foo6(int a, int b)
> > > +{
> > > +  return a > b ? a : b;
> > > +}
> > > +
> > >  /* { dg-final { scan-assembler-times "min" 3 } } */
> > > -/* { dg-final { scan-assembler-times "max" 3 } } */
> > > +/* { dg-final { scan-assembler-times "max" 4 } } */
> > >  /* { dg-final { scan-assembler-times "minu" 1 } } */
> > > -/* { dg-final { scan-assembler-times "maxu" 1 } } */
> > > +/* { dg-final { scan-assembler-times "maxu" 3 } } */
> > > +/* { dg-final { scan-assembler-not "zext.w" } } */
> > > +/* { dg-final { scan-assembler-not "sext.w" } } */
> > > --
> > > 2.32.0
> > >
>
  
Kito Cheng Nov. 11, 2021, 4:42 p.m. UTC | #4
Hi Philipp:

This testcase got wrong result with this patch even w/o
<bitmanip_optab>si3_sext pattern:

#include <stdio.h>

#define MAX(A, B) ((A) > (B) ? (A) : (B))

long long __attribute__((noinline, noipa))
foo6(long long a, long long b, int c)
{
  int xa = a;
  int xb = b;
  return MAX(MAX(xa, xb), c);
}
int main() {
  long long a = 0x200000000ll;
  long long b = 0x1ffffffffl;
  int c = 10;
  long long d = foo6(a, b, c);
  printf ("%lld %lld %d = %lld\n", a, b, c, d);
  return 0;
}

On Fri, Nov 12, 2021 at 12:27 AM Kito Cheng <kito.cheng@gmail.com> wrote:
>
> IIRC it's not work even without sign extend pattern since I did similar experimental before (not for RISC-V, but same concept), I guess I need more time to test that.
>
> Philipp Tomsich <philipp.tomsich@vrull.eu> 於 2021年11月12日 週五 00:18 寫道:
>>
>> Kito,
>>
>> Unless I am missing something, the problem is not the relaxation to
>> GPR, but rather the sign-extending pattern I had squashed into the
>> same patch.
>> If you disable "<bitmanip_optab>si3_sext", a sext.w will be have to be
>> emitted after the 'max' and before the return (or before the SImode
>> output is consumed as a DImode), pushing the REE opportunity to a
>> subsequent consumer (e.g. an addw).
>>
>> This will generate
>>    foo6:
>>       max a0,a0,a1
>>       sext.w a0,a0
>>       ret
>> which (assuming that the inputs to max are properly sign-extended
>> SImode values living in DImode registers) will be the same as
>> performing the two sext.w before the max.
>>
>> Having a second set of eyes on this is appreciated — let me know if
>> you agree and I'll revise, once I have collected feedback on the
>> remaining patches of the series.
>>
>> Philipp.
>>
>>
>> On Thu, 11 Nov 2021 at 17:00, Kito Cheng <kito.cheng@gmail.com> wrote:
>> >
>> > Hi Philipp:
>> >
>> > We can't pretend we have SImode min/max instruction without that semantic.
>> > Give this testcase, x86 and rv64gc print out 8589934592 8589934591 = 0,
>> > but with this patch and compile with rv64gc_zba_zbb -O3, the output
>> > become 8589934592 8589934591 = 8589934592
>> >
>> > -------------Testcase---------------
>> > #include <stdio.h>
>> > long long __attribute__((noinline, noipa))
>> > foo6(long long a, long long b)
>> > {
>> >   int xa = a;
>> >   int xb = b;
>> >   return (xa > xb ? xa : xb);
>> > }
>> > int main() {
>> >   long long a = 0x200000000ll;
>> >   long long b = 0x1ffffffffl;
>> >   long long c = foo6(a, b);
>> >   printf ("%lld %lld = %lld\n", a, b, c);
>> >   return 0;
>> > }
>> > --------------------------------------
>> > v64gc_zba_zbb -O3 w/o this patch:
>> > foo6:
>> >         sext.w  a1,a1
>> >         sext.w  a0,a0
>> >         max     a0,a0,a1
>> >         ret
>> >
>> > --------------------------------------
>> > v64gc_zba_zbb -O3 w/ this patch:
>> > foo6:
>> >         max     a0,a0,a1
>> >         ret
>> >
>> > On Thu, Nov 11, 2021 at 10:10 PM Philipp Tomsich
>> > <philipp.tomsich@vrull.eu> wrote:
>> > >
>> > > While min/minu/max/maxu instructions are provided for XLEN only, these
>> > > can safely operate on GPRs (i.e. SImode or DImode for RV64): SImode is
>> > > always sign-extended, which ensures that the XLEN-wide instructions
>> > > can be used for signed and unsigned comparisons on SImode yielding a
>> > > correct ordering of value.
>> > >
>> > > This commit
>> > >  - relaxes the minmax pattern to express for GPR (instead of X only),
>> > >    providing both a si3 and di3 expansion on RV64
>> > >  - adds a sign-extending form for thee si3 pattern for RV64 to all REE
>> > >    to eliminate redundant extensions
>> > >  - adds test-cases for both
>> > >
>> > > gcc/ChangeLog:
>> > >
>> > >         * config/riscv/bitmanip.md: Relax minmax to GPR (i.e SImode or
>> > >           DImode) on RV64.
>> > >         * config/riscv/bitmanip.md (<bitmanip_optab>si3_sext): Add
>> > >           pattern for REE.
>> > >
>> > > gcc/testsuite/ChangeLog:
>> > >
>> > >         * gcc.target/riscv/zbb-min-max.c: Add testcases for SImode
>> > >           operands checking that no redundant sign- or zero-extensions
>> > >           are emitted.
>> > >
>> > > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
>> > > ---
>> > >
>> > >  gcc/config/riscv/bitmanip.md                 | 14 +++++++++++---
>> > >  gcc/testsuite/gcc.target/riscv/zbb-min-max.c | 20 +++++++++++++++++---
>> > >  2 files changed, 28 insertions(+), 6 deletions(-)
>> > >
>> > > diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
>> > > index 000deb48b16..2a28f78f5f6 100644
>> > > --- a/gcc/config/riscv/bitmanip.md
>> > > +++ b/gcc/config/riscv/bitmanip.md
>> > > @@ -260,13 +260,21 @@ (define_insn "bswap<mode>2"
>> > >    [(set_attr "type" "bitmanip")])
>> > >
>> > >  (define_insn "<bitmanip_optab><mode>3"
>> > > -  [(set (match_operand:X 0 "register_operand" "=r")
>> > > -        (bitmanip_minmax:X (match_operand:X 1 "register_operand" "r")
>> > > -                          (match_operand:X 2 "register_operand" "r")))]
>> > > +  [(set (match_operand:GPR 0 "register_operand" "=r")
>> > > +        (bitmanip_minmax:GPR (match_operand:GPR 1 "register_operand" "r")
>> > > +                            (match_operand:GPR 2 "register_operand" "r")))]
>> > >    "TARGET_ZBB"
>> > >    "<bitmanip_insn>\t%0,%1,%2"
>> > >    [(set_attr "type" "bitmanip")])
>> > >
>> > > +(define_insn "<bitmanip_optab>si3_sext"
>> > > +  [(set (match_operand:DI 0 "register_operand" "=r")
>> > > +        (sign_extend:DI (bitmanip_minmax:SI (match_operand:SI 1 "register_operand" "r")
>> > > +                            (match_operand:SI 2 "register_operand" "r"))))]
>> > > +  "TARGET_64BIT && TARGET_ZBB"
>> > > +  "<bitmanip_insn>\t%0,%1,%2"
>> > > +  [(set_attr "type" "bitmanip")])
>> > > +
>> > >  ;; orc.b (or-combine) is added as an unspec for the benefit of the support
>> > >  ;; for optimized string functions (such as strcmp).
>> > >  (define_insn "orcb<mode>2"
>> > > diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c
>> > > index f44c398ea08..7169e873551 100644
>> > > --- a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c
>> > > +++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c
>> > > @@ -1,5 +1,5 @@
>> > >  /* { dg-do compile } */
>> > > -/* { dg-options "-march=rv64gc_zbb -mabi=lp64 -O2" } */
>> > > +/* { dg-options "-march=rv64gc_zba_zbb -mabi=lp64 -O2" } */
>> > >
>> > >  long
>> > >  foo1 (long i, long j)
>> > > @@ -25,7 +25,21 @@ foo4 (unsigned long i, unsigned long j)
>> > >    return i > j ? i : j;
>> > >  }
>> > >
>> > > +unsigned int
>> > > +foo5(unsigned int a, unsigned int b)
>> > > +{
>> > > +  return a > b ? a : b;
>> > > +}
>> > > +
>> > > +int
>> > > +foo6(int a, int b)
>> > > +{
>> > > +  return a > b ? a : b;
>> > > +}
>> > > +
>> > >  /* { dg-final { scan-assembler-times "min" 3 } } */
>> > > -/* { dg-final { scan-assembler-times "max" 3 } } */
>> > > +/* { dg-final { scan-assembler-times "max" 4 } } */
>> > >  /* { dg-final { scan-assembler-times "minu" 1 } } */
>> > > -/* { dg-final { scan-assembler-times "maxu" 1 } } */
>> > > +/* { dg-final { scan-assembler-times "maxu" 3 } } */
>> > > +/* { dg-final { scan-assembler-not "zext.w" } } */
>> > > +/* { dg-final { scan-assembler-not "sext.w" } } */
>> > > --
>> > > 2.32.0
>> > >
  
Philipp Tomsich Nov. 11, 2021, 6:33 p.m. UTC | #5
Kito,

Thanks for the reality-check: the subreg-expressions are getting in the way.
I'll drop this from v2, as a permanent resolution for this will be a
bit more involved.

Philipp.

On Thu, 11 Nov 2021 at 17:42, Kito Cheng <kito.cheng@gmail.com> wrote:
>
> Hi Philipp:
>
> This testcase got wrong result with this patch even w/o
> <bitmanip_optab>si3_sext pattern:
>
> #include <stdio.h>
>
> #define MAX(A, B) ((A) > (B) ? (A) : (B))
>
> long long __attribute__((noinline, noipa))
> foo6(long long a, long long b, int c)
> {
>   int xa = a;
>   int xb = b;
>   return MAX(MAX(xa, xb), c);
> }
> int main() {
>   long long a = 0x200000000ll;
>   long long b = 0x1ffffffffl;
>   int c = 10;
>   long long d = foo6(a, b, c);
>   printf ("%lld %lld %d = %lld\n", a, b, c, d);
>   return 0;
> }
>
> On Fri, Nov 12, 2021 at 12:27 AM Kito Cheng <kito.cheng@gmail.com> wrote:
> >
> > IIRC it's not work even without sign extend pattern since I did similar experimental before (not for RISC-V, but same concept), I guess I need more time to test that.
> >
> > Philipp Tomsich <philipp.tomsich@vrull.eu> 於 2021年11月12日 週五 00:18 寫道:
> >>
> >> Kito,
> >>
> >> Unless I am missing something, the problem is not the relaxation to
> >> GPR, but rather the sign-extending pattern I had squashed into the
> >> same patch.
> >> If you disable "<bitmanip_optab>si3_sext", a sext.w will be have to be
> >> emitted after the 'max' and before the return (or before the SImode
> >> output is consumed as a DImode), pushing the REE opportunity to a
> >> subsequent consumer (e.g. an addw).
> >>
> >> This will generate
> >>    foo6:
> >>       max a0,a0,a1
> >>       sext.w a0,a0
> >>       ret
> >> which (assuming that the inputs to max are properly sign-extended
> >> SImode values living in DImode registers) will be the same as
> >> performing the two sext.w before the max.
> >>
> >> Having a second set of eyes on this is appreciated — let me know if
> >> you agree and I'll revise, once I have collected feedback on the
> >> remaining patches of the series.
> >>
> >> Philipp.
> >>
> >>
> >> On Thu, 11 Nov 2021 at 17:00, Kito Cheng <kito.cheng@gmail.com> wrote:
> >> >
> >> > Hi Philipp:
> >> >
> >> > We can't pretend we have SImode min/max instruction without that semantic.
> >> > Give this testcase, x86 and rv64gc print out 8589934592 8589934591 = 0,
> >> > but with this patch and compile with rv64gc_zba_zbb -O3, the output
> >> > become 8589934592 8589934591 = 8589934592
> >> >
> >> > -------------Testcase---------------
> >> > #include <stdio.h>
> >> > long long __attribute__((noinline, noipa))
> >> > foo6(long long a, long long b)
> >> > {
> >> >   int xa = a;
> >> >   int xb = b;
> >> >   return (xa > xb ? xa : xb);
> >> > }
> >> > int main() {
> >> >   long long a = 0x200000000ll;
> >> >   long long b = 0x1ffffffffl;
> >> >   long long c = foo6(a, b);
> >> >   printf ("%lld %lld = %lld\n", a, b, c);
> >> >   return 0;
> >> > }
> >> > --------------------------------------
> >> > v64gc_zba_zbb -O3 w/o this patch:
> >> > foo6:
> >> >         sext.w  a1,a1
> >> >         sext.w  a0,a0
> >> >         max     a0,a0,a1
> >> >         ret
> >> >
> >> > --------------------------------------
> >> > v64gc_zba_zbb -O3 w/ this patch:
> >> > foo6:
> >> >         max     a0,a0,a1
> >> >         ret
> >> >
> >> > On Thu, Nov 11, 2021 at 10:10 PM Philipp Tomsich
> >> > <philipp.tomsich@vrull.eu> wrote:
> >> > >
> >> > > While min/minu/max/maxu instructions are provided for XLEN only, these
> >> > > can safely operate on GPRs (i.e. SImode or DImode for RV64): SImode is
> >> > > always sign-extended, which ensures that the XLEN-wide instructions
> >> > > can be used for signed and unsigned comparisons on SImode yielding a
> >> > > correct ordering of value.
> >> > >
> >> > > This commit
> >> > >  - relaxes the minmax pattern to express for GPR (instead of X only),
> >> > >    providing both a si3 and di3 expansion on RV64
> >> > >  - adds a sign-extending form for thee si3 pattern for RV64 to all REE
> >> > >    to eliminate redundant extensions
> >> > >  - adds test-cases for both
> >> > >
> >> > > gcc/ChangeLog:
> >> > >
> >> > >         * config/riscv/bitmanip.md: Relax minmax to GPR (i.e SImode or
> >> > >           DImode) on RV64.
> >> > >         * config/riscv/bitmanip.md (<bitmanip_optab>si3_sext): Add
> >> > >           pattern for REE.
> >> > >
> >> > > gcc/testsuite/ChangeLog:
> >> > >
> >> > >         * gcc.target/riscv/zbb-min-max.c: Add testcases for SImode
> >> > >           operands checking that no redundant sign- or zero-extensions
> >> > >           are emitted.
> >> > >
> >> > > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> >> > > ---
> >> > >
> >> > >  gcc/config/riscv/bitmanip.md                 | 14 +++++++++++---
> >> > >  gcc/testsuite/gcc.target/riscv/zbb-min-max.c | 20 +++++++++++++++++---
> >> > >  2 files changed, 28 insertions(+), 6 deletions(-)
> >> > >
> >> > > diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
> >> > > index 000deb48b16..2a28f78f5f6 100644
> >> > > --- a/gcc/config/riscv/bitmanip.md
> >> > > +++ b/gcc/config/riscv/bitmanip.md
> >> > > @@ -260,13 +260,21 @@ (define_insn "bswap<mode>2"
> >> > >    [(set_attr "type" "bitmanip")])
> >> > >
> >> > >  (define_insn "<bitmanip_optab><mode>3"
> >> > > -  [(set (match_operand:X 0 "register_operand" "=r")
> >> > > -        (bitmanip_minmax:X (match_operand:X 1 "register_operand" "r")
> >> > > -                          (match_operand:X 2 "register_operand" "r")))]
> >> > > +  [(set (match_operand:GPR 0 "register_operand" "=r")
> >> > > +        (bitmanip_minmax:GPR (match_operand:GPR 1 "register_operand" "r")
> >> > > +                            (match_operand:GPR 2 "register_operand" "r")))]
> >> > >    "TARGET_ZBB"
> >> > >    "<bitmanip_insn>\t%0,%1,%2"
> >> > >    [(set_attr "type" "bitmanip")])
> >> > >
> >> > > +(define_insn "<bitmanip_optab>si3_sext"
> >> > > +  [(set (match_operand:DI 0 "register_operand" "=r")
> >> > > +        (sign_extend:DI (bitmanip_minmax:SI (match_operand:SI 1 "register_operand" "r")
> >> > > +                            (match_operand:SI 2 "register_operand" "r"))))]
> >> > > +  "TARGET_64BIT && TARGET_ZBB"
> >> > > +  "<bitmanip_insn>\t%0,%1,%2"
> >> > > +  [(set_attr "type" "bitmanip")])
> >> > > +
> >> > >  ;; orc.b (or-combine) is added as an unspec for the benefit of the support
> >> > >  ;; for optimized string functions (such as strcmp).
> >> > >  (define_insn "orcb<mode>2"
> >> > > diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c
> >> > > index f44c398ea08..7169e873551 100644
> >> > > --- a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c
> >> > > +++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c
> >> > > @@ -1,5 +1,5 @@
> >> > >  /* { dg-do compile } */
> >> > > -/* { dg-options "-march=rv64gc_zbb -mabi=lp64 -O2" } */
> >> > > +/* { dg-options "-march=rv64gc_zba_zbb -mabi=lp64 -O2" } */
> >> > >
> >> > >  long
> >> > >  foo1 (long i, long j)
> >> > > @@ -25,7 +25,21 @@ foo4 (unsigned long i, unsigned long j)
> >> > >    return i > j ? i : j;
> >> > >  }
> >> > >
> >> > > +unsigned int
> >> > > +foo5(unsigned int a, unsigned int b)
> >> > > +{
> >> > > +  return a > b ? a : b;
> >> > > +}
> >> > > +
> >> > > +int
> >> > > +foo6(int a, int b)
> >> > > +{
> >> > > +  return a > b ? a : b;
> >> > > +}
> >> > > +
> >> > >  /* { dg-final { scan-assembler-times "min" 3 } } */
> >> > > -/* { dg-final { scan-assembler-times "max" 3 } } */
> >> > > +/* { dg-final { scan-assembler-times "max" 4 } } */
> >> > >  /* { dg-final { scan-assembler-times "minu" 1 } } */
> >> > > -/* { dg-final { scan-assembler-times "maxu" 1 } } */
> >> > > +/* { dg-final { scan-assembler-times "maxu" 3 } } */
> >> > > +/* { dg-final { scan-assembler-not "zext.w" } } */
> >> > > +/* { dg-final { scan-assembler-not "sext.w" } } */
> >> > > --
> >> > > 2.32.0
> >> > >
  

Patch

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 000deb48b16..2a28f78f5f6 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -260,13 +260,21 @@  (define_insn "bswap<mode>2"
   [(set_attr "type" "bitmanip")])
 
 (define_insn "<bitmanip_optab><mode>3"
-  [(set (match_operand:X 0 "register_operand" "=r")
-        (bitmanip_minmax:X (match_operand:X 1 "register_operand" "r")
-			   (match_operand:X 2 "register_operand" "r")))]
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+        (bitmanip_minmax:GPR (match_operand:GPR 1 "register_operand" "r")
+			     (match_operand:GPR 2 "register_operand" "r")))]
   "TARGET_ZBB"
   "<bitmanip_insn>\t%0,%1,%2"
   [(set_attr "type" "bitmanip")])
 
+(define_insn "<bitmanip_optab>si3_sext"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+        (sign_extend:DI (bitmanip_minmax:SI (match_operand:SI 1 "register_operand" "r")
+			     (match_operand:SI 2 "register_operand" "r"))))]
+  "TARGET_64BIT && TARGET_ZBB"
+  "<bitmanip_insn>\t%0,%1,%2"
+  [(set_attr "type" "bitmanip")])
+
 ;; orc.b (or-combine) is added as an unspec for the benefit of the support
 ;; for optimized string functions (such as strcmp).
 (define_insn "orcb<mode>2"
diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c
index f44c398ea08..7169e873551 100644
--- a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c
+++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-march=rv64gc_zbb -mabi=lp64 -O2" } */
+/* { dg-options "-march=rv64gc_zba_zbb -mabi=lp64 -O2" } */
 
 long
 foo1 (long i, long j)
@@ -25,7 +25,21 @@  foo4 (unsigned long i, unsigned long j)
   return i > j ? i : j;
 }
 
+unsigned int
+foo5(unsigned int a, unsigned int b)
+{
+  return a > b ? a : b;
+}
+
+int
+foo6(int a, int b)
+{
+  return a > b ? a : b;
+}
+
 /* { dg-final { scan-assembler-times "min" 3 } } */
-/* { dg-final { scan-assembler-times "max" 3 } } */
+/* { dg-final { scan-assembler-times "max" 4 } } */
 /* { dg-final { scan-assembler-times "minu" 1 } } */
-/* { dg-final { scan-assembler-times "maxu" 1 } } */
+/* { dg-final { scan-assembler-times "maxu" 3 } } */
+/* { dg-final { scan-assembler-not "zext.w" } } */
+/* { dg-final { scan-assembler-not "sext.w" } } */