Message ID | 20211111141020.2738001-9-philipp.tomsich@vrull.eu |
---|---|
State | Deferred, archived |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5C413385C017 for <patchwork@sourceware.org>; Thu, 11 Nov 2021 14:17:19 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-lj1-x234.google.com (mail-lj1-x234.google.com [IPv6:2a00:1450:4864:20::234]) by sourceware.org (Postfix) with ESMTPS id D2566385803B for <gcc-patches@gcc.gnu.org>; Thu, 11 Nov 2021 14:10:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D2566385803B Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-lj1-x234.google.com with SMTP id h11so12237683ljk.1 for <gcc-patches@gcc.gnu.org>; Thu, 11 Nov 2021 06:10:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull-eu.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qqDixyoHYpxeAjez9OS4WxHR+tBwB1Agd+13DTASX6U=; b=2keC10x3aSF607LmYJjUob+Cc/C985UnL2Ll2kJTWNsoSm1RmHtYWgFwmSM1hcN+uR Qrd7pq63ic0yagcS6xjZ5UHW9vA+26MCoVHr26TeXx+d+Y0M/h/20aW9AKfg+GIzspLr /S1pbDJ3ho/Vphsgl31qMLp3CsA+eNxyho4KkZ0VhxypvgyMVuB20rqGswhNyPBG6/0y u28ETknVESiZyOkxLe15PIidcCr+haIXa2Xpvg8xtSK0gG8FIPUyFAiZudFh79loOyEA UsVIqv4PG8jkWanCjuhnUVWgTcSK5i4iNLRcvQfnVvGJa3CEXmzCzSbgBskq6JMYh3Kt I4tA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qqDixyoHYpxeAjez9OS4WxHR+tBwB1Agd+13DTASX6U=; b=Vf7PlmijNSytHt6l8UwJp4HNORUZv7dNRxc988qVztoJS0YTCp3ltvppXp4Cy7a0rM lseWQYZy//Zp8nQLSNGI3ZJ0SK1OlnjgB2+ZJvd00yf+0HrRNFQeNci61hrJYtg6H4o+ 4f1e4Ci1qqSghiCAlyXE36X1vwoO4Bn0ChA/v7d1HrNCLR/E0+oFpcPC3tbSlx36FBdU 4GG/YrYLxrocER/gyHV/pXbxl1pig8v4m7mCkPiOp6lkV9Va5GxW/JsJIKh+XXlTB6DN sEDVYfcgX8VPBzuF2qbePzeDCfroDaE4RullHSEhBWge0EYkBGeIpopnyux6FSWfRCNM zD3A== X-Gm-Message-State: AOAM531t+pu+1l3FCWUubLXx1i1s9l+RI8ixheY5rgAV5IWro/Os/J3h aesTvgC1cvlEOWU8J/W5f9j4CG9G0fahAJHx X-Google-Smtp-Source: ABdhPJyvLVuYvaWOl3AhNi1As/yOKbMDHBONT+T+wmg3SvA0PueXfR6Ya9DKKX3Ro35wL+UdPRUSpw== X-Received: by 2002:a2e:9601:: with SMTP id v1mr7676857ljh.478.1636639830526; Thu, 11 Nov 2021 06:10:30 -0800 (PST) Received: from ubuntu-focal.. ([2a01:4f9:3a:1e26::2]) by smtp.gmail.com with ESMTPSA id a23sm274427ljh.140.2021.11.11.06.10.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Nov 2021 06:10:30 -0800 (PST) From: Philipp Tomsich <philipp.tomsich@vrull.eu> To: gcc-patches@gcc.gnu.org Subject: [PATCH v1 8/8] RISC-V: bitmanip: relax minmax to operate on GPR Date: Thu, 11 Nov 2021 15:10:20 +0100 Message-Id: <20211111141020.2738001-9-philipp.tomsich@vrull.eu> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20211111141020.2738001-1-philipp.tomsich@vrull.eu> References: <20211111141020.2738001-1-philipp.tomsich@vrull.eu> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Cc: wilson@tuliptree.org, kito.cheng@gmail.com, Philipp Tomsich <philipp.tomsich@vrull.eu> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
Improvements to bitmanip-1.0 (Zb[abcs]) support
|
|
Commit Message
Philipp Tomsich
Nov. 11, 2021, 2:10 p.m. UTC
While min/minu/max/maxu instructions are provided for XLEN only, these
can safely operate on GPRs (i.e. SImode or DImode for RV64): SImode is
always sign-extended, which ensures that the XLEN-wide instructions
can be used for signed and unsigned comparisons on SImode yielding a
correct ordering of value.
This commit
- relaxes the minmax pattern to express for GPR (instead of X only),
providing both a si3 and di3 expansion on RV64
- adds a sign-extending form for thee si3 pattern for RV64 to all REE
to eliminate redundant extensions
- adds test-cases for both
gcc/ChangeLog:
* config/riscv/bitmanip.md: Relax minmax to GPR (i.e SImode or
DImode) on RV64.
* config/riscv/bitmanip.md (<bitmanip_optab>si3_sext): Add
pattern for REE.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zbb-min-max.c: Add testcases for SImode
operands checking that no redundant sign- or zero-extensions
are emitted.
Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
---
gcc/config/riscv/bitmanip.md | 14 +++++++++++---
gcc/testsuite/gcc.target/riscv/zbb-min-max.c | 20 +++++++++++++++++---
2 files changed, 28 insertions(+), 6 deletions(-)
Comments
Hi Philipp: We can't pretend we have SImode min/max instruction without that semantic. Give this testcase, x86 and rv64gc print out 8589934592 8589934591 = 0, but with this patch and compile with rv64gc_zba_zbb -O3, the output become 8589934592 8589934591 = 8589934592 -------------Testcase--------------- #include <stdio.h> long long __attribute__((noinline, noipa)) foo6(long long a, long long b) { int xa = a; int xb = b; return (xa > xb ? xa : xb); } int main() { long long a = 0x200000000ll; long long b = 0x1ffffffffl; long long c = foo6(a, b); printf ("%lld %lld = %lld\n", a, b, c); return 0; } -------------------------------------- v64gc_zba_zbb -O3 w/o this patch: foo6: sext.w a1,a1 sext.w a0,a0 max a0,a0,a1 ret -------------------------------------- v64gc_zba_zbb -O3 w/ this patch: foo6: max a0,a0,a1 ret On Thu, Nov 11, 2021 at 10:10 PM Philipp Tomsich <philipp.tomsich@vrull.eu> wrote: > > While min/minu/max/maxu instructions are provided for XLEN only, these > can safely operate on GPRs (i.e. SImode or DImode for RV64): SImode is > always sign-extended, which ensures that the XLEN-wide instructions > can be used for signed and unsigned comparisons on SImode yielding a > correct ordering of value. > > This commit > - relaxes the minmax pattern to express for GPR (instead of X only), > providing both a si3 and di3 expansion on RV64 > - adds a sign-extending form for thee si3 pattern for RV64 to all REE > to eliminate redundant extensions > - adds test-cases for both > > gcc/ChangeLog: > > * config/riscv/bitmanip.md: Relax minmax to GPR (i.e SImode or > DImode) on RV64. > * config/riscv/bitmanip.md (<bitmanip_optab>si3_sext): Add > pattern for REE. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/zbb-min-max.c: Add testcases for SImode > operands checking that no redundant sign- or zero-extensions > are emitted. > > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu> > --- > > gcc/config/riscv/bitmanip.md | 14 +++++++++++--- > gcc/testsuite/gcc.target/riscv/zbb-min-max.c | 20 +++++++++++++++++--- > 2 files changed, 28 insertions(+), 6 deletions(-) > > diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md > index 000deb48b16..2a28f78f5f6 100644 > --- a/gcc/config/riscv/bitmanip.md > +++ b/gcc/config/riscv/bitmanip.md > @@ -260,13 +260,21 @@ (define_insn "bswap<mode>2" > [(set_attr "type" "bitmanip")]) > > (define_insn "<bitmanip_optab><mode>3" > - [(set (match_operand:X 0 "register_operand" "=r") > - (bitmanip_minmax:X (match_operand:X 1 "register_operand" "r") > - (match_operand:X 2 "register_operand" "r")))] > + [(set (match_operand:GPR 0 "register_operand" "=r") > + (bitmanip_minmax:GPR (match_operand:GPR 1 "register_operand" "r") > + (match_operand:GPR 2 "register_operand" "r")))] > "TARGET_ZBB" > "<bitmanip_insn>\t%0,%1,%2" > [(set_attr "type" "bitmanip")]) > > +(define_insn "<bitmanip_optab>si3_sext" > + [(set (match_operand:DI 0 "register_operand" "=r") > + (sign_extend:DI (bitmanip_minmax:SI (match_operand:SI 1 "register_operand" "r") > + (match_operand:SI 2 "register_operand" "r"))))] > + "TARGET_64BIT && TARGET_ZBB" > + "<bitmanip_insn>\t%0,%1,%2" > + [(set_attr "type" "bitmanip")]) > + > ;; orc.b (or-combine) is added as an unspec for the benefit of the support > ;; for optimized string functions (such as strcmp). > (define_insn "orcb<mode>2" > diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c > index f44c398ea08..7169e873551 100644 > --- a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c > +++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c > @@ -1,5 +1,5 @@ > /* { dg-do compile } */ > -/* { dg-options "-march=rv64gc_zbb -mabi=lp64 -O2" } */ > +/* { dg-options "-march=rv64gc_zba_zbb -mabi=lp64 -O2" } */ > > long > foo1 (long i, long j) > @@ -25,7 +25,21 @@ foo4 (unsigned long i, unsigned long j) > return i > j ? i : j; > } > > +unsigned int > +foo5(unsigned int a, unsigned int b) > +{ > + return a > b ? a : b; > +} > + > +int > +foo6(int a, int b) > +{ > + return a > b ? a : b; > +} > + > /* { dg-final { scan-assembler-times "min" 3 } } */ > -/* { dg-final { scan-assembler-times "max" 3 } } */ > +/* { dg-final { scan-assembler-times "max" 4 } } */ > /* { dg-final { scan-assembler-times "minu" 1 } } */ > -/* { dg-final { scan-assembler-times "maxu" 1 } } */ > +/* { dg-final { scan-assembler-times "maxu" 3 } } */ > +/* { dg-final { scan-assembler-not "zext.w" } } */ > +/* { dg-final { scan-assembler-not "sext.w" } } */ > -- > 2.32.0 >
Kito, Unless I am missing something, the problem is not the relaxation to GPR, but rather the sign-extending pattern I had squashed into the same patch. If you disable "<bitmanip_optab>si3_sext", a sext.w will be have to be emitted after the 'max' and before the return (or before the SImode output is consumed as a DImode), pushing the REE opportunity to a subsequent consumer (e.g. an addw). This will generate foo6: max a0,a0,a1 sext.w a0,a0 ret which (assuming that the inputs to max are properly sign-extended SImode values living in DImode registers) will be the same as performing the two sext.w before the max. Having a second set of eyes on this is appreciated — let me know if you agree and I'll revise, once I have collected feedback on the remaining patches of the series. Philipp. On Thu, 11 Nov 2021 at 17:00, Kito Cheng <kito.cheng@gmail.com> wrote: > > Hi Philipp: > > We can't pretend we have SImode min/max instruction without that semantic. > Give this testcase, x86 and rv64gc print out 8589934592 8589934591 = 0, > but with this patch and compile with rv64gc_zba_zbb -O3, the output > become 8589934592 8589934591 = 8589934592 > > -------------Testcase--------------- > #include <stdio.h> > long long __attribute__((noinline, noipa)) > foo6(long long a, long long b) > { > int xa = a; > int xb = b; > return (xa > xb ? xa : xb); > } > int main() { > long long a = 0x200000000ll; > long long b = 0x1ffffffffl; > long long c = foo6(a, b); > printf ("%lld %lld = %lld\n", a, b, c); > return 0; > } > -------------------------------------- > v64gc_zba_zbb -O3 w/o this patch: > foo6: > sext.w a1,a1 > sext.w a0,a0 > max a0,a0,a1 > ret > > -------------------------------------- > v64gc_zba_zbb -O3 w/ this patch: > foo6: > max a0,a0,a1 > ret > > On Thu, Nov 11, 2021 at 10:10 PM Philipp Tomsich > <philipp.tomsich@vrull.eu> wrote: > > > > While min/minu/max/maxu instructions are provided for XLEN only, these > > can safely operate on GPRs (i.e. SImode or DImode for RV64): SImode is > > always sign-extended, which ensures that the XLEN-wide instructions > > can be used for signed and unsigned comparisons on SImode yielding a > > correct ordering of value. > > > > This commit > > - relaxes the minmax pattern to express for GPR (instead of X only), > > providing both a si3 and di3 expansion on RV64 > > - adds a sign-extending form for thee si3 pattern for RV64 to all REE > > to eliminate redundant extensions > > - adds test-cases for both > > > > gcc/ChangeLog: > > > > * config/riscv/bitmanip.md: Relax minmax to GPR (i.e SImode or > > DImode) on RV64. > > * config/riscv/bitmanip.md (<bitmanip_optab>si3_sext): Add > > pattern for REE. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/riscv/zbb-min-max.c: Add testcases for SImode > > operands checking that no redundant sign- or zero-extensions > > are emitted. > > > > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu> > > --- > > > > gcc/config/riscv/bitmanip.md | 14 +++++++++++--- > > gcc/testsuite/gcc.target/riscv/zbb-min-max.c | 20 +++++++++++++++++--- > > 2 files changed, 28 insertions(+), 6 deletions(-) > > > > diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md > > index 000deb48b16..2a28f78f5f6 100644 > > --- a/gcc/config/riscv/bitmanip.md > > +++ b/gcc/config/riscv/bitmanip.md > > @@ -260,13 +260,21 @@ (define_insn "bswap<mode>2" > > [(set_attr "type" "bitmanip")]) > > > > (define_insn "<bitmanip_optab><mode>3" > > - [(set (match_operand:X 0 "register_operand" "=r") > > - (bitmanip_minmax:X (match_operand:X 1 "register_operand" "r") > > - (match_operand:X 2 "register_operand" "r")))] > > + [(set (match_operand:GPR 0 "register_operand" "=r") > > + (bitmanip_minmax:GPR (match_operand:GPR 1 "register_operand" "r") > > + (match_operand:GPR 2 "register_operand" "r")))] > > "TARGET_ZBB" > > "<bitmanip_insn>\t%0,%1,%2" > > [(set_attr "type" "bitmanip")]) > > > > +(define_insn "<bitmanip_optab>si3_sext" > > + [(set (match_operand:DI 0 "register_operand" "=r") > > + (sign_extend:DI (bitmanip_minmax:SI (match_operand:SI 1 "register_operand" "r") > > + (match_operand:SI 2 "register_operand" "r"))))] > > + "TARGET_64BIT && TARGET_ZBB" > > + "<bitmanip_insn>\t%0,%1,%2" > > + [(set_attr "type" "bitmanip")]) > > + > > ;; orc.b (or-combine) is added as an unspec for the benefit of the support > > ;; for optimized string functions (such as strcmp). > > (define_insn "orcb<mode>2" > > diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c > > index f44c398ea08..7169e873551 100644 > > --- a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c > > +++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c > > @@ -1,5 +1,5 @@ > > /* { dg-do compile } */ > > -/* { dg-options "-march=rv64gc_zbb -mabi=lp64 -O2" } */ > > +/* { dg-options "-march=rv64gc_zba_zbb -mabi=lp64 -O2" } */ > > > > long > > foo1 (long i, long j) > > @@ -25,7 +25,21 @@ foo4 (unsigned long i, unsigned long j) > > return i > j ? i : j; > > } > > > > +unsigned int > > +foo5(unsigned int a, unsigned int b) > > +{ > > + return a > b ? a : b; > > +} > > + > > +int > > +foo6(int a, int b) > > +{ > > + return a > b ? a : b; > > +} > > + > > /* { dg-final { scan-assembler-times "min" 3 } } */ > > -/* { dg-final { scan-assembler-times "max" 3 } } */ > > +/* { dg-final { scan-assembler-times "max" 4 } } */ > > /* { dg-final { scan-assembler-times "minu" 1 } } */ > > -/* { dg-final { scan-assembler-times "maxu" 1 } } */ > > +/* { dg-final { scan-assembler-times "maxu" 3 } } */ > > +/* { dg-final { scan-assembler-not "zext.w" } } */ > > +/* { dg-final { scan-assembler-not "sext.w" } } */ > > -- > > 2.32.0 > >
IIRC it's not work even without sign extend pattern since I did similar experimental before (not for RISC-V, but same concept), I guess I need more time to test that. Philipp Tomsich <philipp.tomsich@vrull.eu> 於 2021年11月12日 週五 00:18 寫道: > Kito, > > Unless I am missing something, the problem is not the relaxation to > GPR, but rather the sign-extending pattern I had squashed into the > same patch. > If you disable "<bitmanip_optab>si3_sext", a sext.w will be have to be > emitted after the 'max' and before the return (or before the SImode > output is consumed as a DImode), pushing the REE opportunity to a > subsequent consumer (e.g. an addw). > > This will generate > foo6: > max a0,a0,a1 > sext.w a0,a0 > ret > which (assuming that the inputs to max are properly sign-extended > SImode values living in DImode registers) will be the same as > performing the two sext.w before the max. > > Having a second set of eyes on this is appreciated — let me know if > you agree and I'll revise, once I have collected feedback on the > remaining patches of the series. > > Philipp. > > > On Thu, 11 Nov 2021 at 17:00, Kito Cheng <kito.cheng@gmail.com> wrote: > > > > Hi Philipp: > > > > We can't pretend we have SImode min/max instruction without that > semantic. > > Give this testcase, x86 and rv64gc print out 8589934592 8589934591 = 0, > > but with this patch and compile with rv64gc_zba_zbb -O3, the output > > become 8589934592 8589934591 = 8589934592 > > > > -------------Testcase--------------- > > #include <stdio.h> > > long long __attribute__((noinline, noipa)) > > foo6(long long a, long long b) > > { > > int xa = a; > > int xb = b; > > return (xa > xb ? xa : xb); > > } > > int main() { > > long long a = 0x200000000ll; > > long long b = 0x1ffffffffl; > > long long c = foo6(a, b); > > printf ("%lld %lld = %lld\n", a, b, c); > > return 0; > > } > > -------------------------------------- > > v64gc_zba_zbb -O3 w/o this patch: > > foo6: > > sext.w a1,a1 > > sext.w a0,a0 > > max a0,a0,a1 > > ret > > > > -------------------------------------- > > v64gc_zba_zbb -O3 w/ this patch: > > foo6: > > max a0,a0,a1 > > ret > > > > On Thu, Nov 11, 2021 at 10:10 PM Philipp Tomsich > > <philipp.tomsich@vrull.eu> wrote: > > > > > > While min/minu/max/maxu instructions are provided for XLEN only, these > > > can safely operate on GPRs (i.e. SImode or DImode for RV64): SImode is > > > always sign-extended, which ensures that the XLEN-wide instructions > > > can be used for signed and unsigned comparisons on SImode yielding a > > > correct ordering of value. > > > > > > This commit > > > - relaxes the minmax pattern to express for GPR (instead of X only), > > > providing both a si3 and di3 expansion on RV64 > > > - adds a sign-extending form for thee si3 pattern for RV64 to all REE > > > to eliminate redundant extensions > > > - adds test-cases for both > > > > > > gcc/ChangeLog: > > > > > > * config/riscv/bitmanip.md: Relax minmax to GPR (i.e SImode or > > > DImode) on RV64. > > > * config/riscv/bitmanip.md (<bitmanip_optab>si3_sext): Add > > > pattern for REE. > > > > > > gcc/testsuite/ChangeLog: > > > > > > * gcc.target/riscv/zbb-min-max.c: Add testcases for SImode > > > operands checking that no redundant sign- or zero-extensions > > > are emitted. > > > > > > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu> > > > --- > > > > > > gcc/config/riscv/bitmanip.md | 14 +++++++++++--- > > > gcc/testsuite/gcc.target/riscv/zbb-min-max.c | 20 +++++++++++++++++--- > > > 2 files changed, 28 insertions(+), 6 deletions(-) > > > > > > diff --git a/gcc/config/riscv/bitmanip.md > b/gcc/config/riscv/bitmanip.md > > > index 000deb48b16..2a28f78f5f6 100644 > > > --- a/gcc/config/riscv/bitmanip.md > > > +++ b/gcc/config/riscv/bitmanip.md > > > @@ -260,13 +260,21 @@ (define_insn "bswap<mode>2" > > > [(set_attr "type" "bitmanip")]) > > > > > > (define_insn "<bitmanip_optab><mode>3" > > > - [(set (match_operand:X 0 "register_operand" "=r") > > > - (bitmanip_minmax:X (match_operand:X 1 "register_operand" "r") > > > - (match_operand:X 2 "register_operand" > "r")))] > > > + [(set (match_operand:GPR 0 "register_operand" "=r") > > > + (bitmanip_minmax:GPR (match_operand:GPR 1 "register_operand" > "r") > > > + (match_operand:GPR 2 "register_operand" > "r")))] > > > "TARGET_ZBB" > > > "<bitmanip_insn>\t%0,%1,%2" > > > [(set_attr "type" "bitmanip")]) > > > > > > +(define_insn "<bitmanip_optab>si3_sext" > > > + [(set (match_operand:DI 0 "register_operand" "=r") > > > + (sign_extend:DI (bitmanip_minmax:SI (match_operand:SI 1 > "register_operand" "r") > > > + (match_operand:SI 2 "register_operand" > "r"))))] > > > + "TARGET_64BIT && TARGET_ZBB" > > > + "<bitmanip_insn>\t%0,%1,%2" > > > + [(set_attr "type" "bitmanip")]) > > > + > > > ;; orc.b (or-combine) is added as an unspec for the benefit of the > support > > > ;; for optimized string functions (such as strcmp). > > > (define_insn "orcb<mode>2" > > > diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c > b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c > > > index f44c398ea08..7169e873551 100644 > > > --- a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c > > > +++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c > > > @@ -1,5 +1,5 @@ > > > /* { dg-do compile } */ > > > -/* { dg-options "-march=rv64gc_zbb -mabi=lp64 -O2" } */ > > > +/* { dg-options "-march=rv64gc_zba_zbb -mabi=lp64 -O2" } */ > > > > > > long > > > foo1 (long i, long j) > > > @@ -25,7 +25,21 @@ foo4 (unsigned long i, unsigned long j) > > > return i > j ? i : j; > > > } > > > > > > +unsigned int > > > +foo5(unsigned int a, unsigned int b) > > > +{ > > > + return a > b ? a : b; > > > +} > > > + > > > +int > > > +foo6(int a, int b) > > > +{ > > > + return a > b ? a : b; > > > +} > > > + > > > /* { dg-final { scan-assembler-times "min" 3 } } */ > > > -/* { dg-final { scan-assembler-times "max" 3 } } */ > > > +/* { dg-final { scan-assembler-times "max" 4 } } */ > > > /* { dg-final { scan-assembler-times "minu" 1 } } */ > > > -/* { dg-final { scan-assembler-times "maxu" 1 } } */ > > > +/* { dg-final { scan-assembler-times "maxu" 3 } } */ > > > +/* { dg-final { scan-assembler-not "zext.w" } } */ > > > +/* { dg-final { scan-assembler-not "sext.w" } } */ > > > -- > > > 2.32.0 > > > >
Hi Philipp: This testcase got wrong result with this patch even w/o <bitmanip_optab>si3_sext pattern: #include <stdio.h> #define MAX(A, B) ((A) > (B) ? (A) : (B)) long long __attribute__((noinline, noipa)) foo6(long long a, long long b, int c) { int xa = a; int xb = b; return MAX(MAX(xa, xb), c); } int main() { long long a = 0x200000000ll; long long b = 0x1ffffffffl; int c = 10; long long d = foo6(a, b, c); printf ("%lld %lld %d = %lld\n", a, b, c, d); return 0; } On Fri, Nov 12, 2021 at 12:27 AM Kito Cheng <kito.cheng@gmail.com> wrote: > > IIRC it's not work even without sign extend pattern since I did similar experimental before (not for RISC-V, but same concept), I guess I need more time to test that. > > Philipp Tomsich <philipp.tomsich@vrull.eu> 於 2021年11月12日 週五 00:18 寫道: >> >> Kito, >> >> Unless I am missing something, the problem is not the relaxation to >> GPR, but rather the sign-extending pattern I had squashed into the >> same patch. >> If you disable "<bitmanip_optab>si3_sext", a sext.w will be have to be >> emitted after the 'max' and before the return (or before the SImode >> output is consumed as a DImode), pushing the REE opportunity to a >> subsequent consumer (e.g. an addw). >> >> This will generate >> foo6: >> max a0,a0,a1 >> sext.w a0,a0 >> ret >> which (assuming that the inputs to max are properly sign-extended >> SImode values living in DImode registers) will be the same as >> performing the two sext.w before the max. >> >> Having a second set of eyes on this is appreciated — let me know if >> you agree and I'll revise, once I have collected feedback on the >> remaining patches of the series. >> >> Philipp. >> >> >> On Thu, 11 Nov 2021 at 17:00, Kito Cheng <kito.cheng@gmail.com> wrote: >> > >> > Hi Philipp: >> > >> > We can't pretend we have SImode min/max instruction without that semantic. >> > Give this testcase, x86 and rv64gc print out 8589934592 8589934591 = 0, >> > but with this patch and compile with rv64gc_zba_zbb -O3, the output >> > become 8589934592 8589934591 = 8589934592 >> > >> > -------------Testcase--------------- >> > #include <stdio.h> >> > long long __attribute__((noinline, noipa)) >> > foo6(long long a, long long b) >> > { >> > int xa = a; >> > int xb = b; >> > return (xa > xb ? xa : xb); >> > } >> > int main() { >> > long long a = 0x200000000ll; >> > long long b = 0x1ffffffffl; >> > long long c = foo6(a, b); >> > printf ("%lld %lld = %lld\n", a, b, c); >> > return 0; >> > } >> > -------------------------------------- >> > v64gc_zba_zbb -O3 w/o this patch: >> > foo6: >> > sext.w a1,a1 >> > sext.w a0,a0 >> > max a0,a0,a1 >> > ret >> > >> > -------------------------------------- >> > v64gc_zba_zbb -O3 w/ this patch: >> > foo6: >> > max a0,a0,a1 >> > ret >> > >> > On Thu, Nov 11, 2021 at 10:10 PM Philipp Tomsich >> > <philipp.tomsich@vrull.eu> wrote: >> > > >> > > While min/minu/max/maxu instructions are provided for XLEN only, these >> > > can safely operate on GPRs (i.e. SImode or DImode for RV64): SImode is >> > > always sign-extended, which ensures that the XLEN-wide instructions >> > > can be used for signed and unsigned comparisons on SImode yielding a >> > > correct ordering of value. >> > > >> > > This commit >> > > - relaxes the minmax pattern to express for GPR (instead of X only), >> > > providing both a si3 and di3 expansion on RV64 >> > > - adds a sign-extending form for thee si3 pattern for RV64 to all REE >> > > to eliminate redundant extensions >> > > - adds test-cases for both >> > > >> > > gcc/ChangeLog: >> > > >> > > * config/riscv/bitmanip.md: Relax minmax to GPR (i.e SImode or >> > > DImode) on RV64. >> > > * config/riscv/bitmanip.md (<bitmanip_optab>si3_sext): Add >> > > pattern for REE. >> > > >> > > gcc/testsuite/ChangeLog: >> > > >> > > * gcc.target/riscv/zbb-min-max.c: Add testcases for SImode >> > > operands checking that no redundant sign- or zero-extensions >> > > are emitted. >> > > >> > > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu> >> > > --- >> > > >> > > gcc/config/riscv/bitmanip.md | 14 +++++++++++--- >> > > gcc/testsuite/gcc.target/riscv/zbb-min-max.c | 20 +++++++++++++++++--- >> > > 2 files changed, 28 insertions(+), 6 deletions(-) >> > > >> > > diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md >> > > index 000deb48b16..2a28f78f5f6 100644 >> > > --- a/gcc/config/riscv/bitmanip.md >> > > +++ b/gcc/config/riscv/bitmanip.md >> > > @@ -260,13 +260,21 @@ (define_insn "bswap<mode>2" >> > > [(set_attr "type" "bitmanip")]) >> > > >> > > (define_insn "<bitmanip_optab><mode>3" >> > > - [(set (match_operand:X 0 "register_operand" "=r") >> > > - (bitmanip_minmax:X (match_operand:X 1 "register_operand" "r") >> > > - (match_operand:X 2 "register_operand" "r")))] >> > > + [(set (match_operand:GPR 0 "register_operand" "=r") >> > > + (bitmanip_minmax:GPR (match_operand:GPR 1 "register_operand" "r") >> > > + (match_operand:GPR 2 "register_operand" "r")))] >> > > "TARGET_ZBB" >> > > "<bitmanip_insn>\t%0,%1,%2" >> > > [(set_attr "type" "bitmanip")]) >> > > >> > > +(define_insn "<bitmanip_optab>si3_sext" >> > > + [(set (match_operand:DI 0 "register_operand" "=r") >> > > + (sign_extend:DI (bitmanip_minmax:SI (match_operand:SI 1 "register_operand" "r") >> > > + (match_operand:SI 2 "register_operand" "r"))))] >> > > + "TARGET_64BIT && TARGET_ZBB" >> > > + "<bitmanip_insn>\t%0,%1,%2" >> > > + [(set_attr "type" "bitmanip")]) >> > > + >> > > ;; orc.b (or-combine) is added as an unspec for the benefit of the support >> > > ;; for optimized string functions (such as strcmp). >> > > (define_insn "orcb<mode>2" >> > > diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c >> > > index f44c398ea08..7169e873551 100644 >> > > --- a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c >> > > +++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c >> > > @@ -1,5 +1,5 @@ >> > > /* { dg-do compile } */ >> > > -/* { dg-options "-march=rv64gc_zbb -mabi=lp64 -O2" } */ >> > > +/* { dg-options "-march=rv64gc_zba_zbb -mabi=lp64 -O2" } */ >> > > >> > > long >> > > foo1 (long i, long j) >> > > @@ -25,7 +25,21 @@ foo4 (unsigned long i, unsigned long j) >> > > return i > j ? i : j; >> > > } >> > > >> > > +unsigned int >> > > +foo5(unsigned int a, unsigned int b) >> > > +{ >> > > + return a > b ? a : b; >> > > +} >> > > + >> > > +int >> > > +foo6(int a, int b) >> > > +{ >> > > + return a > b ? a : b; >> > > +} >> > > + >> > > /* { dg-final { scan-assembler-times "min" 3 } } */ >> > > -/* { dg-final { scan-assembler-times "max" 3 } } */ >> > > +/* { dg-final { scan-assembler-times "max" 4 } } */ >> > > /* { dg-final { scan-assembler-times "minu" 1 } } */ >> > > -/* { dg-final { scan-assembler-times "maxu" 1 } } */ >> > > +/* { dg-final { scan-assembler-times "maxu" 3 } } */ >> > > +/* { dg-final { scan-assembler-not "zext.w" } } */ >> > > +/* { dg-final { scan-assembler-not "sext.w" } } */ >> > > -- >> > > 2.32.0 >> > >
Kito, Thanks for the reality-check: the subreg-expressions are getting in the way. I'll drop this from v2, as a permanent resolution for this will be a bit more involved. Philipp. On Thu, 11 Nov 2021 at 17:42, Kito Cheng <kito.cheng@gmail.com> wrote: > > Hi Philipp: > > This testcase got wrong result with this patch even w/o > <bitmanip_optab>si3_sext pattern: > > #include <stdio.h> > > #define MAX(A, B) ((A) > (B) ? (A) : (B)) > > long long __attribute__((noinline, noipa)) > foo6(long long a, long long b, int c) > { > int xa = a; > int xb = b; > return MAX(MAX(xa, xb), c); > } > int main() { > long long a = 0x200000000ll; > long long b = 0x1ffffffffl; > int c = 10; > long long d = foo6(a, b, c); > printf ("%lld %lld %d = %lld\n", a, b, c, d); > return 0; > } > > On Fri, Nov 12, 2021 at 12:27 AM Kito Cheng <kito.cheng@gmail.com> wrote: > > > > IIRC it's not work even without sign extend pattern since I did similar experimental before (not for RISC-V, but same concept), I guess I need more time to test that. > > > > Philipp Tomsich <philipp.tomsich@vrull.eu> 於 2021年11月12日 週五 00:18 寫道: > >> > >> Kito, > >> > >> Unless I am missing something, the problem is not the relaxation to > >> GPR, but rather the sign-extending pattern I had squashed into the > >> same patch. > >> If you disable "<bitmanip_optab>si3_sext", a sext.w will be have to be > >> emitted after the 'max' and before the return (or before the SImode > >> output is consumed as a DImode), pushing the REE opportunity to a > >> subsequent consumer (e.g. an addw). > >> > >> This will generate > >> foo6: > >> max a0,a0,a1 > >> sext.w a0,a0 > >> ret > >> which (assuming that the inputs to max are properly sign-extended > >> SImode values living in DImode registers) will be the same as > >> performing the two sext.w before the max. > >> > >> Having a second set of eyes on this is appreciated — let me know if > >> you agree and I'll revise, once I have collected feedback on the > >> remaining patches of the series. > >> > >> Philipp. > >> > >> > >> On Thu, 11 Nov 2021 at 17:00, Kito Cheng <kito.cheng@gmail.com> wrote: > >> > > >> > Hi Philipp: > >> > > >> > We can't pretend we have SImode min/max instruction without that semantic. > >> > Give this testcase, x86 and rv64gc print out 8589934592 8589934591 = 0, > >> > but with this patch and compile with rv64gc_zba_zbb -O3, the output > >> > become 8589934592 8589934591 = 8589934592 > >> > > >> > -------------Testcase--------------- > >> > #include <stdio.h> > >> > long long __attribute__((noinline, noipa)) > >> > foo6(long long a, long long b) > >> > { > >> > int xa = a; > >> > int xb = b; > >> > return (xa > xb ? xa : xb); > >> > } > >> > int main() { > >> > long long a = 0x200000000ll; > >> > long long b = 0x1ffffffffl; > >> > long long c = foo6(a, b); > >> > printf ("%lld %lld = %lld\n", a, b, c); > >> > return 0; > >> > } > >> > -------------------------------------- > >> > v64gc_zba_zbb -O3 w/o this patch: > >> > foo6: > >> > sext.w a1,a1 > >> > sext.w a0,a0 > >> > max a0,a0,a1 > >> > ret > >> > > >> > -------------------------------------- > >> > v64gc_zba_zbb -O3 w/ this patch: > >> > foo6: > >> > max a0,a0,a1 > >> > ret > >> > > >> > On Thu, Nov 11, 2021 at 10:10 PM Philipp Tomsich > >> > <philipp.tomsich@vrull.eu> wrote: > >> > > > >> > > While min/minu/max/maxu instructions are provided for XLEN only, these > >> > > can safely operate on GPRs (i.e. SImode or DImode for RV64): SImode is > >> > > always sign-extended, which ensures that the XLEN-wide instructions > >> > > can be used for signed and unsigned comparisons on SImode yielding a > >> > > correct ordering of value. > >> > > > >> > > This commit > >> > > - relaxes the minmax pattern to express for GPR (instead of X only), > >> > > providing both a si3 and di3 expansion on RV64 > >> > > - adds a sign-extending form for thee si3 pattern for RV64 to all REE > >> > > to eliminate redundant extensions > >> > > - adds test-cases for both > >> > > > >> > > gcc/ChangeLog: > >> > > > >> > > * config/riscv/bitmanip.md: Relax minmax to GPR (i.e SImode or > >> > > DImode) on RV64. > >> > > * config/riscv/bitmanip.md (<bitmanip_optab>si3_sext): Add > >> > > pattern for REE. > >> > > > >> > > gcc/testsuite/ChangeLog: > >> > > > >> > > * gcc.target/riscv/zbb-min-max.c: Add testcases for SImode > >> > > operands checking that no redundant sign- or zero-extensions > >> > > are emitted. > >> > > > >> > > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu> > >> > > --- > >> > > > >> > > gcc/config/riscv/bitmanip.md | 14 +++++++++++--- > >> > > gcc/testsuite/gcc.target/riscv/zbb-min-max.c | 20 +++++++++++++++++--- > >> > > 2 files changed, 28 insertions(+), 6 deletions(-) > >> > > > >> > > diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md > >> > > index 000deb48b16..2a28f78f5f6 100644 > >> > > --- a/gcc/config/riscv/bitmanip.md > >> > > +++ b/gcc/config/riscv/bitmanip.md > >> > > @@ -260,13 +260,21 @@ (define_insn "bswap<mode>2" > >> > > [(set_attr "type" "bitmanip")]) > >> > > > >> > > (define_insn "<bitmanip_optab><mode>3" > >> > > - [(set (match_operand:X 0 "register_operand" "=r") > >> > > - (bitmanip_minmax:X (match_operand:X 1 "register_operand" "r") > >> > > - (match_operand:X 2 "register_operand" "r")))] > >> > > + [(set (match_operand:GPR 0 "register_operand" "=r") > >> > > + (bitmanip_minmax:GPR (match_operand:GPR 1 "register_operand" "r") > >> > > + (match_operand:GPR 2 "register_operand" "r")))] > >> > > "TARGET_ZBB" > >> > > "<bitmanip_insn>\t%0,%1,%2" > >> > > [(set_attr "type" "bitmanip")]) > >> > > > >> > > +(define_insn "<bitmanip_optab>si3_sext" > >> > > + [(set (match_operand:DI 0 "register_operand" "=r") > >> > > + (sign_extend:DI (bitmanip_minmax:SI (match_operand:SI 1 "register_operand" "r") > >> > > + (match_operand:SI 2 "register_operand" "r"))))] > >> > > + "TARGET_64BIT && TARGET_ZBB" > >> > > + "<bitmanip_insn>\t%0,%1,%2" > >> > > + [(set_attr "type" "bitmanip")]) > >> > > + > >> > > ;; orc.b (or-combine) is added as an unspec for the benefit of the support > >> > > ;; for optimized string functions (such as strcmp). > >> > > (define_insn "orcb<mode>2" > >> > > diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c > >> > > index f44c398ea08..7169e873551 100644 > >> > > --- a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c > >> > > +++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c > >> > > @@ -1,5 +1,5 @@ > >> > > /* { dg-do compile } */ > >> > > -/* { dg-options "-march=rv64gc_zbb -mabi=lp64 -O2" } */ > >> > > +/* { dg-options "-march=rv64gc_zba_zbb -mabi=lp64 -O2" } */ > >> > > > >> > > long > >> > > foo1 (long i, long j) > >> > > @@ -25,7 +25,21 @@ foo4 (unsigned long i, unsigned long j) > >> > > return i > j ? i : j; > >> > > } > >> > > > >> > > +unsigned int > >> > > +foo5(unsigned int a, unsigned int b) > >> > > +{ > >> > > + return a > b ? a : b; > >> > > +} > >> > > + > >> > > +int > >> > > +foo6(int a, int b) > >> > > +{ > >> > > + return a > b ? a : b; > >> > > +} > >> > > + > >> > > /* { dg-final { scan-assembler-times "min" 3 } } */ > >> > > -/* { dg-final { scan-assembler-times "max" 3 } } */ > >> > > +/* { dg-final { scan-assembler-times "max" 4 } } */ > >> > > /* { dg-final { scan-assembler-times "minu" 1 } } */ > >> > > -/* { dg-final { scan-assembler-times "maxu" 1 } } */ > >> > > +/* { dg-final { scan-assembler-times "maxu" 3 } } */ > >> > > +/* { dg-final { scan-assembler-not "zext.w" } } */ > >> > > +/* { dg-final { scan-assembler-not "sext.w" } } */ > >> > > -- > >> > > 2.32.0 > >> > >
diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md index 000deb48b16..2a28f78f5f6 100644 --- a/gcc/config/riscv/bitmanip.md +++ b/gcc/config/riscv/bitmanip.md @@ -260,13 +260,21 @@ (define_insn "bswap<mode>2" [(set_attr "type" "bitmanip")]) (define_insn "<bitmanip_optab><mode>3" - [(set (match_operand:X 0 "register_operand" "=r") - (bitmanip_minmax:X (match_operand:X 1 "register_operand" "r") - (match_operand:X 2 "register_operand" "r")))] + [(set (match_operand:GPR 0 "register_operand" "=r") + (bitmanip_minmax:GPR (match_operand:GPR 1 "register_operand" "r") + (match_operand:GPR 2 "register_operand" "r")))] "TARGET_ZBB" "<bitmanip_insn>\t%0,%1,%2" [(set_attr "type" "bitmanip")]) +(define_insn "<bitmanip_optab>si3_sext" + [(set (match_operand:DI 0 "register_operand" "=r") + (sign_extend:DI (bitmanip_minmax:SI (match_operand:SI 1 "register_operand" "r") + (match_operand:SI 2 "register_operand" "r"))))] + "TARGET_64BIT && TARGET_ZBB" + "<bitmanip_insn>\t%0,%1,%2" + [(set_attr "type" "bitmanip")]) + ;; orc.b (or-combine) is added as an unspec for the benefit of the support ;; for optimized string functions (such as strcmp). (define_insn "orcb<mode>2" diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c index f44c398ea08..7169e873551 100644 --- a/gcc/testsuite/gcc.target/riscv/zbb-min-max.c +++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-march=rv64gc_zbb -mabi=lp64 -O2" } */ +/* { dg-options "-march=rv64gc_zba_zbb -mabi=lp64 -O2" } */ long foo1 (long i, long j) @@ -25,7 +25,21 @@ foo4 (unsigned long i, unsigned long j) return i > j ? i : j; } +unsigned int +foo5(unsigned int a, unsigned int b) +{ + return a > b ? a : b; +} + +int +foo6(int a, int b) +{ + return a > b ? a : b; +} + /* { dg-final { scan-assembler-times "min" 3 } } */ -/* { dg-final { scan-assembler-times "max" 3 } } */ +/* { dg-final { scan-assembler-times "max" 4 } } */ /* { dg-final { scan-assembler-times "minu" 1 } } */ -/* { dg-final { scan-assembler-times "maxu" 1 } } */ +/* { dg-final { scan-assembler-times "maxu" 3 } } */ +/* { dg-final { scan-assembler-not "zext.w" } } */ +/* { dg-final { scan-assembler-not "sext.w" } } */