Message ID | 20221228181817.193462-1-rzinsly@ventanamicro.com |
---|---|
State | Deferred, archived |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9103F3858410 for <patchwork@sourceware.org>; Wed, 28 Dec 2022 18:18:52 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oa1-x2b.google.com (mail-oa1-x2b.google.com [IPv6:2001:4860:4864:20::2b]) by sourceware.org (Postfix) with ESMTPS id 81F983858D35 for <gcc-patches@gcc.gnu.org>; Wed, 28 Dec 2022 18:18:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 81F983858D35 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=ventanamicro.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=ventanamicro.com Received: by mail-oa1-x2b.google.com with SMTP id 586e51a60fabf-14fb3809eaeso13368189fac.1 for <gcc-patches@gcc.gnu.org>; Wed, 28 Dec 2022 10:18:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=OATrbedgnWuHl4XAbcKkJKdbj2f1VYlzm3/+97zV2LA=; b=ETPo8fZPRh9eK5bgz96Prkuz6ppwDvF/kOdnbIV4ExwaS8tq5MF6f6dY4qfnl1EODY AjMZ7+kFQ1blS8HNcdRynYhNmTS0HjFcbXFoiShdzLEzatrJf38sHpxEP9j1x6pHrA1K t34SNv474yFqtO019zQI8Hsp4p/AYj21FhcyGoqm9HJi6fUlYN0+oPLpZxgYGeGNzhIP iBbIV0+K3Asn0cLCf1bmMq9rniEL4FLOkrZOwI4Z6/mOqEdmNZuOTv7pae5rBqo8JzYz DMVupF8lvnmDgynzcVVFQvcp2EDD0X65ssg1eGRgYizq55rVcgFxVHmctfDkq6wCHzEo NwYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=OATrbedgnWuHl4XAbcKkJKdbj2f1VYlzm3/+97zV2LA=; b=iqqJ8WRyA17hEOFY6XWLFUUaU+b2ywXNcESfHmAZeHYr2NOaLkyI3qTvd/lfBT76RO GbwTBr8KUI09qbLHhKcmN5nBlI0jb0EksITHrx8AeKTVvn0X6UkrQ9Bat4xvpOXzlB++ G46uCED9izOoYkrLg0v77woX1UHI4iuI1ium99xmwrQGB5hJ+dIfMEs2YrlS8bbquV7f ZKwzmtRsBDOfJH410sXHMAs0qAUU5lzY8k6pfv5KdgArqbZKW6d/aT2PB8+Jg+JUp6nj +etEntXCZeaig5CHYLfDRnfQ5fpIND8Y6X6KtMAZVNlHwtJ/Uy8ARvbAUkvPSrxaqhtZ h76w== X-Gm-Message-State: AFqh2koWzp62rT7xirSDoVSgV1eharjaPT4R0SBMBY/R032LbJfhFPfq oBgOhtiPbZ3wmciQUTIj1gwLYQcAZ+FyFY0J X-Google-Smtp-Source: AMrXdXv9M/HHW6oLtVy3VuGqp/P+VwUm1WvdoF/4SZic3dvV/rouBxYX59JLVCMbNqg2H8UBPjxiUQ== X-Received: by 2002:a05:6871:414:b0:144:57b1:c8cf with SMTP id d20-20020a056871041400b0014457b1c8cfmr17503640oag.47.1672251512404; Wed, 28 Dec 2022 10:18:32 -0800 (PST) Received: from marvin.dc1.ventanamicro.com ([2804:14d:baa4:50a9::1000]) by smtp.gmail.com with ESMTPSA id s24-20020a056870ea9800b0014c7958c55bsm7731479oap.42.2022.12.28.10.18.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Dec 2022 10:18:31 -0800 (PST) From: Raphael Moreira Zinsly <rzinsly@ventanamicro.com> To: gcc-patches@gcc.gnu.org Cc: jeffreyalaw@gmail.com, philipp.tomsich@vrull.eu, Raphael Moreira Zinsly <rzinsly@ventanamicro.com> Subject: [PATCH] RISC-V: Optimize min/max with SImode sources on 64-bit Date: Wed, 28 Dec 2022 15:18:17 -0300 Message-Id: <20221228181817.193462-1-rzinsly@ventanamicro.com> X-Mailer: git-send-email 2.38.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-13.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
RISC-V: Optimize min/max with SImode sources on 64-bit
|
|
Commit Message
Raphael Moreira Zinsly
Dec. 28, 2022, 6:18 p.m. UTC
The Zbb min/max pattern was not matching 32-bit sources when compiling for 64-bit. This patch separates the pattern into SImode and DImode, and use a define_expand to handle SImode on 64-bit. zbb-min-max-02.c generates different code as a result of the new expander. The resulting code is as efficient as the old code. Furthermore, the special sh1add pattern that appeared in zbb-min-max-02.c is tested by the zba-shNadd-* tests. gcc/ChangeLog: * config/riscv/bitmanip.md (<bitmanip_optab><mode>3): Divide pattern into <bitmanip_optab>si3_insn and <bitmanip_optab>di3. (<bitmanip_optab>si3): Handle SImode sources on TARGET_64BIT. gcc/testsuite: * gcc.target/riscv/zbb-abs.c: New test. * gcc.target/riscv/zbb-min-max-02.c: Addapt the expected output. --- gcc/config/riscv/bitmanip.md | 38 ++++++++++++++++--- gcc/testsuite/gcc.target/riscv/zbb-abs.c | 18 +++++++++ .../gcc.target/riscv/zbb-min-max-02.c | 2 +- 3 files changed, 52 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-abs.c
Comments
On 12/28/22 11:18, Raphael Moreira Zinsly wrote: > The Zbb min/max pattern was not matching 32-bit sources when > compiling for 64-bit. > This patch separates the pattern into SImode and DImode, and > use a define_expand to handle SImode on 64-bit. > zbb-min-max-02.c generates different code as a result of the new > expander. The resulting code is as efficient as the old code. > Furthermore, the special sh1add pattern that appeared in > zbb-min-max-02.c is tested by the zba-shNadd-* tests. > > gcc/ChangeLog: > > * config/riscv/bitmanip.md > (<bitmanip_optab><mode>3): Divide pattern into > <bitmanip_optab>si3_insn and <bitmanip_optab>di3. > (<bitmanip_optab>si3): Handle SImode sources on > TARGET_64BIT. > > gcc/testsuite: > > * gcc.target/riscv/zbb-abs.c: New test. > * gcc.target/riscv/zbb-min-max-02.c: Addapt the > expected output. So we need to do a bit of x86 debugging. Given the regressions below on the x86 testsuite, we should assume there may be other targets where the optimization might result in testsuite regressions. The good news is we can can use my upstream GCC tester to help identify some of these cases. So I'll put the simplify-rtx change into my tester and see what pops out overnight on the embedded targets. You're also missing a ChangeLog entry for the simplify-rtx change. Sorry I didn't catch that sooner. --- /home/jlaw/gcc.sum 2022-12-28 15:59:52.612140312 -0700 +++ gcc/testsuite/gcc/gcc.sum 2022-12-28 17:14:52.661333526 -0700 @@ -182730,9 +182730,9 @@ PASS: gcc.target/i386/pr106095.c (test for excess errors) PASS: gcc.target/i386/pr106122.c (test for excess errors) PASS: gcc.target/i386/pr106231-1.c (test for excess errors) -PASS: gcc.target/i386/pr106231-1.c scan-assembler-not cltq +FAIL: gcc.target/i386/pr106231-1.c scan-assembler-not cltq PASS: gcc.target/i386/pr106231-2.c (test for excess errors) -PASS: gcc.target/i386/pr106231-2.c scan-assembler-not cltq +FAIL: gcc.target/i386/pr106231-2.c scan-assembler-not cltq PASS: gcc.target/i386/pr106273.c (test for excess errors) PASS: gcc.target/i386/pr106273.c scan-assembler-not andn[ \\t]+%rdi, %r11, %rdi PASS: gcc.target/i386/pr106303.c (test for excess errors) @@ -186148,7 +186148,7 @@ PASS: gcc.target/i386/pr91704.c scan-assembler-not \tvpsubusb\t PASS: gcc.target/i386/pr91704.c scan-assembler-times \tvpcmpgtb\t%ymm 1 PASS: gcc.target/i386/pr91824-1.c (test for excess errors) -PASS: gcc.target/i386/pr91824-1.c scan-assembler-not cltq +FAIL: gcc.target/i386/pr91824-1.c scan-assembler-not cltq PASS: gcc.target/i386/pr91824-2.c (test for excess errors) PASS: gcc.target/i386/pr91824-2.c scan-assembler-not cltq PASS: gcc.target/i386/pr91824-2.c scan-assembler-not movl\t%eax, %eax @@ -186561,9 +186561,9 @@ PASS: gcc.target/i386/pr95483-7.c (test for excess errors) PASS: gcc.target/i386/pr95483-7.c scan-assembler-times vmovdqu[ \\t]+[^{\n]*\\)[^\n]*%xmm[0-9]+(?:\n|[ \\t]+#) 2 PASS: gcc.target/i386/pr95535-1.c (test for excess errors) -PASS: gcc.target/i386/pr95535-1.c scan-assembler-not cltq +FAIL: gcc.target/i386/pr95535-1.c scan-assembler-not cltq PASS: gcc.target/i386/pr95535-2.c (test for excess errors) -PASS: gcc.target/i386/pr95535-2.c scan-assembler-not cltq +FAIL: gcc.target/i386/pr95535-2.c scan-assembler-not cltq PASS: gcc.target/i386/pr95740.c (test for excess errors) PASS: gcc.target/i386/pr95740.c scan-assembler-times (?n)incl[\\t ]*%eax 1 PASS: gcc.target/i386/pr95740.c scan-assembler-times (?n)incq[\\t ]*%rax 1
On Wed, Dec 28, 2022 at 10:36 PM Jeff Law <jeffreyalaw@gmail.com> wrote: > > > > On 12/28/22 11:18, Raphael Moreira Zinsly wrote: > > The Zbb min/max pattern was not matching 32-bit sources when > > compiling for 64-bit. > > This patch separates the pattern into SImode and DImode, and > > use a define_expand to handle SImode on 64-bit. > > zbb-min-max-02.c generates different code as a result of the new > > expander. The resulting code is as efficient as the old code. > > Furthermore, the special sh1add pattern that appeared in > > zbb-min-max-02.c is tested by the zba-shNadd-* tests. > > > > gcc/ChangeLog: > > > > * config/riscv/bitmanip.md > > (<bitmanip_optab><mode>3): Divide pattern into > > <bitmanip_optab>si3_insn and <bitmanip_optab>di3. > > (<bitmanip_optab>si3): Handle SImode sources on > > TARGET_64BIT. > > > > gcc/testsuite: > > > > * gcc.target/riscv/zbb-abs.c: New test. > > * gcc.target/riscv/zbb-min-max-02.c: Addapt the > > expected output. > So we need to do a bit of x86 debugging. > > Given the regressions below on the x86 testsuite, we should assume there > may be other targets where the optimization might result in testsuite > regressions. Thanks for checking it! > The good news is we can can use my upstream GCC tester to help identify > some of these cases. So I'll put the simplify-rtx change into my tester > and see what pops out overnight on the embedded targets. > > You're also missing a ChangeLog entry for the simplify-rtx change. > Sorry I didn't catch that sooner. There is no simplify-rtx change in this patch, I think you may be mixing my patches ;-)
On 12/29/22 05:23, Raphael Zinsly wrote: > On Wed, Dec 28, 2022 at 10:36 PM Jeff Law <jeffreyalaw@gmail.com> wrote: >> >> >> >> On 12/28/22 11:18, Raphael Moreira Zinsly wrote: >>> The Zbb min/max pattern was not matching 32-bit sources when >>> compiling for 64-bit. >>> This patch separates the pattern into SImode and DImode, and >>> use a define_expand to handle SImode on 64-bit. >>> zbb-min-max-02.c generates different code as a result of the new >>> expander. The resulting code is as efficient as the old code. >>> Furthermore, the special sh1add pattern that appeared in >>> zbb-min-max-02.c is tested by the zba-shNadd-* tests. >>> >>> gcc/ChangeLog: >>> >>> * config/riscv/bitmanip.md >>> (<bitmanip_optab><mode>3): Divide pattern into >>> <bitmanip_optab>si3_insn and <bitmanip_optab>di3. >>> (<bitmanip_optab>si3): Handle SImode sources on >>> TARGET_64BIT. >>> >>> gcc/testsuite: >>> >>> * gcc.target/riscv/zbb-abs.c: New test. >>> * gcc.target/riscv/zbb-min-max-02.c: Addapt the >>> expected output. >> So we need to do a bit of x86 debugging. >> >> Given the regressions below on the x86 testsuite, we should assume there >> may be other targets where the optimization might result in testsuite >> regressions. > > Thanks for checking it! lol, you're right. I'm mushing together your stuff in my head! So we've got a couple TODOs for the ctz/clz improvements before they can be submitted ;-) jeff
On Wed, 28 Dec 2022 at 19:18, Raphael Moreira Zinsly < rzinsly@ventanamicro.com> wrote: > The Zbb min/max pattern was not matching 32-bit sources when > compiling for 64-bit. > This patch separates the pattern into SImode and DImode, and > use a define_expand to handle SImode on 64-bit. > zbb-min-max-02.c generates different code as a result of the new > expander. The resulting code is as efficient as the old code. > Furthermore, the special sh1add pattern that appeared in > zbb-min-max-02.c is tested by the zba-shNadd-* tests. > > gcc/ChangeLog: > > * config/riscv/bitmanip.md > (<bitmanip_optab><mode>3): Divide pattern into > <bitmanip_optab>si3_insn and <bitmanip_optab>di3. > (<bitmanip_optab>si3): Handle SImode sources on > TARGET_64BIT. > > gcc/testsuite: > > * gcc.target/riscv/zbb-abs.c: New test. > * gcc.target/riscv/zbb-min-max-02.c: Addapt the > expected output. > --- > gcc/config/riscv/bitmanip.md | 38 ++++++++++++++++--- > gcc/testsuite/gcc.target/riscv/zbb-abs.c | 18 +++++++++ > .../gcc.target/riscv/zbb-min-max-02.c | 2 +- > 3 files changed, 52 insertions(+), 6 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-abs.c > > diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md > index d17133d58c1..abf08a29e89 100644 > --- a/gcc/config/riscv/bitmanip.md > +++ b/gcc/config/riscv/bitmanip.md > @@ -360,14 +360,42 @@ > DONE; > }) > > -(define_insn "<bitmanip_optab><mode>3" > - [(set (match_operand:X 0 "register_operand" "=r") > - (bitmanip_minmax:X (match_operand:X 1 "register_operand" "r") > - (match_operand:X 2 "register_operand" "r")))] > - "TARGET_ZBB" > +(define_insn "<bitmanip_optab>si3_insn" > + [(set (match_operand:SI 0 "register_operand" "=r") > + (bitmanip_minmax:SI (match_operand:SI 1 "register_operand" "r") > + (match_operand:SI 2 "register_operand" "r")))] > + "!TARGET_64BIT && TARGET_ZBB" > "<bitmanip_insn>\t%0,%1,%2" > [(set_attr "type" "bitmanip")]) > > +(define_insn "<bitmanip_optab>di3" > + [(set (match_operand:DI 0 "register_operand" "=r") > + (bitmanip_minmax:DI (match_operand:DI 1 "register_operand" "r") > + (match_operand:DI 2 "register_operand" "r")))] > + "TARGET_64BIT && TARGET_ZBB" > + "<bitmanip_insn>\t%0,%1,%2" > + [(set_attr "type" "bitmanip")]) > + > +(define_expand "<bitmanip_optab>si3" > + [(set (match_operand:SI 0 "register_operand" "=r") > + (bitmanip_minmax:SI (match_operand:SI 1 "register_operand" "r") > + (match_operand:SI 2 "register_operand" "r")))] > + "TARGET_ZBB" > + " > +{ > + if (TARGET_64BIT) > + { > + rtx op1_x = gen_reg_rtx (DImode); > + emit_move_insn (op1_x, gen_rtx_SIGN_EXTEND (DImode, operands[1])); > + rtx op2_x = gen_reg_rtx (DImode); > + emit_move_insn (op2_x, gen_rtx_SIGN_EXTEND (DImode, operands[2])); > + rtx dst_x = gen_reg_rtx (DImode); > + emit_insn (gen_<bitmanip_optab>di3 (dst_x, op1_x, op2_x)); > + emit_move_insn (operands[0], gen_lowpart (SImode, dst_x)); > + DONE; > + } > +}") > We have two issues around min/max here: 1. That it doesn't apply to the SImode abs case (which is due to expand_abs_nojump() blindly testing for the current mode in smax_optab). 2. That we have to reduce the number of extensions to the least amount. The above addresses expand_abs_nojump(), but makes the general solution harder as the middle-end needs to know there is no native SImode min/max available. We still plan (proof-of-concept works, but a final patch will likely not be ready before very late in January) to submit a patch to improve the expansion of MIN_EXPR/MAX_EXPR that utilizes the type-precision and value-ranges to not even create the sign-extensions in the first place. If we do the above, the middle-end will blindly emit this sequence with the 2 sign-extensions — which may or may not be eliminated later by combining with a w-form. I'll also add an enhancement to expand_abs_nojump() to our list of changes for the min/max enhancement during the lowering. Note that, if we decide to go ahead with using this as a temporary solution until our change is ready, you'll also need to add a cost for the SImode max. Philipp. > + > ;; Optimize the common case of a SImode min/max against a constant > ;; that is safe both for sign- and zero-extension. > (define_insn_and_split "*minmax" > diff --git a/gcc/testsuite/gcc.target/riscv/zbb-abs.c > b/gcc/testsuite/gcc.target/riscv/zbb-abs.c > new file mode 100644 > index 00000000000..6ef7efdbd49 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/zbb-abs.c > @@ -0,0 +1,18 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gc_zbb" } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" } } */ > + > +#define ABS(x) (((x) >= 0) ? (x) : -(x)) > + > +int > +foo (int x) > +{ > + return ABS(x); > +} > + > +/* { dg-final { scan-assembler-times "neg" 1 } } */ > +/* { dg-final { scan-assembler-times "max" 1 } } */ > +/* { dg-final { scan-assembler-not "sraiw" } } */ > +/* { dg-final { scan-assembler-not "xor" } } */ > +/* { dg-final { scan-assembler-not "subw" } } */ > + > diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max-02.c > b/gcc/testsuite/gcc.target/riscv/zbb-min-max-02.c > index b462859f10f..b9db655d55d 100644 > --- a/gcc/testsuite/gcc.target/riscv/zbb-min-max-02.c > +++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max-02.c > @@ -9,6 +9,6 @@ int f(unsigned int* a) > } > > /* { dg-final { scan-assembler-times "minu" 1 } } */ > -/* { dg-final { scan-assembler-times "sext.w" 1 } } */ > +/* { dg-final { scan-assembler-times "sext.w|addw" 1 } } */ > /* { dg-final { scan-assembler-not "zext.w" } } */ > > -- > 2.38.1 > >
On 12/29/22 07:50, Philipp Tomsich wrote: > > > We have two issues around min/max here: > 1. That it doesn't apply to the SImode abs case (which is due to > expand_abs_nojump() blindly testing for the current mode in smax_optab). Mode testing is inherent in the optab query interface. > 2. That we have to reduce the number of extensions to the least amount. Yes. I'd already indicated to Raphael that y'all were poking in this space and that I consider the two approaches complementary. Essentially we want to reduce unnecessary extensions by type adjustment in gimple when we can prove that safe while at the same time we want the RTL bits to generate good code if we're unable to adjust the types. > > The above addresses expand_abs_nojump(), but makes the general solution > harder as the middle-end needs to know there is no native SImode min/max > available. I think we can probably deal with this without too much trouble. We just need a reasonable API to ask "is this optab supported directly by the target, or is it expanded by the target". I can envision a couple ways to do that, one of which would be to query the rtx cost. If the resultant cost is > COSTS_N_INSNS (1), then its synthesized, otherwise it's native. > We still plan (proof-of-concept works, but a final patch will likely not > be ready before very late in January) to submit a patch to improve the > expansion of MIN_EXPR/MAX_EXPR that utilizes the type-precision and > value-ranges to not even create the sign-extensions in the first place. Which would be fantastic. > If we do the above, the middle-end will blindly emit this sequence with > the 2 sign-extensions — which may or may not be eliminated later by > combining with a w-form. I think the question becomes whether or not we think the work you're doing would likely subsume what Raphael has done. Your work and Raphael's work both target the gimple->RTL expansion phase. Yours tries to adjust types so that we have fewer extensions. Raphael's work adjusts how we expand when we're asked for a 32bit min/max. Those seem complementary to me -- if you're unable to adjust the types, then go ahead and ask for that 32bit min/max and let Raphael's expander generate reasonable code for it. If there's an API that allows you to query for native vs synthesized support, then you should have the backend info necessary to drive your type adjustment work. > > Note that, if we decide to go ahead with using this as a temporary > solution until our change is ready, you'll also need to add a cost for > the SImode max. Agreed. I haven't exposed Raphael to rtx_cost yet, but now seems like a good a time as any. Raphael, you're going to want to look in riscv.cc::riscv_rtx_costs. We'll want a case for MIN/MAX. For TARGET_ZBB and word sized operations, the cost is probably COSTS_N_INSNS (1), for TARGET_ZBB a subword operation (ie SImode for TARGET_64BIT) it'd be COSTS_N_INSNS (2). In both of those cases I think we return true. For !TARGET_ZBB, COSTS_N_INSNS (N) where N is the number of instructions to implement a min/max without TARGET_ZBB would be the right value. I just don't know it offhand. Jeff
diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md index d17133d58c1..abf08a29e89 100644 --- a/gcc/config/riscv/bitmanip.md +++ b/gcc/config/riscv/bitmanip.md @@ -360,14 +360,42 @@ DONE; }) -(define_insn "<bitmanip_optab><mode>3" - [(set (match_operand:X 0 "register_operand" "=r") - (bitmanip_minmax:X (match_operand:X 1 "register_operand" "r") - (match_operand:X 2 "register_operand" "r")))] - "TARGET_ZBB" +(define_insn "<bitmanip_optab>si3_insn" + [(set (match_operand:SI 0 "register_operand" "=r") + (bitmanip_minmax:SI (match_operand:SI 1 "register_operand" "r") + (match_operand:SI 2 "register_operand" "r")))] + "!TARGET_64BIT && TARGET_ZBB" "<bitmanip_insn>\t%0,%1,%2" [(set_attr "type" "bitmanip")]) +(define_insn "<bitmanip_optab>di3" + [(set (match_operand:DI 0 "register_operand" "=r") + (bitmanip_minmax:DI (match_operand:DI 1 "register_operand" "r") + (match_operand:DI 2 "register_operand" "r")))] + "TARGET_64BIT && TARGET_ZBB" + "<bitmanip_insn>\t%0,%1,%2" + [(set_attr "type" "bitmanip")]) + +(define_expand "<bitmanip_optab>si3" + [(set (match_operand:SI 0 "register_operand" "=r") + (bitmanip_minmax:SI (match_operand:SI 1 "register_operand" "r") + (match_operand:SI 2 "register_operand" "r")))] + "TARGET_ZBB" + " +{ + if (TARGET_64BIT) + { + rtx op1_x = gen_reg_rtx (DImode); + emit_move_insn (op1_x, gen_rtx_SIGN_EXTEND (DImode, operands[1])); + rtx op2_x = gen_reg_rtx (DImode); + emit_move_insn (op2_x, gen_rtx_SIGN_EXTEND (DImode, operands[2])); + rtx dst_x = gen_reg_rtx (DImode); + emit_insn (gen_<bitmanip_optab>di3 (dst_x, op1_x, op2_x)); + emit_move_insn (operands[0], gen_lowpart (SImode, dst_x)); + DONE; + } +}") + ;; Optimize the common case of a SImode min/max against a constant ;; that is safe both for sign- and zero-extension. (define_insn_and_split "*minmax" diff --git a/gcc/testsuite/gcc.target/riscv/zbb-abs.c b/gcc/testsuite/gcc.target/riscv/zbb-abs.c new file mode 100644 index 00000000000..6ef7efdbd49 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbb-abs.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zbb" } */ +/* { dg-skip-if "" { *-*-* } { "-O0" } } */ + +#define ABS(x) (((x) >= 0) ? (x) : -(x)) + +int +foo (int x) +{ + return ABS(x); +} + +/* { dg-final { scan-assembler-times "neg" 1 } } */ +/* { dg-final { scan-assembler-times "max" 1 } } */ +/* { dg-final { scan-assembler-not "sraiw" } } */ +/* { dg-final { scan-assembler-not "xor" } } */ +/* { dg-final { scan-assembler-not "subw" } } */ + diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max-02.c b/gcc/testsuite/gcc.target/riscv/zbb-min-max-02.c index b462859f10f..b9db655d55d 100644 --- a/gcc/testsuite/gcc.target/riscv/zbb-min-max-02.c +++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max-02.c @@ -9,6 +9,6 @@ int f(unsigned int* a) } /* { dg-final { scan-assembler-times "minu" 1 } } */ -/* { dg-final { scan-assembler-times "sext.w" 1 } } */ +/* { dg-final { scan-assembler-times "sext.w|addw" 1 } } */ /* { dg-final { scan-assembler-not "zext.w" } } */