| Message ID | c9934c80-5182-9a7b-f9fe-f7b14e458e16@rivosinc.com |
|---|---|
| State | New |
| Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3CC333847825 for <patchwork@sourceware.org>; Fri, 11 Nov 2022 02:28:21 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qk1-x72e.google.com (mail-qk1-x72e.google.com [IPv6:2607:f8b0:4864:20::72e]) by sourceware.org (Postfix) with ESMTPS id BCAFC3858C27 for <gcc-patches@gcc.gnu.org>; Fri, 11 Nov 2022 02:28:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BCAFC3858C27 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-qk1-x72e.google.com with SMTP id v8so2331969qkg.12 for <gcc-patches@gcc.gnu.org>; Thu, 10 Nov 2022 18:28:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:subject:cc:to:from:content-language :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=zK0LPtfXdtyPxD3FW6W3u1rGEXxld0LeCdcdaDQFMR0=; b=vOuGrrhWQ6CVtv/4L4O1A0Y5miIDCsXiP3zB5XrnOk+mPAGpw12j+1w0d5CbM24R+C D9lxagu4jU6uv5duXilvnCwlLfGPhulDHFaejNk6Cw01pgcVLhWyC6/AEVD9NTH7VlR4 3vpmptxZlQ9lf31AgXW6GiPuamJ1rXU6Xwa0JDPAZIg81WbZhmPPDUkLmrsgfiFnRlkB m+7SfvoEU28uc7JUmnP2mdpSbq9uGE7cG8bJL8AVoi7t1pcW1CGek2bYRt+Gleua/FLk uPy7pJuILoG2YUZ3n/YTgnHDK2wbGtfHKzCQzjIlmB4FinK5/jbE2kF9rFebzeZ4XrfY Gi2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:subject:cc:to:from:content-language :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=zK0LPtfXdtyPxD3FW6W3u1rGEXxld0LeCdcdaDQFMR0=; b=c31jWDvuC1UXwr9YOGfizIxjtMRv+306v5kgvjJhLmV1AIiC33SfTYzPmd7s9kMtsw XpjXS7NIB2Eyg/odslCNtxRdQjN2U2m01s/zHwRbFssIePdU/yyaUcP0jYP2yhapHabY NsMittdPQauFb2WBQv3ptx3dNxN4ous5uWWGGBpo6YxSE5hlR5n1GZx8bAFPsQfbIWM4 2TCXYv149vTjlVzfD/wCm8fzzkczIyfvEsOaiTe44gdcSI6ev6RJ58WasZjlNhXfhSmr JprC7AhYR9XeEciO2tLDOA4twhnS2gIPdj6+Bx1xrT3Z+KWy+tQjd7slAhQW6jWuUI4+ c2zA== X-Gm-Message-State: ACrzQf3vGZkGDWUCwBL8nGrn6QEri5tQXLFvxUQ+2X3TdcKPbXDpk+v0 LDACI4KRuJ2pLSDzPRjHMm1a41/Zg/+Kmw== X-Google-Smtp-Source: AMsMyM45ROB+uVNT4K0YTDPkVnyk3Lo39IGhK4zaGZY0molvabIOodgNj3ykFb5y4W119iTW/Prk9g== X-Received: by 2002:ae9:df85:0:b0:6fa:1e59:4b72 with SMTP id t127-20020ae9df85000000b006fa1e594b72mr2653885qkf.247.1668133682913; Thu, 10 Nov 2022 18:28:02 -0800 (PST) Received: from [192.168.86.117] ([136.57.172.92]) by smtp.gmail.com with ESMTPSA id hj10-20020a05622a620a00b003a582090530sm529432qtb.83.2022.11.10.18.28.02 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 10 Nov 2022 18:28:02 -0800 (PST) Message-ID: <c9934c80-5182-9a7b-f9fe-f7b14e458e16@rivosinc.com> Date: Thu, 10 Nov 2022 21:28:01 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.2.2 Content-Language: en-US From: Michael Collison <collison@rivosinc.com> To: gcc-patches@gcc.gnu.org Cc: Richard Biener <richard.guenther@gmail.com> Subject: [PATCH v2] match.pd: rewrite select to branchless expression Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
| Series |
[v2] match.pd: rewrite select to branchless expression
|
|
Commit Message
Michael Collison
Nov. 11, 2022, 2:28 a.m. UTC
This patches transforms ((x & 0x1) == 0) ? y : z <op> y -into (-(typeof(y))(x & 0x1) & z) <op> y, where op is a '^' or a '|'. It also transforms (cond (and (x , 0x1) != 0), (z op y), y ) into (-(and (x , 0x1)) & z ) op y. Matching this patterns allows GCC to generate branchless code for one of the functions in coremark. Bootstrapped and tested on x86 and RISC-V. Okay? Michael. 2022-11-10 Michael Collison <collison@rivosinc.com> * match.pd ((x & 0x1) == 0) ? y : z <op> y -> (-(typeof(y))(x & 0x1) & z) <op> y. 2022-11-10 Michael Collison <collison@rivosinc.com> * gcc.dg/tree-ssa/branchless-cond.c: New test. --- Changes in v2: - Rewrite comment to use C syntax - Guard against 1-bit types - Simplify pattern by using zero_one_valued_p gcc/match.pd | 24 +++++++++++++++++ .../gcc.dg/tree-ssa/branchless-cond.c | 26 +++++++++++++++++++ 2 files changed, 50 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
Comments
On Fri, 11 Nov 2022 at 07:58, Michael Collison <collison@rivosinc.com> wrote: > > This patches transforms ((x & 0x1) == 0) ? y : z <op> y -into > (-(typeof(y))(x & 0x1) & z) <op> y, where op is a '^' or a '|'. It also > transforms (cond (and (x , 0x1) != 0), (z op y), y ) into (-(and (x , > 0x1)) & z ) op y. > > Matching this patterns allows GCC to generate branchless code for one of > the functions in coremark. > > Bootstrapped and tested on x86 and RISC-V. Okay? > > Michael. > > 2022-11-10 Michael Collison <collison@rivosinc.com> > > * match.pd ((x & 0x1) == 0) ? y : z <op> y > -> (-(typeof(y))(x & 0x1) & z) <op> y. > > 2022-11-10 Michael Collison <collison@rivosinc.com> > > * gcc.dg/tree-ssa/branchless-cond.c: New test. > > --- > > Changes in v2: > > - Rewrite comment to use C syntax > > - Guard against 1-bit types > > - Simplify pattern by using zero_one_valued_p > > gcc/match.pd | 24 +++++++++++++++++ > .../gcc.dg/tree-ssa/branchless-cond.c | 26 +++++++++++++++++++ > 2 files changed, 50 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c > > diff --git a/gcc/match.pd b/gcc/match.pd > index 194ba8f5188..258531e9046 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -3486,6 +3486,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1) > (max @2 @1)) > > +/* ((x & 0x1) == 0) ? y : z <op> y -> (-(typeof(y))(x & 0x1) & z) <op> y */ > +(for op (bit_xor bit_ior) > + (simplify > + (cond (eq zero_one_valued_p@0 > + integer_zerop) > + @1 > + (op:c @2 @1)) > + (if (INTEGRAL_TYPE_P (type) > + && TYPE_PRECISION (type) > 1 > + && (INTEGRAL_TYPE_P (TREE_TYPE (@0)))) > + (op (bit_and (negate (convert:type @0)) @2) @1)))) > + > +/* ((x & 0x1) == 0) ? z <op> y : y -> (-(typeof(y))(x & 0x1) & z) <op> y */ > +(for op (bit_xor bit_ior) > + (simplify > + (cond (ne zero_one_valued_p@0 > + integer_zerop) > + (op:c @2 @1) > + @1) > + (if (INTEGRAL_TYPE_P (type) > + && TYPE_PRECISION (type) > 1 > + && (INTEGRAL_TYPE_P (TREE_TYPE (@0)))) > + (op (bit_and (negate (convert:type @0)) @2) @1)))) > + > /* Simplifications of shift and rotates. */ > > (for rotate (lrotate rrotate) > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c > new file mode 100644 > index 00000000000..68087ae6568 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c > @@ -0,0 +1,26 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > + > +int f1(unsigned int x, unsigned int y, unsigned int z) > +{ > + return ((x & 1) == 0) ? y : z ^ y; > +} > + > +int f2(unsigned int x, unsigned int y, unsigned int z) > +{ > + return ((x & 1) != 0) ? z ^ y : y; > +} > + > +int f3(unsigned int x, unsigned int y, unsigned int z) > +{ > + return ((x & 1) == 0) ? y : z | y; > +} > + > +int f4(unsigned int x, unsigned int y, unsigned int z) > +{ > + return ((x & 1) != 0) ? z | y : y; > +} Sorry to nitpick -- Since the pattern gates on INTEGRAL_TYPE_P, would it be a good idea to have these tests for other integral types too besides int like {char, short, long} ? Thanks, Prathamesh > + > +/* { dg-final { scan-tree-dump-times " -" 4 "optimized" } } */ > +/* { dg-final { scan-tree-dump-times " & " 8 "optimized" } } */ > +/* { dg-final { scan-tree-dump-not "if" "optimized" } } */ > -- > 2.34.1 >
Hi Prathamesh, It is my understanding that INTEGRAL_TYPE_P applies to the other integer types you mentioned (chart, short, long). In fact the test function that motivated this match has a mixture of char and short and does not restrict matching. On 11/11/22 02:44, Prathamesh Kulkarni wrote: > On Fri, 11 Nov 2022 at 07:58, Michael Collison <collison@rivosinc.com> wrote: >> This patches transforms ((x & 0x1) == 0) ? y : z <op> y -into >> (-(typeof(y))(x & 0x1) & z) <op> y, where op is a '^' or a '|'. It also >> transforms (cond (and (x , 0x1) != 0), (z op y), y ) into (-(and (x , >> 0x1)) & z ) op y. >> >> Matching this patterns allows GCC to generate branchless code for one of >> the functions in coremark. >> >> Bootstrapped and tested on x86 and RISC-V. Okay? >> >> Michael. >> >> 2022-11-10 Michael Collison <collison@rivosinc.com> >> >> * match.pd ((x & 0x1) == 0) ? y : z <op> y >> -> (-(typeof(y))(x & 0x1) & z) <op> y. >> >> 2022-11-10 Michael Collison <collison@rivosinc.com> >> >> * gcc.dg/tree-ssa/branchless-cond.c: New test. >> >> --- >> >> Changes in v2: >> >> - Rewrite comment to use C syntax >> >> - Guard against 1-bit types >> >> - Simplify pattern by using zero_one_valued_p >> >> gcc/match.pd | 24 +++++++++++++++++ >> .../gcc.dg/tree-ssa/branchless-cond.c | 26 +++++++++++++++++++ >> 2 files changed, 50 insertions(+) >> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c >> >> diff --git a/gcc/match.pd b/gcc/match.pd >> index 194ba8f5188..258531e9046 100644 >> --- a/gcc/match.pd >> +++ b/gcc/match.pd >> @@ -3486,6 +3486,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) >> (cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1) >> (max @2 @1)) >> >> +/* ((x & 0x1) == 0) ? y : z <op> y -> (-(typeof(y))(x & 0x1) & z) <op> y */ >> +(for op (bit_xor bit_ior) >> + (simplify >> + (cond (eq zero_one_valued_p@0 >> + integer_zerop) >> + @1 >> + (op:c @2 @1)) >> + (if (INTEGRAL_TYPE_P (type) >> + && TYPE_PRECISION (type) > 1 >> + && (INTEGRAL_TYPE_P (TREE_TYPE (@0)))) >> + (op (bit_and (negate (convert:type @0)) @2) @1)))) >> + >> +/* ((x & 0x1) == 0) ? z <op> y : y -> (-(typeof(y))(x & 0x1) & z) <op> y */ >> +(for op (bit_xor bit_ior) >> + (simplify >> + (cond (ne zero_one_valued_p@0 >> + integer_zerop) >> + (op:c @2 @1) >> + @1) >> + (if (INTEGRAL_TYPE_P (type) >> + && TYPE_PRECISION (type) > 1 >> + && (INTEGRAL_TYPE_P (TREE_TYPE (@0)))) >> + (op (bit_and (negate (convert:type @0)) @2) @1)))) >> + >> /* Simplifications of shift and rotates. */ >> >> (for rotate (lrotate rrotate) >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c >> new file mode 100644 >> index 00000000000..68087ae6568 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c >> @@ -0,0 +1,26 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2 -fdump-tree-optimized" } */ >> + >> +int f1(unsigned int x, unsigned int y, unsigned int z) >> +{ >> + return ((x & 1) == 0) ? y : z ^ y; >> +} >> + >> +int f2(unsigned int x, unsigned int y, unsigned int z) >> +{ >> + return ((x & 1) != 0) ? z ^ y : y; >> +} >> + >> +int f3(unsigned int x, unsigned int y, unsigned int z) >> +{ >> + return ((x & 1) == 0) ? y : z | y; >> +} >> + >> +int f4(unsigned int x, unsigned int y, unsigned int z) >> +{ >> + return ((x & 1) != 0) ? z | y : y; >> +} > Sorry to nitpick -- Since the pattern gates on INTEGRAL_TYPE_P, would > it be a good idea > to have these tests for other integral types too besides int like > {char, short, long} ? > > Thanks, > Prathamesh >> + >> +/* { dg-final { scan-tree-dump-times " -" 4 "optimized" } } */ >> +/* { dg-final { scan-tree-dump-times " & " 8 "optimized" } } */ >> +/* { dg-final { scan-tree-dump-not "if" "optimized" } } */ >> -- >> 2.34.1 >>
On 11/11/22 06:00, Michael Collison wrote: > Hi Prathamesh, > > It is my understanding that INTEGRAL_TYPE_P applies to the other > integer types you mentioned (chart, short, long). In fact the test > function that motivated this match has a mixture of char and short and > does not restrict matching. What I think Prathamesh is asking is whether or not we want to have tests with different types. It's less about correctness I think and more about ensuring that the testsuite covers those tests in case they regress in the future. jeff
On Fri, Nov 11, 2022 at 3:28 AM Michael Collison <collison@rivosinc.com> wrote: > > This patches transforms ((x & 0x1) == 0) ? y : z <op> y -into > (-(typeof(y))(x & 0x1) & z) <op> y, where op is a '^' or a '|'. It also > transforms (cond (and (x , 0x1) != 0), (z op y), y ) into (-(and (x , > 0x1)) & z ) op y. > > Matching this patterns allows GCC to generate branchless code for one of > the functions in coremark. > > Bootstrapped and tested on x86 and RISC-V. Okay? OK. Thanks, Richard. > Michael. > > 2022-11-10 Michael Collison <collison@rivosinc.com> > > * match.pd ((x & 0x1) == 0) ? y : z <op> y > -> (-(typeof(y))(x & 0x1) & z) <op> y. > > 2022-11-10 Michael Collison <collison@rivosinc.com> > > * gcc.dg/tree-ssa/branchless-cond.c: New test. > > --- > > Changes in v2: > > - Rewrite comment to use C syntax > > - Guard against 1-bit types > > - Simplify pattern by using zero_one_valued_p > > gcc/match.pd | 24 +++++++++++++++++ > .../gcc.dg/tree-ssa/branchless-cond.c | 26 +++++++++++++++++++ > 2 files changed, 50 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c > > diff --git a/gcc/match.pd b/gcc/match.pd > index 194ba8f5188..258531e9046 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -3486,6 +3486,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > (cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1) > (max @2 @1)) > > +/* ((x & 0x1) == 0) ? y : z <op> y -> (-(typeof(y))(x & 0x1) & z) <op> y */ > +(for op (bit_xor bit_ior) > + (simplify > + (cond (eq zero_one_valued_p@0 > + integer_zerop) > + @1 > + (op:c @2 @1)) > + (if (INTEGRAL_TYPE_P (type) > + && TYPE_PRECISION (type) > 1 > + && (INTEGRAL_TYPE_P (TREE_TYPE (@0)))) > + (op (bit_and (negate (convert:type @0)) @2) @1)))) > + > +/* ((x & 0x1) == 0) ? z <op> y : y -> (-(typeof(y))(x & 0x1) & z) <op> y */ > +(for op (bit_xor bit_ior) > + (simplify > + (cond (ne zero_one_valued_p@0 > + integer_zerop) > + (op:c @2 @1) > + @1) > + (if (INTEGRAL_TYPE_P (type) > + && TYPE_PRECISION (type) > 1 > + && (INTEGRAL_TYPE_P (TREE_TYPE (@0)))) > + (op (bit_and (negate (convert:type @0)) @2) @1)))) > + > /* Simplifications of shift and rotates. */ > > (for rotate (lrotate rrotate) > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c > new file mode 100644 > index 00000000000..68087ae6568 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c > @@ -0,0 +1,26 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > + > +int f1(unsigned int x, unsigned int y, unsigned int z) > +{ > + return ((x & 1) == 0) ? y : z ^ y; > +} > + > +int f2(unsigned int x, unsigned int y, unsigned int z) > +{ > + return ((x & 1) != 0) ? z ^ y : y; > +} > + > +int f3(unsigned int x, unsigned int y, unsigned int z) > +{ > + return ((x & 1) == 0) ? y : z | y; > +} > + > +int f4(unsigned int x, unsigned int y, unsigned int z) > +{ > + return ((x & 1) != 0) ? z | y : y; > +} > + > +/* { dg-final { scan-tree-dump-times " -" 4 "optimized" } } */ > +/* { dg-final { scan-tree-dump-times " & " 8 "optimized" } } */ > +/* { dg-final { scan-tree-dump-not "if" "optimized" } } */ > -- > 2.34.1 >
Richard, Can you submit this patch for me while I sort out git write access? On 11/18/22 07:57, Richard Biener wrote: > On Fri, Nov 11, 2022 at 3:28 AM Michael Collison <collison@rivosinc.com> wrote: >> This patches transforms ((x & 0x1) == 0) ? y : z <op> y -into >> (-(typeof(y))(x & 0x1) & z) <op> y, where op is a '^' or a '|'. It also >> transforms (cond (and (x , 0x1) != 0), (z op y), y ) into (-(and (x , >> 0x1)) & z ) op y. >> >> Matching this patterns allows GCC to generate branchless code for one of >> the functions in coremark. >> >> Bootstrapped and tested on x86 and RISC-V. Okay? > OK. > > Thanks, > Richard. > >> Michael. >> >> 2022-11-10 Michael Collison <collison@rivosinc.com> >> >> * match.pd ((x & 0x1) == 0) ? y : z <op> y >> -> (-(typeof(y))(x & 0x1) & z) <op> y. >> >> 2022-11-10 Michael Collison <collison@rivosinc.com> >> >> * gcc.dg/tree-ssa/branchless-cond.c: New test. >> >> --- >> >> Changes in v2: >> >> - Rewrite comment to use C syntax >> >> - Guard against 1-bit types >> >> - Simplify pattern by using zero_one_valued_p >> >> gcc/match.pd | 24 +++++++++++++++++ >> .../gcc.dg/tree-ssa/branchless-cond.c | 26 +++++++++++++++++++ >> 2 files changed, 50 insertions(+) >> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c >> >> diff --git a/gcc/match.pd b/gcc/match.pd >> index 194ba8f5188..258531e9046 100644 >> --- a/gcc/match.pd >> +++ b/gcc/match.pd >> @@ -3486,6 +3486,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) >> (cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1) >> (max @2 @1)) >> >> +/* ((x & 0x1) == 0) ? y : z <op> y -> (-(typeof(y))(x & 0x1) & z) <op> y */ >> +(for op (bit_xor bit_ior) >> + (simplify >> + (cond (eq zero_one_valued_p@0 >> + integer_zerop) >> + @1 >> + (op:c @2 @1)) >> + (if (INTEGRAL_TYPE_P (type) >> + && TYPE_PRECISION (type) > 1 >> + && (INTEGRAL_TYPE_P (TREE_TYPE (@0)))) >> + (op (bit_and (negate (convert:type @0)) @2) @1)))) >> + >> +/* ((x & 0x1) == 0) ? z <op> y : y -> (-(typeof(y))(x & 0x1) & z) <op> y */ >> +(for op (bit_xor bit_ior) >> + (simplify >> + (cond (ne zero_one_valued_p@0 >> + integer_zerop) >> + (op:c @2 @1) >> + @1) >> + (if (INTEGRAL_TYPE_P (type) >> + && TYPE_PRECISION (type) > 1 >> + && (INTEGRAL_TYPE_P (TREE_TYPE (@0)))) >> + (op (bit_and (negate (convert:type @0)) @2) @1)))) >> + >> /* Simplifications of shift and rotates. */ >> >> (for rotate (lrotate rrotate) >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c >> new file mode 100644 >> index 00000000000..68087ae6568 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c >> @@ -0,0 +1,26 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2 -fdump-tree-optimized" } */ >> + >> +int f1(unsigned int x, unsigned int y, unsigned int z) >> +{ >> + return ((x & 1) == 0) ? y : z ^ y; >> +} >> + >> +int f2(unsigned int x, unsigned int y, unsigned int z) >> +{ >> + return ((x & 1) != 0) ? z ^ y : y; >> +} >> + >> +int f3(unsigned int x, unsigned int y, unsigned int z) >> +{ >> + return ((x & 1) == 0) ? y : z | y; >> +} >> + >> +int f4(unsigned int x, unsigned int y, unsigned int z) >> +{ >> + return ((x & 1) != 0) ? z | y : y; >> +} >> + >> +/* { dg-final { scan-tree-dump-times " -" 4 "optimized" } } */ >> +/* { dg-final { scan-tree-dump-times " & " 8 "optimized" } } */ >> +/* { dg-final { scan-tree-dump-not "if" "optimized" } } */ >> -- >> 2.34.1 >>
On Thu, Dec 1, 2022 at 7:57 PM Michael Collison <collison@rivosinc.com> wrote: > > Richard, > > Can you submit this patch for me while I sort out git write access? Done. I had to apply the patch manually - in future please make sure to send patches that can be applied with git am. Thanks, Richard. > On 11/18/22 07:57, Richard Biener wrote: > > On Fri, Nov 11, 2022 at 3:28 AM Michael Collison <collison@rivosinc.com> wrote: > >> This patches transforms ((x & 0x1) == 0) ? y : z <op> y -into > >> (-(typeof(y))(x & 0x1) & z) <op> y, where op is a '^' or a '|'. It also > >> transforms (cond (and (x , 0x1) != 0), (z op y), y ) into (-(and (x , > >> 0x1)) & z ) op y. > >> > >> Matching this patterns allows GCC to generate branchless code for one of > >> the functions in coremark. > >> > >> Bootstrapped and tested on x86 and RISC-V. Okay? > > OK. > > > > Thanks, > > Richard. > > > >> Michael. > >> > >> 2022-11-10 Michael Collison <collison@rivosinc.com> > >> > >> * match.pd ((x & 0x1) == 0) ? y : z <op> y > >> -> (-(typeof(y))(x & 0x1) & z) <op> y. > >> > >> 2022-11-10 Michael Collison <collison@rivosinc.com> > >> > >> * gcc.dg/tree-ssa/branchless-cond.c: New test. > >> > >> --- > >> > >> Changes in v2: > >> > >> - Rewrite comment to use C syntax > >> > >> - Guard against 1-bit types > >> > >> - Simplify pattern by using zero_one_valued_p > >> > >> gcc/match.pd | 24 +++++++++++++++++ > >> .../gcc.dg/tree-ssa/branchless-cond.c | 26 +++++++++++++++++++ > >> 2 files changed, 50 insertions(+) > >> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c > >> > >> diff --git a/gcc/match.pd b/gcc/match.pd > >> index 194ba8f5188..258531e9046 100644 > >> --- a/gcc/match.pd > >> +++ b/gcc/match.pd > >> @@ -3486,6 +3486,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > >> (cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1) > >> (max @2 @1)) > >> > >> +/* ((x & 0x1) == 0) ? y : z <op> y -> (-(typeof(y))(x & 0x1) & z) <op> y */ > >> +(for op (bit_xor bit_ior) > >> + (simplify > >> + (cond (eq zero_one_valued_p@0 > >> + integer_zerop) > >> + @1 > >> + (op:c @2 @1)) > >> + (if (INTEGRAL_TYPE_P (type) > >> + && TYPE_PRECISION (type) > 1 > >> + && (INTEGRAL_TYPE_P (TREE_TYPE (@0)))) > >> + (op (bit_and (negate (convert:type @0)) @2) @1)))) > >> + > >> +/* ((x & 0x1) == 0) ? z <op> y : y -> (-(typeof(y))(x & 0x1) & z) <op> y */ > >> +(for op (bit_xor bit_ior) > >> + (simplify > >> + (cond (ne zero_one_valued_p@0 > >> + integer_zerop) > >> + (op:c @2 @1) > >> + @1) > >> + (if (INTEGRAL_TYPE_P (type) > >> + && TYPE_PRECISION (type) > 1 > >> + && (INTEGRAL_TYPE_P (TREE_TYPE (@0)))) > >> + (op (bit_and (negate (convert:type @0)) @2) @1)))) > >> + > >> /* Simplifications of shift and rotates. */ > >> > >> (for rotate (lrotate rrotate) > >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c > >> new file mode 100644 > >> index 00000000000..68087ae6568 > >> --- /dev/null > >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c > >> @@ -0,0 +1,26 @@ > >> +/* { dg-do compile } */ > >> +/* { dg-options "-O2 -fdump-tree-optimized" } */ > >> + > >> +int f1(unsigned int x, unsigned int y, unsigned int z) > >> +{ > >> + return ((x & 1) == 0) ? y : z ^ y; > >> +} > >> + > >> +int f2(unsigned int x, unsigned int y, unsigned int z) > >> +{ > >> + return ((x & 1) != 0) ? z ^ y : y; > >> +} > >> + > >> +int f3(unsigned int x, unsigned int y, unsigned int z) > >> +{ > >> + return ((x & 1) == 0) ? y : z | y; > >> +} > >> + > >> +int f4(unsigned int x, unsigned int y, unsigned int z) > >> +{ > >> + return ((x & 1) != 0) ? z | y : y; > >> +} > >> + > >> +/* { dg-final { scan-tree-dump-times " -" 4 "optimized" } } */ > >> +/* { dg-final { scan-tree-dump-times " & " 8 "optimized" } } */ > >> +/* { dg-final { scan-tree-dump-not "if" "optimized" } } */ > >> -- > >> 2.34.1 > >>
diff --git a/gcc/match.pd b/gcc/match.pd index 194ba8f5188..258531e9046 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -3486,6 +3486,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1) (max @2 @1)) +/* ((x & 0x1) == 0) ? y : z <op> y -> (-(typeof(y))(x & 0x1) & z) <op> y */ +(for op (bit_xor bit_ior) + (simplify + (cond (eq zero_one_valued_p@0 + integer_zerop) + @1 + (op:c @2 @1)) + (if (INTEGRAL_TYPE_P (type) + && TYPE_PRECISION (type) > 1 + && (INTEGRAL_TYPE_P (TREE_TYPE (@0)))) + (op (bit_and (negate (convert:type @0)) @2) @1)))) + +/* ((x & 0x1) == 0) ? z <op> y : y -> (-(typeof(y))(x & 0x1) & z) <op> y */ +(for op (bit_xor bit_ior) + (simplify + (cond (ne zero_one_valued_p@0 + integer_zerop) + (op:c @2 @1) + @1) + (if (INTEGRAL_TYPE_P (type) + && TYPE_PRECISION (type) > 1 + && (INTEGRAL_TYPE_P (TREE_TYPE (@0)))) + (op (bit_and (negate (convert:type @0)) @2) @1)))) + /* Simplifications of shift and rotates. */ (for rotate (lrotate rrotate) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c new file mode 100644 index 00000000000..68087ae6568 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +int f1(unsigned int x, unsigned int y, unsigned int z) +{ + return ((x & 1) == 0) ? y : z ^ y; +} + +int f2(unsigned int x, unsigned int y, unsigned int z) +{ + return ((x & 1) != 0) ? z ^ y : y; +} + +int f3(unsigned int x, unsigned int y, unsigned int z) +{ + return ((x & 1) == 0) ? y : z | y; +} + +int f4(unsigned int x, unsigned int y, unsigned int z) +{ + return ((x & 1) != 0) ? z | y : y; +} + +/* { dg-final { scan-tree-dump-times " -" 4 "optimized" } } */ +/* { dg-final { scan-tree-dump-times " & " 8 "optimized" } } */ +/* { dg-final { scan-tree-dump-not "if" "optimized" } } */