Message ID | 20221108195509.2701313-1-philipp.tomsich@vrull.eu |
---|---|
State | Deferred, archived |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5C792385AC2E for <patchwork@sourceware.org>; Tue, 8 Nov 2022 19:55:30 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-lj1-x232.google.com (mail-lj1-x232.google.com [IPv6:2a00:1450:4864:20::232]) by sourceware.org (Postfix) with ESMTPS id B3C9D3858003 for <gcc-patches@gcc.gnu.org>; Tue, 8 Nov 2022 19:55:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B3C9D3858003 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-lj1-x232.google.com with SMTP id b9so22698500ljr.5 for <gcc-patches@gcc.gnu.org>; Tue, 08 Nov 2022 11:55:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=HpWFTKpR8RTITEEcJM7g/Smup+i4QN4fi/YyUSK0Xog=; b=hsO8yqqX8jOgr7zCgELNVFwA21cyCGXIoMyA2KgeqN1F3VlDlTjzXwVBn9+fkk6Plv gVsshmvWKRRSOIq3H95+GmxT0PbqGFQ0NSRQOz+nAnf+Nia+pSq3ETjqjsBYMPslzeXE ltkq8qUXaLm6Km4+vGtdX1iWBgxyMvK+1hedsEKIsSS/HX3AhGM7TWWsyfo512shk1m6 hGBKH3vDjnHBayvGSTcSC1iRnVFvs86Zj66sP+CZB6ycMhJwOvFk42sMpY0TJ0YHwxV9 WRV6PRyBSl4CM6nvPxcuJJpeI6kHiYlBI24VAhILqpNiXtOHb8A2xLKUnEGGL+fHUnWY ickA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=HpWFTKpR8RTITEEcJM7g/Smup+i4QN4fi/YyUSK0Xog=; b=1Krh+EGyfyvLkgKbMVikbHaDW180ILprLyiKLa9w877Y82KSYAPJEjzdO0POZVTKxp tuPbVK49kkvedYfqgN3szkuKpNbFsfBQZiMI3kr+Uk5/0vkfscyV4wb+acs+smHw7fpj A07ufTu10mf601SAYoe81xddYSNG4+nQyb2ia7HrbHN7vsIB21sKKfu4Wbunaj/TN3iw Bp8/ceGNYdggcG02/C8X3Vmq20RD3Yn7465H9Zer01XS3jL2z7UUAn0xf1y4lZAXLZPR v7OwA0mZCq1Sxk5OGXsMVgdnV7mc6ddg/HgBEcuRWoHedprrHA8RH6nH1QsfiIo2FtRi /4ig== X-Gm-Message-State: ACrzQf3hqq3v19NxFBp1/pxFAgAhUowPrfS+/+0aTqpmBhSLJ9LaROdu NaT3oDUL3Y/0/iSVTGunXy9cjxdXCjgpo2M3 X-Google-Smtp-Source: AMsMyM4xrTQvK9ZQPfVXzHRpf4CE0AsHWcIAUSHJrIpncfU5RXZcjudifgA50BRW5of7UYZbNKKGHw== X-Received: by 2002:a2e:918a:0:b0:277:46da:16a6 with SMTP id f10-20020a2e918a000000b0027746da16a6mr17914946ljg.200.1667937312037; Tue, 08 Nov 2022 11:55:12 -0800 (PST) Received: from ubuntu-focal.. ([2a01:4f9:3a:1e26::2]) by smtp.gmail.com with ESMTPSA id f6-20020ac25326000000b0049487818dd9sm1918404lfh.60.2022.11.08.11.55.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Nov 2022 11:55:11 -0800 (PST) From: Philipp Tomsich <philipp.tomsich@vrull.eu> To: gcc-patches@gcc.gnu.org Cc: Vineet Gupta <vineetg@rivosinc.com>, Kito Cheng <kito.cheng@gmail.com>, Jeff Law <jlaw@ventanamicro.com>, Palmer Dabbelt <palmer@rivosinc.com>, Christoph Muellner <christoph.muellner@vrull.eu>, Philipp Tomsich <philipp.tomsich@vrull.eu> Subject: [PATCH] RISC-V: branch-(not)equals-zero compares against $zero Date: Tue, 8 Nov 2022 20:55:09 +0100 Message-Id: <20221108195509.2701313-1-philipp.tomsich@vrull.eu> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
RISC-V: branch-(not)equals-zero compares against $zero
|
|
Commit Message
Philipp Tomsich
Nov. 8, 2022, 7:55 p.m. UTC
If we are testing a register or a paradoxical subreg (i.e. anything that is not a partial subreg) for equality/non-equality with zero, we can generate a branch that compares against $zero. This will work for QI, HI, SI and DImode, so we enable this for ANYI. 2020-08-30 gcc/ChangeLog: * config/riscv/riscv.md (*branch<mode>_equals_zero): Added pattern. --- gcc/config/riscv/riscv.md | 13 +++++++++++++ 1 file changed, 13 insertions(+)
Comments
On 11/8/22 12:55, Philipp Tomsich wrote: > If we are testing a register or a paradoxical subreg (i.e. anything that is not > a partial subreg) for equality/non-equality with zero, we can generate a branch > that compares against $zero. This will work for QI, HI, SI and DImode, so we > enable this for ANYI. > > 2020-08-30 gcc/ChangeLog: > > * config/riscv/riscv.md (*branch<mode>_equals_zero): Added pattern. I've gone back an forth on this a few times. As you know, I hate subregs in the target descriptions and I guess I need to extend that to querying if something is a subreg or not rather than just subregs appearing in the RTL. Presumably the idea behind rejecting partial subregs is the bits outside the partial is unspecified, but that's also going to be true if we're looking at a hardreg in QImode (for example) irrespective of it being wrapped in a subreg. I don't doubt it works the vast majority of the time, but I haven't been able to convince myself it'll work all the time. How do we ensure that the bits outside the mode are zero? I've been bitten by this kind of problem before, and it's safe to say it was exceedingly painful to find. Jeff
On Thu, 17 Nov 2022 14:44:31 PST (-0800), jeffreyalaw@gmail.com wrote: > > On 11/8/22 12:55, Philipp Tomsich wrote: >> If we are testing a register or a paradoxical subreg (i.e. anything that is not >> a partial subreg) for equality/non-equality with zero, we can generate a branch >> that compares against $zero. This will work for QI, HI, SI and DImode, so we >> enable this for ANYI. >> >> 2020-08-30 gcc/ChangeLog: >> >> * config/riscv/riscv.md (*branch<mode>_equals_zero): Added pattern. > > I've gone back an forth on this a few times. As you know, I hate > subregs in the target descriptions and I guess I need to extend that to > querying if something is a subreg or not rather than just subregs > appearing in the RTL. > > > Presumably the idea behind rejecting partial subregs is the bits outside > the partial is unspecified, but that's also going to be true if we're > looking at a hardreg in QImode (for example) irrespective of it being > wrapped in a subreg. > > > I don't doubt it works the vast majority of the time, but I haven't been > able to convince myself it'll work all the time. How do we ensure that > the bits outside the mode are zero? I've been bitten by this kind of > problem before, and it's safe to say it was exceedingly painful to find. I don't really understand the middle-end issues here (if there are any?), but I'm pretty sure code like this has passed by a few times before and we've yet to find a reliable way to optimize these cases. There's a bunch of patterns where knowing the XLEN-extension of shorter values would let us generate better code, but there's also cases where we'd generate worse code by ensure any extension scheme is followed. Every time I've seen this come up before I've managed to convince myself we can't really fix the problem in the backend, though: if we always generate extended values in registers then we just push the cost over to the other patterns. The only way I've come up with to handle something like this is to push more types into the middle-end so we can track these high bits and generate the faster sequences where we know what they are. That seems like a huge mess, though, and every time it comes up folks run away ;) Sorry if that's kind of vague, I usually find a way to break these but my box isn't cooperating with GCC builds today so I haven't even gotten that far yet...
On Fri, 18 Nov 2022 at 05:53, Palmer Dabbelt <palmer@rivosinc.com> wrote: > > On Thu, 17 Nov 2022 14:44:31 PST (-0800), jeffreyalaw@gmail.com wrote: > > > > On 11/8/22 12:55, Philipp Tomsich wrote: > >> If we are testing a register or a paradoxical subreg (i.e. anything that is not > >> a partial subreg) for equality/non-equality with zero, we can generate a branch > >> that compares against $zero. This will work for QI, HI, SI and DImode, so we > >> enable this for ANYI. > >> > >> 2020-08-30 gcc/ChangeLog: > >> > >> * config/riscv/riscv.md (*branch<mode>_equals_zero): Added pattern. > > > > I've gone back an forth on this a few times. As you know, I hate > > subregs in the target descriptions and I guess I need to extend that to > > querying if something is a subreg or not rather than just subregs > > appearing in the RTL. > > > > > > Presumably the idea behind rejecting partial subregs is the bits outside > > the partial is unspecified, but that's also going to be true if we're > > looking at a hardreg in QImode (for example) irrespective of it being > > wrapped in a subreg. > > > > > > I don't doubt it works the vast majority of the time, but I haven't been > > able to convince myself it'll work all the time. How do we ensure that > > the bits outside the mode are zero? I've been bitten by this kind of > > problem before, and it's safe to say it was exceedingly painful to find. > > I don't really understand the middle-end issues here (if there are > any?), but I'm pretty sure code like this has passed by a few times > before and we've yet to find a reliable way to optimize these cases. > There's a bunch of patterns where knowing the XLEN-extension of shorter > values would let us generate better code, but there's also cases where > we'd generate worse code by ensure any extension scheme is followed. > > Every time I've seen this come up before I've managed to convince myself > we can't really fix the problem in the backend, though: if we always > generate extended values in registers then we just push the cost over to > the other patterns. The only way I've come up with to handle something > like this is to push more types into the middle-end so we can track > these high bits and generate the faster sequences where we know what > they are. That seems like a huge mess, though, and every time it comes > up folks run away ;) You are perfectly right that this problem can not be fixed in the backend, at least not in a general manner (i.e., additional patterns can resolve some of the cases and it is messy in the backend). In fact, we are looking at fixing this before/during lowering by avoiding the extension whenever possible (based on the type information and even value ranges). However, this work will miss GCC13 and that is the reason why the band-aid was submitted here. > Sorry if that's kind of vague, I usually find a way to break these but > my box isn't cooperating with GCC builds today so I haven't even gotten > that far yet... I am gathering the original rationale why this should be safe from our internal communication (the change is 2 years old, after all) and will follow up. If you find a way to break this in the meantime, please let us know. Philipp.
On 11/17/22 21:53, Palmer Dabbelt wrote: > On Thu, 17 Nov 2022 14:44:31 PST (-0800), jeffreyalaw@gmail.com wrote: >> >> On 11/8/22 12:55, Philipp Tomsich wrote: >>> If we are testing a register or a paradoxical subreg (i.e. anything >>> that is not >>> a partial subreg) for equality/non-equality with zero, we can >>> generate a branch >>> that compares against $zero. This will work for QI, HI, SI and >>> DImode, so we >>> enable this for ANYI. >>> >>> 2020-08-30 gcc/ChangeLog: >>> >>> * config/riscv/riscv.md (*branch<mode>_equals_zero): Added pattern. >> >> I've gone back an forth on this a few times. As you know, I hate >> subregs in the target descriptions and I guess I need to extend that to >> querying if something is a subreg or not rather than just subregs >> appearing in the RTL. >> >> >> Presumably the idea behind rejecting partial subregs is the bits outside >> the partial is unspecified, but that's also going to be true if we're >> looking at a hardreg in QImode (for example) irrespective of it being >> wrapped in a subreg. >> >> >> I don't doubt it works the vast majority of the time, but I haven't been >> able to convince myself it'll work all the time. How do we ensure that >> the bits outside the mode are zero? I've been bitten by this kind of >> problem before, and it's safe to say it was exceedingly painful to find. > > I don't really understand the middle-end issues here (if there are > any?), but I'm pretty sure code like this has passed by a few times > before and we've yet to find a reliable way to optimize these cases. > There's a bunch of patterns where knowing the XLEN-extension of > shorter values would let us generate better code, but there's also > cases where we'd generate worse code by ensure any extension scheme is > followed. It's not really the extension scheme, though that is a subset of the concerns in this space. Essentially we have to be 100% sure that the bits outside of the branch mode (QI/HI/SI) and XLEN are zero, it's not just the sign bit. This becomes even more of a concern as we exploit the bitmanip extensions more aggressively. The SUBREG check is supposed to avoid that problem, but I'm not convinced it's sufficient. Philipp claims that PROMOTE_MODE plus WORD_REGISTER_OPERATIONS is sufficient here, but I'm not sure that's the case. He's digging out the rationale from some internal archives which we'll dig into once he finds it. I'd be happy to be proved wrong :-) jeff
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 798f7370a08..97f6ca37891 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -2205,6 +2205,19 @@ ;; Conditional branches +(define_insn "*branch<mode>_equals_zero" + [(set (pc) + (if_then_else + (match_operator 1 "equality_operator" + [(match_operand:ANYI 2 "register_operand" "r") + (const_int 0)]) + (label_ref (match_operand 0 "" "")) + (pc)))] + "!partial_subreg_p (operands[2])" + "b%C1\t%2,zero,%0" + [(set_attr "type" "branch") + (set_attr "mode" "none")]) + (define_insn "*branch<mode>" [(set (pc) (if_then_else