[v2] RISC-V: Split unordered FP comparisons into individual RTL insns

  We have unordered FP comparisons implemented as RTL insns that produce 
multiple machine instructions.  Such RTL insns are hard to match with a 
processor pipeline description and additionally there is a redundant 
SNEZ instruction produced on the result of these comparisons even though 
the FLT.fmt and FLE.fmt machine instructions already produce either 0 or 
1, e.g.:

long
flt (double x, double y)
{
  return __builtin_isless (x, y);
}

with `-O2 -fno-finite-math-only -ftrapping-math -fno-signaling-nans' 
gets compiled to:

	.globl	flt
	.type	flt, @function
flt:
	frflags	a5
	flt.d	a0,fa0,fa1
	fsflags	a5
	snez	a0,a0
	ret
	.size	flt, .-flt

because the middle end can't see through the UNSPEC operation unordered 
FP comparisons have been defined in terms of.

These instructions are only produced via an expander already, so change 
the expander to emit individual RTL insns for each machine instruction 
in the ultimate ultimate sequence produced rather than deferring to a 
single RTL insn producing the whole sequence at once.

	gcc/
	* config/riscv/riscv.md (UNSPECV_FSNVSNAN): New constant.
	(QUIET_PATTERN): New int attribute.
	(f<quiet_pattern>_quiet<ANYF:mode><X:mode>4): Emit the intended 
	RTL insns entirely within the preparation statements.
	(*f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_default)
	(*f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_snan): Remove 
	insns.
	(*riscv_fsnvsnan<mode>2): New insn.

	gcc/testsuite/
	* gcc.target/riscv/fle-ieee.c: New test.
	* gcc.target/riscv/fle-snan.c: New test.
	* gcc.target/riscv/fle.c: New test.
	* gcc.target/riscv/flef-ieee.c: New test.
	* gcc.target/riscv/flef-snan.c: New test.
	* gcc.target/riscv/flef.c: New test.
	* gcc.target/riscv/flt-ieee.c: New test.
	* gcc.target/riscv/flt-snan.c: New test.
	* gcc.target/riscv/flt.c: New test.
	* gcc.target/riscv/fltf-ieee.c: New test.
	* gcc.target/riscv/fltf-snan.c: New test.
	* gcc.target/riscv/fltf.c: New test.
---
Hi Kito,

 Thank you for your review.

> Thanks for detail analysis and performance number report, I am concern
> about this patch might let compiler schedule the fsflags/frflags with
> other floating point instructions, and the major issue is we didn't
> model fflags right in GCC as you mentioned in the first email.

 I have now looked through various places and we're good.

 First of all the C language standard has this:

"F.8.1 Environment management

"IEC 60559 requires that floating-point operations implicitly raise 
floating-point exception status flags, and that rounding control modes can 
be set explicitly to affect result values of floating-point operations. 
When the state for the FENV_ACCESS pragma (defined in <fenv.h>) is "on", 
these changes to the floating-point state are treated as side effects 
which respect sequence points."

We don't actually support the FENV_ACCESS pragma, however we provide for 
having FP environment support in the C library (e.g. `riscv_getflags' and 
`riscv_setflags' among others is the RISC-V port of glibc):

"  * 'The default state for the 'FENV_ACCESS' pragma (C99 and C11
     7.6.1).'

"    This pragma is not implemented, but the default is to "off" unless
     '-frounding-math' is used in which case it is "on"."

(I find this misleading, because my interpretation of our documentation 
and code is that the default is "on" whenever `-frounding-math' and 
`-ftrapping-math' are active both at a time; arguably the text is however 
correct, because `-ftrapping-math' is on by default, so it doesn't have to 
"be used" and it's `-fno-trapping-math' that has to "be unused" for the 
effect to be in place of what FENV_ACCESS "on" would do should we 
implement it).

 Now `riscv_getflags' and `riscv_setflags' use `asm volatile' to peek or 
poke at IEEE exception flags, so for these pieces of code to work the 
compiler has to make sure not to reorder volatile instructions around 
trapping instructions and that is handled in `can_move_insns_across':

	  if (may_trap_or_fault_p (PATTERN (insn))
	      && (trapping_insns_in_across
		  || other_branch_live != NULL
		  || volatile_insn_p (PATTERN (insn))))
	    break;

(cf.: <https://gcc.gnu.org/ml/gcc-patches/2013-01/msg00254.html>, and 
commit c6d851b95a7b).  And we consider FP arithmetic instructions trapping 
under `-ftrapping-math' in `may_trap_p_1':

    case NEG:
    case ABS:
    case SUBREG:
    case VEC_MERGE:
    case VEC_SELECT:
    case VEC_CONCAT:
    case VEC_DUPLICATE:
      /* These operations don't trap even with floating point.  */
      break;

    default:
      /* Any floating arithmetic may trap.  */
      if (FLOAT_MODE_P (GET_MODE (x)) && flag_trapping_math)
	return 1;

(other relevant operations are handled elsewhere with this switch 
statement).  This is consistent with my observations where only FLD or 
FABS.D (neither trapping) get reordered across FRFLAGS (volatile by means 
of `unspec_volatile').

 However following the observations above I chose to update the test cases 
to better reflect how we control IEEE exception handling and use 
`-fno-trapping-math' rather than `-ffinite-math-only' to disable the use 
of unordered comparison operations.

> So I think we should model this right before we split that, I guess
> that would be a bunch of work:
> 1. Add fflags to the hard register list.
> 2. Add (clobber (reg fflags)) or (set (reg fflags) (fpop
> (operands...))) to those floating point operations which might change
> fflags
> 3. Rewrite riscv_frflags and riscv_fsflags pattern by right RTL
> pattern: (set (reg) (reg fflags)) and (set (reg fflags) (reg)).
> 4. Then split *f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_default and
> *f<quiet_pattern>_quiet<ANYF:mode><X:mode>4_snan pattern.
> 
> However I am not sure about the code gen impact of 2, especially the
> impact to the combine pass, not sure if you are interested to give a
> try?

 I think we can look into it long-term as a further optimisation, but to 
solve the problem of insn scheduling at hand my current proposal seems 
adequate enough and I suspect adding full data dependency tracking for the 
IEEE flags could be a rabbit hole (ISTM after all no other target does 
that, and I smell there's been a reason for that).

 FAOD if at all I'd envision doing such tracking individually for each
exception flag, following "Accumulating CSRs" listings from Section 14.3 
"Source and Destination Register Listings" in the unprivileged ISA spec.

 While re-reviewing code I have spotted I previously missed operand 
constraints for the new `*riscv_fsnvsnan<mode>2' insn, so I have added 
them now.

 I have verified that the new test cases still pass with the update in 
place, and I have rerun full `-fsignaling-nans' regression testing for the 
constraint fix.  OK to apply?

  Maciej

Changes from v1:

- Add operand constraints for the new `*riscv_fsnvsnan<mode>2' insn.

- In test cases use `-fno-trapping-math' rather than `-ffinite-math-only'; 
  consequently force `-fno-finite-math-only' so that any defaults do not 
  interfere with the results expected.
---
 gcc/config/riscv/riscv.md                  |   67 +++++++++++++++--------------
 gcc/testsuite/gcc.target/riscv/fle-ieee.c  |   12 +++++
 gcc/testsuite/gcc.target/riscv/fle-snan.c  |   12 +++++
 gcc/testsuite/gcc.target/riscv/fle.c       |   12 +++++
 gcc/testsuite/gcc.target/riscv/flef-ieee.c |   12 +++++
 gcc/testsuite/gcc.target/riscv/flef-snan.c |   12 +++++
 gcc/testsuite/gcc.target/riscv/flef.c      |   12 +++++
 gcc/testsuite/gcc.target/riscv/flt-ieee.c  |   12 +++++
 gcc/testsuite/gcc.target/riscv/flt-snan.c  |   12 +++++
 gcc/testsuite/gcc.target/riscv/flt.c       |   12 +++++
 gcc/testsuite/gcc.target/riscv/fltf-ieee.c |   12 +++++
 gcc/testsuite/gcc.target/riscv/fltf-snan.c |   12 +++++
 gcc/testsuite/gcc.target/riscv/fltf.c      |   12 +++++
 13 files changed, 179 insertions(+), 32 deletions(-)

gcc-riscv-fcmp-split.diff

Message ID	alpine.DEB.2.20.2207011936120.10833@tpp.orcam.me.uk
State	Deferred, archived
Headers	Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F41B23851178 for <patchwork@sourceware.org>; Mon, 4 Jul 2022 14:12:32 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by sourceware.org (Postfix) with ESMTPS id 11E72385C326 for <gcc-patches@gcc.gnu.org>; Mon, 4 Jul 2022 14:12:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 11E72385C326 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embecosm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=embecosm.com Received: by mail-wr1-x42b.google.com with SMTP id f2so8316852wrr.6 for <gcc-patches@gcc.gnu.org>; Mon, 04 Jul 2022 07:12:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=w8TTVUTMnURlzNH9CaPN7bx6HlADKIRyVI36nx+OUDQ=; b=TsHbDcc6L6LbBusVkZrTr/sWSa+BHOyrIUoTo40s2pCL8thz9YdQjSPW7vjeQc/J1K SRli0kNbcrPAEy8M/t6HyTxsCsV10XH06IwcJhep2yUSyMdSBwvrRSiRYhZWevsdXrKD X8jFmAsDgXtszWHBeeavUUN4dCrMzA/K40KoxaBF7L+bS+wm14Y6iZqdqv5EZWMbBtJr EtKAKTq5B/0hjzVnsVz+AS6HlER9DGOTiyM2fu9e+cWWIylIO7nrU7nx3RZUT2yoSoxw XAsa3nJKRbinA+a7XUKE4BcdBNgcLaVbe1EpkECA80NbMHb2qPKG0n8cIDxCvObMUltF FbAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=w8TTVUTMnURlzNH9CaPN7bx6HlADKIRyVI36nx+OUDQ=; b=MBnS65fQ2PaM9Rauc+1CLkcpgcPHGLRQeDWgBin+HBHrkP9e6AsbeTPQ3HzZgjdzx4 1jVTSk2aHD6qWNux04u8dHMPFvlW0ixBvbSDMKdoGU2KiWtkqAuPHqcbYIM2DC/+HF+A gOfFPLatSq7AJEoW7Z758M5h2d2135GaLySr9QqtPdISbUWE8sfuvU7heHu+FyUyze1u /kHNGJeuPA1qLWPiO4lk3gMOYWdvR12ZYfxRs48j+M9sMTtXg2eYtz4mRzZK4F0jy3pV nH+8V7liti2D/GNpTHPDoxYxziBTs6VgAtu/pCO+vBP/UNgr5YjWLirbn9oDM7U4u7tx G7IQ== X-Gm-Message-State: AJIora8NIxx3x3Mw/5pkwbINqTFItyWybUZ2dj53fLZypzoIJlUFIMxN l5TkmNY+/PoGw1QH0YDW5Bwr+g== X-Google-Smtp-Source: AGRyM1ta3tDJQXDNm3gZ+H5GU1C7pNMn+BgrEq0VamAeXVax/Zj3woUUM+9b/FX2CsOyh3KQtJovXg== X-Received: by 2002:a5d:6c63:0:b0:21d:3626:6c97 with SMTP id r3-20020a5d6c63000000b0021d36266c97mr21647274wrz.708.1656943931720; Mon, 04 Jul 2022 07:12:11 -0700 (PDT) Received: from tpp.orcam.me.uk (tpp.orcam.me.uk. [2001:8b0:154:0:ea6a:64ff:fe24:f2fc]) by smtp.gmail.com with ESMTPSA id q63-20020a1c4342000000b003973c54bd69sm16299021wma.1.2022.07.04.07.12.10 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 04 Jul 2022 07:12:11 -0700 (PDT) Date: Mon, 4 Jul 2022 15:12:08 +0100 (BST) From: "Maciej W. Rozycki" <macro@embecosm.com> To: Kito Cheng <kito.cheng@gmail.com> Subject: [PATCH v2] RISC-V: Split unordered FP comparisons into individual RTL insns In-Reply-To: <CA+yXCZBLavnLVv=AVYQiSnoa38u5i2OwHSEg1fxJa0kF2e8x6Q@mail.gmail.com> Message-ID: <alpine.DEB.2.20.2207011936120.10833@tpp.orcam.me.uk> References: <alpine.DEB.2.20.2206082354490.10833@tpp.orcam.me.uk> <alpine.DEB.2.20.2206220143490.10833@tpp.orcam.me.uk> <CA+yXCZBLavnLVv=AVYQiSnoa38u5i2OwHSEg1fxJa0kF2e8x6Q@mail.gmail.com> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_ASCII_DIVIDERS, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Cc: GCC Patches <gcc-patches@gcc.gnu.org>, Andrew Waterman <andrew@sifive.com> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
Series	[v2] RISC-V: Split unordered FP comparisons into individual RTL insns \| [v2] RISC-V: Split unordered FP comparisons into individual RTL insns

[v2] RISC-V: Split unordered FP comparisons into individual RTL insns

Commit Message

Comments

Patch